10,000 Matching Annotations
  1. Last 7 days
    1. eLife Assessment

      This study provides evidence that single-cell multi-omics profiling can reveal key regulators of HIV-1 persistence and early immune dysregulation, particularly implicating KLF2 and Th17 cells as major players in viral reservoir dynamics. The findings are solid, supported by rigorous integration of scRNA-seq and scATAC-seq data, but are limited by sample size and lack of validation with external datasets. Overall, this work makes a valuable contribution to understanding HIV-1 immune evasion and highlights potential therapeutic targets for reservoir eradication.

    2. Reviewer #1 (Public review):

      Summary:

      The authors aimed to elucidate the molecular mechanisms underlying HIV-1 persistence and host immune dysfunction in CD4+ T cells during early infection (<6 months). Using single-cell multi-omics technologies-including scRNA-seq, scATAC-seq, and single-cell multiome analyses-they characterized the transcriptional and epigenomic landscapes of HIV-1-infected CD4+ T cells. They identified key transcription factors (TFs), signaling pathways, and T cell subtypes involved in HIV-1 persistence, particularly highlighting KLF2 and Th17 cells as critical regulators of immune suppression. The study provides new insights into immune dysregulation during early HIV-1 infection and reveals potential epigenetic regulatory mechanisms in HIV-1-infected T cells.

      Strengths:

      The study excels through its innovative integration of single-cell multi-omics technologies, enabling detailed analysis of gene regulatory networks in HIV-1-infected cells. Focusing on early infection stages, it fills a crucial knowledge gap in understanding initial immune responses and viral reservoir establishment. The identification of KLF2 as a key transcription factor and Th17 cells as major viral reservoirs, supported by comprehensive bioinformatics analyses, provides robust evidence for the study's conclusions. These findings have immediate clinical relevance by identifying potential therapeutic targets for HIV-1 reservoir eradication.

      Weaknesses:

      Despite its strengths, the study has several limitations. By focusing exclusively on CD4+ T cells, the study overlooks other relevant immune cells such as CD14+ monocytes, NK cells, and B cells. Additionally, while the authors generated their own single-cell datasets, they need to validate their findings using other publicly available single-cell data from HIV-1-infected PBMCs.

    3. Reviewer #2 (Public review):

      Summary:

      The authors observed gene ontologies associated with upregulated KLF2 target genes in HIV-1 RNA+ CD4 T Cells using scRNA-seq and scATAC-seq datasets from the PBMCs of early HIV-1-infected patients, showing immune responses contributing to HIV pathogenesis and novel targets for viral elimination.

      Strengths:<br /> The authors carried out detailed transcriptomics profiling with scRNA-seq and scATAC-seq datasets to conclude upregulated KLF2 target genes in HIV-1 RNA+ CD4 T Cells.

      Weaknesses:

      This key observation of up-regulation KLF2 associated genes family might be important in the HIV field for early diagnosis and viral clearance. However, with the limited sample size and in-vivo study model, it will be hard to conclude. I highly recommend increasing the sample size of early HIV-1-infected patients.

    4. Reviewer #3 (Public review):

      Summary:

      This manuscript studies intracellular changes and immune processes during early HIV-1 infection with an additional focus on the small CD4+ T cell subsets. The authors used single-cell omics to achieve high resolution of transcriptomic and epigenomic data on the infected cells which were verified by viral RNA expression. The results add to understanding of transcriptional regulation which may allow progression or HIV latency later in infected cells. The biosamples were derived from early HIV infection cases, providing particularly valuable data for the HIV research field.

      Strengths:

      The authors examined the heterogeneity of infected cells within CD4 T cell populations, identified a significant and unexpected difference between naive and effector CD4 T cells, and highlighted the differences in Th2 and Th17 cells. Multiple methods were used to show the role of the increased KLF2 factor in infected cells. This is a valuable finding of a new role for the major transcription factor in further disease progression and/or persistence.

      The methods employed by the authors are robust. Single-cell RNA-Seq from PBMC samples was followed by a comprehensive annotation of immune cell subsets, 16 in total. This manuscript presents to the scientific community a valuable multi-omics dataset of good quality, which could be further analyzed in the context of larger studies.

      Weaknesses:

      Methods and Supplementary materials<br /> Some technical aspects could be described in more detail. For example, it is unclear how the authors filtered out cells that did not pass quality control, such as doublets and cells with low transcript/UMI content. Next, in cell annotation, what is the variability in cell types between donors? This information is important to include in the supplementary materials, especially with such a small sample size. Without this, it is difficult to determine, whether the differences between subsets on transcriptomic level, viral RNA expression level, and chromatin assessment are observed due to cell type variations or individual patient-specific variations. For the DEG analysis, did the authors exclude the most variable genes?

      The annotation of 16 cell types from PBMC samples is impressive and of good quality, however, not all cell types get attention for further analysis. It's natural to focus primarily on the CD4 T cells according to the research objectives. The authors also study potential interactions between CD4 and CD8 T cells by cell communication inference. It would be interesting to ask additional questions for other underexplored immune cell subsets, such as: 1) Could viral RNA be detected in monocytes or macrophages during early infection? 2) What are the inferred interactions between NK cells and infected CD4 T cells, are interactions similar to CD4-CD8 results? 3) What are the inferred interactions between monocytes or macrophages and infected CD4 T cells?

      Discussion<br /> It would be interesting to see more discussion of the observation of how naïve T cells produce more viral RNA compared to effector T cells. It seems counterintuitive according to general levels of transcriptional and translational activity in subsets.<br /> Another discussion block could be added regarding the results and conclusion comparison with Ashokkumar et al. paper published earlier in 2024 (10.1093/gpbjnl/qzae003). This earlier publication used both a cell line-based HIV infection model and primary infected CD4 T cells and identified certain transcription factors correlated with viral RNA expression.

    1. eLife Assessment

      This manuscript reports fundamental and important discoveries on how necrotic cells contribute to organ regeneration through apoptotic signalling to produce cells with non-lethal apoptotic caspase activity that contribute to the regenerated tissue. These findings would be of broad interest to those who study wound repair and tissue regeneration. The strength of the evidence is solid and has been improved in the revised version.

    2. Reviewer #2 (Public review):

      In this revised manuscript, Klemm et al., build on top of past published findings (Klemm et al., 2021) to characterize caspase activation in distal cells following necrotic tissue damage within the Drosophila wing imaginal disc. Previously in Klemm et al., 2021, the authors describe necrosis-induced-apoptosis (NiA) following the development of a genetic system to study necrosis that is caused by the expression of a constitutive active GluR1 (Glutamate/Ca2+ channel), and they discovered that the appearance of NiA cells were important for promoting regeneration.

      In this manuscript, the authors investigate how tissues regenerate following necrotic cell death. They find that<br /> (1) the cells of the wing pouch are more likely to have non-autonomous caspase activation than other regions within the wing imaginal disc (hinge and notum),<br /> (2) two signaling pathways that are known to be upregulated during regeneration, Wnt (wingless) and JAK/Stat signaling, act to prevent additional NiA in pouch cells, and may partially explain the region specificity,<br /> (3) the presence of NiA (and/or NiCP) cells promotes regenerative proliferation in the late stages of regeneration,<br /> (4) not all caspase-positive cells are cleared from the epithelium (these cells are then referred to as Necrosis-induced Caspase Positive (NiCP) cells), these NiCP cells continue to live and promote proliferation in adjacent cells,<br /> (5) the initiator caspase Dronc is important for creating NiA/NiCP cells and for these cells to promote proliferation. Animals heterozygous for a Dronc null allele show a decrease in regeneration following necrotic tissue damage. In the revised manuscript, the authors provide improvements through additional data quantifications and text changes to better explain NiA/NiCP lineage tracing methods.

      The study has the potential to be broadly interesting due to the insights into how tissues differentially respond to necrosis as compared to apoptosis to promote regeneration. The paper raises many interesting questions for future investigation, including what is the nature of the signaling between the damaged tissue and the NiA/NiCP responsive areas (such as the identity of the DAMPs)? What determines if these cells at a distance undergo apoptosis or remain viable in the tissue as caspase-positive cells? And since the authors have data that indicates that the phenomenon is distinct from 'undead cells', what are the mechanisms by which these cells promote local proliferation?

    3. Reviewer #3 (Public review):

      The manuscript "Regeneration following tissue necrosis is mediated by non-apoptotic caspase activity" by Klemm et al. is an exploration of what happens to a group of cells that experience caspase activation after necrosis occurs some distance away from the cells of interest. These experiments have been conducted in the Drosophila wing imaginal disc, which has been used extensively to study the response of a developing epithelium to damage and stress. The authors revise and refine their earlier discovery of apoptosis initiated by necrosis, here showing that many of those presumed apoptotic cells do not complete apoptosis. Thus, the most interesting aspect of the paper is the characterization of a group of cells that experience mild caspase activation in response to an unknown signal, followed by some effector caspase activation and DNA damage, but that then recover from the DNA damage, avoid apoptosis, and proliferate instead.

      The authors have addressed the concerns raised, including those about drawing conclusions from RNAi knockdown without evaluating the efficacy of the knockdown, and in doing so they revised their conclusions after ascertaining that the Zfh2 RNAi was not effective.

      The authors have added quantification of the imaging data throughout, which strengthens their conclusions.

      In addition, the authors have revised some of the text describing the changes in EdU signal and added explanations of reagents such as the caspase sensors to clarify the experimental approaches, results, and interpretation of those results.

      The authors have also addressed the minor concerns and questions about the figures and text.

      A few questions remain, which the authors may choose to address.

      (1) The hh>Stat92ERNAi was assessed by the 10xSTAT-GFP reporter, as shown in Fig 2 Supp1 F. The authors point out the marked reduction in GFP in the ventral part of the hinge but do not comment on the lack of change in GFP in the dorsal part of the hinge. However, the open arrowhead in Figure 2H indicating the lack of cDcp-1 signal in the hinge in the same experiment points to the dorsal hinge, where the reporter suggests no difference in JAK-STAT signaling.

      (2) The data used to conclude that DRONC-DN and UAS-DIAP1 do not affect regenerative proliferation were normalized EdU intensities. As discussed in the prior review round, normalized EdU may not be a good comparison across experimental conditions given that the remainder of the disc may also have altered EdU incorporation, so this measurement may not be enough by itself to draw conclusions about regenerative proliferation. To strengthen the conclusion that regenerative proliferation is unaffected under these conditions, the authors may want to consider using a second measure such as adult wing size, PCNA, or quantitate mitoses via anti-phospho histone H3 staining.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In previous work, the authors described necrosis-induced apoptosis (NiA) as a consequence of induced necrosis. Specifically, experimentally induced necrosis in the distal pouch of larval wing imaginal discs triggers NiA in the lateral pouch. In this manuscript, the authors confirmed this observation and found that while necrosis can kill all areas of the disc, NiA is limited to the pouch and to some extent to the notum, but is excluded from the hinge region. Interestingly and unexpectedly, signaling by the Jak/Stat and Wg pathways inhibits NiA. Further characterization of NiA by the authors reveals that NiA also triggers regenerative proliferation which can last up to 64 hours following necrosis induction. This regenerative response to necrosis is significantly stronger compared to discs ablated by apoptosis. Furthermore, the regenerative proliferation induced by necrosis is dependent on the apoptotic pathway because RNAi targeting the RHG genes is sufficient to block proliferation. However, NiA does not promote proliferation through the previously described apoptosis-induced proliferation (AiP) pathway, although cells at the wound edge undergo AiP. Further examination of the caspase levels in NiA cells allowed the authors to group these cells into two clusters: some cells (NiA) undergo apoptosis and are removed, while others referred to as Necrosis-induced Caspase Positive (NiCP) cells survive despite caspase activity. It is the NiCP cells that repair cellular damage including DNA damage and that promote regenerative proliferation. Caspase sensors demonstrate that both groups of cells have initiator caspase activity, while only the NiA cells contain effector caspase activity. Under certain conditions, the authors were also able to visualize effector caspase activity in NiCP cells, but the level was low, likely below the threshold for apoptosis. Finally, the authors found that loss of the initiator caspase Dronc blocks regenerative proliferation, while inhibiting effector caspases by expression of p35 does not, suggesting that Dronc can induce regenerative proliferation following necrosis in a non- apoptotic manner. This last finding is very interesting as it implies that Dronc can induce proliferation in at least two ways in addition to its requirement in AiP.

      Strengths:

      This is a very interesting manuscript. The authors demonstrate that epithelial tissue that contains a significant number of necrotic cells is able to regenerate. This regenerative response is dependent on the apoptotic pathway which is induced at a distance from the necrotic cells. Although regenerative proliferation following necrosis requires the initiator caspase Dronc, Dronc does not induce a classical AiP response for this type of regenerative response. In future work, it will be very interesting to dissect this regenerative response pathway genetically.

      Weaknesses:

      No weaknesses were identified.

      We thank the reviewer for their positive evaluation and kind words.

      Reviewer #2 (Public Review):

      Summary / Strengths:

      In this manuscript, Klemm et al., build on past published findings (Klemm et al., 2021) to characterize caspase activation in distal cells following necrotic tissue damage within the Drosophila wing imaginal disc. Previously in Klemm et al., 2021, the authors describe necrosis-induced-apoptosis (NiA) following the development of a genetic system to study necrosis that is caused by the expression of a constitutive active GluR1 (Glutamate/Ca2+ channel), and they discovered that the appearance of NiA cells were important for promoting regeneration.

      In this manuscript, the authors aim to investigate how tissues regenerate following necrotic cell death. They find that the cells of the wing pouch are more likely to have non-autonomous caspase activation than other regions within the wing imaginal disc (hinge and notum),two signaling pathways that are known to be upregulated during regeneration, Wnt (wingless) and JAK/Stat signaling, act to prevent additional NiA in pouch cells, and may explain the region specificity, the presence of NiA cells promotes regenerative proliferation in late stages of regeneration, not all caspase-positive cells are cleared from the epithelium (these cells are then referred to as Necrosis-induced Caspase Positive (NiCP) cells), these NiCP cells continue to live and promote proliferation in adjacent cells, the caspase Dronc is important for creating NiA/NiCP cells and for these cells to promote proliferation. Animals heterozygous for a Dronc null allele show a decrease in regeneration following necrotic tissue damage.

      The study has the potential to be broadly interesting due to the insights into how tissues differentially respond to necrosis as compared to apoptosis to promote regeneration.

      Weaknesses:

      However, here are some of my current concerns for the manuscript in its current version:

      The presence of cells with activated caspase that don't die (NiCP cells) is an interesting biological phenomenon but is not described until Figure 5. How does the existence of NiCP cells impact the earlier findings presented? Is late proliferation due to NiA, NiCP, or both? Does Wg and JAK/STAT signaling act to prevent the formation of both NiA and NiCP cells or only NiA cells? Moreover, the authors are able to specifically manipulate the wound edge (WE) and lateral pouch cells (LP), but don't show how these manipulations within these distinct populations impact regeneration. The authors provide evidence that driving UAS-mir(RHG) throughout the pouch, in the LP or the WE all decrease the amount of NiA/NiCP in Figure 3G-O, but no data on final regenerative outcomes for these manipulations is presented (such as those presented for Dronc-/+ in Fig 7M). The manuscript would be greatly enhanced by quantification of more of the findings, especially in describing if the specific manipulations that impacted NiA /NiCP cells disrupt end-point regeneration phenotypes.

      We have added a line to the results to clarify that we believe the finding that some NiA likely persist as NiCP does not affect our conclusions up to this point.

      We have added a statement emphasizing the results from our first paper, which demonstrate that LP>miRHG expression reduces the overall capacity to regenerate.

      Quantification of the change in posterior NiA number have been added to Figure 2L to strengthen the evidence. Likewise, we have included quantification of the E2F time course presented in Figure 3A (Figure 3 – Figure supplement 1C), and quantification of the change in GC3Ai signal over time has been added to Figure 5 - Figure supplement 1D) to emphasize the perdurance of GC3Ai-positive NiA/NiCP.

      How fast does apoptosis take within the wing disc epithelium? How many of the caspase(+) cells are present for the whole 48 hours of regeneration? Are new cells also induced to activate caspase during this time window? The author presented a number of interesting experiments characterizing the NiCP cells. For the caspase sensor GC3Ai experiments in Figure 5, is there a way to differentiate between cells that have maintained fluorescent CG3Ai from cells that have newly activated caspase? What is the timeline for when NiA and NiCP are specified? In addition, what fraction of NiCP cells contribute to the regenerated epithelium? Additional information about the temporal dynamics of NiA and NiCP specification/commitment would be greatly appreciated.

      We have included more information concerning the kinetics of apoptotic cell removal, and how this compares to the observations we have made with NiA/NiCP in our GC3Ai experiments. Additionally, we have included a quantification of the percent of the whole wing pouch with GC3Ai signal over time (Figure 5F) as well as the distal wing pouch with GC3Ai signal over time (Figure 5 – Figure supplement 1D) to further support the idea that NiCP persist over time.

      We acknowledge that our GC3Ai time course unfortunately cannot confirm whether the increase in GC3Ai signal over time is due to cells with new caspase activity or proliferating NiCP and have included this point in the discussion.

      We attempted to track the lineage of NiA/NiCP into the pupal and adult wings with CasExpress and DBS, however the results of these experiments were inconsistent, and therefore we did not feel confident to include these data or draw conclusions in either direction. We are currently designing variations of these lineage trace tools in order to better track the lineage of these cells that we hope to include in a future paper.

      The notum also does not express developmental JAK/STAT, yet little NiA was observed within the notum. Do the authors have any additional insights into the differential response between the pouch and notum? What makes the pouch unique? Are NiA/NiCP cells created within other imaginal discs and other tissues? Are they similarly important for regenerative responses in other contexts?

      We have added a brief mention of these points to the appropriate results section to avoid further increasing the length of the discussion.

      Data on the necrosis of other imaginal discs through FLP/FRT clone formation in haltere and leg discs has been added to Figure 1 Figure supplement 1J, and described in the text.

      Reviewer #3 (Public Review):

      The manuscript "Regeneration following tissue necrosis is mediated by non- apoptotic caspase activity" by Klemm et al. is an exploration of what happens to a group of cells that experience caspase activation after necrosis occurs some distance away from the cells of interest. These experiments have been conducted in the Drosophila wing imaginal disc, which has been used extensively to study the response of a developing epithelium to damage and stress. The authors revise and refine their earlier discovery of apoptosis initiated by necrosis, here showing that many of those presumed apoptotic cells do not complete apoptosis. Thus, the most interesting aspect of the paper is the characterization of a group of cells that experience mild caspase activation in response to an unknown signal, followed by some effector caspase activation and DNA damage, but that then recover from the DNA damage, avoid apoptosis, and proliferate instead. Many questions remain unanswered, including the signal that stimulates the mild caspase activation, and the mechanism through which this activation stimulates enhanced proliferation.

      The authors should consider answering additional questions, clarifying some points, and making some minor corrections:

      Major concerns affecting the interpretation of experimental results:

      Expression of STAT92E RNAi had no apparent effect on the ability of hinge cells to undergo NiA, leading the authors to conclude that other protective signals must exist. However, the authors have not shown that this STAT92E RNAi is capable of eliminating JAK/STAT signaling in the hinge under these experimental conditions. Using a reporter for JAK/STAT signaling, such as the STAT-GFP, as a readout would confirm the reduction or elimination of signaling. This confirmation would be necessary to support the negative result as presented.

      We have included data demonstrating our ability to knock down JAK/STAT activity in the hinge with UAS-Stat92E<sup>RNAi</sup> (Figure 2 – Figure supplement 1E and F). Additionally, we have included a quantification of posterior NiA/NiCP with the Stat92E<sup>RNAi</sup> (as well as wg<sup>RNAi</sup> and Zfh-2<sup>RNAi</sup>, Figure 2L) to strengthen our conclusion that JAK/STAT and WNT signaling acts to regulate NiA formation within the pouch.

      Similarly, the authors should confirm that the Zfh2 RNAi is reducing or eliminating Zfh2 levels in the hinge under these experimental conditions, before concluding that Zfh2 does not play a role in stopping hinge cells from undergoing NiA.

      We have repeated this experiment with a longer knockdown using a GAL4 driver that expresses from early larval stages until our evaluation at L3, but were unable to demonstrate a loss of Zfh-2 with IF labeling. Additionally, we have quantified posterior NiA/NiCP with a Zfh-2RNAi (Figure 2L) and do find a slight increase in NiA/NiCP number, however this change is not significant. We have altered our conclusions to reflect these new data.

      EdU incorporation was quantified by measuring the fluorescence intensity of the pouch and normalizing it to the fluorescence intensity of the whole disc. However, the images show that EdU fluorescence intensity of other regions of the disc, especially the notum, varied substantially when comparing the different genetic backgrounds (for example, note the substantially reduced EdU in the notum of Figure 3 B' and B'). Indeed, it has been shown that tissue damage can lead to suppression of proliferation in the notum and elsewhere in the disc, unless the signaling that induces the suppression is altered. Therefore, the normalization may be skewing the results because the notum EdU is not consistent across samples, possibly because the damage-induced suppression of proliferation in the notum is different across the different genetic backgrounds.

      To more accurately reflect the observations that we have made with the EdU assay, we have changed our terminology to indicate that the EdU signal is more localized to the damaged tissue in ablated discs, thus taking into account the relative changes across the disc, rather than referring to it as an increase in the pouch. To further strengthen our observation that damage results in a localized proliferation, we have included a quantification of the E2F time course presented in Figure 3A (Figure 3 – Figure supplement 1C), which underscores the trend observed in our EdU experiments.

      The authors expressed p35 to attempt to generate "undead cells". They take an absence of mitogen secretion or increased proliferation as evidence that undead cells were not generated. However, there could be undead cells that do not stimulate proliferation non-autonomously, which could be detected by the persistence of caspase activity in cells that do not complete apoptosis. Indeed, expressing p35 and observing sustained effector caspase activation could help answer the later question of what percentage of this cell population would otherwise complete apoptosis (NiA, rescued by p35) vs reverse course and proliferate (NiCP, unaffected by p35).

      In our previous work, we showed that P35 expression impairs our ability to detect effector caspases with IF-based tools. This can also be seen in Figure 4 of this work (Figure 4C and F). Given that P35 expression precludes our ability to label and assay effector caspase activity visually, and thus address the concerns outlined above, we relied on other tools such as reporters of AiP mitogens (wg-lacZ & dpp-lacZ) to assay whether NiA participate in AiP. As a functional readout, we also paired P35 expression with the EdU assay to test whether proliferation was altered by the presence of undead cells. The results discussed in Figure 4 lead us to conclude that NiA likely do not participate in the canonical AiP feedforward loop, although it is possible that these experiments generate another type of undead cell – one that utilizes a different mechanism to promote proliferation.

      It is unclear if the authors' model is that the NiCP cells lead to autonomous or non-autonomous cell proliferation, or both. Could the lineage-tracing experiments and/or the experiments marking mitosis relative to caspase activity answer this question?

      We have added further details to the discussion on the potential for NiA/NiCP to induce cell autonomous/non-autonomous proliferation.

      Many of the conclusions rely on single images. Quantification of many samples should be included wherever possible.

      We have added quantification to strengthen the results of Figures 2, 3 and 5.

      Why does the reduction of Dronc appear to affect regenerative growth in females but not males?

      We have repeated this regeneration scoring experiments and have increased the N for control versus droncI29 mutant males, however the results of the analysis for male wing size remain not significant, although the general trend that droncI29 wings are slightly smaller. While there could be sex-specific differences in the capacity to regenerate that contribute to this observation, it is unclear what the underlying mechanism could be.

      Reviewer #1 (Recommendations for the authors):

      The work in this paper is already very complete and very well worked out. The conclusions are well supported by the data in this manuscript. I do not have any experimental requests, only a few minor and formal requests/questions.

      (1) Why does Diap1 overexpression not affect regenerative proliferation, whereas mir(RHG) and dronc[I29] do, given that Diap1 acts between RHG and Dronc?

      We speculate on this point in the discussion section but have adjusted some of the phrasing for clarity.

      (2) I assume that the authors used the cleaved Dcp-1 antibody from Cell Signaling Technologies. I recommend that the authors refer to this antibody as cDcp-1 in text and figures as this antibody specifically detects the cleaved, and thus activated form of Dcp-1, and not the uncleaved, inactive form of Dcp-1 which has a uniform expression in the discs.

      Changed to cDcp-1.

      (3) Line 299: Hay et al. 1994 did not show that p35 inhibits Drice and Dcp-1 (in fact, both genes were not even cloned yet). This was shown by Meier et al. 2000 and Hawkins et al. 2000. Please correct references.

      Corrected.

      (4) Line 574/575. Meier et al. 2000 did not show that Dronc is mono-ubiquitylated. This was shown by Kamber-Kaya et al., 2017. Please correct.

      Corrected.

      Reviewer #2 (Recommendations for the authors):

      (1) Does domeless knockdown cause apoptosis without tissue ablation (Figures 2C-E)? Currently, the non-ablation control is not shown.

      Domeless knockdown does not cause apoptosis in the absence of ablation (Added Figure 2 – Figure supplement 1A).

      (2) The supplemental experiment with zfh2-RNAi is hard to interpret because there is no evidence of RNAi knockdown based on the staining with the anti-Zfh2 antibody.

      As noted above, a longer zfh-2 knockdown does not appear to alter Zfh-2 protein levels. A quantification of posterior NiA/NiCP following knockdown shows a slight (non-significant) increase in posterior NiA/NiCP. Considering these new results, we have altered our interpretation within the appropriate results and discussion sections.

      (3) The authors should consider adding a diagram showing where mir(RHG) and DIAP1 are in the apoptotic/caspase activation pathway (Figure 7N).

      Completed, Figure 7N and 7O.

      Reviewer #3 (Recommendations for the authors):

      (1) Figure 2 I -The purported increase in NiA should be quantitated relative to the NiA in G across many discs.

      Completed (Figure 2L)

      (2) Figure 2 M - contrary to the conclusion drawn, the posterior Dcp1 does not appear different from that in the control (K). This conclusion that the NiA does not occur in the margin could be better supported with more images/quantification.

      We have exchanged the image for a representative one that more clearly shows the lack of margin NiA and highlighted with an arrowhead (Figure 2K)

      (3) Figure 2 supp 1 E - the "slight increase" in NiA in the pouch is relative to which control? Can this conclusion be supported by quantification?

      Figure 2L now quantifies this change.

      (4) Figure 2 Supp 1 D, E - these discs supposedly have Zfh2 RNAi expressed, but there appears to be no reduction in Zfh2.

      We were unable to demonstrate a reduction of Zfh2, even with a longer knockdown. Considering these new data, we have altered our conclusions from the Zfh2 experiments.

      (5) Figure 2 Supp 1 I - please quantitate the Dcp-1 across many discs to support the conclusion.

      This is the UAS-wg experiment, which we decided to remove from the quantification given the non-specific increase in cDcp-1 throughout the disc (likely as a result from ectopic Wg expression).

      (6) Figure 4 legend M - The authors conclude that the experiment indicates that "NiA promote proliferation independent of AiP". It would be more precise to say that NiA cells do not secrete AiP mitogens and do not increase the proliferation of surrounding cells when prevented from completing apoptosis. To say that the NiA-induced proliferation does not require AiP would require eliminating AiP, perhaps through reaper hid grim knockdown or mitogen knockdown.

      Corrected.

      Minor concerns and clarification needed:

      (7) Line 61 - consider the distinction between a feed-forward loop and a positive feedback loop.

      Corrected.

      (8) Line 338 - it would be helpful to have a brief explanation of what the GC3Ai consists of and how it reports caspase activity.

      Corrected.

      (9) Line 343 - the authors should clarify by what they mean when they state GC3Ai-positive cells are "associated with" mitotic cells. Are the GC3Ai cells undergoing mitosis? Or is the increase in mitosis non-autonomous?

      Adjusted. “associated with adjacent proliferative cells”.

      (10) Lines 392-394 - the authors should add brief descriptions of how the Drice-Based sensor and the CasExpress function, so the readers can better understand the distinctions between these sensors and the previously mentioned sensors (anti-Dcp1 and GC3Ai). In addition, please clarify how the Gal80ts modulates the sensitivity of the CasExpress.

      Descriptions of DBS and CasExpress and additional clarification provided.

      (11) Line 413: How does Gal80ts suppress the background developmental caspase signal, and how does this suppression lead to NiCP cells expressing GFP?

      This section has been reworded to clarify.

      (12) Line 417 - which GFP label is referred to here?

      This section has been reworded to clarify.

      (13) Line 445 is the first mention of the CARD domain - it could be introduced more fully and explained why the DroncDN's lack of effect on proliferation excludes the CARD domain as being important.

      Clarified. See also the discussion for the significance of the CARD domain as dispensable for regenerative proliferation following necrosis.

      (14) Line 452 - "As mentioned" - the manuscript has not previously mentioned DIAP1 modification of the CARD domain and what that modification does. Perhaps the previous explanatory text was inadvertently removed?

      Corrected.

      (15) The Discussion is a lengthy list of experiments that the authors did not do or observations they were unable to make. This section could benefit from a more in-depth discussion of necrosis and the possibility that NiCP cells contribute to repair after injury across contexts and species.

      We have made several changes to the discussion that elaborate on some of the points listed in the public reviews.

      (16) All figures: Consider making single-channel panels grayscale to aid visualization. Also consider using color combinations that can be distinguished by color-blind readers.

      We appreciate these suggestions and will consider them for future manuscripts.

      (17) All figure legends - are error bars SD or SEM?

      Standard deviation. Added to appropriate legends.

      (18) Figure 1A,C - it would be helpful in the diagrams to note when the necrosis occurs/completes.

      The endpoint of necrosis is not well defined, given the simultaneous changes that occur with regeneration. Thus, we opted to not include an indicator of when necrotic ablation ends.

      (19) Figure 1B - it would be helpful to name the GAL4 drivers whose expression domain is depicted to correlate with the terms used in the text.

      Completed.

      (20) Figure 1 legend- what do the different colors of the arrowheads denote? The dotted lines are in R' and S', not N' and O'.

      Completed.

      (21) Figure 2G - the yellow dashed line is not in the same place in the two images.

      Corrected.

      (22) Figure 2I - what is the open arrowhead?

      Completed (Figure 2I legend).

      (23) Figure 3 legend - please describe what the time course is observing (EdU).

      Completed.

      (24) Figure 4 - please include the yellow boxes in the Dcp-1 channels.

      Completed.

      (25) Figure 5 F' - add the arrowheads to all the panels. The yellow arrowhead appears to be pointing to nothing.

      Completed.

      (27) Figure 5 legend - what is a "cytoplasmic undisturbed cell"? What is the arrowhead in G? J and J' should show the same view at different time points or different views at the same time point.

      Figure legend has been corrected.

      (28) Figure 5 Supp 1 would be especially helped by having more single-channel panels in grayscale.

      For clarity and consistency, we chose to maintain the different color channels.

      (29) Figure 5 Supp 1 D and E - It would be helpful to have higher magnification and arrows pointing to the cells of interest. Why are there TUNEL+ cells that do not have caspase activation (green)?

      We have added arrowheads as suggested. We believe the disparity in TUNEL and GC3Ai signals are a result of the different sensitivities of the IF staining and the TUNEL assay.

      (30) Figure 5 Supp 1 F - perhaps the arrowheads should be in all panels - they point to empty spaces with no H2Av staining in the final panel. Perhaps a higher magnification image would make the "strong overlap" of the two signals more apparent?

      We have added arrowheads where appropriate.

      (31) Figure 6 D-E - does the widespread GFP lineage tracing signal suggest that most cells in the repaired tissue originated from cells that once had caspases activity?

      Possibly, however given that CasExpress leads to significant developmental labeling, we were unable to determine to what extent the signal in this experiment comes from NiA/NiCP activity versus developmental labeling. Note that tubGAL80ts is not present in this experiment.

      (32) Writing corrections:

      Line 343 "positive" is misspelled.

      Completed

      Line 429 - a word may be missing.

      Completed

      Line 639 - the word "day" may be missing.

      Completed

      Line 658 - what temperature was the recovery?

      Completed

      Lines 706-708 - were the discs incubated in 55 mL and 65 mL of liquid, or a smaller volume?

      Completed

    1. eLife Assessment

      This valuable study describes a software package in R for visualizing metabolite ratio pairs. The evidence supporting the claims of the authors is solid and broadly supports the authors' conclusions. This work would be of interest to the mass spectrometry community.

    2. Reviewer #2 (Public review):

      Summary:

      In the article, the authors describe their software package in R for visualizing metabolite ratio pairs. I think the work would be of interest to the mass spectrometry community.

      Strengths:

      The authors describe a software that would be of use to those performing MALDI MSI. This software would certainly add to the understanding metabolomics data and enhance the identification of critical metabolites.

      Weaknesses:

      The figures are difficult to interpret/ analyze in their current state but are significantly better in the revision.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Cheng et al explore the utility of analyte ratios instead of relative abundance alone for biological interpretation of tissue in a MALDI MSI workflow. Utilizing the ratio of metabolites and lipids that have complimentary value in metabolic pathways, they show the ratio as a heat map which enhances the understanding of how multiple analytes relate to each other spatially. Normally, this is done by projecting each analyte as a unique color but using a ratio can help clarify visualization and add to biological interpretability. However, existing tools to perform this task are available in open-source repositories, and fundamental limitations inherent to MALDI MSI need to be made clear to the reader. The study lacks rigor and controls, i.e. without quantitative data from a variety of standards (internal isotopic or tissue mimetic models for example), the potential delta in ionization efficiencies of different species subtracts from the utility of pathway analysis using metabolite ratios.

      We thank the reviewer for comments on the availability of four other commercial and open-source tools for performing ratio imaging: ENVI® Geospatial Analysis Software, MATLAB image processing toolbox, Spectral Python (SPy) and QGIS. We now highlight these in the introduction (page 3 line 80-86). However, in contrast to these target ratio imaging methods, our approach uniquely enables the untargeted discovery of correlated (or anti-correlated) ratios of molecular features, whether the species are structurally known or unknown.

      ENVI® Geospatial Analysis Software and MATLAB image processing toolbox for hyperspectral imaging are both paid programs, limiting free access and software evaluation for the potential application of untargeted ratio-metric imaging. We are able to evaluate the application of MATLAB RatioImage since Weill Cornell Medicine has an institutional subscription for Mathwork-MATLAB. Notably, MATLAB RatioImage computes and displays an individual intensity modulated ratiometric image by choosing a numerator and denominator image. This software tool only images the ratios of selected metabolites from an input list of multiple species and does not allow for the possibility of untargeted ratiometric images of all metabolite pairs.

      While Spectral Python (SPy) and QGIS are both freely-available software packages, and both can perform individual metabolite ratio images, neither allows for untargeted ratiometric imaging of all pairs from a multiple metabolite input list. Table S1 (below) provides a comparison of the ratio imaging tool that we offer in comparison with other previously available tools.

      We appreciate the reviewer’s insightful comments on differential ionization efficiency among metabolites and the importance of using stable isotope internal standard to gain absolute quantification.

      A fundamental advantage of our ratiometric imaging tool is to provide better image contrast for tissue regions with differential ionization efficiency, with the potential to discover new “metabolic” regions that can be revealed by metabolite ratio. Note that comparison for ratio image abundance is limited to tissue groups in the equivalent region which is expected to have similar ionization efficiency for given metabolites. Furthermore, the power of our strategy is to provide untargeted (and targeted) ratio imaging as a hypothesis generation tool and this use does not require absolute quantification. If cost was not an issue, an extensive group of stable isotope standards could theoretically be used for absolute metabolite quantification of target metabolites with known identity.

      Using the tissue mimetic model, we generate calibration curve for stable isotope standards spiked in carboxymethylcellulose (CMC)-embedded brain homogenate cryosections and quantify the concentration of brain glucose, lactate and ascorbate concentrations. Similar ratio images among these metabolites are obtained from abundance data compared to quantified concentration data (Fig S3). While stable isotope standards are often used to obtain quantitative concentration of metabolite/lipid of interest, it is not applicable for untargeted metabolite ratios that include an assessment of structurally undefined species. Nevertheless, our data indicates that absolute quantification is not necessary for the targeted and untargeted ratio imaging described here (Page 6, line 196-205).

      Reviewer #2 (Public Review):

      Summary:

      In the article, "Untargeted Pixel-by-Pixel Imaging of Metabolite Ratio Pairs as a Novel Tool for Biomedical Discovery in Mass Spectrometry Imaging" the authors describe their software package in R for visualizing metabolite ratio pairs. I think the novelty of this manuscript is overstated and there are several notable issues with the figures that prevent detailed assessment but the work would be of interest to the mass spectrometry community.

      Strengths:

      The authors describe a software that would be of use to those performing MALDI MSI. This software would certainly add to the understanding of metabolomics data and enhance the identification of critical metabolites.

      Weaknesses:

      The authors are missing several references and discussion points, particularly about SIMS MSI, where ratio imaging has been previously performed.

      There are several misleading sentences about the novelty of the approach and the limitations of metabolite imaging.

      Several sentences lack rigor and are not quantitative enough.

      The figures are difficult to interpret/ analyze in their current state and lack some critical components, including labels and scale bars.

      We thank reviewer for very helpful comments. The tone of the manuscript has been adjusted to highlight the real novelty of this method in the ease of computing and application to MS specific projects (abstract line 26-30 ). All figures have been updated to include labels and scale bars with improved resolution. References for ratio imaging use of SIMS MSI has been added in the introduction (Page 3, line 80-89).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major Comments:

      In the Abstract it is stated that: "the research community lacks a discovery tool that images all metabolite abundance ratio pairs." However, the following tools exist that perform this fundamental task.

      A "pixel by pixel" data frame in .csv form has a very similar data structure to many instruments like satellite imaging or other hyperspectral tools. It is true this does not exist in the MALDI-specific context, but it would not be difficult to perform this task on the following programs. Highlight the novelty here is not ratios but the ease of computing them and the application in the specific project. Also, describe the available tools and what shortcomings others lack that this package provides. A supplemental table of MSI data analysis tools and the function of each would be a good addition.

      List of tools to perform band ratio computation with minimal modification:

      (1) ENVI IDL: geospatial imaging tool that allows ratio computation between spectral bands.

      (2) MATLAB image processing toolbox for hyperspectral imaging.

      (3) Spectral Python package (SPy).

      (4) QGIS with plugins can be used for hyperspectral image analysis with a ratio between bands.

      We revised the abstract and introduction to include novelty and comparison to other existing methods listed in Table S1.

      "untargeted R package workflow" - If there are functions used outside the SCiLS Lab API client then write it up and include a GitHub link for open access to fit the mission of eLife.

      As shown in Scheme I. We develop two types of codes for untargeted ratio imaging. The first type uses Scils lab API client to extend the function of targeted and targeted ratio imaging and all related spatial image analysis. This is suitable for Scils lab users. The second type does not require Scils lab API, it allows extracting pixel data from imzml file then proceed targeted and untargeted imaging and analysis. Both codes are now deposit in Github via public access (https://github.com/qic2005/Untargeted-massspectrometry-ratio-imaging.git).

      "across cells and tissue subregions" The value in reporting cell type and tissue type-specific differences in any metric is powerful, but not done in this paper. Only whole samples are compared such as "KO vs WT" and the annotations in Figure 3 are not leveraged for increased biological relevance. This paper treats each image as a homogenization experiment in a practical sense beyond just visually inspecting each image. Remove this claim or do the calculations on region/tissue/cell-type specific differences with the appropriate tools to show the data beyond simple heat map images.

      We have deleted the sentence containing across cells and tissue subregions from the abstract.

      "enhances spatial image resolution" Clarify. The resolution in MALDI is set by the raster size of the pixels which is an instrument parameter and cannot be changed post-acquisition. Image-specific methods to increase resolution exist, but dividing the value in one peak column by another does not change functional resolution in the context of the instruments here.

      We thank reviewer for pointing out this typo. We have changed it to enhance spatial image contrast in the abstract (line 34).

      "pixel-by-pixel imaging of the ratio of an enzyme's substrate to its derived product offers an opportunity to view the distribution of functional activity for a given metabolic pathway across tissue" - Appropriately calibrate the impact of this work and correct this statement to better reflect the capabilities of this approach. Do not oversell the exploration of pathway activity since the raw quantity reported as relative abundance does not provide biologically interpretable pathway information. This is due to unaccounted differences in ionization efficiencies between analytes in a pathway and lack of determination of rate. Without a calibration curve and more techniques on the analytical chemistry side of the project, it is possible a relative abundance of one analyte (like the product of a pathway) could be higher than the relative abundance of another analyte (a precursor), but due to structural differences, the actual quantity of the higher relative abundance species could be significantly different or even lower than its counterpart. Secondly, "functional activity" cannot be assessed in this manner without isotopic labeling or additional techniques. This does not subtract from the overall validity and impact of the work, but highlighting these shortcomings and slight alterations to the claim are important for a multidisciplinary audience.

      Although we show that abundance ratio results in similar image to concentration ratio for brain metabolites such as lactate, glucose and ascorbate, we agree with the reviewer that abundance ratio is different from the absolute concentration ratio in numerical value due to difference in ionization efficiency. We delete the sentence “pixel-by-pixel imaging of the ratio of an enzyme's substrate to its derived product offers an opportunity to view the distribution of functional activity for a given metabolic pathway across tissue" from the abstract. We apologize for not clarifying this application more clearly. We meant to compare pathway activity among the equivalent and similar pixel/regions of tissues from different biological groups, given the assumption that ionization efficiency is identical for equivalent pixel from different tissue sections ( i.e. same cell type and microenvironment), especially for metabolites with similar functional structure in the same pathway. For example, fatty acids with different chain length and phospholipid with same head groups are expected to have similar ionization efficiency in the same tissue pixel/region. We have thereby rewritten this section (Page 7, line 239-247).

      "We further show that ratio imaging minimizes systematic variations in MSI data by sample handling and instrument drift, improves image resolution, enables anatomical mapping of metabotype heterogeneity, facilitates biomarker discovery, and reveals new spatially resolved tissue regions of interest (ROIs) that are metabolically distinct but otherwise unrecognized."

      Instrument drift is not accounted for by ratios as it impacts the process before ratio computation. "metabotype" - spelling?

      Instrument drift here refers to individual ion abundance changes during long data acquisition. Ratio may offer a better read-out than individual metabolite abundance alone. However, for acquired data after total ion normalization, ratio data would not have difference from non-ratio data. Therefore, we delete instrument drift from the sentence (Page 2, line 33, and Page 3, line 99)

      Metabotype is a term widely used for metabolomics field. It is categorized by similar metabolic profiles, which are based on combinations of specific metabolites. https://nutritionandmetabolism.biomedcentral.com/articles/10.1186/s12986-020-00499-z

      Results 3: Justify the claim that the ratio reduces artifacts. A ratio is the value from one m/z area over another and would seem that the quality of the ratio would be always lower than the individually higher quality pixel signal of the two analytes that compose a ratio.

      Ratio images are indeed the heatmaps of pixel-by-pixel ratio data, set by the scale of all ratio values. For very abundant ion pairs, their individual image may not be better than the ratio image, depending on the abundance changes among pixels within tissue sections. Similarly, the quality of ratio image may not be higher than the individual image if distribution of ratios does not change much among pixels in tissue sections. For example, metabolite or lipids in Figures 2 and 5 are abundant, but non-ratio images do not have better quality than ratio images. Furthermore, ratio image provides additional information on how the ratio of the two metabolite pair changes pixel-by pixel in all tissue sections, such additional information could be useful for data interpretation.

      Results 4: The metabolite pairs are biologically sensible but should be clearly stated that they do not account for differences in ionization efficiency between metabolites and cannot provide quantitative pathway analysis with a high degree of biological confidence.

      We apologize for not clarifying this application more clearly. We meant to compare pathway activity among the equivalent and similar pixel/regions of tissues from different biological groups, given the assumption that ionization efficiency is identical for equivalent pixel from different tissue sections ( i.e. same cell type and microenvironment), especially for metabolites with similar functional structure in the same pathway. For example, fatty acids with different chain length and phospholipid with same head groups are expected to have similar ionization efficiency in the same tissue pixel/region. We have thereby rewritten this section (Page 7, 239-247, 254-255).

      Results 4: "cell-type specific metabolic activity at cellular (10 µm) spatial resolution" Prove the cell type differences with IHC coregistration or MALDI IHC if you want to make claims about them. Just visually determining a tissue type of a scan of a slide is inadequate to support this claim.

      We agree with reviewer’s comments. We meant to provide additional information on cellular level metabolic activity such as adenosine nucleotide phosphorylation status (ATP/AMP) ratio at 10µm resolution. Hippocampus neurons provide a good example for depicting this utility. We have rewritten the claim to highlight the role of ratio imaging in providing additional metabolic information (Page 8, line 288-290).

      Minor Comments:

      Table 2 "Aspartiate" spelling

      We have corrected it.

      Describe the process and mathematical background for ratio computation in the Methods section. As this paper introduces a package, describing its underlying functions has value.

      We have added R-script comments to illustrate the untargeted ratio calculation using the R-mathematical function of combination and division between any two metabolite pairs in a data matrix (Page 4, line 139-141)

      "we annotate missing values with 1/5 the minimum value quantified in all pixels in which it was detected" This is explicit (ie only values with exactly 1/5 the value are annotated" - make it clear this is a threshold.

      We apologize for misunderstanding. Missing values are either have no value or have solid zero in their abundance. We first calculate the minimum abundance of a particular m/z among all pixels with detectable abundance ( i.e. excluding non-missing values), then use 1/5 this minimum value as a threshold to annotate missing value (Page 4, 133-139).

      Figure 1: legend scils is branded SCiLS and EXCEL does not need caps lock (Excel).

      Figure 1 legend has been corrected.

      Conflicts of interest "None" - there are Bruker employees on a paper about MALDI method development in a field they dominate.

      We added Joshua Fischer as a Bruker employee.

      Figure 3: The legend does not describe the purple arrow in J.

      Purple arrow description is added to figure legend.

      Figure 5: Fix orientation inconsistencies in G, H, I, and J. Especially in J - they are opposite directions. This is arbitrary and determined in SCiLS lab with simple rotation.

      Orientation has been made consistent in G,H, I and J.

      Figure S8: Provide exact number of biological and technical replicates used to generate this figure.

      Figure S8, now Figure S9, was generated from 4 biological replicates of KO and 4 biological replicates of WT brain section in the ROI7 region. This information has been added to the figure legend.

      Figure S9: Make consistent orientation of all brains

      We have made brain orientations consistent.

      In addition to ionization efficiencies impacting the value of the numeric relative abundance where ratio computation originates from, it should be mentioned how different classes of metabolites are differentially impacted by the euthanasia and collection methods used for various tissue types. For example, it is well established the ATP/AMP ratio can change drastically from tissue collection.

      We have added this to page 8, line 315-319.

      Perform standards to adjust for ionization efficiency between different m/z features.

      Untargeted ratio imaging serves as an add-on MSI data analysis tool with primary use in comparing ratio among equivalent regions/pixels with similar ionization efficiencies. It is a hypothesis generation tool. Standards adjust for ionization efficiency would be a great idea for a more accurate assessment of ratio values. Due to the cost and availability of stable isotope standards for different m/z, we chose glucose, lactate and ascorbate to showcase that abundance ratio and concentration ratio result in similar images among example brain metabolite lactate, glucose and ascorbate (page 6, 196-205).

      Add more controls to support the claims.

      We have 4 biological replicates for each genotype of brain. We have added the number of controls in all figure legends.

      Significantly tone down the claims, it is unclear how knowledgeable the authors are about the current literature of SW regarding MALDI.

      The tone has been significantly tuned down throughout the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Abstract:

      "relative abundance of structurally identified and yet-undefined metabolites across tissue cryosections" is misleading, since tandem MS can be performed in an imaging context and is often also compatible with the same instrument.

      We have deleted this sentence in the abstract.

      Intro:

      Paragraph 1: The authors mention MALDI and DESI, but I would argue that SIMS is more abundantly used than DESI within single-cell applications.

      We have added SIMS to the introduction Page 3, line 67.

      Paragraph 2: While it may not be all detected pairs, there are many examples of ratio imaging in the MALDI MSI and SIMS communities, particularly for bacterial signaling. These would be important examples to reference.

      We have added the application of SIMS ratio imaging to the introduction, page 3, line 74-75.

      Materials :

      Paragraph 1: More specificity on sample size is required. 3 or 4 per group is not specific. Which has four and which has three? Why are they different?

      We have corrected sample numbers for specific genotype in the text and figure legends. The number of sections per group is different due to the availability of fresh-frozen tissues (Page 4, line 115-117).

      Results:

      Paragraph 1: Am I correct in reading that an .imzml can't be used directly? Why not?

      Imaging Mass Spectrometry Markup Language (imzml) is a common data format for mass spectrometry imaging. It was developed to allow the flexible and efficient exchange of large MS imaging data between different instruments and data analysis software (Schramm et al, 2012). It contains two sets of data: the mass spectral data which is stored in a binary file (.ibd file) to ensure efficient storage and the XML metadata (.imzml file) which stores instrumental parameters, sample details. Therefore, it can’t be used directly. We have added this to result 1(Page 5, line 160-169).

      Paragraph 4: "Additionally, nonlipid small molecule metabolites suffer from smearing and/or diffusion during cryosection processing, including over the course of matrix deposition for MALDI-MSI." This is misleading. There are several examples of MALDI MSI of small metabolites that are nonlipids, where smearing or diffusion have not occurred. It would be beneficial to have a more accurate discussion of this instead. The authors should also provide some evidence of this, since they continue to focus on it for the full paragraph and don't provide references.

      We initially meant the poor image quality of small molecule metabolites is due to its interaction with aqueous phase of spraying solution, rapid degradation rate and matrix interference. We have deleted this sentence in the revised version.

      Section 5 Paragraph 2; "However, ratio imaging revealed a much greater aspartate to glutamate ratio in an unusual "moon arc" region across the amygdala and hypothalamus relative to the rest of the coronal brain." Much greater isn't scientifically accurate or descript. Use real numbers and be quantitative.

      We used pixel data from all 8 sections to obtain quantitative changes in the ratio-generated “moon arc” region compared to the rest of coronal brain (page 8, line 331-337). Ratio imaging revealed a average of 1.59-fold increase in aspartate to glutamate ratio in an unusual “moon arc” region across the amygdala and hypothalamus (mean abundance 0.563 in 6345 pixels) relative to the rest of the coronal brain (mean abundance 0.353 in 45742 pixels, Figure 5D). Similar but different arc-like structures are encompassed within the ventral thalamus and hypothalamus, wherein glutamate to glutamine ratio show a 1.63-fold increase in intensity compared to the rest of the brain (mean abundance of 0.695 in 7108 pixels vs 0.428 in 44979 pixels, Figure 5E).

      Section 8 Paragraph 2: "UMAPing" is not scientifically written.

      We have replaced UMAPing with UMAP.

      Figure 2 is difficult to interpret, given the small sizes of the images. Align the images, reduce the white space, clearly label the different tissues, add scale bars, increase size, etc. This applies to all figures, except for 3. This will make it possible to review.

      All figures have been resized by removing extra space between sections.

      Figure 3. There seems to be a change in tissue after section I, so a different diagram would be helpful. SCD has a high abundance in an area that seems to be off of the tissue. Can the authors explain this? Some of the images also appear to be low signal-to-noise. Example spectra in the SI would be helpful, so I can more accurately judge the quality of the data.

      We apologize for the discrepancy. All images are from the same sample. We initially cropped the individual image from multiple page PDF plot, then inserted it in Figure 3. Resizing and cropping inconsistency may lead to the small difference in image size. In the revised version, we plot all images in one page, which eliminates the inconsistency.

      Figure 3 example pixel data, ratio pixel data, mass spectra and ratio images can be downloaded below:

      https://wcm.box.com/s/2d5jch45ar8upjzytljnylt6doewcsqc

    1. eLife Assessment

      The study by Chen and Phillips provides evidence for a dynamic switch in the small RNA repertoire of the Argonaute protein NRDE-3 during embryogenesis in C. elegans. The work is supported by convincing experimental data, shedding light on RNA regulation during development. While the functional relevance of this process warrants further investigation, this study provides valuable insights into small RNA pathways with broader implications for developmental biology and gene regulation in other systems.

    2. Reviewer #1 (Public review):

      Summary:

      Chen and Phillips describe the dynamic appearance of cytoplasmic granules during embryogenesis analogous to SIMR germ granules, and distinct from CSR-1-containing granules, in the C. elegans germline. They show that the nuclear Argonaute NRDE-3, when mutated to abrogate small RNA binding, or in specific genetic mutants, partially colocalizes to these granules along with other RNAi factors, such as SIMR-1, ENRI-2, RDE-3, and RRF-1. Furthermore, NRDE-3 RIP-seq analysis in early vs. late embryos is used to conclude that NRDE-3 binds CSR-1-dependent 22G RNAs in early embryos and ERGO-1-dependent 22G RNAs in late embryos. These data lead to their model that NRDE-3 undergoes small RNA substrate "switching" that occurs in these embryonic SIMR granules and functions to silence two distinct sets of target transcripts - maternal, CSR-1 targeted mRNAs in early embryos and duplicated genes and repeat elements in late embryos.

      Strengths:

      The identification and function of small RNA-related granules during embryogenesis is a poorly understood area and this study will provide the impetus for future studies on the identification and potential functional compartmentalization of small RNA pathways and machinery during embryogenesis.

      Weaknesses:

      (1) The authors acknowledge the following issue that loss of SIMR granules have no significant impact on NRDE-3 small RNA loading weakens the functional relevance of these structures. However, this point is clearly discussed and, as they note in their Discussion, it is entirely possible that these embryonic granules may be "incidental condensates."

    3. Reviewer #2 (Public review):

      Summary:

      NRDE-3 is a nuclear WAGO-clade Argonaute that, in somatic cells, binds small RNAs amplified in response to the ERGO-class 26G RNAs that target repetitive sequences. This manuscript reports that, in the germline and early embryos, NRDE-3 interacts with a different set of small RNAs that target mRNAs. This class of small RNAs were previously shown to bind to a different WAGO-clade Argonaute called CSR-1, which is cytoplasmic unlike nuclear NRDE-3. The switch in NRDE-3 specificity parallels recent findings in Ascaris where the Ascaris NRDE homolog was shown to switch from sRNAs that target repetitive sequences to CSR-class sRNAs that target mRNAs.

      The manuscript also correlates the change in NRDE-3 specificity with the appearance in embryos of cytoplasmic condensates that accumulate SIMR-1, a scaffolding protein that the authors previously implicated in sRNA loading for a different nuclear Argonaute HRDE-1. By analogy, and through a set of corelative evidence, the authors argue that SIMR foci arise in embryogenesis to facilitate the change in NRDE-3 small RNA repertoire. The paper presents lots of data that beautifully documents the appearance and composition of the embryonic SIMR-1 foci, including evidence that a mutated NRDE-3 that cannot bind sRNAs accumulate in SIMR-1 foci in SIMR-1-dependent fashion.

    4. Reviewer #3 (Public review):

      Summary:

      Chen and Phillips present intriguing work that extends our view on the C. elegans small RNA network significantly. While the precise findings are rather C. elegans specific there are also messages for the broader field, most notably the switching of small RNA populations bound to an argonaute, and RNA granules behavior depending on developmental stage. The work also starts to shed more light on the still poorly understood role of the CSR-1 argonaute protein and supports its role in the decay of maternal transcripts. Overall, the work is of excellent quality, and the messages have a significant impact.

      Strengths:

      Compelling evidence for major shift in activities of an argonaute protein during development, and implications for how small RNAs affect early development. Very balanced and thoughtful discussion.

      Weaknesses:

      The switch between maternal and zygotic NRDE-3 remains unaddressed

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      Chen and Phillips describe the dynamic appearance of cytoplasmic granules during embryogenesis analogous to SIMR germ granules, and distinct from CSR-1-containing granules, in the C. elegans germline. They show that the nuclear Argonaute NRDE-3, when mutated to abrogate small RNA binding, or in specific genetic mutants, partially colocalizes to these granules along with other RNAi factors, such as SIMR-1, ENRI-2, RDE-3, and RRF-1. Furthermore, NRDE-3 RIP-seq analysis in early vs. late embryos is used to conclude that NRDE-3 binds CSR-1-dependent 22G RNAs in early embryos and ERGO-1dependent 22G RNAs in late embryos. These data lead to their model that NRDE-3 undergoes small RNA substrate "switching" that occurs in these embryonic SIMR granules and functions to silence two distinct sets of target transcripts - maternal, CSR-1 targeted mRNAs in early embryos and duplicated genes and repeat elements in late embryos.

      Strengths:

      The identification and function of small RNA-related granules during embryogenesis is a poorly understood area and this study will provide the impetus for future studies on the identification and potential functional compartmentalization of small RNA pathways and machinery during embryogenesis.

      Weaknesses:

      (1) While the authors acknowledge the following issue, their finding that loss of SIMR granules has no apparent impact on NRDE-3 small RNA loading puts the functional relevance of these structures into question. As they note in their Discussion, it is entirely possible that these embryonic granules may be "incidental condensates." It would be very welcomed if the authors could include some evidence that these SIMR granules have some function; for example, does the loss of these SIMR granules have an effect on CSR-1 targets in early embryos and ERGO-1-dependent targets in late embryos?

      We appreciate reviewer 1’s concern that we do not provide enough evidence for the function of the SIMR granules. As suggested, we examined the NRDE-3 bound small RNAs more deeply, and we do observe a slight but significant increased CSR-class 22G-RNAs binding to NRDE-3 in late embryos of simr-1 and enri-2 mutants (see below, right). We hypothesize that this result could be due to a slower switch from CSR to ERGO 22G-RNAs in the absence of SIMR granules. We added these data to Figure 6G.

      (2) The analysis of small RNA class "switching" requires some clarification. The authors re-define ERGO1-dependent targets in this study to arrive at a very limited set of genes and their justification for doing this is not convincing. What happens if the published set of ERGO-1 targets is used? 

      As we mentioned in the manuscript, we initially attempted to use the previously defined ERGO targets. However, the major concern is fewer than half the genes classified as ERGO targets by Manage et al. and Fischer et al. overlap with one another (Figure 6—figure supplement 1D and below). We reason this might because the gene sets were defined as genes that lose small RNAs in various ERGO pathway mutants and because different criteria were used to define the lists as discussed in the manuscript (lines 471-476). As a result, some of the previously defined ERGO target genes may actually be indirect targets of the pathway. Here we focus on genes targeted by small RNAs enriched in an ERGO pathway Argonaute IP, which should be more specific.

      In this manuscript, we are interested specifically in the ERGO targets bound by NRDE-3, thus we utilized the IP-small RNA sequencing data from young adult animals (Seroussi et al, 2023), to define a new ERGO list. We are confident about this list because 1) Most of our new ERGO genes overlap with the overlap between ERGO-Manage and ERGO-Fischer list (see Figure 6—figure supplement 1D in our manuscript and below). 2) We observed the most significant decrease of small RNA levels and increase of mRNA levels in the nrde-3 mutants using our newly defined list (see Figure 6—figure supplement 1E-F in our manuscript).

      To further address reviewer 1’s concern about whether the data would look significantly different when using the ERGO-Manage and ERGO-Fischer lists, we made new scatter plots shown in Author response image 1 panels A-C below (ERGO-Manage – purple, ERGO-Fischer- yellow, and the overlap - yellow with purple ring). We found that the small switching pattern of NRDE-3 is consistent with our newly defined list, particularly if we look at the overlap of ERGO-Manage and ERGO-Fischer list (Author response image 1 panels D-F below, red).

      Author response image 1.

      Further, the NRDE-3 RIP-seq data is used to conclude that NRDE-3 predominantly binds CSR-1 class 22G RNAs in early embryos, while ERGO-1-dependent 22G RNAs are enriched in late embryos. a) The relative ratios of each class of small RNAs are given in terms of unique targets. What is the total abundance of sequenced reads of each class in the NRDE-3 IPs? 

      To address the reviewer’s question about the total abundance of sequenced reads of each class in the NRDE-3 IPs: Author response image 2 panel A-B below show the total RPM of CSR and ERGO class sRNAs in inputs and IPs at different stages. Focusing on late embryos, the total abundance of ERGO-dependent sRNAs is similar to CSR-class sRNAs in input, while much higher in IP, indicating an enrichment of ERGO-dependent 22G-RNAs in NRDE-3 consistent with our log2FC (IP vs input) in Figure 6B. This data supports our conclusion that NRDE-3 preferentially binds to ERGO targets in late embryos.

      Author response image 2.

      b) The "switching" model is problematic given that even in late embryos, the majority of 22G RNAs bound by NRDE-3 is the CSR-1 class (Figure 5D). 

      It is important to keep in mind the difference in the total number of CSR target genes (3834) and ERGO target genes (119).  The pie charts shown in Figure 6D are looking at the total proportion of the genes enriched in the NRDE-3 IP that are CSR or ERGO targets. For the NRDE-3 IP in late embryos, that would be 70/119 (58.8%) of ERGO targets are enriched, while 172/3834 (4.5%) of CSR targets are enriched. These data are also supported by the RPM graphs shown in Author response image 2 panels A-B above, which show that the majority of the small RNA bound by NRDE-3 in late embryos are ERGO targets. Nonetheless, NRDE-3 still binds to some CSR targets shown as Figure 6D and panel B, which may be because the amount of CSR-class 22G-RNAs is reduced gradually across embryonic development as the maternally-deposited NRDE-3 loaded with CSR-class 22G-RNAs is diluted by newly transcribed NRDE-3 loaded with ERGOdependent 22G-RNAs (lines 857-862). 

      c) A major difference between NRDE-3 small RNA binding in eri-1 and simr-1 mutants appears to be that NRDE-3 robustly binds CSR-1 22G RNAs in eri-1 but not in simr-1 in late embryos. This result should be better discussed.

      In the eri-1 mutant, we hypothesize that NRDE-3 robustly binds CSR-class 22G-RNAs because ERGOclass 22G-RNAs are not synthesized during mid-embryogenesis, so either NRDE-3 is unloaded (in granule at 100-cell stage in Figure 2A) or mis-loaded with CSR-class 22G-RNAs (in the nucleus at 100cell stage in Figure 2A). We don’t have a robust method to address the proportion of loaded vs. unloaded NRDE-3 so it is difficult to address the degree to which NRDE-3 is misloaded in the eri-1 mutant. In the simr-1 mutant, both classes of small RNAs are present and NRDE-3 is still preferentially loaded with ERGO-dependent 22G-RNAs, though we do see a subtle increase in association with CSR-class 22GRNAs. These data could suggest a less efficient loading of NRDE-3 with ERGO-dependent 22G-RNAs, but we would need more precise methods to address the loading dynamics in the simr-1 mutant.

      (3) Ultimately, if the switching is functionally important, then its impact should be observed in the expression of their targets. RNA-seq or RT-qPCR of select CSR-1 and ERGO-1 targets should be assessed in nrde-3 mutants during early vs late embryogenesis.

      The function of NRDE-3 at ERGO targets has been well studied (Guang et al, 2008) and is also assessed in our H3K9me3 ChIP-seq analysis in Figure 7E where, in mixed staged embryos, H3K9me3 level on ERGO targets (labeled as ‘NRDE-3 targets in young adults’) is reduced significantly in the nrde-3 mutant.

      To understand the function of NRDE-3 binding on CSR targets in early embryos, we attempted to do RTqPCR, smFISH, and anti-H3K9me3 CUT&Tag-seq on early embryos, and we either failed to obtain enough signal or failed to detect any significant difference (data not shown). We additionally tested the possibility that NRDE-3 functions with CSR-class 22G-RNAs in oocytes. We present new data showing that NRDE-3 represses RNA Pol II in oocytes to promote global transcriptional repression at the oocyteto-embryo transition, we now included these data in Figure 8. 

      Reviewer #2 (Public review):

      Summary:

      NRDE-3 is a nuclear WAGO-clade Argonaute that, in somatic cells, binds small RNAs amplified in response to the ERGO-class 26G RNAs that target repetitive sequences. This manuscript reports that, in the germline and early embryos, NRDE-3 interacts with a different set of small RNAs that target mRNAs. This class of small RNAs was previously shown to bind to a different WAGO-clade Argonaute called CSR1, which is cytoplasmic, unlike nuclear NRDE-3. The switch in NRDE-3 specificity parallels recent findings in Ascaris where the Ascaris NRDE homolog was shown to switch from sRNAs that target repetitive sequences to CSR-class sRNAs that target mRNAs.

      The manuscript also correlates the change in NRDE-3 specificity with the appearance in embryos of cytoplasmic condensates that accumulate SIMR-1, a scaffolding protein that the authors previously implicated in sRNA loading for a different nuclear Argonaute HRDE-1. By analogy, and through a set of corelative evidence, the authors argue that SIMR foci arise in embryogenesis to facilitate the change in NRDE-3 small RNA repertoire. The paper presents lots of data that beautifully documents the appearance and composition of the embryonic SIMR-1 foci, including evidence that a mutated NRDE-3 that cannot bind sRNAs accumulates in SIMR-1 foci in a SIMR-1-dependent fashion.

      Weaknesses:

      The genetic evidence, however, does not support a requirement for SIMR-1 foci: the authors detected no defect in NRDE-3 sRNA loading in simr-1 mutants. Although the authors acknowledge this negative result in the discussion, they still argue for a model (Figure 7) that is not supported by genetic data. My main suggestion is that the authors give equal consideration to other models - see below for specifics.

      We appreciate reviewer 2’s comments on the genetic evidence for the function of SIMR foci.  A similar concern was also brought up by reviewer 1. By re-examining our sequencing data, we found that there is a modest but significant increase in NRDE-3 association with CSR-class sRNAs in simr-1 and enri-2 mutants in late embryos. We believe that this data supports our model that SIMR-1 and ENRI-2 are required for an efficient switch of NRDE-3 bound small RNAs. Please refer our response to the reviewer 1 - point (1), and Figure 6G in the updated manuscript. 

      Reviewer #3 (Public review):

      Summary:

      Chen and Phillips present intriguing work that extends our view on the C. elegans small RNA network significantly. While the precise findings are rather C. elegans specific there are also messages for the broader field, most notably the switching of small RNA populations bound to an argonaute, and RNA granules behavior depending on developmental stage. The work also starts to shed more light on the still poorly understood role of the CSR-1 argonaute protein and supports its role in the decay of maternal transcripts. Overall, the work is of excellent quality, and the messages have a significant impact.

      Strengths:

      Compelling evidence for major shift in activities of an argonaute protein during development, and implications for how small RNAs affect early development. Very balanced and thoughtful discussion.

      Weaknesses:

      Claims on col-localization of specific 'granules' are not well supported by quantitative data

      We have now included zoomed images of individual granules to better show the colocalization in Figure 4 and Figure 4—figure supplement 1, and performed Pearson’s colocalization analysis between different sets of proteins in Figure 4B. 

      Reviewer #2 (Recommendations for the authors):

      - The manuscript is very dense and the gene names are not helpful. For example, the authors mention ERGO-1 without clarifying the type of protein, etc. I suggest the authors include a figure to go with the introduction that describes the different classes of primary and secondary sRNAs, associated Argonautes, and other accessory proteins. Also include a table listing relevant gene names, protein classes, main localizations, and proposed functions for easy reference by the readers.

      We agree that the genes names in different small RNA pathways are easily confused. We added a diagram and table in Figure 1—figure supplement 1 depicting the ERGO/NRDE and CSR pathways and added clarification about the ERGO/NRDE-3 pathway in the text from line 126-128.  

      - Line 424 - the wording here and elsewhere seems to imply that SIMR-1 and ENRI-2, although not essential, contribute to NRDE-3 sRNA loading. The sequencing data, however, do not support this - the authors should be clearer on this. If the authors believe there are subtle but significant differences, they should show them perhaps by adding a panel in Figure 5 that directly compares the NRDE-3 IPs in wildtype versus simr-1 mutants. Figure 5H however does not support such a requirement.

      As brought up by reviewer 1, we do not see difference in binding of ERGO-dependent sRNA in simr-1 mutant in late embryos. We do, however, see a modest, but significant, increase of CSR-sRNAs bound by NRDE-3 in simr-1 and enri-2 mutants, which we hypothesize could be due to a less efficient loading of ERGO-dependent 22G-RNAs by NRDE-3. The updated data are now in Figure 6G. We have also edited the text and model figure to soften these conclusions.

      - Condensates of PGL proteins appear at a similar time and place (somatic cells of early embryos) as the embryonic SIMR-1 foci. The PGL foci correspond to autophagy bodies that degrade PGL proteins. Is it possible that SIMR-1 foci also correspond to degradative structures? The possibility that SIMR-1 foci are targeted for autophagy and not functional would fit with the finding that simr-1 mutants do not affect NRDE-3 loading in embryos.

      We appreciate reviewer 2’s comments on possibility of SIMR granules acting as sites for degradation of SIMR-1 and NRDE-3. We think this is not the case for the following reasons: 1) if SIMR granules are sites of autophagic degradation, then we would expect that embryonic SIMR granules in somatic cells, like PGL granules, should only be observed in autophagy mutants; however we see them in wild-type embryos 2) we would not expect a functional Tudor domain to be required for granule localization; however in Figure 1—figure supplement 2B, we show that a point mutation in the Tudor domain of SIMR-1 abrogates SIMR granule formation, and 3) if NRDE-3(HK-AA) is recruited to SIMR granules for degradation while wild-type NRDE-3 is cytoplasmic, then NRDE-3(HK-AA) should shows a significantly reduced protein level comparing to wild-type NRDE-3. In the western blot in Figure 2—figure supplement 1B, NRDE-3 and NRDE-3(HK-AA) protein levels are similar, indicating that NRDE-3(HK-AA) is not degraded despite being unloaded. This is in contrast to what we have observed previously for HRDE-1, which is degraded in its unloaded state. If SIMR-1 played a role directly in promoting degradation of NRDE-3(HK-AA), we would similarly expect to see a change in NRDE-3 or NRDE-3(HK-AA) expression in a simr-1 mutant. We performed western blot and did not observe a significant change in protein expression for NRDE-3 (Figure 3—figure supplement 1A). 

      Although under wild-type conditions, SIMR granules do not appear to be sites of autophagic degradation, upon treatment with lgg-1 (an autophagy protein) RNAi, we found that SIMR-1, as well as many other germ granule and embryonic granule-localized proteins, increase in abundance in late embryos.  This data demonstrates that ZNFX-1, CSR-1, SIMR-1, MUT-2/RDE-3, RRF-1, and unloaded NRDE-3 are removed by autophagic degradation similar to what have been shown previously for PGL-1 proteins (Zhang et al, 2009, Cell). We added these data to Figure 5. It is important to emphasize, however, that the timing of degradation differs for each granule assayed (Lines 447-450), indicating that there must be multiple waves of autophagy to selectively degrade subsets of proteins when they are no longer needed by the embryo.

      - The observation that an NRDE-3 mutant that cannot load sRNAs localizes to SIMR-1 foci does not necessarily imply that wild-type unloaded NRDE-3 would also localize there. Unless the authors have additional data to support this idea, the authors should acknowledge that this hypothesis is speculative. In fact, why does cytoplasmic NRDE-3 not localize to granules in the rde-3;ego-1degron strain shown in Figure 6B?? Is it possible that the NRDE-3 mutant accumulates in SIMR-1 foci because it is unfolded and needs to be degraded?

      We believe that wild-type NRDE-3 also localize to SIMR foci when unloaded. This is supported by the localization of wild-type NRDE-3 in eri-1 and rde-3 mutants, where a subset of small RNAs are depleted. Wild-type NRDE-3 localizes to both somatic SIMR-1 granules and the nucleus, depending on embryo stage (Figure 2A, Figure 2—figure supplement 1C). The granule numbers in eri-1 and rde-3 mutants are less than the nrde-3(HK-AA) mutant, consistent with the imaging data that NRDE-3 only partially localize to somatic granule (Figure 2A – 100-cell stage).

      In the rde-3; ego-1 double mutant, the embryos have severe developmental defect: they cannot divide properly after 4-8 cell stage and exhibit morphology defects after that stage. In wild-type, SIMR foci does not appear until around 8-28-cell stage (shown in Figure 1C), so we believe that cytoplasmic NRDE-3 does not localize to foci in the double mutant is because of the timing.

      - The authors propose that NRDE-3 functions in nuclei to target mRNAs also targeted in the cytoplasm by CSR-1. If so, how do they propose that NRDE-3 might do this since little transcription occurs in oocytes/early embryos?? Are the authors suggesting that NRDE-3 targets germline genes for silencing specifically at the times that zygotic transcription comes back on, or already in maturing oocytes? Is the transcription of most CSR-1 targets silenced in early embryos??

      We appreciate the suggestions to check the function of NRDE-3 in oocytes. We tested this possibility and found it to be correct. NRDE-3 functions in oocytes for transcriptional repression by inhibiting RNA Pol II elongation. We added these data to Figure 8. We also attempted to do RT-qPCR, smFISH, and antiH3K9me3 Cut&Tag-seq on early embryos to further test the hypothesis that NRDE-3 acts with CSR-class 22G-RNAs in early embryos, but we either failed to obtain enough signal or failed to detect any significant difference (data not shown). Therefore, we think that the primary role for NRDE-3 bound to CSR-class 22G-RNAs may be for global transcriptional repression of oocytes prior to fertilization.

      - Line 684-686: "In summary, this work investigating the role of SIMR granules in embryos, together with our previous study of SIMR foci in the germline (Chen and Phillips 2024), has identified a new mechanism for small RNA loading of nuclear Argonaute proteins in C. elegans". This statement appears overstated/incorrect since there is no evidence that SIMR-1 foci are required for sRNA loading of NRDE3. The authors should emphasize other models, as suggested above.

      We have revised the text on line 869-871 to emphasize that SIMR granule regulate the localization of nuclear Argonaute proteins, rather than suggesting a direct role on controlling small RNA loading. We also edit the title, text, and legend for our model in Figure 9. 

      Reviewer #3 (Recommendations for the authors):

      Issues to be addressed:

      - The authors show a switch in 22G RNA binding by NRDE-3 during embryogenesis. While the data is convincing, it would be great if it could be tested if the preferred NRDE-3 replacement model is indeed correct. This could be done relatively easily by giving NRDE-3 a Dendra tag, allowing one to colour-switch the maternal WAGO-3 pool before the zygotic pool comes up. Such data would significantly enhance the manuscript, as this would allow the authors to follow the fate of maternal NRDE-3 more precisely, perhaps identifying a period of sharp decline of maternal NRDE-3.

      We think the NRDE-3 Dendra tag experiment suggested by the reviewer is a clever approach and we will consider generating this strain in the future. However, we feel that optimization of the color-switching tag between the maternal germline and the developing embryos is beyond the scope of this manuscript. To partially address the question about NRDE-3 fate during embryogenesis, we examined the single-cell sequencing data of C. elegans embryos from 1-cell to 16-cell stage (Tintori et al, 2016, Dev Cell; Visualization tool from John I Murray lab), as shown in Author response image 3 Panel A below, NRDE-3 transcript level increases as embryo develops, indicating that zygotic NRDE-3 is being actively expressed starting very early in development. We hypothesize that maternal NRDE-3 will either be diluted as the embryo develops or actively degraded during early embryogenesis. 

      Author response image 3.

      - Figure 3A: * should mark PGCs, but this seems incorrect. At the 8-cell stage there still is only one PGC (P4), not two, and at 100 cells there are only two, not three germ cells. Also, the identification of PGCs with a maker (PGL for instance) would be much more convincing.

      We apologize for the confusion in Figure 3A. We changed the figure legend to clarify that the * indicate nuclear NRDE-3 localization in somatic cells for 8- and 100-cell stage embryos rather than the germ cells.  

      - Overall, the authors should address colocalization more robustly. In the current manuscript, just one image is provided, and often rather zoomed-out. How robust are the claims on colocalization, or lack thereof? With the current data, this cannot be assessed. Pearson correlation, combined with line-scans through a multitude of granules in different embryos will be required to make strong claims on colocalization. This applies to all figures (main and supplement) where claims on different granules are derived from.

      We thank reviewer 3 for this important suggestion. To better address the colocalization, we included insets of individual granules in Figure 2D and Figure 4. We also performed colocalization analysis by calculating the Pearson’s R value between different groups of proteins in Figure 4B, to highlight that SIMR-1 colocalizes with ENRI-2, NRDE-3(HK-AA), RDE-3, and RRF-1, while CSR-1 colocalizes with EGO-1.

      For the proteins that lack colocalization in Figure 4—figure supplement 1, we also added insets of individual granules. Additionally, we included a new set of panels showing SIMR-1 localization compared to tubulin::GFP (Figure 4—figure supplement 1I) in response to a recent preprint (Jin et al, 2024, BioRxiv), which finds NRDE-3 (expressed under a mex-5 promoter) associating with pericentrosomal foci and the spindle in early embryos. We do not see SIMR-1 (or NRDE-3, data not shown) at centrosomes or spindles in wild-type conditions but made a similar observation for SIMR-1 in a mut-16 mutant (Figure 4E). All of the localization patterns were examined on at least 5 individual 100-cell staged embryos with same localization pattern.

      - Figure 7: Its title is: Function of cytoplasmic granules. This is a much stronger statement than provided in the nicely balanced discussion. The role of the granules remains unclear, and they may well be just a reflection of activity, not a driver. While this is nicely discussed in the text, figure 7 misses this nuance. For instance, the title suggests function, and also the legend uses phrases like 'recruited to granule X'. If granules are the results of activity, 'recruitment' is really not the right way to express the findings. The nuance that is so nicely worded in the discussion should come out fully in this figure and its legend as well.

      We have changed the title of Figure 7 (now Figure 9) to “Model for temporally- and developmentallyregulated NRDE-3 function” to deemphasize the role of the granules and to highlight the different functions of NRDE-3. Similarly, we have rephrased the text in the figure and legend and add a some details about our new results.

      Minor:

      Typo: line 663 Acaris

      We corrected the typo.

    1. eLife Assessment

      This study provides valuable information on the single nucleus RNA sequencing transcriptome, pathways, and cell types in pig skeletal muscle in response to conjugated linoleic acid (CLA) supplementation. Based on the comprehensive data analyses, the data are considered compelling and provide new insight into the mechanisms underlying intramuscular fat deposition and muscle fiber remodeling. The study contributes significantly to the understanding of nutritional strategies for fat infiltration in pig muscle.

    2. Joint Public Review:

      This study comprehensively presents data from single nuclei sequencing of Heigai pig skeletal muscle in response to conjugated linoleic acid supplementation. The authors identify changes in myofiber type and adipocyte subpopulations induced by linoleic acid at depth previously unobserved. The authors show that linoleic acid supplementation decreased the total myofiber count, specifically reducing type II muscle fiber types (IIB), myotendinous junctions, and neuromuscular junctions, whereas type I muscle fibers are increased. Moreover, the authors identify changes in adipocyte pools, specifically in a population marked by SCD1/DGAT2. To validate the skeletal muscle remodeling in response to linoleic acid supplementation, the authors compare transcriptomics data from Laiwu pigs, a model of high intramuscular fat, to Heigai pigs. The results verify changes in adipocyte subpopulations when pigs have higher intramuscular fat, either genetically or diet-induced. Targeted examination using cell-cell communication network analysis revealed associations with high intramuscular fat with fibro-adipogenic progenitors (FAPs). The authors then conclude that conjugated linoleic acid induces FAPs towards adipogenic commitment. Specifically, they show that linoleic acid stimulates FAPs to become SCD1/DGAT2+ adipocytes via JNK signaling. The authors conclude that their findings demonstrate the effects of conjugated linoleic acid on skeletal muscle fat formation in pigs, which could serve as a model for studying human skeletal muscle diseases.

      [Editors' note: the authors have responded to the previous rounds of review: https://doi.org/10.7554/eLife.99790.1.sa1 and https://doi.org/10.7554/eLife.99790.2.sa1]

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public review):

      In this revised manuscript, the authors aim to elucidate the cytological mechanisms by which conjugated linoleic acids (CLAs) influence intramuscular fat deposition and muscle fiber transformation in pig models. They have utilized single-nucleus RNA sequencing (snRNA-seq) to explore the effects of CLA supplementation on cell populations, muscle fiber types, and adipocyte differentiation pathways in pig skeletal muscles. Notably, the authors have made significant efforts in addressing the previous concerns raised by the reviewers, clarifying key aspects of their methodology and data analysis.

      Strengths:

      (1) Thorough validation of key findings: The authors have addressed the need for further validation by including qPCR, immunofluorescence staining, and western blotting to verify changes in muscle fiber types and adipocyte populations, which strengthens their conclusions.

      (2) Improved figure presentation: The authors have enhanced figure quality, particularly for the Oil Red O and Nile Red staining images, which now better depict the organization of lipid droplets (Figure 7A). Statistical significance markers have also been clarified (Figure 7I and 7K).

      Thanks!

      Weaknesses:

      (1) Cross-species analysis and generalizability of the results: Although the authors could not perform a comparative analysis across species due to data limitations, they acknowledged this gap and focused on analyzing regulatory mechanisms specific to pigs. Their explanation is reasonable given the current availability of snRNA-seq datasets on muscle fat deposition in other human and mouse.

      Thanks for your suggestion!

      (2) Mechanistic depth in JNK signaling pathway: While the inclusion of additional experiments is a positive step, the exploration of the JNK signaling pathway could still benefit from deeper analysis of downstream transcriptional regulators. The current discussion acknowledges this limitation, but future studies should aim to address this gap fully.

      Thanks! As we discussed in discussion part, further studies should focus on the downstream transcriptional regulators of JNK signaling pathway on IMF deposition.

      (3) Limited exploration of other muscle groups: The authors did not expand their analysis to additional muscle groups, leaving some uncertainty regarding whether other muscle groups might respond differently to CLA supplementation. Further studies in this direction could enhance the understanding of muscle fiber dynamics across the organism.

      Thanks for your suggestion! In this study, we mainly focused on the adipocytes, muscles and FAPs subpopulations, which play important roles in lipid deposition. As you suggested, our further study will focus on other subpopulations such as endothelial cells and immune cells.

      Reviewer #2 (Public review):

      Summary:

      This study comprehensively presents data from single nuclei sequencing of Heigai pig skeletal muscle in response to conjugated linoleic acid supplementation. The authors identify changes in myofiber type and adipocyte subpopulations induced by linoleic acid at depth previously unobserved. The authors show that linoleic acid supplementation decreased the total myofiber count, specifically reducing type II muscle fiber types (IIB), myotendinous junctions, and neuromuscular junctions, whereas type I muscle fibers are increased. Moreover, the authors identify changes in adipocyte pools, specifically in a population marked by SCD1/DGAT2. To validate the skeletal muscle remodeling in response to linoleic acid supplementation, the authors compare transcriptomics data from Laiwu pigs, a model of high intramuscular fat, to Heigai pigs. The results verify changes in adipocyte subpopulations when pigs have higher intramuscular fat, either genetically or diet-induced. Targeted examination using cell-cell communication network analysis revealed associations with high intramuscular fat with fibro-adipogenic progenitors (FAPs). The authors then conclude that conjugated linoleic acid induces FAPs towards adipogenic commitment. Specifically, they show that linoleic acid stimulates FAPs to become SCD1/DGAT2+ adipocytes via JNK signaling. The authors conclude that their findings demonstrate the effects of conjugated linoleic acid on skeletal muscle fat formation in pigs, which could serve as a model for studying human skeletal muscle diseases.

      Strengths:

      The comprehensive data analysis provides information on conjugated linoleic acid effects on pig skeletal muscle and organ function. The notion that linoleic acid induces skeletal muscle composition and fat accumulation is considered a strength and demonstrates the effect of dietary interactions on organ remodeling. This could have implications for the pig farming industry to promote muscle marbling. Additionally, these data may inform the remodeling of human skeletal muscle under dietary behaviors, such as elimination and supplementation diets and chronic overnutrition of nutrient-poor diets. However, the biggest strength resides in thorough data collection at the single nuclei level, which was extrapolated to other types of Chinese pigs.

      Weaknesses:

      Although the authors compiled a substantial and comprehensive dataset, the scope of cellular and molecular-level validation still needs to be expanded. For instance, the single nuclei data suggest changes in myofiber type after linoleic acid supplementation, but these findings need more thorough validation. Further histological and physiological assessments are necessary to address fiber types and oxidative potential. Similarly, the authors propose that linoleic acid alters adipocyte populations, FAPs, and preadipocytes; however, there are limited cellular and molecular analyses to confirm these findings. The identified JNK signaling pathways require additional follow-ups on the molecular mechanism or transcriptional regulation. However, these issues are discussed as potential areas for future exploration. While various individual studies have been conducted on mouse/human skeletal muscle and adipose tissues, these have only been briefly discussed, and further investigation is warranted. Additionally, the authors incorporate two pig models into their results, but they only examine one muscle group. Exploring whether other muscle groups respond similarly or differently to linoleic acid supplementation would be valuable. Furthermore, the authors should discuss how their results translate to human and pig nutrition, such as the desirability and cost-effectiveness for pig farmers and human diets high in linoleic acid. Notably, while the single nuclei data is comprehensive, there needs to be a statement on data deposition and code availability, allowing others access to these datasets.

      Thanks for your suggestion!

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      The authors have discussed and provided some experimental evidence to address the related issues to help justify their conclusions. The reviewer believes that authors should deposit their single-cell sequencing data and code for the broader research community.

      Thank you! We have uploaded our raw dataset in the Genome Sequence Archive (Genomics, Proteomics & Bioinformatics 2021) in National Genomics Data Center (Nucleic Acids Res 2022), China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences and data availability part has been updated (line 575-579).

    1. eLife Assessment

      This important study reveals that disrupting fatty acid metabolism in macrophages significantly restricts the growth of Mycobacterium tuberculosis, showing that impaired lipid processing triggers various antimicrobial responses. Overall, the approach is robust, utilizing CRISPR-Cas9 knockout of multiple genes involved in lipid metabolism which yielded convincing data. This work highlights how host lipid metabolism affects the ability of tubercle bacilli to thrive intracellularly, pointing to potential new therapeutic targets.

    2. Reviewer #1 (Public review):

      Summary:

      This study investigates the role of macrophage lipid metabolism in the intracellular growth of Mycobacterium tuberculosis. By using a CRISPR-Cas9 gene-editing approach, the authors knocked out key genes involved in fatty acid import, lipid droplet formation, and fatty acid oxidation in macrophages. Their results show that disrupting various stages of fatty acid metabolism significantly impairs the ability of Mtb to replicate inside macrophages. The mechanisms of growth restriction included increased glycolysis, oxidative stress, pro-inflammatory cytokine production, enhanced autophagy, and nutrient limitation. The study demonstrates that targeting fatty acid homeostasis at different stages of the lipid metabolic process could offer new strategies for host-directed therapies against tuberculosis.

      The work is convincing and methodologically strong, combining genetic, metabolic, and transcriptomic analyses to provide deep insights into how host lipid metabolism affects bacterial survival.

      Strengths:

      The study uses a multifaceted approach, including CRISPR-Cas9 gene knockouts, metabolic assays, and dual RNA sequencing, to assess how various stages of macrophage lipid metabolism affect Mtb growth. The use of CRISPR-Cas9 to selectively knock out key genes involved in fatty acid metabolism enables precise investigation of how each step-lipid import, lipid droplet formation, and fatty acid oxidation-affects Mtb survival. The study offers mechanistic insights into how different impairments in lipid metabolism lead to diverse antimicrobial responses, including glycolysis, oxidative stress, and autophagy. This deepens the understanding of macrophage function in immune defense.<br /> The use of functional assays to validate findings (e.g., metabolic flux analyses, lipid droplet formation assays, and rescue experiments with fatty acid supplementation) strengthens the reliability and applicability of the results.<br /> By highlighting potential targets for HDT that exploit macrophage lipid metabolism to restrict Mtb growth, the work has significant implications for developing new tuberculosis treatments.

      Weaknesses:

      The experiments were primarily conducted in vitro using CRISPR-modified macrophages. While these provide valuable insights, they may not fully replicate the complexity of the in vivo environment where multiple cell types and factors influence Mtb infection and immune responses. Yet, I agree that the Hoxb8 in vitro model provides a powerful genetic tool to interrogate host-Mtb interactions using primary macrophages that represent the bone marrow-derived macrophage lineage, instead of using cell lines.

      Comments on revisions: The authors have addressed my comment satisfactorily.

    3. Reviewer #2 (Public review):

      Summary:

      Host-derived lipids are an important factor during Mtb infection. In this study, using CRISPR knockouts of genes involved in fatty acid uptake and metabolism, the authors claim that a compromised uptake, storage or metabolism of fatty acid in the hosts restricts Mtb growth upon infection. The mechanism involves increased glycolysis, autophagy, oxidative stress, pro-inflammatory cytokines and nutrient limitation. The study may be useful for developing novel host-directed approaches against TB.

      Strengths:

      The study's strength is the use of clean HOXB8-derived primary mouse macrophage lines for generating CRISPR knockouts.

      Weaknesses:

      The strength of evidence on autophagy and redox stress remains incomplete.

      Comments on revisions:

      The authors have revised the manuscript and addressed some of the earlier concerns. However, some of the interpretations and responses are incorrect.

      Overall, the level of evidence to state the following in the abstract- "Our analyzes demonstrate that macrophages which cannot either import, store or catabolize fatty acids restrict Mtb growth by both common and divergent anti-microbial mechanisms, including increased glycolysis, increased oxidative stress, production of pro-inflammatory cytokines, enhanced autophagy and nutrient limitation" is incomplete.

      There is an increase in glycolysis and pro-inflammatory cytokines and, to some extent, oxidative stress. The same can not be said about autophagy. Unfortunately, the authors did not try to establish a direct role of any of these pathways in restricting bacterial growth in the absence of any of the three genes studied.

      Major concern:

      Autophagy: The LC3 WB does not, by any stretch of the imagination, convince that there is an increase in autophagy flux, as inferred by the authors. Authors correctly cite the "Guidelines to autophagy" paper. Unfortunately, they cite it only selectively to justify their assessment. The LC3II/LC3I ratio indicates the number of autophagosomes present. This ratio can also increase if there is an active block of autophagosome maturation. That's why having BafA1 or CQ controls is important to assess the active autophagosome maturation. However, the authors sidestep this serious consideration by claiming some "pleiotropic impact on Mtb". With BafA1 and CQ, the only assay one needs is to measure the impact on LC3II levels. In the absence of this assay, the evidence supporting the role of autophagy is incomplete.

      The main concern regarding autophagy results is that autophagy induction can typically bring down oxidative stress and classically has anti-inflammatory outlay. Thus, increased glycolysis, inflammatory cytokine production and redox stress indicate more towards a potential block in autophagy at the maturation step. This necessitates validation using autophagy flux assays.

      Oxidative stress: Showing a representative image for the corresponding representative groups would be more convincing. For example, there is no clarity on whether, in the infected group, there was any staining for Mtb to analyse only the infected cells.

    4. Reviewer #3 (Public review):

      Summary:

      This study provides significant insights into how host metabolism, specifically of lipids, influences the pathogenesis of Mycobacterium tuberculosis (Mtb). It builds on existing knowledge about Mtb's reliance on host lipids and emphasizes the potential of targeting fatty acid metabolism for therapeutic intervention.

      Strengths:

      To generate the data, the authors use CRISPR technology to precisely disrupt the genes involved in lipid import (CD36, FATP1), lipid droplet formation (PLIN2) and fatty acid oxidation (CPT1A, CPT2) in mouse primary macrophages. The Mtb Erdman strain is used to infect the macrophage mutants. The study, revealsspecific roles of different lipid-related genes. Importantly, results challenge previous assumptions about lipid droplet formation and show that macrophage responses to lipid metabolism impairments are complex and multifaceted. The experiments are well-controlled and the data is convincing.

      Overall, this well-written paper makes a meaningful contribution to the field of tuberculosis research, particularly in the context of host-directed therapies (HDTs). It suggests that manipulating macrophage metabolism could be an effective strategy to limit Mtb growth.

      Weaknesses:

      None noted. The manuscript provides important new knowledge that will lead mpvel to host-directed therapies to control Mtb infections.

      Comments on revisions: The authors have addressed the concerns of the reviewers.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study investigates the role of macrophage lipid metabolism in the intracellular growth of Mycobacterium tuberculosis. By using a CRISPR-Cas9 gene-editing approach, the authors knocked out key genes involved in fatty acid import, lipid droplet formation, and fatty acid oxidation in macrophages. Their results show that disrupting various stages of fatty acid metabolism significantly impairs the ability of Mtb to replicate inside macrophages. The mechanisms of growth restriction included increased glycolysis, oxidative stress, pro-inflammatory cytokine production, enhanced autophagy, and nutrient limitation. The study demonstrates that targeting fatty acid homeostasis at different stages of the lipid metabolic process could offer new strategies for host-directed therapies against tuberculosis.

      The work is convincing and methodologically strong, combining genetic, metabolic, and transcriptomic analyses to provide deep insights into how host lipid metabolism affects bacterial survival.

      Strengths:

      The study uses a multifaceted approach, including CRISPR-Cas9 gene knockouts, metabolic assays, and dual RNA sequencing, to assess how various stages of macrophage lipid metabolism affect Mtb growth. The use of CRISPR-Cas9 to selectively knock out key genes involved in fatty acid metabolism enables precise investigation of how each step-lipid import, lipid droplet formation, and fatty acid oxidation affect Mtb survival. The study offers mechanistic insights into how different impairments in lipid metabolism lead to diverse antimicrobial responses, including glycolysis, oxidative stress, and autophagy. This deepens the understanding of macrophage function in immune defense.

      The use of functional assays to validate findings (e.g., metabolic flux analyses, lipid droplet formation assays, and rescue experiments with fatty acid supplementation) strengthens the reliability and applicability of the results.

      By highlighting potential targets for HDT that exploit macrophage lipid metabolism to restrict Mtb growth, the work has significant implications for developing new tuberculosis treatments.

      Weaknesses:

      The experiments were primarily conducted in vitro using CRISPR-modified macrophages. While these provide valuable insights, they may not fully replicate the complexity of the in vivo environment where multiple cell types and factors influence Mtb infection and immune responses.

      We thank the reviewer for pointing this out. We acknowledge that our in vitro system may indeed not fully replicate the complex in vivo environment given of what is becoming to light of macrophage heterogenous responses to Mtb infection in whole animal models. We do believe, however, that the Hoxb8 in vitro model provides a powerful genetic tool to interrogate host-Mtb interactions using primary macrophages that represent the bone marrow-derived macrophage lineage.

      Reviewer #2 (Public review):

      Summary:

      Host-derived lipids are an important factor during Mtb infection. In this study, using CRISPR knockouts of genes involved in fatty acid uptake and metabolism, the authors claim that a compromised uptake, storage, or metabolism of fatty acid restricts Mtb growth upon infection. Further, the authors claim that the mechanism involves increased glycolysis, autophagy, oxidative stress, pro-inflammatory cytokines, and nutrient limitation. The authors also claim that impaired lipid droplet formation restricts Mtb growth. However, promoting lipid droplet biogenesis does not reverse/promote Mtb growth.

      Strengths:

      The strength of the study is the use of clean HOXB8-derived primary mouse macrophage lines for generating CRISPR knockouts.

      Weaknesses:

      There are many weaknesses of this study, they are clubbed into four categories below

      (1) Evidence and interpretations: The results shown in this study at several places do not support the interpretations made or are internally contradictory or inconsistent. There are several important observations, but none were taken forward for in-depth analysis.

      a) The phenotypes of PLIN2<sup>-/-</sup>, FATP1<sup>-/-</sup>, and CPT-/- are comparable in terms of bacterial growth restriction; however, their phenotype in terms of lipid body formation, IL1B expression, etc., are not consistent. These are interesting observations and suggest additional mechanisms specific to specific target genes; however, clubbing them all as altered fatty acid uptake or catabolism-dependent phenotypes takes away this important point.

      We thank the reviewer for highlighting this. Our focus was on assessing the impact of manipulating lipid homeostasis in macrophages at several stages and the consequences this has on the intracellular growth of Mtb. Throughout the manuscript (abstract, results and discussion), we have continuously emphasized that interfering with lipid handling at several stages in macrophages results in both conserved and divergent antimicrobial responses against intracellular Mtb.

      b) Finding the FATP1 transcript in the HOXB8-derived FATP1<sup>-/-</sup> CRISPR KO line is a bit confusing. There is less than a two-fold decrease in relative transcript abundance in the KO line compared to the WT line, leaving concerns regarding the robustness of other experiments as well using FATP1<sup>-/-</sup> cells.

      CRISPR-Cas9 targeting of genes with single sgRNAs as is the case with our mutants generates insertions and deletions (INDELs) at the CRISPR cut site. These INDELs do not block mRNA transcription totally, and this is widely reported in the field.  Because of this, quantitative RT-PCR or RNA-seq methods are not routinely used to verify CRISPR knockouts as they are not sensitive enough to identify INDELs. We provide INDEL quantification and knockout efficiencies by ICE analysis in supplemental file 1 for all the mutants used in the study. We also demonstrate protein depletion by western blot and flow cytometry for all the mutants (Figure 1 - figure supplement 1). Only mutants with greater than >90% protein depletion were used for subsequent characterization.

      c) No gene showing differential regulation in FATP<sup>-/-</sup> macrophages, which is very surprising.

      We assume the reviewer is referring to the Mtb transcriptome response in FATP1<sup>-/-</sup> macrophages, which we agree was unexpected.  However, we saw a significant compensatory response in the host cell (at transcriptional level) in FATP1<sup>-/-</sup> macrophages as evidenced by an upregulation of other fatty acid transporters (Figure 5 - figure supplement 1, now Figure 6 - figure supplement 1). We believe that these compensatory responses could, in part, alleviate the stresses the bacteria experience within the cell. We discuss this point in the manuscript.

      d) ROS measurements should be done using flow cytometry and not by microscopy to nail the actual pattern.

      We thank the reviewer for the suggestion. However, confocal imaging is also widely used to measure ROS with similar quantitative power and individual cell resolution (PMID: 32636249, 35737799).

      (2) Experimental design: For a few assays, the experimental design is inappropriate

      a) For autophagy flux assay, immunoblot of LC3II alone is not sufficient to make any interpretation regarding the state of autophagy. This assay must be done with BafA1 or CQ controls to assess the true state of autophagy.

      We would like to point out that monitoring LC3I to LC3II conversion by western blot, confocal imaging of LC3 puncta and qPCR analysis of autophagy related genes are all validated assays for monitoring autophagic flux in a wide variety of cells. We refer the reviewer to the latest extensive guidelines on the subject (PMID: 33634751). Furthermore, Bafilomycin A and chloroquine are not specific inhibitors of autophagy and therefore are of limited value as controls. BafA is an inhibitor of the proton-ATPase apparatus and can indirectly impact autophagy through activity on the Ca-P60A/SERCA pathway. Chloroquine impacts vacuole acidification, autophagosome/lysosome fusion and slows phagosome maturation. So, while BafA and chloroquine will reduce autophagy; their effects are pleotropic and their impact on Mtb is unknown.

      b) Similarly, qPCR analyses of autophagy-related gene expression do not reflect anything on the state of autophagy flux.

      See our response above.

      (3) Using correlative observations as evidence:

      a) Observations based on RNAseq analyses are presented as functional readouts, which is incorrect.

      We are not entirely sure where we used our RNA-seq data sets as functional readouts. We used our transcriptome data to provide a preliminary identification of anti-microbial responses in the mutant macrophages infected with Mtb and we mention this at the beginning of the RNA-seq results sections. Where applicable, we followed up and confirmed the more compelling RNA-seq data either by metabolic flux analyzes, qPCR, ROS measurements, and quantitative imaging.

      b) Claiming that the inability to generate lipid droplets in PLIN2<sup>-/-</sup> cells led to the upregulation of several pathways in the cells is purely correlative, and the causal relationship does not exist in the data presented.

      It was not our intention to infer causality. We have re-written the beginning of the sentence, and it now starts with “Meanwhile, Mtb infection of PLIN2<sup>-/-</sup> macrophages led to upregulation” which hopefully eliminates any association to causality.

      (4) Novelty: A few main observations described in this study were previously reported. That includes Mtb growth restriction in PLIN2 and FATP1 deficient cells. Similarly, the impact of Metformin and TMZ on intracellular Mtb growth is well-reported. While that validates these observations in this study, it takes away any novelty from the study.

      To the best of our knowledge, Mtb growth restrictions in PLIN2 and FATP1 deficient macrophages have not been reported elsewhere. To the contrary, PLIN2 knockout macrophages obtained from PLIN2 deficient mice have been reported to robustly support Mtb replication (PMID: 29370315). We extensively discuss these discrepancies in the manuscript. We also discuss and cite appropriate references where Mtb growth restriction for similar macrophage mutants have been reported (CD36<sup>-/-</sup> and CPT2<sup>-/-</sup>). Our aim was to carry out a systematic myeloid specific genetic interference of fatty acid import, storage and catabolism to assess the effect on Mtb growth at all stages of lipid handling instead of focusing on one target. In the chemical approach, we used TMZ and Metformin deliberately because they had already been reported as being active against intracellular Mtb and we wished to place our data in the context of existing literature.  These studies have been referenced extensively in the text.

      (5) Manuscript organisation: It will be very helpful to rearrange figures and supplementary figures.

      New figures have been added, and existing ones have been re-arranged where necessary. See our responses to recommendations for authors.

      Reviewer #3 (Public review):

      Summary:

      This study provides significant insights into how host metabolism, specifically lipids, influences the pathogenesis of Mycobacterium tuberculosis (Mtb). It builds on existing knowledge about Mtb's reliance on host lipids and emphasizes the potential of targeting fatty acid metabolism for therapeutic intervention.

      Strengths:

      To generate the data, the authors use CRISPR technology to precisely disrupt the genes involved in lipid import (CD36, FATP1), lipid droplet formation (PLIN2), and fatty acid oxidation (CPT1A, CPT2) in mouse primary macrophages. The Mtb Erdman strain is used to infect the macrophage mutants. The study, reveals specific roles of different lipid-related genes. Importantly, results challenge previous assumptions about lipid droplet formation and show that macrophage responses to lipid metabolism impairments are complex and multifaceted. The experiments are well-controlled and the data is convincing.

      Overall, this well-written paper makes a meaningful contribution to the field of tuberculosis research, particularly in the context of host-directed therapies (HDTs). It suggests that manipulating macrophage metabolism could be an effective strategy to limit Mtb growth.

      Weaknesses:

      None noted. The manuscript provides important new knowledge that will lead mpvel to host-directed therapies to control Mtb infections.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The study presents compelling and well-supported conclusions based on a solid body of evidence. However, the clarity of several figures could be improved for better understanding.

      (1) In Figure 1, panels B and C are referenced incorrectly in the text.

      We thank the reviewer for identifying the error. This has now been corrected

      (2) Figures 2 and S2 would benefit from being combined or reorganized to display the data related to infected and uninfected cells together, making it easier for the reader to interpret.

      We thank the reviewer for the suggestion. However, we believe that combining the two figures would further complicate the merged figure making it even more difficult to interpret. We decided to highlight the mutant macrophage’s responses upon Mtb infection in Figure 2 and put the uninfected data sets in supplementary information given that the OCR and ECAR trends were similar and as expected in both infected and uninfected states.

      (3) Figure 3 is mislabeled, with four panels shown in the figure, but only panels A and B are mentioned in both the text and the figure legend.

      We thank the reviewer for the observation. Figure 3 has been extensively revised. We have included new blots, statistical comparisons and a corresponding new supplementary figure (Figure 3 - figure supplement 1). We have verified that the figure panels are labelled correctly and appropriately referenced in the manuscript text.

      (4) Figure 5 is overly complex and difficult to interpret. Simplifying the figure, possibly by reducing the amount of data or breaking it into more digestible parts, would enhance its readability.

      We thank the reviewer for the suggestion. We have separated the figure into two parts which are now Figure 5 for the PCA and Venn diagrams and Figure 6 for the pathway enrichment figure panels. We have increased the resolution of both figures in the revised manuscript to improve readability.

      (5) Panel 6A is not particularly informative and could either be omitted with a more detailed explanation provided in the text, or replaced with a clearer visual representation, such as Venn diagrams, to improve data visualization.

      We thank the reviewer for the suggestion. We have removed Figure 6A given that detailed explanation of the panel is already available in the manuscript text.

      (6) Additionally, on line 309, the word "to" is missing before "generate".

      We thank the reviewer for identifying this. This sentence has now been re-written to address some unintended inferences of causation in line with recommendations from reviewer 2.

      Reviewer #2 (Recommendations for the authors):

      (1) Manuscript Organisations: The manuscript is very poorly organised. Supplemental figures are labelled very unconventionally, and that creates much confusion in following the manuscript. Some of the results in the supplementary figures could be easily kept in the main figures, as it is difficult to compare plots between the main figures and the supple figures. The results of RNAseq experiments are impossible to follow with very small fonts. Overall, the figures are very casually organised and can certainly be improved.

      We would like to clarify that supplemental figures are labelled and organized as is in line with the eLife formatting of supplemental figures. We deliberately put some redundant figures like Figure 2 - figure supplement 1 in supplementary information (see our response to reviewer 1 recommendations on the same). We have split the RNA-seq Figure 5 into two separate figures (now Figure 5 and 6) and increased their resolution to improve readability.

      (2) Figure 3: Among the KO lines, only PLIN2<sup>-/-</sup> had a higher HIF1a level before infection. Infection surely leads to higher levels across the three cases.

      We have generated replicate western blots and provide statistical quantitation for both HIF1a, AMPK and pAMPK. Figure 3 has now been revised extensively, replicate blots are in Figure 3 - figure supplement 1. We have updated the text to reflect the reviewer observation which was also consistent with our statistical quantification.

      (3) pAMPK blots are of very poor quality. Without quantification, the trend mentioned in the text is not clearly visible.

      We have provided two more replicate blots for AMPK/pAMPK and provide statistical quantification as described above.

      (4) Line 230: Regarding autophagy flux, neither the data suggest what is interpreted nor is this experiment correctly done. LC3 WB and autophagy gene qPCR: Unfortunately, LC3 WB, the way it was done, does not tell anything about the state of autophagy in these cells. A very mild LC3II increase is noted in CPT2<sup>-/-</sup> cells upon infection; the rest of the others do not show any change. This assay is not done correctly. To interpret LC3II WB, one needs to include the Bafilomycin A1 control, usually +Baf and -Baf run in the adjacent wells in the gel. Similarly, qPCR results are not indicative of any increase in autophagy. Regulation of ATG7, MAP1LC3B, and ULK1 is more at the post-translational level than the transcriptional level.

      We have provided an additional replicate blot together with statistical quantification of LC3II/LC3I ratios in the revised Figure 3 - figure supplement 2. Our quantifications remain consistent with our prior assertations in the manuscript text. See our response in the public review section concerning autophagy assays and the use of Baf or chloroquine as controls.

      (5) Exogenous oleate fails to rescue the Mtb icl1-deficient mutant in FATP1<sup>-/-</sup>, PLIN2<sup>-/-</sup> and CPT2<sup>-/-</sup> macrophages: this result is confusing. Lipid uptake and metabolism have been the central players so far; however, here, the phenotypes of FATP1 and CPT2 in terms of lipid body accumulation are very distinct. Therefore, the assessment that Mtb growth inhibition is due to factors other than limited access to fatty acid is not consistent with the theme of the study.

      Nutrient limitation is a distinct transcriptional signature of Mtb, at least in PLIN2<sup>-/-</sup> macrophages (Figure 7). We used the oleate supplementation assay with the Mtb Dicl1 mutant to assess whether nutrient restriction was the sole anti-microbial pathway against Mtb in the knockout macrophages. This would have been the case (to a certain extent) if the growth of the Mtb Dicl1 mutant was rescuable upon addition of exogenous oleate in the knockout macrophages. Our data clearly shows that this is not the case and that in addition to nutrient limitation, interference with lipid processing results in several other macrophage anti-microbial responses against the bacteria. We extensively discuss these points in the abstract, results and discussion sections of the manuscript.

      (6) Line 309: "Meanwhile, inability generate lipid droplets in Mtb infected PLIN2<sup>-/-</sup> macrophages led to upregulation in pathways involved in ribosomal biology, MHC class 1 antigen presentation, canonical glycolysis, ATP metabolic processes and type 1 interferon responses (Figure 5C, Supplementary file 3)." This is just a correlative observation. However, it is mentioned here as a causal mechanism.

      We have revised this sentence to remove any unintended inference of causation.

      (7) IL-1b is upregulated in FATP-/- macrophages, no effect in CPT2<sup>-/-</sup> macrophages, but downregulated in PLIN2<sup>-/-</sup> macrophages. Moreover, this effect is very transient, and by 24 hours, all these differences are lost. This suggests the mechanism of action, as their pro-bacterial function shown in Figure 1, is very distinct for different proteins, and FA metabolism is probably not the common denominator across these phenotypes.

      We agree with the reviewer, and we extensively discuss this in the manuscript text (results and discussion). Clearly, they are shared anti-microbial responses across the mutants, but they are also points of divergence. We would like to further clarify that pro-inflammatory responses (IL-1b or IFN-B) in Mtb infected macrophages show a biphasic early upregulation (up to 8 hours of infection) followed by a rapid resolution phase (24-48 hours post infection). This is well reported in the literature (PMID: 30914513). It is common for pro-inflammatory gene expression differences to be temporary lost during the resolution phase (PMID: 30914513, 39472457). IL-1b expression profiles return to the 4-hour equivalent profile in Mtb infected FATP1<sup>-/-</sup> and PLIN2<sup>-/-</sup> macrophages 4 days post infection (Figure 6A, Figure 6 - figure supplement 2B, Supplementary file 2)

      (8) It is very surprising that FATP-/- macrophages do not show any change in Mtb gene expression. The robustness of this experiment and analysis appears doubtful, given that the phenotype in terms of bacterial growth was clean.

      See our response to this comment in the public reviews section

      (9) Figure 5, Supplementary Figure 1: Among the FA transporters, authors also show data for FATP1. I am surprised to see FATP1 expression levels in the FATP1<sup>-/-</sup> cells. This puts into doubt every dataset using FATP-/- cells in this study.

      See our response to this comment in the public reviews section

      (10) Unfortunately, with the kind of evidence presented, it is far-fetched to claim that PLIN2<sup>-/-</sup> macrophages restrict Mtb growth by increasing ROS production. There is no evidence for this statement. The MFI units in Figure 6, Supplementary 1 are too small to extract meaningful interpretations. Moreover, the data appears to be arrived at by combining multiple technical replicates. Usually, flow cytometry data are more reliable for CellROX assays. Microscopy is not the technique of choice for this assay.

      We would like to point out that MFIs are arbitrary units set to predetermined reference points. In our case, the reference was background fluorescence in CellROX unstained cells and cells stained with CellROX equivalent fluorophore conjugated isotype antibodies. We are not entirely sure what the reviewer means by “small” in these contexts. And the data is not entirely from technical replicates. Reported MFIs are from three independent repeats with MFI reads of at least 30 cells per replicate. We have added this clarification in Figure 6 - figure supplement 1 legend, now Figure 7 - figure supplement 1. See our response in the public reviews section on the use of confocal microcopy to image and quantify ROS. Furthermore, the Mtb transcriptional response in PLIN2<sup>-/-</sup> and CPT2<sup>-/-</sup> macrophages is clearly indicative of increased oxidative stresses (Figure 7).

      (11) The CFU results with Metformin and TMZ are on the expected lines, as published earlier by others. FATP1 In data is good and aligned with the knockout phenotype.

      We thank the reviewer for the note.

      (12) Western blots, when interpreted for quantitative differences, must be quantified, and data should be represented as plots with statistical analysis.

      Replicate blots have been provided and statistical quantifications performed.

    1. eLife Assessment

      In this work, the authors use a Drosophila melanogaster adult ventral nerve cord injury model extending and confirming previous observations. This important study reveals key aspects of adult neural plasticity. Taking advantage of several genetic reporter and fate tracing tools, the authors provide solid evidence for different forms of glial plasticity, that are increased upon injury. The significance of the generated cell types under homeostatic conditions and in response to injury remains to be further explored and open up new avenues of research.

    2. Reviewer #2 (Public review):

      Summary:

      Casas-Tinto et al., provide new insight into glial plasticity using a crush injury paradigm in the ventral nerve cord (VNC) of adult Drosophila. The authors find that both astrocyte-like glia (ALG) and ensheating glia (EG) divide under homeostatic conditions in the adult VNC and identify ALG as the glial population that specifically ramps up proliferation in response to injury, whereas the number of EGs decreases following the insult. Using lineage-tracing tools, the authors interestingly observe interconversion of glial subtypes, especially of EGs into ALGs, which occurs independent of injury and is dependent on the availability of the transcription factor Prospero in EGs, adding to the plasticity observed in the system. Finally, when tracing the progeny of glia, Casas-Tinto and colleagues detect cells of neuronal identity and provide evidence that such glia-derived neurogenesis is favored following ventral nerve cord injury, which puts forward a remarkable way in which glia can respond to neuronal damage.

      Strengths:

      This study highlights a new facet of adult nervous system plasticity at the level of the ventral nerve cord, supporting the view that proliferative capacity is maintained in the mature CNS and stimulated upon injury.

      The injury paradigm is well chosen, as the organization of the neuromeres allows specific targeting of one segment, compared to the remaining intact and with the potential to later link observed plasticity to behavior such as locomotion.

      Numerous experiments have been carried out in 7-day old flies, showing that the observed plasticity is not due to residual developmental remodeling or a still immature VNC.

      Different techniques are used to observe proliferation in the VNC.

      By elegantly combining different methods, the authors show glial divisions including with mitotic-dependent tracing and find that the number of generated glia is refined by apoptosis later on.

      The work identifies prospero in glia as important coordinator of glial cell fate, from development to the adult context, which draws further attention to the upstream regulatory mechanisms.

      Weaknesses:

      The authors do not discuss their results on gliogenesis or neurogenesis in the adult VNC to previous findings made in the context of the injured adult brain.

      The authors speculate about the role of glial inter-conversion for tissue homeostasis or regeneration, but no supportive evidence is cited or provided. Further experiments will be required to test the function of the described glial plasticity.

      Elav+ cells originating from glia do not express markers for mature neurons at the analysed time-point. If they will eventually differentiate<br /> or what type of structure is formed by them will have to be followed up in future studies.

      Context/Discussion

      Highlighting some differences in the reactiveness of glia in the VNC compared to the brain could reveal important differences in repair strategies in different areas of the CNS.

    3. Reviewer #3 (Public review):

      In this manuscript, Casas-Tintó et al. explore the role of glial cell in the response to a neurodegenerative injury in the adult brain. They used Drosophila melanogaster as a model organism, and found that glial cells are able to generate new neurons through the mechanism of transdifferentiation in response to injury. This paper provides a new mechanism in regeneration, and gives an understanding to the role of glial cells in the process.

      The authors have now addressed all my concerns.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      eLife Assessment

      In this work, the authors use a Drosophila adult ventral nerve cord injury model extending and confirming previous observations; this important study reveals key aspects of adult neural plasticity. Taking advantage of several genetic reporter and fate tracing tools, the authors provide solid evidence for different forms of glial plasticity, that are increased upon injury. The data on detected plasticity under physiologic conditions and especially the extent of cell divisions and cell fate changes upon injury would benefit from validation by additional markers. The experimental part would improve if strengthened and accompanied by a more comprehensive integration of results regarding glial reactivity in the adult CNS.

      Thank you very much for your thoughtful comments and constructive feedback regarding our manuscript. We appreciate all the positive remarks on the significance of our findings on neural plasticity in this Drosophila adult ventral nerve cord injury model.

      In response to your suggestion, we fully agree that the continuation of this project should address in detail cell fate changes with additional markers if available, or an “omic” approach such as scRNAseq. Unfortunately, these further experiments are beyond the scope of this paper to describe the in vivo phenomena of cell reprogramming, and the cellular events that take glial cells to convert into neurons or neuronal precursors.

      Additionally, we agree that the experimental part can be further improved by providing a more comprehensive integration of our results with current knowledge on glial reactivity in the adult CNS. We will revise the manuscript accordingly to include a deeper discussion of the broader implications of our findings and their alignment with existing literature.

      Thank you again for your valuable input, which will undoubtedly enhance the quality of our work. We look forward to submitting the revised manuscript for your consideration.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Casas-Tinto et al. present convincing data that injury of the adult Drosophila CNS triggers transdifferentiation of glial cell and even the generation of neurons from glial cells. This observation opens up the possibility to get an handle on the molecular basis of neuronal and glial generation in the vertebrate CNS after traumatic injury caused by Stroke or Crush injury. The authors use an array of sophisticated tools to follow the development of glial cells at the injury site in very young and mature adults. The results in mature adults reveal a remarkable plasticity in the fly CNS and dispels the notion that repair after injury may be only possible in nerve cords which are still developing. The observation of so called VC cells which do not express the glial marker repo could point to the generation of neurons by former glial cells.

      Conclusion:

      The authors present an interesting story which is technically sound and could form the basis for an in depth analysis of the molecular mechanism driving repair after brain injury in Drosophila and vertebrates.

      Strengths:

      The evidence for transdifferentiation of glial cells is convincing. In addition, the injury to the adult CNS shows an inherent plasticity of the mature ventral nerve cord which is unexpected.

      Weaknesses:

      Traumatic brain injury in Drosophila has been previously reported to trigger mitosis of glial cells and generation of neural stem cells in the larval CNS and the adult brain hemispheres. Therefore this report adds to but does not significantly change our current understanding. The origin and identity of VC cells is still unclear. The authors show that VC cells are not GABA- or glutamergic. Yet, there are many other neurotransmitter or neuropetides. It would have been nice to see a staining with another general neuronal marker such as anti-Syt1 to confirm the neuronal identity of Syt1.

      We thank the reviewer for the constructive comments and positive feedback. We concur that previous studies have demonstrated glial cell proliferation in response to CNS injury. In contrast, our study focuses on glial transdifferentiation that emerges as a novel phenomenon, particularly in response to injury. We found that neuropile glia lose their glial identity and express the pan-neuronal marker Elav. To investigate the identity of these newly observed elav-positive cells, we employed anti-ChAT, antiGABA and anti-GluRIIA antibodies to determine the functional identity of these cells, besides we stained them with other neuronal markers such Enabled, Gigas or Dac (not shown); however, our attempts yielded limited success. To address this, we have now included a discussion section exploring the potential identity of these cells, considering the possibility that they may represent immature neurons.

      Reviewer #2 (Public review):

      Summary:

      Casas-Tinto et al., provide new insight into glial plasticity using a crush injury paradigm in the ventral nerve cord (VNC) of adult Drosophila. The authors find that both astrocyte-like glia (ALG) and ensheating glia (EG) divide under homeostatic conditions in the adult VNC and identify ALG as the glial population that specifically ramps up proliferation in response to injury, whereas the number of EGs decreases following the insult. Using lineage-tracing tools, the authors interestingly observe interconversion of glial subtypes, especially of EGs into ALGs, which occurs independent of injury and is dependent on the availability of the transcription factor Prospero in EGs, adding to the plasticity observed in the system. Finally, when tracing the progeny of glia, Casas-Tinto and colleagues detect cells of neuronal identity and provide evidence that such gliaderived neurogenesis is specifically favoured following ventral nerve cord injury, which puts forward a remarkable way in which glia can respond to neuronal damage.

      Strengths:

      This study highlights a new facet of adult nervous system plasticity at the level of the ventral nerve cord, supporting the view that proliferative capacity is maintained in the mature CNS and stimulated upon injury.

      The injury paradigm is well chosen, as the organization of the neuromeres allows specific targeting of one segment, compared to the remaining intact and with the potential to later link observed plasticity to behaviour such as locomotion.

      Numerous experiments have been carried out in 7-day old flies, showing that the observed plasticity is not due to residual developmental remodelling or a still immature VNC.

      By elegantly combining different methods, the authors show glial divisions including with mitotic-dependent tracing and find that the number of generated glia is refined by apoptosis later on.

      The work identifies prospero in glia as an important coordinator of glial cell fate, from development to the adult context, which draws further attention to the upstream regulatory mechanisms.

      We would like to thank the reviewer for his/her comments and the positive analysis of this work.

      Weaknesses:

      The authors observe consistent inter-conversion of EG to ALG glial subtypes that is further stimulated upon injury. The authors conclude that these findings have important consequences for CNS regeneration and potentially for memory and learning. However, it remains somewhat unclear how glial transformation could contribute to regeneration and functional recovery.

      This is an ongoing question in the laboratory and in the field. We know that glial cells contribute to the regenerative program in the nervous system, and molecular signalling in glial cells is determinant for the functional recovery (Losada-Perez et al 2021). Therefore, we include this concept in the discussion as the evidence indicates that glial cells participate in these programs. However, further investigation is required to clarify and determine the mechanisms underlying this glial contribution. To determine if glial to neuron transformation contributes to functional recovery, we would need to compare the recovery of animals with new VC to animals without VC, however, the  molecular mechanism that produces this change of identity is still unknown, and therefore we are not able to generate injured flies with no new VC

      The signal of the Fucci cell cycle reporter seems more complex to interpret based on the panels provided compared to the other methods employed by the authors to assess cell divisions.

      We agree that Fly Fucci is a genetic reporter that might be more complex to interpret than EdU staining or other markers. However, glial cells proliferation is a milestone of this manuscript, and we used different available tools to confirm our results. We have revised this specific section to ensure that the text is clear and straightforward.

      Elav+ cells originating from glia do not express markers for mature neurons at the analysed time-point. If they will eventually differentiate or what type of structure is formed by them will have to be followed up in future studies.

      We fully agree with the reviewer, and we will analyze later days to study neuronal fate and contribution to VNC function.

      Context/Discussion

      There is some lack of connecting or later comparing the observed forms of glial plasticity in the VNC with respect to plasticity described in the fly brain.

      Highlighting some differences in the reactiveness of glia in the VNC compared to the brain could point to relevant differences in repair capacity in different areas of the CNS.

      Based on the assays employed, the study points to a significant amount of glial "identity" changes or interconversions under homeostatic conditions. The potential significance of this rather unexpected "baseline" plasticity in adult tissues is not explicitly pointed out and could improve the understanding of the findings.

      Some speculations if "interconversion" of glia is driven by the needs in the tissue could enrich the discussion.

      We would like to thank the reviewer for these suggestions. We have changed the discussion to introduce these concepts.

      Reviewer #3 (Public review):

      In this manuscript, Casas-Tintó et al. explore the role of glial cell in the response to a neurodegenerative injury in the adult brain. They used Drosophila melanogaster as a

      model organism, and found that glial cells are able to generate new neurons through the mechanism of transdifferentiation in response to injury. This paper provides a new mechanism in regeneration, and gives an understanding to the role of glial cells in the process.

      Comments on revisions:

      In the previous version of the manuscript, I had suggested several recommendations for the authors. Unfortunately, none of these were addressed in the author's revision.

      We are sorry for this error. We apologize but we never received these comments. We have now found them, and we have incorporated these comments in the new version of the manuscript.

      (1) Have you tried screening for other markers for the EdU+ Repo+ Pros- cells?

      We have identified these cells as glial cells (Repo +), and not astrocyte-like glia (pros-). But we have not further characterized  the identity of these cells. Our aim was to identify these proliferating glial cells as NPG (Neuropile glia), which are Astrocyte-Like Glia (ALG), as previous works suggest in larvae (Kato et al., 2020; Losada-Perez et al., 2016), or Ensheathing Glia (EG). To discard the ALG identity, we used prospero as the best marker. The results indicate that there are ALG among the proliferating population, but in addition, we also found pros- glial cells that were EdU positive. These cells are located in the interface between cortex and neuropile, where the neuropile glia position is described. The anti-pros staining indicated they were no ALG which suggest that they are EG.

      There is no specific nuclear marker for EG cells, therefore we used FLY_FUCCI under the control of a EG specific promoter (R56F03-Gal4) to determine if the other dividing cells were EG. These results indicate that EG glia divide although their proliferation does not increase upon injury.

      The R56F03 Gal4 construct is described as ensheathing glia specific by previous publications, including:

      (1) Kremer M. C., Jung C., Batelli S., Rubin G. M. and Gaul U. (2017). The glia of the adult Drosophila nervous system. Glia 65, 606-638. 10.1002/glia.23115

      (2) Qingzhong Ren, Takeshi Awasaki, Yu-Chun Wang, Yu-Fen Huang, Tzumin Lee. Lineage-guided Notch-dependent gliogenesis by Drosophila multi-potent progenitors. Development. 2018 Jun 11;145(11):dev160127. doi: 10.1242/dev.160127   

      To summarize, our results suggest that part of these proliferating glial cells are ALG and EG. Our results can not discard that a residual part of these proliferating cells are not AG nor EG.

      (2) You mentioned that ALG are heterogenous in size and shape, does that mean that you may have different subpopulations of ALG? Would that also mean that only a portion of them responds to injury?

      Yes, as in Astrocytes in vertebrates this population is highly heterogeneous. Currently there are no molecular tools to specifically identify these subpopulations and characterize their distinct roles. However, emerging research suggests that differences in size, shape, and potentially molecular markers could correlate with functional diversity. This implies that certain subpopulations of ALG may be more specialized or primed to respond to injury, while others may play roles in homeostasis or other processes. Understanding this heterogeneity will require advanced techniques such as single-cell RNA sequencing, spatial transcriptomics, or live imaging to unravel how these subpopulations contribute to injury responses and overall tissue dynamics.

      (3) You mentioned that NP-like cells have similar nuclear shape and size to ALG and EG, while Ventral cortex cells have larger nuclei. Can you please show a quantification of the NP-like cells and Ventral cortex cells size, and show a direct comparison with ALG and EG cells to support those claims (images, quantification and analysis)?

      We added a new supplementary figure with a graph showing nuclei size differences between VC and NP-like cells, and a diagram showing VC cell localization. Images in figure 2A-A’ and 2B-B’ show both types of cells with the same scale, additionally, NPG cells are shown in red (current expression of the specific Gal4 line). A direct comparison between EG and NP-like glia can be observed in Figure 3 as well.

      Besides of size and localization, we conclude  that VC and N-like cells present different molecular markers as VC are elav-positive and reponegative whereas NP-like cells are repo-positive elav-negative

      (4) In Figure 2B, the repo expression is not very clear. I suggest using a different example to support the claim that NP cells are Repo+.

      We have changed the color of anti-elav staining to facilitate visualisation

      (5) Again, in Figure 2C, you need quantification and analysis to support the claim that you used nuclear shape and size to identify VC vs. NP like cells.

      Quantification in point 3, criteria in Figure S1

      (6) What is the identity of the newly formed neurons? Other than Elav, have you tried using other markers of neurons that are typically found in this area?

      This question is of great interest and relevance. We have done great efforts to solve this open question and so far, our data suggest that these neurons might be in an immature state. In this last version of the manuscript, we included the results (Figure S1) with several different markers. 

      The molecular identity of these cell populations, glia and neurons, is currently under investigation.

      Minor comments:

      (1) In the abstract, EG and ALG abbreviations are not introduced properly.

      Thank you very much for noticing this missing information, we have now included it in the abstract.

      (2) Please include a representation of the NPG somata location in Figure 1A.

      We have included this information in the figure

      (3) A schematic showing the differences between ALG and EG cells would be helpful as well.

      We have included in the introduction references and reviews where other authors describe in detail the differences.

      (4) In Figure 1 E, G, H- please indicated the genotype of the fly used in the panel as well as the cell type studied.

      The complete genotype is included in the corresponding figure legend. We have added a simplified genotype in the figure for clarity.

      (5) Please show the genotype used for images in Figure 2: ALG or EG specific drivers.

      This information is included in the corresponding figure legend. We believe that it is better to keep the figure clean so we decided to keep the complete genotype, which is considerably long, only in the figure legend.

    1. eLife Assessment

      This manuscript establishes a mathematical model to estimate the key parameters that control the repopulation of planarian stem cells after sublethal irradiation as they undergo fate-switching as part of their differentiation and self-renewal process. The findings are important for future investigation of stem cell division in planarians and have implications for analyzing stem cell biology in other systems. The methods are convincing, integrating modeling with perturbations of key transcription factors known to be critical for cell fate decisions, but the authors have only shown that this is the case for a small number of stem cell types.

    2. Reviewer #1 (Public review):

      Summary:

      This is a very creative study using modeling and measurement of neoblast dynamics to gain insight into the mechanism that allows these highly potent cells to undergo fate-switching as part of their differentiation and self-renewal process. The authors estimate growth equation parameters for expanding neoblast clones based on new and prior experimental observations. These results indicate neoblast likely undergo much more symmetric self-amplifying division than loss of the population through symmetric differentiation, in the case of clone expansion assays after sublethal irradiation. Neoblasts take on multiple distinct transcriptional fates related to their terminally differentiated cell types, and prior work indicated neoblasts have a high plasticity to switch fates in way linked to cell cycle progression and possibly through a random process. Here, the authors explore the impact of inhibition of key transcription factors defining such states (ie "fate specifying transcription factors", FSTFs) plus measurement and modeling in the clone expansion assay, to find that inhibition of factors like zfp1 likely cause otherwise zfp1-fated neoblasts to fail to proliferate and differentiation, without causing compensatory gains in other lineages. A mathematical model of this process assuming that neoblasts do not retain a memory of prior states while they proliferate and transition across specified states can mimic the experimentally determined decreased sizes of clones following inhibition of zfp1. Complementary approaches to inhibit more than one lineage (muscle plus intestine) supports the idea that this is a more general process in planarian stem cells. These results provide an important advance for understanding the fate-switching process and its relationship to neoblast growth.

      Overall I find the evidence very well presented and the study compelling, and offers an important new perspective on the key properties of neoblasts. I have some comments to clarify the presentation and significance of the work.

      Comments on revisions:

      In this revised version, the authors nicely address all of my comments and I find the work makes a strong case for its main conclusions.

    3. Reviewer #2 (Public review):

      Summary:

      Cell cycle duration and cell fate choice are critical to understanding the cellular plasticity of neoblasts in planarians. In this study, Tamar et al. integrated experimental and computational approaches to simulate a model for neoblast behaviors during colony expansion.

      Strengths:

      The finding that "arresting differentiation into specific lineages disrupts neoblast proliferative capacities without inducing compensatory expression of other lineages" is particularly intriguing. This concept could inspire further studies on pluripotent stem cells and their application for regenerative biology.

      Comments on revisions:

      The authors have addressed all of my comments and concerns.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews

      Reviewer #1 (Public review):

      Overall I find the evidence very well presented and the study compelling. It offers an important new perspective on the key properties of neoblasts. I do have some comments to clarify the presentation and significance of the work.

      We thank the reviewer for the positive feedback and plan to improve the presentation of the work.

      Reviewer #2 (Public review):

      However, the absence of a cell-cell feedback mechanism during colony growth and the likelihood of the difference needs to be clarified. Is there any difference in interpreting the results if this mechanism is considered?

      We will improve the description of the model assumptions and the interpretation of the data on the basis of these assumptions.

      Although hnf-4 and foxF have been silenced together to validate the model, a deeper understanding of the tgs-1+ cell type and the non-significant reduction of tgs-1+ neoblasts in zfp-1 RNAi colonies is necessary, considering a high neural lineage frequency.

      We will improve the analysis of this result in light of the experimentally determined frequency of the tgs-1+ neoblast population.

      Recommendations for the authors

      Reviewing Editor Comments:

      After consultation, we have compiled a list of the key changes to be made to the manuscript, along with reviewer-specific recommendations to follow.

      (1) Include a section that explicitly describes the assumptions and limitations of the study, particularly with respect to the following assumptions:

      We thank the reviewers for the comment. We added a description of the model assumptions in the methods section “Assumptions underlying neoblast colony growth model”.

      a) All known types of specialized neoblasts cycle at the same rate (see points from Reviewer 1).

      We thank the reviewers for the comment. The current data used to estimate τ (Lei et al., Dev Cell, 2016) does not allow the direct estimation of individual cycling behaviors. Consequently, we assume that all specialized neoblasts cycle at the same average rate, a simplification supported by the model's accurate prediction of colony growth.

      b) The assumption that any FSTF-like gene would behave like zfp1 or foxF and hnfA genes. The manuscript does not mention that there may be fundamental differences among these different FSTFs that could be uncovered by future work. A strong addition to the paper would be to test other epithelial genes (e.g. p53, chd4, egr5) to show reproducible behavior within a single lineage.

      We thank the reviewers for the comment. Colony size reduction following inhibition of Smed-p53 and failure to produce epidermal progenitors is strongly supported by previous analysis (Wagner et al., Cell Stem Cell, 2012). We refer to this observation in the paper in the section titled: “Inhibition of zfp-1 does not induce overexpression of other lineages in homeostasis”. We added the following sentence to the discussion (Line 460-462): Interestingly, suppression of Smed-p53, a TF expressed in neoblasts and required for epidermal cell production, has resulted in a similar reduction in colony size (Wagner et al., Cell Stem Cell, 2012).

      Of note, Chd4 expression is not limited to specialized neoblasts or to a specific lineage (Scinome et al., Development, 2010), and therefore its inhibition likely has a more complex outcome than an effect on a single lineage. Furthermore, egr-5 is not expressed in neoblasts (Tu et al, eLife, 2015), making this experimental condition more challenging to examine in the context of neoblast colonies at the time points assessed in this study.

      c) The fact that the data used to feed the model relies on radiated animals which are likely to have altered cell cycle rates compared to unirradiated animals (see comment by Reviewer 1). Of note, the model predicts a steady increase in colony size, but colony size does not change between 9dpi and 12dpi.

      We thank the reviewers for the comment. The colony size in control animals increased between 9 and 12 dpi (Fig 3B), as predicted by the model. In zfp-1 (RNAi) animals, the median colony size has also increased over this period, at a slower rate, which we attribute to the increase in q. We attribute the unchanged average colony size to an increase in the frequency of cells failing to proliferate, because of selection of a fate they cannot fully differentiate into.

      d) In light of both reviewers' comments about colony expansion vs. feedback, the authors should discuss how predicted changes to division frequencies might change as homeostasis is reached, or explain how their model accounts for the predicted rate differences under homeostatic conditions in which overall neoblast numbers do not change. Can the model estimate when this transition might occur?

      We thank the reviewers for the comment. Our colony assays are constrained by the animals survival following sub-total irradiation (16 to 20 days). In this timeframe, the neoblast population is overwhelmingly smaller in comparison to non-irradiated animals. Therefore, the animals do not reach homeostasis during the experiment, and the model does not allow to estimate the time the system would need to return to homeostasis.

      (2) In Figure 2D, the assumption is that these adjacent smedwi-1+ cells are sisters. Previous data analyzing this relied on EdU or H3P staining to show a shared division history. When these images were collected is therefore extremely critical to include (the methods suggest 7, 9, or 12 days). The authors should justify why they believe that these adjacent cells are derived from a single neoblast that has divided only once.

      We thank the reviewers for the comment. The images were collected at 7 dpi. We modified the figure legend and the associated methods to include this information. At this early time point, smedwi-1+ cell dyads are spatially separated from other neighboring cells, suggesting that they are the product of a single cell division. Importantly, our data is in complete agreement with previous estimates of symmetric renewal division rate (Raz et al., Cell Stem Cell, 2021; Lei et al, Developmental Cell, 2016).

      (3) Clarify the wording 'pre-selected' in the abstract as described by Reviewer 1.

      We thank the reviewers for the comment, and for clarity we replaced the wording “pre-select” with “select”. 

      (4) Experimental details that are important to the interpretation should be added. For example, how is belonging to a colony defined? This is important because some of the data (e.g. Figure S1A: similar numbers of smedwi-1+ cells are observed at 2dpi and 4dpi, but 4dpi is considered a colony whereas 2dpi is not). The timing of quantification should be included in each figure (it is missing in Figure S2, and Figure 3C and 3D). How the authors distinguish biological vs technical replicates is not mentioned.

      We thank the reviewers for the comment. Subtotal irradiation may result in formation of a spatially-isolated cluster of neoblasts that is not distributed throughout the animal (Wagner et al., Science, 2011). This localized cluster of neoblasts is defined as a neoblast colony (Wagner et al., Science, 2011; Wagner et al., Cell Stem Cell, 2012). The small number of high smedwi-1+ cells observed at 4 dpi in our experiments aligns with this definition (Fig S1A). By contrast, the low smedwi-1 expression detected across the animal 2 dpi does not fit this definition and likely reflects remnants of dying neoblasts resulting from irradiation. The following text was added to the figure legend: “isolated cells expressing low levels of smedwi-1+ were scattered in the planarian parenchyma, likely reflecting remnants of dying neoblasts”.

      (5) Figure 5F appears to use SMEDWI-1 antibody (based on capital letters and increased signal in the brain). Is this the case? The methods do not mention the use of a SMEDWI-1 antibody, and the text indicates that these are progenitors, but SMEDWI-1 protein is well known to not mark neoblasts. If the antibody was used, the authors should not claim that these are neoblasts.

      We thank the reviewers for the comment. The SMEDWI-1 antibody used in the experiments described in Figure 5F indeed labels neoblasts and their progeny (Guo et al., Developmental cell, 2006). The methods section “Immunofluorescence combined with FISH” details the labeling procedure, which combines FISH and IF using this antibody.

      All microscopy images are difficult to see. Perhaps this is because they are formatted as CMYK images. They should be converted to RGB format to make them appear less dull.

      We thank the reviewer for the comment. Improved version of the figures has now been uploaded.

      The terminology used in Figure 5 to describe upregulation should not be "overexpression".  We thank the reviewers for the comment.

      We changed the terminology to “upregulated”.

      Reviewer #1 (Recommendations for the authors):

      I think the authors should include a section that explicitly lays out the assumptions and limitations of the study. For example, I believe that determining tau requires assuming that all different types of specialized neoblasts cycle at the same rates. Also there is the assumption that any FSTF-like gene would behave like zfp1 or foxF and hnfA genes. It seems to remain possible that a future study could find that a subset of FSTFs might indeed exert "either/or" decisions in fating, just not the particular genes under investigation here.

      We thank the reviewer for the comment. We added a description of the model assumptions in the methods section.

      In the abstract, the wording "pre-selected" is somewhat puzzling to me. I would interpret a preselection as a process that defines the next specified state prior to its manifestation. Instead, and as I understand the authors argue this as well, the study provides good evidence that the determination mechanism is random in that subsequent neoblast choices do not likely depend on prior states. So I would suggest changing that wording.

      We thank the reviewer for the comment. We replaced “pre-select” with “select”

      Is it possible to determine the uncertainty in measuring tau the cell cycle time and would this have an impact on subsequent modeling?

      We thank the reviewers for the comment. The current data that was used to estimate tau (Lei et al., Dev Cell, 2016) does not allow us to directly estimate the uncertainty in measuring τ.

      For lines 154-164 I would suggest doing a little more to explicitly write out the logic of determining the growth constants within the main text and not just in methods, for ease of reading.

      We thank the reviewer for the comment, and added explanations for how we determined the growth constant in the text. The text now reads (lines 160-166): “Considering an average cell cycle length of 29.7 hours, we calculated the value of q using the following approach: the probabilities of all cell division outcomes must sum to 1. Our experimental data showed that symmetric renewal (p) and asymmetric division (a) occur at equal rates (i.e., p = a). By fitting these parameters to the experimental data, we determined that the difference between the probabilities of symmetric renewal and symmetric differentiation (i.e., p - q) was = 0.345 (Fig 2E, S1D-E). Therefore, with these criteria, we estimated the probabilities of cell division outcomes in the colony as p = 0.45, a = 0.45, and q = 0.1 (Fig 2G; Methods).”

      Line 192 why does post-mitotic progeny number linearly relate to neoblast number? In clones, a change in q has an exponential effect. I feel like I am missing something.

      We thank the reviewer for the comment. In colonies, 50% of cell divisions result in the production of post-mitotic progeny (asymmetric division). Therefore, the number of produced progenitors in a given cell cycle is linearly correlated with the number of neoblasts. This statement is in line with previous analysis of planarian colony size (Wagner et al., Cell Stem Cell, 2012).

      Line103 it also seems possible, although less likely, that the specified state is not fixed within a given cell cycle and could be that cells that try to switch into zeta-neoblasts mid-cell cycle arrest in proliferation etc just for that time.

      We thank the reviewer for the comment and agree that this is a possibility. However, our observations suggest that incorporating this factor into the model is unnecessary for accurately predicting colony size.

      In terms of the feedback mechanism proposed to operate in homeostasis, I think in the case of zfp-1 it is quite likely that loss of epidermal differentiation results in wound responses (this phenomenon has been documented in egr-5 RNAi in Tu et al 2015 I believe). This could play out differently in the clone assay because the effects of sublethal irradiation on this process would predominate in both control versus zfp1(RNAi) conditions.

      We thank the reviewer for the comment. Our RNA-seq analysis following zfp-1 inhibition did not show overexpression of injury-induced genes at an early time point (6 days; Fig. 5B-C). However, an increase in cycling cells was detected much earlier via EdU labeling (3 days; Fig. 5D). In the case of egr-5 suppression, Tu et al. analyzed injury-induced gene expression at a later stage (21 days of RNAi), where they found significant epidermal defects (see Fig. 5C in Tu et al.). We agree that sublethal irradiation effects likely predominate in colony analysis for both control and zfp-1 (RNAi) animals. In homeostasis, additional factors likely influence cell proliferation and differentiation.

      It seems likely that some of the differences noted between homeostasis versus clone growth could ultimately arise from the different growth parameters under each setting. Could the rate parameters be estimated from prior data in homeostasis as well? It seems to me that with the framework the authors use, homeostasis must involve a net zero change to neoblast abundance (also shown by Wagner 2011 by the sigmoidal curve of neoblast abundance at the endpoint of clone expansion). Therefore, in these conditions p=q by definition. Experimental evidence from Lei 2016 (Figure S7M) suggests asymmetric divisions and symmetric renewing divisions are about equally abundant (5/12 41% sym renewing vs 7/12 69% asymmetric renewing). Therefore, under homeostasis, there would be an estimated p=q=0.3 and a=0.4. Compared to clone growth conditions then, in homeostasis, it seems that roughly the rate of symmetric renewal decreases and the rate of symmetric differentiation also increases. I wonder, could this kind of difference potentially account for the differences between homeostasis versus clone expansion settings? It is also worth noting that the clone expansion context has been used as a sensitized genetic background for identifying effects of gene inhibition on neoblast self-renewal, so perhaps the reason this works is that the rates of selfrenewal are relatively less in homeostasis so that clone expansion represents a case where there is greater demand for self-renewal.

      We thank the reviewer for the comment. We agree that under homeostatic conditions, where the population size remains stable, the average probability of symmetric renewal matches the average probability of symmetric differentiation or elimination. By contrast, during colony expansion, the probability of symmetric renewal exceeds that of symmetric differentiation or elimination. The differences in response to a lineage block between homeostasis and colony expansion can have multiple interpretations. However, data from homeostatic animals does not permit the analysis of individual neoblasts or their specific responses to a lineage block. Consequently, we cannot determine whether the proliferative response following the lineage block during homeostasis is a direct response to the lineage block or an indirect effect resulting from changes in other neoblasts. We discuss these possibilities further in lines 472 - 484.

      In terms of the memory effect, I recall some arguments presented in the Raz 2021 study that were consistent with a slight memory for neoblast specification being retained. I believe this was a minor point from detecting a slightly higher likelihood of identifying 2-cell clones that both took on prog1+ identity compared to the population average. If this is the case, it may be worth the authors commenting on reconciling those observations with their model.

      We thank the reviewer for their comment. Raz et al. (Cell Stem Cell, 2021) reported that in the asymmetric division of a zeta-neoblast, which generates a prog-2+ cell and a neoblast, there was a slightly higher observed frequency of zfp-1 expression in the neoblast compared to the expected rate (Expected: 32%, Observed: 44%). This small increase may reflect a mild memory effect, experimental variability, or both. However, statistical analysis using Fisher's exact test yielded a non-significant p-value (p = 0.1), suggesting that this difference could be attributed to experimental variability. Other data from Raz et al., such as lineage representation in early colonies, also did not show significant memory effects, indicating that any such effects, if present, are minimal and difficult to detect. Therefore, while we do not, and cannot, rule out the presence of minor memory effects, we expect that effects of this magnitude will have minimal impact on our model.

      Reviewer #2 (Recommendations for the authors):

      Figure 2C and 2D:

      Please provide the specific time points for the data presented.

      We thank the reviewer for the comment. The information was added to the figure legend.

      Colony growth and homeostasis:

      It would be beneficial to estimate a time point at which colony growth transitions to a model with a cell-cell feedback mechanism, similar to that observed in homeostasis. This would help in understanding the dynamics and timing of these processes.

      We thank the reviewers for the comment. Our colony assays were constrained by the animals survival following sub-total irradiation (16 to 20 days). Neoblast numbers are substantially reduced compared to unirradiated animals, preventing us from determining the time point at which homeostasis is achieved.

      Methods:

      μl should be μL  

      The text was changed accordingly.

      Line 526: H2O should be H2O

      The text was changed accordingly.

    1. eLife Assessment

      This important and well-written study uses functional neuroimaging in human observers to provide compelling evidence that activity in the early visual cortex is suppressed at locations that are frequently occupied by a task-irrelevant but salient item. This suppression appears to be general to any kind of stimulus and also occurs in advance of any item actually appearing. The work will be of great interest to psychologists and neuroscientists examining attention, perception, learning and prediction.

    2. Reviewer #1 (Public review):

      Summary:

      The authors investigated if/how distractor suppression derived from statistical learning may be implemented in early visual cortex. While in a scanner, participants conducted a standard additional singleton task in which one location more frequently contained a salient distractor. The results showed that activity in EVC was suppressed for the location of the salient distractor as well as for neighbouring neutral locations. This suppression was not stimulus specific - meaning it occurred equally for distractors, targets and neutral items - and it was even present in trials in which the search display was omitted. Generally, the paper was clear, the experiment was well-designed, and the data are interesting.

      The authors addressed all of my concerns and the revised manuscript will make a beautiful addition to the literature.

    3. Reviewer #2 (Public review):

      The authors of this work set out to test ideas about how observers learn to ignore irrelevant visual information. Specifically, they used fMRI to scan participants who performed a visual search task. The task was designed in such a way that highly salient but irrelevant search items were more likely to appear at a given spatial location. With a region-of-interest approach, the authors found that activity in visual cortex that selectively responds to that location was generally suppressed, in response to all stimuli (search targets, salient distractors, or neutral items), as well as in the absence of an anticipated stimulus.

      Strengths of the study include: A well-written and well-argued manuscript; clever application of a region of interest approach to fMRI design, which allows articulating clear tests of different hypotheses; careful application of follow-up analyses to rule out alternative, strategy-based accounts of the findings; tests of the robustness of the findings to detailed analysis parameters such as ROI size; and exclusion of the role of regional baseline differences in BOLD responses. The main findings are enhanced by supplementary analyses that distinguish between the responses of early visual areas.

      The study provides an advance over previous studies, which identified enhancement or suppression in visual cortex as a function of search target/distractor predictability, but in less spatially-specific way. It also speaks to open questions about whether such suppression/enhancement is observed only in response to the arrival of visual information, or instead is preparatory, favouring the latter view. These questions have been at the heart of theoretical debates in this literature on how distractor suppression unfolds in the context of visual search.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This well-written report uses functional neuroimaging in human observers to provide convincing evidence that activity in the early visual cortex is suppressed at locations that are frequently occupied by a task-irrelevant but salient item. This suppression appears to be general to any kind of stimulus, and also occurs in advance of any item actually appearing. The work in its present form will be valuable to those examining attention, perception, learning and prediction, but with a few additional analyses could more informatively rule out potential alternative hypotheses. Further discussion of the mechanistic implications could clarify further the broad extent of its significance. 

      We thank the editor and the reviewers for the positive evaluation of our manuscript and the thoughtful comments. Below we provide a detailed point-by-point reply to the reviewers’ comments.

      In addition to addressing the reviewers' comments, we have improved the figure legends by explicitly describing the type of error bars depicted in the figures, information which was previously only listed in the Materials and Methods section. Specifically, the statement: “Error bars denote within-subject SEM” was added to several figures, as applicable. We believe that briefly reiterating this information in the figure legends enhances clarity and enables readers to interpret the results more accurately and efficiently. We also updated our code and data sharing statement, as well as opened the repository for the public: “Analysis and experiment code, as well as data required to replicate the results reported in this manuscript are available here: https://doi.org/10.17605/OSF.IO/G4RXV. Raw MRI data is available upon request.”

      Public Reviews

      Reviewer #1 (Public review): 

      Summary: 

      The authors investigated if/how distractor suppression derived from statistical learning may be implemented in early visual cortex. While in a scanner, participants conducted a standard additional singleton task in which one location more frequently contained a salient distractor. The results showed that activity in EVC was suppressed for the location of the salient distractor as well as for neighbouring neutral locations. This suppression was not stimulus specific - meaning it occurred equally for distractors, targets and neutral items - and it was even present in trials in which the search display was omitted. Generally, the paper was clear, the experiment was well-designed, and the data are interesting. Nevertheless, I do have several concerns mostly regarding the interpretation of the results. 

      (1) My biggest concern with the study is regarding the interpretation of some of the results. Specifically, regarding the dynamics of the suppression. I appreciate that there are some limitations with what you might be able to say here given the method but I do feel as if you have committed to a single interpretation where others might still be at play. Below I've listed a few alternatives to consider. 

      We agree with the reviewer that there are important alternatives to consider. Adequately addressing these alternatives will substantially increase the inferences we can draw from our data. Therefore, we address each alternative interpretation in detail below.

      (a) Sustained Suppression. I was wondering if there is anything in your results that would speak for or against the suppression being task specific. That is, is it possible that people are just suppressing the HPDL throughout the entire experiment (i.e., also through ITI, breaks, etc., rather than just before and during the search). Since the suppression does not seem volitional, I wonder if participants might apply a blanket suppression to HPDL un l they learn otherwise. Since your localiser comes a er the task you might be able to see hints of sustained suppression in the HPDL during these trials.  

      It is indeed possible that participants suppressed the HPDL throughout the entire experiment, instead of proactively instantiating suppression on each trial. While possible, we believe that this account is less likely to explain the present results, given the utilized analysis approach, a voxel-wise GLM fit to the BOLD data per run (see Materials and Methods for details). Specifically, we derived parameter estimates from this GLM per location to estimate the relative suppression. Sustained suppression would modulate BOLD responses throughout the run, i.e. presumably also during the implicit baseline period used to estimate the contrast parameter estimates per location. Hence, sustained suppression should not result in a differential modulation between locations, as the BOLD response at the HPDL during the baseline period would be equally suppressed as during the trial. Inspired by the reviewer’s comment, we now clarify this critical point in the manuscript’s Discussion section:

      “Third, participants might have suppressed the HPDL consistently throughout the experiment. This sustained suppression account differs from the proactive suppression proposed here. While this alternative is plausible, we believe that it is less likely to account for the present results, given the analysis conducted. Specifically, we computed voxel-wise parameter estimates and contrasted the obtained betas between locations. Under a sustained suppression account, the HPDL would show suppression even during the implicit baseline period, which would obscure the observed BOLD suppression at and near the HPDL.” 

      (b) Enhancement followed by suppression. Another alternative that wasn't discussed would be an initial transient enhancement of the HPDL which might be brought on by the placeholders followed by more sustained suppression through the search task. Of course, on the whole this would look like suppression, but this still seems like it would hold different implications compared to simply "proactive suppression". This would be something like search and destroy however could be on the location level before the actual onset of the search display.  

      R1 correctly points out that BOLD data, given the poor temporal resolution, do not allow for the detection of potential transient enhancements at the HPDL followed by a later and more pronounced suppression (akin to “search and destroy”). We fully agree with this assessment. However, we also argue that a transient enhancement followed by sustained suppression before search display onset constitutes proactive suppression in line with our interpretation, because suppression would still arise proactively (i.e., before search, and hence distractor, onset). Whether transient enhancement precedes suppression cannot be elucidated by our data, but we believe that it constitutes an interesting avenue for future studies using me-resolved and spatially specific recording methods. We now clarify this important implementational variation in the updated manuscript.

      “Finally, due to the limited temporal resolution of BOLD data, the present data do not elucidate whether the present suppression is preceded by a brief attentional enhancement of the HPDL, as implied by some prior work (Huang et al., 2024). On this account the HPDL would see transient enhancement, followed by sustained suppression, akin to a ‘search and destroy’ mechanism. Critically, we believe that this variation would nonetheless constitute proactive distractor suppression as the suppression would still arise before search onset. Using temporally and spatially resolved methods to explore potential transient enhancements preceding suppression is a promising avenue for future research charting the neural mechanisms underlying distractor suppression.”

      (2) I was also considering whether your effects might be at least partially attributable to priming type effects. This would be on the spatial (not feature) level as it is clear that the distractors are switching colours. Basically, is it possible that on trial n participants see the HPDL with the distractor in it and then on trial n+1 they suppress that location. This would be something distinct from the statistical learning framework and from the repetition suppression discussion you have already included. To test for this, you could look at the trials that follow omission or trials. If there is no suppression or less suppression on these trials it would seem fair to conclude that the suppression is at least in part due to the previous trial. 

      We agree with the reviewer that it is plausible that participants particularly suppress locations which on previous trials contained a distractor. To address this possibility, we conducted a new analysis and adjusted the manuscript accordingly:

      “Second, participants may have suppressed locations that contained the distractor on the previous trial, reflecting a spatial priming effect. This account constitutes a complementary but different perspective than statistical learning, which integrates implicit prior knowledge across many trials. We ruled out that spatial priming explains the present results by contrasting BOLD suppression magnitudes on trials with the distractor at the HPDL and trials where the distractor was not at the HPDL on the previous trial. Results, depicted in Supplementary Figure 4 showed that distractor suppression was statistically significant across both trial types, including trials without a distractor at the HPDL on the preceding trial. This indicates that the observed BOLD suppression is unlikely to be driven by priming and is instead more consistent with statistical learning. Moreover, results did not yield a statistically significant difference between trial types based on the distractor location in the preceding trial. However, these results should not be taken to suggest that spatial priming cannot contribute to distractor suppression; for details see: Supplementary Figure 4.” (p. 13).

      We note that this analysis approach slightly differs from the reviewer’s suggestion, which considered omission trials. However, we decided to exclude trials immediately following an omission to ensure that both conditions were matched as closely as possible. In particular, omission trials represent extended rest periods, which could alter participants’ state and especially modulate the visually evoked BOLD responses (e.g., potentially increasing the dynamic range) compared to trials that did not follow omissions. Our analysis approach avoids this difference while still addressing the hypothesis put forward by the reviewer. We now provide the full explanation and results figure of this priming analysis in the figure text of Supplementary Figure 4: 

      Reviewer #2 (Public review): 

      The authors of this work set out to test ideas about how observers learn to ignore irrelevant visual information. Specifically, they used fMRI to scan participants who performed a visual search task. The task was designed in such a way that highly salient but irrelevant search items were more likely to appear at a given spatial location. With a region-of-interest approach, the authors found that activity in visual cortex that selectively responds to that location was generally suppressed, in response to all stimuli (search targets, salient distractors, or neutral items), as well as in the absence of an anticipated stimulus. 

      Strengths of the study include: A well-written and well-argued manuscript; clever application of a region of interest approach to fMRI design, which allows articulating clear tests of different hypotheses; careful application of follow-up analyses to rule out alternative, strategy-based accounts of the findings; tests of the robustness of the findings to detailed analysis parameters such as ROI size; and exclusion of the role of regional baseline differences in BOLD responses. 

      We thank the reviewer for the positive evaluation of our manuscript.

      The report might be enhanced by analyses (perhaps in a surface space) that distinguish amongst the multiple "early" retinotopic visual areas that are analysed in the aggregate here. 

      We agree with the reviewer that an exploratory analysis separating early visual cortex (EVC) into its retinotopic areas could be an interesting addition. Our reasoning to combine early visual areas into one mask in the original analyses was two-fold: First, we did not have an a priori reason to expected distinct neural suppression between these early ROIs. Therefore, we did not acquire retinotopy data to reliably separate early visual areas (e.g. V1, V2 and V3), instead opting to increase the number of search task trials. The lack of retinotopy data inherently limits the reliability of the resulting cortical segmentation. However, we now performed an analysis separating early visual cortex into V1 and V2 and report the details as Supplementary Text 1:

      “In an exploratory analysis we investigated whether subdivisions of EVC exhibit different representations of priority signals. In brief, we used FreeSurfer to reconstruct brain surfaces (recon-all) from each subject’s anatomical scan. From these reconstructions we derived V1_exvivo and V2_exvivo labels, which were transformed into volume space using ‘mri_label2vol’ and merged into a bilateral mask for each ROI. We then selected the voxels within each ROI that were most responsive to the four stimulus locations, based on independent localizer data. This voxel selection followed the procedure outlined in the Materials and Methods: Region of Interest (ROI) Definition. To accommodate the subdivision into two ROIs (V1 and V2) compared to the single EVC ROI in the main analysis, we halved the number of voxels selected per location. Finally, we applied the same ROI analysis to investigate distractor suppression during search and omission trials, following the procedure described in Materials and Methods: Statistical Analysis. 

      Results of this more fine-grained ROI analyses are depicted in Supplementary Figure 1. First, the results from V2 qualitatively mirrored our primary ROI analysis. BOLD responses in V2 differed significantly between stimulus types (main effect of stimulus type: F<sub>(2,54)</sub> = 31.11, p < 0.001, 𝜂 = 0.54). Targets elicited larger BOLD responses compared to distractors (t<sub>(27)</sub> = 3.05, p<sub>holm</sub> = 0.004, d = 0.06) and neutral stimuli (t<sub>(27)</sub> = 7.82, p<sub>holm</sub> < 0.001, d = 0.14). Distractors also evoked larger responses than neutral stimuli (t<sub>(27)</sub> = 4.78, p<sub>holm</sub> < 0.001, d = 0.09). These results likely reflect top-down modulation due to target relevance and bo om-up effects of distractor salience. Consistent with the primary ROI analysis, the manipula on of distractor predictability showed a distinct pattern of location specific BOLD suppression in V2 (main effect of location: F<sub>(1.1,52.8)</sub> = 5.01, p = 0.030, 𝜂 = 0.16). Neural populations with receptive fields at the HPDL showed significantly reduced BOLD responses compared to the diagonally opposite neutral location (NL-far; post hoc test HPDL vs NL-far: t<sub>(27)</sub> = 2.69, p<sub>holm</sub> = 0.022, d = 0.62). Again, this suppression was not confined to the HPDL but also extended to close by neutral locations (NL-near vs NL-far: t<sub>(27)</sub> = 2.79, p<sub>holm</sub> = 0.022, d = 0.65). BOLD responses did not differ between HPDL and NL-near locations (HPDL vs NL-near: t<sub>(27)</sub> = 0.11, p<sub>holm</sub> = 0.915, d = 0.03; BF<sub>10</sub> = 0.13). As in the EVC ROI analysis, this suppression pattern was consistent across distractor, target, and neutral stimuli presented at the HPDL and NL-near locations compared to NL-far. In sum, neural responses in V2 were significantly modulated by the distractor contingencies, evident as reduced BOLD responses in neural populations with receptive fields at the HPDL and neutral locations near the location of the frequent distractor (NL-near), relative to the neutral location diagonally across the HPDL (NL-far). 

      In V1, BOLD responses also differed significantly between stimulus types (main effect of stimulus type: F<sub>(1.3,35.6)</sub> = 6.69, p = 0.009, 𝜂 = 0.20). Targets elicited larger BOLD responses compared neutral stimuli (t<sub>(27)</sub> = 3.52, p<sub>holm</sub> = 0.003, d = 0.12) and distractors evoked larger responses than neutral stimuli (t<sub>(27)</sub> = 2.62, p<sub>holm</sub> = 0.023, d = 0.09). However, no difference between targets and distractors was observed (t<sub>(27)</sub> = 0.90, p<sub>holm</sub> = 0.375, d = 0.03; BF<sub>10</sub> = 0.17), suggesting reduced sensitivity to task-related effects in V1. Indeed, analyzing the effect of distractor predictability for BOLD responses in V1 showed a different result than in V2 and the combined EVC ROI. There was no significant main effect of location (F<sub>(2,54)</sub> = 2.20, p = 0.120, 𝜂 = 0.08; BF<sub>10</sub> = 0.77). BOLD responses at NL-near and NL-far were similar (BF<sub>10</sub> = 0.171), with the only reliable difference found between target stimuli at the HPDL and NL-far locations (W = 94, p<sub>holm</sub> = 0.012, r = 0.54).”  

      We include the new result figure as Supplementary Figure 5

      We now include reference to these results in the manuscript’s Discussion section:

      “Are representations of priority signals uniform across EVC? A priori we did not have any hypotheses regarding distinct neural suppression profiles across different early visual areas, hence our primary analyses focused stimulus responses neural populations in EVC, irrespective of subdivision. However, an exploratory analysis suggests that distractor suppression may show different patterns in V1 compared to V2 (Supplementary Figure 5 and Supplementary Text 1). In brief, results in V2 mirrored those reported for the combined EVC ROI (Figure 4). In contrast, results in V1 appeared to be only partially modulated by distractor contingencies, and if so, the modulation was less robust and not as spatially broad as in V2. This suggests the possibility of different effects of distractor predictability across subdivisions of early visual areas. However, these results should be interpreted with caution. First, our design did not optimize the delineation of early visual areas (e.g., no functional retinotopy), limiting the accuracy of V1 and V2 segmentation. Additionally, analyses were conducted in volumetric space, which further reduces spatial precision. Future studies could improve this by including retinotopy runs to accurately delineate V1, V2, and V3, and by performing analyses in surface space. Higher-resolution functional and anatomical MRI sequences would also help elucidate how distractor suppression is implemented across EVC with greater precision.”

      Furthermore, the study could benefit from an analysis that tests the correlation over observers between the magnitude of their behavioural effects and their neural responses. 

      R2 highlights that behavioral facilitation and neural suppression could be correlated across participants. The rationale is that if neural suppression in EVC is related to the facilitation of behavioral responses, we should expect a positive relationship between neural suppression at the HPDL and RTs across participants. In this analysis we focused on the contrast between HPDL and NL-far, as this contrast was statistically significant in both the RT (Figure 2) and the neural suppression analysis (Figure 4). First, we computed for each participant the behavioural benefit of distractor suppression as: RT<sub>facilitation</sub> = RT<sub>NL-far</sub> – RT<sub>HPDL</sub>. Thereby RT facilitation reflects the response speeding due to a distractor appearing at the high probability distractor location compared to the far neutral location. Next, we computed neural suppression as: BOLD<sub>suppression</sub> = BOLD<sub>NL-far</sub> – BOLD<sub>HPDL</sub> Thus, positive values reflect the suppression of BOLD responses at the HPDL comparted to the NL-far location. The BOLD suppression index was computed for each stimulus type separately, as in the main ROI analysis (i.e. for Targets, Neutrals and Distractors). Finally, we correlated RT<sub>facilitation</sub> with BOLD<sub>suppression</sub> across participants using Pearson correlation. Results showed a small, but not statistically significant correlation between RT facilitation and BOLD suppression for distractor (r<sub>(26)</sub> = 0.22, p = 0.257), target (r<sub>(26)</sub> = 0.10, p = 0.598) and neutral (r<sub>(26)</sub> = 0.13, p = 0.519) stimuli. Thus, while the direc on of the correlation was in line with the specula on by the reviewer in the “ Recommendations for the authors”, results were not statistically reliable and therefore inconclusive. As also noted in our preliminary reply to the reviewer comments, it was a priori unlikely that this analysis would yield a statistically significant correlation. An a priori power analysis suggested that, to reach a power of 0.8 at a standard alpha of 0.05, given the present sample size of n=28, the effect size would need to exceed r > 0.75, which seemed unlikely for the correlation of behavioural and neural difference scores. Given the inconclusive nature of the results, we prefer to not include this additional analysis in the manuscript, as we believe that it does not add to the main message of the paper but have it accessible to the interested reader in the public “peer review process”.

      The study provides an advance over previous studies, which iden fied enhancement or suppression in visual cortex as a function of search target/distractor predictability, but in less spatially-specific way. It also speaks to open questions about whether such suppression/enhancement is observed only in response to the arrival of visual information, or instead is preparatory, favouring the la er view. The theoretical advance is moderate, in that it is largely congruent with previous frameworks, rather than strongly excluding an opposing view or providing a major step change in our understanding of how distractor suppression unfolds. 

      We agree with the reviewer that our results are an advancement of prior work, particularly with respect to narrowing down the role of sensory areas and the proactive nature of distractor suppression. However, we argue that this represents a significant step forward for several reasons. First, to our knowledge, the literature on distractor suppression, and visual search in general, is by no means unanimous with respect to the conclusion that distractor suppression is instantiated proactively (Huang et al., 2021, 2022). Indeed, there are several studies suggesting the opposite account; reactive suppression (Chang et al., 2023) or contributions by both proactive and reactive mechanisms (Sauter et al., 2021; Wang et al., 2019). Moreover, studies in support of proactive distractor suppression did not investigate the involvement of (early) sensory areas during suppression. Conversely, to our knowledge most studies investigating the involvement of sensory cortex during distractor suppression did not address the question whether suppression arises proactive or reactively.

      Recommendations for the authors: 

      Reviewer #1 ( Recommendations for the authors): 

      Minor Points: 

      (1) There are several disconnects between the behaviour and the MR results - i.e. not stimulus specific yet there are no deficits for targets appearing the HPDL, also no behavioural suppression for the NLNear but neural suppression found. Nevertheless, the behaviour is used as a way to rule out potential attentional strategies when considering whether there is enhancement in the NL-Far condition. I realise you have a few other points here, but I think it's worth addressing what could be seen as a double standard.

      The reviewer points out an important concern, which we feel could have better been addressed in the manuscript. From our point of view a partial dissociation between neural modulations in EVC and eventual behavioural facilitation is not surprising, given the extensive neural processing beyond EVC required for behaviour. However, this assessment may differ, if one stresses an explicit volitional attentional strategy over an implicit statistical learning account. That said, we clearly do not want to create the impression of using a double standard. The lack of behavioural facilitation for targets at NLfar is not a critical part of our argument against explicit attentional strategies. Therefore, we rephrased the relevant paragraph in the Discussion section to now emphasize the importance of the control analysis excluding participants who reported the correct HPDL in the questionnaire (Figure 5), but nonetheless yielded qualitatively identical results to the main ROI analysis (Figure 4). In our opinion, this control analysis provides more compelling evidence against a volitional attentional strategy account without the risk of crea ng the impression of applying a double standard in the interpretation of behavioural data. Additionally, we now acknowledge the limitation of relying on behavioral data in ruling out volitional attentional strategies in the updated manuscript:

      “It is well established that attention enhances BOLD responses in visual cortex (Maunsell, 2015; Reynolds & Chelazzi, 2004; Williford & Maunsell, 2006). If participants learned the underlying distractor contingencies, they could deploy an explicit strategy by directing their attention away from the HPDL, for example by focusing attention on the diagonally opposite neutral location. This account provides an alternative explanation for the observed EVC modulations. However, while credible, the current findings are not consistent with such an interpretation. First, there was no behavioral facilitation for target stimuli presented at the far neutral location, contrary to what one might expect if participants employed an explicit strategy. However, given the partial dissociation between neural suppression in EVC and behavioral facilitation, additional neural data analyses are required to rule out volitional attention strategies. Thus, we performed a control analysis that excluded all participants that indicated the correct HPDL location in the questionnaire, thereby possibly expressing explicit awareness of the contingencies. This control analysis yielded qualitatively identical results to the full sample, showing significant distractor suppression in EVC. Therefore, it is unlikely that explicit attentional strategies, and the enhancement of locations far from the HPDL, drive the results observed here. Instead the current finding are consistent with an account emphasizing the automa c deployment of spatial priors (He et al., 2022) based on implicitly learned statistical regularities.”

      (2) Does the level of suppression change in any way through the experiment? I.e., does it get stronger in the second vs. first half of the experiment? 

      The reviewer askes an interesting question, whether BOLD suppression may change across the experiment. To address this question, we performed an additional analysis testing BOLD suppression in EVC during the first compared to second half of the MRI experiment. Here we defined BOLD suppression as: BOLD<sub>suppression</sub> = ((BOLD<sub>NL-far</sub> – BOLD<sub>HPDL</sub>) + (BOLD<sub>NL-far</sub> – BOLD<sub>NL-near</sub>)) / 2. Thus, in this formula on of BOLD suppression we summarize the two primary BOLD suppression effects observed in our main results (Figure 4). Additionally, as we previously did not observe any significant differences in BOLD suppression magnitudes between different stimulus types (i.e. suppression was similar for target, distractor and neutral stimuli), we collapsed across stimulus types in this analysis.

      Results, depicted below, showed that during both the initial (Run 1+2) and later part (Run 4+5) of the MRI experiment BOLD suppression was statistically significant (BOLD suppression Run 1+2: W = 331, p = 0.003, r = 0.63; BOLD suppression Run 4+5: W = 320, p = 0.007, r= 0.58) , confirming our main results of reliable distractor suppression even in this subset of trials. However, we did not observe any statistically significant differences between early and late runs of the experiment (t<sub>(27)</sub> = -0.21, p = 0.835, d = -0.04). In fact, a Bayesian paired t-test provided evidence for the absence of a difference in BOLD suppression between early compared to later runs (BF<sub>10</sub> = 0.205), suggesting that distractor suppression in EVC was stable throughout the experiment. A qualitatively similar, pattern was evident during omission trials, with significant distractor suppression during early runs (t<sub>(27)</sub> = 2.70, p = 0.012, d = 0.51), but not quite a statistically significant modulation for later runs (t<sub>(27)</sub> = 1.97, p = 0.059, d = 0.37). Again, there was no evidence for a difference in suppression magnitudes across the experiment (W = 198, p = 0.920, d = -0.025) and support for the absence of a difference in BOLD suppression between early and late runs (BF<sub>10</sub> = 0.278).

      Author response image 1.

      Analysis of BOLD suppression magnitudes in EVC across the MRI experiment phases. BOLD suppression was comparable between early (Run 1+2) and late (Run 4+5) phases of the MRI experiment, suggesting consistent suppression in EVC following statistical learning. Error-bars denote within-subject SEM. * p < 0.05, ** p < 0.01, = BF<sub>10</sub> < 1/3.

      In sum, results suggest that distractor suppression in EVC was stable across runs and did not change significantly throughout the experiment. This result was a priori likely, given that participants already underwent behavioral training before entering the MRI. This enabled them to establish modified spatial priority maps, containing the high probability distractor location contingencies, already before the first MRI run. While specula ve, it is possible that participants may still have consolidated the spatial priority maps during the initial runs, but that this additional consolation is not evident in the data, as later runs may see less engagement by participants due to increasing fa gue towards the end of the MRI experiment. Indeed, rapid learning and stable suppression throughout the remainder of the experiment is also reported by prior work (Lin et al., 2021). We believe that it is highly interesting for future studies to investigate the development of distractor suppression across learning, with initial exposure to the contingencies inside the MRI. However, as the present results are inconclusive, we prefer to not include this analysis in the main manuscript, as it may not provide significant additional insight into the neural mechanisms underlying distractor suppression. 

      (3) In the methods vs. results you have reported the probabili es slightly differently. In the methods you say the HPDL was 6x more likely to contain a distractor whereas in the results you say 4x. Based on the reported trial numbers I think it should be 4, but probably you want to double check that this is consistent and correct throughout. 

      We thank the reviewer for bringing this inconsistency to our attention. We have corrected this oversight in the adjusted manuscript: 

      “One of the four locations of interest was designated the high probability distractor location (HPDL), which contained distractor stimuli (unique color) four mes more o en than any of the remaining three locations of interest. In other words, if a distractor was present on a given trial (42 trials per run), the distractor appeared 57% (24 trials per run) at the HPDL and at one of the other three locations with equal probability (i.e., 14% or 6 trials per run per location).” 

      Reviewer #2 ( Recommendations for the authors): 

      The authors have performed their analyses in the volume rather than the surface, and have grouped together V1, V2, and V3 as "early visual cortex". As the authors' claims lean heavily on the idea that they are measuring "early" visual responses, the study would be improved by delinea ng the ROIS within these different retinotopic regions. Such an approach might be facilitated by analysing data on the reconstructed surface. 

      Please refer to our reply to this analysis suggested in the Public review.

      The authors rightly tread carefully on the causal link between their neural findings and the behavioural outcomes. The picture might be clarified somewhat further by testing for a positive relationship between behavioural effect sizes and neural effect sizes across participants. e.g. to what extent is the search advantage when distractors are presented at the "HPDL" linked to greater suppression of BOLD at the HDPL region of early visual cortex? 

      Please refer to our reply to this analysis suggested in the Public review.

      Some of the claims based on null hypotheses would be better supported by Bayesian tests e.g. page 6 "This pattern of results was the same regardless whether the distractor, target, or a neutral stimulus presented at the HPDL and NL-near locations compared to NL-far ..." and "BOLD responses between HPDL and NL-near locations did not reliably differ ..." This is similar to the approach that the authors adopted later in the section "Ruling out attentional modulation".

      We agree with the reviewer that our ROI analyses would benefit from providing evidence for the absence of a modulation. Accordingly, we updated our results by adding equivalent Bayesian tests. Bayes Factors were computed using JASP 0.18.2 (JASP Team, 2024; RRID:SCR_015823) with default settings; i.e. for Bayesian paired t-tests with a Cauchy prior width of 0.707. Qualitative interpretations of BFs were based on Lee and Wagenmakers (2014). We now report the obtained BF in the Results section. 

      “BOLD responses between HPDL and NL-near locations did not reliably differ (HPDL vs NL-near: t<sub>(27)</sub> = 0.47, p<sub>holm</sub> = 0.643, d = 0.08; BF<sub>10</sub> = 0.19).”

      And:

      “Neural responses at HPDL and NL-near did not reliably differ (t<sub>(27)</sub> = 0.21, p<sub>holm</sub> = 0.835 d = 0.04; BF<sub>10</sub> = 0.21).”

      Moreover, we now denote any equivalent results (defined as BF<sub>10</sub><1/3) in Fig. 4 and Fig. 5, and included the descrip on of the associated symbol in the figure text (“ = BF<sub>10</sub> < 1/3”).

      Additionally, we now also report the BF for all paired t-tests reported in Supplementary Table 1.

      Finally, we addressed the statement: “This pattern of results was the same regardless whether the distractor, target, or a neutral stimulus presented at the HPDL and NL-near locations compared to NLfar”. Our inten on was to emphasize that the pattern of results reported in the sentence preceding it was evident for distractor, target, or neutral stimulus, and not to suggest that the magnitude of the effect is the same. Hence, to more accurate reflect the results, we changed this sentence to:  “This pattern of results was present regardless whether the distractor, target, or a neutral stimulus presented at the HPDL and NL-near locations compared to NL-far”

    1. eLife Assessment

      This valuable work presents how PRDM16 plays a critical role during colloid plexus development, through regulating BMP signaling. Solid evidence supports the context-dependent gene regulatory mechanisms both in vivo and in vitro. The work will be of broad interest to researchers working on growth factor signaling mechanisms and vertebrate development.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript describes the role of PRDM16 in modulating BMP response during choroid plexus (ChP) development. The authors combine PRDM16 knockout mice and cultured PRDM16 KO primary neural stem cells (NSCs) to determine the interactions between BMP signaling and PRDM16 in ChP differentiation.

      They show PRDM16 KO affects ChP development in vivo and BMP4 response in vitro. They determine genes regulated by BMP and PRDM16 by ChIP-seq or CUT&TAG for PRDM16, pSMAD1/5/8, and SMAD4. They then measure gene activity in primary NSCs through H3K4me3 and find more genes are co-repressed than co-activated by BMP signaling and PRDM16. They focus on the 31 genes found to be co-repressed by BMP and PRDM16. Wnt7b is in this set and the authors then provide evidence that PRDM16 and BMP signaling together repress Wnt activity in the developing choroid plexus.

      Strengths:

      Understanding context-dependent responses to cell signals during development is an important problem. The authors use a powerful combination of in vivo and in vitro systems to dissect how PRDM16 may modulate BMP response in early brain development.

      Main weaknesses of the experimental setup:

      (1) Because the authors state that primary NSCs cultured in vitro lose endogenous Prdm16 expression, they drive expression by a constitutive promoter. However, this means the expression levels are very different from endogenous levels (as explicitly shown in Supplementary Figure 2B) and the effect of many transcription factors is strongly dose-dependent, likely creating differences between the PRDM16-dependent transcriptional response in the in vitro system and in vivo.

      (2) It seems that the authors compare Prdm16_KO cells to Prdm16 WT cells overexpressing flag_Prdm16. Aside from the possible expression of endogenous Prdm16, other cell differences may have arisen between these cell lines. A properly controlled experiment would compare Prdm16_KO ctrl (possibly infected with a control vector without Prdm16) to Prdm16_KO_E (i.e. the Prdm16_KO cells with and without Prdm16 overexpression.)

      Other experimental weaknesses that make the evidence less convincing:

      (1) The authors show in Figure 2E that Ttr is not upregulated by BMP4 in PRDM16_KO NSCs. Does this appear inconsistent with the presence of Ttr expression in the PRDM16_KO brain in Figure1C?

      (2) Figure 3: The authors use H3K4me3 to measure gene activity. This is however, very indirect, with bulk RNA-seq providing the most direct readout and polymerase binding (ChIP-seq) another more direct readout. Transcription can be regulated without expected changes in histone methylation, see e.g. papers from Josh Brickman. They verify their H3K4me3 predictions with qPCR for a select number of genes, all related to the kinetochore, but it is not clear why these genes were picked, and one could worry whether these are representative.

      (3) Line 256: The overlap of 31 genes between 184 BMP-repressed genes and 240 PRDM16-repressed genes seems quite small.

      (4) The Wnt7b H3K4me3 track in Fig. 3G is not discussed in the text but it shows H3K4me3 high in _KO and low in _E regardless of BMP4. This seems to contradict the heatmap of H3K4me3 in Figure 3E which shows H3K4me3 high in _E no BMP4 and low in _E BMP4 while omitting _KO no BMP4. Meanwhile CDKN1A, the other gene shown in 3G, is missing from 3E.

      (5) The authors use PRDM16 CUT&TAG on dissected dorsal midline tissues to determine if their 31 identified PRDM16-BMP4 co-repressed genes are regulated directly by PRDM16 in vivo. By manual inspection, they find that "most" of these show a PRDM16 peak. How many is most? If using the same parameters for determining peaks, how many genes in an appropriately chosen negative control set of genes would show peaks? Can the authors rigorously establish the statistical significance of this observation? And why wasn't the same experiment performed on the NSCs in which the other experiments are done so one can directly compare the results? Instead, as far as I could tell, there is only ChIP-qPCR for two genes in NSCs in Supplementary Figure 4D.

      (6) In comparing RNA in situ between WT and PRDM16 KO in Figure 7, the authors state they use the Wnt2b signal to identify the border between CH and neocortex. However, the Wnt2b signal is shown in grey and it is impossible for this reviewer to see clear Wnt2b expression or where the boundaries are in Figure 7A. The authors also do not show where they placed the boundaries in their analysis. Furthermore, Figure 7B only shows insets for one of the regions being compared making it difficult to see differences from the other region. Finally, the authors do not show an example of their spot segmentation to judge whether their spot counting is reliable. Overall, this makes it difficult to judge whether the quantification in Figure 7C can be trusted.

      (7) The correlation between mKi67 and Axin2 in Figure 7 is interesting but does not convincingly show that Wnt downstream of PRDM16 and BMP is responsible for the increased proliferation in PRDM16 mutants.

      Weaknesses of the presentation:

      Overall, the manuscript is not easy to read. This can cause confusion.

    3. Reviewer #2 (Public review):

      Summary:

      This article investigates the role of PRDM16 in regulating cell proliferation and differentiation during choroid plexus (ChP) development in mice. The study finds that PRDM16 acts as a corepressor in the BMP signaling pathway, which is crucial for ChP formation.

      The key findings of the study are:<br /> (1) PRDM16 promotes cell cycle exit in neural epithelial cells at the ChP primordium.<br /> (2) PRDM16 and BMP signaling work together to induce neural stem cell (NSC) quiescence in vitro.<br /> (3) BMP signaling and PRDM16 cooperatively repress proliferation genes.<br /> (4) PRDM16 assists genomic binding of SMAD4 and pSMAD1/5/8.<br /> (5) Genes co-regulated by SMADs and PRDM16 in NSCs are repressed in the developing ChP.<br /> (6) PRDM16 represses Wnt7b and Wnt activity in the developing ChP.<br /> (7) Levels of Wnt activity correlate with cell proliferation in the developing ChP and CH.

      In summary, this study identifies PRDM16 as a key regulator of the balance between BMP and Wnt signaling during ChP development. PRDM16 facilitates the repressive function of BMP signaling on cell proliferation while simultaneously suppressing Wnt signaling. This interplay between signaling pathways and PRDM16 is essential for the proper specification and differentiation of ChP epithelial cells. This study provides new insights into the molecular mechanisms governing ChP development and may have implications for understanding the pathogenesis of ChP tumors and other related diseases.

      Strengths:

      (1) Combining in vitro and in vivo experiments to provide a comprehensive understanding of PRDM16 function in ChP development.

      (2) Uses of a variety of techniques, including immunostaining, RNA in situ hybridization, RT-qPCR, CUT&Tag, ChIP-seq, and SCRINSHOT.

      (3) Identifying a novel role for PRDM16 in regulating the balance between BMP and Wnt signaling.

      (4) Providing a mechanistic explanation for how PRDM16 enhances the repressive function of BMP signaling. The identification of SMAD palindromic motifs as preferred binding sites for the SMAD/PRDM16 complex suggests a specific mechanism for PRDM16-mediated gene repression.

      (5) Highlighting the potential clinical relevance of PRDM16 in the context of ChP tumors and other related diseases. By demonstrating the crucial role of PRDM16 in controlling ChP development, the study suggests that dysregulation of PRDM16 may contribute to the pathogenesis of these conditions.

      Weaknesses:

      (1) Limited investigation of the mechanism controlling PRDM16 protein stability and nuclear localization in vivo. The study observed that PRDM16 protein became nearly undetectable in NSCs cultured in vitro, despite high mRNA levels. While the authors speculate that post-translational modifications might regulate PRDM16 in NSCs similar to brown adipocytes, further investigation is needed to confirm this and understand the precise mechanism controlling PRDM16 protein levels in vivo.

      (2) Reliance on overexpression of PRDM16 in NSC cultures. To study PRDM16 function in vitro, the authors used a lentiviral construct to constitutively express PRDM16 in NSCs. While this approach allowed them to overcome the issue of low PRDM16 protein levels in vitro, it is important to consider that overexpressing PRDM16 may not fully recapitulate its physiological role in regulating gene expression and cell behavior.

      (3) Lack of direct evidence for AP1 as the co-factor responsible for SMAD relocation in the absence of PRDM16. While the study identified the AP1 motif as enriched in SMAD binding sites in Prdm16 knockout cells, they only provided ChIP-qPCR validation for c-FOS binding at two specific loci (Wnt7b and Id3). Further investigation is needed to confirm the direct interaction between AP1 and SMAD proteins in the absence of PRDM16 and to rule out other potential co-factors.

    4. Reviewer #3 (Public review):

      Summary:

      Bone morphogenetic protein (BMP) signaling instructs multiple processes during development including cell proliferation and differentiation. The authors set out to understand the role of PRDM16 in these various functions of BMP signaling. They find that PRDM16 and BMP co-operate to repress stem cell proliferation by regulating the genomic distribution of BMP pathway transcription factors. They additionally show that PRDM16 impacts choroid plexus epithelial cell specification. The authors provide evidence for a regulatory circuit (constituting of BMP, PRDM16, and Wnt) that influences stem cell proliferation/differentiation.

      Strengths:

      I find the topics studied by the authors in this study of general interest to the field, the experiments well-controlled and the analysis in the paper sound.

      Weaknesses:

      I have no major scientific concerns. I have some minor recommendations that will help improve the paper (regarding the discussion).

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript describes the role of PRDM16 in modulating BMP response during choroid plexus (ChP) development. The authors combine PRDM16 knockout mice and cultured PRDM16 KO primary neural stem cells (NSCs) to determine the interactions between BMP signaling and PRDM16 in ChP differentiation.

      They show PRDM16 KO affects ChP development in vivo and BMP4 response in vitro. They determine genes regulated by BMP and PRDM16 by ChIP-seq or CUT&TAG for PRDM16, pSMAD1/5/8, and SMAD4. They then measure gene activity in primary NSCs through H3K4me3 and find more genes are co-repressed than co-activated by BMP signaling and PRDM16. They focus on the 31 genes found to be co-repressed by BMP and PRDM16. Wnt7b is in this set and the authors then provide evidence that PRDM16 and BMP signaling together repress Wnt activity in the developing choroid plexus.

      Strengths:

      Understanding context-dependent responses to cell signals during development is an important problem. The authors use a powerful combination of in vivo and in vitro systems to dissect how PRDM16 may modulate BMP response in early brain development.

      Main weaknesses of the experimental setup:

      (1) Because the authors state that primary NSCs cultured in vitro lose endogenous Prdm16 expression, they drive expression by a constitutive promoter. However, this means the expression levels are very different from endogenous levels (as explicitly shown in Supplementary Figure 2B) and the effect of many transcription factors is strongly dose-dependent, likely creating differences between the PRDM16-dependent transcriptional response in the in vitro system and in vivo.<br />

      We acknowledge that our in vitro experiments may not ideally replicate the in vivo situation, a common limitation of such experiments, our primary aim was to explore the molecular relationship between PRDM16 and BMP signaling in gene regulation. Such molecular investigations are challenging to conduct using in vivo tissues. In vitro NSCs treated with BMP4 has been used a model to investigate NSC proliferation and quiescence, drawing on previous studies (e.g., Helena Mira, 2010; Marlen Knobloch, 2017). Crucially, to ensure the relevance of our in vitro findings to the in vivo context, we confirmed that cultured cells could indeed be induced into quiescence by BMP4, and this induction necessitated the presence of PRDM16. Furthermore, upon identifying target genes co-regulated by PRDM16 and SMADs, we validated PRDM16's regulatory role on a subset of these genes in the developing Choroid Plexus (ChP) (Fig. 7 and Suppl.Fig7-8). Only by combining evidence from both in vitro and in vivo experiments could we confidently conclude that PRDM16 serves as an essential co-factor for BMP signaling in restricting NSC proliferation.

      (2) It seems that the authors compare Prdm16_KO cells to Prdm16 WT cells overexpressing flag_Prdm16. Aside from the possible expression of endogenous Prdm16, other cell differences may have arisen between these cell lines. A properly controlled experiment would compare Prdm16_KO ctrl (possibly infected with a control vector without Prdm16) to Prdm16_KO_E (i.e. the Prdm16_KO cells with and without Prdm16 overexpression.)

      We agree that Prdm16 KO cells carrying the Prdm16-expressing vector would be a good comparison with those with KO_vector. However, despite more than 10 attempts with various optimization conditions, we were unable to establish a viable cell line after infecting Prdm16 KO cells with the Prdm16-expressing vector. The overall survival rate for primary NSCs after viral infection is low, and we observed that KO cells were particularly sensitive to infection treatment when the viral vector was large (the Prdm16 ORF is more than 3kb).

      As an alternative oo assess vector effects, we instead included two other control cell lines, wt and KO cells infected with the 3xNLS_Flag-tag viral vector, and presented the results in supplementary Fig 2.  When we compared the responses of the four lines — wt, KO, wt infected with the Flag vector, KO infected with the Flag vector — to the addition and removal of BMP4, we confirmed that the viral infection itself has no significant impacts on the responses of these cells to these treatments regarding changes in cell proliferation and Ttr induction.

      Given that wt cells and the KO cells, with or without viral backbone infection behave quite similarly in terms of cell proliferation, we speculate that even if we were successful in obtaining a cell line with Prdm16-expressing vector in the KO cells, it may not exhibit substantial differences compared to wt cells infected with Prdm16-expressing vector.

      Other experimental weaknesses that make the evidence less convincing:

      (1) The authors show in Figure 2E that Ttr is not upregulated by BMP4 in PRDM16_KO NSCs. Does this appear inconsistent with the presence of Ttr expression in the PRDM16_KO brain in Figure1C?<br />

      The reviwer’s point is that there was no significant increase in Ttr expression in Prdm16_KO cells after BMP4 treatment (Fig. 2E), but there remained residule Ttr mRNA signals in the Prdm16 mutant ChP (Fig. 1C). We think the difference lies in the measuable level of Ttr expression between that induced by BMP4 in NSC culture and that in the ChP. This is based on our immunostaining expreriment in which we tried to detect Ttr using a Ttr antibody. This antibody could not detect the Ttr protein in BMP4-treated Prdm16_expressing NSCs but clearly showed Ttr signal in the wt ChP. This means that although Ttr expression can be significantly increased by BMP4 in vitro to a level measurable by RT-qPCR, its absolute quantity even in the Prdm16_expressing condition is much lower compared to that in vivo. Our results in Fig 1C and Fig 2E, as well as Fig 7B, all consistently showed that Prdm16 depletion significantly reduced Ttr expression in in vitro and in vivo.

      (2) Figure 3: The authors use H3K4me3 to measure gene activity. This is however, very indirect, with bulk RNA-seq providing the most direct readout and polymerase binding (ChIP-seq) another more direct readout. Transcription can be regulated without expected changes in histone methylation, see e.g. papers from Josh Brickman. They verify their H3K4me3 predictions with qPCR for a select number of genes, all related to the kinetochore, but it is not clear why these genes were picked, and one could worry whether these are representative.

      H3K4me3 has widely been used as an indicator of active transcription and is a mark for cell identity genes. And it has been demonstrated that H3K4me3 has a direct function in regulating transciption at the step of RNApolII pausing release. As stated in the text, there are advantages and disadvantages of using H3K4me3 compared to using RNA-seq. RNA-seq profiles all gene products, which are affected by transcription and RNA stability and turnover. In contrast, H3K4me3 levels at gene promoter reflects transcriptional activity. In our case, we aimed to identify differential gene expression between proliferation and quiescence states. The transition between these two states is fast and dynamic. RNA-seq may not be able to identify functionally relevant genes but more likely produces false positive and negative results. Therefore, we chose H3K4me3 profiling.

      We agree that transcription may change without histone methylation changes. This may cause an under-estimation of the number of changed genes between the conditions. 

      We validated 7 out of 31 genes (Wnt7b, Id3, Mybl2, Spc24, Spc25, Ndc80 and Nuf2). We chose these genes based on two critira: 1) their function is implicated in cell proliferation and cell-cycle regulation based on gene ontology analysis; 2) their gene products are detectable in the developing ChP based on the scRNA-seq data. Three of these genes (Wnt7b, Id3, Mybl2) are not related to the kinetochore. We now clarify this description in the revised text.

      (3) Line 256: The overlap of 31 genes between 184 BMP-repressed genes and 240 PRDM16-repressed genes seems quite small.

      This indicates that in addition to co-repressing cell-cycle genes, BMP and PRDM16 have independent fucntions. For example, it was reported that BMP regulates neuronal and astrocyte differentiation (Katada, S. 2021), while our previous work demonstrated that Prdm16 controls temporal identity of NSCs (He, L. 2021).

      (4) The Wnt7b H3K4me3 track in Fig. 3G is not discussed in the text but it shows H3K4me3 high in _KO and low in _E regardless of BMP4. This seems to contradict the heatmap of H3K4me3 in Figure 3E which shows H3K4me3 high in _E no BMP4 and low in _E BMP4 while omitting _KO no BMP4. Meanwhile CDKN1A, the other gene shown in 3G, is missing from 3E.

      The track in Fig 3G shows the absolute signal of H3K4me3 after mapping the sequencing reads to the genome and normaliz them to library size. Compare the signal in Prdm16_E with BMP4 and that in Prdm16_E without BMP4, the one with BMP4 has a lower peak. The same trend can be seen for the pair of Prdm16_KO cells with or without BMP4.  The heatmap in Fig. 3E shows the relative level of H3K4me3 in three conditions. The Prdm16_E cells with BMP4 has the lowest level, while the other two conditions (Prdm16_KO with BMP4 and Prdm16_E without BMP4) display a higher level. These two graphs show a consistent trend of H3K4me3 changes at the Wnt7b promoter across these conditions.

      (5) The authors use PRDM16 CUT&TAG on dissected dorsal midline tissues to determine if their 31 identified PRDM16-BMP4 co-repressed genes are regulated directly by PRDM16 in vivo. By manual inspection, they find that "most" of these show a PRDM16 peak. How many is most? If using the same parameters for determining peaks, how many genes in an appropriately chosen negative control set of genes would show peaks? Can the authors rigorously establish the statistical significance of this observation? And why wasn't the same experiment performed on the NSCs in which the other experiments are done so one can directly compare the results? Instead, as far as I could tell, there is only ChIP-qPCR for two genes in NSCs in Supplementary Figure 4D.

      In our text, we indicated the genes containing PRDM16 binding peaks in the figures and described them as “Text in black in Fig. 6A and Supplementary Fig. 5A”. We will add the precise number “25 of these genes” in the main text to clarify it. To define a negative control set of genes, we will use BMP-only repressed 184-31 =153 genes (excluding PRDM16-BMP4 co-repressed), and of these 153 genes, we will determine how many have PRDM16 peaks in the E12.5 ChP data, say X. Then we will use binomial test to calculate p-value binom_test(25, 31, X/153, alternative=“greater).

      We are confused with the second part of the comment “And why wasn't the same experiment performed on the NSCs in which the other experiments are done so one can directly compare the results? Instead, as far as I could tell, there is only ChIP-qPCR for two genes in NSCs in Supplementary Figure 4D.” If the reviewer meant why we didn’t sequence the material from sequential-ChIP or validate more taget genes, the reason is the limitation of the material. Sequential ChIP requires a large quantity of the antibodies, and yields little material barely sufficient for a few qPCR after the second round of IP. This yielded amount was far below the minimum required for library construction. The PRDM16 antibody was a gift, and the quantity we have was very limited. We made a lot of efforts to optimize all available commercial antibodies in ChIP and Cut&Tag, but none of them worked.

      (6) In comparing RNA in situ between WT and PRDM16 KO in Figure 7, the authors state they use the Wnt2b signal to identify the border between CH and neocortex. However, the Wnt2b signal is shown in grey and it is impossible for this reviewer to see clear Wnt2b expression or where the boundaries are in Figure 7A. The authors also do not show where they placed the boundaries in their analysis. Furthermore, Figure 7B only shows insets for one of the regions being compared making it difficult to see differences from the other region. Finally, the authors do not show an example of their spot segmentation to judge whether their spot counting is reliable. Overall, this makes it difficult to judge whether the quantification in Figure 7C can be trusted.

      To address these questions, in the revised manuscript we will include an individal channel of Wnt2b and mark the boundaries. We will also provide full-view images and examples of spot segmentation in supplementary figures as space limitation in the main figures.

      (7) The correlation between mKi67 and Axin2 in Figure 7 is interesting but does not convincingly show that Wnt downstream of PRDM16 and BMP is responsible for the increased proliferation in PRDM16 mutants.

      We agree that this result (the correlation between mKi67 and Axin2) alone only suggests that Wnt signaling is related to the proliferation defect in the Prdm16 mutant, and does not necessarily mean that Wnt is downstream of PRDM16 and BMP. Our concolusion is backed up by two additional lines of evidences:  the Cut&Tag data in which PRDM16 binds to regulatory regions of Wnt7b and Wnt3a; BMP and PRDM16 co-repress Wnt7b in vitro.

      An ideal result is that down-regulating Wnt signaling in Prdm16 mutant can rescue Prdm16 mutant phenotype. Such an experiment is technically challenging. Wnt plays diverse and essential roles in NSC regulation, and one would need to use a celltype-and stage-specific tool to down-regulate Wnt in the background of Prdm16 mutation. Moreover, Wnt genes are not the only targets regulated by PRDM16 in these cells, and downregulating Wnt may not be sufficient to rescue the phenotype. 

      Weaknesses of the presentation:

      Overall, the manuscript is not easy to read. This can cause confusion.

      We will revise the text to improve the clarity.

      Reviewer #2 (Public review):

      Summary:

      This article investigates the role of PRDM16 in regulating cell proliferation and differentiation during choroid plexus (ChP) development in mice. The study finds that PRDM16 acts as a corepressor in the BMP signaling pathway, which is crucial for ChP formation.

      The key findings of the study are:

      (1) PRDM16 promotes cell cycle exit in neural epithelial cells at the ChP primordium.

      (2) PRDM16 and BMP signaling work together to induce neural stem cell (NSC) quiescence in vitro.

      (3) BMP signaling and PRDM16 cooperatively repress proliferation genes.

      (4) PRDM16 assists genomic binding of SMAD4 and pSMAD1/5/8.

      (5) Genes co-regulated by SMADs and PRDM16 in NSCs are repressed in the developing ChP.

      (6) PRDM16 represses Wnt7b and Wnt activity in the developing ChP.

      (7) Levels of Wnt activity correlate with cell proliferation in the developing ChP and CH.

      In summary, this study identifies PRDM16 as a key regulator of the balance between BMP and Wnt signaling during ChP development. PRDM16 facilitates the repressive function of BMP signaling on cell proliferation while simultaneously suppressing Wnt signaling. This interplay between signaling pathways and PRDM16 is essential for the proper specification and differentiation of ChP epithelial cells. This study provides new insights into the molecular mechanisms governing ChP development and may have implications for understanding the pathogenesis of ChP tumors and other related diseases.

      Strengths:

      (1) Combining in vitro and in vivo experiments to provide a comprehensive understanding of PRDM16 function in ChP development.

      (2) Uses of a variety of techniques, including immunostaining, RNA in situ hybridization, RT-qPCR, CUT&Tag, ChIP-seq, and SCRINSHOT.

      (3) Identifying a novel role for PRDM16 in regulating the balance between BMP and Wnt signaling.

      (4) Providing a mechanistic explanation for how PRDM16 enhances the repressive function of BMP signaling. The identification of SMAD palindromic motifs as preferred binding sites for the SMAD/PRDM16 complex suggests a specific mechanism for PRDM16-mediated gene repression.

      (5) Highlighting the potential clinical relevance of PRDM16 in the context of ChP tumors and other related diseases. By demonstrating the crucial role of PRDM16 in controlling ChP development, the study suggests that dysregulation of PRDM16 may contribute to the pathogenesis of these conditions.

      Weaknesses:

      (1) Limited investigation of the mechanism controlling PRDM16 protein stability and nuclear localization in vivo. The study observed that PRDM16 protein became nearly undetectable in NSCs cultured in vitro, despite high mRNA levels. While the authors speculate that post-translational modifications might regulate PRDM16 in NSCs similar to brown adipocytes, further investigation is needed to confirm this and understand the precise mechanism controlling PRDM16 protein levels in vivo.

      While mechansims controlling PRDM16 protein stability and nuclear localization in the developing brain are interesting, the scope of this paper is revealing the function of PRDM16 in the choroid plexus and its interaction with BMP signaling. We will be happy to pursuit this direction in our next study.

      (2) Reliance on overexpression of PRDM16 in NSC cultures. To study PRDM16 function in vitro, the authors used a lentiviral construct to constitutively express PRDM16 in NSCs. While this approach allowed them to overcome the issue of low PRDM16 protein levels in vitro, it is important to consider that overexpressing PRDM16 may not fully recapitulate its physiological role in regulating gene expression and cell behavior.

      As stated above, we acknowledge that findings from cultured NSCs may not directly apply to ChP cells in vivo. We are cautious with our statements. The cell culture work was aimed to identify potential mechanisms by which PRDM16 and SMADs interact to regulate gene expression and target genes co-regulated by these factors. We expect that not all targets from cell culture are regulated by PRDM16 and SMADs in the ChP, so we validated expression changes of several target genes in the developing ChP and now included the new data in Fig. 7 and Supplementary Fig. 7. Out of the 31 genes identified from cultured cells, four cell cycle regulators including Wnt7b, Id3, Spc24/25/nuf2 and Mybl2, showed de-repression in Prdm16 mutant ChP. These genes can be relevant downstream genes in the ChP, and other target genes may be cortical NSC-specific or less dependent on Prdm16 in vivo.

      (3) Lack of direct evidence for AP1 as the co-factor responsible for SMAD relocation in the absence of PRDM16. While the study identified the AP1 motif as enriched in SMAD binding sites in Prdm16 knockout cells, they only provided ChIP-qPCR validation for c-FOS binding at two specific loci (Wnt7b and Id3). Further investigation is needed to confirm the direct interaction between AP1 and SMAD proteins in the absence of PRDM16 and to rule out other potential co-factors.

      We agree that the finding of the AP1 motif enriched at the PRDM16 and SMAD co-binding regions in Prdm16 KO cells can only indirectly suggest AP1 as a co-factor for SMAD relocation. That’s why we used ChIP-qPCR to examine the presence of C-fos at these sites. Although we only validated two targets, the result confirms that C-fos binds to the sites only in the Prdm16 KO cells but not Prdm16_expressing cells, suggesting AP1 is a co-factor.  We results cannot rule out the presence of other co-factors.

      Reviewer #3 (Public review):

      Summary:

      Bone morphogenetic protein (BMP) signaling instructs multiple processes during development including cell proliferation and differentiation. The authors set out to understand the role of PRDM16 in these various functions of BMP signaling. They find that PRDM16 and BMP co-operate to repress stem cell proliferation by regulating the genomic distribution of BMP pathway transcription factors. They additionally show that PRDM16 impacts choroid plexus epithelial cell specification. The authors provide evidence for a regulatory circuit (constituting of BMP, PRDM16, and Wnt) that influences stem cell proliferation/differentiation.

      Strengths:

      I find the topics studied by the authors in this study of general interest to the field, the experiments well-controlled and the analysis in the paper sound.

      Weaknesses:

      I have no major scientific concerns. I have some minor recommendations that will help improve the paper (regarding the discussion).

      We will revise the discussion according the suggestions.

    1. eLife Assessment

      The authors utilize a valuable computational approach to exploring the mechanisms of memory-dependent klinotaxis, with a hypothesis that is both plausible and testable. Although they provide a solid hypothesis of circuit function based on an established model, the model's lack of integration of newer experimental findings, its reliance on predefined synaptic states, and oversimplified sensory dynamics, make the investigation incomplete for both memory and internal-state modulation of taxis.

    2. Reviewer #1 (Public review):

      Summary:

      This research focuses on C. elegans klinotaxis, a chemotactic behavior characterized by gradual turning, aiming to uncover the neural circuit mechanism responsible for the context-dependent reversal of salt concentration preference. The phenomenon observed is that the preferred salt concentration depends on the difference between the pre-assay cultivation conditions and the current environmental salt levels.

      The authors propose that a synaptic-reversal plasticity mechanism at the primary sensory neuron, ASER, is critical for this memory- and context-dependent switching of preference. They build on prior findings regarding synaptic reversal between ASER and AIB, as well as the receptor composition of AIY neurons, to hypothesize that similar "plasticity" between ASER and AIY underpins salt preference behavior in klinotaxis. This plasticity differs conceptually from the classical one as it does not rely on any structural changes but rather synaptic transmission is modulated by the basal level of glutamate, and can switch from inhibitory to excitatory.

      To test this hypothesis, the study employs a previously established neuroanatomically grounded model [4] and demonstrates that reversing the ASER-AIY synapse sign in the model agent reproduces the observed reversal in salt preference. The model is parameterized using a computational search technique (evolutionary algorithm) to optimize unknown electrophysiological parameters for chemotaxis performance. Experimental validity is ensured by incorporating constraints derived from published findings, confirming the plausibility of the proposed mechanism.

      Finally. the circuit mechanism allowing C. elegans to switch behaviour to an exploration run when starved is also investigated. This extension highlights how internal states, such as hunger, can dynamically reshape sensory-motor programs to drive context-appropriate behaviors.

      Strengths and weaknesses:

      The authors' approach of integrating prior knowledge of receptor composition and synaptic reversal with the repurposing of a published neuroanatomical model [4] is a significant strength. This methodology not only ensures biological plausibility but also leverages a solid, reproducible modeling foundation to explore and test novel hypotheses effectively.

      The evidence produced that the original model has been successfully reproduced is convincing.

      The writing of the manuscript needs revision as it makes comprehension difficult.

      One major weakness is that the model does not incorporate key findings that have emerged since the original model's publication in 2013, limiting the support for the proposed mechanism. In particular, ablation studies indicate that AIY is not critical for chemotaxis, and other interneurons may play partially overlapping roles in positive versus negative chemotaxis. These findings challenge the centrality of AIY and suggest the model oversimplifies the circuit involved in klinotaxis.

      Reference [1] also shows that ASER neurons exhibit complex, memory- and context-dependent responses, which are not accounted for in the model and may have a significant impact on chemotactic model behaviour.

      The hypothesis of synaptic reversal between ASER and AIY is not explicitly modeled in terms of receptor-specific dynamics or glutamate basal levels. Instead, the ASER-to-AIY connection is predefined as inhibitory or excitatory in separate models. This approach limits the model's ability to test the full range of mechanisms hypothesized to drive behavioral switching.

      While the main results - such as response dependence on step inputs at different phases of the oscillator - are consistent with those observed in chemotaxis models with explicit neural dynamics (e.g., Reference [2]), the lack of richer neural dynamics could overlook critical effects. For example, the authors highlight the influence of gap junctions on turning sensitivity but do not sufficiently analyze the underlying mechanisms driving these effects. The role of gap junctions in the model may be oversimplified because, as in the original model [4], the oscillator dynamics are not intrinsically generated by an oscillator circuit but are instead externally imposed via $z_\text{osc}$. This simplification should be carefully considered when interpreting the contributions of specific connections to network dynamics. Lastly, the complex and context-dependent responses of ASER [1] might interact with circuit dynamics in ways that are not captured by the current simplified implementation. These simplifications could limit the model's ability to account for the interplay between sensory encoding and motor responses in C. elegans chemotaxis.

      Appraisal:

      The authors show that their model can reproduce memory-dependent reversal of preference in klinotaxis, demonstrating that the ASER-to-AIY synapse plays a key role in switching chemotactic preferences. By switching the ASER-AIY connection from excitatory to inhibitory they indeed show that salt preference reverses. They also show that the curving/turn rate underlying the preference change is gradual and depends on the weight between ASER-AIY. They further support their claim by showing that curving rates also depend on cultivated (set-point).

      Thus within the constraints of the hypothesis and the framework, the model operates as expected and aligns with some experimental findings. However, significant omissions of key experimental evidence raise questions on whether the proposed neural mechanisms are sufficient for reversal in salt-preference chemotaxis.

      Previous work [1] has shown that individually ablating the AIZ or AIY interneurons has essentially no effect on the Chemotactic Index (CI) toward the set point ([1] Figure 6). Furthermore, in [1] the authors report that different postsynaptic neurons are required for movement above or below the set point. The manuscript should address how this evidence fits with their model by attempting similar ablations. It is possible that the CI is rescued by klinokinesis but this needs to be tested on an extension of this model to provide a more compelling argument.

      The investigation of dispersal behaviour in starved individuals is rather limited to testing by imposing inhibition of the SMB neurons. Although a circuit is proposed for how hunger states modulate taxis in the absence of food, this circuit hypothesis is not explicitly modelled to test the theory or provide novel insights.

      Impact :

      This research underscores the value of an embodied approach to understanding chemotaxis, addressing an important memory mechanism that enables adaptive behavior in the sensorimotor circuits supporting C. elegans chemotaxis. The principle of operation - the dependence of motor responses to sensory inputs on the phase of oscillation - appears to be a convergent solution to taxis. Similar mechanisms have been proposed in Drosophila larvae chemotaxis [2], zebrafish phototaxis [3], and other systems. Consequently, the proposed mechanism has broader implications for understanding how adaptive behaviors are embedded within sensorimotor systems and how experience shapes these circuits across species.

      Although the reported reversal of synaptic connection from excitatory to inhibitory is an exciting phenomenon of broad interest, it is not entirely new, as the authors acknowledge similar reversals have been reported in ASER-to-AIB signaling for klinokinesis ( Hiroki et al., 2022). The proposed reversal of the ASER-to-AIY synaptic connection from inhibitory to excitatory is a novel contribution in the specific context of klinotaxis. While the ASER's role in gradient sensing and memory encoding has been previously identified, the current paper mechanistically models these processes, introducing a hypothesis for synaptic plasticity as the basis for bidirectional salt preference in klinotaxis.

      The research also highlights how internal states, such as hunger, can dynamically reshape sensory-motor programs to drive context-appropriate behaviors.

      The methodology of parameter search on a neural model of a connectome used here yielded the valuable insight that connectome information alone does not provide enough constraints to reproduce the neural circuits for behaviour. It demonstrates that additional neurophysiological constraints are required.

      Additional Context

      Oscillators with stimulus-driven perturbations appear to be a convergent solution for taxis and navigation across species. Similar mechanisms have been studied in zebrafish phototaxis [3], Drosophila larvae chemotaxis [2], and have even been proposed to underlie search runs in ants. The modulation of taxis by context and memory is a ubiquitous requirement, with parallels across species. For example, Drosophila larvae modulate taxis based on current food availability and predicted rewards associated with odors, though the underlying mechanism remains elusive. The synaptic reversal mechanism highlighted in this study offers a compelling framework for understanding how taxis circuits integrate context-related memory retrieval more broadly.

      As a side note, an interesting difference emerges when comparing C. elegans and Drosophila larvae chemotaxis. In Drosophila larvae, oscillatory mechanisms are hypothesized to underlie all chemotactic reorientations, ranging from large turns to smaller directional biases (weathervaning). By contrast, in C. elegans, weathervaning and pirouettes are treated as distinct strategies, often attributed to separate neural mechanisms. This raises the possibility that their motor execution could share a common oscillator-based framework. Re-examining their overlap might reveal deeper insights into the neural principles underlying these maneuvers.

      (1) Luo, L., Wen, Q., Ren, J., Hendricks, M., Gershow, M., Qin, Y., Greenwood, J., Soucy, E.R., Klein, M., Smith-Parker, H.K., & Calvo, A.C. (2014). Dynamic encoding of perception, memory, and movement in a C. elegans chemotaxis circuit. Neuron, 82(5), 1115-1128.

      (2) Antoine Wystrach, Konstantinos Lagogiannis, Barbara Webb (2016) Continuous lateral oscillations as a core mechanism for taxis in Drosophila larvae eLife 5:e15504.

      (3) Wolf, S., Dubreuil, A.M., Bertoni, T. et al. Sensorimotor computation underlying phototaxis in zebrafish. Nat Commun 8, 651 (2017).

      (4) Izquierdo, E.J. and Beer, R.D., 2013. Connecting a connectome to behavior: an ensemble of neuroanatomical models of C. elegans klinotaxis. PLoS computational biology, 9(2), p.e1002890.

    3. Reviewer #2 (Public review):

      Summary:

      This study explores how a simple sensorimotor circuit in the nematode C. elegans enables it to navigate salt gradients based on past experiences. Using computational simulations and previously described neural connections, the study demonstrates how a single neuron, ASER, can change its signaling behavior in response to different salt conditions, with which the worm is able to "remember" prior environments and adjust its navigation toward "preferred" salinity accordingly.

      Strengths:

      The key novelty and strength of this paper is the explicit demonstration of computational neurobehavioral modeling and evolutionary algorithms to elucidate the synaptic plasticity in a minimal neural circuit that is sufficient to replicate memory-based chemotaxis. In particular, with changes in ASER's glutamate release and sensitivity of downstream neurons, the ASER neuron adjusts its output to be either excitatory or inhibitory depending on ambient salt concentration, enabling the worm to navigate toward or away from salt gradients based on prior exposure to salt concentration.

      Weaknesses:

      While the model successfully replicates some behaviors observed in previous experiments, many key assumptions lack direct biological validation. As to the model output readouts, the model considers only endpoint behaviors (chemotaxis index) rather than the full dynamics of navigation, which limits its predictive power. Moreover, some results presented in the paper lack interpretation, and many descriptions in the main text are overly technical and require clearer definitions.

    4. Author response:

      eLife Assessment 

      The authors utilize a valuable computational approach to exploring the mechanisms of memorydependent klinotaxis, with a hypothesis that is both plausible and testable. Although they provide a solid hypothesis of circuit function based on an established model, the model's lack of integration of newer experimental findings, its reliance on predefined synaptic states, and oversimplified sensory dynamics, make the investigation incomplete for both memory and internal-state modulation of taxis.  

      We would like to express our gratitude to the editor for the assessment of our work. However, we respectfully disagree with the assessment that our investigation is incomplete, if the negative assessment is primarily due to the impact of AIY interneuron ablation on the chemotaxis index (CI) which was reported in Reference [1]. It is crucial to acknowledge that the CI determined through experimental means incorporates contributions from both klinokinesis and klinotaxis [1]. It is plausible that the impact of AIY ablation was not adequately reflected in the CI value. Consequently, the experimental observation does not necessarily diminish the role of AIY in klinotaxis. Anatomical evidence provided by the database (http://ims.dse.ibaraki.ac.jp/ccep-tool/) substantiates that ASE sensory neurons and AIZ interneurons, which have been demonstrated to play a crucial role in klinotaxis [Matsumoto et al., PNAS 121 (5) e2310735121], have the highest number of synaptic connections with AIY interneurons. These findings provide substantial evidence supporting the validity of the presented minimal neural network responsible for salt klinotaxis.

      Public Reviews: 

      Reviewer #1 (Public review): 

      Summary: 

      This research focuses on C. elegans klinotaxis, a chemotactic behavior characterized by gradual turning, aiming to uncover the neural circuit mechanism responsible for the context-dependent reversal of salt concentration preference. The phenomenon observed is that the preferred salt concentration depends on the difference between the pre-assay cultivation conditions and the current environmental salt levels. 

      We would like to express our gratitude for the time and consideration you have dedicated to reviewing our manuscript.

      The authors propose that a synaptic-reversal plasticity mechanism at the primary sensory neuron, ASER, is critical for this memory- and context-dependent switching of preference. They build on prior findings regarding synaptic reversal between ASER and AIB, as well as the receptor composition of AIY neurons, to hypothesize that similar "plasticity" between ASER and AIY underpins salt preference behavior in klinotaxis. This plasticity differs conceptually from the classical one as it does not rely on any structural changes but rather synaptic transmission is modulated by the basal level of glutamate, and can switch from inhibitory to excitatory. 

      To test this hypothesis, the study employs a previously established neuroanatomically grounded model [4] and demonstrates that reversing the ASER-AIY synapse sign in the model agent reproduces the observed reversal in salt preference. The model is parameterized using a computational search technique (evolutionary algorithm) to optimize unknown electrophysiological parameters for chemotaxis performance. Experimental validity is ensured by incorporating constraints derived from published findings, confirming the plausibility of the proposed mechanism. 

      Finally. the circuit mechanism allowing C. elegans to switch behaviour to an exploration run when starved is also investigated. This extension highlights how internal states, such as hunger, can dynamically reshape sensory-motor programs to drive context-appropriate behaviors.  

      We would like to thank the reviewer for the appropriate summary of our work. 

      Strengths and weaknesses: 

      The authors' approach of integrating prior knowledge of receptor composition and synaptic reversal with the repurposing of a published neuroanatomical model [4] is a significant strength.

      This methodology not only ensures biological plausibility but also leverages a solid, reproducible modeling foundation to explore and test novel hypotheses effectively.

      The evidence produced that the original model has been successfully reproduced is convincing.

      The writing of the manuscript needs revision as it makes comprehension difficult.  

      We would like to thank the reviewer for recognizing the usefulness of our approach. In the revised version, we will improve the explanation.  

      One major weakness is that the model does not incorporate key findings that have emerged since the original model's publication in 2013, limiting the support for the proposed mechanism. In particular, ablation studies indicate that AIY is not critical for chemotaxis, and other interneurons may play partially overlapping roles in positive versus negative chemotaxis. These findings challenge the centrality of AIY and suggest the model oversimplifies the circuit involved in klinotaxis.

      We would like to express our gratitude for the constructive feedback we have received. We concur with some of your assertions. In fact, our model is the minimal network for salt klinotaxis, which includes solely the interneurons that are connected to each other via the highest number of synaptic connections. It is important to note that our model does not consider redundant interneurons that exhibit overlapping roles. Consequently, the model is not applicable to the study of the impact of interneuron ablation. In the reference [1], the influence of interneuron ablations on the chemotaxis index (CI) has been investigated. The experimentally determined CI value incorporates the contributions from both klinokinesis and klinotaxis. Consequently, it is plausible that the impact of AIY ablation was not significantly reflected in the CI value. The experimental observation does not necessarily diminish the role of AIY in klinotaxis. 

      Reference [1] also shows that ASER neurons exhibit complex, memory- and context-dependent responses, which are not accounted for in the model and may have a significant impact on chemotactic model behaviour. 

      As pointed out by the reviewer, our model does not incorporate the context-dependent response of the ASER. Instead, the salt concentration-dependent glutamate release from the ASRE [S. Hiroki et al. Nat Commun 13, 2928 (2022)] as the result of the ASER responses is considered in the present study.

      The hypothesis of synaptic reversal between ASER and AIY is not explicitly modeled in terms of receptor-specific dynamics or glutamate basal levels. Instead, the ASER-to-AIY connection is predefined as inhibitory or excitatory in separate models. This approach limits the model's ability to test the full range of mechanisms hypothesized to drive behavioral switching.  

      We would like to thank the reviewer for the helpful comments. In the revised version, we will mention the limitation.

      While the main results - such as response dependence on step inputs at different phases of the oscillator - are consistent with those observed in chemotaxis models with explicit neural dynamics (e.g., Reference [2]), the lack of richer neural dynamics could overlook critical effects. For example, the authors highlight the influence of gap junctions on turning sensitivity but do not sufficiently analyze the underlying mechanisms driving these effects. The role of gap junctions in the model may be oversimplified because, as in the original model [4], the oscillator dynamics are not intrinsically generated by an oscillator circuit but are instead externally imposed via $z_¥text{osc}$. This simplification should be carefully considered when interpreting the contributions of specific connections to network dynamics. Lastly, the complex and contextdependent responses of ASER [1] might interact with circuit dynamics in ways that are not captured by the current simplified implementation. These simplifications could limit the model's ability to account for the interplay between sensory encoding and motor responses in C. elegans chemotaxis. 

      We might not understand the substance of your assertions. However, we understand that the oscillator dynamics were not generated by an oscillator neural circuit in our modeling. On the other hand, the present study focuses on how the sensory input and resulting interneuron dynamics regulate the oscillatory activity of SMB motor neurons to generate klinotaxis. 

      Appraisal: 

      The authors show that their model can reproduce memory-dependent reversal of preference in klinotaxis, demonstrating that the ASER-to-AIY synapse plays a key role in switching chemotactic preferences. By switching the ASER-AIY connection from excitatory to inhibitory they indeed show that salt preference reverses. They also show that the curving/turn rate underlying the preference change is gradual and depends on the weight between ASER-AIY. They further support their claim by showing that curving rates also depend on cultivated (set-point).  

      We would like to thank the reviewer for assessing our work.

      Thus within the constraints of the hypothesis and the framework, the model operates as expected and aligns with some experimental findings. However, significant omissions of key experimental evidence raise questions on whether the proposed neural mechanisms are sufficient for reversal in salt-preference chemotaxis.  

      We agree with your opinion. The present hypothesis should be verified by experiments.

      Previous work [1] has shown that individually ablating the AIZ or AIY interneurons has essentially no effect on the Chemotactic Index (CI) toward the set point ([1] Figure 6). Furthermore, in [1] the authors report that different postsynaptic neurons are required for movement above or below the set point. The manuscript should address how this evidence fits with their model by attempting similar ablations. It is possible that the CI is rescued by klinokinesis but this needs to be tested on an extension of this model to provide a more compelling argument.  

      We would like to express our gratitude for the constructive feedback we have received. In the reference [1], the influence of interneuron ablations on the chemotaxis index (CI) has been investigated. It is important to acknowledge that the experimentally determined CI value encompasses the contributions of both klinokinesis and klinotaxis. It is plausible that the impact of AIY ablation was not reflected in the CI value. Consequently, these experimental observations do not necessarily diminish the role of AIY in klinotaxis. The neural circuit model employed in the present study constitutes a minimal network for salt klinotaxis, encompassing solely interneurons that are connected to each other via the highest number of synaptic connections. Anatomical evidence provided by the database (http://ims.dse.ibaraki.ac.jp/cceptool/) substantiates that ASE sensory neurons and AIZ interneurons, which have been demonstrated to play a crucial role in klinotaxis [Matsumoto et al., PNAS 121 (5) e2310735121], have the highest number of synaptic connections with AIY interneurons. Our model does not take into account redundant interneurons with overlapping roles, thus rendering it not applicable to the study of the effects of interneuron ablation.

      The investigation of dispersal behaviour in starved individuals is rather limited to testing by imposing inhibition of the SMB neurons. Although a circuit is proposed for how hunger states modulate taxis in the absence of food, this circuit hypothesis is not explicitly modelled to test the theory or provide novel insights.  

      As pointed out by the reviewer, the neural circuit that inhibits the SMB motor neurons was not explicitly incorporated in our model. We then examined whether our minimal network model could reproduce dispersal behavior under starvation conditions solely due to the experimentally identified inhibitory effect of SMB motor neurons.

      Impact : 

      This research underscores the value of an embodied approach to understanding chemotaxis, addressing an important memory mechanism that enables adaptive behavior in the sensorimotor circuits supporting C. elegans chemotaxis. The principle of operation - the dependence of motor responses to sensory inputs on the phase of oscillation - appears to be a convergent solution to taxis. Similar mechanisms have been proposed in Drosophila larvae chemotaxis [2], zebrafish phototaxis [3], and other systems. Consequently, the proposed mechanism has broader implications for understanding how adaptive behaviors are embedded within sensorimotor systems and how experience shapes these circuits across species.

      We would like to express our gratitude for useful suggestion. We will add the argument that the reviewer mentioned in the revised version.  

      Although the reported reversal of synaptic connection from excitatory to inhibitory is an exciting phenomenon of broad interest, it is not entirely new, as the authors acknowledge similar reversals have been reported in ASER-to-AIB signaling for klinokinesis ( Hiroki et al., 2022). The proposed reversal of the ASER-to-AIY synaptic connection from inhibitory to excitatory is a novel contribution in the specific context of klinotaxis. While the ASER's role in gradient sensing and memory encoding has been previously identified, the current paper mechanistically models these processes, introducing a hypothesis for synaptic plasticity as the basis for bidirectional salt preference in klinotaxis.  

      The research also highlights how internal states, such as hunger, can dynamically reshape sensory-motor programs to drive context-appropriate behaviors.  

      The methodology of parameter search on a neural model of a connectome used here yielded the valuable insight that connectome information alone does not provide enough constraints to reproduce the neural circuits for behaviour. It demonstrates that additional neurophysiological constraints are required.  

      We would like to acknowledge the appropriate recognition of our work.

      Additional Context 

      Oscillators with stimulus-driven perturbations appear to be a convergent solution for taxis and navigation across species. Similar mechanisms have been studied in zebrafish phototaxis [3],

      Drosophila larvae chemotaxis [2], and have even been proposed to underlie search runs in ants.

      The modulation of taxis by context and memory is a ubiquitous requirement, with parallels across species. For example, Drosophila larvae modulate taxis based on current food availability and predicted rewards associated with odors, though the underlying mechanism remains elusive. The synaptic reversal mechanism highlighted in this study offers a compelling framework for understanding how taxis circuits integrate context-related memory retrieval more broadly.  

      We would like to express our gratitude for the insightful commentary. In the revised version, we will incorporate the discussion that the similar oscillator mechanism with stimulus-driven perturbations has been observed for zebrafish phototaxis [3] and Drosophila larvae chemotaxis [2].

      As a side note, an interesting difference emerges when comparing C. elegans and Drosophila larvae chemotaxis. In Drosophila larvae, oscillatory mechanisms are hypothesized to underlie all chemotactic reorientations, ranging from large turns to smaller directional biases (weathervaning). By contrast, in C. elegans, weathervaning and pirouettes are treated as distinct strategies, often attributed to separate neural mechanisms. This raises the possibility that their motor execution could share a common oscillator-based framework. Re-examining their overlap might reveal deeper insights into the neural principles underlying these maneuvers. 

      We would like to acknowledge your thoughtfully articulated comment. As pointed out by the reviewer, from the anatomical database (http://ims.dse.ibaraki.ac.jp/ccep-tool/), we found that the neural circuits underlying weathervaning and pirouettes in C. elegans are predominantly distinct but exhibit partial overlap. When we restrict our search to the neurons that are connected to each other with the highest number of synaptic connections, we identify the projections from the neural circuit of weathervaning to the circuit of pirouettes; however we observed no reversal projections. This finding suggests that the neural circuit of weathervaning, namely, our minimal neural network, is not likely to be affected by that of pirouettes, which consists of AIB interneurons and interneurons and motor neurons the downstream. 

      (1) Luo, L., Wen, Q., Ren, J., Hendricks, M., Gershow, M., Qin, Y., Greenwood, J., Soucy, E.R., Klein, M., Smith-Parker, H.K., & Calvo, A.C. (2014). Dynamic encoding of perception, memory, and movement in a C. elegans chemotaxis circuit. Neuron, 82(5), 1115-1128. 

      (2) Antoine Wystrach, Konstantinos Lagogiannis, Barbara Webb (2016) Continuous lateral oscillations as a core mechanism for taxis in Drosophila larvae eLife 5:e15504. 

      (3) Wolf, S., Dubreuil, A.M., Bertoni, T. et al. Sensorimotor computation underlying phototaxis in zebrafish. Nat Commun 8, 651 (2017). 

      (4) Izquierdo, E.J. and Beer, R.D., 2013. Connecting a connectome to behavior: an ensemble of neuroanatomical models of C. elegans klinotaxis. PLoS computational biology, 9(2), p.e1002890. 

      Reviewer #2 (Public review): 

      Summary: 

      This study explores how a simple sensorimotor circuit in the nematode C. elegans enables it to navigate salt gradients based on past experiences. Using computational simulations and previously described neural connections, the study demonstrates how a single neuron, ASER, can change its signaling behavior in response to different salt conditions, with which the worm is able to "remember" prior environments and adjust its navigation toward "preferred" salinity accordingly.  

      We would like to express our gratitude for the time and consideration the reviewer has dedicated to reviewing our manuscript.

      Strengths: 

      The key novelty and strength of this paper is the explicit demonstration of computational neurobehavioral modeling and evolutionary algorithms to elucidate the synaptic plasticity in a minimal neural circuit that is sufficient to replicate memory-based chemotaxis. In particular, with changes in ASER's glutamate release and sensitivity of downstream neurons, the ASER neuron adjusts its output to be either excitatory or inhibitory depending on ambient salt concentration, enabling the worm to navigate toward or away from salt gradients based on prior exposure to salt concentration.

      We would like to thank the reviewer for appreciating our research. 

      Weaknesses: 

      While the model successfully replicates some behaviors observed in previous experiments, many key assumptions lack direct biological validation. As to the model output readouts, the model considers only endpoint behaviors (chemotaxis index) rather than the full dynamics of navigation, which limits its predictive power. Moreover, some results presented in the paper lack interpretation, and many descriptions in the main text are overly technical and require clearer definitions.  

      We would like to thank the reviewer for the constructive feedback. As the reviewer noted, the fundamental assumptions posited in the study have yet to be substantiated by biological validation. Consequently, these assumptions must be directly assessed by biological experimentation. The model performance for salt klinotaxis is evaluated by multiple factors, including not only a chemotaxis index but also the curving rate vs. bearing (Fig. 4a, the bearing is defined in Fig. A3) and the curving rate vs. normal gradient (Fig. 4c). The subsequent two parameters work to characterize the trajectory during salt klinotaxis. In the revised version, we will meticulously revise the manuscript according to the suggestions by the reviewer. We would like to express our sincere gratitude for your insightful review of our work.

    1. eLife Assessment

      This important study examines the role of pericytes in patterning the zebrafish blood-brain barrier (BBB) and controlling its permeability. Using pdgfrb mutant zebrafish models lacking brain pericytes, the authors report that pericyte-deficient cerebrovasculatures are ill-patterned, yet display unaltered restrictive BBB permeability properties at larval and juvenile stages. More severe phenotypes are detected in adults, with focal leakage sites associated with hemorrhages and aneurysms. Using solid and beautifully documented imaging, the authors suggest that, contrary to the situation described in rodent models, pdgfrb-dependent pericytes are not required to maintain the BBB in the zebrafish brain; these unexpected and intriguing findings reshape our understanding of BBB permeability regulation in vertebrates.

    2. Reviewer #1 (Public review):

      Summary:

      The study investigates the role of vascular mural cells, specifically pericytes and vascular smooth muscle cells (vSMCs), in maintaining blood-brain barrier (BBB) integrity and regulating vascular patterning. Analyzing zebrafish pdgfrb mutants that lack brain pericytes and vSMCs, they show that mural cell deficiency does not impair BBB establishment or maintenance during larval and early juvenile stages. However, mural cells seem to be crucial for preventing vascular aneurysms and hemorrhage in adulthood as focal leakage, basement membrane disruption, and increased caveolae formation are observed in adult zebrafish at aneurysm hotspots. The authors challenge the paradigm that mural cells are essential for BBB regulation in early development while highlighting their importance for long-term vascular stability.

      Strengths:

      Previous studies have established that the zebrafish BBB shares molecular and morphological homology with e.g. the mammalian BBB and therefore represents a suitable model. By examining mural cell roles across different life stages - from larval to adult zebrafish - the study provides an unprecedented comprehensive developmental analysis of brain vascular development and of how mural cells influence BBB integrity and vascular stability over time. The use of live imaging, whole-brain clearing, and electron microscopy offers high-resolution insights into cerebrovascular patterning, aneurysm development, and structural changes in endothelial cells and basement membranes. By analyzing "leakage hotspots" and their association with structural endothelial defects in adults the presented findings add novel insights into how mural cell loss may lead to vascular instability.

      Weaknesses:

      The study uses quantitative tracer assays with multiple molecular weight dyes to evaluate blood-brain barrier (BBB) permeability. The study normalizes the intensity of tracer signals (e.g., 10 kDa, 70 kDa dextrans) in the brain parenchyma to the vascular signal of a 2000 kDa dextran tracer (assumed to remain within vessels). Intensity normalization is used to control for variations in tracer injection efficiency or vascular density. This method doesn't directly assess the absolute amount of tracer present in the parenchyma, potentially underestimating leakage severity. As the lack of BBB impairment is a "negative" finding, more rigorous controls or other methods might be needed to corroborate it.

    3. Reviewer #2 (Public review):

      Summary:

      The authors generated a zebrafish mutant of the pdgfrb gene. The presented analyses and data confirm previous studies demonstrating that Pdgfrb signaling is necessary for mural cell development in zebrafish. In addition, the data support previously published studies in zebrafish showing that mural cell deficiency leads to hemorrhages later in life. The authors presented quantified data on vessel density and branching, assessed tracer extravasation, and investigated the vasculature of adult mice using electron microscopy.

      Strengths:

      The strength of this article is that it provides independent confirmation of the important role of Pdgfrb signaling for the development of mural cells in the zebrafish brain. In addition, it confirms previous literature on zebrafish that provides evidence that, in the absence of pericytes/VSMC, hemorrhages appear (Wang et al, 2014, PMID: 24306108 and Ando et al 2021, PMID: 3431092). The study by Ando et al, 2021 did not report experiments assessing BBB leakage in pdgfrb mutants but in the review article by Ando et al (PMID: 34685412) it is stated that "indicating that endothelial cells can produce basic barrier integrity without pericytes in zebrafish".

      Weaknesses:

      (1) The authors should avoid using violin plots, which show distribution. Instead, they should replace all violin plots in the figures with graphs showing individual data points and standard deviation. For Figure 2f specifically, the standard deviation in the analyzed cohort should be shown.

      (2) The authors have not shown the reduced PDGFRB protein or the effect of mutation on mRNA level in their zebrafish mutant.

      (3) Statistical data analysis: Did the authors perform analyses to investigate whether the data has a normal distribution (e.g., Figures 1d, e)?

      (4) Analysis of tracer extravasation. The use of 2000 kDa dextran intensity as an internal reference is problematic because the authors have not provided data demonstrating that the 2000 kDa dextran signal remains consistent across the entire vasculature. The authors have not provided data demonstrating that the 2000 kDa dextran signal in vessels exhibits acceptable variance across the vasculature to serve as a reliable internal reference. The variability of this signal within a single animal remains unknown. The presented data do not address this aspect.

      Additionally, it's intriguing that the signal intensity in the parenchyma of the tested tracers presents a substantial range, varying by 20-30% in the analysed cohort (Figure 1g, Extended Figure 1e). Such large variability raises the question of its origin. Could it be a consequence of the normalization to 2000 kDa dextran intensity which differs between different fish? Or is it due to the differences in the parenchymal signal intensity while the baseline 2000 kDa intensity is stable? Or is the situation mixed?

      An alternative and potentially more effective approach would be to cross the pdgfrb mutant line with a line where endothelial cells are genetically labeled to define vessels (e.g. the line kdrl used in acquiring data presented in Figure 2a). Non-injected controls could then be used as a baseline to assess tracer extravasation into the parenchyma.

      How is the data presented in Figure 3e generated? How was the dextran intensity calculated? It looks like the authors have used the kdrl line to define vessels. Was the 2000 kDa still used as in previous figures? If not, please describe this in the Materials and Methods section.

      (5) The authors state that both controls and mutants show extravasation of 1 kDa NHS-ester into the parenchyma. However, the presented images do not illustrate this; it is not obvious from these images (Extended Data Figure 1c). Additionally, the presented quantification data (Extended Data Figure 1e) do not show that, at 7 dpf, the vasculature is permeable to this tracer. Note that the range of signal intensity of the 1 kDa NHS-ester is similar to the 70 kDa dextran (Figure 1g and Extended Figure 1e). Would one expect an increase in the ratio in case of extravasation, considering that the 2000 kDa dextran has the same intensity in all experiments? Please explain.

      (6) The study would be strengthened by a more detailed temporal analysis of the phenotype. When do the aneurysms appear? Is there an additional loss of VSMC?

      (7) The authors intended to analyze the BBB at later stages (line 128), but there is not a significant time difference between 2 months (Figure 2) and 3 months (Figure 3) considering that zebrafish live on average 3 years. Therefore, the selection of only two time-points, 2 and 3 months, to analyze BBB changes does not provide a comprehensive overview of temporal changes throughout the zebrafish's lifespan. How long do the pdgfb mutants live?

      (8) Why is there a difference in tracer permeability between 2 and 3 months (Figures 2 and 3)? Are hemorrhages not detected in 2-month-old zebrafish?

      (9) Figure 3: The capillary bed should be presented in magnified images as it is not clearly visible. Figure 3e shows that in the pdgfb mutant the dextran intensity is higher also in regions 6-10. How do the authors explain this?

      (10) In general, the manuscript would benefit from a more detailed description of the performed experiments. How long did the tracer circulate in the experiments presented in Figures 2, 3, and 4?

      (11) How do the authors explain the poor signal of the 70 kDa dextran from the vasculature of 5-month-old zebrafish presented in Extended Data Figure 3?

      (12) The study would benefit from a clear separation of the phenotypes caused by the loss of VSMC. The title eludes that also capillaries present hemorrhages which is not the case. How do vascular mural cells differ from mural cells? Are there any other mural cells?

      (13) I have a few comments about how the authors have interpreted the literature and why, in my opinion, they should revise their strong statements (e.g., the last sentence in the abstract).

      Scientists have their own insights and interpretations of data. However, when citing published data, it should be clearly indicated whether the statement is a direct quote from the original publication or an interpretation. In the current manuscript, the authors have not correctly cited the data presented in the two published papers (references 5 and 6). These papers do not propose a model where pericytes suppress "adsorptive transcytosis" (lines 73-76). While increased transcytosis is observed in pericyte-deficient mice, the specific type of vesicular transport that is increased or induced remains unknown.

      Similarly, lines 151-152 refer to references 5 and 6 and use the term "adsorptive transcytosis," but the authors of both papers did not use this term. Attributing this term to the original authors is inaccurate. Additionally, lines 152-153 do not accurately represent the findings of references 5 and 6. These papers do not state that there is an induction of "caveolae" in endothelial cells in pericyte-deficient mice. In the absence of pericytes, many vesicles can be observed in endothelial cells, but these vesicles are relatively large. It is more likely that there is some form of uncontrolled transcytosis, perhaps micropinocytosis. Please refer to the original papers accurately.

      Also, the authors have missed the fact that in mice, the extent of pericyte loss correlates with the extent of BBB leakage. To a certain extent, the remaining pericytes, can compensate for the loss by making longer processes and so ensure the full longitudinal coverage of the endothelium. This was shown in the initial work of Armulik et al (reference 5) and later in other studies.

      The bold assertion on lines 183 -187 that a lack of specific BBB phenotype in pdgfrb zebrafish mutant invalidates mouse model findings is unfounded. Despite the notion that zebrafish endothelium possesses a BBB, I present a few examples highlighting the differences in brain vascular development and why the authors' expectation of a straightforward extrapolation of mouse BBB phenotypes to zebrafish is untenable.

      In mice Pdgfrb knockout is lethal, but in zebrafish, this is not the case. In marked contrast to mice, however, zebrafish pdgfrb null mutants reach adulthood despite extensive cerebral vascular anomalies and hemorrhage. Following the authors' argumentation about the unlikely divergence of zebrafish and mice evolution, does it mean that the described mouse phenotype warrants a revisit and that the Pdgfrb knockout in mice perhaps is not lethal? Another example where the role of a gene product is not one-to-one, which relates to pericyte development, is Notch3. Notch3-null mice do not show significant changes in pericyte numbers or distribution, suggesting a less prominent role in pericyte development compared to zebrafish.

      Although many aspects of development are conserved between species, there are significant differences during brain vascular development between zebrafish and mice. These differences could reveal why the BBB is not impaired in zebrafish pdgfrb mutants. There is a difference in the temporal aspect when various cellular players emerge. The timing of microglia colonization in the brain differs. In mice, microglia colonization starts before the first vessel sprouts enter the brain, while in zebrafish, microglia enter after. Additionally, microglia in zebrafish and mice have a different ontogeny. In mice, astrocytes specialize postnatally and form astrocyte endfeet postnatally. In zebrafish, radial glia/astrocytes form at 48 hpf, and as early as 3 dpf, gfap+ cells have a close relationship with blood vessels. Thus, these radial glia/astrocyte-like cells could play an important role in BBB induction in zebrafish. It's worth noting that in Drosophila, the blood-brain barrier is located in glial cells. While speculative, these cells might still play a role in zebrafish, while the role of pericytes does not seem to be crucial. Pericytes enter the brain and contact with developing vasculature (endothelium) relatively late in zebrafish (60 hpf). In mice, the situation is different, as there is no such lag between endothelium and pericyte entry into the brain. I suggest that the authors approach the observed data with curiosity and ask: Why are these differences present? Are all aspects of the BBB induced by neural tissue in zebrafish? What is the contribution of microglia and astrocytes?"

      Another interesting aspect to consider is the endothelial-pericyte ratio and longitudinal coverage of pericytes in the zebrafish brain, and how this relates to what is observed in mice. How similar is the zebrafish vasculature to the mouse vasculature when it comes to the average length of pericytes in the zebrafish brain? Does the longitudinal coverage of pericytes in the zebrafish brain reach nearly 100%, as it does in mice?

      Based on the preceding arguments, it is recommended that the authors present a balanced discussion that provides insightful discussion and situates their work within a broader framework.

    4. Reviewer #3 (Public review):

      This manuscript examines the role of pdgfrb-positive pericytes in the establishment and maintenance of the blood-brain barrier (BBB) in the zebrafish. Previous studies in PDGFB- or PDGFRB-deficient mice have suggested that loss of pericytes results in disruption of the BBB. The authors show that zebrafish pdgfrb mutant larvae have an intact BBB and that pdgfrb mutant adult fish show large vessel defects and hemorrhage but do not exhibit substantial leakage from brain capillaries, suggesting loss of pericytes is not sufficient to "open" the BBB. The authors use beautiful and compelling images and rigorous quantification to back up most of their conclusions. The imaging of the adult brain is particularly nice. The authors rigorously document the lack of BBB leakage in pdgfrbuq30bh mutant larvae and large vessel phenotypes (eg, enlargement and rupture) in pdgfrbuq30bh mutant adults. A few points would help the authors to further strengthen their findings contradicting the current dogma from rodent models.

      Major point:

      The authors document pericyte loss using a single TgBAC(pdgfrb:egfp)ncv22 transgenic line driven by the promoter of the same gene mutated in their pdgfrbuq30bh mutants. Given their findings on the consequences of pericyte loss directly contradict current dogma from rodent studies, it would be useful to further validate the absence of brain pericytes in these mutants using one of several other transgenic lines marking pericytes currently available in the zebrafish. This could be done using pdgfrb crispants, which the authors show nicely phenocopy the germline mutants, at least in larvae. This would help nail down the absence of any currently identifiable pericyte population or sub-population in the loss of pdgfrb animals and substantially strengthen the authors' conclusions.

      Other issues:

      The authors should provide more information about the pdgfrbuq30bh mutant and how it was generated (including a diagram in a supplemental figure would be useful).

      It would be helpful to show some data on whether mutants show morphological phenotypes or developmental delay at 7 and 14 dpf, to provide some context to better assess the reduced branching and vessel length vascular phenotypes (see Figures 1c-e).

      If available, it would be helpful to have a positive control for the tracer leakage experiments - a genetic manipulation that does cause disruption of the BBB and leakage at 2 hours post-tracer injection (see Figures 1f and g).

      Quantification of the findings in Figure 4c,d would be useful, as would the use of germline fish for these experiments if these are now available. If this is not possible, it would be helpful to document that the crispants used in these experiments lack pdgfrb:egfp pericytes at adult stages (this is only shown for 5 dpf larvae, in Extended Data Figure 4b).

      Adult mutants clearly show less dye leakage in the more superficial capillary regions than WT siblings, but dextran intensity is a bit higher, although this could well be diffusion from more central brain regions where overt hemorrhage is occurring. Along similar lines though, the authors' TEM data in Extended Data Figure 4d hints that there may be more caveolae in mutant brain capillaries, although the N number was lower here than for the measurements from TEM of larger central vessels (Figure 4g). It would be useful to carry out additional measurements to increase the N number in Figure 4d to see whether the difference between wild-type sibling and mutant capillary caveolae numbers remains as not significant.

      It might be helpful to include some orienting labels and/or additional descriptions in the figure legends to help readers who are not used to looking at zebrafish brain vessels have an easier time figuring out what they are looking at and where it is in the brain.

    5. Author response:

      We thank all the reviewers for their detailed comments. In response, we will address the comments with further analysis, experiments and an expanded discussion.

      In terms of each specific reviewer's comments:

      Reviewer 1 was positive overall but had several suggestions and requested further rigorously controls. These are highly constructive technical concerns and will be addressed through additional experimentation and methods for quantification.

      Reviewer 2 summarised the strengths of the study as being largely confirmatory. They have perhaps not fully appreciated that this is the first published functional assessment of cerebral vascular permeability in a pericyte deficient zebrafish model.

      The reviewer has made a number of very helpful suggestions to improve technical aspects of the analysis. Many align with the suggestions of Reviewer 1. Additional experiments that include more rigorous controls and further methods to quantify vessel permeability will address these concerns in revision.

      We also note that the reviewer calls for a more nuanced and careful discussion section. We take the reviewers point and do appreciate their concerns. We were limited by wordcount in the initial submission in short report format, but in response will expand and provide a more thorough discussion.

      Reviewer 3 was positive overall but has suggested additional controls and experiments to further strengthen the findings and support our conclusions. Some align with the suggestions of Reviewers 1 and 2. We agree and aim to address them through additional work in revision.

    1. eLife Assessment

      This important manuscript proposes a new strategy for the identification of new mechanisms of drug resistance based on SAturated Transposon Analysis in Yeast (SATAY), a powerful transposon sequencing method in Saccharomyces cerevisiae. This method allows us to uncover loss- and gain-of-function mutations conferring resistance to 20 different antifungal compounds. The method is convincing, allowing the authors to identify a novel interaction of chitosan with the cell wall mannosylphosphate, and show that the transporter Hol1 concentrates the novel antifungal ATI-2307 within yeast.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, the authors employed Saturated Transposon Analysis in Yeast (SATAY) in the model yeast Saccharomyces cerevisiae to uncover mutations conferring resistance to 20 different antifungal compounds. These screens revealed novel resistance mechanisms and the modes of action for the antifungal compounds Chitosan and HTI-2307. The authors discovered that Chitosan electrostatically interacts with cell wall mannosylphosphate and identified Hol1 as the transporter of HTI-2307.

      Strengths:

      The study highlights the power of SATAY in uncovering drug-resistance mechanisms, modes of action, and cellular processes influencing fungal responses to drugs. Identifying novel resistance mechanisms and modes of action for various compounds in this model yeast provides valuable insights for further investigating these compounds in fungal pathogens and developing antifungal strategies. This study thus represents a significant resource for exploring cellular responses to chemical stresses.

      The manuscript is well-written and highly clear.

      Weaknesses:

      As the study was conducted using highly modified non-pathogenic laboratory yeast strains, verification of the findings in fungal pathogens would greatly enhance its relevance and applicability.

    3. Reviewer #2 (Public review):

      The study begins by exposing wild-type yeast libraries to some well-understood antifungals (amphotericin B, caspofungin, myriocin) to illustrate the complexity and power of the analytical method. These toxins are positively selected for loss-of-function transposon (CDS) insertions in many of the genes identified previously in earlier studies. The outlier genes were visually evident in scatter plots (Figure 1A, 1B, 1C) but the magnitude and statistical significance of the effects were not presented in tables. There were some unexplained and unexpected findings as well. For example, caspofungin targets the product of the GSC2 gene, and yet transposon insertions in this gene were positively selected rather than negatively selected (seemingly discordant from other studies).

      Interestingly, transposon insertions immediately upstream of toxin targets (Figure 1D) and toxin efflux transporters or their regulators (Figure 1E) were visibly selected by exposure to the toxins, suggesting gain-of-expression. Most of these findings are convincing, even without statistical tests. However, some were not (for example, Soraphen A on YOR1). A relevant question emerges here: Do both ends of the transposon confer the same degree of cryptic enhancer/promoter activity? If one end contains strong activity on downstream gene expression while the other does not, the effects of one may be obscured by the other. The directionality of transposon insertions (not provided) would then be important to consider when interpreting the raw data.

      A masterful rationalization of transposon insertion selection in the YAP1 and FLR1 genes was presented wherein loss of C-terminal auto-inhibitory domain of the Yap1 transcription factor resulted in FLR1 overexpression and resistance to Cerulenin. Transposon insertions in the CDS of YAP1 and FLR1 were negatively selected in Chlorothalonil while the gain-of-function and -expression insertions (enriched in Cerulenin) were not. The rationalization of these findings - that Chlorothalonil activates Yap1 while Cerulenin does not - was much less convincing and should be tested directly with a simple experiment such as Q-PCR.

      Moving to specially engineered yeast strains (Figure 2) where multiple efflux transporters were eliminated (for Prochloraz testing) or new drug targets were inserted (for Fludioxonil and Iprodione), numerous interesting observations were obtained. For instance, transposon insertions in totally different sets of genes were enriched by prochloraz depending on the strain background. Conversely, almost the exact same genes were selected in Fludioxonil and Iprodione, including genes in the well-known HOG pathway. Because several candidate receptors of these compounds were not significant in the Tn-seq dataset, the authors add new evidence to the field suggesting that the introduced gene (BdDRK1) represents the direct, or near-direct, target of these compounds.

      Chitosan effectiveness was studied by Tn-seq in yet another specialized strain of yeast that is uniquely susceptible to the toxin. Once again, the authors masterfully rationalize the complex effects, leading to a simple model where chitosan interacts with mannosyl-phosphate in the cell wall and membrane, which is deposited by Mnn4 and Mnn6 and masked by Mnn1 enzymes in the Golgi complex (themselves regulated or dependent on a number of additional gene products such as YND1. This research compellingly adds to our understanding of an industrial antifungal.

      Finally, the effects of a preclinical antifungal ATI-2307 were studied for the first time. Remarkably, ATI-2307 efficacy greatly depended on HOL1 coding sequences and an upstream enhancer (Figure 4). After engineering hol1∆ strains, uptake of the compound and sensitivity to the compound were lost and then restored by heterologous expression of CaHOL1 from a pathogenic yeast. HOL1 also conferred susceptibility to polyamines with related structures (Pentamidine, Iminoctadine). Remarkably, separation-of-function mutations were obtained in HOL1 that abolished the uptake of the toxins while preserving the uptake of nutrient polyamines in low nitrogen conditions, which strongly suggests that HOL1 encodes a direct transporter of the toxins. The implications are important for ATI-2307 efficacy in patients, where resistance mutations could arise spontaneously and produce poor clinical outcomes.

      Additional comments:

      The experiments presented here are often convincing and serve to illustrate the power of Tn-seq approaches in elucidating drug resistance mechanisms in eukaryotic microbes. The gain-of-expression effects (upstream of CDS), gain-of-function effects (elimination of auto-inhibitory domains), and loss-of-function effects were all carefully exposed and discussed, leading to numerous new insights on the action of diverse toxins.

      On the other hand, several deficiencies and weaknesses (in addition to the minor ones described above) limit the utility of the data that has been generated.

      (1) There was no summary table of Tn-seq data for different genes in the different conditions, so readers could not easily access data for genes and pathways not mentioned in the text. This is especially important because transposon insertions that were negatively selected (of great interest to the community) were barely mentioned. Additionally, the statistical significance of outlier genes was not reported. The same is true for insertions within the DNA segments upstream of CDSs. Users of these data are therefore restricted to visually inspecting insertion sites on a genome browser.

      (2) Only one dose of each toxin was studied, which therefore produces a limited perspective on the genetic mechanisms of resistance in each case.

      (3) No Tn-seq experiments were performed in diploid yeast strains. The gain-of-expression and gain-of-function insertions under positive selection in haploid strains in the different conditions are expected to be dominant in diploid strains as well, while loss-of-function insertions in CDS are expected to be recessive. Do these expectations hold? Could such experiments potentially confirm the models for Cerulenin and Chlorothalonil effects on YAP1 and FLR1? Pathogenic Candida species are usually diploid where gain-of-function/expression mutants most frequently lead to poor clinical outcomes. Resistance to ATI-2307 through loss of HOL1 may not be as significant for diploid C. albicans with two functional copies of all genes. On a related note, is it possible that transposon insertions in the 3' untranslated region produce anti-sense transcripts that lowers the expression of the upstream gene from both alleles in diploids, thereby producing a strong selective advantage in ATI-2307? This study already touches on exciting new applications of the Tn-seq method but could easily go a bit further.

    4. Reviewer #3 (Public review):

      Summary:

      This manuscript describes an extensive application of the Yeast (SATAY) transposon mutagenesis and sequencing method to explore loss- and gain-of-function mutations conferring resistance to 20 different antifungal compounds. Impressively, the authors demonstrate that SATAY can be used to identify mutations that lead to antifungal resistance, including promoter mutations that include the direct targets of antifungal compounds and drug efflux pumps. Because SATAY is not tied to a specific genetic background, the sensitivity of an S. cerevisiae strain, AD1-8, that specifically displays Chitosan susceptibility was examined in detail, and the results suggest that Chitosan acts through interactions with the fungal cell wall. Through a series of experiments that expand upon SATAY analysis, the novel antifungal ATI-2307, the authors clearly show that the transporter Hol1 concentrates this compound within yeast.

      General Comments:

      This is a very impressive application of SATAY, highlighting many different strategies for exploring the mechanism of action of various antifungal compounds. It's clear from the findings presented that SATAY is a powerful and potentially highly productive approach for chemical-genetic analysis.

    1. eLife Assessment

      Birdsong production depends on precise neural sequences in a vocal motor nucleus HVC. In this useful biophysical model, Daou and colleagues identify specific biophysical parameters that result in sparse neural sequences observed in vivo. While the model is presently incomplete because it is overfit to produce sequences and therefore not robust to real biological variation, the model has the potential to address some outstanding issues in HVC function.

    2. Reviewer #1 (Public review):

      Summary:

      The paper presents a model for sequence generation in the zebra finch HVC, which adheres to cellular properties measured experimentally. However, the model is fine-tuned and exhibits limited robustness to noise inherent in the inhibitory interneurons within the HVC, as well as to fluctuations in connectivity between neurons. Although the proposed microcircuits are introduced as units for sub-syllabic segments (SSS), the backbone of the network remains a feedforward chain of HVC_RA neurons, similar to previous models.

      Strengths:

      The model incorporates all three of the major types of HVC neurons. The ion channels used and their kinetics are based on experimental measurements. The connection patterns of the neurons are also constrained by the experiments.

      Weaknesses:

      The model is described as consisting of micro-circuits corresponding to SSS. This presentation gives the impression that the model's structure is distinct from previous models, which connected HVC_RA neurons in feedforward chain networks (Jin et al 2007, Li & Greenside, 2006; Long et al 2010; Egger et al 2020). However, the authors implement single HVC_RA neurons into chain networks within each micro-circuit and then connect the end of the chain to the start of the chain in the subsequent micro-circuit. Thus, the HVC_RA neuron in their model forms a single-neuron chain. This structure is essentially a simplified version of earlier models.

      In the model of the paper, the chain network drives the HVC_I and HVC_X neurons. The role of the micro-circuits is more significant in organizing the connections: specifically, from HVC_RA neurons to HVC_I neurons, and from HVC_I neurons to both HVC_X and HVC_RA neurons.

      How useful is this concept of micro-circuits? HVC neurons fire continuously even during the silent gaps. There are no SSS during these silent gaps.

      A significant issue of the current model is that the HVC_RA to HVC_RA connections require fine-tuning, with the network functioning only within a narrow range of g_AMPA (Figure 2B). Similarly, the connections from HVC_I neurons to HVC_RA neurons also require fine-tuning. This sensitivity arises because the somatic properties of HVC_RA neurons are insufficient to produce the stereotypical bursts of spikes observed in recordings from singing birds, as demonstrated in previous studies (Jin et al 2007; Long et al 2010). In these previous works, to address this limitation, a dendritic spike mechanism was introduced to generate an intrinsic bursting capability, which is absent in the somatic compartment of HVC_RA neurons. This dendritic mechanism significantly enhances the robustness of the chain network, eliminating the need to fine-tune any synaptic conductances, including those from HVC_I neurons (Long et al 2010).

      Why is it important that the model should NOT be sensitive to the connection strengths?

      First, the firing of HVC_I neurons is highly noisy and unreliable. HVC_I neurons fire spontaneous, random spikes under baseline conditions. During singing, their spike timing is imprecise and can vary significantly from trial to trial, with spikes appearing or disappearing across different trials. As a result, their inputs to HVC_RA neurons are inherently noisy. If the model relies on precisely tuned inputs from HVC_I neurons, the natural fluctuations in HVC_I firing would render the model non-functional. The authors should incorporate noisy HVC_I neurons into their model to evaluate whether this noise would render the model non-functional.

      Second, Kosche et al. (2015) demonstrated that reducing inhibition by suppressing HVC_I neuron activity makes HVC_RA firing less sparse but does not compromise the temporal precision of the bursts. In this experiment, the local application of gabazine should have severely disrupted HVC_I activity. However, it did not affect the timing precision of HVC_RA neuron firing, emphasizing the robustness of the HVC timing circuit. This robustness is inconsistent with the predictions of the current model, which depends on finely tuned inputs and should, therefore, be vulnerable to such disruptions.

      Third, the reliance on fine-tuning of HVC_RA connections becomes problematic if the model is scaled up to include groups of HVC_RA neurons forming a chain network, rather than the single HVC_RA neurons used in the current work. With groups of HVC_RA neurons, the summation of presynaptic inputs to each HVC_RA neuron would need to be precisely maintained for the model to function. However, experimental evidence shows that the HVC circuit remains functional despite perturbations, such as a few degrees of cooling, micro-lesions, or turnover of HVC_RA neurons. Such robustness cannot be accounted for by a model that depends on finely tuned connections, as seen in the current implementation.

      The authors examined how altering the channel properties of neurons affects the activity in their model. While this approach is valid, many of the observed effects may stem from the delicate balancing required in their model for proper function.

      In the current model, HVC_X neurons burst as a result of rebound activity driven by the I_H current. Rebound bursts mediated by the I_H current typically require a highly hyperpolarized membrane potential. However, this mechanism would fail if the reversal potential of inhibition is higher than the required level of hyperpolarization. Furthermore, Mooney (2000) demonstrated that depolarizing the membrane potential of HVC_X neurons did not prevent bursts of these neurons during forward playback of the bird's own song, suggesting that these bursts (at least under anesthesia, which may be a different state altogether) are not necessarily caused by rebound activity. This discrepancy should be addressed or considered in the model.

      Some figures contain direct copies of figures from published papers. It is perhaps a better practice to replace them with schematics if possible.

    3. Reviewer #2 (Public review):

      Summary:

      In this paper, the authors use numerical simulations to try to understand better a major experimental discovery in songbird neuroscience from 2002 by Richard Hahnloser and collaborators. The 2002 paper found that a certain class of projection neurons in the premotor nucleus HVC of adult male zebra finch songbirds, the neurons that project to another premotor nucleus RA, fired sparsely (once per song motif) and precisely (to about 1 ms accuracy) during singing.

      The experimental discovery is important to understand since it initially suggested that the sparsely firing RA-projecting neurons acted as a simple clock that was localized to HVC and that controlled all details of the temporal hierarchy of singing: notes, syllables, gaps, and motifs. Later experiments suggested that the initial interpretation might be incomplete: that the temporal structure of adult male zebra finch songs instead emerged in a more complicated and distributed way, still not well understood, from the interaction of HVC with multiple other nuclei, including auditory and brainstem areas. So at least two major questions remain unanswered more than two decades after the 2002 experiment: What is the neurobiological mechanism that produces the sparse precise bursting: is it a local circuit in HVC or is it some combination of external input to HVC and local circuitry? And how is the sparse precise bursting in HVC related to a songbird's vocalizations?

      The authors only investigate part of the first question, whether the mechanism for sparse precise bursts is local to HVC. They do so indirectly, by using conductance-based Hodgkin-Huxley-like equations to simulate the spiking dynamics of a simplified network that includes three known major classes of HVC neurons and such that all neurons within a class are assumed to be identical. A strength of the calculations is that the authors include known biophysically deduced details of the different conductances of the three major classes of HVC neurons, and they take into account what is known, based on sparse paired recordings in slices, about how the three classes connect to one another. One weakness of the paper is that the authors make arbitrary and not well-motivated assumptions about the network geometry, and they do not use the flexibility of their simulations to study how their results depend on their network assumptions. A second weakness is that they ignore many known experimental details such as projections into HVC from other nuclei, dendritic computations (the somas and dendrites are treated by the authors as point-like isopotential objects), the role of neuromodulators, and known heterogeneity of the interneurons. These weaknesses make it difficult for readers to know the relevance of the simulations for experiments and for advancing theoretical understanding.

      Strengths:

      The authors use conductance-based Hodgkin-Huxley-like equations to simulate spiking activity in a network of neurons intended to model more accurately songbird nucleus HVC of adult male zebra finches. Spiking models are much closer to experiments than models based on firing rates or on 2-state neurons.

      The authors include information deduced from modeling experimental current-clamp data such as the types and properties of conductances. They also take into account how neurons in one class connect to neurons in other classes via excitatory or inhibitory synapses, based on sparse paired recordings in slices by other researchers.

      The authors obtain some new results of modest interest such as how changes in the maximum conductances of four key channels (e.g., A-type K+ currents or Ca-dependent K+ currents) influence the structure and propagation of bursts, while simultaneously being able to mimic accurately current-clamp voltage measurements.

      Weaknesses:

      One weakness of this paper is the lack of a clearly stated, interesting, and relevant scientific question to try to answer. In the introduction, the authors do not discuss adequately which questions recent experimental and theoretical work have failed to explain adequately, concerning HVC neural dynamics and its role in producing vocalizations. The authors do not discuss adequately why they chose the approach of their paper and how their results address some of these questions.

      For example, the authors need to explain in more detail how their calculations relate to the works of Daou et al, J. Neurophys. 2013 (which already fitted spiking models to neuronal data and identified certain conductances), to Jin et al J. Comput. Neurosci. 2007 (which already discussed how to get bursts using some experimental details), and to the rather similar paper by E. Armstrong and H. Abarbanel, J. Neurophys 2016, which already postulated and studied sequences of microcircuits in HVC. This last paper is not even cited by the authors.

      The authors' main achievement is to show that simulations of a certain simplified and idealized network of spiking neurons, which includes some experimental details but ignores many others, match some experimental results like current-clamp-derived voltage time series for the three classes of HVC neurons (although this was already reported in earlier work by Daou and collaborators in 2013), and simultaneously the robust propagation of bursts with properties similar to those observed in experiments. The authors also present results about how certain neuronal details and burst propagation change when certain key maximum conductances are varied.

      However, these are weak conclusions for two reasons. First, the authors did not do enough calculations to allow the reader to understand how many parameters were needed to obtain these fits and whether simpler circuits, say with fewer parameters and simpler network topology, could do just as well. Second, many previous researchers have demonstrated robust burst propagation in a variety of feed-forward models. So what is new and important about the authors' results compared to the previous computational papers?

      Also missing is a discussion, or at least an acknowledgment, of the fact that not all of the fine experimental details of undershoots, latencies, spike structure, spike accommodation, etc may be relevant for understanding vocalization. While it is nice to know that some models can match these experimental details and produce realistic bursts, that does not mean that all of these details are relevant for the function of producing precise vocalizations. Scientific insights in biology often require exploring which of the many observed details can be ignored and especially identifying the few that are essential for answering some questions. As one example, if HVC-X neurons are completely removed from the authors' model, does one still get robust and reasonable burst propagation of HVC-RA neurons? While part of the nucleus HVC acts as a premotor circuit that drives the nucleus RA, part of HVC is also related to learning. It is not clear that HVC-X neurons, which carry out some unknown calculation and transmit information to area X in a learning pathway, are relevant for burst production and propagation of HVC-RA neurons, and so relevant for vocalization. Simulations provide a convenient and direct way to explore questions of this kind.

      One key question to answer is whether the bursting of HVC-RA projection neurons is based on a mechanism local to HVC or is some combination of external driving (say from auditory nuclei) and local circuitry. The authors do not contribute to answering this question because they ignore external driving and assume that the mechanism is some kind of intrinsic feed-forward circuit, which they put in by hand in a rather arbitrary and poorly justified way, by assuming the existence of small microcircuits consisting of a few HVC-RA, HVC-X, and HVC-I neurons that somehow correspond to "sub-syllabic segments". To my knowledge, experiments do not suggest the existence of such microcircuits nor does theory suggest the need for such microcircuits.

      Another weakness of this paper is an unsatisfactory discussion of how the model was obtained, validated, and simulated. The authors should state as clearly as possible, in one location such as an appendix, what is the total number of independent parameters for the entire network and how parameter values were deduced from data or assigned by hand. With enough parameters and variables, many details can be fit arbitrarily accurately so researchers have to be careful to avoid overfitting. If parameter values were obtained by fitting to data, the authors should state clearly what the fitting algorithm was (some iterative nonlinear method, whose results can depend on the initial choice of parameters), what the error function used for fitting (sum of least squares?) was, and what data were used for the fitting.

      The authors should also state clearly the dynamical state of the network, the vector of quantities that evolve over time. (What is the dimension of that vector, which is also the number of ordinary differential equations that have to be integrated?) The authors do not mention what initial state was used to start the numerical integrations, whether transient dynamics were observed and what were their properties, or how the results depended on the choice of the initial state. The authors do not discuss how they determined that their model was programmed correctly (it is difficult to avoid typing errors when writing several pages or more of a code in any language) or how they determined the accuracy of the numerical integration method beyond fitting to experimental data, say by varying the time step size over some range or by comparing two different integration algorithms.

      Also disappointing is that the authors do not make any predictions to test, except rather weak ones such as that varying a maximum conductance sufficiently (which might be possible by using dynamic clamps) might cause burst propagation to stop or change its properties. Based on their results, the authors do not make suggestions for further experiments or calculations, but they should.

    1. eLife Assessment

      This important study seeks to examine the relationship between pupil size and information gain, showing opposite effects dependent upon whether the average uncertainty increases or decreases across trials. Given the broad implications for learning and perception, the findings will be of broad interest to researchers in cognitive neuroscience, decision-making, and computational modelling. Nevertheless, the evidence in support of the particular conclusion is at present incomplete - the conclusions would be strengthened if the authors could both clarify the differences between model-updating and prediction error in their account and clarify the patterns in the data.

    2. Reviewer #1 (Public review):

      Summary:

      This study investigates whether pupil dilation reflects prediction error signals during associative learning, defined formally by Kullback-Leibler (KL) divergence, an information-theoretic measure of information gain. Two independent tasks with different entropy dynamics (decreasing and increasing uncertainty) were analyzed: the cue-target 2AFC task and the letter-color 2AFC task. Results revealed that pupil responses scaled with KL divergence shortly after feedback onset, but the direction of this relationship depended on whether uncertainty (entropy) increased or decreased across trials. Furthermore, signed prediction errors (interaction between frequency and accuracy) emerged at different time windows across tasks, suggesting task-specific temporal components of model updating. Overall, the findings highlight that pupil dilation reflects information-theoretic processes in a complex, context-dependent manner.

      Strengths:

      This study provides a novel and convincing contribution by linking pupil dilation to information-theoretic measures, such as KL divergence, supporting Zénon's hypothesis that pupil responses reflect information gained during learning. The robust methodology, including two independent datasets with distinct entropy dynamics, enhances the reliability and generalisability of the findings. By carefully analysing early and late time windows, the authors capture the temporal dynamics of prediction error signals, offering new insights into the timing of model updates. The use of an ideal learner model to quantify prediction errors, surprise, and entropy provides a principled framework for understanding the computational processes underlying pupil responses. Furthermore, the study highlights the critical role of task context - specifically increasing versus decreasing entropy - in shaping the directionality and magnitude of these effects, revealing the adaptability of predictive processing mechanisms.

      Weaknesses:

      While this study offers important insights, several limitations remain. The two tasks differ significantly in design (e.g., sensory modality and learning type), complicating direct comparisons and limiting the interpretation of differences in pupil dynamics. Importantly, the apparent context-dependent reversal between pupil constriction and dilation in response to feedback raises concerns about how these opposing effects might confound the observed correlations with KL divergence. Finally, subjective factors such as participants' confidence and internal belief states were not measured, despite their potential influence on prediction errors and pupil responses.

    3. Reviewer #2 (Public review):

      Summary:

      The authors proposed that variability in post-feedback pupillary responses during the associative learning tasks can be explained by information gain, which is measured as KL divergence. They analysed pupil responses in a later time window (2.5s-3s after feedback onset) and correlated them with information-theory-based estimates from an ideal learner model (i.e., information gain-KL divergence, surprise-subjective probability, and entropy-average uncertainty) in two different associative decision-making tasks.

      Strength:

      The exploration of task-evoked pupil dynamics beyond the immediate response/feedback period and then associating them with model estimates was interesting and inspiring. This offered a new perspective on the relationship between pupil dilation and information processing.

      Weakness:

      However, disentangling these later effects from noise needs caution. Noise in pupillometry can arise from variations in stimuli and task engagement, as well as artefacts from earlier pupil dynamics. The increasing variance in the time series of pupillary responses (e.g., as shown in Figure 2D) highlights this concern.

      It's also unclear what this complicated association between information gain and pupil dynamics actually means. The complexity of the two different tasks reported made the interpretation more difficult in the present manuscript.

    4. Reviewer #3 (Public review):

      Summary:

      This study examines prediction errors, information gain (Kullback-Leibler [KL] divergence), and uncertainty (entropy) from an information-theory perspective using two experimental tasks and pupillometry. The authors aim to test a theoretical proposal by Zénon (2019) that the pupil response reflects information gain (KL divergence). In particular, the study defines the prediction error in terms of KL divergence and speculates that changes in pupil size associated with KL divergence depend on entropy. Moreover, the authors examine the temporal characteristics of pupil correlates of prediction errors, which differed considerably across previous studies that employed different experimental paradigms. In my opinion, the study does not achieve these aims due to several methodological and theoretical issues.

      Strengths:

      (1) Use of an established Bayesian model to compute KL divergence and entropy.

      (2) Pupillometry data preprocessing, including deconvolution.

      Weaknesses:

      (1) Definition of the prediction error in terms of KL divergence:

      I'm concerned about the authors' theoretical assumption that the prediction error is defined in terms of KL divergence. The authors primarily refer to a review article by Zénon (2019): "Eye pupil signals information gain". It is my understanding that Zénon argues that KL divergence quantifies the update of a belief, not the prediction error: "In short, updates of the brain's internal model, quantified formally as the Kullback-Leibler (KL) divergence between prior and posterior beliefs, would be the common denominator to all these instances of pupillary dilation to cognition." (Zénon, 2019).

      From my perspective, the update differs from the prediction error. Prediction error refers to the difference between outcome and expectation, while update refers to the difference between the prior and the posterior. The prediction error can drive the update, but the update is typically smaller, for example, because the prediction error is weighted by the learning rate to compute the update. My interpretation of Zénon (2019) is that they explicitly argue that KL divergence defines the update in terms of the described difference between prior and posterior, not the prediction error.

      The authors also cite a few other papers, including Friston (2010), where I also could not find a definition of the prediction error in terms of KL divergence. For example [KL divergence:] "A non-commutative measure of the non-negative difference between two probability distributions." Similarly, Friston (2010) states: Bayesian Surprise - "A measure of salience based on the Kullback-Leibler divergence between the recognition density (which encodes posterior beliefs) and the prior density. It measures the information that can be recognized in the data." Finally, also in O'Reilly (2013), KL divergence is used to define the update of the internal model, not the prediction error.

      The authors seem to mix up this common definition of the model update in terms of KL divergence and their definition of prediction error along the same lines. For example, on page 4: "KL divergence is a measure of the difference between two probability distributions. In the context of predictive processing, KL divergence can be used to quantify the mismatch between the probability distributions corresponding to the brain's expectations about incoming sensory input and the actual sensory input received, in other words, the prediction error (Friston, 2010; Spratling, 2017)."

      Similarly (page 23): "In the current study, we investigated whether the pupil's response to decision outcome (i.e., feedback) in the context of associative learning reflects a prediction error as defined by KL divergence."

      This is problematic because the results might actually have limited implications for the authors' main perspective (i.e., that the pupil encodes prediction errors) and could be better interpreted in terms of model updating. In my opinion, there are two potential ways to deal with this issue:

      a) Cite work that unambiguously supports the perspective that it is reasonable to define the prediction error in terms of KL divergence and that this has a link to pupillometry. In this case, it would be necessary to clearly explain the definition of the prediction error in terms of KL divergence and dissociate it from the definition in terms of model updating.

      b) If there is no prior work supporting the authors' current perspective on the prediction error, it might be necessary to revise the entire paper substantially and focus on the definition in terms of model updating.

      (2) Operationalization of prediction errors based on frequency, accuracy, and their interaction:

      The authors also rely on a more model-agnostic definition of the prediction error in terms of stimulus frequency ("unsigned prediction error"), accuracy, and their interaction ("signed prediction error"). While I see the point here, I would argue that this approach offers a simple approximation to the prediction error, but it is possible that factors like difficulty and effort can influence the pupil signal at the same time, which the current approach does not take into account. I recommend computing prediction errors (defined in terms of the difference between outcome and expectation) based on a simple reinforcement-learning model and analyzing the data using a pupillometry regression model in which nuisance regressors are controlled, and results are corrected for multiple comparisons.

      (3) The link between model-based (KL divergence) and model-agnostic (frequency- and accuracy-based) prediction errors:

      I was expecting a validation analysis showing that KL divergence and model-agnostic prediction errors are correlated (in the behavioral data). This would be useful to validate the theoretical assumptions empirically.

      (4) Model-based analyses of pupil data:

      I'm concerned about the authors' model-based analyses of the pupil data. The current approach is to simply compute a correlation for each model term separately (i.e., KL divergence, surprise, entropy). While the authors do show low correlations between these terms, single correlational analyses do not allow them to control for additional variables like outcome valence, prediction error (defined in terms of the difference between outcome and expectation), and additional nuisance variables like reaction time, as well as x and y coordinates of gaze.

      Moreover, including entropy and KL divergence in the same regression model could, at least within each task, provide some insights into whether the pupil response to KL divergence depends on entropy. This could be achieved by including an interaction term between KL divergence and entropy in the model.

      (5) Major differences between experimental tasks:

      More generally, I'm not convinced that the authors' conclusion that the pupil response to KL divergence depends on entropy is sufficiently supported by the current design. The two tasks differ on different levels (stimuli, contingencies, when learning takes place), not just in terms of entropy. In my opinion, it would be necessary to rely on a common task with two conditions that differ primarily in terms of entropy while controlling for other potentially confounding factors. I'm afraid that seemingly minor task details can dramatically change pupil responses. The positive/negative difference in the correlation with KL divergence that the authors interpret to be driven by entropy may depend on another potentially confounding factor currently not controlled.

      (6) Model validation:

      My impression is that the ideal learner model should work well in this case. However, the authors don't directly compare model behavior to participant behavior ("posterior predictive checks") to validate the model. Therefore, it is currently unclear if the model-derived terms like KL divergence and entropy provide reasonable estimates for the participant data.

      (7) Discussion:

      The authors interpret the directional effect of the pupil response w.r.t. KL divergence in terms of differences in entropy. However, I did not find a normative/computational explanation supporting this interpretation. Why should the pupil (or the central arousal system) respond differently to KL divergence depending on differences in entropy?

      The current suggestion (page 24) that might go in this direction is that pupil responses are driven by uncertainty (entropy) rather than learning (quoting O'Reilly et al. (2013)). However, this might be inconsistent with the authors' overarching perspective based on Zénon (2019) stating that pupil responses reflect updating, which seems to imply learning, in my opinion. To go beyond the suggestion that the relationship between KL divergence and pupil size "needs more context" than previously assumed, I would recommend a deeper discussion of the computational underpinnings of the result.

    1. eLife Assessment

      This important study analyzes a large dataset of Salmonella gallinarum whole-genome sequences and provides findings regarding the population structure of this avian-specific pathogen. The convincing results indicate regional adaptation of the mobilome-driven resistome and a role in the evolutionary trajectory of this pathogen that will interest microbiologists and researchers working on genomics, evolution, and antimicrobial resistance.

    2. Reviewer #1 (Public review):

      Summary:

      The investigators in this study analyzed the dataset assembly from 540 Salmonella isolates, and those from 45 recent isolates from Zhejiang University of China. The analysis and comparison of the resistome and mobilome of these isolates identified a significantly higher rate of cross-region dissemination compared to localized propagation. This study highlights the key role of the resistome in driving the transition and evolutionary history of S. Gallinarum.

      Strengths:

      The isolates included in this study were from 16 countries in the past century (1920 to 2023). While the study uses S. Gallinarun as the prototype, the conclusion from this work will likely apply to other Salmonella serotypes and other pathogens.

    3. Reviewer #2 (Public review):

      Summary:

      The authors sequence 45 new samples of S. Gallinarum, a commensal Salmonella found in chickens, which can sometimes cause disease. They combine these sequences with around 500 from public databases, determine the population structure of the pathogen, and coarse relationships of lineages with geography. The authors further investigate known anti-microbial genes found in these genomes, how they associate with each other, whether they have been horizontally transferred, and date the emergence of clades.

      Strengths:

      - It doesn't seem that much is known about this serovar, so publicly available new sequences from a high burden region are a valuable addition to the literature.<br /> - Combining these sequences with publicly available sequences is a good way to better contextualise any findings.<br /> - The genomic analyses have been greatly improved since the first version of the manuscript, and appropriately analyse the population and date emergence of clades.<br /> - The SNP thresholds are contextualised in terms of evolutionary time.<br /> - The importance and context of the findings are fairly well described.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews: 

      Reviewer #1 (Public review):

      Summary:

      The investigators in this study analyzed the dataset assembly from 540 Salmonella isolates, and those from 45 recent isolates from Zhejiang University of China. The analysis and comparison of the resistome and mobilome of these isolates identified a significantly higher rate of cross-region dissemination compared to localized propagation. This study highlights the key role of the resistome in driving the transition and evolutionary history of S. Gallinarum.

      Strengths:

      The isolates included in this study were from 16 countries in the past century (1920 to 2023). While the study uses S. Gallinarun as the prototype, the conclusion from this work will likely apply to other Salmonella serotypes and other pathogens.

      Thank you very much for your positive feedback. We recognize, as you noted, that emphasizing Salmonella enterica Serovar Gallinarum in the title may lead readers to perceive our methods and conclusions as overly restrictive. In light of your evaluation of our work, we have revised the title to: “Avian-specific Salmonella transition to endemicity is accompanied by localized resistome and mobilome interaction” We believe this final version not only reflects the applicability of our conclusions, as you appreciated, but also addresses your previous suggestion to highlight the resistome and mobilome.

      Revisions in the manuscript Lines: 1-3

      Weaknesses:

      While the isolates came from 16 countries, most strains in this study were originally from China.

      We believe that this issue was discussed in detail in our previous response. Although potential bias exists, we have minimized its impact by constructing the largest global S. Gallinarum genome dataset to date. In addition, we have further emphasized these limitations in the manuscript.

      Comments on revisions:

      This reviewer is happy with the detailed responses from the authors regarding revising this manuscript. I do not have further comments.

      We greatly appreciate your positive feedback and are pleased that our responses have addressed your concerns.

      Reviewer #2 (Public review):

      Summary:

      The authors sequence 45 new samples of S. Gallinarum, a commensal Salmonella found in chickens, which can sometimes cause disease. They combine these sequences with around 500 from public databases, determine the population structure of the pathogen, and coarse relationships of lineages with geography. The authors further investigate known anti-microbial genes found in these genomes, how they associate with each other, whether they have been horizontally transferred, and date the emergence of clades.

      Strengths:

      - It doesn't seem that much is known about this serovar, so publicly available new sequences from a high burden region are a valuable addition to the literature.

      - Combining these sequences with publicly available sequences is a good way to better contextualise any findings.

      - The genomic analyses have been greatly improved since the first version of the manuscript, and appropriately analyse the population and date emergence of clades.

      - The SNP thresholds are contextualised in terms of evolutionary time.

      - The importance and context of the findings are fairly well described.

      Thank you so much for your thorough review and constructive comments on the manuscript.

      Weaknesses:

      -  There are still a few issues with the genomic analyses, although they no longer undermine the main conclusions:

      We are grateful for the valuable time and effort you have dedicated to improving our manuscript. In this revision, we have provided a point-by-point response to each of your concerns. Moreover, with the addition of new supplementary materials and modifications to the figures, we have re-examined and adjusted the numbering of figures and supplementary materials in the text to ensure they appear correctly in the manuscript.

      (1) Although the SNP distance is now considered in terms of time, the 5 SNP distance presented still represents ~7yrs evolution, so it is unlikely to be a transmission event, as described. It would be better to use a much lower threshold or describe the interpretation of these clusters more clearly. Bringing in epidemiological evidence or external references on the likely time interval between transmissions would be helpful.

      We sincerely thank you for highlighting this issue. We appreciate your concern regarding the use of a 5-SNP threshold to define a transmission event, especially given the approximate 7-year evolutionary timeframe. Considering our updated estimate for the evolutionary rate of S. Gallinarum (approximately 0.74 SNPs per year, with a 95% HPD range of 0.42 to 1.06), we have revised the manuscript to use a 2-SNP threshold (approximately representing less than two years of evolution) to better control the temporal span of transmission events. In addition, we have updated the manuscript to reflect this new threshold and demonstrated that the use of a more stringent SNP threshold does not affect the overall conclusions of the study.

      Specifically, we adopted the newly established 2-SNP threshold to update Figure 3a and corresponding Supplementary Figure 8. The heatmap on the far right of New Figure 3a illustrates the SNP distances among 45 newly isolated S. Gallinarum strains from two locations in Zhejiang Province (Taishun and Yueqing). New Supplementary Figure 8 simulates potential transmission events between the bvSP strains isolated from Zhejiang Province (n=95) and those from other regions of China with available provincial information (n=435). These analyses collectively demonstrate the localized transmission patterns of bvSP within China.

      For New Figure 3a, we found that even with the 2-SNP threshold, the number of potential transmission events among the 45 newly isolated S. Gallinarum strains from the two Zhejiang locations (Taishun and Yueqing) remains unchanged. In fact, we observed that the results from SNP tracing using an SNP threshold of less than 5 are consistent (see Author response image 1). 

      Author response image 1.

      Clustering results of 45 newly isolated S. Gallinarum strains using different SNP thresholds of 1, 2, 3, 4, and 5 SNPs. The five subplots represent the clustering results under each threshold. Each point corresponds to an individual strain, and lines connect strains with potential transmission relationships.

      For New Supplementary Figure 8, we employed the 2-SNP threshold and found that the number of transmission events between the bvSP strains isolated from Zhejiang Province (n=95) and those from other Chinese provinces (n=435) decreased from 91 to 53. The names of the strains involved in these potential transmission events are listed in Supplementary Table 5.

      Revisions in the manuscript

      Lines: 352-357

      Figures: Figure 3; Supplementary Figure 8

      Table: Supplementary Table 5

      (2) The HGT definition has not fundamentally been changed and therefore still has some issues, mainly that vertical evolution is still not systematically controlled for. 

      We sincerely thank you for highlighting this issue. We hope the following explanation will help clarify and improve our manuscript, as well as address your concerns.

      In bacteria, mobile genetic elements (MGEs) such as plasmids, transposons, integrons, and prophages, as mentioned in our manuscript, are segments of DNA that encode enzymes and proteins responsible for mediating the movement of genetic material between bacterial genomes (commonly referred to as “jumping genes”). These MGEs contribute to the mechanisms of horizontal gene transfer (HGT) in Salmonella, including transduction (via prophages), conjugation (via plasmids), and transposition (via integrons and transposons) (Nat Rev Microbiol. 2005 Sep;3(9):722-32). These “jumping genes” can enable Salmonella to acquire additional antimicrobial resistance genes (ARGs), which may not only originate from other Salmonella strains but also from distantly related species.

      To further address your concern regarding the systematic control of vertical evolution, we employed the HGTphyloDetect pipeline developed by Le Yuan et al. (Brief Bioinform. 2023 Mar 19;24(2):bbad035) to control for vertical evolution in the ARG sequences mentioned in our manuscript. We chose HGTphyloDetect because, as noted, "jumping genes" often occur among evolutionarily distant species, rendering the use of Gubbins potentially unsuitable for these distant HGT events.

      Using the HGTphyloDetect pipeline, we extracted base sequences for the eight ARGs shown in Figure 6b with an HGT frequency greater than zero (bla<sup>TEM-1B</sup>, sul1, dfrA17, aadA5, sul2, aph(3’’)-Ib, tet(A), aph(6)-Id). For bla<sup>TEM-1B</sup>, sul1, dfrA17, aadA5, and sul2, the HGT frequency reached 100% across different isolates, indicating that these ARG sequences have a unique sequence type. In contrast, due to the ResFinder settings requiring both similarity and coverage to meet a minimum value of 90%, the base sequences for aph(3’’)-Ib, tet(A), and aph(6)-Id are not unique. Consequently, we applied the HGTphyloDetect pipeline individually to each sequence type of ARGs to verify their association with HGT events. Specifically, among 436 bvSP isolates collected in China, we identified two sequence types of aph(3’’)-Ib, four sequence types of tet(A), and three sequence types of aph(6)-Id.

      Subsequently, to identify potential ARGs horizontally acquired from evolutionarily distant organisms, we queried the translated amino acid sequences of each ARG against the National Center for Biotechnology Information (NCBI) non-redundant protein database. We then evaluated whether these sequences were products of HGT by calculating Alien Index (AI) scores and out_perc values.

      The calculation of AI score is as follows:

      In this study, bbhG and bbhO represent the E-values of the best blast hit in ingroup and outgroup lineages, respectively. The outgroup lineage is defined as all species outside of the kingdom, while the ingroup lineage encompasses species within the kingdom but outside of the subphylum. An AI score ≥ 45 is considered a strong indicator that the gene in question is likely derived from an HGT event.

      Regarding the calculation method for out_perc:

      Finally, according to the definition provided by the HGTphyloDetect pipeline, ARGs with AI score ≥ 45 and out_perc ≥ 90% are presumed to be potential candidates for HGT from evolutionarily distant species. We have compiled the calculation results for the aforementioned genes in New Supplementary Table 9. The results indicate that all ARGs presented in Figure 6b, which exhibited a HGT frequency greater than zero, were acquired horizontally by S. Gallinarum. Based on these findings, we have revised the manuscript accordingly.

      Revisions in the manuscript

      Lines: 302-307; 616-650; 955-957

      Table: Supplementary Table 9

      Using a 5kb window is not sufficient, as LD may extend across the entire genome.

      We agree with your point that linkage disequilibrium (LD) could influence the transmission of genes within chromosomal regions. LD can lead to the non-random cooccurrence of alleles at different loci within a population. Considering that horizontal gene transfer (HGT) events involving more distantly related ARGs may be accompanied by vertical propagation on chromosomes, and to simultaneously assess the impact of LD, we conducted two evaluations.

      It is important to note that the following assessments are based on the assumption that plasmid replicons detected by PlasmidsFinder are part of self-replicating, extrachromosomal DNA.

      (1) In the revised pipeline used to calculate ARG HGT frequencies, we categorized a total of 621 ARGs carried by 436 bvSP isolates collected in China and found that 415 of these ARGs were located on MGEs. We further investigated the distribution of these 415 ARGs across different MGEs, taking into account the complex nesting relationships among them. We observed that 90% of the ARGs (372/415) were located on plasmid contigs. It is important to clarify that this finding does not contradict our statement in the manuscript regarding plasmids and transposons as the primary reservoirs for resistome geo-temporal dissemination. This is because transposons, integrons, and prophages carrying ARGs can also be found on plasmids. Additionally, only 25 bvSG isolates from China contained ARGs, which were likely acquired via transposons or integrons located on the chromosome.  

      (2) In our manuscript, we searched for ARGs within a 5kb upstream and downstream region (a total of 10kb) of transposons and integrons (The BLASTn parameters used in the Bacant pipeline to identify transposons and integrons were set to a coverage threshold of 60%, rather than 100%). However, in light of the potential impact of LD on vertical transmission, we expanded our search to include a 10kb upstream and downstream range (a total of 20kb)  for these 25 isolates. The decision to expand the search range to 10kb upstream and downstream range is based on the following two considerations: 1) Based on literature, we determined the overall lengths of the integrons and transposons carried by the 25 isolates (Tn801, Tn6205, Tn1721, In498, In1440, In473, and In282), and found that the maximum length of these elements is ~13.5 kb. Using a 10kb upstream and downstream threshold effectively covers these integrons/transposons. 2) The limitation posed by genomic fragmentation due to next-generation sequencing, which restrict the search range. We present the results of this expanded search for colocalization of ARGs with transposons and integrons at: Figshare:  https://doi.org/10.6084/m9.figshare.28129130.v1

      We found that these results were consistent with those obtained using the previous search range.

      Taken together, these results suggest that although linkage disequilibrium may influence genetic processes within chromosomal regions—particularly for the few chromosomeassociated antibiotic resistance genes linked to integrons and transposons—the overall impact in our study is likely minimal. This conclusion is supported by the observation that 90% of the ARGs in our dataset are located on plasmids, and even an expanded search range does not alter this outcome. Additionally, by incorporating Alien Index scores and calculating out_perc, we can further confirm the occurrence of horizontal gene transfer events.

      However, it is undeniable that other studies using our current pipeline may be affected. As a temporary remedial measure, we have included a note in the "README" file  as below (https://github.com/tjiaa/Cal_HGT_Frequency):

      “Note: Considering that ARGs located on the chromosome and carried by mobile genetic elements—such as integrons and transposons—may introduce potential computational errors, we recommend evaluating the number of ARGs associated with these elements on the chromosome during your analysis. If a majority of ARGs in your dataset fall into this category, we suggest using additional methods to evaluate the potential impact of linkage disequilibrium. Additionally, by modifying the “MGE_start” and “MGE_end” parameters in the “eLife_MGE_ARG_Co_location.ipynb” script, you can assess the distance between different ARGs and integrons or transposons on the chromosome. This approach will further aid in evaluating the impact of linkage disequilibrium on the genetic process.”

      We believe this approach will assist researchers in further assessing the potential impact of vertical evolution and help other users determine whether additional methods are necessary to account for such effects.

      As the authors have now run gubbins correctly, they could use the results from this existing analysis to find recent HGT.

      We sincerely thank you for your valuable suggestion. Utilizing additional methods to predict potential horizontal gene transfer (HGT) events could indeed enhance the robustness of the results. However, "jumping genes" often occur among evolutionarily distant species, rendering the use of Gubbins potentially unsuitable for these distant HGT events.

      Furthermore, the primary focus of our study is to identify HGT of antimicrobial resistance genes (ARGs) in the Salmonella genome driven by mobile genetic elements. Therefore, we employed the HGTphyloDetect pipeline developed by Le Yuan et al. (Brief Bioinform. 2023 Mar 19;24(2):bbad035) to control for vertical evolution in the ARG sequences. The specific computational methods and conclusions have been detailed above.

      To definite mobilisation, perhaps a standard pipeline such (e.g. https://github.com/EBIMetagenomics/mobilome-annotation-pipeline) would be more convincing.

      Thank you for your valuable suggestion. We agree that defining mobilization using a standardized pipeline can add rigor and clarity to our analysis. The pipeline you referenced (https://github.com/EBI-Metagenomics/mobilome-annotation-pipeline) is an excellent resource and provides a robust approach to the identification and annotation of mobile genetic elements.

      We have examined and run this pipeline, which uses “IntegronFinder” and “ICEfinder” to detect integrons, “geNomad” to identify plasmids, and “geNomad” and “VIRify” to detect prophages. Our initial checks revealed that the numbers of integrons, plasmids, and prophages identified using this pipeline were consistent with those detected in our study. However, due to the significantly different output formats, the results from this pipeline could not be integrated with the pipeline we used for calculating HGT frequency.

      We will incorporate the standardized pipeline you suggested in future studies to further improve the reliability of our findings.

      (3) The invasiveness index is better described, but the authors still did not provide convincing evidence that the small difference is actually biologically meaningful (there was no statistical difference between the two strains provided in response Figure 6). What do other Salmonella papers using this approach find, and can their links be brought in? If there is still no good evidence, a better description of this difference would help make the conclusions better supported.

      We sincerely appreciate your thoughtful feedback. The initial introduction of the invasiveness index in our manuscript aimed to quantitatively assess the differences in invasiveness between two geographically distinct strains of S. Gallinarum (isolated from Taishun and Yueqing) by comparing the degradation of 196 top predicted genes associated with invasiveness in their genomes. We found a highly significant statistical difference (P < 0.0001) in the invasiveness index between them.

      Several studies have also employed the invasiveness index to predict biological relevance in Salmonella strains, and we believe these examples provide further context for our approach:

      (1) Caisey V. Pulford et al, Nat Microbiol, 2021, used the same method to calculate the invasiveness index for Salmonella Typhimurium and employed it to characterize the invasiveness of different lineage strains. They found that Salmonella in Lineage-3 exhibited the highest invasiveness index, suggesting an adaptation from an intestinal to a systemic lifestyle. The authors noted, "Although the invasiveness index cannot yet be experimentally validated, Salmonella isolates with different invasiveness indices produce distinct clinical symptoms in a human population (BMC Med. 2020 Jul 17; 18(1):212)". They emphasized the necessity of developing more robust methods to measure Salmonella invasiveness.

      (2) Sandra Van Puyvelde et al, Nat Commun, 2019, reported that Salmonella Typhimurium sequence type 313 (ST313) lineage II.1 exhibited a higher invasiveness index compared to lineage II, suggesting that the two lineages might have distinct adaptations to an invasive lifestyle. Further experiments demonstrated significant differences between these lineages in terms of biofilm formation (A red dry and rough (RDAR) assay) and metabolic capacity for carbon compounds.

      (3) Wim L. Cuypers et al, Nat Commun, 2023, calculated the invasiveness index for 284 global Salmonella Concord strains across different lineages and found that Lineage-4 potentially exhibited the highest invasiveness.

      Given these evidences, we acknowledge that no significant difference in mortality was observed between the L2b and L3b S. Gallinarum strains in 16-day-old SPF chicken embryos. Existing literature suggests that strains with higher invasiveness indices may still exhibit differences in biofilm formation and metabolic capacities, reflecting their adaptation to different host environments. As such, we maintain that the invasiveness index remains a valuable metric for evaluating the genomic differences between S. Gallinarum strains from Taishun and Yueqing. We plan to further investigate these differences through phenotypic experiments in our next research.

      In the revised manuscript, we have added the following discussion along with additional references:

      Lines 358-365: “Moreover, the invasiveness index of bvSP from Taishun and Yueqing suggests that different lineages of S. Gallinarum recovered from distinct regions may exhibit biological differences. Previous studies have shown that strains with higher invasiveness indexes tend to be more virulent in hosts (30, 31), potentially causing neurological or arthritic symptoms in S. Gallinarum infections. Furthermore, strains with varying invasiveness indexes have been confirmed to differ in their biofilm formation abilities and metabolic capacities for carbon compounds (32).”

      Revisions in the manuscript:

      Lines: 358-365, 806-827.

      In summary, the analysis is broadly well described and feels appropriate. Some of the conclusions are still not fully supported, although the main points and context of the paper now appear sound.

      Thank you so much for your positive evaluation of our work. We hope that the revised manuscript meets your expectations and offers a more accurate interpretation of our findings.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      This is a great improvement over the first version and I thank the authors for a thorough response, as well as changing their conclusions in response to their improvements.

      Other small remaining issues:

      Figure 3: Heatmap of SNPs is hard to read in grayscale. It also just represents the between clade distances already shown by the tree. It would be more useful to present intraclade distances only to see the SNP resolution _within_ each lineage. Using a better colour scheme would also help.

      Thank you for your insightful comments and suggestions regarding Figure 3. We agree that the grayscale heatmap may present challenges in terms of visual clarity. To address this, we have updated the heatmap with a more distinct color gradient, ensuring better contrast and easier interpretation (New Figure 3). 

      Regarding your second suggestion: "It would be more useful to present intraclade distances only to see the SNP resolution within each lineage," we believe it is already addressed in the current version of New Figure 3. Specifically, the heatmap on the right side of New Figure 3 illustrates the SNP distances between S. Gallinarum isolates from Taishun and Yueqing, with the goal of demonstrating that genomic variation within isolates from a single region is generally smaller compared to those from different regions. In this figure, 45 newly isolated S. Gallinarum strains are categorized into two lineages: L2b and L3b. The heatmap on the right side of Figure 3 displays the SNP distances between all pairwise combinations of these 45 strains, where the intraclade distances are represented by the red regions (highlighting the pairwise distances within each lineage, specifically L3b and L2b, which are indicated by two triangles). The between-clade distances are shown by the blue regions.

      We also believe in further exploring the intraclade distances across the entire dataset of 580 S. Gallinarum strains, as it could provide additional insights. However, this analysis would extend beyond the scope of the current section.

      Revisions in the manuscript Line: 998

      Figure: Figure 3

      Please remove Figure 6c, it does not add anything to the paper and raises questions about performing this regression.

      Thank you for pointing out this issue. We have removed Figure 6c and the corresponding description in the "Results" section from the manuscript (New Figure 6).

      Revisions in the manuscript Lines: 316, 319, 1035-1041.

      Figure: Figure 6

      Again, thank you all for your time and efforts in reviewing our work. We believe the improved manuscript meets the high standards of the journal.

    1. eLife Assessment

      NAD deficiency perturbs embryonic development resulting in multiple congenital malformations, collectively termed Congenital NAD Deficiency Disorder (CNDD). The authors report fundamental findings demonstrating that extra-embryonic visceral yolk sac endoderm is critical for NAD de novo synthesis during early organogenesis and perturbations of this pathway may underlie CNDD. The authors combine gene expression with metabolic assays to provide solid evidence of an essential role of the extra-embryonic visceral yolk sac in both mouse and human embryos.

    2. Reviewer #1 (Public review):

      Summary:

      This study investigated the mechanism underlying Congenital NAD Deficiency Disorder (CNDD) using a mouse model with loss of function of the HAAO enzyme which mediates a key step in the NAD de novo synthesis pathway. This study builds on the observation that the kynurenine pathway is required in the conceptus, as HAAO null embryos are sensitive to maternal deficiency of NAD precursors (vitamin B3) and tryptophan, and narrows the window of sensitivity to a 3 day period.

      An important finding is that de novo NAD synthesis occurs in an extra-embryonic tissue, the visceral yolk sac, before the liver develops in the embryo. It is suggested that lack of this yolk sac activity leads to impaired NAD supply in the embryo leading to structural abnormalities found later in development.

      Strengths:

      Previous studies show a requirement for HAOO activity for normal development of the embryos develop abnormalities under conditions of maternal vitamin B3 deficiency, indicating a requirement for NAD synthesis in the conceptus. Analysis of scRNA-seq datasets combined with metabolite analysis of yolk sac tissue shows that the NAD synthesis pathway is expressed and functional in the yolk sac from E10.5 onwards (prior to liver development).

      HAOO enzyme assay enabled quantification of enzyme activity in relevant tissues including liver (from E12.5), embryo, placenta and yolk sac (from E11.5).<br /> Comprehensive metabolite analysis of the NAD synthesis pathway supports the predicted effects of HAOO knockout and provides analysis of yolk sac, placenta and embryo at a series of stages.

      The dietary study (with lower vitamin B3 in maternal diet from E7.5-10.5) is an incremental addition to previous studies which imposed similar restrictions from E7.5-12.5. Nevertheless, this emphasises the importance of the synthesis pathway on the conceptus at stages before liver activity is prominent.

      Weaknesses:

      The current dietary study narrows the period when deficiency can cause malformations (analysed at E18.5), and altered metabolite profiles (eg, increased 3HAA, lower NAD) are detected in yolk sac and embryo at E10.5.

      More importantly, there is still a question of whether in addition to the yolks sac, there is HAAO activity within the embryo itself has been assayed as early as E11.5, with minimal activity prior to E12.5 (when it is assayed in liver). These findings support the hypothesis that within the conceptus (embryo, chorioallantoic placenta and visceral yok sac) the embryo is unlikely to be the site of NAD synthesis prior to liver development.

      Evidence for lack of function of the NAD synthesis pathway in the embryos itself from kynurenine at E7.5-10.5 comes from reanalysis of scRNA-seq. This suggests low or absent expression of HAAO in the embryo prior to E10.5 (corresponding to the period when the authors have demonstrated that de novo NAD synthesis in the conceptus is needed). The caveat to this conclusion is that additional analysis of RNA and/or protein expression in the embryos at E7.5-10.5 has not been performed to validate the scRNA-seq data.

    3. Reviewer #2 (Public review):

      Summary:

      Disruption of the nicotinamide adenine dinucleotide (NAD) de novo Synthesis Pathway, by which L-tryptophan is converted to NAD results in multi-organ malformations which collectively has been termed Congenital NAD Deficiency Disorder (CNDD).

      While NAD de novo synthesis is primarily active in the liver postnatally, the site of activity prior to and during organogenesis is unknown. However, mouse embryos are susceptible to CNDD between E7.5-E12.5, before the embryo has developed a functional liver. Therefore, NAD de novo synthesis is likely active in another cell or tissue during this time window of susceptibility.

      The body of work presented in this paper continues the corresponding author's labs investigation of the cause and effects of NAD Deficiency and the primary goal was to determine the cell or tissue responsible for NAD de novo synthesis during early embryogenesis.

      The authors conclude that visceral yolk sac endoderm is the source of NAD de novo synthesis, which is essential for mouse embryonic development, and furthermore that the dynamics of NAD synthesis are conserved in human equivalent cells and tissues, the perturbation of which results in CNDD.

      Strengths:

      Overall, the primary findings regarding the source of NAD synthesis, the temporal requirement and conservation between rodent and human species is quite novel and important for our understanding of NAD synthesis and function and role in CNDD.

      The authors used UHPLC-MS/MS to quantify NAD+ and NAD-related metabolites and showed convincingly that the NAD salvage pathway can compensate for the loss of NAD synthesis in Haao-/- embryos, then determined that Haao activity was present in the yolk sac prior to hepatic development identifying this organ as the site of de novo NAD synthesis. Dietary modulation between E7.5-10.5 was sufficient to induce CNDD phenotypes, narrowing the window of susceptibility, and then re-analysis of RNA-seq datasets suggested the endoderm was the cell source of NAD synthesis.

      Weaknesses:

      Page 4 and Table S4. The descriptors for malformations of organs such as the kidney and vertebrae are quite vague and uninformative. More specific details are required to convey the type and range of anomalies observed as a consequence of NAD deficiency.

      Can the authors define whether the role for the NAD pathway in a couple of tissue or organ systems is the same. By this I mean is the molecular or cellular effect of NAD deficiency the same in the vertebrae and organs such as the kidney. What unifies the effects on these specific tissues and organs and are all tissues and organs affected. If some are not, can the authors explain why they escape the need for the NAD pathway.

      Page 5 and Figure 6C. The expectation and conclusion for whether specific genes are expressed in particular cell types in scRNA-seq datasets depends on number of cells sequenced, the technology (methodology) used, the depth of sequencing and also the resolution of the analysis. It is therefore essential to perform secondary validation of the analysis of scRNA-seq data. At a minimum, the authors should perform in situ hybridization or immunostaining for Tdo2, Afmid, Kmo, Kynu, Haao, Qprt and Nadsyn1 or some combination thereof at multiple time points during early mouse embryogenesis to truly understand the spatiotemporal dynamics of expression and NAD synthesis.

      Absolute functional proof of the yolk sac endoderm as being essential and required for NAD synthesis in the context of CNDD might require conditional deletion of Haao in the yolk sac versus embryo using appropriate Cre driver lines or in the absence of a conditional allele, could be performed by tetraploid embryo-ES cell complementation approaches. But temporal dietary intervention can also approximate the same thing by perturbing NAD synthesis then the yolk sac is the primary source versus when the liver becomes the primary source in the embryo.

      In further revisions, the authors have added data to Supp Table 4 and Supplemental Figures 1 and 2

      Although the authors did not perform in situ hybridization for some of the genes requested to define the critical cell type of expression, available scRNA-sequencing suggests the yolk sac endoderm are the only likely source of NAD synthesis prior to its synthesis in the liver. Absolute functional proof of the yolk sac endoderm as being essential and required for NAD synthesis in the context of CNDD still requires validation but nonetheless it seems likely given the absence of a functional liver in embryos prior to E12.5. The authors provided some additional data pertaining to the type of kidney and vertebral anomalies observed which makes this data more complete.

    4. Author response:

      The following is the authors’ response to the current reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      A number of modifications/additions have been made to the text which help to clarify the background and details of the study and I feel have improved the study.

      NAD deficiency induced using the dietary/Haao null model showed a window of susceptibility at E7.5-10.5. Further, HAAO enymze activity data has been added at E11.5 and the minimal HAAO activity in the embryo act E11.5 supports the hypothesis that the NAD synthesis pathway from kynurenine is not functional until the liver starts to develop.

      The caveat to this is that absence of expression/activity in embryonic cells at E7.5-10/5 relies on previous scRNA-seq data. Both reviewers commented that analysis of RNA and/or protein expression at these stages (E7.5-10.5) would be necessary to rule this out, and would strongly support the conclusions regarding the necessity for yolk sac activity.

      There are a number of antibodies for HAAO, KNYU etc so it is surprising if none of these are specific for the mouse proteins, while an alternative approach in situ hydridisation would also be possible.


      The following is the authors’ response to the original reviews.

      Reviewer 1 Public Review:

      The current dietary study narrows the period when deficiency can cause malformations (analysed at E18.5), and altered metabolite profiles (eg, increased 3HAA, lower NAD) are detected in the yolk sac and embryo at E10.5. However, without analysis of embryos at later stages in this experiment it is not known how long is needed for NAD synthesis to be recovered - and therefore until when the period of exposure to insufficient NAD lasts. This information would inform the understanding of the developmental origin of the observed defects.

      Our previous published work (Cuny et al 2023 https://doi.org/10.1242/dmm.049647) indicates that the timing of NAD de novo synthesis pathway precursor availability and consequently the timing of NAD deficiency during organogenesis drives which organs are affected in their development. Furthermore, experimental data of another project (manuscript submitted) shows that mouse embryos (from mothers on an NAD precursor restricted diet that induces CNDD) were NAD deficient at E9.5 and E11.5, but embryo NAD levels were fully recovered at E14.5 when compared to same-stage embryos from mothers on precursor-sufficient diet. This was observed irrespective of the embryos’ Haao genotype. In the current study, NAD precursor provision was only restricted until E10.5. Thus, we expect that our embryos phenotyped at E18.5 had recovered their NAD levels back to normal by E14.5 at the latest.  More research, beyond the scope of the current manuscript, is required to spatio-temporally link embryonic NAD deficiency to the occurrence of specific defect types and elucidate the mechanistic origin of the defects. To acknowledge this, we updated the respective Discussion paragraph on page 7 and added the following statement: “This observation supports our hypothesis that the timing of NAD deficiency during organogenesis determines which organs/tissues are affected (Cuny et al., 2023), but more research is needed to fully characterise the onset and duration of embryonic NAD deficiency in dietary NAD precursor restriction mouse models.”

      More importantly, there is still a question of whether in addition to the yolk sac, there is HAAO activity within the embryo itself prior to E12.5 (when it has first been assayed in the liver - Figure 1C). The prediction is that within the conceptus (embryo, chorioallantoic placenta, and visceral yok sac) the embryo is unlikely to be the site of NAD synthesis prior to liver development. Reanalysis of scRNA-seq (Fig 1B) shows expression of all the enzymes of the kynurenine pathway from E9.5 onwards. However, the expression of another available dataset at E10.5 (Fig S3) suggested that expression is 'negligible'. While the expression in Figure 1B, Figure S1 is weak this creates a lack of clarity about the possible expression of HAAO in the hepatocyte lineage, or especially elsewhere in the embryo prior to E10.5 (corresponding to the period when the authors have demonstrated that de novo NAD synthesis in the conceptus is needed). Given these questions, a direct analysis of RNA and/or protein expression in the embryos at E7.5-10.5 would be helpful. 

      We now have included additional data showing that whole embryos at E11.5 and embryos with their livers removed at E14.5 have negligible HAAO enzyme activity. The observed lack of HAAO activity in the embryo at E11.5 is consistent with the absence of a functional embryonic liver at that stage. Thus, it confirms that the embryo is dependent of extraembryonic tissues (the yolk sac) for NAD de novo synthesis prior to E12.5. The additional datasets are now included in Supplementary Table S1 and as Supplementary Figure 2. The Results section on page 2 has been updated to refer to these datasets.

      Reviewer #2 (Public Review): 

      Page 4 and Table S4. The descriptors for malformations of organs such as the kidney and vertebrae are quite vague and uninformative. More specific details are required to convey the type and range of anomalies observed as a consequence of NAD deficiency. 

      We now provide more information about the malformation types in the Results on page 4. Also, Table S4 now defines the missing vertebral, sternum, and kidney descriptors.

      Can the authors define whether the role of the NAD pathway in a couple of tissue or organ systems is the same? By this I mean is the molecular or cellular effect of NAD deficiency is the same in the vertebrae and organs such as the kidney. What unifies the effects on these specific tissues and organs and are all tissues and organs affected? If some are not, can the authors explain why they escape the need for the NAD pathway? 

      This is a good comment, highlighting that further research, beyond the scope of this manuscript, is needed to better understand the underlying mechanisms of CNDD causation. We have expanded the Discussion paragraph “NAD deficiency in early organogenesis is sufficient to cause CNDD” to indicate that while the timing of NAD deficiency during embryogenesis explains variability in phenotypes among the CNDD spectrum, it is unknown why other organs/tissues are seemingly not affected by NAD deficiency.

      To answer the reviewer’s questions and elucidate the underlying cellular and molecular processes in individual organs affected by NAD deficiency, a multiomic approach is required. This is because NAD is involved in hundreds of molecular and cellular processes affecting gene expression, protein levels, metabolism, etc. For details of NAD functions that have relevance to embryogenesis, the reviewer may refer to our recent review article (Dunwoodie et al 2023 https://doi.org/10.1089/ars.2023.0349). 

      Page 5 and Figure 6C. The expectation and conclusion for whether specific genes are expressed in particular cell types in scRNA-seq datasets depend on the number of cells sequenced, the technology (methodology) used, the depth of sequencing, and also the resolution of the analysis. It is therefore essential to perform secondary validation of the analysis of scRNA-seq data. At a minimum, the authors should perform in situ hybridization or immunostaining for Tdo2, Afmid, Kmo, Kynu, Haao, Qprt, and Nadsyn1 or some combination thereof at multiple time points during early mouse embryogenesis to truly understand the spatiotemporal dynamics of expression and NAD synthesis. 

      We have tested antibodies against HAAO, KYNU, and QPRT in adult mouse liver samples (the main site of NAD de novo synthesis) but these produced non-specific bands in western blotting experiments. Therefore, immunostaining studies on embryonic tissues were not feasible. 

      However, we agree that histological methods such as in situ hybridisation would provide secondary validation of the exact cell types that express these genes. To acknowledge this, we have updated a sentence on page 5 referring to the data shown in Figure 6C as follows: “While histological methods such as in situ hybridisation would be required to confirm the exact cell types expressing these genes, the available expression data indicates that the genes encoding those enzymes required to convert L-kynurenine to NAD (kynurenine pathway) are exclusively expressed in the yolk sac endoderm lineage from the onset of organogenesis (E8.0-8.5).”

      Absolute functional proof of the yolk sac endoderm as being essential and required for NAD synthesis in the context of CNDD might require conditional deletion of Haao in the yolk sac versus embryo using appropriate Cre driver lines or in the absence of a conditional allele, could be performed by tetraploid embryo-ES cell complementation approaches. But temporal dietary intervention can also approximate the same thing by perturbing NAD synthesis Shen the yolk sac is the primary source versus when the liver becomes the primary source in the embryo. 

      Reviewer 1 has made a similar comment about confirming that indeed NAD de novo synthesis activity is limited to extraembryonic tissues (=yolk sacs) and absent in the embryo prior to development of an embryonic liver. We now have included additional data showing that whole embryos at E11.5 and embryos with their livers removed at E14.5 have negligible HAAO enzyme activity. The observed lack of HAAO activity in the embryo at E11.5 is consistent with the absence of a functional embryonic liver at that stage. We think this provides enough proof that the embryo is dependent of extraembryonic tissues (the yolk sac) for NAD de novo synthesis prior to E12.5. The additional datasets are now included in Supplementary Table S1 and as Supplementary Figure 2. The Results section on page 2 has been updated to refer to these data.

      Reviewer #1 (Recommendations For The Authors): 

      (1) Introduction (page 1) introduces mouse models with defects in the kynurenine pathway "confirming that NAD de novo synthesis is required during embryogenesis ...". This requirement is revealed by the imposition of maternal dietary deficiency and more detail (or a more clear link to the following sentences) here would help the reader who is not familiar with the previous papers using the HAAO mice and dietary modulation.

      We have updated this paragraph in the Introduction to better indicate that the requirement of NAD de novo synthesis for embryogenesis was confirmed in mouse models by modulating the maternal dietary NAD precursor provision during pregnancy.

      (2) Discussion - throughout the introduction and results the authors refer to the NAD de novo synthesis pathway, with the study focussing on the effects of HAAO loss of function. Data implies that the kynurenine pathway is active in the yolk sac but whether de novo synthesis from L-tryptophan occurs has not been addressed. The first sub-heading of the discussion could be more accurate referring to the kynurenine pathway, or synthesis from kynurenine. 

      We agree that our manuscript needed to make better distinction between NAD de novo synthesis starting from kynurenine and starting from tryptophan. We removed “from Ltryptophan” from the sub-heading in the Discussion and clarified in this paragraph which genes are required to convert tryptophan to kynurenine and which genes to convert kynurenine to NAD. We also updated two Results paragraphs (page 2, 2nd paragraph; page 5, 5th paragraph) to improve clarity.

      It is worth noting that our statement in the Discussion “this is the first demonstration of NAD de novo synthesis occurring in a tissue outside of the liver and kidney.” is valid because vascular smooth muscle cells express Tdo2 and in combination with the other requisite genes expressed in endoderm cells, the yolk sac has the capability to synthesise NAD de novo from L-tryptophan.

      (3) Outlook - While this section is designed to be looking ahead to the potential implications of the work, the last section on gene therapy of the yolk sac seems far removed from the paper content and highly speculative. I feel this could detract from the main points of the study and could be removed. 

      We have updated the Outlook paragraph and shortened the final part to “Further research is required to better understand the mechanisms of CNDD causation and of other causes of adverse pregnancy outcomes involving the yolk sac.”

      (4) In Figure 2D it would be useful to label the clusters as the colours in the legend are difficult to match to the heatmap. 

      We now have labelled the clusters with lowercase letters above the heatmap to make it easier to match the clusters in Figure 2D to the colours used for designating tissues and genotypes. These labels are described in the figure’s key and the figure legend.  

      Reviewer #2 (Recommendations For The Authors): 

      Page 4 and Table S4. The descriptors for malformations of organs such as the kidney and vertebrae are quite vague and uninformative. More specific details are required to convey the type and range of anomalies observed as a consequence of NAD deficiency. 

      We now provide more information about the malformation types in the Results on page 4. Also, Table S4 now defines the missing vertebral, sternum, and kidney descriptors.

      Can the authors define whether the role of the NAD pathway in a couple of tissue or organ systems is the same? By this I mean is the molecular or cellular effect of NAD deficiency is the same in the vertebrae and organs such as the kidney. What unifies the effects on these specific tissues and organs and are all tissues and organs affected? If some are not, can the authors explain why they escape the need for the NAD pathway? 

      This is a good comment, highlighting that further research, beyond the scope of this manuscript, is needed to better understand the underlying mechanisms of CNDD causation. We have expanded the Discussion paragraph “NAD deficiency in early organogenesis is sufficient to cause CNDD” to indicate that while the timing of NAD deficiency during embryogenesis explains variability in phenotypes among the CNDD spectrum, it is unknown why other organs/tissues are seemingly not affected by NAD deficiency.

      To answer the reviewer’s questions and elucidate the underlying cellular and molecular processes in individual organs affected by NAD deficiency, a multiomic approach is required. This is because NAD is involved in hundreds of molecular and cellular processes affecting gene expression, protein levels, metabolism, etc. For details of NAD functions that have relevance to embryogenesis, the reviewer may refer to our recent review article (Dunwoodie et al 2023 https://doi.org/10.1089/ars.2023.0349). 

      Page 5 and Figure 6C. The expectation and conclusion for whether specific genes are expressed in particular cell types in scRNA-seq datasets depend on the number of cells sequenced, the technology (methodology) used, the depth of sequencing, and also the resolution of the analysis. It is therefore essential to perform secondary validation of the analysis of scRNA-seq data. At a minimum, the authors should perform in situ hybridization or immunostaining for Tdo2, Afmid, Kmo, Kynu, Haao, Qprt, and Nadsyn1 or some combination thereof at multiple time points during early mouse embryogenesis to truly understand the spatiotemporal dynamics of expression and NAD synthesis. 

      We have tested antibodies against HAAO, KYNU, and QPRT in adult mouse liver samples (the main site of NAD de novo synthesis) but these produced non-specific bands in western blotting experiments. Therefore, immunostaining studies on embryonic tissues were not feasible. 

      However, we agree that histological methods such as in situ hybridisation would provide secondary validation of the exact cell types that express these genes. To acknowledge this, we have updated a sentence on page 5 referring to the data shown in Figure 6C as follows: “While histological methods such as in situ hybridisation would be required to confirm the exact cell types expressing these genes, the available expression data indicates that the genes encoding those enzymes required to convert L-kynurenine to NAD (kynurenine pathway) are exclusively expressed in the yolk sac endoderm lineage from the onset of organogenesis (E8.0-8.5).”

    1. eLife Assessment

      In this valuable study, Tutak and colleagues set out to identify factors that mediate Repeat Associated Non-AUG (RAN) translation of CGG repeats in the FMR1 mRNA which are implicated in toxic protein accumulation that underpins ensuing neurological pathologies. The authors provide solid evidence that RPS26 may be implicated in mediating the RAN translation of FMR1 mRNA. This article should be of broad interest to researchers in the variety of disciplines including post-transcriptional regulation of gene expression and neurobiology.

    2. Reviewer #2 (Public review):

      Summary:

      Translation of CGG repeats leads to accumulation of poly G, which is associated with neurological disorders. This is an important paper in which the authors sought out proteins that modulate RAN translation. They determined which proteins in Hela cells were enriched on CGG repeats and affected levels of polyG encoded in the 5'UTR of the FMR1 mRNA. They then showed that siRNA depletion of ribosomal protein RPS26 results in less production of FMR1polyG than in control. Experiments were performed in several cell lines and with several reporters with differences in repeats and transfection methods to increase confidence that changes were occurring. New data and details of the methods increase confidence that reporter translation but not global translation is diminished by RPS26 knockdown as concluded. The manuscript has been improved by data showing that new proteins are being synthesized in cells following RPS26 knockdown, and that near-cognate start codon usage is diminished in lines when RPS26 is knocked down, but the mechanism by which RPS26 depletion affects translation is still unclear.

      Strengths:

      - The authors have proteomics data that show enrichment of a set of proteins on FMR1-polyG RNA but not a related RNA.<br /> - Knockdown of RPS26, which was enriched on the FMR1 RNA, led to decreases in cell growth, but surprisingly did not strongly affect global translation, as assessed by puromycin incorporation<br /> - There is some new evidence that near-cognate start codon selection is affected by RPS26 knockdown

      Weaknesses:

      - The mechanism for RPS26 knockdown affecting translation of the polyG sequences is unclear, whether knockdown is affecting ribosome levels, extra ribosomal RPS26 or ribosome composition is not known.

    3. Reviewer #3 (Public review):

      Tutak et al provide intriguing findings demonstrating that insufficiency of RPS26 and related proteins, such as TSR2 and RPS25, downregulates RAN translation from CGG repeat RNA in fragile X-associated conditions. Using RNA-tagging system and mass spectrometry-based screening, the authors identified RPS26 as a potential regulator of RAN translation. They further confirmed its regulatory effects on RAN translation by siRNA-based knockdown experiments in multiple cellular disease models. Quantitative mass spectrometry analysis revealed that the expression of some ribosomal proteins is sensitive to RPS26 depletion, while approximately 80% of proteins, including FMRP, were not influenced. Given the limited understanding of the roles of ribosomal proteins in RAN translation regulation, this study provides novel insights into this research field. However, certain data do not fully support the authors' critical conclusions.

      (1) While the authors substituted the ACG near-cognate initiation codon with other near-cognate codons, such as GTG and CTG, in the luciferase assay (Figure 4F), substitution of the ACG codon with an ATG codon should also be performed. Although they evaluated RPS26 knockdown effect on AUG-dependent FMRP translation in Figure 3C, investigating its effect on AUG-dependent repeat-associated translation (e.g., AUG-CGG-repeat) is necessary to substantiate their claim that ACG codon selection is important for RAN translation downregulation by RPS26 knockdown.

      (2) The results of the ASO-based ACG codon-blocking experiment in Figure 4G are difficult to interpret. While RPS knockdown reduces FMRpolyG expression, the effect appears attenuated by the ASO-ACG treatment compared to the control. However, this does not conclusively demonstrate that the regulatory effect is directly due to ACG codon selection during translation initiation for some reasons. For example, ASO-ACG treatment possibly interferes with ribosomal scanning rather than ACG-codon selection, or alters the expression of template CGG repeat RNA. To validate the effect of RPS26 knockdown on ACG codon selection, experiments using the ACG-to-ATG substituted CGG repeat reporter are recommended, as suggested in comment 1.

      (3) The regulatory effects of RPS26 and other molecules on RAN translation have been investigated as effects on the expression levels of FMRpolyG proteins upon knockdown of these molecules in disease model cells expressing CGG repeat sequences (Figures 1C, 1D, 3B, 3C, 3E, 4F, 4G, 5A, 5C, 6A, 6D). However, FMRpolyG expression levels can be influenced by factors other than RAN translation in these cellular experiments, such as template RNA level, template RNA localization, and FMRpolyG protein degradation. Although the authors evaluated the effect on the expression levels of template CGG repeat RNA, it would be better to confirm the direct effect of these regulators on RAN translation by other experiments. In vitro translation assay that can directly evaluate RAN translation is preferable, but experiments using the ACG-to-ATG substituted CGG repeat reporter, as suggested in comment 1, would also provide valuable insights.

    4. Author response:

      The following is the authors’ response to the current reviews.

      We thank Reviewers for highlighting the strengths of our work along with suggestions for future directions.

      We agree with the Reviewers that RPS26 depletion may impact not only RAN translation initiation and codon selection (as showed in the experiments in Figure 4G), but also other mechanisms, such as speed of PIC scanning, as we stated in the discussion. Although, we did provide the data showing that mRNA of exogenous FMR1-GFP does not change upon RPS26 depletion (Figure 3B&C), hence observed effect most likely stems from translation regulation. In addition, an experiment with ASO-ACG treatment (Figure 4G) suggests that near cognate start codon selection or speed of PIC scanning may be a part of the regulation of RAN translation sensitive to RPS26 depletion. In addition, our latest unpublished results (Niewiadomska D. et al., in revision), indicate that FMRpolyG in fusion with GFP is fairly stable, in particular, while derived from long repeats (>90xCGG), suggesting that the protein stability is not at play in RPS26-dependent regulation.

      We would like to stress that in order to avoid bias in result interpretation and to mimic the natural situation, the majority of experiments concerning levels of FMRpolyG were performed in cell models with stable expression of ACG-initiated FMRpolyG. Currently, we do not possess a cell model with stable expression of AUG-initiated FMRpolyG, and the experiments based on transient transfection system would not necessarily be comparable to the results obtained in stable expression system. However, we believe that the experiment presented in Figure 2B serves as a good control for overall translation level upon RPS26 depletion indicating that RPS26 insufficiency does not affect global translation and the observed regulation is specific to some mRNAs including the one encoding FMRpolyG frame. We also show that the level of ca. 80% of identified canonical proteins, including FMRP, did not change upon RPS26 silencing (SILAC-MS, Figure 4A). Indeed, we did not explore the ribosome composition upon RPS26 and TSR2 depletion, although, most likely the pool of functional ribosomes in the cell is sufficient enough to support the basal translation level (SUnSET assays, Figure 2B & 5C). However, we cannot exclude possibility that for some mRNAs, including one encoding for FMRpolyG, the observed effect can be partially caused by lowering the number of fully active ribosomes, especially in experiments with transient transfection experiments where transgene expression is hundreds times higher than for average native mRNA.

      Finally, we agree with the Reviewer that in vitro translation assay would provide the evidence of direct effect of RPS26 on FMRpolyG level, however, we did not manage to overcome technical difficulties in obtaining cellular lysate devoid of RPS26 from vendor companies.


      The following is the authors’ response to the original reviews.

      General Comments

      We thank Reviewers for the critical comments and experimental suggestions. We considered most of the advices in the revised version of the manuscript, which allowed for a more balanced interpretation of the results presented, and further supported major statement of the manuscript that insufficiency of the RPS26 and RPS25 plays a role in modulating the efficiency of noncanonical RAN translation from FMR1 mRNA, which results in the production of toxic polyglycine protein (FMRpolyG). Firstly, performing new experiments, we showed that silencing of the RPS26 and its chaperone protein TSR2, which regulates loading/exchange of RPS26 in maturing small ribosome subunit, did not elicit global translation inhibition. Secondly, we demonstrated that in contrary to RPS26 and RPS25 depletion, silencing the RPS6 protein, a core component of 40S subunit, did not affect FMRpolyG production, further supporting the specific effect of RPS26 and RPS25 on RAN translation regulation of mutant FMR1 mRNA. We also observed that depletion of RPS26, RPS25 and RPS6 had significant negative effect on cells proliferation which is in line with previously published results indicating that insufficiencies of ribosomal proteins negatively affect cell growth. Moreover, we showed that FMRpolyG production is significantly affected by RPS26 depletion while initiated at ACG, but not other near cognate start codons. Importantly, translation of FMRP initiated at canonical AUG codon of the same mRNA upstream the CGGexp was not affected by RPS26 silencing, similarly to vast majority of the human proteome. This implies that RAN translation of FMR1 mRNA mediated by RPS26 insufficiency is likely to be dependent on start codon selection/fidelity. In essence, we provide a series of evidences indicating that cellular amount of 40S ribosomal proteins RPS26 and RPS25 is important factor of CGGrelated RAN translation regulation. Finally, we also decided to tone down our claims. Now, we state that the RPS26/25/TSR2 insufficiency or depletion, affects RAN translation, rather than composition of 40S ribosomal subunit per se influences RAN translation. We have addressed all specific concerns below and made changes to the new version of manuscript.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Tutak et al use a combination of pulldowns, analyzed by mass spectrometry, reporter assays, and fluorescence experiments to decipher the mechanism of protein translation in fragile X-related diseases. The topic is interesting and important.

      Although a role for Rps26-deficient ribosomes in toxic protein translation is plausible based on already available data, the authors' data are not carefully controlled and thus do not support the conclusions of the paper.

      We sincerely appreciate your rigorous, insightful, and constructive feedback throughout the revision process. We believe your guidance has been instrumental in significantly enhancing the quality of our research. Below, we have addressed your comments pointby-point.

      Strengths:

      The topic is interesting and important.

      Weaknesses:

      In particular, there is very little data to support the notion that Rps26-deficient ribosomes are even produced under the circumstances. And no data that indicate that they are involved in the RAN translation. Essential controls (for ribosome numbers) are lacking, no information is presented on the viability of the cells (Rps26 is an essential protein), and the differences in protein levels could well arise from block in protein synthesis, and cell division coupled to differential stability of the proteins.

      We agree that data presented in the first version of the manuscript did not directly address the following processes: ribosome content, global translation rate and cell viability upon RPS26 depletion. Therefore we addressed some of the issues in the revised version of the manuscript. In particular, we showed that RPS26 and TSR2 knock down did not inhibit global translation (new Figure 2B & 4C), hence we concluded that the changes of FMRpolyG level did not arise from general translational shut down. On the other hand, RPS26, RPS25 and RPS6 depletion negatively affected cells proliferation (new Figure 2A,5D,6C), which is in line with a number of previously published researches (e.g. Cheng et al, 2019; Havkin-Solomon et al, 2023). However, the rate of proliferation abnormalities is limited. We agree that observed effects on RAN translation from mutant FMR1 mRNA may stem from the combination of altered protein synthesis, conditions of the cells but also cis-acting factors of mRNA sequence/structure. In new experiments we showed that single nucleotide substitution of ACG by other near cognate start codons change sensitivity of RAN translation to insufficiency of RPS26 (new Figure 4F). Also the inhibitory effect of antisense oligonucleotide binding to the region of 5’UTR containing ACG initiation codon (ASO_ACG) is different in cells differing in amount of RPS26 (new Figure 4G).

      We also agree that our data only partially supports the role of RPS26-defficient ribosomes in RAN translation. Therefore, we have toned down our claims. Now, we state that the RPS26/25/TSR2 insufficiency or depletion affects RAN translation. We also changed the title of the manuscript to: “Insufficiency of 40S ribosomal proteins, RPS26 and RPS25, negatively affects biosynthesis of polyglycine-containing proteins in fragile-X associated conditions” (Previously it was: “Ribosomal composition affects the noncanonical translation and toxicity of polyglycine-containing proteins in fragile X-associated conditions”.

      Specific points:

      (1) Analysis of the mass spec data in Supplemental Table S3 indicates that for many of the proteins that are differentially enriched in one sample, a single peptide is identified. So the difference is between 1 peptide and 0. I don't understand how one can do a statistical analysis on that, or how it would give out anything of significance. I certainly do not think it is significant. This is exacerbated by the fact that the contaminants in the assay (keratins) are many, many-fold more abundant, and so are proteins that are known to be mitochondrial or nuclear, and therefore likely not actual targets (e.g. MCCC1, PC, NPM1; this includes many proteins "of significance" in Table S1, including Rrp1B, NAF1, Top1, TCEPB, DHX16, etc...).

      The data in Table S6/Figure 3A suffer from the same problem.

      I am not convinced that the mass spec data is reliable.

      We thank Reviewer for the comment concerning MS data; however, we believe that it may stem from misunderstanding of the data presented in Table S3 and S6. Both tables represent the output from MaxQuant analysis (so-called ProteinGroup) of MS .raw files, without any filtering. As stated in the Material&Methods, we applied default parameters suggested by MaxQuant developers to analyze MS data, these include identification of proteins based on at least 1 unique peptide, and thus some of the proteins with only 1 unique peptide are shown in Tables S1 and S3. Reviewer is also right that in this output table common contaminants, such as keratins are included. However, these identifications are denoted as “CON_”, and are further filtered out during statistical analysis in Perseus software. During the statistical analysis we first filtered out irrelevant protein groups identifications, such as contaminants, or only identified by site modifications.

      We have changed the names of Supplementary Table files, giving more detailed description. We hope this will help to avoid misunderstanding for broader public. Secondly, when comparing the data presented in Table S3 and volcano plot presented in Figure 1B, one can notice that indeed the majority of identified proteins are not statistically significant (grey points), thus not selected for further stratification. Lack of significance of these proteins may be partially due to poor MS identification, however, they are not included in the following parts of the manuscript. Further, we selected only eight proteins (out of over 150) for stratification by orthogonal techniques, thus we argue that this step validates the biological relevance of chosen candidate RAN-translation modifiers. One should also keep in mind that pull down samples analyzed by MS often yield lower intensity and identification rates, when comparing to whole cell analysis, as a result of lower protein input or stringent washes used during sample preparation.

      Regarding the data presented in Table S6 (SILAC data), we argue that these data are of very good quality. More than 2,000 proteins were identified in a 125min gradient, with over 80% of proteins that were identified with at least 2 unique peptides. Each of three biological replicates was analyzed three times (technical replicates), giving total of 9 high resolution MS runs. Together, we strongly believe that this data is of high confidence.

      (2) The mass-spec data however claims to identify Rps26 as a factor binding the toxic RNA specifically. The rest of the paper seeks to develop a story of how Rps26-deficient ribosomes play a role in the translation of this RNA. I do not consider that this makes sense.

      Indeed, we identified RPS26 as a protein that co-precipitated with FMR1 containing expanded CGG repeats (Supplementary Figure 1G) and found that depletion of RPS26 hindered RAN translation of FMRpolyG, suggesting that RPS26 positively affects RAN translation. However, we did not state that RPS26 directly interacts with toxic RNA. In order to confirm the specificity of RAN translation regulation by RPS26 insufficiency, we tested whether depletion of other 40S ribosomal protein, RPS6, affects FMRpolyG synthesis. Our experiments showed that there was no any significant effect on RAN translation efficiency post RPS6 silencing (new Figure 5C). Importantly, we showed that RPS26 depletion did not inhibit global translation (new Figure 2B). In addition, mutagenesis of near-cognate start codon (new Figure 4F) and ASO_ACG treatment (new Figure 4G) provided the evidences that modulation of FMRpolyG biosynthesis by RPS26 level may depend on start codon selection. In essence, our data suggest that RPS26 depletion specifically affects synthesis of FMRpolyG, but not FMRP derived from the same FMR1 mRNA with CGGexp. However, we do not claim that the observed effect is the consequence of a direct interaction between RPS26 and 5’UTR of FMR1 mRNA. Downregulation of FMRpolyG biosynthesis could be an outcome of the alteration of ribosomal assembly, decrease of efficiency and fidelity of PIC scanning/initiation or impeded elongation or a combination of all these processes. In the manuscript we presented the results of experiments which tested many of these possibilities.

      (3) Rps26 is an essential gene, I am sure the same is true for DHX15. What happens to cell viability? Protein synthesis? The yeast experiments were carefully carried out under experiments where Rps26 was reduced, not fully depleted to give small growth defects.

      We agree with the Reviewer that RPS26 and DHX15 are essential proteins, similarly to all RNA binding proteins, and caution should be taken during experimental design. To address this, we titrated different concentrations of siRPS26, and found that administration of 5 nM siRPS26, which just partially silenced RPS26, decreased FMRpolyG by around 50% (new Figure 1D). This impact was even greater with 15 nM siRPS26, as we observed around 80% decrease of FMRpolyG.

      Havkin-Solomon et al. (2023), showed that proliferation rate is decreased in cells with mutated C-terminus of RPS26, which is required for contacting mRNA. In accordance with this study, we showed that cells with knocked down RPS26 proliferate less efficiently (new Figure 2A), but depletion of RPS26 did not impact the global translation (new Figure 2B). In addition, our SILAC-MS data indicates that ~80% of proteins with determined expression level were not affected by RPS26 insufficiency, and ~20% of the proteins turned out to be sensitive to RPS26 decrease. Although, these data do not take into account the protein stability.

      (4) Knockdown efficiency for all tested genes must be shown to evaluate knockdown efficiency.

      The current version of the manuscript contains representative western blots with validation of knock-down efficiency (for example in Figure 3B, C, E, Figure 6A) and we included knock-down validations where applicable (Figures 1D, 2B, 4G and 5C).

      (5) The data in Figure 1E have just one mock control, but two cell types (control si and Rps26 depletion).

      Mock control corresponds to the cells treated with lipofectamine reagent and was included in the study to determine the “background” signal from cells treated with delivery agent and reagents used to measure the apoptosis process. These cells were neither expressing FMRpolyG, nor siRNAs. Luminescence signals were normalized to the values obtained from mock control. We added more details describing this assay in the Figure 1 legend.

      (6) The authors' data indicate that the effects are not specific to Rps26 but indeed also observed upon Rps25 knockdown. This suggests strongly that the effects are from reduced ribosome content or blocked protein synthesis. Additional controls should deplete a core RP to ascertain this conclusion.

      We agree that observed effects may stem from reduced ribosome content, however, we argue that this is the only possibility and explanation. Previously, it was shown that RPS25 regulates G4C2-related RAN translation, but knock out of RPS25 does not affect global translation (Yamada S, 2019, Nat. Neuroscience). Similarly, we showed that KD of RPS26 or TSR2 did not reduce significantly global translation rate (SUnSET assay; new Figure 2B and 5C, respectively).

      Moreover, in a new version of manuscript we included a control experiment, where we silenced core ribosomal protein (RPS6) and found that RPS6 depletion did not affect RAN translation from mutant FMR1 mRNA (new Figure 5C), thus strengthening our conclusion about specific RAN translation regulation by the level of RPS26 and RPS25.

      Finally, our observation aligns well with current knowledge about how deficiency of different ribosomal proteins alters translation of some classes of mRNAs (Luan Y, 2022, Nucleic Acids Res; Cheng Z, 2019, Mol Cell). It was shown that depletion of RPS26 affects translation rate of different mRNAs compared to depletion of other proteins of small ribosomal subunit.

      (7) Supplemental Figure S3 demonstrates that the depletion of S26 does not affect the selection of the start codon context. Any other claim must be deleted. All the 5'-UTR logos are essentially identical, indicating that "picking" happens by abundance (background).

      Supplementary Figure 3D represents results indicating that the mutation in -4 position (from G to A) did not affect the RAN translation regardless of RPS26 presence or depletion. However, this result does not imply that RPS26 does not affect the selection of start codon of sequence- or RNA structure-context. We verified this particular -4 position, as it was suggested previously as important RPS26-sensitive site in yeasts (Ferretti M, 2017, Nat Struct Mol Biol). We agree with Reviewer that all 5’UTR logos presented in our paper did not show statistical significance for neither tested position for human mRNAs. On the contrary, we observed that regulation sensitive to RPS26 level depends on the selection of start codon of RAN translation, in particular ACG initiation (new Figure 4F&G). RPS26 depletion affected ACG-initiated but not GTG- or CTG-initiated RAN translation.

      In the previous version of the manuscript, we wrote that we did not identify any specific motifs or enrichment within analyzed transcripts in comparison to the background. On the other hand, we found that the GC-content among analyzed transcripts is higher within 5’UTRs and in close proximity to ATG in coding sequences (Figure 4D), what suggests the importance of RNA stable structures in this region. In addition, we showed that mRNAs encoding proteins responding to RPS26 depletion have shorter than average 5’UTRs (new Figure 4E).

      (8) Mechanism is lacking entirely. There are many ways in which ribosomes could have mRNA-specific effects. The authors tried to find an effect from the Kozak sequence, unsuccessfully (however, they also did not do the experiment correctly, as they failed to recognize that the Kozak sequence differs between yeast, where it is A-rich, and mammalian cells, where it is GGCGCC). Collisions could be another mechanism.

      Indeed, collisions as well as other mechanisms such as skewed start codon fidelity may have an effect on efficiency of FMRpolyG biosynthesis. In the current version of the manuscript, we show that RPS26 amount-sensitive regulation seems to be start codonselection dependent (new Figure 4F&G).

      Reviewer #2 (Public Review):

      Summary:

      Translation of CGG repeats leads to the accumulation of poly G, which is associated with neurological disorders. This is a valuable paper in which the authors sought out proteins that modulate RAN translation. They determined which proteins in Hela cells bound to CGG repeats and affected levels of polyG encoded in the 5'UTR of the FMR1 mRNA. They then showed that siRNA depletion of ribosomal protein RPS26 results in less production of FMR1polyG than in control. There are data supporting the claim that RPS26 depletion modulates RAN translation in this RNA, although for some results, the Western results are not strong. The data to support increased aggregation by polyG expression upon S26 KD are incomplete.

      We thank the Reviewer for critical comments and suggestions. We sincerely appreciate your rigorous, insightful, and constructive feedback throughout the revision process.

      Below each specific point, we addressed the mentioned issues.

      Strengths:

      The authors have proteomics data that show the enrichment of a set of proteins on FMR1 RNA but not a related RNA.

      We thank Reviewer for appreciation of provided MS-screening results, which identified proteins enriched on FMR1 RNA with expanded CGG repeats.

      Weaknesses:

      - It is insinuated that RPS26 binds the RNA to enhance CGG-containing protein expression. However, RPS26 reduction was also shown previously to affect ribosome levels, and reduced ribosome levels can result in ribosomes translating very different RNA pools.

      In previous version of the manuscript we did not state that RPS26 binds directly to RNA with expanded CGG repeats and we did not show the experiment indicating direct interaction between studied RNA and RPS26. What we showed is that RPS26 was enriched on FMR1 RNA MS samples, however, we did not verify whether it is direct or indirect interaction. We also tried to test hypothesis that lack of RPS26 in PIC complex may affect efficiency of RAN translation initiation via specific, previously described in yeast Kozak context (Ferretti M, 2017, Nat Struct Mol Biol). As we described this hypothesis was negatively validated. However, we showed that other features of 5’UTR sequences (e.g. higher GC-content or shorter leader sequence) are potentially important for translation efficiency in cells with depleted RPS26.

      Indeed, RPS26 is involved in 40S maturation steps (Plassart L, 2021, eLife) and its insufficiency or mutations or blocking its inclusion to 40S ribosome may result in incomplete 40S maturation, which subsequently might negatively affect translation per se. However, we did not observe global translation inhibition after RPS26 depletion or depletion of TSR2, the chaperon involved in incorporation/exchange RPS26 to small ribosomal subunit (new Figure 2B and 5C). In addition, our SILAC-MS data indicates that majority of studied proteins (including FMRP, the main product of FMR1 gene) were not affected by RPS26 depletion which can be carefully extrapolated to global translation. In revised manuscript we also showed that relatively low silencing of RPS26 also decreased FMRpolyG production in model cells (new Figure 1D).

      We agree that reduced ribosome levels can result in different efficiency of translation of different RNA pools. We enhance this statement in revised manuscript. However, we also showed that the same mRNA containing different near cognate start codons (single/two nucleotide substitution) specific to RAN translation, or targeting this codon with antisense oligonucleotides resulted in altered sensitivity of FMR1 mRNA translation to RPS26 depletion (new Figure 4F).

      - A significant claim is that RPS26 KD alleviates the effects of FMRpolyG expression, but those data aren't presented well.

      We thank the Reviewer for this comment. In the new version of the manuscript, we have added new microscopic images and improved the explanation of Figure 1E. We have also completed the interpretation of Figure 1F in the main text, figure image as well as figure legend, and we hope that these changes will ameliorate understanding of our data.

      Recommendations For The Authors:

      - A significant claim is that RPS26 KD alleviates the effects of FMR polyG expression, but those data aren't presented well:

      Figure 1D (supporting data in S2) and 2D - the authors need to show representative images of a control that has aggregation and indicate aggregates being counted on an image. The legend states that there are no aggregates, but the quantification of aggregates/nucleus is ~1, suggesting there are at least 1 per cell. It is preferred to show at least a representative of what is quantified in the main figure instead of a bar graph.

      The representative images of control and siRPS26-treated cells are now shown in revised version of Figure 1E. Additionally, we completed the Figure legend concerning this part, as well as extended description of the experiment in Materials&Methods section.

      Figure 1E - it is unclear what luminescence signal is being measured. Is this a dye for an apoptotic marker? More information is needed in the legend.

      This information was added to the legend of modified Figure 1F (previously 1E) as suggested.

      - Some of the Western blots are not very convincing. Better evidence for the changes in bar graphs would improve how convincing the data are:

      Fig 2B. The western for FMR95G in the first model is not very convincing. The difference by eye for the second siRNA seems to give a larger effect than the first for 95G construct but they appear almost the same on the graph. More supporting information for the quantification is needed.

      We provided better explanation for WB quantification in M&M section in the manuscript. Alos, we provided additional blot demonstrating independent biological replicate of the mentioned experiment in supplementary materials (Supplementary Figure S2E).

      Figure 4A, the blots for RPS26 and FMR95G are not convincing. They are quite smeary compared to all of the others shown for these proteins in other figures. Could a different replicate be shown?

      We provided additional blot demonstrating the effect on transiently expressed FMRpolyG affected by depletion of TSR2 in COS7 cell line (Supplementary Figure S4A).

      Figure 5A and 5B blots are not ideal. Could a different replicate be shown? Or show multiple replicates in the supplemental figure?

      We provided additional blots from the same experiment, although data is not statistically significant, most likely due to low quality of normalization factor, which is Vinculin (Supplementary Figure S5A). Nevertheless, the level of FMRpolyG is decreased by ~70% after RPS25 silencing in SH-SY5Y cells.

      Figure 2C. Please use the same y axes for all four Westerns in B and C. One would like to compare 95 and 15 repeats, but it is difficult when the y axes are different.

      Thank you for this comment. The y axis was adjusted as suggested by the Reviewer.

      Figure 3D-The text suggests a significant difference between positive and negative responders that is not clear in the figure.

      In the main body of the manuscript we state that: “We did not observe any significant differences in the frequency of individual nucleotide positions in the 20-nucleotide vicinity of the start codon relative to the expected distribution in the BG”, which is in line with the graph showed in Figure 4D (previously 3D).

      Reviewer #3 (Public Review):

      Tutak et al provide interesting data showing that RPS26 and relevant proteins such as TSR2 and RPS25 affect RAN translation from CGG repeat RNA in fragile X-associated conditions. They identified RPS26 as a potential regulator of RAN translation by RNAtagging system and mass spectrometry-based screening for proteins binding to CGG repeat RNA and confirmed its regulatory effects on RAN translation by siRNA-based knockdown experiments in multiple cellular disease models and patient-derived fibroblasts. Quantitative mass spectrometry analysis found that the expressions of some ribosomal proteins are sensitive to RPS26 depletion while approximately 80% of proteins including FMRP were not influenced. Since the roles of ribosomal proteins in RAN translation regulation have not been fully examined, this study provides novel insights into this research field. However, some data presented in this manuscript are limited and preliminary, and their conclusions are not fully supported.

      (1) While the authors emphasized the importance of ribosomal composition for RAN translation regulation in the title and the article body, the association between RAN translation and ribosomal composition is apparently not evaluated in this work. They found that specific ribosomal proteins (RPS26 and RPS25) can have regulatory effects on RAN translation (Figures 1C, 2B, 2C, 2E, 4A, 5A, and 5B), and that the expression levels of some ribosomal proteins can be changed by RPS26 knockdown (Figure 3B, however, the change of the ribosome compositions involved in the actual translation has not been elucidated). Therefore, their conclusive statement, that is, "ribosome composition affects RAN translation" is not fully supported by the presented data and is misleading.

      We thank the Reviewer for critical comments and suggestions. We agree that the initial title and some statements in the text were misleading and the presented data did not fully support the aforementioned statement regarding ribosomal composition affecting FMRpolyG synthesis. Therefore, in the revised version of the manuscript we included a control experiment indicating that depletion of another core 40S ribosomal protein (RPS6) did not impact the FMRpolyG synthesis (new Figure 5C), which supports our hypothesis that RPS26 and RPS25 are specific CGG-related RAN translation modifiers. To precisely deliver a main message of our work, we changed the title that will indicate the specific effect of RPS26 and RPS25 insufficiency on RAN translation of FMRpolyG. Proposed title: “Insufficiency of 40S ribosomal proteins, RPS26 and RPS25 negatively affects biosynthesis of polyglycine-containing proteins in fragile-X associated conditions”. We also changed all statements regarding “ribosomal composition” in main text of the new version of manuscript.

      (2) The study provides insufficient data on the mechanisms of how RPS26 regulates RAN translation. Although authors speculate that RPS26 may affect initiation codon fidelity and regulate RAN translation in a CGG repeat sequence-independent manner (Page 9 and Page 11), what they really have shown is just identification of this protein by the screening for proteins binding to CGG repeat RNA (Figure 1A, 1B), and effects of this protein on CGG repeat-RAN translation. It is essential to clarify whether the regulatory effect of RPS26 on RAN translation is dependent on CGG repeat sequence or near-cognate initiation codons like ACG and GUG in the 5' upstream sequence of the repeat. It would be better to validate the effects of RPS26 on translation from control constructs, such as one composed of the 5' upstream sequence of FMR1 with no CGG repeat, and one with an ATG substitution in the 5' upstream sequence of FMR1 instead of near-cognate initiation codons.

      We agree that the data presented in the manuscript implies that insufficiency of RPS26 plays a pivotal role in the regulation of CGG-related RAN translation and in the revised version of the manuscript we included a series of experiments indicating that ACG codon selection seems to be an important part of RPS26 level-dependent regulation of polyglycine production (new Figure 4F&G; see point 3 below for more details). Importantly, in the luciferase assay showed on Figure 4F we used the AUG-initiated firefly luciferase reporter as normalization control.

      Moreover, to verify if FMRpolyG response to RPS26 deficiency depends on the type of reporter used, we repeated many experiments using FMRpolyG fused with different tags. The luciferase-based assays were in line with experiments conducted on constructs with GFP tag (new Figure 1D), thus strengthening our previous data. Moreover, in the series of experiments, we show that FMRP synthesis which is initiated from ATG codon located in FMR1 exon 1, was not affected by RPS26 depletion (Figure 3E & 4C), even though its translation occurs on the same mRNA as FMRpolyG. This indicates a specific RPS26 regulation of polyglycine frame initiated from ACG near cognate codon.

      (3) The regulatory effects of RPS26 and other molecules on RAN translation have all been investigated as effects on the expression levels of FMRpolyG-GFP proteins in cellular models expressing CGG repeat sequences Figures 1C, 2B, 2C, 2E, 4A, 5A, and 5B). In these cellular experiments, there are multiple confounding factors affecting the expression levels of FMRpolyG-GFP proteins other than RAN translation, including template RNA expression, template RNA distribution, and FMRpolyG-GFP protein degradation. Although authors evaluated the effect on the expression levels of template CGG repeat RNA, it would be better to confirm the effect of these regulators on RAN translation by other experiments such as in vitro translation assay that can directly evaluate RAN translation.

      We agree that there are multiple factors affecting final levels of FMRpolyG-GFP proteins including aforementioned processes. We evaluated the level of FMR1 mRNA, which turned out not to be decreased upon RPS26 depletion (Figure 3B&C), therefore, we assumed that what we observed, was the regulation on translation level, especially that RPS26 is a ribosomal protein contacting mRNA in E-site. We believe that direct assays such as in vitro translation may be beneficial, however, depletion of RPS26 from cellular lysate provided by the vendor seems technically challenging, if not completely impossible. Instead, we focused on sequence/structure specific regulation of RAN translation with the emphasis on start-codon initiation selection. It resulted in generating the valuable results pointing out the RPS26 role in start codon fidelity (Figure 4F&G). These new results showed that translation from mRNAs differing just in single or two nucleotide substitution in near cognate start codon (ACG to GUG or ACG to CUG), although results in exactly the same protein, is differently sensitive to RPS26 silencing (new Figure 4F). Similar differences were observed for translation efficiency from the same mRNA targeted or not with antisense oligonucleotide complementary to the region of RAN translation initiation codon (new Figure 4G). These results also suggest that stability of FMRpolyG is not affected in cells with decreased level of RPS26.

      (4) While the authors state that RPS26 modulated the FMRpolyG-mediated toxicity, they presented limited data on apoptotic markers, not cellular viability (Figure 1E), not fully supporting this conclusion. Since previous work showed that FMRpolyG protein reduces cellular viability (Hoem G, 2019,Front Genet), additional evaluations for cellular viability would strengthen this conclusion.

      We thank the Reviewer for this suggestion. We addressed the apoptotic process in order to determine the effect of RPS26 depletion on RAN translation related toxicity (Figure 1F). In revised version of the manuscript, we also added the evaluation on how cells proliferation was affected by RPS26, RPS25, RPS6 and TSR2 depletion. Our data indicate that TSR2 silencing slightly impacted the cellular fitness (new Figure 5D), whereas insufficiencies of RPS26, RPS25 and RPS6 had a much stronger negative effect on proliferation (new Figure 2A, 5D, 6C), which is in line with previous data (Cheng Z 2019, Mol Cell; Luan Y, 2022, Nucleic Acids Res). The difference in proliferation rate after treatment with siRPS26 makes proper interpretation of cellular viability assessment very difficult.

      Recommendations For The Authors:

      (1) It would be nice to validate the effects of overexpression of RPS26 and other regulators on RAN translation, not limited to knockdown experiments, to support the conclusion.

      We did not performed such experiments because we believed that RPS26 overexpression may have no or marginal effect on translation or RAN translation. It is likely impossible to efficiently incorporate overexpressed RPS26 into 40S subunits, because the concentration of all ribosomal proteins in the cells is very high.

      (2) It would be better to explain how authors selected 8 proteins for siRNA-based validation (Figure 1C, 1D, S1D) from 32 proteins enriched in CGG repeat RNA in the first screening.

      We selected those candidates based on their functions connected to translation, structured RNA unwinding or mRNA processing. For example, we tested few RNA helicases because of their known function in RAN translation regulation described by other researchers. This explanation was added to the revised version of the manuscript.

      (3) Original image data showing nuclear FMRpolyG-GFP aggregates should be presented in Figure 1D.

      The representative images of control and siRPS26-treated cells are now shown in modified version of Figure 1E and described with more details in the legend.

      (4) Image data in Figure 2A and 2D have poor signal/noise ratio and the resolution should be improved. In addition, aggregates should be clearly indicated in Figure 2D in an appropriate manner.

      The stable S-FMR95xG cellular model is characterized by very low expression of RANtranslated FMR95xG, therefore, it is challenging to obtain microscopic images of better quality with higher GFP signal. In the L-99xCGG model expression of transgene is higher. Therefore, we provided new image in the new version of Figure 3D (former 2D). Moreover, we showed aggregates on the image obtained using confocal microscopy (new Supplementary Figure 2D).

      (5) The detailed information on patient-derived fibroblast (age and sex of the patient, the number of CGG repeats, etc.) in Figure 2F needed to be presented.

      This information was added to the figure legend (Figure 3F; previously 2F) and in the Material and Methods section as suggested.

      (6) It would be better to normalize RNA expression levels of FMR1 and FMR1-GFP by the housekeeping gene in Figure S2C, like other RT-qPCR experimental data such as Figure 2B.

      Normalization of FMR1-GFP to GAPDH is now shown in modified version of Figure S2C (right graph) as requested by the Reviewer.

      (7) It would be better to add information on molecular weight on all Western blotting data.

      (8) Marks corresponding to molecular weight ladder were added to all images.

      Full blots, including protein ladders were deposited in Zenodo repository, under doi: 10.5281/zenodo.13860370

      References

      Cheng Z, Mugler CF, Keskin A, Hodapp S, Chan LYL, Weis K, Mertins P, Regev A, Jovanovic M & Brar GA (2019) Small and Large Ribosomal Subunit Deficiencies Lead to Distinct Gene Expression Signatures that Reflect Cellular Growth Rate. Mol Cell 73: 36-47.e10

      Havkin-Solomon T, Fraticelli D, Bahat A, Hayat D, Reuven N, Shaul Y & Dikstein R (2023) Translation regulation of specific mRNAs by RPS26 C-terminal RNA-binding tail integrates energy metabolism and AMPK-mTOR signaling. Nucleic Acids Res 51: 4415–4428

      Hoem,G., Larsen,K.B., Øvervatn,A., Brech,A., Lamark,T., Sjøttem,E. and Johansen,T. (2019) The FMRpolyGlycine protein mediates aggregate formation and toxicity independent of the CGG mRNA hairpin in a cellular model for FXTAS. Front. Genet., 10, 1–18.

      Luan Y, Tang N, Yang J, Liu S, Cheng C, Wang Y, Chen C, Guo YN, Wang H, Zhao W, et al (2022) Deficiency of ribosomal proteins reshapes the transcriptional and translational landscape in human cells. Nucleic Acids Res 50: 6601–6617

      Plassart L, Shayan R, Montellese C, Rinaldi D, Larburu N, Pichereaux C, Froment C, Lebaron S, O’donohue MF, Kutay U, et al (2021) The final step of 40s ribosomal subunit maturation is controlled by a dual key lock. Elife 10

    1. eLife Assessment

      This valuable study uses a massive and long-term experimental data set to provide solid evidence on how tree diversity affects host-parasitoid communities of insects in forests. The work will be of interest to ecologists working on biodiversity conservation, community ecology, and food webs.

    2. Reviewer #2 (Public review):

      Summary

      The authors use a tree biodiversity experiment to evaluate the effects of tree community and canopy cover on communities of cavity-nesting Hymenoptera and their parasitoids and the interactions between these two guilds. They find that multiple measures of tree diversity influence the hosts, parasitoids, and their interactions. In addition, host-parasitoid interactions show a phylogenetic signal.

      Strength

      The authors use a massive, long-term data set, meaningful community descriptors, and a solid set of analyses to explore the impacts of tree communities on host-parasitoid networks. It is rare to have such detailed data from multiple different trophic levels.

      Weakness

      Even though the data expands over several seasons, this is not considered in the analyses, but communities sampled at different years are pooled at the plot level. A more detailed analysis of the variations between years could reveal underlaying patterns as currently the differences in the communities and their structure between the years are ignored (e.g., when estimating the phylogenetic compositions not all the species pooled together actually coexist in time).<br /> Also, the precision of the writing should be improved as it was not always easy to follow the text and the thoughts.

    3. Author response:

      The following is the authors’ response to the original reviews.

      It would be great if the authors could add clarification about the NMDS analyses and the associated results (Fig. 1, Table 1 and Tables S2-4). The overall aim of these analyses was to see how plot characteristics (e.g. canopy cover) and composition of one taxonomic group were related to the composition of another taxonomic group. The authors quantified species composition by two axes from NMDS. (1) This analysis may yield an interpretation problem: if we only find one of the axes, but not the other, was significantly related to one variable, it would be difficult to determine whether that specific variable is important to the species composition because the composition is co-determined by two axes. (2) It is unclear how the authors did the correlation analyses for Tables S2-4. If correlation coefficients were presented in these tables, then these coefficients should be the same or very similar if we switch the positions of y vs. x. That is, the correlation between host vs. parasite phylogenetic composition would be very close to the correlation between parasite vs. phylogenetic composition, but not as the author found that these two relationships were quite different, leading to the interpretation of bottom-up or top-down processes. It is also unclear which correlation coefficient was significant or not because only one P value was provided per row in these tables. (3) In addition to the issues of multiple axes (point 1), NMDS axes simply define the relative positions of the objects in multi-dimensional space, but not the actual dissimilarities. Other methods, such as generalized dissimilarity modeling, redundancy analysis and MANOVA, can be better alternatives.

      Thank you for the thorough and constructive review. We have taken the concerns and questions raised by the editors and reviewers into account and provided clarification about the NMDS analyses as well as additional analyses to confirm our results. First, we have now added a brief explanation in the manuscript regarding the interpretation of the two NMDS axes and how they relate to species composition. Specifically, we clarified that while NMDS defines the relative positions of objects in multi-dimensional space, the two axes together provide a more comprehensive representation of the community composition, which is not solely determined by either axis independently. Second, we acknowledge that alternative approaches could help further strengthen our conclusions. To address this, we incorporated Mantel tests and PERMANOVA (with ‘adonis2’) as additional validation methods. These analyses allowed us to summarize compositional patterns while testing our hypotheses within the framework of the plot characteristics and taxonomic relationships. We have added these analyses and their results in the manuscript to reinforce our findings.

      In methods: L478-481 “To strengthen the robustness of our findings based on NMDS, we further validated the results using Mantel test and PERMANOVA (with ‘adonis2’) for correlation between communities and relationships between communities and environmental variables.”

      L469-475 “NMDS was used to summarize the variation in species composition across plots. The two axes extracted from the NMDS represent gradients in community composition, where each axis reflects a subset of the compositional variation. These axes should not be interpreted in isolation, as the overall species composition is co-determined by their combined variation. For clarity, results were interpreted based on the relationships of variables with the compositional gradients captured by both axes together."

      In results: L172-177 “The PERMANOVA analysis also highlighted the important role of canopy cover for host and parasitoid community (Table S6-9). The Mantel test revealed a consistent pattern with the NMDS analysis, highlighting a pronounced relationship between the species composition of hosts and parasitoids (Table S10). However, the correlation between the phylogenetic composition of hosts and parasitoids was not significant.”

      In discussion: L257-261 “However, this significant pattern was observed only in the NMDS analysis and not in the Mantel test, suggesting that the non-random interactions between hosts and parasitoids could not be simply predicted by their community similarity and associations between the phylogenetic composition of hosts and parasitoids are more complex and require further investigation in the future.”

      -- One additional minor point: "site" would be better set as a fixed rather than random term in the linear mixed-effects models, because the site number (2) is too small to make a proper estimate of random component.

      Now we treated “site” as a fixed factor in our models, interacting with tree species richness/tree MPD and tree functional diversity to reflect the variation of spatial and tree composition between the two sites. We found the main results did not change, as both sites showed consistent patterns for effects of tree richness/MPD on network metrics, which is more pronounced in one site.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors analyzed how biotic and abiotic factors impact antagonistic host-parasitoid interaction systems in a large BEF experiment. They found the linkage between the tree community and host-parasitoid community from the perspective of the multi-dimensionality of biodiversity. Their results revealed that the structure of the tree community (habitat) and canopy cover influence host-parasitoid compositions and their interaction pattern. This interaction pattern is also determined by phylogenetic associations among species. This paper provides a nice framework for detecting the determinants of network topological structures.

      Strengths:

      This study was conducted using a five-year sampling in a well-designed BEF experiment. The effects of the multi-dimensional diversity of tree communities have been well explained in a forest ecosystem with an antagonistic host-parasitoid interaction.

      The network analysis has been well conducted. The combination of phylogenetic analysis and network analysis is uncommon among similar studies, especially for studies of trophic cascades. Still, this study has discussed the effect of phylogenetic features on interacting networks in depth.

      Weaknesses:

      (1) The authors should examine species and interaction completeness in this study to confirm that their sampling efforts are sufficient.

      (2) The authors only used Rao's Q to assess the functional diversity of tree communities. However, multiple metrics of functional diversity exist (e.g., functional evenness, functional dispersion, and functional divergence). It is better to check the results from other metrics and confirm whether these results further support the authors' results.

      (3) The authors did not elaborate on which extinction sequence was used in robustness analysis. The authors should consider interaction abundance in calculating robustness. In this case, the author may use another null model for binary networks to get random distributions.

      (4) The causal relationship between host and parasitoid communities is unclear. Normally, it is easy to understand that host community composition (low trophic level) could influence parasitoid community composition (high trophic level). I suggest using the 'correlation' between host and parasitoid communities unless there is strong evidence of causation.

      Thank you very much for your thoughtful and constructive review of our manuscript. We have carefully addressed your comments and made several revisions to improve the clarity and robustness of our work.1) We appreciate your suggestion regarding species and interaction completeness. To confirm that our sampling efforts were sufficient, we have now included a figure (Fig. S1) showing the species accumulation curve and the coverage of interactions in our study. This ensures that the data collected provide a comprehensive representation of the system. 2) Regarding the use of only Rao’s Q to assess functional diversity, we acknowledge that multiple metrics of functional diversity exist. However, due to the large number of predictors in our analysis, we decided to streamline our approach and focus on Rao’s Q as it provides a robust measure for our research objectives. We have discussed this decision in the revised manuscript and clarified that, while additional metrics could be informative, we believe Rao’s Q sufficiently captures the key aspects of functional diversity in our study. 3) We have elaborated on the robustness analysis and the null model used in our study. Specifically, we now clarified which extinction sequence (random extinction) was used in our manuscript, and explained interaction abundance was incorporated into the robustness calculations (networklevel function, weighted=TURE; see L506). 4) We have revised the text to clarify the relationship between host and parasitoid communities. As you correctly pointed out, while it is intuitive that host community composition influences parasitoid community composition, we have reframed our analysis to emphasize the correlation between the two communities rather than implying causation without strong evidence. We have revised the manuscript to reflect this distinction.

      Reviewer #2 (Public Review):

      Summary:

      In their manuscript, Multi-dimensionality of tree communities structure host-parasitoid networks and their phylogenetic composition, Wang et al. examine the effects of tree diversity and environmental variables on communities of reed-nesting insects and their parasitoids. Additionally, they look for the correlations in community composition and network properties of the two interacting insect guilds. They use a data set collected in a subtropical tree biodiversity experiment over five years of sampling. The authors find that the tree species, functional, and phylogenetic diversity as well as some of the environmental factors have varying impacts on both host and parasitoid communities. Additionally, the communities of the host and parasitoid showed correlations in their structures. Also, the network metrices of the host-parasitoid network showed patterns against environmental variables.

      Strengths:

      The main strength of the manuscript lies in the massive long-term data set collected on host-parasitoid interactions. The data provides interesting opportunities to advance our knowledge on the effects of environmental diversity (tree diversity) on the network and community structure of insect hosts and their parasitoids in a relatively poorly known system.

      Weaknesses:

      To me, there are no major issues regarding the manuscript, though sometimes I disagree with the interpretation of the results and some of the conclusions might be too far-fetched given the analyses and the results (namely the top-down control in the system). Additionally, the methods section (especially statistics) was lacking some details, but I would not consider it too concerning. Sometimes, the logic of the text could be improved to better support the studied hypotheses throughout the text. Also, the results section cannot be understood as a stand-alone without reading the methods first. The study design and the rationale of the analyses should be described somewhere in the intro or presented with the results.

      Thank you very much for your valuable comments and suggestions on our manuscript! We appreciate your feedback and have made revisions accordingly. Specifically, we have rephrased the interpretation of the results and conclusions to better align with the analyses and avoid overstatements, particularly concerning the top-down control in the system. In addition, we have expanded the methods section by providing more details, especially regarding the statistical approaches, to address the points you raised. To enhance the clarity of the manuscript, we have also ensured that the logic of the text better supports the hypotheses throughout. Please see our point-by-point responses below for additional clarifications.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Line 120: "... and large ecosystems susceptible to global change (add citation here)": Citation(s)?

      Now we provided the missed citations.

      Line 141: Add sampling completeness information.

      Now we provide a new figure about sampling completeness (Fig. S1) in the supplementary materials, showing the adequate sampling effort for our study.

      Line 151: use more metrics in the evaluation of functional diversity

      We used tree functional diversity Rao’s Q, which is an integrated and wildly used metric to represent functional dissimilarity of trees. As our study focus on multiple diversity indices of trees, it would be better to do not pay more attention to one type of diversity. Thank you for your suggestion!

      Line 164: host vulnerability. Although generality and vulnerability are commonly used in network analysis, it is better to link these metrics with the trophic level, like the 'host vulnerability' you used. Thus, you can use 'parasitoid generality' instead of 'generality'.

      Thanks for your suggestion. Now the metrics were labeled with the trophic levels in the full text.

      Line 169: two'.'

      Corrected.

      Line 173: 'parasitoid robustness' Or 'robustness of parasitoids'?

      Now changed it to ‘robustness of parasitoid’.

      Lines 173, 468: For the robustness estimations, maybe use null model for binary networks to get random distributions?

      Thanks for the suggestion. Actually, we have used Patefield null models to compare the randomized robustness and observed, helping to assess whether the robustness of the observed network is significantly different compared to expected by chance. All robustness indices across plots were significantly different from a random distribution, See results section L197-201.

      Line 184: modulating interacting communities of hosts and parasitoids.

      Changed accordingly.

      Line 186: determined host-parasitoid interaction patterns

      Changed accordingly.

      Line 191: Biodiversity loss in this study refers to low trophic levels.

      Now we clarified this point.

      Line 190: understand

      Changed accordingly.

      Lines 215-216: Reorganize these sentences

      Line 227: indirectly influenced by...

      Changed accordingly.

      Line 238: Be more specific. Which type of further study?

      Rephased it more specific.

      Lines 297-299: rewrite this sentence to make it more transparent.

      Now we rewrote the sentence accordingly.

      Line 302: Certain

      Changed accordingly.

      Line 453: effective

      Changed accordingly.

      Finally, the authors should check the text carefully to avoid grammatical errors.

      Thanks, now we have checked the full text to avoid grammatical errors.

      Reviewer #2 (Recommendations For The Authors):

      I feel that the authors have very interesting data and have a solid set of analyses. I do not have major issues regarding the manuscript, though sometimes I disagree with the interpretation of the results and some of the conclusions might be too far-fetched given the analyses and the results. Additionally, the methods section (especially statistics) was lacking some details, but I would not consider it too concerning at this point.

      I feel that the largest caveat of the manuscript remains in the representation of the rationale of the study. I felt the introduction could be more concise and be better focused to back up the study questions and hypotheses. Many times, the sentences were too vague and unspecific, and thus, it was difficult to understand what was meant to be said. The authors could mention something more about how community composition of hosts and parasitoids are expected to change with the studied experimental design regarding the metrices you mention in the introduction (stronger hypotheses). The results section cannot be understood as a stand-alone without reading the methods first. The study design and the rationale of the analyses must be described somewhere in the intro or results, if the journal/authors want to keep the methods last structure. Also, the results and discussion could be more focused around the hypotheses. Naturally, these things can be easily fixed.

      I also disagree with the interpretation of results finding top-down control in the system (it might well be there, but I do not think that the current methods and tests are suitable in finding it). First, the used methodology cannot distinguish parasitoids if the hosts are not there and the probability to detect parasitoid likely depends on the abundance of the host. Thus, the top-down regulation is difficult to prove (is it the parasitoids that have driven the host population down). Secondly, I would be hesitant to say anything about the top-down and bottom-up control in the systems as the data in the manuscript is pooled across five years while the top-down/bottom-up regulation in insect systems usually spans only one season/generation in time (much shorter than five years). Consequently, the analyses are comparing the communities of species that some of most likely do not co-exist (they were found in the same space but not during the same time). Luckily, the top-down/bottom-up effects could potentially be explored by using separately the time steps of the now pooled community data: e.g., does the population of the host decrease in t if the parasitoids are abundant in t-1? There are also other statistical tests to explore these patterns.

      In the manuscript "Phylogenetic composition" refers to Mean Pairwise Distance. I would use "phylogenetic diversity" instead throughout the text. Also, to my understanding, in trees both "phylogenetic composition" and "phylogenetic diversity" are used even though based on their descriptions, they are the same.

      Detailed comments:

      Punctuation needs to be checked and edited at some point (I think copy-pasting had left things in the wrong places). Please check that "-" instead of "-" is used in host-parasitoid.

      1-2 The title is not very matching with the content. "Multi-dimensionality" is not mentioned in the text. "phylogenetic composition" -> "phylogenetic diversity"

      We didn’t find the role of functional diversity of trees in host-parasitoid interactions, but we still have tree richness and phylogenetic diversity. I also disagree with that using phylogenetic diversity to replace phylogenetic composition, because diversity highlights higher or lower phylogenetic distance among communities, while the later highlights the phylogenetic dissimilarity across communities.

      53-57 This sentence is quite vague and because of it, difficult to follow. Consider rephrasing and avoiding unspecified terms such as "tree identity", "genetic diversity", and "overall community composition of higher trophic levels" (at least, I was not sure what taxa/level you meant with them).

      Rephased.

      L58-61 “Especially, we lack a comprehensive understanding of the ways that biotic factors, including plant richness, overall community phylogenetic and functional composition of consumers, and abiotic factors such as microclimate, determining host–parasitoid network structure and host–parasitoid community dynamics.”

      56 I would remove "interact" as no interactions were tested.

      Removed accordingly.

      59-60 This needs rephrasing. I feel "taxonomic and phylogenetic composition should be just "species composition". To better match, what was done: "taxonomic, phylogenetic, and network composition of both host and parasitoid communities" -> "species and phylogenetic diversity of both host and parasitoid communities and the composition their interaction networks"

      Changed accordingly.

      62 Remove "tree composition".

      Done.

      62 Replace "taxonomic" with "species". Throughout the text.

      Done.

      63-64 "Generally, top-down control was stronger than bottom-up control via phylogenetic association between hosts and parasitoids" I disagree, see my comments elsewhere.

      Now we rephased the sentence.

      L68-70 “Generally, phylogenetic associations between hosts and parasitoids reflect non-randomly structured interactions between phylogenetic trees of hosts and parasitoids.”

      68 "habitat structure and heterogeneity" This is too strong and general of a statement based on the results. You did not really measure habitat structure or heterogeneity.

      Now we rephased the statement to avoid strong and general description.

      L71-73 “Our study indicates that the composition of higher trophic levels and corresponding interaction networks are determined by plant diversity and canopy cover especially via trophic phylogenetic links in species-rich ecosystems.”

      69 Specify "phylogenetic links". Trophic links?

      Specified to “trophic phylogenetic links”.

      75-77 The sentence is a bit difficult to follow. Consider rephrasing.

      Now we rephased it.

      L79-82 “Changes in network structure of higher trophic levels usually coincide with variations in their diversity and community, which could be in turn affected by the changes in producers via trophic cascades”

      76 Be more specific about what you mean by "community of trophic levels".

      Specified to “community composition”.

      79 Remove "basal changes of", it only makes the sentence heavier.

      Done.

      81 What is "species codependence"?

      We sim to describe the species co-occurrence depending on their closely relationships. For clarity, now we changed to “species coexistence”

      82 What do you mean by "complex dynamics"?

      Rephased to “mechanisms on dynamics of networks”.

      83 onward: I would not focus so much on top-down/bottom-up as I feel that your current analyses cannot really say anything too strong about these causalities but are rather correlative.

      Thanks, we now removed the relevant contents from the discussion. However, we kept one sentence in the Introduction, because it should be highlighted to make reviewers aware of this (the other text on about this were removed).

      89 Remove "environmental".

      Done.

      90 Specify what you mean by "these forces".

      Done.

      98-99 I have difficulties following the logic here "potential specialization of their hosts may cascade up to impact the parasitoids' presence or absence". Consider rephrasing.

      Now we rephased it.

      L101-102 “…and their host fluctuations may cascade up to impact the parasitoids’ presence or absence.”

      100 Be more specific with "habitat-level changes".

      Specified to “community-level changes”

      100 I do not see why host-parasitoid systems would be ideal to study "species interactions". There are much simpler and easier systems available.

      Changed to “… one of ideal…”

      101-103 "influence of" on what?

      Now we rephased the sentence.

      L104-105 “Previous studies mainly focused on the influence of abiotic factors on host-parasitoid interactions”

      104 Be more specific in "the role of multiple components of plant diversity".

      Now we specified "the role of multiple components of plant diversity".

      L107-108 “…the role of multiple components of plant diversity (i.e. taxonomic, functional and phylogenetic diversity)…”

      106 "diversity associations" of what?

      “diversity associations between host and parasitoids”.

      108 Specify the "direct and indirect effects".

      Now we specified it to “…direct and indirect effects (i.e. one pathway and more pathways via other variables)…”

      110-113 A bit heavy sentence to follow. Consider rephrasing.

      Now we rephased the sentence to make it more readable.

      114 Give an example of "phylogenetic dependences".

      Done. Phylogenetic dependences (e.g. phylogenetic diversity)

      117 Move the "e.g. taxonomic, phylogenetic, functional" within brackets in 117 after "dimensions of biodiversity".

      Done.

      120 "(add citation here)" Yes please!

      Done.

      120-121 Specify "such relationships".

      Done. Specified to “multiple dimensions of biodiversity”

      128-130 This is difficult to follow. Please rephrase.

      Now we rephased the sentence.

      L135-137 “We aimed to discern the primary components of the diversity and composition of tree communities that affect higher trophic level interactions via quantifying the strength and complexity of associations between hosts and parasitoid.”

      131-132 Remove "phylogenetic and". It is redundant to phylogenetic diversity.

      Done.

      128 Tested robustness does not really capture "stability of associations".

      Yes, we agree. Now we rephased the sentence and exclude the “stability” description.

      133 Specify "phylogenetic processes".

      Now we specified “phylogenetic processes”.

      L140-141 “…especially via phylogenetic processes (e.g. lineages of trophic levels diverge and evolve over time)…”

      141 I would like to have more details on the data set somewhere in the results. How many individuals and species were found in each plot (on average)? Was there a lot of temporal variation (e.g. between the seasons)? On how many sites were the insect species found?

      Thanks for your suggestion. Now we provide more details on the data set in the results (L153-156), including mean values of individuals and species in each plot. However, the temporal variation should be studied for another relative independent topic, as our study focus on the general patter of interactions between hosts and parasitoids. Therefore, we would not put more information on temporal changes to make readers get lost in the text.

      153-156 “Among them, we found 56 host species (12 bees and 44 wasps, mean abundance and richness are 400.05 and 45.14, respectively, for each plot) and 50 parasitoid species (38 Hymenoptera and 12 Diptera, mean abundance and richness are 14.07 and 9.05, respectively, for each plot).”

      149 tree -> trees

      Done.

      149 Should there read also some else than "NMDS scores"?

      Thanks! Now we provided more details about “NMDS scores”.

      L161-162 “(NMDS axis scores; i.e. preserving the rank order of pairwise dissimilarities between samples)”

      149 You could mention the amount of variation explained by the first two axes of the NMDSs. Now it is difficult to estimate how much the models actually explain.

      Thanks for your comments! However, we could not directly provide the explanatory power of the two axes, because NMDS is based on rank-order distances rather than linear relationships like in PCA. However, the goodness of fit for the NMDS solution is typically evaluated using the stress value. We provide the stress value in the figure caption.

      150 "tree MPD" is mentioned for the first time. Spell it out.

      Done.

      150 Explain "eastness".

      Done.

      L163-164 “…eastness (sine-transformed radian values of aspect) )”

      151 How was "tree functional diversity" quantified?

      Please see methods. L437-L438.

      160 Specify that you talk about phylogenetic compositions of the host and parasitoid communities here.

      We would keep it refined here, keeping consistent with species composition here. Phylogenetic composition just represents the dissimilarities of phylogenetic linages within a community.

      161 Describe "parafit" test here when first mentioned.

      Done, see methods L485-487.

      182 Keep on referring to tables and figures in the discussion! Also, more clearly discuss your hypotheses. There are lots of discussions on top-down/bottom-up control. It could be good to form a hypothesis on them and predict what kind of patterns would suggest either one and what would you expect to find regarding them.

      Now we referred figures and tables in the discussion. As the contents on top-down and bottom-up control were not fit very well with our study (as also suggested by reviewers), so we rephased the discussion and also clearly discuss our hypotheses in the discussion. See L218, L226, and L237 etc.

      186 "partly determined host-parasitoid networks" Be more specific.

      Done.

      L206-207 “…partly determined host-parasitoid network indices, including vulnerability, linkage density, and interaction evenness.”

      195 Tell what you mean by "other biotic factors".

      Specified it: “…other biotic factors such as elevation and slope…”

      197-198 "It seems likely that these results are based on bee linkages to pollen resources" I would be hesitant to conclude this as the bees most likely forage way beyond the borders of the 30m by 30m study plots.

      Thanks for your concern about this problem. While it is true that bees can forage beyond 30 x 30m, the study focuses on their nesting behavior and activity within this defined area, rather than their entire foraging range. Existing literature shows bees often forage locally when resources are available (e.g. Ebeling et al., 2012 Oecologia; Guo et al., year, Basic and Applied Ecology). Therefore, we are confident that this pattern could be associated with the resources around the trap nests.

      223 "This could be further tested by collecting the food directly used by the wasps (caterpillars)" A bit unnecessary addition.

      Thanks for your suggestion. Yes, this definitely is a good point, but currently we don’t have enough data of caterpillars, but we will follow this in the future.

      232-238 I disagree with the authors on the interpretation of the causality of the results here. I think that the community of parasitoids simply indicates which host species are available, while the host community does not have an as strong effect on parasitoid community as parasitoids do not utilise the whole species pool of the hosts. (Presence of parasitoid tells that the host is around while the presence of the host does not necessarily tell about the presence of the parasitoid.) To me, this would rather indicate a bottom-up than top-down regulation. Similar patterns are also visible in species communities of hosts and parasites.

      Thank you for your suggestion. We agree with you that parasitoids are more depended on hosts, as host could not be always attacked by parasitoids. Now we rephased our explanation to follow this argument.

      L254-256 “Such pattern could be further confirmed by the significant association between host phylogenetic composition and parasitoid phylogenetic composition (Fig. 1c), which suggested that their interactions are phylogenetically structured to some extent.”

      247-266 The logic in this section is difficult to follow. Try rephrasing.

      Now we rephased the section for a clearer logic.

      L270-287 “Tree community species richness did not significantly influence the diversity of hosts targeted by parasitoids (parasitoid generality), but caused a significant increase in the diversity of parasitoids per host species (host vulnerability) (Fig. 3a; Table 2). This is likely because niche differentiation often influences network specialization via potential higher resource diversity in plots with higher tree diversity (Lopez-Carretero et al. 2014). Such positive relationship between host vulnerability and tree species richness suggested that host-parasitoid interactions could be driven through bottom-up effects via benefit from tree diversity. For example, parasitoid species increases more than host diversity with increasing tree species richness (Guo et al. 2021), resulting increasing of host vulnerability at community level. According to the enemies hypothesis (Root 1973), which posits a positive effects of plant richness on natural enemies, the higher trophic levels in our study (e.g. predators and parasitoids) would benefit from tree diversity and regulate herbivores thereby (Staab and Schuldt 2020). Indeed, previous studies at the same site found that bee parasitoid richness and abundance were positively related to tree species richness, but not their bee hosts (Fornoff et al. 2021, Guo et al. 2021). Because our dataset considered all hosts and reflects an overall pattern of host-parasitoid interactions, the effects of tree species richness on parasitoid generality might be more complex and difficult to predict, as we found that neither tree species richness nor tree MPD were related to parasitoid generality.”

      249 "This is likely because niche differentiation often influences network specialization via potential higher resource diversity in plots with higher tree diversity" This is a bit contradicting your vulnerability results as niche differentiation should increase specialization and diversity and specialization should decrease vulnerability (less host per parasitoid).

      Thanks! We understand that the concepts of “generality” and “vulnerability” can be a bit confusing. To clarify, “fewer hosts per parasitoid” actually corresponds to lower generality at the community level.

      332-337 How did you select the species growing on your plots? Or was only species number considered? What was the pool of tree species growing on the selected plots? Was the selection similar at both sites?

      Now we provided more information on the experiment design.

      L354-356 “The species pools of the two plots are nonoverlapping (16 species for each site). The composition of tree species within the study plots is based on a “broken-stick” design (see Bruelheide et al. 2014).”

      342 Remove "centrally per plot"?

      Done.

      346-347 Was the selection of different reed diameters similar in all the plots?

      Diameters and the relative distribution of diameters was similar in all trap nests.

      399 & 432 Are "phylogenetic diversity of the tree communities" and "phylogenetic composition of trees" the same? They are both described as mean pairwise distance.

      These two are actually different, as we use this to distinguish the phylogenetic diversity with communities and rank order of dissimilarities between tree communities. Here, the phylogenetic diversity of the tree communities is mean pairwise phylogenetic distance of species for tree communities. Tree phylogenetic composition is the rank order of pairwise dissimilarities between tree communities based on NMDS.

      400 Do you think that MPD makes any sense with the monocultures (value is always 0)? Does this have a potential to bias your analyses and result?

      We agree your point. However, we do not think that this is a major problem in the analyses. We followed the experimental design and considered low phylogenetic relatedness of tree species in a plot (Likewise in monocultures, the tree species richness is always 1).

      402-405 MNTD is not mentioned before or after this. Consider removing this section.

      We tested the potential effects of MNTD in our models. Now we mentioned it in our results.

      L194-195 “Tree mean nearest taxon distance (MNTD) was unrelated to any network indices.”

      405 "Phylogenetic metrics of trees" Which ones?

      Both tree MPD and MNTD. Now we have noted it in the manuscript. (L432)

      410 Further details on "Rao's Q" and how the functional diversity of the communities was calculated are needed.

      Now more details were provided.

      L435-438 “Specifically, seven leaf traits were used for calculation of tree functional diversity, which was calculated as the mean pairwise distance in trait values among tree species, weighted by tree wood volume, and expressed as Rao's Q”

      413 Specify "higher trophic levels".

      Now we specified the trophic levels.

      L440-441 “…higher trophic levels in our study area, such as herbivores and predators”

      417-424 What about the position of the plots within study sites? Is there potential for edge effects (e.g. bees finding easier the trap nest close to the edge of the experimental forest)? Were there any differences between the two sites? What is the elevation range of the plots?

      Thanks for concerning the details of our study. First, all the plots were randomly distributed within the study sites (see Fig. S2). Admittedly, there are several plots are located in the edges of the site. However, we did not consider the potential edge effects in our analysis. Of course, this will be a good point in our future studies. Moreover, the biggest difference between the two is the non-overlapping tree species pool, and the two study sites are apart from 5 km in the same town. Finally, there is not too distinct elevation gradient across the plots (112 m - 260 m).

      432-434 "The species and phylogenetic composition of trees, hosts, and parasitoids were quantified at each plot with nonmetric multidimensional scaling (NMDS) analysis based on Morisita-Horn distances" This section needs to be more specific and detailed. Did you do the NMDS separately for each plot as suggested in the text?

      We provided more details of the section.

      L462-465 “The minimum number of required dimensions in the NMDS based on the reduction in stress value was determined in the analysis (k = 2 in our case). We centred the results to acquire maximum variance on the first dimension, and used the principal components rotation in the analysis.”

      435 Specify how picante was used (function and arguments)!

      Now we specified the function.

      L465-467 “The phylogenetic composition was calculated by mean pairwise distance among the host or parasitoid communities per plot with the R package “picante” with ‘mpd’ function.”

      436 "standardized values" Of what? How was the standardisation done?

      Now we citied a supplementary table (Table S2) to specify it (see L469). For the standardization, we used ‘scale’ function in R, which standardizes data by centering and scaling data. Specifically, it subtracts the mean and divides by the standard deviation for each variable.

      443 Provide more details on parafit.

      Actually, we have provided the reason why we use the parafit test and the usage.

      L483-486 “We used a parafit test (9,999 permutations) with the R package “ape” to test whether the associations were non-random between hosts and parasitoids. This is widely used to assess host-parasite co-phylogeny by analyzing the congruence between host and parasite phylogenies using a distance-based matrix approach.”

      449-451 Rephrase the sentence.

      Rephased.

      L490-491 “We constructed quantitative host-parasitoid networks at community level with the R package “bipartite” for each plot of the two sites.”

      451 "six" Should this be five?

      Yes, should be five, thanks.

      470-481 What package and function were used for the LMMs?

      As we now used linear models, we do no longer use a R package for LMMs.

      470 "mix" -> mixed

      Changed to linear models.

      472 "six" Should this be five?

      Again, we changed it to five.

      479-481 How did you treat the variables from the two different sites when testing for the correlations to avoid two geographic clusters of data points?

      Now we considered the two study sites as fixed factor in our linear models. Moreover, tree-based variables were additionally included as interaction terms with the study sites.

      501 "mix" -> mixed

      Changed to linear models.

      The panel selection for figures 3 and 4 seems random. Justify it!

      Thank you. To avoid including too many figures in the main text, which could potentially confuse readers, we have selected the key results that are of primary interest. The remaining figures are provided in the appendix for reference.

      533 "Note that axes are on a log scale for tree species richness." Why the log-scale if the analyses were performed with linear fit? Also, the drawn regression lines do not match the model description (non-linear, while a linear model is described in the text). The models should probably be described in more detail.

      We used log-transformed to promote the normality of the data. The drawn regression lines are linear lines, which fit our models.

      539 "Values were adjusted for covariates of the final regression model." How?

      We used residual plot to directly visualizes the relationship between the predictor and the response variable with the fitted regression line, making it easier to assess the model's fit.

      Fig. S4 text does not match the figure.

      Thanks! We now deleted the unmatched text in the figure.

    1. eLife Assessment

      This important study provides new insights into the mechanisms that underlie perceptual and attentional impairments of conscious access. The paper presents convincing evidence of a dissociation between the early stages of low-level perception, which are impermeable to perceptual or attentional impairments, and subsequent stages of visual integration which are susceptible to perceptual impairment but resilient to attentional manipulations. This study will be of interest to scientists working on visual perception and consciousness.

    2. Reviewer #1 (Public review):

      Summary:

      In this work, Noorman and colleagues test the predictions of the "four-stage model" of consciousness by combining psychophysics and scalp EEG in humans. The study relies on an elegant experimental design to investigate the respective impact of attentional and perceptual blindness on visual processing.

      The study is very well summarised, the text is clear and the methods seem sound. Overall, a very solid piece of work. I haven't identified any major weaknesses. Below I raise a few questions of interpretation that may possibly be the subject of a revision of the text.

      (1) The perceptual performance on Fig1D appears to show huge variation across participants, with some participants at chance levels and others with performance > 90% in the attentional blink and/or masked conditions. This seems to reveal that the procedure to match performance across participants was not very successful. Could this impact the results? The authors highlight the fact that they did not resort to post-selection or exclusion of participants, but at the same time do not discuss this equally important point.

      (2) In the analysis on collinearity and illusion-specific processing, the authors conclude that the absence of a significant effect of training set demonstrates collinearity-only processing. I don't think that this conclusion is warranted: as the illusory and non-illusory share the same shape, so more elaborate object processing could also be occuring. Please discuss.

      (3) Discussion, lines 426-429: It is stated that the results align with the notion that processes of perceptual segmentation and organization represent the mechanism of conscious experience. My interpretation of the results is that they show the contrary: for the same visibility level in the attentional blind or masking conditions, these processes can be implicated or not, which suggests a role during unconscious processing instead.

      (4) The two paradigms developed here could be used jointly to highlight non-idiosyncratic NCCs, i.e. EEG markers of visibility or confidence that generalise regardless of the method used. Have the authors attempted to train the classifier on one method and apply it to another (e.g. AB to masking and vice versa)? What perceptual level is assumed to transfer?

      (4) How can the results be integrated with the attentional literature showing that attentional filters can be applied early in the processing hierarchy?

      Comments on revisions:

      I'm very pleased with the responses to my previous comments, and congratulate the authors on this excellent piece of work.

    3. Reviewer #2 (Public review):

      Summary:

      This is a very elegant and important EEG study that unifies within a single set of behaviorally equated experimental conditions conscious access (and therefore also conscious access failures) during visual masking and attentional blink (AB) paradigms in humans. By a systematic and clever use of multivariate pattern classifiers across conditions, they could dissect, confirm, and extend a key distinction (initially framed within the GNWT framework) between 'subliminal' and 'pre-conscious' unconscious levels of processing. In particular, the authors could provide strong evidence to distinguish here within the same paradigm these two levels of unconscious processing that precede conscious access : (i) an early (< 80ms) bottom-up and local (in brain) stage of perceptual processing ('local contrast processing') that was preserved in both unconscious conditions, (ii) a later stage and more integrated processing (200-250ms) that was impaired by masking but preserved during AB. On the basis of preexisting studies and theoretical arguments, they suggest that this later stage could correspond to lateral and local recurrent feedback processes. Then, the late conscious access stage appeared as a P3b-like event.

      Strengths:

      The methodology and analyses are strong and valid. This work adds an important piece in the current scientific debate about levels of unconscious processing and specificities of conscious access in relation to feed-forward, lateral, and late brain-scale top-down recurrent processing.

      Comments on revisions:

      I congratulate the authors for the quality of their revised ms. They convincingly addressed each of the issues raised in my previous review.

    4. Reviewer #3 (Public review):

      Summary:

      This work aims to investigate how perceptual and attentional processes affect conscious access in humans. By using multivariate decoding analysis of electroencephalography (EEG) data, the authors explored the neural temporal dynamics of visual processing across different levels of complexity (local contrast, collinearity, and illusory perception). This is achieved by comparing the decidability of an illusory percept in matched conditions of perceptual (i.e., degrading the strength of sensory input using visual masking) and attentional impairment (i.e., impairing top-down attention using attentional blink, AB). The decoding results reveal three distinct temporal responses associated with the three levels of visual processing. Interestingly, the early stage of local contrast processing remains unaffected by both masking and AB. However, the later stage of collinearity and illusory percept processing are impaired by the perceptual manipulation but remained unaffected by the attentional manipulation. These findings contribute to the understanding of the unique neural dynamics of perceptual and attentional functions and how they interact with the different stages of conscious access.

      Strengths:

      The study investigates perceptual and attentional impairments across multiple levels of visual processing in a single experiment. Local contrast, collinearity, and illusory perception were manipulated using different configurations of the same visual stimuli. This clever design allows for the investigation of different levels of visual processing under similar low-level conditions.

      Moreover, behavioural performance was matched between perceptual and attentional manipulations. One of the main problems when comparing perceptual and attentional manipulations on conscious access is that they tend to impact performance at different levels, with perceptual manipulations like masking producing larger effects. The study utilizes a staircasing procedure to find the optimal contrast of the mask stimuli to produce a performance impairment to the illusory perception comparable to the attentional condition, both in terms of perceptual performance (i.e., indicating whether the target contained the Kanizsa illusion) and metacognition (i.e., confidence in the response).

      The results show a clear dissociation between the three levels of visual processing in terms of temporal dynamics. Local contrast was represented at an early stage (~80 ms), while collinearity and illusory perception were associated with later stages (~200-250 ms). Furthermore, the results provide clear evidence in support of a dissociation between the effects of perceptual and attentional processes on conscious access: while the former affected both neuronal correlates of collinearity and illusory perception, the latter did not have any effect on the processing of the more complex visual features involved in the illusion perception.

      Weaknesses:

      The design of the study and the results presented are very similar to those in Fahrenfort et al. (2017), reducing its novelty. Similar to the current study, Fahrenfort et al. (2017) tested the idea that if both masking and AB impact perceptual integration, they should affect the neural markers of perceptual integration in a similar way. They found that behavioural performance (hit/false alarm rate) was affected by both masking and AB, even though only the latter was significant in the unmasked condition. In contrast, an early classification peak was exclusively affected by masking. A later classification peak mirrored the behavioural findings, with classification performance impacted by both masking and AB.

      The interpretation of the results primarily relies on the recurrent processing theory of consciousness (Lamme, 2020), which lead to the assumption that local contrast and illusory perception reflect feedforward and (lateral and feedback) recurrent connections, respectively. It should be mentioned, however, that this theoretical prediction is not directly tested in the study. Moreover, the evidence for the dissociation between illusion and collinearity in terms of lateral and feedback connections seems at least limited. For instance, Kok et al. (2016) found that, whereas bottom-up stimulation activated all cortical layers, feedback activity induced by illusory figures led to a selective activation of the deep layers. Lee & Nguyen (2001), instead, found that V1 neurons respond to illusory contours of the Kanizsa figures, particularly in the superficial layers. Although both studies reference feedback connections, neither provides clear evidence for the involvement of lateral connections.

      The evidence in favour of primarily lateral connections driving collinearity seems mixed as well. On one hand, Liang et al. (2017) showed that feedback and lateral connections closely interact to mediate image grouping and segmentation. On the other hand, Stettler et al. (2002) showed that, whereas the intrinsic connections link similarly oriented domains in V1, V2 to V1 feedback displays no such specificity. Additionally, the other studies cited in the manuscript focused solely on lateral connections without examining feedback pathways, making it challenging to draw definitive conclusions.

      Comments on revisions:

      The authors have thoroughly addressed all my comments and provided comprehensive responses to each point raised.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      In this work, Noorman and colleagues test the predictions of the "four-stage model" of consciousness by combining psychophysics and scalp EEG in humans. The study relies on an elegant experimental design to investigate the respective impact of attentional and perceptual blindness on visual processing. 

      The study is very well summarised, the text is clear and the methods seem sound. Overall, a very solid piece of work. I haven't identified any major weaknesses. Below I raise a few questions of interpretation that may possibly be the subject of a revision of the text. 

      We thank the reviewer for their positive assessment of our work and for their extremely helpful and constructive comments that helped to significantly improve the quality of our manuscript.

      (1) The perceptual performance on Fig1D appears to show huge variation across participants, with some participants at chance levels and others with performance > 90% in the attentional blink and/or masked conditions. This seems to reveal that the procedure to match performance across participants was not very successful. Could this impact the results? The authors highlight the fact that they did not resort to postselection or exclusion of participants, but at the same time do not discuss this equally important point. 

      Performance was indeed highly variable between observers, as is commonly found in attentional-blink (AB) and masking studies. For some observers, the AB pushes performance almost to chance level, whereas for others it has almost no effect. A similar effect can be seen in masking. We did our best to match accuracy over participants, while also matching accuracy within participants as well as possible, adjusting mask contrast manually during the experimental session. Naturally, those that are strongly affected by masking need not be the same participants as those that are strongly affected by the AB, given the fact that they rely on different mechanisms (which is also one of the main points of the manuscript). To answer the research question, what mattered most was that at the group-level, performance was well matched between the two key conditions. As all our statistical inferences, both for behavior and EEG decoding, rest on this group level. We do not think that variability at the individualsubject level detracts from this general approach.  

      In the Results, we added that our goal was to match performance across participants:

      “Importantly, mask contrast in the masked condition was adjusted using a staircasing procedure to match performance in the AB condition, ensuring comparable perceptual performance in the masked and the AB condition across participants (see Methods for more details).”

      In the Methods, we added:

      “Second, during the experimental session, after every 32 masked trials, mask contrast could be manually updated in accordance with our goal to match accuracy over participants, while also matching accuracy within participants as well as possible.”

      (2) In the analysis on collinearity and illusion-specific processing, the authors conclude that the absence of a significant effect of training set demonstrates collinearity-only processing. I don't think that this conclusion is warranted: as the illusory and nonillusory share the same shape, so more elaborate object processing could also be occurring. Please discuss. 

      We agree with this qualification of our interpretation, and included the reviewer’s account as an alternative explanation in the Discussion section:  

      “It should be noted that not all neurophysiological evidence unequivocally links processing of collinearity and of the Kanizsa illusion to lateral and feedback processing, respectively (Angelucci et al., 2002; Bair et al., 2003; Chen et al., 2014), so that overlap in decoding the illusory and non-illusory triangle may reflect other mechanisms, for example feedback processes representing the triangular shapes as well.”

      (3) Discussion, lines 426-429: It is stated that the results align with the notion that processes of perceptual segmentation and organization represent the mechanism of conscious experience. My interpretation of the results is that they show the contrary: for the same visibility level in the attentional blind or masking conditions, these processes can be implicated or not, which suggests a role during unconscious processing instead. 

      We agree with the reviewer that the interpretation of this result depends on the definition of consciousness that one adheres to. If one takes report as the leading metric for consciousness (=conscious access), one can indeed conclude that perceptual segmentation/organization can also occur unconsciously. However, if the processing that results in the qualitative nature of an image (rather than whether it is reported) is taken as leading – such as the processing that results in the formation of an illusory percept – (=phenomenal) the conclusion can be quite different. This speaks to the still ongoing debate regarding the existence of phenomenal vs access consciousness, and the literature on no-report paradigms amongst others (see last paragraph of the discussion). Because the current data do not speak directly to this debate, we decided to remove  the sentence about “conscious experience”, and edited this part of the manuscript (also addressing a comment about preserved unconscious processing during masking by Reviewer 2) by limiting the interpretation of unconscious processing to those aspects that are uncontroversial:

      “Such deep feedforward processing can be sufficient for unconscious high-level processing, as indicated by a rich literature demonstrating high-level (e.g., semantic) processing during masking (Kouider & Dehaene, 2007; Van den Bussche et al., 2009; van Gaal & Lamme, 2012). Thus, rather than enabling deep unconscious processing, preserved local recurrency during inattention may afford other processing advantages linked to its proposed role in perceptual integration (Lamme, 2020), such as integration of stimulus elements over space or time.”

      (4) The two paradigms developed here could be used jointly to highlight nonidiosyncratic NCCs, i.e. EEG markers of visibility or confidence that generalise regardless of the method used. Have the authors attempted to train the classifier on one method and apply it to another (e.g. AB to masking and vice versa)? What perceptual level is assumed to transfer? 

      To avoid issues with post-hoc selection of (visible vs. invisible) trials (discussed in the Introduction), we did not divide our trials into conscious and unconscious trials, and thus did not attempt to reveal NCCs, or NCCs generalizing across the two paradigms. Note also that this approach alone would not resolve the debate regarding the ‘true’ NCC as it hinges on the operational definition of consciousness one adheres to; also see our response to the previous point the reviewer raised. Our main analysis revealed that the illusory triangle could be decoded with above-chance accuracy during both masking and the AB over extended periods of time with similar topographies (Fig. 2B), so that significant cross-decoding would be expected over roughly the same extended period of time (except for the heightened 200-250 ms peak). However, as our focus was on differences between the two manipulations and because we did not use post-hoc sorting of trials, we did not add these analyses.

      (5) How can the results be integrated with the attentional literature showing that attentional filters can be applied early in the processing hierarchy? 

      Compared to certain manipulations of spatial attention, the AB phenomenon is generally considered to represent an instance of  “late” attentional filtering. In the Discussion section we included a paragraph on classic load theory, where early and late filtering depend on perceptual and attentional load. Just preceding this paragraph, we added this:  

      “Clearly, these findings do not imply that unconscious high-level (e.g., semantic) processing can only occur during inattention, nor do they necessarily generalize to other forms of inattention. Indeed, while the AB represents a prime example of late attentional filtering, other ways of inducing inattention or distraction (e.g., by manipulating spatial attention) may filter information earlier in the processing hierarchy (e.g., Luck & Hillyard, 1994 vs. Vogel et al., 1998).”

      Reviewer #2 (Public Review): 

      Summary: 

      This is a very elegant and important EEG study that unifies within a single set of behaviorally equated experimental conditions conscious access (and therefore also conscious access failures) during visual masking and attentional blink (AB) paradigms in humans. By a systematic and clever use of multivariate pattern classifiers across conditions, they could dissect, confirm, and extend a key distinction (initially framed within the GNWT framework) between 'subliminal' and 'pre-conscious' unconscious levels of processing. In particular, the authors could provide strong evidence to distinguish here within the same paradigm these two levels of unconscious processing that precede conscious access : (i) an early (< 80ms) bottom-up and local (in brain) stage of perceptual processing ('local contrast processing') that was preserved in both unconscious conditions, (ii) a later stage and more integrated processing (200-250ms) that was impaired by masking but preserved during AB. On the basis of preexisting studies and theoretical arguments, they suggest that this later stage could correspond to lateral and local recurrent feedback processes. Then, the late conscious access stage appeared as a P3b-like event. 

      Strengths: 

      The methodology and analyses are strong and valid. This work adds an important piece in the current scientific debate about levels of unconscious processing and specificities of conscious access in relation to feed-forward, lateral, and late brain-scale top-down recurrent processing. 

      Weaknesses: 

      - The authors could improve clarity of the rich set of decoding analyses across conditions. 

      - They could also enrich their Introduction and Discussion sections by taking into account the importance of conscious influences on some unconscious cognitive processes (revision of traditional concept of 'automaticity'), that may introduce some complexity in Results interpretation 

      - They should discuss the rich literature reporting high-level unconscious processing in masking paradigms (culminating in semantic processing of digits, words or even small group of words, and pictures) in the light of their proposal (deeper unconscious processing during AB than during masking). 

      We thank the reviewer for their positive assessment of our study and for their insightful comments and helpful suggestions that helped to significantly strengthen our paper. We provide a more detailed point-by-point response in the “recommendations for the authors” section below. In brief, we followed the reviewer’s suggestions and revised the Results/Discussion to include references to influences on unconscious processes and expanded our discussion of unconscious effects during masking vs. AB.  

      Reviewer #3 (Public Review): 

      Summary: 

      This work aims to investigate how perceptual and attentional processes affect conscious access in humans. By using multivariate decoding analysis of electroencephalography (EEG) data, the authors explored the neural temporal dynamics of visual processing across different levels of complexity (local contrast, collinearity, and illusory perception). This is achieved by comparing the decidability of an illusory percept in matched conditions of perceptual (i.e., degrading the strength of sensory input using visual masking) and attentional impairment (i.e., impairing topdown attention using attentional blink, AB). The decoding results reveal three distinct temporal responses associated with the three levels of visual processing. Interestingly, the early stage of local contrast processing remains unaffected by both masking and AB. However, the later stage of collinearity and illusory percept processing are impaired by the perceptual manipulation but remain unaffected by the attentional manipulation. These findings contribute to the understanding of the unique neural dynamics of perceptual and attentional functions and how they interact with the different stages of conscious access. 

      Strengths: 

      The study investigates perceptual and attentional impairments across multiple levels of visual processing in a single experiment. Local contrast, collinearity, and illusory perception were manipulated using different configurations of the same visual stimuli. This clever design allows for the investigation of different levels of visual processing under similar low-level conditions. 

      Moreover, behavioural performance was matched between perceptual and attentional manipulations. One of the main problems when comparing perceptual and attentional manipulations on conscious access is that they tend to impact performance at different levels, with perceptual manipulations like masking producing larger effects. The study utilizes a staircasing procedure to find the optimal contrast of the mask stimuli to produce a performance impairment to the illusory perception comparable to the attentional condition, both in terms of perceptual performance (i.e., indicating whether the target contained the Kanizsa illusion) and metacognition (i.e., confidence in the response). 

      The results show a clear dissociation between the three levels of visual processing in terms of temporal dynamics. Local contrast was represented at an early stage (~80 ms), while collinearity and illusory perception were associated with later stages (~200-250 ms). Furthermore, the results provide clear evidence in support of a dissociation between the effects of perceptual and attentional processes on conscious access: while the former affected both neuronal correlates of collinearity and illusory perception, the latter did not have any effect on the processing of the more complex visual features involved in the illusion perception. 

      Weaknesses: 

      The design of the study and the results presented are very similar to those in Fahrenfort et al. (2017), reducing its novelty. Similar to the current study, Fahrenfort et al. (2017) tested the idea that if both masking and AB impact perceptual integration, they should affect the neural markers of perceptual integration in a similar way. They found that behavioural performance (hit/false alarm rate) was affected by both masking and AB, even though only the latter was significant in the unmasked condition. An early classification peak was instead only affected by masking. However, a late classification peak showed a pattern similar to the behavioural results, with classification affected by both masking and AB. 

      The interpretation of the results mainly centres on the theoretical framework of the recurrent processing theory of consciousness (Lamme, 2020), which lead to the assumption that local contrast, collinearity, and the illusory perception reflect feedforward, local recurrent, and global recurrent connections, respectively. It should be mentioned, however, that this theoretical prediction is not directly tested in the study. Moreover, the evidence for the dissociation between illusion and collinearity in terms of lateral and feedback connections seems at least limited. For instance, Kok et al. (2016) found that, whereas bottom-up stimulation activated all cortical layers, feedback activity induced by illusory figures led to a selective activation of the deep layers. Lee & Nguyen (2001), instead, found that V1 neurons respond to illusory contours of the Kanizsa figures, particularly in the superficial layers. They all mention feedback connections, but none seem to point to lateral connections. 

      Moreover, the evidence in favour of primarily lateral connections driving collinearity seems mixed as well. On one hand, Liang et al. (2017) showed that feedback and lateral connections closely interact to mediate image grouping and segmentation. On the other hand, Stettler et al. (2002) showed that, whereas the intrinsic connections link similarly oriented domains in V1, V2 to V1 feedback displays no such specificity. Furthermore, the other studies mentioned in the manuscript did not investigate feedback connections but only lateral ones, making it difficult to draw any clear conclusions. 

      We thank the reviewer for their careful review and positive assessment of our study, as well as for their constructive criticism and helpful suggestions. We provide a more detailed point-by-point response in the “recommendations for the authors” section below. In brief, we addressed the reviewer’s comments and suggestions by better relating our study to Fahrenfort et al.’s (2017) paper and by highlighting the limitations inherent in linking our findings to distinct neural mechanisms (in particular, to lateral vs. feedback connections).

      Recommendations for the authors:  

      Reviewer #1 (Recommendations For The Authors): 

      -  Methods: it states that "The distance between the three Pac-Man stimuli as well as between the three aligned two-legged white circles was 2.8 degrees of visual angle". It is unclear what this distance refers to. Is it the shortest distance between the edges of the objects? 

      It is indeed the shortest distance between the edges of the objects. This is now included in the Methods.

      -  Methods: It's unclear to me if the mask updating procedure during the experimental session was based on detection rate or on the perceptual performance index reported on Fig1D. Please clarify. 

      It was based on accuracy calculated over 32 trials. We have included this information in the Methods.

      -  Methods and Results: I did not understand why the described procedure used to ensure that confidence ratings are not contaminated by differences in perceptual performance was necessary. To me, it just seems to make the "no manipulations" and "both manipulations" less comparable to the other 2 conditions. 

      To calculate accurate estimates of metacognitive sensitivity for the two matched conditions, we wanted participants to make use of the full confidence scale (asking them to distribute their responses evenly over all ratings within a block). By mixing all conditions in the same block, we would have run the risk of participants anchoring their confidence ratings to the unmatched very easy and very difficult conditions (no and both manipulations condition). We made this point explicit in the Results section and in the Methods section:

      “To ensure that the distribution of confidence ratings in the performancematched masked and AB condition was not influenced by participants anchoring their confidence ratings to the unmatched very easy and very difficult conditions (no and both manipulations condition, respectively), the masked and AB condition were presented in the same experimental block, while the other block type included the no and both manipulations condition.”

      “To ensure that confidence ratings for these matched conditions (masked, long lag and unmasked, short lag) were not influenced by participants anchoring their confidence ratings to the very easy and very difficult unmatched conditions (no and both manipulations, respectively), one type of block only contained the matched conditions, while the other block type contained the two remaining, unmatched conditions (masked, short lag and unmasked, long lag).”

      - Methods: what priors were used for Bayesian analyses? 

      Bayesian statistics were calculated in JASP (JASP Team, 2024) with default prior scales (Cauchy distribution, scale 0.707). This is now added to the Methods.

      - Results, line 162: It states that classifiers were applied on "raw EEG activity" but the Methods specify preprocessing steps. "Preprocessed EEG activity" seems more appropriate. 

      We changed the term to “preprocessed EEG activity” in the Methods and to “(minimally) preprocessed EEG activity (see Methods)” in the  Results, respectively.

      - Results, line 173: The effect of masking on local contrast decoding is reported as "marginal". If the alpha is set at 0.05, it seems that this effect is significant and should not be reported as marginal. 

      We changed the wording from “marginal” to “small but significant.”  

      - Fig1: The fixation cross is not displayed. 

      Because adding the fixation cross would have made the figure of the trial design look crowded and less clear, we decided to exclude it from this schematic trial representation. We are now stating this also in the legend of figure 1.  

      - Fig 3A: In the upper left panel, isn't there a missing significant effect of the "local contrast training and testing" condition in the first window? If not, this condition seems oddly underpowered compared to the other two conditions. 

      Thanks for the catch! The highlighting in bold and the significance bar were indeed lacking for this condition in the upper left panel (blue line). We corrected the figure in our revision.

      - Supplementary text and Fig S6: It is unclear to me why the two control analyses (the black lines vs. the green and purple lines) are pooled together in the same figure. They seem to test for different, non-comparable contrasts (they share neither training nor testing sets), and I find it confusing to find them on the same figure. 

      We agree that this may be confusing, and deleted the results from one control analysis from the figure (black line, i.e., training on contrast, testing on illusion), as the reviewer correctly pointed out that it displayed a non-comparable analysis. Given that this control analysis did not reveal any significant decoding, we now report its results only in the Supplementary text.  

      - Fig S6: I think the title of the legend should say testing on the non-illusory triangle instead of testing on the illusory triangle to match the supplementary text. 

      This was a typo – thank you! Corrected.  

      Reviewer #2 (Recommendations For The Authors): 

      Issue #1: One key asymmetry between the three levels of T2 attributes (i.e.: local contrast; non-illusory triangle; illusory Kanisza triangle) is related to the top-down conscious posture driven by the task that was exclusively focusing on the last attribute (illusory Kanisza triangle). Therefore, any difference in EEG decoding performance across these three levels could also depend to this asymmetry. For instance, if participants were engaged to report local contrast or non-illusory triangle, one could wonder if decoding performance could differ from the one used here. This potential confound was addressed by the authors by using decoders trained in different datasets in which the main task was to report one the two other attributes. They could then test how classifiers trained on the task-related attribute behave on the main dataset. However, this part of the study is crucial but not 100% clear, and the links with the results of these control experiments are not fully explicit. Could the author better clarity this important point (see also Issue #1 and #3). 

      The reviewer raises an important point, alluding to potential differences between decoded features regarding task relevance. There are two separate sets of analyses where task relevance may have been a factor, our main analyses comparing illusion to contrast decoding, and our comparison of collinearity vs. illusion-specific processing.  

      In our main analysis, we are indeed reporting decoding of a task-relevant feature (illusion) and of a task-irrelevant feature (local contrast, i.e., rotation of the Pac-Man inducers). Note, however, that the Pac-Man inducers were always task-relevant, as they needed to be processed to perceive illusory triangles, so that local contrast decoding was based on task-relevant stimulus elements, even though participants did not respond to local contrast differences in the main experiment. However, we also ran control analyses testing the effect of task-relevance on local contrast decoding in our independent training data set and in another (independent) study, where local contrast was, in separate experimental blocks, task-relevant or task-irrelevant. The results are reported in the Supplementary Text and in Figure S5. In brief, task-relevance did not improve early (70–95 ms) decoding of local contrast. We are thus confident that the comparison of local contrast to illusion decoding in our main analysis was not substantially affected by differences in task relevance. In our previous manuscript version, we referred to these control analyses only in the collinearity-vs-illusion section of the Results. In our revision, we added the following in the Results section comparing illusion to contrast decoding:

      “In the light of evidence showing that unconscious processing is susceptible to conscious top-down influences (Kentridge et al., 2004; Kiefer & Brendel, 2006; Naccache et al., 2002), we ran control analyses showing that early local contrast decoding was not improved by rendering contrast task-relevant (see Supplementary Information and Fig. S5), indicating that these differences between illusion and contrast decoding did not reflect differences in task-relevance.”

      In addition to our main analysis, there is the concern that our comparison of collinearity vs. illusion-specific processing may have been affected by differences in task-relevance between the stimuli inducing the non-illusory triangle (the “two-legged white circles”, collinearity-only) and the stimuli inducing the Kanizsa illusion (the PacMan inducers, collinearity-plus-illusion). We would like to emphasize that in our main analysis classifiers were always used to decode T2 illusion presence vs. absence (collinearity-plus-illusion), and never to decode T2 collinearity-only. To distinguish collinearity-only from collinearity-plus-illusion processing, we only varied the training data (training classifiers on collinearity-only or collinearity-plus-illusion), using the independent training data set, where collinearity-only and collinearity-plus-illusion (and rotation) were task-relevant (in separate blocks). As discussed in the Supplementary Information, for this analysis approach to be valid, collinearity-only processing should be similar for the illusory and the non-illusory triangle, and this is what control analyses demonstrated (Fig. S7). In any case, general task-relevance was equated for the collinearity-only and the collinearity-plus-illusion classifiers.  

      Finally, in supplementary Figure 6 we also show that our main results reported in Figure 2 (discussed at the top of this response) were very similar when the classifiers were trained on the independent localizer dataset in which each stimulus feature could be task-relevant.  

      Together, for the reasons described above, we believe that differences in EEG decoding performance across these three stimulus levels did  are unlikely to depend also depend on a “task-relevance” asymmetry.

      Issue #2: Following on my previous point the authors should better mention the concept of conscious influences on unconscious processing that led to a full revision of the notion of automaticity in cognitive science [1 , 2 , 3 , 4]. For instance, the discovery that conscious endogenous temporal and spatial attention modulate unconscious subliminal processing paved the way to this revision. This concept raises the importance of Issue#1: equating performance on the main task across AB and masking is not enough to guarantee that differences of neural processing of the unattended attributes of T2 (i.e.: task-unrelated attributes) are not, in part, due to this asymmetry rather than to a systematic difference of unconscious processing strengtsh [5 , 6-8]. Obviously, the reported differences for real-triangle decoding between AB and masking cannot be totally explained by such a factor (because this is a task-unrelated attribute for both AB and masking conditions), but still this issue should be better introduced, addressed, clarified (Issue #1 and #3) and discussed. 

      We would like to refer to our response to the previous point: Control analyses for local contrast decoding showed that task relevance had no influence on our marker for feedforward processing. Most importantly, as outlined above, we did not perform real-triangle decoding – all our decoding analyses focused on comparing collinearity-only vs. collinearity-plus-illusion were run on the task-relevant T2 illusion (decoding its presence vs. absence). The key difference was solely the training set, where the collinearity-only classifier was trained on the (task-relevant) real triangle and the collinearity-plus-illusion classifier was trained on the (task-relevant) Kanizsa triangle. Thus, overall task relevance was controlled in these analyses.  

      In our revision, we are now also citing the studies proposed by the reviewer, when discussing the control analyses testing for an effect of task-relevance on local contrast decoding:

      “In the light of evidence showing that unconscious processing is susceptible to conscious top-down influences (Kentridge et al., 2004; Kiefer & Brendel, 2006; Naccache et al., 2002), we ran control analyses showing that early local contrast decoding was not improved by rendering contrast task-relevant (see Supplementary Information and Fig. S5), indicating that these differences between illusion and contrast decoding did not reflect differences in task-relevance.”

      Issue #3: In terms of clarity, I would suggest the authors to add a synthetic figure providing an overall view of all pairs of intra and cross-conditions decoding analyses and mentioning main task for training and testing sets for each analysis (see my previous and related points). Indeed, at one point, the reader can get lost and this would not only strengthen accessibility to the detailed picture of results, but also pinpoint the limits of the work (see previous point). 

      We understand the point the reviewer is raising and acknowledge that some of our analyses, in particular those using different training and testing sets, may be difficult to grasp. But given the variety of different analyses using different training and testing sets, different temporal windows, as well as different stimulus features, it was not possible to design an intuitive synthetic figure summarizing the key results. We hope that the added text in the Results and Discussion section will be sufficient to guide the reader through our set of analyses.  

      In our revision, we are now more clearly highlighting that, in addition to presenting the key results in our main text that were based on training classifiers on the T1 data, “we replicated all key findings when training the classifiers on an independent training set where individual stimuli were presented in isolation (Fig. 3A, results in the Supplementary Information and Fig. S6).” For this, we added a schematic showing the procedure of the independent training set to Figure 3, more clearly pointing the reader to the use of a separate training data set.  

      Issue #4: In the light of these findings the authors should discuss more thoroughly the question of unconscious high-level representations in masking versus AB: in particular, a longstanding issue relates to unconscious semantic processing of words, numbers or pictures. According to their findings, they tend to suggest that semantic processing should be more enabled in AB than in masking. However, a rich literature provided a substantial number of results (including results from the last authors Simon Van Gaal) that tend to support the notion of unconscious semantic processing in subliminal processing (see in particular: [9 , 10 , 11 , 12 , 13]). So, and as mentioned by the authors, while there is evidence for semantic processing during AB they should better discuss how they would explain unconscious semantic subliminal processing. While a possibility could be to question the unconscious attribute of several subliminal results, the same argument also holds for AB studies. Another possible track of discussion would be to differentiate AB and subliminal perception in terms of strength and durability of the corresponding unconscious representations, but not necessarily in terms of cognitive richness. Indeed, one may discuss that semantic processing of stimuli that do not need complex spatial integration (e.g.: words or digits as compared to illusory Kanisza tested here) can still be observed under subliminal conditions. 

      We thank the reviewer for pointing us to this shortcoming of our previous Discussion. Note that our data does not directly speak to the question of high-level unconscious representations in masking vs AB, because such conclusions would hinge on the operational definition of consciousness one adheres to (also see response to Reviewer 1). Nevertheless, we do follow the reviewer’s suggestions and added the following in the Discussion (also addressing a point about other forms of attention raised by Reviewer 1):

      “Clearly, these findings do not imply that unconscious high-level (e.g., semantic) processing can only occur during inattention, nor do they necessarily generalize to other forms of inattention. Indeed, while the AB represents a prime example of late attentional filtering, other ways of inducing inattention or distraction (e.g., by manipulating spatial attention) may filter information earlier in the processing hierarchy (e.g., Luck & Hillyard, 1994 vs. Vogel et al., 1998).”

      And, in a following paragraph in the Discussion:

      “Such deep feedforward processing can be sufficient for unconscious high-level processing, as indicated by a rich literature demonstrating high-level (e.g., semantic) processing during masking (Kouider & Dehaene, 2007; Van den Bussche et al., 2009; van Gaal & Lamme, 2012). Thus, rather than enabling high-level unconscious processing, preserved local recurrency during inattention may afford other processing advantages linked to its proposed role in perceptual integration (Lamme, 2020), such as integration of stimulus elements over space or time.  

      Reviewer #3 (Recommendations For The Authors): 

      (1) The objective of Fahrenfort et al., 2017 seems very similar to that of the current study. What are the main differences between the two studies? Moreover, Fahrenfort et al., 2017 conducted similar decoding analyses to those performed in the current study.

      Which results were replicated in the current study, and which ones are novel? Highlighting these differences in the manuscript would be beneficial. 

      We now provide a more comprehensive coverage of the study by Fahrenfort et al., 2017. In the Introduction, we added a brief summary of the key findings, highlighting that this study’s findings could have reflected differences in task performance rather than differences between masking and AB:

      “For example, Fahrenfort and colleagues (2017) found that illusory surfaces could be decoded from electroencephalogram (EEG) data during the AB but not during masking. This was taken as evidence that local recurrent interactions, supporting perceptual integration, were preserved during inattention but fully abolished by masking. However, masking had a much stronger behavioral effect than the AB, effectively reducing task performance to chance level. Indeed, a control experiment using weaker masking, which resulted in behavioral performance well above chance similar to the main experiment’s AB condition, revealed some evidence for preserved local recurrent interactions also during masking. However, these conditions were tested in separate experiments with small samples, precluding a direct comparison of perceptual vs. attentional blindness at matched levels of behavioral performance. To test …”

      In the Results , we are now also highlighting this key advancement by directly referencing the previous study:

      “Thus, whereas in previous studies task performance was considerably higher during the AB than during masking (e.g., Fahrenfort et al., 2017), in the present study the masked and the AB condition were matched in both measures of conscious access.” When reporting the EEG decoding results in the Results section, we continuously cite the Fahrenfort et al. (2017) study to highlight similarities in the study’s findings. We also added a few sentences explicitly relating the key findings of the two studies:

      “This suggests that the AB allowed for greater local recurrent processing than masking, replicating the key finding by Fahrenfort and colleagues (2017). Importantly, the present result demonstrates that this effect reflects the difference between the perceptual vs. attentional manipulation rather than differences in behavior, as the masked and the AB condition were matched for perceptual performance and metacognition.”

      “This similarity between behavior and EEG decoding replicates the findings of Fahrenfort and colleagues  (2017) who also found a striking similarity between late Kanizsa decoding (at 406 ms) and behavioral Kanizsa detection. These results indicate that global recurrent processing at these later points in time reflected conscious access to the Kanizsa illusion.”  

      We also more clearly highlighted where our study goes beyond Fahrenfort et al.’s (2017), e.g., in the Results:

      “The addition of this element of collinearity to our stimuli was a key difference to the study by Fahrenfort and colleagues (2017), allowing us to compare non-illusory triangle decoding to illusory triangle decoding in order to distinguish between collinearity and illusion-specific processing.”

      And in the Discussion:

      “Furthermore, the addition of line segments forming a non-illusory triangle to the stimulus employed in the present study allowed us to distinguish between collinearity and illusion-specific processing.”

      Also, in the Discussion, we added a paragraph “summarizing which results were replicated in the current study, and which ones are novel”, as suggested by the reviewer:

      “This pattern of results is consistent with a previous study that used EEG to decode Kanizsa-like illusory surfaces during masking and the AB (Fahrenfort et al., 2017). However, the present study also revealed some effects where Fahrenfort and colleagues (2017) failed to obtain statistical significance, likely reflecting the present study’s considerably larger sample size and greater statistical power. For example, in the present study the marker for feedforward processing was weakly but significantly impaired by masking, and the marker for local recurrency was significantly impaired not only by masking but also by the AB, although to a lesser extent. Most importantly, however, we replicated the key findings that local recurrent processing was more strongly impaired by masking than by the AB, and that global recurrent processing was similarly impaired by masking and the AB and closely linked to task performance, reflecting conscious access. Crucially, having matched the key conditions behaviorally, the present finding of greater local recurrency during the AB can now unequivocally be attributed to the attentional vs. perceptual manipulation of consciousness.”

      Finally, we changed the title to “Distinct neural mechanisms underlying perceptual and attentional impairments of conscious access despite equal task performance” to highlight one of the crucial differences between the Fahrenfort et al., study and this study, namely the fact that we equalized task performance between the two critical conditions (AB and masking).

      (2) It is not clear from the text the link between the current study and the literature on the role of lateral and feedback connections in consciousness (Lamme, 2020). A better explanation is needed. 

      To our knowledge, consciousness theories such as recurrent processing theory by Lamme make currently no distinction between the role of lateral and feedback connections for consciousness. The principled distinction lies between unconscious feedforward processing and phenomenally conscious or “preconscious” local recurrent processing, where local recurrency refers to both lateral (or horizontal) and feedback connections. We added a sentence in the Discussion:

      “As current theories do not distinguish between the roles of lateral vs. feedback connections for consciousness, the present findings may enrich empirical and theoretical work on perceptual vs. attentional mechanisms of consciousness …”

      (3) When training on T1 and testing on T2, EEG data showed an early peak in local contrast classification at 75-95 ms over posterior electrodes. The authors stated that this modulation was only marginally affected by masking (and not at all by AB); however, the main effect of masking is significant. Why was this effect interpreted as nonrelevant? 

      Following this and Reviewer 1’s comment, we changed the wording from “marginal” to “weak but significant.” We considered this effect “weak” and of lesser relevance, because its Bayes factor indicated that the alternative hypothesis was only 1.31 times more likely than the null hypothesis of no effect, representing only “anecdotal” evidence, which is in sharp contrast to the robust effects of the consciousness manipulations on illusion decoding reported later. Furthermore, later ANOVAs comparing the effect of masking on contrast vs. illusion decoding revealed much stronger effects on illusion decoding than on contrast decoding (BFs>3.59×10<sup>4</sup>).

      (4) The decoding analysis on the illusory percept yielded two separate peaks of decoding, one from 200 to 250 ms and another from 275 to 475 ms. The early component was localized occipitally and interpreted as local sensory processing, while the late peak was described as a marker for global recurrent processing. This latter peak was localized in the parietal cortex and associated with the P300. Can the authors show the topography of the P300 evoked response obtained from the current study as a comparison? Moreover, source reconstruction analysis would probably provide a better understanding of the cortical localization of the two peaks. 

      Figure S4 now shows the P300 from electrode Pz, demonstrating a stronger positivity between 375 and 475 ms when the illusory triangle was present than when it was absent. We did not run a source reconstruction analysis.  

      (5) The authors mention that the behavioural results closely resembled the pattern of the second decoding peak results. However, they did not show any evidence for this relationship. For instance, is there a correlation between the two measures across or within participants? Does this relationship differ between the illusion report and the confidence rating? 

      This relationship became evident from simply eyeballing the results figures: Both in behavior and EEG decoding performance dropped from the both-manipulations condition to the AB and masked conditions, while these conditions did not differ significantly. Following a similar observation of a close similarity between behavior and the second/late illusion decoding peak in the study by Fahrenfort et al. (2017), we adopted their analysis approach and ran two additional ANOVAs, adding “measure” (behavior vs. EEG) as a factor. For this analysis, we dropped the both-manipulations condition due to scale restrictions (as noted in footnote 1: “We excluded the bothmanipulations condition from this analysis due to scale restrictions: in this condition, EEG decoding at the second peak was at chance, while behavioral performance was above chance, leaving more room for behavior to drop from the masked and AB condition.”). The analysis revealed that there were no interactions with condition:

      “The pattern of behavioral results, both for perceptual performance and metacognitive sensitivity, closely resembled the second decoding peak: sensitivity in all three metrics dropped from the no-manipulations condition to the masked and AB conditions, while sensitivity did not differ significantly between these performancematched conditions (Fig. 2C). Two additional rm ANOVAs with the factors measure (behavior, second EEG decoding peak) and condition (no-manipulations, masked, AB)<sup>1</sup> for perceptual performance and metacognitive sensitivity revealed no significant interaction (performance: F</iv><sub>2,58</sub>=0.27, P\=0.762, BF<sub>01</sub>=8.47; metacognition: F</iv><sub>2,58</sub=0.54, P\=0.586, BF<sub>2,58</sub>=6.04). This similarity between behavior and EEG decoding replicates the findings of Fahrenfort and colleagues  (2017) who also found a striking similarity between late Kanizsa decoding (at 406 ms) and behavioral Kanizsa detection. These results indicate that global recurrent processing at these later points in time reflected conscious access to the Kanizsa illusion.”

      (6) The marker for illusion-specific processing emerged later (200-250 ms), with the nomanipulation decoding performing better after training on the illusion than the nonillusory triangle. This difference emerged only in the AB condition, and it was fully abolished by masking. The authors confirmed that the illusion-specific processing was not affected by the AB manipulations by running a rm ANOVA which did not result in a significant interaction between condition and training set. However, unlike the other non-significant results, a Bayes Factor is missing here. 

      We added Bayes factors to all (significant and non-significant) rm ANOVAs.

      (7) The same analysis yielded a second illusion decoding peak at 375-475 ms. This effect was impaired by both masking and AB, with no significant differences between the two conditions. The authors stated that this result was directly linked to behavioural performance. However, it is not clear to me what they mean (see point 5). 

      We added analyses comparing behavior and EEG decoding directly (see our response to point 5).

      (8) The introduction starts by stating that perceptual and attentional processes differently affect consciousness access. This differentiation has been studied thoroughly in the consciousness literature, with a focus on how attention differs from consciousness (e.g., Koch & Tsuchiya, TiCS, 2007; Pitts, Lutsyshyna & Hillyard, Phil. Trans. Roy. Soc. B Biol. Sci., 2018). The authors stated that "these findings confirm and enrich empirical and theoretical work on perceptual vs. attentional mechanisms of consciousness clearly distinguishing and specifying the neural profiles of each processing stage of the influential four-stage model of conscious experience". I found it surprising that this aspect was not discussed further. What was the state of the art before this study was conducted? What are the mentioned neural profiles? How did the current results enrich the literature on this topic? 

      We would like to point out that our study is not primarily concerned with the conceptual distinction between consciousness and attention, which has been the central focus of e.g., Koch and Tsuchiuya (2007). While this literature was concerned with ways to dissociate consciousness and attention, we tacitly assumed that attention and consciousness are now generally considered as different constructs. Our study is thus not dealing with dissociations between attention and consciousness, nor with the distinction between phenomenal consciousness and conscious access, but is concerned with different ways of impairing conscious access (defined as the ability to report about a stimulus), either via perceptual or via attentional manipulations. For the state of the art before the study was conducted, we would like to refer to the motivation of our study in the Introduction, e.g., previous studies’ difficulties in unequivocally linking greater local recurrency during attentional than perceptual blindness to the consciousness manipulation, given performance confounds (we expanded this Introduction section). We also expanded a paragraph in the discussion to remind the reader of the neural profiles of the 4-stage model and to highlight the novelty of our findings related to the distinction between lateral and feedback processes:

      “As current theories do not distinguish between the roles of lateral vs. feedback connections for consciousness, the present findings may enrich empirical and theoretical work on perceptual vs. attentional mechanisms of consciousness (Block, 2005; Dehaene et al., 2006; Hatamimajoumerd et al., 2022; Lamme, 2010; Pitts et al., 2018; Sergent & Dehaene, 2004), clearly distinguishing the neural profiles of each processing stage of the influential four-stage model of conscious experience (Fig. 1A). Along with the distinct temporal and spatial EEG decoding patterns associated with lateral and feedback processing, our findings suggest a processing sequence from feedforward processing to local recurrent interactions encompassing lateral-tofeedback connections, ultimately leading to global recurrency and conscious report.”  

      (9) When stating that this is the first study in which behavioural measures of conscious perception were matched between the attentional blink and masking, it would be beneficial to highlight the main differences between the current study and the one from Fahrenfort et al., 2017, with which the current study shares many similarities in the experimental design (see point 1). 

      We would like to refer the reviewer to our response to point 1), where we detail how we expanded the discussion of similarities and differences between our present study and Fahrenfort et al. (2017).

      (10) The discussion emphasizes how the current study "suggests a processing sequence from feedforward processing to local recurrent interactions encompassing lateral-to-feedback connections, ultimately leading to global recurrency and conscious report". For transparency, it is though important to highlight that one limit of the current study is that it does not provide direct evidence for the specified types of connections (see point 6). 

      We added a qualification in the Discussion section:

      “Although the present EEG decoding measures cannot provide direct evidence for feedback vs. lateral processes, based on neurophysiological evidence, …”

      Furthermore, we added this qualification in the Discussion section:

      “It should be noted that the not all neurophysiological evidence unequivocally links processing of collinearity and of the Kanizsa illusion to lateral and feedback processing, respectively (Angelucci et al., 2002; Bair et al., 2003; Chen et al., 2014), so that overlap in decoding the illusory and non-illusory triangle may reflect other mechanisms, for example feedback processing as well.”

      References

      Angelucci, A., Levitt, J. B., Walton, E. J. S., Hupe, J.-M., Bullier, J., & Lund, J. S. (2002). Circuits for local and global signal integration in primary visual cortex. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 22(19), 8633–8646.

      Bair, W., Cavanaugh, J. R., & Movshon, J. A. (2003). Time course and time-distance relationships for surround suppression in macaque V1 neurons. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 23(20), 7690–7701.

      Block, N. (2005). Two neural correlates of consciousness. Trends in Cognitive Sciences, 9(2), 46–52.

      Chen, M., Yan, Y., Gong, X., Gilbert, C. D., Liang, H., & Li, W. (2014). Incremental integration of global contours through interplay between visual cortical areas. Neuron, 82(3), 682–694.

      Dehaene, S., Changeux, J.-P., Naccache, L., Sackur, J., & Sergent, C. (2006). Conscious, preconscious, and subliminal processing: a testable taxonomy. Trends in Cognitive Sciences, 10(5), 204–211.

      Hatamimajoumerd, E., Ratan Murty, N. A., Pitts, M., & Cohen, M. A. (2022). Decoding perceptual awareness across the brain with a no-report fMRI masking paradigm. Current Biology: CB. https://doi.org/10.1016/j.cub.2022.07.068

      JASP Team. (2024). JASP (Version 0.19.0)[Computer software]. https://jasp-stats.org/ Kentridge, R. W., Heywood, C. A., & Weiskrantz, L. (2004). Spatial attention speeds discrimination without awareness in blindsight. Neuropsychologia, 42(6), 831– 835.

      Kiefer, M., & Brendel, D. (2006). Attentional Modulation of Unconscious “Automatic” Processes: Evidence from Event-related Potentials in a Masked Priming Paradigm. Journal of Cognitive Neuroscience, 18(2), 184–198.

      Kouider, S., & Dehaene, S. (2007). Levels of processing during non-conscious perception: a critical review of visual masking. Philosophical Transactions of the Royal Society B: Biological Sciences, 362(1481), 857–875.

      Lamme, V. A. F. (2010). How neuroscience will change our view on consciousness. Cognitive Neuroscience, 1(3), 204–220.

      Luck, S. J., & Hillyard, S. A. (1994). Electrophysiological correlates of feature analysis during visual search. Psychophysiology, 31(3), 291–308.

      Naccache, L., Blandin, E., & Dehaene, S. (2002). Unconscious masked priming depends on temporal attention. Psychological Science, 13(5), 416–424.

      Pitts, M. A., Lutsyshyna, L. A., & Hillyard, S. A. (2018). The relationship between attention and consciousness: an expanded taxonomy and implications for ‘noreport’ paradigms. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 373(1755), 20170348.

      Sergent, C., & Dehaene, S. (2004). Is consciousness a gradual phenomenon? Evidence for an all-or-none bifurcation during the attentional blink. Psychological Science, 15(11), 720–728.

      Van den Bussche, E., Van den Noortgate, W., & Reynvoet, B. (2009). Mechanisms of masked priming: a meta-analysis. Psychological Bulletin, 135(3), 452–477. van Gaal, S., & Lamme, V. A. F. (2012). Unconscious high-level information processing: implication for neurobiological theories of consciousness: Implication for neurobiological theories of consciousness. The Neuroscientist: A Review Journal Bringing Neurobiology, Neurology and Psychiatry, 18(3), 287–301.

      Vogel, E. K., Luck, S. J., & Shapiro, K. L. (1998). Electrophysiological evidence for a postperceptual locus of suppression during the attentional blink. Journal of Experimental Psychology. Human Perception and Performance, 24(6), 1656– 1674.

    1. eLife Assessment

      This important manuscript sets out to identify sleep/arousal phenotypes in larval zebrafish carrying mutations in Alzheimer's disease (AD)-associated genes. The authors provide detailed phenotypic data for F0 knockouts of each of 7 AD-associated genes and then compare the resulting behavioral fingerprints to those obtained from a large-scale chemical screen to generate new hypotheses about underlying molecular mechanisms. The data presented are solid, although extensive interpretation of pharmacological screen data does not necessarily reflect the limited mechanistic data. Nonetheless, the authors address most reviewer concerns in their revised version, providing invaluable new analyses. Phenotypic characterization presented is comprehensive, and the authors develop a well-designed behavioral analysis pipeline that will provide considerable value for zebrafish neuroscientists.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, Kroll et al. conduct an in-depth behavioral analysis of F0 knockouts of 4 genes associated with late-onset Alzheimer's Disease (AD), together with 3 genes associated with early-onset AD. Kroll and colleagues developed a web application (ZOLTAR) to compare sleep-associated traits between genetic mutants with those obtained from a panel of small molecules to promote identification of affected pathways and potential therapeutic interventions. The authors make a set of potentially important findings vis-à-vis the relationship between AD-associated genes and sleep. First, they find that loss-of-function in late-onset AD genes universally result in nighttime sleep loss, consistent with the well-supported hypothesis that sleep disruption contributes to Alzheimer's-related pathologies. psen-1, an early-onset associated AD gene, which the authors find is principally responsible for the generation of AB40 and AB42 in zebrafish, also shows a slight increase in activity at night and slight decreases in nighttime sleep. Conversely, psen-2 mutations increase daytime sleep, while appa/appb mutations have no impact on sleep. Finally, using ZOLTAR, the authors identify serotonin receptor activity as potentially disrupted in sorl1 mutants, while betamethasone is identified as a potential therapeutic to promote reversal of psen2 knockout-associated phenotypes.

      This is a highly innovative and thorough study, yet a handful of key questions remain. First, are the nighttime sleep loss phenotypes observed in all knockouts for late-onset AD genes in the larval zebrafish a valid proxy for AD risk? Can 5-HT reuptake inhibitors reverse other AD-related pathologies in zebrafish? Can compounds be identified which have a common behavioral fingerprint across all or multiple AD risk genes? Do these modify sleep phenotypes? Finally, the authors propose but do not test the hypothesis that sorl1 might regulate localization/surface expression of 5-HT2 receptors. This could provide exciting / more convincing mechanistic support for the assertion that serotonin signaling is disrupted upon loss of AD-associated genes. Despite these important considerations, this study provides a valuable platform for high-throughput analysis of sleep phenotypes and correlation with small-molecule induced sleep phenotypes. The platform could also be expanded to facilitate comparison of other behavioral phenotypes, including stimulus-evoked behaviors. Moreover, the new analyses looking for pathways that might be co-regulated by AD risk genes and discussion of cholinergic signaling as a potentially meaningful target downstream of 5/7 knockouts are valuable.

      Strengths:<br /> - Provides a useful platform for comparison of sleep phenotypes across genotypes/drug manipulations.<br /> - Presents convincing evidence that nighttime sleep is disrupted in mutants for multiple late-onset AD-related genes.<br /> - Provides potential mechanistic insights for how AD-related genes might impact sleep and identifies a few drugs that modify their identified phenotypes.

      Weaknesses:<br /> - Exploration of potential mechanisms for serotonin disruption in sorl1 mutants is limited<br /> - The pipeline developed is only used to examine sleep-related / spontaneous movement phenotypes. Stimulus-evoked behaviors are not examined.

    3. Reviewer #2 (Public review):

      Summary:

      This work delineates the larval zebrafish behavioral phenotypes caused by F0 knockout of several important genes that increase risk for Alzheimer's disease. Using behavioral pharmacology, comparing the behavioral fingerprint of previously assayed molecules to the newly generated knockout data, compounds were discovered that impacted larval movement in ways that suggest interaction with or recovery of disrupted mechanisms.

      Strengths:

      This is a well-written manuscript that uses newly developed analysis methods to present the findings in a clear, high-quality way. The addition of an extensive behavioral analysis pipeline is of value to the field of zebrafish neuroscience and will be particularly helpful for researchers who prefer the R programming language. Even the behavioral profiling of these AD risk genes, regardless of the pharmacology aspect, is an important contribution. The recovery of most behavioral parameters in the psen2 knockout with betamethasone, predicted by comparing fingerprints, is an exciting demonstration of the approach. The hypotheses generated by this work are important stepping stones to future studies uncovering the molecular basis of the proposed gene-drug interactions and discovering novel therapeutics to treat AD or co-occurring conditions such as sleep disturbance. Most concerns are sufficiently addressed in the revised manuscript or response to reviewers.

      Weaknesses:

      - The overarching concept of the work is that comparing behavioral fingerprints can align genes and molecules with similarly disrupted molecular pathways. While the recovery of the psen2 phenotypes by one molecule with the opposite phenotype is interesting, as are previous studies that show similar behaviorally-based recoveries, the underlying assumption that normalizing the larval movement normalizes the mechanism still lacks substantial support. While I agree with the authors detailed response that rescuing most behavioral parameters is a good indication that the underlying mechanism is normalized, I disagree that high-throughput larval behavior kinematics is a sufficient enough representation of most behavioral parameters to be indicative of molecular mechanism normalization. There are many instances of mutants with completely normal kinetics at baseline, but a behavioral difference that emerges during stimulation or in a new paradigm such as hunting. Without testing far more behavioral paradigms than are possible in the multi-well plate format, as well as possibly multiple life stages, I remain unconvinced that this approach will yield valuable therapeutic insights. I do agree that it can yield insight for future investigation, such as in the case of cntnap2a/cntnap2b and GABA receptor agonists, but even in that instance is it not clear that such an agonist would rescue abnormalities in a meaningful way. In the case of a disorder such as autism, the early locomotor phenotypes may be disconnected from the molecular mechanisms underlying later social deficits, and it is far more challenging to screen on juvenile behaviors that would be a more appropriate target for a behavior-first approach. The added experiment of testing fluvoxamine, a second SSRI, yielded very different behavioral responses to the SSRI citalopram, supporting my assertion that this approach and the disrupted underlying mechanisms are more complicated than suggested by the authors. I disagree that the connection between sorl1 and serotonin is strengthened by this experiment. The authors suggest that since the knockout larvae react differently than control siblings to both SSRIs, it indicates that serotonin is disrupted. There is no negative control included, where a pathway that is clearly not indicated to be important is pharmacologically manipulated. It is possible that the mutants would also behave differently compared to siblings when other pathways are perturbed. The authors acknowledge in the reviewers that they may not have identified the underlying molecular disruption in this mutant, but they did not substantially alter the Discussion section on this point. I agree with the authors that using a different wild-type strain in a different lab could lead to discrepancies, but these issues could have been experimentally mitigated or more clearly highlighted in the manuscript itself.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, Kroll et al. conduct an in-depth behavioral analysis of F0 knockouts of 4 genes associated with late-onset Alzheimer's Disease (AD), together with 3 genes associated with early-onset AD. Kroll and colleagues developed a web application (ZOLTAR) to compare sleep-associated traits between genetic mutants with those obtained from a panel of small molecules to promote the identification of affected pathways and potential therapeutic interventions. The authors make a set of potentially important findings vis-à-vis the relationship between AD-associated genes and sleep. First, they find that loss-of-function in late-onset AD genes universally results in night-time sleep loss, consistent with the well supported hypothesis that sleep disruption contributes to Alzheimer's-related pathologies. psen-1, an early-onset associated AD gene, which the authors find is principally responsible for the generation of AB40 and AB42 in zebrafish, also shows a slight increase in activity at night and slight decreases in night-time sleep. Conversely, psen-2 mutations increase daytime sleep, while appa/appb mutations have no impact on sleep. Finally, using ZOLTAR, the authors identify serotonin receptor activity as potentially disrupted in sorl1 mutants, while betamethasone is identified as a potential therapeutic to promote reversal of psen2 knockout-associated phenotypes.

      This is a highly innovative and thorough study, yet a handful of key questions remain. First, are night-time sleep loss phenotypes observed in all knockouts for late-onset AD genes in the larval zebrafish a valid proxy for AD risk?

      We cannot say, but it is an interesting question. We selected the four late-onset Alzheimer’s risk genes (APOE, CD2AP, CLU, SORL1) based on human genetics data and brain expression in zebrafish larvae, not based on their likelihood to modify sleep behaviour, which we could have tried by searching for overlaps with GWAS of sleep phenotypes, for example. Consequently, we find it remarkable that all four of these genes caused a night-time sleep phenotype when mutated. We also find it reassuring that knockout of appa/appb and psen2 did not cause a night-time sleep phenotype, which largely excludes the possibility that the phenotype is a technical artefact (e.g. caused by the F0 knockout method) or a property of every gene expressed in the larval brain.

      Having said that, it could still be a coincidence, rather than a special property of genes associated with late-onset AD. In addition to testing additional late-onset Alzheimer’s risk genes, the ideal way to answer this question would be to test in parallel a random set of genes expressed in the brain at this stage of development. From this random set, one could estimate the proportion of genes that cause a night-time sleep phenotype when mutated. One could then use that information to test whether late-onset Alzheimer’s risk genes are indeed enriched for genes that cause a night-time sleep phenotype when mutated.

      For those mutants that cause night-time sleep disturbances, do these phenotypes share a common underlying pathway? e.g. Do 5-HT reuptake inhibitors promote sleep across all 4 late-onset genes in addition to psen1? Can 5-HT reuptake inhibitors reverse other AD-related pathologies in zebrafish? Can compounds be identified that have a common behavioral fingerprint across all or multiple AD risk genes? Do these modify sleep phenotypes?

      To attempt to answer these questions, we used ZOLTAR to generate predictions for all the knockout behavioural fingerprints presented in the study, in the same way as for sorl1 in Fig. 5 and Fig. 5–supplement 1. Here are the indications, targets, and KEGG pathways which are shared by the largest number of knockouts (Author response image 1):

      – One indication is shared by 4/7 knockouts: “opioid dependence” (significant for appa/appb, psen1, apoea/apoeb, cd2ap).

      – Four targets are shared by 4/7 knockouts: “strychnine-binding glycine receptor” (psen1, apoea/apoeb, clu, sorl1); “neuronal acetylcholine receptor beta-2” (psen1, apoea/apoeb, cd2ap, clu); thyroid peroxidase (psen1, apoea/apoeb, cd2ap, clu); carbonic anhydrase IV (appa/appb, psen1, psen2, cd2ap).

      – Three KEGG pathways are shared by 5/7 knockouts: “cholinergic synapse” (psen1, apoea/apoeb, cd2ap, clu, sorl1); tyrosine metabolism (psen2, apoea/apoeb, cd2ap, clu, sorl1); and “nitrogen metabolism” (appa/appb, psen1, psen2, apoea/apoeb, cd2ap).

      As reminder, we hypothesised that loss of Sorl1 affected serotonin signalling based on the following annotations being significant: indication “depression”, target “serotonin transporter”, and KEGG pathway “serotonergic synapse”. Indication “depression” is only significant for sorl1 knockouts; target “serotonin transporter” is also significant for appa/appb and psen2 knockouts; and KEGG pathway “serotonergic synapse” is also significant for psen2 knockouts. ZOLTAR therefore does not predict serotonin signalling to be a major theme common to all mutants with a night-time sleep loss phenotype.

      Particularly interesting is cholinergic signalling appearing in the most common targets and KEGG pathways. Acetylcholine signalling is a major theme in research on AD. For example, the first four drugs ever approved by the FDA to treat AD were acetylcholinesterase inhibitors, which increase acetylcholine signalling by preventing its breakdown by acetylcholinesterase. These drugs are generally considered only to treat symptoms and not modify disease course, but this view has been called into question (Munoz-Torrero, 2008; Relkin, 2007). If, as ZOLTAR suggests, mutations in several Alzheimer’s risk genes affect cholinergic signalling early in development, this would point to a potential causal role of cholinergic disruption in AD.

      Author response image 1.

      Common predictions from ZOLTAR for the seven Alzheimer’s risk genes tested. Predictions from ZOLTAR which are shared by multiple knockout behavioural fingerprints presented in the study. Only indications, targets, and KEGG pathways which are significant for at least three of the seven knockouts tested are shown, ranked from the annotations which are significant for the largest number of knockouts.

      Finally, the web- based platform presented could be expanded to facilitate comparison of other behavioral phenotypes, including stimulus-evoked behaviors.

      Yes, absolutely. The behavioural dataset we used (Rihel et al., 2010) did not measure other stimuli than day/night light transitions, but the “SauronX” platform and dataset (MyersTurnbull et al., 2022) seems particularly well suited for this. To provide some context, we and collaborators have occasionally used the dataset by Rihel et al. (2010) to generate hypotheses or find candidate drugs that reverse a behavioural phenotype measured in the sleep/wake assay (Ashlin et al., 2018; Hoffman et al., 2016). The present work was the occasion to enable a wider and more intuitive use of this dataset through the ZOLTAR app, which has already proven successful. Future versions of ZOLTAR may seek to incorporate larger drug datasets using more types of measurements.

      Finally, the authors propose but do not test the hypothesis that sorl1 might regulate localization/surface expression of 5-HT2 receptors. This could provide exciting / more convincing mechanistic support for the assertion that serotonin signaling is disrupted upon loss of AD-associated genes.

      While working on the Author Response, we made some changes to the analysis ran by ZOLTAR to calculate enrichments (see Methods and github.com/francoiskroll/ZOLTAR, notes on v2). With the new version, 5-HT receptor type 2 is not a significantly enriched target for the sorl1 knockout fingerprint but type 4 is. 5-HT receptor type 4 was also shown to interact with sorting nexin 27, a subunit of retromer, so is a promising candidate (Joubert et al., 2004). Antibodies against human 5-HT receptor type 2 and 4a exist; whether they would work in zebrafish remains to be tested. In our experience, the availability of antibodies suitable for immunohistochemistry in the zebrafish is a serious experimental roadblock.

      Note, all the results presented in the “Version of Records” are from ZOLTAR v2.

      Despite these important considerations, this study provides a valuable platform for highthroughput analysis of sleep phenotypes and correlation with small-molecule-induced sleep phenotypes.

      Strengths:

      - Provides a useful platform for comparison of sleep phenotypes across genotypes/drug manipulations.

      - Presents convincing evidence that night-time sleep is disrupted in mutants for multiple late onset AD-related genes.

      - Provides potential mechanistic insights for how AD-related genes might impact sleep and identifies a few drugs that modify their identified phenotypes

      Weaknesses:

      - Exploration of potential mechanisms for serotonin disruption in sorl1 mutants is limited.

      - The pipeline developed can only be used to examine sleep-related / spontaneous movement phenotypes and stimulus-evoked behaviors are not examined.

      - Comparisons between mutants/exploration of commonly affected pathways are limited.

      Thank you for these excellent suggestions, please see our answers above.

      Reviewer #2 (Public Review):

      Summary:

      This work delineates the larval zebrafish behavioral phenotypes caused by the F0 knockout of several important genes that increase the risk for Alzheimer's disease. Using behavioral pharmacology, comparing the behavioral fingerprint of previously assayed molecules to the newly generated knockout data, compounds were discovered that impacted larval movement in ways that suggest interaction with or recovery of disrupted mechanisms.

      Strengths:

      This is a well-written manuscript that uses newly developed analysis methods to present the findings in a clear, high-quality way. The addition of an extensive behavioral analysis pipeline is of value to the field of zebrafish neuroscience and will be particularly helpful for researchers who prefer the R programming language. Even the behavioral profiling of these AD risk genes, regardless of the pharmacology aspect, is an important contribution. The recovery of most behavioral parameters in the psen2 knockout with betamethasone, predicted by comparing fingerprints, is an exciting demonstration of the approach. The hypotheses generated by this work are important stepping stones to future studies uncovering the molecular basis of the proposed gene-drug interactions and discovering novel therapeutics to treat AD or co-occurring conditions such as sleep disturbance.

      Weaknesses:

      - The overarching concept of the work is that comparing behavioral fingerprints can align genes and molecules with similarly disrupted molecular pathways. While the recovery of the psen2 phenotypes by one molecule with the opposite phenotype is interesting, as are previous studies that show similar behaviorally-based recoveries, the underlying assumption that normalizing the larval movement normalizes the mechanism still lacks substantial support. There are many ways that a reduction in movement bouts could be returned to baseline that are unrelated to the root cause of the genetically driven phenotype. An ideal experiment would be to thoroughly characterize a mutant, such as by identifying a missing population of neurons, and use this approach to find a small molecule that rescues both behavior and the cellular phenotype. If the connection to serotonin in the sorl1 was more complete, for example, the overarching idea would be more compelling.

      Thank you for this cogent criticism.

      On the first point, we were careful not to claim that betamethasone normalises the molecular/cellular mechanism that causes the psen2 behavioural phenotype. Having said that, yes, to a certain extent that would be the hope of the approach. As you say, every compound which normalises the behavioural fingerprint will not normalise the underlying mechanism, but the opposite seems true: every compound that normalises the underlying mechanism should also normalise the behavioural fingerprint. We think this logic makes the “behaviour-first” approach innovative and interesting. The logic is to discover compounds that normalise the behavioural phenotype first, only subsequently test whether they also normalise the molecular mechanism, akin to testing first whether a drug resolves the symptoms before testing whether it actually modifies disease course. While in practice testing thousands of drugs in sufficient sample sizes and replicates on a mutant line is challenging, the dataset queried through ZOLTAR provides a potential shortcut by shortlisting in silico compounds that have the opposite effect on behaviour.

      You mention a “reduction in movement bouts” but note here that the number of behavioural parameters tested is key to our argument. To take the two extremes, say the only behavioural parameter we measured in psen2 knockout larvae was time active during the day, then, yes, any stimulant used at the right concentration could probably normalise the phenotype. In this situation, claiming that the stimulant is likely to also normalise the underlying mechanism, or even that it is a genuine “phenotypic rescue”, would not be convincing. Conversely, say we were measuring thousands of behavioural parameters under various stimuli, such as swimming speed, position in the well, bout usage, tail movements, and eye angles, it seems almost impossible for a compound to rescue most parameters without also normalising the underlying mechanism. The present approach is somewhere inbetween: ZOLTAR uses six behavioural parameters for prediction (e.g. Fig 6a), but all 17 parameters calculated by FramebyFrame can be used to assess rescue during a subsequent experiment (Fig. 6c). For both, splitting each parameter in day and night increases the resolution of the approach, which partly answers your criticism. For example, betamethasone rescued the day-time hypoactivity without causing night-time hyperactivity, so we are not making the “straw man argument” explained above of using any broad stimulant to rescue the hypoactivity phenotype.

      Furthermore, for diseases where the behavioural defect is the primary concern, such as autism or bipolar disorder, perhaps this behaviour-first approach is all that is needed, and whether or not the compound precisely rescues the underlying mechanism is somewhat secondary. The use of lithium to prevent manic episodes in bipolar disorder is a good example. It was initially tested because mania was thought to be caused by excess uric acid and lithium can dissolve uric acid (Mitchell and Hadzi-Pavlovic, 2000). The theory is now discredited, but lithium continues to be used without a precise understanding of its mode of action. In this example, behavioural rescue alone, assuming the secondary effects are tolerable, is sufficient to be beneficial to patients, and whether it modulates the correct causal pathway is secondary.

      On the second point, we agree that testing first ZOLTAR on a mutant for which we have a fairly good understanding of the mechanism causing the behavioural phenotype could have been a productive approach. Note, however, that examples already exist in the literature (Ashlin et al., 2018; Hoffman et al., 2016). The example from Hoffman et al. (2016) is especially convincing. Drugs generating behavioural fingerprints that positively correlate with the cntnap2a/cntnap2b double knockout fingerprint were enriched with NMDA and GABA receptor antagonists. In experiments analogous to our citalopram and fluvoxamine treatments (Fig. 5c,d and Fig. 5–supplement 1c,d), cntnap2a/cntnap2b knockout larvae were overly sensitive to the NMDA receptor antagonist MK-801 and the GABAA receptor antagonist pentylenetetrazol (PTZ). Among other drugs tested, zolpidem, a GABAA receptor agonist, caused opposite effects on wild-type and cntnap2a/cntnap2b knockout larvae. Knockout larvae were found to have fewer GABAergic neurons in the forebrain. While these studies did not use precisely the same analysis that ZOLTAR runs, they used the same rationale and behavioural dataset to make these predictions (Rihel et al., 2010), which shows that approaches like ZOLTAR can point to causal processes.

      On your last point, we hope our experiment testing fluvoxamine, another selective serotonin reuptake inhibitor (SSRI), makes the connection between Sorl1 and serotonin signalling more convincing.

      - The behavioral difference between the sorl1 KO and scrambled at the higher dose of the citalopram is based on a small number of animals. The KO Euclidean distance measure is also more spread out than for the other datasets, and it looks like only five or so fish are driving the group difference. It also appears as though the numbers were also from two injection series. While there is nothing obviously wrong with the data, I would feel more comfortable if such a strong statement of a result from a relatively subtle phenotype were backed up by a higher N or a stable line. It is not impossible that the observed difference is an experimental fluke. If something obvious had emerged through the HCR, that would have also supported the conclusions. As it stands, if no more experiments are done to bolster the claim, the confidence in the strength of the link to serotonin should be reduced (possibly putting the entire section in the supplement and modifying the discussion). The discussion section about serotonin and AD is interesting, but I think that it is excessive without additional evidence.

      We mostly agree with this criticism. One could interpret the larger spread of the data for sorl1 KO larvae treated with 10 µM citalopram as evidence that the knockout larvae do indeed react differently to the drug at this dose, regardless of being driven by a subset of the animals. The result indeed does not survive removing the top 5 (p = 0.87) or top 3 (p = 0.18) sorl1 KO + 10 µM larvae, but this amounts to excluding 20 (3/14) or 35 (5/14) % of the datapoints as potential outliers, which is unreasonable. In fact, excluding the top 5 sorl1 KO + 10 µM is equivalent to calling any datapoint with z-score > 0.2 an outlier (z-scores of the top 5 datapoints are 0.2–1.8). Applying consistently the same criterion to the scrambled + 10 µM group would remove the top 6 datapoints (z-scores = 0.5–3.9). Comparing the resulting two distributions again gives the sorl1 KO + 10 µM distribution as significantly higher (p = 0.0015). We would also mention that Euclidean distance, as a summary metric for distance between behavioural fingerprints, has limitations. For example, the measure will be more sensitive to changes in some parameters but not others, depending on how much room there is for a given parameter to change. We included this metric to lend support to the observation one can draw from the fingerprint plot (Fig. 5c) that sorl1 mutants respond in an exaggerated way to citalopram across many parameters, while being agnostic to which parameter might matter most.

      Given that the HCR did not reveal anything striking, we agree with you that too much of our argument relied on this result being robust. As you and Reviewer #3 suggested, we repeated this experiment with a different SSRI, fluvoxamine (Fig. 5–supplement 1). We cannot readily explain why the result was opposite to what we found with citalopram, but in both cases sorl1 knockout larvae reacted differently than their control siblings, which adds an argument to our claim that ZOLTAR correctly predicted serotonin signalling as a disrupted pathway from the behavioural fingerprint. Accordingly, we mostly kept the Discussion on Sorl1 the same, although we concede that we may not have identified the molecular mechanism.

      - The authors suggest two hypotheses for the behavioral difference between the sorl1 KO and scrambled at the higher dose of the citalopram. While the first is tested, and found to not be supported, the second is not tested at all ("Ruling out the first hypothesis, sorl1 knockouts may react excessively to a given spike in serotonin." and "Second, sorl1 knockouts may be overly sensitive to serotonin itself because post-synaptic neurons have higher levels of serotonin receptors."). Assuming that the finding is robust, there are probably other reasons why the mutants could have a different sensitivity to this molecule. However, if this particular one is going to be mentioned, it is surprising that it was not tested alongside the first hypothesis. This work could proceed without a complete explanation, but additional discussion of the possibilities would be helpful or why the second hypothesis was not tested.

      There are no strong scientific reasons why this hypothesis was not tested. The lead author (F Kroll) moved to a different lab and country so the project was finalised at that time. We do not plan on testing this hypothesis at this stage. However, we adapted the wording to make it clear this is one possible alternative hypothesis which could be tested in the future. The small differences found by HCR are actually more in line with the new results from the fluvoxamine experiment, so it may also be that both hypotheses (pre-synaptic neurons releasing less serotonin when reuptake is blocked; or post-synaptic neurons being less sensitive) contribute. The fluvoxamine experiment was performed in a different lab (ICM, Paris; all other experiments were done in UCL, London) in a different wild-type strain (TL in ICM, AB x Tup LF in UCL), which complicates how one interprets this discrepancy.

      - The authors claim that "all four genes produced a fairly consistent phenotype at night". While it is interesting that this result arose in the different lines, the second clutch for some genes did not replicate as well as others. I think the findings are compelling, regardless, but the sometimes missing replicability should be discussed. I wonder if the F0 strategy adds noise to the results and if clean null lines would yield stronger phenotypes. Please discuss this possibility, or others, in regard to the variability in some phenotypes.

      For the first part of this point, please see below our answer to Reviewer #3, point (2) c.

      Regarding the F0 strategy potentially adding variability, it is an interesting question which we tested in a larger dataset of behavioural recordings from F0 and stable knockouts for the same genes (unpublished). In summary, the F0 knockout method does not increase clutchto-clutch or larva-to-larva variability in the assay. F0 knockout experiments found many more significant parameters and larger effect sizes than stable knockout experiments, but this difference could largely be explained by the larger sample sizes of F0 knockout experiments. In fact, larger sample sizes within individual clutches appears to be a major advantage of the F0 knockout approach over in-cross of heterozygous knockout animals as it increases sensitivity of the assay without causing substantial variability. We plan to report in more detail on this analysis in a separate paper as we think it would dilute the focus of the present work.

      - In this work, the knockout of appa/appb is included. While APP is a well-known risk gene, there is no clear justification for making a knockout model. It is well known that the upregulation of app is the driver of Alzheimer's, not downregulation. The authors even indicate an expectation that it could be similar to the other knockouts ("Moreover, the behavioural phenotypes of appa/appb and psen1 knockout larvae had little overlap while they presumably both resulted in the loss of Aβ." and "Comparing with early-onset genes, psen1 knockouts had similar night-time phenotypes, but loss of psen2 or appa/appb had no effect on night-time sleep."). There is no reason to expect similarity between appa/appb and psen1/2. I understand that the app knockouts could unveil interesting early neurodevelopmental roles, but the manuscript needs to be clarified that any findings could be the opposite of expectation in AD.

      On “there is no reason to expect similarity […]”, we disagree. Knockout of appa/appb and knockout of psen1 will both result in loss of Aβ (appa/appb encode Aβ and psen1 cleaves Appa/Appb to release Aβ, cf. Fig. 3e). Consequently, a phenotype caused by the loss of Aβ, or possibly other Appa/Appb cleavage products, should logically be found in both appa/appb and psen1 knockouts.

      On “it is well known that the upregulation of APP is the driver of Alzheimer’s, not downregulation”; we of course agree. Among others, the examples of Down syndrome, APP duplication (Sleegers et al., 2006), or mouse models overexpressing human APP show definitely that overexpression of APP is sufficient to cause AD. Having said that, we would not be so quick in dismissing APP knockout as potentially relevant to understanding of AD.

      Loss of soluble Aβ due to aggregation could contribute to pathology (Espay et al., 2023). Without getting too much into this intricate debate, links between levels of Aβ and risk of disease are often counter-intuitive too. For example, out of 138 PSEN1 mutations screened in vitro, 104 reduced total Aβ production and 11 even seemingly abolished the production of both Aβ40 and Aβ42 (Sun et al., 2017). In short, loss of soluble Aβ occurs in both AD and in our appa/appb knockout larvae.

      We added a sentence in Results (section psen2 knockouts […]) to briefly justify our appa/appb knockout approach. To be clear, we do not want to imply, for example, that the absence of a night-time sleep phenotype for appa/appb is contradictory to the body of literature showing links between Aβ and sleep, including in zebrafish (Özcan et al., 2020). As you say, our experiment tested loss of App, including Aβ, while the literature typically reports on overexpression of APP, as in APP/PSEN1-overexpressing mice (Jagirdar et al., 2021).

      Reviewer #3 (Public Review):

      In this manuscript by Kroll and colleagues, the authors describe combining behavioral pharmacology with sleep profiling to predict disease and potential treatment pathways at play in AD. AD is used here as a case study, but the approaches detailed can be used for other genetic screens related to normal or pathological states for which sleep/arousal is relevant. The data are for the most part convincing, although generally the phenotypes are relatively small and there are no major new mechanistic insights. Nonetheless, the approaches are certainly of broad interest and the data are comprehensive and detailed. A notable weakness is the introduction, which overly generalizes numerous concepts and fails to provide the necessary background to set the stage for the data.

      Major points

      (1) The authors should spend more time explaining what they see as the meaning of the large number of behavioral parameters assayed and specifically what they tell readers about the biology of the animal. Many are hard to understand--e.g. a "slope" parameter.

      We agree that some parameters do not tell something intuitive about the biology of the animal. It would be easy to speculate. For example, the “activity slope” parameter may indicate how quickly the animal becomes tired over the course of the day. On the other hand, fractal dimension describes the “roughness/smoothness” of the larva’s activity trace (Fig. 2–supplement 1a); but it is not obvious how to translate this into information about the physiology of the animal. We do not see this as an issue though. While some parameters do provide intuitive information about the animal’s behaviour (e.g. sleep duration or sunset startle as a measure of startle response), the benefit of having a large number of behavioural parameters is to compare behavioural fingerprints and assess rescue of the behavioural phenotype by small molecules (Fig. 6c). For this purpose, the more parameters the better. The “MoSeq” approach from Wiltschko et al., 2020 is a good example from literature that inspired our own Fig. 6c. While some of the “behavioural syllables” may be intuitive (e.g. running or grooming), it is probably pointless to try to explain the ‘meaning’ of the “small left turn in place with head motion” syllable (Wiltschko et al., 2020). Nonetheless, this syllable was useful to assess whether a drug specifically treats the behavioural phenotype under study without causing too many side effects. Unfortunately, ZOLTAR has to reduce the FramebyFrame fingerprint (17 parameters) to just six parameters to compare it to the behavioural dataset from Rihel et al., 2010, but here, more parameters would almost certainly translate into better predictions too, regardless of their intuitiveness.

      It is true however that we did not give much information on how some of the less intuitive parameters, such as activity slope or fractal dimension, are calculated or what they describe about the dataset (e.g. roughness/smoothness for fractal dimension). We added a few sentences in the legend of Fig. 2–supplement 1.

      (2) Because in the end the authors did not screen that many lines, it would increase confidence in the phenotypes to provide more validation of KO specificity. Some suggestions include:

      a. The authors cite a psen1 and psen2 germline mutant lines. Can these be tested in the FramebyFrame R analysis? Do they phenocopy F0 KO larvae?

      We unfortunately do not have those lines. We investigated the availability of importing a psen2 knockout line from abroad, but the process of shipping live animals is becoming more and more cost and time prohibitive. However, we observed the same pigmentation phenotype for psen2 knockouts as reported by Jiang et al., 2018, which is at least a partial confirmation of phenocopying a loss of function stable mutant.  

      b. psen2_KO is one of the larger centerpieces of the paper. The authors should present more compelling evidence that animals are truly functionally null. Without this, how do we interpret their phenotypes?

      We disagree that there should be significant doubt about these mutants being truly functionally null, given the high mutation rate and presence of the expected pigmentation phenotype (Jiang et al., 2018, Fig. 3f and Fig. 3–supplement 3a). The psen2 F0 knockouts were virtually 100% mutated at three exons across the gene (mutation rates were locus 1: 100 ± 0%; locus 2: 99.99 ± 0.06%; locus 3: 99.85 ± 0.24%). Additionally, two of the three mutated exons had particularly high rates of frameshift mutations (locus 1: 97 ± 5%; locus 2: 88 ± 17% frameshift mutation rate). It is virtually impossible that a functional protein is translated given this burden of frameshift mutations. Phenotypically, in addition to the pigmentation defect, double psen1/psen2 F0 knockout larvae had curved tails, the same phenotype as caused by a high dose of the γ-secretase inhibitor DAPT (Yang et al., 2008). These double F0 knockouts were lethal, while knockout of psen1 or psen2 alone did not cause obvious morphological defects. Evidently, most larvae must have been psen2 null mutants in this experiment, otherwise functional Psen2 would have prevented early lethality.

      Translation of zebrafish psen2 can start at downstream start codons if the first exon has a frameshift mutation, generating a seemingly functional Psen2 missing the N-terminus (Jiang et al., 2020). Zebrafish homozygous for this early frameshift mutation had normal pigmentation, showing it is a reliable marker of Psen2 function even when it is mutated. This mechanism is not a concern here as the alternative start codons are still upstream of two of the three mutated exons (the alternative start codons discovered by Jiang et al., 2020 are in exon 2 and 3, but we targeted exon 3, exon 4, and exon 6).

      We understand that the zebrafish community may be cautious about F0 phenotyping compared to stably generated mutants. As mentioned to Reviewer #2, we are planning to assemble a paper that expressly compares behavioural phenotypes measured in F0 vs. stable mutants to allay some of these concerns. Our current manuscript, which combines CRISPR-Cas9 rapid F0 screening with in silico pharmacological predictions, inevitability represents a first step in characterizing the functions of these genes. 

      c. Related to the above, for cd2AP and sorl1 KO, some of the effect sizes seem to be driven by one clutch and not the other. In other words, great clutch-to-clutch variability. Should the authors increase the number of clutches assayed?

      Correct, there is substantial clutch-to-clutch variability in this behavioural assay. This is not specific to our experiments. Even within the same strain, wild-type larvae from different clutches (i.e. non-siblings) behave differently (Joo et al., 2021). This is why it is essential to compare behavioural phenotypes within individual clutches (i.e. from a single pair of parents, one male and one female), as we explain in Methods (section Behavioural video-tracking) and in the documentation of the FramebyFrame package. We often see two different experimental designs in literature: comparing non-sibling wild-type and mutant larvae, or pooling different clutches which include all genotypes (e.g. pooling multiple clutches from heterozygous in-crosses or pooling wild-type clutches before injecting them). The first experimental design causes false positive findings (Joo et al., 2021), as the clutchto-clutch variability we and others observe gets interpreted as a behavioural phenotype. The second experimental design should not cause false positives but likely decreases the sensitivity of the assay by increasing the spread within genotypes. In both cases, the clutch-to-clutch variability is hidden, either by interpreting it as a phenotype (first case) or by adding it to animal-to-animal variability (second case). Our experimental design is technically more challenging as it requires obtaining large clutches from unique pairs of parents. However, this approach is better as it clearly separates the different sources of variability (clutch-to-clutch or animal-to-animal). As for every experiment, yes, a larger number of replicates would be better, but we do not plan to assay additional clutches at this time. Our work heavily focuses on the sorl1 and psen2 knockout behavioural phenotypes. The key aspects of these phenotypes were effectively tested in four experiments (five to six clutches) as sorl1 knockout larvae were also tracked in the citalopram and fluvoxamine experiments (Fig. 5 and Fig. 5–supplement 1), and psen2 knockout larvae were also tracked in the small molecule rescue experiment (Fig. 6 and Fig. 6–supplement 1).

      The psen2 behavioural phenotype replicated well across the six clutches tested (pairwise cosine similarities: 0.62 ± 0.15; Author response image 2a). 5/6 clutches were less active and initiating more sleep bouts during the day, as we claimed in Fig. 3.

      In the citalopram experiment, the H<sub>2</sub>O-treated sorl1 knockout fingerprint replicated fairly well the baseline recordings in Fig. 4, despite the smaller sample size (cos = 0.30 and 0.78; Author response image 2b, see “KO Fig. 5”). 5/6 of the significant parameters presented in Fig. 4–supplement 4 moved in the same direction, and knockout larvae were also hypoactive during the day but hyperactive at night. Note that two clutches were tracked on the same 96-well plate in this experiment. We calculated each larva’s z-score using the average of its control siblings, then we averaged all the z-scores to generate the fingerprint. The H<sub>2</sub>O treated sorl1 knockout clutch from the fluvoxamine experiment did not replicate well the baseline recordings (cos = 0.08 and 0.11; Author response image 2b, see “KO Fig. 5–suppl. 1”). Knockout larvae were hypoactive during the day as expected, but behaviour at night was not as robustly affected. As mentioned above, knockouts were made in a different genetic background (TL, instead of AB x Tup LF used for all other experiments), which could explain the discrepancy.

      We also took the opportunity to check whether our SSRI treatments replicated well the data from Rihel et al., 2010. For both citalopram (n = 3 fingerprints in the database) and fluvoxamine (n = 4 fingerprints in the database), replication was excellent (cos ≥ 0.67 for all comparisons of a fingerprint from this study vs. a fingerprint from Rihel et al. 2010; Author response image 2c,d). Note that the scrambled + 10 µM citalopram and + 10 µM fluvoxamine fingerprints correlate extremely well (cos = 0.92; can be seen in Author response image 2c,d), which was predicted by the small molecule screen dataset.

      Author response image 2.

      Replication of psen2 and sorl1 F0 knockout fingerprints and SSRI treatments from Rihel et al., 2010. a, (left) Every psen2 F0 knockout behavioural fingerprint generated in this study. Each dot represents the mean deviation from the same-clutch scrambled-injected mean for that parameter (z-score, mean ± SEM). From the experiments in Fig. 6, presented is the psen2 F0 knockout + H<sub>2</sub>O fingerprints. The fingerprints in grey (“not shown”) are from a preliminary drug treatment experiment we did not include in the final study. These fingerprints are from psen2 F0 knockout larvae treated with 0.2% DMSO, normalised to scrambled-injected siblings also treated with 0.2% DMSO. (right) Pairwise cosine similarities (−1.0–1.0) for the fingerprints presented. b, Every sorl1 F0 knockout behavioural fingerprint, as in a). c, The scrambled-injected + citalopram (10 µM) fingerprints (grey) in comparison to the citalopram (10–15 µM) fingerprints from the Rihel et al., 2010 database (green). d, The scrambled-injected + fluvoxamine (10 µM) fingerprint (grey) in comparison to the fluvoxamine fingerprints from the Rihel et al., 2010 database (pink). In c) and d), the scrambled-injected fingerprints are from the experiments in Fig. 5 and Fig. 5–suppl. 1, but were converted here into the behavioural parameters used by Rihel et al., 2010 for comparison. Parameters: 1, average activity (sec active/min); 2, average waking activity (sec active/min, excluding inactive minutes); 3, total sleep (hr); 4, number of sleep bouts; 5, sleep bout length (min); 6, sleep latency (min until first sleep bout).

      (3) The authors make the point that most of the AD risk genes are expressed in fish during development. Is there public data to comment on whether the genes of interest are expressed in mature/old fish as well? Just because the genes are expressed early does not at all mean that early- life dysfunction is related to future AD (though this could be the case, of course). Genes with exclusive developmental expression would be strong candidates for such an early-life role, however. I presume the case is made because sleep studies are mainly done in juvenile fish, but I think it is really a prejy minor point and such a strong claim does not even need to be made.

      This is a fair criticism but we do not make this claim (“early-life dysfunction is related to future AD”) from expression alone. The reviewer is probably referring to the following quote:

      “[…] most of these were expressed in the brain of 5–6-dpf zebrafish larvae, suggesting they play a role in early brain development or function,” which does not mention future risk of AD. We do suggest that these genes have a function in development. After all, every gene that plays a role in brain development must be expressed during development, so this wording seemed reasonable. Nevertheless, we adapted the wording to address this point and Reviewer #2’s complaint below. As noted, the primary goal was to check that the genes we selected were indeed expressed in zebrafish larvae before performing knockout experiments. Our discussion does raise the hypothesis that mutations in Alzheimer’s risk genes impact brain development and sleep early in life, but this argument primarily relies on our observation that knockout of late-onset Alzheimer’s risk genes causes sleep phenotypes in 7-day old zebrafish larvae and from previous work showing brain structural differences in children at high genetic risk of AD (Dean et al., 2014; Quiroz et al., 2015), not solely on gene expression early in life.

      Please also see our answer to a similar point raised by Reviewer #2 below (cf. Author response image 7).

      (4) A common quandary with defining sleep behaviorally is how to rectify sleep and activity changes that influence one another. With psen2 KOs, the authors describe reduced activity and increased sleep during the day. But how do we know if the reduced activity drives increased behavioral quiescence that is incorrectly defined as sleep? In instances where sleep is increased but activity during periods during wake are normal or elevated, this is not an issue. But here, the animals might very well be unhealthy, and less active, so naturally they stop moving more for prolonged periods, but the main conclusion is not sleep per se. This is an area where more experiments should be added if the authors do not wish to change/temper the conclusions they draw. Are psen2 KOs responsive to startling stimuli like controls when awake? Do they respond normally when quiescent? Great care must be taken in all models using inactivity as a proxy for sleep, and it can harm the field when there is no acknowledgment that overall health/activity changes could be a confound. Particularly worrisome is the betamethasone data in Figure 6, where activity and sleep are once again coordinately modified by the drug.

      This is a fair criticism. We agree it is a concern, especially in the case of psen2 as we claim that day-time sleep is increased while zebrafish are diurnal. We do not rely heavily on the day-time inactivity being sleep (the ZOLTAR predictions or the small molecule rescue do not change whether the parameter is called sleep or inactivity), but our choice of labelling can fairly be challenged.

      To address “are psen2 KO responsive to startling stimuli like controls when awake/when quiescent”, we looked at the larvae’s behaviour immediately after lights abruptly switched on in the mornings. Almost every larva, regardless of genotype, responded strongly to every lights-off transition during the experiment. Instead, we chose the lights-on transition for this analysis because it is a weaker startling stimulus for the larvae than the lights-off transition (Fig. 3–supplement 3), potentially exposing differences between genotypes or behavioural states (quiescent or awake). We defined a larva as having reacted to the lights switching on if it made a swimming bout during the second (25 frames) a er the lights-on transition. Across two clutches and two lights-on transitions, an average of 65% (range 52–73%) of all larvae reacted to the stimulus. psen2 knockout larvae were similarly likely, if not more likely, to respond (in average 69% responded, range 60–76%) than controls (60% average, range 44– 75%). When the lights switched on, about half of the larvae (39–51%) would have been classified as asleep according to the one-minute inactivity definition (i.e. the larva did not move in the minute preceding the lights transition). This allowed us to also compare behavioural states, as suggested by the reviewer. For three of the four light transitions, larvae which were awake when lights switched on were more likely to react than asleep larvae, but this difference was not striking (overall, awake larvae were only 1.1× more likely to react; Author response image 3). Awake psen2 knockout larvae were 1.1× (range 1.04–1.11×) more likely to react than awake control larvae, so, yes, psen2 knockout larvae respond normally when awake. Asleep psen2 knockout larvae were 1.4× (range 0.63–2.19×) more likely to react than asleep control larvae, so psen2 knockouts are also more or equally likely to react than control larvae when asleep. In summary, the overall health of psen2 knockouts did not seem to be a significant confound in the experiment. As the reviewer suggested, if psen2 knockout larvae were seriously unhealthy, they would not be as responsive as control larvae to a startling stimulus.

      Author response image 3.

      psen2 F0 knockouts react normally to lights switching on, indicating they are largely healthy. At each lights-on transition (9 AM), each larva was categorised as awake if it had moved in the preceding one minute or asleep if it had been inactive for at least one minute. Darker tiles represent larvae which performed a swimming bout during the second following lights-on; lighter tiles represent larvae which did not move during that second. The total count of each waffle plot was normalised to 25 so plots can be compared to each other. The real count is indicated in the corner of each plot. Data is from the baseline psen2 knockout trackings presented in Fig. 3 and Fig. 3–suppl. 2.

      Next, we compared inactive period durations during the day between psen2 and control larvae. If psen2 knockout larvae indeed sleep more during the day compared to controls, we may predict inactive periods longer than one minute to increase disproportionately compared to the increase in shorter inactive periods. This broadly appeared to be the case, especially for one of the two clutches (Author response image 4). In clutch 1, inactive periods lasting 1–60 sec were equally frequent in both psen2 and control larvae (fold change 1.0× during both days), while inactive periods lasting 1–2 min were 1.5× (day 1) and 2.5× (day 2) more frequent in psen2 larvae compared to control larvae. In clutch 2, 1–60 sec inactive periods were also equally frequent in both psen2 and control larvae, while inactive periods lasting 1–2 min were 3.4× (day 1) and 1.5× (day 2) more frequent in psen2 larvae compared to control larvae. Therefore, psen2 knockouts disproportionately increased the frequency of inactive periods longer than one minute, suggesting they genuinely slept more during the day.

      Author response image 4.

      psen2 F0 knockouts increased preferentially the frequency of longer inactive bouts. For each day and clutch, we calculated the mean distribution of inactive bout lengths across larvae of same genotype (psen2 F0 knockout or scrambled-injected), then compared the frequency of inactive bouts of different lengths between the two genotypes. For example, in clutch 1 during day 2, 0.01% of the average scrambled-injected larva’s inactive bouts lasted 111–120 seconds (X axis 120 sec) while 0.05% of the average psen2 F0 knockout larva lasted this long, so the fold change was 5×. Inactive bouts lasting < 1 sec were excluded from the analysis. In clutch 2, day 1 plot, two datapoints fall outside the Y axis limit: 140 sec, Y = 32×; 170 sec, Y = 16×. Data is from the baseline psen2 knockout trackings presented in Fig. 3 and Fig. 3–suppl. 2.

      Ultimately, this criticism seems challenging to definitely address experimentally. A possible approach could be to use a closed-loop system which, after one minute of inactivity, triggers a stimulus that is sufficient to startle an awake larva but not an asleep larva. If psen2 knockout larvae indeed sleep more during the day, the stimulus should usually not be sufficient to startle them. Nevertheless, we believe the two analyses presented here are consistent with psen2 knockout larvae genuinely sleeping more during the day, so we decided to keep this label. We agree with the reviewer that the one-minute inactivity definition has limitations, especially for day-time inactivity.

      (5) The conclusions for the serotonin section are overstated. Behavioural pharmacology purports to predict a signaling pathway disrupted with sorl1 KO. But is it not just possible that the drug acts in parallel to the true disrupted pathway in these fish? There is no direct evidence for serotonin dysfunction - that conclusion is based on response to the drug. Moreover, it is just one drug - is the same phenotype present with another SSRI? Likewise, language should be toned down in the discussion, as this hypothesis is not "confirmed" by the results (consider "supported"). The lack of measured serotonin differences further raises concern that this is not the true pathway. This is another major point that deserves further experimental evidence, because without it, the entire approach (behavioral pharm screen) seems more shaky as a way to identify mechanisms. There are any number of testable hypotheses to pursue such as a) Using transient transgenesis to visualize 5HT neuron morphology (is development perturbed: cell number, neurite morphology, synapse formation); b) Using transgenic Ca reporters to assay 5HT neuron activity.

      Regarding the comment, “is it not just possible that the drug acts in parallel to the true disrupted pathway”, we think no, assuming we understand correctly the question. Key to our argument is the fact that sorl1 knockout larvae react differently to the drug(s) than control larvae. As an example, take night-time sleep bout length, which was not affected by knockout of sorl1 (Fig. 4–supplement 4). For the sake of the argument, say only dopamine signalling (the “true disrupted pathway”) was affected in sorl1 knockouts and that serotonin signalling was intact. Assuming that citalopram specifically alters serotonin signalling, then treatment should cause the same increase in sleep bout length in both knockouts and controls as serotonin signalling is intact in both. This is not what we see, however. Citalopram caused a greater increase in sleep bout length in sorl1 knockouts than in scrambled-injected larvae. In other words, the effect is non-additive, in the sense that citalopram did not add the same number of z-scores to sorl1 knockouts or controls. We think this shows that serotonin signalling is somehow different in sorl1 knockouts. Nonetheless, we concede that the experiment does not necessarily say much about the importance of the serotonin disruption caused by loss of Sorl1. It could be, for example, that the most salient consequence of loss of Sorl1 is cholinergic disruption (see reply to Reviewer #1 above) and that serotonin signalling is a minor theme.

      Furthermore, we agree with the reviewer and Reviewer #2 that the conclusions were overly confident. As suggested, we decided to repeat this experiment with another SSRI, fluvoxamine. Please find the results of this experiment in Fig. 5–supplement 1. The suggestions to further test the serotonin system in the sorl1 knockouts are excellent as well, however we do not plan to pursue them at this stage.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major Comments:

      - Data are presented in a variety of different ways, occasionally making comparisons across figures difficult. Perhaps at a minimum, behavioral fingerprints as in Figure 3 - Supplementary Figure 1 should be presented for all mutants in the main figures.

      We like this suggestion! Thank you. We brought the behavioural fingerprints figure (previously Fig. 4–supplement 5) as main Fig. 4, and put the figure focused on the sorl1 knockout behavioural phenotype in supplementary, with the other gene-by-gene figures.

      - It is not clear why some data were selected for supplemental rather than main figures. In many cases, detailed phenotypic data is provided for one example mutant in the main figures, and then additional mutants are described in detail in the supplement. Again, to facilitate comparisons between mutants, fingerprints could be provided for all mutants in a main figure, with detailed analyses moved to the supplements.

      The logic was to dedicate one main figure to psen2 (Fig. 3) as an example of an early-onset Alzheimer’s risk gene, and one to sorl1 (previously Fig. 4) as an example of a late-onset Alzheimer’s risk gene. We focused on them in main figures as they are both tested again later (Fig. 5 and Fig. 6). Having said that, we agree that the fingerprints may be a better use of main figure space than the parameters plots. In addition to the above (fingerprints of lateonset Alzheimer’s risk genes in main figure), we rearranged the figures in the early-onset AD section to have the psen2 F0 knockout fingerprint in main.

      - The explication of the utility of behavioral fingerprinting on page 35 is somewhat confusing. The authors describe drugs used to treat depression as enriched among small molecules anti-correlating with the sorl1 fingerprint. However, in Figure 5 - Supplementary Figure 1, drugs used to treat depression are biased toward positive cosines, which are indicated as having a more similar fingerprint to sorl1. These drugs should be described as more present among compounds positively correlating with the sorl1 fingerprint.

      Sorry, the confusion is about “(anti-)correlating”. Precisely, we meant “correlating and/or anti-correlating”, not just anti-correlating. We changed to that wording. In short, the analysis is by design agnostic to whether compounds with a given annotation are found more on the positive cosines side (le side in Fig. 5–supplement 1a) or the negative cosines side (right side). This is because the dataset often includes both agonists and antagonists to a given pathway but these are difficult to annotate. For example, say 10 compounds in the dataset target the dopamine D4 receptor, but these are an unknown mix of agonists and antagonists. In this case, we want ZOLTAR to generate a low p-value when all 10 compounds are found at extreme ends of the list, regardless of which end(s) that is (e.g. top 8 and bottom 2 should give an extremely low p-value). Initially, we were splitting the list, for each annotation, into positive-cosine fingerprints and negative-cosine fingerprints and testing enrichment on both separately, but we think the current approach is better as it reflects better the cases we want to detect and considers all available examples for a given annotation in one test. In sum, yes, in this case drugs used to treat depression were mostly in the positive-cosine side, but the other drugs on the negative-cosine side also contributed to what the p-value is, so it reflects better the analysis to say “correlating and/or anticorrelating”. You can read more about our logic for the analysis in Methods (section Behavioural pharmacology from sorl1 F0 knockout’s fingerprint).

      - The authors conclude the above-described section by stating: "sorl1 knockout larvae behaved similarly to larvae treated with small molecules targeting serotonin signaling, suggesting that the loss of Sorl1 disrupted serotonin signaling." Directionality here may be important. Are all of the drugs targeting the serotonin transporter SSRIs or similar? If so, then a correct statement would be that loss of Sorl1 causes similar phenotypes to drugs enhancing serotonin signaling. Finally, based on the correlation between serotonin transporter inhibitor trazodone and the sorl1 crispant phenotype, it is potentially surprising that the SSRI citalopram caused the opposite phenotype from sorl1, that is, increased sleep during the day and night. It is potentially interesting that this result was enhanced in mutants, and suggests dysfunction of serotonin signaling, but the statement that "our behavioral pharmacology approach correctly predicted from behaviour alone that serotonin signaling was disrupted" is too strong a conclusion.

      We understand “disrupt” as potentially going either way, but this may not be the common usage. We changed to “altered”.

      The point regarding directionality is excellent, however. We tested the proportion of serotonin transporter agonists and antagonists (SSRIs) on each side of the ranked list of small molecule fingerprints. We used the STITCH database for this analysis as it has more drug–target interactions, but likely less curated, than the Therapeutic Target Database (Szklarczyk et al., 2016). As with the Therapeutic Target Database, most fingerprints of compounds interacting with the serotonin transporter SLC6A4 were found on the side of positive cosines (p ~ 0.005 using the custom permutation test), which replicates Fig. 5a with a different source for the drug–target annotations (Author response image 5). On the side of positive cosines (small molecules which generate behavioural fingerprints correlating with the sorl1 fingerprint), there were 2 agonists and 26 antagonists. On the side of negative cosines (small molecules which generate behavioural fingerprints anti-correlating with the sorl1 fingerprint), there were 3 agonists and 2 antagonists. Using a Chi-squared test, this suggests a significant (p = 0.002) over-representation of antagonists (SSRIs) on the positive side (expected count = 24, vs. 26 observed) and agonists on the negative side (expected count = 1, vs. 3 observed). If SLC6A4 antagonists, i.e. SSRIs, indeed tend to cause a similar behavioural phenotype than knockout of sorl1, this would point in the direction of our original interpretation of the citalopram experiment; which was that excessive serotonin signalling is what causes the sorl1 behavioural phenotype.

      Author response image 5.

      Using the STITCH database as source of annotations also predicts SLC6A4 as an enriched target for the sorl1 behavioural fingerprint. Same figures as Fig. 5a,b but using the STITCH database (Szklarczyk et al., 2016) as source for the drug targets. a, Compounds annotated by STITCH as interacting with the serotonin transporter SLC6A4 tend to generate behavioural phenotypes similar to the sorl1 F0 knockout fingerprint. 40,522 compound–target protein pairs (vertical bars; 1,592 unique compounds) are ranked from the fingerprint with the most positive cosine to the fingerprint with the most negative cosine in comparison with the mean sorl1 F0 knockout fingerprint. Fingerprints of drugs that interact with SLC6A4 are coloured in yellow. Simulated p-value = 0.005 for enrichment of drugs interacting with SLC6A4 at the top (positive cosine) and/or bottom (negative cosine) of the ranked list by a custom permutation test. b, Result of the permutation test for top and/or bottom enrichment of drugs interacting with SLC6A4 in the ranked list. The absolute cosines of the fingerprints of drugs interacting with SLC6A4 (n = 52, one fingerprint per compound) were summed, giving sum of cosines = 15.9. To simulate a null distribution, 52 fingerprints were randomly drawn 100,000 times, generating a distribution of 100,000 random sum of cosines. Here, only 499 random draws gave a larger sum of cosines, so the simulated p-value was p = 499/100,000 = 0.005 **.

      If this were true, we would expect, as the reviewer suggested, SSRI treatment (citalopram or fluvoxamine) on control larvae to give a similar behavioural phenotype as knockout of sorl1. However, this generally did not appear to be the case (sorl1 knockout fingerprint vs. SSRI-treated control fingerprint, cosine = 0.08 ± 0.35; Author response image 6).

      Author response image 6.

      sorl1 F0 knockouts in comparison to controls treated with SSRIs. a, sorl1 F0 knockout fingerprints (baseline recordings and sorl1 + H<sub>2</sub>O fingerprint from the citalopram experiment) in comparison with the scrambled-injected + citalopram (1 or 10 µM) fingerprints. Each dot represents the mean deviation from the same-clutch scrambled-injected H<sub>2</sub>O-treated mean for that parameter (z-score, mean ± SEM). b, As in a), sorl1 F0 knockout fingerprints (baseline recordings and sorl1 + H<sub>2</sub>O fingerprint from the fluvoxamine experiment) in comparison with the scrambled-injected + fluvoxamine (10 µM) fingerprint.

      The comparison with trazodone is an interesting observation, but it is only a weak serotonin reuptake inhibitor (Ki for SLC6A4 = 690 nM, vs. 8.9 nM for citalopram; Owens et al., 1997) and it has many other targets, both as agonist or antagonist, including serotonin, adrenergic, and histamine receptors (Mijur, 2011). In any case, the average trazodone fingerprint does not correlate particularly well to the sorl1 knockout fingerprint (cos = 0.3). Finally, the sorl1 knockout behavioural phenotype could be primarily caused by altered serotonin signalling in the hypothalamus, where we found both the biggest difference in tph1a/1b/2 HCR signal intensity (Fig. 5f) and the highest expression of sorl1 across scRNA-seq clusters (Fig. 1– supplement 2). In this case, it would be correct to expect sorl1 knockouts to react differently to SSRIs than controls, but it would be incorrect to expect SSRI treatment to cause the same behavioural phenotype, as it concurrently affects every other serotonergic neuron in the brain.

      Finally, we agree the quoted conclusion was too strong given the current evidence. We since tested another SSRI, fluvoxamine, on sorl1 knockouts.

      - Also in reference to Figure 5: in panel c, data are presented as deviation from vehicle treated. Because of this data presentation choice, it's no longer possible to determine whether, in this experiment, sorl1 crispants sleep less at night relative to their siblings. Does citalopram rescue / reverse sleep deficits in sorl1 mutants?

      On your first point, please see our response to Reviewer #3 (2)c and Author Response 2b above.

      On “does citalopram rescue/reverse sleep deficits in sorl1 mutants”: citalopram (and fluvoxamine) tends to reverse the key aspects of the sorl1 knockout behavioural phenotype by reducing night-time activity (% time active and total Δ pixels), increasing night-time sleep, and shortening sleep latency (Author response image 7). Extrapolating from the hypothesis presented in Discussion, this may be interpreted as a hint that sorl1 knockouts have reduced levels of 5-HT receptors, as increasing serotonin signalling using an SSRI tends to rescue the phenotype. However, we do not think that focusing on the significant behavioural parameters necessarily make sense here. Rather, one should take all parameters into account to conclude whether knockouts react differently to the drug than wild types (also see answer to Reviewer #3, (7) on this). For example, citalopram increased more the night-time sleep bout length of sorl1 knockouts than the one of controls (Fig. 5), but this parameter was not modified by knockout of sorl1 (Fig. 4). To explain the rationale more informally, citalopram is only used as a tool here to probe serotonin signalling in sorl1 knockouts, whether it worsens or rescues the behavioural phenotype is somewhat secondary, the key question is whether knockouts react differently than controls.

      Author response image 7.

      Comparing untreated sorl1 F0 knockouts vs. treated with SSRIs. a, sorl1 F0 knockout fingerprints (baseline recordings and sorl1 + H<sub>2</sub>O fingerprint from the citalopram experiment) in comparison with the sorl1 knockout + citalopram (1 or 10 µM) fingerprints. Each dot represents the mean deviation from the same-clutch scrambled-injected H<sub>2</sub>O-treated mean for that parameter (z-score, mean ± SEM). b, As in a), sorl1 F0 knockout fingerprints (baseline recordings and sorl1 + H<sub>2</sub>O fingerprint from the fluvoxamine experiment) in comparison with the sorl1 + fluvoxamine (10 µM) fingerprint.

      - Possible molecular pathways targeted by tinidazole, fenoprofen, and betamethasone are not described.

      Tinidazole is an antibiotic, fenoprofen is a non-steroidal anti-inflammatory drug (NSAIDs), betamethasone is a steroidal anti-inflammatory drug. Interestingly, long-term use of NSAIDs reduces the risk of AD (in ’t Veld Bas A. et al., 2001). Several mechanisms are possible (Weggen et al., 2007), including reduction of Aβ42 production by interacting with γ-secretase (Eriksen et al., 2003). However, we did not explore the mechanism of action of these drugs on psen2 knockouts so do not feel comfortable speculating. We do not know, for example, whether these findings apply to betamethasone.

      Minor Comments:

      - On page 25, panel "g" should be labeled as "f".

      Thank you!

      - On page 35, a reference should be provided for the statement "From genomic studies of AD, we know that mutations in genes such as SORL1 modify risk by disrupting some biological processes.".

      Thank you, this is now corrected. There were the same studies as mentioned in Introduction.

      - On page 43, the word "and" should be added - "in wild-type rats and mice, overexpressing mutated human APP and PSEN1, AND restricting sleep for 21 days...".

      Right, this sentence could be misread, we edited it. “overexpressing […]” only applied to the mice, not the rats (as they are wild-type); and both are sleep-deprived.

      - On page 45, a reference should be provided for the statement "SSRIs can generally be used continuously with no adverse effects" and this statement should potentially be softened.

      The reference is at the end of that sentence (Cirrito et al., 2011). You are correct though; we reformulated this statement to: “SSRIs can generally be used safely for many years”. SSRIs indeed have side effects.

      - On page 54, a 60-minute rolling average is described as 45k rows, but this seems to be a 30-minute rolling average.

      Thank you! We corrected. It should have been 90k rows, as in: 25 frames-per-second × 60 seconds × 60 minutes.

      Reviewer #2 (Recommendations For The Authors):

      "As we observed in the scRNA-seq data, most genes tested (appa, appb, psen1, psen2, apoea, cd2ap, sorl1) were broadly expressed throughout the 6-dpf brain (Fig. 1d and Fig. 1supplement 3 and 4)."

      - apoea and appb are actually not expressed highly in the scRNA-seq data, and the apoea in situ looks odd, as if it has no expression. The appb gene mysteriously does not look as though it has high expression in the Raj data, but it is clearly expressed based on the in situ. I had previously noticed the same discrepancy, and I attribute it to the transcriptome used to map the Raj data, as the new DanioCell data uses a new transcriptome and indicates high appb expression in the brain. Please point out the discrepancy and possible explanation, perhaps in the figure legend.

      All excellent points, thank you. We included them directly in Results text.

      "most of these were expressed in the brain of 5-6-dpf zebrafish larvae, suggesting they play a role in early brain development or function."

      - Evidence of expression does not suggest function, particularly not a function in brain development. As one example, almost half of the genome is expressed prior to the maternal-zygotic transition but does not have a function in those earliest stages of development. There are numerous other instances where expression does not equal function. Please change the sentence even as simply as "it is possible that they".

      We mostly agree and edited to “[…], so they could play a role […]”.

      Out of curiosity, we plotted, for each zebrafish developmental stage, the proportion of Alzheimer’s risk gene orthologues expressed in comparison to the proportion of all genes expressed (Author response image 8). We defined “all genes” as every gene that is expressed in at least one of the developmental stages (n = 24,856), not the complete transcriptome, to avoid including genes that are never expressed in the brain or whose expression is always below detection limit. We counted a gene as “expressed” if at least three cells had detectable transcripts. Using these definitions, 82 ± 7% of genes are expressed during development. For every developmental stage except 5 dpf (so 11/12), a larger proportion of Alzheimer’s risk genes than all genes are expressed (+5 ± 4%).

      Author response image 8.

      Proportion of Alzheimer’s risk genes orthologues expressed throughout zebrafish development. Proportion of Alzheimer’s risk genes orthologues (n = 42) and all genes (n = 24,856) expressed in the zebrafish brain at each developmental stage, from 12 hours post-fertilisation (hpf) to 15 days post-fertilisation (dpf). “All genes” corresponds to every gene expressed in the brain at any of the developmental stages, not the complete transcriptome. A gene is considered “expressed” (green) if at least three cells had detectable transcripts. Single-cell RNA-seq dataset from Raj et al., 2020.

      "This frame-by-frame analysis has several advantages over previous methods that analysed activity data at the one-minute resolution."

      - Which methods are these? There are no citations. There are certainly existing methods in the zebrafish field that can produce similar data to the method developed for this project. This new package is useful, as most existing software is not written in R, so it would help scientists who prefer this programming language. However, I would be careful not to oversell its novelty, since many methods do exist that produce similar results.

      We added the references. There were referenced above after “we combined previous sleep/wake analysis methods”, but should have been referenced again here.

      We are not convinced by this criticism. We would obviously not claim that the FramebyFrame package is as sophisticated and versatile as video-tracking tools like SLEAP or DeepLabCut, but we do think it answers a genuine need that was not addressed by other methods. Specifically, we know of many labs recording pixel count data across multiple days using the Zebrabox or DanioVision (we added support for DanioVision data after submission), but there were no packages to extract behavioural parameters from these data. Other methods involved standalone scripts with no documentation or version tracking. We would concede the FramebyFrame package is mostly targeted at these labs, but we already know of six labs routinely using it and were recently contacted by a researcher tracking Daphnia in the Zebrabox.

      "F0 knockouts of both cutches" - "clutches"

      Thank you!

      Reviewer #3 (Recommendations For The Authors):

      I would suggest totally revamping the Introduction section, and being sure to provide readers with the context and background they need for the data that comes thereafter. Key areas to touch on, in no particular order, include:

      • Far more detail on the behavioral pharm screen upon which this paper builds, as a brief overview of that approach and the data generated are needed.

      Thank you for the suggestion, we added a sentence hinting at this work in the last Introduction paragraph.

      • Limitations of current zebrafish sleep/arousal assays that motivated the authors to develop a new, temporally high-resolution system.

      We think this is better explained in Results, as is currently. For example, we need to point to Fig. 2–supplement 2a,b,c to explain that one-minute methods were missing sleep bouts and how FramebyFrame resolves this issue.

      • A paragraph about sleep and AD, that does a better job of citing work in humans, mammalian, and invertebrate models that motivate the interest in the connection pursued here.

      Sorry, we think this would place too much focus on sleep and AD. We want the main topic of the paper to be the behavioural pharmacology approach, not AD or sleep per se. As the Introduction states, we see Alzheimer’s risk genes as a case study for the behavioural pharmacology approach, rather than the reason why the approach was developed. Additionally, presenting sleep and AD in Introduction risks sounding like ZOLTAR is specifically designed for this context, while we conceived of it as much more generalisable and explicitly encourage its use to study genes associated to other diseases. Note that the paragraph you suggest is, we think, mostly present in Discussion (section Disrupted sleep and serotonin signalling […]).

      • I modestly suggest eliminating making such a strong case for a gene-first approach being the best way to understand disease. It is not a zero-sum game, and there is plenty to learn from proteomics, metabolomics, etc. I suspect nobody will argue with the authors saying they leveraged the strength of their system and focused on key AD genes of interest.

      From your point below, we understand the following quote is the source of the issue: “For finding causal processes, studying the genome, rather than the transcriptome or epigenome, is advantageous because the chronology from genomic variant to disease is unambiguous […]”. We did not want to suggest it is a zero-sum game, but we now understand how it can be read this way. We adapted slightly the wording. What we want to do is highlight the causality argument as the advantage of the genomics approach. We feel we do not read this argument often enough, while it remains a ‘magic power’ of genomics. One essentially does not have to worry about causality when studying a pathogenic germline variant, while it is a constant concern when studying the transcriptome or epigenome (i.e. did the change in this transcript’s level cause disease, or vice-versa?). To take an example in the context of AD, arguments based on genomics (e.g. Down syndrome or APP duplication) are often the definite arbiters when debating the amyloid hypothesis, exactly because their causality cannot be doubted.

      Minor comments

      (1) The opening of the introduction is perhaps overly broad, spending an entire paragraph on genome vs transcriptome, etc and making the claim that a gene-first approach is the best path. It isn't zero-sum, and the authors could just get right into AD and study genes of interest. Similar issues occur throughout the manuscript, with sentences/paragraphs that are not necessarily needed.

      Please see our answer to your previous point. On the introduction being overly broad, we perfectly agree it is broad, but related to your point about presenting sleep and AD in the Introduction, we wish to talk about finding causal processes from genomics findings using behavioural pharmacology. We purposefully present research on AD as one instance of this broader goal, not the primary topic of the paper.

      Another example are these sentences, which could be totally removed as the following paragraph starts off making the same point much more succinctly. "From genomic studies of AD, we know that mutations in genes such as SORL1 modify risk by disrupting some biological processes. Presumably, the same processes are disrupted in zebrafish sorl1 knockouts, and some caused the behavioural alterations we observed. Can we now follow the thread backwards and predict some of the biological processes in which Sorl1 is involved based on the behavioural profile of sorl1 knockouts?"

      Thanks for the suggestion, but we think these sentences are useful to place back this Results section in the context of the Introduction. Think of the paper as mainly about the behavioural pharmacology approach, not on Alzheimer’s risk genes. The function of the paragraph here is not simply to explain the method by which we decided to study sorl1; it is to reiterate the rationale behind the behavioural pharmacology approach so that the reader understands where this Results section fits in the overall structure.

      (2) Related to the above, the authors use lecanemab as an example to support their approach, but there has been a great deal of controversy regarding this drug. I don't think such extensive justification is needed. This study uses AD risk genes as a case study in a newly developed behavioral pharm pipeline. A great deal of the rest of the intro seems to just fill space and could be more focused on the study at hand. Interestingly, a er gene selection, the next step in their pipeline is sleep/wake analysis yet nothing is covered about AD and sleep in the intro. Some justification of that approach (why focus on sleep/wake as a starting point for behavioral pharm rather than learning and memory?) would be a better use of intro space.

      There has indeed been controversy about lecanemab, but even the harshest critiques of the amyloid hypothesis concede that it slows down cognitive decline (Espay et al., 2023). That is all that is needed to support our argument, which is that research on AD started primarily from genomics and thereby yielded a disease-modifying drug. The controversy seems mostly focused on whether this effect size is clinically significant, and we think we correctly represent this uncertainty (e.g. “antibodies against Aβ such as lecanemab show promise in slowing down disease progression” and “the beneficial effects from targeting Aβ aggregation currently remain modest”).

      Your next point is entirely fair. We mostly answered it above. To explain further, the primary reason why we measured sleep/wake behaviour is to match the behavioural dataset from Rihel et al., 2010 so we can use it to make predictions, not to study sleep in the context of AD per se. Sure, perhaps learning and memory would have been interesting, but we do not know of any study testing thousands of small molecules on zebrafish larvae during a memory task. We understand it can be slightly confusing though, as we then spend a paragraph of Discussion on sleep as a causal process in AD, but we obviously need to discuss this topic given the findings. However, to reiterate, we purposefully designed FramebyFrame and ZOLTAR to be useful beyond studying sleep/wake behaviour. For example, FramebyFrame would not calculate 17 behavioural parameters if the only goal was to measure sleep. We now mention the Rihel et al., 2010 study in the Introduction as you suggested above (“Far more detail on the behavioral pharm screen […]”), as that is the real reason why sleep/wake behaviour was measured in the first place.

      (3) Also related to the above, another more relevant point that could be talked about in the intro is the need for more refined approaches to analyze sleep in zebrafish, given the effort that went into the new analysis system described here. Again, I think the context for why the authors developed this system would be more meaningful than the current content.

      Thank you, we think we answered this point above (especially below Limitations of current zebrafish sleep/arousal assays […]).

      (4) GWAS can stand for Genome-wide associate studies (plural) so I do not think the extra "s" is needed (GWASs) .

      Indeed, that seems to be the common usage. Thank you.

      (5) AD candidate risk genes were determined from loci using "mainly statistic colocalization". Can the authors add a few more details about what was done and what the "mainly" caveat refers to?

      “Mainly” simply refers to the fact that other methods were used by Schwartzentruber et al. (2021) to annotate the GWAS loci with likely causal genes, but that most calls were ultimately made from statistic colocalisation. Readers can refer to this work to learn more about the methods used.

      (6) The authors write "The loss of psen1 only had mild effects on behaviour" but I think they mean "sleep behaviors" as there could be many other behaviors that are disrupted but were not assessed. The same issue a few sentences later with "Behaviour during the day was not affected" and at the end of the following paragraph.

      Yes, that would be more precise, thank you.

      (7) For the Sorl1 pharmacology data, it is very hard to understand what is being measured behaviorally. Are the authors measuring sleep +/- citalopram, or something else, and why the change to Euclidean distance rather than all the measures we were just introduced to earlier in the manuscript?

      We understand these plots (Fig. 5c,d) are less intuitive, but it is important that we show the difference in behaviour compared to H<sub>2</sub>O-treated larvae of same genotype. The claim is that citalopram has a larger effect on knockouts than on controls, so the reader needs to focus on the effect of the drug on each genotype, not on the effect of sorl1 knockout. We added the standard fingerprints (i.e. setting controls to z-score = 0) here in Author response figures.

      Euclidean distance takes as input all the measures we introduced. The point is precisely not to select a single measure. For example, say we were only plotting active bout number during the day, we would conclude that 10 µM citalopram has the same effect on knockouts and controls. Conversely, if we had taken sleep bout length at night, we would conclude 10 µM has a stronger effect on knockouts. What is the correct parameter to select? Using Euclidean distance resolves this by taking all parameters into account, rather than arbitrarily choosing one.

      And what exactly is a "given spike in serotonin"? and how is this hypothesis the conclusion based on the lack of evidence for the second hypothesis? As the authors say, there could be other ways sorl1 knockouts are more sensitive to citalopram, so the absence of evidence for one hypothesis certainly does not support the other hypothesis.

      We mean a given release of serotonin in the synaptic cleft. We have fixed this wording. 

      We tend to disagree on the second point. We can think of two ways that sorl1 knockouts are more sensitive to citalopram: 1) they produce more serotonin, so blocking reuptake causes a larger spike in knockouts; or 2) blocking reuptake causes the same increase in both knockouts and wild-types but knockouts react more strongly to serotonin. We cannot in fact think of another way to explain the citalopram results. Not finding overwhelming evidence for 1) surely supports 2) somewhat, even if we do not have direct evidence for it. As an analogy, if two diagnoses are possible for a patient, testing negative for the first one supports the other one, even before it is directly tested.

      (8) Again some language is used without enough care. Fish are referred to as "drowsier" under some drug conditions. How do the authors know the animal is drowsy? The phenotype is more specific - more sleep, less activity.

      Thank you, we switched to “Furthermore, fenoprofen worsened the day-time hypoactivity of psen2 knockout larvae […]”.

      (9) This sentence is misleading as it gives the impression that results in this manuscript suggest the conclusion: "Our observation that disruption of genes associated with AD diagnosis after 65 years reduces sleep in 7-day zebrafish larvae suggest that disrupted sleep may be a common mechanism through which these genes exert an effect on risk." That idea is widely held in the field, and numerous other previous manuscripts/reviews should be cited for clarity of where this hypothesis came from.

      This idea is not widely held in the field. You likely read this point as “disrupted sleep is a risk factor for AD”, which, yes, is widely discussed in the field, but is not precisely what we are saying. We hypothesise that mutations in some of the Alzheimer’s risk genes cause disrupted sleep, possibly from a very early age, which then causes AD decades later. Studies and reviews on sleep and AD rarely make this hypothesis, at least not explicitly. The closest we know of are a few recent human genetics studies, typically using Mendelian Randomisation, finding that higher genetic risk of AD correlates with some sleep phenotypes, such as sleep duration (Chen et al., 2022; Leng et al., 2021). The work of Muto et al. (2021) is particularly interesting as it found correlations between higher genetic risk of AD and some sleep phenotypes in men in their early twenties, which seems unlikely to be a consequence of early pathology (Muto et al., 2021). Note, however, that even these studies do not mention sleep possibly being disrupted early in development, which is what our findings in zebrafish larvae support. As we mention, we think a team should test whether sleep is different in infants at higher genetic risk of AD, essentially performing an analogous, but obviously much more difficult, experiment as we did in zebrafish larvae. We do not know of any study testing this or even raising this idea, so evidently it is not widely held. Having said that, the studies we mention here were not referenced in the Discussion paragraph. We have now corrected this.

      Ashlin TG, Blunsom NJ, Ghosh M, Cockcroft S, Rihel J. 2018. Pitpnc1a Regulates Zebrafish Sleep and Wake Behavior through Modulation of Insulin like Growth Factor Signaling. Cell Rep 24:1389–1396. doi:10.1016/j.celrep.2018.07.012

      Chen D, Wang X, Huang T, Jia J. 2022. Sleep and LateOnset Alzheimer’s Disease: Shared Genetic Risk Factors, Drug Targets, Molecular Mechanisms, and Causal Effects. Front Genet 13. doi:10.3389/fgene.2022.794202

      Cirrito JR, Disabato BM, Restivo JL, Verges DK, Goebel WD, Sathyan A, Hayreh D, D’Angelo G, Benzinger T, Yoon H, Kim J, Morris JC, Mintun MA, Sheline YI. 2011. Serotonin signaling is associated with lower amyloid-β levels and plaques in transgenic mice and humans. Proc Natl Acad Sci U S A 108:14968–14973. doi:10.1073/pnas.1107411108

      Dean DC, Jerskey BA, Chen K, Protas H, Thiyyagura P, RoonJva A, O’Muircheartaigh J, Dirks H, Waskiewicz N, Lehman K, Siniard AL, Turk MN, Hua X, Madsen SK, Thompson PM, Fleisher AS, Huentelman MJ, Deoni SCL, Reiman EM. 2014. Brain Differences in Infants at Differential Genetic Risk for Late-Onset Alzheimer Disease A Cross-sectional Imaging Study. JAMA Neurol 71:11–22. doi:10.1001/jamaneurol.2013.4544

      Eriksen JL, Sagi SA, Smith TE, Weggen S, Das P, McLendon DC, Ozols VV, Jessing KW, Zavitz KH, Koo EH, Golde TE. 2003. NSAIDs and enantiomers of flurbiprofen target γ-secretase and lower Aβ42 in vivo. J Clin Invest 112:440–449. doi:10.1172/JCI18162

      Espay AJ, Herrup K, Kepp KP, Daly T. 2023. The proteinopenia hypothesis: Loss of Aβ42 and the onset of Alzheimer’s Disease. Ageing Res Rev 92:102112. doi:10.1016/j.arr.2023.102112

      Hoffman EJ, Turner KJ, Fernandez JM, Cifuentes D, Ghosh M, Ijaz S, Jain RA, Kubo F, Bill BR, Baier H, Granato M, Barresi MJF, Wilson SW, Rihel J, State MW, Giraldez AJ. 2016. Estrogens Suppress a Behavioral Phenotype in Zebrafish Mutants of the AuJsm Risk Gene, CNTNAP2. Neuron 89:725–733. doi:10.1016/j.neuron.2015.12.039

      in ’t Veld Bas A, Ruitenberg A, Hofman A, Launer LJ, van Duijn CM, Stijnen T, Breteler MMB, Stricker BHC. 2001. Nonsteroidal Anti inflammatory Drugs and the Risk of Alzheimer’s Disease. N Engl J Med 345:1515–1521. doi:10.1056/NEJMoa010178

      Jagirdar R, Fu C-H, Park J, Corbek BF, Seibt FM, Beierlein M, Chin J. 2021. Restoring activity in the thalamic reticular nucleus improves sleep architecture and reduces Aβ accumulation in mice. Sci Transl Med 13:eabh4284. doi:10.1126/scitranslmed.abh4284

      Jiang H, Newman M, Lardelli M. 2018. The zebrafish orthologue of familial Alzheimer’s disease gene PRESENILIN 2 is required for normal adult melanotic skin pigmentation. PLOS ONE 13:e0206155. doi:10.1371/journal.pone.0206155

      Jiang H, Pederson SM, Newman M, Dong Y, Barthelson K, Lardelli M. 2020. Transcriptome analysis indicates dominant effects on ribosome and mitochondrial function of a premature termination codon mutation in the zebrafish gene psen2. PloS One 15:e0232559. doi:10.1371/journal.pone.0232559

      Joo W, Vivian MD, Graham BJ, Soucy ER, Thyme SB. 2021. A Customizable Low-Cost System for Massively Parallel Zebrafish Behavioral Phenotyping. Front Behav Neurosci 14.

      Joubert L, Hanson B, Barthet G, Sebben M, Claeysen S, Hong W, Marin P, Dumuis A, Bockaert J. 2004. New sorting nexin (SNX27) and NHERF specifically interact with the 5-HT4a receptor splice variant: roles in receptor targeting. J Cell Sci 117:5367–5379. doi:10.1242/jcs.01379

      Leng Y, Ackley SF, Glymour MM, Yaffe K, Brenowitz WD. 2021. Genetic Risk of Alzheimer’s Disease and Sleep Duration in Non-Demented Elders. Ann Neurol 89:177–181. doi:10.1002/ana.25910

      Mitchell PB, Hadzi-Pavlovic D. 2000. Lithium treatment for bipolar disorder. Bull World Health Organ 78:515–517.

      Mikur A. 2011. Trazodone: properties and utility in multiple disorders. Expert Rev Clin Pharmacol 4:181–196. doi:10.1586/ecp.10.138

      Munoz-Torrero D. 2008. Acetylcholinesterase Inhibitors as Disease-Modifying Therapies for Alzheimer’s Disease. Curr Med Chem 15:2433–2455. doi:10.2174/092986708785909067

      Muto V, Koshmanova E, Ghaemmaghami P, Jaspar M, Meyer C, Elansary M, Van Egroo M, Chylinski D, Berthomier C, Brandewinder M, Mouraux C, Schmidt C, Hammad G, Coppieters W, Ahariz N, Degueldre C, Luxen A, Salmon E, Phillips C, Archer SN, Yengo L, Byrne E, Collette F, Georges M, Dijk D-J, Maquet P, Visscher PM, Vandewalle G. 2021. Alzheimer’s disease genetic risk and sleep phenotypes in healthy young men: association with more slow waves and daytime sleepiness. Sleep 44. doi:10.1093/sleep/zsaa137

      Myers-Turnbull D, Taylor JC, Helsell C, McCarroll MN, Ki CS, Tummino TA, Ravikumar S, Kinser R, Gendelev L, Alexander R, Keiser MJ, Kokel D. 2022. Simultaneous analysis of neuroactive compounds in zebrafish. doi:10.1101/2020.01.01.891432

      Owens MJ, Morgan WN, Plok SJ, Nemeroff CB. 1997. Neurotransmiker receptor and transporter binding profile of antidepressants and their metabolites. J Pharmacol Exp Ther 283:1305– 1322.

      Özcan GG, Lim S, Leighton PL, Allison WT, Rihel J. 2020. Sleep is bi-directionally modified by amyloid beta oligomers. eLife 9:e53995. doi:10.7554/eLife.53995

      Quiroz YT, Schultz AP, Chen K, Protas HD, Brickhouse M, Fleisher AS, Langbaum JB, Thiyyagura P, Fagan AM, Shah AR, Muniz M, Arboleda-Velasquez JF, Munoz C, Garcia G, Acosta-Baena N, Giraldo M, Tirado V, Ramírez DL, Tariot PN, Dickerson BC, Sperling RA, Lopera F, Reiman EM. 2015. Brain Imaging and Blood Biomarker Abnormalities in Children With Autosomal Dominant Alzheimer Disease: A Cross-Sectional Study. JAMA Neurol 72:912–919. doi:10.1001/jamaneurol.2015.1099

      Relkin NR. 2007. Beyond symptomatic therapy: a reexamination of acetylcholinesterase inhibitors in Alzheimer’s disease. Expert Rev Neurother 7:735–748. doi:10.1586/14737175.7.6.735

      Rihel J, Prober DA, Arvanites A, Lam K, Zimmerman S, Jang S, Haggarty SJ, Kokel D, Rubin LL, Peterson RT, Schier AF. 2010. Zebrafish Behavioral Profiling Links Drugs to Biological Targets and Rest/Wake Regulation. Science 327:348–351. doi:10.1126/science.1183090

      Sleegers K, Brouwers N, Gijselinck I, Theuns J, Goossens D, Wauters J, Del-Favero J, Cruts M, van Duijn CM, Van Broeckhoven C. 2006. APP duplication is sufficient to cause early onset Alzheimer’s dementia with cerebral amyloid angiopathy. Brain J Neurol 129:2977–2983. doi:10.1093/brain/awl203

      Sun L, Zhou R, Yang G, Shi Y. 2017. Analysis of 138 pathogenic mutations in presenilin-1 on the in vitro production of Aβ42 and Aβ40 peptides by γ-secretase. Proc Natl Acad Sci 114:E476– E485. doi:10.1073/pnas.1618657114

      Szklarczyk D, Santos A, von Mering C, Jensen LJ, Bork P, Kuhn M. 2016. STITCH 5: augmenting protein–chemical interaction networks with tissue and affinity data. Nucleic Acids Res 44:D380–D384. doi:10.1093/nar/gkv1277

      Weggen S, Rogers M, Eriksen J. 2007. NSAIDs: small molecules for prevention of Alzheimer’s disease or precursors for future drug development? Trends Pharmacol Sci 28:536–543. doi:10.1016/j.Jps.2007.09.004

      Wiltschko AB, Tsukahara T, Zeine A, Anyoha R, Gillis WF, Markowitz JE, Peterson RE, Katon J, Johnson MJ, Daka SR. 2020. Revealing the structure of pharmacobehavioral space through motion sequencing. Nat Neurosci 23:1433–1443. doi:10.1038/s41593-020-00706-3

      Yang T, Arslanova D, Gu Y, Augelli-Szafran C, Xia W. 2008. Quantification of gamma-secretase modulation differentiates inhibitor compound selectivity between two substrates Notch and amyloid precursor protein. Mol Brain 1:15. doi:10.1186/1756-6606-1-15

    1. eLife Assessment

      This study examined how multidimensional social relationships influence social attention in rhesus macaques, linking individual and group-level behaviors to attentional processes. The findings that oxytocin altered social attention and its relationship to both social tendencies and dyadic relationships are important, as recent technological advances allow for the exploration of neuronal activities and mechanisms in free-moving macaques. This work is convincing and will be of interest to those studying the interplay between social dynamics and information processing in primates.

    2. Reviewer #1 (Public review):

      Summary:

      This study aims to investigate the links between social behaviors observed in free-moving situations and behavioral performances measured in well-controlled, laboratory settings. The authors assessed general social tendencies and dyadic relationships among four monkeys in a group by scoring agonistic (aggression) and affiliative (grooming and proximity) behaviors in each pair. By measuring the saccadic reaction time in a classic social interference task, the authors reported that the monkeys with higher SEIs (i.e., more social individuals) were less distracted by the faces of other monkeys. These effects were enhanced when the distractors were out-group monkey faces rather than in-group ones. Lastly, oxytocin administration increased the impact of the out-group monkey faces in the social interference task, while reducing the magnitude of general social tendencies measured with SEI.

      Strengths:

      (1) The combination of behavioral data obtained in a colony room and in a laboratory environment is rare and important.<br /> (2) The evaluation of social interactions were successfully performed based on an automated target detection algorithm. The resulting multi-dimensional, complicated social interactions were summarized into simple indices (SEI and IEI). These indices provide a good measure for the social tendencies of each monkey.<br /> (3) Well-designed and robust experiments in the laboratory environment that are linked nicely with the general social tendencies observed in spontaneous behaviors.

      Weaknesses:

      (1) While the overall results are interesting, I am somewhat left confused about how to interpret the difference in the scores derived from different conditions. For example, the authors stated "Comparing the weights for in-group and out-group distractors, the effect of proximity was larger than that of aggression and grooming" in p.8. Does this mean that the proximity is indeed the type of behavior most affected in the out-group condition compared to the in-group condition? The out-group effects are difficult to examine with actual behavioral data, but some in-group effects such as those involving OT can be tested, which possibly provides good insights into interpreting the differences of the weights observed across the experimental conditions.

      (2) I think it is important to provide how variable spontaneous social interactions were across sessions and how impactful the variability of the interactions is on the SEI and IEI, as it helps to understand how meaningful the differences of weights are across the conditions, but such data are missing. In line with this point, although the conclusions still hold as those data were obtained during the same experimental periods, shouldn't the weights in Fig. 3f and Figs. 4g and 4h (saline) be expected to be similar, if not the same?

      Comments on revisions: I do not have further comments.

    3. Reviewer #2 (Public review):

      Summary:

      The study presents significant findings that elucidate the relationship between multi-dimensional social relationships and social attention in rhesus macaques. By integrating advanced computational methods, behavioral analyses, and neuroendocrine manipulation, the authors provide strong evidence for how oxytocin modulates attention within social networks. The results are robust and address critical gaps in understanding the dynamics of social attention in primates.

      Strengths:

      (1) The use of YOLOv5 for automatic behavioral detection is an exceptional methodological advance. The combination of automated analyses with manual validation enhances confidence in the data.<br /> (2) The study's focus on three distinct dimensions of social interaction (aggression, grooming, and proximity) is comprehensive and provides nuanced insights into the complexity of primate social networks.<br /> (3) The investigation of oxytocin's role adds a compelling neuroendocrine dimension to the findings, providing a bridge between behavioral and neural mechanisms.

      Weaknesses:

      (1) The study's conclusions are based on observations of only four monkeys, which limits the generalizability of the findings. Larger sample sizes could strengthen the validity of the results.<br /> (2) The limited set of stimulus images (in-group and out-group faces) may introduce unintended biases. This could be addressed by increasing the diversity of stimuli or incorporating a broader range of out-group members.

      Comments on revisions: I have no further comments!

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Weaknesses:<br /> (1) While the overall results are interesting, I am somewhat left confused about how to interpret the difference in the scores derived from different conditions. For example, the authors stated "Comparing the weights for in-group and out-group distractors, the effect of proximity was larger than that of aggression and grooming" in p.8. Does this mean that the proximity is indeed the type of behavior most affected in the out-group condition compared to the in-group condition? The out-group effects are difficult to examine with actual behavioral data, but some in-group effects such as those involving OT can be tested, which possibly provides good insights into interpreting the differences of the weights observed across the experimental conditions.

      Thank you for your thoughtful comments and for highlighting an important aspect of our findings. The statement in page 8 refers to the relative impact of different social behaviors—proximity, aggression, and grooming—on the derived weights for in-group and out-group distractors. Specifically, the data suggest that proximity exerts a stronger influence than aggression or grooming in differentiating the effects of out-group versus in-group distractors. Regarding the out-group condition, we acknowledge that it presents challenges for direct behavioral observation, as interactions involving out-group members are often more difficult to quantify in naturalistic settings. However, we agree with you about the suggestion to test certain in-group effects, particularly those influenced by oxytocin (OT), as they offer a more controlled framework to validate and interpret the observed differences in weights across experimental conditions. In line with this, we examined specific in-group behaviors under OT administration to disentangle their contributions to attentional dynamics (Fig. 4 and Fig. 5 e to h). By integrating controlled experimental manipulations, we think these results could provide deeper insights into how social relationships shape the observed patterns of attention.

      (2) I think it is important to provide how variable spontaneous social interactions were across sessions and how impactful the variability of the interactions is on the SEI and IEI, as it helps to understand how meaningful the differences of weights are across the conditions, but such data are missing. In line with this point, although the conclusions still hold as those data were obtained during the same experimental periods, shouldn't the weights in Fig. 3f and Figs. 4g and 4h (saline) be expected to be similar, if not the same?

      Thank you for your insightful comments. As highlighted, we utilized the entire experimental period as the dataset to evaluate the monkeys' social interactions. The experiments presented in Figures 3 and 4 were designed to examine how social relationships correlate with patterns of social attention under two distinct conditions: without manipulation (Fig. 3) and with nebulized exposure to oxytocin and saline (Fig. 4). Theoretically, the weights observed in the unmanipulated condition and the nebulized saline condition should be similar. However, our results indicate that distractor biases shifted significantly following nebulized saline exposure (Fig. 4) compared to the unmanipulated condition (Fig. 3) (MK: p = 9.3×10<sup>-3</sup>, ML: p = 9.77×10<sup>-4</sup>, MC: p = 9.77×10<sup>-4</sup>, MA: p = 0.09; n<sub>1</sub> = n<sub>2</sub> = 12 experimental days; Two-sided Wilcoxon signed-rank test). This suggests that the nebulization process itself, despite acclimating the monkeys to saline exposure for approximately two weeks prior to the experiments, still influenced their attentional behaviors.

      While the primary goal of nebulization was to assess the effects of oxytocin on social attention, our main conclusions remain robust, even considering the impact of nebulization on distractor biases. We acknowledge that variability in spontaneous social interactions across days or experimental sessions could be an important factor influencing the SEI and IEI. The dynamic nature of social interactions within the colony is likely affected by numerous variables. Future research will aim to integrate these factors into a more comprehensive and dynamic framework to better interpret their influence on social attention metrics.

      Reviewer #2 (Public review):

      Weaknesses:<br /> (1) The study's conclusions are based on observations of only four monkeys, which limits the generalizability of the findings. Larger sample sizes could strengthen the validity of the results.

      Thank you for your valuable comment. We acknowledge that the relatively small sample size could influence the generalizability of the findings.  However, despite this limitation, our work systematically examined multifaceted social relationships among monkeys and their attentional strategies within a well-controlled experimental setup. We reported results across sessions and conditions (e.g., in-group vs. out-group; saline vs. Oxytocin), which strengthens the reliability of the observed effects of social networks within this context. We agree that increasing the sample size would improve the generalizability of the results. Future studies with a larger cohort will be critical for confirming the robustness of our findings and expanding their broader applicability. We have acknowledged this limitation in the revised manuscript and highlighted the potential for further research with larger sample sizes to validate and extend our conclusions.

      (2) The limited set of stimulus images (in-group and out-group faces) may introduce unintended biases. This could be addressed by increasing the diversity of stimuli or incorporating a broader range of out-group members.

      Thank you for your thoughtful comment. We acknowledge that the use of a limited set of six monkey faces as stimuli for in-group and out-group conditions could potentially introduce biases. To address this concern, we conducted an additional analysis to minimize the potential impact of individual images on our findings using the current dataset. Specifically, we randomly excluded one in-group and one out-group image and reanalyzed distractor biases using the remaining two images (Supplementary Fig. 3a). For each subject, this approach generated three sets of two distractors per group, resulting in 81(3<sup>4</sup>) combinations across four monkey subjects, and a total of 81 × 81 subject-distractor pairings. We statistically compared distractor biases between in-group and out-group faces for each combination (Supplementary Fig. 3b). As shown in Supplementary Fig. 3c, 99.30% of the 6,561 combinations demonstrated significantly lower distractor biases towards in-group faces compared to out-group faces (two-sided Wilcoxon signed-rank test, p < 0.05). These results suggest that the observed differences in social attention between in-group and out-group monkeys are unlikely to be driven by specific images within the stimulus set. That said, we agree that increasing the diversity of stimulus images or incorporating a broader range of out-group members would improve the generalizability of the results. We have acknowledged this limitation in the revised manuscript and highlighted the potential for further research to incorporate a more diverse stimulus set to validate and extend our findings.

      “However, these conclusions may be constrained by the relatively small sample size and the homogeneity of stimulus set in the study. Future research focusing on larger, more diverse cohorts and incorporating a broader range of stimuli will enhance the generalizability and applicability of the findings.”

      Reviewer #1 (Recommendations for the authors):

      It is difficult to distinguish "Getting fighted" and "Fighting partner" in Fig. 1b (esp. when printed). I thought Actor showed "Fighting partner" several times in Session 2, but it seems to be "Getting fighted" judging from Figs. 1c and 1d. Is this correct? If so, I would suggest to change the color to improve visibility.

      Thank you for your valuable comment. We apologize for the confusion in the previous version. To improve clarity, we have both terms to “begin fighting” and “being fought”. As shown in Figure 1b, we now explicitly define the identities of the two monkeys as the actor (K) and the partner (L), with all behaviors described from the perspective of the actor. For example, when the actor (K) initiates the fight, it is marked as “begin fighting”, whereas when the partner (L) initiates the fight, the actor (K) is the recipient and labeled as “being fought”. Additionally, we have implemented your suggestion by changing the colors to enhance visibility, especially for the terms “begin fighting” and “being fought”.

      Reviewer #2 (Recommendations for the authors): 

      I have some minor concerns:

      (1) Figure1B, caption for x axis is missing, 4 means 4 days?

      Thank you so much for the comment. We have clarified the x-axis in Figure 1B, where the label "4" corresponds to 4 hours of video typing on each experimental day. The revised figure now includes the appropriate label for better clarity. We appreciate your careful attention to this detail.

      (2) I am slightly concerned about animal safety. How do the experimenters ensure the animals' safety and well-being in cases of aggressive interactions or attacks?

      Thank you for your comment. We share your concern regarding animal safety and take re the well-being of the monkeys in the study. All experimental procedures were reviewed and approved by the Institutional Animal Care and Use Committee at the Institute of Biophysics, Chinese Academy of Sciences (IBP-NHP-002(22)). The monkeys were housed together in the same colony room for over four years, in interconnected cages that allowed for direct physical interaction. Animal behaviors in cages were closely monitored via a live video system to ensure their safety. To prevent potential injuries, a sliding partition system was in place, enabling the isolation of individual animals when necessary, minimizing risks to their well-being.

    1. eLife Assessment

      This important study combines convincing evolution experiments with molecular and genetic techniques to study how a genetic lesion in MreB that causes rod-shaped cells to become spherical, with concomitant deleterious fitness effects, can be rescued by natural selection. The detailed mechanistic investigation increases our understanding of how mreB contributes to cell wall synthesis and shows how compensatory mutations may reestablish its homogeneity.

    2. Reviewer #1 (Public review):

      Summary:

      The authors performed experimental evolution of MreB mutants that have a slow growing round phenotype and studied the subsequent evolutionary trajectory using analysis tool from molecular biology. It was remarkable and interesting that they found that the original phenotype was not restored (most common in these studies) but that the round phenotype was maintained.

      Strengths:

      The finding that the round phenotype was maintained during evolution rather than that the original phenotype, rod shape cells, was recovered is interesting. The paper extensively investigates what happens during adaptation with various different techniques. Also the extensive discussion of the findings at the end of the paper is well thought through and insightful.

    3. Reviewer #3 (Public review):

      This paper addresses a long-standing problem in microbiology: the evolution of bacterial cell shape. Bacterial cells can take a range of forms, among the most common being rods and spheres. The consensus view is that rods are the ancestral form and spheres the derived form. The molecular machinery governing these different shapes is fairly well understood but the evolutionary drivers responsible for the transition between rods and spheres is not. Enter Yulo et al.'s work. The authors start by noting that deletion of a highly conserved gene called MreB in the Gram-negative bacterium Pseudomonas fluorescens reduces fitness but does not kill the cell (as happens in other species like E. coli and B. subtilis) and causes cells to become spherical rather than their normal rod shape. They then ask whether evolution for 1000 generations restores the rod shape of these cells when propagated in a rich, benign medium.

      The answer is no. The evolved lineages recovered fitness by the end of the experiment, growing just as well as the unevolved rod-shaped ancestor, but remained spherical. The authors provide an impressively detailed investigation of the genetic and molecular changes that evolved. Their leading results are:

      (1) The loss of fitness associated with MreB deletion causes high variation in cell volume among sibling cells after cell division;<br /> (2) Fitness recovery is largely driven by a single, loss-of-function point mutation that evolves within the first ~250 generations that reduces the variability in cell volume among siblings;<br /> (3) The main route to restoring fitness and reducing variability involves loss of function mutations causing a reduction of TPase and peptidoglycan cross-linking, leading to a disorganized cell wall architecture characteristic of spherical cells.

      The inferences made in this paper are on the whole well supported by the data. The authors provide a uniquely comprehensive account of how a key genetic change leads to gains in fitness and the spectrum of phenotypes that are impacted and provide insight into the molecular mechanisms underlying models of cell shape.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      We made a serious effort to address the reviewers comments. If we have come up short, then let this be stated and explained in the eLife review. But we would be grateful if you did not include in the revised eLife review, comments that were corrected / addressed last time – unless of course there is disagreement, or if our response was unsatisfactory.  If either of the latter, then please explain and we will respond.

      As to the exceptionally minor issue, namely, correction for multiple statistical tests (minor because the data and the error are presented in the text). We have now conducted one-way ANOVA to back the data displayed in Fig 4A., and Supp. Figs 19 and 21. In each case ANOVA revealed a highly significant difference among means: Dunnett’s post hoc test was then used to test each result against SBW25, with the multiple comparisons corrected for in the analysis.

      This resulted in changes to the description of the statistical analysis in the following captions:

      To Figure 4.

      Where we previously referred to paired t-tests we now state:  ANOVA revealed a highly significant difference among means [F<sub>7,16</sub> = 8.19, p < 0.001] with Dunnett’s post-hoc test adjusted for multiple comparisons showing that five genotypes (*) differ significantly (p < 0.05) from SBW25.

      To Supplementary Figure 19.

      Where we previously referred to paired t-tests we now state: ANOVA revealed a highly significant difference among means [F<sub>7,16</sub> = 16.74, p < 0.001] with Dunnett’s post-hoc test adjusted for multiple comparisons showing that three genotypes (*) differ significantly (p < 0.05) from SBW25.

      To Supplementary Figure 21.

      Where we previously referred to paired t-tests we now state:  ANOVA revealed a highly significant difference among means [F<sub>7,89</sub> = 9.97, p < 0.0001] with Dunnett’s post-hoc test adjusted for multiple comparisons showing that SBW25 ∆mreB and SBW25 ∆PFLU4921-4925 are significantly different (*) from SBW25 (p < 0.05).

    1. eLife Assessment

      This study reveals a novel mechanism of glutamine synthetase (GS) regulation in Methanosarcina mazei, demonstrating that 2-oxoglutarate (2-OG) directly promotes GS activity by stabilizing its dodecameric assembly. Using mass photometry, activity assays, and cryo-electron microscopy, the authors show that GS transitions from a dimeric, inactive form at low 2-OG concentrations to a fully active dodecameric complex at saturating 2-OG levels, highlighting 2-OG as a key effector in C/N sensing. The findings are valuable, supported by solid data, and provide new insights into archaeal GS regulation, though further clarification of interactions with known partners like Glnk1 and sp26 is needed.

    2. Reviewer #1 (Public review):

      Summary:

      Shows a new mechanism of GS regulation in the archaean Methanosarcina maze and clarifies the direct activation of GS activity by 2-oxoglutarate, thus featuring an other way, how 2-oxoglutarate acts as a central status reporter of C/N sensing.

      Strengths:

      mass photometry reveals a a dynamic mode the effect of 2-OG on the oligomerization state of GS. Single particle Cryo-EM reveals the mechanism of 2-OG mediated dodecamer formation.

      Weaknesses:

      Not entirely clear, how very high 2-OG concentrations activate GS beyond dodecamer formation.

      In the revised version, most of my concerns were adequately addressed. In the summary it is stated that glutamine acts as allosteric inhibitor of dodecameric GS. This is not correct: glutamine binds to the active site and is therefore not allosteric. This way of feedback inhibition is a type of product inhibition

    3. Reviewer #2 (Public review):

      Summary:

      Herdering et al. introduced research on an archaeal glutamine synthetase (GS) from Methanosarcina mazei, which exhibits sensitivity to the environmental presence of 2-oxoglutarate (2-OG). While previous studies have indicated 2-OG's ability to enhance GS activity, the precise underlying mechanism remains unclear. Initially, the authors utilized biophysical characterization, primarily employing a nanomolar-scale detection method called mass photometry, to explore the molecular assembly of Methanosarcina mazei GS (M. mazei GS) in the absence or presence of 2-OG. Similar to other GS enzymes, the target M. mazei GS forms a stable dodecamer, with two hexameric rings stacked in tail-to-tail interactions. Despite approximately 40% of M. mazei GS existing as monomeric or dimeric entities in the detectable solution, the majority spontaneously assemble into a dodecameric state. Upon mixing 2-OG with M. mazei GS, the population of the dodecameric form increases proportionally with the concentration of 2-OG, indicating that 2-OG either promotes or stabilizes the assembly process. The cryo-electron microscopy (cryo-EM) structure reveals that 2-OG is positioned near the interface of two hexameric rings. At a resolution of 2.39 Å, the cryo-EM map vividly illustrates 2-OG forming hydrogen bonds with two individual GS subunits as well as with solvent water molecules. Moreover, local sidechain reorientation and conformational changes of loops in response to 2-OG further delineate the 2-OG-stabilized assembly of M. mazei GS.

      Strengths & Weaknesses:

      The investigation studies into the impact of 2-oxoglutarate (2-OG) on the assembly of Methanosarcina mazei glutamine synthetase (M mazei GS). Utilizing cutting-edge mass photometry, the authors scrutinized the population dynamics of GS assembly in response to varying concentrations of 2-OG. Notably, the findings demonstrate a promising and straightforward correlation, revealing that dodecamer formation can be stimulated by 2-OG concentrations of up to 10 mM, although GS assembly never reaches 100% dodecamerization in this study. Furthermore, catalytic activities showed a remarkable enhancement, escalating from 0.0 U/mg to 7.8 U/mg with increasing concentrations of 2-OG, peaking at 12.5 mM. However, an intriguing gap arises between the incomplete dodecameric formation observed at 10 mM 2-OG, as revealed by mass photometry, and the continued increase in activity from 5 mM to 10 mM 2-OG for M mazei GS. This prompts questions regarding the inability of M mazei GS to achieve complete dodecamer formation and the underlying factors that further enhance GS activity within this concentration range of 2-OG.

      Moreover, the cryo-electron microscopy (cryo-EM) analysis provides additional support for the biophysical and biochemical characterization, elucidating the precise localization of 2-OG at the interface of two GS subunits within two hexameric rings. The observed correlation between GS assembly facilitated by 2-OG and its catalytic activity is substantiated by structural reorientations at the GS-GS interface, confirming the previously reported phenomenon of "funnel activation" in GS. However, the authors did not present the cryo-EM structure of M. mazei GS in complex with ATP and glutamate in the presence of 2-OG, which could have shed light on the differences in glutamine biosynthesis between previously reported GS enzymes and the 2-OG-bound M. mazei GS.

      Furthermore, besides revealing the cryo-EM structure of 2-OG-bound GS, the study also observed the filamentous form of GS, suggesting that filament formation may be a universal stacking mechanism across archaeal and bacterial species. However, efforts to enhance resolution to investigate whether the stacked polymer is induced by 2-OG or other factors such as ions or metabolites were not undertaken by the authors, leaving room for further exploration into the mechanisms underlying filament formation in GS.

      Comments on revisions:

      My comments have been addressed adequately.

      I recognize that determining the structure of the GS complex bound to ATP and/or other ligands would enhance this study by offering a more comprehensive understanding of 2-oxoglutarate-mediated dodecameric assembly and activation. However, I accept the authors' explanation for not including this aspect in the current work.

    4. Reviewer #3 (Public review):

      The current manuscript investigates the effect of 2-oxoglutarate (2OG) as modulator of glutamine synthetase (GS). To do this, the authors rely of mass photometry, specific activity measurements and single particle cryo-EM data.<br /> From the results, the authors conclude that the GS from Methanosarcina mazei shifts from a dimeric, non-active state under low concentrations of 2OG, to a dodecameric and fully active complex at saturating concentrations of 2OG.

      GS is a crucial enzyme in all domains of life. The dodecameric fold of GS is recurrent amongst prokaryotic and archaea organisms but the enzyme activity can be regulated in distinct ways. This is a very interesting work combining protein biochemistry with structural biology.

      A novel role for 2OG is presented for this mesophilic methanoarchaeon, as a crucial effector for the enzyme oligomerization and full reactivity.

      The conclusions of this paper are mostly well supported by data, but some aspects of this GS regulation and interaction with known partners like Glnk1 and sp26 need to be clarified and extended.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      his study shows a new mechanism of GS regulation in the archaean Methanosarcina mazei and clarifies the direct activation of GS activity by 2-oxoglutarate, thus featuring another way in which 2-oxoglutarate acts as a central status reporter of C/N sensing.

      Mass photometry and single particle cryoEM structure analysis convincingly show the direct regulation of GS activity by 2-OG promoted formation of the dodecameric structure of GS. The previously recognized small proteins GlnK1 and Sp26 seem to play a subordinate role in GS regulation, which is in good agreement with previous data. Although these data are quite clear now, there remains one major open question: how does 2-OG further increase GS activity once the full dodecameric state is achieved (at 5 mM)? This point needs to be reconsidered.

      Weaknesses:

      It is not entirely clear, how very high 2-OG concentrations activate GS beyond dodecamer formation.

      The data presented in this work are in stark contrast to the previously reported structure of M. mazei GS by the Schumacher lab. This is very confusing for the scientific community and requires clarification. The discussion should consider possible reasons for the contradictory results.

      Importantly, it is puzzling how Schumacher could achieve an apo-structire of dodecameric GS? If 2-OG is necessary for dodecameric formation, this should be discussed. If GlnK1 doesn't form a complex with the dodecameric GS, how could such a complex be resolved there?

      In addition, the text is in principle clear but could be improved by professional editing. Most obviously there is insufficient comma placement.

      We thank Reviewer #1 for the professional evaluation and raising important points. We will address those comments in the updated manuscript and especially improve the discussion in respect to the two points of concern.

      (1) How can GlnA1 activity further be stimulated with further increasing 2-OG after the dodecamer is already fully assembled at 5 mM 2-OG.

      We assume a two-step requirement for 2-OG, the dodecameric assembly and the priming of the active sites. The assembly step is based on cooperative effects of 2-OG and does not require the presence of 2-OG in all 2-OG-binding pockets: 2-OG-binding to one binding pocket also causes a domino effect of conformational changes in the adjacent 2-OG-unbound subunit, as also described for Methanothermococcus thermolithotrophicus GS in Müller et al. 2023. Due to the introduction of these conformational changes, the dodecameric form becomes more favourable even without all 2-OG binding sites being occupied. With higher 2-OG concentrations present (> 5mM), the activity increased further until finally all 2-OG-binding pockets were occupied, resulting in the priming of all active sites (all subunits) and thereby reaching the maximal activity.

      (2) The contradictory results with previously published data on the structure of M. mazei by Schumacher et al. 2023.

      We certainly agree that it is confusing that Schumacher et al. 2023 obtained a dodecameric structure without the addition of 2-OG, which we claim to be essential for the dodecameric form. 2-OG is a cellular metabolite that is naturally present in E. coli, the heterologous expression host both groups used. Since our main question focused on analysing the 2-OG effect on GS, we have performed thorough dialysis of the purified protein to remove all 2-OG before performing MP experiments. In the absence of 2-OG we never observed significant enzyme activity and always detected a fast disassembly after incubation on ice. We thus assume that a dodecamer without 2-OG in Schumacher et al. 2023 is an inactive oligomer of a once 2-OG-bound form, stabilized e.g. by the presence of 5 mM MgCl2.

      The GlnA1-GlnK1-structure (crystallography) by Schumacher et al. 2023 is in stark contrast to our findings that GlnK1 and GlnA1 do not interact as shown by mass photometry with purified proteins. A possible reason for this discrepancy might be that at the high protein concentrations used in the crystallization assay, complexes are formed based on hydrophobic or ionic protein interactions, which would not form under physiological concentrations.

      Reviewer #2 (Public Review):

      Summary:

      Herdering et al. introduced research on an archaeal glutamine synthetase (GS) from Methanosarcina mazei, which exhibits sensitivity to the environmental presence of 2-oxoglutarate (2-OG). While previous studies have indicated 2-OG's ability to enhance GS activity, the precise underlying mechanism remains unclear. Initially, the authors utilized biophysical characterization, primarily employing a nanomolar-scale detection method called mass photometry, to explore the molecular assembly of Methanosarcina mazei GS (M. mazei GS) in the absence or presence of 2-OG. Similar to other GS enzymes, the target M. mazei GS forms a stable dodecamer, with two hexameric rings stacked in tail-to-tail interactions. Despite approximately 40% of M. mazei GS existing as monomeric or dimeric entities in the detectable solution, the majority spontaneously assemble into a dodecameric state. Upon mixing 2-OG with M. mazei GS, the population of the dodecameric form increases proportionally with the concentration of 2-OG, indicating that 2-OG either promotes or stabilizes the assembly process. The cryo-electron microscopy (cryo-EM) structure reveals that 2-OG is positioned near the interface of two hexameric rings. At a resolution of 2.39 Å, the cryo-EM map vividly illustrates 2-OG forming hydrogen bonds with two individual GS subunits as well as with solvent water molecules. Moreover, local side-chain reorientation and conformational changes of loops in response to 2-OG further delineate the 2-OG-stabilized assembly of M. mazei GS.

      Strengths & Weaknesses:

      The investigation studies the impact of 2-oxoglutarate (2-OG) on the assembly of Methanosarcina mazei glutamine synthetase (M mazei GS). Utilizing cutting-edge mass photometry, the authors scrutinized the population dynamics of GS assembly in response to varying concentrations of 2-OG. Notably, the findings demonstrate a promising and straightforward correlation, revealing that dodecamer formation can be stimulated by 2-OG concentrations of up to 10 mM, although GS assembly never reaches 100% dodecamerization in this study. Furthermore, catalytic activities showed a remarkable enhancement, escalating from 0.0 U/mg to 7.8 U/mg with increasing concentrations of 2-OG, peaking at 12.5 mM. However, an intriguing gap arises between the incomplete dodecameric formation observed at 10 mM 2-OG, as revealed by mass photometry, and the continued increase in activity from 5 mM to 10 mM 2-OG for M mazei GS. This prompts questions regarding the inability of M mazei GS to achieve complete dodecamer formation and the underlying factors that further enhance GS activity within this concentration range of 2-OG.

      Moreover, the cryo-electron microscopy (cryo-EM) analysis provides additional support for the biophysical and biochemical characterization, elucidating the precise localization of 2-OG at the interface of two GS subunits within two hexameric rings. The observed correlation between GS assembly facilitated by 2-OG and its catalytic activity is substantiated by structural reorientations at the GS-GS interface, confirming the previously reported phenomenon of "funnel activation" in GS. However, the authors did not present the cryo-EM structure of M. mazei GS in complex with ATP and glutamate in the presence of 2-OG, which could have shed light on the differences in glutamine biosynthesis between previously reported GS enzymes and the 2-OG-bound M. mazei GS.

      Furthermore, besides revealing the cryo-EM structure of 2-OG-bound GS, the study also observed the filamentous form of GS, suggesting that filament formation may be a universal stacking mechanism across archaeal and bacterial species. However, efforts to enhance resolution to investigate whether the stacked polymer is induced by 2-OG or other factors such as ions or metabolites were not undertaken by the authors, leaving room for further exploration into the mechanisms underlying filament formation in GS.

      We thank Reviewer #2 for the detailed assessment and valuable input. We will address those comments in the updated manuscript and clarify the message.

      (1) The discrepancy of the dodecamer formation (max. at 5 mM 2-OG) and the enzyme activity (max. at 12.5 mM 2-OG). We assume that there are two effects caused by 2-OG: 1. cooperativity of binding (less 2-OG needed to facilitate dodecamer formation) and 2. priming of each active site. See also Reviewer #1 R.1). We assume this is the reason why the activity of dodecameric GlnA1 can be further enhanced by increased 2-OG concentration until all catalytic sites are primed.

      (2) The lack of the structure of a 2-OG and ATP-bound GlnA1. Although we strongly agree that this would be a highly interesting structure, it seems out of the scope of a typical revision to request new cryo-EM structures. We evaluate the findings of our present study concerning the 2-OG effects as important insights into the strongly discussed field of glutamine synthetase regulation, even without the requested additional structures.

      (3) The observed GlnA1-filaments are an interesting finding. We certainly agree with the referee on that point, that the stacked polymers are potentially induced by 2-OG or ions. However, it is out of the main focus of this manuscript to further explore those filaments. Nevertheless, this observation could serve as an interesting starting point for future experiments.

      Reviewer #3 (Public Review):

      Summary:

      The current manuscript investigates the effect of 2-oxoglutarate and the Glk1 protein as modulators of the enzymatic reactivity of glutamine synthetase. To do this, the authors rely on mass photometry, specific activity measurements, and single-particle cryo-EM data.

      From the results obtained, the authors convey that glutamine synthetase from Methanosarcina mazei exists in a non-active monomeric/dimeric form under low concentrations of 2-oxoglutarate, and its oligomerization into a dodecameric complex is triggered by higher concentration of 2-oxoglutarate, also resulting in the enhancement of the enzyme activity.

      Strengths:

      Glutamine synthetase is a crucial enzyme in all domains of life. The dodecameric fold of GS is recurrent amongst prokaryotic and archaea organisms, while the enzyme activity can be regulated in distinct ways. This is a very interesting work combining protein biochemistry with structural biology.

      The role of 2-OG is here highlighted as a crucial effector for enzyme oligomerization and full reactivity.

      Weaknesses:

      Various opportunities to enhance the current state-of-the-art were missed. In particular, omissions of the ligand-bound state of GnK1 leave unexplained the lack of its interaction with GS (in contradiction with previous results from the authors). A finer dissection of the effect and role of 2-oxoglurate are missing and important questions remain unanswered (e.g. are dimers relevant during early stages of the interaction or why previous GS dodecameric structures do not show 2-oxoglutarate).

      We thank Reviewer #3 for the expert evaluation and inspiring criticism.

      (1) Encouragement to examine ligand-bound states of GlnK1. We agree and plan to perform the suggested experiments exploring the conditions under which GlnA1 and GlnK1 might interact. We will perform the MP experiments in the presence of ATP. In GlnA1 activity test assays when evaluating the presence/effects of GlnK1 on GlnA1 activity, however, ATP was always present in high concentrations and still we did not observe a significant effect of GlnK1 on the GlnA1 activity.

      (2) The exact role of 2-OG could have been dissected much better. We agree on that point and will improve the clarity of the manuscript. See also Reviewer #1 R.1.

      (3) The lack of studies on dimers. This is actually an interesting point, which we did not consider during writing the manuscript. Now, re-analysing all our MP data in this respect, GlnA1 is likely a dimer as smallest species. Consequently, we will add more supplementary data which supports this observation and change the text accordingly.

      (4) Previous studies and structures did not show the 2-OG. We assume that for other structures, no additional 2-OG was added, and the groups did not specifically analyse for this metabolite either. All methanoarchaea perform methanogenesis and contain the oxidative part of the TCA cycle exclusively for the generation of glutamate (anabolism) but not a closed TCA cycle enabling them to use internal 2-OG concentration as internal signal for nitrogen availability. In the case of bacterial GS from organisms with a closed TCA cycle used for energy metabolism (oxidation of acetyl CoA) like e.g. E. coli, the formation of an active dodecameric GS form underlies another mechanism independent of 2-OG. In case of the recent M. mazei GS structures published by Schumacher et al. 2023, the dodecameric structure is probably a result from the heterologous expression and purification from E. coli. (See also Reviewer #1 R.2). One example of methanoarchaeal glutamine synthetases that do in fact contain the 2-OG in the structure, is Müller et al. 2023.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Specific issues:

      L 141: 2-OG levels increase due to slowing GOGAT reaction (due to Gln limitation as a consequence of N-starvation).... (2-OG also increases in bacteria that lack GDH...)

      As the GS-GOGAT cycle is the major route of ammonium assimilation, consumption of 2-OG by GDH is probably only relevant under high ammonium concentrations.

      In Methanoarchaea, GS is strictly regulated and expression strongly repressed under nitrogen sufficiency - thus glutamate for anabolism is mainly generated by GDH under N sufficiency consuming 2-OG delivered by the oxidative part of the TCA cycle (Methanogenesis is the energy metabolism in methanoarchaea, a closed TCA cycle is not present) thus 2-OG is increasing under nitrogen limitation, when no NH3 is available for GDH.

      L148: it is not clear what is meant by: "and due to the indirect GS activity assay"

      We apologize for not being clear here. The GS activity assay used is the classical assay by Sahpiro & Stadtman 1970 and is a coupled optical test assay (coupling the ATP consumption of the GS activity to the oxidation of NADH by lactate dehydrogenase). Based on the coupled test assay the measurements of low activities show a high deviation. We now added this information in the revised MS respectively.

      L: 177: arguing about 2-OG affinities: more precisely, the 0.75 mM 2-OG is the EC50 concentration of 2-OG for triggering dodecameric formation; it might not directly reflect the total 2-OG affinity, since the affinity may be modulated by (anti)cooperative effects, or by additional sites... as there may be different 2-OG binding sites involved... (same in line 201)

      Thank you for the valuable input. We changed KD to EC50 within the entire manuscript. Concerning possible additional 2-OG binding sites: we did not see any other 2-OG in the cryo-EM structure aside from the described one and we therefore assume that the one described in the manuscript is the main and only one. Considering the high amounts of 2-OG (12.5 mM) used in the structure, it is quite unlikely that additional 2-OG sites exist since they would have unphysiologically low affinities.

      In this respect, instead of the rather poor assay shown in Figure 1D, a more detailed determination of catalytic activation by different 2-OG concentrations should be done (similar to 1A)... This would allow a direct comparison between dodecamerization and enzymatic activation.

      We agree and performed the respective experiments, which are now presented in revised Fig. 1D

      Discussion: the role of 2-OG as a direct activator, comparison with other prokaryotic GS: in other cases, 2-OG affects GS indirectly by being sensed by PII proteins or other 2-OG sensing mechanisms (like 2OG-NtcA-mediated repression of IF factors in cyanobacteria)

      We agree and have added that information in the discussion as suggested.

      290. Unclear: As a second step of activation, the allosteric binding of 2-OG causes a series of conformational.... where is this site located? According to the catalytic effects (compare 1A and 1D) this site should have a lower affinity …

      Thank you very much for pointing this out. Binding of 2-OG only occurs in one specific allosteric binding-site. Binding however, has two effects on the GlnA1: dodecamer assembly and priming of the active site (with two specific EC50, which are now shown in Fig. 1A and D).

      See also public comment #1 (1).

      Reviewer #2 (Recommendations For The Authors):

      The primary concern for me is that mass photometry might lead to incorrect conclusions. The differences in the forms of GS seen in SEC and MP suggest that GS can indeed form a stable dodecamer when the concentration of GS is high enough, as shown in Figure S1B. I strongly suggest using an additional biophysical method to explore the connection between GS and 2-OG in terms of both assembly and activity, to truly understand 2-OG's role in the process of assembly and catalysis.

      We apologize if we did not present this clear enough, however the MP analysis of GlnA1 in the absence of 2-OG showed always (monomers/) dimers, dodecamers were only present in the presence of 2-OG. The SEC analysis in Fig. S1B has been performed in the presence of 12.5 mM 2-OG, we realized this information is missing in the figure legend - we now added this in the revised version. The 2-OG is in addition visible in the Cryo EM structure. Thus, we do not agree to perform additional biophysical methods.

      As for the other experimental findings, they appear satisfactory to me, and I have no reservations regarding the cryoEM data.

      (1) Mass photometry is a fancy technique that uses only a tiny amount of protein to study how they come together. However, the concentration of the protein used in the experiment might be lower than what's needed for them to stick together properly. So, the authors saw a lot of single proteins or pairs instead of bigger groups. They showed in Figure S1B that the M. mazei GS came out earlier than a 440-kDa reference protein, indicating it's actually a dodecamer. But when they looked at the dodecamer fraction using mass photometry, they found smaller bits, suggesting the GS was breaking apart because the concentration used was too low. To fix this, they could try using a technique called analytic ultracentrifuge (AUC) with different amounts of 2-OG to see if they can spot single proteins or pairs when they use a bit more GS. They could also try another technique called SEC-MALS to do similar tests. If they do this, they could replace Figure 1A with new data showing fully formed GS dodecamers when they use the right amount of 2-OG.

      Thank you for this input. In MP we looked at dodecamer formation after removing the 2-OG entirely and re-adding it in the respective concentration. We think that GlnA1 is much more unstable in its monomeric/dimeric fraction and that the complete and harsh removal of 2-OG results in some dysfunctional protein which does not recover the dodecameric conformation after dialysis and re-addition of 2-OG. Looking at the dodecamer-peak right after SEC however, we exclusively see dodecamers, which is now included as an additional supplementary figure (suppl. Fig. 1C). Consequently, we did not perform additional experiments.

      (2) Building on the last point, the estimated binding strength (Kd) between 2-OG and GS might be lower than it really is, because the GS often breaks apart from its dodecameric form in this experiment, even though 2-OG helps keep the pairs together, as seen with cryoEM. What if they used 5-10 times more GS in the mass photometry experiment? Would the estimated bond strength stay the same? Could they use AUC or other techniques like ITC to find out the real, not just estimated, strength of the bond?

      We agree that the term KD is not suitable. We have changed the term KD to EC50 as suggested by reviewer #1, which describes the effective concentration required for 50 % dodecamer assembly. Furthermore, we disagree that the dodecamer breaks apart when the concentrations are as low as in MP experiments. The actual reason for the breaking is rather the harsh dialysis to remove all 2-OG before MP experiments. Right after SEC, the we exclusively see dodecamer in MP (suppl. Fig. S1C). See also #2 (1).

      (3) The fact that the GS hardly works without 2-OG is interesting. I tried to understand the experiment setup, but it wasn't clear as the protocol mentioned in the author's 2021 FEBS paper referred to an old paper from 1970. The "coupled optical test assay" they talked about wasn't explained well. I found other papers that used phosphometry assays to see how much ATP was used up. I suggest the authors give a better, more detailed explanation of their experiments in the methods section. Also, it's unclear why the GS activity keeps going up from 5 to 12.5 mM 2-OG, even though they said it's saturated. They suggested there might be another change happening from 5 to 12.5 mM 2-OG. If that's the case, they should try to get a cryo-EM picture of the GS with lots of 2-OG, both with and without ATP/glutamate (or the Met-Sox-P-ADP inhibitor), to see what's happening at a structural level during this change caused by 2-OG.

      We agree with the reviewer that the GS assay was not explained in detail (since published and known for several years). However, we now added the more detailed description of the assay in the revised MS, which also measures the ATP used up by GS, but couples the generation of ADP to an optical test assay producing pyruvate from PEP with the generated ADP catalysed by pyruvate kinase present in the assay. This generated pyruvate is finally reduced to lactate by the present lactate dehydrogenase consuming NADH, the reduction of which is monitored at 340 nm.

      The still increasing activity of GS after dodecamer formation (max. at 5 mM 2-OG) and the continuously increasing enzyme activity (max. at 12.5 mM 2-OG): See also public reviews, we assume that there are two effects caused by 2-OG: 1. cooperativity of binding (less 2-OG needed to facilitate dodecamer formation) and 2. priming of each active site.

      The suggested additional experiments with and without ATP/Glutamate: Although we strongly agree that this would be a highly interesting structure, it seems out of the scope of a typical revision to request new cryo-EM structures. We evaluate the findings of our present study concerning the 2-OG effects as important insights into the strongly discussed field of glutamine synthetase regulation, even without the requested additional structures.

      (4) Please remake Figure S2, the panels are too small to read the words. At least I have difficulty doing so.

      We assume the reviewer is pointing to Suppl. Fig S3, we now changed this figure accordingly.

      Line 153, the reference Schumacher et al. 23, should be 2023?

      Yes, thank you. We corrected that.

      Line 497. I believe it's UCSF ChimeraX, not Chimera.

      We apologize and corrected accordingly.

      Reviewer #3 (Recommendations For The Authors):

      Recent studies on the Methanothermococcus thermolithotrophicus glutamine synthetase, published by Müller et al., 2024, have identified the binding site for 2-oxoglutarate as well as the conformational changes that were induced in the protein by its presence. In the present study, the authors confirm these observations and additionally establish a link between the presence of 2-oxoglutarate and the dodecameric fold and full activation of GS.

      Curiously, here, the authors could not confirm their own findings that the dodecameric GS can directly interact with the PII-like GlnK1 protein and the small peptide sP26. However, the lack of mention of the GlnK-bound state in these studies is very alarming since it certainly is highly relevant here.

      We agree with the reviewer that we have not observed the interaction with GlnK1 and sP26 in the recent study. Consequently, we speculate that yet unknown cellular factor(s) might be required for an interaction of GlnA1 with GlnK1 and sP26, which were not present in the in vitro experiments using purified proteins, however they were present in the previous pull-down approaches (Ehlers et al. 2005, Gutt et al. 2021). Another reason might be that post-translational modifications occur in M. mazei, which might be important for the interaction, which are also not present in purified proteins expressed in E. coli.

      The manuscript interest could have been substantially increased if the authors had done finer biochemical and enzymatic analyses on the oligomerization process of GS, used GlnK1 bound to known effectors in their assays and would have done some more efforts to extrapolate their findings (even if a small niche) of related glutamine synthetases.

      We thank the reviewer for their valuable encouragement to explore ligand-bound-states of GlnK1. However, in this manuscript we mainly focused on 2-OG as activator of GlnA1 and decided to dedicate future experiments to the exploration of conditions that possibly favor GlnK1-binding.

      In principle, we have explored the ATP bound GlnK1 effects on GlnA1 activity in the activity assays (Fig. 2E) since ATP (3.6 mM) is present. GlnK1 however showed no effects on GlnA1 activity.

      In general, the manuscript is poorly written, with grammatically incorrect sentences that at times, which stands in the way of passing on the message of the manuscript.

      Particular points:

      (1) It is mentioned that 2-OG induces the active oligomeric (dodecamer, 12-mer) state of GlnA1 without detectable intermediates. However, only 62 % of the starting inactive enzyme yields active 12-mers. Note that this is contradicted in line 212.

      Thanks for pointing out this discrepancy. After removing all 2-OG as we did before MP-experiments, GlnA1 doesn’t reach full dodecamers anymore when 2-OG is re-added. This is not because the 2-OG amount is not enough to trigger full assembly, but because the protein is much more unstable in the absence of 2-OG, so we predict that some GlnA1 breaks during dialysis. See also answer reviewer #2 (1) and supplementary figure S1C.

      Is there any protein precipitation upon the addition of 2-OG? Is all protein being detected in the assay, meaning, is monomer/dimer + dodecamer yields close to 100% of the total enzyme in the assay?

      There is no protein precipitation upon the addition of 2-OG, indeed, GlnA1 is much more stable in the presence of 2-OG. In the mass photometry experiments, all particles are measured, precipitated protein would be visible as big entities in the MP.

      Please add to Figure 1 the amount of monomer/dimer during titration. Some debate why there is no full conversion should be tentatively provided.

      We agree with the reviewer and included the amount of monomer/dimer in the figure, as well as some discussion on why it is not fully converted again. GlnA1 is unstable without 2-OG and it was dialysed against buffer without 2-OG before MP measurements. This sample mistreatment resulted in no full re-assembly after re-adding 2-OG (although full dodecamers before dialysis (suppl. Fig. S1C).

      (2) Figure 1B reflects an exemplary result. Here, the addition of 0.1 mM 2-OG seems to promote monomer to dimer transition. Why was this not studied in further detail? It seems highly relevant to know from which species the dodecamer is assembled.

      We thank the reviewer for their comment. However, we would like to point out that, although not shown in the figure, GlnA1 is always mainly present as dimers as the smallest entity. As suggested earlier, we have added the amount of monomers/dimers to Figure 1A, which shows low monomer-counts at all 2-OG concentrations (Fig.1A). Although not depicted in the graph starting at 0.01 mM OG, we also see mainly dimers at 0 mM 2-OG.

      How does the y-axis compare to the number and percentage of counts assigned to the peaks? In line 713, it is written that the percentage of dodecamer considers the total number of counts, and this was plotted against the 2-OG concentration.

      We thank the reviewer for addressing this unclarity. Line 713 corresponds to Figure 1A, where we indeed plotted the percentage of dodecamer against the 2-OG-concentration. Thereby, the percentage of dodecamer corresponds to the percentage calculated from the Gaussian Fit of the MP-dodecamer-peak. In Figure 1 B, however, the y-axis displays the relative amount of counts per mass, multiple similar masses then add up to the percentage of the respective peak (Gaussian Fit above similar masses).

      (3) Lines 714 and 721 (and elsewhere): Why only partial data is used for statistical purposes?

      We in general only show one exemplary biological replicate, since the quality of the respective GlnA1 purification sometimes varied (maximum activity ranging from 5 - 10 U/mg). Therefore, we only compared activities within the same protein purification. For the EC50 calculations of all measurements, we refer to the supplement.

      (4) Lines 192-193: It is claimed that GlnK1 was previously shown to both regulate the activity of GlnA1 and form a complex with GlnA1. Please mention the ratio between GlnK1 and GlnA1 in this complex.

      We now included the requested information (GlnA1:GlnK1 1:1, (Ehlers et al. 2005); His6-GlnA1 (0.95 μM), His6-GlnK1 (0.65 μM); 2:1,4, Gutt et al. 2021).

      It is also known that PII proteins such as GlnK1 can bind ADP, ATP, and 2-OG. Interestingly, however, for various described PII proteins, 2-OG can only bind after the binding of ATP.

      So, the crucial question here is what is the binding state of GlnK1? 

      Were these assays performed in the absence of ATP? This is key to fully understand and connect the results to the previous observations. For example, if the GlnK1 used was bound to ADP but not to ATP, then the added 2-OG might indeed only be able to affect GlnA1 (leading to its activation/oligomerization). If this were true and according to the data reported, ADP would prevent GlnK1 from interacting with any oligomeric form of GlnA1. However, if GlnK1 bound to ATP is the form that interacts with GlnA1 (potentially validating previous results?) then, 2-OG would first bind to GlnK1 (assuming a higher affinity of 2-OG to GlnK1), eventually causing its release from GlnA1 followed by binding and activation of GlnA1.

      These experiments need to be done as they are essential to further understand the process. Given the ability of the authors to produce the protein and run such assays, it is unclear why they were not done here. As written in line 203, in this case, "under the conditions tested" is not a good enough statement, considering what is known in the field and how many more conclusions could easily be taken from such a setup.

      Thanks for the encouragement to investigate the ligand-bound states of GlnK1. We agree and plan to perform the suggested mass photometry experiments exploring the conditions under which GlnA1 and GlnK1 might interact in future work. In GlnA1 activity test assays, when evaluating the presence/effects of GlnK1 on GlnA1 activity, however, ATP was always present in high concentrations and still we did not observe a significant effect of GlnK1 on the GlnA1 activity.

      (5) Figure 2D legend claims that the graphic shows the percentage of dodecameric GlnA1 as a function of the concentration of 2-OG. This is not what the figure shows; Figure 2D shows the dodecamer/dimer (although legend claims monomer was used, in line 732) ratio as a function of 2-OG (stated in line 736!). If this is true, a ratio of 1 means 50 % of dodecamers and dimers co-exist. This appears to be the case when GlnK1 was added, while in the absence of GlnK1 higher ratios are shown for higher 2-OG concentration implying that about 3 times more dodecamers were formed than dimers. However, wouldn´t a 50 % ratio be physiologically significant?

      We apologize for the partially incorrect and also misleading figure legend and corrected it. Indeed, the ratio of dodecamers and dimers is shown. Furthermore, we did not use monomeric GlnA1 (the smallest entity is mainly a dimer, see Fig 1A), however, the molarity was calculated based on the monomer-mass. Concerning the significance of the difference between the maximum ratio of GlnA1 and GlnK1: The ratio does appear higher, but this is mostly because adding large quantities of GlnK1 broadens all peaks at low molecular weight. This happens because the GlnK1 signal starts overlapping with the signal from GlnA1, leading to inflated GlnA1 dimer counts. We therefore do not think that this is biologically significant, especially as the activities do not differ under these conditions.

      (6) Is it possible that the uncleaved GlnA1 tag is preventing interaction with GlnK1? This should be discussed.

      This is of course a very important point. We however realized that Schumacher et al. also used an N-terminal His-tag, so we assume that the N-terminal tag is not hampering the interaction.

      (7) Line 228: Please detail the reported discrepancies in rmsd between the current protein and the gram-negative enzymes.

      The differences in rmsd between our M.mazei GlnA1 structure and the structure of gram-negative enzymes is caused by a) sequence similarity: E.g. M.mazei GlnA1 compared to B.subtilis GlnA have a sequence percent identity of 58.47; b) ligands in the structure: The B.Subtilis structure contains L-Methionine-S-sulfoximine phosphate, a transition state inhibitor, while the M. mazei  structure contains 2OG; c) Methodology: The structural determination methods also contribute to these differences. B. subtilis GlnA was determined using X-ray crystallography, while the M. mazei GlnA1 structure was resolved using Cryo-EM, where the protein behaves differently in ice compared to a crystal.

      (8) Line 747: The figure title claims "dimeric interface" although the manuscript body only refers to "hexameric interface" or "inter-hexamer interface" (line 224). Moreover, the figure 4 legend uses terms such as vertical and horizontal dimers and this too should be uniformized within the manuscript.

      Thank you for your valuable feedback. We have updated both the figure title and the figure legend as well in the main text to ensure consistency in the description.

      (9) Line 752: The description of the color scheme used here is somehow unclear.

      Thanks for pointing this out. We changed the description to make it more comprehensive.

      (10) Please label H14/15 and H14´/H15´in Fig 4C zoom.

      We agree that this has not been very clear. We added helix labels.

      (11) In Figure 4D legend, make sure to note that the binding sites for the substrate are based on homologies with another enzyme poised with these molecules.

      The same should be clear in the text: sites are not known, they are assumed to be, based on homologies (paragraph starting at line 239).

      Concerning this comment we want to point out that we studied the exact same enzyme as the Schumacher group, except that we used 2-OG in our experiments, which they did not.

      (12) Figure 3 appears redundant in light of Figure 4. 

      (13) Line 235: When mentioning F24, please refer to Figure 5.

      Thank you, we changed that accordingly.

      (14) Please provide the distances for the bonds depicted in Figure 4B.

      Thanks for pointing this out, we added distance labels to Figure 4B. For reasons of clarity only to three H-bonds.

      (15) Line 241: D57 is likely serving to abstract a proton from ammonium, what is residue Glu307 potentially doing? The information seems missing in light of how the sentence is built.

      Thanks for pointing this out. According to previous studies both residues are likely involved in proton abstraction - first from ammonium, and then from the formed gamma-ammonium group. Additionally, they contribute in shielding the active site from bulk solvent to prevent hydrolysis of the formed phospho-glutamate.

      (16) Why do the authors assume that increased concentrations of 2-OG are a signal for N starvation only in M. mazei and not in all prokaryotic equivalent systems (line 288)?

      In line 288, we did not claim that this is a unique signal for M. mazei. It is also the central N-starvation signal in Cyanobacteria but not directly perceived by the cyanobacterial GS through binding directly to GS.

      The authors should look into the residues that bind 2-OG and check if they are conserved in other GS. The results of this sequence analysis should be discussed in line with the variable prokaryotic glutamine synthetase types of activity modulation that were exposed in the introduction and Figure 7.

      Please refer to supplementary figure S5, where we already aligned the mentioned glutamine synthetase sequences. Since this was also already discussed in Müller et al. 2024, we did not want to repeat their observations and refer to our supplementary figure in too much detail.

      (17) Figure 5 title: Replace TS by transition state structures of homology enzymes, or alike.

      Thank you for this suggestion. We did not change the title however, since it is not a homologue but the exact same glutamine synthetase from Methanosarcina mazei.

      (18) Line 249: D170 is not shown in Figure 5A or elsewhere in Figure 5.

      Thank you for pointing this out. We added D170 to figure 5A.

      (19) Representative density for the residues binding 2-OG should be provided, maybe in a supplemental figure.

      Thank you for the suggestion. We added the densities of 2-OG-binding residues to figure 4B

      (20) Line 260: Please add a reference when describing the phosphoryl transfer.

      We thank the reviewer for this important point and added that accordingly.

      (21) Line 296: The binding of 2-OG indeed appears to be cooperative, such that at concentrations above its binding affinity to the protein, only dodecamers are seen (under experimental conditions). However, claiming that the oligomerization is fast is not correct when the experimental setup includes 10 minutes of incubation before measurements are done. Please correct this within the entire manuscript.

      A (fast) continuous kinetic assay could have confirmed this point and revealed the oligomerization steps and the intermediaries in the process (maybe monomer/dimers, then dimers/hexamers, and then hexamers/dodecamers). Such assays would have been highly valuable to this study.

      We thank the reviewer for this suggestion, but disagree. It is indeed a rather fast regulation (as activity assays without pre-incubation only takes 1 min longer to reach full activity, see the newly included suppl. Fig S6). Considering other regulation mechanisms like e.g. transcription or translation regulation, an activation that takes only 60 s is actually quite quick.

      (22) Line 305 (and elsewhere in the manuscript): the authors state that 2-OG primes the active site for a transition state. This appears incorrect. The transition state is the highest energy state in an enzymatic reaction progressing from substrate to product. Meaning, the transition state is a state that has a more or less modified form of the original substrate bound to the active site. This is not the case.

      In line 366 an "active open state" appears much more adequate to use. 

      We agree and changed accordingly throughout the manuscript.

      (23) Line 330: Please delete "found". Eventually replace it with "confirmed": As the authors write, others have described this residue as a ligand to glutamine.

      Thanks, we changed that accordingly, although previous descriptions were just based on homologies without the experimental validation.

      (24) The discussion in at various points summarizing again the results. It should be trimmed and improved.

      (25) Line 381: replace "two fast" with "fast"?

      We thank the reviewer for this suggestion, but disagree on this point. We especially wanted to highlight that there are two central nitrogen-metabolites involved in the direct regulation of GlnA1, that means TWO fast direct processes mediated by 2-OG and glutamine.

    1. eLife Assessment

      This important paper reports functional interactions between L1TD1, an RNA binding protein (RBP), and its ancestral LINE-1 retrotransposon which is not modulated at the translational level. The evidence for the association between L1TD1 and LINE-1 ORF1p is solid. The work implies that the transposon-derived RNA binding protein in the human genome can interact with the ancestral transposable element from which this protein was initially derived. This work spurs interesting questions for cancer types, where LINE1 and L1TD1 are aberrantly expressed.

    2. Reviewer #1 (Public review):

      Summary:

      In their manuscript entitled 'The domesticated transposon protein L1TD1 associates with its ancestor L1 ORF1p to promote LINE-1 retrotransposition', Kavaklıoğlu and colleagues delve into the role of L1TD1, an RNA binding protein (RBP) derived from a LINE1 transposon. L1TD1 proves crucial for maintaining pluripotency in embryonic stem cells and is linked to cancer progression in germ cell tumors, yet its precise molecular function remains elusive. Here, the authors uncover an intriguing interaction between L1TD1 and its ancestral LINE-1 retrotransposon.

      The authors delete the DNA methyltransferase DNMT1 in a haploid human cell line (HAP1), inducing widespread DNA hypo-methylation. This hypomethylation prompts abnormal expression of L1TD1. To scrutinize L1TD1's function in a DNMT1 knock-out setting, the authors create DNMT1/L1TD1 double knock-out cell lines (DKO). Curiously, while the loss of global DNA methylation doesn't impede proliferation, additional depletion of L1TD1 leads to DNA damage and apoptosis.

      To unravel the molecular mechanism underpinning L1TD1's protective role in the absence of DNA methylation, the authors dissect L1TD1 complexes in terms of protein and RNA composition. They unveil an association with the LINE-1 transposon protein L1-ORF1 and LINE-1 transcripts, among others.

      Surprisingly, the authors note fewer LINE-1 retro-transposition events in DKO cells compared to DNMT1 KO alone.

      Strengths:

      The authors present compelling data suggesting the interplay of a transposon-derived human RNA binding protein with its ancestral transposable element. Their findings spur interesting questions for cancer types, where LINE1 and L1TD1 are aberrantly expressed.

      Weaknesses:

      The finding that L1TD1/DNMT1 DKO cells exhibit increased apoptosis and DNA damage but decreased L1 retro-transposition is unexpected. Considering the DNA damage associated with retro-transposition and the DNA damage and apoptosis observed in L1TD1/DNMT1 DKO cells, one would anticipate the opposite outcome. Could it be that the observation of fewer transposition-positive colonies stems from the demise of the most transposition-positive colonies? Future studies are bound to further explore this intriguing phenomenon.

    3. Reviewer #2 (Public review):

      In this study, Kavaklıoğlu et al. investigated and presented evidence for a role for domesticated transposon protein L1TD1 in enabling its ancestral relative, L1 ORF1p, to retrotranspose in HAP1 human tumor cells. The authors provided insight into the molecular function of L1TD1 and shed some clarifying light on previous studies that showed somewhat contradictory outcomes surrounding L1TD1 expression. Here, L1TD1 expression was correlated with L1 activation in a hypomethylation dependent manner, due to DNMT1 deletion in HAP1 cell line. The authors then identified L1TD1 associated RNAs using RIP-Seq, which display a disconnect between transcript and protein abundance (via Tandem Mass Tag multiplex mass spectrometry analysis). The one exception was for L1TD1 itself, is consistent with a model in which the RNA transcripts associated with L1TD1 are not directly regulated at the translation level. Instead, the authors found L1TD1 protein associated with L1-RNPs and this interaction is associated with increased L1 retrotransposition, at least in the contexts of HAP1 cells. Overall, these results support a model in which L1TD1 is restrained by DNA methylation, but in the absence of this repressive mark, L1TD1 is expression, and collaborates with L1 ORF1p (either directly or through interaction with L1 RNA, which remains unclear based on current results), leads to enhances L1 retrotransposition. These results establish feasibility of this relationship existing in vivo in either development or disease, or both.

      Comments on revised version:

      Thank you for this revised manuscript and for addressing our concerns and suggestions. These improvements have significantly enhanced the quality and reliability of the results presented and have addressed all our questions.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In their manuscript entitled 'The domesticated transposon protein L1TD1 associates with its ancestor L1 ORF1p to promote LINE-1 retrotransposition', Kavaklıoğlu and colleagues delve into the role of L1TD1, an RNA binding protein (RBP) derived from a LINE1 transposon. L1TD1 proves crucial for maintaining pluripotency in embryonic stem cells and is linked to cancer progression in germ cell tumors, yet its precise molecular function remains elusive. Here, the authors uncover an intriguing interaction between L1TD1 and its ancestral LINE-1 retrotransposon.

      The authors delete the DNA methyltransferase DNMT1 in a haploid human cell line (HAP1), inducing widespread DNA hypo-methylation. This hypomethylation prompts abnormal expression of L1TD1. To scrutinize L1TD1's function in a DNMT1 knock-out setting, the authors create DNMT1/L1TD1 double knock-out cell lines (DKO). Curiously, while the loss of global DNA methylation doesn't impede proliferation, additional depletion of L1TD1 leads to DNA damage and apoptosis.

      To unravel the molecular mechanism underpinning L1TD1's protective role in the absence of DNA methylation, the authors dissect L1TD1 complexes in terms of protein and RNA composition. They unveil an association with the LINE-1 transposon protein L1-ORF1 and LINE-1 transcripts, among others.

      Surprisingly, the authors note fewer LINE-1 retro-transposition events in DKO cells compared to DNMT1 KO alone.

      Strengths:

      The authors present compelling data suggesting the interplay of a transposon-derived human RNA binding protein with its ancestral transposable element. Their findings spur interesting questions for cancer types, where LINE1 and L1TD1 are aberrantly expressed.

      Weaknesses:

      Suggestions for refinement:

      The initial experiment, inducing global hypo-methylation by eliminating DNMT1 in HAP1 cells, is intriguing and warrants more detailed description. How many genes experience misregulation or aberrant expression? What phenotypic changes occur in these cells? Why did the authors focus on L1TD1? Providing some of this data would be helpful to understand the rationale behind the thorough analysis of L1TD1.

      The finding that L1TD1/DNMT1 DKO cells exhibit increased apoptosis and DNA damage but decreased L1 retro-transposition is unexpected. Considering the DNA damage associated with retro-transposition and the DNA damage and apoptosis observed in L1TD1/DNMT1 DKO cells, one would anticipate the opposite outcome. Could it be that the observation of fewer transposition-positive colonies stems from the demise of the most transpositionpositive colonies? Further exploration of this phenomenon would be intriguing.

      Reviewer #2 (Public review):

      In this study, Kavaklıoğlu et al. investigated and presented evidence for a role for domesticated transposon protein L1TD1 in enabling its ancestral relative, L1 ORF1p, to retrotranspose in HAP1 human tumor cells. The authors provided insight into the molecular function of L1TD1 and shed some clarifying light on previous studies that showed somewhat contradictory outcomes surrounding L1TD1 expression. Here, L1TD1 expression was correlated with L1 activation in a hypomethylation dependent manner, due to DNMT1 deletion in HAP1 cell line. The authors then identified L1TD1 associated RNAs using RIPSeq, which display a disconnect between transcript and protein abundance (via Tandem Mass Tag multiplex mass spectrometry analysis). The one exception was for L1TD1 itself, is consistent with a model in which the RNA transcripts associated with L1TD1 are not directly regulated at the translation level. Instead, the authors found L1TD1 protein associated with L1-RNPs and this interaction is associated with increased L1 retrotransposition, at least in the contexts of HAP1 cells. Overall, these results support a model in which L1TD1 is restrained by DNA methylation, but in the absence of this repressive mark, L1TD1 is expression, and collaborates with L1 ORF1p (either directly or through interaction with L1 RNA, which remains unclear based on current results), leads to enhances L1 retrotransposition. These results establish feasibility of this relationship existing in vivo in either development or disease, or both.

      Comments on revised version:

      In general, the authors did an acceptable job addressing the major concerns throughout the manuscript. This revision is much clearer and has improved in terms of logical progression.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The authors have addressed all my questions in the revised version of the manuscript.

      Reviewer #2 (Recommendations for the authors):

      Revised comments:

      A few points we'd like to see addressed are our comments about the model (Figure S7C), as this is important for the readership to understand this complex finding. Please try to apply some quantification, if possible (question 8). Please do your best to tone down the direct relationship of these findings to embryology (question 11). Based on both reviewer comments, we believe addressing reviewer #1s "Suggestions for refinement" (2 points), would help us change our view of solid to convincing.

      Responses to changes:

      Major

      (1) The study only used one knockout (KO) cell line generated by CRISPR/Cas9.

      Considering the possibility of an off-target effect, I suggest the authors attempt one or both of these suggestions.

      A)  Generate or acquire a similar DMNT1 deletion that uses distinct sgRNAs, so that the likelihood of off-targets is negligible. A few simple experiments such as qRT-PCR would be sufficient to suggest the same phenotype.

      B)  Confirm the DNMT1 depletion also by siRNA/ASO KD to phenocopy the KO effect.

      (2) In addition to the strategies to demonstrate reproducibility, a rescue experiment restoring DNMT1 to the KO or KD cells would be more convincing. (Partial rescue would suffice in this case, as exact endogenous expression levels may be hard to replicate).

      We have undertook several approaches to study the effect of DNMT1 loss or inactivation: As described above, we have generated a conditional KO mouse with ablation of DNMT1 in the epidermis. DNMT1-deficient keratinocytes isolated from these mice show a significant increase in L1TD1 expression. In addition, treatment of primary human keratinocytes and two squamous cell carcinoma cell lines with the DNMT inhibitor aza-deoxycytidine led to upregulation of L1TD1 expression. Thus, the derepression of L1TD1 upon loss of DNMT1 expression or activity is not a clonal effect.

      Also, the spectrum of RNAs identified in RIP experiments as L1TD1-associated transcripts in HAP1 DNMT1 KO cells showed a strong overlap with the RNAs isolated by a related yet different method in human embryonic stem cells. When it comes to the effect of L1TD1 on L1-1 retrotranspostion, a recent study has reported a similar effect of L1TD1 upon overexpression in HeLa cells [4].

      All of these points together help to convince us that our findings with HAP1 DNMT KO are in agreement with results obtained in various other cell systems and are therefore not due to off-target effects. With that in mind, we would pursue the suggestion of Reviewer 1 to analyze the effects of DNA hypomethylation upon DNMT1 ablation.

      Thank you for addressing this concern. The reference to Beck 2021 and the additional cells lines (R2: keratinocytes and R3: squamous cell carcinoma) provides sufficient evidence that this result is unlikely to be a result of clonal expansion or off targets.

      Question: Was the human ES Cell RIP Experiment shown here? What is the overlap?

      We refer to the recently published study by Jin et al. (PMID: 38165001). As stated in the Discussion, the majority of L1TD1-associated transcripts in HAP1 cells (69%) identified in our study were also reported as L1TD1 targets in hESCs suggesting a conserved binding affinity of this domesticated transposon protein across different cell types.  

      (3) As stated in the introduction, L1TD1 and ORF1p share "sequence resemblance" (Martin 2006). Is the L1TD1 antibody specific or do we see L1 ORF1p if Fig 1C were uncropped?

      (6) Is it possible the L1TD1 antibody binds L1 ORF1p? This could make Figure 2D somewhat difficult to interpret. Some validation of the specificity of the L1TD1 antibody would remove this concern (see minor concern below).

      This is a relevant question. We are convinced that the L1TD1 antibody does not crossreact with L1 ORF1p for the following reasons: Firstly, the antibody does not recognize L1 ORF1p (40 kDa) in the uncropped Western blot for Figure 1C (Figure R4A). Secondly, the L1TD1 antibody gives only background signals in DKO cells in the indirect immunofluorescence experiment shown in Figure 1E of the manuscript.

      Thirdly, the immunogene sequence of L1TD1 that determines the specificity of the antibody was checked in the antibody data sheet from Sigma Aldrich. The corresponding epitope is not present in the L1 ORF1p sequence.

      Finally, we have shown that the ORF1p antibody does not cross-react with L1TD1 (Figure R4B).

      Response: Thank you for sharing these images. These full images relieve concerns about specificity. The increase of ORF1P in R4B and Main figure 3C is interesting and pointed out in the manuscript. Not for the purposes of this review, but the observation of reduced transposition despite increased ORF1P could be an interesting follow up to this study (combined with the similar UPF1 result could indicate a complex of some kind).

      (4) In abstract (P2), the authors mentioned that L1TD1 works as an RNA chaperone, but in the result section (P13), they showed that L1TD1 associates with L1 ORF1p in an RNA independent manner. Those conclusions appear contradictory. Clarification or revision is required.

      Our findings that both proteins bind L1 RNA, and that L1TD1 interacts with ORF1p are compatible with a scenario where L1TD1/ORF1p heteromultimers bind to L1 RNA. The additional presence of L1TD1 might thereby enhance the RNA chaperone function of ORF1p. This model is visualized now in Suppl. Figure S7C.

      Response: Thank you for the model. To further clarify, do you mean that L1TD1 can bind L1 RNA, but this is not needed for the effect, however this "bonus" binding (that is enabled by heteromultimerization) appears to enhance the retrotransposition frequency? Do you think L1TD1 is binding L1 RNA in this context or simply "stabilizing" ORF1P (Trimer) RNP?

      Based on our data, L1TD1 associates with L1 RNA and interacts with L1 ORF1p. Both features might contribute to the enhanced retrotransposition frequency. Interestingly, the L1TD1 protein shares with its ancestor L1 ORF1p the non-canonical RNA recognition motif and the coiled-coil motif required for the trimerization but has two copies instead of one of the C-terminal domain (CTD), a structure with RNA binding and chaperone function. We speculate that the presence of an additional CTD within the L1TD1 protein might thereby enhance the RNA binding and chaperone function of L1TD1/ORF1p heteromultimers.

      (5) Figure 2C fold enrichment for L1TD1 and ARMC1 is a bit difficult to fully appreciate. A 100 to 200-fold enrichment does not seem physiological. This appears to be a "divide by zero" type of result, as the CT for these genes was likely near 40 or undetectable. Another qRT-PCR based approach (absolute quantification) would be a more revealing experiment. This is the validation of the RIP experiments and the presentation mode is specifically developed for quantification of RIP assays (Sigma Aldrich RIP-qRT-PCR: Data Analysis Calculation Shell). The unspecific binding of the transcript in the absence of L1TD1 in DNMT1/L1TD1 DKO cells is set to 1 and the value in KO cells represents the specific binding relative the unspecific binding. The calculation also corrects for potential differences in the abundance of the respective transcript in the two cell lines. This is not a physiological value but the quantification of specific binding of transcripts to L1TD1. GAPDH as negative control shows no enrichment, whereas specifically associated transcripts show strong enrichement. We have explained the details of RIPqRT-PCR evaluation in Materials and Methods (page 14) and the legend of Figure 2C in the revised manuscript.

      Response: Thank you for the clarification and additional information in the manuscript.

      (6) Is it possible the L1TD1 antibody binds L1 ORF1p? This could make Figure 2D somewhat difficult to interpret. Some validation of the specificity of the L1TD1 antibody would remove this concern (see minor concern below).

      See response to (3).

      Response: Thanks.

      (7) Figure S4A and S4B: There appear to be a few unusual aspects of these figures that should be pointed out and addressed. First, there doesn't seem to be any ORF1p in the Input (if there is, the exposure is too low). Second, there might be some L1TD1 in the DKO (lane 2) and lane 3. This could be non-specific, but the size is concerning. Overexposure would help see this.

      The ORF1p IP gives rise to strong ORF1p signals in the immunoprecipitated complexes even after short exposure. Under these conditions ORF1p is hardly detectable in the input. Regarding the faint band in DKO HAP1 cells, this might be due to a technical problem during Western blot loading. Therefore, the input samples were loaded again on a Western blot and analyzed for the presence of ORF1p, L1TD1 and beta-actin (as loading control) and shown as separate panel in Suppl. Figure S4A.

      The enhanced image is clearer. Thanks.

      S4A and S4B now appear to the S6A and S6B, is that correct? (This is due to the addition of new S1 and S2, but please verify image orders were not disturbed).

      Yes, the input is shown now as a separate panel in Suppl. Figure S6A.

      (8) Figure S4C: This is related to our previous concerns involving antibody cross-reactivity. Figure 3E partially addresses this, where it looks like the L1TD1 "speckles" outnumber the ORF1p puncta, but overlap with all of them. This might be consistent with the antibody crossreacting. The western blot (Figure 3C) suggests an upregulation of ORF1p by at least 23x in the DKO, but the IF image in 3E is hard to tell if this is the case (slightly more signal, but fewer foci). Can you return to the images and confirm the contrast are comparable? Can you massively overexpose the red channel in 3E to see if there is residual overlap? In Figure 3E the L1TD1 antibody gives no signal in DNMT1/L1TD1 DKO cells confirming that it does not recognize ORF1p. In agreement with the Western blot in Figure 3C the L1 ORF1p signal in Figure 3E is stronger in DKO cells. In DNMT1 KO cells the L1 ORF1p antibody does not recognize all L1TD1 speckles. This result is in agreement with the Western blot shown above in Figure R4B and indicates that the L1 ORF1p antibody does not recognize the L1TD1 protein. The contrast is comparable and after overexposure there are still L1TD1 specific speckles. This might be due to differences in abundance of the two proteins.

      Response: Suggestion: Would it be possible to use a program like ImageJ to supplement the western blot observation? Qualitatively, In figure 3E, it appears that there is more signal in the DKO, but this could also be due to there being multiple cells clustered together or a particularly nicely stained region. Could you randomly sample 20-30 cells across a few experiments to see if this holds up. I am interested in whether the puncta in the KO image(s) is a very highly concentrated region and in the DKO this is more disperse. Also, the representative DKO seems to be cropped slightly wrong. (Please use puncta as a guide to make the cropping more precise)

      As suggested by the reviewer we have quantified the signals of 60 KO cells and 56 DKO cells in three different IF experiments by ImageJ. We measured a 1.4-fold higher expression level of L1 ORF1p in DKO cells. However, the difference is not statistically significant. This is most probably due to the change in cell size and protein content during the cell cycle with increasing protein contents from G1 to G2. Western blot analysis provides signals of comparable protein amounts representing an average expression levels over ten thousands of cells. Nevertheless, the quantification results reflect in principle the IF pictures shown in Figure 3E but IF is probably not the best method to quantify protein amounts. We have also corrected Figure 3E.

      Author response image 1.

      (9) The choice of ARMC1 and YY2 is unclear. What are the criteria for the selection?

      ARMC1 was one of the top hits in a pilot RIP-seq experiment (IP versus input and IP versus IgG IP). In the actual RIP-seq experiment with DKO HAP1 cells instead of IgG IP as a negative control, we found ARMC1 as an enriched hit, although it was not among the top 5 hits. The results from the 2nd RIP-seq further confirmed the validity of ARMC1 as an L1TD1interacting transcript. YY2 was of potential biological relevance as an L1TD1 target due to the fact that it is a processed pseudogene originating from YY1 mRNA as a result of retrotransposition. This is mentioned on page 6 of the revised manuscript.

      Response: Appreciated!

      (10) (P16) L1 is the only protein-coding transposon that is active in humans. This is perhaps too generalized of a statement as written. Other examples are readily found in the literature.

      Please clarify.

      We will tone down this statement in the revised manuscript.

      Response: Appreciated! To further clarify, the term "active" when it comes to transposable elements, has not been solidified. It can span "retrotransposition competent" to "transcripts can be recovered". There are quite a few reports of GAG transcripts and protein from various ERV/LTR subfamilies in various cells and tissues (in mouse and human at least), however whether they contribute to new insertions is actively researched.

      (11) In both the abstract and last sentence in the discussion section (P17), embryogenesis is mentioned, but this is not addressed at all in the manuscript. Please refrain from implying normal biological functions based on the results of this study unless appropriate samples are used to support them.

      Much of the published data on L1TD1 function are related to embryonic stem cells [3- 7].

      Therefore, it is important to discuss our findings in the context of previous reports.

      Response: It is well established that embryonic stem cells are not a perfect or direct proxies for the inner cell mass of embryos, as multiple reports have demonstrated transcriptomic, epigenetic, chromatin accessibility differences. The exact origin of ES cells is also considered controversial. We maintain that the distinction between embryos/embryogenesis and the results presented in the manuscript are not yet interchangeable. An important exception would be complex models of embryogenesis such as embryoids, (or synthetic/artificial embryo models that have been carefully been termed as such so as to not suggest direct implications to embryos). https://www.nature.com/articles/ncb2965  

      https://link.springer.com/article/10.1007/s00018-018-2965-y  

      https://www.cell.com/developmental-cell/abstract/S1534-5807(24)00363-0?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS1534580724003630%3Fshowall%3Dtrue

      We have deleted the corresponding paragraph in the Discussion.

      (12) Figure 3E: The format of Figures 1A and 3E are internally inconsistent. Please present similar data/images in a cohesive way throughout the manuscript. We show now consistent IF Figures in the revised manuscript.

      Response: Thanks

      Minor:

      In general:

      Still need checking for typos, mostly in Materials and Methods section; Please keep a consistent writing style throughout the whole manuscript. If you use L1 ORF1p, then please use L1 instead of LINE-1, or if you keep LINE-1 in your manuscript, then you should use LINE-1 ORF1p.

      A lab member from the US checked again the Materials and Methods section for typos. We keep the short version L1 ORF1p.

      (1) Intro:

      - Is L1Td1 in mice and Humans? How "conserved" is it and does this suggest function? Murine and human L1TD1 proteins share 44% identity on the amino acid level and it was suggested that the corresponding genes were under positive selection during evolution with functions in transposon control and maintenance of pluripotency [8].

      - Why HAP1? (Haploid?) The importance of this cell line is not clear.

      HAP1 is a nearly haploid human cancer cell line derived from the KBM-7 chronic myelogenous leukemia (CML) cell line [9, 10]. Due to its haploidy is perfectly suited and widely used for loss-of-function screens and gene editing. After gene editing cells can be used in the nearly haploid or in the diploid state. We usually perform all experiments with diploid HAP1 cell lines. Importantly, in contrast to other human tumor cell lines, this cell line tolerates ablation of DNMT1. We have included a corresponding explanation in the revised manuscript on page 5, first paragraph.

      - Global methylation status in DNMT1 KO? (Methylations near L1 insertions, for example?)

      The HAP1 DNMT1 KO cell line with a 20 bp deletion in exon 4 used in our study was validated in the study by Smits et al. [11]. The authors report a significant reduction in overall DNA methylation. However, we are not aware of a DNA methylome study on this cell line. We show now data on the methylation of L1 elements in HAP1 cells and upon DNMT1 deletion in the revised manuscript in Suppl. Figure S1B.

      Response: Looks great!

      (2) Figure 1:

      - Figure 1C. Why is LMNB used instead of Actin (Fig1D)?

      We show now beta-actin as loading control in the revised manuscript.

      - Figure 1G shows increased Caspase 3 in KO, while the matching sentence in the result section skips over this. It might be more accurate to mention this and suggest that the single KO has perhaps an intermediate phenotype (Figure 1F shows a slight but not significant trend).

      We fully agree with the reviewer and have changed the sentence on page 6, 2nd paragraph accordingly.

      - Would 96 hrs trend closer to significance? An interpretation is that L1TD1 loss could speed up this negative consequence.

      We thank the reviewer for the suggestion. We have performed a time course experiment with 6 biological replicas for each time point up to 96 hours and found significant changes in the viability upon loss of DNMT1 and again significant reduction in viability upon additional loss of L1TD1 (shown in Figure 1F). These data suggest that as expected loss of DNMT1 leads to significant reduction viability and that additional ablation of L1TD1 further enhances this effect.

      Response: Looks good!

      - What are the "stringent conditions" used to remove non-specific binders and artifacts (negative control subtraction?)

      Yes, we considered only hits from both analyses, L1TD1 IP in KO versus input and L1TD1 IP in KO versus L1TD1 IP in DKO. This is now explained in more detail in the revised manuscript on page 6, 3rd paragraph.

      (3) Figure 2:

      - Figure 2A is a bit too small to read when printed.

      We have changed this in the revised manuscript.

      - Since WT and DKO lack detectable L1TD1, would you expect any difference in RIP-Seq results between these two?

      Due to the lack of DNMT1 and the resulting DNA hypomethylation, DKO cells are more similar to KO cells than WT cells with respect to the expressed transcripts.

      - Legend says selected dots are in green (it appears blue to me). We have changed this in the revised manuscript.

      - Would you recover L1 ORF1p and its binding partners in the KO? (Is the antibody specific in the absence of L1TD1 or can it recognize L1?) I noticed an increase in ORF1p in the KO in Figure 3C.

      Thank you for the suggestion. Yes, L1 ORF1p shows slightly increased expression in the proteome analysis and we have marked the corresponding dot in the Volcano plot (Figure 3A).

      - Should the figure panel reference near the (Rosspopoff & Trono) reference instead be Sup S1C as well? Otherwise, I don't think S1C is mentioned at all.

      - What are the red vs. green dots in 2D? Can you highlight ERV and ALU with different colors?

      We added the reference to Suppl. Figure S1C (now S3C) in the revised manuscript. In Figure 2D L1 elements are highlighted in green, ERV elements in yellow, and other associated transposon transcripts in red.

      Response: Much better, thanks!

      - Which L1 subfamily from Figure 2D is represented in the qRT-PCR in 2E "LINE-1"? Do the primers match a specific L1 subfamily? If so, which? We used primers specific for the human L1.2 subfamily.

      - Pulling down SINE element transcripts makes some sense, as many insertions "borrow" L1 sequences for non-autonomous retro transposition, but can you speculate as to why ERVs are recovered? There should be essentially no overlap in sequence.

      In the L1TD1 evolution paper [8], a potential link between L1TD1 and ERV elements was discussed:

      "Alternatively, L1TD1 in sigmodonts could play a role in genome defense against another element active in these genomes. Indeed, the sigmodontine rodents have a highly active family of ERVs, the mysTR elements [46]. Expansion of this family preceded the death of L1s, but these elements are very active, with 3500 to 7000 speciesspecific insertions in the L1-extinct species examined [47]. This recent ERV amplification in Sigmodontinae contrasts with the megabats (where L1TD1 has been lost in many species); there are apparently no highly active DNA or RNA elements in megabats [48]. If L1TD1 can suppress retroelements other than L1s, this could explain why the gene is retained in sigmodontine rodents but not in megabats."

      Furthermore, Jin et al. report the binding of L1TD1 to repetitive sequences in transcripts [12]. It is possible that some of these sequences are also present in ERV RNAs.

      Response: Interesting, thanks for sharing

      - Is S2B a screenshot? (the red underline).

      No, it is a Powerpoint figure, and we have removed the red underline.

      (4) Figure 3:

      - Text refers to Figure 3B as a western blot. Figure 3B shows a volcano plot. This is likely 3C but would still be out of order (3A>3C>3B referencing). I think this error is repeated in the last result section.

      - Figure and legends fail to mention what gene was used for ddCT method (actin, gapdh, etc.).

      - In general, the supplemental legends feel underwritten and could benefit from additional explanations. (Main figures are appropriate but please double-check that all statistical tests have been mentioned correctly).

      Thank you for pointing this out. We have corrected these errors in the revised manuscript.

      (5) Discussion:

      - Aluy connection is interesting. Is there an "Alu retrotransposition reporter assay" to test whether L1TD1 enhances this as well?

      Thank you for the suggestion. There is indeed an Alu retrotransposition reporter assay reported be Dewannieux et al. [13]. The assay is based on a Neo selection marker. We have previously tested a Neo selection-based L1 retrotransposition reporter assay, but this system failed to properly work in HAP1 cells, therefore we switched to a blasticidin based L1 retrotransposition reporter assay. A corresponding blasticidin-based Alu retrotransposition reporter assay might be interesting for future studies (mentioned in the Discussion, page 11 paragraph 4 of the revised manuscript.

      (6) Material and Methods :

      - The number of typos in the materials and methods is too numerous to list. Instead, please refer to the next section that broadly describes the issues seen throughout the manuscript.

      Writing style

      (1) Keep a consistent style throughout the manuscript: for example, L1 or LINE-1 (also L1 ORF1p or LINE-1 ORF1p); per or "/"; knockout or knock-out; min or minute; 3 times or three times; media or medium. Additionally, as TE naming conventions are not uniform, it is important to maintain internal consistency so as to not accidentally establish an imprecise version.

      (2) There's a period between "et al" and the comma, and "et al." should be italic.

      (3) The authors should explain what the key jargon is when it is first used in the manuscript, such as "retrotransposon" and "retrotransposition".

      (4) The authors should show the full spelling of some acronyms when they use it for the first time, such as RNA Immunoprecipitation (RIP).

      (5) Use a space between numbers and alphabets, such as 5 μg. (6) 2.0 × 105 cells, that's not an "x".

      (7) Numbers in the reference section are lacking (hard to parse).

      (8) In general, there are a significant number of typos in this draft which at times becomes distracting. For example, (P3) Introduction: Yet, co-option of TEs thorough (not thorough, it should be through) evolution has created so-called domesticated genes beneficial to the gene network in a wide range of organisms. Please carefully revise the entire manuscript for these minor issues that collectively erode the quality of this submission. Thank you for pointing out these mistakes. We have corrected them in the revised manuscript. A native speaker from our research group has carefully checked the paper. In summary, we have added Supplementary Figure S7C and have changed Figures 1C, 1E, 1F, 2A, 2D, 3A, 4B, S3A-D, S4B and S6A based on these comments.

      Response: Thank you for taking these comments on board!

    1. eLife Assessment

      The study reports valuable findings from a very rich EEG-fMRI dataset including 107 participants, which was collected during nocturnal naps. The authors link activity in memory-related brain regions (e.g., hippocampus, thalamus, and medial prefrontal cortex), and their functional connectivity, to the occurrence of canonical sleep rhythms, namely spindles and slow oscillations in non-rapid eye movement sleep. This work could contribute to further understanding of sleep neural dynamics, although the evidence for some of the main claims is incomplete at present.

    2. Reviewer #1 (Public review):

      Wang et al., recorded concurrent EEG-fMRI in 107 participants during nocturnal NREM sleep to investigate brain activity and connectivity related to slow oscillations (SO), sleep spindles, and in particular their co-occurrence. The authors found SO-spindle coupling to be correlated with increased thalamic and hippocampal activity, and with increased functional connectivity from the hippocampus to the thalamus and from the thalamus to the neocortex, especially the medial prefrontal cortex (mPFC). They concluded the brain-wide activation pattern to resemble episodic memory processing, but to be dissociated from task-related processing and suggest that the thalamus plays a crucial role in coordinating the hippocampal-cortical dialogue during sleep.

      The paper offers an impressively large and highly valuable dataset that provides the opportunity for gaining important new insights into the network substrate involved in SOs, spindles, and their coupling. However, the paper does unfortunately not exploit the full potential of this dataset with the analyses currently provided, and the interpretation of the results is often not backed up by the results presented.

      I have the following specific comments.

      (1) The introduction is lacking sufficient review of the already existing literature on EEG-fMRI during sleep and the BOLD-correlates of slow oscillations and spindles in particular (Laufs et al., 2007; Schabus et al., 2007; Horovitz et al., 2008; Laufs, 2008; Czisch et al., 2009; Picchioni et al., 2010; Spoormaker et al., 2010; Caporro et al., 2011; Bergmann et al., 2012; Hale et al., 2016; Fogel et al., 2017; Moehlman et al., 2018; Ilhan-Bayrakci et al., 2022). The few studies mentioned are not discussed in terms of the methods used or insights gained.

      (2) The paper falls short in discussing the specific insights gained into the neurobiological substrate of the investigated slow oscillations, spindles, and their interactions. The validity of the inverse inference approach ("Open ended cognitive state decoding"), assuming certain cognitive functions to be related to these oscillations because of the brain regions/networks activated in temporal association with these events, is debatable at best. It is also unclear why eventually only episodic memory processing-like brain-wide activation is discussed further, despite the activity of 16 of 50 feature terms from the NeuroSynth v3 dataset were significant (episodic memory, declarative memory, working memory, task representation, language, learning, faces, visuospatial processing, category recognition, cognitive control, reading, cued attention, inhibition, and action).

      (3) Hippocampal activation during SO-spindles is stated as a main hypothesis of the paper - for good reasons - however, other regions (e.g., several cortical as well as thalamic) would be equally expected given the known origin of both oscillations and the existing sleep-EEG-fMRI literature. However, this focus on the hippocampus contrasts with the focus on investigating the key role of the thalamus instead in the Results section.

      (4) The study included an impressive number of 107 subjects. It is surprising though that only 31 subjects had to be excluded under these difficult recording conditions, especially since no adaptation night was performed. Since only subjects were excluded who slept less than 10 min (or had excessive head movements) there are likely several datasets included with comparably short durations and only a small number of SOs and spindles and even less combined SO-spindle events. A comprehensive table should be provided (supplement) including for each subject (included and excluded) the duration of included NREM sleep, number of SOs, spindles, and SO+spindle events. Also, some descriptive statistics (mean/SD/range) would be helpful.

      (5) Was the 20-channel head coil dedicated for EEG-fMRI measurements? How were the electrode cables guided through/out of the head coil? Usually, the 64-channel head coil is used for EEG-fMRI measurements in a Siemens PRISMA 3T scanner, which has a cable duct at the back that allows to guide the cables straight out of the head coil (to minimize MR-related artifacts). The choice for the 20-channel head coil should be motivated. Photos of the recording setup would also be helpful.

      (6) Was the EEG sampling synchronized to the MR scanner (gradient system) clock (the 10 MHz signal; not referring to the volume TTL triggers here)? This is a requirement for stable gradient artifact shape over time and thus accurate gradient noise removal.

      (7) The TR is quite long and the voxel size is quite large in comparison to state-of-the-art EPI sequences. What was the rationale behind choosing a sequence with relatively low temporal and spatial resolution?

      (8) The anatomically defined ROIs are quite large. It should be elaborated on how this might reduce sensitivity to sleep rhythm-specific activity within sub-regions, especially for the thalamus, which has distinct nuclei involved in sleep functions.

      (9) The study reports SO & spindle amplitudes & densities, as well as SO+spindle coupling, to be larger during N2/3 sleep compared to N1 and REM sleep, which is trivial but can be seen as a sanity check of the data. However, the amount of SOs and spindles reported for N1 and REM sleep is concerning, as per definition there should be hardly any (if SOs or spindles occur in N1 it becomes by definition N2, and the interval between spindles has to be considerably large in REM to still be scored as such). Thus, on the one hand, the report of these comparisons takes too much space in the main manuscript as it is trivial, but on the other hand, it raises concerns about the validity of the scoring.

      (10) Why was electrode F3 used to quantify the occurrence of SOs and spindles? Why not a midline frontal electrode like Fz (or a number of frontal electrodes for SOs) and Cz (or a number of centroparietal electrodes) for spindles to be closer to their maximum topography?

      (11) Functional connectivity (hippocampus -> thalamus -> cortex (mPFC)) is reported to be increased during SO-spindle coupling and interpreted as evidence for coordination of hippocampo-neocortical communication likely by thalamic spindles. However, functional connectivity was only analysed during coupled SO+spindle events, not during isolated SOs or isolated spindles. Without the direct comparison of the connectivity patterns between these three events, it remains unclear whether this is specific for coupled SO+spindle events or rather associated with one or both of the other isolated events. The PPIs need to be conducted for those isolated events as well and compared statistically to the coupled events.

      (12) The limited temporal resolution of fMRI does indeed not allow for easily distinguishing between fMRI activation patterns related to SO-up- vs. SO-down-states. For this, one could try to extract the amplitudes of SO-up- and SO-down-states separately for each SO event and model them as two separate parametric modulators (with the risk of collinearity as they are likely correlated).

      (13) L327: "It is likely that our findings of diminished DMN activity reflect brain activity during the SO DOWN-state, as this state consistently shows higher amplitude compared to the UP-state within subjects, which is why we modelled the SO trough as its onset in the fMRI analysis." This conclusion is not justified as the fact that SO down-states are larger in amplitude does not mean their impact on the BOLD response is larger.

      (14) Line 77: "In the current study, while directly capturing hippocampal ripples with scalp EEG or fMRI is difficult, we expect to observe hippocampal activation in fMRI whenever SOs-spindles coupling is detected by EEG, if SOs- spindles-ripples triple coupling occurs during human NREM sleep". Not all SO-spindle events are associated with ripples (Staresina et al., 2015), but hippocampal activation may also be expected based on the occurrence of spindles alone (Bergmann et al., 2012).

      References:

      Bergmann TO, Molle M, Diedrichs J, Born J, Siebner HR (2012) Sleep spindle-related reactivation of category-specific cortical regions after learning face-scene associations. Neuroimage 59:2733-2742.<br /> Caporro M, Haneef Z, Yeh HJ, Lenartowicz A, Buttinelli C, Parvizi J, Stern JM (2011) Functional MRI of sleep spindles and K-complexes. Clin Neurophysiol.<br /> Czisch M, Wehrle R, Stiegler A, Peters H, Andrade K, Holsboer F, Samann PG (2009) Acoustic oddball during NREM sleep: a combined EEG/fMRI study. PLoS One 4:e6749.<br /> Fogel S, Albouy G, King BR, Lungu O, Vien C, Bore A, Pinsard B, Benali H, Carrier J, Doyon J (2017) Reactivation or transformation? Motor memory consolidation associated with cerebral activation time-locked to sleep spindles. PLoS One 12:e0174755.<br /> Hale JR, White TP, Mayhew SD, Wilson RS, Rollings DT, Khalsa S, Arvanitis TN, Bagshaw AP (2016) Altered thalamocortical and intra-thalamic functional connectivity during light sleep compared with wake. Neuroimage 125:657-667.<br /> Horovitz SG, Fukunaga M, de Zwart JA, van Gelderen P, Fulton SC, Balkin TJ, Duyn JH (2008) Low frequency BOLD fluctuations during resting wakefulness and light sleep: a simultaneous EEG-fMRI study. Hum Brain Mapp 29:671-682.<br /> Ilhan-Bayrakci M, Cabral-Calderin Y, Bergmann TO, Tuscher O, Stroh A (2022) Individual slow wave events give rise to macroscopic fMRI signatures and drive the strength of the BOLD signal in human resting-state EEG-fMRI recordings. Cereb Cortex 32:4782-4796.<br /> Laufs H (2008) Endogenous brain oscillations and related networks detected by surface EEG-combined fMRI. Hum Brain Mapp 29:762-769.<br /> Laufs H, Walker MC, Lund TE (2007) 'Brain activation and hypothalamic functional connectivity during human non-rapid eye movement sleep: an EEG/fMRI study'--its limitations and an alternative approach. Brain 130:e75; author reply e76.<br /> Moehlman TM, de Zwart JA, Chappel-Farley MG, Liu X, McClain IB, Chang C, Mandelkow H, Ozbay PS, Johnson NL, Bieber RE, Fernandez KA, King KA, Zalewski CK, Brewer CC, van Gelderen P, Duyn JH, Picchioni D (2018) All-Night Functional Magnetic Resonance Imaging Sleep Studies. J Neurosci Methods.<br /> Picchioni D, Horovitz SG, Fukunaga M, Carr WS, Meltzer JA, Balkin TJ, Duyn JH, Braun AR (2010) Infraslow EEG oscillations organize large-scale cortical-subcortical interactions during sleep: A combined EEG/fMRI study. Brain Res.<br /> Schabus M, Dang-Vu TT, Albouy G, Balteau E, Boly M, Carrier J, Darsaud A, Degueldre C, Desseilles M, Gais S, Phillips C, Rauchs G, Schnakers C, Sterpenich V, Vandewalle G, Luxen A, Maquet P (2007) Hemodynamic cerebral correlates of sleep spindles during human non-rapid eye movement sleep. Proc Natl Acad Sci U S A 104:13164-13169.<br /> Spoormaker VI, Schroter MS, Gleiser PM, Andrade KC, Dresler M, Wehrle R, Samann PG, Czisch M (2010) Development of a large-scale functional brain network during human non-rapid eye movement sleep. J Neurosci 30:11379-11387.<br /> Staresina BP, Bergmann TO, Bonnefond M, van der Meij R, Jensen O, Deuker L, Elger CE, Axmacher N, Fell J (2015) Hierarchical nesting of slow oscillations, spindles and ripples in the human hippocampus during sleep. Nat Neurosci 18:1679-1686.

    3. Reviewer #2 (Public review):

      In this study, Wang and colleagues aimed to explore brain-wide activation patterns associated with NREM sleep oscillations, including slow oscillations (SOs), spindles, and SO-spindle coupling events. Their findings reveal that SO-spindle events corresponded with increased activation in both the thalamus and hippocampus. Additionally, they observed that SO-spindle coupling was linked to heightened functional connectivity from the hippocampus to the thalamus, and from the thalamus to the medial prefrontal cortex-three key regions involved in memory consolidation and episodic memory processes.

      This study's findings are timely and highly relevant to the field. The authors' extensive data collection, involving 107 participants sleeping in an fMRI while undergoing simultaneous EEG recording, deserves special recognition. If shared, this unique dataset could lead to further valuable insights. While the conclusions of the data seem overall well supported by the data, some aspects with regard to the detection of sleep oscillations need clarification.

      The authors report that coupled SO-spindle events were most frequent during NREM sleep (2.46 {plus minus} 0.06 events/min), but they also observed a surprisingly high occurrence of these events during N1 and REM sleep (2.23 {plus minus} 0.09 and 2.32 {plus minus} 0.09 events/min, respectively), where SO-spindle coupling would not typically be expected. Combined with the relatively modest SO amplitudes reported (~25 µV, whereas >75 µV would be expected when using mastoids as reference electrodes), this raises the possibility that the parameters used for event detection may not have been conservative enough - or that sleep staging was inaccurately performed. This issue could present a significant challenge, as the fMRI findings are largely dependent on the reliability of these detected events.

    4. Reviewer #3 (Public review):

      Summary:

      Wang et al., examined the brain activity patterns during sleep, especially when locked to those canonical sleep rhythms such as SO, spindle, and their coupling. Analyzing data from a large sample, the authors found significant coupling between spindles and SOs, particularly during the upstate of the SO. Moreover, the authors examined the patterns of whole-brain activity locked to these sleep rhythms. To understand the functional significance of these brain activities, the authors further conducted open-ended cognitive state decoding and found a variety of cognitive processing may be involved during SO-spindle coupling and during other sleep events. The authors next investigated the functional connectivity analyses and found enhanced connectivity between the hippocampus, the thalamus, and the medial PFC. These results reinforced the theoretical model of sleep-dependent memory consolidation, such that SO-spindle coupling is conducive to systems-level memory reactivation and consolidation.

      Strengths:

      There are obvious strengths in this work, including the large sample size, state-of-the-art neuroimaging and neural oscillation analyses, and the richness of results.

      Weaknesses:

      Despite these strengths and the insights gained, there are weaknesses in the design, the analyses, and inferences.

      A repeating statement in the manuscript is that brain activity could indicate memory reactivation and thus consolidation. This is indeed a highly relevant question that could be informed by the current data/results. However, an inherent weakness of the design is that there is no memory task before and after sleep. Thus, it is difficult (if not impossible) to make a strong argument linking SO/spindle/coupling-locked brain activity with memory reactivation or consolidation.

      Relatedly, to understand the functional implications of the sleep rhythm-locked brain activity, the authors employed the "open-ended cognitive state decoding" method. While this method is interesting, it is rather indirect given that there were no behavioral indices in the manuscript. Thus, discussions based on these analyses are speculative at best. Please either tone down the language or find additional evidence to support these claims.

      Moreover, the results from this method are difficult to understand. Figure 3e showed that for all three types of sleep events (SO, spindle, SO-spindle), the same mental states (e.g., working memory, episodic memory, declarative memory) showed opposite directions of activation (left and right panels showed negative and positive activation, respectively). How to interpret these conflicting results? This ambiguity is also reflected by the term used: declarative memory and episodic memories are both indexed in the results. Yet these two processes can be largely overlapped. So which specific memory processes do these brain activity patterns reflect? The Discussion shall discuss these results and the limitations of this method.

      The coupling strength is somehow inconsistent with prior results (Hahn et al., 2020, eLife, Helfrich et al., 2018, Neuron). Specifically, Helfrich et al. showed that among young adults, the spindle is coupled to the peak of the SO. Here, the authors reported that the spindles were coupled to down-to-up transitions of SO and before the SO peak. It is possible that participants' age may influence the coupling (see Helfrich et al., 2018). Please discuss the findings in the context of previous research on SO-spindle coupling.

      The discussion is rather superficial with only two pages, without delving into many important arguments regarding the possible functional significance of these results. For example, the author wrote, "This internal processing contrasts with the brain patterns associated with external tasks, such as working memory." Without any references to working memory, and without delineating why WM is considered as an external task even working memory operations can be internal. Similarly, for the interesting results on SO and reduced DMN activity, the authors wrote "The DMN is typically active during wakeful rest and is associated with self-referential processes like mind-wandering, daydreaming, and task representation (Yeshurun, Nguyen, & Hasson, 2021). Its reduced activity during SOs may signal a shift towards endogenous processes such as memory consolidation." This argument is flawed. DMN is active during self-referential processing and mind-wandering, i.e., when the brain shifts from external stimuli processing to internal mental processing. During sleep, endogenous memory reactivation and consolidation are also part of the internal mental processing given the lack of external environmental stimulation. So why during SO or during memory consolidation, the DMN activity would be reduced? Were there differences in DMN activity between SO and SO-spindle coupling events?

    5. Author response:

      Reviewer #1 (Public review):

      Wang et al., recorded concurrent EEG-fMRI in 107 participants during nocturnal NREM sleep to investigate brain activity and connectivity related to slow oscillations (SO), sleep spindles, and in particular their co-occurrence. The authors found SO-spindle coupling to be correlated with increased thalamic and hippocampal activity, and with increased functional connectivity from the hippocampus to the thalamus and from the thalamus to the neocortex, especially the medial prefrontal cortex (mPFC). They concluded the brain-wide activation pattern to resemble episodic memory processing, but to be dissociated from task-related processing and suggest that the thalamus plays a crucial role in coordinating the hippocampal-cortical dialogue during sleep.

      The paper offers an impressively large and highly valuable dataset that provides the opportunity for gaining important new insights into the network substrate involved in SOs, spindles, and their coupling. However, the paper does unfortunately not exploit the full potential of this dataset with the analyses currently provided, and the interpretation of the results is often not backed up by the results presented. I have the following specific comments.

      Thank you for your thoughtful and constructive feedback. We greatly appreciate your recognition of the strengths of our dataset and findings Below, we address your specific comments and provide responses to each point you raised to ensure our methods and results are as transparent and comprehensible as possible. We hope these revisions address your comments and further strengthen our manuscript. Thank you again for the constructive feedback.

      (1) The introduction is lacking sufficient review of the already existing literature on EEG-fMRI during sleep and the BOLD-correlates of slow oscillations and spindles in particular (Laufs et al., 2007; Schabus et al., 2007; Horovitz et al., 2008; Laufs, 2008; Czisch et al., 2009; Picchioni et al., 2010; Spoormaker et al., 2010; Caporro et al., 2011; Bergmann et al., 2012; Hale et al., 2016; Fogel et al., 2017; Moehlman et al., 2018; Ilhan-Bayrakci et al., 2022). The few studies mentioned are not discussed in terms of the methods used or insights gained.

      We acknowledge the need for a more comprehensive review of prior EEG-fMRI studies investigating BOLD correlates of slow oscillations and spindles. However, these articles are not all related to sleep SO or spindle. Articles (Hale et al., 2016; Horovitz et al., 2008; Laufs, 2008; Laufs, Walker, & Lund, 2007; Spoormaker et al., 2010) mainly focus on methodology for EEG-fMRI, sleep stages, or brain networks, which are not the focus of our study. Thank you again for your attention to the comprehensiveness of our literature review, and we will expand the introduction to include a more detailed discussion of the existing literature, ensuring that the contributions of previous EEG-fMRI sleep studies are adequately acknowledged.

      Introduction, Page 4 Lines 62-76

      “Investigating these sleep-related neural processes in humans is challenging because it requires tracking transient sleep rhythms while simultaneously assessing their widespread brain activation. Recent advances in simultaneous EEG-fMRI techniques provide a unique opportunity to explore these processes. EEG allows for precise event-based detection of neural signal, while fMRI provides insight into the broader spatial patterns of brain activation and functional connectivity (Horovitz et al., 2008; Huang et al., 2024; Laufs, 2008; Laufs, Walker, & Lund, 2007; Schabus et al., 2007; Spoormaker et al., 2010). Previous EEG-fMRI studies on sleep have focused on classifying sleep stages or examining the neural correlates of specific waves (Bergmann et al., 2012; Caporro et al., 2012; Czisch et al., 2009; Fogel et al., 2017; Hale et al., 2016; Ilhan-Bayrakcı et al., 2022; Moehlman et al., 2019; Picchioni et al., 2011). These studies have generally reported that slow oscillations are associated with widespread cortical and subcortical BOLD changes, whereas spindles elicit activation in the thalamus, as well as in several cortical and paralimbic regions. Although these findings provide valuable insights into the BOLD correlates of sleep rhythms, they often do not employ sophisticated temporal modeling (Huang et al., 2024), to capture the dynamic interactions between different oscillatory events, e.g., the coupling between SOs and spindles.”

      (2) The paper falls short in discussing the specific insights gained into the neurobiological substrate of the investigated slow oscillations, spindles, and their interactions. The validity of the inverse inference approach ("Open ended cognitive state decoding"), assuming certain cognitive functions to be related to these oscillations because of the brain regions/networks activated in temporal association with these events, is debatable at best. It is also unclear why eventually only episodic memory processing-like brain-wide activation is discussed further, despite the activity of 16 of 50 feature terms from the NeuroSynth v3 dataset were significant (episodic memory, declarative memory, working memory, task representation, language, learning, faces, visuospatial processing, category recognition, cognitive control, reading, cued attention, inhibition, and action).

      Thank you for pointing this out, particularly regarding the use of inverse inference approaches such as “open-ended cognitive state decoding.” Given the concerns about the indirectness of this approach, we decided to remove its related content and results from Figure 3 in the main text and include it in Supplementary Figure 7. We will refocus the main text on direct neurobiological insights gained from our EEG-fMRI analyses, particularly emphasizing the hippocampal-thalamocortical network dynamics underlying SO-spindle coupling, and we will acknowledge the exploratory nature of these findings and highlight their limitations.

      Discussion, Page 17-18 Lines 323-332

      “To explore functional relevance, we employed an open-ended cognitive state decoding approach using meta-analytic data (NeuroSynth: Yarkoni et al. (2011)). Although this method usefully generates hypotheses about potential cognitive processes, particularly in the absence of a pre- and post-sleep memory task, it is inherently indirect. Many cognitive terms showed significant associations (16 of 50), such as “episodic memory,” “declarative memory,” and “working memory.” We focused on episodic/declarative memory given the known link with hippocampal reactivation (Diekelmann & Born, 2010; Staresina et al., 2015; Staresina et al., 2023). Nonetheless, these inferences regarding memory reactivation should be interpreted cautiously without direct behavioral measures. Future research incorporating explicit tasks before and after sleep would more rigorously validate these potential functional claims.”

      (3) Hippocampal activation during SO-spindles is stated as a main hypothesis of the paper - for good reasons - however, other regions (e.g., several cortical as well as thalamic) would be equally expected given the known origin of both oscillations and the existing sleep-EEG-fMRI literature. However, this focus on the hippocampus contrasts with the focus on investigating the key role of the thalamus instead in the Results section.

      We appreciate your insight regarding the relative emphasis on hippocampal and thalamic activation in our study. We recognize that the manuscript may currently present an inconsistency between our initial hypothesis and the main focus of the results. To address this concern, we will ensure that our Introduction and Discussion section explicitly discusses both regions, highlighting the complementary roles of the hippocampus (memory processing and reactivation) and the thalamus (spindle generation and cortico-hippocampal coordination) in SO-spindle dynamics.

      Introduction, Page 5 Lines 87-103

      “To address this gap, our study investigates brain-wide activation and functional connectivity patterns associated with SO-spindle coupling, and employs a cognitive state decoding approach (Margulies et al., 2016; Yarkoni et al., 2011)—albeit indirectly—to infer potential cognitive functions. In the current study, we used simultaneous EEG-fMRI recordings during nocturnal naps (detailed sleep staging results are provided in the Methods and Table S1) in 107 participants. Although directly detecting hippocampal ripples using scalp EEG or fMRI is challenging, we expected that hippocampal activation in fMRI would coincide with SO-spindle coupling detected by EEG, given that SOs, spindles, and ripples frequently co-occur during NREM sleep. We also anticipated a critical role of the thalamus, particularly thalamic spindles, in coordinating hippocampal-cortical communication.

      We found significant coupling between SOs and spindles during NREM sleep (N2/3), with spindle peaks occurring slightly before the SO peak. This coupling was associated with increased activation in both the thalamus and hippocampus, with functional connectivity patterns suggesting thalamic coordination of hippocampal-cortical communication. These findings highlight the key role of the thalamus in coordinating hippocampal-cortical interactions during human sleep and provide new insights into the neural mechanisms underlying sleep-dependent brain communication. A deeper understanding of these mechanisms may contribute to future neuromodulation approaches aimed at enhancing sleep-dependent cognitive function and treating sleep-related disorders.”

      Discussion, Page 16-17 Lines 292-307

      “When modeling the timing of these sleep rhythms in the fMRI, we observed hippocampal activation selectively during SO-spindle events. This suggests the possibility of triple coupling (SOs–spindles–ripples), even though our scalp EEG was not sufficiently sensitive to detect hippocampal ripples—key markers of memory replay (Buzsáki, 2015). Recent iEEG evidence indicates that ripples often co-occur with both spindles (Ngo, Fell, & Staresina, 2020) and SOs (Staresina et al., 2015; Staresina et al., 2023). Therefore, the hippocampal involvement during SO-spindle events in our study may reflect memory replay from the hippocampus, propagated via thalamic spindles to distributed cortical regions.

      The thalamus, known to generate spindles (Halassa et al., 2011), plays a key role in producing and coordinating sleep rhythms (Coulon, Budde, & Pape, 2012; Crunelli et al., 2018), while the hippocampus is found essential for memory consolidation (Buzsáki, 2015; Diba & Buzsá ki, 2007; Singh, Norman, & Schapiro, 2022). The increased hippocampal and thalamic activity, along with strengthened connectivity between these regions and the mPFC during SO-spindle events, underscores a hippocampal-thalamic-neocortical information flow. This aligns with recent findings suggesting the thalamus orchestrates neocortical oscillations during sleep (Schreiner et al., 2022). The thalamus and hippocampus thus appear central to memory consolidation during sleep, guiding information transfer to the neocortex, e.g., mPFC.”

      (4) The study included an impressive number of 107 subjects. It is surprising though that only 31 subjects had to be excluded under these difficult recording conditions, especially since no adaptation night was performed. Since only subjects were excluded who slept less than 10 min (or had excessive head movements) there are likely several datasets included with comparably short durations and only a small number of SOs and spindles and even less combined SO-spindle events. A comprehensive table should be provided (supplement) including for each subject (included and excluded) the duration of included NREM sleep, number of SOs, spindles, and SO+spindle events. Also, some descriptive statistics (mean/SD/range) would be helpful.

      We appreciate your recognition of our sample size and the challenges associated with simultaneous EEG-fMRI sleep recordings. We acknowledge the importance of transparently reporting individual subject data, particularly regarding sleep duration and the number of detected SOs, spindles, and SO-spindle events. To address this, we will provide comprehensive tables in the supplementary materials, contains descriptive information about sleep-related characteristics (Table S1), as well as detailed information about sleep waves at each sleep stage for all 107 subjects(Table S2-S4), listing for each subject:(1)Different sleep stage duration; (2)Number of detected SOs; (3)Number of detected spindles; (4)Number of detected SO-spindle coupling events; (5)Density of detected SOs; (6)Density of detected spindles; (7)Density of detected SO-spindle coupling events.

      However, most of the excluded participants were unable to fall asleep or had too short a sleep duration, so they basically had no NREM sleep period, so it was impossible to count the NREM sleep duration, SO, spindle, and coupling numbers.

      Supplementary Materials, Page 42-54, Table S1-S4

      (Consider of the length, we do not list all the tables here. Please refer to the revised manuscript.)

      (5) Was the 20-channel head coil dedicated for EEG-fMRI measurements? How were the electrode cables guided through/out of the head coil? Usually, the 64-channel head coil is used for EEG-fMRI measurements in a Siemens PRISMA 3T scanner, which has a cable duct at the back that allows to guide the cables straight out of the head coil (to minimize MR-related artifacts). The choice for the 20-channel head coil should be motivated. Photos of the recording setup would also be helpful.

      Thank you for your comment regarding our choice of the 20-channel head coil for EEG-fMRI measurements. We acknowledge that the 64-channel head coil is commonly used in Siemens PRISMA 3T scanners; however, the 20-channel coil was selected due to specific practical and technical considerations in our study. In particular, the 20-channel head coil was compatible with our EEG system and ensured sufficient signal-to-noise ratio (SNR) for both EEG and fMRI acquisition. The EEG electrode cables were guided through the lateral and posterior openings of the head coil, secured with foam padding to reduce motion and minimize MR-related artifacts. Moreover, given the extended nature of nocturnal sleep recordings, the 20-channel coil allowed us to maintain participant comfort while still achieving high-quality simultaneous EEG-fMRI data.

      We have made this clearer in the revised manuscript.

      Methods, Page 20 Lines 385-392

      “All MRI data were acquired using a 20-channel head coil on a research-dedicated 3-Tesla Siemens Magnetom Prisma MRI scanner. Earplugs and cushions were provided for noise protection and head motion restriction. We chose the 20-channel head coil because it was compatible with our EEG system and ensured sufficient signal-to-noise ratio (SNR) for both EEG and fMRI acquisition. The EEG electrode cables were guided through the lateral and posterior openings of the head coil, secured with foam padding to reduce motion and minimize MR-related artifacts. Moreover, given the extended nature of nocturnal sleep recordings, the 20-channel coil helped maintain participant comfort while still achieving high-quality simultaneous EEG-fMRI data.”

      (6) Was the EEG sampling synchronized to the MR scanner (gradient system) clock (the 10 MHz signal; not referring to the volume TTL triggers here)? This is a requirement for stable gradient artifact shape over time and thus accurate gradient noise removal.

      Thank you for raising this important point. We confirm that the EEG sampling was synchronized to the MR scanner’s 10 MHz gradient system clock, ensuring a stable gradient artifact shape over time and enabling accurate artifact removal. This synchronization was achieved using the standard clock synchronization interface of the EEG amplifier, minimizing timing jitter and drift. As a result, the gradient artifact waveform remained stable across volumes, allowing for more effective artifact correction during preprocessing. We appreciate your attention to this critical aspect of EEG-fMRI data acquisition.

      We have made this clearer in the revised manuscript.

      Methods, Page 19-20 Lines 371-383

      “EEG was recorded simultaneously with fMRI data using an MR-compatible EEG amplifier system (BrainAmps MR-Plus, Brain Products, Germany), along with a specialized electrode cap. The recording was done using 64 channels in the international 10/20 system, with the reference channel positioned at FCz. In order to adhere to polysomnography (PSG) recording standards, six electrodes were removed from the EEG cap: one for electrocardiogram (ECG) recording, two for electrooculogram (EOG) recording, and three for electromyogram (EMG) recording. EEG data was recorded at a sample rate of 5000 Hz, the resistance of the reference and ground channels was kept below 10 kΩ, and the resistance of the other channels was kept below 20 kΩ. To synchronize the EEG and fMRI recordings, the BrainVision recording software (BrainProducts, Germany) was utilized to capture triggers from the MRI scanner. The EEG sampling was synchronized to the MR scanner’s 10 MHz gradient system clock, ensuring a stable gradient artifact shape over time and enabling accurate artifact removal. This was achieved via the standard clock synchronization interface of the EEG amplifier, minimizing timing jitter and drift.”

      (7) The TR is quite long and the voxel size is quite large in comparison to state-of-the-art EPI sequences. What was the rationale behind choosing a sequence with relatively low temporal and spatial resolution?

      We acknowledge that our chosen TR and voxel size are relatively long and large compared to state-of-the-art EPI sequences. This decision was made to optimize the signal-to-noise ratio (SNR) and reduce susceptibility-related distortions, which are particularly critical in EEG-fMRI sleep studies where head motion and physiological noise can be substantial. A longer TR allowed us to sample whole-brain activity with sufficient coverage, while a larger voxel size helped enhance BOLD sensitivity and minimize partial volume effects in deep brain structures such as the thalamus and hippocampus, which are key regions of interest in our study. We appreciate your concern and hope this clarification provides sufficient rationale for our sequence parameters.

      We have made this clearer in the revised manuscript.

      Methods, Page 20-21 Lines 398-408

      “Then, the “sleep” session began after the participants were instructed to try and fall asleep. For the functional scans, whole-brain images were acquired using k-space and steady-state T2*-weighted gradient echo-planar imaging (EPI) sequence that is sensitive to the BOLD contrast. This measures local magnetic changes caused by changes in blood oxygenation that accompany neural activity (sequence specification: 33 slices in interleaved ascending order, TR = 2000 ms, TE = 30 ms, voxel size = 3.5 × 3.5 × 4.2 mm<sup>3</sup>, FA = 90°, matrix = 64 × 64, gap = 0.7 mm). A relatively long TR and larger voxel size were chosen to optimize SNR and reduce susceptibility-related distortions, which are critical in EEG-fMRI sleep studies where head motion and physiological noise can be substantial. The longer TR allowed whole-brain coverage with sufficient temporal resolution, while the larger voxel size helped enhance BOLD sensitivity and minimize partial volume effects in deep brain structures (e.g., the thalamus and hippocampus), which are key regions of interest in this study.”

      (8) The anatomically defined ROIs are quite large. It should be elaborated on how this might reduce sensitivity to sleep rhythm-specific activity within sub-regions, especially for the thalamus, which has distinct nuclei involved in sleep functions.

      We appreciate your insight regarding the use of anatomically defined ROIs and their potential limitations in detecting sleep rhythm-specific activity within sub-regions, particularly in the thalamus. Given the distinct functional roles of thalamic nuclei in sleep processes, we acknowledge that using a single, large thalamic ROI may reduce sensitivity to localized activity patterns. To address this, we will discuss this limitation in the revised manuscript, acknowledging that our approach prioritizes whole-structure effects but may not fully capture nucleus-specific contributions.

      Discussion, Page 18 Lines 333-341

      “Despite providing new insights, our study has several limitations. First, our scalp EEG did not directly capture hippocampal ripples, preventing us from conclusively demonstrating triple coupling. Second, the combination of EEG-fMRI and the lack of a memory task limit our ability to parse fine-grained BOLD responses at the DOWN- vs. UP-states of SOs and link observed activations to behavioral outcomes. Third, the use of large anatomical ROIs may mask subregional contributions of specific thalamic nuclei or hippocampal subfields. Finally, without a memory task, we cannot establish a direct behavioral link between sleep-rhythm-locked activation and memory consolidation. Future studies combining techniques such as ultra-high-field fMRI or iEEG with cognitive tasks may refine our understanding of subregional network dynamics and functional significance during sleep.”

      (9) The study reports SO & spindle amplitudes & densities, as well as SO+spindle coupling, to be larger during N2/3 sleep compared to N1 and REM sleep, which is trivial but can be seen as a sanity check of the data. However, the amount of SOs and spindles reported for N1 and REM sleep is concerning, as per definition there should be hardly any (if SOs or spindles occur in N1 it becomes by definition N2, and the interval between spindles has to be considerably large in REM to still be scored as such). Thus, on the one hand, the report of these comparisons takes too much space in the main manuscript as it is trivial, but on the other hand, it raises concerns about the validity of the scoring.

      We appreciate your concern regarding the reported presence of SOs and spindles in N1 and REM sleep and the potential implications. Our detection method for detecting SO, spindle, and coupling were originally designed only for N2&N3 sleep data based on the characteristics of the data itself, and this method is widely recognized and used in the sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). While, because the detection methods for SO and spindle are based on percentiles, this method will always detect a certain number of events when used for other stages (N1 and REM) sleep data, but the differences between these events and those detected in stage N23 remain unclear. We will acknowledge the reasons for these results in the Methods section and emphasize that they are used only for sanity checks.

      Methods, Page 25 Lines 515-524

      “We note that the above methods for detecting SOs, spindles, and their couplings were originally developed for N2 and N3 sleep data, based on the specific characteristics of these stages. These methods are widely recognized in sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). However, because this percentile-based detection approach will inherently identify a certain number of events if applied to other stages (e.g., N1 and REM), the nature of these events in those stages remains unclear compared to N2/N3. We nevertheless identified and reported the detailed descriptive statistics of these sleep rhythms in all sleep stages, under the same operational definitions, both for completeness and as a sanity check. Within the same subject, there should be more SOs, spindles, and their couplings in N2/N3 than in N1 or REM (see also Figure S2-S4, Table S1-S4).”

      (10) Why was electrode F3 used to quantify the occurrence of SOs and spindles? Why not a midline frontal electrode like Fz (or a number of frontal electrodes for SOs) and Cz (or a number of centroparietal electrodes) for spindles to be closer to their maximum topography?

      We appreciate your suggestion regarding electrode selection for SO and spindle quantification. Our choice of F3 was primarily based on previous studies (Massimini et al., 2004; Molle et al., 2011), where bilateral frontal electrodes are commonly used for detecting SOs and spindles. Additionally, we considered the impact of MRI-related noise and, after a comprehensive evaluation, determined that F3 provided an optimal balance between signal quality and artifact minimization. We also acknowledge that alternative electrode choices, such as Fz for SOs and Cz for spindles, could provide additional insights into their topographical distributions.

      (11) Functional connectivity (hippocampus -> thalamus -> cortex (mPFC)) is reported to be increased during SO-spindle coupling and interpreted as evidence for coordination of hippocampo-neocortical communication likely by thalamic spindles. However, functional connectivity was only analysed during coupled SO+spindle events, not during isolated SOs or isolated spindles. Without the direct comparison of the connectivity patterns between these three events, it remains unclear whether this is specific for coupled SO+spindle events or rather associated with one or both of the other isolated events. The PPIs need to be conducted for those isolated events as well and compared statistically to the coupled events.

      We appreciate your critical perspective on our functional connectivity analysis and the interpretation of hippocampus-thalamus-cortex (mPFC) interactions during SO-spindle coupling. We acknowledge that, in the current analysis, functional connectivity was only examined during coupled SO-spindle events, without direct comparison to isolated SOs or isolated spindles. To address this concern, we have conducted PPI analyses for all three ROIs(Hippocampus, Thalamus, mPFC) and all three event types (SO-spindle couplings, isolated SOs, and isolated spindles). Our results indicate that neither isolated SOs nor isolated Spindles yielded significant connectivity changes in all three ROIs, as all failed to survive multiple comparison corrections. This suggests that the observed connectivity increase is specific to SO-spindle coupling, rather than being independently driven by either SOs or spindles alone.

      Results, Page 14 Lines 248-255

      “Crucially, the interaction between FC and SO-spindle coupling revealed that only the functional connectivity of hippocampus -> thalamus (ROI analysis, t<sub>(106)</sub> = 1.86, p = 0.0328) and thalamus -> mPFC (ROI analysis, t<sub>(106)</sub> = 1.98, p = 0.0251) significantly increased during SO-spindle coupling, with no significant changes in all other pathways (Fig. 4e). We also conducted PPI analyses for the other two events (SOs and spindles), and neither yielded significant connectivity changes in the three ROIs, as all failed to survive whole-brain FWE correction at the cluster level (p < 0.05). Together, these findings suggest that the thalamus, likely via spindles, coordinates hippocampal-cortical communication selectively during SO-spindle coupling, but not isolated SOs or spindle events alone.”

      (12) The limited temporal resolution of fMRI does indeed not allow for easily distinguishing between fMRI activation patterns related to SO-up- vs. SO-down-states. For this, one could try to extract the amplitudes of SO-up- and SO-down-states separately for each SO event and model them as two separate parametric modulators (with the risk of collinearity as they are likely correlated).

      We appreciate your insightful comment regarding the challenge of distinguishing fMRI activation patterns related to SO-up vs. SO-down states due to the limited temporal resolution of fMRI. While our current analysis does not differentiate between these two phases, we acknowledge that separately modeling SO-up and SO-down states using parametric modulators could provide a more refined understanding of their distinct neural correlates. However, as you notes, this approach carries the risk of collinearity, and there is indeed a high correlation between the two amplitudes across all subjects in our results (r=0.98). Future studies could explore more on leveraging high-temporal-resolution techniques. While implementing this in the current study is beyond our scope, we will acknowledge this limitation in the Discussion section.

      Discussion, Page 17 Lines 308-322

      “An intriguing aspect of our findings is the reduced DMN activity during SOs when modeled at the SO trough (DOWN-state). This reduced DMN activity may reflect large-scale neural inhibition characteristic of the SO trough. The DMN is typically active during internally oriented cognition (e.g., self-referential processing or mind-wandering) and is suppressed during external stimuli processing (Yeshurun, Nguyen, & Hasson, 2021). It is unlikely, however, that this suppression of DMN during SO events is related to a shift from internal cognition to external responses given it is during deep sleep time. Instead, it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead, Fig. S5). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state. Interestingly, no such DMN reduction was found during SO-spindle coupling, implying that coupling may involve distinct neural dynamics that partially re-engage DMN-related processes, possibly reflecting memory-related reactivation. Future research using high-temporal-resolution techniques like iEEG could clarify these possibilities.

      Discussion, Page 18 Lines 333-341

      “Despite providing new insights, our study has several limitations. First, our scalp EEG did not directly capture hippocampal ripples, preventing us from conclusively demonstrating triple coupling. Second, the combination of EEG-fMRI and the lack of a memory task limit our ability to parse fine-grained BOLD responses at the DOWN- vs. UP-states of SOs and link observed activations to behavioral outcomes. Third, the use of large anatomical ROIs may mask subregional contributions of specific thalamic nuclei or hippocampal subfields. Finally, without a memory task, we cannot establish a direct behavioral link between sleep-rhythm-locked activation and memory consolidation. Future studies combining techniques such as ultra-high-field fMRI or iEEG with cognitive tasks may refine our understanding of subregional network dynamics and functional significance during sleep.

      (13) L327: "It is likely that our findings of diminished DMN activity reflect brain activity during the SO DOWN-state, as this state consistently shows higher amplitude compared to the UP-state within subjects, which is why we modelled the SO trough as its onset in the fMRI analysis." This conclusion is not justified as the fact that SO down-states are larger in amplitude does not mean their impact on the BOLD response is larger.

      We appreciate your concern regarding our interpretation of diminished DMN activity reflecting the SO down-state. We acknowledge that the current expression is somewhat misleading, and our interpretation of it is: it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state. And we will make this clear in the Discussion section.

      Discussion, Page 17 Lines 308-322

      “An intriguing aspect of our findings is the reduced DMN activity during SOs when modeled at the SO trough (DOWN-state). This reduced DMN activity may reflect large-scale neural inhibition characteristic of the SO trough. The DMN is typically active during internally oriented cognition (e.g., self-referential processing or mind-wandering) and is suppressed during external stimuli processing (Yeshurun, Nguyen, & Hasson, 2021). It is unlikely, however, that this suppression of DMN during SO events is related to a shift from internal cognition to external responses given it is during deep sleep time. Instead, it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead, Fig. S5). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state. Interestingly, no such DMN reduction was found during SO-spindle coupling, implying that coupling may involve distinct neural dynamics that partially re-engage DMN-related processes, possibly reflecting memory-related reactivation. Future research using high-temporal-resolution techniques like iEEG could clarify these possibilities.

      (14) Line 77: "In the current study, while directly capturing hippocampal ripples with scalp EEG or fMRI is difficult, we expect to observe hippocampal activation in fMRI whenever SOs-spindles coupling is detected by EEG, if SOs- spindles-ripples triple coupling occurs during human NREM sleep". Not all SO-spindle events are associated with ripples (Staresina et al., 2015), but hippocampal activation may also be expected based on the occurrence of spindles alone (Bergmann et al., 2012).

      We appreciate your clarification regarding the relationship between SO-spindle coupling and hippocampal ripples. We acknowledge that not all SO-spindle events are necessarily accompanied by ripples (Staresina et al., 2015). However, based on previous research, we found that hippocampal ripples are significantly more likely to occur during SO-spindle coupling events. This suggests that while ripple occurrence is not guaranteed, SO-spindle coupling creates a favorable network state for ripple generation and potential hippocampal activation. To ensure accuracy, we will revise the manuscript to delete this misleading sentence in the Introduction section and acknowledge in the Discussion that our results cannot conclusively directly observe the triple coupling of SO, spindle, and hippocampal ripples.

      Discussion, Page 18 Lines 333-341

      “Despite providing new insights, our study has several limitations. First, our scalp EEG did not directly capture hippocampal ripples, preventing us from conclusively demonstrating triple coupling. Second, the combination of EEG-fMRI and the lack of a memory task limit our ability to parse fine-grained BOLD responses at the DOWN- vs. UP-states of SOs and link observed activations to behavioral outcomes. Third, the use of large anatomical ROIs may mask subregional contributions of specific thalamic nuclei or hippocampal subfields. Finally, without a memory task, we cannot establish a direct behavioral link between sleep-rhythm-locked activation and memory consolidation. Future studies combining techniques such as ultra-high-field fMRI or iEEG with cognitive tasks may refine our understanding of subregional network dynamics and functional significance during sleep.”

      Reviewer #2 (Public review):

      In this study, Wang and colleagues aimed to explore brain-wide activation patterns associated with NREM sleep oscillations, including slow oscillations (SOs), spindles, and SO-spindle coupling events. Their findings reveal that SO-spindle events corresponded with increased activation in both the thalamus and hippocampus. Additionally, they observed that SO-spindle coupling was linked to heightened functional connectivity from the hippocampus to the thalamus, and from the thalamus to the medial prefrontal cortex-three key regions involved in memory consolidation and episodic memory processes.

      This study's findings are timely and highly relevant to the field. The authors' extensive data collection, involving 107 participants sleeping in an fMRI while undergoing simultaneous EEG recording, deserves special recognition. If shared, this unique dataset could lead to further valuable insights. While the conclusions of the data seem overall well supported by the data, some aspects with regard to the detection of sleep oscillations need clarification.

      The authors report that coupled SO-spindle events were most frequent during NREM sleep (2.46 [plus minus] 0.06 events/min), but they also observed a surprisingly high occurrence of these events during N1 and REM sleep (2.23 [plus minus] 0.09 and 2.32 [plus minus] 0.09 events/min, respectively), where SO-spindle coupling would not typically be expected. Combined with the relatively modest SO amplitudes reported (~25 µV, whereas >75 µV would be expected when using mastoids as reference electrodes), this raises the possibility that the parameters used for event detection may not have been conservative enough - or that sleep staging was inaccurately performed. This issue could present a significant challenge, as the fMRI findings are largely dependent on the reliability of these detected events.

      Thank you very much for your thorough and encouraging review. We appreciate your recognition of the significance and relevance of our study and dataset, particularly in highlighting how simultaneous EEG-fMRI recordings can provide complementary insights into the temporal dynamics of neural oscillations and their associated spatial activation patterns during sleep. In the sections that follow, we address each of your comments in detail. We have revised the text and conducted additional analyses wherever possible to strengthen our argument, clarify our methodological choices. We believe these revisions improve the clarity and rigor of our work, and we thank you for helping us refine it.

      We appreciate your insightful comments regarding the detection of sleep oscillations. Our methods for detecting SOs, spindles, and their couplings were originally developed for N2 and N3 sleep data, based on the specific characteristics of these stages. These methods are widely recognized in sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). However, because this percentile-based detection approach will inherently identify a certain number of events if applied to other stages (e.g., N1 and REM), the nature of these events in those stages remains unclear compared to N2/N3. We nevertheless identified and reported the detailed descriptive statistics of these sleep rhythms in all sleep stages, under the same operational definitions, both for completeness and as a sanity check. Within the same subject, there should be more SOs, spindles, and their couplings in N2/N3 than in N1 or REM. We will acknowledge the reasons for these results in the Methods section and emphasize that they are used only for sanity checks.

      Regarding the reported SO amplitudes (~25 µV), during preprocessing, we applied the Signal Space Projection (SSP) method to more effectively remove MRI gradient artifacts and cardiac pulse noise. While this approach enhances data quality, it also reduces overall signal power, leading to systematically lower reported amplitudes. Despite this, our SO detection in NREM sleep (especially N2/N3) remain physiologically meaningful and are consistent with previous fMRI studies using similar artifact removal techniques. We appreciate your careful evaluation and valuable suggestions.

      In addition, we will provide comprehensive tables in the supplementary materials, contains descriptive information about sleep-related characteristics (Table S1), as well as detailed information about sleep waves at each sleep stage for all 107 subjects(Table S2-S4), listing for each subject:(1)Different sleep stage duration; (2)Number of detected SOs; (3)Number of detected spindles; (4)Number of detected SO-spindle coupling events; (2)Density of detected SOs; (3)Density of detected spindles; (4)Density of detected SO-spindle coupling events.

      Methods, Page 25 Lines 515-524

      “We note that the above methods for detecting SOs, spindles, and their couplings were originally developed for N2 and N3 sleep data, based on the specific characteristics of these stages. These methods are widely recognized in sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). However, because this percentile-based detection approach will inherently identify a certain number of events if applied to other stages (e.g., N1 and REM), the nature of these events in those stages remains unclear compared to N2/N3. We nevertheless identified and reported the detailed descriptive statistics of these sleep rhythms in all sleep stages, under the same operational definitions, both for completeness and as a sanity check. Within the same subject, there should be more SOs, spindles, and their couplings in N2/N3 than in N1 or REM (see also Figure S2-S4, Table S1-S4).”

      Supplementary Materials, Page 42-54, Table S1-S4

      (Consider of the length, we do not list all the tables here. Please refer to the revised manuscript.)

      Reviewer #3 (Public review):

      Summary:

      Wang et al., examined the brain activity patterns during sleep, especially when locked to those canonical sleep rhythms such as SO, spindle, and their coupling. Analyzing data from a large sample, the authors found significant coupling between spindles and SOs, particularly during the upstate of the SO. Moreover, the authors examined the patterns of whole-brain activity locked to these sleep rhythms. To understand the functional significance of these brain activities, the authors further conducted open-ended cognitive state decoding and found a variety of cognitive processing may be involved during SO-spindle coupling and during other sleep events. The authors next investigated the functional connectivity analyses and found enhanced connectivity between the hippocampus, the thalamus, and the medial PFC. These results reinforced the theoretical model of sleep-dependent memory consolidation, such that SO-spindle coupling is conducive to systems-level memory reactivation and consolidation.

      Strengths:

      There are obvious strengths in this work, including the large sample size, state-of-the-art neuroimaging and neural oscillation analyses, and the richness of results.

      Weaknesses:

      Despite these strengths and the insights gained, there are weaknesses in the design, the analyses, and inferences.

      Thank you for your detailed and thoughtful review of our manuscript. We are delighted that you recognize our advanced analysis methods and rich results of neuroimaging and neural oscillations as well as the large sample size data. In the following sections, we provide detailed responses to each of your comments. And we have revised the text and conducted additional analyses to strengthen our arguments and clarify our methodological choices. We believe these revisions enhance the clarity and rigor of our work, and we sincerely appreciate your thoughtful feedback in helping us refine the manuscript.

      (1) A repeating statement in the manuscript is that brain activity could indicate memory reactivation and thus consolidation. This is indeed a highly relevant question that could be informed by the current data/results. However, an inherent weakness of the design is that there is no memory task before and after sleep. Thus, it is difficult (if not impossible) to make a strong argument linking SO/spindle/coupling-locked brain activity with memory reactivation or consolidation.

      We appreciate your suggestion regarding the lack of a pre- and post-sleep memory task in our study design. We acknowledge that, in the absence of behavioral measures, it is hard to directly link SO-spindle coupling to memory consolidation in an outcome-driven manner. Our interpretation is instead based on the well-established role of these oscillations in memory processes, as demonstrated in previous studies. We sincerely appreciate this feedback and will adjust our Discussion accordingly to reflect a more precise interpretation of our findings.

      Discussion, Page 18 Lines 333-341

      “Despite providing new insights, our study has several limitations. First, our scalp EEG did not directly capture hippocampal ripples, preventing us from conclusively demonstrating triple coupling. Second, the combination of EEG-fMRI and the lack of a memory task limit our ability to parse fine-grained BOLD responses at the DOWN- vs. UP-states of SOs and link observed activations to behavioral outcomes. Third, the use of large anatomical ROIs may mask subregional contributions of specific thalamic nuclei or hippocampal subfields. Finally, without a memory task, we cannot establish a direct behavioral link between sleep-rhythm-locked activation and memory consolidation. Future studies combining techniques such as ultra-high-field fMRI or iEEG with cognitive tasks may refine our understanding of subregional network dynamics and functional significance during sleep.”

      (2) Relatedly, to understand the functional implications of the sleep rhythm-locked brain activity, the authors employed the "open-ended cognitive state decoding" method. While this method is interesting, it is rather indirect given that there were no behavioral indices in the manuscript. Thus, discussions based on these analyses are speculative at best. Please either tone down the language or find additional evidence to support these claims.

      Moreover, the results from this method are difficult to understand. Figure 3e showed that for all three types of sleep events (SO, spindle, SO-spindle), the same mental states (e.g., working memory, episodic memory, declarative memory) showed opposite directions of activation (left and right panels showed negative and positive activation, respectively). How to interpret these conflicting results? This ambiguity is also reflected by the term used: declarative memory and episodic memories are both indexed in the results. Yet these two processes can be largely overlapped. So which specific memory processes do these brain activity patterns reflect? The Discussion shall discuss these results and the limitations of this method.

      We appreciate your critical assessment of the open-ended cognitive state decoding method and its interpretational challenges. Given the concerns about the indirectness of this approach, we decided to remove its related content and results from Figure 3 in the main text and include it in Supplementary Figure 7.

      Due to the complexity of memory-related processes, we acknowledge that distinguishing between episodic and declarative memory based solely on this approach is not straightforward. We will revise the Supplementary Materials to explicitly discuss these limitations and clarify that our findings do not isolate specific cognitive processes but rather suggest general associations with memory-related networks.

      Discussion, Page 17-18 Lines 323-332

      “To explore functional relevance, we employed an open-ended cognitive state decoding approach using meta-analytic data (NeuroSynth: Yarkoni et al. (2011)). Although this method usefully generates hypotheses about potential cognitive processes, particularly in the absence of a pre- and post-sleep memory task, it is inherently indirect. Many cognitive terms showed significant associations (16 of 50), such as “episodic memory,” “declarative memory,” and “working memory.” We focused on episodic/declarative memory given the known link with hippocampal reactivation (Diekelmann & Born, 2010; Staresina et al., 2015; Staresina et al., 2023). Nonetheless, these inferences regarding memory reactivation should be interpreted cautiously without direct behavioral measures. Future research incorporating explicit tasks before and after sleep would more rigorously validate these potenial functional claims.”

      (3) The coupling strength is somehow inconsistent with prior results (Hahn et al., 2020, eLife, Helfrich et al., 2018, Neuron). Specifically, Helfrich et al. showed that among young adults, the spindle is coupled to the peak of the SO. Here, the authors reported that the spindles were coupled to down-to-up transitions of SO and before the SO peak. It is possible that participants' age may influence the coupling (see Helfrich et al., 2018). Please discuss the findings in the context of previous research on SO-spindle coupling.

      We appreciate your concern regarding the temporal characteristics of SO-spindle coupling. We acknowledge that the SO-spindle coupling phase results in our study are not identical to those reported by Hahn et al. (2020); Helfrich et al. (2018). However, these differences may arise due to slight variations in event detection parameters, which can influence the precise phase estimation of coupling. Notably, Hahn et al. (2020) also reported slight discrepancies in their group-level coupling phase results, highlighting that methodological differences can contribute to variability across studies. Furthermore, our findings are consistent with those of Schreiner et al. (2021), further supporting the robustness of our observations.

      That said, we acknowledge that our original description of SO-spindle coupling as occurring at the "transition from the lower state to the upper state" was not entirely precise. The -π/2 phase represents the true transition point, while our observed coupling phase is actually closer to the SO peak rather than strictly at the transition. We will revise this statement in the manuscript to ensure clarity and accuracy in describing the coupling phase.

      Discussion, Page 16 Lines 283-291

      “Our data provide insights into the neurobiological underpinnings of these sleep rhythms. SOs, originating mainly in neocortical areas such as the mPFC, alternate between DOWN- and UP-states. The thalamus generates sleep spindles, which in turn couple with SOs. Our finding that spindle peaks consistently occurred slightly before the UP-state peak of SOs (in 83 out of 107 participants), concurs with prior studies, including Schreiner et al. (2021). Yet it differs from some results suggesting spindles might peak right at the SO UP-state (Hahn et al., 2020; Helfrich et al., 2018). Such discrepancies could arise from differences in detection algorithms, participant age (Helfrich et al., 2018), or subtle variations in cortical-thalamic timing. Nonetheless, these results underscore the importance of coordinated SO-spindle interplay in supporting sleep-dependent processes.”

      (4) The discussion is rather superficial with only two pages, without delving into many important arguments regarding the possible functional significance of these results. For example, the author wrote, "This internal processing contrasts with the brain patterns associated with external tasks, such as working memory." Without any references to working memory, and without delineating why WM is considered as an external task even working memory operations can be internal. Similarly, for the interesting results on SO and reduced DMN activity, the authors wrote "The DMN is typically active during wakeful rest and is associated with self-referential processes like mind-wandering, daydreaming, and task representation (Yeshurun, Nguyen, & Hasson, 2021). Its reduced activity during SOs may signal a shift towards endogenous processes such as memory consolidation." This argument is flawed. DMN is active during self-referential processing and mind-wandering, i.e., when the brain shifts from external stimuli processing to internal mental processing. During sleep, endogenous memory reactivation and consolidation are also part of the internal mental processing given the lack of external environmental stimulation. So why during SO or during memory consolidation, the DMN activity would be reduced? Were there differences in DMN activity between SO and SO-spindle coupling events?

      We appreciate your concerns regarding the brevity of the discussion and the need for clearer theoretical arguments. We will expand this section to provide more in-depth interpretations of our findings in the context of prior literature. Regarding working memory (WM), we acknowledge that our phrasing was ambiguous. We will modify this statement in the Discussion section.

      For the SO-related reduction in DMN activity, we recognize the need for a more precise explanation. This reduced DMN activity may reflect large-scale neural inhibition characteristic of the SO trough. The DMN is typically active during internally oriented cognition (e.g., self-referential processing or mind-wandering) and is suppressed during external stimuli processing (Yeshurun, Nguyen, & Hasson, 2021). It is unlikely, however, that this suppression of DMN during SO events is related to a shift from internal cognition to external responses given it is during deep sleep time. Instead, it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state.

      To address your final question, we have conducted the additional post hoc comparison of DMN activity between isolated SOs and SO-spindle coupling events. Our results indicate that

      DMN activation during SOs was significantly lower than during SO-spindle coupling (t<sub>(106)</sub> = -4.17, p < 1e-4). This suggests that SO-spindle coupling may involve distinct neural dynamics that partially re-engage DMN-related processes, possibly reflecting memory-related reactivation. We appreciate your constructive feedback and will integrate these expanded analyses and discussions into our revised manuscript.

      Results, Page 11 Lines 199-208

      “Spindles were correlated with positive activation in the thalamus (ROI analysis, t<sub>(106)</sub> = 15.39, p < 1e-4), the anterior cingulate cortex (ACC), and the putamen, alongside deactivation in the DMN (Fig. 3c). Notably, SO-spindle coupling was linked to significant activation in both the thalamus (ROI analysis, t<sub>(106)</sub> \= 3.38, p = 0.0005) and the hippocampus (ROI analysis, t<sub>(106)</sub> \= 2.50, p = 0.0070, Fig. 3d). However, no decrease in DMN activity was found during SO-spindle coupling, and DMN activity during SO was significantly lower than during coupling (ROI analysis, t<sub>(106)</sub> \= -4.17, p < 1e-4). For more detailed activation patterns, see Table S5-S7. We also varied the threshold used to detect SO events to assess its effect on hippocampal activation during SO-spindle coupling and observed that hippocampal activation remained significant when the percentile thresholds for SO detection ranged between 71% and 80% (see Fig. S6).”

      Discussion, Page 17-18 Lines 308-332

      “An intriguing aspect of our findings is the reduced DMN activity during SOs when modeled at the SO trough (DOWN-state). This reduced DMN activity may reflect large-scale neural inhibition characteristic of the SO trough. The DMN is typically active during internally oriented cognition (e.g., self-referential processing or mind-wandering) and is suppressed during external stimuli processing (Yeshurun, Nguyen, & Hasson, 2021). It is unlikely, however, that this suppression of DMN during SO events is related to a shift from internal cognition to external responses given it is during deep sleep time. Instead, it could be driven by the inherent rhythmic pattern of SOs, which makes it difficult to separate UP- from DOWN-states (the two temporal regressors were highly correlated, and similar brain activation during SOs events was obtained if modelled at the SO peak instead, Fig. S5). Since the amplitude at the SO trough is consistently larger than that at the SO peak, the neural activation we detected may primarily capture the large-scale inhibition from DOWN-state. Interestingly, no such DMN reduction was found during SO-spindle coupling, implying that coupling may involve distinct neural dynamics that partially re-engage DMN-related processes, possibly reflecting memory-related reactivation. Future research using high-temporal-resolution techniques like iEEG could clarify these possibilities.

      To explore functional relevance, we employed an open-ended cognitive state decoding approach using meta-analytic data (NeuroSynth: Yarkoni et al. (2011)). Although this method usefully generates hypotheses about potential cognitive processes, particularly in the absence of a pre- and post-sleep memory task, it is inherently indirect. Many cognitive terms showed significant associations (16 of 50), such as “episodic memory,” “declarative memory,” and “working memory.” We focused on episodic/declarative memory given the known link with hippocampal reactivation (Diekelmann & Born, 2010; Staresina et al., 2015; Staresina et al., 2023). Nonetheless, these inferences regarding memory reactivation should be interpreted cautiously without direct behavioral measures. Future research incorporating explicit tasks before and after sleep would more rigorously validate these potential functional claims.”

      Reviewing Editor Comment:

      The reviewers think that you are working on a relevant and important topic. They are praising the large sample size used in the study. The reviewers are not all in line regarding the overall significance of the findings, but they all agree the paper would strongly benefit from some extra work, as all reviewers raise various critical points that need serious consideration.

      We appreciate your recognition of the relevance and importance of our study, as well as your acknowledgment of the large sample size as a strength of our work. We understand that there are differing perspectives regarding the overall significance of our findings, and we value the constructive critiques provided. We are committed to addressing the key concerns raised by all reviewers, including refining our analyses, clarifying our interpretations, and incorporating additional discussions to strengthen the manuscript. Below, we address your specific recommendations and provide responses to each point you raised to ensure our methods and results are as transparent and comprehensible as possible. We believe that these revisions will significantly enhance the rigor and impact of our study, and we sincerely appreciate your thoughtful feedback in helping us improve our work.

      Reviewer #1 (Recommendations for the authors):

      (1) The phrase "overnight sleep" suggests an entire night, while these were rather "nocturnal naps". Please rephrase.

      Thank you for pointing this out. We have revised the phrasing in our manuscript to "nocturnal naps" instead of "overnight sleep" to more accurately reflect the duration of the sleep recordings.

      (2) Sleep staging results (macroscopic sleep architecture) should be provided in more detail (at least min and % of the different sleep stages, sleep onset latency, total sleep duration, total recording duration), at least mean/SD/range.

      Thank you for this suggestion. We will provide comprehensive tables in the supplementary materials, contains descriptive information about sleep-related characteristics. This information will help provide a clearer overview of the macroscopic sleep architecture in our dataset.

      Supplementary Materials, Page 42, Table S1

      Author response table 1.

      Descriptive results of demographic information and sleep characteristics. Note: The total recorded time is equal to the awake time plus the total sleep time. The sleep onset latency is the time taken to reach the first sleep epoch. The Sleep Efficiency is the ratio of actual sleep time to total recording time.

      Reviewer #2 (Recommendations for the authors):

      In order to allow for a better estimation of the reliability of the detected sleep events, please:

      (1) Provide densities and absolute numbers of all detected SOs and spindles (N1, NREM, and REM sleep).

      Thank you for pointing this out. We will provide comprehensive tables in the supplementary materials, contains detailed information about sleep waves at each sleep stage for all 107 subjects (Table S2-S4), listing for each subject:1) Different sleep stage duration; 2) Number of detected SOs; 3) Number of detected spindles; 4) Number of detected SO-spindle coupling events; 5) Density of detected SOs; 6) Density of detected spindles; 7) Density of detected SO-spindle coupling events.

      Supplementary Materials, Page 43-54, Table S2-S4

      (Consider of the length, we do not list all the tables here. Please refer to the revised manuscript.)

      (2) Show ERPs for all detected SOs and spindles (per sleep stage).

      Thank you for the suggestion. We will provide ERPs for all detected SOs and spindles, separated by sleep stage (N1, N2&N3, and REM) in supplementary Fig. S2-S4. These ERP waveforms will help illustrate the characteristic temporal profiles of SOs and spindles across different sleep stages.

      Methods, Page 25, Line 525-532

      “Event-related potentials (ERP) analysis. After completing the detection of each sleep rhythm event, we performed ERP analyses for SOs, spindles, and coupling events in different sleep stages. Specifically, for SO events, we took the trough of the DOWN-state of each SO as the zero-time point, then extracted data in a [-2 s to 2 s] window from the broadband (0.1–30 Hz) EEG and used [-2 s to -0.5 s] for baseline correction; the results were then averaged across 107 subjects (see Fig. S2a). For spindle events, we used the peak of each spindle as the zero-time point and applied the same data extraction window and baseline correction before averaging across 107 subjects (see Fig. S2b). Finally, for SO-spindle coupling events, we followed the same procedure used for SO events (see Fig. 2a, Figs. S3–S4).”

      Supplementary Materials, Page 36-38, Fig. S2-S4

      Author response image 1.

      ERPs of SOs and spindles coupling during different sleep stages across all 107 subjects. a. ERP of SOs in different sleep stages using the broadband (0.1–30 Hz) EEG data. We align the trough of the DOWN-state of each SO at time zero (see Methods for details). The orange line represents the SO ERP in the N1 stage, the black line represents the SO ERP in the N2&N3 stage, and the green line represents the SO ERP in the REM stage. b. ERP of spindles in different sleep stages using the broadband (0.1–30 Hz) EEG data. We align the peak of each spindle at time zero (see Methods for details). The color scheme is the same as in panel a.

      Author response image 2.

      ERP and time-frequency patterns of SO-spindle coupling in the N1 stage. The averaged temporal frequency pattern and ERP across all instances of SO-spindle coupling, computed over all subjects, following the same procedure as in Fig. 2a, but for N1 stage.

      Author response image 3.

      ERP and time-frequency patterns of SO-spindle coupling in the REM stage. The averaged temporal frequency pattern and ERP across all instances of SO-spindle coupling, computed over all subjects, again following the same procedure as in Fig. 2a, but for REM stage.

      (3) Provide detailed info concerning sleep characteristics (time spent in each sleep stage etc.).

      Thank you for this suggestion. Same as the response above, we will provide comprehensive tables in the supplementary materials, contains descriptive information about sleep-related characteristics.

      Supplementary Materials, Page 42, Table S1 (same as above)

      (4) What would happen if more stringent parameters were used for event detection? Would the authors still observe a significant number of SO spindles during N1 and REM? Would this affect the fMRI-related results?

      Thank you for this suggestion. Our methods for detecting SOs, spindles, and their couplings were originally developed for N2 and N3 sleep data, based on the specific characteristics of these stages. These methods are widely recognized in sleep research (Hahn et al., 2020; Helfrich et al., 2019; Helfrich et al., 2018; Ngo, Fell, & Staresina, 2020; Schreiner et al., 2022; Schreiner et al., 2021; Staresina et al., 2015; Staresina et al., 2023). However, because this percentile-based detection approach will inherently identify a certain number of events if applied to other stages (e.g., N1 and REM), the nature of these events in those stages remains unclear compared to N2/N3. We nevertheless identified and reported the detailed descriptive statistics of these sleep rhythms in all sleep stages, under the same operational definitions, both for completeness and as a sanity check. Within the same subject, there should be more SOs, spindles, and their couplings in N2/N3 than in N1 or REM (see also Figure S2-S4, Table S1-S4).

      Furthermore, in order to explore the impact of this on our fMRI results, we conducted an additional sensitivity analysis by applying different detection parameters for SOs. Specifically, we adjusted amplitude percentile thresholds for SO detection (the parameter that has the greatest impact on the results). We used the hippocampal activation value during N2&N3 stage SO-spindle coupling as an anchor value and found that when the parameters gradually became stricter, the results were similar to or even better than the current results. However, when we continued to increase the threshold, the results began to gradually decrease until the threshold was increased to 80%, and the results were no longer significant. This indicates that our results are robust within a specific range of parameters, but as the threshold increases, the number of trials decreases, ultimately weakening the statistical power of the fMRI analysis.

      Thank you again for your suggestions on sleep rhythm event detection. We will add the results in Supplementary and revise our manuscript accordingly.

      Results, Page 11, Line 199-208

      “Spindles were correlated with positive activation in the thalamus (ROI analysis, t<sub>(106)</sub> = 15.39, p < 1e-4), the anterior cingulate cortex (ACC), and the putamen, alongside deactivation in the DMN (Fig. 3c). Notably, SO-spindle coupling was linked to significant activation in both the thalamus (ROI analysis, t<sub>(106)</sub> \= 3.38, p = 0.0005) and the hippocampus (ROI analysis, t<sub>(106)</sub> \= 2.50, p = 0.0070, Fig. 3d). However, no decrease in DMN activity was found during SO-spindle coupling, and DMN activity during SO was significantly lower than during coupling (ROI analysis, t<sub>(106)</sub> \= -4.17, p < 1e-4). For more detailed activation patterns, see Table S5-S7. We also varied the threshold used to detect SO events to assess its effect on hippocampal activation during SO-spindle coupling and observed that hippocampal activation remained significant when the percentile thresholds for SO detection ranged between 71% and 80% (see Fig. S6).”

      Supplementary Materials, Page 40, Fig. S6

      Author response image 4.

      Influence of the percentile threshold for SO detection on hippocampal activation (ROI) during SO-spindle coupling. We changed the percentile threshold for SO event detection in the EEG data analysis and then reconstructed the GLM design matrix based on the SO events detected at each threshold. The brain-wide activation pattern of SO-spindle couplings in the N2/3 stage was extracted using the same method as shown in Fig. 3. The gray horizontal line represents the significant range (71%–80%). * p < 0.05.

      Finally, we sincerely thank all again for your thoughtful and constructive feedback. Your insights have been invaluable in refining our analyses, strengthening our interpretations, and improving the clarity and rigor of our manuscript. We appreciate the time and effort you have dedicated to reviewing our work, and we are grateful for the opportunity to enhance our study based on your recommendations.

      References:

      Bergmann, T. O., Mölle, M., Diedrichs, J., Born, J., & Siebner, H. R. (2012). Sleep spindle-related reactivation of category-specific cortical regions after learning face-scene associations. NeuroImage, 59(3), 2733-2742.

      Buzsáki, G. (2015). Hippocampal sharp wave‐ripple: A cognitive biomarker for episodic memory and planning. Hippocampus, 25(10), 1073-1188.

      Caporro, M., Haneef, Z., Yeh, H. J., Lenartowicz, A., Buttinelli, C., Parvizi, J., & Stern, J. M. (2012). Functional MRI of sleep spindles and K-complexes. Clinical neurophysiology, 123(2), 303-309.

      Coulon, P., Budde, T., & Pape, H.-C. (2012). The sleep relay—the role of the thalamus in central and decentral sleep regulation. Pflügers Archiv-European Journal of Physiology, 463, 53-71.

      Crunelli, V., Lőrincz, M. L., Connelly, W. M., David, F., Hughes, S. W., Lambert, R. C., Leresche, N., & Errington, A. C. (2018). Dual function of thalamic low-vigilance state oscillations: rhythm-regulation and plasticity. Nature Reviews Neuroscience, 19(2), 107-118.

      Czisch, M., Wehrle, R., Stiegler, A., Peters, H., Andrade, K., Holsboer, F., & Sämann, P. G. (2009). Acoustic oddball during NREM sleep: a combined EEG/fMRI study. PloS one, 4(8), e6749.

      Diba, K., & Buzsáki, G. (2007). Forward and reverse hippocampal place-cell sequences during ripples. Nature Neuroscience, 10(10), 1241.

      Diekelmann, S., & Born, J. (2010). The memory function of sleep. Nature Reviews Neuroscience, 11(2), 114-126.

      Fogel, S., Albouy, G., King, B. R., Lungu, O., Vien, C., Bore, A., Pinsard, B., Benali, H., Carrier, J., & Doyon, J. (2017). Reactivation or transformation? Motor memory consolidation associated with cerebral activation time-locked to sleep spindles. PloS one, 12(4), e0174755.

      Hahn, M. A., Heib, D., Schabus, M., Hoedlmoser, K., & Helfrich, R. F. (2020). Slow oscillation-spindle coupling predicts enhanced memory formation from childhood to adolescence. Elife, 9, e53730.

      Halassa, M. M., Siegle, J. H., Ritt, J. T., Ting, J. T., Feng, G., & Moore, C. I. (2011). Selective optical drive of thalamic reticular nucleus generates thalamic bursts and cortical spindles. Nature Neuroscience, 14(9), 1118-1120.

      Hale, J. R., White, T. P., Mayhew, S. D., Wilson, R. S., Rollings, D. T., Khalsa, S., Arvanitis, T. N., & Bagshaw, A. P. (2016). Altered thalamocortical and intra-thalamic functional connectivity during light sleep compared with wake. NeuroImage, 125, 657-667.

      Helfrich, R. F., Lendner, J. D., Mander, B. A., Guillen, H., Paff, M., Mnatsakanyan, L., Vadera, S., Walker, M. P., Lin, J. J., & Knight, R. T. (2019). Bidirectional prefrontal-hippocampal dynamics organize information transfer during sleep in humans. Nature Communications, 10(1), 3572.

      Helfrich, R. F., Mander, B. A., Jagust, W. J., Knight, R. T., & Walker, M. P. (2018). Old brains come uncoupled in sleep: slow wave-spindle synchrony, brain atrophy, and forgetting. Neuron, 97(1), 221-230. e224.

      Horovitz, S. G., Fukunaga, M., de Zwart, J. A., van Gelderen, P., Fulton, S. C., Balkin, T. J., & Duyn, J. H. (2008). Low frequency BOLD fluctuations during resting wakefulness and light sleep: A simultaneous EEG‐fMRI study. Human brain mapping, 29(6), 671-682.

      Huang, Q., Xiao, Z., Yu, Q., Luo, Y., Xu, J., Qu, Y., Dolan, R., Behrens, T., & Liu, Y. (2024). Replay-triggered brain-wide activation in humans. Nature Communications, 15(1), 7185.

      Ilhan-Bayrakcı, M., Cabral-Calderin, Y., Bergmann, T. O., Tüscher, O., & Stroh, A. (2022). Individual slow wave events give rise to macroscopic fMRI signatures and drive the strength of the BOLD signal in human resting-state EEG-fMRI recordings. Cerebral Cortex, 32(21), 4782-4796.

      Laufs, H. (2008). Endogenous brain oscillations and related networks detected by surface EEG‐combined fMRI. Human brain mapping, 29(7), 762-769.

      Laufs, H., Walker, M. C., & Lund, T. E. (2007). ‘Brain activation and hypothalamic functional connectivity during human non-rapid eye movement sleep: an EEG/fMRI study’—its limitations and an alternative approach. Brain, 130(7), e75.

      Margulies, D. S., Ghosh, S. S., Goulas, A., Falkiewicz, M., Huntenburg, J. M., Langs, G., Bezgin, G., Eickhoff, S. B., Castellanos, F. X., & Petrides, M. (2016). Situating the default-mode network along a principal gradient of macroscale cortical organization. Proceedings of the National Academy of Sciences, 113(44), 12574-12579.

      Massimini, M., Huber, R., Ferrarelli, F., Hill, S., & Tononi, G. (2004). The sleep slow oscillation as a traveling wave. Journal of Neuroscience, 24(31), 6862-6870.

      Moehlman, T. M., de Zwart, J. A., Chappel-Farley, M. G., Liu, X., McClain, I. B., Chang, C., Mandelkow, H., Özbay, P. S., Johnson, N. L., & Bieber, R. E. (2019). All-night functional magnetic resonance imaging sleep studies. Journal of neuroscience methods, 316, 83-98.

      Molle, M., Bergmann, T. O., Marshall, L., & Born, J. (2011). Fast and slow spindles during the sleep slow oscillation: disparate coalescence and engagement in memory processing. Sleep, 34(10), 1411-1421.

      Ngo, H.-V., Fell, J., & Staresina, B. (2020). Sleep spindles mediate hippocampal-neocortical coupling during long-duration ripples. Elife, 9, e57011.

      Picchioni, D., Horovitz, S. G., Fukunaga, M., Carr, W. S., Meltzer, J. A., Balkin, T. J., Duyn, J. H., & Braun, A. R. (2011). Infraslow EEG oscillations organize large-scale cortical– subcortical interactions during sleep: a combined EEG/fMRI study. Brain research, 1374, 63-72.

      Schabus, M., Dang-Vu, T. T., Albouy, G., Balteau, E., Boly, M., Carrier, J., Darsaud, A., Degueldre, C., Desseilles, M., & Gais, S. (2007). Hemodynamic cerebral correlates of sleep spindles during human non-rapid eye movement sleep. Proceedings of the National Academy of Sciences, 104(32), 13164-13169.

      Schreiner, T., Kaufmann, E., Noachtar, S., Mehrkens, J.-H., & Staudigl, T. (2022). The human thalamus orchestrates neocortical oscillations during NREM sleep. Nature communications, 13(1), 5231.

      Schreiner, T., Petzka, M., Staudigl, T., & Staresina, B. P. (2021). Endogenous memory reactivation during sleep in humans is clocked by slow oscillation-spindle complexes. Nature Communications, 12(1), 3112.

      Singh, D., Norman, K. A., & Schapiro, A. C. (2022). A model of autonomous interactions between hippocampus and neocortex driving sleep-dependent memory consolidation. Proceedings of the National Academy of Sciences, 119(44), e2123432119.

      Spoormaker, V. I., Schröter, M. S., Gleiser, P. M., Andrade, K. C., Dresler, M., Wehrle, R., Sämann, P. G., & Czisch, M. (2010). Development of a large-scale functional brain network during human non-rapid eye movement sleep. Journal of Neuroscience, 30(34), 11379-11387.

      Staresina, B. P., Bergmann, T. O., Bonnefond, M., van der Meij, R., Jensen, O., Deuker, L., Elger, C. E., Axmacher, N., & Fell, J. (2015). Hierarchical nesting of slow oscillations, spindles and ripples in the human hippocampus during sleep. Nature Neuroscience, 18(11), 1679-1686.

      Staresina, B. P., Niediek, J., Borger, V., Surges, R., & Mormann, F. (2023). How coupled slow oscillations, spindles and ripples coordinate neuronal processing and communication during human sleep. Nature Neuroscience, 1-9.

      Yarkoni, T., Poldrack, R. A., Nichols, T. E., Van Essen, D. C., & Wager, T. D. (2011). Large-scale automated synthesis of human functional neuroimaging data. Nature methods, 8(8), 665-670.

      Yeshurun, Y., Nguyen, M., & Hasson, U. (2021). The default mode network: where the idiosyncratic self meets the shared social world. Nature Reviews Neuroscience, 1-12.

    1. eLife Assessment

      The authors use deep mutational scanning to assess the effect of ~6,600 protein-coding variants in MC4R, a G protein coupled receptor associated with obesity. They develop new, more precise approaches to deep mutational scanning, enabling them to probe molecular phenotypes directly relevant to the development of drugs that target this receptor. In this important work, the authors provide compelling evidence that variants impact signaling through MC4R in different ways, that some defective variants are amenable to a corrector drug and that deep mutational scanning data could guide compound optimization.

    2. Reviewer #1 (Public review):

      Summary:

      Howard et al. performed deep mutational scanning on the MC4R gene, using a reporter assay to investigate two distinct downstream pathways across multiple experimental conditions. They validated their findings with ClinVar data and previous studies. Additionally, they provided insights into the application of DMS results for personalized drug therapy and differential ligand responses across variant types.

      Strengths:

      They captured over 99% of variants with robust signals and investigated subtle functionalities, such as pathway-specific activities and interactions with different ligands, by refining both the experimental design and analytical methods.

      They provided additional details regarding the quality of the library, including the even composition of variants, sufficient readout from tested cells, and adequate sequencing depth. Additionally, they clarified the underlying assay mechanisms, effectively demonstrating the robustness of their results.

    3. Reviewer #2 (Public review):

      Overview

      In this manuscript the authors use deep mutational scanning to assess the effect of ~6,600 protein-coding variants in MC4R, a G protein-coupled receptor associated with obesity. Reasoning that current deep mutational scanning approaches are insufficiently precise for some drug development applications, they focus on articulating new, more precise approaches. These approaches, which include a new statistical model and innovative reporter assay, enable them to probe molecular phenotypes directly relevant to the development of drugs that target this receptor with high precision and statistical rigor.

      They use the resulting data for a variety of purposes, including probing the relationship between MC4R's sequence and structure, analyzing the effect of clinically important variants, identifying variants that disrupt downstream MC4R signaling via one but not both pathways, identifying loss of function variants are amenable to a corrector drug and exploring how deep mutational scanning data could guide small molecule drug optimization.

      Strengths

      The analysis and statistical framework developed by the authors represent a significant advance. In particular, it makes use of barcode-level internally replicated measurements to more accurately estimate measurement noise.<br /> The framework allows variant effects to be compared across experimental conditions, a task which is currently hard to do with rigor. Thus, this framework will be applicable to a large number of existing and future deep mutational scanning experiments.

      The authors refine their existing barcode transcription-based assay for GPCR signaling, and develop a clever "relay" new reporter system to boost signaling in a particular pathway. They show that these reporters can be used to measure both gain of function and loss of function effects, which many deep mutational scanning approaches cannot do.

      The use of systematic approaches to integrate and then interrogate high-dimensional deep mutational scanning data is a big strength. For example, the authors applied PCA to the variant effect results from reporters for two different MC4R signaling pathways and were able to discover variants that biased signaling through one or the other pathway. This approach paves the way for analyses of higher dimensional deep mutational scans.

      The authors use the deep mutational scanning data they collect to map how different variants impact small molecule agonists activate MC4R signaling. This is an exciting idea because developing small-molecule protein-targeting therapeutics is difficult, and this manuscript suggests a new way to map small molecule-protein interactions.

      Weaknesses

      The authors derive insights into the relationship between MC4R signaling through different pathways and its structure. While these make sense based on what is already known, the manuscript would be stronger if some of these insights were validated using methods other than deep mutational scanning.

      Likewise, the authors use their data to identify positions where variants disrupt MC4R activation by one small molecule agonist but not another. They hypothesize these effects point to positions that are more or less important for the binding of different small molecule agonists. The manuscript would be stronger if some of these insights were explored further.

      Impact

      In this manuscript the authors present new methods, including a statistical framework for analyzing deep mutational scanning data that will have a broad impact. They also generate MC4R variant effect data that is of interest to the GPCR community.

      Comments on revisions:

      I do not have additional comments, and feel that the authors addressed most of my concerns!

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1 (Public reviews):

      Summary

      Howard et al. performed deep mutational scanning on the MC4R gene, using a reporter assay to investigate two distinct downstream pathways across multiple experimental conditions. They validated their findings with ClinVar data and previous studies. Additionally, they provided insights into the application of DMS results for personalized drug therapy and differential ligand responses across variant types.

      Strengths

      They captured over 99% of variants with robust signals and investigated subtle functionalities, such as pathway-specific activities and interactions with different ligands, by refining both the experimental design and analytical methods.

      Weaknesses

      While the study generated informative results, it lacks a detailed explanation regarding the input library, replicate correlation, and sequencing depth for a given number of cells. Additionally, there are several questions that it would be helpful for authors to clarify.

      (1) It would be helpful to clarify the information regarding the quality of the input library and experimental replicates. Are variants evenly represented in the library? Additionally, have the authors considered using long-read sequencing to confirm the presence of a single intended variant per construct? Finally, could the authors provide details on the correlation between experimental replicates under each condition?

      Are variants evenly represented in the library?

      We strive to achieve as evenly balanced library as possible at every stage of the DMS process (e.g., initial cloning in E. coli through integration into human cells). Below is a representative plot showing the number of barcodes per amino acid variant at each position in a given ~60 amino acid subregion of MC4R, which highlights how evenly variants are represented at the E. coli cloning stage.

      Author response image 1.

      We also make similar measurements after the library is integrated into HEK293T cell lines, and see similarly even coverage across all variants, as shown in the plot below:

      Author response image 2.

      Additionally, have the authors considered using long-read sequencing to confirm the presence of a single intended variant per construct?

      We agree long-read sequencing would be an excellent way to confirm that our constructs contain a single intended variant. However, we elected for an alternate method (outlined in more detail in Jones et al. 2020) that leverages multiple layers of validation. First, the oligo chip-synthesized portions of the protein containing the variants are cloned into a sequence-verified plasmid backbone, which greatly decreases the chances of spuriously generating a mutation in a different portion of the protein. We then sequence both the oligo portion and random barcode using overlapping paired end reads during barcode mapping to avoid sequencing errors and to help detect DNA synthesis errors. At this stage, we computationally reject any constructs that have more than one variant. Given this, the vast majority of remaining unintended variants would come from somatic mutations introduced by the E. coli cloning or replication process, which should be low frequency. We have used our in-house full plasmid sequencing method, OCTOPUS, to sample and spot check this for several other DMS libraries we have generated using the same cloning methods. We have found variants in the plasmid backbone in only ~1% of plasmids in these libraries. Our statistical model also helps correct for this by accounting for barcode-specific variation. Finally we believe this provides further motivation for having multiple barcodes per variant, which dilutes the effect of any unintended additional variants.

      Finally, could the authors provide details on the correlation between experimental replicates under each condition?

      Certainly! In general, the Gs reporter had higher correlation between replicates than the Gq system (r ~ 0.5 vs r ~ 0.4). The plots below, which have been added as a panel to Supplementary Figure 1, show two representative correlations at the RNA-seq stage of read counts for barcodes between the low a-MSH conditions.

      We added the following text to reference this panel:

      (see Methods > Sequence processing for barcode expression): “The correlation (r) of barcode readcounts between replicates was ~0.5 and ~0.4 for the Gs and Gq assays, respectively (Supplementary Fig. 1E).”

      One important advantage of our statistical model is that it’s able to leverage information from barcodes regardless of the number of replicates they appear in.

      (2) Since the functional readout of variants is conducted through RNA sequencing, it seems crucial to sequence a sufficient number of cells with adequate sequencing saturation. Could the authors clarify the coverage depth used for each RNA-seq experiment and how this depth was determined? Additionally, how many cells were sequenced in each experiment?

      The text has been added in the manuscript as follows:

      (in Methods > Running DMS Assays): “Given the seeding density (~17x10<sup>6</sup> cells per 150 mm replicate dish), time from seeding to collection, and doubling time of HEK293T cells, approximately 25.5x10<sup>6</sup> cells were collected per replicate. This translates to approximately 30-60x cellular coverage per amino acid variant in each replicate.”

      (in Methods > Sequence processing for barcode expression): “Total mapped reads per replicate at the RNA-seq stage were as follows:

      - Gs/CRE: 9.1-18.2 million mapped reads, median=12.3

      - Gq/UAS: 8.6-24.1 million mapped reads, median=14.5

      - Gs/CRE+Chaperone: 6.4-9.5 million mapped reads, median=7.5”

      The median read counts per sample per barcode were 8, 10, and 6 reads for Gs/CRE, Gq/UAS, and Gs/CRE+Chaperone assays, respectively. The median number of barcodes per variant across all samples (the “median of medians”) were 56 for Gs/CRE, 28 for Gq/UAS, and 44 for Gs/CRE+Chaperone.”

      (3) It appears that the frequencies of individual RNA-seq barcode variants were used as a proxy for MC4R activity. Would it be important to also normalize for heterogeneity in RNA-seq coverage across different cells in the experiment? Variability in cell representation (i.e., the distribution of variants across cells) could lead to misinterpretation of variant effects. For example, suppose barcode_a1 represents variant A and barcode_b1 represents variant B. If the RNA-seq results show 6 reads for barcode_a1 and 7 reads for barcode_b1, it might initially appear that both variants have similar effect sizes. However, if these reads correspond to 6 separate cells each containing 1 copy of barcode_a1, and only 1 cell containing 7 copies of barcode_b1, the interpretation changes significantly. Additionally, if certain variants occupy a larger proportion of the cell population, they are more likely to be overrepresented in RNA sequencing.

      We account for this heterogeneity in several ways. First, as shown above (see Response to Reviewer 1, Question 1), we aim to have even representation of variants within our libraries. Second, we utilize compositional control conditions like forskolin or unstimulated conditions to obtain treatment-independent measurements of barcode abundance and, consequently, of mutant-vs-WT effects that are due to compositional rather than biological variability. We expect that variability observed under these controls is due to subtle effects of molecular cloning, gene expression, and stochasticity. Using these controls, we observe that mutant-vs-WT effects are generally close to zero in these normalization conditions (e.g., in untreated Gq, see Supplementary Figure 3) as compared to treated conditions. For example, pre-mature stops behave similar to WT in normalization conditions. This indicates that mutant abundance is relatively homogenous. Where there are barcode-dependent effects on abundance, we can use information from these conditions to normalize that effect. Finally, our mixed-effect model accounts for barcode-specific deviations from the expected mutant effect (e.g., a “high count” barcode consistently being high relative to the mean).

      (4) Although the assay system appears to effectively represent MC4R functionality at the molecular level, we are curious about the potential disparity between the DMS score system and physiological relevance. How do variants reported in gnomAD distribute within the DMS scoring system?

      Figure 2D shows DMS scores (variant effect on Gs signaling) relative to human population frequency for all MC4R variants reported in gnomAD as of January 8, 2024.

      (5) To measure Gq signaling, the authors used the GAL4-VPR relay system. Is there additional experimental data to support that this relay system accurately represents Gq signaling?

      The full Gq reporter uses an NFAT response element from the IL-2 promoter to regulate the expression of the GAL4-VPR relay. In this system, the activation of Gq signaling results in the activation of the NFAT response element, and this signal is then amplified by the GAL4-VPR relay. The NFAT response element has been previously well-validated to respond to the activation of Gq signaling (e.g., Boss, Talpade, and Murphy 1996). We will have added this reference to the text (see Results> Assays for disease-relevant mechanisms) to further support the use of the Gq assay.

      (6) Identifying the variants responsive to the corrector was impressive. However, we are curious about how the authors confirmed that the restoration of MC4R activity was due to the correction of the MC4R protein itself. Is there a possibility that the observed effect could be influenced by other factors affected by the corrector? When the corrector was applied to the cells, were any expected or unexpected differential gene expression changes observed?

      While we do not directly measure whether Ipsen-17 has effects on other signaling processes, previous work has shown that Ipsen-17 treatment does not indirectly alter signaling kinetics such as receptor internalization (Wang et al., 2014). Furthermore, our analysis methods inherently account for this by normalizing variant effects to WT signaling levels. Any observed rescue of a given variant inherently means that the variant is specifically more responsive to Ipsen-17 than WT, and the fact that different variants exhibit different levels of rescue is reassuring that the mechanism is on target to MC4R. Lastly, Ipsen-17 is known to be an antagonist of alpha-MSH activity and is thought to bind directly to the same site on MC4R (Wang et al., 2014).

      We have revised text in the Methods section as follows (see Running DMS Assays) to better articulate this : “For chaperone experiments, cells were washed 3x with 10 mL DMEM to remove Ipsen 17 prior to agonist stimulation as it has been shown to be an antagonist of α-MSH activity and is thought to bind directly to the same site on MC4R (Wang et al. 2014).”

      (7) As mentioned in the introduction, gain-of-function (GoF) variants are known to be protective against obesity. It would be interesting to see further studies on the observed GoF variants. Do the authors have any plans for additional research on these variants?

      We agree this would be an excellent line of inquiry, but due to changes in company priorities we unfortunately do not have any plans for additional research on these variants.

      Reviewer 2 (Public reviews):

      Overview

      In this manuscript, the authors use deep mutational scanning to assess the effect of ~6,600 protein-coding variants in MC4R, a G protein-coupled receptor associated with obesity. Reasoning that current deep mutational scanning approaches are insufficiently precise for some drug development applications, they focus on articulating new, more precise approaches. These approaches, which include a new statistical model and innovative reporter assay, enable them to probe molecular phenotypes directly relevant to the development of drugs that target this receptor with high precision and statistical rigor.

      They use the resulting data for a variety of purposes, including probing the relationship between MC4R's sequence and structure, analyzing the effect of clinically important variants, identifying variants that disrupt downstream MC4R signaling via one but not both pathways, identifying loss of function variants are amenable to a corrector drug and exploring how deep mutational scanning data could guide small molecule drug optimization.

      Strengths

      The analysis and statistical framework developed by the authors represent a significant advance. In particular, the study makes use of barcode-level internally replicated measurements to more accurately estimate measurement noise.

      The framework allows variant effects to be compared across experimental conditions, a task that is currently hard to do with rigor. Thus, this framework will be applicable to a large number of existing and future deep mutational scanning experiments.

      The authors refine their existing barcode transcription-based assay for GPCR signaling, and develop a clever "relay" new reporter system to boost signaling in a particular pathway. They show that these reporters can be used to measure both gain of function and loss of function effects, which many deep mutational scanning approaches cannot do.

      The use of systematic approaches to integrate and then interrogate high-dimensional deep mutational scanning data is a big strength. For example, the authors applied PCA to the variant effect results from reporters for two different MC4R signaling pathways and were able to discover variants that biased signaling through one or the other pathway. This approach paves the way for analyses of higher dimensional deep mutational scans.

      The authors use the deep mutational scanning data they collect to map how different variants impact small molecule agonists activate MC4R signaling. This is an exciting idea, because developing small-molecule protein-targeting therapeutics is difficult, and this manuscript suggests a new way to map small-molecule-protein interactions.

      Weaknesses

      The authors derive insights into the relationship between MC4R signaling through different pathways and its structure. While these make sense based on what is already known, the manuscript would be stronger if some of these insights were validated using methods other than deep mutational scanning.

      Likewise, the authors use their data to identify positions where variants disrupt MC4R activation by one small molecule agonist but not another. They hypothesize these effects point to positions that are more or less important for the binding of different small molecule agonists. The manuscript would be stronger if some of these insights were explored further.

      Impact

      In this manuscript, the authors present new methods, including a statistical framework for analyzing deep mutational scanning data that will have a broad impact. They also generate MC4R variant effect data that is of interest to the GPCR community.

      Recommendations for the authors:

      (1) Page 7 - the Gq reporter relay system is clever. Could the authors include the original data showing that the simpler design didn't work at all, or at least revise the text to say more precisely what "not suitable due to weak SNR" means?

      We added a panel (D) to Supplementary Figure 2 showing that the native NFAT reporter was ~10x weaker than the CRE reporter, and the relay system amplified the NFAT signal to be comparable to the CRE reporter:

      (2) Page 7 - Even though the relay system gives some signal, it's clearly less sensitive/higher background than Gs. How does that play out in the quantitative analysis?

      —AND—

      (4) Page 10 - The Gq library had fewer barcodes per variant, and, as noted above, the Gq reporter doesn't work quite as well as the Gs one. It would be nice if the authors could comment on how these aspects of the Gq experiments affected data quality/power to detect effects.

      Due to the reviewer's excellent suggestion, we updated Supplementary Figure 2B to better contextualize the quantitative effects of the difference in signal to noise ratio of the Gq versus the Gs reporter system (see changes below). These distributions show the Z-statistic for testing either each stop mutation (red) or all possible coding variants against WT. Thus, a |Z| > 1.96 corresponds to a p = 0.05 in a two-sided Wald Test. We can see that in the Gs reporter, 95% of the stops are nominally significantly different from WT (visualized above with the majority of the red distribution being < -1.96). Alternatively, only 64% of stops are nominally significantly different from WT in Gq. This implies that it will be more difficult to detect effects in the Gq system, especially those less severe than stops.

      In addition to the overall signal to noise ratio being less in the Gq system, there were also less barcodes per variant (28 vs 56 barcodes per variant on average for Gq vs Gs). As demonstrated in Supplementary Figure 2C, the error bars on our estimates are related to the number of barcodes per variant (Standard Error ~ 1 / sqrt(Number of Barcodes), as shown in the plot below). This suggests that our estimates of mutant effects will be less certain in the Gq library than the Gs library. For example, the average standard error in the Gq library was 0.260 which was ~1.58 times larger than the Gs library's 0.165. Finally, we believe this further reiterates the power of our statistical framework, as it naturally enables formalized hypothesis testing that takes these errors into account when making comparisons both within reporters and across reporters.

      (3) Page 9 - it would be nice to see the analysis framework applied to a few existing datasets from other types of assays, to really judge its performance. That's not the main point of this paper, and it's fine, but it would be lovely!

      We agree with the reviewer and hope others apply our framework to their problems to further refine its utility and applicability! To that end, we’ve open-sourced it under a permissive license to help encourage the community to use it. Part of the challenge in applying it to other existing datasets is that few DMS experiments leverage variant-level replication through barcodes. While we re-analyzed an older DMS data from Jones et al. 2020 to produce the distributions in Supplementary Figure 2b, a more thorough comparison is outside the scope of this paper. That said, we have two additional manuscripts in preparation that leverage this framework to analyze DMS data in different proteins and assay types.

      (5) Page 10 - In discussing the relationship of the data to ClinVar and AM, the authors use qualitative comparisons like "majority" and "typically." Just giving numbers would better help the reader appreciate how the data compare.

      We added specific proportions for these statements to the text for the ClinVar and AlphaMissense comparisons as follows:

      (See Results > Comprehensive Deep Mutational Scanning of MC4R): “For example, the majority (63.3%, 31/49) of human MC4R variants classified as pathogenic or likely pathogenic in ClinVar (Landrum et al., 2014) lead to a significant reduction of Gs signaling under low α-MSH stimulation conditions (significance threshold: false discovery rate (FDR) < 1%; Fig. 2C). Variants that are significantly loss-of-function in this condition are rarer in the human population, and more common human variants have no significant effect on MC4R function (significance threshold: FDR < 1%; Fig. 2D). Loss-of-function variants by our DMS assay are also typically (e.g., AlphaMissense: 93.4%, 1894/2028) predicted to be deleterious by commonly used variant effect predictors like AlphaMissense (Cheng et al., 2023) and popEVE (Orenbuch et al., 2023) (Supplementary Fig. 5).”

      (6) Pages 10-12, Figures 2C, E. The data look really nice, but the correlation with clinvar and the Huang data is not perfect (e.g. many pathogenic variants are classified as WT and partial LoF variants too). Can the authors comment on this discrepancy? For ClinVar, they should say when ClinVar was accessed and also how they filtered variants. I would recommend using variants with at least 1 star. Provided they did use high-quality clinical classifications, do they think the classifications are wrong, or their data? The same goes for Huang.

      —AND—

      (7) Page 13 - similar to previous comments, I'm curious about the 5 path/likely path ClinVar variants that are not LoF in the assay. Are they high noise/fewer barcodes? Or does the assay just miss some aspect of human biology?

      ClinVar data was accessed on January 5, 2024 (see Methods: Comparison to human genetics data and variant effect predictors). No annotation quality filtering was performed, and we have revised the text as follows to clarify this:

      (see Methods > Comparison to Human Genetics Data and Variant Effect Predictors): “Pathogenicity classifications of MC4R missense and nonsense variants were obtained from ClinVar (Landrum et al., 2014) on January 5, 2024, and all available annotations were included in the analysis regardless of ClinVar review status metric.”

      A substantial proportion of the discrepancy between our data and ClinVar is, as the reviewer suggests, likely due to low quality ClinVar annotations. Of the five variants that the reviewer notes were reported as pathogenic/likely pathogenic but did not result in loss of protein function in any of our DMS assays, two (V50M and V166I) have been reclassified in ClinVar to uncertain or conflicting interpretation since we accessed annotations in early 2024. An additional two of the five discrepant variants (Q43K and S58C) currently have 0 star ratings to support their pathogenic/likely pathogenic annotation. The remaining discrepant variant (S94N) has a 1 star rating supporting an annotation of “likely pathogenic.

      The Huang et al. paper did an admirably thorough job of aggregating variant annotations from more than a dozen primary literature sources that each reported functional validation data for small panels of variants. However, one inherent limitation of this approach is that the resulting annotation classes are based on experiments that were carried out using inconsistent methods and/or scoring criteria. For example, classifications in the Huang et al. paper are based on an inconsistent mix of functional assay types (e.g., Gs signaling, Gq signaling, protein cell surface expression, etc.), and different variants were tested in different cell types (e.g., HEK293T, CHO, Cos-7, etc.). In principle, DMS assays should provide a more accurate assessment of the relative quantitative differences between alleles since each variant was tested using identical experimental conditions and analysis parameters.

      That being said, while very good, our assays are likely missing or only indirectly reporting on at least some aspects of MC4R biology. For example, in addition to Gs and Gq signaling, MC4R interfaces with β-arrestin. Variants that are protective against obesity-related phenotypes have been shown to increase recruitment of β-arrestin to MC4R, and we did not directly assess this function.

      (8) Page 15, Fig 3C - The three variants they highlight all have paradoxical changes in bias as a-MSH dose is increased (e.g. the bias inverts). I'm not a GPCR expert, but this seems interesting and a little weird. Perhaps the authors could comment on it?

      We agree this is an interesting observation that deserves further study, but unfortunately is outside the scope of our priorities at the moment. As noted, all three highlighted variants in this region have a biased basal activity, and this bias inverts upon stimulation. While we don’t have a good explanation for why this would be the case, this phenomenon has been previously observed for 158R (Paisdzior et al., 2020). Our DMS data emphasizes how diverse biased effects can be and further highlights the importance of characterizing these effects. It would be interesting if further studies could elucidate the mechanistic basis for this behavior and how it may be related to G protein coupling in this region.

      (9) Page 16 - I'm not familiar with the A21x1 formalism. For the general reader, maybe the authors could introduce this formalism.

      Given the shared structural topology of GPCRs, others have developed a variety of numbering schemes to refer to where various variants are to allow more direct comparisons between different GPCRs. We use the GPCRDB.org numbering scheme (e.g., F202<sup>5x4</sup>) as it takes experimentally determined structures into account. Roughly speaking, the number preceding the “x” corresponds to which transmembrane domain (one through seven) or region the residue is located in. The numbers following the “x” correspond to where that residue is located in that region relative to a structurally conserved residue that is always assigned 50. For example F202<sup>5x48</sup> means that F202 is located in the 5th transmembrane helix and is 2 residues before the most conserved M204<sup>5x50</sup>. We updated the text to clarify this accordingly:

      (see Results > Structural Insights into Biased Signaling): “Upon ligand binding, W258 (W258<sup>6x48</sup> in https://gpcrdb.org/ nomenclature, where 6 corresponds to the 6th transmembrane helix and 48 denotes 258 is 2 residues before the most conserved residue in that helix (Isberg et al., 2015)) of the conserved CWxP motif undergoes a conformational rearrangement that is translated to L133<sup>3x36</sup> and I137<sup>3x40</sup>, of the conserved PIF motif (MIF in melanocortin receptors).”

      (10) Page 17, Figure 3A - Since 137, 254, and 140 are not picked out on the structure, I have no idea where they are. If the authors want to show readers these residues, perhaps they could be annotated or a panel added. Since ~1 entire page of the manuscript is dedicated to this cascade, it might make sense to add a panel. Just amplifying the comment above as regards position 79, others were discussed in that paragraph but not highlighted.

      We updated Supplementary Fig. 6C,D to label all of the listed residues on the protein structure for easy reference.

    1. eLife Assessment

      This manuscript describes an important study of the giant virus Jyvaskylavirus. The characterisation presented is compelling. The work will be of interest to virologists working on giant viruses as well as those working with other members of the PRD1/Adenoviridae lineage.

    2. Reviewer #1 (Public review):

      This study presents Jyvaskylavirus, a new member of the Marseilleviridae family, infecting Acanthamoeba castellanii. The study provides a detailed and comprehensive genomic and structural analysis of Jyvaskylavirus. The authors identified ORF142 as the capsid penton protein and additional structural proteins that comprise the virion. Using a combination of imaging techniques the authors provide new insights into the giant virus architecture and lifecycle. The study could be improved by providing atomic coordinates and refinement statistics, comparisons with available giant virus structures could be expanded, and the novelty in terms of the first isolated example of a giant virus from Finland could be expounded upon.

      The study contributes new structural and genomic diversity to the Marseilleviridae family, hinting at a broader distribution and ecological significance of giant viruses than previously thought.

      Comments on revisions: I'm satisfied with the authors' responses to the review, and request no further changes.

    3. Reviewer #2 (Public review):

      This paper describes the molecular characterisation of a new isolate of the giant virus Jyvaskylavirus, a member of the Marseilleviridae family infecting Acanthamoeba castellanii. The isolate comes from a boreal environment in Finland, showcasing that giant viruses can thrive in this ecological niche. The authors came up with a non-trivial isolation procedure that can be applied to characterise other members of the family and will be beneficial for the virology field. The genome shows typical Marseilleviridae features and phylogenetically belongs to their clade B. The structural characterisation was performed on the level of isolated virion morphology by negative stain EM, virions associated with cells either during the attachment or release by helium microscopy, the visualisation of the virus assembly inside cells using stained thin sections, and lastly on the protein secondary structure level by reconstructing ~6 A icosahedral map of the massive virion using cryoEM. The cryoEM density combined with gene product structure prediction enabled the identification and functional assessment of various virion proteins. The visualisation of ongoing virus assembly inside virus factories brings interesting hypotheses about the process that; however, needs to be verified in the next studies.

      Strengths:

      The detailed description of the virus isolation protocol is the largest strength of the paper and I believe it can be modified for isolating various viruses infecting small eukaryotes. The cryoEM map allows us to understand how exceptionally large virions of these viruses are stabilised by minor capsid proteins and nicely demonstrates the integration of medium-resolution cryoEM with protein structure prediction in deciphering virion protein function.

      Weaknesses:

      No mass spectrometry data are presented to supplement and confirm the identity of virion proteins which predicted models were fitted into the cryoEM density.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This manuscript describes an important study of the giant virus Jyvaskylavirus. The characterisation presented is solid, although, in the current form, it is not clear to what extent these findings change our perception of how giant viruses, especially those isolated from a cold environment, function. The work will be of interest to virologists working on giant viruses as well as those working with other members of the PRD1/Adenoviridae lineage.

      Thank you for the revision and positive comments. We decided to submit our revised version of the manuscript with changes made in light of the comments made by the editorial team and the reviewers. We hope that now the manuscript is in a better shape and satisfies all comments received. Major changes made were:

      - We changed the author order considering reviewer 2 comments (point 11). Note that no author was added or removed, we just rearranged the order of authorship.

      - We included a new supplementary table with the Jyvaskylavirus genome annotation. This is now supplementary table 2.

      - We included a supplementary figure 9 to support our changes based on reviewer 2 comments (point 6).

      - Figures 2,5,6,7 and the supplementary figure 2 were updated to accommodate our answers to different reviewer comments.

      - Three new references were added to support some of our changes.

      Below you will find our responses to each specific point raised by the reviewers.

      Public Reviews:

      Reviewer #1 (Public review):

      This study presents Jyvaskylavirus, a new member of the Marseilleviridae family, infecting Acanthamoeba castellanii. The study provides a detailed and comprehensive genomic and structural analysis of Jyvaskylavirus. The authors identified ORF142 as the capsid penton protein and additional structural proteins that comprise the virion. Using a combination of imaging techniques the authors provide new insights into the giant virus architecture and lifecycle. The study could be improved by providing atomic coordinates and refinement statistics, comparisons with available giant virus structures could be expanded, and the novelty in terms of the first isolated example of a giant virus from Finland could be expounded upon.

      The study contributes new structural and genomic diversity to the Marseilleviridae family, hinting at a broader distribution and ecological significance of giant viruses than previously thought.

      Thank you for your constructive comments. We have addressed each point raised in our rebuttal letter and revised the manuscript accordingly. By following your specific comments, we improved the manuscript regarding atomic coordinates, refinement statistics and novelty of finding a Finnish marseillevirus. Details are provided in the specific answers to your points.

      Reviewer #2 (Public review):

      Summary:

      This paper describes the molecular characterisation of a new isolate of the giant virus Jyvaskylavirus, a member of the Marseilleviridae family infecting Acanthamoeba castellanii. The isolate comes from a boreal environment in Finland, showcasing that giant viruses can thrive in this ecological niche. The authors came up with a non-trivial isolation procedure that can be applied to characterise other members of the family and will be beneficial for the virology field. The genome shows typical Marseilleviridae features and phylogenetically belongs to their clade B. The structural characterisation was performed on the level of isolated virion morphology by negative stain EM, virions associated with cells either during the attachment or release by helium microscopy, the visualisation of the virus assembly inside cells using stained thin sections, and lastly on the protein secondary structure level by reconstructing ~6 A icosahedral map of the massive virion using cryoEM. The cryoEM density combined with gene product structure prediction enabled the identification and functional assessment of various virion proteins.

      Strengths:

      The detailed description of the virus isolation protocol is the largest strength of the paper and this reviewer believes it can be modified for isolating various viruses infecting small eukaryotes. The cryoEM map allows us to understand how exceptionally large virions of these viruses are stabilised by minor capsid proteins and nicely demonstrates the integration of medium-resolution cryoEM with protein structure prediction in deciphering virion protein function. The visualisation of ongoing virus assembly inside virus factories brings interesting hypotheses about the process that; however, needs to be verified in the next studies.

      Weaknesses:

      The conclusions from helium microscopy images are overinterpreted, as the native membrane structure cannot be preserved in a fixed and dehydrated sample. In the image, there are many other parts of the curved membrane and a lot of virions, to me it seems the specific position of the highlighted virion could arise by a random chance. The claim that the cells were imaged in the near-original state by this method should be therefore omitted. Also, no mass spectrometry data are presented that would supplement and confirm the identity of virion proteins which predicted models were fitted into the cryoEM density. For a general virology reader outside of the giant virus field, the results presented in the current state might not have enough influence and the section should be rewritten to better showcase the novelty of findings.

      Thank you for your constructive comments. We thank reviewer #2 for highlighting these weaknesses, giving us the opportunity to improve our study. We have removed the claim that the cells were imaged in a near-original state. Additionally, we agree that the positions of the virions on the cell surface could result from a random distribution. However, the specific virion in panel 3C is situated halfway into a crevice, and it cannot be ruled out that this particular one could be in the process of being endocytotically uptaken. This is why we used the term "probably" while referring to this finding. Regarding the mass spectrometry data, while we understand that MS data would provide an additional layer of evidence to validate the specific proteins present in the virion, they would not confirm the precise location or role of these proteins within the virion.

      We have addressed each point raised in our rebuttal letter and revised the manuscript accordingly.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      I have only minor comments which should be relatively simple to address:

      (1) Atomic coordinates should be deposited in the PDB, and refinement statistics for the models provided, for example by expanding Table S2.

      We thank reviewer #1 for the suggestion. In the original submission in the ‘Data availability’ statement we stated that ‘Predicted Jyvaskylavirus PDB models using ModelAngelo and Alphafold have been deposited at BioStudies under the accession number S-BSST1654’. So, atomic coordinates of all predicted models are publicly available at the https://www.ebi.ac.uk/biostudies/ ; for additional clarity we also added the link in the ‘Data availability’ statement in the revised version.

      Our reasoning of not depositing them in the Protein Data Bank associated to our EMD-51613 entry is because they remain predicted models rigid-body fitted into the Jyvaskylavirus density map of 6.3 Å resolution. However, we have added into our BioStudies deposition (BSST1654) the whole Jyvaskylavirus pentameric assembly model (including all identified and predicted major and minor capsid proteins) rigid-body fitted into the Jyvaskylavirus map, and it can be easily downloaded.

      We did not to perform the real-space ‘minimization_global’ refinement of the predicted models corresponding to the ORFs of Melbournevirus (or Jyvaskylavirus) into the corresponding Melbournevirus available densities with entries EMD-37188, 37189, 37190 at ~ 3.5 Å resolution (by block-based reconstruction methods) as these maps were generated and deposited by other authors. Instead, we performed the rigid-body fit-into-map procedure of the individual predicted Jyvaskylavirus models into the previously deposited Melbournevirus maps using ChimeraX, demonstrating a fold-map alignment and assignment (see for example the individual stereo views in Supplementary Figure 6).

      In the revised version, we now provide the refinement statistics for the complete Jyvaskylavirus pentameric assembly (inclusive of peripentonal major capsid and minor capsid proteins) rigid-body fitted as a whole into the Melbournevirus 5-block reconstruction map using PHENIX, resulting into a CC<sub>mask</sub> of 57.3% (this is also stated in Supplementary Figure 7). The same pentameric assembly model was then placed into our lower-resolution 6.3 Å Jyvaskylavirus 3D density map in ChimeraX and rigid-body refined as a whole in PHENIX, yielding a predictably lower CC<sub>mask</sub> of 33%. This pentameric assembly model has now also been included into BioStudies entry.

      The procedure for this rigid body fitting and refinement has been clarified and added to the 'Materials and Methods' section as follows:

      “Then, the corresponding full 3D models were predicted using AlphaFold3 and fitted into the Melbournevirus and Jyvaskylavirus cryoEM density using the fit-into-map routine in ChimeraX together with the peripentonal capsomers (Meng et al 2023). To assess the metric of this fitting (Supplementary Figure 7), the 3.5 Å five-fold Melbournevirus block 3D density (EMDB-37190) was boxed around the pentameric assembly model and refined as a whole using rigid-body refinement in PHENIX, yielding a CC<sub>mask</sub> of 57.3%. The same pentameric model was subsequently fitted into the 6.3 Å Jyvaskylavirus 3D cryo-EM density (previously boxed around the model), resulting in a lower CC<sub>mask</sub> of 33%, consistent with the limited resolution of the capsid map and below regions.”

      (2) The results section 'Jyvaskylavirus three-dimensional architecture' could be expanded to compare and contrast with other giant virus structures, in terms of T-number, diameter, and features on and inside the capsid. This is not essential but would help focus claims of novelty with regard to structure.

      We have added a few lines as indicated by reviewer#1 to contextualize in morphological terms Jyvaskylavirus with other NCLDV viruses as follows:

      “Both the capsid organization and virion size are similar to those of other Marseilleviruses, such as Melbournevirus and Tokyovirus. Pacmanvirus, considered to be at the crossroads between Asfarviridae and Faustoviruses, also possesses the same T number (309) and a comparable diameter to Jyvaskylavirus. In contrast, other giant viruses, such as African swine fever virus (ASFV), representative of the Asfarviridae family, have a T number of 277 and a diameter of approximately 2,100 Å, while PBCV-1, a member of the Phycodnaviridae family, has a T number of 169 and an average diameter of 1,900 Å. All of the above-mentioned viruses have been shown to possess a major capsid protein with a vertical double jelly-roll fold that composes the capsid shell, along with an internal membrane bilayer. Minor capsid proteins have been identified and structurally modelled for the smaller virions ASFV and PBCV-1 (Wang et al. 2019; Shao et al. 2022).”

      (3) The authors highlight one of the main novelties of the virus as being the first to be isolated from Finland. The first isolation of a giant virus from the region is indeed a success but reported isolation experiments for giant viruses are still relatively few. To help shed light on the likely distribution of Jyvaskylavirus-like viruses in the region, and further afield, the genome of Jyvaskylavirus could be searched against relevant available metagenomes.

      In the last decade the interest on finding giant viruses by metagenomics has increased. However, the focus has been on marine environments, where these viruses are shown to be prevalent. Besides the few isolates from the Northern hemisphere mentioned in the manuscript, northern giant viruses were detected in metagenome datasets from glacier samples, epishelf lakes, the permafrost, the Nordic seas and in a deep-sea hydrothermal vent. Most of the genomic hits are for mimivirus-like or phycodnavirus-like sequences. A few marseilleviruses were found in the Loki’s castle deep sea vent, and we have already included these sequences in the analysis shown by the supplementary figure 3. In this case the deep-sea vent viruses clusters outside the conventional clades of the marseilleviridae family, evidencing their uniqueness.

      In response to the suggestion of exploring the distribution of Jyvaskylavirus, we utilized the MGnify-database to search for DNA polymerase (DNApol) and major capsid protein (MCP) sequences. Our findings revealed multiple hits with significantly low E-values (< 1e-80), where both DNApol and MCP were detected from the same studies, indicating the presence of similar virus-like particles (VLPs) globally. Of particular interest was the detection of similar sequences in metagenomes and transcriptomes obtained from drinking water distribution systems of ground and surface waterworks in central and eastern Finland (https://www.ebi.ac.uk/metagenomics/studies/MGYS00005650#overview). We have acknowledged this in the manuscript and cited the appropriated references, as follows:

      Results: “Searching the Jyvaskylavirus major capsid protein and DNA polymerase sequences in the MGnify-database (Richardson et al 2023) yields multiple hits with significantly low E-values (< 1e-80), as expected from the apparent ubiquity of marseilleviruses. Of note was the detection of similar sequences in metagenomes and transcriptomes obtained from drinking water distribution systems of ground and surface waterworks in central and eastern Finland, evidencing that marseilleviruses are prevalent but still unexplored in this region (Tiwari et al 2022)”.

      Discussion: “Marseillevirus DNA polymerase sequences are present in metagenomes from Finnish drinking water distribution systems (Tiwari et al 2022), hinting to a wide distribution of these viruses and still unknown ecological role in Central and Eastern Finland.”

      Reviewer #2 (Recommendations for the authors):

      Apart from the major comments in the weaknesses section, I have these additional minor comments to the authors:

      (1) I do not understand why the authors emphasized the uniqueness of isolating a giant virus from Finland. I think the manuscript would benefit if they rather emphasize that the virus comes from a boreal environment.

      The first giant virus, APMV, was described in 2003. In the following years the apparent ubiquity of these viruses was evidenced by two fronts. Metagenomics made clear that giant viruses are found almost everywhere, biased towards the oceans. Isolation efforts brought new virus groups in evidence but has been so far biased towards central Europe and South America samples. The closest isolated giant viruses to Jyvaskylavirus would be either an uncharacterized Swedish cedratvirus or a few microalgae-infecting mimivirus-like and phycodnaviruses-like isolates from Norway. Among marseilleviruses, Jyvaskylavirus is the northernmost isolate so far. Other marseilleviruses from the northern hemisphere were found in France, India, Japan and Algeria only.

      We still believe that finding a giant virus in Finland is relevant, considering that no other is known to date, be as an isolate or detected by genomics. We have made these observations clearer in the manuscript, giving emphasis to the boreal environment as well.

      (2) All discussed AlphaFold models should be added as Supplementary PDB data.

      We thank reviewer #2 for the suggestion. In the original submission in the ‘Data availability’ statement we stated that ‘Predicted Jyvaskylavirus PDB models using ModelAngelo and Alphafold have been deposited at BioStudies under the accession number S-BSST1654’. So, atomic coordinates of all predicted models are publicly available at the https://www.ebi.ac.uk/biostudies/ ; for additional clarity we also added the link in the ‘Data availability’ statement in the revised version.

      Our reasoning of not depositing them in the Protein Data Bank associated to our EMD-51613 entry is because they remain predicted models rigid-body fitted into the Jyvaskylavirus density map of 6.3 Å resolution. However, we have added into our BioStudies deposition (BSST1654) the whole Jyvaskylavirus pentameric assembly model (including all identified and predicted major and minor capsid proteins) rigid-body fitted into the Jyvaskylavirus map, and it can be easily downloaded.

      We did not to perform the real-space ‘minimization_global’ refinement of the predicted models corresponding to the ORFs of Melbournevirus (or Jyvaskylavirus) into the corresponding Melbournevirus available densities with entries EMD-37188, 37189, 37190 at ~ 3.5 Å resolution (by block-based reconstruction methods) as these maps were generated and deposited by other authors. Instead, we performed the rigid-body fit-into-map procedure of the individual predicted Jyvaskylavirus models into the previously deposited Melbournevirus maps using ChimeraX, demonstrating a fold-map alignment and assignment (see for example the individual stereo views in Supplementary Figure 6).

      In the revised version, we now provide the refinement statistics for the complete Jyvaskylavirus pentameric assembly (inclusive of peripentonal major capsid and minor capsid proteins) rigid-body fitted as a whole into the Melbournevirus 5-block reconstruction map using PHENIX, resulting into a CC<sub>mask</sub> of 57.3% (this is also stated in Supplementary Figure 7).

      The same pentameric assembly model was then placed into our lower-resolution 6.3 Å Jyvaskylavirus 3D density map in ChimeraX and rigid-body refined as a whole in PHENIX, yielding a predictably lower CC<sub>mask</sub> of 33%. This pentameric assembly model has now also been included into BioStudies entry.

      The procedure for this rigid body fitting and refinement has been clarified and added to the 'Materials and Methods' section as follows:

      “Then, the corresponding full 3D models were predicted using AlphaFold3 and fitted into the Melbournevirus and Jyvaskylavirus cryoEM density using the fit-into-map routine in ChimeraX together with the peripentonal capsomers (Meng et al 2023). To assess the metric of this fitting (Supplementary Figure 7), the 3.5 Å five-fold Melbournevirus block 3D density (EMDB-37190) was boxed around the pentameric assembly model and refined as a whole using rigid-body refinement in PHENIX, yielding a CC<sub>mask</sub> of 57.3%. The same pentameric model was subsequently fitted into the 6.3 Å Jyvaskylavirus 3D cryo-EM density (previously boxed around the model), resulting in a lower CC<sub>mask</sub> of 33%, consistent with the limited resolution of the capsid map and below regions.”

      (3) Figure 2A: Could ORFs that encode structural proteins discussed in the paper, be somehow highlighted?

      We have updated Figure2A to include this information.

      (4) Figure 2C: Could be somehow highlighted from these members on which there was conducted structural characterisation (e.g. by some symbol next to the name)?

      We have updated Figure2C to include this information.

      (5) Figure 5A: Could the central bid be shown in a lower threshold (you can retain the threshold for the protein shell)? It would be interesting to see some details of the interior, rather than a massive blob.

      We have decreased the threshold level of the map as suggested.

      (6) Figure 6: the density corresponding to MCPs, minor capsid, and penton proteins respectively could be colour-zoned in Chimera(X). This would better visualise where each entity lies.

      About ORF142 - what other virus protein possesses this fold? Is it similar to the penton protein in other PRD1/Adenoviridae viruses? Maybe some comparison could be presented?

      We have incorporated the feedback from reviewer_#_2 by modifying the corresponding panel A in Figure 6. We have colour-zoned the penton (ORF142), some of the density region corresponding to the MCPs (ORF184) and to the minor cap proteins (ORF121). We have kept in grey the density corresponding to other minor proteins, and those we were able to identify are logically introduced later and shown as individual coloured cartoon tube models fitted into the density in panel A of Figure 7.

      Regarding ORF142, we have included a reference in the Discussion section to a new Supplementary Figure 9, where we provide a side-by-side comparison of the predicted Jyvaskylavirus penton protein model with experimentally derived penton protein models of PRD1 and HCIV-1. In light of this comparison, we have also added a brief clarification in the Discussion as follows:

      “However, in ORF142, the CHEF strands are predicted to be tilted relative to the BIDG strands, with an estimated angle of approximately 60° based on visual inspection (Supplementary Figure 9).”

      (7) Figure 7B: Could the density around the protein be zoned (rather than side view clipped), as this would better showcase how it fits the density?

      Initially, we presented a side view of the clipped surface to highlight the correspondence between the wall-shaped density, characteristic of a low-resolution beta-barrel, and the beta-barrel of the predicted model. Following the Reviewer’s suggestion, we have now surface-zoned the density and provided a stereo view of the density with the model fitted into the map using ChimeraX. While we recognize that stereo views are no longer commonly used in main text figures, we believe they remain valuable for visually assessing the overall match in low-resolution 3D density maps.

      (8) The authors did not try to reconstruct the asymmetric feature of the virion by classifying pentons, which may have identified a special vertex, one they claim might be required for genome packaging in "open particles". I understand the number of particles is low, but even low-resolution classification in C5 might be of interest in the field.

      We thank reviewer #2 for this valuable comment. The potential existence of a unique vertex in Marseilleviruses remains an open and intriguing question. Further investigations, including a significant increase in the number of particles, may help clarify this issue, and we plan to explore this topic in future structural studies.

      (9) Supplementary Figure 2: It would be interesting how the titre changes after the 12 hours, will it plateau? Could you add a bar showing the original titre to the chart showing stability after 109 days? I like the data in this figure and think it should be transferred to the main text.

      The titre at the 12h time point is very close to the titre we often get in our stocks, indicating that indeed it is close to peaking. For comparison: the titre of the 12-hour time point was 10<sup>11.55</sup> TCID50/ml, whereas our stock has a titre of 10<sup>11.66</sup> TCID50/ml. Our growth curve had more time points up to 48h and we lost the later time points due to a higher viral load than predicted, which led to us not being able to count these time points with the dilutions used. Showing the first 12 hours was enough for our initial purpose, which was to show a quick replication cycle for Jyvaskylavirus, in accordance with the other marseilleviruses in which the timing of the replication cycle was observed (see the answer for point 10 below).

      We have added a bar representing the original titre of the stock used for the stability experiment as suggested.

      While preparing the draft we were divided into having the growth and stability figure in the main text or in the supplementary material. Our decision was to move this data to the supplementary material and keep the focus of the main text on the discovery, genome analysis and structural data, as these are the main findings of our work. The specifics regarding stability, growth and other uncharacterized VLPs went to the supplementary material for those in the field who are interested in looking deeper. That being said, we will decide to keep this data as supplementary material if you and the editor agrees.

      (10) In the Discussion, the authors should focus on how our perception of giant viruses changes by this study - compare with other growth curves, stability assays, and structures of giant viruses, showcasing how prevalent those stabilising minor capsid proteins are, etc. My impression is that in the current form, it is just not clear if/how substantial these findings are and such a comparison and putting the results in a bigger picture would considerably increase the impact of the paper.

      Our comparisons with other marseilleviruses were based on genomic and structural characteristics, the two fronts we had data from the literature and databases to compare to. Sadly there is not too much information regarding stability and growth of other isolates that could be used for an in-depth comparison. For example: although marseilleviruses are known to have a fast replication cycle, this has been measured by DAPI staining of DNA inside infected cells to evaluate viral factory formation (Boyer et al 2009), or by time-series observations of viral cycle stages by electron microscopy (Fabre et al 2017), and not by viral titration as done here. We included a mention to these references in the results:

      “A fast replication cycle is a feature also shown for other marseilleviruses (Boyer et al 2009 ; Fabre et al 2017).”

      The literature also does not show virion stability of other isolates, making it impossible to have a comparison with jyvaskylavirus. A comparative study testing different isolates side by side is definitely of relevance and interest, but this would be difficult to be done in a short time due to obtaining other isolates. We believe the results in this manuscript might set some parameters to be used for comparing with other marseilleviruses, by our groups and others, in the future.

      Regarding the prevalence of the minor capsid proteins, we have expanded and clarified the identification of ORFs in Melbournevirus in the ‘Results’ and ‘Discussion’ sections. The revised Supplementary Table 4 has been updated accordingly and referenced in the results to clarify that the identification of Melbourne ORFs was carried out in BLASTp by querying the Jyvaskylavirus minor protein sequences exclusively against the Melbournevirus isolate 1 (NCBI Reference Sequence: NC_025412.1). BLASTp was then performed against the full sequence database, and homologous sequences were primarily retrieved from other marseillaviruses. These results have been compiled in a new Supplementary Table 5.

      However, Supplementary Table 5 also shows that the hits for Melbournevirus are not ranked at the top, and in some cases, they do not appear among the top hits.

      The ‘Results’ section now contains the following text:

      “To this end, we identified the corresponding Jyvaskylavirus ORFs in Melbournevirus through sequence comparison with Melbournevirus isolate 1 (NCBI Reference Sequence: NC_025412.1) (Supplementary Table 34). However, when the identified Jyvaskylavirus ORF sequences were analyzed using BLASTp without restricting the search to the Melbournevirus reference, many hits were observed in other giant viruses, primarily marseillevirus. Remarkably, some of these hits scored higher than those for Melbournevirus, supporting the presence of homologous proteins in these viruses (Supplementary Table 5).”

      The ‘Discussion’ section now contains the following text:

      “Additionally, the observation that the identified Jyvaskylavirus minor capsid protein sequences are shared across other marseillaviruses supports their essential structural and stabilizing roles in these viruses.”

      At the same time, we have modified the ‘Materials and Methods’ section to include a reference to Supplementary Figure 5, where the use of ModelAngelo is mentioned. Additionally, a new Supplementary Figure 10 has been included to clarify how the residues built into the Melbournevirus density using ModelAngelo (without prior knowledge of any sequence) are subsequently matched with the Jyvaskylavirus sequences.

      (11) Based on the author's statement, Iker Arriaga did all the cryoEM experiments. It is strange to me they are not placed higher on the author's list.

      We thank you for this observation and agree with your comment. This manuscript has been in preparation for a few years, and the first draft had the author order defined before the structural data collection and analyses were completed. Iker participation was indeed important and substantial from the first draft to the submitted version and he definitely deserves a better author placement. We have modified the author order to accommodate this. Note that only the author order changed and that no author has been included or removed.

    1. eLife Assessment

      This manuscript presents solid evidence suggesting that the loss of ZNRF3 and RNF43, two E3 ubiquitin ligases, leads to dysregulation of EGFR signaling in cancer. The authors propose that EGFR is a direct substrate of ZNRF3/RNF43. While the authors provide immunoprecipitation data showing increased detection of ubiquitinated species, this evidence does not definitively establish that EGFR itself is ubiquitinated by RNF43/ZNRF3. The absence of direct evidence for EGFR ubiquitination is a major limitation, although the findings are useful as they may provide novel insights into the mechanisms underlying EGFR-driven cancers and open new therapeutic avenues.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors provide strong evidence that the cell surface E3 ubiquitin ligases RNF43 and ZNRF3, which are well known for their role in regulating cell surface levels of WNT receptors encoded by FZD genes, also target EGFR for degradation. This is newly identified function for these ubiquitin ligases beyond their role in regulating WNT signaling. Loss of RNF43/ZNRF3 expression leads to elevated EGFR levels and signaling, suggesting a potential new axis to drive tumorigenesis, whereas overexpression of RNF43 or ZNRF3 decreases EGFR levels and signaling. Furthermore, RNF43 and ZNRF3 directly interact with EGFR through their extracellular domains.

      Strengths:

      The data showing that RNF43 and ZNRF3 interact with EGFR and regulate its levels and activity are thorough and convincing, and the conclusions are largely supported.

      Weaknesses:

      Prior work established a clear role for RNF43 and ZNRF3 in regulating cell surface levels of FZD, a class of WNT receptors. These new findings that these E3 ubiquitin ligases also target EGFR add a new layer of complexity, and it remains unclear to what extent WNT signaling versus EGFR signaling are impacted in cancer settings. The authors acknowledge this gap in our understanding, which will likely be the topic of follow-up studies.

      Comments on revisions:

      The authors addressed my main concerns in this revised version and in their rebuttal comments. I have no further critiques to add.

    3. Reviewer #2 (Public review):

      1st Public review:<br /> Using proteogenomic analysis of human cancer datasets, Yu et al, found that EGFR protein levels negatively correlate with ZNFR3/RNF43 expression across multiple cancers. Interestingly, they found that CRC harbouring the frequent RNF43 G659Vfs*41 mutation exhibit higher levels of EGFR when compared to RNF43 wild-type tumors. This is highly interesting since this mutation is generally not thought to influence Frizzled levels and Wnt-bcatenin pathway activity. Using CRISPR knockouts and overexpression experiments, the authors show that EGFR levels are modulated by ZNRF3/RNF43. Supporting these findings modulation of ZNRF3/RNF43 activity using Rspondin also leads to increased EGFR levels. Mechanistically, the authors, show that ZNRF3/RNF43 ubiquitinate EGFR and lead to degradation. Finally, the authors present functional evidence that loss of ZNRF3/RNF43 unleashes EGFR-mediated cell growth in 2D culture and organoids and promote tumor growth in vivo.

      Overall, the conclusions of the manuscript are well supported by the data presented, but some aspects of the mechanism presented need to be re-enforced to fully support the claims made by the authors. Additionally, the title of the paper suggests that ZNRF3 and RNF43 loss leads to hyperactivity of EGFR and that its signalling activity contribute to cancer initiation/progression. I don't think the authors convincingly showed this in their study.

      Major points:

      (1) EGFR ubiquitination. All of the experiments supporting that ZNFR3/RNF43 mediate EGFR ubiquitination are performed under overexpression conditions. A major caveat is also that none of the ubiquitination experiments are performed under denaturing conditions. Therefore, it is impossible to claim that the ubiquitin immunoreactivity observed on the western blots presented in Fig.4 corresponds to ubiquitinated-EGFR species.

      Another issue is that in Figure 4A, the experiments suggest that the RNF43-dependent ubiquitination of EGFR is promoted by EGF. However, there is no control showing the ubiquitination of EGFR in the absence of EGF but under RNF43 overexpression. According to the other experiments presented in Figures 4B, 4C and 4F, there seems to be a constitutive ubiquitination of EGFR upon overexpression. How do the authors reconcile the role of ZNRF3/RNF43 vs c-cbl?

      (2) EGFR degradation vs internalization. In Figure 3C, the authors show experiments that demonstrate that RNF43 KO increases steady state levels of EGFR and prevents its EGF-dependent proteolysis. Using flow cytometry they then present evidence that the reduction in cell surface levels of EGFR mediated by EGF is inhibited in the absence of RNF43. The authors conclude that this is due to inhibition of EGF-induced internalization of surface EGF. However, the experiments are not designed to study internalization and rather merely examine steady state levels of surface EGFR pre and post treatment. These changes are an integration of many things (retrograde and anterograde transport mechanisms presumable modulated by EGF). What process(es) is/are specifically affected by ZNFR3/RNF43? Are these processes differently regulated by c-cbl? If the authors are specifically interested in internalization/recycling, the use of cell surface biotinylation experiments and time courses are needed to examine the effect of EGF in the presence or absence of the E3 ligases.

      (3) RNF43 G659fs*41. The authors make a point in Figure 1D that this mutant leads to elevated EGFR in cancers but do not present evidence that this mutant is ineffective in mediated ubiquitination and degradation of EGFR. As this mutant maintains its ability to promote Frizzled ubiquitination and degradation, it would be important to show side by side that it does not affect EGFR. This would perhaps imply differential mechanisms for these two substrates.

      (4) "Unleashing EGFR activity". The title of the paper implies that ZNRF3/RNF43 loss leads to increased EGFR expression and hence increased activity that underlies cancer. However, I could find only one direct evidence showing that increased proliferation of the HT29 cell line mutant for RNF43 could be inhibited by the EGFR inhibitor Erlotinib. All the other evidence presented that I could find is correlative or indirect (e.g. RPPA showing increased phosphorylation of pathway members upon RNF43 KO, increased proliferation of a cell line upon ZNRF3/ RNF43 KO, decreased proliferation of a cell line upon ZNRF3/RNF43 OE in vitro or in xeno...). Importantly, the authors claim that cancer initiation/ progression in ZNRF3/RNF43 mutant may in some contexts be independent of their regulation of Wnt-bcatenin signaling and relying on EGFR activity upregulation. However, this has not been tested directly. Could the authors leverage their znrf3/RNF43 prostate cancer model to test whether EGFR inhibition could lead to reduced cancer burden whereas a Frizzled or Wnt inhibitor does not?

      More broadly, if EGFR signaling were to be unleashed in cancer, then one prediction would be that these cells would be more sensitive to EGFR pathway inhibition. Could the authors provide evidence that this is the case? Perhaps using isogenic cell lines or a panel of patient derived organoids (with known genotypes).

      Comments on revisions:

      The most important criticism of this manuscript that I raised in my original review has not been addressed. Indeed, the authors claim that EGFR is a direct substrate of the RNF43/ZNFR3 E3 ligase. This has not been directly demonstrated. Indeed, showing increased detection of ubiquitinated species in an immunoprecipitate could mean that a protein is directly modified. However, an alternative explanation is that a protein that is co-immunoprecipitated with the target protein is ubiquitinated (such as several EGFR adapters and interacting partners). Performing these experiments under denaturing conditions is one way to determine that EGFR is the substrate. Alternatively, a quantitative MS approach to quantify an increase in ubiquitinated peptides would also enable the authors to conclude that EGFR is indeed a substrate.

      In addition, one of the main conclusions of the authors is that EGFR activity is unleashed in cancer following ZNRF3 and/or RNF43 loss (as the title suggests). There is still no direct evidence in the manuscript that this is the case. I appreciate the new data showing that MEF with knockout of RNF43/ZNRF3 are sensitive to EGFR inhibitor (and not porcupine inhibitor) but what is the data supporting that EGFR activity is "unleashed" in cancer? The authors still claim that ZNRF3 and RNF43 loss could impact cancer initiation/development in a Wnt-independent fashion (see lines 341-343). I believe this conclusion is based on correlative staining of nuclear bcatenin (which is in itself not a reliable readout of active sginaling) and not on functional data.... I suggested in my original review that the authors should test the efficacy of EGFR inhibitor and Wnt inhibitor in the prostate cancer model that they present in Figure 7 that would have enabled them to firmly conclude about their relative contribution. This was largely handwaved in their rebuttal letter... Doing experiment in WT cells is not the same as addressing this question in the context of cancer.

      Finally, the authors use CRISPR KO experiments, without assessing editing or KO efficiencies throughout the manuscript and simply assume that the gRNA work. In my opinion this is an unacceptable practice.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors provide strong evidence that the cell surface E3 ubiquitin ligases RNF43 and ZNRF3, which are well known for their role in regulating cell surface levels of WNT receptors encoded by FZD genes, also target EGFR for degradation. This is a newly identified function for these ubiquitin ligases beyond their role in regulating WNT signaling. Loss of RNF43/ZNRF3 expression leads to elevated EGFR levels and signaling, suggesting a potential new axis to drive tumorigenesis, whereas overexpression of RNF43 or ZNRF3 decreases EGFR levels and signaling. Furthermore, RNF43 and ZNRF3 directly interact with EGFR through their extracellular domains.

      Strengths:

      The data showing that RNF43 and ZNRF3 interact with EGFR and regulate its levels and activity are thorough and convincing, and the conclusions are largely supported.

      Weaknesses:

      While the data support that EGFR is a target for RNF43/ZNRF3, some of the authors' interpretations of the data on EGFR's role relative to WNT's roles downstream of RNF43/ZNRF3 are overstated. The authors, perhaps not intentionally, promote the effect of RNF43/ZNRF3 on EGFR while minimizing their role in WNT signaling. This is the case in most of the biological assays (cell and organoid growth and mouse tumor models). For example, the conclusion of "no substantial activation of Wnt signaling" (page 14) in the prostate cancer model is currently not supported by the data and requires further examination. In fact, examination of the data presented here indicates effects on WNT/b-catenin signaling, consistent with previous studies.

      Cancers in which RNF43 or ZNRF3 are deleted are often considered to be "WNT addicted", and inhibition of WNT signaling generally potently inhibits tumor growth. In particular, treatment of WNT-addicted tumors with Porcupine inhibitors leads to tumor regression. The authors should test to what extent PORCN inhibition affects tumor (and APC-min intestinal organoid) growth. If the biological effects of RNF43/ZNRF3 loss are mediated primarily or predominantly through EGFR, then PORCN inhibition should not affect tumor or organoid growth.

      We thank the reviewer’s appreciation of the key strength of our study. We fully agree with the reviewer that RNF43/ZNRF3 play key roles in restraining WNT signaling and their deletions activate WNT signaling that leads  to cancer promotion, as discussed and cited in our manuscript (Hao et al, 2012; Koo et al, 2012). We have revised the language in this manuscript to avoid any confusion or appearance of downplaying this known signaling pathway in cancer progression.

      What we would like to highlight in this work is that our study uncovered an effect of RNF43/ZNRF3 on EGFR, leading to biological impact in multiple model systems. In particular, we included the APC-mutated human cancer cell line HT29 and Apc min mouse intestinal tumor organoids. In the context of APC mutations, β-catenin stabilization and the activation of WNT target genes are essentially decoupled from upstream WNT ligand binding to WNT receptors, thus we could primarily focus on the effect of RNF43/ZNRF3 on EGFR. Our statement of “no substantial activation of WNT signaling” as cited by the reviewer was made in describing the data in Fig. 7E where we did not observe β-catenin accumulation in the nucleus and reasoned no substantial activation of canonical WNT signaling. We agree that further examination would help strengthen the conclusion and appreciate the reviewer’s suggestion of PORCN inhibition experiments. While PORCN inhibition is a valuable experiment in models with abundance of WNT ligands/receptors and non-mutationally activated regulators of WNT signaling (Yu et al, 2020), in biological scenarios with existing APC mutations, another group has previously demonstrated that PORCN inhibition had no observable effect on WNT signaling in APC-deficient cells (PMID: 29533772). In our initial submission, we confirmed this predicted low response to manipulation of WNT signaling components upstream of a mutated APC. We showed that addition of RSPO1 in Apc min mouse intestinal tumor organoids failed to further activate WNT target expression (Fig. 6G). Furthermore, in this revised manuscript, we added new data on EGFR inhibition and PORCN inhibition in WT and Znrf3 KO MEFs (Fig. 6L). PORCN inhibition had no impact on cell growth in neither WT nor Znrf3 KO MEFs, suggesting that Znrf3 KO promoting MEF growth is WNT independent. In contrast, inhibition of EGFR downstream signaling components (Fig. 6L) significantly blocked MEF growth and abolished the impact of Znrf3 KO in MEF growth. This new evidence further supports our main conclusion that RNF43/ZNRF3 controls EGFR signaling to regulate cell growth.

      Reviewer #2 (Public Review):

      Using proteogenomic analysis of human cancer datasets, Yu et al, found that EGFR protein levels negatively correlate with ZNFR3/RNF43 expression across multiple cancers. Interestingly, they found that CRC harbouring the frequent RNF43 G659Vfs*41 mutation exhibits higher levels of EGFR when compared to RNF43 wild-type tumors. This is highly interesting since this mutation is generally not thought to influence Frizzled levels and Wnt-bcatenin pathway activity. Using CRISPR knockouts and overexpression experiments, the authors show that EGFR levels are modulated by ZNRF3/RNF43. Supporting these findings, modulation of ZNRF3/RNF43 activity using Rspondin also leads to increased EGFR levels. Mechanistically, the authors, show that ZNRF3/RNF43 ubiquitinate EGFR and leads to degradation. Finally, the authors present functional evidence that loss of ZNRF3/RNF43 unleashes EGFR-mediated cell growth in 2D culture and organoids and promotes tumor growth in vivo.

      Overall, the conclusions of the manuscript are well supported by the data presented, but some aspects of the mechanism presented need to be reinforced to fully support the claims made by the authors. Additionally, the title of the paper suggests that ZNRF3 and RNF43 loss leads to the hyperactivity of EGFR and that its signalling activity contributes to cancer initiation/progression. I don't think the authors convincingly showed this in their study.

      We thank the reviewer commenting that our “conclusions of the manuscript are well supported by the data presented.”  We address the concerns raised by this reviewer in an itemized way as detailed below:

      Major points:

      (1) EGFR ubiquitination. All of the experiments supporting that ZNFR3/RNF43 mediates EGFR ubiquitination are performed under overexpression conditions. A major caveat is also that none of the ubiquitination experiments are performed under denaturing conditions. Therefore, it is impossible to claim that the ubiquitin immunoreactivity observed on the western blots presented in Figure 4 corresponds to ubiquitinated-EGFR species. Another issue is that in Figure 4A, the experiments suggest that the RNF43-dependent ubiquitination of EGFR is promoted by EGF. However, there is no control showing the ubiquitination of EGFR in the absence of EGF but under RNF43 overexpression. According to the other experiments presented in Figures 4B, 4C, and 4F, there seems to be a constitutive ubiquitination of EGFR upon overexpression. How do the authors reconcile the role of ZNRF3/RNF43 vs c-cbl?

      We agree with this reviewer of the limitation of overexpression experiments. In this manuscript, we actually leveraged both overexpression and knockout systems to demonstrate that ZNRF3/RNF43 regulates EGFR ubiquitination: in Fig 4A, we showed that overexpression of RNF43 increased EGFR ubiquitination; in Fig 4B&C and Fig S3A, we showed that RNF43 knockout decreased EGFR ubiquitination; in Fig 4F, we showed that overexpression of ZNRF3 WT increased EGFR ubiquitination but overexpression of ZNRF3 RING domain deletion mutant failed to increase EGFR ubiquitination.

      We also appreciate the rigor with which the reviewer has approached our methodology. We acknowledge that denaturing conditions can provide additional validation, but the technical challenges associated with denaturing conditions include the potential disruption of epitope structures recognized by these antibodies. Our methodology was chosen to balance the need for accurate detection with the preservation of protein structure and function, which are crucial for understanding the biological implications of EGFR ubiquitination. Moreover, our immunoprecipitation and subsequent Western blotting were stringent with high SDS and 2-ME, optimized to minimize non-specific binding and enhance the specificity of detection. We believe that the data presented are robust and contribute significantly to the existing body of knowledge on EGFR ubiquitination.

      CBL is a well-known E3 ligase of EGFR, and it induces EGFR ubiquitination upon EGF ligand stimulation. Therefore, in order to have a fair comparison of RNF43 and CBL on EGFR ubiquitination, we designed Fig 4A and related experiments in the setting of EGF stimulation. We observed that RNF43 overexpression increased EGFR ubiquitination as potently as CBL did. Following this result, we further demonstrated that knockout of RNF43 decreased endogenous ubiquitinated EGFR level in the unstimulated/basal condition (Fig 4B) as well as in the EGF-stimulated condition (Fig 4C). We acknowledge the importance and interest in fully understanding how ZNRF3/RNF43 interplays with the functions of CBL in regulating EGFR ubiquitination. This line of investigation indeed holds the potential to uncover novel regulatory mechanisms in detail. However, the primary focus of the current study was to establish a foundational understanding of ZNRF3/RNF43 role in regulating EGFR ubiquitination. We look forward to exploring further in future work.

      (2) EGFR degradation vs internalization. In Figure 3C, the authors show experiments that demonstrate that RNF43 KO increases steady-state levels of EGFR and prevents its EGF-dependent proteolysis. Using flow cytometry they then present evidence that the reduction in cell surface levels of EGFR mediated by EGF is inhibited in the absence of RNF43. The authors conclude that this is due to inhibition of EGF-induced internalization of surface EGF. However, the experiments are not designed to study internalization and rather merely examine steady-state levels of surface EGFR pre and post-treatment. These changes are an integration of many things (retrograde and anterograde transport mechanisms presumable modulated by EGF). What process(es) is/are specifically affected by ZNFR3/RNF43? Are these processes differently regulated by c-cbl? If the authors are specifically interested in internalization/recycling, the use of cell surface biotinylation experiments and time courses are needed to examine the effect of EGF in the presence or absence of the E3 ligases.

      We agree that our study design primarily assesses EGFR levels on the cell surface before and after EGF treatment and does not comprehensively measure the whole internalization process. In response to the reviewer’s comments, we have revised the relevant sections of manuscript to clarify that our current findings are focused on changes in cell surface EGFR and do not extend to the detailed mechanisms of EGF-induced internalization or recycling.

      (3) RNF43 G659fs*41. The authors make a point in Figure 1D that this mutant leads to elevated EGFR in cancers but do not present evidence that this mutant is ineffective in mediated ubiquitination and degradation of EGFR. As this mutant maintains its ability to promote Frizzled ubiquitination and degradation, it would be important to show side by side that it does not affect EGFR. This would perhaps imply differential mechanisms for these two substrates.

      Fig 1D is based on bioinformatic analysis of colon cancer patient samples, showing that RNF43 G659Vfs*41 mutant tumors exhibited significantly higher levels of EGFR protein compared to RNF43 WT tumors. Following this lead, we investigated whether this RNF43 G659fs*41 hotspot mutation lost its role in downregulating EGFR. To this end, we transfected the same amount of control vector, RNF43 WT, RING deletion mutant, G659fs*41 mutant DNA into 293T cells and measured the level of EGFR (co-transfected). As shown in Author response image 1, overexpression of RNF43 WT decreased EGFR level while overexpression of RING deletion mutant had no impact on EGFR level as compared with the Vector group, which is consistent with our findings in the manuscript. Cells transfected with the RNF43 G659Vfs*41 mutant exhibited nearly normal levels of EGFR; however, we also observed that RNF43 G659Vfs*41 was less expressed than WT, even though the same amounts of DNA were transfected. Therefore, the insubstantial impact on EGFR levels could be attributed to both functional loss or compromised stability of RNF43 G659Vfs*41 mRNA or protein. Further investigation on RNF43 G659Vfs*41 mRNA and protein stability vs. RNF43 G659Vfs*41 protein function is needed to draw a solid conclusion.

      Author response image 1.

      (4) "Unleashing EGFR activity". The title of the paper implies that ZNRF3/RNF43 loss leads to increased EGFR expression and hence increased activity that underlies cancer. However, I could find only one direct evidence showing that increased proliferation of the HT29 cell line mutant for RNF43 could be inhibited by the EGFR inhibitor Erlotinib. All the other evidence presented that I could find is correlative or indirect (e.g. RPPA showing increased phosphorylation of pathway members upon RNF43 KO, increased proliferation of a cell line upon ZNRF3/ RNF43 KO, decreased proliferation of a cell line upon ZNRF3/RNF43 OE in vitro or in xeno...). Importantly, the authors claim that cancer initiation/ progression in ZNRF3/RNF43 mutants may in some contexts be independent of their regulation of Wnt-bcatenin signaling and relying on EGFR activity upregulation. However, this has not been tested directly. Could the authors leverage their znrf3/RNF43 prostate cancer model to test whether EGFR inhibition could lead to reduced cancer burden whereas a Frizzled or Wnt inhibitor does not?

      More broadly, if EGFR signaling were to be unleashed in cancer, then one prediction would be that these cells would be more sensitive to EGFR pathway inhibition. Could the authors provide evidence that this is the case? Perhaps using isogenic cell lines or a panel of patient-derived organoids (with known genotypes).

      We appreciate the reviewer’s suggestion to provide more direct evidence demonstrating the importance of the ZNRF3/RNF43-EGFR axis in cancer cell proliferation.   In this revised manuscript, we further studied this issue in the WT vs. Znrf3 KO MEF cells. We observed that treatment with the EGFR inhibitor erlotinib did not affect WT MEF but stunted the growth advantage of Znrf3 KO MEF cells (Fig. 6L). On the other hand, treatment with the porcupine inhibitor C59 did not impact either WT or Znrf3 KO MEF cells (Fig. 6L), suggesting a more important role of the ZNRF3/RNF43-EGFR axis in mediating the enhanced cell growth of MEF caused by Znrf3 knockout. Furthermore, considering EGFR is often mutated in human cancer, to increase the clinical relance of our study, we also tested the effect of RNF43 knockout on EGFR L858R (Fig. 2D), a common oncogenic EGFR mutant, and found that RNF43 knockout in HT29 boosted levels of this EGFR mutant detected by its FLAG tag, suggesting that RNF43 degrades both WT and mutated EGFR and its loss can enhance signaling of both WT EGFR and its oncogenic mutant .  However, we emphasize again that this manuscript is in no way written to diminish the proven importance of ZNRF3/RNF43-WNT-β-catenin axis in cancer and development.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The main conclusion that EGFR is targeted for degradation by RNF43 and ZNRF3 is well supported and documented. Figures 1-5 and associated supplemental figures contain largely convincing data. Figures 6 and 7, however, require some modifications, as follows in order of appearance:

      Figure 6C: Growth of intestinal tumor organoids from Apcmin mice does not require Rspo, however, the authors show that these organoids grow larger in the presence of Rspo, an effect they attribute to increased EGFR activity, rather than increased WNT activity. While this conclusion may be correct, the authors should address this possibility by treating the organoids with PORCN inhibitor. The prediction would be that Rspo treatment still increases organoid size in the presence of PORCN inhibition. A further prediction would be that blocking EGFR (e.g. with Cetuximab) will abrogate the RSPO1 effect.

      Yes, we attributed the impact of Rspo on Apc min organoid growth to enhanced EGFR activity because we observed increased EGFR levels (Fig 6F) but no detectable increase in eight WNT target genes assayed. We agree that further pharmacologic experiments would further boost our conclusion, but our few attempts at treating organoids encountered technical difficulties. Hence, we switched to testing PORCN inhibition vs EGFR inhibition in WT and Znfr33 KO MEFs. As shown in the revised Fig. 6L, EGFR inhibition significantly reversed the growth advantage caused by Znrf3 KO but C59 did not.

      Figure 6G: It is unclear why the authors provide "8-day RSPO1 treatment" data. Here, EGFR mRNA appears to be elevated 2-fold (perhaps not statistically significant), and the Wnt targets Lef1 and Axin2 are decreased, as indicated by the statistical significance. What point is being made here?

      Our observation of increased size of APC min mouse intestinal tumor organoids and increased the EGFR protein levels were at 8 days of RSPO1 treatment. Therefore, we measured mRNA levels at the same time point with the 2-day time point also included for comparison. The goal of this qPCR experiment was to detect the contribution of WNT signaling, and we did not detect an increased transcriptional readout. We included EGFR mRNA levels for comparison, and we did not detect a statistically significant increase, consistent with our experiments concluding that ZNRF3/RNF43 regulate EGFR at the protein level. As stated in the preceding response, these data led us to attribute the impact of Rspo on Apc min organoid growth to enhanced EGFR activity.

      Figure 7A: This requires quantitation. How many mice were used per cell line? The data shown is not particularly convincing, with ZNRF3 overexpressing HT29 cells growing detectably. Showing representative mice is fine, but this should be supplemented with quantitation of all mice.

      We had provided this data. The BLI signal quantification was shown below the representative BLI images. Seven mice were used per cell line, as annotated at the top of the graph.

      Figure 7B: The authors assert that "canonical WNT signaling, based on levels of active-β-Catenin (non-phosphorylated at Ser33/37/Thr41; Figure 7B), remained unaffected". As shown, 2 of the 3 Myc-Znrf3 tumors have increased active-b-catenin signal over the GFP tumors. This indicates to me that canonical Wnt signaling was affected. The authors either need to present quantitative data that supports this claim or modify their conclusions. As presented, I don't think it is appropriate to decouple the effect of Znrf3 overexpression on EGFR from its effect on WNT.

      As requested, we have quantified the level of non-phospho β-Catenin at Ser33/37/Thr41 and found no significant differences (p > 0.05) between the control group vs. ZNRF3 overexpression group. We once again note that our manuscript was not meant to dispute the proven signaling and biological significance of WNT signaling regulation by ZNRF3/RNF43, and we have proof-read the manuscript multiple times to ensure that we did not make any generalized or misleading statements in this aspect.

      Author response image 2.

      Figure 7E: Here the authors assert that "no substantial activation of canonical Wnt signaling" in the Z&R KO tumors, however, the figure shows a substantial increase in active b-catenin staining. The current resolution is insufficient to claim that there is no increase in nuclear b-catenin. The authors' claim that WNT signaling is not involved here is not supported by the data presented here. One way to demonstrate that this effect is through EGFR activation and not through WNT activation is to treat mice with PORCN inhibitor. WNT-addicted tumors, such as by Rnf43 or Znrf3 deletion, regress upon PORCN inhibition. In this case, if the effect of Z&R KO is mediated through EGFR rather than WNT, then there should be no effect on tumor growth upon PORCN inhibition. This is a critical experiment in order to make this point.

      We appreciate the reviewer’s comments and suggestion of experiments. We based our initial statement on insubstantial nuclear β-catenin staining, but we agree that immunohistochemical staining lacks the resolution suitable for quantification. We could not generate the adequate number of KO animals for these in vivo experiments in the window of time planned for this revision. Rather, as shown in the newly added Fig. 6L, we tested EGFR inhibition and PORCN inhibition in Znrf3 KO MEFs and obtained strong data further supporting EGFR in mediating Znrf3 KO promotion of MEF growth. Notwithstanding, we have carefully revised our description of the in vivo data in Fig 7E to avoid any confusion or over-interpretation.

      Minor points:

      Figure 2A: provide quantitation of this immunoblot.

      We have revised manuscript with quantification result shown next to the immunoblot.

      Figure 2B: provide more detail in the figure legend and in the Materials and Methods section on how the KO MEFs were generated. Confirmation that Znrf3 (or in cases of Rnf43 KO) expression is lost in KO would be advisable.

      We have confirmed Znrf3 KO by genotyping and RNF43 KO by immunofluorescent staining. We have also tested multiple commercial anti-ZNRF3 antibodies and anti-RNF43 antibodies for Western blotting, but they all failed.

      Figure 4C is a little misleading. The schematic indicates that ECD-TM and TM-ICD truncations were analyzed for both ZNRF3 and RNF43. However, Figure 4 only shows data for ZNRF3, and the corresponding Figure S4 lacks data for the TM-ICD of Rnf43. A recommendation is to show only those schematics for which data is presented in that figure. On a related topic, the results using the deltaRING constructs (Figure S5) are not mentioned/described in the text.

      We think that the reviewer meant Fig 5C. We have revised the Fig 5C by removing the RNF43 label, and we confirm that  Results section does include the data in Fig S5.

      Figure S4A: Only ZNRF3 is indicated in this figure. Please explain why RNF43 is not represented here. Also, indicate what is plotted along the x-axis.

      We only detected the endogenous ZNRF3-EGFR interaction, possibly because the RNF43 protein level is relatively low in the cell line we used for the mass spec experiment. X-axis is the proteins ordered based on Y-axis values as detailed in the figure legend  -- each data point was arranged along the x axis based on the fold change of iBAQ of EGFR-associated proteins identified in EGF-stimulated vs. control in the log2 scale, from low to high (from left to right on x axis). We have added the phrase “Proteins detected by Mass-Spec” for X-axis.

      Reviewer #2 (Recommendations For The Authors):

      Minor Points.

      (1) In Figure 2B, the authors claim that Znrf3 KO enhanced both EGFR and p-EGFR levels both in the absence and presence of EGF. Although it is clear in the presence of EGF, the increased in p-EGFR in the absence of EGF is less than clear.

      We have revised the manuscript to more clearly state the result in Fig 2B.

      (2) Importantly the authors validated their findings using three independent RNF43 gRNA (fig S2D) but they do not show the editing efficiency obtained with the gRNA.

      We did not include RNF43 IB in this Figure due to lack of specific antibodies for detecting RNR43 in IB. We have no reasons to doubt adequate efficiency of knockout since EGFR was increased compared to the control group. As a result, we did not perform deep sequencing to validate knockout efficacy.

      (3) In S2E, the authors show that KO of either ZNRF3 or RNF43 enhance HER2 levels. This suggests that there is no redundancy between these E3 ligases, at least in this context. How do the authors reconcile that?

      The reviewer raised an interesting issue. Due to the lack of WB antibodies for these two proteins, we would not easily assess the feedback impact of knockout of either gene on the protein levels of the other gene. We speculate that there may be a threshold level of the sum of the two proteins that is needed for adequate degradation of HER2, leading to HER2 increase when either gene is knocked out. Detailed studies of this issue is beyond the scope of this current work.

      (4) Experiments performed in Fig 3C are performed in only one clone. The authors need to repeat in an additional clone or rescue this phenotype using a RNF43 cDNA.

      Our RNF43 KO HT29 line is a pool of KO cells, not a single clone.

      (5) In Figure 7E, the authors suggest that the absence of nuclear bcatenin means that canonical Wnt signaling is unaffected. It is widely known that nuclear bcatenin is often not correlating with pathway activity.

      As stated above, we have revised the manuscript to avoid confusion and misinterpretation.

      (6) What is the nature of the error bars in Fig 3c? Are the differences statistically significant?

      As mentioned in the figure legend, the error bars are SEM. The result is statistically significant, and p-value is noted in the graph.

      (7) In the Figure legends, it should be stated clearly how many biological replicates were performed for each experiment and single data points should be plotted where applicable (e.g. qPCR data). It would be helpful if the uncropped and unprocessed Western blot membranes and replicates that are not shown would be accessible to allow the reader a more comprehensive view of the acquired data, especially for blots that were quantified (e.g. Figure 2F, Figure 3C, there is clearly some defect on the blot).

      For WB representation, it would be helpful to include more size markers on the Western blots (especially on the Ips that show ubiquitin smear) and in general to use a reference protein (GAPDH, Actin, Vinculin) that is closer to the protein being accessed.

      More details should be added in the Methods section to explain how protocols were performed in detail. For example, it should be explained how the viruses used for infecting cells were produced (which plasmids were transfected using which transfection reagent, how long was the virus collected for, etc). Then, it should be stated how long the cells were undergoing selection before being harvested. Because the expression of the viral constructs potentially has an effect on cell proliferation through EGFR, this information is quite relevant. This is just an example, there are details missing in nearly every section (Flow: washing protocols, gating protocols (Live/dead stain?), WB: RIPA lysis buffer composition? How much protein was loaded on blots? How was protein quantification done? IP: how were washes performed and how often repeated?)

      Missing: antibody dilutions for IF, IHC, and WB, plasmid backbones, sequences and availability, qPCR primer sequences from Origene.

      Incucyte experiments are not described.

      We have revised the relevant sections to include more details.

      (8) Line 141: revise text: 2x mRNA abundance in the same sentence.

      Line 162: define intermediate expression better.

      Line 197/198: revise text ('the predominant one'?).

      Line 218/219: revise text (Internalisation of surface EGFR?).

      Line 245: clarify in text that it is endogenous EGFR that is being pulled down.

      Line 264: typo: conserved instead of conservative.

      Line 324: revise text (What does 'unknown significance' mean).

      Line 396/397: revise text: 2x Co-IP in the same sentence.

      Figure 3 D/E: more details on the Method in the figure legend.

      We have revised them accordingly.

    1. eLife Assessment

      This manuscript presents a clever and powerful approach to examining differential roles of Nav1.2 and Nav1.6 channels in excitability of neocortical pyramidal neurons, by engineering mice in which a sulfonamide inhibitor of both channels has reduced affinity for one or the other channels. Overall, the results in the manuscript are compelling and give important information about differential roles of Nav1.6 and Nav1.2 channels. Activity-dependent inactivation of NaV1.6 was also found to attenuate seizure-like activity in cells, demonstrating the promise of activity-dependent NaV1.6-specific pharmacotherapy for epilepsy.

    2. Reviewer #1 (Public review):

      Summary:

      Prior research indicates that NaV1.2 and NaV1.6 have different compartmental distributions, expression timelines in development, and roles in neuron function. The lack of subtype-specific tools to control Nav1.2 and Nav1.6 activity however has hampered efforts to define the role of each channel in neuronal behavior. The authors attempt to address the problem of subtype specificity here by using aryl sulfonamides (ASCs) to stabilize channels in the inactivated state in combination with mice carrying a mutation that renders NaV1.2 and/or NaV1.6 genetically resistant to the drug. Using this innovative approach, the authors find that action potential initiation is controlled by NaV1.6 while both NaV1.2 and NaV1.6 are involved in backpropagation of the action potential to the soma, corroborating previous findings. Additionally, NaV1.2 inhibition paradoxically increases the firing rate, as has also been observed in genetic knockout models. Finally, the potential anticonvulsant properties of ASCs were tested. NaV1.6 inhibition but not NaV1.2 inhibition was found to decrease action potential firing in prefrontal cortex layer 5b pyramidal neurons in response to current injections designed to mimic inputs during seizure. This result is consistent with studies of loss-of-function Nav1.6 models and knockdown studies showing that these animals are resistant to certain seizure types. These results lend further support for the therapeutic promise of activity-dependent, NaV1.6-selective, inhibitors for epilepsy.

      Strengths:

      (1) The chemogenetic approaches used to achieve selective inhibition of NaV1.2 and NaV1.6 are innovative and help resolve long-standing questions regarding the role of Nav1.2 and Nav1.6 in neuronal electrogenesis.

      (2) The experimental design is overall rigorous, with appropriate controls included.

      (3) The assays to elucidate the effects of channel inactivation on typical and seizure-like activity were well selected.

      Weaknesses:

      (1) The potential impact of the YW->SR mutation in the voltage sensor does not appear to have been sufficiently assessed. The activation/inactivation curves in Figure 1E show differences in both activation and inactivation at physiologically relevant membrane voltages, which may be significant even though the V1/2 and slope factors are roughly similar.

      (2) Additional discussion of the fact that channels are only partially blocked by the ASC and that ASCs act in a use-dependent manner would improve the manuscript and help readers interpret these results.

      (3) NaV1.6 was described as being exclusively responsible for the change in action potential threshold, but when NaV1.6 alone was inactivated, the effect was significantly reduced from the condition in which both channels were inactivated (Figure 4E). Similarly, Figure 6C shows that blockade of both channels causes threshold depolarization prior to the seizure-like event, but selective inactivation of NaV1.6 does not. As NaV1.2 does not appear to be involved in action potential initiation and threshold change, what is the mechanism of this dissimilarity between the NaV1.6 inactivation and combined NaV1.6/ NaV1.2 inactivation?

      (4) The idea that use-dependent VGSC-acting drugs may be effective antiseizure medications is well established. Additional discussion or at least acknowledgement of the existing, widely used, use-dependent VGSC drugs should be included (e.g. Carbamazepine, Lamotrigine, Phenytoin). Also, the idea that targeting NaV1.6 may be effective for seizures is established by studies using genetic models, knockdown, and partially selective pharmacology (e.g. NBI-921352). Additional discussion of how the results reported here are consistent with or differ from studies using these alternative approaches would improve the discussion

    3. Reviewer #2 (Public review):

      The authors used a clever and powerful approach to explore how Nav1.2 and Nav1.6 channels, which are both present in neocortical pyramidal neurons, differentially control firing properties of the neurons. Overall, the approach worked very well, and the results show very interesting differences when one or the other channel is partially inhibited. The experimental data is solid and the experimental data is very nicely complemented by a computational model incorporating the different localization of the two types of sodium channels.

      In my opinion the presentation and interpretation of the results could be improved by a more thorough discussion of the fact that only incomplete inhibition of the channels can be achieved by the inhibitor under physiological recording conditions and I thought the paper could be easier to digest if the figures were re-organized. However, the key results are well-documented.

    4. Reviewer #3 (Public review):

      Summary:

      The authors used powerful and novel reagents to carefully assess the roles of the voltage gated sodium channel (NaV) isoforms in regulating the neural excitability of principal neurons of the cerebral cortex. Using this approach, they were able to confirm that two different isoforms, NaV1.2 and NaV1.6 have distinct roles in electrogenesis of neocortical pyramidal neurons.

      Strengths:

      Development of very powerful transgenic mice in which NaV1.2 and/or NaV1.6 were modified to be insensitive to ASCs, a particular class of NaV blocker. This allowed them to test for roles of the two isoforms in an acute setting, without concerns of genetic or functional compensation that might result from a NaV channel knockout.

      Careful biophysical analysis of ASC effects on different NaV isoforms.

      Extensive and rigorous analysis of electrogenesis - action potential production - under conditions of blockade of either NaV1.2 or NaV1 or both.

      Weaknesses:

      Some results are overstated in that the representative example records provided do not directly support the conclusions.

      Results from a computational model are provided to make predictions of outcomes, but the computational approach is highly underdeveloped.

    5. Author response:

      We thank the reviewers and editors for these careful and constructive comments. Based on these comments, we plan to perform new experiments and revised analysis, summarized as follows:

      (1) A more thorough analysis and experimental test of the effects of YW->SR variants on baseline AP excitability in neurons in the absence of any pharmacology.

      (2) More details on modeling of selective block of Na<sub>V</sub>1.2 and Na<sub>V</sub>1.6.

      (3) Revisions to text, figure contents, and figure order to better convey key points and better frame these findings in the context of current clinically available anti-seizure medications that interact with sodium channels.

    1. eLife Assessment

      This valuable study provides a novel framework for leveraging longitudinal field observations to examine the effects of aging on stone tool use behaviour in wild chimpanzees. However, the analysis and interpretation are currently incomplete and would benefit from a more robust consideration of additional sources of variance for the data (e.g., foraging ecology, nut and tool properties, etc.). Despite the low sample size of five individuals, this study is of broad interest to ethologists, primatologists, archaeologists, and psychologists.

    2. Reviewer #1 (Public review):

      Summary:

      Howard-Spink et al. investigated how older chimpanzees changed their behavior regarding stone tool use for nutcracking over a period of 17 years, from late adulthood to old age. This behavior is cognitively demanding, and it is a good target for understanding aging in wild primates. They used several factors to follow the aging process of five individuals, from attendance at the nut-cracking outdoor laboratory site to time to select tools and efficiency in nut-cracking to check if older chimpanzee changed their behavior.

      Indeed, older chimpanzees reduced their visits to the outdoor lab, which was not observed in the younger adults. The authors discuss several reasons for that; the main ones being physiological changes, cognitive and physical constraints, and changes in social associations. Much of the discussion is hypothetical, but a good starting point, as there is not much information about senescence in wild chimpanzees.

      The efficiency for nut-cracking was variable, with some individuals taking a long time to crack nuts while others showed little variance. As this is not compared with the younger individuals and the sample is small (only five individuals), it is difficult to be sure if this is also partly a normal variance caused by other factors (ecology) or is only related to senescence.

      Strengths:

      (1) 17 years of longitudinal data in the same setting, following the same individuals.

      (2) Using stone tool use, a cognitively demanding behavior, to understand the aging process.

      Weaknesses:

      A lack of comparison of the stone tool use behavior with younger individuals in the same period, to check if the changes observed are only related to age or if it is an overall variance. The comparison with younger chimpanzees was only done for one of the variables (attendance).

    3. Reviewer #2 (Public review):

      Summary:

      Primates are a particularly important and oft-applied model for understanding the evolution of, e.g., life history and senescence in humans. Although there is a growing body of work on aging in primates, there are three components of primate senescence research that have been underutilized or understudied: (1) longitudinal datasets, (2) wild populations, and (3) (stone) tool-use behaviors. Therefore, the goal of this study was to (1) use a 17-year longitudinal dataset (2) of wild chimpanzees in the Bossou forest, (3) visiting a site for field experiments on nut-cracking. They sampled and analyzed data from five field seasons for five chimpanzees of old age. From this sample, Howard-Spink and colleagues noted a decline in tool-use and tool-use efficiency in some individuals, but not in others. The authors then conclude that there is a measurable effect of senescence on chimpanzee behavior, but that it varies individually. The study has major intellectual value as a building block for future research, but there are several major caveats.

      Strengths:

      With this study, Howard-Spink and colleagues make a foray into a neglected topic of research: the impact of the physiological and cognitive changes due to senescence on stone tool use in chimpanzees. Based on novelty alone, this is a valuable study. The authors cleverly make use of a longitudinal record covering 17 years of field data, which provides a window into long-term changes in the behavior of wild chimpanzees, which I agree cannot be understood through cross-sectional comparisons.

      The metrics of 'efficiency' (see caveats below) are suitable for measuring changes in technological behavior over time, as specifically tailored to the nut-cracking (e.g., time, number of actions, number of strikes, tool changes). The ethogram and the coding protocol are also suitable for studying the target questions and objectives. I would recommend, however, the inclusion of further variables that will assist in improving the amount of valid data that can be extrapolated (see also below).

      With this pilot, Howard-Spink and colleagues have established a foundation upon which future research can be designed, including further investigation with the Bossou dataset and other existing video archives, but especially future targeted data collection, which can be designed to overcome some of the limits and confounds that can be identified in the current study.

      Weaknesses:

      Although I agree with the reasoning behind conducting this research and understand that, as the authors state, there are logistical considerations that have to be made when planning and executing such a study, there are a number of methodological and theoretical shortcomings that either need to be more explicitly stated by the authors or would require additional data collection and analysis.

      One of the main limitations of this study is the small sample size. There are only 5 of the old-aged individuals, which is not enough to draw any inferences about aging for chimpanzees more generally. Howard-Spink and colleagues also study data from only five of the 17 years of recorded data at Bossou. The selection of this subset of data requires clarification: why were these intervals chosen, why this number of data points, and how do we know that it provides a representative picture of the age-related changes of the full 17 years?

      With measuring and interpreting the 'efficiency' of behaviors, there are in-built assumptions about the goals of the agents and how we can define efficiency. First, it may be that efficiency is not an intentional goal for nut-cracking at all, but rather, e.g., productivity as far as the number of uncrushed kernels (cf. Putt 2015). Second, what is 'efficient' for the human observer might not be efficient for the chimpanzee who is performing the behavior. More instances of tool-switching may be considered inefficient, but it might also be a valid strategy for extracting more from the nuts, etc. Understanding the goals of chimpanzees may be a difficult proposition, but these are uncertainties that must be kept in mind when interpreting and discussing 'decline' or any change in technological behaviors over time.

      For the study of the physiological impact of senescence of tool use (i.e., on strength and coordination), the study would benefit from the inclusion of variables like grip type and (approximate) stone size (Neufuss et al., 2016). The size and shape of stones for nut-cracking have been shown to influence the efficacy and 'efficiency' of tool use (i.e., the same metrics of 'efficiency' implemented by Howard-Spink et al. in the current study), meaning raw material properties are a potential confound that the authors have not evaluated.

      Similarly, inter- and intraspecific variation in the properties of nuts being processed is another confound (Falótico et al., 2022; Proffitt et al., 2022). If oil palm nuts were varying year-to-year, for example, this would theoretically have an effect on the behavioral forms and strategies employed by the chimpanzees, and thus, any metric of efficiency being collected and analyzed. Further, it is perplexing that the authors analyze only one year where the coula nuts were provided at the test site, but these were provided during multiple field seasons. It would be more useful to compare data from a similar number of field seasons with both species if we are to study age-related changes in nut processing over time (one season of coula nut-cracking certainly does not achieve this).

      Both individual personality (especially neophilia versus neophobia; e.g., Forss & Willems, 2022) and motivation factors (Tennie & Call, 2023) are further confounds that can contribute to a more valid interpretation of the patterns found. To draw any conclusions about age-related changes in diet and food preferences, we would need to have data on the overall food intake/preferences of the individuals and the food availability in the home range. The authors refer briefly to this limitation, but the implications for the interpretation of the data are not sufficiently underlined (e.g., for the relevance of age-related decline in stone tool-use ability for individual survival).

      Generally speaking, there is a lack of consideration for temporal variation in ecological factors. As a control for these, Howard-Spink and colleagues have examined behavioral data for younger individuals from Bossou in the same years, to ostensibly show that patterns in older adults are different from patterns in younger adults, which is fair given the available data. Nonetheless, they seem to focus mostly on the start and end points and not patterns that occur in between. For example, there is a curious drop in attendance rate for all individuals in the 2008 season, the implications of which are not discussed by the authors.

      As far as attendance, Howard-Spink and colleagues also discuss how this might be explained by changes in social standing in later life (i.e., chimpanzees move to the fringes of the social network and become less likely to visit gathering sites). This is not senescence in the sense of physiological and cognitive decline with older age. Instead, the reduced attendance due to changes in social standing seems rather to exacerbate signs of aging rather than be an indicator of it itself. The authors also mention a flu-like epidemic that caused the death of 5 individuals; the subsequent population decline and related changes in demography also warrant more discussion and characterization in the manuscript.

      Understandably, some of these issues cannot be evaluated or corrected with the presented dataset. Nonetheless, these undermine how certain and/or deterministic their conclusions can really be considered. Howard-Spink et al. have not strongly 'demonstrated' the validity of relationships between the variables of the study. If anything, their cursory observations provide us with methods to apply and hypotheses to test in future studies. It is likely that with higher-resolution datasets, the individual variability in age-related decline in tool-use abilities will be replicated. For now, this can be considered a starting point, which will hopefully inspire future attempts to research these questions.

      Falótico, T., Valença, T., Verderane, M. & Fogaça, M. D. Stone tools differences across three capuchin monkey populations: food's physical properties, ecology, and culture. Sci. Rep. 12, 14365 (2022).<br /> Forss, S. & Willems, E. The curious case of great ape curiosity and how it is shaped by sociality. Ethology 128, 552-563 (2022).<br /> Neufuss, J., Humle, T., Cremaschi, A. & Kivell, T. L. Nut-cracking behaviour in wild-born, rehabilitated bonobos (Pan paniscus): a comprehensive study of hand-preference, hand grips and efficiency. Am. J. Primatol. 79, e22589 (2016).<br /> Proffitt, T., Reeves, J. S., Pacome, S. S. & Luncz, L. V. Identifying functional and regional differences in chimpanzee stone tool technology. R. Soc. Open Sci. 9, 220826 (2022).<br /> Putt, S. S. The origins of stone tool reduction and the transition to knapping: An experimental approach. J. Archaeol. Sci.: Rep. 2, 51-60 (2015).<br /> Tennie, C. & Call, J. Unmotivated subjects cannot provide interpretable data and tasks with sensitive learning periods require appropriately aged subjects: A Commentary on Koops et al. (2022) "Field experiments find no evidence that chimpanzee nut cracking can be independently innovated". ABC 10, 89-94 (2023).

    4. Author response:

      We thank both reviewers for their comments on our manuscript. We are pleased that the value of this research has been communicated effectively, and that the reviewers agree that whilst our sample size of individuals is relatively small, it offers a unique perspective for understanding the effects of aging for wild chimpanzees’ technological behaviors. Whilst only yielding data on a few individuals, the Bossou archive is the only available data source with which we can currently address these questions over extended timescales, and is key for understanding longitudinal effects of aging for specific individuals. This is particularly true if we are to understand the life-long dynamics of chimpanzees’ technical skills during tasks which require the organization of multiple movable elements. Bossou is the only community where chimpanzees both perform nut cracking with moveable hammer and anvil stones, and have been systematically studied over a period of decades. Moreover, given the dwindling population at Bossou (N = 3 as of 2025), we must make every effort to understand these effects with existing data. We agree that this work will likely form a valuable foundation for future studies, which may aim to either replicate our results, or use our findings to design more specific research questions and approaches.

      In the next iteration of the manuscript, we will elaborate on our choice of field seasons more clearly. However, this was a logistical tradeoff between needing to sample across a long lifespan using fine-granularity behavior coding, versus the time constraints for our project and the likely yield of data collection. We sampled from the middle of individuals’ prime age, up until the oldest recorded ages of individuals lifespans (17 years). Where possible we aimed to use consistent time intervals (approximately 4 years); however, this was not always possible, as in some years data was not collected by researchers at Bossou (for example, during years where there were Ebola outbreaks affecting the region). In such instances, we sampled the closest available year that offered sufficient data to meet our sampling requirements).

      Reviewer 2 raises that there may be a disconnect between how human observers and chimpanzees conceive of efficiency when nut cracking, and support this idea with a citation to previous work on efficiency of Oldowan stone knapping. We agree that knowing precisely how chimpanzees perceive their own efficiency during tool use is not available through observation alone, nor can we assess the true extent to which chimpanzees are concerned about the efficiency of their nut-cracking. However, following previous studies, it is reasonable to assume that adult chimpanzees embody some level of efficiency, given that adults often select tools which aid efficient nut cracking (Braun et al. 2025, J. Hum. Evol.; Carvalho et al. 2008, J. Hum. Evol.; Sirianni et al. 2015, Animal Behav.); perform nut cracking using more streamlined combinations of actions than less experienced individuals (Howard-Spink et al. 2024, Peer J; Inoue-Nakamura & Matsuzawa 1997, J. Comp. Psychol.), and consequently end up cracking nuts using fewer hammer strikes, indicating a higher level of skill (Biro et al. 2003, Animal Cogn.; Boesch et al. 2019, Sci. Rep.). Ultimately, these factors suggest that across adulthood, experienced chimpanzees perform nut cracking with a level of efficiency which exceeds novice individuals, including across the chaine operatoire.

      To account for the multiple ways in which reduced efficiency may manifest later in life, we provide one of the most flexible measures of efficiency in wild chimpanzee tool use to date, which incorporates more classical measures of time and hammer strikes (see previous examples of Biro et al. 2003, Animal Cogn.; Boesch et al. 2019 Sci. Rep.) as well as additional variables which aim to characterize how streamlined behavioral sequences are (tool rotations, tool swaps, nut replacements, etc. see Berdugo et al. 2024 Nat. Hum. Behav for other analyses using similar metrics). In the case of swapping out tools, Reviewer 2 suggests that some of these tool swaps may in fact be to aid nut cracking, by maintaining kernel integrity (a key result relating to Yo’s coula nut cracking efficiency). This however seems unlikely, given that these behaviors were performed extremely rarely by chimpanzees in early field seasons, and were not performed more frequently by other individuals with aging. We will provide additional information behind our metrics for measuring efficiency, with reference to earlier work, and also will incorporate the points raised by Reviewer 2 concerning the limitations with which we can infer chimpanzees’ goals, and how efficiently they meet them.

      Reviewer 1 questioned why we did not sample efficiency data for younger individuals, and compare this data with older individuals to detect the effects of aging. Throughout our manuscript, we compared aging individuals’ nut-cracking efficiency with their efficiency in previous years (thus, at younger ages). This offered each individual personalized benchmark of efficiency in early life, and allowed us to identify aging effects whilst controlling for long-term interindividual variation in skill levels. Indeed, previous analyses at Bossou find that across the majority of adulthood, efficiency varies between individuals, but is relatively stable within individuals (see Berdugo et al. 2024, Nat. Hum. Behav.). As focal aging chimpanzees cracked multiple nuts each field season (and each encounter), we had ample data to fit models that examine individuals’ efficiency over field seasons, using random slopes to model correlations for each individual. By taking this approach, our paper offers a novel perspective by being able to report the longitudinal effects of aging on tool-using efficiency, rather than averaged cross-sectional effects between young and old cohorts. As random slope models (and not just random intercept models) offered the best explanation for variation in aging individuals’ efficiency over our sample period, this implies that focal chimpanzees were experiencing individual-level changes in efficiency over time, thus giving us key evidence that interindividual variation in tool-using efficiency can be compounded by aging.

      We argue that the reductions in efficiency observed for some individuals (e.g. Yo & Velu) are unlikely to be due to environmental changes (e.g. nuts becoming harder in later field seasons), as if this was the case, these effects would be detected across the behaviors of all individuals (which was not observed). Additionally, in the specific case of the hardness of nuts, nuts used in our experiment were sourced from local communities, and were moderately aged. This avoided the use of young nuts which are harder to crack, or older nuts which are often worm-eaten or can be empty (Sakura & Matsuzawa, 1991; Ethology). We will update our manuscript with this information.

      Whilst other factors may introduce general variation into our efficiency data (such as different stones used on different encounters, or more general variation in nut hardness across encounters), very few of these factors predict directional long-term changes in efficiency. Rather, if these factors were driving the majority of variation in our data, we would expect them to lead to variation across visits during earlier field seasons (such as 1999-2008) and later field seasons (2011 onwards) equally, and in a way which does not necessarily correlate with age. This does not match the pattern we observed in our data, where for some individuals (e.g. Yo & Velu), efficiency in nut cracking reduced in later field seasons only, and was relatively consistent across field seasons prior to 2011. Moreover, for Yo – the individual who exhibited the greatest reductions in tool-using efficiency - efficiency continued to decrease across the three of the latest sampled field seasons. Thus, it is more likely Yo was experiencing deleterious effects of aging. We do however agree that additional data on these variables would help us to remove the possibility of compounding factors more rigorously – we will include recommendations for this data to be collected in future studies.

      When modelling the effect of aging on attendance at the outdoor laboratory, we could not use the same approach we used when modelling tool-using efficiency, as we could only acquire one datapoint (attendance rate) per individual for each field season. We therefore had to adapt our analysis, and introduce attendance rates for younger individuals as a baseline to compare against the attendance rates of older individuals across years. We observed a significant interaction effect, where across field seasons, attendance dropped significantly more rapidly for older individuals than younger ones. Reviewer 2 has asked why we do not consider inter-annual variability across this time period, and suggested that we ignored intervening years. This is not the case. When fitting models that examined the effects of aging on attendance, we used all data across all field seasons. We reported an approximate effect size for this significant correlation using a digestible comparison of the attendance rates in the initial and final field seasons sampled. We will ensure that this is clear in the next iteration of our manuscript.

      Reviewer 2 noted that many factors may have influenced the decision for chimpanzees to attend the outdoor laboratory in older field seasons, and the current data may not be used to make strong arguments for changes in attendance rates being due to dietary preferences. We agree that many factors may have influenced these attendance rates, and that is what we have aimed to transparently report within our discussion where we raise an extensive, non-exhaustive list of hypotheses for why we have observed this age-related change in our data. We will aim to ensure that this is exceptionally clear prior to resubmission, and where relevant, will further emphasize points raised by Reviewer 2. We consider some points raised by Reviewer 2 to be unlikely to apply for our study; for example, it is unlikely neophobia has influenced the behaviors of chimpanzees, as these chimpanzees habitually attended the outdoor laboratory at their own accord for over a decade prior to the earliest year we sampled in this study (reflecting extremely high levels of habituation to the experimental set up). Previous studies at Bossou have surveyed the ecology of stone tool use across the home range, and confirm that the outdoor laboratory is visited by chimpanzees during ranging as a food patch (Almeida-Warren et al. 2022 Int. J. Primatol.).

      Reviewer 2 suggested that it would be helpful to have additional data on variables such as hand grip, as this may reveal further information about how cognitive and physiological senescence influences reductions in tool-using efficiency. We agree that whilst further data on hand grips are not required to detect reductions in efficiency per say per se, it would be profitable for future analyses to collect similar data – we will add this as a recommendation to our discussion.

      Finally, Reviewer 2 commented that they found our discussion of coula-nut cracking disruptive to the flow of the manuscript, given that we could not compare with coula-nut cracking in earlier years. We reported the coula nut cracking of Yo in 2011 as it was part of our sampled data, and we felt that the comparison with other individuals in the same year was an interesting discussion point, however we acknowledge this limitation. We will move all data and discussion of coula-nut cracking to the Supplementary Materials, which we will present as an interesting additional observation which may warrant further investigation using additional data from the Bossou archive. Data collection for this future project could include collecting data on the additional variables raised by both reviewers (e.g. hand grips).

      We thank both reviewers for their comments. We believe that their feedback will improve the quality of our reporting, and the validity of our interpretations.

    1. eLife Assessment

      The conclusions of this work are based on valuable simulations of a detailed model of striatal dopamine dynamics. Establishing that a lower dopamine uptake rate can lead to a 'tonic' level of dopamine in the ventral but not dorsal striatum, and that dopamine concentration changes at short delays can be tracked by D1 but not D2 receptor activation, is of value and will be of interest to dopamine aficionados. However, the simulations are incomplete, providing only partial support for the key claims. Several things can be done to strengthen the conclusions, including, for example, but not exclusively, a demonstration of how the results would change as a function of changes in D2 affinity.

    2. Reviewer #1 (Public review):

      Ejdrup, Gether, and colleagues present a sophisticated simulation of dopamine (DA) dynamics based on a substantial volume of striatum with many DA release sites. The key observation is that a reduced DA uptake rate in the ventral striatum (VS) compared to the dorsal striatum (DS) can produce an appreciable "tonic" level of DA in VS and not DS. In both areas they find that a large proportion of D2 receptors are occupied at "baseline"; this proportion increases with simulated DA cell phasic bursts but has little sensitivity to simulated DA cell pauses. They also examine, in a separate model, the effects of clustering dopamine transporters (DAT) into nanoclusters and say this may be a way of regulating tonic DA levels in VS. I found this work of interest and I think it will be useful to the community. At the same time, there are a number of weaknesses that should be addressed, and the authors need to more carefully explain how their conclusions are distinct from those based on prior models.

      (1) The conclusion that even an unrealistically long (1s) and complete pause in DA firing has little effect on DA receptor occupancy is potentially important. The ability to respond to DA pauses has been thought to be a key reason why D2 receptors (may) have high affinity. This simulation instead finds evidence that DA pauses may be useless. This result should be highlighted in the abstract and discussed more.

      (2) The claim of "DAT nanoclustering as a way to shape tonic levels of DA" is not very well supported at present. None of the panels in Figure 4 simply show mean steady-state extracellular DA as a function of clustering. Perhaps mean DA is not the relevant measure, but then the authors need to better define what is and why. This issue may be linked to the fact that DAT clustering is modeled separately (Figure 4) to the main model of DA dynamics (Figures 1-3) which per the Methods assumes even distribution of uptake. Presumably, this is because the spatial resolution of the main model is too coarse to incorporate DAT nanoclusters, but it is still a limitation. As it stands it is convincing (but too obvious) that DAT clustering will increase DA away from clusters, while decreasing it near clusters. I.e. clustering increases heterogeneity, but how this could be relevant to striatal function is not made clear, especially given the different spatial scales of the models.

      (3) I question how reasonable the "12/40" simulated burst firing condition is, since to my knowledge this is well outside the range of firing patterns actually observed for dopamine cells. It would be better to base key results on more realistic values (in particular, fewer action potentials than 12).

      (4) There is a need to better explain why "focality" is important, and justify the measure used.

      (5) Line 191: " D1 receptors (-Rs) were assumed to have a half maximal effective concentration (EC50) of 1000 nM"<br /> The assumptions about receptor EC50s are critical to this work and need to be better justified. It would also be good to show what happens if these EC50 numbers are changed by an order of magnitude up or down.

      (6) Line 459: "we based our receptor kinetics on newer pharmacological experiments in live cells (Agren et al., 2021) and properties of the recently developed DA receptor-based biosensors (Labouesse & Patriarchi, 2021). Indeed, these sensors are mutated receptors but only on the intracellular domains with no changes of the binding site (Labouesse & Patriarchi, 2021)"<br /> This argument is diminished by the observation that different sensors based on the same binding site have different affinities (e.g. in Patriarchi et al. 2018, dLight1.1 has Kd of 330nM while dlight1.3b has Kd of 1600nM).

      (7) Estimates of Vmax for DA uptake are entirely based on prior fast-scan voltammetry studies (Table S2). But FSCV likely produces distorted measures of uptake rate due to the kinetics of DA adsorption and release on the carbon fiber surface.

      (8) It is assumed that tortuosity is the same in DS and VS - is this a safe assumption?

      (9) More discussion is needed about how the conclusions derived from this more elaborate model of DA dynamics are the same, and different, to conclusions drawn from prior relevant models (including those cited, e.g. from Hunger et al. 2020, etc).

    3. Reviewer #2 (Public review):

      The work presents a model of dopamine release, diffusion, and reuptake in a small (100 micrometer^2 maximum) volume of striatum. This extends previous work by this group and others by comparing dopamine dynamics in the dorsal and ventral striatum and by using a model of immediate dopamine-receptor activation inferred from recent dopamine sensor data. From their simulations, the authors report two main conclusions. The first is that the dorsal striatum does not appear to have a sustained, relatively uniform concentration of dopamine driven by the constant 4Hz firing of dopamine neurons; rather that constant firing appears to create hotspots of dopamine. By contrast, the lower density of release sites and lower rate of reuptake in the ventral striatum creates a sustained concentration of dopamine. The second main conclusion is that D1 receptor (D1R) activation is able to track dopamine concentration changes at short delays but D2 receptor activation cannot.

      The simulations of the dorsal striatum will be of interest to dopamine aficionados as they throw some doubt on the classic model of "tonic" and "phasic" dopamine actions, further show the disconnect between dopamine neuron firing and consequent release, and thus raise issues for the reward-prediction error theory of dopamine.

      There is some careful work here checking the dependence of results on the spatial volume and its discretisation. The simulations of dopamine concentration are checked over a range of values for key parameters. The model is good, the simulations are well done, and the evidence for robust differences between dorsal and ventral striatum dopamine concentration is good.

      However, the main weakness here is that neither of the main conclusions is strongly evidenced as yet. The claim that the dorsal striatum has no "tonic" dopamine concentration is based on the single example simulation of Figure 1 not the extensive simulations over a range of parameters. Some of those later simulations seem to show that the dorsal striatum can have a "tonic" dopamine concentration, though the measurement of this is indirect. It is not clear why the reader should believe the example simulation over those in the robustness checks, for example by identifying which range of parameter values is more realistic.

      The claim that D1Rs can track rapid changes in dopamine is not well supported. It is based on a single simulation in Figure 1 (DS) and 2 (VS) by visual inspection of simulated dopamine concentration traces - and even then it is unclear that D1Rs actually track dynamics because they clearly do not track rapid changes in dopamine that are almost as large as those driven by bursts (cf Figure 1i). The claim also depends on two things that are poorly explained. First, the model of binding here is missing from the text. It seems to be a simple bound-fraction model, simulating a single D1 or D2 receptor. It is unclear whether more complex models would show the same thing. Second, crucial to the receptor model here is the inference that D1 receptor unbinding is rapid; but this inference is made based on the kinetics of dopamine sensors and is superficially explained - it is unclear why sensor kinetics should let us extrapolate to receptor kinetics, and unclear how safe is the extrapolation of the linear regression by an order of magnitude to get the D1 unbinding rate.

    4. Author response:

      eLife Assessment

      The conclusions of this work are based on valuable simulations of a detailed model of striatal dopamine dynamics. Establishing that a lower dopamine uptake rate can lead to a 'tonic' level of dopamine in the ventral but not dorsal striatum, and that dopamine concentration changes at short delays can be tracked by D1 but not D2 receptor activation, is of value and will be of interest to dopamine aficionados. However, the simulations are incomplete, providing only partial support for the key claims. Several things can be done to strengthen the conclusions, including, for example, but not exclusively, a demonstration of how the results would change as a function of changes in D2 affinity.

      We sincerely thank the Editors and Reviewers for their insightful comments on our manuscript. We are pleased that our simulations are recognized as interesting, sophisticated and valuable. Moreover, we fully agree that many of the findings will be of particular interest to dopamine aficionados. While we maintain that our simulations provide a solid basis for the key claims, we acknowledge that the conclusions can be further strengthened by the revisions suggested below.

      Reviewer #1 (Public review):

      Ejdrup, Gether, and colleagues present a sophisticated simulation of dopamine (DA) dynamics based on a substantial volume of striatum with many DA release sites. The key observation is that a reduced DA uptake rate in the ventral striatum (VS) compared to the dorsal striatum (DS) can produce an appreciable "tonic" level of DA in VS and not DS. In both areas they find that a large proportion of D2 receptors are occupied at "baseline"; this proportion increases with simulated DA cell phasic bursts but has little sensitivity to simulated DA cell pauses. They also examine, in a separate model, the effects of clustering dopamine transporters (DAT) into nanoclusters and say this may be a way of regulating tonic DA levels in VS. I found this work of interest and I think it will be useful to the community. At the same time, there are a number of weaknesses that should be addressed, and the authors need to more carefully explain how their conclusions are distinct from those based on prior models.

      (1) The conclusion that even an unrealistically long (1s) and complete pause in DA firing has little effect on DA receptor occupancy is potentially important. The ability to respond to DA pauses has been thought to be a key reason why D2 receptors (may) have high affinity. This simulation instead finds evidence that DA pauses may be useless. This result should be highlighted in the abstract and discussed more.

      We appreciate that the reviewer finds our work interesting and useful to the community. However, we acknowledge that in the revised version we to need to better describe how our conclusions are different from those reached based on previous models.

      We will also carry out new simulations across a range of D2R affinities to assess how this will affect the finding that even a long pause in DA firing has little effect on DR2 receptor occupancy. As also suggested, the results will be highlighted and further discussed.

      (2) The claim of "DAT nanoclustering as a way to shape tonic levels of DA" is not very well supported at present. None of the panels in Figure 4 simply show mean steady-state extracellular DA as a function of clustering. Perhaps mean DA is not the relevant measure, but then the authors need to better define what is and why. This issue may be linked to the fact that DAT clustering is modeled separately (Figure 4) to the main model of DA dynamics (Figures 1-3) which per the Methods assumes even distribution of uptake. Presumably, this is because the spatial resolution of the main model is too coarse to incorporate DAT nanoclusters, but it is still a limitation.

      We will improve our definitions and descriptions relating to nanoclustering of DAT in the revised version of the manuscript. We fully agree that the spatial resolution of the main model is a limitation and, ideally, that the nanoclustering should be combined with the large-scale release simulations. Unfortunately, this would require many orders of magnitude more computational power than currently available.

      As it stands it is convincing (but too obvious) that DAT clustering will increase DA away from clusters, while decreasing it near clusters. I.e. clustering increases heterogeneity, but how this could be relevant to striatal function is not made clear, especially given the different spatial scales of the models.

      Thank you for raising this important point. While it is true that DAT clustering increases heterogeneity in DA distribution at the microscopic level, the diffusion rate is, in most circumstances, too fast to permit concentration differences on a spatial scale relevant for nearby receptors. Accordingly, we propose that the primary effect of DAT nanoclustering is to decrease the overall uptake capacity, which in turn increases overall extracellular DA concentrations. Thus, homogeneous changes in extracellular DA concentrations can arise from regulating heterogenous DAT distribution. An exception to this would be the circumstance where the receptor is located directly next to a dense cluster – i.e. within nanometers. In such cases, local DA availability may be more directly influenced by clustering effects. This will be further discussed in the revised manuscript.

      (3) I question how reasonable the "12/40" simulated burst firing condition is, since to my knowledge this is well outside the range of firing patterns actually observed for dopamine cells. It would be better to base key results on more realistic values (in particular, fewer action potentials than 12).

      We fully agree that this typically is outside the physiological range. The values are included to showcase what extreme situations would look like.

      (4) There is a need to better explain why "focality" is important, and justify the measure used.

      We will expand on the intention of this measure in the revised manuscript. Thank you for pointing out this lack of clarification.

      (5) Line 191: " D1 receptors (-Rs) were assumed to have a half maximal effective concentration (EC50) of 1000 nM" The assumptions about receptor EC50s are critical to this work and need to be better justified. It would also be good to show what happens if these EC50 numbers are changed by an order of magnitude up or down.

      We agree that these assumptions are critical. Simulations on effective off-rates across a range of EC50 values will be included in the revised version.

      (6) Line 459: "we based our receptor kinetics on newer pharmacological experiments in live cells (Agren et al., 2021) and properties of the recently developed DA receptor-based biosensors (Labouesse & Patriarchi, 2021). Indeed, these sensors are mutated receptors but only on the intracellular domains with no changes of the binding site (Labouesse & Patriarchi, 2021)”

      This argument is diminished by the observation that different sensors based on the same binding site have different affinities (e.g. in Patriarchi et al. 2018, dLight1.1 has Kd of 330nM while dlight1.3b has Kd of 1600nM).

      We sincerely thank the reviewer for highlighting this important point. We fully recognize the fundamental importance of absolute and relative DA receptor kinetics for modeling DA actions and acknowledge that differences in affinity estimates from sensor-based measurements highlight the inherent uncertainty in selecting receptor kinetics parameters. While we have based our modeling decisions on what we believe to be the most relevant available data, we acknowledge that the choice of receptor kinetics is a topic of ongoing debate. Importantly, we are making our model available to the research community, allowing others to test their own estimates of receptor kinetics and assess their impact on the model’s behavior. In our revised manuscript, we will further discuss the rationale behind our parameter choices, including: Our selection of a Kd value of 1000 nM for D1R (based on the observed affinities for D1R sensors) and an extrapolated Koff of 19.5 s<sup>-1</sup> (Labouesse & Patriarchi, 2021). Our use of a Kd value of 7 nM and an extrapolated Koff of 0.2 s<sup>-1</sup> for D2R, consistent with recent binding studies (Ågren et al., 2021).

      (7) Estimates of Vmax for DA uptake are entirely based on prior fast-scan voltammetry studies (Table S2). But FSCV likely produces distorted measures of uptake rate due to the kinetics of DA adsorption and release on the carbon fiber surface.

      We fully agree that this is a limitation of FSCV. However, most of the cited papers attempt to correct for this by way of fitting the output to a multi-parameter model for DA kinetics. If newer literature brings the Vmax values estimated into question, we have made the model publicly available to rerun the simulations with new parameters.

      (8) It is assumed that tortuosity is the same in DS and VS - is this a safe assumption?

      The original paper cited does not specify which region the values are measured in. However, a separate paper estimates the rat cerebellum has a comparable tortuosity index (Nicholson and Phillips, J Physiol. (1981)), suggesting it may be a rather uniform value across brain regions.

      (9) More discussion is needed about how the conclusions derived from this more elaborate model of DA dynamics are the same, and different, to conclusions drawn from prior relevant models (including those cited, e.g. from Hunger et al. 2020, etc).

      As part of our revision, we will expand the current discussion of our finding in the context of previous models in the manuscript

      Reviewer #2 (Public review):

      The work presents a model of dopamine release, diffusion, and reuptake in a small (100 micrometer^2 maximum) volume of striatum. This extends previous work by this group and others by comparing dopamine dynamics in the dorsal and ventral striatum and by using a model of immediate dopamine-receptor activation inferred from recent dopamine sensor data. From their simulations, the authors report two main conclusions. The first is that the dorsal striatum does not appear to have a sustained, relatively uniform concentration of dopamine driven by the constant 4Hz firing of dopamine neurons; rather that constant firing appears to create hotspots of dopamine. By contrast, the lower density of release sites and lower rate of reuptake in the ventral striatum creates a sustained concentration of dopamine. The second main conclusion is that D1 receptor (D1R) activation is able to track dopamine concentration changes at short delays but D2 receptor activation cannot.

      The simulations of the dorsal striatum will be of interest to dopamine aficionados as they throw some doubt on the classic model of "tonic" and "phasic" dopamine actions, further show the disconnect between dopamine neuron firing and consequent release, and thus raise issues for the reward-prediction error theory of dopamine.

      There is some careful work here checking the dependence of results on the spatial volume and its discretisation. The simulations of dopamine concentration are checked over a range of values for key parameters. The model is good, the simulations are well done, and the evidence for robust differences between dorsal and ventral striatum dopamine concentration is good.

      However, the main weakness here is that neither of the main conclusions is strongly evidenced as yet. The claim that the dorsal striatum has no "tonic" dopamine concentration is based on the single example simulation of Figure 1 not the extensive simulations over a range of parameters. Some of those later simulations seem to show that the dorsal striatum can have a "tonic" dopamine concentration, though the measurement of this is indirect. It is not clear why the reader should believe the example simulation over those in the robustness checks, for example by identifying which range of parameter values is more realistic.

      We appreciate that the reviewer finds our work interesting and carefully performed. The reviewer is correct that DA dynamics, including the presence and level of tonic DA, are parameter-dependent in both the dorsal striatum (DS) and ventral striatum (VS). Indeed, our simulations across a broad range of biological parameters were intended to help readers understand how such variation would impact the model’s outcomes, particularly since many of the parameters remain contested. Naturally, altering these parameters results in changes to the observed dynamics. However, to derive possible conclusions, we selected a subset of parameters that we believe best reflect the physiological conditions, as elaborated in the manuscript. This is eventually required in computational modelling of biological systems. In response to the reviewer’s comment, we will place greater emphasis on clarifying which parameter regimes produce a "tonic" versus "non-tonic" DA state in the DS. Additionally, we will underscore that the distinction between tonic and non-tonic states is not a binary outcome but a parameter-dependent continuum—one that our model now allows researchers to explore systematically. Finally, we will highlight how our simulations across parameter space not only capture this continuum but also identify the regimes that produce the most heterogeneous DA signaling, both within and across striatal regions.

      The claim that D1Rs can track rapid changes in dopamine is not well supported. It is based on a single simulation in Figure 1 (DS) and 2 (VS) by visual inspection of simulated dopamine concentration traces - and even then it is unclear that D1Rs actually track dynamics because they clearly do not track rapid changes in dopamine that are almost as large as those driven by bursts (cf Figure 1i).

      We would like to draw the attention also to Fig. S1, where the claim that D1R track rapid changes is supported in more depth. According to this figure, upon coordinated burst firing, the D1R occupancy rapidly increased as diffusion no longer equilibrated the extracellular concentrations on a timescale faster than the receptors – and D1R receptor occupancy closely tracked extracellular DA with a delay on the order of tens of milliseconds. Note that the brief increases in [DA] from uncoordinated stochastic release events from tonic firing in Fig. 1i are too brief to drive D1 signaling, as the DA concentration diffuses into the remaining extracellular space on a timescale of 1-5 ms. This is faster than the receptors response rate, and does not lead to any downstream signaling according to our simulations. This means D1 kinetics are rapid enough to track coordinated signaling on a ~50 ms timescale and slower, but not fast enough to respond to individual release events from tonic activity. In our revised manuscript we will expand the discussion of this topic to provide greater clarity.

      The claim also depends on two things that are poorly explained. First, the model of binding here is missing from the text. It seems to be a simple bound-fraction model, simulating a single D1 or D2 receptor. It is unclear whether more complex models would show the same thing.

      We realize that this is not made clear in the methods and, accordingly, we will update the method section to elaborate on how we model receptor binding. The model simulates occupied fraction of D1R and D2R in every single voxel of the simulation space.

      Second, crucial to the receptor model here is the inference that D1 receptor unbinding is rapid; but this inference is made based on the kinetics of dopamine sensors and is superficially explained - it is unclear why sensor kinetics should let us extrapolate to receptor kinetics, and unclear how safe is the extrapolation of the linear regression by an order of magnitude to get the D1 unbinding rate.

      We chose to use the sensors because it was possible to estimate precise affinities/off-rates from the fluorescent measurements. Although there might some variation in affinities that could be attributable to the mutations introduced in the sensors, the data clearly separated D1R and D2R with a D1R affinity of ~1000 nM and a D1R affinity of ~7 nM (Labouesse & Patriarchi, 2021) consistent with earlier predictions of receptor affinities. From our assessment of the literature we found that this was the most reasonable way to estimate affinities and thereby off-rates. Importantly, the model has been made publicly available, so should new measurements arise, the simulations can be rerun with tweaks to the input parameters.

    1. eLife Assessment

      This potentially valuable study presents claims of evidence for coordinated membrane potential oscillations in E. coli biofilms that can be linked to a putative K+ channel and that may serve to enhance photo-protection. The finding of waves of membrane potential would be of interest to a wide audience from molecular biology to microbiology and physical biology. Unfortunately, a major issue is that it is unclear whether the dye used can act as a Nernstian membrane potential dye in E. coli. The arguments of the authors, who largely ignore previously published contradictory evidence, are not adequate in that they do not engage with the fact that the dye behaves in their hands differently than in the hands of others. In addition, the lack of proper validation of the experimental method including key control experiments leaves the evidence incomplete.

    2. Reviewer #1 (Public Review):

      (1) Significance of the findings:

      Cell-to-cell communication is essential for higher functions in bacterial biofilms. Electrical signals have proven effective in transmitting signals across biofilms. These signals are then used to coordinate cellular metabolisms or to increase antibiotic tolerance. Here, the authors have reported for the first time coordinated oscillation of membrane potential in E. coli biofilms that may have a functional role in photoprotection.

      (2) Strengths of the manuscript:

      - The authors report original data.<br /> - For the first time, they showed that coordinated oscillations in membrane potential occur in E. Coli biofilms.<br /> - The authors revealed a complex two-phase dynamic involving distinct molecular response mechanisms.<br /> - The authors developed two rigorous models inspired by 1) Hodgkin-Huxley model for the temporal dynamics of membrane potential and 2) Fire-Diffuse-Fire model for the propagation of the electric signal.<br /> - Since its discovery by comparative genomics, the Kch ion channel has not been associated with any specific phenotype in E. coli. Here, the authors proposed a functional role for the putative gated-voltage-gated K+ ion channel (Kch channel) : enhancing survival under photo-toxic conditions.

      (3) Weakness:

      - Contrarily to what is stated in the abstract, the group of B. Maier has already reported collective electrical oscillations in the Gram-negative bacterium Neisseria gonorrhoeae (Hennes et al., PLoS Biol, 2023).<br /> - The data presented in the manuscript are not sufficient to conclude on the photo-protective role of the Kch channel. The authors should perform the appropriate control experiments related to Fig4D,E, i.e. reproduce these experiments without ThT to rule out possible photo-conversion effects on ThT that would modify its toxicity. In addition, it looks like the data reported on Fig 4E are extracted from Fig 4D. If this is indeed the case, it would be more conclusive to report the percentage of PI-positive cells in the population for each condition. This percentage should be calculated independently for each replicate. The authors should then report the average value and standard deviation of the percentage of dead cells for each condition.<br /> - Although Fig 4A clearly shows that light stimulation has an influence on the dynamics of ThT signal in the biofilm, it is important to rule out possible contributions of other environmental variations that occur when the flow is stopped at the onset of light stimulation. I understand that for technical reasons, the flow of fresh medium must be stopped for the sake of imaging. Therefore, I suggest to perform control experiments consisting in stopping the flow at different time intervals before image acquisition (30min or 1h before). If there is no significant contribution from environmental variations due to medium perfusion arrest, the dynamics of ThT signal must be unchanged regardless of the delay between flow stop and the start of light stimulation.<br /> - To precise the role of K+ in the habituation response, I suggest using the ionophore valinomycin at sub-inhibitory concentrations (5 or 10µM). It should abolish the habituation response. In addition, the Kch complementation experiment exhibits a sharp drop after the first peak but on a single point. It would be more convincing to increase the temporal resolution (1min->10s) to show that there are indeed a first and a second peak. Finally, the high concentration (100µM) of CCCP used in this study completely inhibits cell activity. Therefore, it is not surprising that no ThT dynamics was observed upon light stimulation at such concentration of CCCP.<br /> - Since TMRM signal exhibits a linear increase after the first response peak (Supp Fig1D), I recommend to mitigate the statement at line 78.<br /> - Electrical signal propagation is an important aspect of the manuscript. However, a detailed quantitative analysis of the spatial dynamics within the biofilm is lacking. At minima, I recommend to plot the spatio-temporal diagram of ThT intensity profile averaged along the azimuthal direction in the biofilm. In addition, it is unclear if the electrical signal propagates within the biofilm during the second peak regime, which is mediated by the Kch channel: I have plotted the spatio-temporal diagram for Video S3 and no electrical propagation is evident at the second peak. In addition, the authors should provide technical details of how R^2(t) is measured in the first regime (Fig 7E).<br /> - In the series of images presented in supplementary Figure 4A, no wavefront is apparent. Although the microscopy technics used in this figure differs from other images (like in Fig2), the wavefront should be still present. In addition, there is no second peak in confocal images as well (Supp Fig4B) .<br /> - Many important technical details are missing (e.g. biofilm size, R^2, curvature and 445nm irradiance measurements). The description of how these quantitates are measured should be detailed in the Material & Methods section.<br /> - Fig 5C: The curve in Fig 5D seems to correspond to the biofilm case. Since the model is made for single cells, the curve obtained by the model should be compared with the average curve presented in Fig 1B (i.e. single cell experiments).<br /> - For clarity, I suggest to indicate on the panels if the experiments concern single cell or biofilm experiments. Finally, please provide bright-field images associated to ThT images to locate bacteria.<br /> - In Fig 7B, the plateau is higher in the simulations than in the biofilm experiments. The authors should add a comment in the paper to explain this discrepancy.

    3. Reviewer #2 (Public Review):

      The authors use ThT dye as a Nernstian potential dye in E. coli. Quantitative measurements of membrane potential using any cationic indicator dye are based on the equilibration of the dye across the membrane according to Boltzmann's law.

      Ideally, the dye should have high membrane permeability to ensure rapid equilibration. Others have demonstrated that E.coli cells in the presence of ThT do not load unless there is blue light present, that the loading profile does not look like it is expected for a cationic Nernstian dye. They also show that the loading profile of the dye is different for E.coli cells deleted for the TolC pump. I, therefore, objected to interpreting the signal from the ThT as a Vm signal when used in E.coli. Nothing the authors have said has suggested that I should be changing this assessment.

      Specifically, the authors responded to my concerns as follows:

      (1) 'We are aware of this study, but believe it to be scientifically flawed. We do not cite the article because we do not think it is a particularly useful contribution to the literature.' This seems to go against ethical practices when it comes to scientific literature citations. If the authors identified work that handles the same topic they do, which they believe is scientifically flawed, the discussion to reflect that should be included.

      (2)'The Pilizota group invokes some elaborate artefacts to explain the lack of agreement with a simple Nernstian battery model. The model is incorrect not the fluorophore.'<br /> It seems the authors object to the basic principle behind the usage of Nernstian dyes. If the authors wish to use ThT according to some other model, and not as a Nernstian indicator, they need to explain and develop that model. Instead, they state 'ThT is a Nernstian voltage indicator' in their manuscript and expect the dye to behave like a passive voltage indicator throughout it.

      (3)'We think the proton effect is a million times weaker than that due to potassium i.e. 0.2 M K+<br /> versus 10-7 M H+. We can comfortably neglect the influx of H+ in our experiments.'<br /> I agree with this statement by the authors. At near-neutral extracellular pH, E.coli keeps near-neutral intracellular pH, and the contribution from the chemical concentration gradient to the electrochemical potential of protons is negligible. The main contribution is from the membrane potential. However, this has nothing to do with the criticism to which this is the response of the authors. The criticism is that ThT has been observed not to permeate the cell without blue light. The blue light has been observed to influence the electrochemical potential of protons (and given that at near-neutral intracellular and extracellular pH this is mostly the membrane potential, as authors note themselves, we are talking about Vm effectively). Thus, two things are happening when one is loading the ThT, not just expected equilibration but also lowering of membrane potential. The electrochemical potential of protons is coupled via the membrane potential to all the other electrochemical potentials of ions, including the mentioned K+.

      (4) 'The vast majority of cells continue to be viable. We do not think membrane damage is dominating.' In response to the question on how the authors demonstrated TMRM loading and in which conditions (and while reminding them that TMRM loading profile in E.coli has been demonstrated in Potassium Phosphate buffer). The request was to demonstrate TMRM loading profile in their condition as well as to show that it does not depend on light. Cells could still be viable, as membrane permeabilisation with light is gradual, but the loading of ThT dye is no longer based on simple electrochemical potential (of the dye) equilibration.

      (5) On the comment on the action of CCCP with references included, authors include a comment that consists of phrases like 'our understanding of the literature' with no citations of such literature. Difficult to comment further without references.

      (6) 'Shielding would provide the reverse effect, since hyperpolarization begins in the dense centres of the biofilms. For the initial 2 hours the cells receive negligible blue light. Neither of the referee's comments thus seem tenable.'<br /> The authors have misunderstood my comment. I am not advocating shielding (I agree that this is not it) but stating that this is not the only other explanation for what they see (apart from electrical signaling). The other I proposed is that the membrane has changed in composition and/or the effective light power the cells can tolerate. The authors comment only on the light power (not convincingly though, giving the number for that power would be more appropriate), not on the possible changes in the membrane permeability.

      (7) 'The work that TolC provides a possible passive pathway for ThT to leave cells seems slightly niche. It just demonstrates another mechanism for the cells to equilibrate the concentrations of ThT in a Nernstian manner i.e. driven by the membrane voltage.' I am not sure what the authors mean by another mechanism. The mechanism of action of a Nernstian dye is passive equilibration according to the electrochemical potential (i.e. until the electrochemical potential of the dye is 0).

      (8) 'In the 70 years since Hodgkin and Huxley first presented their model, a huge number of similar models have been proposed to describe cellular electrophysiology. We are not being hyperbolic when we state that the HH models for excitable cells are like the Schrödinger<br /> equation for molecules. We carefully adapted our HH model to reflect the currently understood electrophysiology of E. coli.'

      I gave a very concrete comment on the fact that in the HH model conductivity and leakage are as they are because this was explicitly measured. The authors state that they have carefully adopted their model based on what is currently understood for E.coli electrophysiology. It is not clear how. HH uses gKn^4 based on Figure2 here https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1392413/pdf/jphysiol01442-0106.pdf, i.e. measured rise and fall of potassium conductance on msec time scales. I looked at the citation the authors have given and found a resistance of an entire biofilm of a given strain at 3 applied voltages. So why n^4 based on that? Why does unknown current have gqz^4 form? Sodium conductance in HH is described by m^3hgNa (again based on detailed conductance measurements), so why unknown current in E.coli by gQz^4? Why leakage is in the form that it is, based on what measurement?

      Throughout their responses, the authors seem to think that collapsing the electrochemical gradient of protons is all about protons, and this is not the case. At near neutral inside and outside pH, the electrochemical potential of protons is simply membrane voltage. And membrane voltage acts on all ions in the cell.

      Authors have started their response to concrete comments on the usage of ThT dye with comments on papers from my group that are not all directly relevant to this publication. I understand that their intention is to discredit a reviewer but given that my role here is to review this manuscript, I will only address their comments to the publications/part of publications that are relevant to this manuscript and mention what is not relevant.

      Publications in the order these were commented on.

      (1) In a comment on the paper that describes the usage of ThT dye as a Nernstian dye authors seem to talk about a model of an entire active cell.<br /> 'Huge oscillations occur in the membrane potentials of E. coli that cannot be described by the SNB model.' The two have nothing to do with each other. Nernstian dye equilibrates according to its electrochemical potential. Once that happens it can measure the potential (under the assumption that not too much dye has entered and thus lowered too much the membrane potential under measurement). The time scale of that is important, and the dye can only measure processes that are slower than that equilibration. If one wants to use a dye that acts under a different model, first that needs to be developed, and then coupled to any other active cell model.

      (2) The part of this paper that is relevant is simply the usage of TMRM dye. It is used as Nernstian dye, so all the above said applies. The rest is a study of flagellar motor.

      (3) The authors seem to not understand that the electrochemical potential of protons is coupled to the electrochemical potentials of all other ions, via the membrane potential. In the manuscript authors talk about, PMF~Vm, as DeltapH~0. Other than that this publication is not relevant to their current manuscript.

      (4) The manuscript in fact states precisely that PMF cannot be generated by protons only and some other ions need to be moved out for the purpose. In near neutral environment it stated that these need to be cations (K+ e.g.). The model used in this manuscript is a pump-leak model. Neither is relevant for the usage of ThT dye.

      Further comments include, along the lines of:

      'The editors stress the main issue raised was a single referee questioning the use of ThT as an indicator of membrane potential. We are well aware of the articles by the Pilizota group and we believe them to be scientifically flawed. The authors assume there are no voltage-gated ion channels in E. coli and then attempt to explain motility data based on a simple Nernstian battery model (they assume E. coli are unexcitable<br /> matter). This in turn leads them to conclude the membrane dye ThT is faulty, when in fact it is a problem with their simple battery model.'

      The only assumption made when using a cationic Nernstian dye is that it equilibrates passively across the membrane according to its electrochemical potential. As it does that, it does lower the membrane potential, which is why as little as possible is added so that this is negligible. The equilibration should be as fast as possible, but at the very least it should be known, as no change in membrane potential can be measured that is faster than that.

      This behaviour should be orthogonal to what the cell is doing, it is a probe after all. If the cell is excitable, a Nernstian dye can be used, as long as it's still passively equilibrating and doing so faster than any changes in membrane potential due to excitations of the cells. There are absolutely no assumptions made on the active system that is about to be measured by this expected behaviour of a Nernstian dye. And there shouldn't be, it is a probe. If one wants to use a dye that is not purely Nernstian that behaviour needs to be described and a model proposed. As far as I can find, authors do no such thing.

      There is a comment on the use of a flagellar motor as a readout of PMF, stating that the motor can be stopped by YcgR citing the work from 2023. Indeed, there is a range of references such as https://doi.org/10.1016/j.molcel.2010.03.001 that demonstrate this (from around 2000-2010 as far as I am aware). The timescale of such slowdown is hours (see here Figure 5 https://www.cell.com/cell/pdf/S0092-8674(10)00019-X.pdf). Needless to say, the flagellar motor when used as a probe, needs to stay that in the conditions used. Thus one should always be on the lookout at any other such proteins that could slow it down and we are not aware of yet or make the speed no longer proportional to the PMF. In the papers my group uses the motor the changes are fast, often reversible, and in the observation window of 30min. They are also the same with DeltaYcgR strain, which we have not included as it seemed given the time scales it's obvious, but certainly can in the future (as well as stay vigilant on any conditions that would render the motor a no longer suitable probe for PMF).

    4. Reviewer #3 (Public Review):

      This manuscript by Akabuogu et al. investigates membrane potential dynamics in E. coli. Membrane potential fluctuations have been observed in bacteria by several research groups in recent years, including in the context of bacterial biofilms where they have been proposed to play a role in cellular communication. Here, these authors investigate membrane potential in E. coli, in both single cells and biofilms. I have reviewed the revised manuscript provided by the authors, as well as their responses to the initial reviews; my opinion about the manuscript is largely unchanged. I have focused my public review on those issues that I believe to be most pressing, with additional comments included in the review to authors. Although these authors are working in an exciting research area, the evidence they provide for their claims is inadequate, and several key control experiments are still missing. In some cases, the authors allude to potentially relevant data in their responses to the initial reviews, but unfortunately these data are not shown. Furthermore, I cannot identify any traveling wavefronts in the data included in this manuscript. In addition to the challenges associated with the use of Thioflavin-T (ThT) raised by the second reviewer, these caveats make the work presented in this manuscript difficult to interpret.

      First, some of the key experiments presented in the paper lack required controls:

      (1) This paper asserts that the observed ThT fluorescence dynamics are induced by blue light. This is a fundamental claim in the paper, since the authors go on to argue that these dynamics are part of a blue light response. This claim must be supported by the appropriate negative control experiment measuring ThT fluorescence dynamics in the absence of blue light- if this idea is correct, these dynamics should not be observed in the absence of blue light exposure. If this experiment cannot be performed with ThT since blue light is used for its excitation, TMRM can be used instead.

      In response to this, the authors wrote that "the fluorescent baseline is too weak to measure cleanly in this experiment." If they observe no ThT signal above noise in their time lapse data in the absence of blue light, this should be reported in the manuscript- this would be a satisfactory negative control. They then wrote that "It appears the collective response of all the bacteria hyperpolarization at the same time appears to dominate the signal." I am not sure what they mean by this- perhaps that ThT fluorescence changes strongly only in response to blue light? This is a fundamental control for this experiment that ought to be presented to the reader.

      (2) The authors claim that a ∆kch mutant is more susceptible to blue light stress, as evidenced by PI staining. The premise that the cells are mounting a protective response to blue light via these channels rests on this claim. However, they do not perform the negative control experiment, conducting PI staining for WT the ∆kch mutant in the absence of blue light. In the absence of this control it is not possible to rule out effects of the ∆kch mutation on overall viability and/or PI uptake. The authors do include a growth curve for comparison, but planktonic growth is a very different context than surface-attached biofilm growth. Additionally, the ∆kch mutation may have impacts on PI permeability specifically that are not addressed by a growth curve. The negative control experiment is of key importance here.

      Second, the ideas presented in this manuscript rely entirely on analysis of ThT fluorescence data, specifically a time course of cellular fluorescence following blue light treatment. However, alternate explanations for and potential confounders of the observed dynamics are not sufficiently addressed:

      (1) Bacterial cells are autofluorescent, and this fluorescence can change significantly in response to stress (e.g. blue light exposure). To characterize and/or rule out autofluorescence contributions to the measurement, the authors should present time lapse fluorescence traces of unstained cells for comparison, acquired under the same imaging conditions in both wild type and ∆kch mutant cells. In their response to reviewers the authors suggested that they have conducted this experiment and found that the autofluorescence contribution is negligible, which is good, but these data should be included in the manuscript along with a description of how these controls were conducted.

      (2) Similarly, in my initial review I raised a concern about the possible contributions of photobleaching to the observed fluorescence dynamics. This is particularly relevant for the interpretation of the experiment in which catalase appears to attenuate the decay of the ThT signal; this attenuation could alternatively be due to catalase decreasing ThT photobleaching. In their response, the authors indicated that photobleaching is negligible, which would be good, but they do not share any evidence to support this claim. Photobleaching can be assessed in this experiment by varying the light dosage (illumination power, frequency, and/or duration) and confirming that the observed fluorescence dynamics are unaffected.

      Third, the paper claims in two instances that there are propagating waves of ThT fluorescence that move through biofilms, but I do not observe these waves in any case:

      (1) The first wavefront claim relates to small cell clusters, in Fig. 2A and Video S2 and S3 (with Fig. 2A and Video S2 showing the same biofilm.) I simply do not see any evidence of propagation in either case- rather, all cells get brighter and dimmer in tandem. I downloaded and analyzed Video S3 in several ways (plotting intensity profiles for different regions at different distances from the cluster center, drawing a kymograph across the cluster, etc.) and in no case did I see any evidence of a propagating wavefront. (I attempted this same analysis on the biofilm shown in Fig. 2A and Video S2 with similar results, but the images shown in the figure panels and especially the video are still both so saturated that the quantification is difficult to interpret.) If there is evidence for wavefronts, it should be demonstrated explicitly by analysis of several clusters. For example, a figure of time-to-peak vs. position in the cluster demonstrating a propagating wave would satisfy this. Currently, I do not see any wavefronts in this data.

      (2) The other wavefront claim relates to biofilms, and the relevant data is presented in Fig. S4 (and I believe also in what is now Video S8, but no supplemental video legends are provided, and this video is not cited in text.) As before, I cannot discern any wavefronts in the image and video provided; Reviewer 1 was also not able to detect wave propagation in this video by kymograph. Some mean squared displacements are shown in Fig. 7. As before, the methods for how these were obtained are not clearly documented either in this manuscript or in the BioRXiv preprint linked in the initial response to reviewers, and since wavefronts are not evident in the video it is hard to understand what is being measured here- radial distance from where? (The methods section mentions radial distance from the substrate, this should mean Z position above the imaging surface, and no wavefronts are evident in Z in the figure panels or movie.) Thus, clear demonstration of these wavefronts is still missing here as well.

      Fourth, I have some specific questions about the study of blue light stress and the use of PI as a cell viability indicator:

      (1) The logic of this paper includes the premise that blue light exposure is a stressor under the experimental conditions employed in the paper. Although it is of course generally true that blue light can be damaging to bacteria, this is dependent on light power and dosage. The control I recommended above, staining cells with PI in the presence and absence of blue light, will also allow the authors to confirm that this blue light treatment is indeed a stressor- the PI staining would be expected to increase in the presence of blue light if this is so.

      (2) The presence of ThT may complicate the study of the blue light stress response, since ThT enhances the photodynamic effects of blue light in E. coli (Bondia et al. 2021 Chemical Communications). The authors could investigate ThT toxicity under these conditions by staining cells with PI after exposing them to blue light with or without ThT staining.

      (3) In my initial review, I wrote the following: "In Figures 4D - E, the interpretation of this experiment can be confounded by the fact that PI uptake can sometimes be seen in bacterial cells with high membrane potential (Kirchhoff & Cypionka 2017 J Microbial Methods); the interpretation is that high membrane potential can lead to increased PI permeability. Because the membrane potential is largely higher throughout blue light treatment in the ∆kch mutant (Fig. 3[BC]), this complicates the interpretation of this experiment." In their response, the authors suggested that these results are not relevant in this case because "In our experiment methodology, cell death was not forced on the cells by introducing an extra burden or via anoxia." However, the logic of the paper is that the cells are in fact dying due to an imposed external stressor, which presumably also confers an increased burden as the cells try to deal with the stress. Instead, the authors should simply use a parallel method to confirm the results of PI staining. For example, the experiment could be repeated with other stains, or the viability of blue light-treated cells could be addressed more directly by outgrowth or colony-forming unit assays.

      The CFU assay suggested above has the additional advantage that it can also be performed on planktonic cells in liquid culture that are exposed to blue light. If, as the paper suggests, a protective response to blue light is being coordinated at the biofilm level by these membrane potential fluctuations, the WT strain might be expected to lose its survival advantage vs. the ∆kch mutant in the absence of a biofilm.

      Fifth, in several cases the data are presented in a way that are difficult to interpret, or the paper makes claims that are different to observe in the data:

      (1) The authors suggest that the ThT and TMRM traces presented in Fig. S1D have similar shapes, but this is not obvious to me- the TMRM curve has very little decrease after the initial peak and only a modest, gradual rise thereafter. The authors suggest that this is due to increased TMRM photobleaching, but I would expect that photobleaching should exacerbate the signal decrease after the initial peak. Since this figure is used to support the use of ThT as a membrane potential indicator, and since this is the only alternative measurement of membrane potential presented in text, the authors should discuss this discrepancy in more detail.

      (2) The comparison of single cells to microcolonies presented in figures 1B and D still needs revision:

      First, both reviewer 1 and I commented in our initial reviews that the ThT traces, here and elsewhere, should not be normalized- this will help with the interpretation of some of the claims throughout the manuscript.

      Second, the way these figures are shown with all traces overlaid at full opacity makes it very difficult to see what is being compared. Since the point of the comparison is the time to first peak (and the standard deviation thereof), histograms of the distributions of time to first peak in both cases should be plotted as a separate figure panel.<br /> Third, statistical significance tests ought to be used to evaluate the statistical strength of the comparisons between these curves. The authors compare both means and standard deviations of the time to first peak, and there are appropriate statistical tests for both types of comparisons.

      (3) The authors claim that the curve shown in Fig. S4B is similar to the simulation result shown in Fig. 7B. I remain unconvinced that this is so, particularly with respect to the kinetics of the second peak- at least it seems to me that the differences should be acknowledged and discussed. In any case, the best thing to do would be to move Fig. S4B to the main text alongside Fig. 7B so that the readers can make the comparison more easily.

      (4) As I wrote in my first review, in the discussion of voltage-gated calcium channels, the authors refer to "spiking events", but these are not obvious in Figure S3E. Although the fluorescence intensity changes over time, these fluctuations cannot be distinguished from measurement noise. A no-light control could help clarify this.

      (5) In the lower irradiance conditions in Fig. 4A, the ThT dynamics are slower overall, and it looks like the ThT intensity is beginning to rise at the end of the measurement. The authors write that no second peak is observed below an irradiance threshold of 15.99 µW/mm2. However, could a more prominent second peak be observed in these cases if the measurement time was extended? Additionally, the end of these curves looks similar to the curve in Fig. S4B, in which the authors write that the slow rise is evidence of the presence of a second peak, in contrast to their interpretation here.

      Additional considerations:

      (1) The analysis and interpretation of the first peak, and particularly of the time-to-fire data is challenging throughout the manuscript the time resolution of the data set is quite limited. It seems that a large proportion of cells have already fired after a single acquisition frame. It would be ideal to increase the time resolution on this measurement to improve precision. This could be done by imaging more quickly, but that would perhaps necessitate more blue light exposure; an alternative is to do this experiment under lower blue light irradiance where the first spike time is increased (Figure 4A).

      (2) The authors suggest in the manuscript that "E. coli biofilms use electrical signalling to coordinate long-range responses to light stress." In addition to the technical caveats discussed above, I am missing a discussion about what these responses might be. What constitutes a long-range response to light stress, and are there known examples of such responses in bacteria?

      (3) The presence of long-range blue light responses can also be interrogated experimentally, for example, by repeating the Live/Dead experiment in planktonic culture or the single-cell condition. If the protection from blue light specifically emerges due to coordinated activity of the biofilm, the ∆kch mutant would not be expected to show a change in Live/Dead staining in non-biofilm conditions. The CFU experiment I mentioned above could also implicate coordinated long-range responses specifically, if biofilms and liquid culture experiments can be compared (although I know that recovering cells from biofilms is challenging.)

      4. At the end of the results section, the authors suggest a critical biofilm size of only 4 μm for wavefront propagation (not much larger than a single cell!) The authors show responses for various biofilm sizes in Fig. 2C, but these are all substantially larger (and this figure also does not contain wavefront information.) Are there data for cell clusters above and below this size that could support this claim more directly?

      (5) In Fig. 4C, the overall trajectories of extracellular potassium are indeed similar, but the kinetics of the second peak of potassium are different than those observed by ThT (it rises minutes earlier)- is this consistent with the idea that Kch is responsible for that peak? Additionally, the potassium dynamics also include the first ThT peak- is this surprising given that the Kch channel has no effect on this peak according to the model?

      Detailed comments:

      Why are Fig. 2A and Video S2 called a microcluster, whereas Video S3, which is smaller, is called a biofilm?

      "We observed a spontaneous rapid rise in spikes within cells in the center of the biofilm" (Line 140): What does "spontaneous" mean here?

      "This demonstrates that the ion-channel mediated membrane potential dynamics is a light stress relief process.", "E. coli cells employ ion-channel mediated dynamics to manage ROS-induced stress linked to light irradiation." (Line 268 and the second sentence of the Fig. 4F legend): This claim is not well-supported. There are several possible interpretations of the catalase experiment (which should be discussed); this experiment perhaps suggests that ROS impacts membrane potential but does not indicate that these membrane potential fluctuations help the cells respond to blue light stress. The loss of viability in the ∆kch mutant might indicate a link between these membrane potential experiments and viability, but it is hard to interpret without the no light controls I mention above.

      "The model also predicts... the external light stress" (Lines 338-341): Please clarify this section. Where does this prediction arise from in the modeling work? Second, I am not sure what is meant by "modulates the light stress" or "keeps the cell dynamics robust to the intensity of external light stress" (especially since the dynamics clearly vary with irradiance, as seen in Figure 4A).

      "We hypothesized that E. coli not only modulates the light-induced stress but also handles the increase of the ROS by adjusting the profile of the membrane potential dynamics" (Line 347): I am not sure what "handles the ROS by adjusting the profile of the membrane potential dynamics" means. What is meant by "handling" ROS? Is the hypothesis that membrane potential dynamics themselves are protective against ROS, or that they induce a ROS-protective response downstream, or something else? Later the authors write that changes in the response to ROS in the model agree with the hypothesis, but just showing that ROS impacts the membrane potential does not seem to demonstrate that this has a protective effect against ROS.

      "Mechanosensitive ion channels (MS) are vital for the first hyperpolarization event in E. coli." (Line 391): This is misleading- mechanosensitive ion channels totally ablate membrane potential dynamics, they don't have a specific effect on the first hyperpolarization event. The claim that mechanonsensitive ion channels are specifically involved in the first event also appears in the abstract.

      Also, the apparent membrane potential is much lower even at the start of the experiment in these mutants (Fig. 6C-D)- is this expected? This seems to imply that these ion channels also have a blue light-independent effect.

      Throughout the paper, there are claims that the initial ThT spike is involved in "registering the presence of the light stress" and similar. What is the evidence for this claim?

      "We have presented much better quantitative agreement of our model with the propagating wavefronts in E. coli biofilms..." (Line 619): It is not evident to me that the agreement between model and prediction is "much better" in this work than in the cited work (reference 57, Hennes et al. 2023). The model in Figure 4 of ref. 57 seems to capture the key features of their data.

      In methods, "Only cells that are hyperpolarized were counted in the experiment as live" (Line 745): what percentage of cells did not hyperpolarize in these experiments?

      Some indication of standard deviation (error bars or shading) should be added to all figures where mean traces are plotted.

      Video S8 is very confusing- why does the video play first forwards and then backwards? It is easy to misinterpret this as a rise in the intensity at the end of the experiment.

    5. Author response:

      The following is the authors’ response to the current reviews.

      The issue of a control without blue light illumination was raised. Clearly without the light we will not obtain any signal in the fluorescence microscopy experiments, which would not be very informative. Instead, we changed the level of blue light illumination in the fluorescence microscopy experiments (figure 4A) and the response of the bacteria scales with dosage. It is very hard to find an alternative explanation, beyond that the blue light is stressing the bacteria and modulating their membrane potentials.

      One of the referees refuses to see wavefronts in our microscopy data. We struggle to understand whether it is an issue with definitions (Waigh has published a tutorial on the subject in Chapter 5 of his book ‘The physics of bacteria: from cells to biofilms’, T.A.Waigh, CUP, 2024 – figure 5.1 shows a sketch) or something subtler on diffusion in excitable systems. We stand by our claim that we observe wavefronts, similar to those observed by Prindle et al<sup>1</sup> and Blee et al<sup>2</sup> for B. subtilis biofilms.

      The referee is questioning our use of ThT to probe the membrane potential. We believe the Pilizota and Strahl groups are treating the E. coli as unexcitable cells, leading to their problems. Instead, we believe E. coli cells are excitable (containing the voltage-gated ion channel Kch) and we now clearly state this in the manuscript. Furthermore, we have added a section at the end of the supplementary information discussing some of the issues with ThT.

      Related to the previous point, we now cite articles from the Pilizota and Strahl groups in the main text (one from each group). Unfortunately, the space constraints of eLife mean we cannot make a more detailed discussion in the main article and this is left to the supplementary information section.

      In terms of modelling the ion channels, the Hodgkin-Huxley type model proposes that the Kch ion channel can be modelled as a typical voltage-gated potassium ion channel i.e. with a 𝑛<sup>4</sup> term in its conductivity. The literature agrees that Kch is a voltage-gated potassium ion channel based on its primary sequence<sup>3</sup>. The protein has the typical 6 transmembrane helix motif for a voltage-gated ion channel. The agent-based model assumes little about the structure of ion channels in E. coli, other than they release potassium in response to a threshold potassium concentration in their environment. The agent based model is thus robust to the exact molecular details chosen and predicts the anomalous transport of the potassium wavefronts reasonably well (the modelling was extended in a recent Physical Review E article(<sup>4</sup>). Such a description of reaction-anomalous diffusion phenomena has not to our knowledge been previously achieved in the literature<sup>5</sup> and in general could be used to describe other signaling molecules.

      1. Prindle, A.; Liu, J.; Asally, M.; Ly, S.; Garcia-Ojalvo, J.; Sudel, G. M., Ion channels enable electrical communication in bacterial communities. Nature 2015, 527, 59.

      2. Blee, J. A.; Roberts, I. S.; Waigh, T. A., Membrane potentials, oxidative stress and the dispersal response of bacterial biofilms to 405 nm light. Physical Biology 2020, 17, 036001.

      3. Milkman, R., An E. col_i homologue of eukaryotic potassium channel proteins. _PNAS 1994, 91, 3510-3514.

      4. Martorelli, V.; Akabuogu, E. U.; Krasovec, R.; Roberts, I. S.; Waigh, T. A., Electrical signaling in three-dimensional bacterial biofilms using an agent-based fire-diffuse-fire model. Physical Review E 2024, 109, 054402.

      5. Waigh, T. A.; Korabel, N., Heterogeneous anomalous transport in cellular and molecular biology. Reports on Progress in Physics 2023, 86, 126601.

      ———

      The following is the authors’ response to the original reviews.

      Critical synopsis of the articles cited by referee 2:

      (1) ‘Generalized workflow for characterization of Nernstian dyes and their effects on bacterial physiology’, L.Mancini et al, Biophysical Journal, 2020, 118, 1, 4-14.

      This is the central article used by referee 2 to argue that there are issues with the calibration of ThT for the measurement of membrane potentials. The authors use a simple Nernstian battery (SNB) model and unfortunately it is wrong when voltage-gated ion channels occur. Huge oscillations occur in the membrane potentials of E. coli that cannot be described by the SNB model. Instead a Hodgkin Huxley model is needed, as shown in our eLife manuscript and multiple other studies (see above). Arrhenius kinetics are assumed in the SNB model for pumping with no real evidence and the generalized workflow involves ripping the flagella off the bacteria! The authors construct an elaborate ‘work flow’ to insure their ThT results can be interpreted using their erroneous SNB model over a limited range of parameters.

      (2) ‘Non-equivalence of membrane voltage and ion-gradient as driving forces for the bacterial flagellar motor at low load’, C.J.Lo, et al, Biophysical Journal, 2007, 93, 1, 294.

      An odd de novo chimeric species is developed using an E. coli  chassis which uses Na+ instead of H+ for the motility of its flagellar motor. It is not clear the relevance to wild type E. coli, due to the massive physiological perturbations involved. A SNB model is using to fit the data over a very limited parameter range with all the concomitant errors.

      (3) Single-cell bacterial electrophysiology reveals mechanisms of stress-induced damage’, E.Krasnopeeva, et al, Biophysical Journal, 2019, 116, 2390.

      The abstract says ‘PMF defines the physiological state of the cell’. This statement is hyperbolic. An extremely wide range of molecules contribute to the physiological state of a cell. PMF does not even define the electrophysiology of the cell e.g. via the membrane potential. There are 0.2 M of K+ compared with 0.0000001 M of H+ in E. coli, so K+ is arguably a million times more important for the membrane potential than H+ and thus the electrophysiology!

      Equation (1) in the manuscript assumes no other ions are exchanged during the experiments other than H+. This is a very bad approximation when voltage-gated potassium ion channels move the majority ion (K+) around!

      In our model Figure 4A is better explained by depolarisation due to K+ channels closing than direct irreversible photodamage. Why does the THT fluorescence increase again for the second hyperpolarization event if the THT is supposed to be damaged? It does not make sense.

      (4) ‘The proton motive force determines E. coli robustness to extracellular pH’, G.Terradot et al, 2024, preprint.

      This article expounds the SNB model once more. It still ignores the voltage-gated ion channels. Furthermore, it ignores the effect of the dominant ion in E. coli, K+. The manuscript is incorrect as a result and I would not recommend publication.

      In general, an important problem is being researched i.e. how the membrane potential of E. coli is related to motility, but there are serious flaws in the SNB approach and the experimental methodology appears tenuous.

      Answers to specific questions raised by the referees

      Reviewer #1 (Public Review):

      Summary:

      Cell-to-cell communication is essential for higher functions in bacterial biofilms. Electrical signals have proven effective in transmitting signals across biofilms. These signals are then used to coordinate cellular metabolisms or to increase antibiotic tolerance. Here, the authors have reported for the first time coordinated oscillation of membrane potential in E. coli biofilms that may have a functional role in photoprotection.

      Strengths:

      - The authors report original data.

      - For the first time, they showed that coordinated oscillations in membrane potential occur in E. Coli biofilms.

      - The authors revealed a complex two-phase dynamic involving distinct molecular response mechanisms.

      - The authors developed two rigorous models inspired by 1) Hodgkin-Huxley model for the temporal dynamics of membrane potential and 2) Fire-Diffuse-Fire model for the propagation of the electric signal.

      - Since its discovery by comparative genomics, the Kch ion channel has not been associated with any specific phenotype in E. coli. Here, the authors proposed a functional role for the putative K+ Kch channel : enhancing survival under photo-toxic conditions.

      We thank the referee for their positive evaluations and agree with these statements.

      Weaknesses:

      - Since the flow of fresh medium is stopped at the beginning of the acquisition, environmental parameters such as pH and RedOx potential are likely to vary significantly during the experiment. It is therefore important to exclude the contributions of these variations to ensure that the electrical response is only induced by light stimulation. Unfortunately, no control experiments were carried out to address this issue.

      The electrical responses occur almost instantaneously when the stimulation with blue light begins i.e. it is too fast to be a build of pH. We are not sure what the referee means by Redox potential since it is an attribute of all chemicals that are able to donate/receive electrons. The electrical response to stress appears to be caused by ROS, since when ROS scavengers are added the electrical response is removed i.e. pH plays a very small minority role if any.

      - Furthermore, the control parameter of the experiment (light stimulation) is the same as that used to measure the electrical response, i.e. through fluorescence excitation. The use of the PROPS system could solve this problem.

      >>We were enthusiastic at the start of the project to use the PROPs system in E. coli as presented by J.M.Krajl et al, ‘Electrical spiking in E. coli probed with a fluorescent voltage-indicating protein’, Science, 2011, 333, 6040, 345. However, the people we contacted in the microbiology community said that it had some technical issues and there have been no subsequent studies using PROPs in bacteria after the initial promising study. The fluorescent protein system recently presented in PNAS seems more promising, ‘Sensitive bacterial Vm sensors revealed the excitability of bacterial Vm and its role in antibiotic tolerance’, X.Jin et al, PNAS, 120, 3, e2208348120.

      - Electrical signal propagation is an important aspect of the manuscript. However, a detailed quantitative analysis of the spatial dynamics within the biofilm is lacking. In addition, it is unclear if the electrical signal propagates within the biofilm during the second peak regime, which is mediated by the Kch channel. This is an important question, given that the fire-diffuse-fire model is presented with emphasis on the role of K+ ions.

      We have presented a more detailed account of the electrical wavefront modelling work and it is currently under review in a physical journal, ‘Electrical signalling in three dimensional bacterial biofilms using an agent based fire-diffuse-fire model’, V.Martorelli, et al, 2024 https://www.biorxiv.org/content/10.1101/2023.11.17.567515v1

      - Since deletion of the kch gene inhibits the long-term electrical response to light stimulation (regime II), the authors concluded that K+ ions play a role in the habituation response. However, Kch is a putative K+ ion channel. The use of specific drugs could help to clarify the role of K+ ions.

      Our recent electrical impedance spectroscopy publication provides further evidence that Kch is associated with large changes in conductivity as expected for a voltage-gated ion channel (https://pubs.acs.org/doi/10.1021/acs.nanolett.3c04446, 'Electrical impedance spectroscopy with bacterial biofilms: neuronal-like behavior', E.Akabuogu et al, ACS Nanoletters, 2024, in print.

      - The manuscript as such does not allow us to properly conclude on the photo-protective role of the Kch ion channel.

      That Kch has a photoprotective role is our current working hypothesis. The hypothesis fits with the data, but we are not saying we have proven it beyond all possible doubt.

      - The link between membrane potential dynamics and mechanosensitivity is not captured in the equation for the Q-channel opening dynamics in the Hodgkin-Huxley model (Supp Eq 2).

      Our model is agnostic with respect to the mechanosensitivity of the ion channels, although we deduce that mechanosensitive ion channels contribute to ion channel Q.

      - Given the large number of parameters used in the models, it is hard to distinguish between prediction and fitting.

      This is always an issue with electrophysiological modelling (compared with most heart and brain modelling studies we are very conservative in the choice of parameters for the bacteria). In terms of predicting the different phenomena observed, we believe the model is very successful.

      Reviewer #2 (Public Review):

      Summary of what the authors were trying to achieve:

      The authors thought they studied membrane potential dynamics in E.coli biofilms. They thought so because they were unaware that the dye they used to report that membrane potential in E.coli, has been previously shown not to report it. Because of this, the interpretation of the authors' results is not accurate.

      We believe the Pilizota work is scientifically flawed.

      Major strengths and weaknesses of the methods and results:

      The strength of this work is that all the data is presented clearly, and accurately, as far as I can tell.

      The major critical weakness of this paper is the use of ThT dye as a membrane potential dye in E.coli. The work is unaware of a publication from 2020 https://www.sciencedirect.com/science/article/pii/S0006349519308793 [sciencedirect.com] that demonstrates that ThT is not a membrane potential dye in E. coli. Therefore I think the results of this paper are misinterpreted. The same publication I reference above presents a protocol on how to carefully calibrate any candidate membrane potential dye in any given condition.

      We are aware of this study, but believe it to be scientifically flawed. We do not cite the article because we do not think it is a particularly useful contribution to the literature.

      I now go over each results section in the manuscript.

      Result section 1: Blue light triggers electrical spiking in single E. coli cells

      I do not think the title of the result section is correct for the following reasons. The above-referenced work demonstrates the loading profile one should expect from a Nernstian dye (Figure 1). It also demonstrates that ThT does not show that profile and explains why is this so. ThT only permeates the membrane under light exposure (Figure 5). This finding is consistent with blue light peroxidising the membrane (see also following work Figure 4 https://www.sciencedirect.com/science/article/pii/S0006349519303923 [sciencedirect.com] on light-induced damage to the electrochemical gradient of protons-I am sure there are more references for this).

      The Pilizota group invokes some elaborate artefacts to explain the lack of agreement with a simple Nernstian battery model. The model is incorrect not the fluorophore.

      Please note that the loading profile (only observed under light) in the current manuscript in Figure 1B as well as in the video S1 is identical to that in Figure 3 from the above-referenced paper (i.e. https://www.sciencedirect.com/science/article/pii/S0006349519308793 [sciencedirect.com]), and corresponding videos S3 and S4. This kind of profile is exactly what one would expect theoretically if the light is simultaneously lowering the membrane potential as the ThT is equilibrating, see Figure S12 of that previous work. There, it is also demonstrated by the means of monitoring the speed of bacterial flagellar motor that the electrochemical gradient of protons is being lowered by the light. The authors state that applying the blue light for different time periods and over different time scales did not change the peak profile. This is expected if the light is lowering the electrochemical gradient of protons. But, in Figure S1, it is clear that it affected the timing of the peak, which is again expected, because the light affects the timing of the decay, and thus of the decay profile of the electrochemical gradient of protons (Figure 4 https://www.sciencedirect.com/science/article/pii/S0006349519303923 [sciencedirect.com]).

      We think the proton effect is a million times weaker than that due to potasium i.e. 0.2 M K+ versus 10-7 M H+. We can comfortably neglect the influx of H+ in our experiments.

      If find Figure S1D interesting. There authors load TMRM, which is a membrane voltage dye that has been used extensively (as far as I am aware this is the first reference for that and it has not been cited https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1914430 [ncbi.nlm.nih.gov]/). As visible from the last TMRM reference I give, TMRM will only load the cells in Potassium Phosphate buffer with NaCl (and often we used EDTA to permeabilise the membrane). It is not fully clear (to me) whether here TMRM was prepared in rich media (it explicitly says so for ThT in Methods but not for TMRM), but it seems so. If this is the case, it likely also loads because of the damage to the membrane done with light, and therefore I am not surprised that the profiles are similar.

      The vast majority of cells continue to be viable. We do not think membrane damage is dominating.

      The authors then use CCCP. First, a small correction, as the authors state that it quenches membrane potential. CCCP is a protonophore (https://pubmed.ncbi.nlm.nih.gov/4962086 [pubmed.ncbi.nlm.nih.gov]/), so it collapses electrochemical gradient of protons. This means that it is possible, and this will depend on the type of pumps present in the cell, that CCCP collapses electrochemical gradient of protons, but the membrane potential is equal and opposite in sign to the DeltapH. So using CCCP does not automatically mean membrane potential will collapse (e.g. in some mammalian cells it does not need to be the case, but in E.coli it is https://www.biorxiv.org/content/10.1101/2021.11.19.469321v2 [biorxiv.org]). CCCP has also been recently found to be a substrate for TolC (https://journals.asm.org/doi/10.1128/mbio.00676-21 [journals.asm.org]), but at the concentrations the authors are using CCCP (100uM) that should not affect the results. However, the authors then state because they observed, in Figure S1E, a fast efflux of ions in all cells and no spiking dynamics this confirms that observed dynamics are membrane potential related. I do not agree that it does. First, Figure S1E, does not appear to show transients, instead, it is visible that after 50min treatment with 100uM CCCP, ThT dye shows no dynamics. The action of a Nernstian dye is defined. It is not sufficient that a charged molecule is affected in some way by electrical potential, this needs to be in a very specific way to be a Nernstian dye. Part of the profile of ThT loading observed in https://www.sciencedirect.com/science/article/pii/S0006349519308793 [sciencedirect.com] is membrane potential related, but not in a way that is characteristic of Nernstian dye.

      Our understanding of the literature is CCCP poisons the whole metabolism of the bacterial cells. The ATP driven K+ channels will stop functioning and this is the dominant contributor to membrane potential.

      Result section 2: Membrane potential dynamics depend on the intercellular distance

      In this chapter, the authors report that the time to reach the first intensity peak during ThT loading is different when cells are in microclusters. They interpret this as electrical signalling in clusters because the peak is reached faster in microclusters (as opposed to slower because intuitively in these clusters cells could be shielded from light). However, shielding is one possibility. The other is that the membrane has changed in composition and/or the effective light power the cells can tolerate (with mechanisms to handle light-induced damage, some of which authors mention later in the paper) is lower. Given that these cells were left in a microfluidic chamber for 2h hours to attach in growth media according to Methods, there is sufficient time for that to happen. In Figure S12 C and D of that same paper from my group (https://ars.els-cdn.com/content/image/1-s2.0-S0006349519308793-mmc6.pdf [ars.els-cdn.com]) one can see the effects of peak intensity and timing of the peak on the permeability of the membrane. Therefore I do not think the distance is the explanation for what authors observe.

      Shielding would provide the reverse effect, since hyperpolarization begins in the dense centres of the biofilms. For the initial 2 hours the cells receive negligible blue light. Neither of the referee’s comments thus seem tenable.

      Result section 3: Emergence of synchronized global wavefronts in E. coli biofilms

      In this section, the authors exposed a mature biofilm to blue light. They observe that the intensity peak is reached faster in the cells in the middle. They interpret this as the ion-channel-mediated wavefronts moved from the center of the biofilm. As above, cells in the middle can have different membrane permeability to those at the periphery, and probably even more importantly, there is no light profile shown anywhere in SI/Methods. I could be wrong, but the SI3 A profile is consistent with a potential Gaussian beam profile visible in the field of view. In Methods, I find the light source for the blue light and the type of microscope but no comments on how 'flat' the illumination is across their field of view. This is critical to assess what they are observing in this result section. I do find it interesting that the ThT intensity collapsed from the edges of the biofilms. In the publication I mentioned https://www.sciencedirect.com/science/article/pii/S0006349519308793#app2 [sciencedirect.com], the collapse of fluorescence was not understood (other than it is not membrane potential related). It was observed in Figure 5A, C, and F, that at the point of peak, electrochemical gradient of protons is already collapsed, and that at the point of peak cell expands and cytoplasmic content leaks out. This means that this part of the ThT curve is not membrane potential related. The authors see that after the first peak collapsed there is a period of time where ThT does not stain the cells and then it starts again. If after the first peak the cellular content leaks, as we have observed, then staining that occurs much later could be simply staining of cytoplasmic positively charged content, and the timing of that depends on the dynamics of cytoplasmic content leakage (we observed this to be happening over 2h in individual cells). ThT is also a non-specific amyloid dye, and in starving E. coli cells formation of protein clusters has been observed (https://pubmed.ncbi.nlm.nih.gov/30472191 [pubmed.ncbi.nlm.nih.gov]/), so such cytoplasmic staining seems possible.

      >>It is very easy to see if the illumination is flat (Köhler illumination) by comparing the intensity of background pixels on the detector. It was flat in our case. Protons have little to do with our work for reasons highlighted before. Differential membrane permittivity is a speculative phenomenon not well supported by any evidence and with no clear molecular mechanism.

      Finally, I note that authors observe biofilms of different shapes and sizes and state that they observe similar intensity profiles, which could mean that my comment on 'flatness' of the field of view above is not a concern. However, the scale bar in Figure 2A is not legible, so I can't compare it to the variation of sizes of the biofilms in Figure 2C (67 to 280um). Based on this, I think that the illumination profile is still a concern.

      The referee now contradicts themselves and wants a scale bar to be more visible. We have changed the scale bar.

      Result section 4: Voltage-gated Kch potassium channels mediate ion-channel electrical oscillations in E. coli

      First I note at this point, given that I disagree that the data presented thus 'suggest that E. coli biofilms use electrical signaling to coordinate long-range responses to light stress' as the authors state, it gets harder to comment on the rest of the results.

      In this result section the authors look at the effect of Kch, a putative voltage-gated potassium channel, on ThT profile in E. coli cells. And they see a difference. It is worth noting that in the publication https://www.sciencedirect.com/science/article/pii/S0006349519308793 [sciencedirect.com] it is found that ThT is also likely a substrate for TolC (Figure 4), but that scenario could not be distinguished from the one where TolC mutant has a different membrane permeability (and there is a publication that suggests the latter is happening https://onlinelibrary.wiley.com/doi/10.1111/j.1365-2958.2010.07245.x [onlinelibrary.wiley.com]). Given this, it is also possible that Kch deletion affects the membrane permeability. I do note that in video S4 I seem to see more of, what appear to be, plasmolysed cells. The authors do not see the ThT intensity with this mutant that appears long after the initial peak has disappeared, as they see in WT. It is not clear how long they waited for this, as from Figure S3C it could simply be that the dynamics of this is a lot slower, e.g. Kch deletion changes membrane permeability.

      The work that TolC provides a possible passive pathway for ThT to leave cells seems slightly niche. It just demonstrates another mechanism for the cells to equilibriate the concentrations of ThT in a Nernstian manner i.e. driven by the membrane voltage.

      The authors themselves state that the evidence for Kch being a voltage-gated channel is indirect (line 54). I do not think there is a need to claim function from a ThT profile of E. coli mutants (nor do I believe it's good practice), given how accurate single-channel recordings are currently. To know the exact dependency on the membrane potential, ion channel recordings on this protein are needed first.

      We have good evidence form electrical impedance spectroscopy experiments that Kch increases the conductivity of biofilms  (https://pubs.acs.org/doi/10.1021/acs.nanolett.3c04446, 'Electrical impedance spectroscopy with bacterial biofilms: neuronal-like behavior', E.Akabuogu et al, ACS Nanoletters, 2024, in print.

      Result section 5: Blue light influences ion-channel mediated membrane potential events in E. coli

      In this chapter the authors vary the light intensity and stain the cells with PI (this dye gets into the cells when the membrane becomes very permeable), and the extracellular environment with K+ dye (I have not yet worked carefully with this dye). They find that different amounts of light influence ThT dynamics. This is in line with previous literature (both papers I have been mentioning: Figure 4 https://www.sciencedirect.com/science/article/pii/S0006349519303923 [sciencedirect.com] and https://ars.els-cdn.com/content/image/1-s2.0-S0006349519308793-mmc6.pdf [ars.els-cdn.com] especially SI12), but does not add anything new. I think the results presented here can be explained with previously published theory and do not indicate that the ion-channel mediated membrane potential dynamics is a light stress relief process.

      The simple Nernstian battery model proposed by Pilizota et al is erroneous in our opinion for reasons outlined above. We believe it will prove to be a dead end for bacterial electrophysiology studies.

      Result section 6: Development of a Hodgkin-Huxley model for the observed membrane potential dynamics

      This results section starts with the authors stating: 'our data provide evidence that E. coli manages light stress through well-controlled modulation of its membrane potential dynamics'. As stated above, I think they are instead observing the process of ThT loading while the light is damaging the membrane and thus simultaneously collapsing the electrochemical gradient of protons. As stated above, this has been modelled before. And then, they observe a ThT staining that is independent from membrane potential.

      This is an erroneous niche opinion. Protons have little say in the membrane potential since there are so few of them. The membrane potential is mostly determined by K+.

      I will briefly comment on the Hodgkin Huxley (HH) based model. First, I think there is no evidence for two channels with different activation profiles as authors propose. But also, the HH model has been developed for neurons. There, the leakage and the pumping fluxes are both described by a constant representing conductivity, times the difference between the membrane potential and Nernst potential for the given ion. The conductivity in the model is given as gK*n^4 for potassium, gNa*m^3*h sodium, and gL for leakage, where gK, gNa and gL were measured experimentally for neurons. And, n, m, and h are variables that describe the experimentally observed voltage-gated mechanism of neuronal sodium and potassium channels. (Please see Hodgkin AL, Huxley AF. 1952. Currents carried by sodium and potassium ions through the membrane of the giant axon of Loligo. J. Physiol. 116:449-72 and Hodgkin AL, Huxley AF. 1952. A quantitative description of membrane current and its application to conduction and excitation in nerve. J. Physiol. 117:500-44).

      In the 70 years since Hodgkin and Huxley first presented their model, a huge number of similar models have been proposed to describe cellular electrophysiology. We are not being hyperbolic when we state that the HH models for excitable cells are like the Schrödinger equation for molecules. We carefully adapted our HH model to reflect the currently understood electrophysiology of E. coli.

      Thus, in applying the model to describe bacterial electrophysiology one should ensure near equilibrium requirement holds (so that (V-VQ) etc terms in authors' equation Figure 5 B hold), and potassium and other channels in a given bacterium have similar gating properties to those found in neurons. I am not aware of such measurements in any bacteria, and therefore think the pump leak model of the electrophysiology of bacteria needs to start with fluxes that are more general (for example Keener JP, Sneyd J. 2009. Mathematical physiology: I: Cellular physiology. New York: Springer or https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0000144 [journals.plos.org])

      The reference is to a slightly more modern version of a simple Nernstian battery model. The model will not oscillate and thus will not help modelling membrane potentials in bacteria. We are unsure where the equilibrium requirement comes from (inadequate modelling of the dynamics?)

      Result section 7: Mechanosensitive ion channels (MS) are vital for the first hyperpolarization event in E. coli.

      The results that Mcs channels affect the profile of ThT dye are interesting. It is again possible that the membrane permeability of these mutants has changed and therefore the dynamics have changed, so this needs to be checked first. I also note that our results show that the peak of ThT coincides with cell expansion. For this to be understood a model is needed that also takes into account the link between maintenance of electrochemical gradients of ions in the cell and osmotic pressure.

      The evidence for permeability changes in the membranes seems to be tenuous.

      A side note is that the authors state that the Msc responds to stress-related voltage changes. I think this is an overstatement. Mscs respond to predominantly membrane tension and are mostly nonspecific (see how their action recovers cellular volume in this publication https://www.pnas.org/doi/full/10.1073/pnas.1522185113 [pnas.org]). Authors cite references 35-39 to support this statement. These publications still state that these channels are predominantly membrane tension-gated. Some of the references state that the presence of external ions is important for tension-related gating but sometimes they gate spontaneously in the presence of certain ions. Other publications cited don't really look at gating with respect to ions (39 is on clustering). This is why I think the statement is somewhat misleading.

      We have reworded the discussion of Mscs since the literature appears to be ambiguous. We will try to run some electrical impedance spectroscopy experiments on the Msc mutants in the future to attempt to remove the ambiguity.

      Result section 8: Anomalous ion-channel-mediated wavefronts propagate light stress signals in 3D E. coli biofilms.

      I am not commenting on this result section, as it would only be applicable if ThT was membrane potential dye in E. coli.

      Ok, but we disagree on the use of ThT.

      Aims achieved/results support their conclusions:

      The authors clearly present their data. I am convinced that they have accurately presented everything they observed. However, I think their interpretation of the data and conclusions is inaccurate in line with the discussion I provided above.

      Likely impact of the work on the field, and the utility of the methods and data to the community:

      I do not think this publication should be published in its current format. It should be revised in light of the previous literature as discussed in detail above. I believe presenting it in it's current form on eLife pages would create unnecessary confusion.

      We believe many of the Pilizota group articles are scientifically flawed and are causing the confusion in the literature.

      Any other comments:

      I note, that while this work studies E. coli, it references papers in other bacteria using ThT. For example, in lines 35-36 authors state that bacteria (Bacillus subtilis in this case) in biofilms have been recently found to modulate membrane potential citing the relevant literature from 2015. It is worth noting that the most recent paper https://journals.asm.org/doi/10.1128/mbio.02220-23 [journals.asm.org] found that ThT binds to one or more proteins in the spore coat, suggesting that it does not act as a membrane potential in Bacillus spores. It is possible that it still reports membrane potential in Bacillus cells and the recent results are strictly spore-specific, but these should be kept in mind when using ThT with Bacillus.

      >>ThT was used successfully in previous studies of normal B. subtilis cells (by our own group and A.Prindle, ‘Spatial propagation of electrical signal in circular biofilms’, J.A.Blee et al, Physical Review E, 2019, 100, 052401, J.A.Blee et al, ‘Membrane potentials, oxidative stress and the dispersal response of bacterial biofilms to 405 nm light’, Physical Biology, 2020, 17, 2, 036001, A.Prindle et al, ‘Ion channels enable electrical communication in bacterial communities’, Nature, 2015, 527, 59-63). The connection to low metabolism pore research seems speculative.

      Reviewer #3 (Public Review):

      It has recently been demonstrated that bacteria in biofilms show changes in membrane potential in response to changes in their environment, and that these can propagate signals through the biofilm to coordinate bacterial behavior. Akabuogu et al. contribute to this exciting research area with a study of blue light-induced membrane potential dynamics in E. coli biofilms. They demonstrate that Thioflavin-T (ThT) intensity (a proxy for membrane potential) displays multiphasic dynamics in response to blue light treatment. They additionally use genetic manipulations to implicate the potassium channel Kch in the latter part of these dynamics. Mechanosensitive ion channels may also be involved, although these channels seem to have blue light-independent effects on membrane potential as well. In addition, there are challenges to the quantitative interpretation of ThT microscopy data which require consideration. The authors then explore whether these dynamics are involved in signaling at the community level. The authors suggest that cell firing is both more coordinated when cells are clustered and happens in waves in larger, 3D biofilms; however, in both cases evidence for these claims is incomplete. The authors present two simulations to describe the ThT data. The first of these simulations, a Hodgkin-Huxley model, indicates that the data are consistent with the activity of two ion channels with different kinetics; the Kch channel mutant, which ablates a specific portion of the response curve, is consistent with this. The second model is a fire-diffuse-fire model to describe wavefront propagation of membrane potential changes in a 3D biofilm; because the wavefront data are not presented clearly, the results of this model are difficult to interpret. Finally, the authors discuss whether these membrane potential changes could be involved in generating a protective response to blue light exposure; increased death in a Kch ion channel mutant upon blue light exposure suggests that this may be the case, but a no-light control is needed to clarify this.

      In a few instances, the paper is missing key control experiments that are important to the interpretation of the data. This makes it difficult to judge the meaning of some of the presented experiments.

      (1) An additional control for the effects of autofluorescence is very important. The authors conduct an experiment where they treat cells with CCCP and see that Thioflavin-T (ThT) dynamics do not change over the course of the experiment. They suggest that this demonstrates that autofluorescence does not impact their measurements. However, cellular autofluorescence depends on the physiological state of the cell, which is impacted by CCCP treatment. A much simpler and more direct experiment would be to repeat the measurement in the absence of ThT or any other stain. This experiment should be performed both in the wild-type strain and in the ∆kch mutant.

      ThT is a very bright fluorophore (much brighter than a GFP). It is clear from the images of non-stained samples that autofluorescence provides a negligible contribution to the fluorescence intensity in an image.

      (2) The effects of photobleaching should be considered. Of course, the intensity varies a lot over the course of the experiment in a way that photobleaching alone cannot explain. However, photobleaching can still contribute to the kinetics observed. Photobleaching can be assessed by changing the intensity, duration, or frequency of exposure to excitation light during the experiment. Considerations about photobleaching become particularly important when considering the effect of catalase on ThT intensity. The authors find that the decrease in ThT signal after the initial "spike" is attenuated by the addition of catalase; this is what would be predicted by catalase protecting ThT from photobleaching (indeed, catalase can be used to reduce photobleaching in time lapse imaging).

      Photobleaching was negligible over the course of the experiments. We employed techniques such as reducing sample exposure time and using the appropriate light intensity to minimize photobleaching.

      (3) It would be helpful to have a baseline of membrane potential fluctuations in the absence of the proposed stimulus (in this case, blue light). Including traces of membrane potential recorded without light present would help support the claim that these changes in membrane potential represent a blue light-specific stress response, as the authors suggest. Of course, ThT is blue, so if the excitation light for ThT is problematic for this experiment the alternative dye tetramethylrhodamine methyl ester perchlorate (TMRM) can be used instead.

      Unfortunately the fluorescent baseline is too weak to measure cleanly in this experiment. It appears the collective response of all the bacteria hyperpolarization at the same time appears to dominate the signal (measurements in the eLife article and new potentiometry measurements).

      (4) The effects of ThT in combination with blue light should be more carefully considered. In mitochondria, a combination of high concentrations of blue light and ThT leads to disruption of the PMF (Skates et al. 2021 BioRXiv), and similarly, ThT treatment enhances the photodynamic effects of blue light in E. coli (Bondia et al. 2021 Chemical Communications). If present in this experiment, this effect could confound the interpretation of the PMF dynamics reported in the paper.

      We think the PMF plays a minority role in determining the membrane potential in E. coli. For reasons outlined before (H+ is a minority ion in E. coli compared with K+).

      (5) Figures 4D - E indicate that a ∆kch mutant has increased propidium iodide (PI) staining in the presence of blue light; this is interpreted to mean that Kch-mediated membrane potential dynamics help protect cells from blue light. However, Live/Dead staining results in these strains in the absence of blue light are not reported. This means that the possibility that the ∆kch mutant has a general decrease in survival (independent of any effects of blue light) cannot be ruled out.

      >>Both strains of bacterial has similar growth curve and also engaged in membrane potential dynamics for the duration of the experiment. We were interested in bacterial cells that observed membrane potential dynamics in the presence of the stress. Bacterial cells need to be alive to engage in membrane potential  dynamics (hyperpolarize) under stress conditions. Cells that engaged in membrane potential dynamics and later stained red were only counted after the entire duration. We believe that the wildtype handles the light stress better than the ∆kch mutant as measured with the PI.

      (6) Additionally in Figures 4D - E, the interpretation of this experiment can be confounded by the fact that PI uptake can sometimes be seen in bacterial cells with high membrane potential (Kirchhoff & Cypionka 2017 J Microbial Methods); the interpretation is that high membrane potential can lead to increased PI permeability. Because the membrane potential is largely higher throughout blue light treatment in the ∆kch mutant (Fig. 3AB), this complicates the interpretation of this experiment.

      Kirchhoff & Cypionka 2017 J Microbial Methods, using fluorescence microscopy, suggested that changes in membrane potential dynamics can introduce experimental bias when propidium iodide is used to confirm the viability of tge bacterial strains, B subtilis (DSM-10) and Dinoroseobacter shibae, that are starved of oxygen (via N2 gassing) for 2 hours. They attempted to support their findings by using CCCP in stopping the membrane potential dynamics (but never showed any pictoral or plotted data for this confirmatory experiment). In our experiment methodology, cell death was not forced on the cells by introducing an extra burden or via anoxia. We believe that the accumulation of PI in ∆kch mutant is not due to high membrane potential dynamics but is attributed to the PI, unbiasedly showing damaged/dead cells. We think that propidium iodide is good for this experiment. Propidium iodide is a dye that is extensively used in life sciences. PI has also been used in the study of bacterial electrophysiology (https://pubmed.ncbi.nlm.nih.gov/32343961/, ) and no membrane potential related bias was reported.

      Throughout the paper, many ThT intensity traces are compared, and described as "similar" or "dissimilar", without detailed discussion or a clear standard for comparison. For example, the two membrane potential curves in Fig. S1C are described as "similar" although they have very different shapes, whereas the curves in Fig. 1B and 1D are discussed in terms of their differences although they are evidently much more similar to one another. Without metrics or statistics to compare these curves, it is hard to interpret these claims. These comparative interpretations are additionally challenging because many of the figures in which average trace data are presented do not indicate standard deviation.

      Comparison of small changes in the absolute intensities is problematic in such fluorescence experiments. We mean the shape of the traces is similar and they can be modelled using a HH model with similar parameters.

      The differences between the TMRM and ThT curves that the authors show in Fig. S1C warrant further consideration. Some of the key features of the response in the ThT curve (on which much of the modeling work in the paper relies) are not very apparent in the TMRM data. It is not obvious to me which of these traces will be more representative of the actual underlying membrane potential dynamics.

      In our experiment, TMRM was used to confirm the dynamics observed using ThT. However, ThT appear to be more photostable than TMRM (especially towars the 2nd peak). The most interesting observation is that with both dyes, all phases of the membrane potential dynamics were conspicuous (the first peak, the quiescent period and the second peak). The time periods for these three episodes were also similar.

      A key claim in this paper (that dynamics of firing differ depending on whether cells are alone or in a colony) is underpinned by "time-to-first peak" analysis, but there are some challenges in interpreting these results. The authors report an average time-to-first peak of 7.34 min for the data in Figure 1B, but the average curve in Figure 1B peaks earlier than this. In Figure 1E, it appears that there are a handful of outliers in the "sparse cell" condition that likely explain this discrepancy. Either an outlier analysis should be done and the mean recomputed accordingly, or a more outlier-robust method like the median should be used instead. Then, a statistical comparison of these results will indicate whether there is a significant difference between them.

      The key point is the comparison of standard errors on the standard deviation.

      In two different 3D biofilm experiments, the authors report the propagation of wavefronts of membrane potential; I am unable to discern these wavefronts in the imaging data, and they are not clearly demonstrated by analysis.

      The first data set is presented in Figures 2A, 2B, and Video S3. The images and video are very difficult to interpret because of how the images have been scaled: the center of the biofilm is highly saturated, and the zero value has also been set too high to consistently observe the single cells surrounding the biofilm. With the images scaled this way, it is very difficult to assess dynamics. The time stamps in Video S3 and on the panels in Figure 2A also do not correspond to one another although the same biofilm is shown (and the time course in 2B is also different from what is indicated in 2B). In either case, it appears that the center of the biofilm is consistently brighter than the edges, and the intensity of all cells in the biofilm increases in tandem; by eye, propagating wavefronts (either directed toward the edge or the center) are not evident to me. Increased brightness at the center of the biofilm could be explained by increased cell thickness there (as is typical in this type of biofilm). From the image legend, it is not clear whether the image presented is a single confocal slice or a projection. Even if this is a single confocal slice, in both Video S3 and Figure 2A there are regions of "haze" from out-of-focus light evident, suggesting that light from other focal planes is nonetheless present. This seems to me to be a simpler explanation for the fluorescence dynamics observed in this experiment: cells are all following the same trajectory that corresponds to that seen for single cells, and the center is brighter because of increased biofilm thickness.

      We appreciate the reviewer for this important observation. We have made changes to the figures to address this confusion. The cell cover has no influence on the observed membrane potential dynamics. The entire biofilm was exposed to the same blue light at each time. Therefore all parts of the biofilm received equal amounts of the blue light intensity. The membrane potential dynamics was not influenced by cell density (see Fig 2C). 

      The second data set is presented in Video S6B; I am similarly unable to see any wave propagation in this video. I observe only a consistent decrease in fluorescence intensity throughout the experiment that is spatially uniform (except for the bright, dynamic cells near the top; these presumably represent cells that are floating in the microfluidic and have newly arrived to the imaging region).

      A visual inspection of Video S6B shows a fast rise, a decrease in fluorescence and a second rise (supplementary figure 4B). The data for the fluorescence was carefully obtained using the imaris software. We created a curved geometry on each slice of the confocal stack. We analyzed the surfaces of this curved plane along the z-axis. This was carried out in imaris.

      3D imaging data can be difficult to interpret by eye, so it would perhaps be more helpful to demonstrate these propagating wavefronts by analysis; however, such analysis is not presented in a clear way. The legend in Figure 2B mentions a "wavefront trace", but there is no position information included - this trace instead seems to represent the average intensity trace of all cells. To demonstrate the propagation of a wavefront, this analysis should be shown for different subpopulations of cells at different positions from the center of the biofilm. Data is shown in Figure 8 that reflects the velocity of the wavefront as a function of biofilm position; however, because the wavefronts themselves are not evident in the data, it is difficult to interpret this analysis. The methods section additionally does not contain sufficient information about what these velocities represent and how they are calculated. Because of this, it is difficult for me to evaluate the section of the paper pertaining to wave propagation and the predicted biofilm critical size.

      The analysis is considered in more detail in a more expansive modelling article, currently under peer review in a physics journal, ‘Electrical signalling in three dimensional bacterial biofilms using an agent based fire-diffuse-fire model’, V.Martorelli, et al, 2024 https://www.biorxiv.org/content/10.1101/2023.11.17.567515v1

      There are some instances in the paper where claims are made that do not have data shown or are not evident in the cited data:

      (1) In the first results section, "When CCCP was added, we observed a fast efflux of ions in all cells"- the data figure pertaining to this experiment is in Fig. S1E, which does not show any ion efflux. The methods section does not mention how ion efflux was measured during CCCP treatment.

      We have worded this differently to properly convey our results.

      (2) In the discussion of voltage-gated calcium channels, the authors refer to "spiking events", but these are not obvious in Figure S3E. Although the fluorescence intensity changes over time, it's hard to distinguish these fluctuations from measurement noise; a no-light control could help clarify this.

      The calcium transients observed were not due to noise or artefacts.

      (3) The authors state that the membrane potential dynamics simulated in Figure 7B are similar to those observed in 3D biofilms in Fig. S4B; however, the second peak is not clearly evident in Fig. S4B and it looks very different for the mature biofilm data reported in Fig. 2. I have some additional confusion about this data specifically: in the intensity trace shown in Fig. S4B, the intensity in the second frame is much higher than the first; this is not evident in Video S6B, in which the highest intensity is in the first frame at time 0. Similarly, the graph indicates that the intensity at 60 minutes is higher than the intensity at 4 minutes, but this is not the case in Fig. S4A or Video S6B.

      The confusion stated here has now been addressed. Also it should be noted that while Fig 2.1 was obtained with LED light source, Fig S4A was obtained using a laser light source. While obtaining the confocal images (for Fig S4A ), the light intensity was controlled to further minimize photobleaching. Most importantly, there is an evidence of slow rise to the 2nd peak in Fig S4B. The first peak, quiescence and slow rise to second peak are evident.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Scientific recommendations:

      - Although Fig 4A clearly shows that light stimulation has an influence on the dynamics of cell membrane potential in the biofilm, it is important to rule out the contribution of variations in environmental parameters. I understand that for technical reasons, the flow of fresh medium must be stopped during image acquisition. Therefore, I suggest performing control experiments, where the flow is stopped before image acquisition (15min, 30min, 45min, and 1h before). If there is no significant contribution from environmental variations (pH, RedOx), the dynamics of the electrical response should be superimposed whatever the delay between stopping the flow stop and switching on the light.

      In this current research study, we were focused on studying how E. coli cells and biofilms react to blue light stress via their membrane potential dynamics. This involved growing the cells and biofilms, stopping the media flow and obtaining data immediately. We believe that stopping the flow not only helped us to manage data acquisition, it also helped us reduce the effect of environmental factors. In our future study we will expand the work to include how the membrane potential dynamics evolve in the presence of changing environmental factors for example such induced by stopping the flow at varied times.

      - Since TMRM signal exhibits a linear increase after the first response peak (Supplementary Figure 1D), I recommend mitigating the statement at line 78.

      - To improve the spatial analysis of the electrical response, I suggest plotting kymographs of the intensity profiles across the biofilm. I have plotted this kymograph for Video S3 and it appears that there is no electrical propagation for the second peak. In addition, the authors should provide technical details of how R^2(t) is measured in the first regime (Figure 7E).

      See the dedicated simulation article for more details. https://www.biorxiv.org/content/10.1101/2023.11.17.567515v1

      - Line 152: To assess the variability of the latency, the authors should consider measuring the variance divided by the mean instead of SD, which may depend on the average value.

      We are happy with our current use of standard error on the standard deviation. It shows what we claim to be true.

      - Line 154-155: To truly determine whether the amplitude of the "action potential" is independent of biofilm size, the authors should not normalise the signals.

      Good point. We qualitatively compared both normalized and unnormalized data. Recent electrical impedance spectroscopy measurements (unpublished) indicate that the electrical activity is an extensive quantity i.e. it scales with the size of the biofilms.

      - To precise the role of K+ in the habituation response, I suggest using valinomycin at sub-inhibitory concentrations (10µM). Besides, the high concentration of CCCP used in this study completely inhibits cell activity. Not surprisingly, no electrical response to light stimulation was observed in the presence of CCCP. Finally, the Kch complementation experiment exhibits a "drop after the first peak" on a single point. It would be more convincing to increase the temporal resolution (1min->10s) to show that there is indeed a first and a second peak.

      An interesting experiment for the future.

      - Line 237-238: There are only two points suggesting that the dynamics of hyperpolarization are faster at higher irradiance(Fig 4A). The authors should consider adding a third intermediate point at 17µW/mm^2 to confirm the statement made in this sentence.

      Multiple repeats were performed. We are confident of the robustness of our data.

      - Line 249 + Fig 4E: It seems that the data reported on Fig 4E are extracted from Fig 4D. If this is indeed the case, the data should be normalised by the total population size to compare survival probabilities under the two conditions. It would also be great to measure these probabilities (for WT and ∆kch) in the presence of ROS scavengers.

      - To distinguish between model fitting and model predictions, the authors should clearly state which parameters are taken from the literature and which parameters are adjusted to fit the experimental data.

      - Supplementary Figure 4A: why can't we see any wavefront in this series of images?

      For the experimental data, the wavefront was analyzed by employing the imaris software. We systematically created a ROI with a curved geometry within the confocal stack (the biofilm). The fluorescence of ThT was traced along the surface of the curved geometry was analyzed along the z-axis.

      - Fig 7B: Could the authors explain why the plateau is higher in the simulations than in the biofilm experiments? Could they add noise on the firing activities?

      See the dedicated Martorelli modelling article. In general we would need to approach stochastic Hodgkin-Huxley modelling and the fluorescence data (and electrical impedance spectroscopy data) presented does not have extensive noise (due to collective averaging over many bacteria cells).

      - Supplementary Figure 4B: Why can't we see the second peak in confocal images?

      The second peak is present although not as robust as in Fig 2B. The confocal images were obtained with a laser source. Therefore we tried to create a balance between applying sufficient light stress on the bacterial cells and mitigating photobleaching.

      Editing recommendations:

      The editing recommendations below has been applied where appropriate

      - Many important technical details are missing (e.g. R^2, curvature, and 445nm irradiance measurements). Error bars are missing from most graphs. The captions should clearly indicate if these are single-cell or biofilm experiments, strain name, illumination conditions, number of experiments, SD, or SE. Please indicate on all panels of all figures in the main text and in the supplements, which are the conditions: single cell vs. biofilm, strains, medium, centrifugal vs centripetal etc..., where relevant. Please also draw error bars everywhere.

      We have now made appropriate changes. We specifically use cells when we were dealing with single cells and biofilms when we worked on biofilms. We decided to describe the strain name either on the panel or the image description.

      - Line 47-51: The way the paragraph is written suggests that no coordinated electrical oscillations have been observed in Gram-negative biofilms. However, Hennes et al (referenced as 57 in this manuscript) have shown that a wave of hyperpolarized cells propagates in Neisseria gonorrhoea colony, which is a Gram-negative bacterium.

      We are now aware of this work. It was not published when we first submitted our work and the authors claim the waves of activity are due to ROS diffusion NOT propagating waves of ions (coordinated electrical wavefronts).

      - Line 59: "stressor" -> "stress" or "perturbation".

      The correction has been made.

      - Line 153: Please indicate in the Material&Methods how the size of the biofilm is measured.

      The biofilm size was obtained using BiofilmQ and the step by step guide for using BiofilmQ were stated..

      - Figure 2A: Please provide associated brightfield images to locate bacteria.

      - Line 186: Please remove "wavefront" from the caption. Fig2B only shows the average signal as a function of time.

      This correction has been implemented.

      - Fig 3B,C: Please indicate single cell and biofilm on the panels and also WT and ∆kch.

      - Line 289: I suggest adding "in single cell experiments" to the title of this section.

      - Fig 5A: blue light is always present at regular time intervals during regime I and II. The presence of blue light only in regime I could be misleading.

      - Fig 5C: The curve in Fig 5D seems to correspond to the biofilm case. The curve given by the model, should be compared with the average curve presented in Fig 1D.

      - Fig 6A, B, and C: These figures could be moved to supplements.

      - Line 392: Replace "turgidity" with "turgor pressure".

      - Fig 7C,E: Please use a log-log scale to represent these data and indicate the line of slope 1.

      - Fig 7E: The x-axis has been cropped.

      - Please provide a supplementary movie for the data presented in Fig 7E.

      - Line 455: E. Coli biofilms do not express ThT.

      - Line 466: "\gamma is the anomalous exponent". Please remove anomalous (\gamma can equal 1 at this stage).

      - Line 475: Please replace "section" with "projection".

      - Line 476: Please replace "spatiotemporal" with "temporal". There is no spatial dependency in either figure.

      - Line 500: Please define Eikonal approximation.

      - Fig 8 could be moved to supplements.

      - Line 553: "predicted" -> "predict".

      - Line 593: Could the authors explain why their model offers much better quantitative agreement?

      - Line 669: What does "universal" mean in that context?

      - Line 671: A volume can be pipetted but not a concentration.

      - Line 676: Are triplicates technical or biological replicates?

      - Sup Fig1: Please use minutes instead of seconds in panel A.

      - Model for membrane dynamics: "The fraction of time the Q+ channel is open" -> "The dynamics of Q+ channel activity can be written". Ditto for K+ channel...

      - Model for membrane dynamics: "the term ... is a threshold-linear". This function is not linear at all. Why is it called linear? Also, please describe what \sigma is.

      - ABFDF model: "releasing a given concentration" -> "releasing a local concentration" or "a given number" but it's not \sigma anymore. Besides, this \sigma is unlikely related to the previous \sigma used in the model of membrane potential dynamics in single cells. Please consider renaming one or the other. Also, ions are referred to as C+ in the text and C in equation 8. Am I missing something?

      Reviewer #2 (Recommendations For The Authors):

      I have included all my comments as one review. I have done so, despite the fact that some minor comments could have gone into this section, because I decided to review each Result section. I thus felt that not writing it as one review might be harder to follow. I have however highlighted which comments are minor suggestions or where I felt corrections.

      However, while I am happy with all my comments being public, given their nature I think they should be shown to authors first. Perhaps the authors want to go over them and think about it before deciding if they are happy for their manuscript to be published along with these comments, or not. I will highlight this in an email to the editor. I question whether in this case, given that I am raising major issues, publishing both the manuscript and the comments is the way to go as I think it might just generate confusion among the audience.

      Reviewer #3 (Recommendations For The Authors):

      I was unable to find any legends for any of the supplemental videos in my review materials, and I could not open supplemental video 5.

      I made some comments in the public review about the analysis and interpretation of the time-to-fire data. One of the other challenges in this data set is that the time resolution is limited- it seems that a large proportion of cells have already fired after a single acquisition frame. It would be ideal to increase the time resolution on this measurement to improve precision. This could be done by imaging more quickly, but that would perhaps necessitate more blue light exposure; an alternative is to do this experiment under lower blue light irradiance where the first spike time is increased (Figure 4A).

      In the public review, I mentioned the possible impact of high membrane potential on PI permeability. To address this, the experiment could be repeated with other stains, or the viability of blue light-treated cells could be addressed more directly by outgrowth or colony-forming unit assays.

      In the public review, I mentioned the possible combined toxicity of ThT and blue light. Live/dead experiments after blue light exposure with and without ThT could be used to test for such effects, and/or the growth curve experiment in Figure 1F could be repeated with blue light exposure at a comparable irradiance used in the experiment.

      Throughout the paper and figure legends, it would help to have more methodological details in the main text, especially those that are critical for the interpretation of the experiment. The experimental details in the methods section are nicely described, but the data analysis section should be expanded significantly.

      At the end of the results section, the authors suggest a critical biofilm size of only 4 µm for wavefront propagation (not much larger than a single cell!). The authors show responses for various biofilm sizes in Fig. 2C, but these are all substantially larger. Are there data for cell clusters above and below this size that could support this claim more directly?

      The authors mention image registration as part of their analysis pipeline, but the 3D data sets in Video S6B and Fig. S4A do not appear to be registered- were these registered prior to the velocity analysis reported in Fig. 8?

      One of the most challenging claims to demonstrate in this paper is that these membrane potential wavefronts are involved in coordinating a large, biofilm-scale response to blue light. One possible way to test this might be to repeat the Live/Dead experiment in planktonic culture or the single-cell condition. If the protection from blue light specifically emerges due to coordinated activity of the biofilm, the Kch mutant would not be expected to show a change in Live/Dead staining in non-biofilm conditions.

      Line 140: How is "mature biofilm" defined? Also on this same line, what does "spontaneous" mean here?

      Line 151: "much smaller": Given that the reported time for 3D biofilms is 2.73 {plus minus} 0.85 min and in microclusters is 3.27 {plus minus} 1.77 min, this seems overly strong.

      Line 155: How is "biofilm density" characterized? Additionally, the data in Figure 2C are presented in distance units (µm), but the text refers to "areal coverage"- please define the meaning of these distance units in the legend and/or here in the text (is this the average radius?).

      Lines 161-162: These claims seem strong given the data presented before, and the logic is not very explicit. For example, in the second sentence, the idea that this signaling is used to "coordinate long-range responses to light stress" does not seem strongly evidenced at this point in the paper. What is meant by a long-range response to light stress- are there processes to respond to light that occur at long-length scales (rather than on the single-cell scale)? If so, is there evidence that these membrane potential changes could induce these responses? Please clarify the logic behind these conclusions.

      Lines 235-236: In the lower irradiance conditions, the responses are slower overall, and it looks like the ThT intensity is beginning to rise at the end of the measurement. Could a more prominent second peak be observed in these cases if the measurement time was extended?

      Line 242-243: The overall trajectories of extracellular potassium are indeed similar, but the kinetics of the second peak of potassium are different than those observed by ThT (it rises some minutes earlier)- is this consistent with the idea that Kch is responsible for that peak? Additionally, the potassium dynamics also reflect the first peak- is this surprising given that the Kch channel has no effect on this peak?

      Line 255-256: Again, this seems like a very strong claim. There are several possible interpretations of the catalase experiment (which should be discussed); this experiment perhaps suggests that ROS impacts membrane potential, but does not obviously indicate that these membrane potential fluctuations mitigate ROS levels or help the cells respond to ROS stress. The loss of viability in the ∆kch mutant might indicate a link between these membrane potential experiments and viability, but it is hard to interpret without the no-light control I mention in the public review.

      Lines 313-315: "The model predicts... the external light stress". Please clarify this section. Where this prediction arises from in the modeling work? Second, I am not sure what is meant by "modulates the light stress" or "keeps the cell dynamics robust to the intensity of external light stress" (especially since the dynamics clearly vary with irradiance, as seen in Figure 4A).

      Line 322: I am not sure what "handles the ROS by adjusting the profile of the membrane potential dynamics" means. What is meant by "handling" ROS? Is the hypothesis that membrane potential dynamics themselves are protective against ROS, or that they induce a ROS-protective response downstream, or something else? Later in lines 327-8 the authors write that changes in the response to ROS in the model agree with the hypothesis, but just showing that ROS impacts the membrane potential does not seem to demonstrate that this has a protective effect against ROS.

      Line 365-366: This section title seems confusing- mechanosensitive ion channels totally ablate membrane potential dynamics, they don't have a specific effect on the first hyperpolarization event. The claim that mechanonsensitive ion channels are specifically involved in the first event also appears in the abstract.

      Also, the apparent membrane potential is much lower even at the start of the experiment in these mutants- is this expected? This seems to imply that these ion channels also have a blue light independent effect.

      Lines 368, 371: Should be VGCCs rather than VGGCs.

      Line 477: I believe the figure reference here should be to Figure 7B, not 6B.

      Line 567-568: "The initial spike is key to registering the presence of the light stress." What is the evidence for this claim?

      Line 592-594: "We have presented much better quantitative agreement..." This is a strong claim; it is not immediately evident to me that the agreement between model and prediction is "much better" in this work than in the cited work. The model in Figure 4 of reference 57 seems to capture the key features of their data. Clarification is needed about this claim.

      Line 613: "...strains did not have any additional mutations." This seems to imply that whole genome sequencing was performed- is this the case?

      Line 627: I believe this should refer to Figure S2A-B rather than S1.

      Line 719: What percentage of cells did not hyperpolarize in these experiments?

      Lines 751-754: As I mentioned above, significant detail is missing here about how these measurements were made. How is "radius" defined in 3D biofilms like the one shown in Video S6B, which looks very flat? What is meant by the distance from the substrate to the core, since usually in this biofilm geometry, the core is directly on the substrate? Most importantly, this only describes the process of sectioning the data- how were these sections used to compute the velocity of ThT signal propagation?

      I also have some comments specifically on the figure presentation:

      Normalization from 0 to 1 has been done in some of the ThT traces in the paper, but not all. The claims in the paper would be easiest to evaluate if the non-normalized data were shown- this is important for the interpretation of some of the claims.

      Some indication of standard deviation (error bars or shading) should be added to all figures where mean traces are plotted.

      Throughout the paper, I am a bit confused by the time axis; the data consistently starts at 1 minute. This is not intuitive to me, because it seems that the blue light being applied to the cells is also the excitation laser for ThT- in that case, shouldn't the first imaging frame be at time 0 (when the blue light is first applied)? Or is there an additional exposure of blue light 1 minute before imaging starts? This is consequential because it impacts the measured time to the first spike. (Additionally, all of the video time stamps start at 0).

      Please increase the size of the scale bars and bar labels throughout, especially in Figure 2A and S4A.

      In Figure 1B and D, it would help to decrease the opacity on the individual traces so that more of them can be discerned. It would also improve clarity to have data from the different experiments shown with different colored lines, so that variability between experiments can be clearly visualized.

      Results in Figure 1E would be easier to interpret if the frequency were normalized to total N. It is hard to tell from this graph whether the edges and bin widths are the same between the data sets, but if not, they should be. Also, it would help to reduce the opacity of the sparse cell data set so that the full microcluster data set can be seen as well.

      Biofilm images are shown in Figures 2A, S3A, and Video S3- these are all of the same biofilm. Why not take the opportunity to show different experimental replicates in these different figures? The same goes for Figure S4A and Video S6B, which again are of the same biofilm.

      Figure 2C would be much easier to read if the curves were colored in order of their size; the same is true for Figure 4A and irradiance.

      The complementation data in Figure S3D should be moved to the main text figure 3 alongside the data about the corresponding knockout to make it easier to compare the curves.

      Fig.ure S3E: Is the Y-axis in this graph mislabeled? It is labeled as ThT fluorescence, but it seems that it is reporting fluorescence from the calcium indicator?

      Video S6B is very confusing - why does the video play first forwards and then backwards? Unless I am looking very carefully at the time stamps it is easy to misinterpret this as a rise in the intensity at the end of the experiment. Without a video legend, it's hard to understand this, but I think it would be much more straightforward to interpret if it only played forward. (Also, why is this video labeled 6B when there is no video 6A?)

    1. eLife Assessment

      This valuable study provides convincing evidence that specific proteins on the surface of cancer cells undergo a particular form of recycling and are redirected toward the cell-cell contact with T cells, a type of immune cell. However, the characterization of the consequences of T cell activation resulting from perturbing the recycling pathway is incomplete. Furthermore, relevant literature has not been sufficiently cited.

    2. Reviewer #1 (Public review):

      Summary:

      This study by Xu et al. focuses on the impact of clathrin-independent endocytosis in cancer cells on T cell activation. In particular, by using a combination of biochemical approaches and imaging, the authors identify ICAM1, the ligand for T cell-expressed integrin LFA-1, as a novel cargo for EndoA3-mediated endocytosis. Subsequently, the authors aim to identify functional implications for T cell activation, using a combination of cytokine assays and imaging experiments.

      They find that the absence of EndoA3 leads to a reduction in T cell-produced cytokine levels. Additionally, they observe slightly reduced levels of ICAM1 at the immunological synapse and an enlarged contact area between T cells and cancer cells. Taken together, the authors propose a mechanism where EndoA3-mediated endocytosis of ICAM1, followed by retrograde transport, supplies the immunological synapse with ICAM1. In the absence of EndoA3, T cells attempt to compensate for suboptimal ICAM1 levels at the synapse by enlarging their contact area, which proves insufficient and leads to lower levels of T cell activation.

      Strengths:

      The authors utilize a rigorous and innovative experimental approach that convincingly identifies ICAM1 as a novel cargo for Endo3A-mediated endocytosis.

      Weaknesses:

      The characterization of the effects of Endo3A absence on T cell activation appears incomplete. Key aspects, such as surface marker upregulation, T cell proliferation, integrin signalling and most importantly, the killing of cancer cells, are not comprehensively investigated.

      As Endo- and exocytosis are intricately linked with the biophysical properties of the cellular membrane (e.g. membrane tension), which can significantly impact T-cell activation and cytotoxicity, the authors should address this possibility and ideally address it experimentally to some degree.

      Crucially, key literature relevant to this research, addressing the role of ICAM1 endocytosis in antigen-presenting cells, has not been taken into consideration.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Xu et al. studies the relevance of endophilin A3-dependent endocytosis and retrograde transport of immune synapse components and in the activation of cytotoxic CD8 T cells. First, the authors show that ICAM1 and ALCAM, known components of immune synapses, are endocytosed via endoA3-dependent endocytosis and retrogradely transported to the Golgi. The authors then show that blocking internalization or retrograde trafficking reduces the activation of CD8 T cells. Moreover, this diminished CD8 T cell activation resulted in the formation of an enlarged immune synapse with reduced ICAM1 recruitment.

      Strengths:

      The authors show a novel EndoA3-dependent endocytic cargo and provide strong evidence linking EndoA3 endocytosis to the retrograde transport of ALCAM and ICAM1.

      Weaknesses:

      The role of EndoA3 in the process of T cell activation is shown in a cell that requires exogenous expression of this gene. Moreover, the authors claim that their findings are important for polarized redistribution of cargoes, but failed to show convincingly that the cargoes they are studying are polarized in their experimental system. The statistics of the manuscript also require some refinement.

    4. Reviewer #3 (Public review):

      Summary:

      Shiqiang Xu and colleagues have examined the importance of ICAM-1 and ALCAM internalization and retrograde transport in cancer cells on the formation of a polarized immunological synapse with cytotoxic CD8+ T cells. They find that internalization is mediated by Endophilin A3 (EndoA3) while retrograde transport to the Golgi apparatus is mediated by the retromer complex. The paper is building on previous findings from corresponding author Henri-François Renard showing that ALCAM is an EndoA3-dependent cargo in clathrin-independent endocytosis.

      Strengths:

      The work is interesting as it describes a novel mechanism by which cancer cells might influence CD8+ T cell activation and immunological synapse formation, and the authors have used a variety of cell biology and immunology methods to study this. However, there are some aspects of the paper that should be addressed more thoroughly to substantiate the conclusions made by the authors.

      Weaknesses:

      In Figure 2A-B, the authors show micrographs from live TIRF movies of HeLa and LB33-MEL cells stably expressing EndoA3-GFP and transiently expressing ICAM-1-mScarlet. The ICAM-1 signal appears diffuse across the plasma membrane while the EndoA3 signal is partially punctate and partially lining the edge of membrane patches. Previous studies of EndoA3-mediated endocytosis have indicated that this can be observed as transient cargo-enriched puncta on the cell surface. In the present study, there is only one example of such an ICAM-1 and EndoA3 positive punctate event. Other examples of overlapping signals between ICAM-1 and EndoA3 are shown, but these either show retracting ICAM-1 positive membrane protrusions or large membrane patches encircled by EndoA3. While these might represent different modes of EndoA3-mediated ICAM-1 internalization, any conclusion on this would require further investigation.

      Moreover, in Figure 2C-E, uptake of the previously established EndoA3 endocytic cargo ALCAM is analyzed by quantifying total internal fluorescence in LB33-MEL cells of antibody labelled ALCAM following both overexpression and siRNA-mediated knockdown of EndoA3, showing increased and decreased uptake respectively. Why has not the same quantification been done for the proposed novel EndoA3 endocytic cargo ICAM-1? Furthermore, if endocytosis of ICAM-1 and ALCAM is diminished following EndoA3 knockdown, the expression level on the cell surface would presumably increase accordingly. This has been shown for ALCAM previously and should also be quantified for ICAM-1.

      In Figure 4A the authors show micrographs from a live-cell Airyscan movie (Movie S6) of a CD8+ T cell incubated with HeLa cells stably expressing HLA-A*68012 and transiently expressing ICAM1-EGFP. From the movie, it seems that some ICAM-1 positive vesicles in one of the HeLa cells are moving towards the T cell. However, it does not appear like the T cell has formed a stable immunological synapse but rather perhaps a motile kinapse. Furthermore, to conclude that the ICAM-1 positive vesicles are transported toward the T cell in a polarized manner, vesicles from multiple cells should be tracked and their overall directionality should be analyzed. It would also strengthen the paper if the authors could show additional evidence for polarization of the cancer cells in response to T-cell interaction.

      Finally, in Figures 4D-G, the authors show that the contact area between CD8+ T cells and LB33-MEL cells is increased in response to siRNA-mediated knockdown of EndoA3 and VPS26A. While this could be caused by reduced polarized delivery of ICAM-1 and ALCAM to the interface between the cells, it could also be caused by other factors such as increased cell surface expression of these proteins due to diminished endocytosis, and/or morphological changes in the cancer cells resulting from disrupted membrane traffic. More experimental evidence is needed to support the working model in Figure 4H.

    1. eLife Assessment

      This study presents valuable insights into the role of two proteins, Rab27A and SYTL5, which control vesicle transport and delivery. While the data is clear, the overall evidence is somewhat incomplete. Strengthening the mechanistic aspect would enhance the study, making it of greater interest to cell biologists studying membrane trafficking and mitochondria.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, Ana Lapao et al. investigated the roles of Rab27 effector SYTL5 in cellular membrane trafficking pathways. The authors found that SYTL5 localizes to mitochondria in a Rab27A-dependent manner. They demonstrated that SYTL5-Rab27A positive vesicles containing mitochondrial material are formed under hypoxic conditions, thus they speculate that SYTL5 and Rab27A play roles in mitophagy. They also found that both SYTL5 and Rab27A are important for normal mitochondrial respiration. Cells lacking SYTL5 undergo a shift from mitochondrial oxygen consumption to glycolysis which is a common process known as the Warburg effect in cancer cells. Based on the cancer patient database, the author noticed that low SYTL5 expression is related to reduced survival for adrenocortical carcinoma patients, indicating SYTL5 could be a negative regulator of the Warburg effect and potentially tumorigenesis.

      Strengths:

      The authors take advantage of multiple techniques and novel methods to perform the experiments.

      (1) Live-cell imaging revealed that stably inducible expression of SYTL5 co-localized with filamentous structures positive for mitochondria. This result was further confirmed by using correlative light and EM (CLEM) analysis and western blotting from purified mitochondrial fraction.

      (2) In order to investigate whether SYTL5 and RAB27A are required for mitophagy in hypoxic conditions, two established mitophagy reporter U2OS cell lines were used to analyze the autophagic flux.

      Weaknesses:

      This study revealed a potential function of SYTL5 in mitophagy and mitochondrial metabolism. However, the mechanistic evidence that establishes the relationship between SYTL5/Rab27A and mitophagy is insufficient. The involvement of SYTL5 in ACC needs more investigation. Furthermore, images and results supporting the major conclusions need to be improved.

    3. Reviewer #2 (Public review):

      Summary:

      The authors provide convincing evidence that Rab27 and STYL5 work together to regulate mitochondrial activity and homeostasis.

      Strengths:

      The development of models that allow the function to be dissected, and the rigorous approach and testing of mitochondrial activity

      Weaknesses:

      There may be unknown redundancies in both pathways in which Rab27 and SYTL5 are working which could confound the interpretation of the results.

      Suggestions for revision:

      Given that Rab27A and SYTL5 are members of protein families it would be important to exclude any possible functional redundancies coming from Rab27B expression or one of the other SYTL family members. For Rab27 this would be straightforward to test in the assays shown in Figure 4 and Supplementary Figure 5. For SYTL5 it might be sufficient to include some discussion about this possibility.

      Suggestions for Discussion:

      Both Rab27A and STYL5 localize to other membranes, including the endolysosomal compartments. How do the authors envisage the mechanism or cellular modifications that allow these proteins, either individually or in complex to function also to regulate mitochondrial function? It would be interesting to have some views.

    4. Reviewer #3 (Public review):

      Summary:

      In the manuscript by Lapao et al., the authors uncover a role for the RAB27A effector protein SYTL5 in regulating mitochondrial function and turnover. The authors find that SYTL5 localizes to mitochondria in a RAB27A-dependent way and that loss of SYTL5 (or RAB27A) impairs lysosomal turnover of an inner mitochondrial membrane mitophagy reporter but not a matrix-based one. As the authors see no co-localization of GFP/mScarlet tagged versions of SYTL5 or RAB27A with LC3 or p62, they propose that lysosomal turnover is independent of the conventional autophagy machinery. Finally, the authors go on to show that loss of SYTL5 impacts mitochondrial respiration and ECAR and as such may influence the Warburg effect and tumorigenesis. Of relevance here, the authors go on to show that SYTL5 expression is reduced in adrenocortical carcinomas and this correlates with reduced survival rates.

      Strengths:

      There are clearly interesting and new findings here that will be relevant to those following mitochondrial function, the endocytic pathway, and cancer metabolism.

      Weaknesses:

      The data feel somewhat preliminary in that the conclusions rely on exogenously expressed proteins and reporters, which do not always align.

      As the authors note there are no commercially available antibodies that recognize endogenous SYTL5, hence they have had to stably express GFP-tagged versions. However, it appears that the level of expression dictates co-localization from the examples the authors give (though it is hard to tell as there is a lack of any kind of quantitation for all the fluorescent figures). Therefore, the authors may wish to generate an antibody themselves or tag the endogenous protein using CRISPR.

      In relation to quantitation, the authors found that SYTL5 localizes to multiple compartments or potentially a few compartments that are positive for multiple markers. Some quantitation here would be very useful as it might inform on function.

      The authors find that upon hypoxia/hypoxia-like conditions that punctate structures of SYTL5 and RAB27A form that are positive for Mitotracker, and that a very specific mitophagy assay based on pSu9-Halo system is impaired by siRNA of SYTL5/RAB27A, but another, distinct mitophagy assay (Matrix EGFP-mCherry) shows no change. I think this work would strongly benefit from some measurements with endogenous mitochondrial proteins, both via immunofluorescence and western blot-based flux assays.

      A really interesting aspect is the apparent independence of this mitophagy pathway on the conventional autophagy machinery. However, this is only based on a lack of co-localization between p62 or LC3 with LAMP1 and GFP/mScarlet tagged SYTL5/RAB27A. However, I would not expect them to greatly colocalize in lysosomes as both the p62 and LC3 will become rapidly degraded, while the eGFP and mScarlet tags are relatively resistant to lysosomal hydrolysis. -/+ a lysosome inhibitor might help here and ideally, the functional mitophagy assays should be repeated in autophagy KOs.

      The link to tumorigenesis and cancer survival is very interesting but it is not clear if this is due to the mitochondrially-related aspects of SYTL5 and RAB27A. For example, increased ECAR is seen in the SYTL5 KO cells but not in the RAB27A KO cells (Fig.5D), implying that mitochondrial localization of SYTL5 is not required for the ECAR effect. More work to strengthen the link between the two sections in the paper would help with future directions and impact with respect to future cancer treatment avenues to explore.

    1. eLife Assessment

      This important study characterizes the mechanics and stability of bolalipids from archaeal membranes using molecular dynamics simulations. A mesoscale model of bolalipids is presented and evaluated across a series of membrane configurations. The evidence supporting the conclusions is convincing, demonstrating that mixtures of bolalipids and regular bilayer lipids in archaeal membranes enhance fluidity and stability.

    2. Reviewer #1 (Public review):

      Summary:

      Amaral et al. presents a study investigating the mesoscale modelling and dynamics of bolalipids.

      Strengths:

      The figures in this paper are exceptional. Both those to outline and introduce the lipid types, but also the quality and resolution of the plots. The data held within also appears to be outstanding and of significant (hopefully) general interest.

      Weaknesses:

      In the introduction, I would like to have read more specifics on the biological role of bolalipids. Archaea are mentioned, but this kingdom is huge - there must be specific species that can be discussed where bolalipids are integral to archaeal life. The authors should go beyond 'extremophiles'. In short, they should unpack why the general audience should be interested in these lipids, within a subset of organisms that are often forgotten about.

    3. Reviewer #2 (Public review):

      Summary:

      The authors aimed to understand the biophysical properties of archeal membranes made of bolalipids. Bacterial and eukaryotic membranes are made of lipids that self-assemble into bilayers. Archea, instead, use bolalipids, lipids that have two headgroups and can span the entire bilayer. The authors wanted to determine if the unique characteristics of archaea, which are often extremophiles, are in part due to the fact that their membranes contain bolalipids.

      The authors develop a minimal computational model to compare the biophysics of bilayers made of lipids, bolalipids, and mixtures of the two. Their model enables them to determine essential parameters such as bilayer phase diagrams, mechanical moduli, and the bilayer behavior upon cargo inclusion and remodeling.

      The author demonstrates that bolalipid bilayers behave as binary mixtures, containing bolalipids organized either in a straight conformation, spanning the entire bilayer, or in a u-shaped one, confined to a single leaflet. This dynamic mixture allows bolalipid bilayers to be very sturdy but also provides remodeling. However, remodeling is energetically more expensive than with standard lipids. The authors speculate that this might be why lipids were more abundant in the evolutionary process.

      Strengths:

      This is a wonderful paper, a very fine piece of scholarship. It is interesting from the point of view of biology, biophysics, and material science. The authors mastered the modeling and analysis of these complex systems. The evidence for their findings is really strong and complete. The paper is written superbly, the language is precise and the reading experience is very pleasant. The plots are very well-thought-out.

      Weaknesses:

      I would not talk about weaknesses, because this is really a nice paper. If I really had to find one, I would have liked to see some clear predictions of the model expressed in such a way that experimentalists could design validation experiments.

    4. Reviewer #3 (Public review):

      Summary:

      The authors have studied the mechanics of bolalipid and archaeal mixed-lipid membranes via comprehensive molecular dynamics simulations. The Cooke-Deserno 3-bead-per-lipid model is extended to bolalipids with 6 beads. Phase diagrams, bending rigidity, mechanical stability of curved membranes, and cargo uptake are studied. Effects such as the formation of U-shaped bolalipids, pore formation in highly curved regions, and changes in membrane rigidity are studied and discussed. The main aim has been to show how the mixture of bolalipids and regular bilayer lipids in archaeal membrane models enhances the fluidity and stability of these membranes.

      Strengths:

      The authors have presented a wide range of simulation results for different membrane conditions and conformations. For the most part, the analyses and their results are presented clearly and concisely. Figures, supplementary information, and movies very well present what has been studied. The manuscript is well-written and is easy to follow.

      Major issues:

      The Cooke-Deserno model, while very powerful for biophysical analysis of membranes at the mesoscale, is very much void of chemical information. It is parameterized such that it is good in producing fluid membranes and predicting values for bending rigidity, compressibility, and even thermal expansion coefficient falling in the accepted range of values for bilayer membranes. But it still represents a generic membrane. Now, the authors have suggested a similar model for the archaeal bolalipids, which have chemically different lipids (the presence of cyclopentane rings for one), and there is no good justification for using the same pairwise interactions between their representative beads in the coarse-grained model. This does not necessarily diminish the worth of all the authors' analyses. What is at risk here is the confusion between "what we observe this model of bolalipid- or mixed-membranes do" and "how real bolalipid-containing archaeal membranes behave at these mechanical and thermal conditions.".

      Another more specific, major issue has to do with using the Hamm-Kozlov model for fitting the power spectrum of thermal undulations. The 1/q^2 term can very well be attributed to membrane tension. While a barostat is indeed used, have the authors made absolutely sure that the deviation from 1/q^4 behavior does not correspond to lateral tension? I got more worried when I noticed in the SI that the simulations had been done with combined "fix langevin" and "fix nph" LAMMPS commands. This combination does not result in a proper isothermal-isobaric ensemble. The importance of tilt terms for bolalipids is indeed very interesting, but I believe more care is needed to establish that.

      This issue is reinforced when considering Figure 3B. These results suggest that increasing the fraction of regular lipids increases the tilt modulus, with the maximum value achieved for a normal Cooke-Deserno bilayer void of bolalipids. But this is contradictory. For these bilayers, we don't need the tilt modulus in the first place.

      Also, from the SI, I gathered that the authors have neglected the longest wavelength mode because it is not equilibrated. If this is indeed the case, it is a dangerous thing to do, because with a small membrane patch, this mode can very well change the general trend of the power spectrum. As a lot of other analyses in the manuscript rely on these measurements, I believe more elaboration is in order.

      The authors have found that "there is a strong dependency of the bending rigidity on the membrane mean curvature of stiffer bolalipids." The effect is negative, with the membrane becoming less stiff at higher mean curvatures. Why is that? I would assume that with more flexible bolalipids, the possibility of reorganization into U-shaped chains should affect the bending rigidity more (as Figure 2E suggests). While for a stiff bolalipid, not much would change if you increase the mean curvature. This should be either a tilt effect, or have to do with asymmetry between the leaflets. But on the other hand, the tilt modulus is shown to decrease with increasing bolalipid rigidity. The authors get back to this issue only on page 10, when they consider U-shaped lipids in the inner and outer leaflets and write, "this suggested that an additional membrane-curving mechanism must be involved." But then again, in the Discussion, the authors write, "It is striking that membranes made from stiffer bolalipids showed a curvature-dependent bending modulus, which is a clear signature that bolalipid membranes exhibit plastic behavior during membrane reshaping," adding to the confusion.

      This issue is repeated when the authors study nanoparticle uptake. They write: "to reconcile these seemingly conflicting observations we reason that the bending rigidity, similar to Figure 2F, is not constant but softens upon increasing membrane curvature, due to dynamic change in the ratio between bolalipids in straight and U-shaped conformation. Hence, bolalipid membranes show stroking plastic behavior as they soften during reshaping." But the softening effect that they refer to, as shown in Figure 4B, occurs for very stiff bolalipids, for which not much switching to U-shaped conformation should occur.

      Another major issue is with what the authors refer to as the "effective temperature". While plotting phase diagrams for kT/eps value is absolutely valid, I'm not a fan of calling this effective temperature. It is a dimensionless quantity that scales linearly with temperature, but is not a temperature. It is usually called a "reduced temperature". Then the authors refer to their findings as studying the stability of archaeal membranes at high temperatures. I have to disagree because eps is not the only potential parameter in the simulations (there are at least space exclusion and angle-bending stiffnesses) so one cannot identify changing eps with changing the global simulation temperature. This only works when you have one potential parameter, like an LJ gas.

      Minor issues:

      As the authors have noted, the fact that the membrane curvature can change the ratio of U-shaped to straight bolalipids would render the curvature elasticity non-linear (though the term "plastic" should not be used, as this is still structurally reversible when the stress is removed. Technically, it is hypoelastic behavior, possibly with hysteresis.) With this in mind, when the authors use essentially linear elastic models for fluctuation analysis, they should make a comparison of maximum curvatures occurring in simulations with a range that causes significant changes in bolalipid conformational ratios.

      The Introduction section of the manuscript is written with a biochemical approach, with very minor attention to the simulation works on this system. Some molecular dynamics works are only cited as existing previous work, without mentioning what has already been studied in archaeal membranes. While some information, like the binding of ESCRT proteins to archaeal membranes, though interesting, helps little to place the study within the discipline. The Introduction should be revised to show what has already been studied with simulations (as the authors mention in the Discussion) and how the presented research complements it.

      The authors have been a bit loose with using the term "stability". I'd like to see the distinction in each case, as in "chemical/thermal/mechanical/conformational stability".

      In the original Cooke-Deserno model, a so-called "poorman's angle-bending term" is used, which is essentially a bond-stretching term between the first and third particle. However, I notice the authors using the full harmonic angle-bending potential. This should be mentioned.

      The analysis of energy of U-shaped lipids with the linear model E=c_0 + c_1 * k_bola is indeed very interesting. I am curious, can this also be corroborated with mean energy measurements? The minor issue is calling the source of the favorability of U-shaped lipids "entropic", while clearly an energetic contribution is found. The two conformations, for example, might differ in the interactions with the neighboring lipids.

      The authors write in the Discussion, "In any case, our results indicate that membrane remodelling, such as membrane fission during membrane traffic, is much more difficult in bolalipid membranes [34]." Firstly, I'm not sure if studying the dependence of budding behavior on adhesion energy with nanoparticles is enough to make claims about membrane fission. Secondly, why is the 2015 paper by Markus Deserno cited here?

      In the SI, where the measurement of the diffusion coefficient is discussed, the expression for D is missing the power 2 of displacement.

      Where cargo uptake is discussed, the term "adsorption energy" is used. I think the more appropriate term would be "adhesion energy".

      Typos:<br /> Page 1, paragraph 2: Adaption → Adaptation.<br /> Page 10, paragraph 1: Stroking → Striking.

    1. eLife Assessment

      This important study explores the commander-independent function of COMMD3-Arf1 in endosomal recycling. The evidence supporting the authors' claims is solid; however, the inclusion of additional validation experiments and control conditions would have further strengthened the study. The findings will be of significant interest to cell biologists working on membrane trafficking.

    2. Reviewer #1 (Public review):

      G. Squiers et al. analyzed a previously reported CRISPR genetic screening dataset of engineered GLUT4 cell-surface presentation and identified the Commander complex subunit COMMD3 as being required for endosomal recycling of specific cargo proteins, such as transferrin receptor (TfR), to the cell surface. Through comparison of COMMD3-KO and other Commander subunit-KO cells, they demonstrated that the role of COMMD3 in mediating TfR recycling is independent of the Commander complex. Structural analysis and co-immunoprecipitation followed by mass spectrometry revealed that TfR recycling by COMMD3 relies on ARF1. COMMD3 interacts with ARF1 through its N-terminal domain (NTD) to stabilize ARF1. A mutation in the NTD of COMMD3, which disrupts the NTD-ARF1 interaction, failed to rescue cell surface TfR in COMMD3-KO cells. In conclusion, the authors assert that COMMD3 stabilizes ARF1 in a Commander complex-independent manner, which is essential for recycling specific cargo proteins from endosomes to the plasma membrane.

      The conclusions of this paper are generally supported by data, but some validation experiments and control conditions should be included to strengthen the study.

      (1) Commander-Independent Role of COMMD3:<br /> While the authors provided evidence to support the Commander-independent role of COMMD3-such as the absence of other Commander subunits in the CRISPR screen and not decreased COMMD3 levels in other subunit-KO cells-direct evidence is lacking. The mutation that specifically disrupts the COMMD3-ARF1 interaction could serve as a valuable tool to directly address this question.

      (2) Role of ARF1 in Cargo Selection:<br /> The Commander-independent function of COMMD3 appears cargo-dependent and relies on ARF1's role in cargo selection. The authors should investigate whether KO/KD of ARF1 reduces cell surface levels of ITGA6 and TfR.

      (3) Impact on TfR Stability:<br /> Figure 7D suggests that TfR protein levels are reduced in COMMD3-KO cells, potentially due to degradation caused by disrupted recycling. This raises the question of whether the observed reduction in cell surface TfR is due to impaired endosomal recycling or decreased total protein levels. The authors should quantify the ratio of cell surface protein to total protein for TfR, GLUT-SPR, and ITGA6 in COMMD3-KO cells.

    3. Reviewer #2 (Public review):

      Summary:

      The Commander complex is a key player in endosomal recycling which recruits cargo proteins and facilitates the formation of tubulo-vesicular carriers. Squiers et al found COMMD3, a subunit of the Commander complex, could interact directly with ARF1 and regulate endosomal recycling.

      Strengths:

      Overall, this is a nice study that provides some interesting knowledge on the function of the Commander complex.

      Weaknesses:

      Several issues should be addressed.

      (1) All existing data suggest that COMMD3 is a subunit of the Commander complex. Is there any evidence that COMMD3 can exist as a monomer?

      (2) In Figure 9, the author emphasizes COMMD3-dependent cargo and Commander-dependent cargo. Can the authors speculate what distinguishes these two types of cargo? Do they contain sequence-specific motifs?

      (3) What could be the possible mechanism underlying the observation that the knockout of COMMD3 results in larger early endosomes? How is the disruption of cargo retrieval related to the increase in endosome size?

    4. Reviewer #3 (Public review):

      Summary:

      The manuscript by Squiers and colleagues uncovers a Commander-independent function for COMMD3 in endosomal recycling. The authors identified COMMD3 as a regulator of endosomal recycling for GLUT4-SPR through unbiased genetic screens. Subsequently, the authors performed COMMD3 knockout experiments to assess endosomal morphology and trafficking, demonstrating that COMMD3 regulates endosomal trafficking in a Commander-independent manner. Furthermore, the authors identified and confirmed that the N-terminal domain (NTD) of COMMD3 interacts with the GTPase Arf1. Using structure-guided mutations, they demonstrated that the COMMD3-Arf1 interaction is critical for the Commander-independent function of COMMD3.

      Overall, the manuscript presents compelling evidence for a Commander-independent role of COMMD3, and I agree with the author's interpretations. The manuscript uses a combination of genetic screening, microscopy, and structural and biochemical approaches to examine and support the conclusions. This is an excellent and intriguing study and I have only a few comments and suggestions to improve the manuscript further.

    1. eLife Assessment

      This manuscript makes a valuable contribution to the field by uncovering a molecular mechanism for miRNA intracellular retention, mediated by the interaction of PCBP2, SYNCRIP, and specific miRNA motifs. These findings advance our understanding of RNA-binding protein-mediated miRNA sorting and provide deeper insights into miRNA dynamics. While the conclusions are supported by solid experimental evidence, additional controls and clarification of the precise intracellular interactions would further strengthen the study.

    2. Reviewer #1 (Public review):

      In this study, Marocco and colleagues perform a deep characterization of the complex molecular mechanism guiding the recognition of a particular CELLmotif previously identified in hepatocytes in another publication. Having miR-155-3p with or without this CELLmotif as the initial focus, the authors identify 21 proteins differentially binding to these two miRNA versions. From there, they decided to focus on PCBP2. They elegantly demonstrate PCBP2 binding to the miR-155-3p WT version but not to the CELLmotif-mutated version. miR-155-3p contains a hEXOmotif identified in a different report, whose recognition is largely mediated by another RNA-binding protein called SYNCRIP. Interestingly, mutation of the hEXOmotif contained in miR-155-3p did not only blunt SYNCRIP binding but also PCBP2 binding despite the maintenance of the CELLmotif. This indicates that somehow SYNCRIP binding is a pre-requisite for PCBP2 binding. EMSA assay confirms that SYNCRIP is necessary for PCBP2 binding to miR-155-3p, while PCBP2 is not needed for SYNCRIP binding. The authors aim to extend these findings to other miRNAs containing both motifs. For that, they perform a small-RNA-Seq of EVs released from cells knockdown for PCBP2 versus control cells, identifying a subset of miRNAs whose expression either increases or decreases. The assumption is that those miRNAs containing PCBP2-binding CELLmotif should now be less retained in the cell and go more to extracellular vesicles, thus reflecting a higher EV expression. The specific subset of miRNAs having both the CELLmotif and hEXOmotif (9 miRNAs) whose expressions increase in EVs due to PCBP2 reduction is also affected by knocking-down SYNCRIP in the sense that reduction of SYNCRIP leads to lower EV sorting. Further experiments confirm that PCBP2 and SYNCRIP bind to these 9 miRNAs and that knocking down SYNCRIP impairs their EV sorting.

      While the process studied in this work is novel and interesting, there are several aspects of this manuscript that should be improved:

      (1) First of all, the nature of the CELLmotif and the hEXOmotif they are studying is extremely confusing. For the CELLmotif, the authors seem to focus on the Core CELLmotif AUU A/G in some experiments and the extended 7-nucleotide version in others. The fact that these CELLmotif and hEXOmotif are not shown anywhere in the figures (I mean with the full nucleotide variability described in the original publications) but only referred to in the text further complicates the identification of the motifs and the understanding of the experiments. Moreover, I am not convinced that the sequences they highlight in grey correspond to the original CELLmotif in all cases. For instance, in the miR-155-3p sequence, GCAUU is highlighted in grey. However, the original CELLmotif is basically 7-nucleotide long: C, A/U, G/A/C, U, U/A, C/G/A, A/U/C or CAGUUCA in its more abundant version. I can only see clearly the presence of the Core CELLmotif AUUA in miR-155-3p; however, the last A is not highlighted in grey. It is true that there is some nucleotide variability in each position in the originally reported CELLmotif by the authors in ref. 5 and the hEXOmotifs in ref. 7; however, not all nucleotides are equally likely to be found in each position. This fact seems to be not to be taken into account by the authors as they took basically any sequence with any length and almost sequence combination as valid CELLmotif. This means that I cannot identify the CELLmotif in many cases among the ones they highlight in grey. Instead, they should really focus on the most predominant CELLmotif sequence or, instead, take a reduced subset of "more abundant" CELLmotif versions from the ones that could fit in the originally described CELLmotif. Altogether, the authors need to explain much better what they have considered as the CELLmotif, what is the Core CELLmotif and what is hEXOmotif in each case and restrict to the most likely versions of the CELLmotif and hEXOmotif.

      (2) Validation of EV isolation method: first, a large part of Supplementary Figure 2 is not readable. EV markers seem to be enriched in EV isolates; however, more EV and cell markers should be assayed to fulfill MISEV guidelines.

      (3) A key variable is missing in Supplementary Figure 2, which is whether PCBP2 or SYNCRIP knockdowns impair EV secretion rates. A quantification of the nr vesicles released per cell upon knocking down each of these factors would be essential to rule out that any of the effects seen throughout the paper are not due to reduced or enhanced EV production rather than miRNA sorting/retention.

      (4) The EMSA experiment is important to support their claims. Given the weak bands that are shown, the authors need to show all their replicates to convince the readers that it is reproducible.

      (5) Although the bindings of SYNCRIP and PCBP2 to miR-155-3p and other miRNAs having both hEXOmotif and CELLmotif seem clear, the need for SYNCRIP binding to allow for PCBP2-mediated cellular retention is counterintuitive. What happens to those miRNAs that only contain a CELLmotif in terms of cellular retention and SYNCRIP dependence for cellular retention? In this regard, a representative miRNA (miR-31-3p) is analyzed in several experiments, showing that PCBP2 does not bind to it unless a hEXOmotif is introduced (Figure 3). However, this type of experiment should definitely be extended to other miRNAs containing only CELLmotif without hEXOmotif.

      (6) Along the same line, I am missing another important experiment: the artificial incorporation of CELLmotif. For example, miR-365-2-5p lacks a CELLmotif but has a hEXOmotif. Does PCBP2 bind to this miRNA upon incorporation of CELLmotif? Does this lead now to enhanced cellular retention of this miRNA?

      (7) What would be the net effect of knocking down both SYNCRIP and PCBP2 at the same time? Would this neutralize each other's effect or would the lack of one impose on the other? That could help in understanding the complex interplay between these two factors for mediating cellular retention and EV sorting.

      (8) The authors have here a great opportunity to shed some light on an unclear aspect of miRNA EV sorting and cellular retention: whether the RBPs go together with the miRNA to the EVs or not. While the original paper describing hEXOmotif found SYNCRIP in EVs, another publication (Jeppesen et al, Cell 2019; PMID: 30951670) later found this RBP being very scarce in small EVs compared to cellular bodies or large EVs (Supplementary Tables 3 and 4 in that publication). Can the authors find SYNCRIP and PCBP2 in the EVs? Another important question would be the colocalization of these RBPs in the place where the miRNA selection is supposed to take place: in multivesicular bodies (MVB). Is there a colocalization of these RBPs with MVBs in the cell?

      (9) In Figure 4C, the authors state in the text that CELLmotif and hEXOmotif are present in extra-seed region; however, for miR-181d-5p and miR-122-3p this is not true as their CELLmotifs fall within the seed sequence.

      (10) The authors need to describe how they calculate the EV/cell ratio in gene expression in some experiments (for instance, Figures 1H, 4D, etc). Did they use any housekeeping gene for EV RNA content, the same RNA load, or some other alternative method to normalize EV vs cell RNA content?

      (11) I would suggest that the authors speculate a bit in the discussion section on how the interaction between PCBP2 and SYNCRIP takes place. Do they contain any potential interacting domain? The binding of one to the miRNA would impose a topological interference on the binding of the other?

    3. Reviewer #2 (Public review):

      Summary:

      The author of this manuscript aimed to uncover the mechanisms behind miRNA retention within cells. They identified PCBP2 as a crucial factor in this process, revealing a novel role for RNA-binding proteins. Additionally, the study discovered that SYNCRIP is essential for PCBP2's function, demonstrating the cooperative interaction between these two proteins. This research not only sheds light on the intricate dynamics of miRNA retention but also emphasizes the importance of protein interactions in regulating miRNA behavior within cells.

      Strengths:

      This paper makes important progress in understanding how miRNAs are kept inside cells. It identifies PCBP2 as a key player in this process, showing a new role for proteins that bind RNA. The study also finds that SYNCRIP is needed for PCBP2 to work, highlighting how these proteins work together. These discoveries not only improve our knowledge of miRNA behavior but also suggest new ways to develop treatments by controlling miRNA locations to influence cell communication in diseases. The use of liver cell models and thorough experiments ensures the results are reliable and show their potential for RNA-based therapies

      Weaknesses:

      Despite its strengths, the manuscript has several notable limitations. The study's exclusive focus on hepatocytes limits the applicability of the findings to other cell types and physiological contexts. While the interaction between PCBP2 and SYNCRIP is well-characterized, the manuscript lacks detailed insights into the structural basis of this interaction and the dynamic regulation of their binding. The generalization of the findings to a broader spectrum of miRNAs and RNA-binding proteins (RBPs) remains underexplored, leaving gaps in understanding the full scope of miRNA compartmentalization.

      Furthermore, the therapeutic implications of these findings, though promising, are not directly connected to specific disease models or clinical scenarios, reducing their immediate translational impact. The manuscript would also benefit from a deeper discussion of potential upstream regulators of PCBP2 and SYNCRIP and the influence of cellular or environmental factors on their activity. Additionally, it is important to note that SYNCRIP has already been recognized as a major regulator of miRNA loading in extracellular vesicles (EVs). However, the purity of EVs is a concern, as the author only performed crude extraction methods without further purification using an iodixanol density gradient. The study also lacks in vivo evidence of PCBP2's role in exosomal miRNA export.

    1. eLife Assessment

      This important study explores the conserved role of IgM in both systemic and mucosal antiviral immunity in teleosts, challenging established views on the differential roles of IgT and IgM. The findings have theoretical and practical implications for immunology and aquaculture. However, the strength of the evidence is incomplete due to insufficient validation of the monoclonal antibodies used to deplete IgM, which limits confidence in the main claims. Addressing these methodological weaknesses would significantly enhance the study's impact.

    2. Joint Public Review:

      In this manuscript, Weiguang Kong et al. investigate the role of immunoglobulin M (IgM) in antiviral defense in the teleost largemouth bass (Micropterus salmoides). The study employs an IgM depletion model, viral infection experiments, and complementary in vitro assays to explore the role of IgM in systemic and mucosal immunity. The authors conclude that IgM is crucial for both systemic and mucosal antiviral defense, highlighting its role in viral neutralization through direct interactions with viral particles. The study's findings have theoretical implications for understanding immunoglobulin function across vertebrates and practical relevance for aquaculture immunology.

      Strengths:

      The manuscript applies multiple complementary approaches, including IgM depletion, viral infection models, and histological and gene expression analyses, to address an important immunological question. The study challenges established views that IgT is primarily responsible for mucosal immunity, presenting evidence for a dual role of IgM at both systemic and mucosal levels. If validated, the findings have evolutionary significance, suggesting the conserved role of IgM as an antiviral effector across jawed vertebrates for over 500 million years. The practical implications for vaccine strategies targeting mucosal immunity in fish are noteworthy, addressing a key challenge in aquaculture.

      Weaknesses:

      Several conceptual and technical issues undermine the strength of the evidence:

      Monoclonal Antibody (MoAb) Validation: The study relies heavily on a monoclonal antibody to deplete IgM, but its specificity and functionality are not adequately validated. The epitope recognized by the antibody is not identified, and there is no evidence excluding cross-reactivity with other isotypes. Mass spectrometry, immunoprecipitation, or Western blot analysis using tissue lysates with varying immunoglobulin expression levels would strengthen the claim of IgM-specific depletion.

      IgM Depletion Kinetics: The rapid depletion of IgM from serum and mucus (within one day) is unexpected and inconsistent with prior literature. Additional evidence, such as Western blot analyses comparing treated and control fish, is necessary to confirm this finding.

      Novelty of Claims: The manuscript claims a novel role for IgM in viral neutralization, despite extensive prior literature demonstrating this role in fish. This overstatement detracts from the contribution of the study and requires a more accurate contextualization of the findings.

      Support for IgM's Crucial Role: The mortality data following IgM depletion do not fully support the claim that IgM is indispensable for antiviral defense. The survival of IgM-depleted fish remains high (75%) compared to non-primed controls (~50%), suggesting that other immune components may compensate for IgM loss.

      Presentation of IgM Depletion Model: The study describes the IgM depletion model as novel, although similar models have been previously published (e.g., Ding et al., 2023). This should be clarified to avoid overstating its novelty.

      While the manuscript attempts to address an important question in teleost immunology, the current evidence is insufficient to fully support the authors' conclusions. Addressing the validation of the monoclonal antibody, re-evaluating depletion kinetics, and tempering claims of novelty would strengthen the study's impact. The findings, if rigorously validated, have important implications for understanding the evolution of vertebrate immunity and practical applications in fish health management.

      This work is of interest to immunologists, evolutionary biologists, and aquaculture researchers. The methodological framework, once validated, could be valuable for studying immunoglobulin function in other non-model organisms and for developing targeted vaccine strategies. However, the current weaknesses limit its broader applicability and impact.

    1. eLife Assessment

      This study reveals that PRMT1 overexpression drives tumorigenesis of acute megakaryocytic leukemia (AMKL) and that targeting PRMT1 is a viable approach for treating AMKL. While the evidence, based largely on one cell line, is convincing, further validations in additional experiment settings will solidify the conclusion. These findings have important implications for the treatment of AMKL with PRMT1 over expression in the future.

    2. Reviewer #1 (Public review):

      Summary:

      PRMT1 overexpression is linked to poor survival in cancers, including acute megakaryocytic leukemia (AMKL). This manuscript describes the important role of PRMT1 in the metabolic reprograming in AMKL. In a PRMT1-driven AMKL model, only cells with high PRMT1 expression induced leukemia, which was effectively treated with the PRMT1 inhibitor MS023. PRMT1 increased glycolysis, leading to elevated glucose consumption, lactic acid accumulation, and lipid buildup while downregulating CPT1A, a key regulator of fatty acid oxidation. Treatment with 2-deoxy-glucose (2-DG) delayed leukemia progression and induced cell differentiation, while CPT1A overexpression rescued cell proliferation under glucose deprivation. Thus, PRMT1 enhances AMKL cell proliferation by promoting glycolysis and suppressing fatty acid oxidation.

      Strengths:

      This study highlights the clinical relevance of PRMT1 overexpression with AMKL, identifying it as a promising therapeutic target. A key novel finding is the discovery that only AMKL cells with high PRMT1 expression drive leukemogenesis, and this PRMT1-driven leukemia can be effectively treated with the PRMT1 inhibitor MS023. The work provides significant metabolic insights, showing that PRMT1 enhances glycolysis, suppresses fatty acid oxidation, downregulates CPT1A, and promotes lipid accumulation, which collectively drive leukemia cell proliferation. The successful use of the glucose analogue 2-deoxy-glucose (2-DG) to delay AMKL progression and induce cell differentiation underscores the therapeutic potential of targeting PRMT1-related metabolic pathways. Furthermore, the rescue experiment with ectopic Cpt1a expression strengthens the mechanistic link between PRMT1 and metabolic reprogramming. The study employs robust methodologies, including Seahorse analysis, metabolomics, FACS analysis, and in vivo transplantation models, providing comprehensive and well-supported findings. Overall, this work not only deepens our understanding of PRMT1's role in leukemia progression but also opens new avenues for targeting metabolic pathways in cancer therapy.

      Weaknesses:

      This study, while significant, has some limitations.

      (1) The findings rely heavily on a single AMKL cell line, with no validation in patient-derived samples to confirm clinical relevance or even another type of leukemia line. Adding the discussion of PRMT1's role in other leukemia types will increase the impact of this work.

      (2) The observed heterogeneity in Prmt1 expression is noted but not further investigated, leaving gaps in understanding its broader implications.

      (3) Some figures and figure legends didn't include important details or had not matching information. For example,<br /> • Figure 2D, E, F, I (wrong label with D), p-value was not shown. Panel I figure legend is missing.<br /> • Figure 6E, F, p value was not shown.<br /> • Line 272-278, figures should be Figures 7 D-F.

      (4) Some wording is not accurate, such as line 80 "the elevated level of PRMT1 maintains the leukemic stem cells", the study is using the cell line, not leukemia stem cells.

      (5) In the disease model, histopathology of blood, spleen, and BM should be shown.

      (6) Can MS023 treatment reverse the metabolic changes in PRMT1 overexpression AMKL cells?

      (7) It would be helpful if a summary graph is provided at the end of the manuscript.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript explores the role of PRMT1 in AMKL, highlighting its overexpression as a driver of metabolic reprogramming. PRMT1 overexpression enhances the glycolytic phenotype and extracellular acidification by increasing lactate production in AMKL cells. Treatment with the PRMT1 inhibitor MS023 significantly reduces AMKL cell viability and improves survival in tumor-bearing mice. Intriguingly, PRMT1 overexpression also increases mitochondrial number and mtDNA content. High PRMT1-expressing cells demonstrate the ability to utilize alternative energy sources dependent on mitochondrial energetics, in contrast to parental cells with lower PRMT1 levels.

      Strengths:

      This is a conceptually novel and important finding as PRMT1 has never been shown to enhance glycolysis in AMKL, and provides a novel point of therapeutic intervention for AMKL.

      Weaknesses:

      (1) The manuscript lacks detailed molecular mechanisms underlying PRMT1 overexpression, particularly its role in enhancing survival and metabolic reprogramming via upregulated glycolysis and diminished oxidative phosphorylation (OxPhos). The findings primarily report phenomena without exploring the reasons behind these changes.

      (2) The article shows that PRMT1 overexpression leads to augmented glycolysis and low reliance on the OxPhos. However, the manuscript also shows that PMRT1 overexpression leads to increased mitochondrial number and mitochondrial DNA content and has an elevated NADPH/NAD+ ratio. Further, these overexpressing cells have the ability to better survive on alternative energy sources in the absence of glucose compared to low PMRT1-expressing parental cells. Surprisingly, the seashores assay in PRMT1 overexpressing cells showed no further enhancement in the ECAR after adding mitochondrial decoupler FCCP, indicating the truncated mitochondrial energetics. These results are contradicting and need a more detailed explanation in the discussion.

      (3) How was disease penetrance established following the 6133/PRMT1 transplant before MS023 treatment?

      (4) The 6133/PRMT1 cells show elevated glycolysis compared to parental 6133; why did the author choose the 6133 cells for treatment with the MS023 and ECAR assay (Fig.3 b)? The same is confusing with OCR after inhibitor treatment in 6133 cells; the figure legend and results section description are inconsistent.

      (5) The discussion is too brief and incoherent and does not adequately address key findings. A comprehensive rewrite is necessary to improve coherence and depth.

      (6) The materials and methods section lacks a description of statistical analysis, and significance is not indicated in several figures (e.g., Figures 1C, D, F; Figures 2D, E, F, I). Statistical significance must be consistently indicated. The methods section requires more detailed descriptions to enable replication of the study's findings.

      (7) Figures are hazy and unclear. They should be replaced with high-resolution images, ensuring legible text and data.

      (8) Correct the labeling in Figure 2I by removing the redundant "D."

    1. eLife Assessment

      This useful study reports analyses of Neuropixel recordings in the medial prefrontal cortex and hippocampus of rats in a spatial navigation trial, focusing on classifying prefrontal neurons based on SWR modulation and anatomical location. However, the evidence for claims of a clear link between SWR modulation and neuronal encoding, and the evidence for anatomical organization, is currently incomplete. Further analyses might strengthen the evidence for some conclusions, and some of the strong claims of the paper should likely be moderated.

    2. Reviewer #1 (Public review):

      Summary:

      The authors used high-density probe recordings in the medial prefrontal cortex (PFC) and hippocampus during a rodent spatial memory task to examine functional sub-populations of PFC neurons that are modulated vs. unmodulated by hippocampal sharp-wave ripples (SWRs), an important physiological biomarker that is thought to have a role in mediating information transfer across hippocampal-cortical networks for memory processes. SWRs are associated with the reactivation of representations of previous experiences, and associated reactivation in hippocampal and cortical regions has been proposed to have a role in memory formation, retrieval, planning, and memory-guided behavior. This study focuses on awake SWRs that are prevalent during immobility periods during pauses in behavior. Previous studies have reported strong modulation of a subset of prefrontal neurons during hippocampal SWRs, with some studies reporting prefrontal reactivation during SWRs that have a role in spatial memory processes. The study seeks to extend these findings by examining the activity of SWR-modulated vs. unmodulated neurons across PFC sub-regions, and whether there is a functional distinction between these two kinds of neuronal populations with respect to representing spatial information and supporting memory-guided decision-making.

      Strengths:

      The major strength of the study is the use of Neuropixels 1.0 probes to monitor activity throughout the dorsal-ventral extent of the rodent medial prefrontal cortex, permitting an investigation of functional distinction in neuronal populations across PFC sub-regions. They are able to show that SWR-unmodulated neurons, in addition to having stronger spatial tuning than SWR-modulated neurons as previously reported, also show stronger directional selectivity and theta-cycle skipping properties.

      Weaknesses:

      (1) While the study is able to extend previous findings that SWR-modulated PFC neurons have significantly lower spatial tuning that SWR-unmodulated neurons, the evidence presented does not support the main conclusion of the paper that only the unmodulated neurons are involved in spatial tuning and signaling upcoming choice, implying that SWR-modulated neurons are not involved in predicting upcoming choice, as stated in the abstract. This conclusion makes a categorical distinction between two neuronal populations, that SWR-modulated neurons are involved and SWR-unmodulated are not involved in predicting upcoming choice, which requires evidence that clearly shows this absolute distinction. However, in the analyses showing non-local population decoding in PFC for predicting upcoming choice, the results show that SWR-unmodulated neurons have higher firing rates than SWR-modulated neurons, which is not a categorical distinction. Higher firing rates do not imply that only SWR-unmodulated neurons are contributing to the non-local decoding. They may contribute more than SWR-modulated neurons, but there are no follow-up analyses to assess the contribution of the two sub-populations to non-local decoding.

      (2) Further, the results show that during non-local representations of the hippocampus of the upcoming options, SWR-excited PFC neurons were more active during hippocampal representations of the upcoming choice, and SWR-inhibited PFC neurons were less active during hippocampal representations of the alternative choice. This clearly suggests that SWR-modulated neurons are involved in signaling upcoming choice, at least during hippocampal non-local representations, which contradicts the main conclusion of the paper.

      (3) Similarly, one of the analyses shows that PFC nonlocal representations show no preference for hippocampal SWRs or hippocampal theta phase. However, the examples shown for non-local representations clearly show that these decodes occur prior to the start of the trajectory, or during running on the central zone or start arm. The time period of immobility prior to the start arm running will have a higher prevalence of SWRs and that during running will have a higher prevalence of theta oscillations and theta sequences, so non-local decoded representations have to sub-divided according to these known local-field potential phenomena for this analysis, which is not followed.

      (4) The primary phenomenon that the manuscript relies on is the modulation of PFC neurons by hippocampal SWRs, so it is necessary to perform the PFC population decoding analyses during SWRs (or examine non-local decoding that occurs specifically during SWRs), as reported in previous studies of PFC reactivation during SWRs, to see if there is any distinction between modulated and unmodulated neurons in this reactivation. Even in the case of independent PFC reactivation as reported by one study, this PFC reactivation was still reported to occur during hippocampal SWRs, therefore decoding during SWRs has to be examined. Similarly, the phenomenon of theta cycle skipping is related to theta sequence representations, so decoding during PFC and hippocampal theta sequences has to be examined before coming to any conclusions.

    3. Reviewer #2 (Public review):

      Summary:

      This work by den Bakker and Kloosterman contributes to the vast body of research exploring the dynamics governing the communication between the hippocampus (HPC) and the medial prefrontal cortex (mPFC) during spatial learning and navigation. Previous research showed that population activity of mPFC neurons is replayed during HPC sharp-wave ripple events (SWRs), which may therefore correspond to privileged windows for the transfer of learned navigation information from the HPC, where initial learning occurs, to the mPFC, which is thought to store this information long term. Indeed, it was also previously shown that the activity of mPFC neurons contains task-related information that can inform about the location of an animal in a maze, which can predict the animals' navigational choices. Here, the authors aim to show that the mPFC neurons that are modulated by HPC activity (SWRs and theta rhythms) are distinct from those "encoding" spatial information. This result could suggest that the integration of spatial information originating from the HPC within the mPFC may require the cooperation of separate sets of neurons.

      This observation may be useful to further extend our understanding of the dynamics regulating the exchange of information between the HPC and mPFC during learning. However, my understanding is that this finding is mainly based upon a negative result, which cannot be statistically proven by the failure to reject the null hypothesis. Moreover, in my reading, the rest of the paper mainly replicates phenomena that have already been described, with the original reports not correctly cited. My opinion is that the novel elements should be precisely identified and discussed, while the current phrasing in the manuscript, in most cases, leads readers to think that these results are new. Detailed comments are provided below.

      Major concerns:

      (1) The main claim of the manuscript is that the neurons involved in predicting upcoming choices are not the neurons modulated by the HPC. This is based upon the evidence provided in Figure 5, which is a negative result that the authors employ to claim that predictive non-local representations in the mPFC are not linked to hippocampal SWRs and theta phase. However, it is important to remember that in a statistical test, the failure to reject the null hypothesis does not prove that the null hypothesis is true. Since this claim is so central in this work, the authors should use appropriate statistics to demonstrate that the null hypothesis is true. This can be accomplished by showing that there is no effect above some size that is so small that it would make the effect meaningless (see https://doi.org/10.1177/070674370304801108).

      (2) The main claim of the work is also based on Figure 3, where the authors show that SWRs-unmodulated mPFC neurons have higher spatial tuning, and higher directional selectivity scores, and a higher percentage of these neurons show theta skipping. This is used to support the claim that SWRs-unmodulated cells encode spatial information. However, it must be noted that in this kind of task, it is not possible to disentangle space and specific task variables involving separate cognitive processes from processing spatial information such as decision-making, attention, motor control, etc., which always happen at specific locations of the maze. Therefore, the results shown in Figure 3 may relate to other specific processes rather than encoding of space and it cannot be unequivocally claimed that mPFC neurons "encode spatial information". This limitation is presented by Mashoori et al (2018), an article that appears to be a major inspiration for this work. Can the authors provide a control analysis/experiment that supports their claim? Otherwise, this claim should be tempered. Also, the authors say that Jadhav et al. (2016) showed that mPFC neurons unmodulated by SWRs are less tuned to space. How do they reconcile it with their results?

      (3) My reading is that the rest of the paper mainly consists of replications or incremental observations of already known phenomena with some not necessarily surprising new observations:<br /> a) Figure 2 shows that a subset of mPFC neurons is modulated by HPC SWRs and theta (already known), that vmPFC neurons are more strongly modulated by SWRs (not surprising given anatomy), and that theta phase preference is different between vmPFC and dmPFC (not surprising given the fact that theta is a travelling wave).<br /> b) Figure 4 shows that non-local representations in mPFC are predictive of the animal's choice. This is mostly an increment to the work of Mashoori et al (2018). My understanding is that in addition to what had already been shown by Mashoori et al here it is shown how the upcoming choice can be predicted. The author may want to emphasize this novel aspect.<br /> c) Figure 6 shows that prospective activity in the HPC is linked to SWRs and theta oscillations. This has been described in various forms since at least the works of Johnson and Redish in 2007, Pastalkova et al 2008, and Dragoi and Tonegawa (2011 and 2013), as well as in earlier literature on splitter cells. These foundational papers on this topic are not even cited in the current manuscript.<br /> Although some previous work is cited, the current narrative of the results section may lead the reader to think that these results are new, which I think is unfair. Previous evidence of the same phenomena should be cited all along the results and what is new and/or different from previous results should be clearly stated and discussed. Pure replications of previous works may actually just be supplementary figures. It is not fair that the titles of paragraphs and main figures correspond to notions that are well established in the literature (e.g., Figure 2, 2nd paragraph of results, etc.).<br /> d) My opinion is that, overall, the paper gives the impression of being somewhat rushed and lacking attention to detail. Many figure panels are difficult to understand due to incomplete legends and visualizations with tiny, indistinguishable details. Moreover, some previous works are not correctly cited. I tried to make a list of everything I spotted below.

    4. Author response:

      We thank the reviewers for their thoughtful feedback. Below we provide an initial response to the central concerns that they have raised. In general, as part of our revisions, we plan to perform additional analyses to strengthen our conclusions, tone down more speculative interpretations, and clarify the novel contributions of our work. A full, point-by-point reply will follow alongside the revised manuscript.

      Briefly, the reviewers’ central concerns are that some of the conclusions are not sufficiently supported by the experimental evidence, specifically (1) the involvement of sharp-wave ripple (SWR)-unmodulated PFC neurons in signaling upcoming choice and (2) the absence of SWR time-locking of PFC non-local representations. They further suggest that (3) the spatial tuning in the PFC may reflect other cognitive processes rather than encoding spatial information; and (4) the manuscript is ambiguous as to which results are novel or corroborating previous work.

      (1) SWR-unmodulated PFC neurons signaling upcoming choice

      Reviewer 1 suggests that our finding that SWR-modulated neurons relate to hippocampal non-local representations contradicts the manuscript’s main conclusion. However, in our view, there is no contradiction and the finding highlights the distinction between the two sub-populations, namely the SWR-modulated neurons linked to hippocampal non-local representations, and the SWR-unmodulated neurons that are more active during prefrontal non-local representations.

      We do agree with the reviewer that the observation of higher firing rates of SWR-unmodulated neurons in the expression of non-local representations does not mean that these neurons are the sole or even main contributors to the non-local decoding. To address both comments, we will perform additional analyses to further disentangle the contributions of SWR-modulated and SWR-unmodulated PFC neurons to the non-local representations of upcoming choice.

      (2) Time-locking of PFC non-local representations to hippocampal SWRs

      Reviewer 1 comments that in the analysis of time-locking to hippocampal SWRs and theta phase, the behavior of the animals needs to be taken into account (i.e., immobility or running). We confirm that this was indeed done in our analysis and we will clarify this point in the revised manuscript.

      The reviewer further requested that PFC decoding during SWRs be performed at shorter timescales as in previous studies. We like to point out that (1) we found no increase in non-local decoding in the PFC around SWR onset (see Fig 5a), and (2) most of the non-local representations in the PFC occurred during the expression of local representations in the hippocampus (see Fig 4d). These data suggest that the non-local representations in both brain regions are expressed independently. To further strengthen this idea, we plan to (1) include the result of decoding PFC activity during SWRs at fine timescales as the reviewer suggested, and (2) look at the firing rates of PFC neurons during non-local representations exclusively when the hippocampus is encoding the actual (local) position.

      Following a suggestion by reviewer 2, we will also add a statistical assessment of how strongly the data supports the absence of time-locking.

      (3) Spatial tuning in the mPFC

      Reviewer 2 points out that the spatial tuning in the prefrontal cortex may be related to cognitive processes (e.g., attention or decision-making) rather than spatial encoding. However, our results show that decoded mPFC activity reliably differentiates between the two start and goal arms (Fig 4a), rate maps show little evidence of mirroring (Fig 3a), and the activity predicts turns in the cue-based task during which goal arms switch pseudo-randomly (meaning that the non-local representations encode the North and South arm alternatingly and correctly, rather than encoding a general rewarded goal arm; Fig. 4b). While it is likely that mPFC encodes several task-related variables, our data suggest that it also encodes distinct locations.

      The reviewer further claims that the results of Jadhav et al. (2016) contradict our findings because they supposedly showed that mPFC neurons unmodulated by SWRs are less tuned to space. However, this is incorrect, as Jadhav et al. (2016) showed that SWR-unmodulated PFC neurons have lower spatial coverage and consequentially are more spatially selective, which is consistent with our observations. We will rephrase this in the text to improve clarity.

      (4) Novelty

      We thank reviewer 2 for pointing out the significance of several novel findings in our work that deserve to be highlighted. This includes the dorsal-ventral profile of SWR-modulation and theta phase locking in the PFC and our observation that the neural representations in the PFC precede the behavioral switch in reversal learning. In our revised manuscript, we will rewrite the text to better emphasize our novel contributions, clearly distinguish new findings from confirmatory observations, and add missing citations where appropriate.

    1. eLife Assessment

      Using electrophysiological recordings in freely moving rats, this valuable study investigates the role of different gamma frequency bands in the development of spatial representations in the hippocampus. Solid evidence supports the idea that gamma-modulated neurons are crucial for generating specific neuronal sequences. These findings will be of interest to neuroscientists studying spatial navigation and neuronal dynamics.

    2. Reviewer #1 (Public review):

      This study presents evidence that a special group of place cells, those tuned to fast-gamma oscillations, play a key role in theta sequence development. How theta sequences are formed and developed during experience is an important question, because these sequences have been implicated in several cognitive functions of place cells, including memory-guided spatial navigation. The revised version of this paper has been significantly improved. Major concerns in the previous round of review on technical and conceptual aspects of the relationship between gamma oscillations and theta sequences are addressed. The main conclusion is supported by the data presented.

    3. Reviewer #2 (Public review):

      This manuscript addresses an important question which has not yet been solved in the field, what is the contribution of different gamma oscillatory inputs to the development of "theta sequences" in the hippocampal CA1 region. Theta sequences have received much attention due to their proposed roles in encoding short-term behavioral predictions, mediating synaptic plasticity, and guiding flexible decision making. Gamma oscillations in CA1 offer a readout of different inputs to this region and have been proposed to synchronize neuronal assemblies and modulate spike timing and temporal coding. However, the interactions between these two important phenomena have not been sufficiently investigated. The authors conducted place cell and local field potential (LFP) recordings in the CA1 region of rats running on a circular track. They then analyzed the phase locking of place cell spikes to slow and fast gamma rhythms, the evolution of theta sequences during behavior and the interaction between these two phenomena. They found that place cell with the strongest modulation by fast gamma oscillations were the most important contributors to the early development of theta sequences and that they also displayed a faster form of phase precession within slow gamma cycles nested with theta. The results reported are interesting and support the main conclusions of the authors. However, the manuscript needs significant improvement in several aspects regarding data analysis, description of both experimental and analytical methods and alternative interpretations, as I detail below.

      • The experimental paradigm and recordings should be explained at the beginning of the Results section. Right now, there is no description whatsoever which makes it harder to understand the design of the study.<br /> • An important issue that needs to be addressed is the very small fraction of CA1 cells phased-locked to slow gamma rhythms (3.7%). This fraction is much lower than in many previous studies, that typically report it in the range of 20-50 %. However, this discrepancy is not discussed by the authors. This needs to be explained and additional analysis considered. One analysis that I would suggest, although there are also other valid approaches, is to, instead of just analyze the phase locking in two discrete frequency bands, to compute the phase locking will all LFP frequencies from 25-100 Hz. This will offer a more comprehensive and unbiased view of the gamma modulation of place cell firing. Alternative metrics to mean vector length that are less sensitive to firing rates, such as pairwise phase consistency index (Vinck et a., Neuroimage, 2010), could be implemented. This may reveal whether the low fraction of phase locked cells could be due to a low number of spikes entering the analysis.<br /> • From the methods, it is not clear to me whether the reference LFP channel was consistently selected to be a different one that where the spikes analyzed were taken. This is the better practice to reduce the contribution of spike leakage that could substantially inflate the coupling with faster gamma frequencies. These analyses need to be described in more detail.<br /> • The initial framework of the authors of classifying cells into fast gamma and not fast gamma modulated implies a bimodality that may be artificial. The authors should discuss the nuances and limitations of this framework. For example, several previous work has shown that the same place cell can couple to different gamma oscillations (e.g., Lastoczni et al., Neuron, 2016; Fernandez-Ruiz et al., Neuron, 2017; Sharif et al., Neuron,2021).<br /> • It would be useful to provide a more through characterization of the physiological properties of FG and NFG cells, as this distinction is the basis of the paper. Only very little characterization of some place cell properties is provided in Figure 5. Important characteristics that should be very feasible to compare include average firing rate, burstiness, estimated location within the layer (i.e., deep vs superficial sublayers) and along the transverse axis (i.e., proximal vs distal), theta oscillation frequency, phase precession metrics (given their fundamental relationship with theta sequences), etc.<br /> • It is not clear to me how the analysis in Figure 6 was performed. In Fig. 6B I would think that the grey line should connect with the bottom white dot in the third panel, which would the interpretation of the results.

      Comments on revisions:

      The authors have conducted new analysis to address the issues I and the other reviewers raised in our original revision. As a result, the revised manuscript has been substantially improved.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Hippocampal place cells display a sequence of firing activities when the animal travels through a spatial trajectory at a behavioral time scale of seconds to tens of seconds. Interestingly, parts of the firing sequence also occur at a much shorter time scale: ~120 ms within individual cycles of theta oscillation. These so-called theta sequences are originally thought to naturally result from the phenomenon of theta phase precession. However, there is evidence that theta sequences do not always occur even when theta phase precession is present, for example, during the early experience of a novel maze. The question is then how they emerge with experience (theta sequence development). This study presents evidence that a special group of place cells, those tuned to fast-gamma oscillations, may play a key role in theta sequence development.

      The authors analyzed place cells, LFPs, and theta sequences as rats traveled a circular maze in repeated laps. They found that a group of place cells were significantly tuned to a particular phase of fast-gamma (FG-cells), in contrast to others that did not show such tunning (NFG-cells). The authors then omitted FG-cells or the same number of NFG-cells, in their algorithm of theta sequence detection and found that the quality of theta sequences, quantified by a weighted correlation, was worse with the FG-cell omission, compared to that with the NFG-cell omission, during later laps, but not during early laps. What made the FG-cells special for theta sequences? The authors found that FG-cells, but not NFG-cells, displayed phase recession to slow-gamma (25 - 45 Hz) oscillations (within theta cycles) during early laps (both FG- and NFG-cells showed slow-gamma phase precession during later laps). Overall, the authors conclude that FG-cells contribute to theta sequence development through slow-gamma phase precession during early laps.

      How theta sequences are formed and developed during experience is an important question, because these sequences have been implicated in several cognitive functions of place cells, including memory-guided spatial navigation. The identification of FG-cells in this study is straightforward. Evidence is also presented for the role of these cells in theta sequence development. However, given several concerns elaborated below, whether the evidence is sufficiently strong for the conclusion needs further clarification, perhaps, in future studies.

      We thank the reviewer for these positive comments.

      (1) The results in Figure 3 and Figure 8 seems contradictory. In Figure 8, all theta sequences displayed a seemingly significant weighted correlation (above 0) even in early laps, which was mostly due to FG-cell sequences but not NFG-cell sequences (correlation for NFG-sequences appeared below 0). However, in Figure 3H, omitting FG-cells and omitting NFG-cells did not produce significant differences in the correlation. Conversely, FG-cell and NFG-cell sequences were similar in later laps in Figure 8 (NFG-cell sequences appeared even better than FG-cell sequences), yet omitting NFG-cells produced a better correlation than omitting FG-cells. This confusion may be related to how "FG-cell-dominant sequences" were defined, which is unclear in the manuscript. Nevertheless, the different results are not easy to understand.

      We thank the reviewer for pointing out this important problem.  The potential contradictory can be interpreted by different sequence dataset included in Fig3 and Fig8, described as follows.

      (1) In Fig 3, all sequences decoded without either FG or NFG cells were included, defined as exFG-sequences and exNFG sequences, so that we couldn’t observe sequence development at early phase and thus the weighted correlation was low.  (2) In Fig8, however, the sequences with either FG or NFG cells firing across at least 3 slow gamma cycles were included, defined as FG-cell sequences and NFG-cell sequences.  This criterion ensures to investigate the relationship between sequence development and slow gamma phase precession, so that these sequences were contributed by cells likely to show slow gamma phase precession.  These definitions have been updated to the “Theta sequences detection” section of the Methods (Line 606-619).

      At early phase, there’s still no difference of weighted correlation between FG-cell sequences and NFG-cell sequences (Author response image 1A, Student’s t test, t(65)=0.2, p=0.8, Cohen's D=0.1), but the FG-cell sequences contained high proportion of slow gamma phase precession (Fig8F).  At late phase, both FG-cell sequences and NFG-cell sequences exhibited slow gamma phase precession, so that their weighted correlation were high with no difference (Author response image 1B, Student’s t test, t(62)=-1.1, p=0.3, Cohen's D=0.3).  This result further indicates that the theta sequence development requires slow gamma phase precession, especially for FG cells during early phase.

      Author response image 1.

      (2) The different contributions between FG-cells and NFG-cells to theta sequences are supposed not to be caused by their different firing properties (Figure 5). However, Figure 5D and E showed a large effect size (Cohen's D = 07, 0.8), although not significant (P = 0.09, 0.06). But the seemingly non-significant P values could be simply due to smaller N's (~20). In other parts of the manuscript, the effect sizes were comparable or even smaller (e.g. D = 0.5 in Figure 7B), but interpreted as positive results: P values were significant with large N's (~480 in Fig. 7B). Drawing a conclusion purely based on a P value while N is large often renders the conclusion only statistical, with unclear physical meaning. Although this is common in neuroscience publications, it makes more sense to at least make multiple inferences using similar sample sizes in the same study.

      We thank the reviewer for this kind suggestion.  We made multiple inferences using similar sample sizes as much as possible.  In Fig7B, we did the statistical analysis with sessions as samples, and we found the significant conclusion was maintained.  These results have been updated to the revised manuscript (Lines 269-270).and the Fig7B has been replaced correspondingly.

      (3) In supplementary Figure 2 - S2, FG-cells displayed stronger theta phase precession than NFG-cells, which could be a major reason why FG-cells impacted theta sequences more than NFG cells. Although factors other than theta phase precession may contribute to or interfere with theta sequences, stronger theta phase precession itself (without the interference of other factors), by definition, can lead to stronger theta sequences.

      This is a very good point.  The finding that FG-cells displayed stronger theta phase precession than NFG-cells was consistent with the finding of Guardamagna et al., 2023 Cell Rep, that the theta phase precession pattern emerged with strong fast gamma.  Since slow gamma phase precession occurred within theta cycles, it is hard to consider the contribution of these factors to theta sequences development, without taking theta phase precession into account.  But one should be noted that the theta sequences could not be developed even if theta phase precession existed from the very beginning of the exploration (Feng et al., 2025 J Neurosci).  These findings suggest that theta phase precession, together with other factors, impact theta sequence development.  However, the weight of each factor and their interaction still need to be further investigated.  We have discussed this possibility in the Discussion section (Lines 361- 373).

      (4) The slow-gamma phase precession of FG-cells during early laps is supposed to mediate or contribute to the emergence of theta sequences during late laps (Figure 1). The logic of this model is unclear. The slow-gamma phase precession was present in both early and late laps for FG-cells, but only present in late laps for NFG-cells. It seems more straightforward to hypothesize that the difference in theta sequences between early and later laps is due to the difference in slow-gamma phase precession of NFG cells between early and late laps. Although this is not necessarily the case, the argument presented in the manuscript is not easy to follow.

      We thank the reviewer for pointing this out.  The slow gamma phase precession was first found in my previous publication (Zheng et al., 2016 Neuron), which indicates a temporally compressed manner for coding spatial information related to memory retrieval.  In this case, we would expect that slow gamma phase precession occurred in all cells during late laps, because spatial information was retrieved when rats have been familiar with the environment.  However, during early laps when novel information was just encoded, there would be balance between fast gamma and slow gamma modulation of cells for upcoming encoding-retrieval transition.  A possibility is that FG-cells support this balance by receiving modulation of both fast gamma and slow gamma, but with distinct phase-coding modes (fast gamma phase locking and slow gamma phase precession) in a temporally coordinated manner.  We have discussed this possibility in the Discussion section (Lines 415- 428).

      (5) There are several questions on the description of methods, which could be addressed to clarify or strengthen the conclusions.

      (i) Were the identified fast- and slow-gamma episodes mutually exclusive?

      Yes, the fast- and slow-gamma episodes are mutually exclusive. We have added descriptions in the “Detection of gamma episodes” section in the Methods part (Lines 538-550).

      (ii) Was the task novel when the data were acquired? How many days (from the 1st day of the task) were included in the analysis? When the development of the theta sequence was mentioned, did it mean the development in a novel environment, in a novel task, or purely in a sense of early laps (Lap 1, 2) on each day?

      We thank the reviewer for pointing this out.  The task was not novel to rats in this dataset, because only days with good enough recording quality for sequence decoding were included in this paper, which were about day2-day10 for each rat.  However, we still observed the process of sequence formation because of the rat’s exploration interest during early laps.  Thus, when the development of the theta sequence was mentioned, it meant a sense of early laps on each day.

      (iii) How were the animals' behavioral parameters equalized between early and later laps? For example, speed or head direction could potentially produce the differences in theta sequences.

      This is a very good point.  In terms of the effect of running speed on theta sequences, we quantified the running speeds during theta sequences across trials 1-5.  We found that the rats were running at stable running speed, which has been reported in Fig.3F.  In terms of the effect of head direction on theta sequences, we measured the angle difference between head direction and running direction.  We found that the angle difference for each lap was distributed around 0, with no significant difference across laps (Fig.S3, Watson-Williams multi-sample test, F(4,55)=0.2, p=0.9, partial η<sup>2</sup>= 0.01).  These results indicate that the differences in theta sequences across trials cannot be interpreted by the variability of behavioral parameters.  We have updated these results and corresponding methods in the revised manuscript (Lines 172-175, Lines 507-511, with a new Fig.S3).

      Reviewer #2 (Public Review):

      This manuscript addresses an important question that has not yet been solved in the field, what is the contribution of different gamma oscillatory inputs to the development of "theta sequences" in the hippocampal CA1 region? Theta sequences have received much attention due to their proposed roles in encoding short-term behavioral predictions, mediating synaptic plasticity, and guiding flexible decision-making. Gamma oscillations in CA1 offer a readout of different inputs to this region and have been proposed to synchronize neuronal assemblies and modulate spike timing and temporal coding. However, the interactions between these two important phenomena have not been sufficiently investigated. The authors conducted place cell and local field potential (LFP) recordings in the CA1 region of rats running on a circular track. They then analyzed the phase locking of place cell spikes to slow and fast gamma rhythms, the evolution of theta sequences during behavior, and the interaction between these two phenomena. They found that place cells with the strongest modulation by fast gamma oscillations were the most important contributors to the early development of theta sequences and that they also displayed a faster form of phase precession within slow gamma cycles nested with theta. The results reported are interesting and support the main conclusions of the authors. However, the manuscript needs significant improvement in several aspects regarding data analysis, description of both experimental and analytical methods, and alternative interpretations, as I detail below.

      • The experimental paradigm and recordings should be explained at the beginning of the Results section. Right now, there is no description whatsoever which makes it harder to understand the design of the study.

      We thank the reviewer for this kind suggestion.  The description of experimental paradigm and recordings has been added to the beginning of the results section (Lines 114-119).

      • An important issue that needs to be addressed is the very small fraction of CA1 cells phased-locked to slow gamma rhythms (3.7%). This fraction is much lower than in many previous studies, that typically report it in the range of 20-50%. However, this discrepancy is not discussed by the authors. This needs to be explained and additional analysis considered. One analysis that I would suggest, although there are also other valid approaches, is to, instead of just analyzing the phase locking in two discrete frequency bands, compute the phase locking will all LFP frequencies from 25-100 Hz. This will offer a more comprehensive and unbiased view of the gamma modulation of place cell firing. Alternative metrics to mean vector length that is less sensitive to firing rates, such as pairwise phase consistency index (Vinck et a., Neuroimage, 2010), could be implemented. This may reveal whether the low fraction of phase-locked cells could be due to a low number of spikes entering the analysis.

      We thank the reviewer for this constructive suggestion.  A previous work also on Long-Evans rats showed that the proportion of slow gamma phase-locked cells during novelty exploration was ~20%, however it dropped to ~10% during familiar exploration (Fig.4E, Kitanishi et al., 2015 Neuron).  This suggests that the proportion of slow gamma phase-locked cells may decreased with familiarity of the environment, which supports our data.  In addition, we also calculated the pairwise phase consistency index in terms of the effect of spike counts on MVL.  We could observe that the tendency of PPC (Author response image 2A) and MVL (Author response image 2B) along frequency bands were consistent across different subsets of cells, suggesting that the determination of cell subsets by MVL metric was not biased by the low number of spikes.  These results further shed light to the contribution of slow gamma phase precession of place cells to theta sequence development.

      Author response image 2.

      • From the methods, it is not clear to me whether the reference LFP channel was consistently selected to be a different one that where the spikes analyzed were taken. This is the better practice to reduce the contribution of spike leakage that could substantially inflate the coupling with faster gamma frequencies. These analyses need to be described in more detail.

      We thank the reviewer for pointing this out.  In the main manuscript, we used local LFPs as the cells were recorded from the same tetrode.  In addition, we selected an individual tetrode which located at stratum pyramidale and at the center of the drive bundle for each rat.  We detected a similar proportion of FG-cells by using LFPs on this tetrode, compared with that using local LFPs (Author response image 3A-B, Chi-squared test, χ<sup>2</sup>= 0.9, p=0.4, Cramer V=0.03).  We further found that the PPC measurement of FG- and NFG-cells were different at fast gamma band by using central LFPs (Author response image 3D), consistent with that by using local LFPs (Author response image 3C).  Therefore, these results suggest that the findings related to fast gamma was not due to the contribution of spike leakage in the local LFPs.  We have updated the description in the manuscript (Lines 553-557, 566-568).

      Author response image 3.

      • The initial framework of the authors of classifying cells into fast gamma and not fast gamma modulated implies a bimodality that may be artificial. The authors should discuss the nuances and limitations of this framework. For example, several previous work has shown that the same place cell can couple to different gamma oscillations (e.g., Lastoczni et al., Neuron, 2016; Fernandez-Ruiz et al., Neuron, 2017; Sharif et al., Neuron,2021).

      We thank the reviewer for this kind suggestion.  We have cited these references and discussed the possibility of bimodal phase-locking in the manuscript (Lines 430-433).

      • It would be useful to provide a more thorough characterization of the physiological properties of FG and NFG cells, as this distinction is the basis of the paper. Only very little characterization of some place cell properties is provided in Figure 5. Important characteristics that should be very feasible to compare include average firing rate, burstiness, estimated location within the layer (i.e., deep vs superficial sublayers) and along the transverse axis (i.e., proximal vs distal), theta oscillation frequency, phase precession metrics (given their fundamental relationship with theta sequences), etc.

      We thank the reviewer for this constructive suggestion.  In addition to the characterizations shown in Fig5, we also analyzed firing rate, anatomical location and theta modulation to compare the physiological properties of FG- and NFG-cells.

      In terms of the firing properties of both types of cells, we found that the mean firing rate of FG-cell was higher than NFG-cell (Fig. 5A, Student's t-test, t(22) = 2.1, p = 0.04, Cohen's D = 0.9), which was consistent with the previous study that the firing rate was higher during fast gamma than during slow gamma (Zheng et al., 2015 Hippocampus).  However, the spike counts of excluded FG- and NFG-cells for decoding were similar (Fig. 5B, Student's t-test, t(22) = 1.2, p = 0.3, Cohen's D = 0.5), suggesting that the differences found in theta sequences cannot be accounted for by different decoding quality related to spike counts.  In addition, we measured the burstiness based on the distribution of inter-spike-intervals, and we found that the bursting probability of spikes was not significantly different between FG and NFG cells (Author response image 4A, Student's t-test, t(22) = 0.6, p=0.5, Cohen's d=0.3).

      In terms of theta modulation of cells, we first compared the theta frequency related to the firing of FG and NFG cells.  We detected the instantaneous theta frequency at each spike timing of FG and NFG cells, and found that it was not significantly different between cell types (Author response image 4B, Student's t-test, t(22) = -0.5, p=0.6, Cohen's d=0.2).  In addition, we found the proportion of cells with significant theta phase precession was greater in FG-cells than in NFG-cells (Fig. S2E).  However, the slope and starting phase of theta phase precession was not significantly different between FG and NFG cells (Author response image 4C, Student's t-test, t(21) = 0.3, p=0.8, Cohen's d=0.1; Author response image 4D, Watson-Williams test, F(1,21)=0.5, p=0.5, partial η<sup>2</sup>=0.02).

      In terms of the anatomical location of FG and NFG cells, we identified tetrode traces in slices for each cell.  We found that both FG and NFG cells were recorded from the deep layer of dorsal CA1, with no difference of proportions between cell types (Author response image 4E, Chi-squared test, χ<sup>2</sup>=0.5, p=0.5, Cramer V=0.05).  The distribution of FG-cells he NFG-cells along the transverse axis was also similar between cell types (Author response image 4F, χ<sup>2</sup>=0.08, p=0.8, Cramer V=0.02).

      Author response image 4.

      • It is not clear to me how the analysis in Figure 6 was performed. In Figure 6B I would think that the grey line should connect with the bottom white dot in the third panel, which would be the interpretation of the results.

      We thank the reviewer for raising this good point.  The grey line was just for intuitional observation, not a quantitative analysis.  We have removed the grey lines from all heat maps in Fig.6.

      Reviewer #3 (Public Review):

      [Editors' note: This review contains many criticisms that apply to the whole sub-field of slow/fast gamma oscillations in the hippocampus, as opposed to this particular paper. In the editors' view, these comments are beyond the scope of any single paper. However, they represent a view that, if true, should contextualise the interpretation of this paper and all papers in the sub-field. In doing so, they highlight an ongoing debate within the broader field.]

      Summary:

      The authors aimed to elucidate the role of dynamic gamma modulation in the development of hippocampal theta sequences, utilizing the traditional framework of "two gammas," a slow and a fast rhythm. This framework is currently being challenged, necessitating further analyses to establish and secure the assumed premises before substantiating the claims made in the present article.

      The results are too preliminary and need to integrate contemporary literature. New analyses are required to address these concerns. However, by addressing these issues, it may be possible to produce an impactful manuscript.

      We thank the reviewer for raising these important questions in the hippocampal gamma field.  We have done a lot of new analyses according to the comments to strengthen our manuscript.

      I. Introduction

      Within the introduction, multiple broad assertions are conveyed that serve as the premise for the research. However, equally important citations that are not mentioned potentially contradict the ideas that serve as the foundation. Instances of these are described below:

      (1) Are there multiple gammas? The authors launched the study on the premise that two different gamma bands are communicated from CA3 and the entorhinal cortex. However, recent literature suggests otherwise, offering that the slow gamma component may be related to theta harmonics:

      From a review by Etter, Carmichael and Williams (2023)

      "Gamma-based coherence has been a prominent model for communication across the hippocampal-entorhinal circuit and has classically focused on slow and fast gamma oscillations originating in CA3 and medial entorhinal cortex, respectively. These two distinct gammas are then hypothesized to be integrated into hippocampal CA1 with theta oscillations on a cycle-to-cycle basis (Colgin et al., 2009; Schomburg et al., 2014). This would suggest that theta oscillations in CA1 could serve to partition temporal windows that enable the integration of inputs from these upstream regions using alternating gamma waves (Vinck et al., 2023). However, these models have largely been based on correlations between shifting CA3 and medial entorhinal cortex to CA1 coherence in theta and gamma bands. In vivo, excitatory inputs from the entorhinal cortex to the dentate gyrus are most coherent in the theta band, while gamma oscillations would be generated locally from presumed local inhibitory inputs (Pernía-Andrade and Jonas, 2014). This predominance of theta over gamma coherence has also been reported between hippocampal CA1 and the medial entorhinal cortex (Zhou et al., 2022). Another potential pitfall in the communication-through-coherence hypothesis is that theta oscillations harmonics could overlap with higher frequency bands (Czurkó et al., 1999; Terrazas et al., 2005), including slow gamma (Petersen and Buzsáki, 2020). The asymmetry of theta oscillations (Belluscio et al., 2012) can lead to harmonics that extend into the slow gamma range (Scheffer-Teixeira and Tort, 2016), which may lead to a misattribution as to the origin of slow-gamma coherence and the degree of spike modulation in the gamma range during movement (Zhou et al., 2019)."

      And from Benjamin Griffiths and Ole Jensen (2023)

      "That said, in both rodent and human studies, measurements of 'slow' gamma oscillations may be susceptible to distortion by theta harmonics [53], meaning open questions remain about what can be attributed to 'slow' gamma oscillations and what is attributable to theta."

      This second statement should be heavily considered as it is from one of the original authors who reported the existence of slow gamma.

      Yet another instance from Schomburg, Fernández-Ruiz, Mizuseki, Berényi, Anastassiou, Christof Koch, and Buzsáki (2014):

      "Note that modulation from 20-30 Hz may not be related to gamma activity but, instead, reflect timing relationships with non-sinusoidal features of theta waves (Belluscio et al., 2012) and/or the 3rd theta harmonic."

      One of this manuscript's authors is Fernández-Ruiz, a contemporary proponent of the multiple gamma theory. Thus, the modulation to slow gamma offered in the present manuscript may actually be related to theta harmonics.

      With the above emphasis from proponents of the slow/fast gamma theory on disambiguating harmonics from slow gamma, our first suggestion to the authors is that they A) address these statements (citing the work of these authors in their manuscript) and B) demonstrably quantify theta harmonics in relation to slow gamma prior to making assertions of phase relationships (methodological suggestions below). As the frequency of theta harmonics can extend as high as 56 Hz (PMID: 32297752), overlapping with the slow gamma range defined here (25-45 Hz), it will be important to establish an approach that decouples the two phenomena using an approach other than an arbitrary frequency boundary.

      We agree with the reviewer that the theta oscillations harmonics could overlap with higher frequency bands including slow gamma, as the above reviews discussed.  In order to rule out the possibility of theta harmonics effects in this study, we added new analyses in this letter (see below).

      (2) Can gammas be segregated into different lamina of the hippocampus? This idea appears to be foundational in the premise of the research but is also undergoing revision.

      As discussed by Etter et al. above, the initial theory of gamma routing was launched on coherence values. However, the values reported by Colgin et al. (2009) lean more towards incoherence (a value of 0) rather than coherence (1), suggesting a weak to negligible interaction. Nevertheless, this theory is coupled with the idea that the different gamma frequencies are exclusive to the specific lamina of the hippocampus.

      Recently, Deschamps et al. (2024) suggested a broader, more nuanced understanding of gamma oscillations than previously thought, emphasizing their wide range and variability across hippocampal layers. This perspective challenges the traditional dichotomy of gamma sub-bands (e.g., slow vs. medium gamma) and their associated cognitive functions based on a more rigid classification according to frequency and phase relative to the theta rhythm. Moreover, they observed all frequencies across all layers.

      Similarly, the current source density plots from Belluscio et al. (2012) suggest that SG and FG can be observed in both the radiatum and lacunosum-moleculare.

      Therefore, if the initial coherence values are weak to negligible and both slow and fast gamma are observed in all layers of the hippocampus, can the different gammas be exclusively related to either anatomical inputs or psychological functions (as done in the present manuscript)? Do these observations challenge the authors' premise of their research? At the least, please discuss.

      We thank the reviewer for raising this point, which I believe still remains controversial in this field.  We also thank the reviewer for providing detailed proofs of existence forms of gamma rhythms.  The reviewer was considering 2 aspects of gamma: 1) the reasonability of dividing slow and fast gamma by specific frequency bands; 2) the existence of gamma across all hippocampal layers, which challenged the functional significance of different types of gamma rhythms.  Although the results in Douchamps et al., 2024 challenged the idea of rigid gamma sub-bands, we still could see separate slow and fast gamma components exclusively occurred along time course, with central frequency of slow gamma lower than ~60Hz and central frequency of fast gamma higher than ~60Hz (Fig.1b of Douchamps et al., 2024).  This was also seen in the rat dataset of this reference (Fig. S3).  Since their behavioral test required both memory encoding and retrieval processes, it was hard to distinguish the role of different gamma components as they may dynamically coordinate during complex memory process.  Thus, although the behavioral performance can be decoded from broad range of gamma, we still cannot deny the existence of difference gamma rhythms and their functional significance during difference memory phases.

      (3) Do place cells, phase precession, and theta sequences require input from afferent regions? It is offered in the introduction that "Fast gamma (~65-100Hz), associated with the input from the medial entorhinal cortex, is thought to rapidly encode ongoing novel information in the context (Fernandez-Ruiz et al., 2021; Kemere, Carr, Karlsson, & Frank, 2013; Zheng et al., 2016)".

      CA1 place fields remain fairly intact following MEC inactivation include Ipshita Zutshi, Manuel Valero, Antonio Fernández-Ruiz , and György Buzsáki (2022)- "CA1 place cells and assemblies persist despite combined mEC and CA3 silencing" and from Hadas E Sloin, Lidor Spivak, Amir Levi, Roni Gattegno, Shirly Someck, Eran Stark (2024) - "These findings are incompatible with precession models based on inheritance, dual-input, spreading activation, inhibition-excitation summation, or somato-dendritic competition. Thus, a precession generator resides locally within CA1."

      These publications, at the least, challenge the inheritance model by which the afferent input controls CA1 place field spike timing. The research premise offered by the authors is couched in the logic of inheritance, when the effect that the authors are observing could be governed by local intrinsic activity (e.g., phase precession and gamma are locally generated, and the attribution to routed input is perhaps erroneous). Certainly, it is worth discussing these manuscripts in the context of the present manuscript.

      We thank the review for this discussion.  The main purpose of our current study is to investigate the mechanism of theta sequence development along with learning, which may or may not dependent on theta phase precession of single place cells as it remains controversial in this field.  Also, there is a limitation in this study that all gamma components were recorded from stratum pyramidale, thus we cannot make any conclusion on the originate of gamma in modulating sequence development.

      II. Results

      (1) Figure 2-

      a. There is a bit of a puzzle here that should be discussed. If slow and fast frequencies modulate 25% of neurons, how can these rhythms serve as mechanisms of communication/support psychological functions? For instance, if fast gamma is engaged in rapid encoding (line 72) and slow gamma is related to the integration processing of learned information (line 84), and these are functions of the hippocampus, then why do these rhythms modulate so few cells? Is this to say 75% of CA1 neurons do not listen to CA3 or MEC input?

      The proportion ~25% was the part of place cells phase-locked to either slow or fast gamma.  However, one of the main findings in this study was that most cells were modulated by slow gamma as they fired at precessed slow gamma phase within a theta cycle (Figs 6-8), which would promote information compression for theta sequence development.  Therefore, we didn’t mean that only a small proportion of cells were modulated by gamma rhythms and contributed to this process.

      b. Figure 2. It is hard to know if the mean vector lengths presented are large or small. Moreover, one can expect to find significance due to chance. For instance, it is challenging to find a frequency in which modulation strength is zero (please see Figure 4 of PMID: 30428340 or Figure 7 of PMID: 31324673).

      i. Please construct the histograms of Mean Vector Length as in the above papers, using 1 Hz filter steps from 1-120Hz and include it as part of Figure 2 (i.e., calculate the mean vector length for the filtered LFP in steps of 1-2 Hz, 2-3 Hz, 3-4 Hz,... etc). This should help the authors portray the amount of modulation these neurons have relative to the theta rhythm and other frequencies. If the theta mean vector length is higher, should it be considered the primary modulatory influence of these neurons (with slow and fast gammas as a minor influence)?

      We thank the review for this suggestion.  We measured the mean vector length at 5Hz step (equivalent to 1Hz step), and we found that the FG-cells were phase-locked to fast gamma rhythms even stronger than that to theta (Author response image 2B, mean MVL of theta=0.126±0.007, mean MVL of theta=0.175±0.006, paired t-test, t(112)=-5.9, p=0.01, Cohen's d=0.7).  In addition, in some previous studies with significant fast gamma phase locking, the MVL values were around 0.15 by using broad gamma band (Kitanishi et al., 2015 Neuron, Lasztóczi et al., 2016 Neuron, Tomar et al., 2021 Front Behav Neurosci, and Asiminas et al., 2022 Molecular Autism), which was consistent with the value in this study.  Therefore, we don’t believe that fast gamma was only a minor influence of these neurons.

      ii. It is possible to infer a neuron's degree of oscillatory modulation without using the LFP. For instance, one can create an ISI histogram as done in Figure 1 here (https://www.biorxiv.org/content/10.1101/2021.09.20.461152v3.full.pdf+html; "Distinct ground state and activated state modes of firing in forebrain neurons"). The reciprocal of the ISI values would be "instantaneous spike frequency". In favor of the Douchamps et al. (2024) results, the figure of the BioRXiV paper implies that there is a single gamma frequency modulate as there is only a single bump in the ISIs in the 10^-1.5 to 10^-2 range. Therefore, to vet the slow gamma results and the premise of two gammas offered in the introduction, it would be worth including this analysis as part of Figure 2.

      By using suggested method, we calculated the ISI distribution on log scale for FG-cells and NFG-cells during behavior (Author response image 5).  We could observe that the ISI distribution of FG-cells had a bump in the 10<sup>-1.5</sup>= to 10<sup>-2</sup>= range (black bar), in particular in the fast gamma range (10<sup>-2</sup>= to 10<sup>-1.8</sup>=).

      Author response image 5.

      c. There are some things generally concerning about Figure 2.

      i. First, the raw trace does not seem to have clear theta epochs (it is challenging to ascertain the start and end of a theta cycle). Certainly, it would be worth highlighting the relationship between theta and the gammas and picking a nice theta epoch.

      We thank the review for this suggestion.  We've updated this figure with a nice theta epoch in the revised manuscript.

      ii. Also, in panel A, there looks to be a declining amplitude relationship between the raw, fast, and slow gamma traces, assuming that the scale bars represent 100uV in all three traces. The raw trace is significantly larger than the fast gamma. However, this relationship does not seem to be the case in panel B (in which both the raw and unfiltered examples of slow and fast gamma appear to be equal; the right panels of B suggest that fast gamma is larger than slow, appearing to contradict the A= 1/f organization of the power spectral density). Please explain as to why this occurs. Including the power spectral density (see below) should resolve some of this.

      We thank the review for pointing this out.  The scales of y-axis of LFPs tracs in Fig.2B was not consistent, which mislead the comparison of amplitude between slow and fast gamma.  We have unified y axis scales across different gamma types in the revised manuscript.  Moreover, we also have replaced these examples with more typical ones (also see the response below).

      iii. Within the example of spiking to phase in the left side of Panel B (fast gamma example)- the neuron appears to fire near the trough twice, near the peak twice, and somewhere in between once. A similar relationship is observed for the slow gamma epoch. One would conclude from these plots that the interaction of the neuron with the two rhythms is the same. However, the mean vector lengths and histograms below these plots suggest a different story in which the neuron is modulated by FG but not SG. Please reconcile this.

      We thank the review for pointing this out.  We found that the fast gamma phase locking was robust across FG-cells with fast gamma peak as the preferred phase.  Therefore, we have replaced these examples with more typical ones, so that the examples were consistent with the group effect.

      iv. For calculating the MVL, it seems that the number of spikes that the neuron fires would play a significant role. Working towards our next point, there may be a bias of finding a relationship if there are too few spikes (spurious clustering due to sparse data) and/or higher coupling values for higher firing rate cells (cells with higher firing rates will clearly show a relationship), forming a sort of inverse Yerkes-Dodson curve. Also, without understanding the magnitude of the MVL relative to other frequencies, it may be that these values are indeed larger than zero, but not biologically significant.

      - Please provide a scatter plot of Neuron MVL versus the Neuron's Firing Rate for 1) theta (7-9 Hz), 2) slow gamma, and 3) fast gamma, along with their line of best fit.

      - Please run a shuffle control where the LFP trace is shifted by random values between 125-1000ms and recalculate the MVL for theta, slow, and fast gamma. Often, these shuffle controls are done between 100-1000 times (see cross-correlation analyses of Fujisawa, Buzsaki et al.).

      - To establish that firing rate does not play a role in uncovering modulation, it would be worth conducting a spike number control, reducing the number of spikes per cell so that they are all equal before calculating the phase plots/MVL.

      We thank the review for raising this point.  Beside of the MVL value, we also calculated the pairwise phase consistency (PPC) as suggested by Reviewer2, which is not sensitive to the spike counts.  We found that the phase locking strength to either rhythm (theta or gamma) was comparable between MVL and PPC measurements (Author response image 2).  Moreover, we quantified the relationship between MVL and mean firing rate, as suggested.  We found that the MVL value for theta, slow gamma and fast gamma was negatively correlated with mean firing rate (Author response image 6, Pearson correlation, theta: R<sup>2</sup>= 0.06, Pearson’s r=-0.3, p=1.3×10<sup>-8</sup>=; slow gamma: R<sup>2</sup>= 0.1, Pearson’s r=-0.4, p=2.4×10<sup>-17</sup>=; fast gamma: R<sup>2</sup>= 0.03, Pearson’s r=-0.2, p=4.3×10<sup>-5</sup>=).  These results help us rule out the concerns of the effect of spikes counts on the phase modulation measurement.

      Author response image 6.

      (2) Something that I anticipated to see addressed in the manuscript was the study from Grosmark and Buzsaki (2016): "Cell assembly sequences during learning are "replayed" during hippocampal ripples and contribute to the consolidation of episodic memories. However, neuronal sequences may also reflect preexisting dynamics. We report that sequences of place-cell firing in a novel environment are formed from a combination of the contributions of a rigid, predominantly fast-firing subset of pyramidal neurons with low spatial specificity and limited change across sleep-experience-sleep and a slow-firing plastic subset. Slow-firing cells, rather than fast-firing cells, gained high place specificity during exploration, elevated their association with ripples, and showed increased bursting and temporal coactivation during postexperience sleep. Thus, slow- and fast-firing neurons, although forming a continuous distribution, have different coding and plastic properties."

      My concern is that much of the reported results in the present manuscript appear to recapitulate the observations of Grosmark and Buzsaki, but without accounting for differences in firing rate. A parsimonious alternative explanation for what is observed in the present manuscript is that high firing rate neurons, more integrated into the local network and orchestrating local gamma activity (PING), exhibit more coupling to theta and gamma. In this alternative perspective, it's not something special about how the neurons are entrained to the routed fast gamma, but that the higher firing rate neurons are better able to engage and entrain their local interneurons and, thus modulate local gamma. However, this interpretation challenges the discussion around the importance of fast gamma routed from the MEC.

      a. Please integrate the Grosmark & Buzsaki paper into the discussion.

      b. Also, please provide data that refutes or supports the alternative hypothesis in which the high firing rate cells are just more gamma modulated as they orchestrate local gamma activity through monosynaptic connections with local interneurons (e.g., Marshall et al., 2002, Hippocampal pyramidal cell-interneuron spike transmission is frequency dependent and responsible for place modulation of interneuron discharge). Otherwise, the attribution to a MEC routed fast gamma routing seems tenuous.

      c. It is mentioned that fast-spiking interneurons were removed from the analysis. It would be worth including these cells, calculating the MVL in 1 Hz increments as well as the reciprocal of their ISIs (described above).

      We thank the review for this suggestion.  Because we found the mean firing rate of FG-cells was higher than that of NFG-cells, it would be possible that the FG-cells are mainly overlapped with fast-firing cells (rigid cells) in Grosmark et al., 2016 Science.  Actually, in this study, we aimed to investigate how fast and slow gamma rhythms modulated neurons dynamically during learning, rather than defining new cell types.  Thus, we don’t think this work was just a replication of the previous publication.  We have added this description in the Discussion part (Lines 439-441).  In addition, we don’t have enough number of interneurons to support the analysis between interneurons and place cells.  Therefore, we couldn’t make any statement about where was the fast gamma originated (CA1 locally or routed from MEC) in this study.

      (3) Methods - Spectral decomposition and Theta Harmonics.

      a. It is challenging to interpret the exact parameters that the authors used for their multi-taper analysis in the methods (lines 516-526). Tallon-Baudry et al., (1997; Oscillatory γ-Band (30-70 Hz) Activity Induced by a Visual Search Task in Humans) discuss a time-frequency trade-off where frequency resolution changes with different temporal windows of analysis. This trade-off between time and frequency resolution is well known as the uncertainty principle of signal analysis, transcending all decomposition methods. It is not only a function of wavelet or FFT, and multi-tapers do not directly address this. (The multitaper method, by using multiple specially designed tapers -like the Slepian sequences- smooths the spectrum. This smoothing doesn't eliminate leakage but distributes its impact across multiple estimates). Given the brevity of methods and the issues of theta harmonics as offered above, it is worth including some benchmark trace testing for the multi-taper as part of the supplemental figures.

      i. Please spectrally decompose an asymmetric 8 Hz sawtooth wave showing the trace and the related power spectral density using the multiple taper method discussed in the methods.

      ii. Please also do the same for an elliptical oscillation (perfectly symmetrical waves, but also capable of casting harmonics). Matlab code on how to generate this time series is provided below:

      A = 1; % Amplitude

      T = 1/8; % Period corresponding to 8 Hz frequency

      omega = 2*pi/T; % Angular frequency

      C = 1; % Wave speed

      m = 0.9; % Modulus for the elliptic function (0<m<1 for cnoidal waves)

      x = linspace(0, 2*pi, 1000); % temporal domain

      t = 0; % Time instant

      % Calculate B based on frequency and speed

      B = sqrt(omega/C);

      % Cnoidal wave equation using the Jacobi elliptic function

      u = A .* ellipj(B.*(x - C*t), m).^2;

      % Plotting the cnoidal wave

      figure;

      plot(x./max(x), u);

      title('8 Hz Cnoidal Wave');

      xlabel('time (x)');

      ylabel('Wave amplitude (u)');

      grid on;

      The Symbolic Math Toolbox needs to be installed and accessible in your MATLAB environment to use ellipj. Otherwise, I trust that, rather than plotting a periodic orbit around a circle (sin wave) the authors can trace the movement around an ellipse with significant eccentricity (the distance between the two foci should be twice the distance between the co-vertices).

      We thank the review for this suggestion.  In the main text of manuscript, we only applied Morlet's wavelet method to calculate the time varying power of rhythms.  Multitaper method was used for the estimation of power spectra across running speeds, which was shown in the manuscript.  Therefore, we removed the description of Multitaper method and updated the Morlet's wavelet power spectral analysis in the Methods (Lines 541-544).

      As suggested, we estimated the power spectral densities of 8 Hz sawtooth and elliptical oscillation by using these methods, and compared them with the results from FFT.  We found that both the Multitaper's and Morlet's wavelet methods could well capture the 8Hz oscillatory components (Author response image 7).  However, we could observe harmonic components from FFT spectrum.

      Author response image 7.

      iii. Line 522: "The power spectra across running speeds and absolute power spectrum (both results were not shown).". Given the potential complications of multi-taper discussed above, and as each convolution further removes one from the raw data, it would be the most transparent, simple, and straightforward to provide power spectra using the simple fft.m code in Matlab (We imagine that the authors will agree that the results should be robust against different spectral decomposition methods. Otherwise, it is concerning that the results depend on the algorithm implemented and should be discussed. If gamma transience is a concern, the authors should trigger to 2-second epochs in which slow/fast gamma exceeds 3-7 std. dev. above the mean, comparing those resulting power spectra to 2-second epochs with ripples - also a transient event). The time series should be at least 2 seconds in length (to avoid spectral leakage issues and the issues discussed in Talon-Baudry et al., 1997 above).

      Please show the unmolested power spectra (Y-axis units in mV2/Hz, X-axis units as Hz) as a function of running speed (increments of 5 cm/s) for each animal. I imagine three of these PSDs for 3 of the animals will appear in supplemental methods while one will serve as a nice manuscript figure. With this plot, please highlight the regions that the authors are describing as theta, slow, and fast gamma. Also, any issues should be addressed should there be notable differences in power across animals or tetrodes (issues with locations along proximal-distal CA1 in terms of MEC/LEC input and using a local reference electrode are discussed below).

      As suggested, we firstly estimated the power spectra as a function of running speeds in each running lap, and showed them separately for each rat, by using the multitaper spectral analysis (Author response image 8).  In addition, to achieve unmolested power spectra, the short-time Fourier transform (STFT) was used for this analysis at the same frequency resolution (Author response image 9).  We could see that the power spectra were consistent between these two methods.  Notably, there seems no significant theta harmonic component in the slow gamma band range.

      The multitaper spectral analysis was performed as follows.  The power spectra were measured across different running speeds as described previously (Ahmed et al., 2012 J Neurosci; Zheng et al., 2015 Hippocampus; Zheng et al., 2016 eNeuro).  Briefly, the absolute power spectrum was calculated for 0.5s moving window and 0.2s step size of the LFPs recordings each lap, using the multitaper spectral analysis in the Chronux toolbox (Mitra and Bokil, 2008, http://chronux.org/) and STFT spectral analysis in Matlab script stft.m.  In the multitaper method, the time-bandwidth product parameter (TW) was set at 3, and the number of tapers (K) was set at 5.  In the STFT method, the FFT length was set at 2048, which was equivalent with the parameters used in multitaper method.  Running speed was calculated (see “Estimation of running speed and head direction” section in the manuscript) and averaged within each 0.5s time window corresponding to the LFP segments.  Then, the absolute power at each frequency was smoothed with a Gaussian kernel centered on given speed bin.  The power spectral as a function of running speed and frequency were plotted in log scale.  Also, the colormap was in log scale, allowing for comparisons across different frequencies that would otherwise be difficult due to the 1/f decay of power in physiological signals.

      Author response image 8.

      Author response image 9.

      iv. Schomberg and colleagues (2014) suggested that the modulation of neurons in the slow gamma range could be related to theta harmonics (see above). Harmonics can often extend in a near infinite as they regress into the 1/f background (contributing to power, but without a peak above the power spectral density slope), making arbitrary frequency limits inappropriate. Therefore, in order to support the analyses and assertions regarding slow gamma, it seems necessary to calculate a "theta harmonic/slow gamma ratio". Aru et al. (2015; Untangling cross-frequency coupling in neuroscience) offer that: " The presence of harmonics in the signal should be tested by a bicoherence analysis and its contribution to CFC should be discussed." Please test both the synthetic signals above and the raw LFP, using temporal windows of greater than 4 seconds (again, the large window optimizes for frequency resolution in the time-frequency trade-off) to calculate the bicoherence. As harmonics are integers of theta coupled to itself and slow gamma is also coupled to theta, a nice illustration and contribution to the field would be a method that uses the bispectrum to isolate and create a "slow gamma/harmonic" ratio.

      We thank the reviewer for providing the method regarding on the theta harmonics.  We firstly measured the theta harmonics on the synthesized signal by using the biphasic coherence method, and we could clearly observe the nonlinear coupling between theta rhythm and its harmonics (Author response image 10).

      Author response image 10.

      In addition, we also measured the bicoherence on raw traces during slow gamma episodes.  We did not see nonlinear coupling between slow gamma and theta bands in this real data (mean bicoherence=0.1±0.0002) compared with that in the synthesized signal (mean bicoherence=0.7 for elliptical waves and 0.5 for sawtooth waves), suggesting that the slow gamma detected in this study was not pure theta harmonic (Author response image 11C, F, I, in red boxes).  Therefore, we believe that the contribution of theta harmonic in slow gamma is not significant.

      Author response image 11.

      (4) I appreciate the inclusion of the histology for the 4 animals. Knerim and colleagues describe a difference in MEC projection along the proximal-distal axis of the CA1 region (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3866456/)- "There are also differences in their direct projections along the transverse axis of CA1, as the LEC innervates the region of CA1 closer to the subiculum (distal CA1), whereas the MEC innervates the region of CA1 closer to CA2 and CA3 (proximal CA1)" From the histology, it looks like some of the electrodes are in the part of CA1 that would be dominated by LEC input while a few are closer to where the MEC would project.

      a. How do the authors control for these differences in projections? Wouldn't this change whether or not fast gamma is observed in CA1?

      b. I am only aware of one manuscript that describes slow gamma in the LEC which appeared in contrast to fast gamma from the MEC (https://www.science.org/doi/10.1126/science.abf3119). One would surmise that the authors in the present manuscript would have varying levels of fast gamma in their CA1 recordings depending on the location of the electrodes in the Proximal-distal axis, to the extent that some of the more medial tetrodes may need to be excluded (as they should not have fast gamma, rather they should be exclusively dominated by slow gamma). Alternatively, the authors may find that there is equal fast gamma power across the entire proximal-distal axis. However, this would pose a significant challenge to the LEC/slow gamma and MEC/fast gamma routing story of Fernandez-Ruiz et al. and require reconciliation/discussion.

      c. Is there a difference in neuron modulation to these frequencies based on electrode location in CA1?

      We thank the reviewer for this concern, which was also raised by Reviewer2.  We aligned the physical location of LFP channels in the proximal-distal axis based on histology.  In our dataset, only 2 rats were recorded from both distal and proximal hippocampus, so we calculated the gamma power from both sites in these rats.  We found that slow power was higher from proximal tetrodes than that from distal tetrodes (Author response image 12, repeated measure ANOVA, F(1,7)=10.2, p=0.02, partial η <sup>2</sup>=0.8).  However, fast gamma power were similar between different recording sites (F(1,7)=0.008, p=0.9, partial η <sup>2</sup>=0.001).  These results are partially consistent with the LEC/slow gamma and MEC/fast gamma routing story of Fernandez-Ruiz’s work.  The main reason would be that all LFPs were recorded from tetrodes in stratum pyramidale, deep layer in particular (Author response image 4E), so that it was hard to precisely identify their distance to distal/proximal apical dendrites.

      Author response image 12.

      In terms of the anatomical location of FG and NFG cells, we identified tetrode traces in slices for each cell.  We found that both FG and NFG cells were recorded from the deep layer of dorsal CA1, with no difference of proportions between cell types (Author response image 4E, Chi-squared test, χ<sup>2</sup>=0.5, p=0.5, Cramer V=0.05).  The distribution of FG-cells he NFG-cells along the transverse axis was also similar between cell types (Author response image 4F, χ<sup>2</sup>=0.08, p=0.8, Cramer V=0.02).

      (5) Given a comment in the discussion (see below), it will be worth exploring changes in theta, theta harmonic, slow gamma, and fast gamma power with running speed as no changes were observed with theta sequences or lap number versus. Notably, Czurko et al., report an increase in theta and harmonic power with running speed (1999) while Ahmed and Mehta (2012) report a similar effect for gamma.

      a. Please determine if the oscillations change in power and frequency of the rhythms discussed above change with running speed using the same parameters applied in the present manuscript. The specific concern is that how the authors calculate running speed is not sensitive enough to evaluate changes.

      We thank the reviewer for this suggestion.  The description of running speed quantification has been updated in the Method (see “Estimation of running speed and head direction” section, Lines 501-511).  Overall, the sample frequency of running speed was25Hz which would be sensitive enough to evaluate the behavioral changes.

      By measuring the rhythmic power changing as a function of running speed (Author response image 8 and Author response image 9), we could observe that theta power was increased as running speed getting higher.  Consistent with the results in (Ahmed and Mehta, 2012) and our previous study (Zheng et al., 2015), the fast gamma power was increasing and slow gamma power was decreasing when running speed was getting high.

      In addition, we also estimated the rhythmic frequency as a function of running speed in the slow and fast episodes respectively.  We found that fast gamma frequency was increased with running speed (Author response image 13, linear regression, R<sup>2</sup>=0.4, corr=0.6, p=9.9×10<sup>-15</sup>), whereas slow gamma frequency was decreased with running speed (R<sup>2</sup>=0.2, corr=-0.4, p=8.8×10<sup>-6</sup>).  Although significant correlation was found between gamma frequency and running speed, consistent with the previous studies, the frequency change (~70-75Hz for fast gamma and ~30-28Hz for slow gamma) was not big enough to affect the sequence findings in this study.  In additiontheta frequency was maintained in either slow episodes (R<sup>2</sup>=0.02, corr=-0.1, p=0.1) or fast episodes (R<sup>2</sup>=0.004, corr=0.06, p=0.5), consistent with results in Fig.1G of Kropff et al., 2021 Neuron.

      Author response image 13.

      b. It is astounding that animals ran as fast as they did in what appears to be the first lap (Figure 3F), especially as rats' natural proclivity is thigmotaxis and inquisitive exploration in novel environments. Can the authors expand on why they believe their rats ran so quickly on the first lap in a novel environment and how to replicate this? Also, please include the individual values for each animal on the same plot.

      We thank the reviewer for pointing this out.  The task was not brand new to rats in this dataset, because only days with good enough recording quality for sequence decoding were included in this paper, which were about day2-day10 for each rat.  However, we still observed the process of sequence formation because of the rat’s exploration interest during early laps.  Thus, in terms exploration behaviors, the rats ran at relative high speeds across laps (Author response image 14, each gray line represents the running speed within an individual session).

      Author response image 14.

      c. Can the authors explain how the statistics on line 169 (F(4,44)) work? Specifically, it is challenging to determine how the degrees of freedom were calculated in this case and throughout if there were only 4 animals (reported in methods) over 5 laps (depicted in Figure 3F. Given line 439, it looks like trials and laps are used synonymously). Four animals over 5 laps should have a DOF of 16.

      This statistic result was performed with each session/day as a sample (n=12 sessions/days).  The statistics were generated by repeated measures ANOVA on 5 trials in 12 sessions, with a DOF of 44.

      (6) Throughout the manuscript, I am concerned about an inflation of statistical power. For example on line 162, F(2,4844). The large degrees of freedom indicate that the sample size was theta sequences or a number of cells. Since multiple observations were obtained from the same animal, the statistical assumption of independence is violated. Therefore, the stats need to be conducted using a nested model as described in Aarts et al. (2014; https://pubmed.ncbi.nlm.nih.gov/24671065/). A statistical consult may be warranted.

      We thank the reviewer for this suggestion.  We have replaced this statistic result by using generalized linear mixed model with ratID being a covariate.  These results have been updated in the revised manuscript (Lines 164-167).

      (7) It is stated that one tetrode served as a quiet recording reference. The "quiet" part is an assumption when often, theta and gamma can be volume conducted to the cortex (e.g., Sirota et al., 2008; This is often why laboratories that study hippocampal rhythms use the cerebellum for the differential recording electrode and not an electrode in the corpus callosum). Generally, high frequencies propagate as well as low frequencies in the extracellular milieu (https://www.eneuro.org/content/4/1/ENEURO.0291-16.2016). For transparency, the authors should include a limitation paragraph in their discussion that describes how their local tetrode reference may be inadvertently diminishing and/or distorting the signal that they are trying to isolate. Otherwise, it would be worth hearing an explanation as to how the author's approach avoids this issue.

      In terms of the locations of references, we had 2 screws above the cerebellum in the skull connected to the recording drive ground, and 1 tetrode in a quiet area of the cortex serving as the recording reference.  We agree that the theta and gamma can be volume conducted to the cortex which may affect the power of these rhythms in the stratum pyramidale.  However, we didn’t mean to measure or compare the absolute theta or gamma power in this study, as we only cared about the phase modulation of gamma to place cells.  Therefore, we believe the location of recording reference would not make significant effect on our conclusion.

      Apologetically, this review is already getting long. Moreover, I have substantial concerns that should be resolved prior to delving into the remainder of the analyses. e.g., the analyses related to Figure 3-5 assert that FG cells are important for sequences. However, the relationship to gamma may be secondary to either their relationship to theta or, based on the Grosmark and Buzsaki paper, it may just be a phenomenon coupled to the fast-firing cells (fast-firing cells showing higher gamma modulation due to a local PING dynamic). Moreover, the observation of slow gamma is being challenged as theta harmonics, even by the major proponents of the slow/fast gamma theory. Therefore, the report of slow gamma precession would come as an unsurprising extension should they be revealed to be theta harmonics (however, no control for harmonics was implemented; suggestions were made above). Following these amendments, I would be grateful for the opportunity to provide further feedback.

      III. Discussion.

      a. Line 330- it was offered that fast gamma encodes information while slow gamma integrates in the introduction. However, in a task such as circular track running (from the methods, it appears that there is no new information to be acquired within a trial), one would guess that after the first few laps, slow gamma would be the dominant rhythm. Therefore, one must wonder why there are so few neurons modulated by slow gamma (~3.7%).

      The proportion of ~3.7% was the part of place cells phase-locked to slow gamma.  However, we aimed to find that the slow gamma phase precession of place cells promoted the theta sequence development.  We would not expect the cells phase-locked to slow gamma if phase precession occurred.

      b. Line 375: The authors contend that: "...slow gamma, related to information compression, was also required to modulate fast gamma phase-locked cells during sequence development. We replicated the results of slow gamma phase precession at the ensemble level (Zheng et al., 2016), and furthermore observed it at late development, but not early development, of theta sequences." In relation to the idea that slow gamma may be coupled to - if not a distorted representation of - theta harmonics, it has been observed that there are changes in theta relative to novelty.

      i. A. Jeewajee, C. Lever, S. Burton, J. O'Keefe, and N. Burgess (2008) report a decrease in theta frequency in novel circumstances that disappears with increasing familiarity.

      ii. One could surmise that this change in frequency is associated with alterations in theta harmonics (observed here as slow gamma), challenging the author's interpretation.

      iii. Therefore, the authors have a compelling opportunity to replicate the results of Jeewajee et al., characterizing changes of theta along with the development of slow gamma precession, as the environment becomes familiar. It will become important to demonstrate, using bicoherence as offered by Aru et al., how slow gamma can be disambiguated from theta harmonics. Specifically, we anticipate that the authors will be able to quantify A) theta harmonics (the number, and their respective frequencies and amplitudes), B) the frequency and amplitude of slow gamma, and C) how they can be quantitatively decoupled. Through this, their discussion of oscillatory changes with novelty-familiarity will garner a significant impact.

      We think we have demonstrated that the slow gamma observed in this study was not purely theta harmonics.  We didn’t focus on the frequency change of slow gamma or theta rhythms in this study.  Further investigation will be carried out on this topic in the future.

      c. Broadly, it is interesting that the authors emphasize the gamma frequency throughout the discussion. Given that the power spectral density of the Local Field Potential (LFP) exhibits a log-log relationship between amplitude and frequency, as described by Buzsáki (2005) in "Rhythms of the Brain," and considering that the LFP is primarily generated through synaptic transmembrane currents (Buzsáki et al., 2012), it seems parsimonious to consider that the bulk of synaptic activity occurs at lower frequencies (e.g., theta). Since synaptic transmission represents the most direct form of inter-regional communication, one might wonder why gamma (characterized by lower amplitude rhythms) is esteemed so highly compared to the higher amplitude theta rhythm. Why isn't the theta rhythm, instead, regarded as the primary mode of communication across brain regions? A discussion exploring this question would be beneficial.

      We thank the reviewer for this deep thinking.  When stating the conclusion on gamma rhythms, we didn’t mean to weaken the role of theta rhythm.  Conversely, the fast or slow gamma episodes were detected riding on theta rhythms, and we believe that the information compression should occur at a finer scale within a theta cycle scale.  More investigation will be carried out on this topic in the future.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) It is helpful to clearly define "FG-cell sequences" before the relevant results are described in the Results section. More importantly, the seemingly conflicting results between Figure 3 and Figure 8 may need to be clarified.

      The “exFG-sequences and exNFG sequences”, “FG-cell sequences and NFG-cell sequences” have been defined clearly in the revised manuscript.  Moreover, the seemingly conflicting results between Figure 3 and Figure 8 have been interpreted properly.

      (2) It is helpful to clearly state the N and what defines a sample whenever a result is described.

      In each statistical results, the N and what defines a sample have been clarified in the revised manuscript.

      (3) Addressing the questions regarding the methods (#5) would clarify some of the results.

      The questions regarding the Methods part has addressed in the revised manuscript.

      (4) Line #244: "successful" should be "successive"?

      Fixed.

      Reviewer #2 (Recommendations For The Authors):

      - The writing of the manuscript can be substantially improved.

      The manuscript can be substantially revised and updated.

      - I noticed that the last author of the manuscript is not the lead or corresponding and has only provided a limited contribution to this work (according to the detailed author contributions). The second to last author seems to be the main senior intellectual contributor and supervisor, together with the third to last author. This speaks of potential bad academic practices where a senior person whose intellectual contribution to the study is relatively minor takes the last author position, against the standard conventions on authorship worldwide. I strongly suggest that this is corrected.

      We thank the reviewer for raising this problem.  The last author Dr. Ming was also a senior author and supervised this project with large contribution.  We have fixed his role as a co-corresponding author in the revised manuscript.

    1. eLife Assessment

      The Twin Domain model proposed by Lui and Wang proposing that twin supercoiling domains of DNA emerge during transcription were first described decades ago, but direct experimental evidence has been challenging to obtain. Here, the authors make a fundamental contribution by directly measuring DNA torsion in cells using a photoactivatable intrastrand cross-linker compared to controls. They gather compelling data using this clever method, which provides direct evidence in support of the twin-supercoiled domain model, for torsional effects at transcription start and end sites, and thereby uncover novel features of higher order structure of chromatin in yeast. These data are exciting, and the tools will be of interest to anyone studying chromosome structure and gene regulation.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript by Hall et al reports a genome-wide map of supercoiling in yeast using psoralen as a probe that intercalates more effectively into underwound DNA and can then be fixed in place by UV-cross-linking. Sites of cross-linking are revealed by exonuclease digestion and sequencing. Cross-linking is compared with samples that are first fixed with formaldehyde, permeabilized, digested with Dpn II to release unrestrained torsion, and then crosslinked. The authors promote this "zero-torsion" approach as an improvement that corrects for nucleosomes (or binding by other macromolecules) that mask psoralen binding. The investigators then examine patterns of psoralen binding (and hence supercoiling) that are associated with promoter strength, promoter type (sequence-specific transcription factor dependent, insulator associated, or general TFs only) and gene length.

      Strengths:

      This is an interesting paper that reports an approach that reveals some new information about the relationship between torsional stress and gene activity in the yeast genome. The method is logical and interesting and provides evidence that spread of torsional stress through the genome is regulated.

      Weaknesses:

      The analysis is not entirely novel, and I believe that more valuable information can be culled from these datasets than is reported here.

    3. Reviewer #2 (Public review):

      Summary:

      This study describes a novel method for mapping torsional stress in the genome of Saccharomyces cerevisiae using trimethylpsoralen (TMP). It introduces a procedure to establish a zero-torsion baseline while preserving the chromatin state by treating cells with formaldehyde before releasing torsion with restriction enzyme digestion.

      This approach allows foer more accurate differentiation between torsional stress effects and accessibility effects in the psoralen signal. The results confirm that psoralen crosslinking is strongly affected by accessibility of the DNA and to a much more limited extent by the torsional stress of the DNA. Subtracting the baseline signal (no torsion) from the total signal allows detecting torsional stress, although TMP accessibility is still affecting the read out. The authors confirm the validity of the method by studying torsional stress in dependence of transcription levels, gene length and relative gene orientation. They propose that torsional stress may play a role in recruiting topoisomerases and regulating 3D genome architecture via cohesin. They also suggest that transcription factor binding might insulate negative supercoiling originated form transcription of neighboring divergent genes.

      Strengths:

      This paper offers a potentially interesting tool for future work.

      Weaknesses:

      The signal-to-background ratio, which represents the torsional fraction, appears to be quite limited relative to the overall signal (roughly 20x less, according to the scales in figs 2a and 2b, raising concerns about the robustness of the conclusions. It is clear from these figures, for instance, that a non-negligible fraction of the remaining signal is still dependent on DNA accessibility, revealing the nucleosomes footprints in spite of the fact that subtracting the zero-torsion signal should theoretically hinder the accessibility component. Because of this, some of the conclusions might be flawed, in that what is attributed to torsional stress might in reality be due, partially or fully, to accessibility issues.

      Specific points:

      Lines 226-227: "rotation may be more restricted with a lengthening in the RNA transcript, which is known to be associated with large machinery, such as spliceosomes". This argument is not appropriate to correlate torsional stress with gene length. Spliced genes are rare and generally short in yeast, generally in ribosomal proteins genes.

      Lines 256-257 In discussing that torsional stress must hinder Pol II progression, the authors write: "Pol II has a minimal presence in the intergenic region between divergent genes and is enriched in the intergenic region between convergent genes, consistent with a previous finding that after termination, Pol II tends to remain on the DNA downstream of the terminator". The connection between Pol II distribution and torsional stress is unclear. Pol ii is depleted at promoters and is enriched at at 3'-end of convergent genes most likely because this ChIP signal is the sum of signals from the two convergent genes. The fact that positive torsional stress is observed in these region does not mean that polymerases accumulate because the torsional stress hinder Pol II progression. To claim elongation defects the authors should repeat the same analysis with stranded data (e.g. NET-seq or CRAC) and assess if polymerases transcribing these regions accumulate more when facing convergent genes compared to tandem genes. The claim that after termination the Pol II tends to remain on the DNA appears to be meaningless - the authors probably mean after RNA processing.

      Lines 275-277: "These data provide evidence that the (+) supercoiling generated by transcription may facilitate genome folding in coordination with other participating proteins". This is an overstatement. It is known that cohesins accumulate between convergent genes. The fact that there is torsional stress in the same position does not imply that supercoiling participates in genome folding. These could be independent events, or even, supercoiling might depend on cohesins

      Lines 289-290 "torsion generated from one gene can impact the expression of its neighboring gene, consistent with previous findings that the expression of these genes is coupled" the existence of negative torsional stress in a common intergenic region for two genes does not imply that torsion is causally associated to gene expression coupling

      Lines 291-292: "Another large class of S. cerevisiae promoters (termed "TFO") are regulated by insulator ssTFs, such as Reb1 and Abf1, which decouple interactions between neighbouring genes" In these cases and others that depend on an activator binding the authors detect a region of accessibility interrupted by a valley, which they interpret as a topological insulator. However, the valley might be generated because of decreased TMP accessibility due to of TF binding.

    4. Reviewer #3 (Public review):

      Summary:

      The authors describe a new method for measuring DNA torsion in cells using the photoactivatable intrastrand cross-linker trimethyl psoralen (TMP). However, their method differs from previous TMP-based torsion mapping methods by comparing formaldehyde cross-linked and torsionally trapped chromatin to torsion-relieved (zero-torsion) chromatin in parallel. Comparison between the two datasets reveals a very slight difference, but enough to provide extremely high resolution genome-wide maps of torsion in the yeast genome. This direct comparison of the two maps confirms that blockage of TMP binding by nucleosomes and some DNA-binding proteins from TMP intercalation is a major complication of previous methods, and analysis of the data provides a glimpse of chromatin-based processes from within the DNA gyre.

      Strengths:

      In addition to providing direct evidence for the twin-supercoiled domain model and for torsional effects at transcription start (TSS) and end (TES) sites, the authors' analyses reveal some novel features of yeast higher-order structure. These include the cohesin-dependent anchoring of DNA loops at sites of positive supercoiling and the insulation of torsion between closely spaced divergent genes by general transcription factors, which implies that these factors resist free rotation. The fact that method should be generalizable to complex eukaryotic cells with large genomes, and the implications for understanding how torsion impacts transcription and gene regulation will be of substantial interest to a broad community.

      Weaknesses:

      No serious weaknesses.

    1. eLife Assessment

      This useful paper uses a quantitative modeling approach to explore a putative mechanism underlying a well-studied behavioral transition in the nematode C. elegans. The premise, that what has been considered a two-state behavior can instead be described as a process whose parameters are smoothly modulated within a single state, is intriguing. However, in the paper's current state, concerns about the model and its fit to empirical data make the support for this idea inadequate.

    2. Reviewer #1 (Public review):

      Summary:

      This paper concerns mechanisms of foraging behavior in C. elegans. Upon removal from food, C. elegans first executes a stereotypical local search behavior in which it explores a small area by executing many random, undirected reversals and turns called "reorientations." If the worm fails to find food, it transitions to a global search in which it explores larger areas by suppressing reorientations and executing long forward runs (Hills et al., 2004). At the population level, the reorientation rate declines gradually. Nevertheless, about 50% of individual worms appear to exhibit an abrupt transition between local and global search, which is evident as a discrete transition from high to low reorientation rate (Lopez-Cruz et al., 2019). This observation has given rise to the hypothesis that local and global search correspond to separate internal states with the possibility of sudden transitions between them (Calhoun et al., 2014). The main conclusion of the paper is that it is not necessary to posit distinct internal states to account for discrete transitions from high to low reorientation rates. On the contrary, discrete transitions can occur simply because of the stochastic nature of the reorientation behavior itself.

      Strengths:

      The strength of the paper is the demonstration that a more parsimonious model explains abrupt transitions in the reorientation rate.

      Weaknesses:

      (1) Use of the Gillespie algorithm is not well justified. A conventional model with a fixed dt and an exponentially decaying reorientation rate would be adequate and far easier to explain. It would also be sufficiently accurate - given the appropriate choice of dt - to support the main claims of the paper, which are merely qualitative. In some respects, the whole point of the paper - that discrete transitions are an epiphenomenon of stochastic behavior - can be made with the authors' version of the model having a constant reorientation rate (Figure 2f).

      (2) In the manuscript, the Gillespie algorithm is very poorly explained, even for readers who already understand the algorithm; for those who do not it will be essentially impossible to comprehend. To take just a few examples: in Equation (1), omega is defined as reorientations instead of cumulative reorientations; it is unclear how (4) follows from (2) and (3); notation in (5), line 133, and (7) is idiosyncratic. Figure 1a does not help, partly because the notation is unexplained. For example, what do the arrows mean, what does "*" mean?

      (3) In the model, the reorientation rate dΩ⁄dt declines to zero but the empirical rate clearly does not. This is a major flaw. It would have been easy to fix by adding a constant to the exponentially declining rate in (1). Perhaps fixing this obvious problem would mitigate the discrepancies between the data and the model in Figure 2d.

      (4) Evidence that the model fits the data (Figure 2d) is unconvincing. I would like to have seen the proportion of runs in which the model generated one as opposed to multiple or no transitions in reorientation rate; in the real data, the proportion is 50% (Lopez). It is claimed that the "model demonstrated a continuum of switching to non-switching behavior" as seen in the experimental data but no evidence is provided.

      (5) The explanation for the poor fit between the model and data (lines 166-174) is unclear. Why would externally triggered collisions cause a shift in the transition distribution?

      (6) The discussion of Levy walks and the accompanying figure are off-topic and should be deleted.

    3. Reviewer #2 (Public review):

      Summary:

      In this study, the authors build a statistical model that stochastically samples from a time-interval distribution of reorientation rates. The form of the distribution is extracted from a large array of behavioral data, and is then used to describe not only the dynamics of individual worms (including the inter-individual variability in behavior), but also the aggregate population behavior. The authors note that the model does not require assumptions about behavioral state transitions, or evidence accumulation, as has been done previously, but rather that the stochastic nature of behavior is "simply the product of stochastic sampling from an exponential function".

      Strengths:

      This model provides a strong juxtaposition to other foraging models in the worm. Rather than evoking a behavioral transition function (that might arise from a change in internal state or the activity of a cell type in the network), or evidence accumulation (which again maps onto a cell type, or the activity of a network) - this model explains behavior via the stochastic sampling of a function of an exponential decay. The underlying model and the dynamics being simulated, as well as the process of stochastic sampling, are well described and the model fits the exponential function (Equation 1) to data on a large array of worms exhibiting diverse behaviors (1600+ worms from Lopez-Cruz et al). The work of this study is able to explain or describe the inter-individual diversity of worm behavior across a large population. The model is also able to capture two aspects of the reorientations, including the dynamics (to switch or not to switch) and the kinetics (slow vs fast reorientations). The authors also work to compare their model to a few others including the Levy walk (whose construction arises from a Markov process) to a simple exponential distribution, all of which have been used to study foraging and search behaviors.

      Weaknesses:

      This manuscript has two weaknesses that dampen the enthusiasm for the results. First, in all of the examples the authors cite where a Gillespie algorithm is used to sample from a distribution, be it the kinetics associated with chemical dynamics, or a Lotka-Volterra Competition Model, there are underlying processes that govern the evolution of the dynamics, and thus the sampling from distributions. In one of their references, for instance, the stochasticity arises from the birth and death rates, thereby influencing the genetic drift in the model. In these examples, the process governing the dynamics (and thus generating the distributions from which one samples) is distinct from the behavior being studied. In this manuscript, the distribution being sampled is the exponential decay function of the reorientation rate (lines 100-102). This appears to be tautological - a decay function fitted to the reorientation data is then sampled to generate the distributions of the reorientation data. That the model performs well and matches the data is commendable, but it is unclear how that could not be the case if the underlying function generating the distribution was fit to the data.

      The second weakness is somewhat related to the first, in that absent an underlying mechanism or framework, one is left wondering what insight the model provides. Stochastic sampling a function generated by fitting the data to produce stochastic behavior is where one ends up in this framework, and the authors indeed point this out: "simple stochastic models should be sufficient to explain observably stochastic behaviors." (Line 233-234). But if that is the case, what do we learn about how the foraging is happening? The authors suggest that the decay parameter M can be considered a memory timescale; which offers some suggestion, but then go on to say that the "physical basis of M can come from multiple sources". Here is where one is left for want: The mechanisms suggested, including loss of sensory stimuli, alternations in motor integration, ionotropic glutamate signaling, dopamine, and neuropeptides are all suggested: these are basically all of the possible biological sources that can govern behavior, and one is left not knowing what insight the model provides. The array of biological processes listed is so variable in dynamics and meaning, that their explanation of what governs M is at best unsatisfying. Molecular dynamics models that generate distributions can point to certain properties of the model, such as the binding kinetics (on and off rates, etc.) as explanations for the mechanisms generating the distributions, and therefore point to how a change in the biology affects the stochasticity of the process. It is unclear how this model provides such a connection, especially taken in aggregate with the previous weakness.

      Providing a roadmap of how to think about the processes generating M, the meaning of those processes in search, and potential frameworks that are more constrained and with more precise biological underpinning (beyond the array of possibilities described) would go a long way to assuaging the weaknesses.

    4. Reviewer #3 (Public review):

      Summary:

      This intriguing paper addresses a special case of a fundamental statistical question: how to distinguish between stochastic point processes that derive from a single "state" (or single process) and more than one state/process. In the language of the paper, a "state" (perhaps more intuitively called a strategy/process) refers to a set of rules that determine the temporal statistics of the system. The rules give rise to probability distributions (here, the probability for turning events). The difficulty arises when the sampling time is finite, and hence, the empirical data is finite, and affected by the sampling of the underlying distribution(s). The specific problem being tackled is the foraging behavior of C. elegans nematodes, removed from food. Such foraging has been studied for decades, and described by a transition over time from 'local'/'area-restricted' search'(roughly in the initial 10-30 minutes of the experiments, in which animals execute frequent turns) to 'dispersion', or 'global search' (characterized by a low frequency of turns). The authors propose an alternative to this two-state description - a potentially more parsimonious single 'state' with time-changing parameters, which they claim can account for the full-time course of these observations.

      Figure 1a shows the mean rate of turning events as a function of time (averaged across the population). Here, we see a rapid transient, followed by a gradual 4-5 fold decay in the rate, and then levels off. This picture seems consistent with the two-state description. However, the authors demonstrate that individual animals exhibit different "transition" statistics (Figure 1e) and wish to explain this. They do so by fitting this mean with a single function (Equations 1-3).

      Strengths:

      As a qualitative exercise, the paper might have some merit. It demonstrates that apparently discrete states can sometimes be artifacts of sampling from smoothly time-changing dynamics. However, as a generic point, this is not novel, and so without the grounding in C. elegans data, is less interesting.

      Weaknesses:

      (1) The authors claim that only about half the animals tested exhibit discontinuity in turning rates. Can they automatically separate the empirical and model population into these two subpopulations (with the same method), and compare the results?

      (2) The equations consider an exponentially decaying rate of turning events. If so, Figure 2b should be shown on a semi-logarithmic scale.

      (3) The variables in Equations 1-3 and the methods for simulating them are not well defined, making the method difficult to follow. Assuming my reading is correct, Omega should be defined as the cumulative number of turning events over time (Omega(t)), not as a "turn" or "reorientation", which has no derivative. The relevant entity in Figure 1a is apparently , i.e. the mean number of events across a population which can be modelled by an expectation value. The time derivative would then give the expected rate of turning events as a function of time.

      (4) Equations 1-3 are cryptic. The authors need to spell out up front that they are using a pair of coupled stochastic processes, sampling a hidden state M (to model the dynamic turning rate) and the actual turn events, Omega(t), separately, as described in Figure 2a. In this case, the model no longer appears more parsimonious than the original 2-state model. What then is its benefit or explanatory power (especially since the process involving M is not observable experimentally)?

      (5) Further, as currently stated in the paper, Equations 1-3 are only for the mean rate of events. However, the expectation value is not a complete description of a stochastic system. Instead, the authors need to formulate the equations for the probability of events, from which they can extract any moment (they write something in Figure 2a, but the notation there is unclear, and this needs to be incorporated here).

      (6) Equations 1-3 have three constants (alpha and gamma which were fit to the data, and M0 which was presumably set to 1000). How does the choice of M0 affect the results?

      (7) M decays to near 0 over 40 minutes, abolishing omega turns by the end of the simulations. Are omega turns entirely abolished in worms after 30-40 minutes off food? How do the authors reconcile this decay with the leveling of the turning rate in Figure 1a?

      (8) The fit given in Figure 2b does not look convincing. No statistical test was used to compare the two functions (empirical and fit). No error bars were given (to either). These should be added. In the discussion, the authors explain the discrepancy away as experimental limitations. This is not unreasonable, but on the flip side, makes the argument inconclusive. If the authors could model and simulate these limitations, and show that they account for the discrepancies with the data, the model would be much more compelling. To do this, I would imagine that the authors would need to take the output of their model (lists of turning times) and convert them into simulated trajectories over time. These trajectories could be used to detect boundary events (for a given size of arena), collisions between individuals, etc. in their simulations and to see their effects on the turn statistics.

      (9) The other figures similarly lack any statistical tests and by eye, they do not look convincing. The exception is the 6 anecdotal examples in Figure 2e. Those anecdotal examples match remarkably closely, almost suspiciously so. I'm not sure I understood this though - the caption refers to "different" models of M decay (and at least one of the 6 examples clearly shows a much shallower exponential). If different M models are allowed for each animal, this is no longer parsimonious. Are the results in Figure 2d for a single M model? Can Figure 2e explain the data with a single (stochastic) M model?

      (10) The left axes of Figure 2e should be reverted to cumulative counts (without the normalization).

      (11) The authors give an alternative model of a Levy flight, but do not give the obvious alternative models:<br /> a) the 1-state model in which P(t) = alpha exp (-gamma t) dt (i.e. a single stochastic process, without a hidden M, collapsing equations 1-3 into a single equation).<br /> b) the originally proposed 2-state model (with 3 parameters, a high turn rate, a low turn rate, and the local-to-global search transition time, which can be taken from the data, or sampled from the empirical probability distributions). Why not? The former seems necessary to justify the more complicated 2-process model, and the latter seems necessary since it's the model they are trying to replace. Including these two controls would allow them to compare the number of free parameters as well as the model results. I am also surprised by the Levy model since Levy is a family of models. How were the parameters of the Levy walk chosen?

      (12) One point that is entirely missing in the discussion is the individuality of worms. It is by now well known that individual animals have individual behaviors. Some are slow/fast, and similarly, their turn rates vary. This makes this problem even harder. Combined with the tiny number of events concerned (typically 20-40 per experiment), it seems daunting to determine the underlying model from behavioral statistics alone.

      (13) That said, it's well-known which neurons underpin the suppression of turning events (starting already with Gray et al 2005, which, strangely, was not cited here). Some discussion of the neuronal predictions for each of the two (or more) models would be appropriate.

      (14) An additional point is the reliance entirely on simulations. A rigorous formulation (of the probability distribution rather than just the mean) should be analytically tractable (at least for the first moment, and possibly higher moments). If higher moments are not obtainable analytically, then the equations should be numerically integrable. It seems strange not to do this.

      In summary, while sample simulations do nicely match the examples in the data (of discontinuous vs continuous turning rates), this is not sufficient to demonstrate that the transition from ARS to dispersion in C. elegans is, in fact, likely to be a single 'state', or this (eq 1-3) single state. Of course, the model can be made more complicated to better match the data, but the approach of the authors, seeking an elegant and parsimonious model, is in principle valid, i.e. avoiding a many-parameter model-fitting exercise.

      As a qualitative exercise, the paper might have some merit. It demonstrates that apparently discrete states can sometimes be artifacts of sampling from smoothly time-changing dynamics. However, as a generic point, this is not novel, and so without the grounding in C. elegans data, is less interesting.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This paper concerns mechanisms of foraging behavior in C. elegans. Upon removal from food, C. elegans first executes a stereotypical local search behavior in which it explores a small area by executing many random, undirected reversals and turns called "reorientations." If the worm fails to find food, it transitions to a global search in which it explores larger areas by suppressing reorientations and executing long forward runs (Hills et al., 2004). At the population level, the reorientation rate declines gradually. Nevertheless, about 50% of individual worms appear to exhibit an abrupt transition between local and global search, which is evident as a discrete transition from high to low reorientation rate (Lopez-Cruz et al., 2019). This observation has given rise to the hypothesis that local and global search correspond to separate internal states with the possibility of sudden transitions between them (Calhoun et al., 2014). The main conclusion of the paper is that it is not necessary to posit distinct internal states to account for discrete transitions from high to low reorientation rates. On the contrary, discrete transitions can occur simply because of the stochastic nature of the reorientation behavior itself.

      Strengths:

      The strength of the paper is the demonstration that a more parsimonious model explains abrupt transitions in the reorientation rate.

      Weaknesses:

      (1) Use of the Gillespie algorithm is not well justified. A conventional model with a fixed dt and an exponentially decaying reorientation rate would be adequate and far easier to explain. It would also be sufficiently accurate - given the appropriate choice of dt - to support the main claims of the paper, which are merely qualitative. In some respects, the whole point of the paper - that discrete transitions are an epiphenomenon of stochastic behavior - can be made with the authors' version of the model having a constant reorientation rate (Figure 2f).

      We apologize, but we are not sure what the reviewer means by “fixed dt”. If the reviewer means taking discrete steps in time (dt), and modeling whether a reorientation occurs, we would argue that the Gillespie algorithm is a better way to do this because it provides floating-point precision time resolution, rather than a time resolution limited by dt, which we hopefully explain in the comments below.

      The reviewer is correct that discrete transitions are an epiphenomenon of stochastic behavior as we show in Figure 2f. However, abrupt stochastic jumps that occur with a constant rate do not produce persistent changes in the observed rate because it is by definition, constant. The theory that there are local and global searches is based on the observation that individual worms often abruptly change their rates. But this observation is only true for a fraction of worms. We are trying to argue that the reason why this is not observed for all, or even most worms is because these are the result of stochastic sampling, not a sudden change in search strategy.

      (2) In the manuscript, the Gillespie algorithm is very poorly explained, even for readers who already understand the algorithm; for those who do not it will be essentially impossible to comprehend. To take just a few examples: in Equation (1), omega is defined as reorientations instead of cumulative reorientations; it is unclear how (4) follows from (2) and (3); notation in (5), line 133, and (7) is idiosyncratic. Figure 1a does not help, partly because the notation is unexplained. For example, what do the arrows mean, what does "*" mean?

      We apologize for this, you are correct,  is cumulative reorientations, and we will edit the text as follows:

      Experimentally, reorientation rate is measured as the number of reorientation events that occurred in an observational window. However, these are discrete stochastic events, so we should describe them in terms of propensity, i.e. the probability of observing a transitional event (in this case, a reorientation) is:

      Here, P(W+1,t) is the probability of observing a reorientation event at time t, and a<sub>1</sub> is the propensity for this event to occur. Observationally, the frequency of reorientations observed decays over time, so we can define the propensity as:

      Where α is the initial propensity at t=0.

      We can model this decay as the reorientation propensity coupled to a decaying factor (M):

      Where the propensity of this event (a<sub>2</sub>) is:

      Since M is a first-order decay process, when integrated, the cumulative M observed is:

      We can couple the probability of observing a reorientation to this decay by redefining (a<sub>1</sub> as:

      So that now:

      A critical detail should be noted. While reorientations are modeled as discrete events, the amount of M at time t\=0 is chosen to be large (M<sub>0</sub>←1,000), so that over the timescale of 40 minutes, the decay in M is practically continuous. This ensures that sudden changes in reorientations are not due to sudden changes in M, but due to the inherent stochasticity of reorientations.

      To model both processes, we can create the master equation:

      Since these are both Poisson processes, the probability density function for a state change i occurring in time t is:

      The probability that an event will not occur in time interval t is:

      The probability that no events will occur for ALL transitions in this time interval is:

      We can draw a random number (r<sub>1</sub> ∈[0,1]) that represents the probability of no events in time interval t, so that this time interval can be assigned by rearranging equation 11:

      where:

      This is the time interval for any event (W+1 or M-1) happening at t + t. The probability of which event occurs is proportional to its propensity:

      We can draw a second number (r<sub>2</sub> ∈[0,1]) that represents this probability so that which event occurs at time t + t is determined by the smallest n that satisfies:

      so that:

      The elegant efficiency of the Gillespie algorithm is two-fold. First, it models all transitions simultaneously, not separately. Second, it provides floating-point time resolution. Rather than drawing a random number, and using a cumulative probability distribution of interval-times to decide whether an event occurs at discrete steps in time, the Gillespie algorithm uses this distribution to draw the interval-time itself. The time resolution of the prior approach is limited by step size, whereas the Gillespie algorithm’s time resolution is limited by the floating-point precision of the random number that is drawn.

      We are happy to add this text to improve clarity.

      We apologize for the arrow notation confusion. Arrow notation is commonly used in pseudocode to indicate variable assignment, and so we used it to indicate variable assignment updates in the algorithm.

      We added Figure 2a to help explain the Gillespie algorithm for people who are unfamiliar with it, but you are correct, some notation, like probabilities, were left unexplained. We will address this to improve clarity.

      (3) In the model, the reorientation rate dΩ⁄dt declines to zero but the empirical rate clearly does not. This is a major flaw. It would have been easy to fix by adding a constant to the exponentially declining rate in (1). Perhaps fixing this obvious problem would mitigate the discrepancies between the data and the model in Figure 2d.

      You are correct that the model deviates slightly at longer times, but this result is consistent with Klein et al. that show a continuous decline of reorientations. However, we could add a constant to the model, since an infinite run length is likely not physiological.

      (4) Evidence that the model fits the data (Figure 2d) is unconvincing. I would like to have seen the proportion of runs in which the model generated one as opposed to multiple or no transitions in reorientation rate; in the real data, the proportion is 50% (Lopez). It is claimed that the "model demonstrated a continuum of switching to non-switching behavior" as seen in the experimental data but no evidence is provided.

      We should clarify that the 50% proportion cited by López-Cruz was based on an arbitrary difference in slopes, and by assessing the data visually. We sought to avoid this subjective assessment by plotting the distribution of slopes and transition times produced by the method used in López-Cruz. We should also clarify by what we meant by “a continuum of switching and non-switching” behavior. Both the transition time distributions and the slope-difference distributions do not appear to be the result of two distributions. This is unlike roaming and dwelling on food, where two distinct distributions of behavioral metrics can be identified based on speed and angular speed (Flavell et al, 2009, Fig S2a). We will add a permutation test to verify the mean differences in slopes and transition times between the experiment and model are not significant.

      (5) The explanation for the poor fit between the model and data (lines 166-174) is unclear. Why would externally triggered collisions cause a shift in the transition distribution?

      Thank you, we should rewrite the text to clarify this better. There were no externally triggered collisions; 10 animals were used per experiment. They would occasionally collide during the experiment, but these collisions were excluded from the data that were provided. However, worms are also known to increase reorientations when they encounter a pheromone trail, and it is unknown (from this dataset) which orientations may have been a result of this phenomenon.

      (6) The discussion of Levy walks and the accompanying figure are off-topic and should be deleted.

      Thank you, we agree that this topic is tangential, and we will remove it.

      Reviewer #2 (Public review):

      Summary:

      In this study, the authors build a statistical model that stochastically samples from a time-interval distribution of reorientation rates. The form of the distribution is extracted from a large array of behavioral data, and is then used to describe not only the dynamics of individual worms (including the inter-individual variability in behavior), but also the aggregate population behavior. The authors note that the model does not require assumptions about behavioral state transitions, or evidence accumulation, as has been done previously, but rather that the stochastic nature of behavior is "simply the product of stochastic sampling from an exponential function".

      Strengths:

      This model provides a strong juxtaposition to other foraging models in the worm. Rather than evoking a behavioral transition function (that might arise from a change in internal state or the activity of a cell type in the network), or evidence accumulation (which again maps onto a cell type, or the activity of a network) - this model explains behavior via the stochastic sampling of a function of an exponential decay. The underlying model and the dynamics being simulated, as well as the process of stochastic sampling, are well described and the model fits the exponential function (Equation 1) to data on a large array of worms exhibiting diverse behaviors (1600+ worms from Lopez-Cruz et al). The work of this study is able to explain or describe the inter-individual diversity of worm behavior across a large population. The model is also able to capture two aspects of the reorientations, including the dynamics (to switch or not to switch) and the kinetics (slow vs fast reorientations). The authors also work to compare their model to a few others including the Levy walk (whose construction arises from a Markov process) to a simple exponential distribution, all of which have been used to study foraging and search behaviors.

      Weaknesses:

      This manuscript has two weaknesses that dampen the enthusiasm for the results. First, in all of the examples the authors cite where a Gillespie algorithm is used to sample from a distribution, be it the kinetics associated with chemical dynamics, or a Lotka-Volterra Competition Model, there are underlying processes that govern the evolution of the dynamics, and thus the sampling from distributions. In one of their references, for instance, the stochasticity arises from the birth and death rates, thereby influencing the genetic drift in the model. In these examples, the process governing the dynamics (and thus generating the distributions from which one samples) is distinct from the behavior being studied. In this manuscript, the distribution being sampled is the exponential decay function of the reorientation rate (lines 100-102). This appears to be tautological - a decay function fitted to the reorientation data is then sampled to generate the distributions of the reorientation data. That the model performs well and matches the data is commendable, but it is unclear how that could not be the case if the underlying function generating the distribution was fit to the data.

      Thank you, we apologize that this was not clearer. In the Lotka-Volterra model, the density of predators and prey are being modeled, with the underlying assumption that rates of birth and death are inherently stochastic. In our model, the number of reorientations are being modeled, with the assumption (based on the experiments), that the occurrence of reorientations is stochastic, just like the occurrence (birth) of a prey animal is stochastic. However, the decay in M is phenomenological, and we speculate about the nature of M later in the manuscript.

      You are absolutely right that the decay function for M was fitted to the population average of reorientations and then sampled to generate the distributions of the reorientation data. This was intentional to show that the parameters chosen to match the population average would produce individual trajectories with comparable stochastic “switching” as the experimental data. All we’re trying to show really is that observed sudden changes in reorientation that appear persistent can be produced by a stochastic process without resorting to binary state assignments. In Calhoun, et al 2014 it is reported all animals produced switch-like behavior, but in Klein et al, 2017 it is reported that no animals showed abrupt transitions. López-Cruz et al seem to show a mix of these results, which can be easily explained by an underlying stochastic process.

      The second weakness is somewhat related to the first, in that absent an underlying mechanism or framework, one is left wondering what insight the model provides. Stochastic sampling a function generated by fitting the data to produce stochastic behavior is where one ends up in this framework, and the authors indeed point this out: "simple stochastic models should be sufficient to explain observably stochastic behaviors." (Line 233-234). But if that is the case, what do we learn about how the foraging is happening? The authors suggest that the decay parameter M can be considered a memory timescale; which offers some suggestion, but then go on to say that the "physical basis of M can come from multiple sources". Here is where one is left for want: The mechanisms suggested, including loss of sensory stimuli, alternations in motor integration, ionotropic glutamate signaling, dopamine, and neuropeptides are all suggested: these are basically all of the possible biological sources that can govern behavior, and one is left not knowing what insight the model provides. The array of biological processes listed is so variable in dynamics and meaning, that their explanation of what governs M is at best unsatisfying. Molecular dynamics models that generate distributions can point to certain properties of the model, such as the binding kinetics (on and off rates, etc.) as explanations for the mechanisms generating the distributions, and therefore point to how a change in the biology affects the stochasticity of the process. It is unclear how this model provides such a connection, especially taken in aggregate with the previous weakness.

      Providing a roadmap of how to think about the processes generating M, the meaning of those processes in search, and potential frameworks that are more constrained and with more precise biological underpinning (beyond the array of possibilities described) would go a long way to assuaging the weaknesses.

      Thank you, these are all excellent points. We should clarify that in López-Cruz et al, they claim that only 50% of the animals fit a local/global search paradigm. We are simply proposing there is no need for designating local and global searches if the data don’t really support it. The underlying behavior is stochastic, so the sudden switches sometimes observed can be explained by a stochastic process where the underlying rate is slowing down, thus producing the persistently slow reorientation rate when an apparent “switch” occurs. What we hope to convey is that foraging doesn’t appear to follow a decision paradigm, but instead a gradual change in reorientations which for individual worms, can occasionally produce reorientation trajectories that appear switch-like.

      As for M, you are correct, we should be more explicit. A decay in reorientation rate, rather than a sudden change, is consistent with observations made by López-Cruz et al.  They found that the neurons AIA and ADE redundantly suppress reorientations, and that silencing either one was sufficient to restore the large number of reorientations during early foraging. The synaptic output of AIA and ADE was inhibited over long timescales (tens of minutes) by presynaptic glutamate binding to MGL-1, a slow G-Protein coupled receptor expressed in AIA and ADE. Their results support a model where sensory neurons suppress the synaptic output of AIA and ADE, which in turn leads to a large number of reorientations early in foraging. As time passes, glutamatergic input from the sensory neurons decrease, which leads to disinhibition of AIA and ADE, and a subsequent suppression of reorientations.

      The sensory inputs into AIA and ADE are sequestered into two separate circuits, with AIA receiving chemosensory input and ADE receiving mechanosensory input. Since the suppression of either AIA or ADE is sufficient to increase reorientations, the decay in reorientations is likely due to the synaptic output of both of these neurons decaying in time. This correlates with an observed decrease in sensory neuron activity as well, so the timescale of reorientation decay could be tied to the timescale of sensory neuron activity, which in turn is influencing the timescale of AIA/ADE reorientation suppression. This implies that our factor “M” is likely the sum of several different sensory inputs decaying in time.

      The molecular basis of which sensory neuron signaling factors contribute to decreased AIA and ADE activity is made more complicated by the observation that the glutamatergic input provided by the sensory neurons was not essential, and that additional factors besides glutamate contribute to the signaling to AIA and ADE. In addition to this, it is simply not the sensory neuron activity that decays in time, but also the sensitivity of AIA and ADE to sensory neuron input that decays in time. Simply depolarizing sensory neurons after the animals had starved for 30 minutes was insufficient to rescue the reorientation rates observed earlier in the foraging assay. This observation could be due to decreased presynaptic vesicle release, and/or decreased receptor localization on the postsynaptic side.

      In summary, there are two neuronal properties that appear to be decaying in time. One is sensory neuron activity, and the other is decreased potentiation of presynaptic input onto AIA and ADE. Our factor “M” is a phenomenological manifestation of these numerous decaying factors.

      Reviewer #3 (Public review):

      Summary:

      This intriguing paper addresses a special case of a fundamental statistical question: how to distinguish between stochastic point processes that derive from a single "state" (or single process) and more than one state/process. In the language of the paper, a "state" (perhaps more intuitively called a strategy/process) refers to a set of rules that determine the temporal statistics of the system. The rules give rise to probability distributions (here, the probability for turning events). The difficulty arises when the sampling time is finite, and hence, the empirical data is finite, and affected by the sampling of the underlying distribution(s). The specific problem being tackled is the foraging behavior of C. elegans nematodes, removed from food. Such foraging has been studied for decades, and described by a transition over time from 'local'/'area-restricted' search'(roughly in the initial 10-30 minutes of the experiments, in which animals execute frequent turns) to 'dispersion', or 'global search' (characterized by a low frequency of turns). The authors propose an alternative to this two-state description - a potentially more parsimonious single 'state' with time-changing parameters, which they claim can account for the full-time course of these observations.

      Figure 1a shows the mean rate of turning events as a function of time (averaged across the population). Here, we see a rapid transient, followed by a gradual 4-5 fold decay in the rate, and then levels off. This picture seems consistent with the two-state description. However, the authors demonstrate that individual animals exhibit different "transition" statistics (Figure 1e) and wish to explain this. They do so by fitting this mean with a single function (Equations 1-3).

      Strengths:

      As a qualitative exercise, the paper might have some merit. It demonstrates that apparently discrete states can sometimes be artifacts of sampling from smoothly time-changing dynamics. However, as a generic point, this is not novel, and so without the grounding in C. elegans data, is less interesting.

      Weaknesses:

      (1) The authors claim that only about half the animals tested exhibit discontinuity in turning rates. Can they automatically separate the empirical and model population into these two subpopulations (with the same method), and compare the results?

      Thank you, we should clarify that the observation that about half the animals exhibit discontinuity was not made by us, but by López-Cruz et al. The observed fraction of 50% was based on a visual assessment of the dual regression method we described. To make the process more objective, we decided to simply plot the distributions of the metrics they used for this assessment to see if two distinct populations could be observed. However, the distributions of slope differences and transition times do not produce two distinct populations. Our stochastic approach, which does not assume abrupt state-transitions, also produces comparable distributions. To quantify this, we will perform permutation tests on the means and variances differences between experimental and model data.

      (2) The equations consider an exponentially decaying rate of turning events. If so, Figure 2b should be shown on a semi-logarithmic scale.

      We are happy to add this panel as well.

      (3) The variables in Equations 1-3 and the methods for simulating them are not well defined, making the method difficult to follow. Assuming my reading is correct, Omega should be defined as the cumulative number of turning events over time (Omega(t)), not as a "turn" or "reorientation", which has no derivative. The relevant entity in Figure 1a is apparently <Omega (t)>, i.e. the mean number of events across a population which can be modelled by an expectation value. The time derivative would then give the expected rate of turning events as a function of time.

      Thank you, you are correct. Please see response to Reviewer #1.

      (4) Equations 1-3 are cryptic. The authors need to spell out up front that they are using a pair of coupled stochastic processes, sampling a hidden state M (to model the dynamic turning rate) and the actual turn events, Omega(t), separately, as described in Figure 2a. In this case, the model no longer appears more parsimonious than the original 2-state model. What then is its benefit or explanatory power (especially since the process involving M is not observable experimentally)?

      Thank you, yes we see how as written this was confusing. In our response to Reviewer #1, we added an important detail:

      While reorientations are modeled as discrete events, which is observationally true, the amount of M at time t\=0 is chosen to be large (M<sub>0</sub>←1,000), so that over the timescale of 40 minutes, the decay in M is practically continuous. This ensures that sudden changes in reorientations are not due to sudden changes in M, but due to the inherent stochasticity of reorientations.

      However you are correct that if M was chosen to have a binary value of 0 or 1, then this would indeed be the two state model. Adding this as an additional model would be a good idea to compare how this matches the experimental data, and we are happy to add it.

      (5) Further, as currently stated in the paper, Equations 1-3 are only for the mean rate of events. However, the expectation value is not a complete description of a stochastic system. Instead, the authors need to formulate the equations for the probability of events, from which they can extract any moment (they write something in Figure 2a, but the notation there is unclear, and this needs to be incorporated here).

      Thank you, yes please see our response to Reviewer #1.

      (6) Equations 1-3 have three constants (alpha and gamma which were fit to the data, and M0 which was presumably set to 1000). How does the choice of M0 affect the results?

      Thank you, this is a good question. We will test this down to a binary state of M as mentioned in comment #4.

      (7) M decays to near 0 over 40 minutes, abolishing omega turns by the end of the simulations. Are omega turns entirely abolished in worms after 30-40 minutes off food? How do the authors reconcile this decay with the leveling of the turning rate in Figure 1a?

      Yes, reviewer #1 recommended adding a baseline reorientation rate which is likely more biologically plausible. However, we should also note that in Klein et al they observed a continuous decay over 50 minutes.

      (8) The fit given in Figure 2b does not look convincing. No statistical test was used to compare the two functions (empirical and fit). No error bars were given (to either). These should be added. In the discussion, the authors explain the discrepancy away as experimental limitations. This is not unreasonable, but on the flip side, makes the argument inconclusive. If the authors could model and simulate these limitations, and show that they account for the discrepancies with the data, the model would be much more compelling. To do this, I would imagine that the authors would need to take the output of their model (lists of turning times) and convert them into simulated trajectories over time. These trajectories could be used to detect boundary events (for a given size of arena), collisions between individuals, etc. in their simulations and to see their effects on the turn statistics.

      Thank you, we will add error bars and perform a permutation test on the mean and variance differences between experiment and model over the 40 minute window.

      (9) The other figures similarly lack any statistical tests and by eye, they do not look convincing. The exception is the 6 anecdotal examples in Figure 2e. Those anecdotal examples match remarkably closely, almost suspiciously so. I'm not sure I understood this though - the caption refers to "different" models of M decay (and at least one of the 6 examples clearly shows a much shallower exponential). If different M models are allowed for each animal, this is no longer parsimonious. Are the results in Figure 2d for a single M model? Can Figure 2e explain the data with a single (stochastic) M model?

      Thank you, yes, we will perform permutation tests on the mean and variance differences in the observed distributions in figure 2d. We certainly don’t want the panels in Figure 2e to be suspicious! These comparisons were drawn from calculating the correlations between all model traces and all experimental traces, and then choosing the top hits. Every time we run the simulation, we arrive at a different set of examples. Since it was recommended we add a baseline rate, these examples will be a completely different set when we run the simulation, again.

      We apologize for the confusion regarding M. Since the worms do not all start out with identical reorientation rates, we drew the initial M value from a distribution centered on M0 and a variance to match the initial distribution of observed experimental rates.

      (10) The left axes of Figure 2e should be reverted to cumulative counts (without the normalization).

      Thank you, we will add this. We want to clarify that we normalized it because we chose these examples based on correlation to show that the same types of sudden changes in search strategy can occur with a model that doesn’t rely on sudden rate changes.

      (11) The authors give an alternative model of a Levy flight, but do not give the obvious alternative models:

      a) the 1-state model in which P(t) = alpha exp (-gamma t) dt (i.e. a single stochastic process, without a hidden M, collapsing equations 1-3 into a single equation).

      b) the originally proposed 2-state model (with 3 parameters, a high turn rate, a low turn rate, and the local-to-global search transition time, which can be taken from the data, or sampled from the empirical probability distributions). Why not? The former seems necessary to justify the more complicated 2-process model, and the latter seems necessary since it's the model they are trying to replace. Including these two controls would allow them to compare the number of free parameters as well as the model results. I am also surprised by the Levy model since Levy is a family of models. How were the parameters of the Levy walk chosen?

      Thank you, we will remove this section completely, as it is tangential to the main point of the paper.

      (12) One point that is entirely missing in the discussion is the individuality of worms. It is by now well known that individual animals have individual behaviors. Some are slow/fast, and similarly, their turn rates vary. This makes this problem even harder. Combined with the tiny number of events concerned (typically 20-40 per experiment), it seems daunting to determine the underlying model from behavioral statistics alone.

      Thank you, yes we should have been more explicit in the reasoning behind drawing the initial M from a distribution (response to comment #9). We assume that not every worm starts out with the same reorientation rate, but that some start out fast (high M) and some start out slow (low M). However, we do assume M decays with the same kinetics, which seems sufficient to produce the observed phenomena.

      (13) That said, it's well-known which neurons underpin the suppression of turning events (starting already with Gray et al 2005, which, strangely, was not cited here). Some discussion of the neuronal predictions for each of the two (or more) models would be appropriate.

      Thank you, yes we will add Gray et al, but also the more detailed response to Reviewer #2.

      (14) An additional point is the reliance entirely on simulations. A rigorous formulation (of the probability distribution rather than just the mean) should be analytically tractable (at least for the first moment, and possibly higher moments). If higher moments are not obtainable analytically, then the equations should be numerically integrable. It seems strange not to do this.

      Thank you for suggesting this, we will add these analyses.

      In summary, while sample simulations do nicely match the examples in the data (of discontinuous vs continuous turning rates), this is not sufficient to demonstrate that the transition from ARS to dispersion in C. elegans is, in fact, likely to be a single 'state', or this (eq 1-3) single state. Of course, the model can be made more complicated to better match the data, but the approach of the authors, seeking an elegant and parsimonious model, is in principle valid, i.e. avoiding a many-parameter model-fitting exercise.

      As a qualitative exercise, the paper might have some merit. It demonstrates that apparently discrete states can sometimes be artifacts of sampling from smoothly time-changing dynamics. However, as a generic point, this is not novel, and so without the grounding in C. elegans data, is less interesting.

      Thank you, we agree that this is a generic phenomenon, which is partly why we did this. The data from López-Cruz seem to agree in part with Calhoun et al, that claim abrupt transitions occur, and Klein et al, which claim they do not occur. Since the underlying phenomenon is stochastic, we propose the mixed observations of sudden and gradual changes in search strategy are simply the result of a stochastic process, which can produce both phenomena for individual observations.

    1. eLife Assessment

      The study presents valuable findings on the molecular mechanisms of glucose-stimulated insulin secretion from pancreatic islets, focusing on the main regulatory elements of the signaling pathway in physiological conditions. While the evidence supporting the conclusions is solid, the study can be strengthened by the use of a beta cell line or knockout mice. The work will be of interest to cell biologists and biochemists working on diabetes.

    2. Reviewer #2 (Public review):

      The authors identified new target elements for prostaglandin E2 (PGE2) through which insulin release can be regulated in pancreatic beta cells under physiological conditions. In vitro extracellular exposure to PGE2 could directly and dose-dependently inhibit the potassium channel Kv2.2. In vitro pharmacology revealed that this inhibition occurs through the EP2/4 receptors, which activate protein kinase A (PKA). By screening specific sites of the Kv2.2 channel, the target phosphorylation site (S448) for PKA regulation was found. The physiological relevance of the described signaling cascade was investigated and confirmed in vivo, using a Kv2.2 knockdown mouse model.

      The strength of this manuscript is the novelty of the (EP2/4-PKA-Kv2.2 channel) molecular pathway described and the comprehensive methodological toolkit the authors have relied upon.

      The introduction is detailed and contains all the information necessary to place the claims in context. Although the dataset is comprehensive and a logical lead is consistently built, there is one important point to consider: to clarify that the described signaling pathway is characteristic of normal physiological conditions and thus differs from pathological changes. It would be useful to carry out basic experiments in a diabetes model (regardless of in mouse or rat even).

      Comments on revisions:

      The authors addressed my comments sufficiently. I have no additional questions to clarify.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study investigated the mechanism by which PGE2 inhibits the release of insulin from pancreatic beta cells in response to glucose. The researchers used a combination of cell line experiments and studies in mice with genetic ablation of the Kv2.2 channel. Their findings suggest a novel pathway where PGE2 acts through EP2/EP4 receptors to activate PKA, which directly phosphorylates a specific site (S448) on the Kv2.2 channel, inhibiting its activity and reducing GSIS.

      Strengths:

      - The study elegantly demonstrates a potential pathway connecting PGE2, EP2/EP4 receptors, PKA, and Kv2.2 channel activity, using embryonic cell line.

      - Additional experiments in INS1 and primary mouse beta cells with altered Kv2.2 function partially support the inhibitory role of PGE2 on GSIS through Kv2.2 inhibition.

      Weaknesses:

      - A critical limitation is the use of HEK293T cells, which are not pancreatic beta cells. Functional aspects can differ significantly between these cell types.

      - The study needs to address the apparent contradiction of PKA activating insulin secretion in beta cells, while also inhibiting GSIS through the proposed mechanism.

      - A more thorough explanation is needed for the discrepancies observed between the effects of PGE2 versus Kv2.2 knockdown/mutation on the electrical activity of beta cells and GSIS.

      Thank you for your positive evaluation and constructive feedback on our study. We appreciate the concern regarding the use of HEK293T cells, which are not pancreatic beta cells and may exhibit functional differences. In response, we have repeated our key experiments using INS1 cells and primary mouse beta cells, which are more representative of the native beta cell environment. These additional experiments confirm our hypothesis and further support the role of Kv2.2 in PGE2-induced inhibition of GSIS. In beta cells, glucose-induced PKA activation is highly localized. As a result, while some PKA pathways promote insulin secretion, others may inhibit it. To directly demonstrate that PGE2-induced PKA phosphorylation of Kv2.2 is involved in the inhibitory effect on GSIS, we overexpressed the S448A mutant Kv2.2 channel in INS-1(832/13) cells. Our results show that Kv2.2-S448A channels significantly attenuate the inhibitory effect of PGE2 on GSIS, further supporting the critical role of Kv2.2 phosphorylation at S448. These data have been added to the revised Figure 7C.

      Reviewer #2 (Public Review):

      The authors identified new target elements for prostaglandin E2 (PGE2) through which insulin release can be regulated in pancreatic beta cells under physiological conditions. In vitro extracellular exposure to PGE2 could directly and dose-dependently inhibit the potassium channel Kv2.2. In vitro pharmacology revealed that this inhibition occurs through the EP2/4 receptors, which activate protein kinase A (PKA). By screening specific sites of the Kv2.2 channel, the target phosphorylation site (S448) for PKA regulation was found. The physiological relevance of the described signaling cascade was investigated and confirmed in vivo, using a Kv2.2 knockdown mouse model.

      The strength of this manuscript is the novelty of the (EP2/4-PKA-Kv2.2 channel) molecular pathway described and the comprehensive methodological toolkit the authors have relied upon.

      The introduction is detailed and contains all the information necessary to place the claims in context. Although the dataset is comprehensive and a logical lead is consistently built, there is one important point to consider: to clarify that the described signaling pathway is characteristic of normal physiological conditions and thus differs from pathological changes. It would be useful to carry out basic experiments in a diabetes model (regardless of whether this is in mice or rats).

      Thank you for your positive evaluation and insightful comment. We have clarified in the Discussion section that our findings pertain specifically to physiological conditions. We acknowledge the importance of investigating the signaling pathway in a pathological context and plan to conduct experiments using a diabetes model in future studies to explore how this pathway may differ under such conditions.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Figure 3A-C: PKA activation regulates different functional aspects in beta cells and HEK293T cells. It is well known that PKA activation enhances insulin secretion in beta cells, therefore the mechanisms that allow the same pathway at the same time to inhibit GSIS are not clear and should be addressed by experiments in beta cells.

      Thank you for your insightful comment. Specificity and versatility in cAMP-PKA signaling are governed by the spatial localization and temporal dynamics of the signal. In beta cells, glucose-induced PKA activation is highly localized (Tengholm and Gylfe, 2017). As a result, while some PKA pathways promote insulin secretion, others may inhibit it. For example, a global increase in cAMP, such as through treatment with Db-cAMP, can simultaneously activate both stimulatory and inhibitory PKA pathways, reflecting a more integrated, complex response. In previous studies, 1 mM Db-cAMP was shown to enhance GSIS in INS-1 cells (Dezaki et al., 2011). We observed that 1 mM Db-cAMP increased GSIS, but lower concentrations (10 mM) decreased GSIS (as shown in Author response image 1). These findings suggest that not all PKA signaling events increase GSIS. To further investigate the role of PGE2-induced PKA phosphorylation of Kv2.2 in the inhibition of GSIS, we overexpressed the S448A mutant of Kv2.2 in INS-1 (832/13) cells. Our results showed that the Kv2.2-S448A mutant significantly attenuated the inhibitory effect of PGE2 on GSIS. These new data have been incorporated into the revised Figure 7C.

      Author response image 1.

      Effect of Db-cAMP on GSIS in INS-1 cells. Statistics for the effect of different concentrations of Db-cAMP on GSIS in INS-1(832/13) cells. One-way ANOVA with Bonferroni post hoc test. *p < 0.05; ***p < 0.001; ****p < 0.0001; n.s., not significant.

      (2) Figure 3G: One would expect that the phospho-mimetic mutation, S448D, will have an opposite effect to S448A and a similar effect as PGE2 or PKA activator in Figure 3B. There is no explanation by the authors for having the same effect in S448A and S448D.

      Thank you for your thoughtful comment. Indeed, the S448D mutation exhibited a similar effect to PGE2 on Kv2.2 channels, as we observed significantly smaller currents compared to wild-type Kv2.2 (Figure 3F). The S448D mutation mimics the phosphorylated state of S448, and since PGE2 regulates Kv2.2 channels by phosphorylating this residue, it has no further effect on the S448D mutant (Figure 3G). In contrast, the S448A mutation prevents phosphorylation at this site, which explains why PGE2 has no effect on the currents of S448A mutant Kv2.2 channels (Figure 3H). These results confirm that PGE2 modulates Kv2.2 channels specifically through phosphorylation of S448, as evidenced by the lack of effect on both the S448A and S448D mutants.

      (3) Figure 4E: Since both PGE2 and Kv2.2 KD inhibit the activity of the channel, it doesn't definitively prove whether PGE2 acts through Kv2.2 in INS-1 cells. A complementary experiment should be done in which overactivation of Kv2.2 rescues the effect of PGE2. For example, with the S448A form of the channel.

      We appreciate your comment and valuable suggestion. Knockdown of Kv2.2 abrogated the inhibitory effect of PGE2 on I<sub>K</sub> currents in INS-1 cells (Figure 4E and F), which strongly indicates that PGE2 acts through Kv2.2. While we agree that the suggested complementary experiment with Kv2.2 overactivation (e.g., using the S448A mutant) could provide additional insights, we believe the current data sufficiently support our conclusion, as the knockdown of Kv2.2 eliminates the observed PGE2 effect, providing direct evidence of the channel's involvement.

      (4) Figure 5C: This result requires further explanation. If PGE2 downregulates Kv2.2 activity and has an inhibitory effect on GSIS, why does Kv2.2 KD have the opposite effect?

      The knockdown of Kv2.2 (Fig. 5C) reduced action potential (AP) firing rates compared to the scramble control (Fig. 5B), which is expected because Kv2.2 is critical for maintaining AP firing. When Kv2.2 is knocked down, the reduced AP firing diminishes the system’s responsiveness to further modulation by PGE2. This is because PGE2 exerts its effects primarily through Kv2.2 channels. Therefore, in the Kv2.2 knockdown condition, PGE2 does not exert an additional inhibitory effect on AP firing rates, as the channels critical for its action are already impaired.

      (5) Figure 5D - The EP1-EP4 receptor antibodies should be validated at least in INS-1(832/13) cells using knockdowns.

      Thank you for your suggestion. We have validated the EP1-EP4 receptor antibodies in INS-1(832/13) cells using knockdown experiments. The validation results, including confirmation of specificity and knockdown efficiency, are provided in Supplemental Figure S2.

      (6) Figure 7B - These experiments don't necessarily prove that PGE2 acts directly through Kv2.2 inhibition. Using the S448A mutation in these experiments could prove this point.

      Thank you for this valuable suggestion. We have now overexpressed the S448A mutant Kv2.2 channels in INS-1(832/13) cells, and the results demonstrate that Kv2.2-S448A channels significantly reduce the inhibitory effect of PGE2 on GSIS. These new data have been incorporated into the revised Figure 7C.

      Reviewer #2 (Recommendations For The Authors):

      (1) Deficiencies and inaccuracies in the description of the methods (animal numbers, name of vendors, abbreviations) and the typos in the figures (axis label) require correction.

      Thank you for pointing this out. We have carefully reviewed the manuscript and the figures, making the necessary corrections to address the deficiencies in the methods section and the typos in the figure axis labels.

      (2) Reducing the number of figures (Figures 7/C-E: knockout mouse line test and Figure1/HEK cell experiments could be part of supplementary) and paragraphs would make the manuscript more compact and powerful. It would also ease its reading for non-experts.

      Thank you for your suggestion. We have moved Figures 7C-E to the supplementary data (Supplemental Figure S1) to streamline the main manuscript.

      (3) Multiple immunostainings for EP receptors in insulinoma cells or pancreatic islets would be representative.

      Due to the rabbit-derived nature of the antibodies (EP1, EP2, EP4), performing multiple immunostainings on the same samples is not feasible due to potential cross-reactivity. However, the immunohistochemistry images demonstrate that each antibody labels more than 90% of the cells, indicating that β-cell express different subtypes of EP receptors simultaneously.

      (4) The antagonists chosen (AH6809, AH23848) are non-specific. Experiments should be re-run (at least some) under more stringent conditions.

      Thank you for your suggestion. AH6809 and AH23848 are well-documented, widely used antagonists in the literature. To further strengthen our findings, we have included additional, widely-used antagonists: the EP2-specific antagonist TG4155 and the EP4-specific antagonist GW627368. The results obtained with these new antagonists were consistent with those observed using AH6809 and AH23848. These updated data are now included in the revised Figure 4I and 4J.

      (5) It would be very helpful to indeed emphasise that this work is for physiological conditions and that it is (or is not) modified in diabetes. Maybe even irrelevant for diabetes (?). This needs to be clarified and supported by data even if one could assume the authors intend to have a follow-up entirely dedicated to pathological changes, perhaps.

      Thank you for this insightful comment. We have clarified in the Discussion that our findings are specific to physiological conditions. To address this point, we have added the following statement:

      "Importantly, our findings pertain to physiological conditions. While we demonstrate the inhibitory effects of PGE2 on Kv2.2 channels in normal b-cells, the role of this pathway under diabetic conditions remains to be investigated and will be the focus of future studies."

      Dezaki K, Damdindorj B, Sone H, Dyachok O, Tengholm A, Gylfe E, Kurashina T, Yoshida M, Kakei M, Yada T (2011) Ghrelin attenuates cAMP-PKA signaling to evoke insulinostatic cascade in islet beta-cells. Diabetes 60:2315-2324.

      Tengholm A, Gylfe E (2017) cAMP signalling in insulin and glucagon secretion. Diabetes Obes Metab 19 Suppl 1:42-53.

    1. eLife Assessment

      This paper presents miniML, an AI-based framework for the detection of synaptic events. Benchmark results presented in the paper are compelling, demonstrating the superiority of miniML over current state-of-the-art alternatives. The performance of miniML is demonstrated across various experimental paradigms, showing that miniML has the potential to become a valuable tool for the analysis of synaptic signals.

    2. Reviewer #1 (Public review):

      O'Neill et al. have developed a software analysis application, miniML, that enables the quantification of electrophysiological events. They utilize a supervised deep learned-based method to optimize the software. miniML is able to quantify and standardize the analyses of miniature events, using both voltage and current clamp electrophysiology, as well as optically driven events using iGluSnFR3, in a variety of preparations, including in the cerebellum, calyx of held, golgi cell, human iPSC cultures, zebrafish, and Drosophila. The software appears to be flexible, in that users are able to hone and adapt the software to new preparations and events. Importantly, miniML is an open source software free for researchers to use and enables users to adapt new features using Python.

      Overall this new software has the potential to become widely used in the field and an asset to researchers. Importantly, a new graphical user interface has been generated that enables more user control and a more user-friendly experience. Further, the authors demonstrate how miniML performs relative to other platforms that have been developed, and highlight areas where miniML works optimally. With these revisions, miniML should now be of considerable benefit and utility to a variety of researchers.

    3. Reviewer #2 (Public review):

      Summary:

      This paper presents miniML as a supervised method for detection of spontaneous synaptic events. Recordings of such events are typically of low SNR, where state-of-the-art methods are prone to high false favourable rates. Unlike current methods, training miniML requires neither prior knowledge of the kinetics of events nor the tuning of parameters/thresholds.

      The proposed method comprises four convolutional networks, followed by a bi-directional LSTM and a final fully connected layer, which outputs a decision event/no event per time window. A sliding window is used when applying miniML to a temporal signal, followed by an additional estimation of events' time stamps. miniML outperforms current methods for simulated events superimposed on real data (with no events) and presents compelling results for real data across experimental paradigms and species.

      Strengths:

      The authors present a pipeline for benchmarking based on simulated events superimposed on real data (with no events). Compared to five other state-of-the-art methods, miniML leads to the highest detection rates and is most robust to specific choices of threshold values for fast or slow kinetics. A major strength of miniML is the ability to use it for different datasets. For this purpose, the CNN part of the model is held fixed and the subsequent networks are trained to adapt to the new data. This Transfer Learning (TL) strategy reduces computation time significantly and more importantly, it allows for using a substantially smaller data set (compared to training a full model) which is crucial as training is supervised (i.e. uses labeled examples).

      Weaknesses:<br /> The authors do not indicate how the specific configuration of miniML was set, i.e. number of CNNs, units, LSTM, etc. Please provide further information regarding these design choices, whether they were based on similar models or if chosen based on performance.

      The data for the benchmark system was augmented with equal amounts of segments with/without events. Data augmentation was undoubtedly crucial for successful training.<br /> (1) Does a balanced dataset reflect the natural occurrence of events in real data? Could the authors provide more information regarding this matter?<br /> (2) Please provide a more detailed description of this process as it would serve users aiming to use this method for other sub-fields.

      The benchmarking pipeline is indeed valuable and the results are compelling. However, the authors do not provide comparative results for miniML for real data (figures 4-8). TL does not apply to the other methods. In my opinion, presenting the performance of other methods, trained using the smaller dataset would be convincing of the modularity and applicability of the proposed approach.

      Impact:

      Accurate detection of synaptic events is crucial for the study of neural function. miniML has a great potential to become a valuable tool for this purpose as it yields highly accurate detection rates, it is robust, and is relatively easily adaptable to different experimental setups.

      Comments on revisions:

      The revised manuscript presents a compelling framework. The performance of mini ML is thouroughly explored and compared to several benchmarks. The training process along with other technical issues are now described in a satisfactory level of detail.<br /> I think the authors did a great job. They answered all claims and concerns raised by me and the other reviewers.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer 1 (Public Review):

      O’Neill et al. have developed a software analysis application, miniML, that enables the quantification of electrophysiological events. They utilize a supervised deep learned-based method to optimize the software. miniML is able to quantify and standardize the analyses of miniature events, using both voltage and current clamp electrophysiology, as well as optically driven events using iGluSnFR3, in a variety of preparations, including in the cerebellum, calyx of held, Golgi cell, human iPSC cultures, zebrafish, and Drosophila. The software appears to be flexible, in that users are able to hone and adapt the software to new preparations and events. Importantly, miniML is an open-source software free for researchers to use and enables users to adapt new features using Python.

      Overall this new software has the potential to become widely used in the field and an asset to researchers. However, the authors fail to discuss or even cite a similar analysis tool recently developed (SimplyFire), and determine how miniML performs relative to this platform. There are a handful of additional suggestions to make miniML more user-friendly, and of broad utility to a variety of researchers, as well as some suggestions to further validate and strengthen areas of the manuscript:

      (1) miniML relative to existing analysis methods: There is a major omission in this study, in that a similar open source, Python-based software package for event detection of synaptic events appears to be completely ignored. Earlier this year, another group published SimplyFire in eNeuro (Mori et al., 2024; doi: 10.1523/eneuro.0326-23.2023). Obviously, this previous study needs to be discussed and ideally compared to miniML to determine if SimplyFire is superior or similar in utility, and to underscore differences in approach and accuracy.

      We thank the reviewer for bringing this interesting publication to our attention. We have included SimplyFire in our benchmarking for comprehensive comparison with miniML. The approach taken by SimplyFire differs from miniML in a number of ways. Our results show that miniML provides higher recall and precision than SimplyFire (revised Figure 3). We appreciate that SimplyFire provides a user-interface similar to the commonly used MiniAnalysis software. In addition, the peak-finding-based approach of SimplyFire makes it relatively robust to event shape, which facilitates analysis of diverse data. However, we noted a strong threshold-dependence and long run time of SimplyFire (revised Figure 3 and Figure 3—figure supplement 1). In addition, SimplyFire is not robust against various types of noise typically encountered in electrophysiological recordings. Our extended benchmark analysis thus indicates that AI-based event detection is superior to existing algorithmic approaches, including SimplyFire.

      (2) The manuscript should comment on whether miniML works equally well to quantify current clamp events (voltage; e.g. EPSP/mEPSPs) compared to voltage clamp (currents, EPSC/mEPSCs), which the manuscript highlights. Are rise and decay time constants calculated for each event similarly?

      miniML works equally well for current- and voltage events (Figure 5, Figure 9). In general, events of opposite polarity can be analyzed by simply inverting the data. Transfer learning models may further improve the detection.

      For each detected event, independent of data/recording type, rise times are calculated as 10–90% times (baseline–peak), and decay times are calculated as time to 50% of the peak. In addition, event decay time constants are calculated from a fit to the event average. With miniML being open-source, researchers can adapt the calculations of event statistics to their needs, if desired. In the revised manuscript, we have expanded the Methods section that describes the quantification of event statistics (Methods, Quantification).

      (3) The interface and capabilities of miniML appear quite similar to Mini Analysis, the free software that many in the field currently use. While the ability and flexibility for users to adapt and adjust miniML for their own uses/needs using Python programming is a clear potential advantage, can the authors comment, or better yet, demonstrate, whether there is any advantage for researchers to use miniML over Mini Analysis or SimplyFire if they just need the standard analyses?

      Following the reviewer’s suggestion, we developed a graphical user interface (GUI) for miniML to enhance its usability (Figure 2—figure supplement 2), which is provided on the GitHub repository. Our comprehensive benchmark analysis demonstrated that miniML outperforms existing tools such as MiniAnalysis and SimplyFire. The main advantages are (i) increased reliability of results, which eliminates the need for visual inspection; (ii) fast runtime and easy automation; (iii) superior detection performance as demonstrated by higher recall in both synthetic and real data; (iv) open-source Python-based design. We believe that these advantages make miniML a valuable tool for researchers recording various types of synaptic events, offering a more efficient and reliable solution compared to existing methods.

      (4) Additional utilities for miniML: The authors show miniML can quantify miniature electrophysiological events both current and voltage clamp, as well as optical glutamate transients using iGluSnFR. As the authors mention in the discussion, the same approach could, in principle, be used to quantify evoked (EPSC/EPSP) events using electrophysiology, Ca2+ events (using GCaMP), and AP waveforms using voltage indicators like ASAP4. While I don’t think it is reasonable to ask the authors to generate any new experimental data, it would be great to see how miniML performs when analysing data from these approaches, particularly to quantify evoked synaptic events and/or Ca2+ (ideally postsynaptic Ca2+ signals from miniature events, as the Drosophila NMJ have developed nice approaches).

      In the revised manuscript, we have extended the application examples of miniML. We applied miniML to detect mEPSPs recorded with the novel voltage-sensitive indicator ASAP5 (Figure 9 and Figure 9—figure supplement 1). We performed simultaneous recordings of membrane voltage through electrophysiology and ASAP5 voltage imaging in rat cultured neurons at physiological temperature. Data were analyzed using miniML, with electrophysiology data being used as ground-truth for assessing detection performance in imaging data. Our results demonstrate that miniML robustly detects mEPSPs in current-clamp, and can localize corresponding transients in imaging data. Furthermore, we observed that miniML performs better than template matching and deconvolution on ASAP5 imaging data (Figure 9 and Figure 9—figure supplement 2).

      Reviewer 2 (Public Review):

      This paper presents miniML as a supervised method for the detection of spontaneous synaptic events. Recordings of such events are typically of low SNR, where state-of-the-art methods are prone to high false positive rates. Unlike current methods, training miniML requires neither prior knowledge of the kinetics of events nor the tuning of parameters/thresholds.

      The proposed method comprises four convolutional networks, followed by a bi-directional LSTM and a final fully connected layer which outputs a decision event/no event per time window. A sliding window is used when applying miniML to a temporal signal, followed by an additional estimation of events’ time stamps. miniML outperforms current methods for simulated events superimposed on real data (with no events) and presents compelling results for real data across experimental paradigms and species. Strengths:

      The authors present a pipeline for benchmarking based on simulated events superimposed on real data (with no events). Compared to five other state-of-the-art methods, miniML leads to the highest detection rates and is most robust to specific choices of threshold values for fast or slow kinetics. A major strength of miniML is the ability to use it for different datasets. For this purpose, the CNN part of the model is held fixed and the subsequent networks are trained to adapt to the new data. This Transfer Learning (TL) strategy reduces computation time significantly and more importantly, it allows for using a substantially smaller data set (compared to training a full model) which is crucial as training is supervised (i.e. uses labeled examples).

      Weaknesses:

      The authors do not indicate how the specific configuration of miniML was set, i.e. number of CNNs, units, LSTM, etc. Please provide further information regarding these design choices, whether they were based on similar models or if chosen based on performance.

      The data for the benchmark system was augmented with equal amounts of segments with/without events. Data augmentation was undoubtedly crucial for successful training.

      (1) Does a balanced dataset reflect the natural occurrence of events in real data? Could the authors provide more information regarding this matter?

      In a given recording, the event frequency determines the ratio of event-containing vs. nonevent-containing data segments. Whereas many synapses have a skew towards non-events, high event frequencies as observed, e.g., in pyramidal cells or Purkinje neurons, can shift the ratio towards event-containing data.

      For model training, we extracted data segments from mEPSC recordings in cerebellar granule cells, which have a low mEPSC frequency (about 0.2 Hz, Delvendahl et al. 2019). Unbalanced training data may complicate model training (Drummond and Holte 2003; Prati et al. 2009; Tyagi and Mittal 2020). We therefore decided to balance the training dataset for miniML by down-sampling the majority class (i.e., non-event segments), so that the final datasets for model training contained roughly equal amounts of events and non-events.

      (2) Please provide a more detailed description of this process as it would serve users aiming to use this method for other sub-fields.

      We thank the reviewer for raising this point. In the revised manuscript, we present a systematic analysis of the impact of imbalanced training data on model training (Figure 1—figure supplement 2). In addition, we have revised the description of model training and data augmentation in the Methods section (Methods, Training data and annotation).

      The benchmarking pipeline is indeed valuable and the results are compelling. However, the authors do not provide comparative results for miniML for real data (Figures 4-8). TL does not apply to the other methods. In my opinion, presenting the performance of other methods, trained using the smaller dataset would be convincing of the modularity and applicability of the proposed approach.

      Quantitative comparison of synaptic detection methods on real-world data is challenging because the lack of ground-truth data prevents robust, quantitative analyses. Nevertheless, we compared miniML to common template-based and finite-threshold based methods on four different types of synapses. We noted that miniML generally detects more events, whereas other methods are susceptible to false-positives (Figure 4—figure supplement 1). In addition, we analyzed the performance of miniML on voltage imaging data (Figure 9). Simultaneous recordings of electrophysiological and imaging data allowed a quantitative comparison of detection methods in this dataset. Our results demonstrate that miniML provides higher recall for optical minis recorded using ASAP5 (Figure 9 and Figure 9—figure supplement 2; F1 score, Cohen’s d 1.35 vs. template matching and 5.1 vs. deconvolution).

      Impact:

      Accurate detection of synaptic events is crucial for the study of neural function. miniML has a great potential to become a valuable tool for this purpose as it yields highly accurate detection rates, it is robust, and is relatively easily adaptable to different experimental setups.

      Additional comments:

      Line 73: the authors describe miniML as "parameter-free". Indeed, miniML does not require the selection of pulse shape, rise/fall time, or tuning of a threshold value. Still, I would not call it "parameter-free" as there are many parameters to tune, starting with the number of CNNs, and number of units through the parameters of the NNs. A more accurate description would be that as an AI-based method, the parameters of miniML are learned via training rather than tuned by the user.

      We agree that a deep learning model is not parameter-free, and this term may be misleading. We have therefore changed this sentence in the introduction as follows: "The method is fast, robust to threshold choice, and generalizable across diverse data types [...]"

      Line 302: the authors describe miniML as "threshold-independent". The output trace of the model has an extremely high SNR so a threshold of 0.5 typically works. Since a threshold is needed to determine the time stamps of events, I think a better description would be "robust to threshold choice".

      To detect event localizations, a peak search is performed on the model output, which uses a minimum peak height parameter (or threshold). Extreme values for this parameter do indeed have a small impact on detection performance (Figure 3J). We have changed the description in the introduction and discussion according to the reviewer’s suggestion.

      Reviewer 3 (Public Review):

      miniML as a novel supervised deep learning-based method for detecting and analyzing spontaneous synaptic events. The authors demonstrate the advantages of using their methods in comparison with previous approaches. The possibility to train the architecture on different tasks using transfer learning approaches is also an added value of the work. There are some technical aspects that would be worth clarifying in the manuscript:

      (1) LSTM Layer Justification: Please provide a detailed explanation for the inclusion of the LSTM layer in the miniML architecture. What specific benefits does the LSTM layer offer in the context of synaptic event detection?

      Our model design choice was inspired by similar approaches in the literature (Donahue et al. 2017; Islam et al. 2020; Passricha and Aggarwal 2019; Tasdelen and Sen 2021; Wang et al. 2020). Convolutional and recurrent neural networks are often combined for time-series classification problems as they allow learning spatial and temporal features, respectively. Combining the strengths of both network architectures can thus help improve the classification performance. Indeed, a CNN-LSTM architecture proved to be superior in both training accuracy and detection performance (Figure 1—figure supplement 2). Further, this architecture requires fewer free parameters than comparable model designs using fully connected layers instead. The revised manuscript shows a comparison of different model architectures (Figure 1—figure supplement 2), and we added the following description to the text (Methods, Deep learning model architecture):

      "The combination of convolutional and recurrent neural network layers helps to improve the classification performance for time-series data. In particular, LSTM layers allow learning temporal features."

      (2) Temporal Resolution: Can you elaborate on the reasons behind the lower temporal resolution of the output? Understanding whether this is due to specific design choices in the model, data preprocessing, or post-processing will clarify the nature of this limitation and its impact on the analysis.

      When running inference on a continuous recording, we choose to use a sliding window approach with stride. Therefore, the model output has a lower temporal resolution than the raw data, which is determined by the stride length (i.e., how many samples to advance the sliding window). While using a stride is not required, it significantly reduces inference time (cf. Figure 2—figure supplement 1). We recommend a stride of 20 samples, which does not impact the detection of events. Any subsequent quantification of events (amplitude, area, risetimes, etc.) is performed on raw data. Based on the reviewer’s comment, we have adapted the code to resample the prediction trace to the sampling rate of the original data. This maintains temporal precision and avoids confusion.

      The Methods now include the following statement:

      "To maintain temporal precision, the prediction trace is resampled to the sampling frequency of the raw data."

      (3) Architecture optimization: how was the architecture CNN+LSTM optimized in terms of a number of CNN layers and size?

      We performed a Bayesian optimization over a defined range of hyperparameters in combination with empirical hyperparameter tuning. We now describe this in the Methods section as follows:

      "To optimise the model architecture, we performed a Bayesian optimisation of hyperparameters. Hyperparameter ranges were chosen for the free parameters of all layers. Optimisation was then performed with a maximum number of trials of 50. Models were evaluated using the validation dataset. Because higher number of free parameters tended to increase inference times, we then empirically tuned the chosen hyperparameter combination to achieve a trade-off between number of free parameters and accuracy."

      Recommendations For The Authors

      Reviewing Editor (Recommendations For The Authors):

      Overall suggestions to the authors:

      (1) Directly compare miniML with SimplyFire (which was not cited or discussed in the original manuscript), with both idealized and actual data. Discuss the pros/cons of each software.

      We have conducted an extensive comparison between miniML and SimplyFire using both simulated and actual experimental data. This analysis is now presented in the revised Figure 3, Figure 3—figure supplement 1, and Figure 4—figure supplement 1. In addition, we have included relevant citations for SimplyFire in our manuscript. These additions provide a more comprehensive and balanced view of the available tools in the field, positioning our work within the broader context of existing solutions.

      (2) Generate a better user interface akin to MiniAnalysis or SimplyFire.

      We thank the editor and reviewers for the suggestion to improve the user interface. We have created a user-friendly graphical user interface (GUI) for miniML that is available on our GitHub repository. This GUI is now showcased in Figure 2—figure supplement 2 of the manuscript. The new interface allows users to load and analyze data through an intuitive point-and-click system, visualize results in real-time, and adjust parameters easily without coding knowledge. We have incorporated user feedback to refine the interface and improve user experience. These improvements significantly enhance the accessibility of miniML, making it more user-friendly for researchers with varying levels of programming expertise.

      Reviewer 1 (Recommendations For The Authors):

      Related to point (1) of the Public Review, we have taken the liberty to compare electrophysiological data using miniAnalysis, SimiplyFire, and miniML. In our comparison, we note the following in our experience:

      (1.1) In contrast to both SimplyFire and miniAnalysis, miniML does not currently have a user-friendly interface where the user can directly control or change the parameters of interest, nor does miniML have a user control center, so the user cannot simply type or select the mini manually. Rather, if any parameter needs to be changed, the user needs to read, understand, and change the original source code to generate the preferred change. This level of "activation energy" and required user coding expertise in computer science, which many researchers do not have, renders miniML much less accessible when directly compared to SimplyFire and miniAnalysis. Hence, unless miniML’s interface can be made more user-friendly, this is a major disadvantage, especially when compared to SimplyFire, which has many of the same features as miniML but with a much easier interface and user controls.

      As suggested by the reviewer, we have created a graphical user interface (GUI) for miniML. The GUI allows easy data loading, filtering, analysis, event inspection, and saving of results without the need for writing Python code. Figure 2—figure supplement 2 illustrates the typical workflow for event analysis with miniML using the GUI and a screenshot of the user interface. Code to use miniML via the GUI is now included in the project’s GitHub repository. The GUI provides a simple and intuitive way to analyze synaptic events, whereas running miniML as Python script allows for more customization and a high degree of automatization.

      (1.2) We compared electrophysiological miniature events between miniML, SimplyFire, and miniAnalysis. All three achieved similar mean amplitudes in "wild type" conditions, and conditions in which mini events were enhanced and diminished, so the overall means and utilities are similar, with miniML and SimplyFire being preferred given the flexibility and much faster analysis. We did note a few differences, however. SimplyFire tends to capture a high number of mini-events over miniML, especially in conditions of diminished mini amplitude (e.g., miniML found 76 events, while SimplyFire 587). The mean amplitudes, however, were similar. It seems that in data with low SNR, SimplyFire captures many more events as real minis that are probably noise, while miniML is more selective, which might be an advantage in miniML. That being said, we found SimplyFire to be superior in many respects, not least of which the user interface and experience.

      We appreciate the reviewer’s thorough comparison of miniML, SimplyFire, and MiniAnalysis. While we acknowledge SimplyFire’s user-friendly interface, our study highlights several advantages of AI-based event analysis over conventional algorithmic approaches. Our updated benchmark analysis revealed better detection performance of miniML compared with SimplyFire (revised Figure 3), which had similar performance to deconvolution. As already noted by the reviewer, high false positive rates are a major issue of the SimplyFire approach. Although a minimum amplitude cutoff can partially resolve this problem, detection performance is highly sensitive to threshold setting (revised Figure 3). Another apparent disadvantage of SimplyFire is its relatively slow runtime (Figure 3—figure supplement 1). Finally, we have enhanced miniML’s accessibility by providing a graphical user interface that is easy to use and provides additional functionality.

      Some technical comments:

      (1) Improvements to the dependence version of miniML: There is a need to clarify the dependence version of the python and tensor flow used in this study and in the GitHub. We used Python version 3.8.19 to load the miniML model. However, if Python versions >=3.9, as described on the GitHub provided, it is difficult to have a matched h5py version installed. It is also inaccurate to say using Python >=3.9, because tensor flow version for this framework needs to be around 2.13. However, if using Python >=3.10, it will only allow 2.16 version tensor flow to be the download choice. Therefore, as a Python framework, the dependency version needs to be specified on GitHub to allow researchers to access the model using the entire work.

      Thank you for highlighting this issue. We have now included specific version numbers in the requirements to avoid version conflicts and to ensure proper functioning of the code.

      (2) Due to the intrinsic characteristics of the trained model, every model is only suitable for analyzing data with similar attributes. It is hard for researchers without a strong computer science background to train a new model themselves for their specific data. Therefore, it would be preferred if there were more available transfer learning models on GitHub accessible for researchers to adapt to their data.

      We would like to thank the reviewer for this feedback. Trained models (such as the default model) can often be used on different data (see, e.g., Figure 4, where data from four distinct synaptic preparations were analyzed with the base model, and Figure 5—figure supplement 1). However, changes in event waveform and/or noise characteristics may necessitate transfer learning to obtain optimal results with miniML. We have revised the description and tutorial for model training on the project’s GitHub repository to provide more guidance in this process. In addition, we now provide a tutorial on how to use existing models on out-of-sample data with distinct kinetics, using resampling. We hope these updates to the miniML GitHub repository will facilitate the use of the method.

      Following the suggestion by the reviewer, we have provided the transfer learning models used for the manuscript on the project’s GitHub repository to increase the number of available machine learning models for event detection. In addition, users of miniML are encouraged to supply their custom models. We hope that this will facilitate model exchange between laboratories in the future.

      Reviewer 3:

      I congratulate all authors for the convincing demonstration of their methodology, I do not have additional recommendations.

      We would like to thank the reviewer for the positive assessment of our manuscript.

      References

      Delvendahl, I., Kita, K., & Müller, M. (2019). Rapid and sustained homeostatic control of presynaptic exocytosis at a central synapse. Proceedings of the National Academy of Sciences, 116(47), 23783–23789. https://doi.org/10.1073/pnas.1909675116

      Donahue, J., Hendricks, L. A., Rohrbach, M., Venugopalan, S., Guadarrama, S., Saenko, K., & Darrell, T. (2017). Long-term recurrent convolutional networks for visual recognition and description. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 677–691. https://doi.org/10.1109/tpami.2016.2599174

      Drummond, C., & Holte, R. C. (2003). C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling. https: //api.semanticscholar.org/CorpusID:204083391

      Islam, M. Z., Islam, M. M., & Asraf, A. (2020). A combined deep CNN-LSTM network for the detection of novel coronavirus (COVID-19) using x-ray images. Informatics in Medicine Unlocked, 20, 100412. https://doi.org/10.1016/j.imu.2020.100412

      Passricha, V., & Aggarwal, R. K. (2019). A hybrid of deep CNN and bidirectional LSTM for automatic speech recognition. Journal of Intelligent Systems, 29(1), 1261–1274. https://doi.org/10.1515/jisys-2018-0372

      Prati, R. C., Batista, G. E. A. P. A., & Monard, M. C. (2009). Data mining with imbalanced class distributions: Concepts and methods. Indian International Conference on Artificial Intelligence. https://api.semanticscholar.org/CorpusID:16651273

      Tasdelen, A., & Sen, B. (2021). A hybrid CNN-LSTM model for pre-miRNA classification. Scientific Reports, 11(1). https://doi.org/10. 1038/s41598-021-93656-0

      Tyagi, S., & Mittal, S. (2020). Sampling approaches for imbalanced data classification problem in machine learning. In P. K. Singh, A. K. Kar, Y. Singh, M. H. Kolekar, & S. Tanwar (Eds.), Proceedings of icric 2019 (pp. 209–221). Springer International Publishing.

      Wang, H., Zhao, J., Li, J., Tian, L., Tu, P., Cao, T., An, Y., Wang, K., & Li, S. (2020). Wearable sensor-based human activity recognition using hybrid deep learning techniques. Security and Communication Networks, 2020, 1–12. https://doi.org/10.1155/2020/ 2132138

    1. eLife Assessment

      This manuscript describes a novel approach for assessing cognitive function in freely moving mice in their home-cage, without human involvement. The authors provide convincing evidence in support of the tasks they developed to capture a variety of complex behaviors and demonstrate the utility of a machine learning approach to expedite the acquisition of task demands. This work is important given its potential utility for other investigators interested in studying mouse cognition. However, additional information (e.g., detailed construction manual, code) is needed to allow other investigators to implement this system independently and use it widely.

    2. Reviewer #1 (Public review):

      Summary:

      This is a new and important system that can efficiently train mice to perform a variety of cognitive tasks in a flexible manner. It is innovative and opens the door to important experiments in the neurobiology of learning and memory.

      Strengths:

      Strengths include: high n's, a robust system, task flexibility, comparison of manual-like training vs constant training, circadian analysis, comparison of varying cue types, long-term measurement, and machine teaching.

      Weaknesses:

      I find no major problems with this report.

      Minor weaknesses:

      (1) Line 219: Water consumption per day remained the same, but number of trails triggered was more as training continued. First, is this related to manual-type training? Also, I'm trying to understand this result quantitatively, since it seems counter-intuitive: I would assume that with more trials, more water would be consumed since accuracy should go up over training (so more water per average trial). Am I understanding this right? Can the authors give more detail or understanding to how more trials can be triggered but no more water is consumed despite training?

      (2) Figure 2J: The X-axis should have some label: at least "training type". Ideally, a legend with colors can be included, although I see the colors elsewhere in the figure. If a legend cannot be added, then the color scheme should be explained in the caption.

      (3) Figure 2K: What is the purple line? I encourage a legend here. The same legend could apply to 2J.

      (4) Supplementary Figure S2 D: I do not think the phrase "relying on" is correct. Instead, I think "predicted by" or "correlating with" might be better.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Yu et al. describes a novel approach for collecting complex and different cognitive phenotypes in individually housed mice in their home cage. The authors report a simple yet elegant design that they developed for assessing a variety of complex and novel behavioral paradigms autonomously in mice.

      Strengths:

      The data are strong, the arguments are convincing, and I think the manuscript will be highly cited given the complexity of behavioral phenotypes one can collect using this relatively inexpensive ($100/box) and high throughput procedure (without the need for human interaction). Additionally, the authors include a machine learning algorithm to correct for erroneous strategies that mice develop which is incredibly elegant and important for this approach as mice will develop odd strategies when given complete freedom.

      Weaknesses:

      (1) A limitation of this approach is that it requires mice to be individually housed for days to months. This should be discussed in depth.

      (2) A major issue with continuous self-paced tasks such as the autonomous d2AFC used by the authors is that the inter-trial intervals can vary significantly. Mice may do a few trials, lose interest, and disengage from the task for several hours. This is problematic for data analysis that relies on trial duration to be similar between trials (e.g., reinforcement learning algorithms). It would be useful to see the task engagement of the mice across a 24-hour cycle (e.g., trials started, trials finished across a 24-hour period) and approaches for overcoming this issue of varying inter-trial intervals.

      (3) Movies - it would be beneficial for the authors to add commentary to the video (hit, miss trials). It was interesting watching the mice but not clear whether they were doing the task correctly or not.

      (4) The strength of this paper (from my perspective) is the potential utility it has for other investigators trying to get mice to do behavioral tasks. However, not enough information was provided about the construction of the boxes, interface, and code for running the boxes. If the authors are not willing to provide this information through eLife, GitHub, or their own website then my evaluation of the impact and significance of this paper would go down significantly.

      Minor concerns:

      Learning rate is confusing for Figure 3 results as it actually refers to trials to reach the criterion, and not the actual rate of learning (e.g., slope).

    4. Reviewer #3 (Public review):

      Summary:

      In this set of experiments, the authors describe a novel research tool for studying complex cognitive tasks in mice, the HABITS automated training apparatus, and a novel "machine teaching" approach they use to accelerate training by algorithmically providing trials to animals that provide the most information about the current rule state for a given task.

      Strengths:

      There is much to be celebrated in an inexpensively constructed, replicable training environment that can be used with mice, which have rapidly become the model species of choice for understanding the roles of distinct circuits and genetic factors in cognition. Lingering challenges in developing and testing cognitive tasks in mice remain, however, and these are often chalked up to cognitive limitations in the species. The authors' findings, however, suggest that instead, we may need to work creatively to meet mice where they live. In some cases, it may be that mice may require durations of training far longer than laboratories are able to invest with manual training (up to over 100k trials, over months of daily testing) but the tasks are achievable. The "machine teaching" approach further suggests that this duration could be substantially reduced by algorithmically optimizing each trial presented during training to maximize learning.

      Weaknesses:

      Cognitive training and testing in rodent models fill a number of roles. Sometimes, investigators are interested in within-subjects questions - querying a specific circuit, genetically defined neuron population, or molecule/drug candidate, by interrogating or manipulating its function in a highly trained animal. In this scenario, a cohort of highly trained animals that have been trained via a method that aims to make their behavior as similar as possible is a strength.

      However, often investigators are interested in between-subjects questions - querying a source of individual differences that can have long-term and/or developmental impacts, such as sex differences or gene variants. This is likely to often be the case in mouse models especially, because of their genetic tractability. In scenarios where investigators have examined cognitive processes between subjects in mice who vary across these sources of individual difference, the process of learning a task has been repeatedly shown to be different. The authors do not appear to have considered individual differences except perhaps as an obstacle to be overcome.

      The authors have perhaps shown that their main focus is highly-controlled within-subjects questions, as their dataset is almost exclusively made up of several hundred young adult male mice, with the exception of 6 females in a supplemental figure. It is notable that these female mice do appear to learn the two-alternative forced-choice task somewhat more rapidly than the males in their cohort.

      Considering the implications for mice modeling relevant genetic variants, it is unclear to what extent the training protocols and especially the algorithmic machine teaching approach would be able to inform investigators about the differences between their groups during training. For investigators examining genetic models, it is unclear whether this extensive training experience would mitigate the ability to observe cognitive differences, or select the animals best able to overcome them - eliminating the animals of interest. Likewise, the algorithmic approach aims to mitigate features of training such as side biases, but it is worth noting that the strategic uses of side biases in mice, as in primates, can benefit learning, rather than side biases solely being a problem. However, the investigators may be able to highlight variables selected by the algorithm that are associated with individual strategies in performing their tasks, and this would be a significant contribution.

      A final, intriguing finding in this manuscript is that animal self-paced training led to much slower learning than "manual" training, by having the experimenter introduce the animal to the apparatus for a few hours each day. Manual training resulted in significantly faster learning, in almost half the number of trials on average, and with significantly fewer omitted trials. This finding does not necessarily argue that manual training is universally a better choice because it leads to more limited water consumption. However, it suggests that there is a distinct contribution of experimenter interactions and/or switching contexts in cognitive training, for example by activating an "occasion setting" process to accelerate learning for a distinct period of time. Limiting experimenter interactions with mice may be a labor-saving intervention, but may not necessarily improve performance. This could be an interesting topic of future investigation, of relevance to understanding how animals of all species learn.

    5. Author response:

      Reviewer #1 (Public review):

      Summary:

      This is a new and important system that can efficiently train mice to perform a variety of cognitive tasks in a flexible manner. It is innovative and opens the door to important experiments in the neurobiology of learning and memory.

      Strengths:

      Strengths include: high n's, a robust system, task flexibility, comparison of manual-like training vs constant training, circadian analysis, comparison of varying cue types, long-term measurement, and machine teaching.

      Weaknesses:

      I find no major problems with this report.

      (1) Line 219: Water consumption per day remained the same, but number of trails triggered was more as training continued. First, is this related to manual-type training? Also, I'm trying to understand this result quantitatively, since it seems counter-intuitive: I would assume that with more trials, more water would be consumed since accuracy should go up over training (so more water per average trial). Am I understanding this right? Can the authors give more detail or understanding to how more trials can be triggered but no more water is consumed despite training?

      Thanks for the thoughtful comment. We would like to clarify the phenomenon described in Line 219: As the training advanced, the number of trials triggered by mice per day decreased (rather than increased as you mentioned in the comment) gradually for both manual and autonomous groups of mice (Fig. 2H left). The performance as you mentioned, improved over time, leading to an increased probability of obtaining water and thus relatively stable daily water intake (Fig. 2H left). We believe the stable daily intake is the minimum amount of water required by the mice under circumstance of autonomous behavioral training.

      (2) Figure 2J: The X-axis should have some label: at least "training type". Ideally, a legend with colors can be included, although I see the colors elsewhere in the figure. If a legend cannot be added, then the color scheme should be explained in the caption.

      (3) Figure 2K: What is the purple line? I encourage a legend here. The same legend could apply to 2J.

      (4) Supplementary Figure S2 D: I do not think the phrase "relying on" is correct. Instead, I think "predicted by" or "correlating with" might be better.

      We thank the reviewer for the valuable suggestion. We will address all these points and make the necessary revisions in the next version of our manuscript.

      Reviewer #2 (Public review):

      Summary:

      The manuscript by Yu et al. describes a novel approach for collecting complex and different cognitive phenotypes in individually housed mice in their home cage. The authors report a simple yet elegant design that they developed for assessing a variety of complex and novel behavioral paradigms autonomously in mice.

      Strengths:

      The data are strong, the arguments are convincing, and I think the manuscript will be highly cited given the complexity of behavioral phenotypes one can collect using this relatively inexpensive ($100/box) and high throughput procedure (without the need for human interaction). Additionally, the authors include a machine learning algorithm to correct for erroneous strategies that mice develop which is incredibly elegant and important for this approach as mice will develop odd strategies when given complete freedom.

      Weaknesses:

      (1) A limitation of this approach is that it requires mice to be individually housed for days to months. This should be discussed in depth.

      Thank you for raising this important point. We agree that the requirement for individual housing of mice during the training period is a limitation of our approach, and we appreciate the opportunity to discuss this in more depth. In the revised manuscript, we will add a dedicated section to the Discussion to address this limitation, including the potential impact of individual housing on the mice, the rationale for individual housing in our study, and efforts or alternatives made to mitigate the effects of individual housing.

      (2) A major issue with continuous self-paced tasks such as the autonomous d2AFC used by the authors is that the inter-trial intervals can vary significantly. Mice may do a few trials, lose interest, and disengage from the task for several hours. This is problematic for data analysis that relies on trial duration to be similar between trials (e.g., reinforcement learning algorithms). It would be useful to see the task engagement of the mice across a 24-hour cycle (e.g., trials started, trials finished across a 24-hour period) and approaches for overcoming this issue of varying inter-trial intervals.

      Thank you for your insightful comment regarding the variability in inter-trial intervals and its potential impact on data analysis. We agree that this is an important consideration for continuous self-paced tasks like the autonomous d2AFC paradigm used in our study. In the original manuscript, we have showed the general task engagement across 24-hour cycle (Fig. 2K). The distribution of inter-trial interval was also illustrated (Fig. S3H), which actually shows that most of trials have short intervals (though with extreme long ones). We will include more detailed analysis and discuss the challenges for data analysis.

      Regarding the approaches to mitigate the issue of varying inter-trial interval, we will also discuss strategies to account for and mitigate the effects, including: trial selection, incorporating engagement period (e.g., open only during a fixed 2-hour period each day), etc.

      (3) Movies - it would be beneficial for the authors to add commentary to the video (hit, miss trials). It was interesting watching the mice but not clear whether they were doing the task correctly or not.

      Thanks for the reminder. We will add subtitles to the videos in the next version.

      (4) The strength of this paper (from my perspective) is the potential utility it has for other investigators trying to get mice to do behavioral tasks. However, not enough information was provided about the construction of the boxes, interface, and code for running the boxes. If the authors are not willing to provide this information through eLife, GitHub, or their own website then my evaluation of the impact and significance of this paper would go down significantly.

      Thanks for this important comment. We would like to clarify that the construction methods, GUI, code for our system, PCB and CAD files (newly uploaded) have already been made publicly available on https://github.com/Yaoyao-Hao/HABITS. Additionally, we have open-sourced all the codes and raw data for all training protocols (https://doi.org/10.6084/m9.figshare.27192897). We will continue to maintain these resources in the future.

      Minor concerns:

      Learning rate is confusing for Figure 3 results as it actually refers to trials to reach the criterion, and not the actual rate of learning (e.g., slope).

      Thanks for pointing this out. We will make the revision in the next version.

      Reviewer #3 (Public review):

      Summary:

      In this set of experiments, the authors describe a novel research tool for studying complex cognitive tasks in mice, the HABITS automated training apparatus, and a novel "machine teaching" approach they use to accelerate training by algorithmically providing trials to animals that provide the most information about the current rule state for a given task.

      Strengths:

      There is much to be celebrated in an inexpensively constructed, replicable training environment that can be used with mice, which have rapidly become the model species of choice for understanding the roles of distinct circuits and genetic factors in cognition. Lingering challenges in developing and testing cognitive tasks in mice remain, however, and these are often chalked up to cognitive limitations in the species. The authors' findings, however, suggest that instead, we may need to work creatively to meet mice where they live. In some cases, it may be that mice may require durations of training far longer than laboratories are able to invest with manual training (up to over 100k trials, over months of daily testing) but the tasks are achievable. The "machine teaching" approach further suggests that this duration could be substantially reduced by algorithmically optimizing each trial presented during training to maximize learning.

      Weaknesses:

      (1) Cognitive training and testing in rodent models fill a number of roles. Sometimes, investigators are interested in within-subjects questions - querying a specific circuit, genetically defined neuron population, or molecule/drug candidate, by interrogating or manipulating its function in a highly trained animal. In this scenario, a cohort of highly trained animals that have been trained via a method that aims to make their behavior as similar as possible is a strength.

      However, often investigators are interested in between-subjects questions - querying a source of individual differences that can have long-term and/or developmental impacts, such as sex differences or gene variants. This is likely to often be the case in mouse models especially, because of their genetic tractability. In scenarios where investigators have examined cognitive processes between subjects in mice who vary across these sources of individual difference, the process of learning a task has been repeatedly shown to be different. The authors do not appear to have considered individual differences except perhaps as an obstacle to be overcome.

      The authors have perhaps shown that their main focus is highly-controlled within-subjects questions, as their dataset is almost exclusively made up of several hundred young adult male mice, with the exception of 6 females in a supplemental figure. It is notable that these female mice do appear to learn the two-alternative forced-choice task somewhat more rapidly than the males in their cohort.

      Thank you for your insightful comments and for highlighting the importance of considering both within-subject and between-subject questions in cognitive training and testing in rodent models.

      We acknowledge that our study primarily focused on highly controlled within-subject questions. However, the datasets we provided have showed some evidences for the ‘between-subject’ questions. For example, the large variability in learning rates among mice observed in Fig. 2I, the overall learning rate difference between male and female subjects (Fig. 2D vs. Fig. S2G, as the reviewer already mentioned), the varying nocturnal behavioral patterns (Fig. 2K), etc. While our primary focus was on highly controlled within-subjects questions, we recognize the value of exploring between-subjects differences. In the revised version, we will discuss these points more systematically.

      (2) Considering the implications for mice modeling relevant genetic variants, it is unclear to what extent the training protocols and especially the algorithmic machine teaching approach would be able to inform investigators about the differences between their groups during training. For investigators examining genetic models, it is unclear whether this extensive training experience would mitigate the ability to observe cognitive differences, or select the animals best able to overcome them - eliminating the animals of interest. Likewise, the algorithmic approach aims to mitigate features of training such as side biases, but it is worth noting that the strategic uses of side biases in mice, as in primates, can benefit learning, rather than side biases solely being a problem. However, the investigators may be able to highlight variables selected by the algorithm that are associated with individual strategies in performing their tasks, and this would be a significant contribution.

      Thank you for the insightful comments. We acknowledge that the extensive training experience, particularly through the algorithmic machine teaching approach, could potentially influence the ability to observe cognitive differences between groups of mice with relevant genetic variants. However, our study design and findings suggest that this approach can still provide valuable insights into individual differences and strategies used by the animals during training. First, the behavioral readout (including learning rate, engagement pattern, etc.) as mentioned above, could tell certain number of differences among mice. Second, detailed modelling analysis (with logistical regression modelling) could further dissect the strategy that mouse use along the training process (Fig. S2B). We have actually highlighted some variables selected by the regression that are associated with individual strategies in performing their tasks (Fig. S2C) and these strategies could be different between manual and autonomous training groups (Fig. S2D). We will discuss these points more in the next version of the manuscript.

      (3) A final, intriguing finding in this manuscript is that animal self-paced training led to much slower learning than "manual" training, by having the experimenter introduce the animal to the apparatus for a few hours each day. Manual training resulted in significantly faster learning, in almost half the number of trials on average, and with significantly fewer omitted trials. This finding does not necessarily argue that manual training is universally a better choice because it leads to more limited water consumption. However, it suggests that there is a distinct contribution of experimenter interactions and/or switching contexts in cognitive training, for example by activating an "occasion setting" process to accelerate learning for a distinct period of time. Limiting experimenter interactions with mice may be a labor-saving intervention, but may not necessarily improve performance. This could be an interesting topic of future investigation, of relevance to understanding how animals of all species learn.

      Thank you for your insightful comments. We agree that the finding that manual training led to significantly faster learning compared to self-paced training is both intriguing and important. One of the possible reasons we think is due to the limited duration of engagement provided by the experimenter in the manual training case, which forced the mice to concentrate more on the trails (thus with fewer omitting trials) than in autonomous training. Your suggestion that experimenter interactions might activate an "occasion setting" process is particularly interesting. In the context of our study, we could actually introduce, for example, a light, serving as the cue that prompt the animals to engage; and when the light is off, the engagement was not accessible any more for the mice to simulate the manual training situation. We agree that this could be an interesting topic for future investigation that might create a more conducive environment for learning, thereby accelerating the learning rate.

    1. eLife Assessment

      The authors have undertaken a useful study to update an existing niche model of highly pathogenic avian influenza. However, there are issues regarding the conceptualisation of the ecological niche of highly pathogenic avian influenza transmission that the modelling aims to capture, raising concerns about the strength of evidence used to support the findings. There are a number of modelling assumptions that are incompletely justified. Combined with shortcomings in the communication, this dilutes the strength of the key findings of this work.

    2. Reviewer #1 (Public review):

      Summary:

      The authors aim to predict ecological suitability for the transmission of highly pathogenic avian influenza (HPAI) using ecological niche models. This class of models identify correlations between the locations of species or disease detections and the environment. These correlations are then used to predict habitat suitability (in this work, ecological suitability for disease transmission) in locations where surveillance of the species or disease has not been conducted. The authors fit separate models for HPAI detections in wild birds and farmed birds, for two strains of HPAI (H5N1 and H5Nx) and for two time periods, pre- and post-2020. The authors also validate models fitted to disease occurrence data from pre-2020 using post-2020 occurrence data.

      Strengths:

      The authors follow the established methods of Dhingra et al., 2016 to provide an updated spatial assessment of HPAI transmission suitability for two time periods, pre- and post-2020. They explore further methods of model cross-validation and consider the diversity of the bird species that HPAI has been detected in.

      Weaknesses:

      The precise ecological niche that the authors are modelling here is ambiguous: if we treat the transmission of HPAI in the wild bird population and in poultry populations as separate transmission cycles, linked by spillover events, then these transmission cycles are likely to have fundamentally different ecological niches. While an "index case" in farmed poultry is relevant to the wildlife transmission cycle, further within-farm and farm-to-farm transmission is likely to be contingent on anthropogenic factors, rather than the environment. Similarly, we would expect "index cases" in outbreaks of HPAI in mammals to be relevant to transmission risk in wild birds - this data is not included in this manuscript. Such "index cases" in farmed poultry occur under separate ecological conditions to subsequent transmission in farmed poultry, so should be separated if possible. Some careful editing of the language used in the manuscript may elucidate some of my questions related to model conceptualisation.

      The authors' handling of sampling bias in disease detection data in poultry is possibly inappropriate: one would expect the true spatial distribution of disease surveillance in poultry to be more closely correlated with poultry farming density, in contrast to human population density. This shortcoming in the modelling workflow possibly dilutes a key finding of the Results, that the transmission risk of HPAI in poultry is greatest in areas where poultry farming density is high.

    3. Reviewer #2 (Public review):

      Summary:

      This study aimed to determine which spatial factors (conceived broadly as environmental, agronomic and socio-economic) explain greater avian influenza case numbers reported since 2020 (2020--2022) by comparing similar models built with data from the period 2015--2020. The authors have chosen an environmental niche modelling approach, where detected infections are modelled as a function of spatial covariates extracted at the location of each case. These covariates are available over the entire world so that the predictions can be projected back to space in the form of a continuous map.

      Strengths:

      The authors use boosted regression trees as the main analytical tool, which always feature among the best-performing models for environmental niche models (also known as habitat suitability models). They run replicate sets of the analysis for each of their model targets (wild/domestic x pathogen variant), which can help produce stable predictions. The authors take steps to ameliorate some forms of expected bias in the detection of cases, such as geographic variation in surveillance efforts, and in general more detections near areas of higher human population density.

      Weaknesses:

      The study is not altogether coherent with respect to time. Data sets for the response (N5H1 or N5Hx case data in domestic or wild birds ) are divided into two periods; 2015--2020, and 2020--2022. Each set is modelled using a common suite of covariates that are not time-varying. That suggests that causation is inferred by virtue of cases being in different geographic areas in those two time periods. Furthermore, important predictors such as chicken density appear to be informed (in the areas of high risk) from census data from before 2010. The possibility for increased surveillance effort *through time* is overlooked, as is the possibility that previously high-burden locations may implement practice changes to reduce vulnerability.

    4. Author response:

      Reviewer #1:

      Summary:

      The authors aim to predict ecological suitability for the transmission of highly pathogenic avian influenza (HPAI) using ecological niche models. This class of models identify correlations between the locations of species or disease detections and the environment. These correlations are then used to predict habitat suitability (in this work, ecological suitability for disease transmission) in locations where surveillance of the species or disease has not been conducted. The authors fit separate models for HPAI detections in wild birds and farmed birds, for two strains of HPAI (H5N1 and H5Nx) and for two time periods, pre- and post-2020. The authors also validate models fitted to disease occurrence data from pre-2020 using post-2020 occurrence data.

      Strengths:

      The authors follow the established methods of Dhingra et al., 2016 to provide an updated spatial assessment of HPAI transmission suitability for two time periods, pre- and post-2020. They explore further methods of model cross-validation and consider the diversity of the bird species that HPAI has been detected in.

      Weaknesses:

      The precise ecological niche that the authors are modelling here is ambiguous: if we treat the transmission of HPAI in the wild bird population and in poultry populations as separate transmission cycles, linked by spillover events, then these transmission cycles are likely to have fundamentally different ecological niches.

      We apologise if this aspect was not clear enough in the previous version of our manuscript but our analyses do not treat or make the assumption of distinct transmission cycles between wild and domestic bird species; those transmission cycles being indeed interconnected by frequent spillover events. Yet, we indeed conduct independent ecological niche modelling analyses to estimate both the ecological suitability for the risk of local circulation in domestic birds as well as the ecological suitability for the risk of local circulation in wild birds. This distinction does not imply that the virus circulates exclusively within one of these populations but rather allows us to identify potential differences in the environmental conditions associated with virus occurrences in each context.

      Our results indicate that these two ecological niche models capture distinct environmental patterns. Virus occurrences in wild birds were primarily associated with factors such as open water and proximity to urban areas, while occurrences in domestic birds were more strongly linked to variables like poultry density and cultivated vegetation. This finding supports the existence of two distinct ecological niches for the virus, corresponding to virus circulation in wild and domestic bird populations. We thank the Reviewer for their feedback and we will take this opportunity to further clarify this aspect in the text.

      While an "index case" in farmed poultry is relevant to the wildlife transmission cycle, further within-farm and farm-to-farm transmission is likely to be contingent on anthropogenic factors, rather than the environment. Similarly, we would expect "index cases" in outbreaks of HPAI in mammals to be relevant to transmission risk in wild birds - this data is not included in this manuscript. Such "index cases" in farmed poultry occur under separate ecological conditions to subsequent transmission in farmed poultry, so should be separated if possible. Some careful editing of the language used in the manuscript may elucidate some of my questions related to model conceptualisation.

      We agree, but index cases are particularly difficult to separate from secondary spread in the absence of field investigation. Identification of index cases based on space-time filtering have been previously investigated but are strongly dependent on the quality of the surveillance, i.e. an “apparent” primary case can be a secondary case of previously undetected ones, and constant surveillance quality cannot be assumed to be homogeneous across countries. Our ecological niche modelling approach is based on HPAI cases reported in the EMPRES-i database, which includes all documented outbreaks without distinguishing primary introductions from subsequent farm-to-farm transmissions. Thus, our ecological niche models are trained on confirmed cases that result from a combination of different transmission dynamics, including introduction events in poultry populations (which can be impacted by ecological factors) and persistence within and between poultry populations (which can be impacted by anthropogenic factors).

      For clarity, we will revise the manuscript to clarify that, while our study primarily aims to assess the environmental suitability for HPAI occurrences, the dataset does not exclude cases resulting from farm-to-farm spread. This means that our models can capture the environmental variables associated with the risk of cases associated with both primary introductions (e.g., spillover from wild birds) and secondary transmission events within poultry systems, although the latter is also influenced by anthropogenic factors such as biosecurity practices and poultry trade networks. These latter factors are not included in our models, which will be highlighted in the limitations (Discussion section) of the revised manuscript.

      In addition, we note the Reviewer's comment regarding the relevance of “index cases” in mammalian outbreaks to understanding the risk of HPAI transmission in wild birds. Although these data are not included in our current study, we will highlight the potential value of incorporating these cases into future models in order to refine risk predictions, provided that they can be identified with some reasonable level of certainty.

      The authors' handling of sampling bias in disease detection data in poultry is possibly inappropriate: one would expect the true spatial distribution of disease surveillance in poultry to be more closely correlated with poultry farming density, in contrast to human population density. This shortcoming in the modelling workflow possibly dilutes a key finding of the Results, that the transmission risk of HPAI in poultry is greatest in areas where poultry farming density is high.

      The Reviewer raises a valid point that poultry surveillance efforts can also be considered as correlated with poultry farm density than with human population density. While human population density can serve as a reasonable proxy for surveillance intensity — given that disease detection is often more active in areas with stronger veterinary notification systems — we acknowledge that poultry disease surveillance can also be influenced by the spatial distribution of poultry farms, as high-density poultry areas could be prioritised for monitoring. Please note that in our study, we followed a previously established approach (Dhingra et al. 2016) and weighted pseudo-absence sampling based on human population density to account for general surveillance biases. However, we do not agree with the Reviewer’s point. In fact, assuming a sampling bias correlated with poultry density would result in reducing its effect as a risk factor. The current approach does not.

      Reviewer #2:

      Summary:

      This study aimed to determine which spatial factors (conceived broadly as environmental, agronomic and socio-economic) explain greater avian influenza case numbers reported since 2020 (2020--2022) by comparing similar models built with data from the period 2015--2020. The authors have chosen an environmental niche modelling approach, where detected infections are modelled as a function of spatial covariates extracted at the location of each case. These covariates are available over the entire world so that the predictions can be projected back to space in the form of a continuous map.

      Strengths:

      The authors use boosted regression trees as the main analytical tool, which always feature among the best-performing models for environmental niche models (also known as habitat suitability models). They run replicate sets of the analysis for each of their model targets (wild/domestic x pathogen variant), which can help produce stable predictions. The authors take steps to ameliorate some forms of expected bias in the detection of cases, such as geographic variation in surveillance efforts, and in general more detections near areas of higher human population density.

      Weaknesses:

      The study is not altogether coherent with respect to time. Data sets for the response (N5H1 or N5Hx case data in domestic or wild birds) are divided into two periods; 2015-2020, and 2020-2022. Each set is modelled using a common suite of covariates that are not time-varying. That suggests that causation is inferred by virtue of cases being in different geographic areas in those two time periods. Furthermore, important predictors such as chicken density appear to be informed (in the areas of high risk) from census data from before 2010. The possibility for increased surveillance effort *through time* is overlooked, as is the possibility that previously high-burden locations may implement practice changes to reduce vulnerability.

      We acknowledge the Reviewer's comments regarding the consistency of time periods in our study. Our approach is to divide the HPAI case data into two time periods (2015-2020 and 2020-2022) and ecological niche models using a common set of covariates that do not explicitly account for temporal variation. We will further clarify these aspects in the revised version of our manuscript:

      (1) Our primary objective is to assess changes in ecological suitability over time rather than infer direct causation. By comparing models trained on pre-2020 data with post-2020 occurrences, we evaluate whether pre-2020 environmental conditions can predict recent HPAI suitability. However, we acknowledge that this does not capture dynamic changes in surveillance efforts, biosecurity measures, or host-pathogen interactions over time.

      (2) Regarding predictor variables, we used poultry density data from 2015, rather than pre-2010 data. However, this dataset is not based on a single census year; instead, it represents a median estimate derived from subnational poultry census data collected between 2000 and 2019. This median year approach provides a more stable representation of poultry density than any single-year snapshot. Furthermore, while poultry production systems may exhibit some temporal variation, these changes are generally minor compared to the inter-annual variability observed in HPAI occurrence, which is largely driven by epidemic dynamics. Given the current limitations of global poultry data, distinguishing distributions from different years is not feasible with the available GLW dataset. We will clarify these points in the manuscript.

      (3) We recognise that increased surveillance efforts and adaptive changes in poultry farming practices could influence the observed HPAI case distribution. While our current models do not incorporate time-varying surveillance intensity or biosecurity policies, we will address this limitation in the Discussion section and suggest that future work integrates dynamic surveillance data to improve risk assessments.

    1. eLife Assessment

      This study provides valuable findings on the effects of mating experience on sweet taste perception. The data as presented provide solid evidence that the dopaminergic signaling-mediated reward system underlies this mating state-dependent behavioral modulation. The work will interest neuroscientists, particularly those working on neuromodulation and the effects of internal states on behavior.

    2. Reviewer #1 (Public review):

      Wang et al. investigated how sexual failure influences sweet taste perception in male Drosophila. The study revealed that courtship failure leads to decreased sweet sensitivity and feeding behavior via dopaminergic signaling. Specifically, the authors identified a group of dopaminergic neurons projecting to the subesophageal zone that interacts with sweet-sensing Gr5a+ neurons. These dopaminergic neurons positively regulate the sweet sensitivity of Gr5a+ neurons via DopR1 and Dop2R receptors. Sexual failure diminishes the activity of these dopaminergic neurons, leading to reduced sweet-taste sensitivity and sugar-feeding behavior in male flies. These findings highlight the role of dopaminergic neurons in integrating reproductive experiences to modulate appetitive sensory responses.

      Previous studies have explored the dopaminergic-to-Gr5a+ neuronal pathways in regulating sugar feeding under hunger conditions. Starvation has been shown to increase dopamine release from a subset of TH-GAL4 labeled neurons, known as TH-VUM, in the subesophageal zone. This enhanced dopamine release activates dopamine receptors in Gr5a+ neurons, heightening their sensitivity to sugar and promoting sucrose acceptance in flies. Since the function of the dopaminergic-to-Gr5a+ circuit motif has been well established, the primary contribution of Wang et al. is to show that mating failure in male flies can also engage this circuit to modulate sugar-feeding behavior. This contribution is valuable because it highlights the role of dopaminergic neurons in integrating diverse internal state signals to inform behavioral decisions.

      An intriguing discrepancy between Wang et al. and earlier studies lies in the involvement of dopamine receptors in Gr5a+ neurons. Prior research has shown that Dop2R and DopEcR, but not DopR1, mediate starvation-induced enhancement of sugar sensitivity in Gr5a+ neurons. In contrast, Wang et al. found that DopR1 and Dop2R, but not DopEcR, are involved in the sexual failure-induced decrease in sugar sensitivity in these neurons. I wish the authors had further explored or discussed this discrepancy, as it is unclear how dopamine release selectively engages different receptors to modulate neuronal sensitivity in a context-dependent manner.

      The data presented by Wang et al. are solid and effectively support their conclusions. However, certain aspects of their experimental design, data analysis, and interpretation warrant further review, as outlined below.

      (1) The authors did not explicitly indicate the feeding status of the flies, but it appears they were not starved. However, the naive and satisfied flies in this study displayed high feeding and PER baselines, similar to those observed in starved flies in other studies. This raises the concern that sexually failed flies may have consumed additional food during the 4.5-hour conditioning period, potentially lowering their baseline hunger levels and subsequently reducing PER responses. This alternative explanation is worth considering, as an earlier study demonstrated that sexually deprived males consumed more alcohol, and both alcohol and food are known rewards for flies. To address this concern, the authors could remove food during the conditioning phase to rule out its influence on the results.

      (2) Figure 1B reveals that approximately half of the males in the Failed group did not consume sucrose, yet Figure 1-S1A suggests that the total volume consumed remained unchanged. Were the flies that did not consume sucrose omitted from the dataset presented in Figure 1-S1A? If so, does this imply that only half of the male flies experience sexual failure, or that sexual failure affects only half of males while the others remain unaffected? The authors should clarify this point.

      (3) The evidence linking TH-GAL4 labeled dopaminergic neurons to reduced sugar sensitivity in Gr5a+ neurons in sexually failed males could be further strengthened. Ideally, the authors would have activated TH-GAL4 neurons and observed whether this restored GCaMP responses in Gr5a+ neurons in sexually failed males. Instead, the authors performed a less direct experiment, shown in Figures 3-S1C and D. The manuscript does not describe the condition of the flies used in this experiment, but it appears that they were not sexually conditioned. I have two concerns with this experiment. First, no statistical analysis was provided to support the enhancement of sucrose responses following activation of TH-GAL4 neurons. Second, without performing this experiment in sexually failed males, the authors lack direct evidence to confirm that the dampened response of Gr5a+ neurons to sucrose results from decreased activity in TH-GAL4 neurons.

      (4) The statistical methods used in this study are poorly described, making it unclear which method was used for each experiment. I suggest that the authors include a clear description of the statistical methods used for each experiment in the figure legends. Furthermore, as I have pointed out, there is a lack of statistical comparisons in Figures 3-S1C and D, a similar problem exists for Figures 6E and F.

      (5) The experiments in Figure 5 lack specificity. The target neurons in this study are Gr5a+ neurons, which are directly involved in sugar sensing. However, the authors used the less specific Dop1R1- and Dop2R-GAL4 lines for their manipulations. Using Gr5a-GAL4 to specifically target Gr5a+ neurons would provide greater precision and ensure that the observed effects are directly attributable to the modulation of Gr5a+ neurons, rather than being influenced by potential off-target effects from other neuronal populations expressing these dopamine receptors.

      (6) I found the results presented in Fig. 6F puzzling. The knockdown of Dop2R in Gr5a+ neurons would be expected to decrease sucrose responses in naive and satisfied flies, given the role of Dop2R in enhancing sweet sensitivity. However, the figure shows an apparent increase in responses across all three groups, which contradicts this expectation. The authors may want to provide an explanation for this unexpected result.

      (7) In several instances in the manuscript, the authors described the effects of silencing dopamine signaling pathways or knocking down dopamine receptors in Gr5a neurons with phrases such as 'no longer exhibited reduced sweet sensitivity' (e.g., L269 and L288), 'prevent the reduction of sweet sensitivity' (e.g., L292), or 'this suppression was reversed' (e.g. L299). I found these descriptions misleading, as they suggest that sweet sensitivity in naive and satisfied groups remains normal while the reduction in failed flies is specifically prevented or reversed. However, this is not the case. The data indicate that these manipulations result in an overall decrease in sweet sensitivity across all groups, such that a further reduction in failed flies is not observed. I recommend revising these descriptions to accurately reflect the observed phenotypes and avoid any confusion regarding the effects of these manipulations.

    3. Reviewer #2 (Public review):

      Summary:

      The authors exposed naïve male flies to different groups of females, either mated or virgin. Male flies can successfully copulate with virgin females; however, they are rejected by mated females. This rejection reduces sugar preference and sensitivity in males. Investigating the underlying neural circuits, the authors show that dopamine signaling onto GR5a sensory neurons is required for reduced sugar preference. GR5a sensory neurons respond less to sugar exposure when they lack dopamine receptors.

      Strengths:

      The findings add another strong phenotype to the existing dataset about brain-wide neuromodulatory effects of mating. The authors use several state-of-the-art methods, such as activity-dependent GRASP to decipher the underlying neural circuitry. They further perform rigorous behavioral tests and provide convincing evidence for the local labellar circuit.

      Weaknesses:

      The authors focus on the circuit connection between dopamine and gustatory sensory neurons in the male SEZ. Therefore, it is still unknown how mating modulates dopamine signaling and what possible implications on other behaviors might result from a reduced sugar preference.

    4. Reviewer #3 (Public review):

      Summary

      In this work, the authors asked how mating experience impacts reward perception and processing. For this, they employ fruit flies as a model, with a combination of behavioral, immunostaining, and live calcium imaging approaches.

      Their study allowed them to demonstrate that courtship failure decreases the fraction of flies motivated to eat sweet compounds, revealing a link between reproductive stress and reward-related behaviors. This effect is mediated by a small group of dopaminergic neurons projecting to the SEZ. After courtship failure, these dopaminergic neurons exhibit reduced activity, leading to decreased Gr5a+ neuron activity via Dop1R1 and Dop2R signaling, and leading to reduced sweet sensitivity. The authors therefore showed how mating failure influences broader behavioral outputs through suppression of the dopamine-mediated reward system and underscores the interactions between reproductive and reward pathways.

      Concern

      My main concern regarding this study lies in the way the authors chose to present their results. If I understood correctly, they provided evidence that mating failure induces a decrease in the fraction of flies exhibiting PER. However, they also showed that food consumption was not affected (Fig. 1, supplement), suggesting that individuals who did eat consumed more. This raises questions about the analysis and interpretation of the results. Should we consider the group as a whole, with a reduced sensitivity to sweetness, or should we focus on individuals, with each one eating more? I am also concerned about how this could influence the results obtained using live imaging approaches, as the flies being imaged might or might not have been motivated to eat during the feeding assays. I would like the authors to clarify their choice of analysis and discuss this critical point, as the interpretation of the results could potentially be the opposite of what is presented in the manuscript.

    5. Author response:

      Reviewer #1 (Public review):

      Wang et al. investigated how sexual failure influences sweet taste perception in male Drosophila. The study revealed that courtship failure leads to decreased sweet sensitivity and feeding behavior via dopaminergic signaling. Specifically, the authors identified a group of dopaminergic neurons projecting to the suboesophageal zone that interacts with sweet-sensing Gr5a+ neurons. These dopaminergic neurons positively regulate the sweet sensitivity of Gr5a+ neurons via DopR1 and Dop2R receptors. Sexual failure diminishes the activity of these dopaminergic neurons, leading to reduced sweet-taste sensitivity and sugar-feeding behavior in male flies. These findings highlight the role of dopaminergic neurons in integrating reproductive experiences to modulate appetitive sensory responses.

      Previous studies have explored the dopaminergic-to-Gr5a+ neuronal pathways in regulating sugar feeding under hunger conditions. Starvation has been shown to increase dopamine release from a subset of TH-GAL4 labeled neurons, known as TH-VUM, in the suboesophageal zone. This enhanced dopamine release activates dopamine receptors in Gr5a+ neurons, heightening their sensitivity to sugar and promoting sucrose acceptance in flies. Since the function of the dopaminergic-to-Gr5a+ circuit motif has been well established, the primary contribution of Wang et al. is to show that mating failure in male flies can also engage this circuit to modulate sugar-feeding behavior. This contribution is valuable because it highlights the role of dopaminergic neurons in integrating diverse internal state signals to inform behavioral decisions.

      An intriguing discrepancy between Wang et al. and earlier studies lies in the involvement of dopamine receptors in Gr5a+ neurons. Prior research has shown that Dop2R and DopEcR, but not DopR1, mediate starvation-induced enhancement of sugar sensitivity in Gr5a+ neurons. In contrast, Wang et al. found that DopR1 and Dop2R, but not DopEcR, are involved in the sexual failure-induced decrease in sugar sensitivity in these neurons. I wish the authors had further explored or discussed this discrepancy, as it is unclear how dopamine release selectively engages different receptors to modulate neuronal sensitivity in a context-dependent manner.

      Our immunostaining experiments showed that three dopamine receptors, DopR1, Dop2R, and DopEcR were expressed in Gr5a<sup>+</sup> neurons in the proboscis, which was consistent with previous findings by using RT-PCR (Inagaki et al 2012). As the reviewer pointed out, we found that DopR1 and Dop2R were required for courtship failure-induced suppression of sugar sensitivity, whereas Marella et al 2012 and Inagaki et al 2012 found that Dop2R and DopEcR were required for starvation-induced enhancement of sugar sensitivity. These results may suggest different internal states (courtship failure vs. starvation) modulate peripheral sensory system via different signaling pathways (e.g. different subsets of dopaminergic neurons; different dopamine release mechanisms; and different dopamine receptors). We will further discuss these possibilities in the revised manuscript.

      The data presented by Wang et al. are solid and effectively support their conclusions. However, certain aspects of their experimental design, data analysis, and interpretation warrant further review, as outlined below.

      (1) The authors did not explicitly indicate the feeding status of the flies, but it appears they were not starved. However, the naive and satisfied flies in this study displayed high feeding and PER baselines, similar to those observed in starved flies in other studies. This raises the concern that sexually failed flies may have consumed additional food during the 4.5-hour conditioning period, potentially lowering their baseline hunger levels and subsequently reducing PER responses. This alternative explanation is worth considering, as an earlier study demonstrated that sexually deprived males consumed more alcohol, and both alcohol and food are known rewards for flies. To address this concern, the authors could remove food during the conditioning phase to rule out its influence on the results.

      We think this is a valid concern. We will conduct courtship conditioning in the absence of food and test if courtship failure can still suppress sugar sensitivity in the revised manuscript.

      (2) Figure 1B reveals that approximately half of the males in the Failed group did not consume sucrose yet Figure 1-S1A suggests that the total volume consumed remained unchanged. Were the flies that did not consume sucrose omitted from the dataset presented in Figure 1-S1A? If so, does this imply that only half of the male flies experience sexual failure, or that sexual failure affects only half of males while the others remain unaffected? The authors should clarify this point.

      Here is a brief clarification of our experimental design and we will further clarify the details in the revised manuscript:

      After the behavioral conditioning, male flies were divided for two assays. On the one hand, we quantified PER responses of individual flies. As shown in Figure 1C, Failed males exhibited decreased sweet sensitivity (as demonstrated by the right shift of the response curve).

      On the other hand, we sought to quantify food consumption of individual flies by using the MAFE assay (Qi et al 2005). When presented with 400 mM sucrose, approximately 100% of the flies in the Naïve and Satisfied groups, and 50% of the flies in the Failed group, extended their proboscis and started feeding (Figure 1B). For these flies, we could quantify the consumed volumes and found there was no change (Figure 1, S1A). We should also note the consistency of these two experiments, e.g. in Figure 1C, only 50-60% of Failed males responded to 400 mM stimulation.  

      These two experiments in combination suggest that sexual failure suppressed sweet sensitivity of the Failed males. Meanwhile, as long as they still initiated feeding, the volume of food consumption remained unchanged. These results led us to focus on the modulatory effect of sexual failure on the sensory system, the main topic of this present study.

      In addition, to further clarify the potential misunderstanding, we plan to examine food consumption by using 800 mM sucrose in the revised manuscript. As shown in Figure 1C, 800 mM sucrose was adequate to induce feeding in ~100% of the flies.

      (3) The evidence linking TH-GAL4 labeled dopaminergic neurons to reduced sugar sensitivity in Gr5a+ neurons in sexually failed males could be further strengthened. Ideally, the authors would have activated TH-GAL4 neurons and observed whether this restored GCaMP responses in Gr5a+ neurons in sexually failed males. Instead, the authors performed a less direct experiment, shown in Figures 3-S1C and D. The manuscript does not describe the condition of the flies used in this experiment, but it appears that they were not sexually conditioned. I have two concerns with this experiment. First, no statistical analysis was provided to support the enhancement of sucrose responses following activation of TH-GAL4 neurons. Second, without performing this experiment in sexually failed males, the authors lack direct evidence to confirm that the dampened response of Gr5a+ neurons to sucrose results from decreased activity in TH-GAL4 neurons.

      We think this is also a valid suggestion. We will directly examine whether activating TH<sup>+</sup> neurons in sexually conditioned males would enhance sugar responses of Gr5a<sup>+</sup> neurons in sexually failed males. We will also add in statistical analysis.

      Nevertheless, we would still argue our current experiments using Naive males (Figure 3, S1C-D) are adequate to show a functional link between TH<sup>+</sup> neurons and Gr5a<sup>+</sup> neurons. Combining with the results that these neurons form active synapses (Figure 3, S1B) and that the activity of TH<sup>+</sup> neurons was dampened in sexually failed males (Figure 3G-I), our current data support the notion that sexual failure suppresses sweet sensitivity via TH-Gr5a circuity.

      (4) The statistical methods used in this study are poorly described, making it unclear which method was used for each experiment. I suggest that the authors include a clear description of the statistical methods used for each experiment in the figure legends. Furthermore, as I have pointed out, there is a lack of statistical comparisons in Figures 3-S1C and D, a similar problem exists for Figures 6E and F.

      We will add detailed information of statistical analysis in each figure legend.

      (5) The experiments in Figure 5 lack specificity. The target neurons in this study are Gr5a+ neurons, which are directly involved in sugar sensing. However, the authors used the less specific Dop1R1- and Dop2R-GAL4 lines for their manipulations. Using Gr5a-GAL4 to specifically target Gr5a+ neurons would provide greater precision and ensure that the observed effects are directly attributable to the modulation of Gr5a+ neurons, rather than being influenced by potential off-target effects from other neuronal populations expressing these dopamine receptors.

      We agree with the reviewer that manipulating Dop1R1 and Dop2R genes (Figure 4) and the neurons expressing them (Figure 5) might have broader impacts. In fact, we have also tested the role of Dop1R1 and Dop2R in Gr5a<sup>+</sup> neurons by RNAi experiments (Figure 6). As shown by both behavioral and calcium imaging experiments, knocking down Dop1R1 and Dop2R in Gr5a<sup>+</sup> neurons both eliminated the effect of sexual failure to dampen sweet sensitivity, further confirming the role of these two receptors in Gr5a<sup>+</sup> neurons.

      (6) I found the results presented in Fig. 6F puzzling. The knockdown of Dop2R in Gr5a+ neurons would be expected to decrease sucrose responses in naive and satisfied flies, given the role of Dop2R in enhancing sweet sensitivity. However, the figure shows an apparent increase in responses across all three groups, which contradicts this expectation. The authors may want to provide an explanation for this unexpected result.

      We agree that there might be some potential discrepancies. However, our current data are not adequate for the clarification given the experiments shown in Figure 6E-F and the apparent control (Figure 3C) were not conducted under identical settings at the same (that’s why we did not directly compare these results). One way to address the issues is to conduct these calcium imaging experiments again with a head-to-head comparison with the control group (Gr5a-GCaMP, +/- Dop1R1 and Dop2R RNAi). We will conduct the experiments and present the data in the revised manuscript.

      (7) In several instances in the manuscript, the authors described the effects of silencing dopamine signaling pathways or knocking down dopamine receptors in Gr5a neurons with phrases such as 'no longer exhibited reduced sweet sensitivity' (e.g., L269 and L288), 'prevent the reduction of sweet sensitivity' (e.g., L292), or 'this suppression was reversed' (e.g. L299). I found these descriptions misleading, as they suggest that sweet sensitivity in naive and satisfied groups remains normal while the reduction in failed flies is specifically prevented or reversed. However, this is not the case. The data indicate that these manipulations result in an overall decrease in sweet sensitivity across all groups, such that a further reduction in failed flies is not observed. I recommend revising these descriptions to accurately reflect the observed phenotypes and avoid any confusion regarding the effects of these manipulations.

      We will change our expressions in the revised manuscript. In brief, we think that these manipulations (suppressing Dop1R1<sup>+</sup> and Dop2R<sup>+</sup> neurons) have two consequences: suppressing the overall sweet sensitivity and eliminating the effect of sexual failure.

      Reviewer #2 (Public review):

      Summary:

      The authors exposed naïve male flies to different groups of females, either mated or virgin. Male flies can successfully copulate with virgin females; however, they are rejected by mated females. This rejection reduces sugar preference and sensitivity in males. Investigating the underlying neural circuits, the authors show that dopamine signaling onto GR5a sensory neurons is required for reduced sugar preference. GR5a sensory neurons respond less to sugar exposure when they lack dopamine receptors.

      Strengths:

      The findings add another strong phenotype to the existing dataset about brain-wide neuromodulatory effects of mating. The authors use several state-of-the-art methods, such as activity-dependent GRASP to decipher the underlying neural circuitry. They further perform rigorous behavioral tests and provide convincing evidence for the local labellar circuit.

      Weaknesses:

      The authors focus on the circuit connection between dopamine and gustatory sensory neurons in the male SEZ. Therefore, it is still unknown how mating modulates dopamine signaling and what possible implications on other behaviors might result from a reduced sugar preference.

      We agree with the reviewer that in the current study, we did not examine how mating experience suppressed the activity of dopaminergic neurons in the SEZ. The current study mainly focused on the behavioral characterization (sexual failure suppresses sweet sensitivity) and the downstream mechanism (TH-Gr5a pathway). We think that examining the upstream modulatory mechanism may be more suitable for a separate future study.

      We believe that a sustained reduction in sweet sensitivity (not limited to sucrose but extend to other sweet compounds, Figure 1, S1B-C) upon sexual failure suggests a generalized and sustained consequence on reward-related behaviors. Sexual failure may thus resemble a state of “primitive emotion” in fruit flies. We will further discuss this possibility in the revised manuscript.

      Reviewer #3 (Public review):

      Summary

      In this work, the authors asked how mating experience impacts reward perception and processing. For this, they employ fruit flies as a model, with a combination of behavioral, immunostaining, and live calcium imaging approaches.

      Their study allowed them to demonstrate that courtship failure decreases the fraction of flies motivated to eat sweet compounds, revealing a link between reproductive stress and reward-related behaviors. This effect is mediated by a small group of dopaminergic neurons projecting to the SEZ. After courtship failure, these dopaminergic neurons exhibit reduced activity, leading to decreased Gr5a+ neuron activity via Dop1R1 and Dop2R signaling, and leading to reduced sweet sensitivity. The authors therefore showed how mating failure influences broader behavioral outputs through suppression of the dopamine-mediated reward system and underscores the interactions between reproductive and reward pathways.

      Concern

      My main concern regarding this study lies in the way the authors chose to present their results. If I understood correctly, they provided evidence that mating failure induces a decrease in the fraction of flies exhibiting PER. However, they also showed that food consumption was not affected (Fig. 1, supplement), suggesting that individuals who did eat consumed more. This raises questions about the analysis and interpretation of the results. Should we consider the group as a whole, with a reduced sensitivity to sweetness, or should we focus on individuals, with each one eating more? I am also concerned about how this could influence the results obtained using live imaging approaches, as the flies being imaged might or might not have been motivated to eat during the feeding assays. I would like the authors to clarify their choice of analysis and discuss this critical point, as the interpretation of the results could potentially be the opposite of what is presented in the manuscript.

      Here is a brief clarification of our experimental design and we will further clarify the details in the revised manuscript:

      After the behavioral conditioning, male flies were divided for two assays. On the one hand, we quantified PER responses of individual flies. As shown in Figure 1C, Failed males exhibited decreased sweet sensitivity (as demonstrated by the right shift of the response curve).

      On the other hand, we sought to quantify food consumption of individual flies by using the MAFE assay (Qi et al 2005). When presented with 400 mM sucrose, approximately 100% of the flies in the Naïve and Satisfied groups, and 50% of the flies in the Failed group, extended their proboscis and started feeding (Figure 1B). For these flies, we could quantify the consumed volumes and found there was no change (Figure 1, S1A). We should also note the consistency of these two experiments, e.g. in Figure 1C, only 50-60% of Failed males responded to 400 mM stimulation.  

      These two experiments in combination suggest that sexual failure suppressed sweet sensitivity of the Failed males. Meanwhile, as long as they still initiated feeding, the volume of food consumption remained unchanged. These results led us to focus on the modulatory effect of sexual failure on the sensory system, the main topic of this present study.

      In addition, to further clarify the potential misunderstanding, we plan to examine food consumption by using 800 mM sucrose instead. As shown in Figure 1C, 800 mM sucrose was adequate to induce feeding in ~100% of the flies.

    1. eLife Assessment

      This study demonstrated that the conditional knockout of afadin disrupts retinal laminar organization and reduced number of photoreceptors while preserving some of the structure and light responsiveness of retinal ganglion cells. These findings are solid and useful for understanding afadin's role in retinal cell generation, lamination, and functional organization. However, the study provides limited new insights into the relationship between retinal lamination defects and overall retinal function.