7,306 Matching Annotations
  1. Aug 2024
    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This work explored intra and interspecific niche partitioning along spatial, temporal, and dietary niche partitioning between apex carnivores and mesocarnivores in the Qilian Mountain National Park of China, using camera trapping data and DNA metabarcoding sequencing data. They conclude that spatial niche partitioning plays a key role in facilitating the coexistence of apex carnivore species, spatial and temporal niche partitioning facilitate the coexistence of mesocarnivore species, and spatial and dietary niche partitioning facilitate the coexistence between apex and mesocarnivore species. The information presented in this study is important for wildlife conservation and will contribute substantially to the current understanding of carnivore guilds and effective conservation management in fragile alpine ecosystems.

      Strengths:

      Extensive fieldwork is evident in the study. Aiming to cover a large percentage of the Qilian Mountain National Park, the study area was subdivided into squares, as a geographical reference to distribute the sampling points where the camera traps were placed and the excreta samples were collected.

      They were able to obtain many records in their camera traps and collected many samples of excreta. This diversity of data allowed them to conduct robust analyses. The data analyses carried out were adequate to obtain clear and meaningful results that enabled them to answer the research questions posed. The conclusions of this paper are mostly well supported by data.

      The study has demonstrated the coexistence of carnivore species in the landscapes of the Qilian Mountains National Park, complementing the findings of previous studies. The information presented in this study is important for wildlife conservation and will contribute substantially to the current understanding of carnivore guilds and effective conservation management in fragile alpine ecosystems.

      Weaknesses:

      It is necessary to better explain the methodology because it is not clear what is the total sampling effort. In methodology, they only claim to have used 280 camera traps, and in the results, they mention that there are 319 sampling sites. However, the total sampling effort (e.g. total time of active camera traps) carried out in the study and at each site is not specified.

      Thanks a lot for this detailed review! We apologize for not offering a distinct description of the overall sampling effort. In this study, we deployed 280 camera trappings, and these cameras were active for approximately 4 to 6 months. We visited each camera 2 to 3 times annually to download photos and check the batteries. In case some cameras failed to capture the targeted carnivore, we would relocate the positions of those cameras. Eventually, we collected 322 camera trapping sites, among which 3 cameras malfunctioned due to loss. As a result, we analyzed data from 319 camera sites and obtained 14,316 independent detections over 37,192 trap-days.

      We have added this information as follows in lines 132 to lines 143: “Taking into account the fact that mammalian communities are sensitive to seasonality, we used camera traps to monitor animals with an extensive survey effort from December 2016 to February 2022, covering the activity of animal species in different seasons, which can reflect the overall distribution of carnivores. We placed a total of 280 infrared cameras at the study site, set them to be active for 4 to 6 months, and considered possible relocation to another position based on animal detection in an effort to improve estimates of the occupancy and detection rates for both common and rare species (Figure 1) (Kays et al., 2020). The camera trap was set to record the time and date on a 24 hr clock when triggered, and to record a 15s video and 1 photo with an interval of 2 minutes between any two consecutive triggers. The sum of camera trap effective days was defined by the total amount of trapping effort during the sampling period, which was calculated from the time the camera was placed in operation to the time the last video or photograph was taken. We visited each camera 2 to 3 times a year to download photos and check batteries.” and lines 228 to lines 232: “A total of 322 camera trap sites were surveyed after relocating infrared cameras that did not capture any target carnivore species. A total of 3 cameras were considered to have failed due to loss. We analyzed data from 319 camera sites and obtained 14,316 independent detections during a total effort of 37,192 effective camera trap days. We recorded wolf in 26 sites, snow leopard in 109 sites, Eurasian lynx in 36 sites, red fox in 92 sites, and Tibetan fox in 34 sites.”

      Reviewer #2 (Public Review):

      Summary:

      The study entitled "Different coexistence patterns between apex carnivores and mesocarnivores based on temporal, spatial, and dietary niche partitioning analysis in Qilian Mountain National Park, China" by Cong et al. addresses the compelling topic of carnivores' coexistence in a biodiversity hotspot in China. The study is interesting given it considers all three components affecting sympatric carnivores' distribution and co-occurrence, namely the temporal, the spatial, and the dietary partition within the carnivore guild. The authors have found that spatial co-occurrence is generally low, which represents the major strategy for coexistence, while there is temporal and dietary overlap. I also appreciated the huge sampling effort carried out for this study by the authors: they were able to deploy 280 camera trapping sites (which became 322 in the result section?) and collect a total of 480 scat samples. However, I have some concerns about the study on the non-consideration of the human dimension and potential anthropogenic disturbance that could affect the spatial and temporal distribution of carnivores, the choice of the statistical model to test co-occurrence, and the lack of clearly stated ecological hypotheses.

      Strengths:

      The strengths of the study are the investigation of all three major strategies that can mitigate carnivores' coexistence, therefore, the use of multiple monitoring techniques (both camera trapping and DNA metabarcoding) and the big dataset produced that consists of a very large sampled area with a noteworthy number of camera trap stations and many scat samples for each species.

      Weaknesses:

      I think that some parts of the manuscript should be written better and more clearly. A clear statement of the ecological hypotheses that could affect the partitioning among the carnivore guild is lacking. I think that the human component (thus anthropogenic disturbance) should have been considered more in the spatial analyses given it can influence the use of the environment by some carnivores. Additionally, a multi-species co-occurrence model would have been a more robust approach to test for spatial co-occurrence given it also considers imperfect detection.

      Thank you very much for your valuable comments and suggestions. We checked and edited the manuscript, and we thought the English level was improved.

      (1) According to your suggestion, we added the competitive exclusion and niche differentiation hypothesis with space, time and diets axis to explain co-occurrence relationship among species in the introduction as follow: “The competitive exclusion principle dictates that species with similar ecological requirements are unable to successfully coexist (Hardin, 1960; Gause, 1934). Thus, carnivores within a guild occupy different ecological niches based on a combination of three niche dimensions, i.e. spatial, temporal, and trophic (Schoener, 1974). Spatially, carnivore species within the same geographic area exhibit distinct distributions that minimize overlap in resource use and competition. For example, carnivores can partition habitats based on habitat feature preferences and availability of prey (De Satgé et al., 2017; Garrote and Pérez De Ayala, 2019; Gołdyn et al., 2003; Strampelli et al., 2023). Temporally, differences in seasonal or daily activity patterns among sympatric carnivores can reduce competitive interactions and facilitate coexistence. For example, carnivores can exhibit temporal segregation in their foraging behaviors, such as diurnal versus nocturnal activity, to avoid direct competition (Finnegan et al., 2021; Nasanbat et al., 2021; Searle et al., 2021). Trophically, carnivore species can diversify their diets to exploit different prey species or sizes, thereby reducing competition for food resources. For example, carnivores can exhibit dietary specialization to optimize their foraging efficiency and minimize competitive pressures (Steinmetz et al., 2021).”

      (2) In addition to distance from roads, we included human dimension as covariates influencing occupancy rates based on the number of independent photos or videos of herders and livestock detected by infrared cameras (named human disturbance and is represented by hdis). According to the results of occupancy models, we found red fox occupancy probability displayed a significant positive relationship with hdis. Moreover, the detection probability of snow leopard and Eurasian lynx decreased with increasing hdis.

      We have incorporated these results into the Results as follow: “According to the findings derived from single-season, single-species occupancy models, the snow leopard demonstrated a notably higher probability of occupancy compared to other carnivore species, estimated at 0.437 (Table 1). Conversely, the Eurasian lynx exhibited a lower occupancy probability, estimated at 0.161. Further analysis revealed that the occupancy probabilities of the wolf and Eurasian lynx declined with increasing Normalized Difference Vegetation Index (NDVI) (Table 2, Figure 2). Additionally, wolf occupancy probability displayed a negative relationship with roughness index and a positive relationship with prey availability. Snow leopard occupancy probabilities exhibited a negative relationship with distance to roads and NDVI. In contrast, both red fox and Tibetan fox demonstrated a positive relationship with distance to roads. Moreover, red fox occupancy probability increased with higher human disturbance and greater prey availability. The detection probabilities of wolf, snow leopard, red fox, and Tibetan fox exhibited an increase with elevation (Table 2). Moreover, there was a positive relationship between the detection probability of Tibetan fox and prey availability. The detection probabilities of snow leopard and Eurasian lynx declined as human disturbance increased.”

      (3) We appreciate the suggestion to use a multi-species co-occurrence model to test spatial co-occurrence. We attempted a multispecies occupancy modeling to analysis the five species in our study followed the method of Rota et al. (2016). Initially, we simplified the candidate models by adopting a single-season, single-species occupancy model. We selected occupancy covariates from the best model as the best covariates for each species and used them to establish multispecies occupancy models. Unfortunately, the final model results did not converge. We are investigating potential solutions to resolve this problem.

      Rota CT, Ferreira MAR, Kays RW, Forrester TD, Kalies EL, McShea WJ, Parsons AW, Millspaugh JJ. 2016. A multispecies occupancy model for two or more interacting species. Methods Ecol Evol 7:1164–1173. doi:10.1111/2041-210X.12587

      Temporal and dietary results are solid and this latter in particular highlights a big predation pressure on some prey species such as the pika. This implies important conservation and management implications for this species, and therefore for the trophic chain, given that i) the pika population should be conserved and ii) a potential poisoning campaign against small mammals could be incredibly dangerous also for mesocarnivores feeding on them due to secondary poisoning.

      Thank you for your thoughtful comments. We appreciate your recognition of the temporal and dietary findings, particularly the highlighted predation pressure on prey species like the pika. These observations indeed underscore critical implications for conservation and management. The necessity to conserve the pika population is paramount for its role in maintaining the stability of the trophic chain within its ecosystem. As you rightly pointed out, any disruption to this delicate balance, including through predation or indirect threats like poisoning campaigns, could have far-reaching consequences. Regarding the potential risks associated with poisoning campaigns targeting small mammals, we acknowledge the significant concerns raised about secondary poisoning affecting mesocarnivores. This underscores the need for careful consideration in pest control strategies and the adoption of measures that minimize unintended ecological impacts. Our findings suggest several practical implications for conservation and management. Conservation efforts should focus on vulnerable prey populations such as the pika, while management strategies could include regulatory frameworks and community education to mitigate risks associated with pest control methods. We believe our study contributes valuable insights into the complexities of predator-prey dynamics and the broader implications for ecosystem health. By integrating these findings into conservation practices, we can work towards ensuring the sustainability of natural systems and the species that depend on them.

      Reviewer #1 (Recommendations For The Authors):

      To better explain the methodology and the sampling effort I recommend reviewing e.g. Kays et al. 2020. An empirical evaluation of camera trap study design: How many, how long, and when?. Methods in Ecology and Evolution, 11(6), 700-713. https://besjournals.onlinelibrary.wiley.com/doi/epdf/10.1111/2041-210X.13370.

      Thank you for this valuable suggestion! According to this reference, we have added this information to explain the methodology and the sampling effort as follow: “Taking into account the fact that mammalian communities are sensitive to seasonality, we used camera traps to monitor animals with an extensive survey effort from December 2016 to February 2022, covering the activity of animal species in different seasons, which can reflect the overall distribution of carnivores. We placed a total of 280 infrared cameras at the study site, set them to be active for 4 to 6 months, and considered possible relocation to another position based on animal detection in an effort to improve estimates of the occupancy and detection rates for both common and rare species (Figure 1) (Kays et al., 2020). The camera trap was set to record the time and date on a 24 hr clock when triggered, and to record a 15s video and 1 photo with an interval of 2 minutes between any two consecutive triggers. The sum of camera trap effective days was defined by the total amount of trapping effort during the sampling period, which was calculated from the time the camera was placed in operation to the time the last video or photograph was taken. We visited each camera 2 to 3 times a year to download photos and check batteries.”

      Reviewer #2 (Recommendations For The Authors):

      I have some concerns about the manuscript.

      I find that the manuscript should be written more clearly: some sentences are not straightforward to understand given the presence of structural errors that make the text hard to read; the paragraphs should be written in a more harmonic way (without logical leaps) with a smoother change of topic between paragraphs, especially in the introduction.

      We appreciate your constructive comments, which have helped us improve the clarity and coherence of the manuscript. We have revised the introduction to provide a clearer outline of the paper's structure and objectives. Specifically, we have rephrased complex sentences and removed ambiguities to ensure that each idea is communicated more straightforwardly. We providing clearer links between ideas and avoiding abrupt shifts in topics to ensure that a smoother transition between paragraphs.

      I feel like the strength of merging the two techniques (camera trapping and DNA metabarcoding) is not brought up enough, while the disadvantage of this approach is not even mentioned (e.g., the increasing costs).

      Thanks a lot for this valuable comment! We have added this information to the Discussion (L356-L363) as follow: “Our study highlights the effectiveness of combining camera trapping with DNA metabarcoding for detecting and identifying both cryptic and rare species within a sympatric carnivore guild. This integrated approach allowed us to capture a more comprehensive view of species presence and interactions compared to traditional visual surveys. whereas, it is important to acknowledge the challenges associated with this technique, including the high costs of equipment and the need for specialized training and computational resources to manage and analyze the large volumes of sequence data. Despite these challenges, the benefits of this combined method in improving biodiversity assessments and understanding species coexistence outweigh the drawbacks.”

      The structure of the manuscript does not follow the structure of the journal (Intro, Material and Method, Results, Discussion instead it reports the methods at the end of the main manuscript), and, most critically, I found that a clear explanation of the research hypothesis is missing: authors should clearly state they ecological hypotheses. What are your hypotheses on the co-occurrence relationship among species? What would specifically affect and change the sympatric relationships among carnivores?

      Thank you for this valuable suggestion! We have revised the manuscript, that is integrated the methods section appropriately within the main body of the manuscript to ensure that it aligns with the standard sections (Introduction, Materials and Methods, Results, Discussion.

      We state our main ecological hypotheses concerning the co-occurrence relationships among carnivore species is based on niche differentiation hypothesis. We hypothesize that differentiation along one or more niche axes is beneficial for the coexistence of carnivorous guild in the Qilian Mountains. We expected that spatial niche differentiation promotes the coexistence of large carnivores in the Qilian Mountain region, as they are more likely than small carnivores to spatially avoid interspecific competition (Davis et al., 2018). Mesocarnivores may coexist either spatially or temporally due to increased interspecific competition for similar prey (Di Bitetti et al., 2010; Donadio and Buskirk, 2006). Nutritional niche differentiation may be a significant factor for promoting coexistence between large and mesocarnivore species due to differences in body size (Gómez-Ortiz et al., 2015; Lanszki et al., 2019). We have added ecological hypotheses in lines 101 to 110.

      Another concern is that all pictures with people have been removed from the dataset, but I think that this could be a bit biased as human presence (or also the presence of livestock) could affect the spatial or temporal presence of carnivores, changing their co-occurrence dynamics. On one side, humans can be perceived as a source of disturbance by carnivores and, therefore, can cause a shift in distribution towards locations with lower human presence (or lower anthropogenic disturbance) that could further concentrate the presence of carnivores increasing the competitive interaction. Conversely, mesocarnivores could take advantage of an increasing human presence - following the human shield hypotheses - finding a refugium from larger body carnivores. From this perspective, important information on the potential anthropogenic pressure is lacking in the description of the study area: how effective is the protection effort of the park? How intense is the potential human disturbance in and around the park? Is there poaching? Intensive livestock grazing? Resources extractions? These are all factors that could affect the interactions among carnivores. Do not forget the possibility and risk of being retaliatory killed by humans due to the presence of livestock in the area. I think that incorporating the human dimension is important because it could strongly affect how carnivores perceive and use the environment. Here only the distance to the closest road has been considered. However, for example, recent research (Gorczynski et al 2022, Global Change Biology) has indeed found that co-occurrece of ecologically similar species differed in relation to increasing human density. Therefore, I think that anthropogenic disturbance is an aspect to be reckoned with and more variables as proxy of human disturbance should be considered.

      Thanks a lot for this valuable comment! We acknowledge that humans can act as both a disturbance factor, potentially driving carnivores away from highly populated areas, and as a source of indirect refuge for mesocarnivores, thereby affecting competitive interactions among carnivores. We understand that poaching and resource extraction are prohibited and livestock grazing is a significant human activity within the study area. Therefore, we added human dimension as covariates influencing occupancy rates based on the number of independent photos or videos of herders and livestock detected by infrared cameras (named human disturbance and is represented by hdis). According to the results of occupancy models, we found red fox occupancy probability displayed a significant positive relationship with hdis. Moreover, the detection probability of snow leopard and Eurasian lynx decreased with increasing hdis.

      In the statistical analyses section, I don't find that the statistical procedure is well described: it is not clear which occupancy model has been used (probably a single-species single-season occupancy model for each target species?), which covariates have been tested for each species and following which hypotheses. Additionally, I think that when modelling the spatial distribution of subordinate species, it should be important to include information on the spatial distribution of apex species given this could affect their occurrence on the territory. This could have been done by using the Relative Abundance Index of the apex predators as a covariate when modelling the distribution of subordinate species. Additionally, why haven't the authors used prey as a covariate for occupancy? I think that prey distribution should affect the occupancy probability more than the detection rate. Also, the authors used the Sørensen similarity index to measure associations between species. However, this association metric has been criticized (see the recent paper of Mainali et al 2022, Science Advances). I am therefore wondering: given the authors are using the occupancy framework, why don't they use a multi-species co-occurrence model that allows them to directly estimate both single-species occupancy and the co-occurrence parameter as a function of covariates (examples are Rota et al. 2016, Methods Ecol. Evol. Or Tobler et al. 2019, Ecology)? For the temporal overlap, I think that adding Figure S2 (pairwise temporal overlap) in the main text would help deliver the results of the temporal analyses more straightforwardly.

      Thanks a lot for this valuable comment!

      (1) The current manuscript utilizes a single-species single-season occupancy model for each target species. Additionally, we have added prey and human disturbance as occupancy covariables. We have revised the statistical analyses section to explicitly state this model choice and clarify the covariates tested for each species from lines 153 to lines170. The details are as follows: “To investigate the spatial distribution of carnivores, as well as the influence of environmental factors on the site occupancy of species in the study area, we performed single-season, single-species occupancy models to estimate carnivores’ occupancy (ψ) and detection (Pr) probability (Li et al., 2022b; MacKenzie, 2018; Moreno-Sosa et al., 2022). To ensure capture independence, only photo or video records at intervals of 30 min were was included in the data analysis (Li et al., 2020). We created a matrix recording whether each carnivore species was detected (1) or not (0) across several 30-day intervals (that is 0-30, 31-60, 61-90, 91-120, 121-150, >150 days) for each camera location. Based on the previous studies of habitat use of carnivores (Greenspan and Giordano, 2021; Alexander et al., 2016; Gorczynski et al., 2022), we selected terrain, vegetation, biological factors and disturbance to construct the model. Terrain is a fundamental element of wildlife habitat and closely linked to other environmental factors (Chen et al., 2024). Terrain variables include elevation (ele) and roughness index (rix). Vegetation variables include normalized difference vegetation index (ndvi), and provide information on the level of habitat concealment. Biological variables include prey abundance (the number of independent photos of their preferred prey based on dietary analysis in this study, wolf and snow leopard: artiodactyla including livestock; Eurasian lynx and Pallas’s cat: lagomorpha; red fox and Tibetan fox: lagomorpha and rodentia) and reflect habitat preference and distribution patterns of carnivores. Disturbance variables include distance to roads (disrd) and human disturbances (hdis, the number of independent photos of herdsman and livestock) and can provide insight into the habitat selection and behavior patterns of carnivores.”

      (2) Thank you for your valuable suggestions. We acknowledge the importance of considering apex species in models of subordinate species' spatial distributions.

      Nonetheless, considering the consistency of covariates for each species and the lack of interspecies interactions in single-species occupancy models, we did not include the Relative Abundance Index of the apex predators as a covariate affecting the occupancy of mesopredators. As you recommended, multi-species occupancy models that account for interspecies interactions are a robust approach. However, we attempted to use the multi-species occupancy method of Rota et al. (Rota et al., 2016), the final model results did not converge. Specifically, we selected occupancy covariates from the best model by single-species model as the best covariates for each species and used them to establish multispecies occupancy models. We are investigating potential solutions to resolve this problem.

      (3) We used the Sørensen similarity index to measure associations between species based on support from previous literature. As counted by Mainali et al., the Sørensen index has been used in more than 700 papers across journals such as Science, Nature, and PNAS. We believe this index holds broad applicability in describing relationships between species.

      (4) We agree that presenting pairwise temporal overlap in the main text would enhance clarity. We revised the manuscript to include Figure S2 in the main text and ensure that the temporal analyses are more straightforwardly presented.

      Regarding the sampling collection of the scats, I'm just curious to know why you decided to use silica desiccant instead of keeping the samples frozen. I'm not familiar with this method and I guess it works fine because the environment is generally freezing cold. Yet, I would like to know more. How fresh do scat samples need to be in order to be suitable for DNA metabarcoding analyses? Additionally, what do you mean by "scats were collected within camera trapping area", could you be more specific? Have you specified a buffer around camera stations?

      Thanks a lot for this specific inquiry! We refer to the scat collection method mentioned in the study of Janecka et al (2008; 2011). Silica is used to dry the scats to minimize DNA degradation. Due to the limitation of field environmental conditions, there is no suitable equipment to freeze samples during sampling, the collected scat samples should be kept dry and cool in shade, and transferred to the laboratory as soon as possible after sampling. We selected relatively fresh samples based on the color of the scat as well as broken off bits and pieces from the outside part of the scat including pieces not directly in the sun. Collect scat material about the size of a pinkie nail in the tube. If over fill the tube it will likely not dry and lead to DNA degradation.

      The study area was subdivided into sample squares of 25 km2 (5×5 km) as a geographical reference for placing camera survey sites and collecting scat samples. Camera traps were set in areas believed to be important to and heavily used by wildlife, such as the bottoms of cliffs, sides of boulders, valleys and ridges along movement corridors. Also, we focused on sites with known or suspected carnivore activity to maximize probability of detection for scat samples. Therefore, transects were set around the infrared camera to collect scat samples. Length of each transect was determined by terrain, amount of scat, and available time. Each transect should have collected about 18 samples or covered 5 km of terrain to avoid uneven representation among transects and ensure that the team has sufficient time to return to base camp (Janečka et al., 2011).

      Janecka J, Jackson R, Yuquang Z, Li D, Munkhtsog B, Buckley-Beason V, Murphy W. 2008. Population monitoring of snow leopards using noninvasive collection of scat samples: A pilot study. Animal Conservation 11:401–411. doi:10.1111/j.1469-1795.2008.00195.x

      Janečka JE, Munkhtsog B, Jackson RM, Naranbaatar G, Mallon DP, Murphy WJ. 2011. Comparison of noninvasive genetic and camera-trapping techniques for surveying snow leopards. J Mammal 92:771–783. doi:10.1644/10-MAMM-A-036.1

      Kays R, Arbogast BS, Baker‐Whatton M, Beirne C, Boone HM, Bowler M, Burneo SF, Cove MV, Ding P, Espinosa S, Gonçalves ALS, Hansen CP, Jansen PA, Kolowski JM, Knowles TW, Lima MGM, Millspaugh J, McShea WJ, Pacifici K, Parsons AW, Pease BS, Rovero F, Santos F, Schuttler SG, Sheil D, Si X, Snider M, Spironello WR. 2020. An empirical evaluation of camera trap study design: How many, how long and when? Methods Ecol Evol 11:700–713. doi:10.1111/2041-210X.13370

      Regarding the discussion, the authors have information for 1) spatial distribution, 2) temporal overlap, 3) dietary requirement, they should use this information to support the discussion. Instead, sometimes it feels that authors go by exclusion or make a suggestion. For example: the authors have found dietary and temporal overlap between two apex predators (i.e., wolf and snow leopard), and they said that this suggests that spatial partitioning is responsible for their successful coexistence in this area (lines 195-196). But why "suggesting", what the co-occurrence metric says? Another example: "Apex carnivores and mesocarnivores showed substantial overlap in time overall, indicating that spatial and dietary partitioning may play a large role in facilitating their coexistence" (lines 241 - 242). However, this should not be a suggestion: your Sørensen similarity index is low proving spatial divergence. So, when data supports the hypotheses, the authors should be firmer in their discussion. Generally, when reading the discussion, it felt that a figure summarizing the partitioning would be much needed to digest which type of partitioning strategy the species are using.

      Thank you for your thoughtful comments and suggestions.

      (1) We appreciate your insights on the discussion section, particularly concerning the interpretation of our findings on spatial distribution, temporal and dietary overlap. We acknowledge the need for clearer interpretation of our findings. We have revised the discussion section to provide more direct support. For example, in line 294-295, we modify it as “We found dietary and temporal overlap among apex carnivores, showing that spatial partitioning is responsible for their successful coexistence in this area.” In line 341-342, we modify it as “Apex carnivores and mesocarnivores exhibited considerable overlap in time overall, showing that spatial and dietary partitioning may play a large role in facilitating their coexistence.”

      (2) We appreciate your suggestion regarding the inclusion of a figure summarizing partitioning strategies among species discussed. In our study, we organized the overlap index of space, time, and diet among carnivores in Table 3, which directly reflects the overlap of carnivore species in these three dimensions by summarizing them in a single table. Additionally, Figure 3 illustrates the activity patterns and overlap among species, while Figure 4 displays the primary prey of carnivores and the frequency of food utilization.

      About lines 228 - 229, just as a side note, the Pallas's cat, as the red fox, selects the environment according to a greater distribution of prey species, while also selecting primarily meadows and natural environment (Greco et al. 2022, Journal of wildlife management) additionally it is not strictly diurnal (Anile et al. 2020, Wildlife Research; Greco et al. 2022, Journal of wildlife management). Regarding the Pallas's cat and its exclusion from the temporal and spatial analyses, can you specify how many independent detection events you had?

      Thanks a lot for this valuable comment!

      (1) We appreciate the references to recent studies highlighting its habitat preferences and activity patterns. We have revised the manuscript to acknowledge these points and provide context regarding its habitat selection strategies. Specifically, we modify it as follow: “Pallas’s cat hunts during crepuscular and diurnal periods, inhabits meadow with greater prey abundance (Anile et al., 2021; Greco et al., 2022; Ross et al., 2019).”

      (2) The low detection rate of Pallas's cat (0.072) identified by single-species occupancy model raised concerns regarding the reliability of the results. The estimated high standard errors for each environmental variable and the wide confidence intervals around the detection rate further indicated potential bias or randomness. Consequently, we made the decision to exclude the Pallas's cat data from further analysis. Upon closer examination of the Pallas's cat data, it became evident that out of 319 camera sites surveyed, only 27 sites detected the presence of Pallas's cat. Notably, only 3 out of 193 sites in Gansu Province recorded detections, while Qinghai Province had 24 detections out of 126 sites. This skewed distribution of data likely contributed to the unsatisfactory outcomes observed in our models.

      About the diet and results of scat analyses, have you found any sign of intra-guild predation (i.e., apex predators that kill and sometimes consume subordinate carnivores to reduce competition), this could actually represent proof of competition and spatial overlap.

      Thanks a lot for your thoughtful comments!

      We observed intraguild predation in the diet of wolves and snow leopards. Specifically, we found the presence of Pallas’s cat, red fox, and Tibetan fox in the diet of wolfs, and Pallas’s cat, Eurasian Badger and Tibetan fox in the diet of snow leopard. However, these intraguild predation events accounted for only 1.89% of the diet composition of apex carnivores. We suggest that the rarity of these observations may be influenced by various factors and does not necessarily provide sufficient evidence of competition and spatial overlap. Therefore, further data collection and in-depth research are needed to better understand this phenomenon.

      Some minor comments: Figure 2 is really nice, while some abbreviations are missing in the caption of Table 2.

      Thank you for your feedback and positive comments on Figure 2. Unfortunately, we have removed Figure 2 from the manuscript. Due to the inclusion of prey abundance and human disturbance as occupancy covariates, these variables were derived solely from infrared camera trap data and did not encompass a comprehensive dataset across the entire national park. Therefore, we were unable to accurately spatially project for carnivore species occupancy probability in nature park.

      We apologize for the oversight that the abbreviations missing in the caption of Table 2. We have added the missing abbreviations to the caption of Table 2 as follow: “Abbreviations: Disrd-distance to roads, Ele-elevation, NDVI-normalized difference vegetation index, Rix- roughness index, hdis-human disturbance.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Satoshi Yamashita et al., investigate the physical mechanisms driving tissue bending using the cellular Potts Model, starting from a planar cellular monolayer. They argue that apical length-independent tension control alone cannot explain bending phenomena in the cellular Potts Model, contrasting with the vertex model. However, the evidence supporting this claim is incomplete. They conclude that an apical elastic term, with zero rest value (due to endocytosis/exocytosis), is necessary in constricting cells and that tissue bending can be enhanced by adding a supracellular myosin cable. Notably, a very high apical elastic constant promotes planar tissue configurations, opposing bending.

      Strengths:

      - The finding of the required mechanisms for tissue bending in the cellular Potts Model provides a more natural alternative for studying bending processes in situations with highly curved cells.

      - Despite viewing cellular delamination as an undesired outcome in this particular manuscript, the model's capability to naturally allow T1 events might prove useful for studying cell mechanics during out-of-plane extrusion.

      We thank the reviewer for the careful comments and insightful suggestions.

      Weaknesses:

      - The authors claim that the cellular Potts Model is unable to obtain the vertex model simulation results, but the lack of a substantial comparison undermines this assertion. No references are provided with vertex model simulations, employing similar setups and rules, and explaining tissue bending solely through an increase in a length-independent apical tension.

      Studies cited in a previous paragraph included the simulations employing the increased length-independent apical tension. For the sake of clarity, we added the citation to them as below.

      P4L174: “In contrast to the simulations in the preceding studies (Sherrard et al., 2010; Conte et al., 2012; Perez-Mockus et al., 2017; Pérez-González et al., 2021), our simulations could not reproduce the apical constriction”.

      We did not copy the parameters of the vertex models in the preceding studies because we also found that the apical, lateral, and basal surface tensions must be balanced otherwise the epithelial cell could not maintain the integrity (Figure 1—figure supplement 1), while the ratio was outside of the suitable range in the preceding studies.

      - The apparent disparity between the two models is attributed to straight versus curved cellular junctions, with cells with a curved lateral junction achieving lower minimum energies at steady-state. However, a critical discussion on the impact of T1 events, allowing cellular delamination, is absent. Note that some of the cited vertex model works do not allow T1 events while allowing curvature.

      We appreciate the comment and added it to the discussion as suggested.

      P12L301: “Even when the vertex model allowed the curved lateral surface, the model did not assume the cells to be rearranged and change neighbors, limiting the cell delamination (Pérez-González et al., 2021).”

      P12L311: “Note that the vertex model could also be extended to incorporate the curved edges and rearrangement of the cells by specifically programming them, and would reproduce the cell delamination. That is, we could find the importance of the balanced pressure because the cellular Potts model intrinscally included a high degree of freedom for the cell shape, the cell rearrangement, and the fluctuation.”

      - The suggested mechanism for inducing tissue bending in the cellular Potts Model, involving an apical elastic term, has been utilized in earlier studies, including a cited vertex model paper (Polyakov 2014). Consequently, the physical concept behind this implementation is not novel and warrants discussion.

      The reviewer is correct but Polyakov et al. assumed “that the cytoskeletal components lining the inside membrane surfaces of the cells provide these surfaces with springlike elastic properties” without justification. We assumed that the myosin activity generated not the elasticity but the contractility based on Labouesse et al. (2015), and expected that the surface elasticity corresponded with the membrane elasticity. Also, in the physical concept, we clarified how the contractility and the elasticity differently deformed the cells and tissue, and demonstrated why the elasticity was important for the apical constriction. We added it to the discussion as below.

      P12L316: “In the preceding studies, the apically localized myosin was assumed to generate either the contractile force (Sherrard et al., 2010; Conte et al., 2012; Perez-Mockus et al., 2017; Pérez-Vonzález et al., 2021) or the elastic force (Polyakov et al., 2014; Inoue et al., 2016; Nematbakhsh et al., 2020). However, the limited cell shape in the vertex model made them similar in terms of the energy change during the apical constriction, i.e., the effective force to decrease the apical surface. In this study, we showed that the contractile force and the elastic force differently deformed the cells and tissue, and demonstrated why and how the elasticity was important for the apical constriction.”

      - The absence of information on parameter values, initial condition creation, and boundary conditions in the manuscript hinders reproducibility. Additionally, the explanation for the chosen values and their unit conversion is lacking.

      We agree with the comment.

      For the initial configuration, we added an explanation to Tissue deformation by increased apical contractility with cellular Potts model section in the Results as below.

      P4L170: “A simulation started from a flat monolayer of cells beneath the apical ECM, and was continued until resulting deformation of cells and tissue could be evaluated for success of failure of reproducing the apical constriction.”

      For the parameter values we added a section “Parameters for the simulations” in the Methods.

      For the parameters unit conversion, we did not measure the surface tension and cell pressure in an actual tissue and thus could not compare the parameters to the actual forces. Instead, we varied the parameters and demonstrated that the apical constriction was reproduced with the wide range of the parameter values. We added it to the discussion as below.

      P12L310: “It succeeded with a wide range of parameter values, indicating a robustness of the model.”

      Reviewer #2 (Public Review):

      Summary:

      In their work, the authors study local mechanics in an invaginating epithelial tissue. The mostly computational work relies on the Cellular Potts model. The main result shows that an increased apical "contractility" is not sufficient to properly drive apical constriction and subsequent tissue invagination. The authors propose an alternative model, where they consider an alternative driver, namely the "apical surface elasticity".

      Strengths:

      It is surprising that despite the fact that apical constriction and tissue invagination are probably most studied processes in tissue morphogenesis, the underlying physical mechanisms are still not entirely understood. This work supports this notion by showing that simply increasing apical tension is perhaps not sufficient to locally constrict and invaginate a tissue.

      We thank the reviewer for recognizing the importance and novelty of our work.

      Weaknesses:

      The findings and claims in the manuscript are only partially supported. With the computational methodology for studying tissue mechanics being so well developed in the field, the authors could probably have done a more thorough job of supporting the main findings of their work.

      We thank the reviewer for the careful assessment and suggestions. However our simulation was computationally expensive, modeling the epithelium in an analytically calculable expression requires a lot of work, and it is beyond the scope of the present study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Reference line 648: Correct the author's name (Pérez-González).

      We thank the reviewer and corrected the reference.

      (2) "Pale" colors are challenging to discern.

      We updated the figures.

      (3) Figure 1j: What does the yellow color in the cellular junction represent?

      We used the apical lateral site colored yellow in Fig. 1e-f’ to simulate the effect of the adherens junction. We updated the figure legend.

      (4) Figure 2c - left: Why is there a red apical junction?

      Our simulation model marked the apical junction in the initial configuration and updated the marking based on connectedness to surrounding other site marked as apical in the same cell. But when a cell was once delaminated and lost its apical junction, any surface site not adjacent to other epithelial cells were marked as basal junction because they were not adjacent to the apical junction.

      We added it to Cellular Potts model with partial surface elasticity section in the Methods as below.

      P17L430: “To simulate the differential phyisical properties of the apical, lateral, and basal surfaces, the subcellular locations are marked automatically, and the marking is updated during the simulation. In each cell, sites adjacent to different cells but not to the medium are marked as lateral.

      At the initial configuration, sites adjacent to the apical ECM are marked as apical, and during the simulation, sites adjacent to medium and other apical sites in the same cell are marked as apical.

      Rest of sites which are adjacent to medium but not marked as apical are marked as basal.

      Therefore, once a cell is delaminated and loses its apical surface, afterwards all sites in the cell adjacent to the medium are marked as basal even if it is adjacent to the apical ECM or the outer body fluid.”

      (5) Figure 4a: The snapshots are not in a steady state but in the middle of deformation. Is the time the same for all snapshots? The motivation to change P_0a is related to endocytosis. However, this could be achieved by decreasing P_0a to a non-zero value. Here, in the more drastic limit, the depth (a measure of bending) is very slight, approximately half of a cell size. What physically limits further invagination? Is it the number of cells or the range of parameters under study?

      The time length was the same for simulations in each figure, and we add it to Parameters for the simulations section in Method as below.

      P18L466: “In each figure, snapshots of the simulations show deformation by the same time length unless specified.”

      For P_0a, the reviewer is correct and the iterated ratcheting may decrease P_0a step by step instead of making it 0 immediately. Still, with P_a0 >0, the energy function and its derivative are both increasing with respect to the apical width as long as P_a > P_a0, and thus the apical shrinkage would be synchronized, even though the deformation would be smaller. We also run simulations by decreasing P_0a to 0.6 times the initial P_a, and observed smaller deformation as expected. On the other hand, the non-zero P_0a made the invagination deeper when it was combined with the effect of surrounding supracellular myosin cable, maybe due to a resistance of the apical surface against compression. One of the novel and important finding in this study is the synergetic effect of the elasticity-based apical constriction and the surrounding supracellular myosin cable. To demonstrate that the deep invagination was not due to the apical surface resistance against the compression, we showed the simulations with P_a0 = 0.

      For the conditions for further invagination, it may include the number of cells, a ratio between the cell height and width (Figure 5—figure supplement 1), interaction with ECM (Figure 5—figure supplement 2), etc. For the parameter, there might be an upper limit (Figure 4). We did not test the number of cells because of its computational cost. Among the conditions we tested, we found the planar compression by surrounding supracellular myosin the most influential rather than the mechanical property of apically constricting cells themselves.

      How each condition and parameter contributes to the invagination shall be studied in future. We added it to the conclusion as below.

      P15L395: “The depth, curvature, and speed of the invagination might be influenced by the cell shape, configuration, and parameters, and how each condition contributes to the invagination shall be studied in future.”

      (6) Figure 6b: What does the cell-surface color represent? If the idea was to represent junction tension, it would be clearer to color the junctions only.

      The junction tension may vary differently in different situations. For example, T1 transition is accompanied by enriched myosin along a shrinking cell-cell junction, and the junction bears higher tension, but other junctions of the same cell do not and thus the cell does not decrease its apical surface. In chick embryo neural tube closure, the junction tension is also polarized, and the cells shrink the apical surface along medial-lateral axis, driving the apical constriction (Nishimura et al., 2012, doi:10.1016/j.cell.2012.04.021). In the case of Drosophila embryo tracheal invagination, the cells shrank their apical surface isotropically (Figure 6a). If the junction tension was responsible for the shrinkage, all junctions of the cell must bear higher tension. Based on this assumption, the junction tension was averaged in each cell to check if the tracheal cells bore the higher average tension than surrounding cells.

      We also plotted stress tensor and calculated nematic order to check if there was radial or encircling tension alignment in the tracheal pit, but there was not.

      (7) Figure 6c: What does the junction color represent here?

      The junction color represent the relative junctional tension. We updated the figure legend.

      (8) Figure 6d-e: It is challenging to understand which error bar corresponds to each dataset.

      We updated the figure.

      (9) What is the definition of relative pressure?

      The geometrical tension inference method assumes that the tissue is in mechanical equilibrium and a sum of the junctional tensions and cell pressures pulling/pushing a vertex (tricellular junction) is 0. Therefore the calculated tensions and pressures are proportional to each other but not absolute values. We added it to the 3D Bayesian tension inference section of Methods as below.

      P24L567: “Since Equation 13 and Equation 14 only evaluate the balance among the forces, it cannot estimate an absolute value but a relative value of the tension and pressure.”

      (10) In the main text, it is mentioned that a large Es (apical elastic constant) leads to flat surfaces, avoiding bending, but the abstract says "strong apical surface tension," which, according to the rest of the text, would seem to be J_apical. Clarification is needed.

      The surface tension includes both of the surface contractility and the surface elasticity.

      We added it to Extended cellular Potts model to simulate epithelial deformations section in the Results as below.

      P3L122: “Note that in some studies the tension and the contractility are considered as equivalent, but they are distinguished in this study.”

      and

      P4L151: “The energy H included only the terms of the contact energy (Equation 1) and the area constraint (Equation 5), but the surface elasticity (Equation 2) nor (Equation 3) was not included, and thus the surface tension was determined by the contact energy.”

      Reviewer #2 (Recommendations For The Authors):

      (1) The model used is rather specific and it is rather confusing whether the issue is in the methodology or fundamental biophysics of apical constriction. For instance, one of the main narratives of the manuscript is that the Cellular Potts model better predicts apical constriction and tissue invagination than the vertex model. As I understand it, and as the authors state in p7 (line 210), "the difference between the vertex model and the cellular Potts model results was due to the straight lateral surface...". I assume that if apical constriction and tissue invagination were modelled with a vertex model with curved edges, while also allowing for cell rearrangements out of the tissue plane (some sort of epithelium-to-mesenchyme transition), the vertex model would yield exactly the same results as in the authors' cellular Potts model. If my understanding is correct, the authors should change the narrative of their manuscript and focus more on the comparison of a model with flat vs. curved edges, with "contractility" vs. "surface elasticity", with patterned apical contractility vs. non-patterned contractility (see my comment in point 2 below)... and not on comparison between CPM and VM.

      We appreciate the comments. The reviewers is correct that the vertex model can include the curved edges and the cell rearrangement, and it would reproduce the result of our cellular Potts model simulations. For the cellular Potts model, there was no need to specifically design how much the cell surface could be curved in a large arc, zigzag, or other shape, and that enabled us to find the conditions of delamination and bending.

      We added it to the discussion as below.

      P12L311: “Note that the vertex model could also be extended to incorporate the curved edges and rearrangement of the cells by specifically programming them, and would reproduce the cell delamination. That is, we could find the importance of the balanced pressure because the cellular Pott’s model intrinscally included a high degree of freedom for the cell shape, the cell rearrangement, and the fluctuation.”

      (2) About physics... and I think this is a really important point: one of the observations in the model was that in the "contractilty" model, only "edge cells" shrank its apical surface, while inner cells remained quadrilateral. Related to this, the authors say that one of the requirements for proper apical constriction is a mechanism that "simulataneously shrinks the apical surface among cells in a cluster". What would happen if the authors assumed patterned contractility, meaning that cells in the center of the cluster would be most apically-contractile, while those further away from the center, would not be contractile? Features like this were investigated in studies of ventral-furrow invagination [see, for instance, Spahn and Reuater PLOS ONE (2013) and Rauzi et al. Nat Commun (2015)-Fig. S13d].

      We thank the reviewer for the critical comment, and ran simulations with the patterned apical contractility. The apical contractility following a gradient of parabola shape succeeded in the simultaneous apical shrinkage. However, it was weak against fluctuations and the cells were delaminated by chance.

      We added it to Apical constriction by modified apical elasticity section in the result as below.

      P9L252: “We also tested another model for the simultaneous apical shrinkage, a gradient contractility model (Spahn and Reuter, 2013; Rauzi et al., 2015). If the inner cells bear higher apical surface contractility than the edge cells, that inner cells may shrink their apical surface. To synchronize the apical shrinkage, the apical contractility must follow a parabola shape gradient. Even though the gradient contractility enabled the cells to shrink the apical surface simultaneously, often some of the cells shrank faster than neighbors and were delaminated by chance (Figure 4—figure Supplement 1).”

      (3) The quality of the figures should be improved. Especially, Figure 3 and the related explanation in lines 183-192. This explanation is way too complicated and it is not clear what Figure 3c shows. For instance: if the arrows are indeed showing contractile forces (as written in the caption) then they are not illustrated correctly, but should be tangential to the cell membrane.

      We updated the figure.

      (4) The figures mostly show steady-state cross-sections from simulations. I miss a more dedicated study with model parameters being varied through wider ranges and some phase diagrams being shown etc. Also, some results could probably be supported by analytic calculations. For instance, the condition for stability (discussed in p4 lines 145-151), cells' preferred aspect ratio, cells' preferred "wedgeness" i.e., local curvature etc... I am sure some of these, if not all, could be calculated analytically and then these analytic results could help to interpret the phase diagrams.

      For the simulation results shown in the figures, we were not sure if the simulations results were in a steady state or not. We added it to Tissue deformation by increased apical contractility simulated with cellular Potts model section in the Results as below.

      P4L170: “A simulation started from a flat monolayer of cells beneath the apical ECM, and was continued until resulting deformation of cells and tissue could be evaluated for success of failure of reproducing the apical constriction.”

      For the ranges of parameters, we ran the simulation in wider range and showed results from sub-range. We added it to Parameters for the simulations section in Methods as below.

      P18L464: “The parameters were varied in a range, and the figures showed simulations with parameter values within a sub-range so that the results showed both success and failure in a development of interest.”

      For the analytical calculations, the Figure 3f shows a kind of phase diagram for shapes of a single cell. To clarify this, we rephrased “map of cell shapes” to “Phase diagram of cell shapes” in the figure legend, and added an explanation to the Results section as below.

      P6L207: “For the analysis of the cell shape in motion, we plotted a phase diagram for shapes of a single cell (Figure 3f).”

      For the analytical evaluation of the cellular Potts model simulations, there was a study doing similar but it concerned a cell of isotropic shape in a steady state (Magno et al., 2015, doi:10.1186/s13628-015-0022-x). Also, our simulation framework is computationally expensive and we could not vary the parameters in fine resolution. Therefore we could not include it in this study.

      (5) I am not sure about the terminology "contractility" vs. "elasticity". In Farhadifar et al. (2007) "contractility" is described by a squared apical-perimeter energy term, while in this work, the authors describe it by a surface-energy-like term.

      In general, elasticity is the ability of a material to resist against deformation and to return to its original shape/size. In Farhadifar et al. (2007), the cell apical area was assigned the area elasticity in this meaning. For the contractility, it is the ability to decrease the size/length, and thus it could be either expressed in linear or quadratic dependent on the modeling. In this study, we assumed cell-cell/cell-ECM adhesion and myosin activity to generate the surface contractility, and thus employed the linear expression. In Farhadifar et al. (2007) it was described as a line tension.

      We used the terms surface ‘elasticity’ and ‘contractility’ as distinctive elements composing the surface ‘tension’. We added it Extended cellular Potts model to simulate epithelial deformations section in the Results as below.

      P3L122: “Note that in some studies the tension and the contractility are considered as equivalent, but they are distinguished in this study.”

      (6) It is not entirely clear what are apical, basal, lateral, and cell "perimeters". This is a 2D model, so I assume all P-s are in fact interface lengths. In either case, this needs to be explained more clearly.

      We updated the explanation in Extended cellular Potts model to simulate epithelial deformations section in the Results as below.

      P3L111: “The cell's perimeter was partitioned automatically based on adjacency with other cells, and it was marked as apical, lateral, basal. Also, apico-lateral sites were marked as a location for the adherens junction. This cell representation also cast the vertical section of the cell. Therefore an area of the cell corresponded with a body of the cell, and a perimeter of the cell corresponded with the cell surface. Likewise the apical, lateral, and basal parts of the perimeter corresponded with the apical surface, cell-cell interface, and the basal surface of the cell respectively.”

      (7) The term H_{mc} is not clear at all. Why is this term called potential energy? What is U(i)? What is the exact biophysical interpretation of this term in 2D vs 3D?

      In 3D, the supracellular myosin cable is formed encircling the cells deformed by the apical constriction. Shrinking of the supracellular myosin cable makes the circle small, and it moves the cable toward the center of the circle. To simulate this motion of the supracellular myosin cable in the 2D cross section, we assigned the force exerted on the adherens junction of the boundary cells pulling toward the center, and because the force is relative to the position of the adherens junction and the center, it was expressed by the potential energy in the simulation.

      We updated Extended cellular Potts model to simulate epithelial deformation section in Results and Cellular Potts model with potential energy section in Methods as below.

      P4L140: “The potential energy was defined by a scalar field which made a horizontal gradient decreasing toward the center,”

      and

      P17L449: “In 3D, tension on a circular actomyosin cable would shrink the circle, and the shrinkage would pull the cable toward the center of the circle. In 2D cross section, the cable is pulled horizontally toward the middle line.”

      (8) Highten->increased

      We updated the text.

      (9) "It seems natural to consider that the myosin generates a force proportional to its density but not to the surface width nor the strain". This sentence should be supported by a reference. Also, if the force is proportional to myosin density, then it must depend on surface width, since density, I assume, is the number of motors per area.

      For the myosin density and generated force, in all preceding studies cited in this manuscript and others in the extent of our knowledge, the myosin and actin filaments density visualized by staining or labeling had been assumed relevant to the generated contractility without references. Therefore it might be well established and shared assumption.

      For the independence from the surface width and strain, the review comment is correct, but the results would be the same. If we presumed that the number of motors on the apical surface was constant in a cell during the apical constriction, then the density would increase when the apical surface was contracted, and thus it would make the apical contractility more unbalanced and promote the delamination. We added it to the results and discussion as below.

      P4L166: “For the sake of simplicity, we ignored an effect of the constriction on the apical myosin density, and discussed it later.”

      P14L328: “In our model, for the sake of simplicity, we ignored an effect of the constriction on the apical myosin density. If we presumed that the apical myosin would be condensed by the shrinkage of the apical surface, it would increase the apical tension in the shrinking cell and is expected to promote the cell delamination further. Therefore it would not change the results.”

      Reviewing Editor (Recommendations For The Authors):

      Please note also the following excerpts from discussions amongst the reviewers and the Reviewing Editor:

      Regarding Reviewer #2's Point 2:

      I believe the authors have assumed patterned contractility in their simulations, and this is shown by the "pale blue" cell color (see also lines 162-163). However, as Reviewer #2 points out in their point 2), the pale colors are very hard to see and therefore easy to miss.

      We updated figure coloring and also add the gradient pattern of contractility.

      Regarding Reviewer #2's point 5:

      It is indeed unconventional to call the "J" terms contractility, they are usually called contact energy or adhesive energy.

      In this study, we included both of the contact energy of cell-cell/cell-ECM adhesion and actomyosin activity in the surface contractility, and used the “J” term as it was conventional in the cellular Potts model.

      On the other hand, due to the parameters chosen for J_apical and J_basal in the pale blue cells, the apical membrane area will tend to shrink and the basal membrane will tend to enlarge. Because the lateral membrane energy J_lateral is constant among all cells (I think?), this will effectively drive cells to apically contract in the center.

      That expectation was an initial motivation of our study, but we found that the differential J alone could not drive the cells to apically contract in the center.

      I agree that extra clarification by the authors would be very helpful here.

      Reviewer #2:

      Regarding the patterned contractility: indeed, I missed this point (the pale blue region is really poorly visible).

      Nevertheless, it seems that contractility in the authors' model changes in a step-like fashion.

      [...] There may be important differences between furrowing under step-like patterning profile versus smooth "bell-like" patterning (see Supplementary Figure 13 in Rauzi et al. Nat Commun 2015). In particular, in the case of a step-like patterning, [there are] constrictions of side cells (similar to what the authors in this manuscript report), whereas in the bell-like patterning, [...] such side constrictions [do not occur].

      As replied to the reviewer #2 comment (2), we added the simulations with gradient-pattern contractility.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Main points:

      (1) We have added data for fructose in Fig. 1

      (2) We have added sta1s1cs (red stars and NS) comparing Tp between fed and refed flies. 

      (3) We have modified the figure for each point to the opened small circles.

      (4) We have moved the data from Fig. S3 to Fig. 2 and 3.

      (5) We have added the schema1c diagrams depic1ng behavioral assay in Fig. S1.

      (6) We have added heatmaps for WT and Gr64f-Gal4>UAS-CsChrimson flies in Fig. S2.

      (7) We have added Orco1 mutant data in Fig. S4.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This paper presents valuable findings that gustation and feeding state influence the preferred environmental temperature preference in flies. Interestingly, the authors showed that by refeeding starved animals with the non-nutritive sugar sucralose, they are able to tune their preference towards a higher temperature in addition to nutrient-dependent warm preference. The authors show that temperature-sensing and sweet-sensing gustatory neurons (SGNs) are involved in the former but not the latter. In addition, their data indicate that pep3dergic signals involved in internal state and clock genes are required for taste-dependent warm preference behavior.

      The authors made an analogy of their results to the cephalic phase response (CPR) in mammals where the thought, sight, and taste of food prepare the animal for the consumption of food and nutrients. They further linked this behavior to core regulatory genes and peptides controlling hunger and sleep in flies having homologues in mammals. These valuable behavioral results can be further inves3gated in flies with the advantage of being able to dissect the neural circuitry underlying CPR and nutrient homeostasis.

      Strengths: 

      (1) The authors convincingly showed that tasting is sufficient to drive warm temperature preference behavior in starved flies and that it is independent of nutrient-driven warm preference. 

      (2) By using the genetic manipulation of key internal sensors and genes controlling internal feeding and sleep states such as DH44 neurons and the per genes for example, the authors linked gustation and temperature preference behavior control to the internal state of the animal. 

      Weaknesses: 

      (1) The title is somewhat misleading, as the term homeostatic temperature control linked to gustation only applies to starved flies. 

      We agree with the reviewer's suggestion and have changed the title to "Taste triggers a homeostatic temperature control in hungry flies".

      (2) The authors used a temperature preference assay and refeeding for 5 minutes, 10 minutes, and 1 hour.

      Experimentally, it makes a difference if the flies are tested immediately after 10 minutes or at the same 3me point as flies allowed to feed for 1 hour. Is 10 minutes enough to change the internal state in a nutrition-dependent manner? Some of the authors' data hint at it (e.g. refeeding with fly food for 10 minutes), but it might be relevant to feed for 5/10 minutes and wait for 55/50min to do the assays at comparable time points. 

      Thank you for your suggestions. The temperature preference behavioral test itself takes 30 minutes from the time the flies are placed in the apparatus until the final choice is made. This means that after the hungry flies have been refed for 5 minutes, they will determine their preferred temperature within 35 minutes. It has been shown that insulin levels peak at 10 minutes and gradually decline (Tsao, et al., PLoS Genetics 2023). However, it is unclear how subtle insulin levels affect behavior and how quickly the flies are able to consume food. These factors may contribute to temperature preference in flies. Therefore, to minimize "extraneous" effects, we decided to test the behavioral assay immediately after they had eaten the food. We have noted in the material and method section that why we chose the condition based on behavior duration and insulin effect. 

      (3) A figure depicting the temperature preference assay in Figure 1 would help illustrate the experimental approach. It is also not clear why Figure 1E is shown instead of full statistics on the individual panels shown above (the data is the same). 

      We have revised Figure 1A and added statistics in Figure 1BCD. We also added a figure depicting the temperature preference assay (Fig. S1).

      (4) The authors state that feeding rate and amount were not changed with sucralose and glucose. However, the FLIC assay they employed does not measure consumption, so this statement is not correct, and it is unclear if the intake of sucralose and glucose is indeed comparable. This limits some of the conclusions. 

      We agree and removed “amount” and have revised the MS. 

      (5) The authors make a distinction between taste-induced and nutrient-induced warm preference. Yet the statistics in most figures only show the significance between the starved and refed flies, not the fed controls. As the recovery is in many cases incomplete and used as a distinction of nutritive vs nonnutritive signals (see Figure 1E) it will be important to also show these additional statistics to allow conclusions about how complete the recovery is. 

      We agree with the comments and have revised the MS and figures. 

      (6) The starvation period used is ranging from 1 to 3 days, as in some cases no effect was seen upon 1 day of starvation (e.g. with clock genes or temperature sensing neurons). While the authors do provide a comparison between 18-21 and 26-29 hours old flies in Figure S1, a comparison for 42-49 and 66-69 hours of starvation is missing. This also limits the conclusion as the "state" of the animal is likely quite different after 1 day vs. 3 days of starvation and, as stated by the authors, many flies die under these conditions.  

      We mainly used 2 overnights of starvation.  Some flies (e.g. Ilp6 mutants) were completely healthy even after 2 overnights of starvation, we had to starve them for 3 overnights. For example, Ilp6 mutants needed 3 overnights of starvation to show a significant difference Tp between fed and starved flies. On the other hand, some flies (e.g. w1118 control flies) were very sick after 2 overnights of starvation, we had to starve them for one overnight. Therefore, the starvation conditions which we used for this manuscript are from 1- 3-overnights.

      First, we confirmed the starvation time by focusing on Tp which resulted in a sta1s1cally significant Tp difference between fed and starved flies; as men1oned above, flies prefer lower temperatures when starvation is prolonged (Umezaki et al., Current Biology 2018). Therefore, if Tp was not statistically different between fed and starved flies, we extended the starva1on 1me from 1 to 3 overnights. Importantly, we show in Fig. S3 that the dura1on of starvation did not affect the recovery effect. Furthermore, since control flies do not survive 42-49 or 66-69 hours of starvation, we can not test the reviewer's suggestion. We have carefully documented the conditions in the Material and method and figure legends.

      (7) In Figure 2, glucose-induced refeeding was not tested in Gr mutants or silenced animals, which would hint at post-ingestive recovery mechanisms related to nutritional intake. This is only shown later (in Figure S3) but I think it would be more fitting to address this point here. The data presented in Figure S3 regarding the taste-evoked vs nutrient-dependent warm preference is quite important while in some parts preliminary. It would nonetheless be justified to put this data in the main figures. However, some of the conclusions here are not fully supported, in part due to different and low n numbers, which due to the inherent variability of the behavior do not allow statistically sound conclusions. The authors claim that sweet GRNs are only involved in taste-induced warm preference, however, glucose is also nutritive but, in several cases, does not rescue warm preference at all upon removal of GRN function (see Figures S3A-C). This indicates that the Gal4 lines and also the involved GRs are potentially expressed in tissues/neurons required for internal nutrient sensing. 

      Thank you for your suggestion. We have added Figure S3ABC (glucose refeeding using Gr mutants and silenced animals) to Figure 2. There is no low N number since we tested > 5 times, i.e. >100 flies were tested. Tp may have a variation probably due to the effect of starvation on their temperature preference. 

      We did not mention that "The authors claim that sweet GRNs are only involved in taste-induced warm preference...". However, our wri1ng may not be clear enough. We agree that "...GRs may be expressed in tissues/neurons required for internal nutrient sensing. ..."  We have rewritten and revised the section.  

      (8) In Figure 4, fly food and glucose refeeding do not fully recover temperature preference after refeeding. With the statistical comparison to the fed control missing, this result is not consistent with the statement made in line 252. I feel this is an important point to distinguish between state-dependent and taste/nutrition-dependent changes.  

      We inserted the statistics and compared between Fed and other conditions. 

      (9) The conclusion that clock genes are required for taste-evoked warm preference is limited by the observation that they ingest less sucralose. In addition, the FLIC assay does not allow conclusions about the feeding amount, only the number of food interactions. Therefore, I think these results do not allow clear-cut conclusions about the impact of clock genes in this assay.  

      We agree and remove “amount” and have revised the MS. The per01 mutants ate (touched) sucralose more often than glucose. On the other hand, 1m01 mutants ate glucose more often than sucralose (Figure S6BC). However, these mutants s1ll showed a similar TP pattern for sucralose and glucose refeeding (Fig. 5CD). The results suggest that the 1m01 flies eat enough amount of sucralose over glucose that their food intake does not affect the TP behavioral phenotype. We have rewritten and revised the section.

      (10) CPR is known to be influenced by taste, thought, smell, and sight of food. As the discussion focused extensively on the CPR link to flies it would be interesting to find out whether the smell and sight of food also influence temperature preference behavior in animals with different feeding states.  

      We have added the data using Olfactory receptor co-receptor (Orco1) mutant, which lack olfaction, in Fig. S4. They failed to show the taste-evoked warm preference, but exhibited the nutrient-induced warm preference. Therefore, the data suggest that olfactory detection is also involved in taste-evoked warm preference. On the other hand, "seeing food" is probably more complicated, since light dramatically affects temperature preference behavior and the circadian clock that regulates temperature preference rhythms. Therefore, it will not be unlikely to draw a solid conclusion from the short set of experiments. We will address this issue in the next study.

      (11) In the discussion in line 410ff the authors claim that "internal state is more likely to be associated with taste-evoked warm preference than nutrient-induced warm preference." This statement is not clear to me, as neuropeptides are involved in mediating internal state signals, both in the brain itself as well as from gut to brain. Thus, neuropeptidergic signals are also involved in nutrient-dependent state changes, the authors might just not have identified the peptides involved here. The global and developmental removal of these signals also limits the conclusions that can be drawn from the experiments, as many of these signals affect different states, circuits, and developmental progression.  

      We agree with the comments. We have removed the sentences and revised the MS.  

      Reviewer #2 (Public Review): 

      Animals constantly adjust their behavior and physiology based on internal states. Hungry animals, desperate for food, exhibit physiological changes immediately upon sensing, smelling, or chewing food, known as the cephalic phase response (CPR), involving processes like increased saliva and gastrointestinal secretions. While starvation lowers body temperature, the mechanisms underlying how the sensation of food without nutrients induces behavioral responses remain unclear. Hunger stress induces changes in both behavior and physiological responses, which in flies (or at least in Drosophila melanogaster) leads to a preference for lower temperatures, analogous to the hunger-driven lower body temperature observed in mammals. In this manuscript, the authors have used Drosophila melanogaster to investigate the issue of whether taste cues can robustly trigger behavioral recovery of temperature preference in starving animals. The authors find that food detection triggers a warm preference in flies. Starved flies recover their temperature preference after food intake, with a distinction between partial and full recovery based on the duration of refeeding. Sucralose, an artificial sweetener, induces a warm preference, suggesting the importance of food-sensing cues. The paper compares the effects of sucralose and glucose refeeding, indicating that both taste cues and nutrients contribute to temperature preference recovery. The authors show that sweet gustatory receptors (Grs) and sweet GRNs (Gustatory Receptor Neurons) play a crucial role in taste-evoked warm preference. Optogenetic experiments with CsChrimson support the idea that the excitation of sweet GRNs leads to a warm preference. The authors then examine the internal state's influence on taste-evoked warm preference, focusing on neuropeptide F (NPF) and small neuropeptide F (sNPF), analogous to mammalian neuropeptide Y. Mutations in NPF and sNPF result in a failure to exhibit taste-evoked warm preference, emphasizing their role in this process. However, these neuropeptides appear not to be critical for nutrient-induced warm preference, as indicated by increased temperature preference during glucose and fly food refeeding in mutant flies. The authors also explore the role of hunger-related factors in regula3ng taste-evoked warm preference. Hunger signals, including diuretic hormone (DH44) and adipokinetic hormone (AKH) neurons, are found to be essential for taste-evoked warm preference but not for nutrient-induced warm preference. Additionally, insulin-like peptides 6 (Ilp6) and Unpaired3 (Upd3), related to nutritional stress, are identified as crucial for taste-evoked warm preference. The investigation then extends into circadian rhythms, revealing that taste-evoked warm preference does not align with the feeding rhythm. While flies exhibit a rhythmic feeding pattern, taste-evoked warm preference occurs consistently, suggesting a lack of parallel coordination. Clock genes, crucial for circadian rhythms, are found to be necessary for taste-evoked warm preference but not for nutrient-induced warm preference. 

      Strengths: 

      A well-written and interesting study, investigating an intriguing issue. The claims, none of which to the best of my knowledge controversial, are backed by a substantial number of experiments. 

      Weakness: 

      The experimental setup used and the procedures for assessing the temperature preferences of flies are rather sparingly described. Additional details and data presentation would enhance the clarity and replicability of the study. I kindly request the authors to consider the following points: 

      i) A schematic drawing or diagram illustrating the experimental setup for the temperature preference assay would greatly aid readers in understanding the spatial arrangement of the apparatus, temperature points, and the positioning of flies during the assay. The drawing should also be accompanied by specific details about the setup (dimensions, material, etc). 

      Thank you for your suggestions. We have added the schematic drawing in Fig. S1.

      ii) It would be beneficial to include a visual representation of the distribution of flies within the temperature gradient on the apparatus. A graphical representation, such as a heatmaps or histograms, showing the percentage of flies within each one-degree temperature bin, would offer insights into the preferences and behaviors of the flies during the assay. In addition to the detailed description of the assay and data analysis, the inclusion of actual data plots, especially for key findings or representative trials, would provide readers with a more direct visualization of the experimental outcomes. These additions will not only enhance the clarity of the presented information but also provide the reader with a more comprehensive understanding of the experimental setup and results. I appreciate the authors' attention to these points and look forward to the potential inclusion of these elements in the revised manuscript. 

      Thank you for the advice. We have added the heat map for WT and Gr64fGal4>CsChrimson data in Fig. S2. 

      Reviewer #3 (Public Review): 

      Summary: 

      The manuscript by Yujiro Umezaki and colleagues aims to describe how taste stimuli influence temperature preference in Drosophila. Under starvation flies display a strong preference for cooler temperatures than under fed conditions that can be reversed by refeeding, demonstrating the strong impact of metabolism on temperature preference. In their present study, Umezaki and colleagues observed that such changes in temperature preference are not solely triggered by the metabolic state of the animal but that gustatory circuits and peptidergic signalling play a pivotal role in gustation-evoked alteration in temperature preference. 

      The study of Umezaki is definitively interesting and the findings in this manuscript will be of interest to a broad readership. 

      Strengths: 

      The authors demonstrate interesting new data on how taste input can influence temperature preference during starvation. They propose how gustatory pathways may work together with thermosensitive neurons, peptidergic neurons and finally try to bridge the gap between these neurons and clock genes. The study is very interesting and the data for each experiment alone are very convincing. 

      Weaknesses: 

      In my opinion, the authors have opened many new questions but did not fully answer the initial question - how do taste-sensing neurons influence temperature preferences? What are the mechanisms underlying this observation? Instead of jumping from gustatory neurons to thermosensitive neurons to peptidergic neurons to clock genes, the authors should have stayed within the one question they were asking at the beginning. How does sugar sensing influence the physiology of thermos-sensation in order to change temperature preference? Before addressing all the following question of the manuscript the authors should first directly decipher the neuronal interplay between these two types of neurons. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Figure S3D is cited before S2, so please rearrange the numbering.

      Thank you. We have changed the numbering.

      I would also suggest a different color to visualize the data points in Figure S3, as some are barely visible on the dark bars (e.g. on a dark green background). 

      We have revised the figures. The data points were changed to smaller opened circles. 

      Reviewer #2 (Recommendations For The Authors): 

      *Please, expand on the experimental procedure, and describe the assay in detail. 

      We have added a scheme for the assay in Fig. S1 and also have revised the manuscript and figures.

      *Show the distribution of the gradient data that the preference values are based upon. Not necessarily for all, but for select key experiments. Heatmaps for each replicate (stacked on top of each other) would be a nice way of showing this. Simple histograms would of course work as well. 

      We have added heatmaps of selected key experiments that were added in Fig. S2. We have revised the manuscript and figures, correspondingly.

      Reviewer #3 (Recommendations For The Authors  

      The manuscript by Yujiro Umezaki and colleagues aims at describing how taste stimuli influence temperature preference in Drosophila. Under starvation, flies display a strong preference for cooler temperatures than under-fed conditions that can be reversed by refeeding, demonstrating the strong impact of metabolism on temperature preference. In their present study, Umezaki and colleagues observed that such changes in temperature preference are not solely triggered by the metabolic state of the animal but that gustatory circuits play a pivotal role in temperature preference. The study of Umezaki is definitively interesting and the findings in this manuscript will be of interest to a broad readership. However, I would like to draw the authors' attention to some points of concern: 

      The title to me sounds somehow inadequate. The definition of homeostasis (Cambridge Dictionary) is as follows: "the ability or tendency of a living organism, cell, or group to keep the conditions INSIDE it the same despite any changes in the conditions around it, or this state of internal balance". What do the authors mean by homeostatic temperature control? Reading the title not knowing much about poikilotherm insects I would understand that the authors claim that Drosophila can indeed keep a temperature homeostasis as mammals do. As Drosophila is not a homoiotherm animal and thus cannot keep its body temperature stable the title should be amended.  

      Homeostasis means a state of balance between all the body systems necessary for the body to survive and function properly. Drosophila are ectotherms, so the source of temperature comes from the environment, and their body temperature is very similar to that of their environment. However, the flies' temperature regulation is not simply a passive response to temperature. Instead, they actively seek a temperature based on their internal state. We have shown that the preferred temperature increases during the day and decreases during the night, showing a circadian rhythm of temperature preference (TPR). Because their environmental temperature is very close to their body temperature, TPR gives rise to body temperature rhythms (BTR). We have shown that TPR is similar to BTR in mammals. (Kaneko et al., Current Biology 2012 and Goda et al., JBR 2023). Similarly, we showed that the hungry flies choose a lower temperature so that the body temperature is also lower. Therefore, our data suggest that the fly maintains its homeostasis by using the environmental temperature to adjust its body temperature to an appropriate temperature depending on its internal state. Therefore, I would like to keep the title as "Taste triggers a homeostatic temperature control in hungry flies" We have added more explana1on in the Introduc1on and Discussion.

      Accordingly, the authors compare the preference of flies to cooler temperatures to the reduced body temperature of mammals (Lines 64 - 65). However, according to the cited literature the reduced body temperature in starved rats is discussed to reduce metabolic heat production (Sakurada et al., 2000). The authors should more rigorously give a short summary of the findings in the cited papers and the original interpretation to help the reader not get confused.

      In flies, it has been shown that a lower temperature means a lower metabolic rate, and a higher temperature means a higher metabolic rate. Therefore, hungry flies choose a lower temperature where their metabolic rate is lower and they do not need as much heat.

      Similarly, in mammals, starvation causes a lower body temperature, hypothermia. Body temperature is controlled by the balance between heat loss and heat production. The starved mammals showed lower heat production. We have added this information to the introduction. 

      The authors show that 5 min fly food refeeding causes a par3al recovery of the naïve temperature preference of the flies (Figure 1B) and that feeding of sucralose par3ally rescues the preference whereas glucose rescues the preference similar to refeeding with fly food would do. As glucose is both sweet and metabolically valuable it would be clearer for the reader if the authors start with the fly food experiment and then show the glucose experiment to show that the altered temperature preference depends on the food component glucose. From there they can further argue that glucose is both sweet (hedonic value) and metabolically valuable. And to disentangle sweetness from metabolism one needs a sugar that is sweet but cannot be metabolized - sucralose. 

      Thank you for your advice. Since the data with sucralose is the one we want to highlight the most, we decided to present it in the order of sucralose, glucose, and fly food.

      In the sucralose experiment the authors omit the 5 min data point and only show the 10 min time point. As Figure 1F indicates that both Glucose and Sucralose elicit the same attractiveness in the flies and that sweetness influences the temperature preference, it is important that the authors show the 5 min temperature preference too to underline the effect of the sweet taste stimulus on the fly behavior independent from the caloric value. Further, the authors should demonstrate not only the cumulative touches but how much sucralose or glucose may already be consumed by the fly in the depicted time frames. 

      It is interesting to see how much sucralose or glucose the flies consume over the time frames shown. Although the cumula1ve exposure to sugar is ideally equivalent to the amount of sugar, we need a different way to actually measure the amount of sugar. We will now emphasize "cumulative touches" rather than "amount of sugar" in the text. In the next study, we will look at how much sucralose or glucose the fly has already consumed.

      Sucralose and Glucose have a similar molecular structure - it would be interesting to see how the sweet taste of a sugar with a different molecular structure like fructose and its receptor Gr43b (Myamato & Amrein 2014) may contribute to temperature preferences.  

      Sucralose and Glucose are not structurally similar. That said, we tested fructose refeeding anyway. The hungry flies showed a taste-evoked warm preference after fructose refeeding. We have added data in Figure 1E and F. The data suggest that sweet taste is more important than sugar structure. We also tested Gr43b>CsChrimson. However, the flies do not show the taste-evoked warm preference (data not shown). The data suggest that Gr43b is not the major receptor controlling taste-evoked warm preference. We have revised the manuscript.

      Both sugars appear similarly attractive to the flies (Figure 1F) - are water, sucralose, and glucose presented in a choice assay or are these individually in separate experiments? 

      Water, sucralose, and glucose were individually presented in separate experiments. We clarified it in the figure legend.

      Subsequently, the authors address the question of how sweet taste may influence temperature preferences in flies. To this end, the authors first employ gustatory receptor mutants for Gr5a, Gr64a, and Gr61a and demonstrate that sucralose feeding does not rescue temperature preference in the absence of sweet taste receptors. In an alternative approach, the authors do not use mutants but an expression of UAS:Kir in Gr64F neurons. Taking a closer look at the graph it appears that the Kir expressing flies have an increased (nearly 1{degree sign}C) temperature preference than the starved mutant flies. Is this preference change related to the mutation directly and what would be the result if Kir would be conditionally only expressed after development is completed, or is the observed temperature preference related to the Gr64f-Gal4 line? If the latter would be the case perhaps the authors may want to bring the flies to the same genetic background to allow for a more direct comparison of the temperature preferences. 

      The Gr64fGal4>Kir flies show a ~one degree higher preferred temperature under starvation compared to the mutants. However, the phenotype is similar to the controls, Gr64fGal4/+ flies, under starvation. Therefore, this phenotype is not due to either the mutation or the Kir effect. Most importantly, the Gr64fGal4>Kir flies failed to show a taste-evoked warm preference. Together with other mutant data, we concluded that sweet GRNs are required for taste-evoked warm preference.

      Overall, the figure legend for Figure 2 is very cryptic and should be more detailed.

      We have revised the figure legend for Figure 2. 

      To shed light on the mechanisms underlying the changes in temperature preferences through gustatory stimuli the authors next blocked heat and cold sensing neurons in fed and starved flies and found out that TrpA1 expressing anterior cells and R11F02-Gal4 expressing neurons both participate in sweetness-induced alteration of temperature preference in starved animals. At this point, it should be explicitly indicated in the figure that the flies need more than one overnight starva3on to display the behavior (Figure 3A). 

      We have revised the manuscript.

      The data provided by the authors indicate a kind of push-and-pull mechanism between heat and cold-sensing neurons under starvation that is somehow influenced by sweet taste sensing. Further, the authors demonstrate that TrpA1-as well as R11F02-Gal4 driven Chrimson activation is sufficient to partially rescue temperature preference under starvation. At this point is unclear why the authors use a tubGal80ts expression system but not for the TrpA1SH-Gal4 driven Chrimson. As the development itself and the conditions under which the animals were raised may have influence on the temperature preference it is important that both groups are equally raised if the authors want to directly compare with each other. 

      As we wrote in the Material and Method, the R11F02-Gal4>uas-CsChrimson flies died during the development. Therefore, we had to use tubGal80ts. On the other hand, the TrpA1-Gal4>CsChrimson flies can survive to adults. As we mentioned in MS, all flies were treated with ATR after they had fully developed into adults. This means that both TrpA1-Gal4 and R11F02-Gal4 expressing cells are ac1vated by red light via CsChrimson only in adult stages. We carefully revised the MS.

      It is a pity that the authors at this point have decided to not deepen the understanding of the circuitry between thermo-sensation and metabolic homeostasis but subsequently change the focus of their study to investigate how internal state influences taste-evoked warm preference in hungry flies. Using mutants for NPF and sNPF the authors demonstrate that both peptides play a pivotal role in taste-evoked warm preference after sucrose feeding but not for nutrient-induced warm preference. Similarly, they found that DH44, AKH and dILP6, Upd2 and Upd3 neurons are also required for taste-evoked warm preference but not for nutrient-induced warm preference. Here again, the authors do not keep the systems stable and change between inhibition of neurons through Kir and mutants for peptides. For a better comparison, it would be preferable to use always exactly the same technique to inhibit neuron signalling.

      It would be interesting to find the neural circuity of thermo-sensation and metabolic homeostasis, but we do not have any luck so far. We will continue to look into the neural circuits which control taste-evoked warm preference and nutrient-induced warm preference. Since UAS-Kir is such a strong reporter, it may kill the flies sometime. So we couldn't use UAS-Kir for all Gal4 flies. 

      DH44 is expressed in the brain and in the abdominal ganglion where they share the expression pattern with 4 Lk neurons per hemisphere. Seeing the impact of Lk signalling in metabolism (AlAnzi et al., 2010) the authors should provide evidence that the observed effect is indeed because of DH44 and not Lk.

      It would be interesting to see if Lk may play a role in taste-evoked warm preference and/or nutrient-induced warm preference. We would like to systematically screen which neuropeptides and receptors are involved in the behavior in the next study. 

      Seeing the results on dILP6 it is interesting that Li and Gong (2015) could show in larvae that cold-sensing neurons directly interact with dILP neurons in the brain. It would be interesting to see whether similar circuitry may exist in adult flies to regulate temperature preferences and these peptidergic neurons. Further, it appears interesting that again these animals need much longer time to display the observed shift in temperature (which again should be clearly indicated in the figure legend too). These observations should be more carefully considered in the discussion part too.

      We have revised the manuscript.

      In the last part of the study, the authors investigate how sensory input from temperature-sensitive cells may transmit information to central clock neurons and how these in turn may influence temperature preference under starvation. The experiments assume that DH44-expressing neurons play a role in the output pathway of the central clock. Using the clock gene null mutants per and tim the authors show that even though the animals display a significant starvation response neither per nor tim mutants exhibited taste-evoked warm preference, indicating a taste but not nutrient-evoked temperature preference regulation. 

      The authors demonstrate interesting new data on how taste input can influence temperature preference during starvation. They propose how gustatory pathways may work together with thermosensitive neurons, peptidergic neurons and finally try to bridge the gap between these neurons and clock genes. The study is very interesting and the data for each experiment alone are very convincing. However, in my opinion, the authors have opened many new questions but did not fully answer the initial question - how do taste-sensing neurons influence temperature preferences? What are the mechanisms underlying this observation? Instead of jumping from gustatory neurons to thermosensitive neurons to peptidergic neurons to clock genes, the authors should have stayed within the one question they were asking at the beginning. How does sugar sensing influence the physiology of thermos-sensation? Before addressing all the following questions of the manuscript the authors should first directly decipher the neuronal interplay between these two types of neurons. 

      Thank you for your suggestion. It would be interesting to find the neural circuity of thermo-sensation and metabolic homeostasis. We have tried but there is no luck so far. 

      The authors could e.g., employ Ca or cAMP-imaging in anterior or cold-sensitive cells and see how the responsiveness of these cells may be altered after sugar feeding. Or at least follow the idea of Li and Gong about the thermos-regulation of dILP-expressing neurons. 

      Thank you for your suggestion. Since we do not know how dlLP-expression neurons are involved in temperature response in the adult flies. We will focus on the cells using Calcium imaging for the next study.

      Anatomical analysis using the GRASP technique may further help to understand the interplay of these neurons and give new insights into the circuitry underlying food preference alteration under starvation. 

      Thank you for your suggestion. It would be interesting to find the neural circuity of thermo-sensation and metabolic homeostasis. We have tried but there is no luck so far.  

      Minor comments: 

      Line 51: Hungry animals are desperate for food - I think the authors should not anthropomorphize at this point too\ much but rather strictly describe how the animals change their behavior without any interpretation of the mental state of the animal. 

      We have modified the manuscript.

      Line 80: Hunger and satiety dramatically affect animal behavior and physiology and control feeding - please not only cite the papers but also give a short overview of the cited papers on which behaviors are altered and how. 

      We have revised the manuscript. 

      Overall statistic: The authors do comparative statistics always against starved animals throughout but often state in the text a comparison against fed (Line 111: "but did not reach that of the fed flies") I think the authors should describe the date according to their statistics and keep this constant throughout the paper. 

      Sorry for the confusion. We originally had it, but we removed it. We have added the additional statistical analyses.  

      Figure legends: Overall the figure legends could be more developed and more detailed.

      We have revised the manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      As adult-born granule neurons have been shown to play diverse roles, both positive and negative, to modulate hippocampal circuitry and function in epilepsy, understanding the mechanisms by which altered neurogenesis contributes to seizures is important for future therapeutic strategies. The work by Jain et al. demonstrates that increasing adult neurogenesis before status epilepticus (SE) leads to a suppression of chronic seizures in the pilocarpine model of temporal lobe epilepsy. This work is potentially interesting because previous studies showed suppressing neurogenesis led to reduced chronic seizures.

      To increase neurogenesis, the authors conditionally delete the pro-apoptotic gene Bax using a tamoxifen-inducible Nestin-CreERT2 which has been previously published to increase proliferation and survival of adult-born neurons by Sahay et al. After 6 weeks of tamoxifen injection, the authors subjected male and female mice to pilocarpine-induced SE. In the first study, at 2 hours after pilocarpine, the authors examine latency to the first seizure, severity and total number of acute seizures, and power during SE. In the second study in a separate group of mice, at 3 weeks after pilocarpine, the authors examine chronic seizure number and frequency, seizure duration, postictal depression, and seizure distribution/cluster seizures. Overall, the study concludes that increasing adult neurogenesis in the normal adult brain can reduce epilepsy in females specifically. However, important BrdU birthdating experiments in both male and female mice need to be included to support the conclusions made by the authors. Furthermore, speculative mechanisms lacking direct evidence reduce enthusiasm for the findings.

      There are two suggestions. First, BrdU birthdating of newborn neurons is important to add to the paper so that there is support for the conclusions. Second, speculative text reduced enthusiasm. In response, we clarified the conclusions. We do not think that the clarified conclusions require BrdU birthdating (discussed further below). We also removed two schematics (and associated text) that we think the reviewer was referring to when speculation was mentioned.

      We also want to point out something minor -that the times of injections listed above are not correct.

      a. Seizures were not measured 2 hrs after pilocarpine; that is when the anticonvulsant diazepam was administered to males. 

      b. Seizures were not measured 3 weeks after pilocarpine; the duration of recording was 3 weeks.  

      (1) BrdU birthdating is required for conclusions.

      We think that the Reviewer was suggesting birthdating because we were not clear about our conclusions, and we apologize for the confusion. The Reviewer stated that we concluded: “conditionally deleting Bax in Nestin-Cre+ cells leads to increased neurogenesis and hilar ectopic granule cells, thereby reducing chronic seizures.”  (Note this is a quote from the review).

      However, we did not intend to conclude that. We intended to conclude that conditionally deleting Bax in Nestin-Cre+ mice reduced chronic seizures in the mouse model of epilepsy that we used. Also, that conclusion only pertained to females. Please note we did not conclude that hilar ectopic granule cells led to reduced seizures. We also concluded that Bax deletion increased neurogenesis in female mice. We have revised the text to make the conclusions clear.

      Abstract, starting on line 67:

      The results suggest that selective Bax deletion to increase adult neurogenesis can reduce experimental epilepsy, and the effect shows a striking sex difference.

      Results, starting on line 448:

      Because Cre+ epileptic females had increased numbers of immature neurons relative to Cre- females at the time of SE, and prior studies show that Cre+ females had less neuronal damage after SE (Jain et al., 2019), female Cre+ mice might have had reduced chronic seizures because of high numbers of immature neurons. However, the data do not prove a causal role.

      Starting on line 477:

      ...we hypothesized that female Cre+ mice would have fewer hilar ectopic GCs than female Cre- mice. However, that female Cre+ mice did not have fewer hilar ectopic GCs.

      Discussion, starting on line 563:

      The chronic seizures, measured 4-7 weeks after pilocarpine, were reduced in frequency by about 50% in females. Therefore, increasing young adult-born neurons before the epileptogenic insult can protect against epilepsy. However, we do not know if the protective effect was due to the greater number of new neurons before SE or other effects. Past data would suggest that increased numbers of newborn neurons before SE leads to a reduced SE duration and less neuronal damage in the days after SE. That would be likely to lessen the epilepsy after SE. However, there may have been additional effects of larger numbers of newborn neurons prior to SE.

      Conclusions, starting on line 745:

      In the past, suppressing adult neurogenesis before SE was followed by fewer hilar ectopic GCs and reduced chronic seizures. Here, we show that the opposite - enhancing adult neurogenesis before SE and increased hilar ectopic GCs - do not necessarily reduce seizures. We suggest instead that protection of the hilar neurons from SE-induced excitotoxicity was critical to reducing seizures. The reason for the suggestion is that the survival of hilar neurons would lead to persistence of the normal inhibitory functions of hilar neurons, protecting against seizures. However, this is only a suggestion at the present time because we do not have data to prove it. Additionally, because protection was in females, sex differences are likely to have played an important role. Regardless, the results show that enhancing neurogenesis of young adult-born neurons in Nestin-Cre+ mice had a striking effect in the pilocarpine model, reducing chronic seizures in female mice.

      The Reviewer is correct that it would be interesting to know when the increase in adult neurogenesis occurred that was critical to the effect. For example, was it the initial increase following Bax deletion but before pilocarpine-induced SE, or the increase in neurogenesis following SE, or increased adult neurogenesis in the chronic stage of epilepsy. It also might be that related aspects of neurogenesis played a role such as the degree that maturation was normal in adult-born neurons. We have not pursued the experiments to identify these aspects of neurogenesis because of how much work it would entail. Also, approaches to conclude cause-effect relationships are going to be difficult. 

      (2) Speculation.

      We removed the text and supplemental figures with schematics that we think were the overly speculative parts of the paper the Reviewer mentioned.

      Strengths:

      (1) The study is sex-matched and reveals differences in response to increasing adult neurogenesis in chronic seizures between males and females.

      (2) The EEG recording parameters are stringent, and the analysis of chronic seizures is comprehensive. In two separate experiments, the electrodes were implanted to record EEG from the cortex as well as the hippocampus. The recording was done for 10 hours post pilocarpine to analyze acute seizures, and for 3 weeks continuous video EEG recording was done to analyze chronic seizures.

      Weaknesses:

      (1) Cells generated during acute seizures have different properties to cells generated in chronic seizures. In this study, the authors employ two bouts of neurogenesis stimuli (Bax deletion dependent and SE dependent), with two phases of epilepsy (acute and chronic). There are multiple confounding variables to effectively conclude that conditionally deleting Bax in Nestin-Cre+ cells leads to increased neurogenesis and hilar ectopic granule cells, thereby reducing chronic seizures.

      As mentioned above, with a clarification of our conclusions we think we have addressed the concern. We believe that we conditionally deleted Bax in Nestin-expressing cells. We believe we found that female mice had reduced loss of hilar mossy cells and somatostatin-expressing neurons after SE, and fewer chronic seizures after SE. While it makes sense that increased neurogenesis caused the reduced seizures, we acknowledge it was not proved.

      We do not make conclusions about the role of hilar ectopic granule cells. However, we note that they appear to have been similar in number across groups, which suggests they played no role in the results. This is very surprising and therefore adds novelty.

      (2) Related to this is the degree of neurogenesis between Cre+ and Cre- mice and the nature of the sex differences. It is crucial to know the rate/fold change of increased neurogenesis before pilocarpine treatment and whether it is different between male and female mice.

      We agree that if sex differences in adult neurogenesis could be shown by a sex difference in rate, fold change, maturation, and other characteristics.  However, sex differences can also be shown by a change in doublecortin (DCX), which is what we did. We respectfully submit that we do not see an exhaustive study is critical.

      As a result, we have clarified DCX was studied either before SE or in the period of chronic seizures:

      Results, starting on line 406:

      III. Before and after epileptogenesis, Cre+ female mice exhibited more immature neurons than Cre- female mice but that was not true for male mice.

      Starting on line 446:

      Therefore, elevated DCX occurred after chronic seizures had developed in Cre+ mice but the effect was limited to females.

      Discussion, starting on line 592:

      This study showed that conditional deletion of Bax from Nestin-expressing progenitors increased young adult-born neurons in the DG when studied 6 weeks after deletion and using DCX as a marker of immature neurons.

      (3) The authors observe more hilar Prox1 cells in Cre+ mice compared to Cre- mice. The authors should confirm the source of the hilar Prox1+ cells.

      This is an excellent question but it is unclear that it is critical to the seizures since both sexes showed more hilar Prox1 cells in Cre+ mice but only the females had fewer seizures than Cre- mice. This is the additional text to describe the results (starting on Line 493):

      In past studies, hilar ectopic GCs have been suggested to promote seizures (Scharfman et al., 2000; Jung et al., 2006; Cho et al., 2015). Therefore, we asked if the numbers of hilar ectopic GCs correlated with the numbers of chronic seizures. When Cre- and Cre+ mice were compared (both sexes pooled), there was a correlation with numbers of chronic seizures (Fig. 6D1) but it suggested that more hilar ectopic GCs improved rather than worsened seizures. However, the correlation was only in Cre- mice, and when sexes were separated there was no correlation (Fig. 6D3).

      When seizure-free interval was examined with sexes pooled, there was a correlation for Cre+ mice (Fig. 6D2) but not Cre- mice. Strangely, the correlations of Cre+ mice with seizure-free interval (Fig. 6D2, D4) suggest ectopic GCs shorten the seizure-free interval and therefore worsen epilepsy, opposite of the correlative data for numbers of chronic seizures. In light of these inconsistent results it seems that hilar ectopic granule cells had no consistent effect on chronic seizures.

      (4) The biggest weakness is the lack of mechanism. The authors postulate a hypothetical mechanism to reconcile how increasing and decreasing adult-born neurons in GCL and hilus and loss of hilar mossy and SOM cells would lead to opposite effects - more or fewer seizures. The authors suggest the reason could be due to rewiring or no rewiring of hilar ectopic GCs, respectively, but do not provide clear-cut evidence.

      As we mention above, we removed the supplemental figures with schematics because they probably were what seemed overly speculative.

      We acknowledge that mechanism is not proven by our study. However, we would like to mention that in our view, showing preservation of hilar mossy cells and SOM cells, but not PV cells, does add mechanistic data to the paper. We understand more experiments are necessary.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Jain et al explore whether increasing adult neurogenesis is protective against status epilepticus (SE) and the development of spontaneous recurrent seizures (chronic epilepsy) in a mouse pilocarpine model of TLE. The authors increase adult neurogenesis via conditional deletion of Bax, a pro-apoptotic gene, in Nestin-CreERT2Baxfl/fl mice. Cre- littermates are used as controls for comparisons. In addition to characterizing seizure phenotypes, the authors also compare the abundance of hilar ectopic granule cells, mossy cells, hilar SOM interneurons, and the degree of neuronal damage between mice with increased neurogenesis (Cre+) vs Cre- controls. The authors find less severe SE and a reduction in chronic seizures in female mice with pre-insult increased adult-born neurons. Immunolabeling experiments show these females also have preservation of hilar mossy cells and somatostatin interneurons, suggesting the pre-insult increase in adult neurogenesis is protective.

      Strengths:

      (1) The finding that female mice with increased neurogenesis at the time of pilocarpine exposure have fewer seizures despite having increased hilar ectopic granule cells is very interesting.

      (2) The work builds nicely on the group's prior studies.

      (3) Apparent sex differences are a potentially important finding.

      (4) The immunohistochemistry data are compelling.

      (5) Good controls for EEG electrode implantation effects.

      (6) Nice analysis of most of the SE EEG data.

      Weaknesses:

      (1) In addition to the Cre- littermate controls, a no Tamoxifen treatment group is necessary to control for both insertional effects and leaky expression of the Nestin-CreERT2 transgene.

      About “leaky” expression, we have not found expression to be leaky. We checked by injecting a Cre-dependent virus so that mCherry would be expressed in those cells that had Cre.  The results were published as Supplemental Figure 9 in Jain et al. (2019).

      In the revised manuscript we also mention a study that examined three Nestin-CreERT2 mouse lines (Sun et al., 2014). One of the mouse lines was ours. The leaky expression was not in the mouse line we use. We have added these points to the revised manuscript:

      Methods, section II starting on line 791:

      Although Nestin-Cre-ERT2 mouse lines have been criticized because  they can have leaky expression, the mouse line used in the present study did not (Sun et al., 2014), which we confirmed (Jain et al., 2019).

      (2) The authors suggest sex differences; however, experimental procedures differed between male and female mice (as the authors note). Female mice received diazepam 40 minutes after the first pilocarpine-induced seizure onset, whereas male mice did not receive diazepam until 2 hours post-onset. The former would likely lessen the effects of SE on the female mice. Therefore, sex differences cannot be accurately assessed by comparing these two groups, and instead, should be compared between mice with matching diazepam time courses.

      We agree that a shorter delay between pilocarpine and diazepam would be likely to lead to less damage. However, the latency from pilocarpine to SE varied, making the time from the onset of SE to diazepam variable. Most of the variability was in females. By timing the diazepam injection differently in males and females, we could make the time from the onset of SE to diazepam similar between females and males. We had added a supplemental figure to show that our approach led to no significant differences between females and males in the latency to SE, time between SE and diazepam injection, and time between pilocarpine and diazepam injection. We also show that Cre+ females and Cre- females were not different in these times, so it could not be related to the neuroprotection of Cre+ females.

      Additionally, the authors state that female mice that received diazepam 2 hours post-onset had severe brain damage. This is concerning as it would suggest that SE is more severe in the female than in the male mice.

      We regret that our language was misleading. We intended to say females had more morbidity and mortality than males (lack of appetite and grooming, death in the days after SE) when we gave DZP 2 hrs after Pilo. We actually don’t know why because there were no differences in severity of SE. We think the females had worse outcome when they had a short latency to SE.  These females had a longer period of SE before DZP than males, probably leading to worse outcome. To correct this we gave DZP to females sooner. Then morbidity and mortality was improved in females. 

      Interestingly, after we did this we saw females did not always have a short latency to SE. We maintained the same regimen however, to be consistent. As the new supplemental figure (above) shows, there were significant sex differences in the latency to SE, time between SE and DZP, and time between pilocarpine and DZP.

      (3) Some sample sizes are low, particularly when sex and genotypes are split (n=3-5), which could cause a type II statistical error.

      We agree and have noted this limitation in the Discussion:

      Additional considerations, starting on line 739:

      This study is limited by the possibilities of type II statistical errors in those instances where we divided groups by genotype and sex, leading to comparisons of 3-5 mice/group.

      (4) Several figures show a datapoint in the sex and genotype-separated graphs that is missing from the corresponding male and female pooled graphs (Figs. 2C, 2D, 4B).

      We are very grateful to the Reviewer for pointing out the errors. They are corrected.

      (5) In Suppl Figs. 1B & 1C, subsections 1c and 2c, the EEG trace recording is described as the end of SE; however, SE appears to still be ongoing in these traces in the form of periodic discharges in the EEG.

      The Reviewer is correct.  It is a misconception that SE actually ends completely. The most intense seizure activity may, but what remains is abnormal activity that can last for days. Other investigators observe the same and have suggested that it argues against the concept of a silent period between SE and chronic epilepsy. We had discussed this in our prior papers and had referenced how we define SE.  In the revised manuscript we add the information to the Methods section instead of referencing a prior study:

      Methods, starting on line 899:

      SE duration was defined in light of the fact that the EEG did not return to normal after the initial period of intense activity. Instead, intermittent spiking occurred for at least 24 hrs, as we previously described (Jain et al., 2019) and has been described by others (Mazzuferi et al., 2012; Bumanglag and Sloviter, 2018; Smith et al., 2018). We therefore chose a definition that captured the initial, intense activity. We defined the end of this time as the point when the amplitude of the EEG deflections were reduced to 50% or less of the peak deflections during the initial hour of SE. Specifically, we selected the time after the onset of SE when the EEG amplitude in at least 3 channels had dropped to approximately 2 times the amplitude of the EEG during the first hour of SE, and remained depressed for at least 10 min (Fig. S2 in (Jain et al., 2019). Thus, the duration of SE was defined as the time between the onset and this definition of the "end" of SE.

      (6) In Results section II.D and associated Fig.3, what the authors refer to as "postictal EEG depression" is more appropriately termed "postictal EEG suppression". Also, postictal EEG suppression has established criteria to define it that should be used.

      We find suppression is typical in studies of ECT or humans (Esmaeili et al., 2023; Gascoigne et al., 2023; Hahn et al., 2023; Kavakbasi et al., 2023; Langroudi et al., 2023; Karl et al., 2024; Vilan et al., 2024; Zhao et al., 2024) and animal research uses the term postictal depression(Kanner et al., 2010; Krishnan and Bazhenov, 2011; Riljak et al., 2012; Singh et al., 2012; Carballosa-Gonzalez et al., 2013; Kommajosyula et al., 2016; Smith et al., 2018; Uva and de Curtis, 2020; Medvedeva et al., 2023). Therefore we think depression is a more suitable term.

      The example traces in Fig. 3A and B should also be expanded to better show this potential phenomenon.

      We expanded traces in Fig. 3 as suggested. They are in Fig 3A.

      (7) In Fig.5D, the area fraction of DCX in Cre+ female mice is comparable to that of Cre- and Cre+ male mice. Is it possible that there is a ceiling effect in DCX expression that may explain why male Cre+ mice do not have a significant increase compared to male Cre- mice?

      We thank the Reviewer for the intriguing possibility. We now mention it in the manuscript:

      Results, starting on line 456:

      It is notable that the Cre+ male mice did not show increased numbers of immature neurons at the time of chronic seizures but Cre+ females did. It is possible that there was a “ceiling” effect in DCX expression that would explain why male Cre+ mice did not have a significant increase in immature neurons relative to male Cre- mice.

      (8) In Suppl. Fig 6, the authors should include DCX immunolabeling quantification from conditional Cre+ male mice used in this study, rather than showing data from a previous publication.

      We have made this revision.

      (9) In Fig 8, please also include Fluorojade-C staining and quantification for male mice.

      The additional data for males have been added to part D.

      (10) Page 13: Please specify in the first paragraph of the discussion that findings were specific to female mice with pre-insult increases in adult-born neurogenesis.

      This has been done.

      Minor:

      (11) In Fig. 1 and suppl. figure 1, please clarify whether traces are from male or female mice.

      We have clarified.

      (12) Please be consistent with indicating whether immunolabeling images are from female or male mice.

      a. Fig 5B images labeled as from "Cre- Females" and "Cre+ Females".

      b. Suppl. Fig 8: Images labeled as "Cre- F" and "Cre+ F".

      c. Fig 6: sex not specified.

      d. Fig. 7: sex only specified in the figure legend.

      e. Fig 8: only female mice were included in these experiments, but this is not clear from the figure title or legend.

      We revised all figures according to the comments.

      (13) Page 4: the last paragraph of the introduction belongs within the discussion section.

      We recognize there is a classic view that any discussion of Results should not be in the Introduction. However, we find that view has faded and more authors make a brief summary statement about the Results at the end of the Introduction. We would like to do so because it allow Readers to understand the direction of the study at the outset, which we find is helpful.

      (14) Page 6: The sentence "The data are consistent with prior studies..." is unnecessary.

      We have removed the text.

      (15) Suppl. Fig 6A: Please include representative images of normal condition DCX immunolabeling.

      We have added these data. There is an image of a Cre- female, Cre+ female, Cre- male and Cre+ male in the new figure, Supplemental Figure 6. All mice had tamoxifen at 6 weeks of age and were perfused 6 weeks later. None of the mice had pilocarpine.

      (16) In Suppl. Fig 7C, I believe the authors mean "no loss of hilar mossy and SOM cells" instead of "loss of hilar mossy and SOM cells".

      This Figure was removed because of the input from Reviewer 1 suggesting it was too speculative.

      Reviewer #1 (Recommendations For The Authors):

      (1) The main claim of the study is that increasing adult neurogenesis decreases chronic seizures. However, to quantify adult-born neurons, DCX immunoreactivity is used as the sole metric to determine neurogenesis. This is insufficient as changes in DCX-expressing cells could also be an indicator of altered maturation, survival, and/or migration, not proliferation per se. To claim that increasing adult neurogenesis is associated with a reduction of chronic seizures, the authors should perform a pulse/chase (birth dating) experiment with BrdU and co-labeling with DCX.

      We think that increased DCX does reflect increased adult neurogenesis. However, we agree that one does not know if it was due to increased proliferation, survival, etc. We also note that this mouse line has been studied thoroughly to show there was increased neurogenesis with BrdU, Ki67 and DCX. We mention that paper in the revised text:

      Methods, starting on line 786:

      It was shown that after tamoxifen injection in adult mice there is an increase in dentate gyrus neurogenesis based on studies of bromo-deoxyuridine, Ki67, and doublecortin (Sahay et al., 2011).

      (2) As mentioned above, analysis of DCX staining alone months after TAM injections is limited. Instead, the cells could be labelled by BrdU prior to TAM injection, following which quantification of BrdU+/Prox1+ cells at 6 weeks post TAM injection should be performed in Cre+ and Cre- mice (males and females) to yield the rate of neurogenesis increase.

      We respectfully disagree that birthdating cells is critical. Using DCX staining just before SE, we know the size of the population of cells that are immature at the time of SE. This is what we think is most important because these immature neurons are those that appear to affect SE, as we have already shown.

      (3) To confirm the source of the hilar Prox1+ cells, a dual BrdU/EdU labeling approach would be beneficial. BrdU injection could be given before TAM injection and EdU injection before pilocarpine to label different cohorts of neural stem cells. Co-staining with Prox1 at different time points will help in identifying the origin of hilar ectopic cells.

      We are grateful for the ideas of the Reviewer. We hesitate to do these experiments now because it seems like a new study to find out where hilar granule cells come from.

      REFERENCES

      Bumanglag AV, Sloviter RS (2018) No latency to dentate granule cell epileptogenesis in experimental temporal lobe epilepsy with hippocampal sclerosis. Epilepsia 59:2019-2034.

      Carballosa-Gonzalez MM, Munoz LJ, Lopez-Alburquerque T, Pardal-Fernandez JM, Nava E, de Cabo C, Sancho C, Lopez DE (2013) EEG characterization of audiogenic seizures in the hamster strain gash:Sal. Epilepsy Res 106:318-325.

      Cho KO, Lybrand ZR, Ito N, Brulet R, Tafacory F, Zhang L, Good L, Ure K, Kernie SG, Birnbaum SG, Scharfman HE, Eisch AJ, Hsieh J (2015) Aberrant hippocampal neurogenesis contributes to epilepsy and associated cognitive decline. Nat Commun 6:6606.

      Esmaeili B, Weisholtz D, Tobochnik S, Dworetzky B, Friedman D, Kaffashi F, Cash S, Cha B, Laze J, Reich D, Farooque P, Gholipour T, Singleton M, Loparo K, Koubeissi M, Devinsky O, Lee JW (2023) Association between postictal EEG suppression, postictal autonomic dysfunction, and sudden unexpected death in epilepsy: Evidence from intracranial EEG. Clin Neurophysiol 146:109-117.

      Gascoigne SJ, Waldmann L, Schroeder GM, Panagiotopoulou M, Blickwedel J, Chowdhury F, Cronie A, Diehl B, Duncan JS, Falconer J, Faulder R, Guan Y, Leach V, Livingstone S, Papasavvas C, Thomas RH, Wilson K, Taylor PN, Wang Y (2023) A library of quantitative markers of seizure severity. Epilepsia 64:1074-1086.

      Hahn T et al. (2023) Towards a network control theory of electroconvulsive therapy response. PNAS Nexus 2:pgad032.

      Jain S, LaFrancois JJ, Botterill JJ, Alcantara-Gonzalez D, Scharfman HE (2019) Adult neurogenesis in the mouse dentate gyrus protects the hippocampus from neuronal injury following severe seizures. Hippocampus 29:683-709.

      Jung KH, Chu K, Lee ST, Kim J, Sinn DI, Kim JM, Park DK, Lee JJ, Kim SU, Kim M, Lee SK, Roh JK (2006) Cyclooxygenase-2 inhibitor, celecoxib, inhibits the altered hippocampal neurogenesis with attenuation of spontaneous recurrent seizures following pilocarpine-induced status epilepticus. Neurobiol Dis 23:237-246.

      Kanner AM, Trimble M, Schmitz B (2010) Postictal affective episodes. Epilepsy Behav 19:156-158.

      Karl S, Sartorius A, Aksay SS (2024) No effect of serum electrolyte levels on electroconvulsive therapy seizure quality parameters. J ECT 40:47-50.

      Kavakbasi E, Stoelck A, Wagner NM, Baune BT (2023) Differences in cognitive adverse effects and seizure parameters between thiopental and propofol anesthesia for electroconvulsive therapy. J ECT 39:97-101.

      Kommajosyula SP, Randall ME, Tupal S, Faingold CL (2016) Alcohol withdrawal in epileptic rats - effects on postictal depression, respiration, and death. Epilepsy Behav 64:9-14.

      Krishnan GP, Bazhenov M (2011) Ionic dynamics mediate spontaneous termination of seizures and postictal depression state. J Neurosci 31:8870-8882.

      Langroudi ME, Shams-Alizadeh N, Maroufi A, Rahmani K, Rahchamani M (2023) Association between postictal suppression and the therapeutic effects of electroconvulsive therapy: A systematic review. Asia Pac Psychiatry 15:e12544.

      Mazzuferi M, Kumar G, Rospo C, Kaminski RM (2012) Rapid epileptogenesis in the mouse pilocarpine model: Video-EEG, pharmacokinetic and histopathological characterization. Exp Neurol 238:156-167.

      Medvedeva TM, Sysoeva MV, Sysoev IV, Vinogradova LV (2023) Intracortical functional connectivity dynamics induced by reflex seizures. Exp Neurol 368:114480.

      Riljak V, Maresova D, Jandova K, Bortelova J, Pokorny J (2012) Impact of chronic ethanol intake of rat mothers on the seizure susceptibility of their immature male offspring. Gen Physiol Biophys 31:173-177.

      Sahay A, Scobie KN, Hill AS, O'Carroll CM, Kheirbek MA, Burghardt NS, Fenton AA, Dranovsky A, Hen R (2011) Increasing adult hippocampal neurogenesis is sufficient to improve pattern separation. Nature 472:466-470.

      Scharfman HE, Goodman JH, Sollas AL (2000) Granule-like neurons at the hilar/CA3 border after status epilepticus and their synchrony with area CA3 pyramidal cells: Functional implications of seizure-induced neurogenesis. J Neurosci 20:6144-6158.

      Singh B, Singh D, Goel RK (2012) Dual protective effect of passiflora incarnata in epilepsy and associated post-ictal depression. J Ethnopharmacol 139:273-279.

      Smith ZZ, Benison AM, Bercum FM, Dudek FE, Barth DS (2018) Progression of convulsive and nonconvulsive seizures during epileptogenesis after pilocarpine-induced status epilepticus. J Neurophysiol 119:1818-1835.

      Sun MY, Yetman MJ, Lee TC, Chen Y, Jankowsky JL (2014) Specificity and efficiency of reporter expression in adult neural progenitors vary substantially among nestin-creer(t2) lines. J Comp Neurol 522:1191-1208.

      Uva L, de Curtis M (2020) Activity- and ph-dependent adenosine shifts at the end of a focal seizure in the entorhinal cortex. Epilepsy Res 165:106401.

      Vilan A, Grangeia A, Ribeiro JM, Cilio MR, de Vries LS (2024) Distinctive amplitude-integrated EEG ictal pattern and targeted therapy with carbamazepine in kcnq2 and kcnq3 neonatal epilepsy: A case series. Neuropediatrics 55:32-41.

      Zhao C, Tang Y, Xiao Y, Jiang P, Zhang Z, Gong Q, Zhou D (2024) Asymmetrical cortical surface area decrease in epilepsy patients with postictal generalized electroencephalography suppression. Cereb Cortex 34.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      First, all the experiments are performed in Jurkat T cells that may not recapitulate the regulation of polarization in primary T cells.

      To extend our results in Jurkat cells forming IS to primary cells, we have now performed experiments using synapses established by Raji cells and either primary T cells  (TCRmediated) or primary CAR T cells (CAR-mediated) (new Suppl. Fig. S7). These experiments clearly show the presence of FMNL1 at these two different IS classes (new Suppl. Fig. S7), similar to what was found in Jurkat-Raji synapses. In addition, since most of the experiments were performed in Jurkat cells, we have changed the title of our manuscript, to be faithful to the main body of our results. New sentences dealing with this important issue have been included in the Results and Discussion sections.

      Moreover, all the experiments analyzing the role of PKCdelta are performed in one clone of wt or PKCdelta KO Jurkat cells. This is problematic since clonal variation has been reported in Jurkat T cells.

      Referee is right, this is the reason why we have studied three different control clones (C3, C9, C7) and three PKCdelta-interfered clones (P5, P6 and S4) all derived from JE6.1 clone and the results have been previously published (Herranz et al 2019)(Bello-Gamboa et al 2020). All these clones expressed similar levels of the relevant cell surface molecules and formed synaptic conjugates with similar efficiency (Herranz et al 2019). The P5, P6 and S4 clones exhibited a similar defect in MVB/MTOC polarization when compared with the control clones (Herranz et al 2019)(Bello-Gamboa et al 2020). Experiments developed by other researchers using a different clone of Jurkat (JE6.1) and primary CD4+ and CD8+ lymphocytes interfered in FMNL1 (Gomez et al. 2007), showed a comparable defect in MTOC polarization to that found in our control clones when were transiently interfered in FMNL1 (Bello-Gamboa et al 2020, this manuscript). In this manuscript we have studied, instead of canonical JE6.1 clone, C3 and C9 control clones derived from JE6.1, since the puromycin-resistant control clones (containing a scramble shRNA) were isolated by limiting dilution together with the PKCdelta-interfered clones (Herranz et al. 2019), thus C3 and C9 clones are the best possible controls to compare with P5 and P6 clones. Please realize that microsatellite analyses, available upon request, supports the identity of our C3 clone with JE6.1. Moreover, when GFP-PKCdelta was transiently expressed in the three PKCdelta-interfered clones, MTOC/MVB polarization was recovered to control levels (Herranz et al. 2019). Therefore, the deficient MTOC/MVB polarization in all these clones is exclusively due to the reduction in PKCdelta expression (Herranz et al 2019), and thus clonal variation cannot underlie our results in stable clones. We have now included new sentences to address this important point and to mention the inability of FMNL1betaS1086D to revert the deficient MTOC polarization occurring in P6 PKCdelta-interfered clone, as occurred in P5 clone. Due to the fact we have now included more figures and panels to satisfy editor and referees’s comments, we have not included the dot plot data corresponding to C9 and P6 clones to avoid a too long and repetitive manuscript. Since all the FMNL1 interference and FMNL1 variants reexpression experiments were performed in transient assays (2-4 days after transfection), there was no chance for any clonal variation in these short-time experiments. Moreover, internal controls using untransfected cells or Raji cells unpulsed with SEE were carried out in all these transient experiments.

      Finally, although convincing, the defect in the secretion of vesicles by T cells lacking phosphorylation of FMNL1beta on S1086 is preliminary. It would be interesting to analyze more precisely this defect. The expression of the CD63‑GFP in mutants by WB is not completely convincing. Are other markers of extracellular vesicles affected, e.g. CD3 positive?

      We acknowledge this comment. It is true that the mentioned results do not directly demonstrate the presence of exosomes at the synaptic cleft of the synapses, since the nanovesicles were harvested from the cell culture supernatants from synaptic conjugates and these nanovesicles could be produced by multi‑directional degranulation of MVBs. To address this important issue, we have performed STED super‑resolution imaging of the immune synapses made by control and FMNL1-interfered cells. Nanosized (100-150 nm) CD63+ vesicles can be found in the synaptic cleft between APC and control cells with polarized MVBs, whereas we could not detect these vesicles in the synaptic cleft from FMNL1-interfered cells that maintain unpolarized MVBs (New Fig. 10). New sentences have been included in the Results and Discussion dealing with this important point. Regarding the use of CD3 as a marker of extracellular vesicles, please realize that CD3 is neither an enriched nor a specific marker of exosomes, since it is also present in plasma membrane shedding vesicles, molting vesicles from microvilli, apoptotic bodies and small cell fragments, apart from exosomes, thus we have preferred to use the canonic exosome marker CD63 as a general exosome reporter readout, for WB and immunofluorescence (MVBs, exosomes), time-lapse of MVBs (suppl. Video 8) and super resolution experiments (Fig. 10).   

      Reviewer #2 (Public Review):

      Summary:

      The authors have addressed the role of S1086 in the FMNL1beta DAD domain in 4 F-actin dynamics, MVB polarization, and exosome secretion, and investigated the potential implication of PKCdelta, which they had previously shown to regulate these processes, in FMNL1beta S1086 phosphorylation. This is based on:

      (1) the documented role of FMNL1 proteins in IS formation

      (2) their ability to regulate F-actin dynamics

      (3) the implication of PKCdelta in MVB polarization to the IS and FMNL1beta phosphorylation

      (4) the homology of the C-terminal DAD domain of FMNL1beta with FMNL2, where a phosphorylatable serine residue regulating its auto-inhibitory function had been previously identified. They demonstrate that FMNL1beta is indeed phosphorylated on S1086 in a PKCdelta-dependent manner and that S1086-phosphorylated FMNL1beta acts downstream of PKCdelta to regulate centrosome and MVB polarization to the IS and exosome release. They provide evidence that FMNL1beta accumulates at the IS where it promotes F-actin clearance from the IS center, thus allowing for MVB secretion.  

      Strengths

      The work is based on a solid rationale, which includes previous findings by the authors establishing a link between PKCdelta, FMNL1beta phosphorylation, synaptic F-actin clearance, and MVB polarization to the IS. The authors have thoroughly addressed the working hypotheses using robust tools. Among these, of particular value is an expression vector that allows for simultaneous RNAi-based knockdown of the endogenous protein of interest (here all FMNL1 isoforms) and expression of wild-‐‑type or mutated versions of the protein as YFP‐tagged proteins to facilitate imaging studies. The imaging analyses, which are the core of the manuscript, have been complemented by immunoblot and immunoprecipitation studies, as well as by the measurement of exosome release (using a transfected MVB/exosome reporter to discriminate exosomes secreted by T cells).

      Weaknesses

      The data on F-‐‑actin clearance in Jurkat T cells knocked down for FMNL1 and expressing wild-type FMNL1 or the non‑phosphorylatable or phosphomimetic mutants thereof would need to be further strengthened, as this is a key message of the manuscript. Also, the entire work has been carried out on Jurkat cells. Although this is an excellent model easily amenable to genetic manipulation and biochemical studies, the key finding should be validated on primary T cells

      Referee’s global assessment is right. To extend our results in Jurkat cells forming IS, we have now performed experiments using synapses established by Raji cells and either primary T cells (TCR-mediated) or primary CAR T cells (CAR-mediated) (new Suppl. Fig. S7). These experiments clearly show the presence of FMNL1 at these two different IS classes (new Suppl. Fig. S7), similar to what was found in Jurkat-Raji synapses. In addition, since most of the experiments were performed in Jurkat cells, we have changed the title of our manuscript, to be faithful to the main body of our results. New sentences have been included in Results and Discussion to address these important points.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      This study shows the role of the phosphorylation of FMNL1b on S1086 on the polarity of T lymphocytes in T lymphocytes, which is a new and interesting finding. It would be important to confirm some of the key results in primary T cells and to analyze in-depth the defect in actin remodeling (quantification of the images, analysis of some key actors of actin remodeling). The description of the defect in the secretion of extracellular vesicles would also benefit from a more accurate analysis of the content of vesicles. 

      Referee is right.  We have now performed experiments using synapses containing Raji cells and either primary T cells (TCR-mediated) or primary CAR T cells (CAR-mediated) (new Suppl. Fig. S7). These experiments clearly show the presence of FMNL1 at these two different IS classes, similar to what was found in Jurkat-‐‑Raji synapses. Moreover, since most of the experiments were performed in Jurkat cells, we have changed the title of our manuscript, to be faithful to the main body of our results. Regarding the use of CD63 instead of other markers such as for instance,  CD3 (as stated by the other referee), please realize that CD3 is neither an enriched nor a specific marker of exosomes, since it is also present in plasma membrane shedding vesicles, molting vesicles from microvilli, apoptotic bodies and small cell fragments, apart from exosomes, thus we have preferred to use the accepted consensus, canonic extracellular vesicle marker CD63 (International Society of Extracellular Vesicles positioning, Thery et al 2018, doi: 10.1080/20013078.2018.1535750. eCollection 2018., Alonso et al. 2011) as a general exosome reporter readout, for both WB, immunofluorescence (MVBs, exosomes) and super-resolution experiments. Accordingly, GFP-‐‑CD63 reporter plasmid was used for exosome secretion in transient expression studies and living cell time-lapse experiments (Suppl. Video 8). Any other exosome marker will also be present in Raji cells and will not allow to analyse exclusively the secretion of exosomes by the effector Jurkat cells, since B lymphocytes produce a large quantity of exosomes upon MHC‑II stimulation by Th lymphocytes (Calvo et al, 2020, doi:10.3390/ijms21072631). To reinforce the exosome data in the context of the immune synapse, STED super-resolution imaging of the immune synapses made by control and FMNL1‑interfered cells was performed. Nanosized (100-150 nm) CD63+ vesicles can be found in the synaptic cleft of control cells with polarized MVBs, whereas we could no detect these vesicles in the synaptic cleft from FMNL1-interfered cells that maintain unpolarized MVBs (new Fig. 10).

      Moreover, all the videos are not completely illustrative. For example, in video 2 it would be more appropriate show only the z plane corresponding to the IS to see more precisely the F-actin remodeling relative to CD63 labeling.

      Referee is right. It is true that the upper rows in some videos may distract the reader of the main message contained in the lower row, that includes the 90º turn-generated, zx plane corresponding to the IS interface. Accordingly, we have maintained the still images of the whole synaptic conjugates in the first row from video 2; this will allow the reader to perceive a general view of the fluorochromes on the whole cell conjugates, as a reference, and to compare precisely the F-actin remodeling relative to CD63 labeling only at the zx interface (lower row). We have now processed the videos 1 and 5 following similar criteria

      The quality of videos 3 and 4 are not good enough. For video 7, it seems that the labeling of phospho-‐‑Ser is very broad at the IS, which is expected since it should label all the proteins that are phosphorylated by PKCs. The resolution of microscopy (at the best 200 to 300 nm) does not allow us to conclude on the co-‐localization of FMNL1b with phospho-‐‑Ser and is thus not conclusive. Finally, the study would benefit from a more careful statistical analysis. The dot plots showing polarity are presented for one experiment. Yet, the distribution of the polarity is broad. Results of the 3 independent experiments should be shown and a statistical analysis performed on the independent experiments

      Referee is right, we have amended video settings (brightness/contrast) in videos 3 and 4 to improve this issue. In addition, we would like to remark that the translocation of proteins to cellular substructures in living cells is not a trivial issue, since certain protein localizations are too dynamic to be properly imaged with enough spatial resolution. The equilibrium resulting from the association/dissociation of a certain protein to the membrane, in addition to the protein diffusion naturally occurring in living cells, as well as signal intensity fluctuations inherent to the stochastic nature of fluorescence emission often provide barriers for image quality (Shroff et al, 2024). Thus, additional image blurring is expected when compared with that observed in fixed samples. However, we think it is important to provide the potential readers with a dynamic view of FMNL1 localization, which can only be achieved through real-time videos, in addition to the still frames from the same videos provided in Fig. 6A (the referee did not argue against the inclusion of these frames), together with images from fixed cells in Fig 6B, for comparison. This is the reason why we have preferred to maintain the improved videos to complement the results of some spare frames from the videos, together with images from fixed cells in the same figure (Fig. 6).

      Regarding video 7, we agree that colocalization is limited by the spatial resolution of confocal  microscopy,  and this fact does not allow us to infer that FMNL1beta is phosphorylated at the IS. However, please realize we have never concluded this in our manuscript.  Instead, we claimed that “colocalization of endogenous FMNL1 and YFP‑FMNL1βWT with anti‑phospho‑Ser  …is compatible with the idea that both endogenous FMNL1 and YFP‑FMNL1βWT are specifically phosphorylated at the cIS”. Moreover, we have now performed colocalization in super‑resolved STED microscopy images, that reduces the XY resolution down to 30-­40 nm (Suppl. Fig. S12), and the results also support colocalization of endogenous FMNL1 with anti-phospho‑Ser PKC at the IS within a 30 nm resolution limit. We have now somewhat softened our conclusion: “Although all these data did not allow us to infer that FMNL1β is phosphorylated at the IS due to the resolution limit of confocal and STED microscopes, the results are compatible with the idea that both endogenous FMNL1 and YFP-FMNL1βWT are specifically phosphorylated at the cIS”.   

      Regarding statistical analyses we agree the dot distribution in the polarity experiments is quite broad, but this is consistent with the end point strategy used by a myriad of research groups (including ourselves) to image an intrinsically stochastic, rapid and asynchronous processes such as immune synapse formation and to score MTOC/MVB  polarization (Calvo et al 2018, https://doi.org/10.3389/fimmu.2018.00684). Despite this fact,  ANOVA  analyses have underscored the statistical significance of all the experiments represented by dot plot experiments. We cannot average or perform meta statistical analyses by combining the equivalent cohort results from independent experiments, since we have observed that small variations of certain variables (SEE concentration, cell recovery, time after transfection, etc.) affect synapse formation and PI values among experiments without altering the final outcome in each case. Please, note that our manuscript includes now 10  multi‑panel figures,  12  multi‑panel supplementary figures and 8 videos, and it is already quite large.  Thus,  we feel the inclusion of redundant, triplicate dot plot figures will dilute and distract to any potential reader from the main message of our already comprehensive contribution. We have now included new sentences at the figure legends to remark ANOVA analyses were executed separately in all the 3 independent experiments.

      Reviewer #2 (Recommendations For The Authors):

      (1) The key findings should be validated on primary CD4+ T cells (of which Jurkat is a transformed model).

      Referee is right. However, as commented by the other referee, the data from activating surfaces clearly shows that the synaptic actin architecture of the immune synapse from primary CD8+ T cells is essentially indistinguishable and thus unbiased from that of Jurkat T cells, but different to that of primary CD4+ cells (Murugesan, 2016). Thus, our data in Jurkat T cells are directly applicable to the synaptic architecture of primary CD8+ cells. In addition, to definitely extend our results in Jurkat cells forming IS, we have performed experiments using synapses established by Raji cells and either primary T cells (TCR-mediated) or primary CAR T cells (CAR-mediated) (new Suppl. Fig. S7) challenged by Raji cells. We have preferred to work with mixed CD4+ and CD8+ cells in order to maintain potential interactions in trans between these subpopulations that may affect or influence IS formation. These experiments clearly show the presence of FMNL1 at these two different IS classes (new Suppl. Fig. S7), similar to what was found in JurkatRaji synapses. Moreover, since most of the experiments were performed in Jurkat cells as stated by the referee, we have changed the title of our manuscript, to circumscribe our results to the model we have used and to be faithful to the main body of our results.

      (2) The image of wt YFP-­FMNL1beta in Figure 4A displays a weak CD63 signal and shows an asymmetric polarization of both the centrosome and MVBs. It should be replaced with a more representative one.

      Referee is right. Accordingly, we have modified the CD63 channel settings (brightness/contrast) in this panel to make it comparable to the other panels in the same figure. In addition, thanks to this referee´s comment, we have realized the position of the MTOC (yellow dot) in the diagram in the right side of the YFP-FMNL1betaWT panels row appeared mislocated, producing the mentioned apparent asymmetry with respect to MVBs’s center of mass (green dot) position. This mistake leads to an apparent segregation between the position of the center of mass of these organelles which certainly does not correspond with the real image. We have now amended the scheme and we apologize for this mistake.

      (3) The images showing F-­actin clearance at the IS (Figure 8, S4, S5) are not very convincing, also when looking at the MFI along the T cell-­‐‑APC interface in the en-­‐face  views.  Since  the  F-­actin  signal  also  includes  some  signal  from  the  APC, transfecting T cells with an actin reporter to selectively image T cell actin could better clarify this key point.

      Referee´s point is correct. However, we (83), and other researchers using the proposed actin reporter approach in the same Raji/Jurkat IS model (Fig. 4 in ref 84) have already excluded the possibility that actin cytoskeleton of Raji cells can also contribute to the measurements of synaptic F-actin. In Materials and Methods, page 37, lines 1048-1055 we included this related sentence:  ¨It is important to remark that MHC-II-antigen triggering on the B cell side of the Th synapse does not induce noticeable F-­actin changes along the synapse (i.e. F-­actin clearing at the central IS), in contrast to TCR stimulation on T cell side (84) (85) (3). In addition, we have observed that majority of F‐‑actin changes along the IS belongs to the Jurkat cell (83). Thus, the contribution to the analyses of the residual, invariant F‐actin from the B cell is negligible using our protocol (83).

      Thus, we can exclude this caveat may affect our results.

      (4) A similar consideration applies to the MVB distribution in the en‑face images. For example, in Figure S5 the MVB profile, with some peripheral distribution, does not appear very different in cells expressing wt YFP‑tagged FMNL1beta versus the S1086A‑expressing cells.

      The referee's assessment regarding Supp. Figure S5 is valid. Using only the plot profile, the outcomes obtained with YFP-FMNL1βWT may appear comparable to those derived from YFP-FMNL1βS1086A. Nonetheless, this resemblance is attributed to the plot profile's exclusive consideration of the MVBs signal in the interface from the immune synapse region (white rectangle). The upper images (second row), where the whole cell is displayed, illustrate that in YFP-FMNL1βWT, MVB are specifically accumulated within this specific region, in contrast to the scattered distribution observed in YFP-FMNL1βS1086A, where MVB are dispersed throughout the cell without distinction. While MVBs are evident in both instances within the synapse region, the reason behind this observation is different. The YFP-FMNL1βWT transfected cell (third column) shows a pronounced MVB concentration within the synaptic area (white rectangle), which leads to MVB PI=0.52, whereas the YFP-FMNL1βS1086A transfected cell (fourth column), as it presents a scattered distribution of MVB throughout the cell, also exhibits some MVB (but only a small proportion of the total cellular MVB) in the synaptic area, which yields MVB PI=-0.09. Please realise that the position of the center of mass of the distribution of MVB (MVBC) labelled in this figure (white squares) is an unbiased parameter that mirrors MVB center of mass polarization. A new sentence has been included in the figure legend to clarify this important point.

      (5) The image in the first row in Figure 6B does not show a clear accumulation of FMNL1beta at the IS, possibly because the T cell is in contact with two APCs. This image should be replaced.

      Referee is right Therefore, we have replaced the quoted example with a single cell:cell synapse that shows a clearer and more localized accumulation in the cIS, thereby avoiding the mentioned caveat.

      (6) In Figure 2A the last row shows what appears to be a T:T cell conjugate (with one cell expressing the YFP-­‐‑tagged protein). The image should be replaced with another showing a T cell-­APC (blue) conjugate.

      Referee is right, we have accordingly replaced the mentioned image with a T cell:APC conjugate.

      (7) The Discussion is very long and dispersive. It would benefit from shortening it and making it more focused.

      Referee is right, we have shortened and focused it, by eliminating the whole second and third paragraphs of the discussion. Moreover, a whole paragraph in page 24 has been also deleted.

      We have also focussed the discussion towards the new data in primary T lymphocytes.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      This paper uses a model of binge alcohol consumption in mice to examine how the behaviour and its control by a pathway between the anterior insular cortex (AIC) to the dorsolateral striatum (DLS) may differ between males and females. Photometry is used to measure the activity of AIC terminals in the DLS when animals are drinking and this activity seems to correspond to drink bouts in males but not females. The effects appear to be lateralized with inputs to the left DLS being of particular interest. 

      Strengths: 

      Increasing alcohol intake in females is of concern and the consequences for substance use disorder and brain health are not fully understood, so this is an area that needs further study. The attempt to link fine-grained drinking behaviour with neural activity has the potential to enrich our understanding of the neural basis of behaviour, beyond what can be gleaned from coarser measures of volumes consumed etc. 

      Weaknesses: 

      The introduction to the drinking in the dark (DID) paradigm is rather narrow in scope (starting line 47). This would be improved if the authors framed this in the context of other common intermittent access paradigms and gave due credit to important studies and authors that were responsible for the innovation in this area (particularly studies by Wise, 1973 and returned to popular use by Simms et al 2010 and related papers; e.g., Wise RA (1973). Voluntary ethanol intake in rats following exposure to ethanol on various schedules. Psychopharmacologia 29: 203-210; Simms, J., Bito-Onon, J., Chatterjee, S. et al. Long-Evans Rats Acquire Operant Self-Administration of 20% Ethanol Without Sucrose Fading. Neuropsychopharmacol 35, 1453-1463 (2010).)

      We appreciate the reviewer’s perspective on the history of the alcohol research field. There are hundreds of papers that could be cited regarding all the numerous different permutations of alcohol drinking paradigms. This study is an eLife “Research Advances” manuscript that is a direct follow-up study to a previously published study in eLife (Haggerty et al., 2022) that focused on the Drinking in the Dark model of binge alcohol drinking. This study must be considered in the context of that previous study (they are linked), and thus we feel that a comprehensive review of the literature is not appropriate for this study.

      The original drinking in the dark demonstrations should also be referenced (Rhodes et al., 2005). Line 154 Theile & Navarro 2014 is a review and not the original demonstration. 

      This is a good recommendation. We have added this citation to Line 33 and changed Line 154.

      When sex differences in alcohol intake are described, more care should be taken to be clear about whether this is in terms of volume (e.g. ml) or blood alcohol levels (BAC, or at least g/kg as a proxy measure). This distinction was often lost when lick responses were being considered. If licking is similar (assuming a single lick from a male and female brings in a similar volume?), this might mean males and females consume similar volumes, but females due to their smaller size would become more intoxicated so the implications of these details need far closer consideration. What is described as identical in one measure, is not in another. 

      As shown in Figure 1, all measures of intake are reported as g/kg for both water and alcohol to assess intakes across fluids that are controlled by body weights. We do not reference changes in fluid volume or BACs to compare differences in measured lickometry or photometric signals, except in one instance where we suggest that the total volume of water (ml) is greater than the total amount of alcohol (ml) consumed in DID sessions, but this applies generally to all animals, regardless of sex, across all the experimental procedures.

      In Figure 2 – Figure Supplement 1 we show drinking microstructures across single DID sessions, and that males and females drink similarly, but not identically, when assessing drinking measures at the smallest timescale that we have the power to detect with the hardware we used for these experiments. Admittedly, the variability seen in these measures is certainly non-zero, and while we are tempted to assume that there exist at least some singular drinks that occur identically between males and females in the dataset that support the idea that females are simply just consuming more volume of fluid per singular drink, we don’t have the sampling resolution to support that claim statistically. Further, even if females did consume more volume per singular drink that males, we do not believe that is enough information to make the claim that such behavior leads to more “intoxication” in females compared males, as we know that alcohol behaviors, metabolism, and uptake/clearance all differ significantly by sex and are contributing factors towards defining an intoxication state. We’ve amended the manuscript to remove any language of referencing these drinking behaviors as identical to clear up the language.

      No conclusions regarding the photometry results can be drawn based on the histology provided. Localization and quantification of viral expression are required at a minimum to verify the efficacy of the dual virus approach (the panel in Supplementary Figure 1 is very small and doesn't allow terminals to be seen, and there is no quantification). Whether these might differ by sex is also necessary before we can be confident about any sex differences in neural activity. 

      We provide hit maps of our fiber placements and viral injection centers, as we have, and many other investigators do regularly for publication based on histological verification. Figure 1A clearly shows the viral strategy taken to label AIC to DLS projections with GCaMP7s, and a representative image shows green GCaMP positive terminals below the fiber placement. Considering the experiments, animals without proper viral expression did not display or had very little GCaMP signal, which also serves as an additional expression-based control in addition to typical histology performed to confirm “hits”. These animals with poor expression or obvious misplacement of the fiber probes were removed as described in the methods. Further, we also report our calcium signals as z-scored differences in changes in observed fluorescence, thus we are comparing scaled averages of signals across sexes, and days, which helps minimize any differences between “low” or “high” viral transduction levels at the terminals, directly underneath the tips of the fibers.

      While the authors have some previous data on the AIC to DLS pathway, there are many brain regions and pathways impacted by alcohol and so the focus on this one in particular was not strongly justified. Since photometry is really an observational method, it's important to note that no causal link between activity in the pathway and drinking has been established here. 

      As mentioned above, this article is an eLife Research Advances article that builds on our previous AIC to DLS work published in eLife (Haggerty et al., 2022). Considering that this is a linked article, a justification for why this brain pathway was chosen is superfluous. In addition, an exhaustive review of all the different brain regions and pathways that are affected by binge alcohol consumption to justify this pathway seems more appropriate to a review article than an article such as this.  

      We make no claims that photometric recordings are anything but observational, but we did observe these signals to be different when time-locked to the beginning of drinking behaviors. We describe this link between activity in the pathway and drinking throughout the manuscript. It is indeed correlational, but just because it is not causal does not mean that our findings are invalid or unimportant.

      It would be helpful if the authors could further explain whether their modified lickometers actually measure individual licks. While in some systems contact with the tongue closes a circuit which is recorded, the interruption of a photobeam was used here. It's not clear to me whether the nose close to the spout would be sufficient to interrupt that beam, or whether a tongue protrusion is required. This detail is important for understanding how the photometry data is linked to behaviour. The temporal resolution of the GCaMP signal is likely not good enough to capture individual links but I think more caution or detail in the discussion of the correspondence of these events is required. 

      The lickometers do not capture individual licks, but a robust quantification of the information they capture is described in Godynyuk et al. 2019 and referenced in multiple other papers (Flanigan et al. 2023, Haggerty et al. 2022, Grecco et al. 2022, Holloway et al. 2023) where these lickometers have been used. However, individual lick tracking is not a requirement for tracking drinking behaviors more generally. The lickometers used clearly track when the animals are at the bottles, drinking fluids, and we have used the start of that lickometer signal to time-lock our photometry signals to drinking behaviors. We make no claims or have any data on how photometric signals may be altered on timescales of single licks. In regard to how AIC to DLS signals change on the second time scale when animals initiate drinking behaviors, we believe we explain these signals with caution and in context of the behaviors they aim to describe.

      Even if the pattern of drinking differs between males and females, the use of the word "strategy" implies a cognitive process that was never described or measured. 

      We use the word strategy to describe a plan of action that is executed by some chunking of motor sequences that amounts to a behavioral event, in this case drinking a fluid. We do not mean to imply anything further than this by using this specific word.

      Reviewer #2 (Public Review): 

      Summary: 

      This study looks at sex differences in alcohol drinking behaviour in a well-validated model of binge drinking. They provide a comprehensive analysis of drinking behaviour within and between sessions for males and females, as well as looking at the calcium dynamics in neurons projecting from the anterior insula cortex to the dorsolateral striatum. 

      Strengths: 

      Examining specific sex differences in drinking behaviour is important. This research question is currently a major focus for preclinical researchers looking at substance use. Although we have made a lot of progress over the last few years, there is still a lot that is not understood about sex-differences in alcohol consumption and the clinical implications of this. 

      Identifying the lateralisation of activity is novel, and has fundamental importance for researchers investigating functional anatomy underlying alcohol-driven behaviour (and other reward-driven behaviours). 

      Weaknesses: 

      Very small and unequal sample sizes, especially females (9 males, 5 females). This is probably ok for the calcium imaging, especially with the G-power figures provided, however, I would be cautious with the outcomes of the drinking behaviour, which can be quite variable. 

      For female drinking behaviour, rather than this being labelled "more efficient", could this just be that female mice (being substantially smaller than male mice) just don't need to consume as much liquid to reach the same g/kg. In which case, the interpretation might not be so much that females are more efficient, as that mice are very good at titrating their intake to achieve the desired dose of alcohol. 

      We agree that the “more efficient” drinking language could be bolstered by additional discussion in the text, and thus have added this to the manuscript starting at line 440.

      I may be mistaken, but is ANCOVA, with sex as the covariate, the appropriate way to test for sex differences? My understanding was that with an ANCOVA, the covariate is a continuous variable that you are controlling for, not looking for differences in. In that regard, given that sex is not continuous, can it be used as a covariate? I note that in the results, sex is defined as the "grouping variable" rather than the covariate. The analysis strategy should be clarified. 

      In lines 265-267, we explicitly state that the covariate factor was sex, which is mathematically correct based on the analyses we ran. We made an in-text error where we referred to sex as a grouping variable on Line 352, when it should have been the covariate. Thank you for the catch and we have corrected the manuscript.

      But, to reiterate, we are attempting to determine if the regression fits by sex are significantly different, which would be reported as a significant covariate. Sex is certainly a categorical variable, but the two measures at which we are comparing them against are continuous, so we believe we have the validity to run an ANCOVA here.

      Reviewer #3 (Public Review): 

      Summary: 

      In this manuscript by Haggerty and Atwood, the authors use a repeated binge drinking paradigm to assess how water and ethanol intake changes in male in female mice as well as measure changes in anterior insular cortex to dorsolateral striatum terminal activity using fiber photometry. They find that overall, males and females have similar overall water and ethanol intake, but females appear to be more efficient alcohol drinkers. Using fiber photometry, they show that the anterior insular cortex (AIC) to dorsolateral striatum projections (DLS) projections have sex, fluid, and lateralization differences. The male left circuit was most robust when aligned to ethanol drinking, and water was somewhat less robust. Male right, and female and left and right, had essentially no change in photometry activity. To some degree, the changes in terminal activity appear to be related to fluid exposure over time, as well as within-session differences in trial-by-trial intake. Overall, the authors provide an exhaustive analysis of the behavioral and photometric data, thus providing the scientific community with a rich information set to continue to study this interesting circuit. However, although the analysis is impressive, there are a few inconsistencies regarding specific measures (e.g., AUC, duration of licking) that do not quite fit together across analytic domains. This does not reduce the rigor of the work, but it does somewhat limit the interpretability of the data, at least within the scope of this single manuscript. 

      Strengths: 

      - The authors use high-resolution licking data to characterize ingestive behaviors. 

      - The authors account for a variety of important variables, such as fluid type, brain lateralization, and sex. 

      - The authors provide a nice discussion on how this data fits with other data, both from their laboratory and others'. 

      - The lateralization discovery is particularly novel. 

      Weaknesses: 

      - The volume of data and number of variables provided makes it difficult to find a cohesive link between data sets. This limits interpretability.

      We agree there is a lot of data and variables within the study design, but also believe it is important to display the null and positive findings with each other to describe the changes we measured wholistically across water and alcohol drinking.

      - The authors describe a clear sex difference in the photometry circuit activity. However, I am curious about whether female mice that drink more similarly to males (e.g., less efficiently?) also show increased activity in the left circuit, similar to males. Oppositely, do very efficient males show weaker calcium activity in the circuit? Ultimately, I am curious about how the circuit activity maps to the behaviors described in Figures 1 and 2. 

      In Figure 3C, we show that across the time window of drinking behaviors, that female mice who drink alcohol do have a higher baseline calcium activity compared to water drinking female mice, so we believe there are certainly alcohol induced changes in AIC to DLS within females, but there remains to be a lack of engagement (as measured by changes in amplitude) compared to males. So, when comparing consummatory patterns that are similar by sex, we still see the lack of calcium signaling near the drinking bouts, but small shifts in baseline activity that we aren’t truly powered to resolve (using an AUC or similar measurements for quantification) because the shifts are so small. Ultimately, we presume that the AIC to DLS inputs in females aren’t the primary node for encoding this behavior, and some recent work out of David Werner’s group (Towner et al. 2023) suggests that for males who drink, the AIC becomes a primary node of control, whereas in females, the PFC and ACC, are more engaged. Thus, the mapping of the circuit activity onto the drinking behaviors more generally represented in Figures 1 and 2 may be sexually dimorphic and further studies will be needed to resolve how females engage differential circuitry to encode ongoing binge drinking behaviors.

      - What does the change in water-drinking calcium imaging across time in males mean? Especially considering that alcohol-related signals do not seem to change much over time, I am not sure what it means to have water drinking change. 

      The AIC seems to encode many physiologically relevant, interoceptive signals, and the water drinking in males was also puzzling to us as well. Currently, we think it may be both the animals becoming more efficient at drinking out of the lickometers in early weeks and may also be signaling changes due to thirst states of taste associated with the fluid. While this is speculation, we need to perform more in-depth studies to determine how thirst states or taste may modulate AIC to DLS inputs, but we believe that is beyond the scope of this current study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Line 45 - states alcohol use rates are increasing in females across the past half-decade. I thought this trend was apparent over the past half-century? Please consider revising this. 

      According to NIAAA, the rates of alcohol consumption in females compares to males has been closing for about the past 100 years now, but only recently are those trends starting to reverse, where females are drinking similar amounts or more than males.

      Placing more of the null findings into supplemental data would make the long paper more accessible to the reader. 

      In reference to reviewer’s three’s point as well, there is a lot of data we present, and we hope for others to use this data, both null and positive findings in their future work. As formatted on eLife’s website, we think it is important to place these findings in-line as well.

      Reviewer #2 (Recommendations For The Authors): 

      In addition to the points raised about analysis and interpretation in the Public Review, I have a minor concern about the written content. I find the final sentence of the introduction "together these findings represent targets for future pharmacotherapies.." a bit unjustified and meaningless. The findings are important for a basic understanding of alcohol drinking behaviour, but it's unclear how pharmacotherapies could target lateralised aic inputs into dls. 

      There are on-going studies (CANON-Pilot Study, BRAVE Lab, Stanford) for targeted therapies that use technologies like TMS and focused ultrasound to activate the AIC to alleviate alcohol cravings and decrease heavy drinking days. The difficulty with these next-generation therapeutics is often targeting, and thus we think this work may be of use to those in the clinic to further develop these treatments. We agree that this data does not support the development of pharmacotherapies in a traditional sense, and thus have removed the word and added text to reference TMS and ultrasound approaches to bolster this statement in lines 101+.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank the reviewers for their overall positive assessment of our manuscript. We have used their constructive feedback to substantially improve our manuscript as described below.

      Reviewer #1

      Evidence, reproducibility and clarity

      This study by Reyes at al is a well conducted analysis of memory B cell dynamics of Plasmodium falciparum (Pf) -specific B cell populations over the course of reducing Pf prevalence in ten Ugandan adults. The data is presented well and the authors provide compelling evidence that 1. There is an overall loss of Ag specific B cells with reduction in exposure and 2. Different antigens (MSP1/AMA-1 vs CIDRa-1) generate different flavors of long lived responses. However, additional clarity to the reader should be provided on certain topics (listed below).

      Major comments: 1. While the premise of the study (reduced Pf transmission due to the use of indoor residual spraying (IRS)) is an important one, I think the authors must take into consideration that 9/10 subjects had at least one Pf positive episode between Time Points 1 and 2 (Figure 1). Also, it looks from Fig 1 that some samples were collected at a time of Pf positive test (green squares), while in Table S1 none of the subjects have a positive parasite status at TP1.

      We recognize that most individuals had detectable parasitemia before and after time point (TP) 1. In our manuscript, we therefore do not report the time between TP1 and TP2, because we agree that the length of this time interval is not relevant in our study methodology. We only mention the time between the last known P. falciparum infection and collection of blood at the second time point. We use the sample collected at TP1 only as a representative sample obtained during a time with high P. falciparum exposure and do not make any claims based on the time between TP1 and TP2. The occurrence of infections after sample collection at TP1 confirms that parasite transmission was still high at this time. We have added a schematic of the relative levels of parasite transmission to Figure 1 to emphasize this.

      With respect to infection status, none of the donors were blood smear positive at TP1. However, as mentioned in Table S1, parasites were detected in three individuals using the more sensitive LAMP assay. These three individuals are therefore marked as parasite positive in Figure 1. Table S1 has been modified to highlight the parasite status of these three individuals.

      1. Figure S1A: What is trBC? Figure S1B: What is Strep? Are the strep positive cells also CIDR-1 positive and were they gated out? Why is APC used for MZ-1 and one of the MSP1-AMA-1 tetramers? Do these stainings come from multiple panels?

      All abbreviations of B cell populations were defined in the figure legend (for example, trBC stands for transitional B cells). To facilitate the interpretation of Figure S1, we have now included the definitions of these abbreviations in the figure.

      Strep stands for streptavidin, which has now also been clarified in the figure. In our gating strategy, we used the term “strep” to denote cells that bound to both CIDRa1 and MSP1/AMA1 tetramers, which we interpreted as non-specific binding to streptavidin or other components of the antigen tetramers. Only the “non-strep” cells were used to gate on antigen-specific cells. We have added this clarification to the figure legend.

      In panel B, we accidentally used the term MZ (for merozoite) to describe tetramers of the merozoite antigens MSP1 / AMA1. These labels are interchangeable, but to avoid confusion, MZ-1 has been changed to MSP1 / AMA1.

      1. Figure 3A: how many cells does the umap plot represent? Were there a total of 3555 Ag specific B cells that were non-naive (Figure 3E)?

      It is correct that there were a total of 3,555 antigen-specific B cells used for the clustering shown in panel A. This information has been added to Figure 3A.

      1. Could the authors comment on why in Figure 3, Ig isotype expression was not considered for clustering? This would allow for characterization of DN sub populations/ clusters in addition to the CD21-CD27- ABCs? It looks like IgD expression was low across the clusters (Figure 3D). Was this the case for the cells considered in this analysis, or was it excluded? If it was truly low expressed, how were the assessments in Figure 2 made?

      From prior experience, we know that Ig isotype information tends to dominate in the clustering, which would result in major clusters based on IgM, IgD, IgG, and IgA expression, not on expression of other markers. This is illustrated in the example below. The UMAP on the left shows clusters in green and red that consist of IgG+ and IgA+ B cells, respectively. The UMAP on the right shows that switched memory (swM) B cells and DN B cells are found in both IgG and IgA clusters. Because we were mainly interested in identifying different subsets of B cells, irrespective of Ig isotype, we did not include Ig isotype in the clustering. We have clarified in the manuscript that Ig isotypes were excluded from the analysis to prevent these from dominating the clustering:

      “Unsupervised clustering was then performed based on expression of all markers, except for Ig isotypes to prevent these from dominating the clustering.”

      IgD expression among cell clusters shown in Figure 3 was low because only non-naïve B cells were included in the analyis. The majority of non-naïve cells are class-switched memory B cells and DN B cells, which by definition do not express IgD (see gating strategy in Figure S1A). Figure 2 shows all B cell populations, including naïve B cells and non-naïve B cell populations (unswitched memory, switched memory, and DN), that were gated based on IgD and CD27 expression.

      5.Are there differences in these designations / phenotypes of DN populations of atBCs vs CD21-CD27- atBCs?

      In the malaria field, atypical B cells are typically defined as CD21-CD27-. The definition of DN2 B cells comes from the autoimmunity field and is stricter: IgD-CD27-CD21-CD11c+ B cells. In our manuscript, we define atypical B cells in a stricter way than typically done in the malaria field, following published guidelines for the identification of B cell subsets (https://doi.org/10.3389/fimmu.2019.02458). Using these guidelines, atypical B cells and DN2 B cells are phenotypically identical. We have added a reference to these published guidelines in the Results section:

      “Following published guidelines for the identification of B cell populations (21), total CD19+ B cells were divided into naïve B cells (IgD+CD27-), unswitched memory B cells (IgD+CD27+), switched memory B cells (IgD-CD27+), and double negative B cells (IgD-CD27-).”

      1. Lines 258-259: In considering only switched MBCs, what clusters from Figure 3a were included? There seem to be 2588 sw MBCs (Table S3, Figure 4). Do the remaining cells (967 cells) come from clusters 2, 5 and 6 (and excludes the atBC clusters)

      This analysis did not use the clusters presented in Figure 3, but instead used switched memory B cells gated as shown in Figure S1A. The reason for this is that the clusters in Figure 3 were generated using antigen-specific B cells and cannot be reproduced using non-antigen-specific B cells. Thus, it is not possible to separate all other B cells into the same six clusters. The only way to compare expression of certain markers between antigen-specific and non-antigen specific switched memory B cells is to gate on these populations manually. We have now tried to clarify this in the manuscript as follows:

      “we determined the percentages of CD95+ cells and CD11c+ cells among antigen-specific switched memory B cells and the total population of switched memory B cells (gated manually as shown in Figure S1A).”

      Minor comments: 1. Line 178- 179: Was there a specific measure of rate of decline made for these cells?

      We did not calculate a rate of decline of antigen-specific B cells for several reasons: 1) the time between TP1 and TP2 is not the same for all people in the study, 2) the time between last exposure and TP2 is not the same for all people, and 3) the rate of decline is most likely not linear and cannot accurately be estimated with only two data points. We have changed the wording of this sentence such that we do not use the word “rate”:

      “we did not observe a difference in the percentage of B cells with specificity for merozoite antigens or variant surface antigens that were lost.”

      In addition, we included the percentage of reduction in size in the paragraph before this section:

      “we observed that both populations decreased in size by about 50%, although these differences were not statistically significant.”

      Significance

      General assessment: Strengths: The authors provide evidence that the dynamics of antigen specific cells in humans can vary with exposure and with the nature of the antigen. They have nicely discussed the potential causes for these differences (Discussion), although they should include the findings of Ambegaonkar et al that ABCs in malaria may be restricted to responding specifically to membrane bound antigens (PMCID: PMC7380957)

      As suggested by the reviewer, we have added a paragraph to the Discussion section to discuss the results reported by Ambegaonkar et al. and how the difference between soluble vs. membrane-bound antigens may have an effect on how these antigens are perceived by B cells:

      The difference between soluble and membrane-bound antigens may also have a direct effect on how these antigens are perceived by B cells. Atypical B cells have been shown to be restricted to recognition of membrane-bound antigens (41). The interaction of a B cell with membrane-associated antigen allows the formation of an immunological synapse. Inhibitory receptors expressed by atypical B cells are excluded from this synapse, resulting in B cell receptor signaling and differentiation towards antibody-secreting cells (41). This could explain why atypical B cell subset 1 that expresses the highest levels of the inhibitory receptor FcRL5 is enriched for recognition of the CIDRα1 domain of membrane-bound protein PfEMP1. It should however be noted that soluble antigen can also be presented effectively in membrane-context by conventional dendritic cells, follicular dendritic cells, and subcapsular macrophages in secondary lymphoid organs, especially when it is part of an immune complex (reviewed in (42)). This would provide a route for atypical B cells to also respond to soluble merozoite antigens, such as MSP1 and AMA1.

      Limitations: 1. Outlined above, and as the authors also mention, a small sample size and homogenous population. 2. The evidence for reduced transmission is not clear, and the negative parasite tests for donors shown in Table S1 do not match with Figure 1 data. 3. Lack of IgD expression across clusters (Figure 3D- the authors will need to clarify this point) would require re-analysis of Figure 2 data

      1. We have provided clarification in response to the points raised by the reviewer.

      2. We believe there is clear evidence for reduced transmission, from a median of almost 2 infections per person per year prior to the implementation of IRS to a median parasite-free period of 1.7 years prior to sample collection at TP2. To further emphasize this, we have summarized the number of P. falciparum infections among the ten individuals included in this study (now included in Table S3):

      year

      Pf infections

      comment

      2012

      20

      2013

      19

      TP1

      2014

      20

      TP1

      2015

      8

      Start IRS

      2016

      0

      TP2

      This reduced parasite exposure is reflected in a decrease in immune activation as presented in Figure 2. We have clarified that the data in Table S1 did indeed match those shown in Figure 1.

      1. We have clarified that IgD expression is low in the clusters presented in Figure 3 because naïve B cells were excluded from this analysis.

      Advances: This study highlights the importance of studying antigen specific B cells in humans in the context of natural infection and the use of high-parameter tools such as spectral flow cytometry in assessing a large quantity of data from limited clinical samples. These data are important to inform better vaccine design. Studies in inbred animals can be quite limited or different from human B cell responses.

      Audience: This study will be of interest to malariologists and B cell immunologists. Atypical B cells are relevant to many infectious diseases and auto immunity, while the dynamics of memory B cells in malaria all be relevant to those interested in vaccine design against blood stage antigens.


      Reviewer #2

      Evidence, reproducibility and clarity

      Summary: In this study, the authors compared long-lived total and antigen (ag)-specific B-cell levels in a cohort of 10 Ugandan malaria patient samples that were collected before and after local reduction of P. falciparum transmission (pre/post-IRS). The focus is on the novel comparison of the two most common malaria antigens: merozoite antigens (MSP1/AMA1) and variant surface antigens (CIDRα1). Using high-parameter spectral flow cytometry, they also characterized the phenotype of the different populations of cells. Their main findings include 1) a decrease in activated but maintenance of resting ag-specific B-cells in the post-IRS sample and 2) CD95 and CD11c, as the only differentially expressed markers between MSP1/AMA1-specific and CIDRα1-specific long-lived memory B cells. Their further phenotypic characterization suggests functional consequences with MSP1/AMA1-specific B-cells being poised for rapid antibody-secreting cell differentiation while CIDRα1-specific B cells were enriched among a subset of atypical B cells that seem poised for antigen presentation (CD86+CD11chi/ AtBC1). Their findings consolidate and further expand our knowledge of long-lived B-cell levels during P. falciparum malaria and report/compare (for the first time to my knowledge) a differential selection of long-lived B-cell levels between these 2 antigen specificities. Overall, the manuscript is straightforward and well-written and the authors did a good job explaining their methodology, findings, and interpretations. I believe the major gap missing in this study is the reconciliation of long-lived antigen-specific B-cell levels with the serum antigen-specific antibody levels of these patients against the same 2 antigens (MSP1/AMA1 and CIDRα1) in the experiments and the discussion. The antibody data would strengthen their main argument and is the main missing piece for characterizing more completely the long-lived antigen-specific humoral responses. Below are my suggestions that would help improve the manuscript:

      Major comments: 1. Serum Anti-Pf antibodies: Do the authors have access to the serum/plasma of these patients? It would be important to correlate the total and ag-specific B-cell populations with levels of serum IgG antibodies against those specific Pf antigens (MSP1/AMA1 and CIDRα1) and total IgG levels to strengthen their point about long-lived humoral responses.

      To our understanding, the rationale for such an analysis would be that if IgG levels correlated with the size of a certain B cell population, it would suggest that this B cell population is implicated in the production of IgG against a particular antigen. While a correlation between the percentage of memory B cells and IgG titers has been observed for antigens from several viruses and bacteria (1-4), other studies have reported the absence of such a correlation (4-7). Similarly, for P. falciparum antigens, a moderate correlation between memory B cell abundance and IgG titers has been observed for some merozoite antigens, but not for others (8, 9). The lack of a correlation between the magnitude of the memory B cell and the antibody response fits with the prevailing model that memory B cells and plasma cells are two independently controlled arms of the humoral immune system (10, 11). Given the lack of strong evidence that the levels of IgG titers and memory B cells are interconnected, we do not think this analysis will be informative.

      An alternative analysis would be to study the contribution of B cell subsets to the production of IgG after re-exposure, similar to a recent study that identified T-bet+ memory B cells as the main contributors to antibody responses following influenza virus vaccination (12). Unfortunately, we are unable to perform this analysis in this study population, because only four of the individuals included in this study (spanning calendar years 2012 – 2016) were recruited into a follow up cohort (calendar years 2017 – 2019), and none of these four people were infected during this later time frame.

      We have however added this future direction to the Discussion section:

      To determine the contribution of different memory B cell subsets to the recall response against P. falciparum, it would be interesting to analyze IgG responses upon re-infection. However, none of the individuals included in this study experienced a recorded P. falciparum infection post-IRS, preventing us from performing such an analysis.

      References

      1. Crotty et al., J Immunol (2003), https://doi.org/10.4049/jimmunol.171.10.4969
      2. Quinn et al., J Infect Dis (2004), https://doi.org/10.1086/423937
      3. Cohen et al., Cell Rep Med (2021), https://doi.org/10.1016/j.xcrm.2021.100354
      4. Amanna et al., New England J Med (2007), https://doi.org/10.1056/nejmoa066092
      5. Leyendeckers et al., Eur J Immunol (1999), https://doi.org/10.1002/(sici)1521-4141(199904)29:04%3C1406::aid-immu1406%3E3.0.co;2-p
      6. Nanan et al., Vaccine (2001), https://doi.org/10.1016/s0264-410x(01)00328-0
      7. Goel et al., Science Immunol (2021), https://doi.org/10.1126/sciimmunol.abi6950
      8. Rivera-Correa et al., eLife (2019), https://doi.org/10.7554/elife.48309
      9. Jahnmatz et al., Front Immunol (2021), https://doi.org/10.3389/fimmu.2020.619398
      10. Weisel et al., Immunity (2016), https://doi.org/10.1016/j.immuni.2015.12.004
      11. Shinnakasu et al., Nat Immunol (2016), https://doi.org/10.1038/ni.3460
      12. Nellore et al., Immunity (2023), https://doi.org/10.1016/j.immuni.2023.03.00
        1. Correlation between populations and initial parasite load: Are the levels between any of the populations at any time point correlated significantly in any way? If the statistical power/N allows it, please perform a correlation array between all populations using all samples both total and ag-specific and initial parasite load.

      We agree that this analysis could be very interesting. However, in most recorded infection cases, parasitemia was submicroscopic and parasite load was not reported. Information about parasite density in the blood prior to TP1 is available for only half of the individuals in this study. In these people, the last known parasite density was recorded between three months to two years prior to TP1. Given the small number of individuals for whom these data are available and the large variation in time between parasitemia and sampling, we do not have sufficient data to perform this analysis.

      1. Figure 2: Why were total and ag-specific plasmablasts/plasma cells not included in this figure? Please include to compare levels in these two time points.

      We did not include the levels of total and antigen-specific plasmablasts (PBs) in Figure 2 because the percentages of PBs are relatively low, and very few antigen-specific PBs were detected. We have now included the levels of total PBs in Figure 2A and the percentages of antigen-specific PBs in Supplementary Figure 2. The percentage of PBs among total B cells decreased by about 50% between TP1 and TP2, in line with a decrease in immune activation.

      1. Healthy baseline: The study is missing "healthy" controls as a reference. I presume this is because each patient is its uninfected control in the post-IRS sample. In methods, they mentioned they used two naïve-USA B-cells as technical controls. It would be important to include and maybe expand (to match age and gender)on that specific data from those controls as supplementary figures to support their findings:
      2. Show negative Tetramer staining for these samples (to understand the background).
      3. Levels of all the USA controls total B cell populations and compared to the pre/post-IRS samples to understand "baseline" or "non-endemic" control levels.
      1. We have included flow cytometry plots of tetramer staining for the non-P. falciparum exposed donors (pooled B cells from two US donors) to show the level of background for these probes. These plots are shown in Figure S1B.

      2. We have used data from P. falciparum-naive US donors (n = 7) that we generated for a prior study to show the average level of total B cell populations in Figure 2, and the percentage of switched memory B cells that express CD95, CD11c, T-bet, and FcRL5 in Figure 4.

      Minor comments: 1. In the gating strategy (S1), please include the percentage of each population of that representative example.

      We have added the percentages for all gated populations to Figure S1.

      1. For Figure 2, since not every panel has the same N, please include the N for each panel in the figure or a supplementary table.

      All panels in Figure 2 show data for all 10 individuals. However, since some data points are overlapping, it may appear that some panels show data from fewer individuals. Specifically, no antigen-specific DN1 cells were detected pre- and post-IRS for four individuals. These data points therefore overlap and are not visible. To avoid confusion, we had mentioned this in the legend to Figure 2 (see text in orange). We have tried to further clarify this by emphasizing in the figure legend that data from all 10 individuals are shown (see text in red):

      Figure 2: Abundance of total and antigen-specific B cell subsets in the circulation during high parasite transmission and in the absence of P. falciparum exposure. The percentage of B cell subsets among circulating B cells is shown for total B cells (A), MSP1/AMA1-specific B cells (B), and CIDRα1-specific B cells (C). For MSP1/AMA1-specific B cells and CIDRα1-specific B cells, the total percentage among all circulating B cells is also shown (right most graphs in each panel). All panels show data for all 10 individuals. In panels B and C, no antigen-specific DN1 cells were detected pre- and post-IRS for four individuals. These data points therefore overlap and are not clearly visible. Differences between groups were evaluated using a Wilcoxon matched-pairs signed-rank test. P values

      1. Please mention the history of past and chronic co-infections of these 10 patients. Particularly if they had any other active or recent infection when the sample was taken.

      Four individuals had active or recent infections in the three months prior to sample collection, with upper respiratory tract infections being the most common. This information has been included in Table S3, with a reference to these data in the Methods section. We have also included a link to ClinEpiDB where additional information about the cohort participants, including medical history, can be found.

      1. Discussion: further discussion with relevant literature on the following points is needed to consolidate cellular and antibody studies: a. Whether the presence of long-lived ag-specific B-cell responses correlates with sustained levels of IgG against Pf antigens. b. The different types of antibodies (protective/pathogenic) that these different B-cell populations have been reported to produce during malaria.

      a. We have added the following paragraph to the Discussion section:

      To determine how these different long-lived B cell subsets contribute to protection against P. falciparum infection, it would be important to analyze the connection between the cellular repertoire and plasma IgG. For P. falciparum antigens, a moderate correlation between memory B cell abundance and IgG titers has been observed for some merozoite antigens, but not for others (28, 44). This is in line with studies for other pathogens, that showed a correlation between the percentage of memory B cells and IgG titers for antigens from several viruses and bacteria (48-51), while other studies have reported the absence of such a correlation (51-54). The lack of a correlation between the magnitude of the memory B cell and the antibody response fits with the prevailing model that memory B cells and plasma cells are two independently controlled arms of the humoral immune system (55, 56). To determine the contribution of different memory B cell subsets to the recall response against P. falciparum, it would be interesting to analyze IgG responses upon re-infection. However, none of the individuals included in this study experienced a recorded P. falciparum infection post-IRS, preventing us from performing such an analysis.

      b. We have added additional discussion about the types of antigens recognized by atypical B cells to the Discussion section:

      Prior studies have shown that while atypical B cells harbor reactivity against P. falciparum antigens (9,18), they are also enriched for autoreactivity (43). Specifically, atypical B cells produce antibodies against the membrane lipid phosphatidylserine, which can induce the destruction of uninfected erythrocytes and contribute to anemia (44).

      Significance

      General assessment:

      Strengths: - Novelty in contrasting two different types of P. falciparum antigen responses at the B-cell level. - The use of tetramers is a cutting-edge technique to assess this question. - Analyses were thorough and found contrasting differences in antigen-specific B-cell populations (atypical vs classical) between these 2 antigens for the first time (to my knowledge). - Well-written manuscript with clear data, methodology, and conclusions

      Limitations: - Missing serum/plasma antibody data to support their claim about long-lived humoral responses and reconciliation of ag-specific B-cell levels and ag-specific antibody levels in experiments and discussion. - Limited N of 10 patients of the same gender (female), some population analyses had even fewer samples. - Missing baseline levels for non-endemic uninfected control for B-cell populations for comparison.

      • We have included a discussion about the correlation between plasma antibody and memory B cell responses in the Discussion section.

      • We have clarified that some data points overlap in Figure 2, giving the impression that data from fewer than 10 individuals were shown.

      • We have included baseline levels of 1) tetramer reactivity (Figure S1), 2) the size of B cell populations (Figure 2), and 3) expression of select markers (Figure 4).

      Advance: The study consolidates antigen-specific responses with the discovery of recently characterized populations (ex. atypical) and finds novel differences between two types of malaria antigen responses at the B-cell level and between specific populations responding differentially to these antigens. The findings are incremental (role of B-cell population in malaria-specific responses), conceptual (contrasting two types of B-cell antigen responses in the same infection), and clinical (finding significant differences in patients).

      Audience: This study will attract basic B-cell immunology scientists, infectious disease clinicians/scientists, vaccinologists, and both basic malaria immunology and clinical audiences.

      Reviewer expertise: Malaria, immunology, antibodies.

      __Reviewer #3 __

      Evidence, reproducibility and clarity: The authors analysed the antigen specificity and phenotypes of B cells during high P falciparum transmission and after a period of successful malaria control with IRS in Uganda. The gap between the two sampling time points is close to two years.

      They use antigen probes for MSP1/AMA1 and CIDRalpha1, two antigens expressed at different stages of P. falciparum life cycle-merozoites and infected red cells, respectively. While MSP1/AMA1 are involved in the parasite's invasion of red blood cells, CIDRalpha1 is a domain of PFEMP1, a large family of antigenically variant proteins that mediates the sequestration of infected red cells in small blood vessels.

      They found that the percentage of activated antigen-specific memory B cells declined with malaria control. However, detectable frequencies of antigen-specific memory B cells were retained after malaria control, which confirms earlier reports.

      However, they also demonstrate that the phenotypic characteristics of memory B cells are associated with antigen specificity. The retained MSP1/AMA1-specific B cells were mostly CD95+CD11c+ memory B cells and FcRL5-Tbet- atypical B cells. In contrast, the retained CIDRalpha1-specific B cells were enriched among a subpopulation of atypical B cells.

      These findings suggest differences exist in how the MSA1/AMA1 and CIDRalpha1 y are recognised and processed by the human immune system and how the immune response responds to them upon re-infection with P falciparum.

      Major issues affecting the conclusion: The findings and conclusions of this study, whilst positively exciting and informative, are based on the analyses of very few cells (at times). Even the authors themselves acknowledge this. I expect the authors to address this issue by toning down their reporting and conclusions (where appropriate). Ultimately, we need to have the confidence that these results are reproducible.

      We appreciate the reviewer’s concern about the numbers of antigen-specific cells included in our analyses, which is an inherent limitation of this approach. However, we would like to point out that most analyses included a substantial number of antigen-specific B cells:

      Figure 3D: 158 to 2,038 cells per group

      Figure 4: an average of 26 to 184 cells per donor

      Figure 5B: 55 to 508 cells per group

      Figure 5C: 10 to 334 cells per group*

      * The group with 10 cells is an outlier here. All other groups contain at least 104 cells. Because this one condition had such a small number of cells, we specifically mentioned this number in the text.

      The numbers of cells for analyses shown in Figures 3D and 5B were already included in the figures. All the other numbers were mentioned in Table S3. To further clarify the number of cells included in each analysis, we have added the number of cells to Figures 4 and 5C.

      To tone down our reporting, we have rephrased some of our conclusions, and now present our section headers in past tense to make these statements reflect our observation instead of a definitive conclusion. For example:

      Conclusion: “The loss of MSP1/AMA1-specific and CIDRα1-specific B cells in the circulation was similar, but the phenotype of long-lived MSP1/AMA1-specific and CIDRα1-specific B cells appeared to differ.”

      Section header: “Long-lived MSP1/AMA1-specific and CIDRα1-specific B cells differed in phenotype”

      Finally, in the Discussion section, we have added a statement to our paragraph describing the limitations of our study to stress the importance of reproducing our findings:

      All in all, it will be important to perform additional studies of the phenotype and functionality of long-lived B cells with specificity for P. falciparum antigens to reproduce and extend our findings.

      Minor comments: Figure 2D-I found this figure, and its presentation is unclear. Notably, using contour plots doesn't allow the reader to appreciate the density of the cells being presented.

      To facilitate the interpretation of this figure, we have changed the plot type to a contour plot with density color gradient, and added the number of cells shown in each plot. (Please note that this panel has been renumbered to C.)

      Figure 4 - label the y-axis.

      The y-axis was labeled with “%”, which we have expanded to “% of B cells expressing marker of interest”.

      __Significance: __The study design-as outlined-allowed for the analyses of the specificity and phenotypic characteristics of residual P falciparum-specific memory B cells after 1.7 years of little to no P falciparum exposure. The cell phenotyping methods presented are also appropriate. However, antigen-specific cells are rare in blood circulation, and as the authors themselves acknowledge in the discussion, some of the results are based on very few cells. This means we cannot be sure all the results presented are reproducible.

      Previous studies demonstrated that P falciparum memory B cells are maintained long after cessation of antigen exposure. However, few (if any) detailed antigen-specific and phenotypic analyses of the characteristics of P falciparum-specific memory B cells following a long period of no exposure exist. Thus, this study presents an incremental advance in our knowledge. In addition, the association of antigen specificity with cell phenotypes is a new concept in malaria immunology. The research presented will greatly interest infectious disease immunologists and vaccinologists.

      I am an infectious disease immunologist with substantial experience in malaria immunology.

    1. Recognize the difference between casual, formal, and urgent registers. Learn how to use each in the classroom and make your shifts between the registers obvious.

      I think that this is a very important point. Being able to understand the difference between formal and informal lessons and tones, as well as posture and facial expressions is a very important skill that teachers need to have, as it is important skill for anyone to have. As teachers and educators, we are role models to our students, and we are meant to exemplify what it is to be a positive contributing member of society. in order for that we need to be able to represent both formal and informal ways of communication and when to be formal or informal communicating. For instance, if we are doing a lesson over business attire and resumes, the instructor may want to be more formal, but if the instructor is teaching about Topic, such as fun or games, the lesson may be less formal. Is important for an educator to represent both forms of communication, as it allows students to understand that there is more to life than just being formal or informal.

    1. Reviewer #1 (Public Review):

      In this paper, Tompary & Davachi present work looking at how memories become integrated over time in the brain, and relating those mechanisms to responses on a priming task as a behavioral measure of memory linkage. They find that remotely but not recently formed memories are behaviorally linked and that this is associated with a change in the neural representation in mPFC. They also find that the same behavioral outcomes are associated with the increased coupling of the posterior hippocampus with category-sensitive parts of the neocortex (LOC) during a post-learning rest period-again only for remotely learned information. There was also correspondence in rest connectivity (posterior hippocampus-LOC) and representational change (mPFC) such that for remote memories specifically, the initial post-learning connectivity enhancement during rest related to longer-term mPFC representational change.

      This work has many strengths. The topic of this paper is very interesting, and the data provide a really nice package in terms of providing a mechanistic account of how memories become integrated over a delay. The paper is also exceptionally well-written and a pleasure to read. There are two studies, including one large behavioral study, and the findings replicate in the smaller fMRI sample. I do however have two fairly substantive concerns about the analytic approach, where more data will be required before we can know whether the interpretations are an appropriate reflection of the findings. These and other concerns are described below.

      (1) One major concern relates to the lack of a pre-encoding baseline scan prior to recent learning.

      a) First, I think it would be helpful if the authors could clarify why there was no pre-learning rest scan dedicated to the recent condition. Was this simply a feasibility consideration, or were there theoretical reasons why this would be less "clean"? Including this information in the paper would be helpful for context. Apologies if I missed this detail in the paper.

      b) Second, I was hoping the authors could speak to what they think is reflected in the post-encoding "recent" scan. Is it possible that these data could also reflect the processing of the remote memories? I think, though am not positive, that the authors may be alluding to this in the penultimate paragraph of the discussion (p. 33) when noting the LOC-mPFC connectivity findings. Could there be the reinstatement of the old memories due to being back in the same experimental context and so forth? I wonder the extent to which the authors think the data from this scan can be reflected as strictly reflecting recent memories, particularly given it is relative to the pre-encoding baseline from before the remote memories, as well (and therefore in theory could reflect both the remote + recent). (I should also acknowledge that, if it is the case that the authors think there might be some remote memory processing during the recent learning session in general, a pre-learning rest scan might not have been "clean" either, in that it could have reflected some processing of the remote memories-i.e., perhaps a clean pre-learning scan for the recent learning session related to point 1a is simply not possible.)

      c) Third, I am thinking about how both of the above issues might relate to the authors' findings, and would love to see more added to the paper to address this point. Specifically, I assume there are fluctuations in baseline connectivity profile across days within a person, such that the pre-learning connectivity on day 1 might be different from on day 2. Given that, and the lack of a pre-learning connectivity measure on day 2, it would logically follow that the measure of connectivity change from pre- to post-learning is going to be cleaner for the remote memories. In other words, could the lack of connectivity change observed for the recent scan simply be due to the lack of a within-day baseline? Given that otherwise, the post-learning rest should be the same in that it is an immediate reflection of how connectivity changes as a function of learning (depending on whether the authors think that the "recent" scan is actually reflecting "recent + remote"), it seems odd that they both don't show the same corresponding increase in connectivity-which makes me think it may be a baseline difference. I am not sure if this is what the authors are implying when they talk about how day 1 is most similar to prior investigation on p. 20, but if so it might be helpful to state that directly.

      d) Fourth and very related to my point 1c, I wonder if the lack of correlations for the recent scan with behavior is interpretable, or if it might just be that this is a noisy measure due to imperfect baseline correction. Do the authors have any data or logic they might be able to provide that could speak to these points? One thing that comes to mind is seeing whether the raw post-learning connectivity values (separately for both recent and remote) show the same pattern as the different scores. However, the authors may come up with other clever ways to address this point. If not, it might be worth acknowledging this interpretive challenge in the Discussion.

      (2) My second major concern is how the authors have operationalized integration and differentiation. The pattern similarity analysis uses an overall correspondence between the neural similarity and a predicted model as the main metric. In the predicted model, C items that are indirectly associated are more similar to one another than they are C items that are entirely unrelated. The authors are then looking at a change in correspondence (correlation) between the neural data and that prediction model from pre- to post-learning. However, a change in the degree of correspondence with the predicted matrix could be driven by either the unrelated items becoming less similar or the related ones becoming more similar (or both!). Since the interpretation in the paper focuses on change to indirectly related C items, it would be important to report those values directly. For instance, as evidence of differentiation, it would be important to show that there is a greater decrease in similarity for indirectly associated C items than it is for unrelated C items (or even a smaller increase) from pre to post, or that C items that are indirectly related are less similar than are unrelated C items post but not pre-learning. Performing this analysis would confirm that the pattern of results matches the authors' interpretation. This would also impact the interpretation of the subsequent analyses that involve the neural integration measures (e.g., correlation analyses like those on p. 16, which may or may not be driven by increased similarity among overlapping C pairs). I should add that given the specificity to the remote learning in mPFC versus recent in LOC and anterior hippocampus, it is clearly the case that something interesting is going on. However, I think we need more data to understand fully what that "something" is.

      (3) The priming task occurred before the post-learning exposure phase and could have impacted the representations. More consideration of this in the paper would be useful. Most critically, since the priming task involves seeing the related C items back-to-back, it would be important to consider whether this experience could have conceivably impacted the neural integration indices. I believe it never would have been the case that unrelated C items were presented sequentially during the priming task, i.e., that related C items always appeared together in this task. I think again the specificity of the remote condition is key and perhaps the authors can leverage this to support their interpretation. Can the authors consider this possibility in the Discussion?

      (4) For the priming task, based on the Figure 2A caption it seems as though every sequence contributes to both the control and primed conditions, but (I believe) this means that the control transition always happens first (and they are always back-to-back). Is this a concern? If RTs are changing over time (getting faster), it would be helpful to know whether the priming effects hold after controlling for trial numbers. I do not think this is a big issue because if it were, you would not expect to see the specificity of the remotely learned information. However, it would be helpful to know given the order of these conditions has to be fixed in their design.

      (5) The authors should be cautious about the general conclusion that memories with overlapping temporal regularities become neurally integrated - given their findings in MPFC are more consistent with overall differentiation (though as noted above, I think we need more data on this to know for sure what is going on).

      (6) It would be worth stating a few more details and perhaps providing additional logic or justification in the main text about the pre and post-exposure phases were set up and why. How many times each object was presented pre and post, and how the sequencing was determined (were any constraints put in place e.g., such that C1 and C2 did not appear close in time?). What was the cover task (I think this is important to the interpretation & so belongs in the main paper)? Were there considerations involving the fact that this is a different sequence of the same objects the participants would later be learning - e.g., interference, etc.?

    1. We further identified HC-HA/PTX3 as the primary bioactive component responsible for pain inhibition.

      This is such an exciting overall result. I'm wondering if you've tested/identified any other bioactive compounds from the same material in addition to HC-HA/PTX3, and/or whether you think there may be other significant contributors to pain inhibition from human birth tissues.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the Authors):

      Arpin is a negative regulator of Arp2/3 activity. Here the authors investigated the role of arpin in vascular permeability using appropriate cultured human and murine endothelial monolayers and successfully developed an arpin KO mice. The results clearly show arpin is expressed in blood vessels (not clear about lymphatics but given leaky vessels, one wonders). The data show that arpin is important for vessel barrier function yet its genetic loss still leads to viable animals in the C57Blk strain albeit with leaky blood vessels. The data are well presented and controls are included. However, the evidence that arpin loss/knockdown causes increased actin functions independent of Arp2/3 is based on pharmacological data and is indirect. Authors conclude ROCK1 activity is elevated and the cause of lost barrier function by arpin reduction. I do have one suggestion for the authors that involves a new study in these animals, which could strengthen their proposed mechanism that the vascular defects are independent of Arp2/3 activity and rather involve ROCK1 but not ZIPK.

      (1) If arpin is working via ROCK1, as the authors infer, perhaps treatment of arpin-/- mice with ROCK1 inhibitor(s) would attenuate vessel permeability while HS38 treatment would not. This type of study would strengthen the conclusion that ROCK1, but not ZIPK, was involved. Including CK666 if active in mouse cells, could also be tested.

      To analyze vascular permeability in vivo, we performed Miles assays in arpin+/+ and arpin-/- mice using the inhibitors of ROCK1 (Y27632) and ZIPK (HS38). Both Y27632 and HS38 reduced the permeability caused by absence of arpin (new Figure 8E), thus confirming what we observed before in HUVEC (shown in old Figure 7). CK666 did not change the permeability in arpin-/- mice, thus confirming the conclusion that arpin does not regulate vascular permeability via Arp2/3 but rather via ROCK1/ZIPK-mediated stress fiber formation (page 13).

      (2) Fig 5. Data demonstrate that Arpin regulates actin filament formations and permeability in HUVEC, but this does not demonstrate its occurring in an Arp2/3-independent manner. If I understand your data this is indirect evidence. One needs more information to reach this conclusion. Can authors measure Arp2/3 directly and then test whether arpin knockdown and CK666 have the same capacity to reduce Arp2/3 activity in vitro.

      Arp2/3 activity cannot be measured directly. The commonly used approach is therefore Arp2/3 inhibition via CK666. Our new in vivo permeability assays (see answer above) together with our HUVEC data in Figure 5 clearly show that CK666 does not have the same effect as arpin knock-down, and neither does CK666 rescue the effects of arpin deficiency in vitro and in vivo. Together, these findings clearly suggest that arpin does not regulate endothelial permeability via Arp2/3.

      Minor issues:

      Fig 2, 3 or other Figs: In presented western blots, all proteins should include appropriate mw labels.

      Thank you. Molecular weights have been added to all Western blots.

      Fig 2. Suggest that like your arpin analysis, amounts of AP1AP and PICK1 at baseline and TNF-treatment by blotting should be included. A minor point is yellow color for labels does not stand out and should be changed to another color - as the authors used in Fig 2C.

      We have included Western blots and quantifications for PICK1 in Figure S1A and S1C. An antibody against AP1AP was unfortunately not available.

      The yellow color has been changed to purple for better visibility.

      Fig 2C. The arpin loss at junctions and actin filaments (Figure 2C) is very minor even though it reached statistical significance. It really is not an obvious loss from your 3 color overlay.

      Thank you. It is indeed hard to see. We included now magnifications in Figure 2C that better show the loss of arpin at junctions.

      Fig 8, text 303-310 shows in vivo evidence of lung congestion and edema. Also appear to be inflammatory cells present in images. If these are inflammatory cells, it begs the question if these mice have an abnormal complete blood cell count (CBC). Suggest adding CBC data for arpin-/- vs control arpin +/+ mice in Fig 8.

      The pathologist observed the presence of lymphocytes and macrophages, indicating the possibility of a (low level) chronic inflammation in arpin-deficient lungs. However, we now also performed hemograms of the mice (new Table S2) that showed no significant difference in the blood cell count of arpin-/- and arpin+/+ mice. Thus, the presence of lymphocytes and macrophages cannot be explained simply by higher leukocyte counts (page 14).

      Line 289, pg 13, Fig 8: Lung levels of arpin are not shown in Fig 8B. Authors must mean another fig?

      Sorry. Arpin protein levels in lungs are shown in figure 8C. This has been corrected on page 13.

      Reviewer #2 (Recommendations For The Authors):

      This is a solid piece of work that adds a small amount of additional factual information to our understanding of cell-cell junctions. The experimental work is of good quality and is sufficient to support the aims of the paper. I think the value of the work is to add this small amount of new knowledge to the archive. I do not believe that further experimental work would add to the paper - it's done. But this doesn't have the impact or completeness for this journal. It belongs in a for-the-record journal.

      We appreciate your overall positive evaluation and your comments that our study represents a solid piece of work with good quality experimental work. However, we are not sure what you mean by “it belongs in a for-the-record journal”. Anyway, we agree that our study does not reveal a complete mechanism of how arpin regulates actin stress fibers, but we respectfully disagree that our study only adds a “small amount of additional factual information”. We may not have been very clear about it, but we present in this study several new discoveries and although some are descriptive in nature that does not make them trivial or less important. We provide for the first time experimental evidence that: 1) arpin is expressed in endothelial cells in vitro and in vivo, and downregulated during inflammation; 2) presence of arpin is required for proper endothelial permeability regulation and junction architecture; 3) arpin exerts these functions in an Arp2/3-independent manner; 4) arpin controls actomyosin contractility in a ROCK1- and ZIPK-dependent fashion; 5) arpin knock-out mice are viable and breed and develop normally but show histological characteristics of a vascular phenotype and increased vascular permeability that can be rescued by inhibition of ROCK1 and ZIPK. The fact that arpin fulfills its functions in endothelial cells independently of the Arp2/3 complex is of special relevance as previously the only known function of arpin was the inhibition of the Arp2/3 complex. Thus, we believe that our study adds a significant amount of new information to the literature. Thank you very much.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Summary Responses: Besides the WT allele, equivalent to the mouse TMEM173 gene, the human TMEM173 gene has two common alleles: the HAQ and AQ alleles carried by billions of people. The main conclusions and interpretation, summarized in the Title and Abstract, are i) Different from the WT TMEM173 allele, the HAQ or AQ alleles are resistant to STING activation-induced cell death; ii) STING residue 293 is critical for cell death; iii) HAQ, AQ alleles are dominant to the SAVI allele; iv) One copy of the AQ allele rescues the SAVI disease in mice. We propose that STING research and STING-targeting immunotherapy should consider human TMEM173 heterogeneity. These interpretations and conclusions were based on Data and Logic. We welcome alternative, logical interpretations and collaborations to advance the human TMEM173 research.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript by Aybar-Torres et al investigated the effect of common human STING1 variants on STING-mediated T cell phenotypes in mice. The authors previously made knock-in mice expressing human STING1 alleles HAQ or AQ, and here they established a new knock-in line Q293. The authors stimulated cells isolated from these mice with STING agonists and found that all three human mutant alleles resist cell death, leading to the conclusion that R293 residue is essential for STING-mediated cell death (there are several caveats with this conclusion, more below). The authors also bred HAQ and AQ alleles to the mouse Sting1-N153S SAVI mouse and observed varying levels of rescue of disease phenotypes with the AQ allele showing more complete rescue than the HAQ allele. The Q293 allele was not tested in the SAVI model. They conclude that the human common variants such as HAQ and AQ have a dominant negative effect over the gain-of-function SAVI mutants.

      Strengths:

      The authors and Dr. Jin's group previously made important observations of common human STING1 variants, and these knock-in mouse models are essential for understanding the physiological function of these alleles.

      Weaknesses:

      However, although some of the observations reported here are interesting, the data collectively does not support a unified model. The authors seem to be drawing two sets of conclusions from in vitro and in vivo experiments, and neither mechanism is clear. Several experiments need better controls, and these knock-in mice need more comprehensive functional characterization.

      (1) In Figure 1, the authors are trying to show that STING agonist-induced splenocytes cell death is blocked by HAQ, AQ and Q alleles. The conclusion at line 134 should be splenocytes, not lymphocytes. Most experiments in this figure were done with mixed population that may involve cell-to-cell communication. Although TBK1-dependence is likely, a single inhibitor treatment of a mixed population is not sufficient to reach this conclusion.

      We greatly appreciate Reviewer 1's insights. We changed the “lymphocytes” to “splenocytes” (line 133) as suggested. We respectfully disagree with Reviewer 1’s comments on TBK1. First, we used two different TBK1 inhibitors: BX795 and GSK8612. Second, because BX795 also inhibits PDK1, we used a PDK1 inhibitor GSK2334470; Third, both BX795 and GSK8612 completely inhibited diABZI-induced splenocyte cell death (Figure 1B) (lines 128 – 133). The logical conclusion is “TBK1 activation is required for STING-mediated mouse spleen cell death ex vivo”. (line 117).

      Our discovery that the common human TMEM173 alleles are resistant to STING activation-induced cell death is a substantial finding. It further strengthens the argument that the HAQ and AQ alleles are functionally distinct from the WT allele 1-3. We wish to underscore the crucial message of this study-that 'STING research and STING-targeting immunotherapy should consider TMEM173 heterogeneity in humans' (line 37), which has been largely overlooked in current STING clinical trials 4.

      Regarding STING-Cell death, as we stated in the Introduction (lines 65-77). i) STING-mediated cell death is cell type-dependent 5-7 and type I IFNs-independent 5,7,8. ii) The in vivo biological significance of STING-mediated cell death is not clear 7,8. iii) The mechanisms of STING-Cell death remain controversial. Multiple cell death pathways, i.e., apoptosis, necroptosis, pyroptosis, ferroptosis, and PANoptosis, are proposed 7,9,10. SAVI/HAQ, SAVI/AQ prevented lymphopenia and alleviated SAVI disease in mice. Thus, the manuscript provides some answers to the biological significance of STING-cell death in vivo, which is new. Regarding the molecular mechanism, splenocytes from Q293/Q293 mice are resistant to STING cell death. The logical conclusion is that the amino acid 293 is critical for STING cell death (line 29).

      Extensive studies are needed, beyond the scope of this manuscript, on how aa293 and TBK1 mediates STING-Cell death to resolve the controversies in the STING-cell death fields (e.g. apoptosis, necroptosis, pyroptosis, ferroptosis, and PANoptosis).

      (2) Q293 knock-in mouse needs to be characterized and compared to HAQ and AQ. Is this mutant expressed in tissues? Does this mutant still produce IFN and other STING activities? Does the protein expression level altered on Western blot? Is the mutant protein trafficking affected? In the authors' previous publications and some of the Western blot here, expression levels of each of these human STING1 protein in mice are drastically different. HAQ and AQ also have different effects on metabolism (pmid: 36261171), which could complicate interoperation of the T cell phenotypes.

      These are very important questions that require rigorous investigations that are beyond the scope of this manuscript. This manuscript, titled “The common TMEM173 HAQ, AQ alleles rescue CD4 T cellpenia, restore T-regs, and prevent SAVI (N153S) inflammatory disease in mice” does not focus on Q293 mice. We have been investigating the common human TMEM173 alleles since 2011 from the discovery 11 , mouse model 1,3, human clinical trial 2, and human genetics studies 3. This manuscript is another step towards understanding these common human TMEM173 alleles with the new discovery that HAQ, AQ alleles are resistant to STING cell death.

      (3) HAQ/WT and AQ/WT splenocytes are protected from STING agonist-induced cell death equally well (Figure 1G). HAQ/SAVI shows less rescue compared to AQ/SAVI. These are interesting observations, but mechanism is unclear and not clearly discussed. E.g., how does AQ protect disease pathology better than HAQ (that contains AQ)? Does Q293 allele also fully rescue SAVI?

      In this manuscript, Figure 6 shows AQ/SAVI had more T-regs than HAQ/SAVI (lines 251 – 261). In our previous publication on HAQ, AQ knockin mice, we showed that AQ T-regs have more IL-10 than HAQ T-regs 3. Thus, increased IL-10+ Tregs in AQ mice may contribute to an improved phenotype in AQ/SAVI compared to HAQ/SAVI. However, we are not excluding other contributions (e.g. metabolic difference) (lines 332-335). We are exploring these possibilities.  

      (4) Figure 2 feels out of place. First of all, why are the authors using human explant lung tissues? PBMCs should be a better source for lymphocytes. In untreated conditions, both CD4 and B cells show ~30% dying cells, but CD8 cells show 0% dying cells. This calls for technical concerns on the CD8 T cell property or gating strategy because in the mouse experiment (Figure 1A) all primary lymphocytes show ~30% cell death at steady-state. Second, Figure 2C, these type of partial effect needs multiple human donors to confirm. Three, the reconstitution of THP1 cells seems out of place. STING-mediated cell death mechanism in myeloid and lymphoid cells are likely different. If the authors want to demonstrate cell death in myeloid cells using THP1, then these reconstituted cell lines need to be better validated. Expression, IFN signaling, etc. The parental THP1 cells is HAQ/HAQ, how does that compare to the reconstitutions? There are published studies showing THP1-STING-KO cells reconstituted with human variants do not respond to STING agonists as expected. The authors need to be scientifically rigorous on validation and caution on their interpretations.

      Figure 2 is necessary because it reveals the difference between mouse and human STING cell death, which is critical to understand STING in human health and diseases (lines 160-161). Figure 2A-2B showed that STING activation killed human CD4 T cells, but not human CD8 T cells or B cells. This observation is different from Figure 1A, where STING activation killed mouse CD4, CD8 T cells, and CD19 B cells, revealing the species-specific STING cell death responses. Regarding human CD8 T cells, as we stated in the Discussion (lines 323-325), human CD8 T cells (PBMC) are not as susceptible as the CD4 T cells to STING-induced cell death 8. We used lung lymphocytes that showed similar observations (Figure 2A). For Figure 2C, we used 2 WT/HAQ and 3 WT/WT individuals (lines 738-739). We generate HAQ, AQ THP-1 cells in STING-KO THP-1 cells (Invivogen,, cat no. thpd-kostg) (lines 380-387).

      A recent study found that a new STING agonist SHR1032 induces cell death in STING-KO THP-1 cells expressing WT(R232) human STING 10 (line 182). SHR1032 suppressed THP1-STING-WT(R232) cell growth at GI50: 23 nM while in the parental THP1-STING-HAQ cells, the GI50 of SHR1032 was >103 nM 10. Cytarabine was used as an internal control where SHR1032 killed more robustly than cytarabine in the THP1-STING-WT(R232) cells but much less efficiently than cytarabine in the THP-1-STING-HAQ cells 10. 

      Our manuscript rigorously uses mouse splenocytes, human lung lymphocytes, THP-1 reconstituted with HAQ, AQ, and HAQ/SAVI, AQ/SAVI mice, to demonstrate that the common human HAQ, AQ alleles are resistant to STING cell death in vitro and in vivo.

      We agree with Reviewer 1 that STING-mediated cell death mechanisms in myeloid and lymphoid cells may be different and likely contribute to the different mechanisms proposed in STING cell death research 7,9,10. Our study focuses on the in vivo STING-mediated T cellpenia.

      (5) Figure 2G, H, I are confusing. AQ is more active in producing IFN signaling than HAQ and Q is the least active. How to explain this?

      We stated in the Introduction that “AQ responds to CDNs and produce type I IFNs in vivo and in vitro 3,12,13 ”(line 92-93). We reported that the AQ knock in mice responded to STING activation 3. We previously showed that there was a negative natural selection on the AQ allele in individuals outside of Africa 3. 28% of Africans are WT/AQ but only 0.6% East Asians are WT/AQ 3. In contrast, the HAQ allele was positively selected in non-Africans 3. Investigation to understand the mechanisms and biological significance of these naturally selected human TMEM173 alleles has been ongoing in the lab.

      (6) The overall model is unclear. If HAQ, AQ and Q are loss-of-function alleles and Q is the key residue for STING-mediated cell death, then why AQ is the most active in producing IFN signaling and AQ/SAVI rescues disease most completely? If these human variants act as dominant negatives, which would be consistent with the WT/het data, then how do you explain AQ is more dominant negative than HAQ?

      In this manuscript, Figure 6 shows AQ/SAVI had more T-regs than HAQ/SAVI (lines 251 – 261). In our previous publication on HAQ, AQ knockin mice, we showed that AQ T-regs have more IL-10 and mitochondria activity than HAQ T-regs 3. Nevertheless, we are not excluding other contributions (e.g. metabolic difference) by the AQ allele (lines 332-335). Last, we used modern human evolution to discover the dominance of these common human STING alleles. In modern humans outside Africans, HAQ was positively selected while AQ was negatively selected 3. However, AQ is likely dominant to HAQ because there is no HAQ/AQ individuals outside Africa. The genetic dominance of common human TMEM173 allele is a new concept. More investigation is ongoing.

      (7) As a general note, SAVI disease phenotypes involve multiple cell types. Lymphocyte cell death is only one of them. The authors' characterization of SAVI pathology is limited and did not analyze immunopathology of the lung.

      Both radioresistant parenchymal and/or stromal cells and hematopoietic cells influence SAVI pathology in mice 14,15. Nevertheless, the lack of CD 4 T cells, including the anti-inflammatory T-regs, likely contributes to the inflammation in SAVI mice and patients 16. We characterized lung function, lung inflammation (Figure 4), lung neutrophils, and inflammatory monocyte infiltration (Figure S5) (lines 232-235).

      (8) Line 281, the discussion on HIV T cell death mechanism is not relevant and over-stretching. This study did not evaluate viral infection in T cells at all. The original finding of HAQ/HAQ enrichment in HIV/AIDS was 2/11 in LTNP vs 0/11 in control, arguably not the strongest statistics.

      Several publications have linked STING to HIV pathogenesis 17-22  (line 271). CD4 T cellpenia is a hallmark of AIDS. The manuscript studies STING activation-induced T cellpenia in vivo. It is not stretching to ask, for example, does preventing STING T cell death (e.g HAQ, AQ alleles) can restore CD4 T cell counts and improve care for AIDS patients?

      Reviewer #2 (Public Review):

      Aybar-Torres and colleagues utilize common human STING alleles to dissect the mechanism of SAVI inflammatory disease. The authors demonstrate that these common alleles alleviate SAVI pathology in mice, and perhaps more importantly use the differing functionality of these alleles to provide insight into requirements of SAVI disease induction. Their findings suggest that it is residue A230 and/or Q293 that are required for SAVI induction, while the ability to induce an interferon-dependent inflammatory response is not. This is nicely exemplified by the AQ/SAVI mice that have an intact inflammatory response to STING activation, yet minimal disease progression. As both mutants seem to be resistant STING-dependent cell death, this manuscript also alludes to the importance of STING-dependent cell death, rather than STING-dependent inflammation, in the progression of SAVI pathology. While I have some concerns, I believe this manuscript makes some important connections between STING pathology mouse models and human genetics that would contribute to the field.

      Some points to consider:

      (1) While the CD4+ T cell counts from HAQ/SAVI and AQ/SAVI mice suggest that these T cells are protected from STING-dependent cell death, an assay that explores this more directly would strengthen the manuscript. This is also supported by Fig 2C, but I believe a strength of this manuscript is the comparison between the two alleles. Therefore, if possible, I would recommend the isolation of T cells from these mice and direct stimulation with diABZI or other STING agonist with a cell death readout.

      Please see the new Figure S3 for cell death by diABZI, DMXAA in Splenocytes from WT/WT, WT/HAQ, HAQ/SAVI, AQ/SAVI mice. The HAQ/SAVI and AQ/SAVI splenocytes showed similar partial resistance to STING activation-induced cell death (lines 214-216).

      (2) Related to the above point - further exemplifying that the Q293 locus is essential to disease, even in human cells, would also strengthen the paper. It seems that CD4 T cell loss is a major component of human SAVI. While not co_mpletely necessary, repeating the THP1 cell death experiments from Fig 2 with a human T cell line would round out the study nicely._

      We examined HAQ, AQ mouse splenocytes, HAQ human lung lymphocytes, THP-1 reconstituted with HAQ, AQ, and HAQ/SAVI, AQ/SAVI mice, to demonstrate that the common human HAQ, AQ alleles are resistant to STING cell death in vitro and in vivo. Additional human T cell line work does not add too much. We hope to conduct more human PBMC or lung lymphocytes STING cell death experiments from HAQ, AQ individuals as we continue the human STING alleles investigation.

      (3) While I found the myeloid cell counts and BMDM data interesting, I think some more context is needed to fully loop this data into the story. Is myeloid cell expansion exemplified by SAVI patients? Do we know if myeloid cells are the major contributors to the inflammation these patients experience? Why should the SAVI community care about the Q293 locus in myeloid cells?

      This is likely a misunderstanding. We use BMDM for the purpose of comparing STING signaling (TBK1, IRF3, NFkB, STING activation) by WT/SAVI, HAQ/SAVI, AQ/SAVI. Ideally, we would like to compare STING signaling in CD4 T cells from WT/SAVI to HAQ/SAVI, AQ/SAVI mice. However, WT/SAVI has no CD4 T cells. Doing so, we are making the assumption that the basic STING signaling (TBK1, IRF3, NFkB, STING activation) is conserved between T cells and macrophages.

      (4) The functional assays in Figure 4 are exciting and really connect the alleles to disease progression. To strengthen the manuscript and connect all the data, I would recommend additional readouts from these mice that address the inflammatory phenotype shown in vitro in Figure 5. For example, measuring cytokines from these mice via ELISA or perhaps even Western blots looking for NFkB or STING activation would be supportive of the story. This would also allow for some tissue specificity. I believe looking for evidence of inflammation and STING activation in the lungs of these mice, for example, would further connect the data to human SAVI pathology.

      Reviewer 2 suggests looking for evidence of inflammation and STING activation in the lungs of HAQ/SAVI, AQ/SAVI. We would like to elaborate further. First, anti-inflammatory treatments, e.g. steroids, DMARDs, IVIG, Etanercept (TNF), rituximab, Nifedipine, amlodipine, et al., all failed in SAVI patients 23. JAK inhibitors on SAVI had mixed outcomes (lines 55-58). Second, Figure S5 examined lung neutrophils and inflammatory monocyte infiltration. Interestingly, while AQ/SAVI mice had a better lung function than HAQ/SAVI mice (Figure 4D, 4E vs 4H, 4I), HAQ/SAVI and AQ/SAVI lungs had comparable neutrophils and inflammatory monocyte infiltration (Figure S5). Last, SAVI is classified as type I interferonopathy 23, but the lung diseases of SAVI are mainly independent of type I IFNs 24-27. The AQ allele suppresses SAVI in vivo.  Understanding the mechanisms by which AQ rescues SAVI may lead to curative care for SAVI patients.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      One suggestion is to streamline this study by focusing on STING-mediated cell death only in CD4 T cells. The authors can use in vitro PBMC isolated human T cells, ex vivo T cells from the knock-in mice, and in vivo T cells from the SAVI breeding. The current manuscript includes myeloid cell death, Tregs, complex SAVI disease pathology, which is too confusing and too complex to explain with the varying effect from the three human STING1 variants.

      We sincerely appreciate Reviewer 1’s suggestion. The goal of our human STING alleles research has always been translational, i.e. improving human health. Even as a monogenetic disease, the SAVI pathology is still complex. For example, thought as a type I Interferonopathy, SAVI is largely independent of type I IFNs. Similarly, STING-activation-induced cell death, while contribute to SAVI, is not the whole story, as the Reviewer pointed out in the Comment 3 & 6 &7. HAQ/SAVI mice still died early and had lung dysfunction (Figure 4). In contrast, AQ/SAVI mice restore lifespan and lung function. We had Figure 6 show different T-regs between AQ/SAVI and HAQ/SAVI mice. In addition, AQ mice had more IL-10+ T-regs than HAQ mice 3. Therefore, we are excited about developing AQ-based curative therapy for SAVI patients (preventing cell death and inducing immune tolerance).  Again, we thank the Reviewer for the suggestion. Additional research is ongoing.

      Reviewer #2 (Recommendations For The Authors):

      Minor points

      (1) Generation of THP1 cells with the human STING alleles is missing from methods.

      We added the protocol in the methods (lines 380-387). THP-1 KO line stable expressing WT STING was first described by Weikang Tao’s group 10.

      (2) Some abbreviations are not expanded (CDA).

      CDA is expanded as cyclic di-AMP (e.g. line 375).

      References.

      (1) Patel, S. et al. The Common R71H-G230A-R293Q Human TMEM173 Is a Null Allele. J Immunol 198, 776-787 (2017).

      (2) Sebastian, M. et al. Obesity and STING1 genotype associate with 23-valent pneumococcal vaccination efficacy. JCI Insight 5 (2020).

      (3) Mansouri, S. et al. MPYS Modulates Fatty Acid Metabolism and Immune Tolerance at Homeostasis Independent of Type I IFNs. J Immunol 209, 2114-2132 (2022).

      (4) Sivick, K. E. et al. Comment on "The Common R71H-G230A-R293Q Human TMEM173 Is a Null Allele". J Immunol 198, 4183-4185 (2017).

      (5) Gulen, M. F. et al. Signalling strength determines proapoptotic functions of STING. Nat Commun 8, 427 (2017).

      (6) Kabelitz, D. et al. Signal strength of STING activation determines cytokine plasticity and cell death in human monocytes. Sci Rep 12, 17827 (2022).

      (7) Murthy, A. M. V., Robinson, N. & Kumar, S. Crosstalk between cGAS-STING signaling and cell death. Cell Death Differ 27, 2989-3003 (2020).

      (8) Kuhl, N. et al. STING agonism turns human T cells into interferon-producing cells but impedes their functionality. EMBO Rep 24, e55536 (2023).

      (9) Li, C., Liu, J., Hou, W., Kang, R. & Tang, D. STING1 Promotes Ferroptosis Through MFN1/2-Dependent Mitochondrial Fusion. Front Cell Dev Biol 9, 698679 (2021).

      (10) Song, C. et al. SHR1032, a novel STING agonist, stimulates anti-tumor immunity and directly induces AML apoptosis. Sci Rep 12, 8579 (2022).

      (11) Jin, L. et al. Identification and characterization of a loss-of-function human MPYS variant. Genes Immun 12, 263-269 (2011).

      (12) Yi, G. et al. Single nucleotide polymorphisms of human STING can affect innate immune response to cyclic dinucleotides. PLoS One 8, e77846 (2013).

      (13) Patel, S. et al. Response to Comment on "The Common R71H-G230A-R293Q Human TMEM173 Is a Null Allele". J Immunol 198, 4185-4188 (2017).

      (14) Gao, K. M. et al. Endothelial cell expression of a STING gain-of-function mutation initiates pulmonary lymphocytic infiltration. Cell Rep 43, 114114 (2024).

      (15) Gao, K. M., Motwani, M., Tedder, T., Marshak-Rothstein, A. & Fitzgerald, K. A. Radioresistant cells initiate lymphocyte-dependent lung inflammation and IFNgamma-dependent mortality in STING gain-of-function mice. Proc Natl Acad Sci U S A 119, e2202327119 (2022).

      (16) Hu, W. et al. Regulatory T cells function in established systemic inflammation and reverse fatal autoimmunity. Nat Immunol 22, 1163-1174 (2021).

      (17) Monroe, K. M. et al. IFI16 DNA sensor is required for death of lymphoid CD4 T cells abortively infected with HIV. Science 343, 428-432 (2014).

      (18) Doitsh, G. et al. Cell death by pyroptosis drives CD4 T-cell depletion in HIV-1 infection. Nature 505, 509-514 (2014).

      (19) Jakobsen, M. R., Olagnier, D. & Hiscott, J. Innate immune sensing of HIV-1 infection. Curr Opin HIV AIDS 10, 96-102 (2015).

      (20) Silvin, A. & Manel, N. Innate immune sensing of HIV infection. Curr Opin Immunol 32, 54-60 (2015).

      (21) Altfeld, M. & Gale, M., Jr. Innate immunity against HIV-1 infection. Nat Immunol 16, 554-562 (2015).

      (22) Krapp, C., Jonsson, K. & Jakobsen, M. R. STING dependent sensing - Does HIV actually care? Cytokine Growth Factor Rev 40, 68-76 (2018).

      (23) Liu, Y. et al. Activated STING in a vascular and pulmonary syndrome. N Engl J Med 371, 507-518 (2014).

      (24) Luksch, H. et al. STING-associated lung disease in mice relies on T cells but not type I interferon. J Allergy Clin Immunol 144, 254-266 e258 (2019).

      (25) Stinson, W. A. et al. The IFN-gamma receptor promotes immune dysregulation and disease in STING gain-of-function mice. JCI Insight 7 (2022).

      (26) Warner, J. D. et al. STING-associated vasculopathy develops independently of IRF3 in mice. J Exp Med 214, 3279-3292 (2017).

      (27) Fremond, M. L. et al. Overview of STING-Associated Vasculopathy with Onset in Infancy (SAVI) Among 21 Patients. J Allergy Clin Immunol Pract 9, 803-818 e811 (2021).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this work, the authors provide a comprehensive description of transcriptional regulation in Pseudomonas syringae by investigating the binding characteristics of various transcription factors. They uncover the hierarchical network structure of the transcriptome by identifying top-, middle-, and bottom-level transcription factors that govern the flow of information in the network. Additionally, they assess the functional variability and conservation of transcription factors across different strains of P. syringae by studying DNA-binding characteristics. These findings notably expand our current knowledge of the P. syringae transcriptome.

      The findings associated with crosstalk between transcription factors and pathways, and the diversity of transcription factor functions across strains provide valuable insights into the transcriptional regulatory network of P. syringae. However, these results are at times underwhelming as their significance is unclear. This study would benefit from a discussion of the implications of transcription factor crosstalk on the functioning of the organism as a whole. Additionally, the implications of variability in transcription factor functions on the phenotype of the strains studied would further this analysis.<br /> Overall, this manuscript serves as a key resource for researchers studying the transcriptional regulatory network of P. syringae.

      Thank you for your positive comments.

      Reviewer #2 (Public Review):

      Summary:

      The phytopathogenic bacterium Pseudomonas syringae is comprised of many pathovars with different host plant species and has been used as a model organism to study bacterial pathogenesis in plants. Transcriptional regulation is key to plant infection and adaptation to host environments by this bacterium. However, researchers have focused on a limited number of transcription factors (TFs) that regulate virulence-related pathways. Thus, a comprehensive, systems-level understanding of regulatory interactions between transcription factors in P. syringae has not been achieved.

      This study by Sun et al performed ChIP-seq analysis of 170 out of 301 TFs in P. syringae pv. syringae 1448A and used this unique dataset to infer transcriptional regulatory networks in this bacterium. The network analyses revealed hierarchical interactions between TFs, various network motifs, and co-regulation of target genes by TF pairs, which collectively mediate information flow. As discussed, the structure and properties of the P. syringae transcriptional regulatory networks are somewhat different from those identified in humans, yeast, and E. coli, highlighting the significance of this study. Further, the authors made use of the P. syringae transcriptional regulatory networks to find TFs of unknown functions to be involved in virulence-related pathways. For some of these TFs, their target specificity and biological functions, such as motility and biofilm formation, were experimentally validated. Of particular interest is the finding that despite conservation of TFs between P. syringae pv. syringae 1448A, P. syringae pv. tomato DC3000, P. syringae pv. syringae B728a, and P. syringae pv. actinidiae C48, some of the conserved TFs show different repertoires of target genes in these four P. syringae strains.

      Thank you for your positive comments.

      Strengths:

      This study presents a systems-level analysis of transcriptional regulatory networks in relation to P. syringae virulence and metabolism, and highlights differences in transcriptional regulatory landscapes of conserved TFs between different P. syringae strains, and develops a user-friendly database for mining the ChIP-seq data generated in this study. These findings and resources will be valuable to researchers in the fields of systems biology, bacteriology, and plant-microbe interactions.

      Thank you for your positive comments.

      Weaknesses:

      No major weaknesses were found, but some of the results may need to be interpreted with caution. ChIP-seq was performed with bacterial strains overexpressing TFs. This may cause artificial binding of TFs to promoters which may not occur when TFs are expressed at physiological levels. Another caution is applied to the interpretation of the biological functions of TFs. The biological roles of the tested TFs are based on in vitro experiments. Thus, functional relevance of the tested TFs during plant infection and/or survival under natural environmental conditions remains to be demonstrated.

      Thank you for your comments, and we agree with the reviewer. To eliminate the artificial binding of TFs, we performed EMSA to verify the analyzed targets. Our EMSA results confirmed the analyzed binding peaks.

      For the verification experiments of the biological functions of TFs, we also performed in vivo motility assay and biofilm production assay (Figures 3b-d). To further detect the biological functions of TFs, we performed plant infection assay of TF PSPPH2193 under natural environmental condition (bean leaves). As shown in Figures S6c and g, both the motility and the virulence of P. syringae in ∆PSPPH2193 strain was significantly reduced compared with WT strain. These results showed that TF PSPPH2193 positively regulated the pathogenicity of P. syringae via modulating the bacterial motility.

      Reviewer #3 (Public Review):

      Summary:

      This study aims to understand gene regulation of the plant bacterial pathogen Pseudomonas syringae. Although the function of some TFs has been characterized in this strain, a global picture of the gene regulatory network remains elusive. The authors conducted a large-scale ChIP-seq analysis, covering 170 out of 301 TFs of this strain, and revealed gene regulatory hierarchy with functional validation of some previously uncharacterized TFs.

      Thank you for your positive comments.

      Strengths:

      - This study provides one of the largest ChIP-seq datasets for a single bacterial strain, covering more than half of its TFs. This impressive resource enabled comprehensive systems-level analysis of the TF hierarchy.

      - This study identified novel gene regulation and function with validations through biochemical and genetic experiments.

      - The authors attempted on broad analyses including comparisons between different bacterial strains, providing further insights into the diversity and conservation of gene regulatory mechanisms.

      Thank you for your positive comments.

      Weaknesses:

      (1) Some conclusions are not backed by quantitative or statistical analyses, and they are sometimes overinterpreted.

      Thank you for your comments. We used hypergeometric test in this analysis. Although only one gene was enriched in some pathways, the adjusted p-value was less than 0.05. We added the details in the revised manuscript.

      (2) Some figures and analyses are not well explained, and I was not able to understand them.

      Thank you for your comments, and we are sorry for the confusion. We defined ‘indirect interaction’ as ‘co-association’ and ‘cooperativity’ as ‘if the common target of two TFs is from a TF’. We added the definition of "indirect interaction" and "cooperativity" in the revised legend.

      For Figure S3a, the low co-association scores and large peak numbers of these top-level TFs indicated that top-level TFs preferred to solely regulate target genes, but not to co-regulate with other top-level TFs. PSPPH4700 was an example to show that top-level TFs with low co-association scores and large peak numbers tend to solely regulate target genes, but not to co-regulate with other top-level TFs. We revised the sentence to ‘For example, the top-level TF PSPPH4700 yielded over 1,700 peaks but cooperated with only 24 top-level TFs with low co-association scores about 0.05 (Supplementary Table 2b).’.

      We analyzed high co-association scores of 125 TFs in three levels and further determined the co-association patterns. To identify the tendency of co-association of all these 125 TFs, the co-association patterns were classified into 4 clusters. Bottom-level TFs tend to co-regulate target genes with other TFs. We revised the sentence in the revised manuscript.

      For Figure 2b, in C1, C2 and C4, many bottom-level TFs performed co-association pattern with other TFs, especially bottom TFs (showed in C4). To explore the regulatory pattern in C3, the peak locations in target genes of MexT were analyzed with those of TFs in C3. Seven top-level TFs (PSPPH1435, PSPPH1758, PSPPH2193, PSPPH2454, PSPPH4638, PSPPH4998 and PSPPH3411), three middle-level TFs (PSPPH1100, PSPPH5132 and PSPPH5144) and four bottom-level TFs (PSPPH0700, PSPPH2300, PSPPH2444 and PSPPH2580) were compared with MexT. MexT showed higher co-association scores (more than 60 scores) with more top-level-TFs. Therefore, we demonstrated that MexT performed closer co-association relationships with top-level TFs. We added the statement in the revised manuscript.

      For Figure 1a, the hierarchical network showed different number of TFs in three levels (54 top-level TFs, 62 middle-level TFs and 147 bottom-level TFs), which indicated that more than half of TFs (bottom-level TFs) tend to be regulated by other TFs and then directly bound to target genes. This finding showed a downward regulatory direction of transcription regulation in P. syringae. We revised the statement in the revised manuscript.

      (3) The Method section lacks depth, especially in data analyses. It is strongly recommended that the authors share their analysis codes so that others can reproduce the analyses.

      Thank you for your comments, and we defined the intergenic region before each TF sequence as the promoter region. As pHM1 plasmid carries its own constitutive promoter (lacZ promoter), we amplified the TF-coding sequence and cloned into site following the promoter. The TF protein expression was activated by the promoter of plasmid. Psph 1448A was used for our main ChIP-seq. We added the details in the revised manuscript.

      For Figure S3, we performed GO analysis on genes that were co-bound by TF pairs. We added the details in the revised manuscript.

      We shared our analysis codes on the website (https://github.com/dengxinb2315/PS-PATRnet-code) in the Data Availability.

      Recommendations for the authors

      Reviewer #1 (Recommendations For The Authors):

      (1) The specific strain of Pseudomonas syringae used in the study outside of the evolutionary analysis should be specified in the abstract and main text.

      Thank you for your suggestion. We revised the statements in abstract and main text to specific strains.

      (2) The language used throughout the manuscript should be revised for clarity, conciseness, and readability.

      Thank you for your suggestion. We have revised the language used throughput the manuscript by a scientific editor who is a native speaker of English.

      (2) Line 688: Replace "80C" with "-80C".

      Thank you for your correction. We revised ‘80℃’ to ‘-80℃’. Please see Line 713.

      (3) Line 172 - 173: The abbreviations TT, MM, BB, TM, TB, and MB need to be expanded in the main text before their use.

      Thank you for your suggestion. We added the abbreviations TT, MM, BB, TM, TB, and MB in the manuscript. Please see Lines 172-174.

      Reviewer #2 (Recommendations For The Authors):

      Major points

      (1) The name of the P. syringae strains used in each experiment/analysis should be explicitly stated (most experiments were carried out with P. syringae strain 1448A). This should also be applied to the introduction where many papers on P. syringae are cited without clear indication of strain names. I think this amendment is essential because target genes and thus biological functions of TFs could be different between P. syringae strains, as shown in the present study.

      Thank you for your suggestion. We revised the P. syringae strains in the citations throughout the manuscript.

      (2) How many TFs were analyzed throughout the study? Most sentences including line 22 in the abstract say 170, but I also found some say 270 (for example, line 106 and line 149). The legend of Figure 1 says 262. More detailed information is required regarding the datasets used for each analysis.

      Thank you for your suggestion. The number of TFs analyzed by ChIP-seq in this research is 170, the number of TFs analyzed by HT-SELEX in our previous research is 100. Hierarchical analysis integrated data from ChIP-seq and HT-SELEX which included 270 TFs. As 8 TFs did not show hierarchical characteristic, the legend of Figure 1 said 262 TFs. We added the data source in the revised manuscript. Please see Lines 104, 147, 160 and 1082.

      (3) Figure 1b: Please define "indirect interaction" and "cooperativity" in the legend as well as in the text. I only found the definition of "direct interaction".

      Sorry for the missing information. We defined ‘indirect interaction’ and ‘cooperativity’ as ‘co-association’ and ‘if the common target of two TFs is from a TF’, respectively. We added the definition of "indirect interaction" and "cooperativity" in the revised legend. Please see Lines 174-176, 1084-1086.

      (4) I found it very interesting that conserved TFs show different repertoires of target genes in different P. syringae strains. This suggests the rewiring of transcriptional regulatory networks in P. syringae strains, but the underlying mechanism is not explored in the current manuscript. It can be easily tested whether these conserved TFs bind to similar or different motifs by motif enrichment analysis. If they bind to similar motifs, it is possible that the promoter sequences of their target genes have diversified. Addressing or at least discussing these points would provide molecular insights into the diversification of the transcriptional regulatory networks in P. syringae. Similarly, functional enrichment analysis of target genes can be used to test whether the conserved TFs regulate different biological processes.

      Thank you for your suggestion. We added the motif analysis and functional enrichment analysis of target genes of TFs (PSPPH3122 and PSPPH4127) in different P. syringae strains. We found two different motifs (AGACN4GATCAA and CGGACGN3GATCA) in 1448A and DC3000 strains, respectively. We also performed the GO analysis and found the specific functions of PSPPH3122 in Psph 1448A compared with Pst DC3000 and Pss B728a strains, including recombinase activity and DNA recombination. For PSPPH4127, we found four different motifs in four P. syringae strains. GO analysis showed its relationship with recombinase activity in Psph 1448A strain, and RNA binding, structural constituent of ribosome, translation and ribosome in Pss B728a strain. These results indicated the highly functional diversity of TFs in P. syringae. We added these points in the Results part, and Figure S9-S10 in the revised manuscript. Please see Lines 497-509.

      (5) Related to point 4, it would be quite useful if a list of orthologous genes of 1448A TFs in the other tested P. syringae strains were provided. Such information may also enhance the utility of the database developed in this study.

      Thank you for your suggestion. We added the list of orthologous genes of 301 Psph 1448A TFs in the other tested P. syringae strains in the Supplementary Table 5. Please see Lines 467 and Supplementary Table 5.

      (6) Lines 243-246: It is unclear how these functional enrichment analyses were performed. Did you use target genes regulated by individual TFs or those coregulated by pairs of TFs? Please add more information for the sake of readers.

      Thank you for your suggestion. We performed the functional enrichment analyses by hypergeometric test (BH-adjusted p < 0.05) via using target genes regulated by individual TFs. We added the details in the Results part. Please see Lines 248-252, 270, 1194-1195, 1199-1200 and 1205-1206.

      Minor points

      (1) Lines 167-168: I may not understand correctly, but you might want to say "downward-pointing edges" instead of "upward-pointing edges".

      Thank you for correction. We revised the ‘upward-pointing edges’ to ‘downward-pointing edges’. Please see Line 166.

      (2) Line 174: "physical interactions" should be amended to "direct interactions".

      Thank you for correction. We revised the ‘physical interactions’ to ‘direct interactions’. Please see Line 177.

      (3) Line 224: Could you please explain why bacterial growth in plant tissues is considered an example of "multi-stability"?

      Thank you for your suggestion. We are sorry for the incorrect statement. We showed ‘plant intercellular spaces’ as ‘multi-stability’. We revised the sentence to ‘These auto-regulators are important and always act as repressors in scenarios of multi-stability, such as plant intercellular spaces’. Please see Lines 224-226.

      (4) Line 254-257: Here, the definition of "tether binding" is introduced, but it is not very clear to me. In my understanding, tethered binding is an indirect binding of a TF to a target gene through protein-protein interaction with other TF that directly binds to the promoter of the target gene.

      Thank you for your suggestion, and we agree with you. We referred to the paper published in 2012 (Wang et al., 2012) and revised the statement of ‘tether binding’ to ‘This finding suggested that these TFs indirectly regulated target genes through protein-protein interaction with other TFs that directly binds to the promoters of target genes, a phenomenon defined as tethered binding’. Please see Lines 259-262.

      (5) Lines 341-343: Figure 3b shows qRT-PCR of hopAE1, not hrpR.

      Thank you for your correction. We revised ‘hrpR’ to ‘hopAE1’. Please see Line 349.

      (6) Lines 500 and Figure 6b: It is hard to see edges from module 12 to others. So, it would be better to provide numeric information (number of TFs and target genes) in the text.

      Thank you for your suggestion. Module 12 includes 22 TFs and 318 target genes. We added the statement of numeric information about Module 12 in the revised manuscript. Please see Lines 536-537.

      (7) Line 519: Figure S4b is not the EMSA data for PSPPH3798. Should it be Figure S4e?

      Thank you for your correction. We revised to ‘Figure S4e’. Please see Line 545.

      (8) Line 522: Figure S6b is not relevant to the statement here.

      Thank you for your correction. We deleted the ‘Figure S6b’ here. Please see Line 547.

      (9) Line 593: prokaryotic transcriptional regulatory networks -> eukaryotic transcriptional regulatory networks?

      Thank you for your correction. We revised ‘prokaryotic transcriptional regulatory networks’ to ‘eukaryotic transcriptional regulatory networks’. Please see Line 618.

      (10) Figure S3 requires images of higher resolution. Especially, values for the color codes are not readable or very hard to see.

      Thank you for your suggestion. To make the images clearer, we enlarged the images, change the color codes, and divided it into three figures. Please see the revised Figures S3-S5 and corresponding Figure legends at Lines 1191-1206.

      Reviewer #3 (Recommendations For The Authors):<br /> (1) Some conclusions are not backed by quantitative or statistical analyses, and they are sometimes overinterpreted.

      L221: "Taken together, the simplest and most effective submodule M1 and the coregulatory submodule M13 played crucial roles in the transcriptional regulation of TFs in P. syringae."

      The authors did not provide any evidence supporting the functional importance of any of these submodules. M13 is most enriched within the locked loop, but its size is much smaller than simple loops. What evidence supports the importance of this particular submodule?

      Thank you for your suggestion. In eukaryote (Saccharomyces cerevisiae) and prokaryote (Escherichia coli) which have the best characterized transcriptional regulation networks, the feed-forward loop (called M13 in this article) appear numerous times in the networks and perform different biological functions. M1 appeared most frequently by an order of magnitude than other modules. We revised the sentence to ‘Taken together, the most numerous but simplest submodule M1 played a crucial role in the transcriptional regulation of TFs in P. syringae.’ Please see Lines 222-224.

      L223: "...we found 92 auto-regulators...These auto-regulators are important and always act as repressors in scenarios of multi-stability, such as in plant intercellular spaces where bacteria grow (Figure 1d)(Alon, 2007). These regulators are regarded as bistable switches that further influence the expression of downstream genes."<br /> Are these claims supported by any evidence?

      Thank you for your suggestion. We referred to the following articles:

      (1) Alon. Nature Reviews Genetics. 2007(Alon, 2007).

      That transcription factors repress the transcription of their target genes was considered as negative regulation. These negative autoregulators account for half of the repressors in E. coli and occur in many eukaryotes. The repressors controlled the concentration of the target production through suppressing its expression, which accelerated back to the steady state of cells.

      (2) Becskei. et al. Nature. 2000; Rosenfeld et al. Journal of Molecular Biology. 2002 (Becskei & Serrano, 2000; Rosenfeld, Elowitz, & Alon, 2002).

      Fluorescent assay confirmed that the negative autoregulatory module (negative autoregulator TetR) spent less time to the log phase than unregulated group, which reduced cell-to-cell fluctuations in the steady-state level of the transcription factor. Some negative autoregulators were showed here, such as LexA, CysB and SrlA-D.

      In our research, we also identified many autoregulators including CysB and LexA2 (annotated as LexA repressor). We revised the sentence to ‘In addition, we found 92 auto-regulators in our hierarchy network. These auto-regulators are important and always act as repressors in scenarios of multi-stability, such as plant intercellular spaces (Figure 1d) (Alon, 2007). For example, LexA and CysB as negative autoregulators were indicated to reduce cell-to-cell fluctuations in the steady-state level of the transcription factor (Becskei & Serrano, 2000; Rosenfeld et al. 2002).’. Please see Lines 224-229.

      L265: "This finding indicated that the bottom-level TFs, which were more easily regulated, tended to cooperate with downstream genes and other intra-level TFs."<br /> Could the authors provide more explanation to reach this conclusion from the data? Analyzing the number of highly co-accessing TFs does not sufficiently support this conclusion. The clustering of TFs (C1-C4) is incomplete, and each TF level (Top/Middle/Bottom) contains different numbers of TFs. Since the authors calculated all-by-all co-association scores for these 125 TFs, they can group these scores into 6 possible combinations (TT, TM, TB, MM, MB, BB) and show the distribution of co-association scores.

      Thank you for your suggestion. We indicated that the bottom-level TFs preferred to regulate the target genes through the cooperation with other TFs. To further support the claim, we analyzed the proportion of the bottom TF interaction in all the TF pairs interactions and direct interaction based on results in Figure 1B. The interactions of bottom TFs were 43% and 49%, respectively. However, the interactions of top TFs and middle TFs were only 20% and 28%, respectively. We revised the statement ‘Based on the analysis in Figure 1B, we found that the proportions of bottom-level TF interaction in all the TF pair interactions and direct interaction were 43% and 49%. These results indicated that the bottom-level TFs tended to regulate downstream genes through cooperating with other level TFs.’ in the revised manuscript. Please see Lines 269-272.

      As not every TF performed co-association with other TFs, we only collected 125 TFs with co-association scores. For the numbers of TF in each level, we divided TFs into three levels according to hierarchy height. Hierarchy height from -1 to -0.3 represented bottom level; hierarchy height from -0.3 to 0.3 represented middle level ; hierarchy height from 0.3 to 1 represents top level. Each level was equally divided by height scores. We suggested that different numbers of TFs in three levels indicated the characteristic of transcriptional regulation in P. syringae.

      Thank you for your suggestion. As the co-association patterns were determined by co-association scores of the same TFs, we first grouped the co-association scores into 3 possible TF pairs (TT, MM, and BB, in Figures S3a, S4a and S5a). Our results indicated that higher co-association scores preferred to occur in bottom-level TFs. We revised the statement in the revised manuscript. Please see Lines 244-252.

      (2) Some figures and analyses are not well explained, and I was not able to understand them.

      Figure 1b: The terms "direct," "indirect," and "cooperativity" require further clarification as their definitions in the text (L169-183) are unclear to me. This ambiguity hampers the evaluation of the authors' discussion regarding TF-TF interactions (L561-584), an important theme of this study. The figure includes concepts discussed in later sections (e.g., cooperativity), making it difficult to understand. A diagram explaining these concepts would be highly helpful for readers to understand.

      Sorry for the missing information. We defined ‘indirect interaction’ as ‘co-association’, ‘cooperativity’ as ‘if the common target of two TFs is from a TF’. We added the definition of "indirect interaction" and "cooperativity" in the revised manuscript and legend. Please see Lines 174-176 and 1085-1087.

      L253: "Notably, we found that TFs at the top level, without cooperating TFs, exhibited a large number of binding peaks (Figure S3a)."

      I could not understand this sentence. Did the authors mean that top-level TFs with a large number of peaks showed a low level of co-association? If so, does this data suggest that these TFs do not tend to cooperate with other TFs? I was confused by the discussion in L253-L261.

      Thank you for your comment, and we agree with you. The low co-association scores and large peak numbers of these top-level TFs indicated that top-level TFs preferred to solely regulate target genes, but not to co-regulate with other top-level TFs.

      Thank you for your comment. From L253-256, PSPPH4700 was an example to show that top-level TFs with low co-association scores and large peak numbers tend to solely regulate target genes, but not to co-regulate with other top-level TFs. We revised the sentence to ‘For example, the top-level TF PSPPH4700 yielded over 1,700 peaks, but cooperated with only 24 top-level TFs with low co-association scores about 0.05 (Supplementary Table 2b).’.

      From L257-261, we analyzed high co-association scores of 125 TFs in three levels and further determined the co-association patterns. To identify the tendency of co-association of all these 125 TFs, the co-association patterns were classified into 4 clusters. Bottom-level TFs tend to co-regulate target genes with other TFs. We revised the sentence. Please see Lines 262-264, 265-266 and 269-272.

      L287: "The analysis of the peak locations of MexT demonstrated that MexT showed closer co-association relationships with top-level TFs (Figure 2b)."

      I could reach this conclusion by seeing Figure 2b. Additional explanation and/or data visualization would be appreciated.

      Thank you for your suggestion. In C1, C2 and C4, many bottom-level TFs performed co-association pattern with other TFs, especially bottom TFs (showed in C4). To explore the regulatory pattern in C3, the peak locations in target genes of MexT were analyzed with those of TFs in C3. Seven top-level TFs (PSPPH1435, PSPPH1758, PSPPH2193, PSPPH2454, PSPPH4638, PSPPH4998 and PSPPH3411), three middle-level TFs (PSPPH1100, PSPPH5132 and PSPPH5144) and four bottom-level TFs (PSPPH0700, PSPPH2300, PSPPH2444 and PSPPH2580) were compared with MexT. MexT showed higher co-association scores (more than 60 scores) with more top-level-TFs. Therefore, we demonstrated that MexT performed closer co-association relationships with top-level TFs. We added the statement in the revised manuscript. Please see Lines 291-296.

      Figure 6cd: What kind of enrichment analysis did the authors perform? Was any statistical test used? The figure only shows the number of genes, and sometimes the number is only 1 for a functional category. Can it be considered as significant enrichment?

      Thank you for your comment. We used hypergeometric test in this analysis. Although only one gene was enriched in some pathways, the adjusted p-value was less than 0.05. We added the details in the revised manuscript. Please see Lines 533-534.

      L169: "The hierarchical network revealed a downward information flow, suggesting the prioritization of collaboration between different hierarchy levels."<br /> Can the authors please explain the logic behind this statement more in detail?

      Thank you for your comment. The hierarchical network showed different number of TFs in three levels (54 top-level TFs, 62 middle-level TFs and 147 bottom-level TFs), which indicated that more than half of TFs (bottom-level TFs) tend to be regulated by other TFs and then directly bound to target genes. This finding showed a downward regulatory direction of transcription regulation in P. syringae. We revised the statement in the revised manuscript. Please see Lines 167-170.

      (3) The Method section lacks depth, especially on data analyses.

      How did the authors define promoter regions of each gene? How were operons treated in their analyses? Was P. syringae 1448A used for their main ChIP-seq?

      Thank you for your comment. We defined the intergenic region before each TF sequence as the promoter region.

      As pHM1 plasmid carries its own constitutive promoter (lacZ promoter), we amplified the TF-coding sequence and cloned into the site following the promoter. The TF protein expression was activated by the promoter of plasmid.

      P. syringae 1448A was used for our main ChIP-seq. We added the details in the revised manuscript. Please see Lines 705 and 727-730.

      Figure S3: I am not sure how the GO analyses were done. For example, in the case of the top-level TF PSPPH4700, did the authors perform GO analysis on genes that are co-bound by PSPPH4700 and any other top-level TFs?

      Thank you for your comment and we agree with you. We performed GO analysis on genes that were co-bound by TF pairs in the same level. We added the details in the revised manuscript. Please see Lines 248-252.

      The analysis presented in Figure 6a needs more explanation of the methodology employed by the authors.

      Thank you for your comment. We added more details for the analysis in Figure 6a. Please see Lines 514-522.

      It is strongly recommended that the authors share their analysis codes so that others can reproduce the analyses.

      Thank you for your comment. We shared our analysis codes on the website (https://github.com/dengxinb2315/PS-PATRnet-code) in the Data Availability. Please see Lines 800-801.

      (4) Other:

      Figure 3: I suggest putting additional panel labels to facilitate the interpretation of the figure.

      Thank you for your suggestion. We added detailed labels in the revised Figures 3 and 4. Please see in the revised Figures 3 and 4.

      I spotted several potential errors:

      L106: 170 TFs?

      Thank you for your comment, and we are sorry for the missing details. For the hierarchical network, we integrated the DNA-binding data of 170 TFs in this study and 100 TFs in our previous SELEX research. We added the details in the revised manuscript. Please see Lines 104, 147 and 159-160.

      L592: P. syringae not E. coli?

      Thank you for your comment. Here we discussed the hierarchical characteristics in E. coli. We revised the statement in the revised manuscript. Please see Line 618.

      L593: eukaryotic not prokaryotic?

      Thank you for your correction. Here we discussed the feedforward loops in our study. We revised the statement in the revised manuscript. Please see Line 618.

      References

      Alon, U. (2007). Network motifs: theory and experimental approaches. Nature Reviews Genetics, 8(6), 450-461.

      Becskei, A., & Serrano, L. (2000). Engineering stability in gene networks by autoregulation. Nature, 405(6786), 590-593.

      Rosenfeld, N., Elowitz, M. B., & Alon, U. (2002). Negative autoregulation speeds the response times of transcription networks. Journal of molecular biology, 323(5), 785-793.

      Wang, J., Zhuang, J., Iyer, S., Lin, X., Whitfield, T. W., Greven, M. C., . . . Cheng, Y. (2012). Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors. Genome research, 22(9), 1798-1812.

    1. Author response:

      eLife assessment

      The authors present a potentially useful approach of broad interest arguing that anterior cingulate cortex (ACC) tracks option values in decisions involving delayed rewards. The authors introduce the idea of a resource-based cognitive effort signal in ACC ensembles and link ACC theta oscillations to a resistance-based strategy. The evidence supporting these new ideas is incomplete and would benefit from additional detail and more rigorous analyses and computational methods.

      The reviewers have provided several excellent suggestions and pointed out important shortcomings of our manuscript. We are grateful for their efforts. To address these concerns, we are planning a major revision to the manuscript. In the revision, our goal is to address each of the reviewer’s concerns and codify the evidence for resistance- and resource-based control signals in the rat anterior cingulate cortex. We have provided a nonexhaustive list we plan to address in the point by point responses below.   

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Young (2.5 mo [adolescent]) rats were tasked to either press one lever for immediate reward or another for delayed reward.

      Please note that at the time of testing and training that the rats were > 4 months old.

      The task had a complex structure in which (1) the number of pellets provided on the immediate reward lever changed as a function of the decisions made, (2) rats were prevented from pressing the same lever three times in a row. Importantly, this task is very different from most intertemporal choice tasks which adjust delay (to the delayed lever), whereas this task held the delay constant and adjusted the number of 20 mg sucrose pellets provided on the immediate value lever.

      Several studies parametrically vary the immediate lever (PMID: 39119916, 31654652, 28000083, 26779747, 12270518, 19389183). While most versions of the task will yield qualitatively similar estimates of discounting, the adjusting amount is preferred as it provides the most consistent estimates (PMID: 22445576). More specifically this version of the task avoids contrast effects of that result from changing the delay during the session (PMID: 23963529, 24780379, 19730365, 35661751) which complicates value estimates.

      Analyses are based on separating sessions into groups, but group membership includes arbitrary requirements and many sessions have been dropped from the analyses.

      We are in discussions about how to address this valid concern. This includes simply splitting the data by delay. This approach, however, has conceptual problems that we will also lay out in a full revision.  

      Computational modeling is based on an overly simple reinforcement learning model, as evidenced by fit parameters pegging to the extremes.

      We apologize for not doing a better job of explaining the advantages of this type of model for the present purposes. Nevertheless, given the clear lack of enthusiasm, we felt it was better to simply update the model as suggested by the Reviewers. The straightforward modifications have now been implemented and we are currently in discussion about how the new results fit into the larger narrative.

      The neural analysis is overly complex and does not contain the necessary statistics to assess the validity of their claims.

      We plan to streamline the existing analysis and add statistics, where required, to address this concern.

      Strengths:

      The task is interesting.

      Thank you for the positive comment

      Weaknesses:

      Behavior:

      The basic behavioral results from this task are not presented. For example, "each recording session consisted of 40 choice trials or 45 minutes". What was the distribution of choices over sessions? Did that change between rats? Did that change between delays? Were there any sequence effects? (I recommend looking at reaction times.) Were there any effects of pressing a lever twice vs after a forced trial?

      Animals tend to make more immediate choices as the delay is extended, which is reflected in Figure 1. We will add more detail and additional statistics to address these questions. 

      This task has a very complicated sequential structure that I think I would be hard pressed to follow if I were performing this task.

      Human tasks implement a similar task structure (PMID: 26779747). Please note the response above that outlines the benefits of using of this task.   

      Before diving into the complex analyses assuming reinforcement learning paradigms or cognitive control, I would have liked to have understood the basic behaviors the rats were taking. For example, what was the typical rate of lever pressing? If the rats are pressing 40 times in 45 minutes, does waiting 8s make a large difference?

      This is a good suggestion. However, rats do not like waiting for rewards, even small delays. Going from the 4 à 8 sec delay results in more immediate choices, indicating that the rats will forgo waiting for a smaller reinforcer at the 8 sec delay as compared to the 4 sec.  

      For that matter, the reaction time from lever appearance to lever pressing would be very interesting (and important). Are they making a choice as soon as the levers appear? Are they leaning towards the delay side, but then give in and choose the immediate lever? What are the reaction time hazard distributions?

      These are excellent suggestions. We are looking into implementing them.

      It is not clear that the animals on this task were actually using cognitive control strategies on this task. One cannot assume from the task that cognitive control is key. The authors only consider a very limited number of potential behaviors (an overly simple RL model). On this task, there are a lot of potential behavioral strategies: "win-stay/lose-shift", "perseveration", "alternation", even "random choices" should be considered.

      The strategies the Reviewer mentioned are descriptors of the actual choices the rats made. For example, perseveration means the rat is choosing one of the levers at an excessively high rate whereas alternation means it is choosing the two levers more or less equally, independent of payouts. But the question we are interested in is why? We are arguing that the type of cognitive control determines the choice behavior but cognitive control is an internal variable that guides behavior, rather than simply a descriptor of the behavior. For example, the animal opts to perseverate on the delayed lever because the cognitive control required to track ival is too high. We then searched the neural data for signatures of the two types of cognitive control.

      The delay lever was assigned to the "non-preferred side". How did side bias affect the decisions made?

      The side bias clearly does not impact performance as the animals prefer the delay lever at shorter delays, which works against this bias.

      The analyses based on "group" are unjustified. The authors compare the proportion of delayed to immediate lever press choices on the non-forced trials and then did k-means clustering on this distribution. But the distribution itself was not shown, so it is unclear whether the "groups" were actually different. They used k=3, but do not describe how this arbitrary number was chosen. (Is 3 the optimal number of clusters to describe this distribution?) Moreover, they removed three group 1 sessions with an 8s delay and two group 2 sessions with a 4s delay, making all the group 1 sessions 4s delay sessions and all group 2 sessions 8s delay sessions. They then ignore group 3 completely. These analyses seem arbitrary and unnecessarily complex. I think they need to analyze the data by delay. (How do rats handle 4s delay sessions? How do rats handle 6s delay sessions? How do rats handle 8s delay sessions?). If they decide to analyze the data by strategy, then they should identify specific strategies, model those strategies, and do model comparison to identify the best explanatory strategy. Importantly, the groups were session-based, not rat based, suggesting that rats used different strategies based on the delay to the delayed lever.

      These are excellent points and, as stated above, we are in the process revisiting the group assignments in an effort allay these criticisms.

      The reinforcement learning model used was overly simple. In particular, the RL model assumes that the subjects understand the task structure, but we know that even humans have trouble following complex task structures. Moreover, we know that rodent decision-making depends on much more complex strategies (model-based decisions, multi-state decisions, rate-based decisions, etc). There are lots of other ways to encode these decision variables, such as softmax with an inverse temperature rather than epsilon-greedy. The RL model was stated as a given and not justified. As one critical example, the RL model fit to the data assumed a constant exponential discounting function, but it is well-established that all animals, including rodents, use hyperbolic discounting in intertemporal choice tasks. Presumably this changes dramatically the effect of 4s and 8s. As evidence that the RL model is incomplete, the parameters found for the two groups were extreme. (Alpha=1 implies no history and only reacting to the most recent event. Epsilon=0.4 in an epsilon-greedy algorithm is a 40% chance of responding randomly.)

      Please see our response above. We agree that the approach was not justified, but we do not agree that it is invalid. Simply stated, a softmax approach gives the best fit to the choice behavior, whereas our epsilon-greedy approach attempted to reproduce the choice behavior using a naïve agent that progressively learns the values of the two levers on a choice-by-choice basis. The epsilon-greedy approach can therefore tell us whether it is possible to reproduce the choice behavior by an agent that is only tracking ival. Given our discovery of an ival-tracking signal in ACC, we believed that this was a critical point (although admittedly we did a poor job of communicating it). However, we also appreciate that important insights can be gained by fitting a model to the data as suggested. In fact, we had implemented this approach initially and are currently reconsidering what it can tell us in light of the Reviewers comments.

      The authors do add a "dbias" (which is a preference for the delayed lever) term to the RL model, but note that it has to be maximal in the 4s condition to reproduce group 2 behavior, which means they are not doing reinforcement learning anymore, just choosing the delayed lever.

      Exactly. The model results indicated that a naïve agent that relied only on ival tracking would not behave in this manner. Hence it therefore was unlikely that the G1 animals were using an ival-tracking strategy, even though a strong ival-tracking signal was present in ACC.

      Neurophysiology:

      The neurophysiology figures are unclear and mostly uninterpretable; they do not show variability, statistics or conclusive results.

      While the reviewer is justified in criticizing the clarity of the figures, the statement that “they do not show variability, statistics or conclusive results” is demonstrably false. Each of the figures presented in the manuscript, except Figure 3, are accompanied by statistics and measures of variability. This comment is hyperbolic and not justified.  

      Figure 3 was an attempt to show raw neural data to better demonstrate how robust the ivalue tracking signal is.

      As with the behavior, I would have liked to have seen more traditional neurophysiological analyses first. What do the cells respond to? How do the manifolds change aligned to the lever presses? Are those different between lever presses?

      We provide several figures describing how neurons change firing rates in response to varying reward. We are unsure what the reviewer means by “traditional analysis”, especially since this is immediately followed by a request for an assessment of neural manifolds. That said, we are developing ways to make the analysis more intuitive and, hopefully, more “traditional”.

      Are there changes in cellular information (both at the individual and ensemble level) over time in the session?

      We provide several analyses of how firing rate changes over trials in relation to ival over time in the session.

      How do cellular responses differ during that delay while both levers are out, but the rats are not choosing the immediate lever?

      It is not clear to us how this analysis addresses our hypothesis regarding control signals in ACC.

      Figure 3, for example, claims that some of the principal components tracked the number of pellets on the immediate lever ("ival"), but they are just two curves. No statistics, controls, or justification for this is shown. BTW, on Figure 3, what is the event at 200s?

      Figure 3 will be folded into one of the other figures that contains the summary statistics.

      I'm confused. On Figure 4, the number of trials seems to go up to 50, but in the methods, they say that rats received 40 trials or 45 minutes of experience.

      This analysis included force trials. The max of the session is 40 choice trials. We will clarify in the revised manuscript. 

      At the end of page 14, the authors state that the strength of the correlation did not differ by group and that this was "predicted" by the RL modeling, but this statement is nonsensical, given that the RL modeling did not fit the data well, depended on extreme values. Moreover, this claim is dependent on "not statistically detectable", which is, of course, not interpretable as "not different".

      We plan to revisit this analysis and the RL model.

      There is an interesting result on page 16 that the increases in theta power were observed before a delayed lever press but not an immediate lever press, and then that the theta power declined after an immediate lever press.

      Thank you for the positive comment.

      These data are separated by session group (again group 1 is a subset of the 4s sessions, group 2 is a subset of the 8s sessions, and group 3 is ignored). I would much rather see these data analyzed by delay itself or by some sort of strategy fit across delays.

      Provisional analysis indicates that the results hold up over delays, rather than the groupings in the paper. We will address this in a full revision of the manuscript.

      That being said, I don't see how this description shows up in Figure 6. What does Figure 6 look like if you just separate the sessions by delay?

      We are unclear what the reviewer means by “this description”.

      Discussion:

      Finally, it is unclear to what extent this task actually gets at the questions originally laid out in the goals and returned to in the discussion. The idea of cognitive effort is interesting, but there is no data presented that this task is cognitive at all. The idea of a resourced cognitive effort and a resistance cognitive effort is interesting, but presumably the way one overcomes resistance is through resource-limited components, so it is unclear that these two cognitive effort strategies are different.

      We view the strong evidence for ival tracking presented herein as a potentially critical component of resource based cognitive effort. We hope to clarify how this task engaged cognitive effort more clearly.  

      The authors state that "ival-tracking" (neurons and ensembles that presumably track the number of pellets being delivered on the immediate lever - a fancy name for "expectations") "taps into a resourced-based form of cognitive effort", but no evidence is actually provided that keeping track of the expectation of reward on the immediate lever depends on attention or mnemonic resources. They also state that a "dLP-biased strategy" (waiting out the delay) is a "resistance-based form of cognitive effort" but no evidence is made that going to the delayed side takes effort.

      There is a well-developed literature that rats and mice do not like waiting for delayed reinforcers. We contend that enduring something you don’t like takes effort.

      The authors talk about theta synchrony, but never actually measure theta synchrony, particularly across structures such as amygdala or ventral hippocampus. The authors try to connect this to "the unpleasantness of the delay", but provide no measures of pleasantness or unpleasantness. They have no evidence that waiting out an 8s delay is unpleasant.

      We will better clarify how our measure of Theta power relates to synchrony. There is a well-developed literature that rats and mice do not like waiting for delayed reinforcers.

      The authors hypothesize that the "ival-tracking signal" (the expectation of number of pellets on the immediate lever) "could simply reflect the emotional or autonomic response". Aside from the fact that no evidence for this is provided, if this were to be true, then, in what sense would any of these signals be related to cognitive control?

      This is proposed as an alternative explanation to the ivalue signal. We provide this as a possibility, never a conclusion. We will clarify this in the revised text. 

      Reviewer #2 (Public Review):

      Summary:

      This manuscript explores the neuronal signals that underlie resistance vs resource-based models of cognitive effort. The authors use a delayed discounting task and computational models to explore these ideas. The authors find that the ACC strongly tracks value and time, which is consistent with prior work. Novel contributions include quantification of a resource-based control signal among ACC ensembles, and linking ACC theta oscillations to a resistance-based strategy.

      Strengths:

      The experiments and analyses are well done and have the potential to generate an elegant explanatory framework for ACC neuronal activity. The inclusion of local-field potential / spike-field analyses is particularly important because these can be measured in humans.

      Thank you for the endorsement of our work.

      Weaknesses:

      I had questions that might help me understand the task and details of neuronal analyses.

      (1) The abstract, discussion, and introduction set up an opposition between resource and resistance based forms of cognitive effort. It's clear that the authors find evidence for each (ACC ensembles = resource, theta=resistance?) but I'm not sure where the data fall on this dichotomy.

      a. An overall very simple schematic early in the paper (prior to the MCML model? or even the behavior) may help illustrate the main point.

      b. In the intro, results, and discussion, it may help to relate each point to this dichotomy.

      c. What would resource-based signals look like? What would resistance based signals look like? Is the main point that resistance-based strategies dominate when delays are short, but resource-based strategies dominate when delays are long?

      d. I wonder if these strategies can be illustrated? Could these two measures (dLP vs ival tracking) be plotted on separate axes or extremes, and behavior, neuronal data, LFP, and spectral relationships be shown on these axes? I think Figure 2 is working towards this. Could these be shown for each delay length? This way, as the evidence from behavior, model, single neurons, ensembles, and theta is presented, it can be related to this framework, and the reader can organize the findings.

      These are excellent suggestions, and we intend to implement each of them, where possible.

      (2) The task is not clear to me.

      a. I wonder if a task schematic and a flow chart of training would help readers.

      Yes, excellent idea, we intend to include this.

      b. This task appears to be relatively new. Has it been used before in rats (Oberlin and Grahame is a mouse study)? Some history / context might help orient readers.

      Indeed, this task has been used in rats in several prior studies in rats. Please see the following references (PMID: 39119916, 31654652, 28000083, 26779747, 12270518, 19389183).

      c. How many total sessions were completed with ascending delays? Was there criteria for surgeries? How many total recording sessions per animal (of the 54?)

      Please note that the delay does not change within a session. There was no criteria for surgery. In addition, we will update Table 1 to make the number of recording sessions more clear.

      d. How many trials completed per session (40 trials OR 45 minutes)? Where are there errors? These details are important for interpreting Figure 1.

      Every animal in this data set completed 40 trials. We will update the task description to clarify this issue. There are no errors in this task, but rather the task is designed to the tendency to make an impulsive choice (smaller reward now). We will provide clarity to this issue in the revision of the manuscript.   

      (3) Figure 1 is unclear to me.

      a. Delayed vs immediate lever presses are being plotted - but I am not sure what is red, and what is blue. I might suggest plotting each animal.

      We will clarify the colors and look into schemes to graph the data set.

      b. How many animals and sessions go into each data point?

      This information is in Table 1, but this could be clearer, and we will update the manuscript.

      c. Table 1 (which might be better referenced in the paper) refers to rats by session. Is it true that some rats (2 and 8) were not analyzed for the bulk of the paper? Some rats appear to switch strategies, and some stay in one strategy. How many neurons come from each rat?

      Table 1 is accurate, and we can add the number of neurons from each animal.

      d. Task basics - RT, choice, accuracy, video stills - might help readers understand what is going into these plots

      e. Does the animal move differently (i.e., RTs) in G1 vs. G2?

      We will look into ways to incorporate this information.

      (4) I wasn't sure how clustered G1 vs. G2 vs G3 are. To make this argument, the raw data (or some axis of it) might help.

      a. This is particularly important because G3 appears to be a mix of G1 and G2, although upon inspection, I'm not sure how different they really are

      b. Was there some objective clustering criteria that defined the clusters?

      c. Why discuss G3 at all? Can these sessions be removed from analysis?

      These are all excellent suggestions and points. We plan to revisit the strategy to assign sessions to groups, which we hope will address each of these points.

      (5) The same applies to neuronal analyses in Fig 3 and 4

      a. What does a single neuron peri-event raster look like? I would include several of these.

      b. What does PC1, 2 and 3 look like for G1, G2, and G3?

      c. Certain PCs are selected, but I'm not sure how they were selected - was there a criteria used? How was the correlation between PCA and ival selected? What about PCs that don't correlate with ival?

      d. If the authors are using PCA, then scree plots and PETHs might be useful, as well as comparisons to PCs from time-shuffled / randomized data.

      We will make several updates to enhance clarity of the neural data analysis, including adding more representative examples. We feel the need to balance the inclusion of representative examples with groups stats given the concerns raised by R1.

      (6) I had questions about the spectral analysis

      a. Theta has many definitions - why did the authors use 6-12 Hz? Does it come from the hippocampal literature, and is this the best definition of theta?. What about other bands (delta - 1-4 Hz), theta (4-7 Hz); and beta - 13- 30 Hz? These bands are of particular importance because they have been associated with errors, dopamine, and are abnormal in schizophrenia and Parkinson's disease.

      This designation comes mainly from the hippocampal and ACC literature in rodents. In addition, this range best captured the peak in the power spectrum in our data. Note that we focus our analysis on theta give the literature regarding theta in the ACC as a correlate of cognitive controls (references in manuscript). We did interrogate other bands as a sanity check and the results were mostly limited to theta. Given the scope of our manuscript and the concerns raised regarding complexity we are concerned that adding frequency analyses beyond theta obfuscates the take home message. However, we think this is worthy, and we will determine if this can be done in a brief, clear, and effective manner.

      b. Power spectra and time-frequency analyses may justify the authors focus. I would show these (y-axis - frequency, x-axis - time, z-axis, power).

      This is an excellent suggestion that we look forward to incorporating. 

      (7) PC3 as an autocorrelation doesn't seem the to be right way to infer theta entrainment or spike-field relationships, as PCA can be vulnerable to phantom oscillations, and coherence can be transient. It is also difficult to compare to traditional measures of phase-locking. Why not simply use spike-field coherence? This is particularly important with reference to the human literature, which the authors invoke.

      Excellent suggestion. We will look into the phantom oscillation issue. Note that PCA provided a way to classify neurons that exhibited peaks in the autocorrelation at theta frequencies. While spike-field coherence is a rigorous tool, it addresses a slightly different question (LFP entrainment). Notwithstanding, we plan to address this issue.  

      Reviewer #3 (Public Review):

      Summary:

      The study investigated decision making in rats choosing between small immediate rewards and larger delayed rewards, in a task design where the size of the immediate rewards decreased when this option was chosen and increased when it was not chosen. The authors conceptualise this task as involving two different types of cognitive effort; 'resistance-based' effort putatively needed to resist the smaller immediate reward, and 'resource-based' effort needed to track the changing value of the immediate reward option. They argue based on analyses of the behaviour, and computational modelling, that rats use different strategies in different sessions, with one strategy in which they consistently choose the delayed reward option irrespective of the current immediate reward size, and another strategy in which they preferentially choose the immediate reward option when the immediate reward size is large, and the delayed reward option when the immediate reward size is small. The authors recorded neural activity in anterior cingulate cortex (ACC) and argue that ACC neurons track the value of the immediate reward option irrespective of the strategy the rats are using. They further argue that the strategy the rats are using modulates their estimated value of the immediate reward option, and that oscillatory activity in the 6-12Hz theta band occurs when subjects use the 'resistance-based' strategy of choosing the delayed option irrespective of the current value of the immediate reward option. If solid, these findings will be of interest to researchers working on cognitive control and ACCs involvement in decision making. However, there are some issues with the experiment design, reporting, modelling and analysis which currently preclude high confidence in the validity of the conclusions.

      Strengths:

      The behavioural task used is interesting and the recording methods should enable the collection of good quality single unit and LFP electrophysiology data. The authors recorded from a sizable sample of subjects for this type of study. The approach of splitting the data into sessions where subjects used different strategies and then examining the neural correlates of each is in principle interesting, though I have some reservations about the strength of evidence for the existence of multiple strategies.

      Thank you for the positive comments.

      Weaknesses:

      The dataset is very unbalanced in terms of both the number of sessions contributed by each subject, and their distribution across the different putative behavioural strategies (see table 1), with some subjects contributing 9 or 10 sessions and others only one session, and it is not clear from the text why this is the case. Further, only 3 subjects contribute any sessions to one of the behavioural strategies, while 7 contribute data to the other such that apparent differences in brain activity between the two strategies could in fact reflect differences between subjects, which could arise due to e.g. differences in electrode placement. To firm up the conclusion that neural activity is different in sessions where different strategies are thought to be employed, it would be important to account for potential cross-subject variation in the data. The current statistical methods don't do this as they all assume fixed effects (e.g. using trials or neurons as the experimental unit and ignoring which subject the neuron/trial came from).

      This is an important issue that we plan to address with additional analysis in the manuscript update.

      It is not obvious that the differences in behaviour between the sessions characterised as using the 'G1' and 'G2' strategies actually imply the use of different strategies, because the behavioural task was different in these sessions, with a shorter wait (4 seconds vs 8 seconds) for the delayed reward in the G1 strategy sessions where the subjects consistently preferred the delayed reward irrespective of the current immediate reward size. Therefore the differences in behaviour could be driven by difference in the task (i.e. external world) rather than a difference in strategy (internal to the subject). It seems plausible that the higher value of the delayed reward option when the delay is shorter could account for the high probability of choosing this option irrespective of the current value of the immediate reward option, without appealing to the subjects using a different strategy.

      Further, even if the differences in behaviour do reflect different behavioural strategies, it is not obvious that these correspond to allocation of different types of cognitive effort. For example, subjects' failure to modify their choice probabilities to track the changing value of the immediate reward option might be due simply to valuing the delayed reward option higher, rather than not allocating cognitive effort to tracking immediate option value (indeed this is suggested by the neural data). Conversely, if the rats assign higher value to the delayed reward option in the G1 sessions, it is not obvious that choosing it requires overcoming 'resistance' through cognitive effort.

      The RL modelling used to characterise the subject's behavioural strategies made some unusual and arguably implausible assumptions:

      i) The goal of the agent was to maximise the value of the immediate reward option (ival), rather than the standard assumption in RL modelling that the goal is to maximise long-run (e.g. temporally discounted) reward. It is not obvious why the rats should be expected to care about maximising the value of only one of their two choice options rather than distributing their choices to try and maximise long run reward.

      ii) The modelling assumed that the subject's choice could occur in 7 different states, defined by the history of their recent choices, such that every successive choice was made in a different state from the previous choice. This is a highly unusual assumption (most modelling of 2AFC tasks assumes all choices occur in the same state), as it causes learning on one trial not to generalise to the next trial, but only to other future trials where the recent choice history is the same.

      iii) The value update was non-standard in that rather than using the trial outcome (i.e. the amount of reward obtained) as the update target, it instead appeared to use some function of the value of the immediate reward option (it was not clear to me from the methods exactly how the fival and fqmax terms in the equation are calculated) irrespective of whether the immediate reward option was actually chosen.

      iv) The model used an e-greedy decision rule such that the probability of choosing the highest value option did not depend on the magnitude of the value difference between the two options. Typically, behavioural modelling uses a softmax decision rule to capture a graded relationship between choice probability and value difference.

      v) Unlike typical RL modelling where the learned value differences drive changes in subjects' choice preferences from trial to trial, to capture sensitivity to the value of the immediately rewarding option the authors had to add in a bias term which depended directly on this value (not mediated by any trial-to-trial learning). It is not clear how the rat is supposed to know the current trial ival if not by learning over previous trials, nor what purpose the learning component of the model serves if not to track the value of the immediate reward option.

      Given the task design, a more standard modelling approach would be to treat each choice as occurring in the same state, with the (temporally discounted) value of the outcomes obtained on each trial updating the value of the chosen option, and choice probabilities driven in a graded way (e.g. softmax) by the estimated value difference between the options. It would be useful to explicitly perform model comparison (e.g. using cross-validated log-likelihood with fitted parameters) of the authors proposed model against more standard modelling approaches to test whether their assumptions are justified. It would also be useful to use logistic regression to evaluate how the history of choices and outcomes on recent trials affects the current trial choice, and compare these granular aspects of the choice data with simulated data from the model.

      Each of the issues outlined above with the RL model a very important. We are currently re-evaluating the RL modeling approach in light of these comments. Please see comments to R1 regarding the model as they are relevant for this as well.

      There were also some issues with the analyses of neural data which preclude strong confidence in their conclusions:

      Figure 4I makes the striking claim that ACC neurons track the value of the immediately rewarding option equally accurately in sessions where two putative behavioural strategies were used, despite the behaviour being insensitive to this variable in the G1 strategy sessions. The analysis quantifies the strength of correlation between a component of the activity extracted using a decoding analysis and the value of the immediate reward option. However, as far as I could see this analysis was not done in a cross-validated manner (i.e. evaluating the correlation strength on test data that was not used for either training the MCML model or selecting which component to use for the correlation). As such, the chance level correlation will certainly be greater than 0, and it is not clear whether the observed correlations are greater than expected by chance.

      This is an astute observation and we plan to address this concern. We agree that cross-validation may provide an appropriate tool here.

      An additional caveat with the claim that ACC is tracking the value of the immediate reward option is that this value likely correlates with other behavioural variables, notably the current choice and recent choice history, that may be encoded in ACC. Encoding analyses (e.g. using linear regression to predict neural activity from behavioural variables) could allow quantification of the variance in ACC activity uniquely explained by option values after controlling for possible influence of other variables such as choice history (e.g. using a coefficient of partial determination).

      This is also an excellent point that we plan to address the manuscript update.

      Figure 5 argues that there are systematic differences in how ACC neurons represent the value of the immediate option (ival) in the G1 and G2 strategy sessions. This is interesting if true, but it appears possible that the effect is an artefact of the different distribution of option values between the two session types. Specifically, due to the way that ival is updated based on the subjects' choices, in G1 sessions where the subjects are mostly choosing the delayed option, ival will on average be higher than in G2 sessions where they are choosing the immediate option more often. The relative number of high, medium and low ival trials in the G1 and G2 sessions will therefore be different, which could drive systematic differences in the regression fit in the absence of real differences in the activity-value relationship. I have created an ipython notebook illustrating this, available at: https://notebooksharing.space/view/a3c4504aebe7ad3f075aafaabaf93102f2a28f8c189ab9176d4807cf1565f4e3. To verify that this is not driving the effect it would be important to balance the number of trials at each ival level across sessions (e.g. by subsampling trials) before running the regression.

      Excellent point and thank you for the notebook. We explored a similar approach previously but did not pursue it to completion. We will re-investigate this issue.

    1. Author response:

      Reviewer #3 (Public Review):

      (1) Conditions on growth and interaction rates for feasibility and stability. The authors approach this using a mean field approximation, and it is important to note that there is no particular temperature dependence assumed here: as far as it goes, this analysis is completely general for arbitrary Lotka-Volterra interactions.

      However, the starting point for the authors' mean field analysis is the statement that "it is not possible to meaningfully link the structure of species interactions to the exact closed-form analytical solution for [equilibria] 𝑥^*_𝑖 in the Lotka-Volterra model.

      I may be misunderstanding, but I don't agree with this statement. The time-independent equilibrium solution with all species present (i.e. at non-zero abundances) takes the form

      x^* = A^{-1}r

      where A is the inverse of the community matrix, and r is the vector of growth rates. The exceptions to this would be when one or more species has abundance = 0, or A is not invertible. I don't think the authors intended to tackle either of these cases, but maybe I am misunderstanding that.

      So to me, the difficulty here is not in writing a closed-form solution for the equilibrium x^*, it is in writing the inverse matrix as a nice function of the entries of the matrix A itself, which is where the authors want to get to. In this light, it looks to me like the condition for feasibility (i.e. that all x^* are positive, which is necessary for an ecologically-interpretable solution) is maybe an approximation for the inverse of A---perhaps valid when off-diagonal entries are small. A weakness then for me was in understanding the range of validity of this approximation, and whether it still holds when off-diagonal entries of A (i.e. inter-specific interactions) are arbitrarily large. I could not tell from the simulation runs whether this full range of off-diagonal values was tested.

      We thank the reviewer for pointing this out and we agree that the language used is imprecise. The GLV model is solvable using the matrix inversion method but as they note, this does not give an interpretable expression in terms of the system parameters. This is important as we aim to build understanding of how these parameters (which in turn depend on temperature) affect the richness in communities. We have made this clearer in lines 372-379.

      In regards to the validity of the approximation we have significantly increased the detail of the method in the manuscript, including the assumptions it makes (lines 384-393). In general the method assumes that any individual interaction has a weak effect on abundance. This will fail when the variation in interactions becomes too strong but should be robust to changes in the average interaction strength across the community.

      As a secondary issue here, it would have been helpful to understand whether the authors' feasible solutions are always stable to small perturbations. In general, I would expect this to be an additional criterion needed to understand diversity, though as the authors point out there are certain broad classes of solutions where feasibility implies stability.

      As the reviewer notes previous work using the GLV model by ? has shown that stability almost surely implies stability in the GLV. Thus we expect that our richness estimates derived from feasibility will closely resemble those from stabiltiy. We have amended the maintext to make this argument clear on lines 321-335.

      (2) I did not follow the precise rationale for selecting the temperature dependence of growth rate and interaction rates, or how the latter could be tested with empirical data, though I do think that in principle this could be a valuable way to understand the role of temperature dependence in the Lotka-Volterra equations.

      First, as the authors note, "the temperature dependence of resource supply will undoubtedly be an important factor in microbial communities"

      Even though resources aren't explicitly modeled here, this suggests to me that at some temperatures, resource supply will be sufficiently low for some species that their growth rates will become negative. For example, if temperature dependence is such that the limiting resource for a given species becomes too low to balance its maintenance costs (and hence mortality rate), it seems that the net growth rate will be negative. The alternative would be that temperature affects resource availability, but never such that a limiting resource leads to a negative growth rate when a taxon is rare.

      On the other hand, the functional form for the distribution of growth rates (eq 3) seems to imply that growth rates are always positive. I could imagine that this is a good description of microbial populations in a setting where the resource supply rate is controlled independently of temperature, but it wasn't clear how generally this would hold.

      We thank the reviewer for their comment. The assumption of positive growth rates is indeed a feature of the Boltzmann-Arrhenius model of temperature dependence. We use the Boltzmann-Arrhenius model due to the dependence of growth on metabolic rate. As metabolic rate is ultimately determined by biochemical kinetics its temper- ature dependence is well described by the Boltzmann-Arrhenius. In addition to this reasoning there is a wealth of empirical evidence supporting the use of the Boltzmann- Arrhenius to describe the temperature dependence of growth rate in microbes.

      Ultimately the temperature dependence of resource supply is not something we can directly consider in our model. As such we have to assume that resource supply is sufficient to maintain positive growth rates in the community. Note that this assump- tion only requires resource supply is sufficient to maintain positive growth rates (i.e. the maximal growth rate of species in isolation) not that resource supply is sufficient to maintain growth in the presence of intra- and interspecific competition. We have updated the manuscript in lines 156-159 to make these assumptions more clear.

      Secondly, while I understand that the growth rate in the exponential phase for a single population can be measured to high precision in the lab as a function of temperature, the assumption for the form of the interaction rates' dependence on temperature seems very hard to test using empirical data. In the section starting L193, the authors seem to fit the model parameters using growth rate dependence on temperature, but then assume that it is reasonable to "use the same thermal response for growth rates and interactions". I did not follow this, and I think a weakness here is in not providing clear evidence that the functional form assumed in Equation (4) actually holds.

      The reviewer is correct, it is very difficult to measure interaction coefficients experi- mentally and to our knowledge there is little to no data available on their empirical temperature responses. We as a best guess use the observed variation in thermal physiology parameters for growth rate as a proxy assuming that interactions must also depend on metabolic rates of the interacting species (see also response to com- ment 8).

    1. Author response:

      Reviewer #1 (Public Review):

      The authors conducted cross-species comparisons between the human brain and the macaque brain to disentangle the specific characteristics of structural development of the human brain. Although previous studies had revealed similarities and differences in brain anatomy between the two species by spatially aligning the brains, the authors made the comparison along the chronological axis by establishing models for predicting the chronological ages with the inputting brain structural features. The rationale is actually clear given that brain development occurs over time in both. More interestingly, the model trained on macaque data was better able to predict the age of humans than the human-trained model was at predicting macaque age. This revealed a brain cross-species age gap (BCAP) that quantified the discrepancy in brain development between the two species, and the authors even found this BCAP measure was associated with performance on behavioral tests in humans. Overall, this study provides important and novel insights into the unique characteristics of human brain development. The authors have employed a rigorous scientific approach, reflecting diligent efforts to scrutinize the patterns of brain age models across species. The clarity of the rationale, the interpretability of the methods, and the quality of the presentation all contribute to the strength of this work.

      We are grateful to your helpful and thorough review and for being so positive about our manuscript. Following your recommendations, we have added more analytic details that have strengthened our paper. We would like to thank you for your input.

      Reviewer #2 (Public Review):

      In the current study, Li et al. developed a novel approach that aligns chronological age to a cross-species brain age prediction model to investigate the evolutionary effect. This method revealed some interesting findings, like the brain-age gap of the macaque model in predicting human age will increase as chronological age increases, suggesting an evolutionary alignment between the macaque brain and the human brain in the early stage of development. This study exhibits ample novelty and research significance. However, I still have some concerns regarding the reliability of the current findings.

      We thank you for the positive and appreciative feedback on our work and the insightful comments, which we have addressed below.

      Question 1: Although the authors named their new method a "cross-species" model, the current study only focused on the prediction between humans and macaques. It would be better to discuss whether their method can also generalize to cross-species examination of other species (e.g., C. elegans), which may provide more comprehensive evolutionary insights. Also, other future directions with their new method are worth discussing.

      We appreciate your insightful comment regarding the generalizability of our model to other species. As you said, we indeed only performed human-macaque cross-species study not including other species. In our study, we only focused human and macaque because macaque is considered to be one of the closest primates to humans except chimpanzees and thus is considered to be the best model for studying human brain evolution. However, our proposed method has limitations that limit its generalizability for other species, e.g., C. elegans. First, our model was trained using MRI data, which limits its applicability to species for which such data is unavailable. This technological requirement brings a barrier to broaden cross-species application. Second, our current model is based on homologous brain atlases that are available for both humans and macaques. The lack of comparable atlases for other species further restricts the model's generalizability. We have discussed this limitation in the revised manuscript and outlined potential future directions to overcome these challenges. This includes discussing the need for developing comparable imaging techniques and standardized brain atlases across a wider range of species to enhance the model's applicability and broaden our understanding of cross-species neurodevelopmental patterns.

      On page 15, lines 11-18

      “However, the existing limitation should be noted regarding the generalizability of our proposed approach for cross-species brain comparison. Our current model relies on homologous brain atlases, and the lack of comparable atlases for other species restricts its broader applicability. To address this limitation, future research should focus on developing prediction models that do not depend on atlases. For instance, 3D convolutional neural networks could be trained directly on raw MRI data for age prediction. These deep learning models may offer greater flexibility for cross-species applications once the training within species is complete. Such advancements would significantly enhance the model's adaptability and expand its potential for comparative neuroscience studies across a wider range of species.”

      Question 2: Algorithm of prediction model. In the method section, the authors only described how they chose features, but did no description about the algorithm (e.g., supporting vector regression) they used. Please add relevant descriptions to the methods.

      Thank you for your comment. We apologize for not providing sufficient details about the model training process in our initial submission. In our study, we used a linear regression model for prediction. We have provided more details regarding the algorithm of prediction model in our response to Reviewer #1. For your convenience, we have attached them below.

      For details on the algorithm of prediction model:

      “A linear regression model was adopted for intra- and inter-species age prediction. The linear regression model was built including the following three main steps: 1) Feature selection: a total of two steps are required to extract the final features. The first step is preliminary extraction. First, all the human or macaque participants were divided into 10-fold and 9-fold was used for model training and 1-fold for model test. The preliminary features were chosen by identifying the significantly age-associated features with p < 0.01 during calculating Pearson’s correlation coefficients between all the 260 features and actual ages of the 9-fold subjects. This process was repeated 100 times. Since we obtained not exactly the same preliminary features each time, we thus further analyzed the preliminary features using two methods to determine the final features: common features and minimum mean absolute error (min MAE). Common features are the preliminary features that were selected in all the 100 times during preliminary model training. The min MAE features were the preliminary features that with the smallest MAE value during the 100 times model test for predicting age. After the above feature selections, we obtained two sets of features: 62 macaque features and 225 human features (common features) and 117 macaque features and 239 human features (min MAE). In addition, to further exclude the influences of unequal number of features in human and macaque, we also selected the first 62 features in human and macaque to test the model prediction performances. 2) Model construction: we conducted age prediction linear model using 10-fold cross-validation based on the selected features for human and macaque separately. The linear model parameters are obtained using the training set data and applied to the test set for prediction. The above process is also repeated 100 times. 3) Prediction: with the above results, we obtained the optimal linear prediction models for human and macaque. Next, we performed intra-species and inter-species brain age prediction, i.e., human model predicted human age, human model predicted macaque age, macaque model predicted macaque age and macaque model predicted human age. Three sets of features (62 macaque features and 225 human features; 117 macaque features and 239 human features; 62 macaque features and 62 human features) were used to test the prediction models for cross-validation and to exclude effects of different number of features in human and macaque. In the main text, we showed the results of brain age prediction, brain developmental and evolutional analyses based on common features and the results obtained using other two types of features were shown in supplementary materials. The prediction performances were evaluated by calculating the Pearson’s correlation and MAE between actual ages and predicted ages.”

      Question 3: Sex difference. The sex difference results are strange to me. For example, in the second row of Figure Supplement 3A, different models show different correlation patterns, but why their Pearson's r is all equal to 0.3939? If they are only typo errors, please correct them. The authors claimed that they found no sex difference. However, the results in Figure Supplement 3 show that, the female seems to have poorer performance in predicting macaque age from the human model. Moreover, accumulated studies have reported sex differences in developing brains (Hines, 2011; Kurth et al., 2021). I think it is also worth discussing why sex differences can't be found in the evolutionary effect.

      Reference:

      Hines, M. (2011). Gender development and the human brain. Annual review of neuroscience, 34, 69-88.

      Kurth, F., Gaser, C., & Luders, E. (2021). Development of sex differences in the human brain. Cognitive Neuroscience, 12(3-4), 155-162.

      It is recommended that the authors explore different prediction models for different species. Maybe macaques are suitable for linear prediction models, and humans are suitable for nonlinear prediction models.

      Thank you for pointing the typos out and comments on sex difference. In Figure Supplement 3A, there are typos for Pearson’s r values and we have corrected it in updated Figure 2-figure supplement 3. For details, please see the updated Figure 2-figure supplement 3 and the following figure.

      Regarding gender effects, we acknowledge your point about the importance of gender differences in understanding brain evolution and development. In our study, however, our primary goal was to develop a robust age prediction model by maximizing the number of training samples. To mitigate gender-related effects in our main results, we incorporated gender information as a covariate in the ComBat harmonization process. We conducted a supplementary analysis just to demonstrate the stability of our proposed cross-species age prediction model by separating the data with gender variable not to investigate gender differences. Although our results demonstrated that gender-specific models could still significantly predict chronological age, we refrained from emphasizing these models' performance in gender-specific species comparisons due to difficulty in explanation for the predicted gender difference. For cross-species prediction, whether a higher Pearson’s r value between actual age and predicted age could reflect conserved evolution for male or female is not convincing. In addition, we adopted same not different prediction models for human and macaque aiming to establish a comparable model between species. Generally speaking, the nonlinear model could obtain better prediction accuracy than linear model. If different species used different models, it is unfair to perform cross-species prediction. Importantly, our study aimed to developed new index based on the same prediction models to quantify brain evolution difference, i.e., brain cross-species age gap (BCAP) instead of traditional statistical analyses. Different prediction models for different species may introduce bias causing by prediction methods and thus impacting the accuracy of BCAP. Thus, we adopted the linear model with best prediction performances for intra-species prediction in this study for cross-species prediction. Although our main goal in this study is to set up stable cross-species prediction model and the models built using either male or female subjects showed good performances during cross-species prediction, however, as your comment, how to unbiasedly characterize evolutionary gender differences using machining learning approaches needs to be further investigated since there are many reports about the gender difference in developing brain in humans. In fact, whether macaque brains have the same gender differences as humans is an interesting scientific question worth studying. Thus, we have included a discussion on how to use machining learning method to study the evolutionary gender difference in our revised manuscript.

      On page 15, lines 18-23 and page 16, line 1-4

      “Many studies have reported sex differences in developing human brains (Hines, 2011; Kurth, Gaser, & Luders, 2021), however, whether macaque brains have similar sex differences as humans is still unknown. We used machining learning method for cross-species prediction to quantify brain evolution and the established prediction models are stable even when only using male or female data, which may indicate that the proposed cross-species prediction model has no evolutionary sex difference. Although the stable prediction model can be established in either male or female participants for cross-species prediction, this indeed does not mean that there are no evolutionary sex differences due to lack of quantitative comparative analysis. In the future, we need to develop more objective, quantifiable and stable index for studying sex differences using machining learning methods to further identify sex differences in the evolved brain”

      Reviewer #3 (Public Review):

      The authors identified a series of WM and GM features that correlated with age in human and macaque structural imaging data. The data was gathered from the HCP and WA studies, which was parcellated in order to yield a set of features. Features that correlated with age were used to train predictive intra and inter-species models of human and macaque age. Interestingly, while each model accurately predicted the corresponding species age, using the macaque model to predict human age was more accurate than the inverse (using the human model to predict macaque age). In addition, the prediction error of the macaque model in predicting human age increased with age, whereas the prediction error of the human model predicting macaque age decreased with age.

      After elaboration of the predictive models, the authors classified the features for prediction into human-specific, macaque-specific and common to human and macaque, where they most notably found that macaque-only and common human-macaque areas were located mainly in gray matter, with only a few human-specific features found in gray matter. Furthermore, the authors found significant correlations between BCAP and picture vocabulary (positive correlation) test and visual sensitivity (negative correlation) test. Several white matter tracts (AF, OR, SLFII) were also identified showing a correlation with BCAP.

      Thank you for providing this excellent summary. We appreciate your thorough review and concise overview of our work.

      STRENGTHS AND WEAKNESSES

      The paper brings an interesting perspective on the evolutionary trajectories of human and non-human primate brain structure, and its relation to behavior and cognition. Overall, the methods are robust and support the theoretical background of the paper. However, the overall clarity of the paper could be improved. There are many convoluted sentences and there seems to be both repetition across the different sections and unclear or missing information. For example, the Introduction does not clearly state the research questions, rather just briefly mentions research gaps existing in the literature and follows by describing the experimental method. It would be desirable to clearly state the theoretical background and research questions and leave out details on methodology. In addition, the results section repeats a lot of what is already stated in the methods. This could be further simplified and make the paper much easier to read.

      In the discussion, authors mention that "findings about cortex expansion are inconsistent and even contradictory", a more convincing argument could be made by elaborating on why the cortex expansion index is inadequate and how BCAP is more accurate.

      Thank you for highlighting the interesting aspects of our work. We are sorry for the lack of the clarity in certain parts of our manuscript. Following your valuable suggestions, we have revised the manuscript to reduce unnecessary repetitions and provide a clearer statement of our research question in Introduction. Specifically, unlike previous analyses of human and macaque evolution using comparative neuroscience, this study embeds chronological axis into the cross-species evolutionary analysis process. It constructed a linear prediction model of brain age for humans and macaques, and quantitatively described the degree of evolution. The brain structure based cross-species age prediction model and cross-species brain age differences proposed in this study further eliminate the inherent developmental effects of humans and macaques on cross-species evolutionary comparisons, providing new perspectives and approaches for studying cross-species development. Regarding the existing repetition in the results section, we have simplified them for the clarity. Regarding the comparison between the cortex expansion index and BCAP, we would like to emphasize that the cortex expansion index was derived without fully considering cross-species alignment along the chronological axis. Specifically, this index does not correspond to a specific developmental stage, but rather focuses on a direct comparison between the two species. In contrast, BCAP addresses this limitation by utilizing a prediction model to establish alignment (or misalignment) between species at the individual level. Therefore, BCAP may serve as a more flexible and nuanced tool for cross-species brain comparison.

      STUDY AIMS AND STRENGTH OF CONCLUSIONS

      Overall, the methods are robust and support the theoretical background of the paper, but it would be good to state the specific research questions -even if exploratory in nature- more specifically. Nevertheless, the results provide support for the research aims.

      Thank you for excellent suggestion. We have revised our introduction to state the specific research question as mentioned above.

      IMPACT OF THE WORK AND UTILITY OF METHODS AND DATA TO THE COMMUNITY

      This study is a good first step in providing a new insight into the neurodevelopmental trajectories of humans and non-human primates besides the existing cortical expansion theories.

      Thank you for your encouraging comment.

      ADDITIONAL CONTEXT:

      It should be clearly stated both in the abstract and methods that the data used for the experiment came from public databases.

      Thank you for your suggestion. We have added this information in both abstract and method. For details, please see page 2, line 9 in Abstract section; page 16, lines 10-11 and page 17, lines 6-10 in Materials and Method section.

    1. Author response:

      Reviewer #1 - Public Review

      This report describes work aiming to delineate multi-modal MRI correlates of psychopathology from a large cohort of children of 9-11 years from the ABCD cohort. While uni-modal characterisations have been made, the authors rightly argue that multi-modal approaches in imaging are vital to comprehensively and robustly capture modes of large-scale brain variation that may be associated with pathology. The primary analysis integrates structural and resting-state functional data, while post-hoc analyses on subsamples incorporate task and diffusion data. Five latent components (LCs) are identified, with the first three, corresponding to p-factor, internal/externalising, and neurodevelopmental Michelini Factors, described in detail. In addition, associations of these components with primary and secondary RSFC functional gradients were identified, and LCs were validated in a replication sample via assessment of correlations of loadings.

      1.1) This work is clearly novel and a comprehensive study of associations within this dataset. Multi-modal analyses are challenging to perform, but this work is methodologically rigorous, with careful implementation of discovery and replication assessments, and primary and exploratory analyses. The ABCD dataset is large, and behavioural and MRI protocols seem appropriate and extensive enough for this study. The study lays out comprehensive associations between MRI brain measures and behaviour that appear to recapitulate the established hierarchical structure of psychopathology.

      We thank Reviewer 1 for appreciating our methods and findings, and we address their suggestions below:

      1.2) The work does have weaknesses, some of them acknowledged. There is limited focus on the strength of observed associations. While the latent component loadings seem reliably reproducible in the behavourial domain, this is considerably less the case in the imaging modalities. A considerable proportion of statistical results focuses on spatial associations in loadings between modalities - it seems likely that these reflect intrinsic correlations between modalities, rather than associations specific to any latent component.

      We appreciate the Reviewer’s comment, and minimized the reporting of correlations between the loadings from the different modalities in the revised Results (specifically subsections on LC1, LC2, and LC3). We now refer to Table S4 in each subsection for this information: “Spatial correlations between modality-specific loadings are reported in Supplementary file 1c.”

      For completeness, we report the intrinsic correlations between the different modalities in Supplementary file 1c (P.19):

      “Lastly, although the current work aimed to reduce intrinsic correlations between variables within a given modality through running a PCA before the PLS approach, intrinsic correlations between measures and modalities may potentially be a remaining factor influencing the PLS solution. We, thus, provided an additional overview of the intrinsic correlations between the different neuroimaging data modalities in the supporting results (Supplementary file 1c).”

      1.3) Assessment of associations with functional gradients is similarly a little hard to interpret. Thus, it is hard to judge the implications for our understanding of the neurophysiological basis of psychopathology and the ability of MRI to provide clinical tools for, say, stratification.

      We now provide additional context, including a rising body of theoretical and empirical work, that outlines the value of functional gradients and cortical hierarchies in the understanding of brain development and psychopathology. Please see P.26.

      “Initially demonstrated at the level of intrinsic functional connectivity (Margulies et al., 2016), follow up work confirmed a similar cortical patterning using microarchitectural in-vivo MRI indices related to cortical myelination (Burt et al., 2018; Huntenburg et al., 2017; Paquola et al., 2019), post-mortem cytoarchitecture (Goulas et al., 2018; Paquola et al., 2020, 2019), or post-mortem microarray gene expression (Burt et al., 2018). Spatiotemporal patterns in the formation and maturation of large-scale networks have been found to follow a similar sensory-to-association axis; moreover, there is the emerging view that this framework may offer key insights into brain plasticity and susceptibility to psychopathology (Sydnor et al., 2021). In particular, the increased vulnerability of transmodal association cortices in late childhood and early adolescence has been suggested to relate to prolonged maturation and potential for plastic reconfigurations of these systems (Paquola et al., 2019; Park et al., 2022b). Between mid-childhood and early adolescence, heteromodal association systems such as the default network become progressively more integrated among distant regions, while being more differentiated from spatially adjacent systems, paralleling the development of cognitive control, as well as increasingly abstract and logical thinking. [...] This suggests that neurodevelopmental difficulties might be related to alterations in various processes underpinned by sensory and association regions, as well as the macroscale balance and hierarchy of these systems, in line with previous findings in several neurodevelopmental conditions, including autism, schizophrenia, as well as epilepsy, showing a decreased differentiation between the two anchors of this gradient (Hong et al., 2019). In future work, it will be important to evaluate these tools for diagnostics and population stratification. In particular, the compact and low dimensional perspective of gradients may provide beneficial in terms of biomarker reliability as well as phenotypic prediction, as previously demonstrated using typically developing cohorts (Hong et al. 2020) On the other hand, it will be of interest to explore in how far alterations in connectivity along sensory-to-transmodal hierarchies provide sufficient graduality to differentiate between specific psychopathologies, or whether they, as the current work suggests, mainly reflect risk for general psychopathology and atypical development.”

      1.4) The observation of a recapitulation of psychopathology hierarchy may be somewhat undermined by the relatively modest strength of the components in the imaging domain.

      We thank the Reviewer for this comment, and now expressed this limitation in the revised Discussion, P.23.

      “The p factor, internalizing, externalizing, and neurodevelopmental dimensions were each associated with distinct morphological and intrinsic functional connectivity signatures, although these relationships varied in strength.”

      1.5) The task fMRI was assessed with a fairly basic functional connectivity approach, not using task timings to more specifically extract network responses.

      In the revised Discussion on P.24, we acknowledge that more in-depth analyses of task-based fMRI may have offered additional insights into state-dependent changes in functional architecture.

      “While the current work derived main imaging signatures from resting-state fMRI as well as grey matter morphometry, we could nevertheless demonstrate associations to white matter architecture (derived from diffusion MRI tractography) and recover similar dimensions when using task-based fMRI connectivity. Despite subtle variations in the strength of observed associations, the latter finding provided additional support that the different behavioral dimensions of psychopathology more generally relate to alterations in functional connectivity. Given that task-based fMRI data offers numerous avenues for analytical exploration, our findings may motivate follow-up work assessing associations to network- and gradient-based response strength and timing with respect to external stimuli across different functional states.”

      1.6) Overall, the authors achieve their aim to provide a detailed multimodal characterisation of MRI correlations of psychopathology. Code and data are available and well organised and should provide a valuable resource for researchers wanting to understand MRI-based neural correlates of psycho-pathology-related behavioural traits in this important age group. It is largely a descriptive study, with comparisons to previous uni-modal work, but without particularly strong testing of neuroscience hypotheses.

      We thank the Reviewer for recognizing the detail and rigor of data-driven study and extensive code and data documentation.

      Reviewer #2 - Public Review

      In "Multi-modal Neural Correlates of Childhood Psychopathology" Krebets et al. integrate multi-modal neuroimaging data using machine learning to delineate dissociable links to diverse dimensions of psychopathology in the ABCD sample. This paper had numerous strengths including a superb use of a large resource dataset, appropriate analyses, beautiful visualizations, clear writing, and highly interpretable results from a data-driven analysis. Overall, I think it would certainly be of interest to a general readership. That being said, I do have several comments for the authors to consider.

      We thank Dr Satterthwaite for the positive evaluation and helpful comments.

      2.1) Out-of-sample testing: while the permutation testing procedure for the PLS is entirely appropriate, without out-of-sample testing the reported effect sizes are likely inflated.

      As discussed in the editorial summary of essential revisions, we agree that out-of-sample prediction indeed provides stronger estimates of generalizability. We assess this by applying the PCA coefficients derived from the discovery cohort imaging data to the replication cohort imaging data. The resulting PCA scores and behavioral data were then z-scored using the mean and standard deviation of the replication cohort. The SVD weights derived from the discovery cohort were applied to the normalized replication cohort data to derive imaging and behavioral composite scores, which were used to recover the contribution of each imaging and behavioral variable to the LCs (i.e., loadings). Out-of-sample replicability of imaging (mean r=0.681, S.D.=0.131) and behavioral (mean r=0.948, S.D.=0.022) loadings was generally high across LCs 1-5. This analysis is reported in the revised manuscript (P.18).

      “Generalizability of reported findings was also assessed by directly applying PCA coefficients and latent components weights from the PLS analysis performed in the discovery cohort to the replication sample data. Out-of-sample prediction was overall high across LCs1-5 for both imaging (mean r=0.681, S.D.=0.131) and behavioral (mean r=0.948, S.D.=0.022) loadings.”

      2.2) Site/family structure: it was unclear how site/family structure were handled as covariates.

      Only unrelated participants were included in discovery and replication samples (see P.6). The site variable was regressed out of the imaging and behavioral data prior to the PLS analysis using the residuals from a multiple linear model which also included age, age2, sex, and ethnicity. This is now clarified on P.29:

      “Prior to the PLS analysis, effects of age, age2, sex, site, and ethnicity were regressed out from the behavioral and imaging data using a multiple linear regression to ensure that the LCs would not be driven by possible confounders (Kebets et al., 2021, 2019; Xia et al., 2018). The imaging and behavioral residuals of this procedure were input to the PLS analysis.”

      2.3) Anatomical features: I was a bit surprised to see volume, surface area, and thickness all evaluated - and that there were several comments on the correspondence between the SA and volume in the results section. Given that cortical volume is simply a product of SA and CT (and mainly driven by SA), this result may be pre-required.

      As suggested, we reduced the reporting of correlations between the loadings from the different modalities in the revised Results (specifically subsections on LC1, LC2, and LC3). Instead, we now refer to Table S4 in each subsection for this information: “Spatial correlations between modality-specific loadings are reported in Supplementary file 1c.”

      We also reran the PLS analysis while only including thickness and surface area as our structural metrics, to account for potential redundancy of these measures with volume. This analysis and associated findings are reported on P.36 and P.19:

      “As cortical volume is a result of both thickness and surface area, we repeated our main PLS analysis while excluding cortical volume from our imaging metrics and report the consistency of these findings with our main model.”

      “Third, to account for redundancy within structural imaging metrics included in our main PLS model (i.e., cortical volume is a result of both thickness and surface area), we also repeated our main analysis while excluding cortical volume from our imaging metrics. Findings were very similar to those in our main analysis, with an average absolute correlation of 0.898±0.114 across imaging composite scores of LCs 1-5.”

      2.4) Ethnicity: the rationale for regressing ethnicity from the data was unclear and may conflict with current best practices.

      We thank the Reviewer for this comment. In light of recent discussions on including this covariate in large datasets such as ABCD (e.g., Saragosa-Harris et al., 2022), we elaborate on our rationale for including this variable in our model in the revised manuscript on P.30:

      “Of note, the inclusion of ethnicity as a covariate in imaging studies has been recently called into question. In the present study, we included this variable in our main model as a proxy for social inequalities relating to race and ethnicity alongside biological factors (age, sex) with documented effects on brain organization and neurodevelopmental symptomatology queried in the CBCL.”

      We also assess the replicability of our analyses when removing race and ethnicity covariates prior to computing the PLS analysis and correlating imaging and behavioral composite scores across both models. We report resulting correlations in the revised manuscript (P.37, 19, and 27):

      “We also assessed the replicability of our findings when removing race and ethnicity covariates prior to computing the PLS analysis and correlating imaging and behavioral composite scores across both models.”

      “Moreover, repeating the PLS analysis while excluding this variable as a model covariate yielded overall similar imaging and behavioral composites scores across LCs to our original analysis. Across LCs 1-5, the average absolute correlations reached r=0.636±0.248 for imaging composite scores, and r=0.715±0.269 for behavioral composite scores. Removing these covariates seemed to exert stronger effects on LC3 and LC4 for both imaging and behavior, as lower correlations across models were specifically observed for these components.”

      “Although we could consider some socio-demographic variables and proxies of social inequalities relating to race and ethnicity as covariates in our main model, the relationship of these social factors to structural and functional brain phenotypes remains to be established with more targeted analyses.”

      2.5) Data quality: the authors did an admirable job in controlling for data quality in the analyses of functional connectivity data. However, it is unclear if a comparable measure of data quality was used for the T1/dMRI analyses. This likely will result in inflated effect sizes in some cases; it has the potential to reduce sensitivity to real effects.

      We agree that data quality was not accounted for in our analysis of T1w- and diffusion-derived metrics. We now accounted for T1w image quality by adding manual quality control ratings to the regressors applied to all structural imaging metrics prior to performing the PLS analysis, and reported the consistency of this new model with original findings. See P.36, P.19:

      “We also considered manual quality control ratings as a measure of T1w scan quality. This metric was included as a covariate in a multiple linear regression model accounting for potential confounds in the structural imaging data, in addition to age, age2, sex, site, ethnicity, ICV, and total surface area. Downstream PLS results were then benchmarked against those obtained from our main model.”

      “Considering scan quality in T1w-derived metrics (from manual quality control ratings) yielded similar results to our main analysis, with an average correlation of 0.986±0.014 across imaging composite scores.”

      As for diffusion imaging, we also regressed out effects of head motion in addition to age, age2, sex, site, and ethnicity from FA and MD measures and reported the consistency with our original results (P.36, P.19):

      “We tested another model which additionally included head motion parameters as regressors in our analyses of FA and MD measures, and assessed the consistency of findings from both models.”

      “Additionally considering head motion parameters from diffusion imaging metrics in our model yielded consistent results to those in our main analyses (mean r=0.891, S.D.=0.103; r=0.733-0.998).”

      Reviewer #3 - Public Review

      In this study, the authors utilized the Adolescent Brain Cognitive Development dataset to investigate the relationship between structural and functional brain network patterns and dimensions of psychopathology. They identified multiple components, including a general psychopathology (p) factor that exhibited a strong association with multimodal imaging features. The connectivity signatures associated with the p factor and neurodevelopmental dimensions aligned with the sensory-to-transmodal axis of cortical organization, which is linked to complex cognition and psychopathology risk. The findings were consistent across two separate subsamples and remained robust when accounting for variations in analytical parameters, thus contributing to a better understanding of the biological mechanisms underlying psychopathology dimensions and offering potential brain-based vulnerability markers.

      3.1) An intriguing aspect of this study is the integration of multiple neuroimaging modalities, combining structural and functional measures, to comprehensively assess the covariance with various symptom combinations. This approach provides a multidimensional understanding of the risk patterns associated with mental illness development.

      We thank the Reviewer for acknowledging the multimodal approach, and for the constructive suggestions.

      3.2) The paper delves deeper into established behavioral latent variables such as the p factor, internalizing, externalizing, and neurodevelopmental dimensions, revealing their distinct associations with morphological and intrinsic functional connectivity signatures. This sheds light on the neurobiological underpinnings of these dimensions.

      We are happy to hear the Reviewer appreciates the gain in understanding neural underpinnings of dimensions of psychopathology resulting from the current work.

      3.3) The robustness of the findings is a notable strength, as they were validated in a separate replication sample and remained consistent even when accounting for different parameter variations in the analysis methodology. This reinforces the generalizability and reliability of the results.

      We appreciate that the Reviewer found our robustness and generalizability assessment convincing.

      3.4) Based on their findings, the authors suggest that the observed variations in resting-state functional connectivity may indicate shared neurobiological substrates specific to certain symptoms. However, it should be noted that differences in resting-state connectivity between groups can stem from various factors, as highlighted in the existing literature. For instance, discrepancies in the interpretation of instructions during the resting state scan can influence the results. Hence, while their findings may indicate biological distinctions, they could also reflect differences in behavior.

      For the ABCD dataset, resting-state fMRI scans were based on eyes open and passive viewing of a crosshair, and are thus homogenized. We acknowledge, however, that there may still be state-to-state fluctuations contributing to the findings, and this is now discussed in the revised Discussion, on P.28. Note, however, that prior literature has generally also suggested rather modest impacts of cognitive and daily variation on resting-state functional networks, compared to much more dominating inter-individual and inter-group factors.

      “Finally, while prior research has shown that resting-state fMRI networks may be affected by differences in instructions and study paradigm (e.g., with respect to eyes open vs closed) (Agcaoglu et al., 2019), the resting-state fMRI paradigm is homogenized in the ABCD study to be passive viewing of a centrally presented fixation cross. It is nevertheless possible that there were slight variations in compliance and instructions that contributed to differences in associated functional architecture. Notably, however, there is a mounting literature based on high-definition fMRI acquisitions suggesting that functional networks are mainly dominated by common organizational principles and stable individual features, with substantially more modest contributions from task-state variability (Gratton et al. 2018). These findings, thus, suggest that resting-state fMRI markers can serve as powerful phenotypes of psychiatric conditions, and potential biomarkers (Abraham et al., 2017; Gratton et al., 2020; Parkes et al., 2020).”

      3.5) The authors conducted several analyses to investigate the relationship between imaging loadings associated with latent components and the principal functional gradient. They found several associations between principal gradient scores and both within- and between-network resting-state functional connectivity (RSFC) loadings. Assessing the analysis presented here proves challenging due to the nature of relating loadings, which are partly based on the RSFC, to gradients derived from RSFC. Consequently, a certain level of correlation between these two variables would be expected, making it difficult to determine the significance of the authors' findings. It would be more intriguing if a direct correlation between the composite scores reflecting behavior and the gradients were to yield statistically significant results.

      We thank the Reviewer for the comment, and agree that investigating gradient-behavior relationships could offer additional insights into the neural basis of psychiatric symptomatology. However, the current analysis pipeline precludes this direct comparison which is performed on a region-by-region basis across the span of the cortical gradient. Indeed, the behavioral loadings are provided for each CBCL item, and not cortical regions.

      The Reviewer also evokes concerns of potential circularity in our analysis, as we compared imaging loadings, which are partially based on RSFC, and gradient values generated from the same RSFC data. In response to this comment, we cross-validated our findings using an RSFC gradient derived from an independent dataset (HCP), showing highly consistent findings to those presented in the manuscript. This correlation is now reported in the Results section P.15.

      “A similar pattern of findings was observed when cross-validating between- and within-network RSFC loadings to a RSFC gradient derived from an independent dataset (HCP), with strongest correlations seen for between-network RSFC loadings for LC1 and LC3 (LC1: r=0.50, pspin<0.001; LC3: r=0.37, pspin<0.001).”

      We furthermore note similar correlations between imaging loadings and T1w/T2w ratio in the same participants, a proxy of intracortical microstructure and hierarchy (Glasser et al., 2011). These findings are now detailed in the revised Results, P.15-16:

      “Of note, we obtain similar correlations when using T1w/T2w ratio in the same participants, a proxy of intracortical microstructure and hierarchy (Glasser et al., 2011). Specifically, we observed the strongest association between this microstructural marker of the cortical hierarchy and between-network RSFC loadings related to LC1 (r=-0.43, pspin<0.001).”

      3.6) Lastly, regarding the interpretation of the first identified latent component, I have some reservations. Upon examining the loadings, it appears that LC1 primarily reflects impulse control issues rather than representing a comprehensive p-factor. Furthermore, it is worth noting that within the field, there is an ongoing debate concerning the interpretation and utilization of the p-factor. An insightful publication on this topic is "The p factor is the sum of its parts, for now" (Fried et al, 2021), which explains that the p-factor emerges as a result of a positive manifold, but it does not necessarily provide insights into the underlying mechanisms that generated the data.

      We thank the Reviewer for this comment, and added greater nuance into the discussion of the association to the p factor. We furthermore discuss some of the ongoing debate about the use of the p factor, and cite the recommended publication on P.27.

      “Other factors have also been suggested to impact the development of psychopathology, such as executive functioning deficits, earlier pubertal timing, negative life events (Brieant et al., 2021), maternal depression, or psychological factors (e.g., low effortful control, high neuroticism, negative affectivity). Inclusion of such data could also help to add further mechanistic insights into the rather synoptic proxy measure of the p factor itself (Fried et al., 2021), and to potentially assess shared and unique effects of the p factor vis-à-vis highly correlated measures of impulse control.”

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for going through our manuscript and providing valuable feedback. We are grateful to all 3 reviewers for describing our findings as important and valuable, well-designed and robust, and of value to the Parkinson's and Crohn's disease communities studying LRRK2. Below we detail a point-by-point response to the reviewers.

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      The paper by Dikovskaya and collaborators investigated the activitiy and expression of LRRK2 in different subtypes of splenic and intestinal immune cells, taking advantage of a novel GFP-Lrrk2 knockin mouse. Interestingly, they found that T-cell-released IL-4 stimulates Lrrk2 expression in B cells. I have a few comments and suggestions for the authors. 1) Figure 1C. LRRK2 KO cells display residual Rab10 phosphorylation. Do the authors have any idea of which kinase other than LRRK2 could be involved in this phosphorylation?

      As far as we are aware no other kinase is known to phosphorylate Rab10 at T73 in vivo. In vitro, recombinant Rab10 can be phosphorylated by MST3 at this site (Knebel A. et al, protocols.io https://dx.doi.org/10.17504/protocols.io.bvjxn4pn), but its relevance in vivo or in cells has not been shown. It is possible that the residual band recognised by anti-pT73 Rab10 ab in splenocytes is unspecific background, as it is mainly seen in LRRK2 KO spleen cells and not in other tissues. But to be certain that our assay assesses LRRK2-dependent Rab10 phosphorylation, we have always compared with the MLi-2 control.

      2) Since there are no good antibodies for IF/IHC as pointed by the authors, the GFP-Lrrk2 mouse gives the opportunity to check endogenous LRRK2 localization, i.e. in cells untreated or treated with IL-4 or other cytokines. Also, does endogenous GFP-LRRK2 accumulate into filaments/puncta upon MLi2 inhibition? The relocalization into filaments of inhibited LRRK2 has been observed in overexpression but not under endogenous expression. This analysis would be interesting also in light of the observed side effect of type-I inhibitors.

      We thank the reviewer for this suggestion. We will attempt a super-resolution microscopy using Airyscan with isolated B-cells treated with cytokine and/or LRRK2 inhibitor to address this question.

      3) Figure 5. The authors need to label more clearly the graphs referring to wt mice versus GFP-Lrrk2 KI mice.

      We have now labelled the panels referring to the WT mice only with "WT mice", to distinguish them from the other panels that incorporate data from both EGFP-Lrrk2 mice and their WT littermates used as a background.

      They should also replace GFP-LRRK2 with GFP-Lrrk2 since they edited the endogenous murine gene.

      Thank you, we have corrected it, and also the other mouse genotypes.

      4) In the material and methods MLi-2 administration in mice is indicated at 60 mg/kg for 2 hr whereas in suppl. figure 5 the indicated dose is 30 mg/kg. Please correct with the actual dose used.

      Thank you, we have corrected the mistake.

      5) The discovery of IL-4 as a Lrrk2 activator in B cells is a very interesting and novel finding. The authors could take advantage of the GFP tag to investigate LRRK2 interactome upon IL-4 stimulation (optional). Also, is the signaling downstream of IL-4 attenuated in Lrrk2 KO cells?

      We thank the reviewer for these interesting suggestions. The role of LRRK2 in IL-4 activated B-cells is currently under active research in the lab.

      Reviewer #1 (Significance (Required)):

      The manuscript is well designed and organized, and the experimental approaches are robust. These results are significant for the field as they add additional layers in the complex regulation and regulatory roles of LRRK2 in immunity, with implication for inflammatory disorders and Parkinson's disease.

      We thank the reviewer for their positive comments and for recognising our efforts to provide some clarity to a complex field.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __

      The authors present a flow cytometry methodology to assess LRRK2 expression and pathway markers in mouse models and explore LRRK2 in splenic and intestinal immune cells. This is a highly valuable study given the emerging understanding that LRRK2 pathway activity in peripheral tissues may be of crucial importance to Parkinson's disease and Crohn's disease. P8 : the authors state that their results indicate 'that the effects of LRRK2-R1441C mutation and inflammation on LRRK2 activity represent two different parallel pathways'. This seems like an overinterpretation as pathway suggests the presence of additional partners in the pathway while R1441C is a LRRK2 intrinsic modification. The results can equally be explained by synergistic effects between both activation mechanisms (mutant and inflammation).

      We agree with the reviewer, and have added this into the text. The sentence now reads "suggesting that the LRRK2-R1441C mutation and inflammation have different impacts on LRRK2 activity, either in parallel or in synergy."

      Methods and experiment descriptions in results : the authors appear to use the terms anti-CD3 stimulation and CD3 stimulation interchangeably, although it is not always clear in the text that these are synonymous. This should be clarified.

      We thank reviewer for pointing out this error on our part. We have made the necessary changes to always refer to the stimulation as anti-CD3.

      One major observation in this paper is that LRRK2 is not detected in gut epithelial cells as previously has been reported. It would be useful to comment on any differences between the presented protocol and the previous reports, in particular relating to the antigen retrieval step. In order to reinforce the finding, it would be useful to include in situ hybridization data that could further strengthen the observations of which cellular subtypes express LRRK2 and which do not. Indeed, while the KO control shows that there is an unacceptable high non-specific staining, it does not prove absence of expression. Also, can any conclusions be made about expression of LRRK2 in neural cells of the gut? This important information on LRRK2 detection in gut should be mentioned in the abstract and highlighted in the discussion.

      We thank the reviewer for pointing this out. In fact, we think the observation that LRRK2 is not detected in epithelial cells is so important that we have a separate manuscript exploring this point. Please see 1. Tasegian, A. et al.https://doi.org/10.1101/2024.03.07.582590 (2024). In this manuscript we have explored the expression of LRRK2 in human and murine intestinal epithelial cells using qPCR. Although we do not have in situ hybridization data, we believe that using both the EGFP-LRRK2 and the pRab10 flow cytometry, as well as qPCR and proteomics on selected cell types, corroborates our findings on the cell types that express LRRK2. We did not analyse LRRK2 expression in the neural cells of the gut, as the focus was on the immune cells, however we hope that others will use the tools developed here to explore this further.

      The authors mention in the discussion that they 'show for the first time that eosinophils also express active LRRK2 at levels comparable to B-cells and DCs.' The relevance of this finding should be further developed. Why is this important?

      We thank the reviewer for this point. We don't know how LRRK2 is important in these cells. However, as the role of LRRK2 in eosinophils and neutrophils has not yet been explored and both cell types play important roles in IBD, we think it is important to point out. We have now added a sentence to the discussion highlighting the importance of eosinophils in IBD. "Since eosinophils have recently been implicated as key player in intestinal defense and colitis(Gurtner et al, 2022), it will be interesting to evaluate LRRK2 functions in these cells."

      In the isolation of lamina propria cells, what efforts were made to characterize the degree of purification of the lamina propria cells compared to cells of other gut wall layers such as epithelium, muscularis mucosa, or deeper layers? Please specify.

      Isolation of lamina propria cells is a very well-established process (LeFrancois and Lycke, 'Isolation of Mouse Small Intestinal Intraepithelial Lymphocytes, Peyer's Patch, and Lamina Propria Cells.' Curr. Protocols in Immunology 2001), where we extensively wash off the epithelial layer before digesting the tissue for the LP. After the digestion the muscle and wall of the gut are still intact, so we do not get any contamination with other deeper layers. The subsets of cells we find in the LP are in line with isolations from other labs.

      Minor comments Figure 5G, for the graphs indicating LRRK2 activity and LRRK2 phosphorylation, the specific measures should be specified in the graph titles to avoid any ambiguity (pT73-Rab10, pS935-LRRK2).

      We have added the specifications to the new version of the figure.

      Suppl figure 1 : please specify the figure label and abbreviation AF568 in the legend. Suppl figure 2 : please specify the figure label and abbreviation anti-rb in the legend

      Thank you, we added the abbreviations to the legends. The Figure labels for both figures have been already included at the top of figure legends.

      Reviewer #2 (Significance (Required)):

      The authors present a flow cytometry methodology to assess LRRK2 expression and pathway markers in mouse models and explore LRRK2 in splenic and intestinal immune cells. This is a highly valuable study given the emerging understanding that LRRK2 pathway activity in peripheral tissues may be of crucial importance to Parkinson's disease and Crohn's disease.

      We thank the reviewer for recognising the value of this study.

      Reviewer #3

      Evidence, reproducibility and clarity

      The paper describes a set of experiments to analyse LRRK2 activity in tissues and despite it has very important findings and technical developments is largely descriptive. It does look like a collection of experiments more than a defined hypothesis and experiments to address that.

      We thank the reviewer for recognising the importance of our findings and the technical developments. We agree that the paper's focus is to describe where LRRK2 is expressed in immune cells, and in which cells is it active or activated after inflammation in a hypothesis-free unbiased manner. We believe this is important data to share as a resource for the wider LRRK2 community and we will submit the manuscript as a Resource.

      The flow cytometry assay of the first part is a great technical challenge and represents the establishment of a potentially very useful tool for the field. It would have been important to test other organs, either as controls or for example because of their relevance e.g. lungs. This first part is disconnected from the second part below.

      We thank the reviewer for pointing out that the pRab10 assay would be useful to apply to other organs too. Since we are interested in the role of LRRK2 in IBD, we had focused on applying the pRab10 assay on intestinal tissue, with spleens also analysed as major lymphoid organ and a source of immune cells that can translocate to the gut in inflammation. We hope that the publication of this method would allow other researchers to analyse other tissues in the future.

      The authors generated a new mouse KI mouse expressing EGFP-LRRK2 and show data the levels of LRRK2 expression are reduced in tissues at different degrees and established a flow cytometry assay to measure LRRK2 expression by monitoring the GFP signal. Interestingly they found that expression does not correlate with activity (as measured by phospho-Rabs). I suggest taking this part out as it breaks the flow of the paper. If data using this mouse is included, then microscopy should be included to complement the flow cytometry data. I understand the mice were used later with the anti-CD3 treatment, but it is very confusing that some experiments are done with EGFP-LRRK2 mice and others not. It does look in general like the mice do not behave as wild types and this is an important caveat. Without microscopy of the tissues or even cells (Figure 4) is hard to conclude much about these experiments.

      We thank the reviewer for this point and would like to explain. It is true that in Suppl Figure 5, we show reduction of LRRK2 signal in the EGFP-Lrrk2-KI mice. However, based on immunoblotting, a significant reduction in EGFP-LRRK2 expression levels was seen only in the brain, but not in the tissues we analysed, that is the spleen and the intestine. Further, we have shown clearly using proteomics (Fig. 3D and 5E), that the GFP signal in immune cells correlates very well with the WT LRRK2 expression. Therefore, we think that the GFP signal in these mice reflects WT LRRK2 expression pattern. Further, despite the limitations of reduced kinase activity that we thoroughly describe, we think this model is very useful since no antibodies work to stain for LRRK2 in mice. We therefore respectfully disagree with this reviewer that the EGFP-LRRK2 data should be taken out, as it has proven to be an invaluable tool to measure and track changes in endogenous LRRK2 expression. Moreover, we think the fact that LRRK2 expression does not correlate with levels of activity, that is, LRRK2 is more active in some immune cells than in others, is a very important finding that evidences the cell-specific regulation of LRRK2 activity beyond its expression level.

      We tried but failed to visualize the EGFP-LRRK2 signal using fluorescence microscopy in the tissue. This is most likely due to the low expression of LRRK2 (proteomics data suggests that even neutrophils express less than 9000 copies), confounded further by the high background autofluorescence in tissues, especially in the gut. We now explain the lack of tissue images from the EGFP-LRRK2 mice in the text. However, we can visualize the EGFP-LRRK2 in B cells, and we will provide these images in a revised version of the manuscript.

      We have also added the following paragraph to the discussion:

      "We complemented the pRab10 assay with the development of the EGFP-Lrrk2-KI reporter mouse. Although the reporter was initially designed as a fluorescent tracker for imaging LRRK2 localisation in cells and tissues, the low expression of LRRK2, combined with high and variable autofluorescence in tissues precluded its use for microscopy. Even in neutrophils, which express highest level of LRRK2 among immune cells, there are less than 9000 copies of LRRK2 per cell (Sollberger et al, 2024), making it difficult to identify localization. However, the EGFP signal was sufficient for flow cytometry-based measurements, where background autofluorescence of each cell type was taken into account and subtracted."

      Then the authors show that LRRK2 expression and activity is different in different cell types and depends on inflammation. The anti-CD3 strategy to induce inflammation is very different from physiological inflammation such as sepsis and LPS stimulation, so experiments with other stimuli could be important here to contribute to the message of inflammatory trigger of LRRK2 activation and decoupling of cell type.

      We thank the reviewer for this suggestion. We used the anti-CD3 model as it also causes intestinal inflammation, and mimics T-cell cytokine storms that happens in many diseases. However, for the revisions we will also test another model of inflammation as suggested, such as LPS stimulation, to measure how inflammation affects LRRK2 expression and activity.

      The IL-4 data is intriguing but too preliminary. The lack of strong effect of IFN-gamma is expected as the promoter of LRRK2 in mice and humans is different and human cells responds much better with regards to LRRK2 expression after IFN-gamma stimulation.

      We are confused by what the reviewer means by saying the IL-4 data is preliminary. We have shown by flow cytometry, immunoblotting, qPCR and proteomics that IL-4 induced LRRK2 expression in B-cells. So we are uncertain as to how else this can be shown. As to the effect of IFNγ on LRRK2 expression, it may indeed be that human cells respond better than murine cells. Importantly, the IL-4 ability to induce LRRK2 in B-cells is a novel and important finding, regardless of the effects of IFNγ.

      Reviewer #3 (Significance (Required))

      The paper describes a set of experiments to analyse LRRK2 activity in tissues and despite it has very important findings and technical developments is largely descriptive. It does look like a collection of experiments more than a defined hypothesis and experiments to address that.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Revisions Round 1

      Reviewer #1

      We thank the reviewer for their careful reading of our manuscript and have taken all of their grammatical corrections into account.

      Reviewer #2 (Public Review): 

      Weaknesses: 

      The paper contains multiple instances of non-scientific language, as indicated below. It would also benefit from additional details on the cryo-EM structure determination in the Methods and inclusion of commonly accepted requirements for cryo-EM structures, like examples of 2D class averages, raw micrographs, and FSC curves (between half-maps as well as between rigid-body fitted (or refined) atomic models of the different polymorphs and their corresponding maps). In addition, cryo-EM maps for the control experiments F1 and F2 should be presented in Figure 9.

      We tried to correct the non-scientific language and have included the suggested data on the Cryo-EM analyses including new Figures 11-17.  We did not collect data on the sample used for the seeds in the cross seeding experiments because we had already confirmed in multiple datasets that the conditions in F1 and F2 reproducibly produce fibrils of Type 1 and Type 3, respectively. We have now analyzed cryo-EM data for 6 more samples at pH 7.0 and found that several kinds of polymorphs (Types 1A, 1M, 2A, 2B and 5) are accessible at this pH, however the Type 3 polymorphs are not formed at pH 7.0 under the conditions that we used for aggregation.

      Reviewer #2 (Recommendations For The Authors): 

      - remove unscientific language: "it seems that there are about as many unique atomic-resolution structures of these aggregates as there are publications describing them"   

      We have rephrased this sentence.

      - for same reason, remove "Obviously, " 

      Done

      - What does this mean? “polymorph-unspecific” 

      Rephrased as non-polymorph-specific

      - What does this mean? "shallow amyloid energy hypersurface"  

      By “shallow hypersurface” we mean that the minimum of the multi-dimensional function that describes the energy of the amyloid is not so deep that subtle changes to the environment will not favor another fold/energy minimum. We have left the sentence because while it may not be perfect, it is concise and seems to get the point across.

      - "The results also confirm the possibility of producing disease-relevant structure in vitro." -> This is incorrect as no disease-relevant structure was replicated in this work. Use another word like “suggest”.

      We have changed to “suggest” as suggested.

      - Remove "historically" 

      Done

      - Rephrase “It has long been understood that all amyloids contain a common structural scaffold” 

      Changed to “It has long been established that all amyloids contain a common structural scaffold..” 

      - "Amyloid polymorphs whose differences lie in both their tertiary structure (the arrangement of the beta-strands) and the quaternary structure (protofilamentprotofilament assembly) have been found to display distinct biological activities [8]" -> I don't think this is true, different biological activities of amyloids have never been linked to their distinct structures.  

      We have added 5 new references (8-12) to support this sentence.

      - Reference 10 is a comment on reference 9; it should be removed. Instead, as for alpha-synuclein, all papers describing the tau structures should be included.  

      We have removed the reference, but feel that the addition of all Tau structure references is not merited in this manuscript since we are not comparing them.

      - Rephrase: "is not always 100% faithful"

      Removed “100%”

      - What is pseudo-C2 symmetry? Do the authors mean pseudo 2_1 symmetry (ie a 2-start helical symmetry)?

      Thank for pointing this out.  We did indeed mean pseudo 21 helical symmetry.  

      - Re-phrase: "alpha-Syn's chameleon-like behavior" 

      We have removed this phrase.

      - "In the case of alpha-Syn, the secondary nucleation mechanism is based on the interaction of the positively charged N-terminal region of monomeric alpha-Syn and the disordered, negatively charged C-terminal region of the alpha-Syn amyloid fibrils [54]" -> I would say the mechanisms of secondary nucleation are not that well understood yet, so one may want to tune this down a bit. 

      We have changed this to “mechanism has been proposed to be”

      - The paragraphs describing experiments by others are better suited for a Discussion rather than a Results section. Perhaps re-organize this part? 

      We have left the text intact as we are using a Results and Discussion format.

      - A lot of information about Image processing seems to be missing: what steps were performed after initial model generation? 

      We have added more details in the methods section on the EM data processing and model analysis.

      - Figure 1: Where is Type 4 on the pH scale?

      We have adjusted the Fig 1 legend to clarify that pH scale is only applicable to the structures presented in this manuscript. 

      - Figure 2: This might be better incorporated as a subpanel of Figure 1.

      We agree that this figure is somewhat of a loner on its own and we only added it in order to avoid confusion with the somewhat inconsistent naming scheme used for the Type 1B structure. However, we prefer to leave it as a separate figure so that it does not get dilute the impact of figure 1.

      - Figure 3: What is the extra density at the bottom of Type 3B from pH 5.8 samples 1 and 2. pH 5.8 + 50mM NaCl (but not pH 5.8 + 100 mM NaCl)? Could this be an indication of a local minimum and the pH 5.8 + 100 mM NaCl structure is correct? Or is this a real difference between 0/50mM NaCl and 100 mM NaCl? 

      We did not see the extra density to which the reviewer is referring, however the images used in this panel are the based on the output of 3D-classification which is more likely to produce more artifacts than a 3D refinement. With this in mind, we did not see any significant differences in the refined structures and therefore only deposited the better quality map and model for each of the polymorph types.

      - Figure 3: To what extent is Type 3B of pH 6.5 still a mixture of different types? The density looks poor. In general, in the absence of more details about the cryo-EM maps, it is hard to assess the quality of the structures presented.  

      In order to improve the quality of the images in this panel, a more complete separation of the particles from each polymorph was achieved via the filament subset selection tool in RELION 5. In each case, an unbiased could be created from the 2D classes via the relion_helix_inimodel2D program, further supporting the coexistence of 4 polymorphs in the pH 6.5 sample. The particles were individually refined to produce the respective maps that are now used in this figure.

      - Many references are incorrect, containing "Preprint at (20xx)" statements.

      This has been corrected.  

      Reviewer #3 (Public Review): 

      Weaknesses: 

      (1) The authors reveal that both Type 1 monofilament fibril polymorph (reminiscent of JOSlike polymorph) and Type 5 polymorph (akin to tissue-amplified-like polymorph) can both form under the same condition. Additionally, this condition also fosters the formation of flat ribbon-like fibril across different batches. Notably, at pH 5.8, variations in experimental groups yield disparate abundance ratios between polymorph 3B and 3C, indicating a degree of instability in fibrillar formation. The variability would potentially pose challenges for replicability in subsequent research. In light of these situations, I propose the following recommendations: 

      (a) An explicit elucidation of the factors contributing to these divergent outcomes under similar experimental conditions is warranted. This should include an exploration of whether variations in purified protein batches are contributing factors to the observed heterogeneity.

      We are in complete agreement that understanding the factors that lead to polymorph variability is of utmost importance (and was the impetus for the manuscript itself). However the number of variables to explore is overwhelming and we will continue to investigate this in our future research. Regarding the variability between batches of purified protein, we also think that this could be a factor in the polymorph variability observed for otherwise “identical” aggregation conditions, particularly at pH 7 where the largest variety of polymorphs have been observed. However, even variation between identical replicates (samples created from the same protein solution and simply aggregated simultaneously in separate tubes) can lead to different outcomes (see datasets 15 and 16 in the revised Table 1) suggesting that there are stochastic processes that can determine the outcome of an individual aggregation experiment. While our data still indicates that Type 1,2 and 3 polymorphs are strongly selected by pH, the selection between interface variants 3B vs. 3C and 2A vs. 2B might also be affected by protein purity. Our standard purification protocol produces a single band by coomassie-stained SDS-PAGE however minor truncations and other impurities below a few percent would go undetected and, given the proposed roles of the N and C-termini in secondary nucleation, could have a large effect on polymorph selection and seeding. In line with the reviewer’s comments we now include a batch number for each EM dataset. While no new conclusions can be drawn from the inclusion of this additional data, we feel that it is important to acknowledge the possible role of batch to batch variability. 

      (b) To enhance the robustness of the conclusions, additional replicates of the experiments under the same condition should be conducted, ideally a minimum of three times.  

      The pH 5.8 conditions that yield Type 3 fibrils has already been repeated several times in the original manuscript. Since the pH 7.4 conditions produce the most common a-Syn polymorph (Type 1A) and were produced twice in this manuscript (once as an unseeded and once as a cross-seeded fibrilization) we decided to focus on the intermediate condition where the most variability had been seen (pH 7.0). The revised table 1 now has 6 new datasets (11-16) representing 6 independent aggregations at pH 7.0 starting from two different protein purification batches. The results is that we now produce the type 2A/B polymorphs in three samples and in two of these samples we once again observed the type 1M polymorph.  The other samples produced Type 1A or non-twisted fibrils.

      (c) Further investigation into whether different polymorphs formed under the same buffer condition could lead to distinct toxicological and pathology effects would be a valuable addition to the study.  

      The correlation of toxicity with structure would in principle be interesting. However the Type 1 and Type 3 polymorphs formed at pH 5.8 and 7.4 are not likely to be biologically relevant. The pH 7 polymorphs (Type 5 and 1M) would be more interesting because they form under the same conditions and might be related to some disease relevant structures. Still, it is rare that a single polymorph appears at 7.0 (the Type 5 represented only 10-20% of the fibrils in the sample and the Type 1M also had unidentified double-filament fibrils in the sample). We plan to pursue this line of research and hope to include it in a future publication.

      (2) The cross-seeding study presented in the manuscript demonstrates the pivotal role of pH conditions in dictating conformation. However, an intriguing aspect that emerges is the potential role of seed concentration in determining the resultant product structure. This raises a critical question: at what specific seed concentration does the determining factor for polymorph selection shift from pH condition to seed concentration? A methodological robust approach to address this should be conducted through a series of experiments across a range of seed concentrations. Such an approach could delineate a clear boundary at which seed concentration begins to predominantly dictate the conformation, as opposed to pH conditions. Incorporating this aspect into the study would not only clarify the interplay between seed concentration and pH conditions, but also add a fascinating dimension to the understanding of polymorph selection mechanisms.

      A more complete analysis of the mechanisms of aggregation, including the effect of seed concentration and the resulting polymorph specificity of the process, are all very important for our understanding of the aggregation pathways of alpha-synuclein and are currently the topic of ongoing investigations in our lab.

      Furthermore, the study prompts additional queries regarding the behavior of cross-seeding production under the same pH conditions when employing seeds of distinct conformation. Evidence from various studies, such as those involving E46K and G51D cross-seeding, suggests that seed structure plays a crucial role in dictating polymorph selection. A key question is whether these products consistently mirror the structure of their respective seeds. 

      We thank the reviewer for reminding us to cite these studies as a clear example of polymorph selection by cross-seeding. Unfortunately, it is not 100% clear from the G51D cross seeding manuscript (https://doi.org/10.1038/s41467-021-26433-2) what conditions were used in the cross-seeding since different conditions were used for the seedless wild-type and mutant aggregations… however it appears that the wildtype without seeds was Tris pH 7.5 (although at 37C the pH could have dropped to 7ish) and the cross-seeded wild-type was in Phosphate buffer at pH 7.0. In the E46K cross-seeding manuscript, it appears that pH 7.5 Tris was used for all fibrilizations (https://doi.org/10.1073/pnas.2012435118).  In any event, both results point to the fact that at pH 7.0-7.5 under low-seed conditions (0.5%) the Type 4 polymorph can propagate in a seed specific manner.

      (3) In the Results section of "The buffer environment can dictate polymorph during seeded nucleation", the authors reference previous cell biological and biochemical assays to support the polymorph-specific seeding of MSA and PD patients under the same buffer conditions. This discussion is juxtaposed with recent research that compares the in vivo biological activities of hPFF, ampLB as well as LB, particularly in terms of seeding activity and pathology. Notably, this research suggests that ampLB, rather than hPFF, can accurately model the key aspects of Lewy Body Diseases (LBD) (refer to: https://doi.org/10.1038/s41467-023-42705-5). The critical issue here is the need to reconcile the phenomena observed in vitro with those in in-vivo or in-cell models. Given the low seed concentration reported in these studies, it is imperative for the authors to provide a more detailed explanation as to why the possible similar conformation could lead to divergent pathologies, including differences in cell-type preference and seeding capability.  

      We thank the reviewer for bring this recent report to our attention. The findings that ampLB and hPFF have different PK digestion patterns and that only the former is able to model key aspects of Lewy Body disease are in support of the seed-specific nature of some types of alpha-synuclein aggregation.  We have added this to the discussion regarding the significant role that seed type and seed conditions likely play in polymorph selection.

      (4) In the Method section of "Image processing", the authors describe the helical reconstruction procedure, without mentioning much detail about the 3D reconstruction and refinement process. For the benefit of reproducibility and to facilitate a deeper understanding among readers, the authors should enrich this part to include more comprehensive information, akin to the level of detail found in similar studies (refer to: https://doi.org/10.1038/nature23002).

      As also suggested by reviewer #2, we have now added more comprehensive information on the 3D reconstruction and refinement process.

      (5) The abbreviation of amino acids should be unified. In the Results section "On the structural heterogeneity of Type 1 polymorphs", the amino acids are denoted using three-letter abbreviation. Conversely, in the same section under "On the structural heterogeneity of Type 2 and 3 structures", amino acids are abbreviated using the one-letter format. For clarity and consistency, it is essential that a standardized format for amino acid abbreviations be adopted throughout the manuscript.

      That makes perfect sense and had been corrected.

      Reviewing Editor: 

      After discussion among the reviewers, it was decided that point 2 in Reviewer #3's Public Review (about the experiments with different concentrations of seeds) would probably lie outside the scope of a reasonable revision for this work. 

      We agree as stated above and will continue to work on this important point.

      Revisions Round 2

      Reviewer #2 (Public Review): 

      I do worry that the FSC values of model-vs-map appear to be higher than expected from the corresponding FSCs between the half-maps (e.g. see Fig 13). The implication of this observation is that the atomic models may have been overfitted in the maps, which would have led to a deterioration of their geometry. A table with rmsd on bond lengths, angles, etc would probably show this. In addition, to check for overfitting, the atomic model for each data set could be refined in one of the half-maps, and then that same model could be used to calculate 2 FSC model-vs-map curves: one against the half-map it was refined in and one against the other half-map. Deviations between these two curves are an indication of overfitting. 

      Thank you for the recommendations for model validation.  We have added the suggested statistics to Table 2 and performed the suggested model fitting to one of the half-maps and plotted 3 FSC model-vs-map curves: one for each half-map versus the model fit against only one half map and one for the model fit against the full map. We feel that the degree of overfitting is reasonable and does not  significantly impact the quality of the models. 

      In addition, the sudden drop in the FSC curves in Figure 16 shows that something unexpected has happened to this refinement. Are the authors sure that only the procedures outlined in the Methods were used to create these curves? The unexpected nature of the FSC curve for this type (2A) raises doubts about the correctness of the reconstruction. 

      We thank the reviewer for the attention to detail.  We should have caught this mistake. It turns out that in the last round of 3D refinement, the two half-maps become shifted with respect to each other in the z direction. We realigned the two maps using Chimera and then re-ran the postprocessing. The new maps have been deposited in EMD-50850. This mistake motivated us to inspect all of the maps and we found the same problem had occurred in the Type 3B maps.  This was not noticed by the reviewer because we accidentally plotted the FSC curves from postprocessing from one refinement round before the one deposited in the EMD. We performed the same half-map shifting procedure for the Type 3B data and performed a final round of real-space refinement to produce new maps and models that have been deposited as EMD-50888 and 9FYP (superseding the previous entries).

      Reviewer #3 (Public Review): 

      There are two minor points I recommend the authors to address: 

      (1) In the response to Weakness 1, point (3), the authors state that "the Type 5 represented only 10-20% of the fibrils in the sample." However, this information is not labeled in the corresponding Figure 4. I suggest the authors verify and label all relevant percentages in the figures to prevent misunderstandings. 

      We aim to be as transparent as possible and this information was included in the main text however we did not label the percentage of Type 5 fibrils in Figure 4 because that would make the other percentages ambiguous.  The percentages in Figure 4 represent the ratio of helical segments used for each type of refined structure in the dataset (always adding up to 100%), not the percent of all fibrils in the dataset.  That is, there are sometimes untwisted or unidentifiable fibrils in datasets and these were not accounted for in the listed percentages. We have added a sentence to the Figure 4 legend to explain to what the percentages refer.

      (2) While the authors have detailed the helical reconstruction procedure in the Methods section, it is necessary to indicate the scale bar or box size in the figure legend of the 2D representative classes to ensure clarity and reproducibility. 

      Thank you for reminding us to add the scale bars. This is now done for the 2D classes in Figures 11-17.

      Recommendations for the authors: 

      Reviewer #2 (Recommendations For The Authors): 

      A critical look at the maps and models of the various structures at this stage may prevent the authors from entering suboptimal structures into the databases.  

      We agree. Thank you for suggesting this.

      Reviewer #3 (Recommendations For The Authors): 

      The authors have responded adequately to these critiques in the revised version of the manuscript. There are two minor points. 

      (1) The authors state that "the Type 5 represented only 10-20% of the fibrils in the sample." However, this information is not labeled in the corresponding Figure 4. I suggest the authors verify and label all relevant percentages in the figures to prevent misunderstandings. 

      (2) While the authors have detailed the helical reconstruction procedure in the Methods section, it is necessary to indicate the scale bar or box size in the figure legend of the 2D representative classes to ensure clarity and reproducibility. 

      Answered in public comments

    1. Author response:

      eLife assessment

      Cav2 voltage-gated calcium channels play key roles in regulating synaptic strength and plasticity. In contrast to mammals, invertebrates like Drosophila encode a single Cav2 channel, raising questions on how diversity in Cav2 is achieved from a single gene. Here, the authors present convincing evidence that two alternatively spliced isoforms of the Cac gene (cacophony, also known as Dmca1A and nightblindA) enable diverse changes in Cav2 expression, localization, and function in synaptic transmission and plasticity. These valuable findings will be of interest to a variety of researchers.

      We suggest replacing “two alternatively spliced isoforms of the Cac gene” by “two alternatively spliced mutually exclusive exon pairs of the Cac gene”. 

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript by Bell et. al. describes an analysis of the effects of removing one of two mutually exclusive splice exons at two distinct sites in the Drosophila CaV2 calcium channel Cacophony (Cac). The authors perform imaging and electrophysiology, along with some behavioral analysis of larval locomotion, to determine whether these alternatively spliced variants have the potential to diversify Cac function in presynaptic output at larval neuromuscular junctions. The author provided valuable insights into how alternative splicing at two sites in the calcium channel alters its function.

      Strengths:

      The authors find that both of the second alternatively spliced exons (I-IIA and I-IIB) that are found in the intracellular loop between the 1st and 2nd set of transmembrane domains can support Cac function. However, loss of the I-IIB isoform (predicted to alter potential beta subunit interactions) results in 50% fewer channels at active zones and a decrease in neurotransmitter release and the ability to support presynaptic homeostatic potentiation. Overall, the study provides new insights into Cac diversity at two alternatively spliced sites within the protein, adding to our understanding of how regulation of presynaptic calcium channel function can be regulated by splicing.

      Weaknesses:

      The authors find that one splice isoform (IS4B) in the first S4 voltage sensor is essential for the protein's function in promoting neurotransmitter release, while the other isoform (IS4A) is dispensable. The authors conclude that IS4B is required to localize Cac channels to active zones. However, I find it more likely that IS4B is required for channel stability and leads to the protein being degraded, rather than any effect on active zone localization. More analysis would be required to establish that as the mechanism for the unique requirement for IS4B.

      We agree that we need to explain more clearly why IS4B is unlikely required for channel stability, but instead, likely has a unique function at the presynaptic active zone of fast synapses. We will address this by revising text and by providing additional data. If IS4B was required for evoked release because it supported channel protein stability, then the removal of IS4B should cause protein degradation throughout all sub-neuronal compartments and throughout the CNS, but this is not the case. First, upon removal of IS4B in adult motoneurons (which use cac channels at the presynapse and somatodendritically, Ryglewski et al., 2012) evoked release from axon terminals is abolished (as at the larval NMJ), but somatodendritic cac inward current is present. If IS4B was required for cac channel stability, somatodendritic current should also be abolished. We will add these data to the ms. Second, immunohistochemistry for tagged IS4B channels reveals that these are present not only at presynaptic active zones at the NMJ but also throughout the VNC motor neuropils. Excision of IS4B causes the absence of cac channels from the presynaptic active zones at the NMJ and throughout the VNC neuropils (and accordingly this is lethal). By contrast, tagged IS4A channels (with IS4B excised) are not found at the presynaptic terminals of fast synapses, but instead, in other distinct parts of the CNS. We will also provide data to show this. Together these data are in line with a unique requirement of IS4B at presynaptic active zones (not excluding additional functions of IS4B), whereas IS4A containing cac isoforms mediate different functions.

      We appreciate the additional reviewer suggestions to the authors that we will address point by point when revising the ms. 

      Reviewer #2 (Public Review):

      This study by Bell et al. focuses on understanding the roles of two alternatively spliced exons in the single Drosophila Cav2 gene cac. The authors generate a series of cac alleles in which one or the other mutually exclusive exons are deleted to determine the functional consequences at the neuromuscular junction. They find alternative splicing at one exon encoding part of the voltage sensor impacts the activation voltage as well as localization to the active zone. In contrast, splicing at the second exon pair does not impact Cav2 channel localization, but it appears to determine the abundance of the channel at active zones. Together, the authors propose that alternative splicing at the Cac locus enables diversity in Cav2 function generated through isoform diversity generated at the single Cav2 alpha subunit gene encoded in Drosophila.

      Overall this is an excellent, rigorously validated study that defines unanticipated functions for alternative splicing in Cav2 channels. The authors have generated an important toolkit of mutually exclusive Cac splice isoforms that will be of broad utility for the field, and show convincing evidence for distinct consequences of alternative splicing of this single Cav2 channel at synapses. Importantly, the authors use electrophysiology and quantitative live sptPALM imaging to determine the impacts of Cac alternative splicing on synaptic function. There are some outstanding questions regarding the mechanisms underlying the changes in Cac localization and function, and some additional suggestions are listed below for the authors to consider in strengthening this study. Nonetheless, this is a compelling investigation of alternative splicing in Cav2 channels that should be of interest to many researchers.

      We agree that some additional information on cac isoform localization (in particular for splicing at the IS4 site) will strengthen the manuscript. We will address this by providing additional data and revising text (see responses to reviewers 1 and 3). We are also grateful for the additional reviewer suggestions which we will address point by point when revising the ms.  

      Reviewer #3 (Public Review):

      Summary:

      Bell and colleagues studied how different splice isoforms of voltage-gated CaV2 calcium channels affect channel expression, localization, function, synaptic transmission, and locomotor behavior at the larval Drosophila neuromuscular junction. They reveal that one mutually exclusive exon located in the fourth transmembrane domain encoding the voltage sensor is essential for calcium channel expression, function, active zone localization, and synaptic transmission. Furthermore, a second mutually exclusive exon residing in an intracellular loop containing the binding sites for Caβ and G-protein βγ subunits promotes the expression and synaptic localization of around ~50% of CaV2 channels, thereby contributing to ~50% of synaptic transmission. This isoform enhances release probability, as evident from increased short-term depression, is vital for homeostatic potentiation of neurotransmitter release induced by glutamate receptor impairment, and promotes locomotion. The roles of the two other tested isoforms remain less clear.

      Strengths:

      The study is based on solid data that was obtained with a diverse set of approaches. Moreover, it generated valuable transgenic flies that will facilitate future research on the role of calcium channel splice isoforms in neural function.

      Weaknesses:

      (1) Based on the data shown in Figures 2A-C, and 2H, it is difficult to judge the localization of the cac isoforms. Could they analyze cac localization with regard to Brp localization (similar to Figure 3; the term "co-localization" should be avoided for confocal data), as well as cac and Brp fluorescence intensity in the different genotypes for the experiments shown in Figure 2 and 3 (Brp intensity appears lower in the dI-IIA example shown in Figure 3G)? Furthermore, heterozygous dIS4B imaging data (Figure 2C) should be quantified and compared to heterozygous cacsfGFP/+.

      We understand the reviewer’s comment and will do the following to convincingly demonstrate absence of cac from presynaptic active zones upon IS4B excision. First, we will show selective enlargements of IS4A and IS4B with Brp in presynaptic active zones to show distinct cac label in active zones following excision of IS4A but not following excision of IS4B. Second, we will provide Pearson’s co-localization coefficients of Brp with IS4B and with IS4A, respectively. Third, we will reduce the intensity of the green channels in figures 2C and 2H to the same levels as in 2A and B, and H control to allow a fair comparison of cac intensities following excision of IS4B versus excision of IS4A and control. We had increased intensity to show that following excision of IS4B, no distinct cac label is found in active zones, even at high exaggerated image brightness. However, we agree with the reviewer that the bright background hampers interpretation and thus will show the same intensity in all images that need to be compared.

      (2) They conclude that I-II splicing is not required for cac localization (p. 13). However, cac channel number is reduced in dI-IIB. Could the channels be mis-localized (e.g., in the soma/axon)? What is their definition of localization? Could cac be also mis-localized in dIS4B? Furthermore, the Western Blots indicate a prominent decrease in cac levels in dIS4B/+ and dI-IIB (Figure 1D). How do the decreased protein levels seen in both genotypes fit to a "localization" defect? Could decreased cac expression levels explain the phenotypes alone?

      We will precisely define channel localization, and we will explain why it is highly unlikely that the absence of IS4B channels as well as the lower number of I-IIA channels are simply a consequence of reduced expression, but instead of splice variant specific channel function and localization. For example, upon excision of IS4B no cac channels are found at the presynaptic active zones and these synapses are thus non-functional. The isoforms containing the mutually exclusive IS4A exon are expressed and mediate other functions (see also response to reviewer 1) but cannot substitute IS4B containing isoforms at the presynapse. In fact, our Western blots are in line with reduced cac expression if all isoforms that mediate evoked release are missing, again indicating that the presynapse specific cac isoforms cannot be replaced by other cac isoforms (see also below, response to (3)). Feedback mechanisms that regulate cac expression in the absence of presynapse specific cac isoforms are beyond the scope of this study.

      (3) Cac-IS4B is required for Cav2 expression, active zone localization, and synaptic transmission. Similarly, loss of cac-I-IIB reduces calcium channel expression and number. Hence, the major phenotype of the tested splice isoforms is the loss of/a reduction in Cav2 channel number. What is the physiological role of these isoforms? Is the idea that channel numbers can be regulated by splicing? Is there any data from other systems relating channel number regulation to splicing (vs. transcription or post-transcriptional regulation)?

      We will provide additional evidence that mutually exclusive splicing at the IS4 site results in cac channels that localize to the presynaptic active zone (IS4B) versus cac channels that localize to other brain parts and/or other subneuronal compartments (see response to reviewer 1).  In addition, we already show in figure 2J that IS4B is required for normal cac HVA current, and we can add data showing that IS4A is not essential for cac HVA current. Similarly, for I-II we find it unlikely that differential splicing regulates channel numbers, but rather splice variant specific functions in different brain parts and different sub-neuronal compartments. To substantiate this interpretation, we will add data from developing adult motoneurons showing that excision of I-IIA causes reduced activity induced calcium influx into dendrites (new data), but it does not reduce channel number at the larval NMJ (figure 4). In our opinion these data are not in line with the idea that splicing regulates cac expression levels, and this in turn, results in specific defects in distinct neuronal compartments. However, we agree that the lack of isoforms with specific functions results in altered overall cac expression levels as indicated by our Western data. If isoforms normally abundantly expressed throughout most neuropils are missing due to exon excision, we indeed find less cac protein in Westerns. By contrast, the lack of isoforms with little abundance has little effect on cac expression levels. This may be the results of unknown feedback mechanisms which are beyond the scope of this study.

      (4) Although not supported by statistics, and as appreciated by the authors (p. 14), there is a slight increase in PSC amplitude in dIS4A mutants (Figure 2). Similarly, PSC amplitudes appear slightly larger (Figure 3J), and cac fluorescence intensity is slightly higher (Figure 3H) in dI-IIA mutants. Furthermore, cac intensity and PSC amplitude distributions appear larger in dI-IIA mutants (Figures 3H, J), suggesting a correlation between cac levels and release. Can they exclude that IS4A and/or I-IIA negatively regulate release? I suggest increasing the sample size for Canton S to assess whether dIS4A mutant PSCs differ from controls (Figure 2E). Experiments at lower extracellular calcium may help reveal potential increases in PSC amplitude in the two genotypes (but are not required). A potential increase in PSC amplitude in either isoform would be very interesting because it would suggest that cac splicing could negatively regulate release.

      There are several possibilities to explain this, but as none of the effects are statistically significant, we prefer to not investigate this in depth. However, given that we cannot find IS4A at the presynaptic active zone, IS4A is unlikely to have a direct negative effect on release probability. Nonetheless, given that IS4A containing cac isoforms mediate functions in other neuronal compartments it may regulate release indirectly by affecting action potential shape. We will provide data in response to the more detailed suggestions to authors that will provide additional insight.

      (5) They provide compelling evidence that IS4A is required for the amplitude of somatic sustained HVA calcium currents. However, the evidence for effects on biophysical properties and activation voltage (p. 13) is less convincing. Is the phenotype confined to the sustained phase, or are other aspects of the current also affected (Figure 2J)? Could they also show the quantification of further parameters, such as CaV2 peak current density, charge density, as well as inactivation kinetics for the two genotypes? I also suggest plotting peak-normalized HVA current density and conductance (G/Gmax) as a function of Vm. Could a decrease in current density due to decreased channel expression be the only phenotype? How would changes in the sustained phase translate into altered synaptic transmission in response to AP stimulation?

      Most importantly, HVA current is mostly abolished upon excision of IS4B (not IS4A, we think the reviewer accidentally mixed up the genotype). This indicates that the cac isoforms that mediate evoked release encode HVA channels. However, the somatodendritic current shown in figure 2J that remains upon excision of IS4B is mediated by IS4A containing cac isoforms. Please note that these never localize to the presynaptic active zone, thus the small inactivating HVA that remains in figure 2J does normally not mediate evoked release. Therefore, the interpretation is that specifically HVA current encoded by IS4B cac isoforms is required for synaptic transmission. Reduced cac current density is not the cause for this phenotype because a specific current component is absent. 

      We agree with the reviewer that a deeper electrophysiological analysis of cac currents mediated by IS4B containing isoforms will be instructive. However, a precise analysis of activation and inactivation voltages and kinetics suffers form space clamp issues in recordings from the soma of such complex neurons (DLM motoneurons of the adult fly). Therefore, we will analyze the currents in a heterologous expression system and present these data to the scientific community as a separate study at a later time point.

      (6) Why was the STED data analysis confined to the same optical section, and not to max. intensity z-projections? How many and which optical sections were considered for each active zone? What were the criteria for choosing the optical sections? Was synapse orientation considered for the nearest neighbor Cac - Brp cluster distance analysis? How do the nearest-neighbor distances compare between "planar" and "side-view" Brp puncta?

      Max. z-projections would be imprecise because they can artificially suggest close proximity of label that is close in x and y but far away in z. Therefore, the analysis was executed in xy-direction of various planes of entire 3D image stacks. We considered active zones of different orientations (Fig. 4C, D). In fact, we searched the entire z-stacks until we found active zones of all orientations shown in figures 4C1-C6 within the same boutons. The same active zone orientations were analyzed for all exon-out mutants with cac localization in active zones. The distance between cac and brp did not change if viewed from the side.

      (7) Cac clusters localize to the Brp center (e.g., Liu et al., 2011). They conclude that Cav2 localization within Brp is not affected in the cac variants (p. 8). However, their analysis is not informative regarding a potential offset between the central cac cluster and the Brp "ring". Did they/could they analyze cac localization with regard to Brp ring center localization of planar synapses, as well as Brp-ring dimensions?

      In the top views (planar) we did not find any clear offset in cac orientation to brp between genotypes. This study focuses on cac splice isoform specific localization and function. Possible effects of different cac isoforms on Brp-ring dimensions or other aspects of scaffold structure are not central to our study, in particular given that Brp puncta are clearly present even if cac is absent from the synapse (Fig. 2H), indicating that cac is not instructive for the formation of the Brp scaffold.  

      (8) Given the accelerated PSC decay/ decreased half width in dI-IIA (Fig. 5Q), I recommend reporting PSC charge in Figure 3, and PPR charge in Figures 5A-D. The charge-based PPRs of dI-IIA mutants likely resemble WT more closely than the amplitude-based PPR. In addition, miniature PSC decay kinetics should be reported, as they may contribute to altered decay kinetics. How could faster cac inactivation kinetics in response to single AP stimulation result in a decreased PSC half-width? Is there any evidence for an effect of calcium current inactivation on PSC kinetics? On a similar note, is there any evidence that AP waveform changes accelerate PSC kinetics? PSC decay kinetics are mainly determined by GluR decay kinetics/desensitization. The arguments supporting the role of cac splice isoforms in PSC kinetics outlined in the discussion section are not convincing and should be revised.

      We agree that reporting charge in figure 3 will be informative and will do so. We also understand the reviewer’s concern attributing altered PSC kinetics to presynaptic cac channel properties. We will tone down our interpretation in the discussion and list possible alterations in presynaptic AP shape or Cav2 channel kinetics as alternative explanations (not conclusions). Moreover, we will quantify postsynaptic GluRIIA abundance to test whether altered PSC kinetics are caused by altered GluRIIA expression. In our opinion, the latter is more instructive than mini decay kinetic analysis because this depends strongly on the distance of the recording electrode to the actual site of transmission in these large muscle cells.

      (9) Paired-pulse ratios (PPRs): On how many sweeps are the PPRs based? In which sequence were the intervals applied? Are PPR values based on the average of the second over the first PSC amplitudes of all sweeps, or on the PPRs of each sweep and then averaged? The latter calculation may result in spurious facilitation, and thus to the large PPRs seen in dI-IIB mutants (Kim & Alger, 2001; doi: 10.1523/JNEUROSCI.21-24-09608.2001).

      We agree that the PP protocol and analyses have to be described more precisely in the methods, and we will do so. PPR values are based on the PPRs of each sweep and then averaged. We are aware of the study of Kim and Alger 2001, but it does not affect our data interpretation because all genotypes were analyzed identically, but only the I-IIB excision resulted in the large data spread shown in figure 5.

      (10) Could the dI-IIB phenotype be simply explained by a decrease in channel number/ release probability? To test this, I propose investigating PPRs and short-term dynamics during train stimulation at lower extracellular Ca2+ concentration in WT. The Ca2+ concentration could be titrated such that the first PSC amplitude is similar between WT and dI-IIB mutants. This experiment would test if the increased PPR/depression variability is a secondary consequence of a decrease in Ca2+ influx, or specific to the splice isoform.

      In fact, the interpretation that decreased PSC amplitude upon I-IIB excision is caused mainly by reduced channel number is precisely our interpretation (see discussion page 14, last paragraph to page 15, first paragraph). In addition, we are grateful for the reviewer’s suggestion to triturate the external calcium such that the first PSC amplitude matches the one in ΔI-IIB to test whether altered short term plasticity is solely a function of altered channel number or whether additional causes, such as altered channel properties, also play into this. We will conduct these experiments and include them in the revised manuscript.

      (11) How were the depression kinetics analyzed? How many trains were used for each cell, and how do the tau values depend on the first PSC amplitude? Time constants in the range of a few (5-10) milliseconds are not informative for train stimulations with a frequency of 1 or 10 Hz (the unit is missing in Figure 5H). Also, the data shown in Figures 5E-K suggest slower time constants than 5-10 ms. Together, are the data indeed consistent with the idea that dI-IIB does not only affect cac channel number, but also PPR/depression variability (p. 9)?

      For each animal, the amplitudes of each PSC were plotted over time and fitted with a single exponential. For depression at 1 and 10 Hz, we used one train per animal, and 5-6 animals per genotype (as reflected in the data points in Figs 5H and 5L). Given that the tau values are highly similar between control and excision of I-IIA, but ΔI-IIA tends to have larger single PSC amplitudes, differences in first PSC amplitude do not seem to skew the data (but see also response to comment 10 above). We thank the reviewer for pointing out that tau values in the range of ms are not informative at 1 and 10 Hz stimulations (Figs 5H and 5L). We mis-labeled (or did not label) the axes. The label should read seconds, not milliseconds. We apologize, and this will be corrected accordingly.

      In sum, pending the outcome of additional important control experiments for GluRIIA abundance (see response to comment 8) and trituration of control PSC amplitude for the first pulse of paired pulses in ΔI-IIB (see response to comment 10) we will either modify or further support that interpretation.

      (12) The GFP-tagged I-IIA and mEOS4b-tagged I-IIB cac puncta shown in Figure 6N appear larger than the Brp puncta. Endogenously tagged cac puncta are typically smaller than Brp puncta (Gratz et al., 2019). Also, the I-IIA and I-IIB fluorescence sometimes appear to be partially non-overlapping. First, I suggest adding panels that show all three channels merged. Second, could they analyze the area and area overlap of I-IIA and I-IIB with regard to each other and to Brp, and compare it to cac-GFP? Any speculation as to how the different tags could affect localization? Finally, I recommend moving the dI-IIA and dI-IIB localization data shown in Figure 6N to an earlier figure (Figure 1 or Figure 3).

      We will show panels with all three labels matched as suggested by the reviewer. For the size of the puncta: this could be different numbers and types of fluorophores on the different antibodies used and thus different point spread, chromatic aberration, different laser and detector intensities etc. We will re-analyze the data to test whether there are systematic differences in size. We do not want to speculate whether the different tags have any effect on localization precision because of the abovementioned reasons as well as artificial differences in localization precision that can be suggested by different antibodies. We prefer to not move the figure because we believe it is informative to show our finding that active zones usually contain both splice variants together with the finding that only one splice variant is required for PHP.

    1. Author response:

      The following is the authors’ response to the original reviews.

      (1) Please provide more background about Rpgrip1l in the introduction, particularly the past studies of mammalian homolog of Rpgrip11, if any? Is there any human disease associated with Rpgrip1l? Do these patients have scoliosis phenotype? 

      • We have added more background on the human ciliopathies caused by RPGRIP1L mutations and on their occasional association with early onset scoliosis (lines 45-54 page 2 in the introduction, see cited references). 

      (2) The allele is a large deficiency of most of the coding region of rpgrip1l, can you give details in the Supplementary data of how you show this by genotyping? It would be good to explain that this mutation is most likely behaving as a null, if you have RNAseq data that supports this please note that. Otherwise, it may be incorrect to assume it is a null allele as your shorthand nomenclature states. If you do not have stronger evidence that the deficiency allele is behaving as a null allele, then please think about using an allele nomenclature as outlined at ZFIN:  

      • We now describe in the results section (Lines 72-76, page 3) the extent of the deletion of rpgrip1l ∆/∆ (22 exons out of 26) that creates an early stop at position 88 of 1256 aas. We have submitted to ZFIN our two novel mutant lines: rpgrip1l∆  is recorded as rpgrip1l bps1 and rpgrip1l ex4 as rpgrip1l bps2 , and we provide this information in the text. Transcriptomics data confirmed this allele is behaving as a null as the most down-regulated transcript found in the brain of rpgrip1l ∆/∆ is rpgrip1l transcript itself, (volcano plot in Fig 5A, described in the results, Line 270-71, page 9).

      • We also have provided in Supplementary Figure 1 A’ a picture of a typical genotyping gel for the rpgrip1l∆ allele. Sequences of both CRISPR guide RNAs and genotyping primers are provided in the Math & Meth section. 

      (3) Throughout the manuscript, the authors refer to zebrafish mutant phenotypes as "juvenile scoliosis". However, scoliosis may not appear until 11 weeks post-fertilization in some animals. After 6-8 weeks of age, it would be more appropriate to describe the phenotype as "late-onset or adult scoliosis" to differentiate between other reported scoliosis mutants (such as hypomorphic or dominant negative alleles of scospondin) that start body curvatures at 3-5 dpf .

      • We think we can really qualify rpgrip1l-/- scoliosis as being a “juvenile scoliosis” as shown by the time course displayed in Fig 1B: rpgrip1l-/- scoliosis develops asynchronously between 4 weeks and 9 weeks (from 0.8 cm/1 cm to 1.6 cm, corresponding to juvenile stages according to Parichy et al, 2009 PMID: 19891001), after which it reaches a plateau. Half of the mutants are already scoliotic by 5 weeks and no scoliosis develops at adult stage, ie from 10 weeks on. We have acknowledged the late onset scoliosis in page 3 line 93.

      (4) A more careful demonstration of the individual vertebrae, using magnified high-resolution pictures in Figures 1D-G, should be made to more clearly show no obvious vertebral malformations are present. 

      • We now provide a movie in Sup Data that presents 3D views of controls and mutant spines, which show the intervertebral spaces as well as vertebral shape and size. With these images we could exclude vertebral fusion and the presence of dysmorphic vertebrae.

      (5) On page 5: the authors comment on transgenic expression of RPGRIP1L in foxj1a-lineages as "rescuing" scoliosis. This terminology is confusing, as rescuing a condition could be interpreted as inducing it where it was once absent. "Suppressing" scoliosis may be a more appropriate term. 

      • We agree with the reviewers, the “rescue” term is confusing, we changed it for “suppress” in the title of the paragraph (line 95 page 3) and within the text (line 115 page 3).

      (6) On page 5, lines 155-156: the authors state that "Indeed, no tissue-specific rescue has been performed yet in zebrafish ciliary gene mutants". This is misleading, as ptk7a and katnb1 mutations both disrupt cilia, and transgenic reintroduction of both ptk7a and katnb1 in foxj1a- expressing lineages has previously been shown to suppress cilia defects as well as scoliosis in these models. The statement should be removed for accuracy. 

      • We agree that we were not precise enough in our sentence: when we mentioned “ciliary gene” mutants, we were referring to genes whose products are enriched within cilia and directly affecting ciliogenesis, cilia content and maintenance such as TZ or BBS genes, without encompassing genes like ptk7 and katnb1 whose products perform multiple functions on top of cilia maintenance such as Wnt signalling and remodelling of the whole microtubule network respectively. We have therefore modified our sentence by adding zebrafish ciliary “TZ and BBS” genes (line 104, page 4).

      (7) Figure 2: panels A-B: In the text (line 196) you state that cilia length was increased and that Arl13b content was severely reduced. However, Panel B shows no significant length difference between scoliotic mutants and controls. This statement and graph should be corrected for accuracy. Also, the Arl13b staining is difficult to see in panel A - can channels be split, and/or quantified? 

      • We have now split the Arl13b and glutamylated tubulin channels (Fig 2 A-C”). We think that the reduction of Arl13b staining intensity is now obvious in both straight and scoliotic mutants (Compare 2A” with 2B” and 2C”). We were not able to quantify Arl13b staining using ciliary masks from glutamylated tubulin staining since both staining only partially overlap along the length of the cilium, Arl13b being more distal than glutamylated tubulin (Fig 2A’). 

      • Ciliary length was significantly increased (from 3.4 to 5.3 µ) in straight rpgrip1l-/-, while the average mean values for scoliotic rpgrip1l-/- were heterogenous (mean 4.1µ) and therefore not significantly different when compared to controls. This heterogeneity stems from the combined presence of both shorter and longer cilia in scoliotic fish, a finding we interpreted by the potential breakage over time of extra-long and thin cilia observed in scoliotic fish (as in Sup figure 1 H’’’, Sup Fig 2M’ and 2O’). 

      • We changed the text to be more accurate: we now state that cilia length increased in straight mutants, and became more heterogenous than controls in scoliotic mutants (line 143-144, page 5). 

      (8) Figure 3: Page 7, line 206: authors state that SCO-spondin secreting cells varied in number along SCO length. What is the evidence that these cells secrete SCO-spondin? The staining shown in Figure 3L-O appears to demonstrate extracellular accumulation of sspo:GFP. What is the evidence that this staining originated from cells in proximity to it? 

      The claim of SCO-secreting cells in Figure 2E-J is confusing. I assume you are using anatomy to infer the SCO is captured in these sections. This should be done in sspo-GFP animals (as in Figure 3) and/or dual anti-body labeling can be done to show SCO-secreting cells and cilia. 

      • We now show in Supplementary Figure 2 A-D a double staining for Sco-spondin-GFP and cilia (Ac-tub, Glu-Tub). Analyzing GFP staining along SCO length on successive sections, we identified the SCO producing cells on the diencephalic dorsal midline by their position under the posterior commissure (PC), which forms an Acetylated Tubulin positive arch), and counted the nuclei surrounded by cytoplasmic GFP from the most anterior region ( 24 cells wide, Sup Fig 2A-A’) to the most posterior region (4-8 cells wide, Sup Fig 2 C).` 

      • Furthermore, the close-ups presented on Fig 2A’ and 2B’ allow to detect the cytoplasmic Sspo-GFP staining around SCO nuclei, above the region presenting primary cilia pointing towards the diencephalic ventricle, both in controls and mutants at scoliosis onset (tail-up mutants), showing that the extracellular staining in B’ very likely originates from these cells. In these tail-up mutants, extracellular Sspo aggregates have not yet filled the whole diencephalic ventricle as in Fig 3 N and Q. 

      (9) Figure 5: Is the transcriptome data and proteomic data consistent for any transcripts and encoded protein products? Please highlight those consistent targets in both analyses. 

      • We would like to emphasize that the transcriptomic study was performed at scoliosis onset, at 5 weeks, while the proteomics analysis was performed at adult stage (3 months) so they cannot be directly compared.

      Moreover, low abundance proteins (such as centrosomal proteins and transcription factors like Foxj1a ) are not detected by label-free proteomics, without prior subcellular fractionation procedure (Lindemann et al, 2017 PMID: 28282288). The extraction protocol also does not allow to purify short neuropeptides such as Urp1-2.

      Nevertheless, we found four targets in common, now highlighted in red in Fig 5, Panel E: Anxa2, complement proteins

      C4 and C7a, and Stat3, all related to immune response, a GO term enriched in both studies as explained in the text (Lines 308-311, page 10). 

      The absence of many inflammation markers or immune response proteins at adult stage in scoliotic mutants most probably indicates a transient inflammatory episode at scoliosis onset, while astrogliosis, as detected by GFAP staining, increases with scoliosis severity. Along the same lines, the two-fold increase of Lcp1 cells within the tectum is present before axis curvature (in straight mutants) and disappears in scoliotic fish (Graph G in Sup Figure S5) as explained in the text, Lines 378-381, page 12, 

      (10) Supplementary Figure 1 F-H: What stage/age samples were used for SEM? It is only stated that they were 'adults'. It is also stated that cilia tufts in straight rpgrip1l-/- fish were morphologically normal but 'less dense'- this was not obvious from the figure. Can density be quantified? (otherwise, data does not support the statement). Similarly, can the statement that "cilia of mono-ciliated ependymal cells showed abnormal irregular structures compared to controls, with either bulged or thinner parts" be supported with measurements/quantification? 

      • The SEM study was performed on 3 months old fish, 3 controls and 5 mutants. We added this information in the figure legend. We could not quantify the number of ciliary tufts in the brain ventricle of the sole straight mutant that was analyzed. We therefore removed the statement that cilia were less dense in the straight mutant. Along the same lines, we mentioned that we could find mutant cilia of irregular shape as shown in Supplementary Figure S1, F”,G’’, H’’ and H’’’) (page 4, lines 124-129). 

      (11) Supplementary Figure 1D-E is never mentioned in the text. The Supplemental Figure legend also refers to a graph of cilia length that is not in the figure itself. As a result, many of the subsequent panel references are out of register. 

      • We now provide the correct version of the legend and refer to Sup Fig 1D-E in the text (page 3, lines 79-81) and its legend, page 53, lines 1616-1620.

      (12) Supplementary Figure 2A-F: Of interest, in panels C and F, it looks as though sspo:GFP is accumulating on cilia within the ventricles of rpgrip1l mutants. Can this be explored? Is it possible that abnormal aggregation of SSPO on cilia is ultimately leading to cilia loss, as you report for multi-ciliated cells surrounding the subcommissural organ? This could be a very interesting finding and possible mechanism for cilia loss.

      • Our observation of all brain sections led us to conclude that the majority of Sspo-GFP aggregates were floating within the brain ventricles of rpgrip1l-/- fish while a portion of aggregates were stuck on ventricle walls, in close contact with cilia as now shown on Supplementary figure S2 B’, outlined in legend page 54, lines 1634-1637. We agree that the contact between Sspo aggregates and cilia might have damaging consequences, either on cilia maintenance or on immune reaction induction and we now mention these possibilities in the discussion page16, lines 524-526. These research lines will be explored in the near future.

      (13) Supplementary Figure 5A-F is not mentioned in the manuscript. Please clarify the role of Anxa2 in neuroinflammation. Is increased Anxa2 expression in rpgrip1l mutant zebrafish reduced after anti-inflammatory drug treatment? What is the expression level of anxa2 in cep290 mutant zebrafish? 

      • We have now added mention to Supplementary Figure 5A-F in the text page 10 lines 328-331. 

      • We unfortunately did not have enough histological material to test Anxa2 staining on NACET treated fish after performing GFAP and Lcp1 staining, neither for dilatation measurement or multiciliated cells quantification. We agree this would have helped to better define which defect might be an indirect consequence of an inflammatory environment.

      • We tested the expression level of Anxa2 in cep290-/- fish. No labelling above control level was detected on cep290-/- brain sections that were positive for GFAP (N = 5). As GFAP staining in 3-4 weeks cep290-/- was not as intense and widespread as in adult rpgrip1l-/- (50% of GFAP + cells compared to 100% in the SCO for example), we concluded that Anxa2 expression may be upregulated after widespread or long-term astrogliosis/inflammation. Alternatively, Anxa2 overexpression could be specific to rpgrip1l-/- fish. 

      (14) A summary diagram at the end would be helpful for understanding the main findings. 

      We added a Graphical Abstract summarizing the main conclusions and hypotheses of this study. It is mentioned and explained in the Discussion section, p. 16 lines 504-508 and 516-529. 

      (15) The sspo-GFP zebrafish line should be listed in the STAR methods section: 

      The sspo-GFP line is now listed in the STAR methods, Scospondin-GFPut24, (Troutwine et al., 2020 PMID: 32386529), p.43, last line.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors investigate the role of chirping in a species of weakly electric fish. They subject the fish to various scenarios and correlate the production of chirps with many different factors. They find major correlations between the background beat signals (continuously present during any social interactions) or some aspects of social and environmental conditions with the propensity to produce different types of chirps. By analyzing more specifically different aspects of these correlations they conclude that chirping patterns are related to navigation purposes and the need to localize the source of the beat signal (i.e. the location of the conspecific).

      The study provides a wealth of interesting observations of behavior and much of this data constitutes a useful dataset to document the patterns of social interactions in these fish. Some data, in particular the high propensity to chirp in cluttered environments, raises interesting questions. Their main hypothesis is a useful addition to the debate on the function of these chirps and is worth considering and exploring further.

      After the initial reviewers' comments, the authors performed a welcome revision of the way the results are presented. Overall the study has been improved by the revision. However, one piece of new data is perplexing to me. The new Figure 7 presents the results of a model analysis of the strength of the EI caused by a second fish to localize when the focal fish is chirping. From my understanding of this type of model, EOD frequency is not a parameter in the model since it evaluates the strength of the field at a given point in time. Therefore the only thing that matters is the phase relationship and strength of the EOD. Assuming that the second fish's EOD is kept constant and the phases relationship is also the same, the only difference during a chirp that could affect the result of the calculation is the potential decrease in EOD amplitude during the chirp. It is indeed logical that if the focal fish decreased its EOD amplitude the target fish's EOD becomes relatively stronger. Where things are harder to understand is why the different types of chirps (e.g. type 1 vs type 2) lead to the same increase in signal even though they are typically associated with different levels of amplitude modulations. Also, it is hard to imagine that a type 2 chirps that is barely associated with any decrease in EOD amplitude (0-10% maybe), would cause doubling of the EI strength. There might be something I don't understand but the authors should provide a lot more details on how this result is obtained and convince us that it makes sense.

      We thank the author for the comments and we agree that the approach could have been better detailed. As anticipated by the Reviewer, the Boundary Element Method (BEM) model can be used simply to calculate the electric field and electric image at a specific point in time (instantaneously), regardless of EOD frequency. However, our model allows for the concatenation of consecutive instants and thus is able to render an entire sequence of electric fields - and resulting electric images - incorporating realistic EOD characteristics such as shape, duration, and frequencies (see Pedraja et al., 2014).

      Chirp-triggered EIs were modeled using real chirps produced by interacting fish. Each chirp was thus associated to its duration and peak parameters, as well as the fish positional information (distance and angle). 

      However, since we did not know the beat phase at which chirps were produced, we computed electric images for each fish position and chirp scenario by simulating various phases (here referred to the initial offset of the two EODs, set at 4 phases, equally spaced). These are intended as phases of the sender EOD and simply refer to the initial OFFSET between the two interacting EODs. However, since our simulations were run over a time window of 500 msec, all phases are likely to be covered, with a different temporal order relative to the chirp (always centered within the 500 msec).

      The simulation was run maintaining consistent timing for both chirp and non-chirp conditions, across approximately 800 body nodes. At each node, the current flow was calculated from the peak-to-peak of the EOD sum (i.e. the point-to-point of the difference between the beat positive and negative envelopes). Analyzing the EIs over this fixed time window enables us to assess the unitary changes of current flow induced by chirps over units of time (ΔI/Δt). From this, we can calculate a cumulative sum of current flow changes - expressed as delta(EI) and use it to show the effect of the chirps on the spatiotemporal EI (Figure 7C).

      One can express this cumulative change mapped onto the fish body (keeping the 800 points separated, as in Figure 7C) or further sum the current changes to obtain a single total (as shown in Figure 7D).

      One can check this by considering that a sum for example of a set of 500/800 points - judging from the size of the blue areas in C not all 800 points have a detectable change - each valued 0.1-to-0.3 mA/s, one could get circa 100 mA/s, which is what is shown in D. (is this what is happening ?)

      We do not know why chirps of different types triggered similar effects. It is possible that, since EI measurements are pooled over several chirps produced at different angles and distances, in case of a lower amount of chirps considered for a given type (as in the case of rises, very low) these measurements may not highlight more marked differences among types. In a publication we are currently working on, we are considering a larger dataset to better assess these results.

      The methods section has been edited to clarify the approach (not yet).

      Reviewer #2 (Public Review):

      Studying Apteronotus leptorhynchus (the weakly electric brown ghost knifefish), the authors provide evidence that 'chirps' (brief modulations in the frequency and amplitude of the ongoing electric signal) function in active sensing (specifically homeoactive sensing) rather than communication. Chirping is a behavior that has been well studied, including numerous studies on the sensory coding of chirps and the neural mechanisms for chirp generation.

      Chirps are largely thought to function in communication behavior, so this alternative function is a very exciting possibility that could have a great impact on the field.

      We thank the Reviewer for the extensive and constructive comments. We would like to add that, while it is true that many detailed studies have been published on the anatomy and physiology of the circuits implicated in the production and modulation of “electric chirps”, most of this  research assumed, and focused exclusively on, their possible role in communication.  In addition, most behavioral studies did the same and a meta-analysis of the existing literature on chirping allows to trace back the communication idea mainly to two studies: Hagedorn and Heiligenberg, 1985 (“Court and spark: electric signals in the courtship and mating of gymnotoid fish”) and Hopkins, 1974 (“Electric Communication: Functions in the Social Behavior of Eigenmannia Virescens”), among the main sources. Importantly, in these studies only contextual observations have been made (no playback experiment or other attempts to analyze more quantitatively the correlation of chirping with other behaviors).

      The authors do provide convincing evidence that chirps may function in homeoactive sensing. However, their evidence arguing against a role for chirps in communication is not as strong, and fails to sufficiently consider the evidence from a large body of existing research. Ultimately, the manuscript presents very interesting data that is sure to stimulate discussion and follow-up studies, but it suffers from dismissing evidence in support of, or consistent with, a communicative function for chirps.

      Although the tone of some statements present in our earlier draft may suggest otherwise, through our revisions, we have made an effort to clarify that we do not intend to dismiss a function of chirps in communication, we only intend to debate and discuss valid alternative hypothesis, advanced from reasonable considerations.

      Before writing this manuscript, we have attempted to survey  literally all the existing literature on chirps (including studies focused on behavior, peripheral sensory physiology as well as brain physiology). Although it is not unlikely that some studies have eluded our attention, an effort for a comprehensive review was made. Based on this survey we realized that none of the studies provided a clear  and  unambiguous piece of evidence to support the communication hypothesis (we refer here to the weak points highlighted in the discussion and mentioned in the previous comment). Which in fact does not come without its weak points and contradictions (see later comments).

      It follows a summary of the mentions made to the communication theory in the different section of the manuscript including several edits we have applied in response to the Reviewer’s concern:

      In the abstract we clearly state that we are considering an alternative that is only hypothetically complementary, not for sure.  Nonetheless, we have identified a couple of instances that could sound dismissive of the “communication hypothesis” in the following section.

      In the introduction we write in fact about the possibility of interference between communication signals and conspecific electrolocation cues, as they are both detected as beat perturbations. We did not mean to use “Interference” here as “reciprocal canceling”, rather we intended it as “partial or more or less conspicuous overlap” in the responses triggered in electroreceptors.

      Hoping to convey a clearer message, we have edited the related statement and changed it to “both types of information are likely to overlap and interact in highly variable ways”.

      We have also removed the statement: “According to this idea, beats and chirps are not only detected through the same input channel, but also used for the same purpose.” as at this point in the manuscript it may be too strong.

      In the results section we do not include statements that might be seen as dismissive of the communication hypothesis but only statements in support of the “probing with chirps” idea (which is the central hypothesis of the study).

      In the discussion paragraphs we elaborate on why the current functional view is either flawed or incomplete (first paragraph “existing functional hypotheses''). Namely: 1)  multiple triggering factors implied in chirp responses covary and need to be disentangled (example DF/ sex), 2) findings on brown ghosts and a few other gymnotiforms have been used to advance the hypothesis of “communication through chirps'' in all weakly electric fish (including pulse species). 3) social encounters - in which chirps are recorded - imply also other behaviors (such as probing) which have not been considered so far. This point is related to the first one on covariates. 4) most studies referring to big chirps as courtship chirps were not done in reproductive animals (added now)  and 5) no causal evidence has been provided so far to justify a role of chirps in social communication.

      We are discussing these points as challenges to the communication hypothesis, not to dismiss the hypothesis, but rather to motivate future studies addressing these challenges.

      We do not want to appear dismissive of the communication hypothesis and had therefore previously edited the manuscript to avoid the impression of exclusivity of the probing hypothesis. We have now gone over the manuscript once more and edited several sentences. Nevertheless, we want to point out again that - despite the large consensus - the communication hypothesis has, until now, never been investigated with the kind of rigor applied here.

      The authors do acknowledge that chirps could function as both a communication and homeactive sensing signal, but it seems clear they wish to argue against the former and for the latter, and the evidence is not yet there to support this.

      In both rounds of revision we have made an effort to convey a more inclusive interpretation of our findings. We tried our best to express our ideas as hypothetical, not as proof that communication through chirps does not exist. The aim of this study is to propose an alternative view, and this cannot be done without underlining the weak points of an existing hypothesis while providing and supporting reasonable arguments in favor of the alternative we advance. The actual evidence for a role of chirping in communication is much less strong than appears from the pure number of articles that have discussed chirps in this context.

      Regarding the weak evidence against communication, here we can list a few additional important points related to the proposed interpretations of chirp function (more specific than those made earlier):

      (1) A formally sound assessment of signal value/meaning - as typically done in animal communication studies should involve: 

      a) the isolation of a naturally occurring signal and determination of the context in which it is produced 

      b) the artificial replication of the signal

      c) the observation that such mimic is capable of triggering reliable and stereotyped responses in a group of individuals (identified by sex and/or species) under the same conditions (conditioned, unconditioned, state-dependent, etc.). As discussed for instance in Bradbury and Vehrencamp, 2011; Laidre and Johnstone, 2013; Wyatt, 2015; Rutz et al., 2023.

      This approach has so far not been applied to weakly electric fish. The initial purpose of the present study was in fact to conduct this type of validation.

      (2) The hypothesis of chirps used for DF-sign discrimination - for “social purposes” - although plausible in the face of theoretical considerations,  does not seem to be reasonable in practice, when one considers emission rates of 150 chirps per minute. We do find a strong correlation of chirp type with DF, which is often very abrupt and sudden (as if the fish were tracking beat frequency to guess its value) but the consideration made above on chirp rates seems to discourage this interpretation.

      (3) The hypothesis of chirp-patterning (i.e. chirping may have meaning based on the sequence of chirps of different types, a bit like syllables in birdsongs) - assessed by only one study conducted in our group - has not been enough substantiated by replication. We have surveyed all possible combinations of chirps produced by interacting pairs in different behavioral conditions using different value for chirp sequence size: 2, 3,... ,8 chirps (both considering the sender alone as well as sender+receiver together). In all cases we found no evidence for  a context dependent “modulation” of chirp types (i.e. no specific chirp type sequence in specific contexts).

      (4) The hypothesized role of “large chirps” as courtship signals could be easily criticized by noting the symmetrical distribution of these events around  a DF of 0 Hz . Although one could argue about a failure to discriminate DF-sign, to explain this well known pattern. However, we know from Walter Heiligenberg’s work and physiological considerations that such task can be solved easily through t-units and … in principle even just by motion (which would change the EOD phase in frequency dependent ways, thus potentially revealing the DF sign).

      Overall, these considerations made us think that certainly chirping occurs in a social context, but it is the meaning of this behavior that remains elusive.  We noticed that environmental factors are also strongly implied … we then formulate an alternative hypothesis to explain chirping but we do so  without dismissing the communication idea.

      All this seems to us just a careful way to critically discuss our results and those of other studies, without considering the issue resolved.

      In the introduction, the authors state, "Since both chirps and positional parameters (such as size, orientation or motion) can only be detected as perturbations of the beat, and via the same electroreceptors, the inputs relaying both types of information are inevitably interfering." I disagree with this statement, which seems to be a key assumption. Both of these features certainly modulate the activity of electroreceptors, but that does not mean those modulations are ambiguous as to their source. You do not know whether the two types of modulations can be unambiguously decoded from electroreceptor afferent population activity.

      We thank the Reviewer for noting this imprecision. We have addressed the Reviewer’s concern in another reply (see above).

      My biggest issue with this manuscript is that it is much too strong in dismissing evidence that chirping correlates with context. In your behavioral observations, you found sex differences in chirping as well as differences between freely interacting and physically separated fish. Chirps tended to occur in close proximity to another fish. Your model of chirp variability found that environmental experience, social experience, and beat frequency (DF) are the most important factors explaining chirp variability. Are these not all considered behavioral or social context? Beat frequency (DF) in particular is heavily downplayed as being a part of "context" but it is a crucial part of the context, as it provides information about the identity of the fish you're interacting with. The authors show quite convincingly that the types of chirps produced do not vary with these contexts, but chirp rates do.

      We believe the “perceived claim” may be an issue of unclear writing. We have now tried to better clarify that “context” affects chirp rates, but it does not affect chirp types as much (except when beat frequency is high).  

      We have edited two statements possibly susceptible to misinterpretation: 

      (1) In the results: “It also indicates that chirp parameters such as duration and FM do not seem to be associated with any particular context in a meaningful way, other than being affected by beat frequency.”

      (2) In the discussion: the statement

      “Recordings from interacting fish pairs confirmed the absence of any significant correlation between chirp type choice and behavioral context (Figure S2) although the variance of chirp parameters appears to be significantly affected by this factor (Figure 2). This may suggest that the effect of behavioral context is mainly detectable in the number of chirps produced (Figure S1), rather than the type (Figure S2).”

      has been changed to:

      “Recordings from interacting fish pairs confirmed the absence of any significant correlation between chirp type choice and behavioral context, except for those cases characterized by higher beat frequencies  (Figure S2). This suggests that the effect of behavioral context highlighted in our factor analysis (Figure 2) is mainly due to the number of chirps produced (Figure S1), rather than their type (Figure S2).”

      Eventually, in the results we emphasize the relatively higher impact of previously unexplored factors on chirp variance: “The plot of individual chirps (Figure 2C) shows the presence of clustering around different categorical variables and it reveals that experience levels or swimming conditions are important factors affecting chirp distribution (note for instance the large central “breeding” cluster in which fish are divided and the smaller ones in which fish are free). Sender or receiver identity does not individuate any clear clustering relative to either sex (see the overlap of male_s/male_r and female_s/female_r) or social status (dominant/subordinate). Chirps labeled based on tank experience (i.e. resident vs intruder) are instead clearly separated.”.

      Further, in your playback experiments, fish responded differently to small vs. large DFs, males chirped more than females, type 2 chirps became more frequent throughout a playback, and rises tended to occur at the end of a playback. These are all examples of context-dependent behavior.

      We do note that male brown ghosts chirp more than females. But we do also say - and show in figure 8 - that males move more in proximity to and around conspecifics. We do acknowledge that chirp time-course may be different during playbacks in a type-dependent manner. But how this can support the communication hypothesis - or other alternatives - is unclear. This result could equally imply the use of different chirp types for different probing needs. Since we cannot be sure about either, we do not want to put too much emphasis to it. Eventually, the fact that “context” (here meant broadly to define different experimental situations in which social but also physical and environmental parameters are altered) affects chirping is undeniable: cluttered and non-cluttered environments do represent different contexts which differently affect chirping in conspicuous ways.

      In the results, the authors state, "Overall, the majority of chirps were produced by male subjects, in comparable amounts regardless of environmental experience (resident, intruder or equal; Figure S1A,C), social status (dominant or subordinate; Figure S1B) or social experience (novel or experienced; Figure S1D)." This is not what is shown in Figure S1. S1A shows clear differences between resident vs. intruder males, S1B shows clear differences between dominant vs. subordinate males, and S1D shows clear differences between naïve and experienced males. The analysis shown in Figure 2 would seem to support this. Indeed, the authors state, "Overall, this analysis indicated that environmental and social experience, together with beat frequency (DF) are the most important factors explaining chirp variability."

      The Reviewer is right in pointing at this imprecise reference and we are grateful for spotting this incongruence. The writing refers probably to an earlier version of the figure in which data were grouped and analyzed differently. We now edited the text and changed it to: “Overall, the majority of chirps were produced by male subjects, at rates that seemed  affected by environmental experience (resident, intruder or equal; Figure S1A,C), social status (dominant or subordinate; Figure S1B) and social experience (novel or experienced; Figure S1D).”

      The choice of chirp type varied widely between individuals but was relatively consistent within individuals across trials of the same experiment. The authors interpret this to mean that chirping does not vary with internal state, but is it not likely that the internal states of individuals are stable under stable conditions, and that individuals may differ in these internal states across the same conditions? Stable differences in communication signals between individuals are frequently interpreted as reflecting differences between those individuals in certain characteristics, which are being communicated by these signals.

      It seems here we have been unclear in the writing: while it is true that behavioral states are stable and can imply stable chirp patterning (if the two are related), since chirp types vary abruptly and in a reliable DF-dependent manner, different types of chirps are unlikely to be matched to different internal states following the same temporal order in such a reliable way (similarly repeated through consecutive trials).

      This would imply the occurrence of different internal states in rapid sequence, reliably triggered by repeated EOD ramps, regardless of whether the playback is 20 sec long or 180 sec long.

      We have edited this paragraph to better explain this: “The reliability by which the chirping response adapts to both the rate and direction of beat frequency is variable across individuals but rather stable across trials (relative to a given subject), further suggesting that chirp type variations may not reflect changes in internal states or in the animal motivation to specific behavioral displays (which are presumably subject to less abrupt variations and stereotypical patterning based on DF).”

      I am not convinced of the conclusion drawn by the analysis of chirp transitions. The transition matrices show plenty of 1-2 and 2-1 transitions occurring.

      The only groups in which 1-2 and 2-1 transitions are as frequent as 1-1 and 2-2 (being 1 and 2 the numerical IDs of the two interacting fish) are F-F pairs. This is a result of the fact that in females chirp rates are so low that within-fish-correlations end up being as low as between-fish-correlations. We believe the impression of the Reviewer could be due to the fact that these are normalized maps (see legend of Figure 5A-B).

      Further, the cross-correlation analysis only shows that chirp timing between individuals is not phase-locked at these small timescales. It is entirely possible that chirp rates are correlated between interacting individuals, even if their precise timing is not.

      We agree with the Reviewer, this is a possibility. To address this point, we did edit the results section to acknowledge that what we see may be related to the time window chosen (i.e. 4 sec):

      “More importantly, they show that - at least in the social conditions analyzed here and within small-sized time windows - chirp time series produced by different fish during paired interactions are consistently independent of each other.”

      Further, it is not clear to me how "transitions" were defined. The methods do not make this clear, and it is not clear to me how you can have zero chirp transitions between two individuals when those two individuals are both generating chirps throughout an interaction.

      We thank the Reviewer for bringing up this unclear point. We have now clarified how transitions were calculated in the method section: “The number of chirp transitions present in each recording (dataset used for Figures 1, 2, 5) was measured by searching in a string array containing the 4 chirp types per fish pair, all their possible pairwise permutations (i.e. all possible permutations of 4+4=8 elements are: 1-1, 1-2, 1-3 … 7-6, 7-7, 7-8; considering the following legend 1 = fish1 type 1, 2 = fish 1 type 2, 3 = fish1 type 3 … 6 = fish2 type 2, 7 = fish2 type 3 and 8 = fish2 rise).”.

      Zero transitions are possible if two fish (or groups of fish) do not produce chirps of all types. Only transitions of produced types can be counted.

      In the results, "Although all chirp types were used during aggressive interactions, these seemed to be rather less frequent in the immediate surround of the chirps (Figure 6A)." A lack of precise temporal correlation on short timescales does not mean there is no association between the two behaviors. An increased rate of chirping during aggression is still a correlation between the two behaviors, even if chirps and specific aggressive behaviors are not tightly time-locked.

      The Reviewer is right in pointing out the limited temporal scaling of our observations/analysis. We have now edited the last paragraph of the results related to figure 6 to include the possibility mentioned by the Reviewer: “The significantly higher extent of chirping during swimming and locomotion, consistently confirmed by 4 different approaches (PSTH, TM, CN, MDS), suggests that - although chirp-behavior correlations may exist at time-scales larger than those here considered - chirping may be linked more strongly with scanning and environmental exploration than with a particular motivational state, thus confirming findings from our playback experiments.”

      The Reviewer here remarks an important point, yet, due to space limitations, we have considered only a sub-second scale. Most playback experiments in weakly electric fish implied the use of EOD mimics for a few tens of seconds - to avoid habituation in the fish behavioral responses -  while inter-chirp intervals usually range between a few hundreds of milliseconds to seconds (depending on how often a fish would chirp). This suggested to us that a 4 second time window may not be a bad choice to start with.

      In summary, it is simply too strong to say that chirping does not correlate with context, or to claim that there is convincing evidence arguing against a communication function of chirps. Importantly, however, this does not detract from your exciting and well-supported hypothesis that chirping functions in homeoactive sensing. A given EOD behavior could serve both communication and homeoactive sensing. I actually suspect this is quite common in electric fish (both gymnotiforms and mormyrids), and perhaps in other actively sensing species such as echolocating animals. The two are not mutually exclusive.

      We agree with the Reviewer that context - broadly speaking - does affect chirping (as we mentioned above). We hope we have improved the writing and clarified that we do not dismiss communication functions of chirping, but we do lean towards electrolocation based on the considerations above made and our results.

      We do conclude the manuscript remarking that communication and electrolocation are not mutually exclusive: ”probing cues could function simultaneously as proximity signals to signal presence, deter approaches, or coordinate behaviors like spawning, if properly timed (Henninger et al., 2018).” (see the conclusion paragraph of the discussion) .

      Therein, we further add “These findings aim to stir the pot and initiate a discussion on possible alternative functions of chirps beyond their presumed communication role.”.

      With this, we hope we’ve made it clear how we intend our manuscript to be read.

      Reviewer #3 (Public Review):

      Summary:

      This important paper provides the best-to-date characterization of chirping in weakly electric fish using a large number of variables. These include environment (free vs divided fish, with or without clutter), breeding state, gender, intruder vs resident, social status, locomotion state and social and environmental experience, without and with playback experiments. It applies state-of-the-art methods for reducing the dimensionality of the data and finding patterns of correlation between different kinds of variables (factor analysis, K-means). The strength of the evidence, collated from a large number of trials with many controls, leads to the conclusion that the traditionally assumed communication function of chirps may be secondary to its role in environmental assessment and exploration that takes social context into account. Based on their extensive analyses, the authors suggest that chirps are mainly used as probes that help detect beats caused by other fish and as well as objects.

      Strengths:

      The work is based on completely novel recordings using interaction chambers. The amount of new data and associated analyses is simply staggering, and yet, well organized in presentation. The study further evaluates the electric field strength around a fish (via modelling with the boundary element method) and how its decay parallels the chirp rate, thereby relating the above variables to electric field geometry.

      The main conclusions are that the lack of any significant behavioural correlates for chirping, and the lack of temporal patterning in chirp time series, cast doubt on a primary communication goal for most chirps. Rather, the key determinants of chirping are the difference frequency between two interacting conspecifics as well as individual subjects' environmental and social experience. The paper concludes that there is a lack of evidence for stereotyped temporal patterning of chirp time series, as well as of sender-receiver chirp transitions beyond the known increase in chirp frequency during an interaction.

      These conclusions by themselves will be very useful to the field. They will also allow scientists working on other "communication" systems to perhaps reconsider and expand the goals of the probes used in those senses. A lot of data are summarized in this paper, with thorough referencing to past work.

      The alternative hypotheses that arise from the work are that chirps are mainly used as environmental probes for better beat detection and processing and object localization, and in this sense are self-directed signals. This led to their prediction that environmental complexity ("clutter") should increase chirp rate, which is fact was revealed by their new experiments. The authors also argue that waveform EODs have less power across high spatial frequencies compared to pulse-type fish, with a resulting relatively impoverished power of resolution. Chirping in wave-type fish could temporarily compensate for the lower frequency resolution while still being able to resolve EOD perturbations with a good temporal definition (which pulse-type fish lack due to low pulse rates).

      The authors also advance the interesting idea that the sinusoidal frequency modulations caused by chirps are the electric fish's solution to the minute (and undetectable by neural wetware) echo-delays available to it, due to the propagation of electric fields at the speed of light in water. The paper provides a number of experimental avenues to pursue in order to validate the non-communication role of chirps.

      We thank the reviewer for the kind assessment.

      Weaknesses:

      My main criticism is that the alternative putative role for chirps as probe signals that optimize beat detection could be better developed. The paper could be clearer as to what that means precisely, especially since beating - and therefore detection of some aspects of beating due to the proximity of a conspecific - most often precedes chirping. One meaning the authors suggest, tentatively, is that the chirps could enhance electrosensory responses to the beat, for example by causing beat phase shifts that remediate blind spots in the electric field of view.

      We agree with the Reviewer that a better and more detailed explanation of how beat processing for conspecific electrolocation may be positively affected by chirps would be important to provide. We are currently working on a follow-up manuscript in which we intend to include these aspects. For space limitations and readability we had to discard from the current manuscript a lot of results that could further clarify these issues.

      A second criticism is that the study links the beat detection to underwater object localization. The paper does not significantly develop that line of thought given their data - the authors tread carefully here given the speculative aspect of this link. It is certainly possible that the image on the fish's body of an object in the environment will be slightly modified by introducing a chirp on the waveform, as this may enhance certain heterogeneities of the object in relation to its environment. The thrust of this argument derives mainly from the notion of Fourier analysis with pulse type fish EOD waveforms (see above, and radar theory more generally), where higher temporal frequencies in the beat waveform induced by the chirp will enable a better spatial resolution of objects. It remains to be seen whether experiments can show this to be significant.

      Perhaps the Reviewer refers to the last discussion paragraph before the conclusions in which we mention the performance of pulse or wave-type EODs in electrolocation (referring here to ideas illustrated in a recent review by Crampton, 2019). We added to this paragraph a statement which could better clarify that we do not propose that chirping could enhance object electrolocation. What we mean is that, in a context in which object electrolocation occurs through wave-type EODs - given the generally lower performance of such narrow-band signals in resolving the spatial features of any object, even a 3D electric field  - chirping could improve beat detection during social encounters by increasing the amount of information obtained by the fish.

      The edited paragraph now reads: “While broadband pulse signals may be useful to capture highly complex environments rich in foliage, roots and other structures common in vegetation featuring the more superficial habitats in which pulse-type fish live, wave-type EODs may be a better choice in the relatively simpler river-bed environments in which many wave-type fish live (e.g., the benthic zone of deep river channels; Crampton, 2019). In this case, achieving a good spatial resolution is critical during social encounters, especially considering the limited utility of visual cues in these low-light conditions. In such habitats, social encounters may “electrically” be less “abrupt”, but spatially less “conspicuous” or blurred (as a 3D electric field may be). In such a scenario, chirps could serve as a means to supplement the spatial information acquired via the beat, accentuating these cues during periods of reduced resolution.”

      Recommendations for the authors:

      Reviewer #3 (Recommendations For The Authors):

      None, my points in the original review have been properly addressed in this resubmission.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The present study provides a phylogenetic analysis of the size prefrontal areas in primates, aiming to investigate whether relative size of the rostral prefrontal cortex (frontal pole) and dorsolateral prefrontal cortex volume vary according to known ecological or social variables.

      I am very much in favor of the general approach taken in this study. Neuroimaging now allows us to obtain more detailed anatomical data in a much larger range of species than ever before and this study shows the questions that can be asked using these types of data. In general, the study is conducted with care, focusing on anatomical precision in definition of the cortical areas and using appropriate statistical techniques, such as PGLS.

      I have read the revised version of the manuscript with interest. I agree with the authors that a focus on ecological vs laboratory variables is a good one, although it might have been useful to reflect that in the title.

      I am happy to see that the authors included additional analyses using different definitions of FP and DLPFC in the supplementary material. As I said in my earlier review, the precise delineation of the areas will always be an issue of debate in studies like this, so showing the effects of different decisions in vital.

      We thank the reviewer for these positive remarks and for these very useful suggestions on the previous version of this article.

      I am sorry the authors are so dismissive of the idea of looking the models where brain size and area size are directly compared in the model, rather preferring to run separate models on brain size and area size. This seems to me a sensible suggestion.

      We agree with the reviewer 1 and the response of reviewer 3 also made it clear to us of why it was an important issue. We have therefore addressed it more thoroughly this time.

      First, we have added a new analysis, with whole brain volume included as covariate in the model accounting for regional volumes, together with the socio-ecological variables of interest. As expected given the very strong correlation across all brain measures (>90%), the effects of all socio-ecological factors disappear for both FP and DLPFC volumes when ‘whole brain’ is included as covariate. This is coherent with our previous analysis showing that the same combination of socio-ecological variables could account for the volume of FP, DLPFC and the whole brain. Nevertheless, the interpretation of these results remains difficult, because of the hidden assumptions underlying the analysis (see below).

      Second, we have clarified the theoretical reasons that made us choose absolute vs relative measures of brain volumes. In short, we understand the notion of specificity associated with relative measures, but 1) the interpretation of relative measures is confusing and 2) we have alternative ways to evaluate the specificity of the effects (which are complementary to the idea of adding whole brain volume as covariate). 

      Our goal here was to evaluate the influence of socio-ecological factors on specific brain regions, based on their known cognitive functions in laboratory conditions (working memory for the DLPFC and metacognition for the frontal pole). Thus, the null hypothesis is that socio-ecological challenges supposed to mobilize working memory and metacognition do not affect the size of the brain regions associated with these functions (respectively DLPFC and FP). This is what our analysis is testing, and from that perspective, it seems to us that direct measures are better, because within regions (across species), volumes provide a good index of neural counts (since densities are conserved), which are indicative fo the amount of computational resources available for the region. It is not the case when using relative measures, or when using the whole brain as covariate, since densities are heterogenous across brain regions (e.g. Herculano-Houzel, 2011; 2017, but see below for further details on this).

      Quantitatively, the theoretical level of specificity of the relation between brain regions and socio-ecological factors is difficult to evaluate, given that our predictions are based on the cognitive functions associated with DLPFC and FP, namely working memory and metacognition, and that each of these cognitive functions also involved other brain regions. We would actually predict that other brain regions associated with the same cognitive functions as DLPFC or FP also show a positive influence of the same socioecological variables. Given that the functional mapping of cognitive functions in the brain remains debated, it is extremely difficult to evaluate quantitatively how specific the influence of the socio-ecological factors should be on DLPFC and FP compared to the rest of the brain, in the frame of our hypothesis.

      Critically, given that FP and DLPFC show a differential sensitivity to population density, a proxy for social complexity, and that this difference is in line with laboratory studies showing a stronger implication of the FP in social cognition, we believe that there is indeed some specificity in the relation between specific regions of the PFC and socioecological variables. Thus, our results as a whole seem to indicate that the relation between prefrontal cortex regions and socio-ecological variables shows a small but significant level of specificity. We hope that the addition of the new analysis and the corresponding modifications of the introduction and discussion section will clarify this point.

      Similarly, the debate about whether area volume and number of neurons can be equated across the regions is an important one, of which they are a bit dismissive.

      We are sorry that the reviewer found us a bit dismissive on this issue, and there may have been a misunderstanding.

      Based on the literature, it is clearly established that for a given brain region, area volume provides a good proxy for the number of neurons, and it is legitimate to generalize this relation across species if neuronal densities are conserved for the region of interest (see for example Herculano-Houzel 2011, 2017 for review). It seems to be the case across primates because cytoarchitectonic maps are conserved for FP and DLPFC, at least in humans and laboratory primates (Petrides et al, 2012; Sallet et al, 2013; Gabi et al, 2016; Amiez et al, 2019). But we make no claim about the difference in number of neurons between FP and DLPFC, and we never compared regional volumes across regions (we only compared the influence of socio-ecological factors on each regional volume), so their difference in cellular density is not relevant here. As long as the neuronal density is conserved across species but within a region (DLPFC or FP), the difference in volume for that region, across species, does provide a reliable proxy for the influence of the socioecological regressor of interest (across species) on the number of neurons in that region.

      Our claims are based on the strength of the relation between 1) cross-species variability in a set of socio-ecological variables and 2) cross-species variability in neural counts in each region of interest (FP or DLPFC). Since the effects of interest relate to inter-specific differences, within a region, our only assumption is that the neural densities are conserved across distinct species for a given brain region. Again (see previous paragraph), there is reasonable evidence for that in the literature. Given that assumption, regional volumes (across species, for a given brain region) provide a good proxy for the number of neurons. Thus, the influence of a given socio-ecological variable on the interspecific differences in the volume of a single brain region provides a reliable estimate of the influence of that socio-ecological variable on the number of neurons in that region (across species), and potentially of the importance of the cognitive function associated with that region in laboratory conditions. None of our conclusions are based on direct comparison of volumes across regions, and we only compared the influence of socioecological factors (beta weights, after normalization of the variables).

      Note that this is yet another reason for not using relative measures and not including whole brain as covariate in the regression model: Given that whole brain and any specific region have a clear difference in density, and that this difference is probably not conserved across species, relative measures (or covariate analysis) cannot be used as proxies for neuronal counts (e.g. Herculano-Houzel, 2011). In other words, using the whole brain to rescale individual brain regions relies upon the assumption that the ratios of volumes (specific region/whole brain) are equivalent to the ratios of neural counts, which is not valid given the differences in densities.

      Nevertheless, I think this is an important study. I am happy that we are using imaging data to answer more wider phylogenetic questions. Combining detailed anatomy, big data, and phylogenetic statistical frameworks is a important approach.

      We really thank the reviewer for these positive remarks, and we hope that this study will indeed stimulate others using a similar approach.

      Reviewer #2 (Public Review):

      In the manuscript entitled "Linking the evolution of two prefrontal brain regions to social and foraging challenges in primates" the authors measure the volume of the frontal pole (FP, related to metacognition) and the dorsolateral prefrontal cortex (DLPFC, related to working memory) in 16 primate species to evaluate the influence of socio-ecological factors on the size of these cortical regions. The authors select 11 socio-ecological variables and use a phylogenetic generalized least squares (PGLS) approach to evaluate the joint influence of these socio-ecological variables on the neuro-anatomical variability of FP and DLPFC across the 16 selected primate species; in this way, the authors take into account the phylogenetic relations across primate species in their attempt to discover the the influence of socio-ecological variables on FP and DLPF evolution.

      The authors run their studies on brains collected from 1920 to 1970 and preserved in formalin solution. Also, they obtained data from the Mussée National d´Histoire Naturelle in Paris and from the Allen Brain Institute in California. The main findings consist in showing that the volume of the FP, the DLPFC, and the Rest of the Brain (ROB) across the 16 selected primate species is related to three socio-ecological variables: body mass, daily traveled distance, and population density. The authors conclude that metacognition and working memory are critical for foraging in primates and that FP volume is more sensitive to social constraints than DLPFC volume.

      The topic addressed in the present manuscript is relevant for understanding human brain evolution from the point of view of primate research, which, unfortunately, is a shrinking field in neuroscience. But the experimental design has two major weak points: the absence of lissencephalic primates among the selected species and the delimitation of FP and DLPFC. Also, a general theoretical and experimental frame linking evolution (phylogeny) and development (ontogeny) is lacking.

      We are sorry that the reviewer still believes that these two points are major weaknesses.

      - We have added a point on lissencephalic species in the discussion. In short, we acknowledge that our work may not be applied to lissencephalic species because they cannot be studied with our method, but on the other hand, based on laboratory data there is no evidence showing that the functional organization of the DLPFC and FP in lissencephalic primates is radically different from that of other primates (Dias et al, 1996; Roberts et al, 2007; Dureux et al, 2023; Wong et al, 2023). Therefore, there is no a priori reason to believe that not including lissencephalic primates prevents us from drawing conclusions that are valid for primates in general. Moreover, as explained in the discussion, including lissencephalic primates would require using invasive functional studies, only possible in laboratory conditions, which would not be compatible with the number of species (>15) necessary for phylogenetic studies (in particular PGLS approaches). Finally, as pointed out by the reviewer, our study is also relevant for understanding human brain evolution, and as such, including lissencephalic species should not be critical to this understanding.

      - In response to the remarks of reviewer 1 on the first version of the manuscript, we had included a new analysis in the previous version of the manuscript, to evaluate the validity of our functional maps given another set of boundaries between FP and DLPFC. But one should keep in mind that our objective here is not to provide a definitive definition of what the regions usually referred to as DLPFC and FP should be from an anatomical point of view. Rather, as our study aims at taking into account the phylogenetic relations across primate species, we chose landmarks that enable a comparison of the volume of cortex involved in metacognition (FP) and working memory (DLPFC) across species. We have also updated the discussion accordingly.

      We agree that this is a difficult point and we have always acknowledged that this was a clear limitation in our study. In the light of the functional imaging literature in humans and non-human primates, as well as the neurophysiological data in macaques, defining the functional boundary between FP and DLPFC remains a challenging issue even in very well controlled laboratory conditions. As mentioned by reviewer 1, “the precise delineation of the areas will always be an issue of debate in studies like this, so showing the effects of different decisions in vital”. Again, an additional analyses using different boundaries for FP and DLPFC was included in the supplementary material to address that issue. Now, we are not aware of solid evidence showing that the boundaries that we chose for DLPFC vs FP were wrong, and we believe that the comparison between 2 sets of measures as well as the discussion on this topic should be sufficient for the reader to assess both the strength and the limits of our conclusion. That being said, if the reviewer has any reference in mind showing better ways to delineate the functional boundary between FP and DLPFC in primates, we would be happy to include it in our manuscript.

      - The question of development, which is an important question per se,  is neither part of the hypothesis nor central for the field of comparative cognition in primates. Indeed, major studies in the field do not mention development (e.g. Byrne, 2000; Kaas, 2012; Barton, 2012). De Casien et al (2022) even showed that developmental constraints are largely irrelevant (see Claim 4 of their article): [« The functional constraints hypothesis […] predicts more complex, ‘mosaic’ patterns of change at the network level, since brain structure should evolve adaptively and in response to changing environments. It also suggests that ‘concerted’ patterns of brain evolution do not represent conclusive evidence for developmental constraints, since allometric relationships between developmentally linked or unlinked brain areas may result from selection to maintain functional connectivity. This is supported by recent computational modeling work [81], which also suggests that the value of mosaic or concerted patterns may fluctuate through time in a variable environment and that developmental coupling may not be a strong evolutionary constraint. Hence, the concept of concerted evolution can be decoupled from that of developmental constraints »].

      Finally, when studies on brain evolution and cognition mention development, it is generally to discuss energetic constraints rather than developmental mechanisms per se (Heldstab et al 2022 ; Smaers et al, 2021;  Preuss & Wise, 2021; Dunbar & Schutz, 2017; MacLean et al, 2012. Mars et al, 2018; 2021). Therefore, development does not seem to be a critical issue, neither for our article nor for the field.

      Reviewer #3 (Public Review):

      This is an interesting manuscript that addresses a longstanding debate in evolutionary biology - whether social or ecological factors are primarily responsible for the evolution of the large human brain. To address this, the authors examine the relationship between the size of two prefrontal regions involved in metacognition and working memory (DLPFC and FP) and socioecological variables across 16 primate species. I recommend major revisions to this manuscript due to: 1) a lack of clarity surrounding model construction; and 2) an inappropriate treatment of the relative importance of different predictors (due to a lack of scaling/normalization of predictor variables prior to analysis).

      We thank the reviewer for his/her remarks, and for the clarification of his /her criticism regarding the use of relative measures. We are sorry to have missed the importance of this point in the first place. We also thank the reviewer for the cited references, which were very interesting and which we have included in the discussion. As the reviewer 1 also shared these concerns, we wrote a detailed response to explain how we addressed the issue above.

      First, we did run a supplementary analysis where whole brain volume was added as covariate, together with socio-ecological variables, to account for the volume of FP or DLPFC. As expected given the very high correlation across all 3 brain measures, none of the socio-ecological variables remained significant. We have added a long paragraph in the discussion to tackle that issue. In short, we agree with the reviewer that the specificity of the effects (on a given brain region vs the rest of the brain) is a critical issue, and we acknowledge that since this is a standard in the field, it was necessary to address the issue and run this extra-analysis. But we also believe that specificity could be assessed by other means: given the differential influence of ‘population density’ on FP and DLPFC, in line with laboratory data, we believe that some of the effects that we describe do show specificity. Also, we prefer absolute measures to relative measures because they provide a better estimate of the corresponding cognitive operation, because standard allometric rules (i.e., body size or whole brain scaling) may not apply to the scaling and evolution of FP and DLPFC in primates.. Indeed, given that we use these measures as proxies of functions (metacognition for FP and working memory for DLPFC), it is clear that other parts of the brain should show the same effect since these functions are supported by entire networks that include not only our regions of interest but also other cortical areas in the parietal lobe. Thus, the extent to which the relation with socio-ecological variables should be stronger in regions of interest vs the whole brain depends upon the extent to which other regions are involved in the same cognitive function as our regions of interest, and this is clearly beyond the scope of this study. More importantly, volumetric measures are taken as proxies for the number of neurons, but this is only valid when comparing data from the same brain region (across species), but not across brain regions, since neural densities are not conserved. Thus, using relative measures (scaling with the whole brain volume) would only work if densities were conserved across brain regions, but it is not the case. From that perspective, the interpretation of absolute measures seems more straightforward, and we hope that the specificity of the effects could be evaluated using the comparison between the 3 measures (FP, DLPFC and whole brain) as well as the analysis suggested by the reviewer. We hope that the additional analysis and the updated discussion will be sufficient to cover that question, and that the reader will have all the information necessary to evaluate the level of specificity and the extent to which our findings can be interpreted.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      In my previous review of the present manuscript, I pointed out the fact that defining parts, modules, or regions of the primate cerebral cortex based on macroscopic landmarks across primate species is problematic because it prevents comparisons between gyrencephalic and lissencephalic primate species. The authors have rephrased several paragraphs in their manuscript to acknowledge that their findings do apply to gyrencephalic primates.

      I also said that "Contemporary developmental biology has showed that the selection of morphological brain features happens within severe developmental constrains. Thus, the authors need a hypothesis linking the evolutionary expansion of FP and DLPFC during development. Otherwise, the claims form the mosaic brain and modularity lack fundamental support". I insisted that the author should clarify their concept of homology of cerebral cortex parts, modules, or regions cross species (in the present manuscript, the frontal pole and the dorsolateral prefrontal cortex). Those are not trivial questions because any phylogenetic explanation of brain region expansion in contemporary phylogenetic and evolutionary biology must be rooted in evolutionary developmental biology. In this regard, the authors could have discussed their findings in the frame of contemporary studies of cerebral cortex evolution and development, but, instead, they have rejected my criticism just saying that they are "not relevant here" or "clearly beyond the scope of this paper".

      The question of development, which is an important question per se, is neither part of the hypothesis nor central for the field of comparative cognition in primates. Indeed, the major studies in the field do not mention development and some even showed that developmental constraints were not relevant (see De Casien et al., 2022 and details in our response to the public review). When studies on brain evolution and cognition mention development, it is generally to discuss energetic constraints rather than developmental mechanisms per se (Heldstab et al 2022 ; Smaers et al, 2021;  Preuss & Wise, 2021; Dunbar & Schutz, 2017;  MacLean et al, 2012. Mars et al, 2018; 2021).

      If the other reviewers agree, the authors are free to publish in eLife their correlations in a vacuum of evolutionary developmental biology interpretation. I just disagree. Explanations of neural circuit evolution in primates and other mammalian species should tend to standards like the review in this link: https://royalsocietypublishing.org/doi/full/10.1098/ rstb.2020.0522

      In this article, Paul Cizek (a brilliant neurophysiologist) speculates on potential evolutionary mechanisms for some primate brain functions, but there is surprisingly very little reference to the existing literature on primate evolution and cognition. There is virtually no mention of studies that involve a large enough number of species to address evolutionary processes and/or a comparison with fossils and/or an evaluation of specific socio-ecological evolutionary constraints. Most of the cited literature refers to laboratory studies on brain anatomy of a handful of species, and their relevance for evolution remains to be evaluated. These ideas are very interesting and they could definitely provide an original perspective on evolution, but they are mostly based on speculations from laboratory studies, rather than from extensive comparative studies. This paper is interesting for understanding developmental mechanisms and their constraints on neurophysiological processes in laboratory conditions, but we do not think that it would fit it in the framework of our paper as it goes far beyond our main topic.

      Reviewer #3 (Recommendations For The Authors):

      Yes, I am suggesting that the authors also include analyses with brain size (rather than body size) as a covariate to evaluate the effects of other variables in the model over and above the effect on brain size. In a very simplified theoretical scenario: two species have the same body sizes, but species A has a larger brain and therefore a larger FP. In this case, species A has a larger FP because of brain allometric patterns, and models including body size as a covariate would link FP size and socioecological variables characteristic of species A (and others like it). However, perhaps the FP of species A is actually smaller than expected for its brain size, while the FP of species B is larger than expected for its brain size.

      As explained in our response to the public review, we did run this analysis and we agree with the reviewer’s point from a practical point of view: it is important to know the extent to which the relation with a set of socio-ecological variables is specific of the region of interest, vs less specific and present for other brain regions. Again, we are sorry to not have understood that earlier, and we acknowledge that since it is a standard in the field, it needs to be addressed thoroughly.

      We understand that the scaling intuition, and the need to get a reference point for volumetric measures, but here the volume of each brain region is taken as a proxy for the number of neurons and therefore for the region’s computational capacities. Since, for a given brain region (FP or DLPFC) the neural densities seem to be well conserved across species, comparing regional volumes across species provides a good proxy for the contrast (across species) in neural counts for that region. All we predicted was that for a given brain region, associated with a given cognitive operation, the volume (number of neurons) would be greater in species for which socio-ecological constraints potentially involving that specific cognitive operation were greater. We do not understand how or why the rest of the brain would change this interpretation (of course, as discussed just above, beyond the question of specificity). And using whole brain volume as a scaling measure is problematic because the whole brain density is very different from the density of these regions of the prefrontal cortex (see above for further details). Again, we acknowledge that allometric patterns exist, and we understand how they can be interpreted, but we do not understand how it could prove or disprove our hypothesis (brain regions involved in specific cognitive operations are influenced by a specific set of socio-ecological variables). When using volumes as a proxy for computational capacities, the theoretical implications of scaling  procedures might be problematic. For example, it implies that the computational capacities of a given brain region are scaled by the rest of the brain. All other things being equal, the computational capacities of a given brain region, taken as the number of neurons, should decrease when the size of the rest of the brain increases. But to our knowledge there is no evidence for that in the literature. Clearly these are very challenging issues, and our position was to take absolute measures because they do not rely upon hidden assumptions regarding allometric relations and their consequence on cognition.

      But since we definitely understand that scaling is a reference in the field, we have not only completed the corresponding analysis (including the whole brain as a covariate, together with socio-ecological variables) but also expended the discussion to address this issue in detail. We hope that between this new analysis and the comparison of effects between non-scaled measures of FP, DLPFC and the whole brain, the reader will be able to judge the specificity of the effect.

      Models including brain (instead of body) size would instead link FP size and socioecological variables characteristic of species B (and others like it). This approach is supported by a large body of literature linking comparative variation in the relative size of specific brain regions (i.e., relative to brain size) to behavioral variation across species - e.g., relative size of visual/olfactory brain areas and diurnality/nocturnality in primates (Barton et al. 1995), relative size of the hippocampus and food caching in birds (Krebs et al. 1989).

      Barton, R., Purvis, A., & Harvey, P. H. (1995). Evolutionary radiation of visual and olfactory brain systems in primates, bats and insectivores. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 348(1326), 381-392.

      Krebs, J. R., Sherry, D. F., Healy, S. D., Perry, V. H., & Vaccarino, A. L. (1989). Hippocampal specialization of food-storing birds. Proceedings of the National Academy of Sciences, 86(4), 1388-1392. 

      We are grateful to the reviewer for mentioning these very interesting articles, and more generally for helping us to understand this issue and clarify the related discussion. Again, we understand the scaling principle but the fact that these methods provide interesting results does not make other approaches (such as ours) wrong or irrelevant. Since we have used both our original approach and the standard version as requested by the reviewer, the reader should be able to get a clear picture of the measures and of their theoretical implications. We sincerely hope that the present version of the paper will be satisfactory, not only because it is clearer, but also because it might stimulate further discussion on this complex question.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity:

      In this work, Anandi et al. propose an ex vivo model that can be used to recapitulate the in vivo structure of the tumor microenvironment, which allows the observation of morphological and functional changes in tumor cells in a 3D context. Due to the ability of cancer cells to induce hypoxic condition within the TME, authors propose this model to tackle the study of metastasis initiation in vitro. The proposed system successfully displays an ischemic gradient with cells accessing nutrients at different rates, similarly to what happens in solid tumors in vivo. Moreover, in line with the literature, tumor cell migration and invasiveness were promoted by hypoxic conditions. Authors also show that the system could be used to study cell-cell interaction, as co-cultures of macrophages and cancer cells were successfully cultured in the system and studied in the context of tumor hypoxia.

      The study proposed is interesting and timely, as cancer cell invasion remains an important area of tumor biology that needs further exploration. The methodology is well explained and proposed in a linear flow. However, the work could benefit from some improvement and changes, as well as from additional experiments. On an important note, authors do not properly refer to the current literature, as several studies on 3D culture systems/chambers have already been studied and developed to investigate the tumor microenvironment, but they are not cited nor referred to in the manuscript. Authors should refer to such literature and explain how this system is different and adds to it.

      Major comments:

      1. Authors propose this method to study the TME in 3D. When culturing cells with different ECM (Collagen vs. matrigel+collagen I) authors should take into consideration the effect of these materials on different cell types. It is known how collagen and matrigel can differently influence the polarization and phenotype of stromal cells (particularly in regards of fibroblasts - major components of solid cancers - e.g., PMID 21029367), therefore these points should be addressed at least in the discussion.

      We completely agree with the reviewer so we added this point (and reference) to our manuscript's introduction (lines 45-46) and discussion (lines 442-445).

      1. In addition to the previous comment, matrigel and collagen are also known to alter cancer cell phenotype (e.g., PMID 21029367) and this point should be taken into account.

      We completely agree with the reviewer so we added this point (and reference) to our discussion in the main text (lines 442-445).

      1. The need for novel 3D systems to study different aspects of the TME in vitro/ex vivo are certainly needed, however they are not inexistent. Authors should address this in the text, as the current literature already started to propose 3D models (including models involving matrigel/collagen in combination with other materials). 3D chambers (of different materials, and with different aims) are being used and designed and can be found in the literature. These works are not cited in the current study at all. For instance, Anguiano 2017; Cavo et al. 2018; Anguiano et al., 2020; Sodek et al. 2008, etc.

      We agree so we have now added those references to the main text (line 56-57).

      1. Even though the focus is on hypoxia and the achievement of an ischemic gradient in the chamber to allow resemblance of an in vivo tumor, the authors write in line 123 (and also in other parts of the text) that: "these results show that consumer cells in the 3MIC form ischemic gradients that can influence the local metabolic microenvironment experienced by neighboring tumor spheroids". The addition of the use of the PMDS membrane partly supports the claim, however it would be interesting to check whether this is indeed true, by measuring for example the levels of certain metabolites (e.g., glucose, glutamine, glutamate, lactate, aspartate) reached with the system, or pH levels, etc., in presence or absence of the hypoxic gradient/consumer cells.

      This is an insightful question and defining the exact composition of this complex ischemic microenvironment is a major ambition of our lab, so we completely agree with the reviewer's comment. However, as the 3MIC was designed specifically for microscopy, measuring specific metabolites it is unfortunately outside its capabilities.

      Having said that, and following the spirit of the reviewer's comments, we used microscopy to measure additional signs of metabolic stress. Specifically, we used fluorescent probes to detect changes in intracellular pH (pHrodo, Molecular Probes) and in Redox status (CellROX, Molecular Probes) and glucose (2-NBDG - a fluorescent D-glucose analog). As we explain below, we found exciting results from our pH measurements which led us to additional functional experiments. We are very excited about these new results, and we thank the reviewer for encouraging these experiments. These new results also provide evidence that other parameters in ischemia - and not just hypoxia - change along the 3MIC and can have an impact on tumor cells.

      1. When looking at the references presented in the manuscript, authors quote too many review articles, rather than scientific articles. Given the extremely wide literature on cancer metastasis, more of these works should be quoted in this context. For example: in the introduction - text lines 27-38 - only 4 references are research articles, out of 14 references presented in that paragraph.

      The reviewer is correct in pointing this out. Our intention was to use reviews on topics that are well established where citing primary research could be unfair to other contributions. But again, we agree with the reviewer, so we replaced reviews with primary research articles in multiple locations along the manuscript.

      1. As authors showed successfully how macrophages and cancer cells can interact in the chamber, recapitulating cell interactions in an in vivo context, it would be very interesting to see whether different consumer cells would induce similar or different changes to the spheroids and the ischemic gradient (for instance using stromal cells or non-tumor cell lines as consumers, instead of cancer cells only), as we know how tumors are a multitude of cell subsets, each contributing to nutrient production, oxygen consumption, etc.

      This is a great point. We thought about that very same point and conducted several experiments to test the combinatorial effects of different consumer cells. In broad terms, we did not observe major differences when using different consumer cells. However, we agree that this system may provide compelling opportunities to test the effect of different cell types on each other. Still, for consistency and ease, we conducted most of our experiments using the same cells in both consumers and in spheroids.

      In the resubmitted version, we added an experiment where we looked at the sprouting of SVEC endothelial cells using the same cells or Lung KPs as consumers (Fig. S6A).

      Minor comments:

      1. Studying the early metastatic development/seeding remains a timely quest, however authors should refer to several new studies in which various mouse models are used to study metastasis from different points of view (e.g., PMID 25822788; PMID 36991128; PMID 25171411; PMID 25633981; PMID 34632412; PMID 35921474; etc). Or line 41, three reviews are quoted (refs 27-29), whilst there are several works that could be quoted on metabolism in solid tumors also in the context of metastasis (e.g., PMID 36522548; PMID: 26719539, PMID 34303764). This comment applies to the rest of the text.

      We thank the reviewer for their help in processing this vast literature. We were aware of most of those works but some were new to us so thanks again! We have now added these references.

      1. The order of the references is not properly presented. In the introduction, the first reference is n. 4 (text line 22), instead of it being reference 1. Moreover, the subsequent literature ref. is number 12 and not number 2. Please revise the order of the references, and position them within the bibliography from first cited to last cited in the text.

      We apologize for this confusion. We have now revised all the references and we hope they are correctly formatted and numbered. The origin of this confusion may have been that we had references in the abstract thus their numbering started there rather than from the introduction. To avoid further confusions, we removed all references from the abstract.

      1. Lines 98-104. It would be helpful to the reader to define here what these consumer cells are. Even though it is explained in the methods that the consumer cells are cancer cells, it is important to make it clear in the text, as it could be misleading at times.

      We agree with the reviewer although we did not mean to be misleading. As mentioned above, we chose to use the same cells for both: consumers and spheroids and we have now added a new figure to illustrate this point (Fig S6A). Following the advice, we are also including additional text to make the message clearer (lines 107-109).

      1. The English grammar and spelling should be revised in some parts, as well as typos and missing words throughout the text (e.g., Line 38, the word "interraction" is misspelled and should be corrected with "interaction". Line 49, the first sentence seems incomplete. Lines 68-69 should be revised as the sentences do not flow well together, probably due to a missing word. In line 77 it should be "presents". Line 341 should be "cannot be explained").

      We apologize for these typos and mistakes. We have tried our best to avoid these type of errors in the new manuscript version.

      Referees cross-commenting

      I find the comments from the other reviewers to be in line with one another as well as with my general assessment. The major and comments of all reviewers should be addressed. The minor comments should be taken into account as well, as they would render the text and the figures more precise. I suggest that 3-6 months to complete the revision process is an appropriate time frame for the authors.

      Finally, I strongly encourage the authors to add in the discussion the points and questions raised by all reviewers, as well as to improve the bibliography in terms of organisation, linearity, and state of the art.

      Significance:

      General assessment:

      The work by Anandi et al. offers an additional tool to tackle the issue of studying the tumor microenvironment, in a 3D culture system.

      The authors show a model that can be used to study tumor hypoxia in 3D, offering the possibility to study the TME in a more in vivo-like manner without turning to mice models. The development of new tools to study the TME avoiding the excessive use of animals is definitely a timely quest. In addition, the system has the potential to be applied to tackle different biological questions, as the methodology is well explained and could be suitable to many other fields of cancer biology (e.g., drug resistance or uptake). The work is overall presented in a clear way and the methodology is explained thoroughly and it has the potential to be a useful tool for the study of cancer hypoxia.

      However, authors should address how their method could differently impact other cells when applied to other systems. As one major claim is the potential use of this methodology to study the TME, it should be taken into consideration how stromal cells are strongly affected by the ECM, and how certain settings or features of the system may impact such cell populations. In addition, the work does not properly refer to the current state of the art. As other studies started to propose 3D systems for the study of TME and cell-cell interactions - besides organoids - the authors should cite these works and frame their own study in a more appropriate context, pointing out differences with the current 3D chambers available, the advantages of one vs the other, and so on.

      Advance: the study adds to the current literature as the study of tumor hypoxia in 3D remains a complicated issue. The interesting co-culture settings with macrophages suggests potential uses of this model to study cell-cell interactions.

      Audience: the study is very methodological and offers a tool that could be used by cancer biologists - and maybe by other biology fields.

      Reviewer #2

      Evidence, reproducibility and clarity:

      Summary

      Anandi and colleagues present a manuscript describing a nice assay for exploring the progressive effect of metabolic depletion of the nutrients and oxygen on the invasion of cancer cells. This builds upon and extends a device that they previously described - MEMIC - and now enables 3D analysis of small numbers of cells. The key to their method is the inclusion of a layer of consumer cells that deplete oxygen and nutrients. Using this tool, they demonstrate that depleted environments promote invasive behavior and lower cell-cell adhesion. This is related to the nutrient-deprived and hypoxic environments found in the center of many tumors. Cellular Potts Modelling is used to explore ideas around the cooperation between reduced cell-cell adhesion and increase ECM adhesion in promoting invasion. Overall, this is a well-constructed manuscript that will be of interest to cell biologists and cancer biologists.

      Major comments

      I realize this work is submitted to review commons and this complicates the recommendation regarding publication. My view is that the 'more prestigious' journals would require greater mechanistic insight, but that the work could find a suitable place in other members of the review commons stable. My comments are divided into those essential for any journal and those that might be journal dependent.

      We hope that the mechanistic experiments added to our new manuscript version will appeal the reviewer and merit publication in any of the review commons journals.

      Essential regardless of journal

      1. Many of the figures lack information about the number of spheroids analyzed and from how many biological repeats they are derived.

      We have now added this information to all our experiments. This information can be found in the figures and on the figure legends.

      1. The authors need to provide citations for their assertion that only gases can cross the PDMS, but not other small metabolites. They should also comment on whether the build-up of CO2 might be relevant.

      We have now added the original reference where they describe PDMS's properties (Cox and Dunn, 1986).

      The point raised about CO2 is very interesting, but we do not expect a buildup of this gas. When using PDMS, CO2 would not accumulate as PDMS membranes are permeable to gases - including CO2. When using glass covers, the lack of oxygen should minimize CO2 production as hypoxic cells will not be able to conduct oxidative phosphorylation and produce lactic acid instead.

      1. The data on the directionality of migration when consumers are present are not significant and doesn't warrant the speculation in lines 186-189.

      Following the reviewer's advice we have removed this speculation.

      1. The ECM degradation in Figure 3 should be quantified.

      We agree. We added additional quantifications for the gelatin degradation assay. We also highlight the quantification we already had of the ECM degradation assessed via DQ collagen. Those data can be found in the new figures 4 and S4, respectively.

      1. Do the authors have evidence that the hypoxia-exposed cells are more adhesive to ECM. This is central to their Potts model and I could not locate the supporting experimental data. If not, then the Potts model should include matrix proteolysis, which they do have data about.

      Again, this is a very insightful observation, and we completely understand this confusion. We think that this may part of the inherent challenge of trying to condense biological problems into analogies or "metaphors" when using physical/mathematical models.

      The algorithm in a Cellular Potts model (CPM) tries to minimize the energy of the system (the entire group of cells/ECM that we are modelling). This global energy reduction is achieved by minimizing local energies in the cell-cell and cell-ECM interactions. The way the algorithm executes this minimization, is by always (probability p=1) accepting a configuration that decrease the energy while restricting the configurations that lead to higher energies (with a probability of p = e-DHT) where DH is the difference between the current and previous energy.

      So, the only thing the model is really doing is to increase the likelihood that cells are in a more "comfortable" environment - i.e. that the energy from the interactions with their neighboring cells and ECM is as low as possible. For example, if cell 1 and cell 2 adhere strongly but not to cell 3, in a CPM this is modelled as a low DH between cell 1 and 2 and a higher DH with cell 3. Conversely, when people model cells better at "invading" into a new "territory" they choose a lower energy between that cell type and that type of substratum.

      In other words, our CPM does not "care" whether ischemic cells invade the ECM because they create space through increased proteolysis or because they are more adherent to the ECM. These two scenarios are the same in a CPM and it is consistent with previous CPM models of similar scenarios (e.g.: PMID: 18835895, 33933478, 26436883, 23596570).

      We have now reworded the description of the model on the main text, and we added an illustration hoping to make this aspect of the model clearer (Fig. S4F).

      1. Is the down-regulation of E-cadherin transcriptional - i.e. is the mRNA level reduced?

      This is a great question. After the reviewer posed this question, we looked at out data and we concluded E-cad's downregulation is transcriptional. Assessing local mRNA levels in the 3MIC is challenging. However, our E-Cad reporter (pHAGE-E-cadherin-RFP, addgene #79603) is a red fluorescent protein driven by the CDH1 (E-Cad) reporter. RFP levels decrease with ischemia indicating that this regulation occurs at the promoter/transcriptional level. We now added this point to the revised manuscript (lines 259-261). We thank the reviewer for this insight!

      1. The title of figure 6 is misleading. The authors do not demonstrate chemoresistance in terms of cell survival or cell proliferation, which is how the term is normally used. The authors should measure cell number, proliferation, and cell viability. The data presented in the Supplementary Figure are inadequate with no quantification. The FUCCI reporter cells would be a good tool for this. Also, why use 150nM paclitaxel when the IC50 is 817nM? This seems bizarre. Lastly, there is a typo in the figure that suggest 150mM drug was used.

      We apologize if these experiments caused confusion. Our intention was to look at the anti-migratory effects of Taxol-related drugs. As such, we first determined the concentrations at which the drug was lethal to our cells (this is the LD50 of ~800nM). Then, we tested if lower concentrations - which we knew where not lethal - would affect cell migration, protrusions, etc. Hence the 30-150nM range we used in our experiments.

      We have now completely rewritten this section hoping that our approach is now clearer. We have also changed the title of the section and the figure legend to clarify that we are studying the effects of Taxol as anti-motility drug rather than its effects on cell survival and proliferation (now Fig. 7). Finally, we have now fixed the 150mM/150nM typo in the figure legend.

      Journal dependent

      1. The authors have not excluded that either changes in nutrients, or even a pro-invasive factor, produced by consumer cells are necessary for the increased invasion. They have only shown that they are not sufficient. The authors should perform a series of experiments comparing hypoxic conditions with normal media and normoxic conditions with nutrient depleted/condition media by prior culturing of KP cancer cells.

      This is a great point. We actually do not want or propose to exclude this possibility. So, we have now added text to clarify this issue (lines 431-435).

      In fact, we would be thrilled if there is a pro-invasive factor. If that would be the case, our results indicate this factor is only effective under ischemia. Because the same consumer cells do not have an effect on the same type of tumor spheroids under well-nurtured environments. In addition, our new pH measurements and perturbations experiments agree with this reviewer's intuition about additional factors being key in the increased invasion (see new Figure 2). We are very excited about these new results, and we hope this reviewer will be excited too.

      1. What is the oxygen sensor for increased invasion? PHD1-3 would be a good place to start looking. Is the PHD2-HIF axis important? Do VHL mutant cells still show responses to the consumer cells?

      Following the reviewer's feedback, we generated isogenic HIF1A KO cell lines to study whether HIF1A was directly needed in the invasion of tumor spheroids within the 3MIC. We complemented these loss-of-function experiments with For HIF1A gain-of-function using pharmacological interventions that stabilize HIF1A under normal oxygen levels (CoCl2 and DMOG).

      As shown in the new figure 2, these experiments mirrored our hypoxia experiments: HIF1A activity was not sufficient but it was required to drive the invasion of ischemic spheroids. We think that these new results are particularly interesting when taken together with our new pH-perturbation experiments. Briefly, our new experiments results show that in addition to the requirement of hypoxia/HIF1A, media acidity also has a strong effect on spheroid invasion. More excitingly, a drop in pH is sufficient to dramatically increase invasion - even in control well-nurtured spheroids. We think that the effects of pH and hypoxia are linked. HIF1A activation and hypoxia the increase glycolysis and thus lactic acid secretion. We speculate that this glycolytic switch is where hypoxia is important, but it is not sufficient because under well-perfused conditions (e.g. healthy tissue or large culture media volume) lactic acid levels may not buildup enough to significantly lower the extracellular pH. In contrast, under poor perfused conditions (3MIC and solid tumors) or if we flood cell cultures with lactic acid, the media's pH drops dramatically (Fig. 2).

      1. If they include both spheroids of endothelial cells and cancer cells, will the resulting protrusions in hypoxia grow towards each other? Would macrophages enhance this process?

      We agree with the reviewer this is an interesting question and we have anecdotally observed this effect. In the manuscript, we used these chimeric endothelial/tumor spheroids rather than separate ones (Fig. 6E). We do not find strong evidence that their protrusions grew towards each other, but this is something that we would like to explore in the future with more detail.

      Significance:

      The main advance is technical, as many previous studies have related hypoxia to increased cancer cell invasion, which the authors correctly acknowledge and cite. It is scholarly study, which will be of interest to many readers, and the method reported is likely to be adopted by several groups.

      Reviewer #3

      Evidence, reproducibility and clarity:

      In this work, Anandi et al., developed a cell culture system to live image the initial transformation of cells in deprivation of oxygen and nutrients in a 3D context. Using this system, 3MIC, they were able to create oxygen and nutrient gradients to simulate ischemic conditions that arise deep within tumors and that typically precede metastasis. With the 3MIC system they validated that ischemia triggers cell migration and invasion of tumor cells. In addition, 3MIC also allowed them to study the interaction of tumor spheroids with stromal cells such as macrophages and endothelial cells. Interestingly, the authors showed that co-culturing tumor spheroids with stromal cells increased the pro-metastatic features induced by ischemia conditions. Lastly, using 3MIC allowed the authors to discern that a poor paclitaxel response in ischemic-like cells is driven by intrinsic cellular resistance rather than due to lower drug concentration.

      Overall, the work is very well-written, and the results are consisting, convincing and support the conclusions. The methods are clear and complete and allow the reproducibility of the experiments. The experiments are adequately replicated and statistical analyses are well described. However, I have few suggestions to improve the impact of the manuscript:

      1. The authors conclude that 3MIC results in the accumulation of lactic acid and nutrient deprivation in an increasing manner when moving far from the opening site. Is there a way to actually show this? So far, the authors employ a hypoxia sensor only. A sensor for internal pH or other method for nutrient deprivation would help to support the conclusion and further validate the model.

      This is an excellent point. Following the reviewer's feedback, we tested additional sensors including for extra- and intra-cellular pH. As mentioned above, we observed dramatic changes in extracellular pH levels. We followed up these observations with a series of experiments that showed a key functional role for media acidification in driving invasion (Figure 2).

      1. According to figure S3E, the main cell line used by the authors is already quite mesenchymal. It would be good to know if the results showed here are consistent in cells with a more basal epithelial phenotype. Do epithelial cells need stronger ischemic conditions to undergo phenotypic changes?

      This is a great catch. To explore this further, we run a Western Blot analysis to compare epithelial and mesenchymal markers expressed by the main cells we used here (Lung KPs) and to compare them to levels in a stereotypical epithelial (MCF-7) and a mesenchymal (MDA-MB-231) cell line (new Fig. S4D). As the reviewer correctly points out, we do see that E-Cad and Vimentin are co-expressed in KP cells.

      So far, our observations in a range of cell lines are a consistent decrease in E-Cad levels with no significant effects in vimentin levels - regardless of the basal levels of this protein.

      Interestingly, a recent study[1] demonstrated in triple-negative breast cancer models, that an EMT hybrid phenotype - including the presence of Vimentin - is required for metastasis. A compelling hypothesis then is that ischemia in the tumor microenvironment may favor these hybrid phenotypes. We briefly discuss this topic in the revised version of this manuscript.

      1. The number of replicates should be included in each figure legend and not only in the methods section. From data presented it is not clearly stated what do points mean in boxplots (e.g, Fig1H, 2B,G...). How many cells/spheroids did the authors count in each experiment?

      We have now added this information to all our experiments. This information can be found in the figures and on the figure legends.

      1. Figure 3B is not mentioned in the main text.

      We apologize for this error, and we thank the reviewer for catching this issue, which have now corrected.

      1. Line 295: "In the absence of macrophages, clusters of endothelial cells remained mostly rounded, even in the presence of consumer cells and regardless of their location along the ischemic gradient (Fig. 5A; Video S6)." However, in Video S6, both images show endothelial cells co-cultured with macrophages. I consider that Video S6 should be not referenced here.

      The reviewer is correct so have removed that reference.

      1. References style should be homogeneous (e.g, in Ref 13 appears "Nature Reviews Cancer" whereas in Ref 14 "Nat Rev Cancer"). Also, in Ref 25, the journal is missing.

      We apologize for this oversight, and we have not tried to be more consistent in our references.

      1. In plots where distance to open chamber site is not especify (e.g. 6B), at what distance were the data recorded? Please, indicate in the figure legend.

      We have now added this information to our figures.

      1. In the experiment showed in Fig 4, the sorting strategy would include stromal cells such as fibroblasts and endothelial cells in the GFP- population (as only CD45+ cells are removed). These cells will likely also grow in the 3MIC system and have an effect in migration. Can the authors rule out this confounding effect?

      The reviewer is correct. We still think that the possibility of fibroblast contamination is low. First, the fluorescence of HRE-GFP cells under normoxic, is still higher than the autofluorescence of cells not expressing this constructs (such as fibroblasts). This is quite normal as most sensors/reporter have some leakage and thus there is a small amount of transcription. Second, intradermal and subcutaneous tumors are quite poor in fibroblasts. In fact, to study the role of fibroblasts in these tumors, they are usually co-injected with tumor cells (PMID: 20138012). Third, in the process of tumor dissociation and in vitroestablishment, non-transformed cells tend to die more. Since these are more technical points, we moved the cell sorting details to the material and methods section.

      1. In Fig 5C the panel of proximal + macrophages is missing

      We apologize for this mistake, and we have corrected in the new version of the manuscript.

      1. In Fig. 5, Linifanib is used to study the effect of blocking VEGF. Linifanib can also interact with RTKs and PDGF. This fact should be acknowledged.

      We agree with this point. Following the reviewer's advice, we now acknowledged the potential off-target effects of these inhibitors (lines 354-355).

      Significance

      This is a very interesting work with the development of a simple and cost-effective system that allows to continuously monitor biological processes in 3D cultures under nutrient-modified conditions. In general, these data would be broadly interesting to cancer community in general, as 3MIC is a very versatile system, where several aspects can be studied and precisely discerned.

    1. Note: This response was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would first like to thank the reviewers for their careful reading and thoughtful feedback.

      We have substantially revised the manuscript and included additional experimental evidence on O-GlcNAc and OGT/OGA protein levels in the placenta of embryos bearing the OGT-Y851A hypomorphic mutation.

      Overall, we believe our improved manuscript provides compelling evidence that the glycosyltransferase activity of OGT, and thus the O-GlcNAc modification itself, plays a sexually dimorphic function in placental development and the developmental repression of retrotransposons in the developing embryo.

      We have addressed each of the reviewers' comments below. The original comments (C) are in italic, our responses (R) in Roman font.

      Reviewer #1

      Evidence, reproducibility and clarity

      C1: Formichetti at el. developed mice with OGT catalytic dead mutations and then studied their function during early embryogenesis. Not surprisingly, dramatic reduction in OGT activity failed to produce embryos; however, mild reduction in OGT did produce animals. The authors then use the T931 animals that have a mild reduction in activity to further characterize the function in the early embryo. Not surprisingly, male mice showed changes in gene expression, implantation sub-lethality, and an uptick in loss of retrotransposon silencing. The authors also show that an even milder reduction in OGT activity (Y851A) effects male placenta function and chromatin remodeling. Finally, the authors make a less stable OGT transgene within the mouse and again found embryogenesis issues in the males and alterations in numerous gene families including mTOR signaling and p53 function. All in all, this is an interesting study that track functions of OGT in early embryonic development. The studies are well-controlled and rigorous.

      R1: We thank the reviewer for their clear understanding and their appreciation of the rigor and impact of this work.

      Significance

      C2: This is a good study and novel. Not only is it of interest to reproductive biologist, but it echos themes found in O-GlcNAc biology.

      R1: We are pleased that the reviewer underlined the novelty of the study and its impact across fields.

      Reviewer #2

      Evidence, reproducibility and clarity

      Comments to authors

      C3: To investigate the function of OGT at specific developmental stages, the authors perturbed OGT's function in vivo by creating a murine allelic series featuring four single amino acid substitutions that variably reduced OGT's catalytic activity. The goal was to identify the direct effect of O-GlcNAcylation, using a sophisticated collection of genetic mutants to evaluate in vivo the role of this modification at early stages of development. Overall, the severity of embryonic lethality correlated with the extent of catalytic impairment of OGT, demonstrating that the O-GlcNAc modification is essential for early development.The study represents a substantial advance in our understanding of OGT and O-GlcNAcylation in mammalian development. The creation of novel murine models and inducible systems is an important contribution, providing powerful tools for future research in this field. The insights into the role of OGT's catalytic activity and its involvement in epigenetic regulation during embryonic development are noteworthy, opening new avenues for research.

      R3: We thank the reviewer for their insightful comments. We are grateful for the supporting statements. Please find below detailed response to all your comments.

      However, there are a few considerations and concerns:

      Major:

      C4: 1. An assumption of the study is that different mutations cause different levels of O-GlcNAcylation rather than alterations in substrate specificity. It might be important to test, at least in cultured cells, that the different mutations do not change the preference of OGT to modify certain proteins rather than others, which can provide alternative explanations for their findings.

      R4: Thanks for asking this question, it helped us to better explain the rationale behind the choice of the Ogt amino-acid substitutions.

      This is a critical point that we carefully considered in the design of the single amino-acid substitutions. Two lines of evidence support that the precise mutations created impact the catalytic rate without modifying the substrate specificity:

      First, as explained in the text, the choice of the single amino-acid substitutions was driven by previous structural and enzymology knowledge. The impact of the four point mutations selected on OGT protein stability and on the Michaelis-Menten kinetic values had previously been determined experimentally (Fig. 1A legend and Martinez-Fleites, C. et al. Nature Structure Molecular Biology 2008; https://doi.org/10.1038/nsmb.1443).

      There is a second important rationale that we added in the revised manuscript: the four point mutations selected are all located in the catalytic domain (specifically, H568A in the N-Cat domain and Y851A, T931A and Q849A in the C-Cat domain), while the substrate recognition is operated via two other domains namely the intervening domain (Int-D) https://doi.org/10.1038/s41589-023-01422-2) and the tetratricopeptide Repeat (TPR) superhelix (10.1021/jacs.7b13546; https://doi.org/10.1073/pnas.2303690120). Therefore, for both these reasons, it is extremely unlikely that these mutations could influence the substrate specificity.

      C5.1: 2. In Fig 1D and 1H, the thresholds to define a gene or TE as differentially expressed are not strong. According to the figure legends, "any" change in terms of log2Fc was considered as DE and colored. I think the figures should illustrate better that the changes are subtle, by for example adding a dotted line (at least) in the value 0.5 of the y-axis. These subtle transcriptional changes should be reflected better in certain paragraphs where the expression of TEs are presented/and discussed as a hallmark of the absence of O-GlcNAcylation in the OGT-mutants. The same happens with Suppl Fig 3C (changes are very minor). {. Applying a stronger threshold, among the upregulated genes, only Xist will be significantly overexpressed. If a gentle threshold needs to be applied to this data, authors should at least justify the reasons behind doing so. Same for Fig2D.

      R5.1: The reviewer means Figure 2D for MA plot of gene expression and Figure 2H for retrotransposons expression. These figures now include a dash line to indicate Log2FC = 0.5 (as all MA plots).

      The text is explicit on the subtle changes in transcription, it reads "with 2/3 of the genes downregulated and 90% of the significant changes below 1 log__2__FC"; "most of the Ogt__T931del/Y embryos showed a low magnitude upregulation of retrotransposons".

      The revised text states "Notably, most of the OgtT931__del/Y embryos showed a low magnitude (log2FC < 1) upregulation of retrotransposons".

      We expand on this topic in the next response (R5.2) noting that changes in gene expression upon O-GlcNAc perturbation in different systems were previously characterized as subtle and widespread. We suggest that this phenotype may arise from the scarcely understood pleiotropic function of O-GlcNAc in fine-tuning gene expression; this phenotype could have a biological significance.

      C5.2: If a gentle threshold needs to be applied to this data, authors should at least justify the reasons behind doing so. Same for Fig2D.

      R5.2: Previous studies in different systems reported that O-GlcNAc perturbation causes a widespread change in gene expression of low magnitude (https://doi.org/10.1101/2024.01.22.576677, https://www.pnas.org/doi/10.1073/pnas.2218332120). We use the same thresholds as a recent functional Ogt study in ES cells to call differentially expressed genes, specifically: p<0.05 (Wald test), any FC (Li et al. PNAS 2023, https://www.pnas.org/doi/10.1073/pnas.2218332120). The p value threshold is standard; the absence of FC threshold is dictated by the insufficient knowledge of the significance of the low magnitude changes observed across many transcripts.

      C6: 3. In Figure 2B, the T931del allele was recovered in the blastocyst population with a very high frequency, even higher than the male WT group (T931del: 10; WT: 3). This observation suggests that the T931del allele did not significantly affect blastocyst survival. Further clarification or additional experiments might be necessary to understand the implications of this finding on early developmental stages.

      R6: This is only a hint as the numbers of blastocysts recovered were too small to perform statistics on Mendelian distribution. Thus, more experiments are needed to perform these statistical tests. These experiments are onerous because the low frequency of germline transmission is incompatible with maintaining this mutation by breeding heterozygous animals. Because of this, a new mouse line needs to be created by CRISPR-HDR targeting in the zygote in order to compute statistics on Mandelian ratios. Importantly, this question - does T931del affect blastocyst survival? - is peripheral, and the results of these experiments would not affect our conclusions in any way.

      C7: 4. Similarly, in Figure 2G, there is an apparent higher expression of TE expression in the T931A/Y embryos group than in the T931del/Y group, which combined with the higher frequency of blastocyst generated in this latest group it may indicate a deeper molecular consequence after the deletion of the T931. A comparison of the transcriptome between these two cell lines help to address this possibility. Also, the authors should compare the O-GlcNAc levels of WT, T931A, and T931del mutant blastocysts by immunostaining, similar to what was done in Figure S5F.

      R7: We agree that a direct comparison between the two mutations of the T931 residue would be interesting; however, this comment is very difficult to address experimentally for the reasons outlined below:

      Firstly, it is not possible to perform a statistical comparison of the transcriptome T931A/Y VS. T931del/Y with the data generated because the number of hemizygous T931A/Y (n=2) is too small. Hence, it cannot be ruled out that the seemingly milder retrotransposon reactivation in one of the T931A/Y embryos could have occurred by chance.

      Secondly, considering the low magnitude effect on gene expression changes upon O-GlcNAc genetic perturbation, to statistically assess the penetrance of the molecular phenotype and perform the differential expression analysis, numerous (>>3) hemizygous blastocysts of each genotype would be needed. Because females heterozygous for the T931 mutations transmit the mutant allele at very low frequency, these experiments require numerous de novo CRISPR injection sessions.

      Thirdly, for the immunostaining of O-GlcNAc to be semi-quantitative, a large number of hemizygous blastocysts for each genotype would be required (note that in Figure S5F, 29 morulae per condition were imaged), thus requiring numerous CRISPR injection experiments as discussed above. Moreover, O-GlcNAc changes could be subtler than what expected based on the strong reduction of OGT activity, since as a compensatory mechanism Ogt expression is upregulated in the Ogt__T931A/del blastocysts (Fig. S2D), making a quantification even more challenging despite a high number of stained embryos.

      In sum, these in vivo experiments are difficult and require sacrificing many animals (about 20 females per CRISPR injection experiment). Because the results would bring refinement to the study but would not change our conclusions, we suggest that the cost/benefit is too high.

      C8: 5. In Boulard et al. 2019 O-GlcNAcylation was shown to be sufficient to modulate expression of DNA methylation-dependent TEs. It would be interesting to know (or at least discuss) if the changes in TE expression observed in OGT-mutant embryos in this study involve changes in DNA methylation. Ideally, some DNA methylation measurement optimized for low input numbers of cells would be useful.

      R8: Thank you for making the link with our previous study. In the PNAS paper, we report that targeted removal of O-GlcNAc at proteins bound to specific TEs (e.g. IAPez) causes their full-blown reactivation without detectable changes in DNA methylation, thus suggesting a role of the O-GlcNAc modification for the silencing of methylated TEs downstream or independent of DNA methylation. We agree that it would be informative to quantify DNA methylation in the T931-mutant blastocysts to test if the in vitro result is the same in vivo, but this would require performing onerous microinjection sessions as explained above.

      C9: 6. The data related with the OGT-degron system in MEs seem disconnected with the rest of the manuscript. While the developmental models (blastocyst, etc) elegantly assess the contribution of O-GlcNAcylation to the control of cell survival and gene expression through the use of different OGT mutants, the degron system is a system of graded depletion that unfortunately was only possible to be used in MEFs (instead of embryos). Thus, the results obtained with the degron system in MEFs are difficult to intersect with the data from the use of OGT-mutants in embryos. Even though there are obvious interesting questions that one may want to know about this OGT degron MEF system, none of them would demonstrate a direct role for O-GlcNAcylation in cellular function, the major point addressed in the developmental system. Using the degron system in embryonic stem cells might have provided a more parallel comparison. The authors should discuss this point in more detail and either use ESC instead of MEFs or provide a stronger justification for the use of MEFs over ESC.

      R9: We thank the reviewer for their clear understanding of the system. The choice of primary MEF as an in vitro model was imposed by technical limitations we encountered during the study. We fully agree that ES cells is the model of choice for preimplantation embryos; thus we initially derived ES cells and obtained only one male clone bearing the AID degron system. Upon auxin addition to the culture media, OGT's level remained unchanged in ES cells. Thus, the ES cells model was not usable. To test the AID degron in a different cell type, we then derived MEFs and showed its effectiveness (Figures 4C and S4C-E), which also allowed to collect functional data on OGT's cellular function (Figures 4D-F). We took the comment on board and clarified the rationale of studying MEFs in the revised manuscript. We agree that it remains to be verified that the OGT-dependent pathways uncovered in MEFs are relevant in the preimplantation embryo. Despite this caveat, we feel the mouse model for endogenous OGT-degron, as well as the negative results in vivo and conclusions in MEFs should be shared with the community, which could take advantage of our results to refine the system.

      Minor:C10: 7. In Fig 2C the color and shape codes are confusing to understand - there are some colors/shapes that are not represented in the PCA plot. The same in Fig 3H, where in the PCA plot there are pink triangles that do not match with the code legends.

      R10: We apologize for the confusion with the legends of Figures 2C and 3H, that we have made unambiguous in the revised version (as well as Figures S2B,C and S3C).

      C11: 8. In the figure legends of Figures 2D, 2E, 2F, and 2H, the notation should be corrected from "OgtT931A/Y" to "OgtT931del/Y".

      R11: This has been corrected; many thanks for bringing it to our attention.

      Significance

      C12: To investigate the function of OGT at specific developmental stages, the authors perturbed OGT's function in vivo by creating a murine allelic series featuring four single amino acid substitutions that variably reduced OGT's catalytic activity. The goal was to identify the direct effect of O-GlcNAcylation, using a sophisticated collection of genetic mutants to evaluate in vivo the role of this modification at early stages of development. Overall, the severity of embryonic lethality correlated with the extent of catalytic impairment of OGT, demonstrating that the O-GlcNAc modification is essential for early development.

      R12: We thank the reviewer for their clear understanding of our work and their appreciation of the biological importance of the findings.

      Reviewer #3

      Evidence, reproducibility and clarity

      C13: This is a conceptually interesting paper that attempts to leverage the knowledge of OGT catalysis to begin to dissect OGT function. The evidence is presented I a straightforward fashion and is in general well documented. The breeding strategies are well informed and the paper draws heavily on previous work carried out in the mouse.

      R13: We greatly appreciate the overall supporting review. However, we fail to understand what they mean with "the paper draws heavily on previous work carried out in the mouse". This comment may stem from a misunderstanding because this work is not based on any previously published study. Specifically, neither the seven murine alleles presented and analyzed nor the single embryo-transcriptomic data sets on which our conclusions are based have been published elsewhere.

      To put this work into context, before our study there were two seminal studies published two decades ago that reported the essential role of Ogt for mouse development, but no molecular profiling was performed (10.1073/pnas.100471497, 10.1128/mcb.24.4.1680-1690.2004). The two Ogt loss-of-function alleles studied in these papers were deemed as not suitable for interrogating molecular phenotypes because they caused cell death that confounds molecular profiling and embryonic lethality at implantation, thus preventing study of the sexually-dimorphic role of Ogt placenta. To overcome this long-standing problem, we created new seven murine alleles, which allowed us to tease apart molecular phenotypes at key stages of mouse embryonic development, focusing on the blastocyst and the placenta.

      Significance

      C14: The paper describes tools which will help dissect the many potential roles of O-GlcNAc addition in early development. As it stands, this is a descriptive manuscript that will lead to hypothesis generation and testing and this should not be undervalued. The biological reagents produced and characterized will be of general interest to the field. Most of the findings presented represented a verification of existing ideas in the field but this is not meant as a criticism since part of the motivation for the approach was to generate a reproducible system for analyzing the biological phenomena.

      R14: We thank the reviewer for their appreciation of the importance of experimentally testing ideas shared in the field without direct evidence.

      However, we must respectfully disagree with the qualification of "descriptive manuscript". This qualification may stem from the particularly difficult challenge to accessing the molecular details on how the O-GlcNAc modification exerts the biological functions we report. We are fully cognizant of the limitations of the study that we discussed in the discussion section and in R20.2. However, we feel that the adjective "descriptive" is not a fair qualification because we provide numerous novel functional evidence. Specifically, we introduce two novel orthogonal in vivo perturbations for endogenous Ogt that allowed us to interrogate for the first time its function in the developing mouse embryo. These perturbations allow us to draw causative conclusions (not descriptive) on the essential role of the O-GlcNAc modification itself for preimplantation development, its sexually-dimorphic role in the placenta and its requirement in vivo for the stable repression of retrotransposons.

      C15: There are perhaps some bioinformatic shortcuts taken that may need to be corrected upon thorough review. These do not lessen the overall impact of the contribution.

      R15: All the code written for the bioinformatic analyses performed in this study is publicly available: https://github.com/boulardlab/Ogt_mouse_models_Formichetti2024. The reviewer needs to specify which bioinformatic analysis they suggest could be improved.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Summary

      C16: O-GlcNAcylation is the fundamental post-translational modification of numerous nuclear and cytosolic proteins. OGT is the sole enzyme catalyzing O-GlcNAc addition onto the proteins. The essentiality of OGT for early development and cellular viability has been established by using OGT-KO mice and cell lines. However, it remains to be elucidated whether the catalytic activity of OGT is required for the early development, and if the catalytic activity of OGT is required what are the functions of OGT or O-GlcNAcylation in early development due to a lack of appropriate mouse models. To overcome the technical difficulty of manipulating the levels of O-GlcNAcylation in early embryos, Formichetti et al. created the series of four mouse models (OgtY851A, OgtT931A, OgtQ849N, and OgtH568A) with different OGT activity by introducing single amino acid substitution in the catalytic domain. By analyzing the inheritance of the hypomorphic OGT alleles and the lethality of mouse embryos, they discovered OGT activity is a critical factor for early development. Subsequently, RNA-seq analyses with two mouse models showing the maternal inheritance of the hypomorphic OGT alleles indicated that sever hypo-OGT activity altered transcription and silencing of retrotransposon in preimplantation development while mild reduction of OGT's activity affected placental development in a sexually dimorphic manner rather than preimplantation development. Furthermore, to study the function of OGT at specific developmental stages, they developed a mouse model bearing endogenously AID-tagged OGT for acute degradation of OGT. Although the degron system wasn't efficient in preimplantation embryos, they discovered quick transcriptional changes upon OGT deletion in MEFs. The quality of the manuscript is good because the question to be solved was appropriately set, the approach was well designed, and their findings were interesting, although their writing was sometimes hard to understand as I raised in my following comments. Nevertheless, there are several points to be fixed before being published.

      R16: We thank the reviewer for their clear understanding of our work and their appreciation of the biological importance of the findings. Your comprehensive review of the manuscript and the questions you raised were extremely helpful in improving the manuscript and fully addressing its limitations. Below, we respond to comments in full, have revised the manuscript to improve clarity and have included novel results.

      Major Comments

      C17: 1. Although the authors showed in vitro activity of each mutant of OGT used in this manuscript by referencing the previous literature, they never showed the levels of global O-GlcNAcylation (and OGT itself) in their established mouse embryos. Although it could be impossible to determine O-GlcNAc levels in OgtQ849N and OgtH568A embryos because of the lack of germline transmission and founder line, respectively, they could do that in OgtY851A and OgtT931A embryos. Given that Y851A and T931A mutants had similar VMAX/KM with different VMAX, it is possible that their activity is comparable or Y851A has even lower activity in vivo depending on the concentration of UDP-GlcNAc in embryos. Therefore, it is critical to assess whether in vivo OGT activity is correlated with that in vitro as expected to conclude that severity of sub-Mendelian inheritance is proportional to the reduction of activity of OGT in vivo. Moreover, since the authors developed the elegant system to deplete OGT, the activity of Q849N and H568A mutant OGT can be examined at least in cells by expressing them in MEFs with OGT-degron system. Thus, I propose determination of global O-GlcNAc levels compensated by OGT levels by western blotting in OgtY851A, and OgtT931A embryos or MEFs with the OGT degron system re-expressing the individual four mutant OGTs. If the protein amount is insufficient for western blotting in the embryos because of the sizes of the earlier stages of embryos, I believe the author could address this by utilizing immunofluorescence as shown in Figure S5.

      R17: We fully agree that this is an important point that requires revision. The only mutation for which the level of O-GlcNAc and OGT can be assessed by western blot in vivo is Y851A, the other mutations resulting in embryonic lethality before the blastocyst stage.

      We have included in the revised manuscript western blot analyses of protein expression for OGT, OGA and O-GlcNAc levels in the placenta of the OgtY851A mutants (new Figures 3C,D). The new data show that OGT is upregulated at the protein level in homozygous females, in good agreement with our transcriptomic analysis. Furthermore, O-GlcNAc levels were slightly reduced in homozygous and hemizygous placentae thus showing the impact of the point mutation on global O-GlcNAc levels in the placentae. Moreover, the analysis of OGA protein level unexpectedly revealed the enrichment of a previously uncharacterized OGA fast migrating isoform in hemizygous and homozygous placentae.

      We agree that it would be informative to compare O-GlcNAc levels in OgtT931A versus OgtY851A embryos. A comparison implies performing the experiment at the same developmental stage, which has to be the blastocyst stage or prior because T931A/Y embryos die around implantation. The blastocyst being made of approximately 140 cells, it would require to pool many single blastocysts to obtain the necessary protein input for western blot. We are not aware of another study performing western blot with pooled blastocysts. An additional great challenge for this experiment is the necessity to genotype and sex the blastocysts before pooling. Thus, the feasibility of this experiment is uncertain.

      As an alternative, the reviewer suggests measuring O-GlcNAc levels in the degron MEFs after introduction of OGT transgenes bearing the mutation studied. This experiment would not be conclusive because of residual O-GlcNAc after OGT degradation (Figure S4E). Furthermore, the O-GlcNAc proteome is dynamic during development (as shown in the developing brain by Liu et al. https://doi.org/10.1371/journal.pone.0043724), therefore the MEFs results would have limited value to explain our results in the early embryo.

      In sum, available technologies to quantify O-GlcNAc (e.g. western bot, mass spectrometry) are inadequate for low input samples as the early embryo. However, our series of hypomorphic alleles backed up with in vitro enzymology measurements brings indirect evidence to this question. Specifically, the qualitative correlation between the measured OGT activity in vitro and the developmental phenotype indicates that the resulting relative levels of O-GlcNAc are consistent with in vitro measurements.

      C18.1 : 2. I didn't understand why the authors couldn't find any founder lines of the OgtH568A mutant. Was that because mosaic mice with OgtH568A mutation are lethal?

      R18.1: To answer to this question, it is important to recall two key features of the biological system:

      1) The mutation H568A was reported to disrupt the glycosyltransferase activity completely (10.1038/nsmb.1443). Hence, OGT-H588A is catalytic dead.

      2) We performed the CRISPR-HDR targeting in the 1-cell embryo.

      Based on these premises, the absence of F0 with the OgtH568A mutation (0/31) suggests that introducing this mutation causes embryonic lethality in both males and females. This hypothesis is consistent with the previously reported lethality around implementation of Ogt-null alleles (10.1128/mcb.24.4.1680-1690.2004). It is possible that the sgRNA is very efficient and results in homozygous mutations in all female zygotes injected (as we have not obtained heterozygous females bearing these mutations). High efficiency of the targeted mutagenesis in the zygote results in mutants where all or the majority of cells bear the mutation (no or low mosaicism). The high number of microinjections performed (416 embryos over the 3 injection sessions) allows us to make these claims.

      C18.2 : Also, I believe there was no explanation why the OgtQ849N allele showed no maternal inheritance. Was that because Q849N possesses enough activity for sustaining mosaic embryos, but not oocytes? The authors should better explain these points in the manuscript text.

      R18.2: Thanks for this comment, we agree that this maternal effect phenotype demands further explanation.

      The phenotype observed suggests two possibilities: either that the oocyte cannot maturate or that the cleavage-stage embryo cannot develop with the resulting lower levels of O-GlcNAc. The cleavage-stage embryo does not transcribe a catalytically active OGT before the 8-cell stage and thus relies on the OGT protein inherited from the oocyte until this stage (https://doi.org/10.1101/2024.01.22.576677).

      Thank you for this comment, we added this interpretation of the result in the text:<br /> "The lack of maternal transmission of the Q849N allele from seemingly mosaic founder females is likely explained by the reliance of the cleavage stage embryo onto the oocyte payload of OGT and O-GlcNAc modified proteins. Specifically, Ogt's exons encoding for the catalytic domains are not detectable before the 8-cell stage, while OGT full-length protein is present and thus maternally inherited (Formichetti et al, 2024)."

      C19: 3. The authors serendipitously found a T931del-allele in the "WT" allele of the OgtT931A line, and suggested that T931del had milder activity loss, although the lethality of embryos was greatly mitigated. Nevertheless, transcriptome analyses in male blastocysts revealed that 120 genes' expression was changed in T931del/Y males. This raised the question about which mutant OGT has higher activity, Y851A or T931del. I think comparing the activity of Y851A and T931del mutants in MEFs with OGT-degron system is important to confirm the proportional relationship between activity and phenotypic severity.

      R19: We agree that it is a limitation that the effect of the T931del mutation on OGT activity has not been biochemically characterized. However, the important point here is that our assessment of phenotypic severity based on maternal inheritance of the mutant allele and embryonic lethality is based on the point mutations for which the catalytic activity has been determined, namely Y851A, T931A, Q849N and H568A, but not T931del.

      We studied the serendipitously discovered T931del mutation to obtain transcriptional insights in the blastocyst. Because the deleted residue T931 is key for the binding to the donor substrate, we can reasonably assume that this mutation affects the catalytic activity, albeit to an undetermined level.

      Hence, our conclusions regarding the requirement of O-GlcNAcylation for development are unaffected by the lack of biochemical knowledge on T931del.

      C20.1: 4. Regarding transcriptomes of T931del/Y, the authors found the upregulation of proteasomal activity and stress granules along with the downregulation of amino acid metabolism, mitochondrial respiration, and so on. To validate the results, the authors should perform qPCR on several up- or down-regulated genes.

      R20.1 : We agree that, in principle, qPCR validation is suitable. However, this validation experiment is particularly expensive in this case because of the requirement of numerous CRISPR zygote pronuclear injection sessions.

      The conclusions of the RNA-seq analysis are strongly supported by a high number of biological replicates (n=10). This high number of biological replicates was essential to obtain sufficient statistical power to quantify with a high level of confidence transcriptional changes of low magnitudes (below 2-fold change, see R5.1 and R5.2).

      Therefore, the qPCR validation experiment would require to repeat the CRISPR zygote pronuclear injection sessions with the same high number of animals. This represents a major investment in experimental work and the sacrificing of about 40 animals. Importantly, the RNA-seq results presented are authoritative because of a high number of biological replicates and high number of sequencing reads per sample. Thus, we argue that qPCR validation is not essential and thus the high cost of this experiment is difficult to justify.

      C20.2: In addition, according to Figure S2E, the authors pointed out that at least for genes upregulated in OgtT931A embryos, the changes were not explained by a developmentally delayed transcriptome, suggesting that upregulation of these genes was the cause of developmental delay. Therefore, I strongly encourage them to discuss in the manuscript text how up-regulated genes could contribute to developmental delay.

      R20.2: Throughout the manuscript, we have been cautious to avoid establishing causal relationships between the differentially expressed genes uncovered and the developmental phenotypes (e.g. delayed development). There are two main obstacles which we believe prevent us from establishing causality with the data available. Firstly, it is not possible to disentangle differentially expressed genes and developmental delay (in other words, we have no way to tell which is the cause and which is the consequence). Secondly, O-GlcNAc modifies over 5000 proteins and the developing embryo is a particularly dynamic system; thus we cannot know whether the differentially expressed promoters are direct targets of O-GlcNAc modified proteins (or alternatively secondary effect of another molecular alteration, for example of the proteome). We discuss this limitation of the study in the discussion section.

      C21: 5. Regarding the transcriptome in OgtY851A mice, Y851A/Y male mice had huge transcriptomic differences, while Y851A/Y851A female mice barely had any. Although it seems to agree with the number of Ogt alleles, I wonder whether other X-linked genes expressed higher in female placenta as shown in Figure 3C could attenuate the effects of decreased OGT activity. I don't think this possibility can be excluded, unless the authors further decrease OGT activity in Y851A/Y851A female placenta and obtain the similar results as for male placenta. Or if they compared the levels of global O-GlcNAcylation between Y851A/Y and Y851A/Y851A mouse placentas and discovered they had similar levels of O-GlcNAcylation, then the authors could conclude that the number of Ogt alleles was not the reason of sexual-dimorphism. The authors should determine the levels of O-GlcNAcylation in Y851A/Y and Y851A/Y851A mouse placentas and/or at least discuss the above possibilities in the manuscript text.

      R21: Thank you for the thoughtful feedback. We agree that the most likely explanation for the higher sensitivity of males placenta as compared to females to OGT reduced activity is the difference in Ogt copy number, especially because Ogt escapes X-chromosome inactivation in the placenta (new Figure S3A).

      Western blot quantification of global O-GlcNAc levels was now performed (new Figures 3C,D). We measured similar level of O-GlcNAc in Y851A/Y and Y851A/Y851A placentas (lowered than WT males in both cases), but we cannot exclude that the WB does not have the dynamic range required to detect a subtle difference. In fact, female homozygous were expected to have an intermediate level between WT males and hemizygous males, and the difference between the two male genotypes (also considering sample-to-sample variability) is already small when quantified from the blot (new Figure 3D). It is possible that a X-linked modifier attenuates the impact of hypo-O_GlcNAcylation in female mutant placenta in the case of identical O-GlcNAc levels in homozygous females and hemizygous males. Thank you for the idea that we included in the revised manuscript:

      "Of note, the lower sensitivity of the homozygous females' transcriptome to Ogt disruption (Fig. 3F,I and S3B) seems difficult to reconcile with their lower O-GlcNAc level comparable (lower) O-GlcNAc level to the hemizygous males (Fig. 3C). It is possible that the western blot technique is not sensitive enough to detect subtle differences in O-GlcNAcylation. An alternative hypothesis, if O-GlcNAc levels were truly identical between Y851A/Y and Y851A/Y851A, could be the existence of a modifier in female that could be a XCI-escapee."

      C22: 6. In terms of the transcriptome in OgtY851A mice, similar to comment 4, the authors should confirm their transcriptomics data shown as Figure 3D by qPCR. In addition, the authors should describe the potential mechanisms by which the differentiation of precursor cells of LaTPs and JZPs were disrupted. Were master regulators of the differentiation known to be O-GlcNAcylated and loss of O-GlcNAcylation perturbed the function?

      R22: As for the whole embryo discussed in R20.2, we also interpret cautiously the gene expression phenotype observed in the placenta. Specifically, we state in the manuscript that it could either be caused by an impact of lower O-GlcNAcylation on placental differentiation or by a general delay in placentation or in the development of the embryo as a whole. The hypothesis of a general delay (of the whole embryo and/or of placental formation specifically) is supported by the downregulation of essentially all markers of more differentiated cell types and the upregulation of the precursor marker. We favor this hypothesis because it is consistent with what observed with the T931 mutants and also with the enzymatic removal of O-GlcNAc in the zygote (Formichetti et al., 2024 BioRxiv). Because of the thousands of O-GlcNAcylated proteins present in the cell, it is impossible to know which is the responsible molecular mechanism, which could even start at much earlier stages.

      Minor Comments

      C23: 1. Regarding DFP461-463 mutant, I couldn't understand the point of this figure because the results had no difference, and the meaning of the mutation was quite different from the others. Thus, the figure was awkward and a little confusing to me. If the authors still want to include the figures, I would suggest that they should reorganize the position of the figure (maybe after figure 3 is better to show you had tried to investigate the effects of nuclear localization of OGT on the changes of transcriptomes) and add some results. Since WT OGT seems to be localized mainly in the cytosol at steady state (Figure S1B and S1C), the effect of mutation on its nuclear localization should not be obvious. Therefore, it is difficult to conclude the mutation had no effect on the nuclear localization unless the ratio of nuclear and cytosol localization is quantified. Also, I wonder whether the O-GlcNAc levels of nuclear and cytosolic proteins in the mutant cells were comparable to those in WT cells. If this is the case, the results would also support the authors' conclusion.

      R23: We took the comments on board and made it clearer that the rationale for the DFP461-463 mutant was an attempt to separate OGT's nuclear and cytosolic functions. We fully agree that these results are peripheral, and thus we presented these results in Supplementary Figure 1 (not in the main figure).

      The biochemical evidence presented in Fig S1C shows that the genetic substitution of DFP to AAA on endogenous OGT has no detectable impact on its nuclear localization in primary MEFs. This result is far more authoritative than the evidence provided by Seo et al. 2016 (doi: 10.1038/srep34614), which is based on the overexpression of OGT transgenes in HeLa cells. Importantly, Seo et al. 2016 did not assess the impact of their mutations on endogenous OGT.

      We believe that the negative results we obtained with the DFP461-463 mouse model shall be extremely valuable for the field. Firstly, science can move forward only if both negative and positive results are shared. In this specific case, we found that mutation of endogenous OGT in MEFs yielded to a different result than previously reported overexpression of the same mutant construct in HeLa cells. Secondly, we want to make the Ogt-NLS- mouse model available for further investigations.

      C24: 2. Since OGT or O-GlcNAcylation regulates chromatin status, the authors analyzed the gene expression profiles of retrotransposons in T931del/Y or T931A/Y mice. Is it possible to investigate if the release of gene silencing is also seen in non-retrotransposon genes? I assumed retrotransposons might be a well-established system to analyze gene silencing status, however, if the authors could find similar effects on genes other than retrotransposons, that would be highly valuable.

      R24: This is an interesting idea. This notion refers to the activation of promoters that are normally epigenetically repressed (e.g. silent despite the presence of all trans-active factors required for their expression). Epigenetically repressed promoters include retrotransposons, imprinted genes and germline specific genes that are normally expressed in germ cells and maintained in a repressed state in somatic cells (10.1038/s41580-019-0159-6). Testing of mono-allelic expression of imprinted genes required F1-hybrid. Thus, we assessed whether well-studied germline specific genes could be realized from silencing in T931del/Y or T931A/Y blastocyst and found no evidence for it (see dot plot below). The unbiased transcriptomic analysis presented in the manuscript shows that the product of upregulated genes are enriched in mRNA processing (Figure 2E), but these genes are not normally epigenetically repressed. Thus, contrary to retrotransposons, the role of O-GlcNAc at cellular gene promoters appears not to be linked to epigenetic silencing. This could be explained by the many different protein substrates for O-GlcNAc.

      C25: 3. OgtY851A mice with milder OGT activity loss didn't exhibit impaired preimplantation development, but did display postimplantation development such as placental development, suggesting that O-GlcNAcylation of proteins required for preimplantation and postimplantation development relies on different degrees of OGT activity. I wonder whether global O-GlcNAc levels in embryos in preimplantation and postimplantation developmental stages are different or not. This might include both the pattern of blotting and intensities. The results would give the authors an explanation why the dependency on OGT activity was different in two developmental stages. Can the authors provide data? If not, then the authors should at least describe hypotheses in the manuscript to address these questions.

      R25: We recently reported that the subcellular patterns of O-GlcNAc are highly dynamic during preimplantation development (Formichetti et al. 2024, BioRxiv). The most striking O-GlcNAc remodeling we observed is the enrichment of nuclear O-GlcNAc as compared to cytoplasmic O-GlcNAc that is concomitant to embryonic genome activation (Formichetti et al. 2024, BioRxiv). We quantified the ratio of the nuclear/cytoplasmic signal by immunofluorescence, but absolute quantification is not possible with this method. Due to the limited number of cells of the preimplantation embryo, this analysis cannot be performed by western blot. Hence, there is no appropriate method to quantitatively compare O-GlcNAc levels between preimplantation and postimplantation embryos.

      C26: 4. The authors' AID-degron system elegantly worked in MEFs but was inefficient in preimplantation embryos. I wonder if this was because of the high expression of the shorter isoform of OGT detected as OGTp78 in the author's western blot. Is it possible to examine this possibility in the embryos? Either way, the authors should describe a potential explanation for why the efficiency in the embryos was low. In addition, the authors should describe why they inserted the AID tag only into the longest OGT isoform.

      R26: This is a good point. The smallest isoform OGTp78 bears the catalytic domain and thus can partially compensate for the degradation of OGTp110. Note that the level of OGTp78 is low and does not increase upon OGTp110 degradation; thus a compensation can only be partial (Figures S4A and S4D). Alternative hypotheses for the ineffectiveness of the degron system in ex vivo grown embryos include: i) the expression level of OsTIR that may be too low in the early embryo (Rosa26 promoter not being activated at EGA), ii) a possible steric hindrance of the N-ter AID tag in these cells, iii) the lower concentration of Auxin imposed by toxicity on the embryo is likely suboptimal. Testing these possibilities is very difficult in preimplantation embryos.

      It is unclear how the OGTp78 isoform is produced; it was hypothesized to originate from an alternative transcription start site (https://doi.org/10.1007/s00335-001-2108-9). We initially attempted to target both isoforms by inserting the AID tag at the C-terminus, but we were unsuccessful in producing this mouse model. It is possible that the C-terminus that is near the catalytic site cannot tolerate the AID knock-in.

      C27: 5. In Figure S1C, is the band detected right below OGTp78 in nuclei fractions non-specific or do both bands correspond to OGTp78 ?

      R27: To answer this question, a knockout control would be needed. OGTp78 being not targeted by our AID-degron, we cannot test the specificity of these bands using our perturbation tool kit.

      C28: 6. Figure 1D top row third column: hemizgous -> hemizygous

      R28: Many thanks; the embarrassing typo has been corrected.

      C29: 7. Figure 1D second row third column: hemyzygous -> hemizygous

      R29: Thanks for bringing this other typo to our attention, it is now corrected.

      Reviewer #4 (Significance (Required)):

      General assessment: strengths and limitations

      C30: Strength: This manuscript elegantly revealed the requirement of OGT in mammalian development by taking advantage knock-in mouse models with different OGT activity. In addition, the manuscript provided the interesting and important transcriptomics data in both pre- and post-implantation embryos of OGT mutant mice. These data sets could explain detailed mechanisms how OGT or O-GlcNAcylation regulates mammalian development in the future. Furthermore, development of AID-tagged OGT system would be a useful tool for other researchers studying OGT function.

      Limitation: Although they found interesting changes in terms transcriptomes in developing mice with different OGT activity, they lack the data showing how these changes caused the observed phenotypes. In other words, there are less mechanistic insights behind the developmental problems seen in mice with different OGT activity.

      In addition, although I agree the question about whether OGT activity itself is crucial for the early development of mammals has not been completely solved for a long time, I assume people thought OGT activity is actually important for the mammalian development thorough the observation of OGT-linked congenital disorders of glycosylation.

      Therefore, I would say the novelty of the manuscript is a little less impactful. Furthermore, although AID-tagged OGT system revealed fundamental questions regarding the transcriptional changes upon acute depletion of OGT in cellular levels, the system was inefficient in mouse embryos. So, they showed nothing about developmental-stage specific requirements of OGT.

      Advance: The manuscript can fill a current gap regarding requirement of OGT in mammalian development. Also, the manuscript developed a series of mutant mice with different OGT activity and an AID-tagged OGT mouse line. These mice provide technical advances.

      Audience: The manuscript will be interested in researchers in specific fields such as glycobiology, developmental biology, and clinical fields.

      Describe your expertise: Biochemistry, Glycobiology, Cell biology

      R30: We are thankful for the constructive and supportive review.

      We fully agree with the limitations of the study and discussed them in the manuscript. Our in vivo approach revealed the most phenotypically relevant transcriptional phenotypes resulting from OGT catalytic impairment during embryonic development. We make the mouse models created for this study available to the community to facilitate follow-up studies aiming at exploring the underlying molecular details.

      As pointed out in the comments, the requirement of OGT glycosyltransferase activity for mammalian development was widely assumed by the field, but this belief was without direct experimental evidence. This study provides the first in vivo evidence for this important conclusion.

      Conclusion: The reviewers' comments were tremendously useful to improving the clarity of the manuscript and adding important new in vivo evidence. We note that none of the reviewers provided any reason to doubt our important conclusions:

      • The demonstration that the enzymatic activity of Ogt, thus the O-GlcNAc modification itself, is essential for preimplantation development.
      • The finding that a mild reduction of OGT's activity is sufficient to perturb the silencing of multiple families of retrotransposons in the growing embryo.
      • The indication, from transcriptomes of hypo-O-GlcNAcylated embryos, of a developmental retardation upon a mild O-GlcNAc perturbation.

      • The discovery that OGT's rapid depletion in vitro downregulates basal cellular function, including translation. This result provides mechanistic support to the embryonic growth delay resulting from decreasing O-GlcNAc in vivo.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer 1:

      Thank you for your review and pointing out multiple things to be discussed and clarified! Below, we go through the various limitations you pointed out and refer to the places where we have tried to address them.

      (1) It's important to keep in mind that this work involves simplified models of the motor system, and often the terminology for 'motor cortex' and 'models of motor cortex' are used interchangeably, which may mislead some readers. Similarly, the introduction fails in many cases to state what model system is being discussed (e.g. line 14, line 29, line 31), even though these span humans, monkeys, mice, and simulations, which all differ in crucial ways that cannot always be lumped together.

      That is a good point. We have clarified this in the text (Introduction and Discussion), to highlight the fact that our model isn’t necessarily meant to just capture M1. We have also updated the introduction to make it more clear which species the experiments which motivate our investigation were performed in.

      (2) At multiple points in the manuscript thalamic inputs during movement (in mice) is used as a motivation for examining the role of preparation. However, there are other more salient motivations, such as delayed sensory feedback from the limb and vision arriving in the motor cortex, as well as ongoing control signals from other areas such as the premotor cortex.

      Yes – the motivation for thalamic inputs came from the fact that those have specifically been shown to be necessary for accurate movement generation in mice. However, it is true that the inputs in our model are meant to capture any signals external to the dynamical system modeled, and as such are likely to represent a mixture of sensory signals, and feedback from other areas. We have clarified this in the Discussion, and have added this additional motivation in the Introduction.

      (3) Describing the main task in this work as a delayed reaching task is not justified without caveats (by the authors' own admission: line 687), since each network is optimized with a fixed delay period length. Although this is mentioned to the reader, it's not clear enough that the dynamics observed during the delay period will not resemble those in the motor cortex for typical delayed reaching tasks.

      Yes, we completely agree that the terminology might be confusing. While the task we are modeling is a delayed reaching task, it does differ from the usual setting since the network has knowledge of the delay period, and that is indeed a caveat of the model. We have added a brief paragraph just after the description of the optimal control objective to highlight this limitation.

      We have also performed additional simulations using two different variants of a model-predictive control approach that allow us to relax the assumption that the go-cue time is known in advance. We show that these modifications of the optimal controller yield results that remain consistent with our main conclusions, and can in fact in some settings lead to preparatory activity plateaus during the preparation epoch as often found in monkey M1 (e.g in Elsayed et al. 2016). We have modified the Discussion to explain these results and their limitations, which are summarized in a new Supplementary Figure (S9).

      (4) A number of simplifications in the model may have crucial consequences for interpretation.

      a) Even following the toy examples in Figure 4, all the models in Figure 5 are linear, which may limit the generalisability of the findings.

      While we agree that linear models may be too simplistic, much prior analyses of M1 data suggest that it is often good enough to capture key aspects of M1 dynamics; for example, the generative model underlying jPCA is linear, and Sussillo et al. (2015) showed that the internal activity of nonlinear RNN models trained to reproduce EMG data aligned best with M1 activity when heavily regularized; in this regime, the RNN dynamics were close to linear. Nevertheless, this linearity assumption is indeed convenient from a modeling viewpoint: the optimal control problem is more easily solved for linear network dynamics and the optimal trajectories are more consistent across networks. Indeed, we had originally attempted to perform the analyses of Figure 5 in the nonlinear setting, but found that while the results were overall similar to what we report in the linear regime, iLQR was occasionally trapped into local minimal, resulting in more variable results especially for inhibition-stabilized network in the strongly connected end of the spectrum. Finally, Figure 5 is primarily meant to explore to what extent motor preparation can be predicted from basic linear control-theoretic properties of the Jacobian of the dynamics; in this regard, it made sense to work with linear RNNs (for which the Jacobian is constant).

      b) Crucially, there is no delayed sensory feedback in the model from the plant. Although this simplification is in some ways a strength, this decision allows networks to avoid having to deal with delayed feedback, which is a known component of closed-loop motor control and of motor cortex inputs and will have a large impact on the control policy.

      This comment resonates well with Reviewer 3's remark regarding the autonomous nature (or not) of M1 during movement. Rather than thinking of our RNN models as anatomically confined models of M1 alone, we think of them as models of the dynamics which M1 implements possibly as part of a broader network involving “inter-area loops and (at some latency) sensory feedback”, and whose state appears to be near-fully decodable from M1 activity alone. We have added a paragraph of Discussion on this important point.

      (5) A key feature determining the usefulness of preparation is the direction of the readout dimension. However, all readouts had a similar structure (random Gaussian initialization). Therefore, it would be useful to have more discussion regarding how the structure of the output connectivity would affect preparation, since the motor cortex certainly does not follow this output scheme.

      We agree with this limitation of our model — indeed one key message of Figure 4 is that the degree of reliance on preparatory inputs depends strongly on how the dynamics align with the readout. However, this strong dependence is somewhat specific to low-dimensional models; in higher-dimensional models (most of our paper), one expects that any random readout matrix C will pick out activity dimensions in the RNN that are sufficiently aligned with the most controllable directions of the dynamics to encourage preparation.

      We did consider optimizing C away (which required differentiating through the iLQR optimizer, which is possible but very costly), but the question inevitably arises what exactly should C be optimized for, and under what constraints (e.g fixed norm or not). One possibility is to optimize C with respect to the same control objective that the control inputs are optimized for, and constrain its norm (otherwise, inputs to the M1 model, and its internal activity, could become arbitrarily small as C can grow to compensate). We performed this experiment (new Supplementary Figure S7) and obtained a similar preparation index; there was one notable difference, namely that the optimized readout modes led to greater observability compared to a random readout; thus, the same amount of “muscle energy” required for a given movement could now be produced by a smaller initial condition. In turn, this led to smaller control inputs, consistent with a lower control cost overall.

      Whilst we could have systematically optimized C away, we reasoned that (i) it is computationally expensive, and (ii) the way M1 affects downstream effectors is presumably “optimized” for much richer motor tasks than simple 2D reaching, such that optimizing C for a fixed set of simple reaches could lead to misleading conclusions. We therefore decided to stick with random readouts.

      Additional comments:

      (1) The choice of cost function seems very important. Is it? For example, penalising the square of u(t) may produce very different results than penalising the absolute value.

      Yes, the choice of cost function does affect the results, at least qualitatively. The absolute value of the inputs is a challenging cost to use, as iLQR relies on a local quadratic approximation of the cost function. However, we have included additional experiments in which we penalized the squared derivative of the inputs (Supplementary Figure S8; see also our response to Reviewer 3's suggestion on this topic), and we do see differences in the qualitative behavior of the model (though the main takeaway, i.e. the reliance on preparation, continues to hold). This is now referred to and discussed in the Discussion section.

      (2) In future work it would be useful to consider the role of spinal networks, which are known to contribute to preparation in some cases (e.g. Prut and Fetz, 1999).

      (3) The control signal magnitude is penalised, but not the output torque magnitude, which highlights the fact that control in the model is quite different from muscle control, where co-contraction would be a possibility and therefore a penalty of muscle activation would be necessary. Future work should consider the role of these differences in control policy.

      Thank you for pointing us to this reference! Regarding both of these concerns, we agree that the model could be greatly improved and made more realistic in future work (another avenue for this would be to consider a more realistic biophysical model, e.g. using the MotorNet library). We hope that the current Discussion, which highlights the various limitations of our modeling choices, makes it clear that a lot of these choices could easily be modified depending on the specific assumptions/investigation being performed.

      Reviewer 2:

      Thank you for your positive review! We very much agree with the limitations you pointed out, some of which overlapped with the comments of the other reviewers. We have done our best to address them through additional discussion and new supplementary figures. We briefly highlight below where those changes can be found.

      (1) Though the optimal control theory framework is ideal to determine inputs that minimize output error while regularizing the input norm, it however cannot easily account for some other varied types of objectives especially those that may lead to a complex optimization landscape. For instance, the reusability of parts of the circuit, sparse use of additional neurons when learning many movements, and ease of planning (especially under uncertainty about when to start the movement), may be alternative or additional reasons that could help explain the preparatory activity observed in the brain. It is interesting to note that inputs that optimize the objective chosen by the authors arguably lead to a trade-off in terms of other desirable objectives. Specifically, the inputs the authors derive are time-dependent, so a recurrent network would be needed to produce them and it may not be easy to interpolate between them to drive new movement variants. In addition, these inputs depend on the desired time of output and therefore make it difficult to plan, e.g. in circumstances when timing should be decided depending on sensory signals. Finally, these inputs are specific to the full movement chain that will unfold, so they do not permit reuse of the inputs e.g. in movement sequences of different orders.

      Yes, that is a good point! We have incorporated further Discussion related to this point. We have additionally included a new example in which we regularize the temporal complexity of the inputs (see also our response to Reviewer 3's suggestion on this topic), which leads to more slowly varying inputs, and may indeed represent a more realistic constraint and lead to simpler inputs that can more easily be interpolated between. We also agree that uncertainty about the upcoming go cue may play an important role in the strategy adopted by the animals. While we have not performed an extensive investigation of the topic, we have included a Supplementary Figure (S9) in which we used Model Predictive Control to investigate the effect of planning under uncertainty about the go cue arrival time. We hope that this will give the reader a better sense of what sort of model extensions are possible within our framework.

      (2) Relatedly, if the motor circuits were to balance different types of objectives, the activity and inputs occurring before each movement may be broken down into different categories that may each specialize into one objective. For instance, previous work (Kaufman et al. eNeuron 2016, Iganaki et al., Cell 2022, Zimnik and Churchland, Nature Neuroscience 2021) has suggested that inputs occurring before the movement could be broken down into preparatory inputs 'stricto sensu' - relating to the planned characteristics of the movement - and a trigger signal, relating to the transition from planning to execution - irrespective of whether the movement is internally timed or triggered by an external event. The current work does not address which type(s) of early input may be labeled as 'preparatory' or may be thought of as a part of 'planning' computations.

      Yes, our model does indeed treat inputs in a very general way, and does not distinguish between the different types of processes they may be composed of. This is partly because we do not explicitly model where the inputs come from, such that our inputs likely englobe multiple processes. We have added discussion related to this point.

      (3) While the authors rightly point out some similarities between the inputs that they derive and observed preparatory activity in the brain, notably during motor sequences, there are also some differences. For instance, while both the derived inputs and the data show two peaks during sequences, the data reproduced from Zimnik and Churchland show preparatory inputs that have a very asymmetric shape that really plummets before the start of the next movement, whereas the derived inputs have larger amplitude during the movement period - especially for the second movement of the sequence. In addition, the data show trigger-like signals before each of the two reaches. Finally, while the data show a very high correlation between the pattern of preparatory activity of the second reach in the double reach and compound reach conditions, the derived inputs appear to be more different between the two conditions. Note that the data would be consistent with separate planning of the two reaches even in the compound reach condition, as well as the re-use of the preparatory input between the compound and double reach conditions. Therefore, different motor sequence datasets - notably, those that would show even more coarticulation between submovements - may be more promising to find a tight match between the data and the author's inputs. Further analyses in these datasets could help determine whether the coarticulation could be due to simple filtering by the circuits and muscles downstream of M1, planning of movements with adjusted curvature to mitigate the work performed by the muscles while permitting some amount of re-use across different sequences, or - as suggested by the authors - inputs fully tailored to one specific movement sequence that maximize accuracy and minimize the M1 input magnitude.

      Regarding the exact shape of the occupancy plots, it is important to note that some of the more qualitative aspects (e.g the relative height of the two peaks) will change if we change the parameters of the cost function. Right now, we have chosen the parameters to ensure that both reaches would be performed at roughly the same speed (as a way to very loosely constrain the parameters based on the observed behavior). However, small changes to the hyperparameters can lead to changes in the model output (e.g one of the two consecutive reaches being performed using greater acceleration than the other), and since our biophysical model is fairly simple, changes in the behavior are directly reflected in the network activity. Essentially, what this means is that while the double occupancy is a consistent feature of the model, the exact shape of the peaks is more sensitive to hyperparameters, and we do not wish to draw any strong conclusions from them, given the simplicity of the biophysical model. However, we do agree that our model exhibits some differences with the data. As discussed above, we have included additional discussion regarding the potential existence of separate inputs for planning vs triggering the movement in the context of single reaches.

      Overall, we are excited about the suggestions made by the Reviewer here about using our approach to analyze other motor sequence datasets, but we think that in order to do this properly, one would need to adopt a more realistic musculo-skeletal model (such as one provided by MotorNet).

      (4) Though iLQR is a powerful optimization method to find inputs optimizing the author's cost function, it also has some limitations. First, given that it relies on a linearization of the dynamics at each timestep, it has a limited ability to leverage potential advantages of nonlinearities in the dynamics. Second, the iLQR algorithm is not a biologically plausible learning rule and therefore it might be difficult for the brain to learn to produce the inputs that it finds. It remains unclear whether using alternative algorithms with different limitations - for instance, using variants of BPTT to train a separate RNN to produce the inputs in question - could impact some of the results.

      We agree that our choice of iLQR has limitations: while it offers the advantage of convergence guarantees, it does indeed restrict the choice of cost function and dynamics that we can use. We have now included extensive discussion of how the modeling choices affect our results.

      We do not view the lack of biological plausibility of iLQR as an issue, as the results are agnostic to the algorithm used for optimization. However, we agree that any structure imposed on the inputs (e.g by enforcing them to be the output of a self-contained dynamical system) would likely alter the results. A potentially interesting extension of our model would be to do just what the reviewer suggested, and try to learn a network that can generate the optimal inputs. However, this is outside the scope of our investigation, as it would then lead to new questions (e.g what brain region would that other RNN represent?).

      (5)  Under the objective considered by the authors, the amount of input occurring before the movement might be impacted by the presence of online sensory signals for closed-loop control. It is therefore an open question whether the objective and network characteristics suggested by the authors could also explain the presence of preparatory activity before e.g. grasping movements that are thought to be more sensory-driven (Meirhaeghe et al., Cell Reports 2023).

      It is true that we aren’t currently modeling sensory signals explicitly. However, some of the optimal inputs we infer may be capturing upstream information which could englobe some sensory information. This is currently unclear, and would likely depend on how exactly the model is specified. We have added new discussion to emphasize that our dynamics should not be understood as just representing M1, but more general circuits whose state can be decoded from M1.

      Reviewer #2 (Recommendations For The Authors):

      Additionally, thank you for pointing out various typos in the manuscript, we have fixed those!

      Reviewer 3:

      Thank you very much for your review, which makes a lot of very insightful points, and raises several interesting questions. In summary, we very much agree with the limitations you pointed out. In particular, the choice of input cost is something we had previously discussed, but we had found it challenging to decide on what a reasonable cost for “complexity” could be. Following your comment, we have however added a first attempt at penalizing “temporal complexity”, which shows promising behavior. We have only included those additional analyses as supplementary figures, and we have included new discussion, which hopefully highlights what we meant by the different model components, and how the model behavior may change as we vary some of our choices. We hope this can be informative for future models that may use a similar approach. Below, we highlight the changes that we have made to address your comments.

      The main limitation of the study is that it focuses exclusively on one specific constraint - magnitude - that could limit motor-cortex inputs. This isn't unreasonable, but other constraints are at least as likely, if less mathematically tractable. The basic results of this study will probably be robust with regard such issues - generally speaking, any constraint on what can be delivered during execution will favor the strategy of preparing - but this robustness cuts both ways. It isn't clear that the constraint used in the present study - minimizing upstream energy costs - is the one that really matters. Upstream areas are likely to be limited in a variety of ways, including the complexity of inputs they can deliver. Indeed, one generally assumes that there are things that motor cortex can do that upstream areas can't do, which is where the real limitations should come from. Yet in the interest of a tractable cost function, the authors have built a system where motor cortex actually doesn't do anything that couldn't be done equally well by its inputs. The system might actually be better off if motor cortex were removed. About the only thing that motor cortex appears to contribute is some amplification, which is 'good' from the standpoint of the cost function (inputs can be smaller) but hardly satisfying from a scientific standpoint.

      The use of a term that punishes the squared magnitude of control signals has a long history, both because it creates mathematical tractability and because it (somewhat) maps onto the idea that one should minimize the energy expended by muscles and the possibility of damaging them with large inputs. One could make a case that those things apply to neural activity as well, and while that isn't unreasonable, it is far from clear whether this is actually true (and if it were, why punish the square if you are concerned about ATP expenditure?). Even if neural activity magnitude an important cost, any costs should pertain not just to inputs but to motor cortex activity itself. I don't think the authors really wish to propose that squared input magnitude is the key thing to be regularized. Instead, this is simply an easily imposed constraint that is tractable and acts as a stand-in for other forms of regularization / other types of constraints. Put differently, if one could write down the 'true' cost function, it might contain a term related to squared magnitude, but other regularizing terms would by very likely to dominate. Using only squared magnitude is a reasonable way to get started, but there are also ways in which it appears to be limiting the results (see below).

      I would suggest that the study explore this topic a bit. Is it possible to use other forms of regularization? One appealing option is to constrain the complexity of inputs; a long-standing idea is that the role of motor cortex is to take relatively simple inputs and convert them to complex time-evolving inputs suitable for driving outputs. I realize that exploring this idea is not necessarily trivial. The right cost-function term is not clear (should it relate to low-dimensionality across conditions, or to smoothness across time?) and even if it were, it might not produce a convex cost function. Yet while exploring this possibility might be difficult, I think it is important for two reasons.

      First, this study is an elegant exploration of how preparation emerges due to constraints on inputs, but at present that exploration focuses exclusively on one constraint. Second, at present there are a variety of aspects of the model responses that appear somewhat unrealistic. I suspect most of these flow from the fact that while the magnitude of inputs is constrained, their complexity is not (they can control every motor cortex neuron at both low and high frequencies). Because inputs are not complexity-constrained, preparatory activity appears overly complex and never 'settles' into the plateaus that one often sees in data. To be fair, even in data these plateaus are often imperfect, but they are still a very noticeable feature in the response of many neurons. Furthermore, the top PCs usually contain a nice plateau. Yet we never get to see this in the present study. In part this is because the authors never simulate the situation of an unpredictable delay (more on this below) but it also seems to be because preparatory inputs are themselves strongly time-varying. More realistic forms of regularization would likely remedy this.

      That is a very good point, and it mirrors several concerns that we had in the past. While we did focus on the input norm for the sake of simplicity, and because it represents a very natural way to regularize our control solutions, we agree that a “complexity cost” may be better suited to models of brain circuits. We have addressed this in a supplementary investigation. We chose to focus on a cost that penalizes the temporal complexity of the inputs, as ||u(t+1) - u(t)||^2. Note that this required augmenting the state of the model, making the computations quite a bit slower; while it is doable if we only penalize the first temporal derivative, it would not scale well to higher orders.

      Interestingly, we did find that the activity in that setting was somewhat more realistic (see new Supplementary Figure S8), with more sustained inputs and plateauing activity. While we have kept the original model for most of the investigations, the somewhat more realistic nature of the results under that setting suggests that further exploration of penalties of that sort could represent a promising avenue to improve the model.

      We also found the idea of a cost that would ensure low-dimensionality of the inputs across conditions very interesting. However, it is challenging to investigate with iLQR as we perform the optimization separately for each condition; nevertheless, it could be investigated using a different optimizer.

      At present, it is also not clear whether preparation always occurs even with no delay. Given only magnitude-based regularization, it wouldn't necessarily have to be. The authors should perform a subspace-based analysis like that in Figure 6, but for different delay durations. I think it is critical to explore whether the model, like monkeys, uses preparation even for zero-delay trials. At present it might or might not. If not, it may be because of the lack of more realistic constraints on inputs. One might then either need to include more realistic constraints to induce zero-delay preparation, or propose that the brain basically never uses a zero delay (it always delays the internal go cue after the preparatory inputs) and that this is a mechanism separate from that being modeled.

      I agree with the authors that the present version of the model, where optimization knows the exact time of movement onset, produces a reasonably realistic timecourse of preparation when compared to data from self-paced movements. At the same time, most readers will want to see that the model can produce realistic looking preparatory activity when presented with an unpredictable delay. I realize this may be an optimization nightmare, but there are probably ways to trick the model into optimizing to move soon, but then forcing it to wait (which is actually what monkeys are probably doing). Doing so would allow the model to produce preparation under the circumstances where most studies have examined it. In some ways this is just window-dressing (showing people something in a format they are used to and can digest) but it is actually more than that, because it would show that the model can produce a reasonable plateau of sustained preparation. At present it isn't clear it can do this, for the reasons noted above. If it can't, regularizing complexity might help (and even if this can't be shown, it could be discussed).

      In summary, I found this to be a very strong study overall, with a conceptually timely message that was well-explained and nicely documented by thorough simulations. I think it is critical to perform the test, noted above, of examining preparatory subspace activity across a range of delay durations (including zero) to see whether preparation endures as it does empirically. I think the issue of a more realistic cost function is also important, both in terms of the conceptual message and in terms of inducing the model to produce more realistic activity. Conceptually it matters because I don't think the central message should be 'preparation reduces upstream ATP usage by allowing motor cortex to be an amplifier'. I think the central message the authors wish to convey is that constraints on inputs make preparation a good strategy. Many of those constraints likely relate to the fact that upstream areas can't do things that motor cortex can do (else you wouldn't need a motor cortex) and it would be good if regularization reflected that assumption. Furthermore, additional forms of regularization would likely improve the realism of model responses, in ways that matter both aesthetically and conceptually. Yet while I think this is an important issue, it is also a deep and tricky one, and I think the authors need considerable leeway in how they address it. Many of the cost-function terms one might want to use may be intractable. The authors may have to do what makes sense given technical limitations. If some things can't be done technically, they may need to be addressed in words or via some other sort of non-optimization-based simulation.

      Specific comments

      As noted above, it would be good to show that preparatory subspace activity occurs similarly across delay durations. It actually might not, at present. For a zero ms delay, the simple magnitude-based regularization may be insufficient to induce preparation. If so, then the authors would either have to argue that a zero delay is actually never used internally (which is a reasonable argument) or show that other forms of regularization can induce zero-delay preparation.

      Yes, that is a very interesting analysis to perform, which we had not considered before! When investigating this, we found that the zero-delay strategy does not rely on preparation in the same way as is seen in the monkeys. This seems to be a reflection of the fact that our “Go cue” corresponds to an “internal” go cue which would likely come after the true, “external go cue” – such that we would indeed never actually be in the zero delay setting. This is not something we had addressed (or really considered) before, although we had tried to ensure we referred to “delta prep” as the duration of the preparatory period but not necessarily the delay period. We have now included more discussion on this topic, as well as a new Supplementary Figure S10.

      I agree with the authors that prior modeling work was limited by assuming the inputs to M1, which meant that prior work couldn't address the deep issue (tackled here) of why there should be any preparatory inputs at all. At the same time, the ability to hand-select inputs did provide some advantages. A strong assumption of prior work is that the inputs are 'simple', such that motor cortex must perform meaningful computations to convert them to outputs. This matters because if inputs can be anything, then they can just be the final outputs themselves, and motor cortex would have no job to do. Thus, prior work tried to assume the simplest inputs possible to motor cortex that could still explain the data. Most likely this went too far in the 'simple' direction, yet aspects of the simplicity were important for endowing responses with realistic properties. One such property is a large condition-invariant response just before movement onset. This is a very robust aspect of the data, and is explained by the assumption of a simple trigger signal that conveys information about when to move but is otherwise invariant to condition. Note that this is an implicit form of regularization, and one very different from that used in the present study: the input is allowed to be large, but constrained to be simple. Preparatory inputs are similarly constrained to be simple in the sense that they carry only information about which condition should be executed, but otherwise have little temporal structure. Arguably this produces slightly too simple preparatory-period responses, but the present study appears to go too far in the opposite direction. I would suggest that the authors do what they can to address these issue via simulations and/or discussion. I think it is fine if the conclusion is that there exist many constraints that tend to favor preparation, and that regularizing magnitude is just one easy way of demonstrating that. Ideally, other constraints would be explored. But even if they can't be, there should be some discussion of what is missing - preparatory plateaus, a realistic condition-invariant signal tied to movement onset - under the present modeling assumptions.

      As described above, we have now included two additional figures. In the first one (S8, already discussed above), we used a temporal smoothness prior, and we indeed get slightly more realistic activity plateaus. In a second supplementary figure (S9), we have also considered using model predictive control (MPC) to optimize the inputs under an uncertain go cue arrival time. There, we found that removing the assumption that the delay period is known came with new challenges: in particular, it requires the specification of a “mental model” of when the Go cue will arrive. While it is reasonable to expect that monkeys will have a prior over the go time arrival cue that will be shaped by the design of the experiment, some assumptions must be made about the utility functions that should be used to weigh this prior. For instance, if we imagine that monkeys carry a model of the possible arrival time of the go cue that is updated online, they could nonetheless act differently based on this information, for instance by either preparing so as to be ready for the earliest go cue possible or alternatively to be ready for the average go cue. This will likely depend on the exact task design and reward/penalty structure. Here, we added simulations with those two cases (making simplifying assumptions to make the problem tractable/solvable using model predictive control), and found that the “earliest preparation” strategy gives rise to more realistic plateauing activity, while the model where planning is done for the “most likely go time” does not. We suspect that more realistic activity patterns could be obtained by e.g combining this framework with the temporal smoothness cost. However, the main point we wished to make with this new supplementary figure is that it is possible to model the task in a slightly more realistic way (although here it comes at the cost of additional model assumptions). We have now added more discussion related to those points. Note that we have kept our analyses on these new models to a minimum, as the main takeaway we wish to convey from them is that most components of the model could be modified/made more realistic. This would impact the qualitative behavior of the system and match to data but – in the examples we have so far considered – does not appear to modify the general strategy of networks relying on preparation.

      On line 161, and in a few other places, the authors cite prior work as arguing for "autonomous internal dynamics in M1". I think it is worth being careful here because most of that work specifically stated that the dynamics are likely not internal to M1, and presumably involve inter-area loops and (at some latency) sensory feedback. The real claim of such work is that one can observe most of the key state variables in M1, such that there are periods of time where the dynamics are reasonably approximated as autonomous from a mathematical standpoint. This means that you can estimate the state from M1, and then there is some function that predicts the future state. This formal definition of autonomous shouldn't be conflated with an anatomical definition.

      Yes, that is a good point, thank you for making it so clearly! Indeed, as previous work, we do not think of our “M1 dynamics” as being internal to M1, but they may instead include sensory feedback / inter-area loops, which we summarize into the connectivity, that we chose to have dynamics that qualitatively resemble data. We have now incorporated more discussion regarding what exactly the dynamics in our model represent.

      Round 2 of reviews

      Reviewer 3:

      My remaining comments largely pertain to some subtle (but to me important) nuances at a few locations in the text. These should be easy for the authors to address, in whatever way they see fit.

      Specific comments:

      (1) The authors state the following on line 56: "For preparatory processes to avoid triggering premature movement, any pre-movement activity in the motor and dorsal pre-motor (PMd) cortices must carefully exclude those pyramidal tract neurons."

      This constraint is overly restrictive. PT neurons absolutely can change their activity during preparation in principle (and appear to do so in practice). The key constraint is looser: those changes should have no net effect on the muscles. E.g., if d is the vector of changes in PT neuron firing rates, and b is the vector of weights, then the constraint is that b'd = 0. d = 0 is one good way of doing this, but only one. Half the d's could go up and half could go down. Or they all go up, but half the b's are negative. Put differently, there is no reason the null space has to be upstream of the PT neurons. It could be partly, or entirely, downstream. In the end, this doesn't change the point the authors are making. It is still the case that d has to be structured to avoid causing muscle activity, which raises exactly the point the authors care about: why risk this unless preparation brings benefits? However, this point can be made with a more accurate motivation. This matters, because people often think that a null-space is a tricky thing to engineer, when really it is quite natural. With enough neurons, preparing in the null space is quite simple.

      That is a good point – we have now reformulated this sentence to instead say “to avoid triggering premature movement, any pre-movement activity in the motor and dorsal premotor (PMd) cortices must engage the pyramidal tract neurons in a way that ensures their activity patterns will not lead to any movement”.

      (2) Line 167: 'near-autonomous internal dynamics in M1'.

      It would be good if such statements, early in the paper, could be modified to reflect the fact that the dynamics observed in M1 may depend on recurrence that is NOT purely internal to M1. A better phrase might be 'near-autonomous dynamics that can be observed in M1'. A similar point applies on line 13. This issue is handled very thoughtfully in the Discussion, starting on line 713. Obviously it is not sensible to also add multiple sentences making the same point early on. However, it is still worth phrasing things carefully, otherwise the reader may have the wrong impression up until the Discussion (i.e. they may think that both the authors, and prior studies, believe that all the relevant dynamics are internal to M1). If possible, it might also be worth adding one sentence, somewhere early, to keep readers from falling into this hole (and then being stuck there till the Discussion digs them out).

      That is a good point: we have now edited the text after line 170 to make it clear that the underlying dynamics may not be confined to M1, and have referenced the later discussion there.

      (3) The authors make the point, starting on line 815, that transient (but strong) preparatory activity empirically occurs without a delay. They note that their model will do this but only if 'no delay' means 'no external delay'. For their model to prepare, there still needs to be an internal delay between when the first inputs arrive and when movement generating inputs arrive.

      This is not only a reasonable assumption, but is something that does indeed occur empirically. This can be seen in Figure 8c of Lara et al. Similarly, Kaufman et al. 2016 noted that "the sudden change in the CIS [the movement triggering event] occurred well after (~150 ms) the visual go cue... (~60 ms latency)" Behavioral experiments have also argued that internal movement-triggering events tend to be quite sluggish relative to the earliest they could be, causing RTs to be longer than they should be (Haith et al. Independence of Movement Preparation and Movement Initiation). Given this empirical support, the authors might wish to add a sentence indicating that the data tend to justify their assumption that the internal delay (separating the earliest response to sensory events from the events that actually cause movement to begin) never shrinks to zero.

      While on this topic, the Haith and Krakauer paper mentioned above good to cite because it does ponder the question of whether preparation is really necessary. By showing that they could get RTs to shrink considerably before behavior became inaccurate, they showed that people normally (when not pressured) use more preparation time than they really need. Given Lara et al, we know that preparation does always occur, but Haith and Krakauer were quite right that it can be very brief. This helped -- along with neural results -- change our view of preparation from something more cognitive that had to occur, so something more mechanical that was simply a good network strategy, which is indeed the authors current point. Working a discussion of this into the current paper may or may not make sense, but if there is a place where it is easy to cite, it would be appropriate.

      This is a nice suggestion, and we thank the reviewer for pointing us to the Haith and Krakauer paper. We have now added this reference and extended the paragraph following line 815 to briefly discuss the possible decoupling between preparation and movement initiation that is shown in the Haith paper, emphasizing how this may affect the interpretation of the internal delay and comparisons with behavioral experiments.

    1. AbstractBackground Visualization is an indispensable facet of genomic data analysis. Despite the abundance of specialized visualization tools, there remains a distinct need for tailored solutions. However, their implementation typically requires extensive programming expertise from bioinformaticians and software developers, especially when building interactive applications. Toolkits based on visualization grammars offer a more accessible, declarative way to author new visualizations. Nevertheless, current grammar-based solutions fall short in adequately supporting the interactive analysis of large data sets with extensive sample collections, a pivotal task often encountered in cancer research.Results We present GenomeSpy, a grammar-based toolkit for authoring tailored, interactive visualizations for genomic data analysis. Users can implement new visualization designs with little effort by using combinatorial building blocks that are put together with a declarative language. These fully customizable visualizations can be embedded in web pages or end-user-oriented applications. The toolkit also includes a fully customizable but user-friendly application for analyzing sample collections, which may comprise genomic and clinical data. Findings can be bookmarked and shared as links that incorporate provenance information. A distinctive element of GenomeSpy’s architecture is its effective use of the graphics processing unit (GPU) in all rendering. GPU usage enables a high frame rate and smoothly animated interactions, such as navigation within a genome. We demonstrate the utility of GenomeSpy by characterizing the genomic landscape of 753 ovarian cancer samples from patients in the DECIDER clinical trial. Our results expand the understanding of the genomic architecture in ovarian cancer, particularly the diversity of chromosomal instability. We also show how GenomeSpy enabled the discovery of clinically actionable genomic aberrations.Conclusions GenomeSpy is a visualization toolkit applicable to a wide range of tasks pertinent to genome analysis. It offers high flexibility and exceptional performance in interactive analysis. The toolkit is open source with an MIT license, implemented in JavaScript, and available at https://genomespy.app/.

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giae040), where the paper and peer reviews are published openly under a CC-BY 4.0 license. These peer reviews were as follows:

      Reviewer 1: Andrea Sboner

      In this manuscript, the authors present Genome Spy, a visualization toolkit geared toward the rapid and interactive exploration of genomic features. They demonstrate how this tool can help investigators explore a large cohort of 753 ovarian cancers sequenced by whole-genome sequencing (WGS). By using the tool, they were able to identify outliers in the dataset and refine their diagnosis. The tool is inspired by Vega-lite, a high-level grammar for interactive graphics, and extends it for genomic applications.

      The manuscript is clearly written, and the authors provide links to the applications itself, tutorials and examples. I want to commend them for doing this. This is a tool that would nicely complement others and has a specific advantage of using high-performance GPUs that are now common in modern computers.

      The only concern that I have is about a couple of claims that may not be fully supported by the data provided: 1. Claim: users can implement new visualization designs easily. While the grammar certainly enables the users to define new designs, I do not think that this is necessarily easy, as the authors themselves recognize in the discussion section when they suggest providing templates to reduce the learning curve. Indeed, the example in Figure 2 is still quite verbose and would need some time for anyone to understand the syntax and the style. The playground web application facilitates testing it, though. 2. Claim: the grammar-based approach allows to be mixed and matched. I did not find any specific example of how to do this. It would have been quite interesting to see the intersection between the DNA representation of structural variants and RNA-seq data (if this is what it means as "mix and match").

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      In this manuscript, by using simulation, in vitro and in vivo electrophysiology, and behavioral tests, Peng et al. nicely showed a new approach for the treatment of neuropathic pain in mice. They found that terahertz (THz) waves increased Kv conductance and decreased the frequency of action potentials in pyramidal neurons in the ACC region. Behaviorally, terahertz (THz) waves alleviated neuropathic pain in the mouse model. Overall, this is an interesting study. The experimental design is clear, the data is presented well, and the paper is well-written. I have a few suggestions.

      (1) The authors provide strong theoretical and experimental evidence for the impact of voltage-gated potassium channels by terahertz wave frequency. However, the modulation of action potential also relies on non-voltage-dependent ion channels. For example, I noticed that the RMP was affected by THz application (Figure 3F) as well. As the RMP is largely regulated by the leak potassium channels (Tandem-pore potassium channels), I would suggest testing whether terahertz wave photons have also any impact on the Kleak channels as well.

      Thank you for your positive comment and for providing us with this valuable suggestion. After testing the leak K+ current with and without HFTS on the SNI model, we observed a notable increase in the leak K+ current with HFTS when the holding potential surpassed -40 mV (please see the revised Figs. 2m and n). This finding prompted us to delve deeper into the shifts in the resting membrane potential (RMP). The data, along with statistical analysis, are detailed in Tables S1-3.

      (2) The activation curves of the Kv currents in Figure 2h seem to be not well-fitted. I would suggest testing a higher voltage (>100 mV) to collect more data to achieve a better fitting.

      Thanks for your advice. We repeated the experiment while maintaining the voltage of patched neurons at a higher level (>100 mV) to collect ample data for better fitting. The outcomes are illustrated in the revised Figs. 2g-j. Clearly, the data reveals a significant increase in K+ conductance in the HFTS group as compared to the SNI group. We have integrated these discoveries into the revised manuscript, replacing the earlier results.

      (3) In the part of behavior tests, the pain threshold increased after THz application and lasted within 60 mins. I suggest conducting prolonged tests to determine the end of the analgesic effect of terahertz waves.

      Thank you for your insightful comment. We echo your curiosity about the duration of the HFTS effect. In the process of revising our work, we conducted a comparative analysis of the analgesic duration resulting from 10-minute and 15-minute applications of HFTS. The findings are visualized in the revised Fig. 5c. Our observations indicate that after 160 minutes, the PWMT value for the 15-minute HFTS group decreased to a level comparable to that of the SNI group. Meanwhile, the analgesic effects persisted for 140 minutes in the case of the 10-minute HFTS application. These results imply a direct correlation between the duration of HFTS application and the duration of analgesia.

      (4) Regarding in vivo electrophysiological recordings, the post-HFTS recordings were acquired from a time window of up to 20 min. It seems that the HFTS effect lasted for minutes, but this was not tested in vitro where they looked at potassium currents. This long-lasting effect of HFTS is interesting. Can the authors discuss it and its possible mechanisms, or test it in slice electrophysiological experiments?

      Thank you for your comment. Based on the results from in vivo electrophysiological recordings, it was observed that the effect of HFTS can endure for a minimum of 20 minutes, and this duration was even more extended in behavioral assessments. Taking your advice, we employed slice electrophysiological recording for further testing. Following a 15-minute application of HFTS, we evaluated the K+ current at 5 and 20 minutes after incubation. Our observations clearly indicated a substantial and lasting increase in K+ current, with the effect persisting for at least 20 minutes (refer to Fig. 2l). This provides confirmation of the long-lasting influence of HFTS. The relevant data and statistical analysis are documented in Table S1-2.

      (5) How did the authors arrange the fiber for HFTS delivery and the electrode for in vivo multi-channel recordings? Providing a schematic illustration in Figure 4 would be useful.

      Thank you for your comment. To enhance the reader's understanding of the HFTS delivery device during multi-channel recording, we have included a schematic illustration in Fig. 4a in the revised manuscript. The top portion of Fig. 4a depicts a quantum cascade laser (QCL) with a center frequency located at approximately 36 THz. This laser is then connected to the recording electrode via a PIR fiber. The left section illustrates the detailed structure of the recording electrode.

      (6) Some grammatical errors should be corrected.

      Thank you for your thorough review. We have carefully checked and corrected grammar errors we found throughout the entire text to ensure that readers can better comprehend the content of the article.

      Reviewer #2 (Public Review):

      In this manuscript, Peng et al., reported that 36 THz high-frequency terahertz stimulation (HFTS) can suppress the activity of pyramidal neurons by enhancing the conductance of voltage-gated potassium channel. The authors also demonstrated the effectiveness of using 36THz HFTS for treating neuropathic pain.

      Strengths:

      The manuscript is well written and the conclusions are supported by robust results. This study highlighted the potential of using 36 THz HFTS for neuromodulation.

      Weaknesses:

      More characterization of HFTS is needed, so the readers can have a better assessment of the potential usage of HFTS in their own applications.

      Thank you for your suggestion. We have created schematic diagrams illustrating the HFTS delivery (Fig. 4a and Fig. 5a in the revised manuscript). Fig. 4a presents the structure designed for in vivo multi-channel recording. Fig. 5a shows the structure used in behavior test, the recording electrode is replaced by a metal hollow tube, allowing the PIR fiber to pass through the tube and target the ACC region of the mice.

      (1) It would be very helpful to estimate the volume of tissue that can be influenced by HFTS. It is not clear how 15 mins HFTS was chosen for this functional study. Does a longer time have a stronger effect? A better characterization of the relationship between the stimulus duration of HFTS and its beneficial effects would be very useful.

      Thank you for your feedback. The degree of tissue influence is directly related to the size of the spot emerging from the fiber outlet. In our experiment, we used a PIR fiber with a 630 nm inner core diameter to propagate high-frequency THz waves. This core features a refractive index of 2.15 and has an effective numerical aperture (NA) of 0.35 ± 0.05.

      Our decision to apply HFTS for 15 minutes in the behavioral study was primarily based on observations from in vivo multi-channel recordings. Specifically, we noticed a considerable reduction in the average firing rate of PYR cells after 15 minutes of HFTS exposure. To further investigate the correlation between the duration of HFTS stimulation and its effects, we conducted a comparative study using a 10-minute HFTS session. The results, depicted in revised Fig. 5c, reveal that the PWMT value decreased to the level seen in the SNI group after approximately 160 minutes following 15 minutes of HFTS, and after about 140 minutes with 10 minutes of HFTS. This suggests a direct relationship between the length of HFTS application and its beneficial outcomes.

      (2) How long does the behavioral effect last after 15 minutes of HFTS? Figure 5b only presents the behavioral effect for one hour, but the pain level is still effectively reduced at this time point. The behavioral measurement should last until pain sensitization drops back to pre-stim level.

      Thank you for your feedback. Similar question is also mentioned by reviewer 1. As depicted in Fig. 5c, it was observed that the analgesic effects lasted for 140-160 min with 10-15 minutes application of HFTS. Based on these findings, we can conclude that in the SNI model, targeting the ACC brain region with HFTS for a duration of 10-15 minutes results in an analgesic effect that lasts for roughly 140-160 minutes. This provides valuable insights into the potential clinical applications and duration of relief that can be achieved through HFTS treatment.

      (3) Although the manuscript only tested in ACC, it will also be useful to demonstrate the neural modulation effect on other brain regions. Would 36THz HFTS also robustly modulate activities in other brain regions? Or are different frequencies needed for different brain regions?

      Thank you for your comment. We hypothesize that light waves at a frequency of approximately 36 THz effectively modulate neuronal activities in various brain regions, primarily due to their impact on K channels. Additionally, we speculate that the application of THz waves at different frequencies may influence other channels, such as Na and Ca channels, potentially facilitating or inhibiting neuronal activities. We believe this is a fascinating and significant area of research to explore in the future.

      Reviewer #3 (Public Review):

      Summary:

      This manuscript by Peng et al. presents intriguing data indicating that high-frequency terahertz stimulation (HFTS) of the anterior cingulate cortex (ACC) can alleviate neuropathic pain behaviors in mice. Specifically, the investigators report that terahertz (THz) frequency stimulation widens the selectivity filter of potassium channels thereby increasing potassium conductance and leading to a reduction in the excitability of cortical neurons. In voltage clamp recordings from layer 5 ACC pyramidal neurons in acute brain slice, Peng et al. show that HFTS enhances K current while showing minimal effects on Na current. Current clamp recording analyses show that the spared nerve injury model of neuropathic pain decreases the current threshold for action potential (AP) generation and increases evoked AP frequency in layer 5 ACC pyramidal neurons, which is consistent with previous studies. Data are presented showing that ex-vivo treatment with HFTS in slice reduces these SNI-induced changes to excitability in layer 5 ACC pyramidal neurons. The authors also confirm that HFTS reduces the excitability of layer 5 ACC pyramidal neurons via in vivo multi-channel recordings from SNI mice. Lastly, the authors show that HFTS is effective at reducing mechanical allodynia in SNI using both the von Frey and Catwalk analyses. Overall, there is considerable enthusiasm for the findings presented in this manuscript given the need for non-pharmacological treatments for pain in the clinical setting.

      Strengths:

      The authors use a multifaceted approach that includes modeling, ex-vivo and in-vivo electrophysiological recordings, and behavioral analyses. Interpretation of the findings is consistent with the data presented. This preclinical work in mice provides new insight into the potential use of directed high-frequency stimulation to the cortex as a primary or adjunctive treatment for chronic pain.

      Weaknesses:

      There are a few concerns noted that if addressed, would significantly increase enthusiasm for the study.

      (1) The left Na current trace for SNI + HFTS in Figure 2B looks to have a significant series resistance error. Time constants (tau) for the rate of activation and inactivation for Na currents would be informative.

      Thank you for your feedback. We have carefully considered your comments and made several adjustments in the revised Figs. 2b-f to improve clarity and accuracy. Firstly, we have conducted a comparison of the time constants (tau) between the SNI group and the SNI+HFTS group. These time constants represent the latency of Na current activation or inactivation relative to the half-activated/inactivated voltage. Our analysis reveals that there is no statistically significant difference in tau between the two groups for both activation and deactivation curves. Secondly, we have updated the sample traces in Fig. 2b of the revised manuscript. These new traces illustrate that tau does not significantly differ between the SNI and SNI+HFTS groups, providing a visual representation of our findings. We believe that these modifications strengthen the presentation of our study's details and results, making the data more accessible and understandable for readers.

      (2) It is unclear why an unpaired t-test was performed for paired data in Figure 2. Also, statistical methods and values for non-significant data should be presented.

      Thank you for your comment. I think you mean the results in Fig. 3. We agree with you that we should use one-way ANOVA to analyze the data since there are more than 2 groups for comparison. We thus re-analyzed the data by using one-way ANOVA in Figs. 3g-k, and have included detailed statistical methods and P values in the revised manuscript.

      (3) It would seem logical to perform HFTS on ACC-Pyr neurons in acute slices from sham mice (i.e. Figure 3 scenario). These experiments would be informative given the data presented in Figure 4.

      Thank you for your valuable advice. During the revision process, we performed HFTS on ACC-PYR neurons in acute slices obtained from sham mice. The findings from this experiment have been integrated into the updated Fig. 3, where the sham group is represented by the green line and histogram (the revised Fig. 3 in the manuscript). It is noteworthy that a significant decrease in spike frequency was observed in the sham mice following HFTS.

      (4) As the data are presented in Figure 4g, it does not seem as if SNI significantly increased the mean firing rate for ACC-Pyr neurons, which is observed in the slice. The data were analyzed using a paired t-test within each group (sham and SNI), but there is no indication that statistical comparisons across groups were performed. If the argument is that HFTS can restore normal activity of ACC-Pyr neurons following SNI, this is a bit concerning if no significant increase in ACC-Pyr activity is observed in in-vivo recordings from SNI mice.

      Thank you for highlighting the inaccuracies in the analysis. After reviewing the data, we re-analyzed it using alternative statistical methods. In the revised version, since the data did not follow a normal distribution, we employed Wilcoxon matched-paired signed rank tests within the sham and SNI groups, and Mann-Whitney tests between the sham and SNI groups.

      Upon comparing the statistical outcomes across the groups, we found that the mean firing rate of 130 ACC neurons in SNI mice was significantly higher compared to that of 108 ACC neurons in sham mice (P = 0.0447, Mann-Whitney test). Notably, the mean firing rate of ACC-PYR exhibited a more pronounced increase with a P value of 0.0274 in SNI pre-HFTS versus sham pre-HFTS, while the mean firing rate of ACC-INT did not display a significant change across the groups. These findings align with the observations we made in the slice, reinforcing the validity of our results.

      (5) The authors indicate that the effects of HFTS are due to changes in Kv1.2. However, they do not directly test this. A blocking peptide or dendrotoxin could be used in voltage clamp recordings to eliminate Kv1.2 current and then test if this eliminates the effects of HFTS. If K current is completely blocked in VC recordings then the authors can claim that currents they are recording are Kv1.1 or 1.2.

      Thank you for your kind suggestion. In our research, we employed the Kv1.2 structure as a model to determine the response frequency of terahertz waves. Through both in vitro and in vivo experiments, we were able to demonstrate that the frequency of approximately 36 THz affects the Kv channel and its corresponding spike frequency. Upon analyzing the action potential waveform, we observed a notable variance in the resting membrane potential (RMP). This RMP is predominantly controlled by leak potassium channels, specifically the Tandem-pore potassium channels. In accordance with the recommendation of reviewer 1, we have addressed this particular aspect of our experimentation in the revised manuscript.

      We agree that we should use blocking peptides or dendrotoxin to eliminate Kv1.2 current. However, we meet problems in purchasing and delivery of the drugs. We thus added some explanation in the Discussion part to emphasize the value for this pharmacological experiment and can further confirm this in the future works.

      (6) The ACC is implicated in modulating the aversive aspect of pain. It would be interesting to know whether HFTS could induce conditioned place preference in SNI mice via negative reinforcement (i.e. alleviation of spontaneous pain due to the injury). This would strengthen the clinical relevance of using HFTS in treating pain.

      Thank you for this valuable advice. We share your intrigue regarding this experiment, and we fully recognize the importance and potential of further exploring this area. At present, however, our equipment and platform limitations prevent us from conducting the necessary tests. However, we remain committed to pursuing relevant research opportunities in the future.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1:

      (1) Study suggests that the effects of their tumor models of mouse behavioral are largely non-specific to the tumor as most behaviors are rescued by analgesic treatment. So, most of the changes were likely due to site-specific pain and not a unique signal from the tumor.

      The tumor generates pain at the site it is implanted, and it is likely amplified by the oral activities tumor bearing mice have to engage in. As there is no pain in the absence of the tumor, the pain is, by definition, caused by the tumor, not by the site. Concerning the relationship between pain and behavior, the behavioral assays undertaken in our study (nesting, cookie test, wheel running) were very limited in scope.  Two of these assays (nesting, cookie test) require use of the oral cavity. Only nesting and wheel running were assessed in the context of treatment for pain. Nesting behavior was completely restored with carprofen and buprenorphine treatment suggesting that in the absence of pain, mice were able to make perfect nests. Consistent with this, carprofen and buprenorphine treated animals also gained weight indicating that eating (another activity dependent on the oral cavity) was also restored.  Wheel running, an activity that does not rely on the oral cavity, was only partially restored with drug treatment. While additional behavioral tests are necessary to confirm this finding, the data suggest that there is pain-independent information relayed to the brain which accounts for this decline in wheel running.

      Reviewer #2:

      (1) The main claim is that tumor-infiltrating nerves underlie cancer-induced behavioral alterations, but the experimental interventions are not specific enough to support this. For example, all TRPV1 neurons, including those innervating the skin and internal organs, are ablated to examine sensory innervation of the tumor. Within the context of cancer, behavioral changes may be due to systemic inflammation, which may alter TRPV1 afferents outside the local proximity of tumor cells. A direct test of the claims of this paper would be to selectively inhibit/ablate nerve fibers innervating the tumor or mouth region.

      We agree with the reviewer that a direct test of the hypothesis would require selectively inhibiting the nerve fibers innervating the tumor and assessing the impact on behavior. Studies in the lab are on-going using pharmacological interventions to do this. These studies are beyond the scope of this current manuscript.

      (2) Behavioral results from TRPV1 neuron ablation studies are in part confounded by differing tumor sizes in ablated versus control mice. Are the differences in behavior potentially explained by the ablated animals having significantly smaller tumors? The differences in tumor sizes are not negligible. One way to examine this possibility might be to correlate behavioral outcomes with tumor size.

      As suggested by the reviewer, we have graphed nesting scores and time-to-interact (cookie test) relative to tumor volume.  In both cases, we used simple linear regression to fit the data and analyzed the slopes of the lines. In the case of nesting, there was no significant difference between the slopes. This is now included as Supplemental Figure 4A. In the case of the cookie test, there was a significant difference between the slopes. This is now included as Supplemental Figure 4B. Graphing the data in this way allows one to look at any given tumor volume and infer what the nesting score and the time-to-interact for the two groups of mice. The linear regression model fits the time to interact with the cookie reasonably well, thus from this graph, we can see that at any given tumor volume the time to interact with the cookie was generally shorter in TRPV1cre::DTAfl/wt animals as compared to C57BL/6 mice. Unfortunately, the linear regression does not fit the nesting data very well and thus it is more difficult to make the comparison of tumor volume and nesting score.

      The following text has been added to the results section.

      Given the impact of nociceptor neuron ablation on tumor growth, we wondered whether differences in tumor volume contributed to the behavioral differences we noted. Thus, the behavior data were graphed as a function of tumor volume (Supplemental Fig 4A, B). A simple linear regression model was used to fit the data. In the case of nesting scores, the linear regression did not fit the data points very well making it difficult to assess nesting scores at a given tumor volume (Supplemental Fig 4A). However, the linear regression model fit the time to interact data better. Here, the graph suggests that tumor volume did not influence behavior as at any given tumor volume the time to interact with the cookie is generally smaller in TRPV1-Cre::Floxed-DTA animals as compared to C57BL/6 animals (Supplemental Fig 4B).

      Reviewer #3:

      (1) The authors mention in their Discussion the need for additional experiments. Could they also include / comment on the potential impact on the anti-tumor immune system in their model?

      The following text has been added to the discussion:

      Neuro-immune interactions have been studied in the context of a variety of conditions including, but not limited to infection 109, inflammation 110,111, homeostasis in the gut 112-114, as well as neurological diseases115,116. Neuro-immune communications in the context of cancer and behavior have also been studied (e.g., sickness behavior, depression) 117-119 however, these studies did not assess these interactions at the tumor bed. Investigations into neuro-immune interactions occurring within primary malignancies which harbor nerves have shed light on these critical communications. In the context of melanoma, which is innervated by sensory nerves, we identified that release of the neuropeptide calcitonin gene related peptide (CGRP) induces immune suppression. This effect is mediated by CGRP binding to its receptor, RAMP1, which is expressed on CD8+ T cells 49. A study utilizing a different syngeneic model of oral cancer similarly found an immune suppressive role for CGRP 120-122. These studies demonstrate that neuro-immune interactions occur at the tumor bed. Our current findings indicating that tumor-infiltrating nerves connect to a circuit that includes regions within the brain suggest that neuro-immune interactions within the peripheral malignancy may contribute to the behavioral alterations we studied.

      (2) The authors mention the importance of inflammation contributing to pain in cancer but do not clearly highlight how this may play a role in their model. Can this be clarified?

      The following text has been added to the discussion section of the manuscript.

      Moreover, given that carprofen and buprenorphine decrease inflammation 104, their ability to restore normal nesting and cookie test behaviors (which require the use of the oral cavity where the tumor is located) suggests that inflammation at the tumor site contributed to the decline in these behaviors in vehicle-treated animals. Since both drugs were given systemically and each only partially restored wheel running, it suggests that systemic inflammation alone cannot fully account for the decline in wheel running seen in vehicle-treated animals. We posit that the inflammation- and pain-independent component of this behavioral decline is mediated via the transcriptional and functional alterations in the cancer-brain circuit.

      (3) The tumor model apparently requires isoflurane injection prior to tumor growth measurements. This is different from most other transplantable types of tumors used in the literature. Was this treatment also given to control (i.e., non-tumor) mice at the same time points? If not, can the authors comment on the impact of isoflurane (if any) in their model?

      Mice in all groups (tumor and non-tumor) were treated with isoflurane. This important detail has been added to the methods section.

      (4) The authors emphasize in several places that this is a male mouse model. They mention this as a limitation in the Discussion. Was there an original reason why they only tested male mice?

      The following text has been added in the discussion section:

      Head and neck cancer is predominantly a cancer in males; it occurs in males three times more often than in females 123, this disparity increases in certain parts of the world. While smoking cigarettes and drinking alcohol are risk factors for HPV negative head and neck squamous cell carcinoma, even males that do not smoke and drink are have a higher susceptibility for this cancer than females 124,125. Thus, our studies used only male mice. However, we do recognize that females also get this cancer. In fact, female patients with head and neck cancer, particularly oral cancer, report more pain than their male counterparts 126,127. These findings suggest that differences in tumor innervation exist in males and females.

      Therefore, another project in the lab has been to compare disease characteristics (including innervation and behavior) in male and female mice. The findings from this second study are the topic of a separate manuscript.

      Recommendations For The Authors:

      Reviewing editor:

      (1) Tumors can communicate with the brain via blood-borne agents from the tumor itself or immune cells that are activated by the tumor in addition to neurons that invade the tumor. The xia and malaise that accompanies some tumors can be mediated by direct innervation and/or the humoral factors because both can activate the same parabrachial pathway. This paper makes the case for the direct innervation being important but ignores the possibility of both being involved. The interesting observation that innervation supports tumor growth (perhaps via substance P) is troublesome because the slower appearance of behavioral consequences (Figures 4 & 5) could be attributed to the smaller tumor size. A nice control for humoral effects would be to implant the tumor cells someplace in the body where innervation does not occur (if possible) and then examine behavioral outcomes.

      In the course of several projects, we have implanted different tumor cell lines in different locations in mice (oral cavity, hind limb, flank, peritoneal cavity). In each location, tumor innervation occurs. This is not a phenomenon found only in mice as we completed an immunohistological survey of human cancers from different sites and found they are all innervated (PMID 34944001). These data are consistent with tumor and locally-released factors that recruit nerves to the tumor bed (PMID: 30327461)(PMID: 32051587)(PMID: 27989802). Thus, an implantation site that does not result in tumor innervation is currently unknown and likely does not exist.

      (2) The authors should address whether there is an inflammatory component in this tumor model.

      MOC2-7 tumors have been characterized as non-inflamed and poorly immunogenic 129-131.

      This information has been added to the methods section.

      (3) The RTX experiment in Figure 5 would be more compelling if the drug was injected directly into the tumor rather than injecting it in the flank, thus ablating all TRPV1-exressing neurons as in the genetic approach.

      While we agree with the reviewer that ablating the TRPV1-expressing neurons at the tumor site directly would be ideal, RTX treatment takes approximately one week for ablation to occur but a significant amount of inflammation is associated with this. Therefore, we wait a total of 4 weeks for the inflammation to resolve. By this time, tumors have generally reached sacrifice criteria. Thus, this approach would not enable the question to be answered Moreover, we are not aware of any studies in which RTX has been injected in the oral cavity or face. While RTX is utilized clinically to treat pain, it is typically administered intrathecally, epidurally or intra-ganglionically (PMID: 37894723).

      (4) The authors address affective aspects of pain but do not adequately address the sensory aspects, e.g., sensitivity to touch, heat and/or cold. They attribute the decrease in food disappearance (consumption) and nest building to oral pain, but it could be due to anhedonia and anorexia that can accompany tumor progression.

      Assaying for touch and heat/cold sensitivity in the oral cavity is a critical aspect of studying head and neck cancer that needs to be addressed. However, in rodents these assays are not trivial given that any touch/heat/cold in the area of the tumor (oral cavity) impacts the sensitive whiskers in that region which directly influence these assays. Thus, we have been refining assays (e.g., OPAD, facial von Frey) to address these important questions. The findings from these studies are beyond the scope of this manuscript.

      The reviewer makes a good point about anhedonia and anorexia. The following text has been added to the results section:

      Pain-induced anhedonia is mediated by changes in the reward pathway. Specifically, in the context of pain, dopaminergic neurons in the ventral tegmental area (VTA) become less responsive to pain and release less serotonin.  This decreased serotonin results in disinhibition of GABA release; the resulting increased GABA promotes an increased inhibitory drive leading to anhedonia  82 and, when extreme, anorexia. Carprofen and buprenorphine treatments completely reversed nesting behavior and significantly improved eating. Inflammation 83 and opioids 84 directly influence reward processing and though our tracing studies did not indicate that the tumor-brain circuit includes the VTA, this brain region may be indirectly impacted by tumor-induced pain in the oral cavity. Thus, an alternative interpretation of the data is that the effects of carprofen and buprenorphine treatments on nesting and food consumption may be due to inhibition of anhedonia (and anorexia) rather than, or in addition to, relieving oral pain.

      (5) Comment on why only males were used in this study.

      Please see response to public reviews.

      Reviewer #1:

      (1) Please provide a justification for the use of exclusively male mice and expand in the discussion if there is potential for these findings to be directly applicable to female mice as well.

      Please see response to public reviews.

      The following text has been added to the discussion:

      Head and neck cancer is predominantly a cancer in males; it occurs in males three times more often than in females 123, this disparity increases in certain parts of the world. While smoking cigarettes and drinking alcohol are risk factors for HPV negative head and neck squamous cell carcinoma, even males that do not smoke and drink are have a higher susceptibility for this cancer than females 124,125. Thus, our studies used only male mice. However, we do recognize that females also get this cancer. In fact, female patients with head and neck cancer, particularly oral cancer, report more pain than their male counterparts 126,127. These findings suggest that differences in tumor innervation exist in males and females.

      (2) When discussing the results shown in Figure 2, please include some mention of Fus, since it was the highest expressed transcript.

      The following text has been added to the results section regarding Fus.

      The gene demonstrating the highest increase in expression, Fus, was of particular interest; it increases in expression within DRG neurons following nerve injury and contributes to injury-induced pain 51,52. Of note, we purposefully used whole trigeminal ganglia rather than FACS-sorted tracer-positive dissociated neurons to avoid artificially imposing injury and altering the transcript levels of these cells 53,54. Thus, significantly elevated expression of Fus by ipsilateral TGM neurons from tumor-bearing animals suggests the presence of neuronal injury induced by the malignancy. This is consistent with our previous findings 55 and those of others 56 showing that tumor-infiltrating nerves harbor higher expression of nerve-injury transcripts and neuronal sensitization.

      (3) In line 197 please clarify the mice used. Were all mice tumor-bearing and some had nociceptors ablated, or was there a control (no tumor) group as well?

      Line 197 refers to Figure 4D. In this figure, panels B-D show quantification of cFos and DFosB in the spinal nucleus of the TGM (SpVc), The parabrachial nucleus (PBN) and the Central nucleus of the amygdala (CeA). These data are from C57BL/6 and TRPV1cre::DTAfl/wt animals all of whom had tumor. Supplementary Figure 3C also show quantification of cFos and DFosB but these are from control, non-tumor bearing animals. The fact that controls are non-tumor-bearing has been added to the supplemental figure legend and the text of the results section has been clarified as follows.

      While Fos expression was similar between non-tumor bearing mice of the two genotypes (Supplemental Fig. 3C-E), the absence of nociceptor neurons in tumor-bearing animals decreases cFos and DFosB in the PBN, and DFosB in the SpVc (Fig. 4B, C).

      (4) Overall it would improve the readability of the figures if the colors for the IHC channels were on the image itself and not exclusively in the figure legend.

      The colors for all the staining have been added to each panel.

      (5) It is not a problem that complete cartography was not done, but please include a justification for why the brain regions that were focused on were chosen.

      In order to ensure that our neural tracing technique captured only nerves present within the tumor bed, we restricted the injection of tracer to only 2 µl. We demonstrated that this small volume did not leak out of the tumor (Figure 1) and thus any tracer labeled neurons we identified were deemed as being connected in a circuit to nerves in the tumor bed. While we acknowledged that this calculated technical approach restricted our ability to tracer label all neurons in the tumor bed (as well as those they share circuitry with), it ensured no tracer leakage and inadvertent labeling of non-tumoral nerves. In non-tumor animals injected with 10 µl of tracer, labeled regions in the brain included the spinal nucleus of the trigeminal, the parabrachial nucleus, the central amygdala, the facial nucleus and the motor nucleus of the trigeminal. The regions that were tracer positive when tumor was injected were limited to the spinal nucleus of the trigeminal, the parabrachial nucleus and the central amygdala. Thus, the regions in the brain that we focused on were the areas that became tracer-positive following injection of tracer into the tumor.

      (6) Were the cells that were injected cultured in media with 10% fetal calf serum? If so was any inflammatory response seen? If not please state in the methods section the media that cells for injection were cultured in.

      The cells injected into animals were cultured in media containing 10% fetal calf serum. When cells are harvested for tumor injections, they are first washed two times with PBS and then trypsinized to detach the cells from the plate. Cells are collected, washed again with PBS and resuspended with DMEM without serum; this is what is injected into animals. We harvest cells in this way in order to eliminate any serum being injected into mice. This information has been added to the Methods section.

      (7) Would any of the differences in drug treatment (Carprofen vs Buprenorphine) be due to the differing routes of administration and metabolism of the drugs?

      Since carprofen and buprenorphine each resulted in similar behavioral impacts (nesting and wheel running), their different routes of administration seem to play a minor or no role in the behaviors assessed.

      (8) Please include in the methods section the specific approach and software that was used for processing calcium imaging data and calculating a relative change in fluorescence.

      The specific approach used for processing calcium imaging data and calculating relative change in fluorescence as well as the software used are all included in the methods section. Please see below:

      Ca2+ imaging. TGM neurons from non-tumor and tumor-bearing animals (n=4-6 mice/condition) were imaged on the same day. Neurons were incubated with the calcium indicator, Fluo-4AM, at 37°C for 20 min. After dye loading, the cells were washed, and Live Cell Imaging Solution (Thermo-Fisher) with 20 mM glucose was added. Calcium imaging was conducted at room temperature. Changes in intracellular Ca2+ were measured using a Nikon scanning confocal microscope with a 10x objective. Fluo-4AM was excited at 488 nm using an argon laser with intensity attenuated to 1%. The fluorescence images were acquired in the confocal frame (1024 × 1024 pixels) scan mode. After 1 min of baseline measure, capsaicin (300nM final concentration) was added. Ca2+ images were recorded before, during and after capsaicin application. Image acquisition and analysis were achieved using NIS-Elements imaging software. Fluo-4AM responses were standardized and shown as percent change from the initial frame. Data are presented as the relative change in fluorescence (DF/F0), where F0 is the basal fluorescence and DF=F-F0 with F being the measured intensity recorded during the experiment. Calcium responses were analyzed only for neurons responding to ionomycin (10 µM, positive control) to ensure neuronal health. Treatment with the cell permeable Ca2+ chelator, BAPTA (200 µM), served as a negative control.

      (9) Suggestions for Figure 1:

      - In Figures 1C, D, E, include labels for the days of tumor harvest.

      - Please make the size of the labels the same for 1K an 1L and align them.

      - Microscopy image in Figure 1L for SpVc looks like it may be at a different magnification.

      - If possible, include (either in the figure or the supplement) IHC images staining for Dcx and tau, which would complement the western blot data.

      The requested changes to the figures have been made. Unfortunately, we do not have Dcx and tau IHC staining of the day 4, 10 and 20 tumors.

      (10) Suggestions for Figure 2:

      - Include directly onto the graph in Figure 2a the legend for tumor-bearing (red) and non-tumor bearing (blue).

      - Keep consistent between Figure 2G and 2H/I if the tumor/nontumor will be labeled as T/N or Tumor/Control.

      The requested changes to the figures have been made.

      (11) Suggestions for Figure 3:

      - An example trace of calcium signal would complement Figure 3G, H well.

      Example tracings of calcium signal are already provided in Supplementary Figure 3A and B.

      Reviewer #2:

      (1) While the use of male mice is acknowledged, there is not a rationale for why female mice were not included in the study.

      Please see the response to Reviewer #1 (first question).

      (2) Criteria for euthanasia should be described in the Methods. This is especially needed for interpreting the survival curve in Figure 4H.

      Criteria for euthanasia in our IACUC approved protocol include:

      - maximum tumor volume of 1000mm3

      - edema

      - extended period of weight loss progressing to emaciation

      - impaired mobility or lesions interfering with eating, drinking or ambulation

      - rapid weight loss (>20% in 1 week)

      - weight loss at or more than 20% of baseline

      In addition to tumor size and weight loss, we use the body condition score to evaluate the state of animals and to determine euthanasia.  These details have been added to the Methods section.

      (3) At what stage in cancer progression were the Fos studies conducted for Figure 4A-D?

      The brains used for Fos staining (Fig 4B-D) were harvested at week 5 post-tumor implantation.

      (4) For Fos counts, what are the bregma coordinates for the sections that were quantified?

      SpVc:  -7.56 to -8.24mm

      PBN:  -4.96 to -5.52mm

      CeA:  -0.82mm to -1.94mm

      (5) Statistics are needed for the claim in Lines 171-173.

      The statistical analysis of Fos staining from tumor-bearing and non-tumor bearing brains are included in Figure 3D-F. The statistical analysis of ex vivo Ca+2 imaging of brains from tumor-bearing and non-tumor bearing animals are included in Figure 3 I and J.

      (6) How long was the baseline period for weight and food intake measurements? How long were the animals single-housed before taking the baseline measurements?  

      Baseline weight and food intake measurements were 2 weeks and animals were singly housed before baseline measurements for 2 weeks (a total of 4 weeks).

      Minor:

      (7) The authors might consider rewording the sentence on lines 59-62, given that it is abundantly clear from rodent studies that both the tumor and chemotherapy are associated with adverse behavioral outcomes.

      We have reworded the sentence as follows:  The association of cancer with impaired mental health is directly mediated by the disease, its treatment or both; these findings suggest that the development of a tumor alters brain functions.

      (8) Line 212 needs a space between the two sentences.

      This has been fixed.

      (9) Font size in Figure 2 is not consistent with the other figures.

      This has been fixed.

      (10) "DAPI" is the more conventional than "DaPi".

      This has been fixed.

      Editorial Comments and Suggestions:

      (1) The Abstract would be better if it were more concise, e.g. ~175 words.

      The abstract has been shortened as requested and now reads:

      Cancer patients often experience changes in mental health, prompting an exploration into whether nerves infiltrating tumors contribute to these alterations by impacting brain functions. Using a mouse model for head and neck cancer and neuronal tracing we show that tumor-infiltrating nerves connect to distinct brain areas. The activation of this neuronal circuitry altered behaviors (decreased nest-building, increased latency to eat a cookie, and reduced wheel running). Tumor-infiltrating nociceptor neurons exhibited heightened calcium activity and brain regions receiving these neural projections showed elevated cFos and delta FosB as well as increased calcium responses compared to non-tumor-bearing counterparts. The genetic elimination of nociceptor neurons decreased brain Fos expression and mitigated the behavioral alterations induced by the presence of the tumor. While analgesic treatment restored nesting and cookie test behaviors, it did not fully restore voluntary wheel running indicating that pain is not the exclusive driver of such behavioral shifts. Unraveling the interaction between the tumor, infiltrating nerves, and the brain is pivotal to developing targeted interventions to alleviate the mental health burdens associated with cancer.

      (2) Lines 28, 104, 258, 486, 521, and many other places, "utilized" should be "used" because the former refers to an application for which it is not intended, e.g. a hammer was utilized as a doorstop.

      The requested changes have been made.

      (3) Lines 32 and 73, it is not clear whether the basal activity is heightened or whether excitability is increased. "manifest" might be better than "harbor" on line 73.

      We have changed the wording in the abstract to be clearer. Moreover, our finding that TGM neurons from tumor-bearing animals have increased expression of the s1-Receptor and phosphorylated TRPV1 (Fig 2G-I) indicate that these neurons have increased excitability.

      (4) Line 34 and elsewhere, it would be better to refer to Fos because the is no need to distinguish cellular, cFos, from viral, vFos, in this context.

      The requested changes have been made.

      (5) Line 38, It would be better to refer to what was actually measured rather than "oral movements".

      The requested changes have been made. The sentence now reads: “While analgesic treatment restored nesting and cookie test behaviors, it did not fully restore voluntary wheel running.”

      (6) Line 84, CXCR3-null mouse on a C57BL/6 background.

      The requested change has been made.

      (7) Lines 86,129 wild-type, male mice.

      The requested change has been made.

      (8) Lines114-115, the brackets are not necessary.

      The requested change has been made.

      (9) Lines 118, 384, 409, 527, 589, 971, 974 always leave a space between numbers and units. Use Greek u for micro.

      The requested change has been made.

      (10) Lines 123-124, it is not clear that there is meaningful labeling within the CeA.

      We have replaced this image with a more representative one of the CeA from a tumor-bearing animal with clear tracer labeling.

      (11) Lines 125, 138, and 246 transcription was not measured, only transcript levels were measured.

      The requested changes have been made.

      (12) Line 133, I think >4 fold is meant.

      Thank you for catching that. I have fixed it to >4 fold.

      (13) Line 165, single-time-point assessment (add hyphens).

      The requested change has been made.

      (14) Line 181 and elsewhere including figure, the superscripts refer to alleles of the genes; hence approved gene names should be used in italics (as in Methods), TRPV1-Cre:: Floxed-DTA (without italics) would be acceptable.

      The requested changes have been made.

      (15) Line 182, nociceptor-neuron-ablated mice (add hyphens).

      The requested changes have been made.

      (16) Line 197, It is not clear that the "speed" of food disappearance was measured or that it is due to oral pain vs loss of appetite.

      The reviewer makes a good point. We have changed the sentence to read:

      To evaluate the effects of this disruption on cancer-induced behavioral changes, we assessed the animals’ general well-being through nesting behavior 32 and anhedonia using the cookie test 76,77, as well as  body weight and food disappearance as surrogates for oral pain and/or loss of appetite.

      (17) Line 199, The reduced tumor growth after ablation could account for most of the changes in the other parameters that were measured.

      We have graphed the nesting scores and time-to-interact with the cookie as a function of tumor volume.  These data are now included as Supplemental Figure 4 and suggest that at the same tumor volume, nesting scores and times-to-interact with the cookie are different between the groups.

      (18) Line 204 TPVP1 spelling. Is the TGN smaller after ablation of half of the neurons?

      The requested change has been made.

      (19) Line 235, "now" is not necessary.

      The requested change has been made.

      (20) Line 238-239 and elsewhere, a few references for to why the TGN-SpVc-PBN-CeA circuit is relevant would be helpful.

      The following references have been added regarding the relevance of this circuit to behavior:

      Molecular Brain 14: 94 (2021) (PMID 34167570)

      Neuropharmacology 198: 108757 (2021) (PMID 34461068)

      Frontiers in Cellular Neuroscience 16: 997360 (2022)  (PMID 36385947)

      Neuropsychopharmacology  49(3): 508-520 (2024) (PMID 37542159)

      (21) Lines 371, 434 and Figures, gm should be g or grams in scientific usage. Include JAX lab stock numbers for these mouse lines.

      The requested changes have been made.

      (22) Line 432, removing food for one hour is not a fast.

      The sentence has been reworded as follows: One hour prior to testing, mouse food is removed and the animals are acclimated to the brightly lit testing room.

      (23) Line 476, 5-um sections (add hyphen).

      The hyphen has been added.

      (24) Lines 988, and 1023, DAPI are usually shown this way.

      The requested change has been made.

      (25) Figure 1K, add Bregma levels to figures.

      SpVc: -8.12 mm

      PBN: -5.34 mm

      CeA: -1.34 mm

      (26) Figure 3 line 1033, "area under the curve" What curve was examined?

      The curve examined was the change in fluorescence over time. This curve has been added as Supplemental Figure 3C.

      (27) Figure 3B, the circled area is the lateral PBN. At first glance, I thought scp was meant as the label for the circled area.

      Scp is noted in the figure legend as a landmark.

    1. Reviewer #3 (Public Review):

      Pipes and Nielsen propose a valuable new computational method for assigning individual Next Generation Sequencing (NGS) reads to their taxonomic group of origin, based on comparison with a dataset of reference metabarcode sequences (i.e. using an existing known marker sequence such as COI or 16S). The underlying problem is an important one, with broad applications such as identifying species of origin of smuggled goods, identifying the composition of metagenomics/ microbiomics samples, or detecting the presence of pathogen variants of concern from wastewater surveillance samples. Pipes and Nielsen propose (and make available with open source software) new computational methods, apply those methods to a series of exemplar data analyses mirroring plausible real-life scenarios, and compare the new method's performance to that of various field-leading alternative methods.

      In terms of methodology, the manuscript presents a novel computational analyses inspired by standard existing probabilistic phylogenetic models for the evolution of genome sequences. These form the basis for comparisons of each NGS read with a reference database of known examples spanning the taxonomic range of interest. The evolutionary aspects of the models are used (a) to statistically represent knowledge about the reference organisms (and uncertainty about their common ancestors) and their evolutionary relationships; and (b) to derive inferences about the relationship of the sample NGS reads that may be derived from reference organisms or from related organisms not represented in the reference dataset. This general approach has been considered previously and, while expected to be powerful in principle, the reliance of those methods on likelihood computations over a phylogenetic tree structure means they are slow to the point of useless on modern-sized problems that may have many thousands of reference sequences and many millions of NGS reads. Alternative methods that have been devised to be computationally feasible have had to sacrifice the phylogenetic approach, with a consequent loss of statistical power.

      Pipes and Nielsen's methodology contribution in this manuscript is to make a series of approximations to the 'ideal' phylogenetic likelihood analysis, aimed at saving computational time and keeping computer memory requirements acceptable whilst retaining as much as possible of the expected power of phylogenetic methods. Their description of their novel methods is solid; as they are largely approximations to other existing methods, their value ultimately will rest with the success of the method in application.

      Regarding the application of the new methods, to compare the accuracy of their method with a selection of existing methods the authors use 1) simulated datasets and 2) previously published mock community datasets to query sequencing reads against appropriate reference trees. The authors show that Tronko has a higher success at assigning query reads (at the species/genus/family level) than the existing tools with both datasets. In terms of computational performance, the authors show Tronko outperforms another phylogenetic tool, and is still within reasonable limits when compared with other 'lightweight' tools.

      As a demonstration of the power of phylogeny-based methods for taxonomic assignment, this ms. could gain added importance by refocusing the community towards explicitly phylogenetic methods. We agree with the authors that this would be likely to give rise to the most powerful possible methods.

      Strengths of this ms. are 1) the focus on phylogenetic approaches and 2) the reduction of a consequently difficult computational problem to a practical method (with freely available software); 3) the reminder that these approaches work well and are worthy of continued interest and development; and ultimately most-importantly 4) the creation of a powerful tool for taxonomic assignment that seems to be at least as good as any other and generally better.

      Weaknesses of the manuscript at present are 1) lack of consideration of some other existing methods and approaches, as it would be interesting to know if other ideas had been tried and rejected, or were not compatible with the methods created; 2) some over-simplifications in the description of new methods, with some aspects difficult or impossible to reproduce and some claims unsubstantiated. Further, 3) we are not convinced enough weight has been given to the complexity of 'pre-processing' the reference dataset for each metabarcode (e.g. gene) of interest, which may give the impression that the method is easier to apply to new reference datasets than we think would be the case. Lastly, 4) we encountered some difficulties getting the software installed and running on our computers. It was not possible to resolve every issue in the time available to us to perform our review, and some processing options remain untested.

      Overall, the methods that Pipes and Nielsen propose represent an important contribution that both creates a computational resource that is immediately valuable to the community, and emphasises the benefits of phylogenetic methods and provides encouragement for others to continue to work in this area to create still-better methods.

    1. AbstractPlatalea minor, the black-faced spoonbill (Threskiornithidae) is a wading bird that is confined to coastal areas in East Asia. Due to habitat destruction, it has been classified by The International Union for Conservation of Nature (IUCN) as globally endangered species. Nevertheless, the lack of its genomic resources hinders our understanding of their biology, diversity, as well as carrying out conservation measures based on genetic information or markers. Here, we report the first chromosomal-level genome assembly of P. minor using a combination of PacBio SMRT and Omni-C scaffolding technologies. The assembled genome (1.24 Gb) contains 95.33% of the sequences anchored to 31 pseudomolecules. The genome assembly also has high sequence continuity with scaffold length N50 = 53 Mb. A total of 18,780 protein-coding genes were predicted, and high BUSCO score completeness (93.7% of BUSCO metazoa_odb10 genes) was also revealed. A total of 6,155,417 bi-allelic SNPs were also revealed from 13 P. minor individuals, accounting for ∼5% of the genome. The resource generated in this study offers the new opportunity for studying the black-faced spoonbill, as well as carrying out conservation measures of this ecologically important spoonbill species.

      This work is part of a series of papers presenting outputs of the Hong Kong Biodiversity Genomics https://doi.org/10.46471/GIGABYTE_SERIES_0006 This work has been published in GigaByte Journal under a CC-BY 4.0 license (https://doi.org/10.46471/gigabyte.130), and has published the reviews under the same license. These are as follows.

      Reviewer 1. Richard Flamio Jr.

      Is the language of sufficient quality?

      No. There are some grammatical errors and spelling mistakes throughout the text.

      Is there sufficient detail in the methods and data-processing steps to allow reproduction?

      Yes. The authors did a phenomenal job at detailing the methods and data-processing steps.

      Additional Comments:

      Very nice job on the paper. The methods are sound and the statistics regarding the genome assembly are thorough. My only two comments are: 1) I think the paper could be improved by the correction of grammatical errors, and 2) I am interested in a discussion about the number of chromosomes expected for this species (or an estimate) based on related species and if the authors believe all of the chromosomes were identified. For example, is the karyotype known or can the researchers making any inferences about the number of microchromosomes in the assembly? Please see a recent paper I wrote on microchromosomes in the wood stork assembly (https://doi.org/10.1093/jhered/esad077) for some ideas in defining the chromosome architecture of the spoonbill and/or comparing this architecture to related species.

      Re-review:

      The authors incorporated the revisions nicely and have produced a quality manuscript. Well done.

      Minor revisions Line 46: A comma is needed after (Threskiornithidae). Line 47: “The” should not be capitalized. Line 48: This should read “as a globally endangered species.” Line 49: “However, the lack of genomic resources for the species hinders the understanding of its biology…” Line 56: Consider changing “also revealed” to “identified” to avoid repetition from the previous sentence. Line 65: Insert “the” before “bird’s.” Lines 69-70: Move “locally” higher in the sentence – “and it is protected locally…” Line 72: Replace “as of to date” with “prior to this study”. Lines 78-79: Pluralize “part.” Line 86: Replace “proceeded” with “processed.” Line 133: “…are listed in Table 1.” Line 158: “accounted” Line 159: “Variant calling was performed using…” Line 161: “Hard filtering was employed…” Lines 200-201: “The heterozygosity levels… from five individuals were comparable to previous reports on spoonbills – black-faced spoonbill … and royal spoonbill … (Li et al. 2022).” Line 202: New sentence. “The remaining heterozygosity levels observed…” Line 206: “…genetic bottleneck in the black-faced spoonbill…” Lines 208-209: “These results highlight the need…” Lines 213-214: “…which are useful and precious resources for future population genomic studies aimed at better understanding spoonbill species numbers and conservation.” Line 226: Missing a period after “heterozygosity.” For references, consider adding DOIs. Some citations have them but most citations would benefit from this addition.

      Reviewer 2. Phred Benham

      Is the language of sufficient quality?

      Generally yes, the language is sufficiently clear. However, a number of places could be refined and extra words removed.

      Are all data available and do they match the descriptions in the paper?

      Additional data is available on figshare.

      I do not see any of the tables that are cited in the manuscript and contain legends. Am I missing something. Also there is no legend for the GenomeScope profile in figure 3.

      The assembly appears to be on genbank as a scaffold level assembly, can you list this accession info in the data availability section in addition to the project number.

      Is there sufficient data validation and statistical analyses of data quality?

      Overall fine, but some additional analyses would aid the paper. Comparison of the spoonbill genome to other close relatives using a synteny plot would be helpful.

      It would also be useful to put heterozygosity and inbreeding coefficients into context by comparing to results from other species.

      Additional Comments:

      Hui et al. report a chromosome level genome for the black-faced spoonbill, a endangered species of coastal wetlands in East Asia. This genome will serve as an important genome for understanding the biology of and conserving this species.

      Generally, the methods are sound and appropriate for the generation of genomic sequence.

      Major comments: This is a highly contiguous genome in line with metrics for Vertebrate Genomics Project genomes and other consortia. The authors argue that they have assembled 31 Pseudo-molecules or chromosomes. It would be nice to see a plot showing synteny of these 31 chromosomes and a closely related species with a chromosome level assembly (e.g. Theristicus caerulescens; GCA_020745775.1)

      The tables appear to be missing from the submitted manuscript?

      Minor comments: Line 49: delete its

      Line 49-51: This sentence is a little awkward, please revise.

      Line 64: delete 'the'

      Line 67: replace 'with' with 'the spoonbil as a'

      Line 68: delete 'Interestingly'

      Line 70: can you be more specific about what kind of genetic methods had previously been performed?

      Line 79: can you provide any additional details on the necessary permits and/or institutional approval

      Line 78: what kind of tissue? or were these blood samples?

      Line 110: do you mean movies?

      Line 143: replace data with dataset

      Line 163: it may be worth applying some additional filters in vcftools, e.g. minor allele freq., min depth, max depth, what level of missing data was allowed?, etc.

      Line 171: delete 'resulted in'

      Line 172: do you mean scaffold L50 was 8? Line 191-195: some context would be useful here, how does this level of heterozygosity and inbreeding compare to other waterbirds?

      Line 217: why did you use the Metazoan database and not the Aves_odb10 database for Busco?

      Figure 1b: Number refers to what, scaffolds? Be consistent with capitalization for Mb. It seems like the order of scaffold N50 and L50 were reversed.

      Figure 3 is missing a legend. Hui et al. report a chromosome level genome for the black-faced spoonbill, a endangered species of coastal wetlands in East Asia. This genome will serve as an important genome for understanding the biology of and conserving this species.

      Generally, the methods are sound and appropriate for the generation of genomic sequence.

      Major comments: This is a highly contiguous genome in line with metrics for Vertebrate Genomics Project genomes and other consortia. The authors argue that they have assembled 31 Pseudo-molecules or chromosomes. It would be nice to see a plot showing synteny of these 31 chromosomes and a closely related species with a chromosome level assembly (e.g. Theristicus caerulescens; GCA_020745775.1)

      The tables appear to be missing from the submitted manuscript?

      Minor comments: Line 49: delete its

      Line 49-51: This sentence is a little awkward, please revise.

      Line 64: delete 'the'

      Line 67: replace 'with' with 'the spoonbil as a'

      Line 68: delete 'Interestingly'

      Line 70: can you be more specific about what kind of genetic methods had previously been performed?

      Line 79: can you provide any additional details on the necessary permits and/or institutional approval

      Line 78: what kind of tissue? or were these blood samples?

      Line 110: do you mean movies?

      Line 143: replace data with dataset

      Line 163: it may be worth applying some additional filters in vcftools, e.g. minor allele freq., min depth, max depth, what level of missing data was allowed?, etc.

      Line 171: delete 'resulted in'

      Line 172: do you mean scaffold L50 was 8? Line 191-195: some context would be useful here, how does this level of heterozygosity and inbreeding compare to other waterbirds?

      Line 217: why did you use the Metazoan database and not the Aves_odb10 database for Busco?

      Figure 1b: Number refers to what, scaffolds? Be consistent with capitalization for Mb. It seems like the order of scaffold N50 and L50 were reversed.

      Figure 3 is missing a legend. Re-review:

      I previously reviewed this manuscript and overall the authors have done a nice job addressing all of my comments.

      I appreciate that the authors include the MCscan analysis that I suggested. However, the alignment of the P. minor assembly and annotations to other genomes suggests rampant mis-assembly or translocations. Birds have fairly high synteny and I would expect Pmin to look more similar to the comparison between T. caerulescens and M. americana in the MCscan plot. For instance, parts of the largest scaffold in the Pmin assembly map to multiple different chromosomes in the Tcae assembly. Similarly, the Z in Tcae maps to 11 different scaffolds in the Pmin assembly and there does not appear to be a single large scaffold in the Pmin assembly that corresponds to the Z chromosome.

      The genome seems to be otherwise of strong quality, so I urge the authors to double-check their MCscan synteny analysis. If this pattern remains, can you please add some comments about it to the end of the Data Validation and Quality Control section? I think other readers will also be surprised at the low levels of synteny apparent between the spoonbill and ibis assemblies.

    1. Popular/Well-known Name

      This is a great addition. I think it would actually be better to use the popular names in the above RA plot as well.

      We may also have some data source internally at SB that has a mapping from NCBI taxonomic virus name to common name.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      In this work, Odenwald and colleagues show that mutant biotin ligases used to perform proximity-dependent biotin identification (TurboID) can be used to amplify signal in fluorescence microscopy and to label phase-separated compartments that are refractory to many immunofluorescence approaches. Using the parasite Trypanosoma brucei, they show that fluorescent methods such as expansion microscopy and CLEM, which require bright signals for optimal detection, benefit from the elevated signal provided by TurboID fusion proteins when coupled with labeled streptavidin. Moreover, they show that phase-separated compartments, where many antibody epitopes are occluded due to limited diffusion and potential sequestration, are labeled reliably with biotin deposited by a TurboID fusion protein that localizes within the compartment. They show successful labeling of the nucleolus, likely phase-separated portions of the nuclear pore, and stress granules. Lastly, they use a panel of nuclear pore-TurboID fusion proteins to map the regions of the T. brucei nuclear pore that appear to be phase-separated by comparing antibody labeling of the protein, which is susceptible to blocking, to the degree of biotin deposition detected by streptavidin, which is not. 

      Strengths: 

      Overall, this study shows that TurboID labelling and fluorescent streptavidin can be used to boost signal compared to conventional immunofluorescence in a manner similar to tyramide amplification, but without having to use antibodies. TurboID could prove to be a viable general strategy for labeling phase-separated structures in cells, and perhaps as a means of identifying these structures, which could also be useful. 

      Weaknesses: 

      However, I think that this work would benefit from additional controls to address if the improved detection that is being observed is due to the increased affinity and smaller size of streptavidin/biotin compared to IgGs, or if it has to do with the increased amount of binding epitope (biotin) being deposited compared to the number of available antibody epitopes. I also think that using the biotinylation signal produced by the TurboID fusion to track the location of the fusion protein and/or binding partners in cells comes with significant caveats that are not well addressed here, mostly due to the inability to discern which proteins are contributing to the observed biotin signal. 

      To dissect the contributions of the TurboID fusion to elevating signal, anti-biotin antibodies could be used to determine if the abundance of the biotin being deposited by the TurboID is what is increasing detection, or if streptavidin is essential for this.

      We agree with the reviewer, that it would be very interesting to distinguish whether the increase in signal comes from the multiple biotinylation sites or from streptavidin being a very good binder, or perhaps from both. However, this question is very hard to answer, as antibodies differ massively in their affinity to the antigen which is further dependent on the respective IF-conditions, and are therefore not directly comparible. Even if anti-biotin gives a better signal then anti-HA, this can be either caused by the increase in antigen-number (more biotin than HA-tag) or by the higher binding affinity, or by a combination of both, thus hard to distinguish. Nevertheless, we have tested monoclonal mouse anti-biotin targeting the (non-phase-separated) NUP158. We found the signal from the biotin-antibody to be much weaker than from anti-HA, indicating that, at least this particular biotin antibody, is not a very good binder in IF. 

      Alternatively, HaloTag or CLIP tagging could be used to see if diffusion of a small molecule tag other than biotin can overcome the labeling issue in phase-separated compartments. There are Halo-biotin substrates available that would allow the conjugation of 1 biotin per fusion protein, which would allow the authors to dissect the relative contributions of the high affinity of streptavidin from the increased amount of biotin that the TurboID introduces. 

      This is a very good idea, as in this case, the signals are both from streptavidin and are directly comparable. We expressed NUP158 with HaloTag and added PEG-biotin as a Halo ligand. However, PEG-biotin is poorly cell-permeable, and is in general only used on lysates. In trypanosomes, cell permeability is particular restricted, and even Halo-ligands that are considered highly cell-penetrant give only a weak signal. Even after over-night incubation, we could not get any signal with PEG-biotin. Our control, the TMR-ligand 647, gave a weak nuclear pore staining, confirming the correct expression and function of the HaloTag-NUP158.

      The idea of using the biotin signal from the TurboID fusion as a means to track the changing localization of the fusion protein or the location of interacting partners is an attractive idea, but the lack of certainty about what proteins are carrying the biotin signal makes it very difficult to make clear statements. For example, in the case of TurboID-PABP2, the appearance of a biotin signal at the cell posterior is proposed to be ALPH1, part of the mRNA decapping complex. However, because we are tracking biotin localization and biotin is being deposited on a variety of proteins, it is not formally possible to say that the posterior signal is ALPH1 or any other part of the decapping complex. For example, the posterior labeling could represent a localization of PABP2 that is not seen without the additional signal intensity provided by the TurboID fusion. There are also many cytoskeletal components present at the cell posterior that could be being biotinylated, not just the decapping complex. Similar arguments can be made for the localization data pertaining to MLP2 and NUP65/75. I would argue that the TurboID labeling allows you to enhance signal on structures, such as the NUPs, and effectively label compartments, but you lack the capacity to know precisely which proteins are being labeled.  

      We fully agree with the reviewer, that tracking proteins by streptavidin imaging alone is problematic, because it cannot distinguish, which protein is biotinylated. We therefore used words like “likely”  in the description of the data. However, we still think, it is a valid method, as long as it is confirmed by an orthogonal method. We have added this paragraph to the end of this chapter:

      “Importantly, tracking of proteins by streptavidin imaging requires orthogonal controls, as the imaging alone does not provide information about the nature of the biotinylated proteins. These can be proximity ligation assay, mass spectrometry or specific tagging visualisation of protein suspects by fluorescent tags. Once these orthogonal controls are established for a specific tracking, streptavidin imaging is an easy and cheap and highly versatile method to monitor protein interactions in a specific setting.”

      Reviewer #2 (Public Review): 

      Summary: 

      The authors noticed that there was an enhanced ability to detect nuclear pore proteins in trypanosomes using a streptavidin-biotin-based detection approach in comparison to conventional antibody-based detection, and this seemed particularly acute for phase-separated proteins. They explored this in detail for both standard imaging but also expansion microscopy and CLEM, testing resolution, signal strength, and sensitivity. An additional innovative approach exploits the proximity element of biotin labelling to identify where interacting proteins have been as well as where they are. 

      Strengths: 

      The data is high quality and convincing and will have obvious application, not just in the trypanosome field but also more broadly where proteins are tricky to detect or inaccessible due to phase separation (or some other steric limitations). It will be of wide utility and value in many cell biological studies and is timely due to the focus of interest on phase separation, CLEM, and expansion microscopy. 

      Thank you! We are glad you liked it.

      Reviewer #3 (Public Review): 

      Summary: 

      The authors aimed to investigate the effectiveness of streptavidin imaging as an alternative to traditional antibody labeling for visualizing proteins within cellular contexts. They sought to address challenges associated with antibody accessibility and inconsistent localization by comparing the performance of streptavidin imaging with a TurboID-HA tandem tag across various protein localization scenarios, including phase-separated regions. They aimed to assess the reliability, signal enhancement, and potential advantages of streptavidin imaging over antibody labeling techniques. 

      Overall, the study provides a convincing argument for the utility of streptavidin imaging in cellular protein visualization. By demonstrating the effectiveness of streptavidin imaging as an alternative to antibody labeling, the study offers a promising solution to issues of accessibility and localization variability. Furthermore, while streptavidin imaging shows significant advantages in signal enhancement and preservation of protein interactions, the authors must consider potential limitations and variations in its application. Factors such as the fact that tagging may sometimes impact protein function, background noise, non-specific binding, and the potential for off-target effects may impact the reliability and interpretation of results. Thus, careful validation and optimization of streptavidin imaging protocols are crucial to ensure reproducibility and accuracy across different experimental setups. 

      Strengths: 

      - Streptavidin imaging utilizes multiple biotinylation sites on both the target protein and adjacent proteins, resulting in a substantial signal boost. This enhancement is particularly beneficial for several applications with diluted antigens, such as expansion microscopy or correlative light and electron microscopy. 

      - This biotinylation process enables the identification and characterization of interacting proteins, allowing for a comprehensive understanding of protein-protein interactions within cellular contexts. 

      Weaknesses: 

      - One of the key advantages of antibodies is that they label native, endogenous proteins, i.e. without introducing any genetic modifications or exogenously expressed proteins. This is a major difference from the approach in this manuscript, and it is surprising that this limitation is not really mentioned, let alone expanded upon, anywhere in the manuscript. Tagging proteins often impacts their function (if not their localization), and this is also not discussed.

      - Given that BioID proximity labeling encompasses not only the protein of interest but also its entire interacting partner history, ensuring accurate localization of the protein of interest poses a challenge. 

      - The title of the publication suggests that this imaging technique is widely applicable. However, the authors did not show the ability to track the localization of several distinct proteins on the same sample, which could be an additional factor demonstrating the outperformance of streptavidin imaging compared with antibody labeling. Similarly, the work focuses only on small 2D samples. It would have been interesting to be able to compare this with 3D samples (e.g. cells encapsulated in an extracellular matrix) or to tissues.  

      Recommendations for the authors:

      To enhance the assessment from 'incomplete' to 'solid', the reviewers recommend that the following major issues be addressed: 

      Major issues: 

      (1) Anti-biotin antibodies in combination with TurboID labeling should be used to compare the signal/labelling penetrance to streptavidin results. That would show if elevated biotin deposition matters, or if it is really the smaller size, more fluors, and higher affinity of streptavidin that's making the difference. 

      We agree with the reviewer, that it would be very interesting to distinguish whether the increase in signal comes from the multiple biotinylation sites or from streptavidin being a very good binder, or perhaps from both, and whether the size matters (IgG versus streptavidin). However, this question is very hard to answer, as antibodies differ massively in their affinity to the antigen. Thus, even if antibiotin would give a better signal then anti-HA, this could be either caused by the increase in antigen-number (more biotin than HA-tag) or by the better binding affinity, or by a combination, and it would not allow to truly answer the question. We have now tested anti-biotin antibodies, also in repsonse to reviewer 1, and got a much poorer signal in comparison to anti-HA or streptavidin.

      Please note that we made another attempt using nanobodies to target phase-separated proteins, to see, whether size matters (Fig. 2I). The nanobody did not stain Mex67 at the nuclear pores, but gave a weak nucelolar signal for NOG1, which may suggest that the nanobody can slightly better penetrate than IgG, but it does not rule out that the nanobody simply binds with higher affinity. Reviewer 1 has suggested to use the Halo Tag with PEG-biotin: this would indeed allow to directly compare the streptavidin signal caused by the TurboID with a single biotin added by the Halo tag. Unfortunately, the PEG-biotin does not  penetrate trypanosome cells. In conclusion, we are not aware of a method that would allow to establish why streptavidin but not IgGs can penetrate to phase separated areas. We therefore prefer to not overinterpret our data, but stick to what is supported by the data: “the inability to label phase-separated areas is not restricted to anti-HA but applies to other antibodies”.

      (3) Figure 4 A-B. The validity of claiming the correct localization demonstrated by streptavidin imaging comes into question, especially when endogenous fluorescence, via the fusion protein, remains undetectable (as indicated by the yellow arrow at apex). 

      In this figure, the streptavidin imaging does NOT show the correct localisation of the bait protein, but it does show proteins from historic interactions that have a distinct localisation to the bait. We had therefore introduced this chapter with the paragraph below, to make sure, the reader is aware of the limitations (which we also see as an opportunity, if properly controlled):

      “We found that in most cases, streptavidin labelling faithfully reflects the steady state localisation of a bait protein, e.g., the localisation resembles those observed with immunofluorescence or direct fluorescence imaging of GFP-fusion proteins. For certain bait proteins, this is not the case, for example, if the bait protein or its interactors have a dynamic localisation to distinct compartments, or if interactions are highly transient. It is thus essential to control streptavidin-based de novo localisation data by either antibody labelling (if possible) or by direct fluorescence of fusion-proteins for each new bait protein.”

      In particular, on lines 450-460, there's a fundamental issue with the argument put forward here. It is not possible to formally know that the posterior labeling is ALPH1 vs. another part of the decapping complex that was associated with PABP2-Turbo, or if the higher detection capacity of the Turbo-biotin label is uncovering a novel localization of the PABP2. While it is likely that it is ALPH1, it is not possible to rule out other possibilities with this approach. These issues should be discussed here and more generally the possibility of off-target labeling with this approach should be addressed in the discussion. 

      We fully agree with the reviewer, that tracking proteins by streptavidin imaging alone is problematic, because it cannot distinguish, which protein is biotinylated. We therefore used words like “likely”  in the description of the data. However, we still think, it is a valid method, as long as it is back-uped by an orthogonal method. We have added this paragraph to the end of this chapter:

      “Importantly, tracking of proteins by streptavidin imaging requires orthogonal controls, as the imaging alone does not provide information about the nature of the biotinylated proteins. These can be proximity ligation assay, mass spectrometry or specific tagging visualisation of protein suspects by fluorescent tags. Once these orthogonal controls are established for a specific tracking, streptavidin imaging is an easy and cheap and highly versatile method to monitor protein interactions in a specific setting.”

      (4) More discussion and acknowledgment of the general limitations in using tagged proteins are needed to balance the manuscript, especially if the hope is to draw a comparison with antibody labeling, which works on endogenous proteins (not requiring a tag). For example: (a) tagging proteins requires genetic/molecular work ahead of time to engineer the constructs and/or cells if trying to tag endogenous proteins; (b) tagged proteins should technically be validated in rescue experiments to confirm the tag doesn't disrupt function in the cell/tissue/context of interest; and (c) exogenous tagged proteins compete with endogenous untagged proteins, which can complicate the interpretation of data.  

      We have added this paragraph to the first paragraph of the discussion part:

      “Like many methods that are frequently used in cell- and molecular biology, streptavidin imaging is based on the expression of a genetically engineered fusion protein: it is essential to validate both, function and localisation of the TurboID-HA tagged protein by orthogonal methods. If the fusion protein is non-functional or mis-localised, tagging at the other end may help, but if not, this protein cannot be imaged by streptavidin imaging. Likewise, target organisms not amenable to genetic manipulation, or those with restricted genetic tools,  are not or less suitable for this method.”

      Also, we like to point out that for non-mainstream organisms like trypanosomes, antibodies are not commercially available and often genetic manipulation is more time-efficient and cheaper than the production of antiserum against the target protein.

      Also, the introduction would ideally be more general in scope and introduce the pros and cons of antibody labeling vs biotin/streptavidin, which are mentioned briefly in the discussion. The fact that the biotin-streptavidin interaction is ~100-fold higher affinity than an IgG binding to its epitope is likely playing a key role in the results here. The difference in size between IgG and streptavidin, the likelihood that the tetrameric streptavidin carries more fluors than a IgG secondary, and the fact that biotin can likely diffuse into phase-separated environments should be clearly stated. The current introduction segues from a previous paper that a more general audience may not be familiar with. 

      We have now included this paragraph to the introduction:

      “It remains unclear, why streptavidin was able to stain biotinylated proteins within these antibody inaccessible regions, but possible reasons are: (i) tetrameric streptavidin is smaller and more compact than IgGs (60 kDa versus a tandem of two IgGs, each with 150 kDa) (ii) the interaction between streptavidin and biotin is ~100 fold stronger than a typical interaction between antibody and antigen and (iii) streptavidin contains four fluorophores, in contrast to only one per secondary IgG.”

      Minor issues: 

      The copy numbers of the HA and Ty1 epitope tags vary depending on the construct being used. For example, Ty1 is found as a single copy tag in the TurboID tag, but on the mNeonGreen tag there are 6 copies of the epitope. It makes it hard to know if differences in detection are due to variations in copies of the epitope tags. Line 372-374: can the authors explain why they chose to use nanobodies in this case? It would be great to show the innate mNeonGreen signal in 2K to compare to the Ty1 labeling. The presence of 6 copies of the Ty1 epitope could be essential to the labeling seen here.

      We agree with the reviewer, that these data are a bit confusing. We have now removed Figure 3K, as it is the only construct with 6 Ty1 instead of one, and it does not add to the conclusions. (the mNeonsignal is entirely in the nucleolus, as shown by Tryptag). We have also added an explanation why we used nanobodies (“The absence of a nanobody signal rules out that its simply the size of IgGs that prevents the staining of Mex67 at the nuclear pores, as nanobodies are smaller than (tetrameric) streptavidin”). However, as stated above, we prefer not to overinterpret the data, as signals from different antibodies/nanobodies – antigen combinations are not comparable. Important to us was to stress that the absence of signal in phase-separated areas is NOT restricted to the anti-HA antibody, which is clearly supported by the data.

      What is the innate streptavidin background labeling look like in cells that are not carrying a TurboID fusion, from the native proteins that are biotinylated? That should be discussed. 

      We have now included the controls without the TurboID fusions for trypanosomes and HeLa cells: “Wild type cells of both Trypanosomes and human showed only a very low streptavidin signal, indicating that the signal from naturally biotinylated proteins is neglectable (Figure S8 in supplementary material).”

      Line 328-331: This is likely to be dependent on whether or not the protein moves to different localizations within the cell. 

      True, we agree, and we have added this paragraph:

      “The one exception are very motile proteins that produce a “biotinylation trail” distinct to the steady state localisation; these exceptions, and how they can be exploited to understand protein interactions, are discussed in chapter 4 below. “

      Line 304-305: Does biotin supplementation not matter at all? 

      No, we never saw any increase in biotinylation when we added extra biotin to trypanosomes. The 0.8 µM biotin concentration in the medium were sufficient.

      Line 326-327: Was the addition of biotin checked for enhancement in the case of the mammalian NUP98? I would argue that there is a significant number of puncta in Figure 1D that are either green or magenta, not both. The amount of extranuclear puncta in the HA channel is also difficult to explain. Biotin supplementation to 500 µM was used in mammalian TurboID experiments in the original Nature Biotech paper- perhaps nanomolar levels are too low. 

      We now tested HeLa cells with 500 µM Biotin and saw an increase in signal, but also in background; due to the increased background  we conclude that low biotin concentrations are more suitable . We have also repeated the experiment using 4HA tags instead of 1HA, and we found a minor improvement in the antibody signal for NUP88 (while the phase separated NUP54 was still not detectable). We have replaced the images in Figure 1D  (NUP88) and also in Figure 2F (NUP54) with improved images and using 4HA tags. However, we like to note that single nuclear pore resolution is beyond what can be expected of light microscopy.

      Line 371: In 2I, I see a signal that looks like the nucleus, similar to the Ty1 labeling in 2G, so I don't think it's accurate to say that that Mex67 was "undetectable". Does the serum work for blotting? 

      Thank you, yes, “undetectable” was not the correct phrase here. Mex67 localises to the nuclear pores, to the nuceoplasm and to the nucleolus (GFP-tagging or streptavidin). Antibodies, either to the tag or to the endogenous proteins, fail to detect Mex67 at the nuclear pores and also don’t show any particular enrichment in the nucleolus. They do, however, detect Mex67 in the (not-phase-separated) area of the nucleoplasm. We have changed the text to make this clearer. The Mex67 antiserum works well on a western blot (see for example: Pozzi, B., Naguleswaran, A., Florini, F., Rezaei, Z. & Roditi, I. The RNA export factor TbMex67 connects transcription and RNA export in Trypanosoma brucei and sets boundaries for RNA polymerase I. Nucleic Acids Res. 51, 5177–5192 (2023))

      Line 477: "lacked" should be "lagged".

      Thank you, corrected.

      Line 468-481: My previous argument holds here - how do you know that the difference in detection here is just a matter of much higher affinity/quantity of binding partner for the avidin?

      See answer to the second point of (3), above.

      483-491: Same issue - without certainty about what the biotin is on, this argument is difficult to make. 

      See answer to the second point of (3), above.

      Line 530: "bone-fine" should be "bonafide"

      Thank you, corrected.

      Line 602: biotin/streptavidin labeling has been used for expansion microscopy previously (Sun, Nature Biotech 2021; PMID: 33288959). 

      Thank you, we had overlooked this! We have now included this reference and describe the differences to our approach clearer in the discussion part:

      “Fluorescent streptavidin has been previously used in expansion microscopy to detect biotin residues in target proteins produced by click chemistry (Sun et al., 2021). However, to the best of our knowledge, this is the first report that employs fluorescent streptavidin as a signal enhancer in expansion microscopy and CLEM, by combining it with multiple biotinylation sites added by a biotin ligase. Importantly, for both CLEM and expansion, streptavidin imaging is the only alternative approach to immunofluorescence, as denaturing conditions associated with these methods rule out direct imaging of fluorescent tags.”

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment:

      This study presents valuable framework and findings to our understanding of the brain as a fractal object by observing the stability of its shape property within 11 primate species and by highlighting an application to the effects of aging on the human brain. The evidence provided is solid but the link between brain shape and the underlying anatomy remains unclear. This study will be of interest to neuroscientists interested in brain morphology, whether from an evolutionary, fundamental or pathological point of view, and to physicists and mathematicians interested in modeling the shapes of complex objects.

      We now clarified the outstanding questions regarding if our model outputs can be related to actual primate brain anatomy, which we believe was mainly based on comments regarding the validity of our output of apparently thicker cortices than nature can produce.

      We address this point in more detail in the point-by-point response below, but want to address this misunderstanding directly here: Our algorithm does not produce thicker cortices with increasing coarse-graining scales; in fact, the cortical thickness never exceeds the actual cortical thickness in our outputs, but rather thins with each coarse-graining scale. In other words, we believe that our outputs are fully in line with neuroanatomy across species.

      Reviewer #2 (Public Review): 

      In this manuscript, the authors analyze the shapes of cerebral cortices from several primate species, including subgroups of young and old humans, to characterize commonalities in patterns of gyrification, cortical thickness, and cortical surface area. The authors state that the observed scaling law shares properties with fractals, where shape properties are similar across several spatial scales. One way the authors assess this is to perform a "cortical melting" operation that they have devised on surface models obtained from several primate species. The authors also explore differences in shape properties between brains of young (~20 year old) and old (~80) humans. A challenge the authors acknowledge struggling with in reviewing the manuscript is merging "complex mathematical concepts and a perplexing biological phenomenon." This reviewer remains a bit skeptical about whether the complexity of the mathematical concepts being drawn from are justified by the advances made in our ability to infer new things about the shape of the cerebral cortex. 

      To allow scientists from all backgrounds to adopt these complex ideas, we have made our code to “melt” the brains and for further downstream analysis publicly available. We have now also provided a graphical user interface, to allow users without substantial coding experience to run the analysis. We also believe that the algorithmic concepts are easy to understand due to the similarity to the coarse-graining procedures found in long-standing and well-accepted box-counting algorithms.

      Beyond the theoretical insight of the fractal nature of cortices and providing an explicit and crucial link between vastly different brains that are gyrified and those that are not, we believe that the advance gained by our methods for future applications is clearly demonstrated in our proof-of-principle with a four-fold increase in effect size. For reference, an effect size of 8 would translate to an almost perfect separation of groups, i.e. an ideal biomarker with near 100% sensitivity and specificity.

      (1) The series of operations to coarse-grain the cortex illustrated in Figure 1 produces image segmentations that do not resemble real brains.

      As re-iterated in our Methods and Discussion: “Note, of course, that the coarse-grained brain surfaces are an output of our algorithm alone and are not to be directly/naively likened to actual brain surfaces, e.g. in terms of the location or shape of the folds. Our comparisons here between coarse-grained brains and actual brains is purely on the level of morphometrics across the whole cortex.”

      Fig. 1 therefore serves as an explanation to the reader on the algorithmic outputs, but each melted brain is not supposed to be directly/visually compared to actual brains. Similar to algorithms measuring the fractal dimension, or the exposed surface area of a given brain, the intermediate outputs of these algorithms are not supposed to represent any biologically observed brain structures, but rather serve as an abstraction to obtain meaningful morphometrics.

      We additionally added a note to the caption of Fig. 1 to clarify this point:

      “Note that the actual size of the brains for analysis are rescaled (see Methods and Fig. 3); we display all brains scaled at an equal size here for the ease of visualisation of the method.”

      Finally, we also edited the entire paper for terminology to clearly distinguish the terms of (1) the cortex as a 3D object, (2) coarse-grained and voxelised versions thereof, and (3) summary morphological measures derived from the former. When we invite comparisons in our paper between real brains and coarse-grained brains, this is always at the level of summary morphological measures, not at the level of the 3D objects/voxelisations themselves.

      The process to assign voxels in downsampled images to cortex and white matter is biased towards the former, as only 4 corners of a given voxel are needed to intersect the original pial surface, but all 8 corners are needed to be assigned a white matter voxel. The reason for introducing this bias (and to the extent that it is present in the authors' implementation) is not provided.

      This detail was in the Supplementary, and we have now added additional clarification on this specific point to our Supplementary:

      “In detail, we assign all voxels in the grid with at least four corners inside the original pial surface to the pial voxelization. This process allows the exposed surface to remain approximately constant with increasing voxel sizes. A constant exposed surface is desirable, as we only want to gradually ‘melt’ and fuse the gyri, but not grow the bounding/exposed surface as well. We want the extrinsic area to remain approximately constant as we decrease the intrinsic area via coarse-graining; it is like generating iterates of a Koch curve in reverse, from more to less detailed, by increasing the length of smallest line segment.

      We then assign voxels with all eight corners inside the original white matter surface to the white matter voxelization. This is to ensure integrity of the white matter, as otherwise white matter voxels in gyri may become detached from the core white matter, and thus artificially increase white matter surface area. Indeed, the main results of the paper are not very sensitive to this decision using all eight corners, vs. e.g. only four corners, as we do not directly use white matter surface area for the scaling law measurements. However, we still maintained this choice in case future work wants to make use of the white matter voxelisations or derivative measures.”

      Note on the point of white matter integrity that if both grey and white matter voxelisations require all 8 corner to be inside the respective mesh, there will be voxels not assigned to either at the grey/white matter interface, causing potential downstream issues.

      We further acknowledge:

      “Of course, our proposed procedure is not the only conceivable way to erase shape details below a given scale; and we are actively working on related algorithms that are also computationally cheaper. Nevertheless, the current version requires no fine-tuning, is computationally feasible and conceptually simple, thus making it a natural choice for introducing the methodology and approach.”

      The authors provide an intuitive explanation of why thickness relates to folding characteristics, but ultimately an issue for this reviewer is, e.g., for the right-most panel in Figure 2b, the cortex consists of several 4.9-sided voxels and thus a >2 cm thick cortex. A structure with these morphological properties is not consistent with the anatomical organization of typical mammalian neocortex. 

      We assume the reviewer refers to Fig. 1B with the panel on scale=4.9mm. We would like to point out that Fig. 1 serves as an explanation of the voxelisation method. For the actual analysis and Results, we are using re-scaled brains (see Fig. 2 with the ever decreasing brain sizes). The rescaling procedure is now expanded as below:

      “Morphological properties, such as cortical thicknesses measured in our ‘melted’ brains are to be understood as a thickness relative to the size of the brain. Therefore, to analyse the scaling behaviour of the different coarse-grained realisations of the same brain, we apply an isometric rescaling process that leaves all dimensionless shape properties unaffected (more details in Suppl. S3.1). Conceptually, this process fixes the voxel size, and instead resizes the surfaces relative to the voxel size, which ensures that we can compare the coarse-grained realisations to the original cortices, and test if the former, like the latter, also scale according to Eqn. (1). Resizing, or more precisely, shrinking the cortical surface is mathematically equivalent to increasing the box size in our coarse-graining method. Both achieved an erasure of folding details below a certain threshold. After rescaling, as an example, the cortical thickness also shrinks with increasing levels of coarse-graining, and never exceeds the thickness measured at native scale.”

      We additionally added a note to the caption of Fig. 1 to clarify this point:

      “Note that the actual size of the brains for analysis are rescaled (see Methods and Fig. 3); we display all brains scaled at an equal size here for the ease of visualisation of the method.”

      Finally, we also edited the entire paper for terminology to clearly distinguish the terms of (1) the cortex as a 3D object, (2) coarse-grained versions thereof, and (3) summary morphological measures derived from the former. When we invite comparisons in our paper between real brains and coarse-grained brains, this is always at the level of summary morphological measures, not at the level of the 3D objects themselves and their detailed anatomical features.

      (2) For the comparison between 20-year-old and 80-year-old brains, a well-documented difference is that the older age group possesses more cerebral spinal fluid due to tissue atrophy, and the distances between the walls of gyri becomes greater. This difference is born out in the left column of Figure 4b. It seems this additional spacing between gyri in 80 year olds requires more extensive down-sampling (larger scale values in Figure 4a) to achieve a similar shape parameter K as for the 20 year olds. The authors assert that K provides a more sensitive measure (associated with a large effect size) than currently used ones for distinguishing brains of young vs. old people. A more explicit, or elaborate, interpretation of the numbers produced in this manuscript, in terms of brain shape, might make this analysis more appealing to researchers in the aging field.

      We have removed the main results relating to K and aging from our last revision already to avoid confusion. This is now only in the supplementary analysis, and our claim of K being a more sensitive measure for age and ageing – whilst still true – will be presented in more detail in a series of upcoming papers.

      (3) In the Discussion, it is stated that self-similarity, operating on all length scales, should be used as a test for existing and future models of gyrification mechanisms. Given the lack of association between the abstract mathematical parameters described in this study and explicit properties of brain tissue and its constituents, it is difficult to envision how the coarse-graining operation can be used to guide development of "models of cortical gyrification."

      We have clarified in more detail what we meant originally in Discussion:

      “Finally, this dual universality is also a more stringent test for existing and future models of cortical gyrification mechanisms at relevant scales, and one that moreover is applicable to individual cortices. For example, any models that explicitly simulate a cortical surface as an output could be directly coarse-grained with our method and the morphological trajectories can be compared with those of actual human and primate cortices. The simulated cortices would only be ‘valid’ in terms of the dual universality, if it also produces the same morphological trajectories.”

      However, we agree with the reviewer that our paper could be misread as demanding direct comparisons of each coarse-grained brain with an actual brain, and we have now added the following text to clarify that this is not our intention for the proposed method or outputs.

      “Note, we do not suggest to directly compare coarse-grained brain surfaces with actual biological brain surfaces. As we noted earlier, the coarse-grained brain surfaces are an output of our algorithm alone and not to be directly/naively likened to actual brain surfaces, e.g. in terms of the location or shape of the folds. Our comparisons here between coarse-grained brains and actual brains is purely on the level of morphometrics across the whole cortex.”

      Indeed, the dual universality imposes restrictive constraints on the possible shapes of real cortices, but do not fully specify them. Presumably, the location of individual folds in different individuals and species will depend on their respective evolutionary histories, so there is no reason to expect a match in fold location between the ‘melted’ cortices of more gyrified species, on one hand, and the cortex of a less-gyrified one, on the other,  even if their global morphological parameters and global mechanism of folding coincide.

      (4) There are several who advocate for analyzing cortical mid-thickness surfaces, as the pial surface over-represents gyral tips compared to the bottoms of sulci in the surface area. The authors indicate that analyses of mid-thickness representations will be taken on in future work, but this seems to be a relevant control for accepting the conclusions of this manuscript.

      In the context of some applications and methods, we agree that the mid-surface is a meaningful surface to analyse. However, in our work, the mid-surface is not. The fractal estimation rests on the assumption that the exposed area hugs the object of interest (hence convex hull of the pial surface), as the relationship between the extrinsic and intrinsic areas across scales determine the fractal relationship (Eq. 2). If we used the mid-surface instead of the pial surface for all estimation, this would not represent the actual object of interest, and it is separated from the convex hull. Estimating a new convex hull based on the mid surface would be the equivalent of asking for the fractal dimension of the mid-surface, not of the cortical ribbon. In other words, it would be a different question, bound to yield a different answer.

      Hence, we indicated in our original response that we only have a provisional answer, but more work beyond the scope of this paper is required to answer this question, as it is a separate question. The mid-surface, as a morphological structure in its own right, will have its own scaling properties, and our provisional understanding is that these also yield a scaling law parallel to those of the cortical ribbon with the same or a similar fractal dimension. But more systematic work is required to investigate this question at native scale and across scales.

      Reviewer #3 (Public Review):

      Summary: Through a rigorous methodology, the authors demonstrated that within 11 different primates, the shape of the brain followed a universal scaling law with fractal properties. They enhanced the universality of this result by showing the concordance of their results with a previous study investigating 70 mammalian brains, and the discordance of their results with other folded objects that are not brains. They incidentally illustrated potential applications of this fractal property of the brain by observing a scale-dependant effect of aging on the human brain. 

      Strengths: 

      - New hierarchical way of expressing cortical shapes at different scales derived from previous report through implementation of a coarse-graining procedure 

      - Investigation of 11 primate brains and contextualisation with other mammals based on prior literature 

      - Proposition of tool to analyse cortical morphology requiring no fine tuning and computationally achievable 

      - Positioning of results in comparison to previous works reinforcing the validity of the observation. 

      - Illustration of scale-dependance of effects of brain aging in the human. 

      Weaknesses: 

      - The notion of cortical shape, while being central to the article, is not really defined, leaving some interpretation to the reader 

      - The organization of the manuscript is unconventional, leading to mixed contents in different sections (sections mixing introduction and method, methods and results, results and discussion...). As a result, the reader discovers the content of the article along the way, it is not obvious at what stages the methods are introduced, and the results are sometimes presented and argued in the same section, hindering objectivity. 

      To improve the document, I would suggest a modification and restructuring of the article such that: 1) by the end of the introduction the reader understands clearly what question is addressed and the value it holds for the community, 2) by the end of the methods the reader understands clearly all the tools that will be used to answer that question (not just the new method), 3) by the end of the results the reader holds the objective results obtained by applying these tools on the available data (without subjective interpretations and justifications), and 4) by the end of the discussion the reader understands the interpretation and contextualisation of the study, and clearly grasps the potential of the method depicted for the better understanding of brain folding mechanisms and properties. 

      We thank this reviewer again for their attention to detail and constructive comments. We have followed the detailed suggestions provided by us in the Recommendations For The Authors, and summarise the main changes here:

      - We have restructured all sections to be more clearly following Introduction, Methods, Results, and Discussion; by using subsections, we believe the structure is now more accessible to readers.

      -  We have now clarified the concept of “cortical shape”, as we use it in our paper in several places, by distinguishing clearly the object of study, and the morphological properties measured from it.

      Recommendations for the authors: 

      Reviewer #2 (Recommendations For The Authors): None 

      Reviewer #3 (Recommendations For The Authors): 

      I once again compliment the authors for their elegant work. I am happy with the way they covered my first feedback. My second review takes into account some comments made by other reviewers with which I agree. 

      We thank this reviewer again for their attention to detail and constructive comments.

      Recommendations for clarifications: 

      General comments: The purpose of the article could be made clearer in the introduction. When I differentiate results from discussion, I think of results as objective measures or observations, while discussion will relate to the interpretation of these results (including comparison with previous literature, in most cases). 

      We have restructured all sections to be more clearly following Introduction, Methods, Results, and Discussion; by using subsection, we believe the structure is now more accessible to readers.

      - l.39: define or discuss "cortical shape" 

      We have gone through the entire paper and corrected for any ambiguities. We specifically distinguish between the cortex as a structure overall, shape measures derived from this structure, and coarse-grained versions of the structure.

      - l.48-74: this would match either an introduction or a discussion rather than a methods section. 

      Done

      - l.98-106: this would match a discussion rather than a methods section. 

      Done

      - l.111: here could be a good spot to discuss the 4 vs 8 corners for inclusion of pial vs white matter voxelization 

      We have discussed this in the more detailed Supplementary section now, as after restructuring, this appears to be the more suitable place.

      - l.140-180: it feels that this section mixes methods, results and discussion of the results 

      We agree and we have resolved this by removing sentences and re-arranging sections.

      - l.183-217: mix of results and discussion 

      We agree and we have resolved this by removing sentences and re-arranging sections.

      Small cosmetic suggestions: 

      - l.44: conservation of 'some' quantities: vague 

      Changed to conservation of morphological relationships across evolution

      - l.66: order of citations ([24, 22,23]) 

      Will be fixed at proof stage depending on format of references.

      - l.77: delete space between citation and period 

      Done

      - l.77: I would delete 'say' 

      Done

      - l.86: 'but to also analyse' -> 'to analyse' 

      Done

      - l.105: remove 'we are encouraged that' 

      Done

      - l.111: 'also see' -> 'see also' 

      Done

      - l.164: 'remarkable': subjective 

      Done

      - l.189: define approx. abbreviation 

      Done

      - l.190: 'approx' -> 'approx.' 

      Revised

      - l.195: 'dramatic': subjective 

      removed

      -l. 246: 'much' -> vague 

      explained

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Answers to reviewers


      Reviewer #1

      Sagia et al. present a manuscript using A. nidulans as model to study different transport routes of membrane proteins from the ER to the plasma membrane. They showed in earlier work that apparently at least two different transport routes exist, one involving the classical ER-ERES-ERGIC-Golgi route, one bypassing the Golgi. Unpolarized membrane proteins use the former, apically sorted membrane proteins the latter route. The study here confirms their earlier findings, uses a better model (co-expression of representatives for both routes in the same cell) and provides additional mechanistic insights about the roles of rabs, SNARES and other important proteins of the secretory pathway. The study is thoroughly done, figures are of high quality, data and methods well described and adequately replicated.

      Thank you for your positive comments

      I do have, however, a number of comments that could help to improve the manuscript.

      -I suggest using the term polarized or apical rather than polar. Polar alone to me refers more to physico-chemical properties like water-solubility.

      Amended in most parts of the revised text.

      -introduction and discussion: I don’t think the literature about unconventional secretion bypassing the Golgi is complete, for example studies about TMED10 like Zhang, M. et al. Cell 181, 637-652 e615 (2020) or Zhang et al. Elife 4 (2015) are missing, there might be others. Is UapA a leader-less cargo that could be inserted via TMED10 translocation?

      Thank you for letting us know, we have missed these articles. More references on UPS are now added, including the Zhang et all publications. UapA, as all transporters, is a multispan transmembrane protein with no leader peptide. In fact, we have checked the role of p24 family proteins (homologous to TMED10) in UapA trafficking. The knock-out of key p24 proteins does not affect UapA sorting to the PM (please consider this as confidential unpublished results)

      -Fig. 1C. Can these intracellular structures be characterized in more detail?

      As explained briefly to the handling editor above, and following the reviewer’s suggestion, we performed new experiments to better characterize the identity of the cargo-labeled fluorescent puncta. To do so, we used co-expression of a standard ERES marker, Sec16, in cells expressing either UapA or SynA, tagged with different fluorescent tags. More specifically, we constructed and analyzed strains co-expressing UapA-GFP/Sec16-mCherry or GFP-SynA/mCherry-Sec16 in the sec31ts genetic background, which allows synchronization and better analysis of ER exit, as described in our text. The new findings appear as Figure 5C __in the revised manuscript. Notice that sec16-mCherry introduced in the native sec16 locus by standard knock-in reverse genetics of A. nidulans (see Materials and methods) does not affect Aspergillus growth or secretion. Experiments depicted in __5C show that both cargoes, UapA and SynA, co-localize significantly (PCC ≈ 0.6), with Sec16, suggesting that most of these puncta are indeed ERES structures. Given that the puncta marked with UapA or SynA are clearly distinct (see Figures 1C,2A, 3A, 5B), this new experiment strongly suggests that there are indeed two distinct ERES, one populated mostly by UapA and the other by SynA. Notice, as we already outline in our response to the editor above, a three-colored approach using Sec16-BFP (or Sec13-BFP) for showing directly the existence of these two populations of cargo-specific ERES in the same cell failed as the BFP signal was problematic for colocalization studies.

      Where is the Golgi localized in A. nidulans, is it decentralized like in yeast?

      Yes, as in S. cerevisiae, A. nidulans Golgi cisternae are individually scattered throughout the cytoplasm, also similarly to other filamentous fungi. Notice that in A. nidulans Golgi structures are moderately polarized (Pantazopoulou and Penalva 2009).

      Is the UapA at the time points shown in Fig. 1C in some sub-PM structures? To me the distribution at or near the PM is more punctate than in the steady state image shown in 1B

      The punctuate appearance of PM transporters at the periphery of fungal cells is a common theme when these do not reach high, steady-state, levels of accumulation. In fact, several transporters mark specific subdomains of the PM, more evident before achieving their steady-state levels. For example, in yeast several amino acid and nucleobase transporters mark punctuate structures that colocalize with eisosomes markers (caveolin-like PM subdomains), while the proton pump ATPase Pma1 marks distinct punctuate domains. Similarly, UapA and other solute transporters mark punctuate structures before reaching their state-state accumulation in the PM. Figure 1C shows the de novo synthesis of cargoes after 100 min of transcription, while Figure 1B depicts the steady-state localization of UapA and SynA after 4h. In the latter case, the PM is ‘saturated’ with UapA molecules and thus the fluorescent signal of distinct puncta ‘fuses’, creating continuous fluorescent labeling. Notice also that in several cases, in our work, we have also performed UapA transport assays, which provide a direct tool to test and confirm the presence of UapA in the PM (see Figures 4D or 6C).

      -Fig. 3A. To me it looks like there is actually a lot of colocalization of UapA and SynA, especially at or near the PM, where there is quite some white, punctate staining. The green fluorescence is just much stronger, overlaying the violet. Can you show separate channels and explain?

      We think the reviewer means Figure 2A, which compares UapA and SynA (Figure 3A compares UapA with Golgi markers). If so, we have quantitatively estimated and performed statistical analysis (PCC) which indicates that this, visually apparent colocalization, is not significant (right panel in Figure 2A). Notice also that we cannot totally exclude very minimal colocalization of UapA and SynA signals as both cargoes mark very proximal early secretory domains (i.e., ERES or ERGIC), especially in fungal cells. Anyhow, in the revised Figure 2 we also added a panel depicting separate channels, as the reviewer asks.

      -Fig. 3: In my opinion the statement that UapA "is probably sorted from an early secretory compartment, ultimately bypassing the need for Golgi maturation" is too strong at that point. You say for both UapA and SynA you don’t get significant colocalization with early Golgi/ERGIC marker, then you cannot conclude that one takes the conventional route via early-late Golgi and the other does not. What you can say is that UapA is apparently not going through late Golgi.

      The reviewer is in principle correct. However, significant colocalization with the late Golgi marker, as SynA shows, strongly suggests that this cargo has passed via the early Golgi compartment. The fact we failed to detect significant colocalization of any cargo tested with early Golgi/ERGIC markers (e.g., SedV) is very probably due to very rapid passage of cargoes from these compartments, which conventional widefield or confocal microscopy cannot detect. To achieve this, ultra-fast fluorescent microcopy, as Lattice Light Sheet Microscopy (LLSM), should be used. In fact, we are currently initiating these studies, which will appear in the near future elsewhere.

      -Fig. 4C: UapA does not seem to accumulate in the ER in the Sec24 and 13 mutants but in punctate structures. This for me is unexpected, any explanations? Can you characterize that punctate staining?

      This is an interesting observation. Notice that UapA is a large homodimeric protein (e.g., 28 transmembrane domains) that oligomerizes further upon translocation into the ER membrane. Repression of Sec24, and to a less extent of Sec13, leads to inability to exit the ER properly. Consequently, this will lead to UapA overaccumulation in the ER, which might in turn lead to ER stress and turnover, reflected in UapA aggregates. In line with this, we have previously shown that specific mutants of UapA unable to exit the ER are indeed degraded by selective autophagy (Evangelinos et al., 2016). In contrast to UapA, SynA partitions in the entire ER without forming aggregates when sec24 or sec13 are repressed. This might be due to the fact that is a single-pass, much smaller, membrane protein compared to UapA and one that is not known to form oligomers. Thus, its overaccumulation in the ER might not lead to aggregation, allowing it to diffuse laterally in the membrane of the ER. A note on this is included in the Figure legend of the revised manuscript.

      -Fig. 6D: You state that BFA "has only a very modest effect on UapA translocation to the PM". To me the PM (or very near PM) staining of UapA looks very different in the PFA treated cells, more uneven/punctate. Is there an explanation for that?

      Our explanation is the following. When BFA is added, conventional secretion is blocked and Golgi collapses. We believe that this might have a moderate indirect effect also on cargoes bypassing the late Golgi/TGN, as UapA (i.e., lower levels of UapA present in the PM). This is based on the fact that UapA, in addition to conventional cargoes, requires the Q-SNARE complex SsoA/Sec9 to translocate to the PM. SsoA, being a membrane protein cargo itself, also needs to traffic to the PM. Interestingly, we have previously obtained evidence suggesting that SsoA traffics to the PM by both conventional and a Golgi-bypass routes (Dimou et al 2020). Thus, UapA translocation to the PM might indeed be partially impeded or delayed due to repression of proteins, such as SsoA (and probably Sec9), needed for its final integration into the PM bilayer. Importantly, in line with an indirect effect of BFA on the levels of UapA localized in the PM, notice that, unlike SynA, UapA was never trapped in brefeldin bodies (i.e., Golgi aggregates).

      Reviewer #1 (Significance):

      One strength of the study is the use of a model organism, A. nidulans, not cell cultures. Also, the use of both reporters, UapA and SynA, in the same cell is an advantage over previous studies using different lines and different promotors. Limitation of the study might be that it remains unclear to what extend the basic mechanism (UapA and SynA are transported to PM in different carrier and via different routes) can be generalized to other polarized (apically?) membrane proteins versus non-polarized membrane proteins in A. nidulans and whether a similar mechanism exists in other organisms. Some of the basic findings of the study are not new but were published by the same group. However, as the authors point out, the current study uses improved assays and extends their previous studies, advancing our understanding of the mechanistics of transport in the conventional secretory pathway and novel alternative routes. The study will be of interest for basic researchers in the trafficking field. My own expertise is transport through the secretory pathway in mammalian cells, many years ago more post-Golgi, now mostly ER-Golgi and ER itself.

      We thank the reviewer for his positive comments.

      __Reviewer #2 __

      __ __The idea that transmembrane proteins of the plasma membrane move from the ER to the Golgi and then to the cell surface is firmly entrenched, and the mechanisms and components of this secretory pathway have been extensively characterized. Secretory vesicles are often delivered from the Golgi to sites of polarized growth. This paper builds on previous work by the same group to provide evidence that in Aspergillus nidulans, some non-polarly localized plasma membrane proteins follow a very different pathway, which bypasses components of the conventional secretory machinery such as SNAREs that have been implicated in secretion as well as the exocyst. In particular, they systematically compare the trafficking of the SNARE SynA, which follows the conventional secretory pathway, with that of the purine transporter UapA, which apparently does not. The two proteins were co-expressed in the same cells using the same promoter. A variety of genetic and microscopy methods are used to support the conclusion that UapA reaches the plasma membrane by a route distinct from that followed by SynA.

      In my view, the authors present a convincing case. The individual experimental results are sometimes ambiguous, but the combined results favor the conclusion that UapA follows a novel pathway to the plasma membrane. I have only a few relatively minor comments.

      Thank you for your positive comments

      1. In the Introduction and elsewhere: to my knowledge, there is no clear evidence that AP-1-containing clathrin-coated vesicles carry cargoes from the Golgi to the plasma membrane. On the contrary, as recently reported by Robinson (https://pubmed.ncbi.nlm.nih.gov/38578286/), AP-1-containing vesicles likely mediate retrograde traffic in the late secretory pathway.

      Thank you for this comment and the relative reference. We are aware that AP-1 is likely to also mediate retrograde traffic in the late secretory pathway or/and intra-Golgi recycling, as also reported by the group of Benjamin Glick. Thus, in the revised version we added a short comment on this plus relative references. Along this line, our previous work has shown that transcriptional repression of AP-1 arrests the polar localization of several apical markers in A. nidulans and we reported that this might be due to an effect on both anterograde and retrograde trafficking. Please see “Secretory Vesicle Polar Sorting, Endosome Recycling and Cytoskeleton Organization Require the AP-1 Complex in Aspergillus nidulans”. Martzoukou O, Diallinas G, Amillis S. Genetics. 2018 Aug;209(4):1121-1138. Overall, the fact that AP-1 was found absolutely dispensable for UapA trafficking, further strengthens our conclusion that UapA bypasses the Golgi.

      1. In Figure 2, is there any known significance to the presence of UapA in "cytoplasmic oscillating thread structures decorated by pearl-like foci as well as a very faint vesicular/tubular network"?

      At present we cannot answer this question. In order to understand what these structures represent and answer what is their role, we will need to employ super-resolution and ultra-fast microscopy and additional markers, which we envision to do. We suspect that they might be tubular networks, but this extends beyond the present work.

      1. SynA is related to S. cerevisiae Snc1/2, which are known to be present in late Golgi compartments due to repeated rounds of endocytosis to the Golgi and exocytosis to the plasma membrane. The SynA shown here to colocalize with PHosbp is probably present in a similar recycling loop rather than being en route to the plasma membrane for the first time. Therefore, the differential colocalization of UapA and SynA with PHosbp does not by itself provide "strong evidence that the two cargoes studied traffic via different routes" as stated in the text but might instead indicate that only SynA undergoes frequent endocytosis. The text should be amended accordingly.

      The reviewer is in principle correct. However, given that colocalization of SynA and PHosbp occurred all over the cytoplasm of hyphae and not only at the apical region, and because we record colocalization of cargoes before their steady-state accumulation to the PM, thus at a stage where recycling must be minimal, the recorded colocalization should reflect anterograde transport rather than recycling. We added this reasoning it the revised text.

      1. A missing piece of the story is a test of whether the puncta visualized for the two cargoes in Figure 5B are indeed distinct populations of COPII-containing ER exit sites. The relevant experiment would involve co-labeling of the cargoes together with a COPII marker. Three-color labeling would presumably be needed.

      This point was also raised by reviewer 1 (and review 3) and thus performed new experiments to better characterize the identity of the cargo-labeled fluorescent puncta. To do so, we used co-expression of a standard ERES marker, Sec16, in cells expressing either UapA or SynA, tagged with different fluorescent tags. More specifically, we constructed and analyzed strains co-expressing UapA-GFP/Sec16-mCherry or GFP-SynA/Sec16-mCherry in the sec31ts genetic background, which allows synchronization and better analysis of ER exit, as described in our text. The new findings appear as Figure 5C __in the revised manuscript. Notice that sec16-mCherry introduced in the native sec16 locus by standard knock-in reverse genetics of A. nidulans (see Materials and methods) does not affect Aspergillus growth or secretion. Experiments depicted in __5C show that both cargoes, UapA and SynA, co-localize significantly (PCC ≈ 0.6), with Sec16, suggesting that most of these puncta are indeed ERES structures. Given that the puncta marked with UapA or SynA are clearly distinct (see Figures 1C,2A, 3A, 5B), this new experiment strongly suggests that there are indeed two distinct ERES, one populated mostly by UapA and the other by SynA. Notice, as we already outline in our response to the editor above, a three-colored approach using Sec16-BFP (or Sec13-BFP) for showing directly the existence of these two populations of cargo-specific ERES in the same cell failed as the BFP signal was problematic for colocalization studies.

      Reviewer #2 (Significance):

      This study provides compelling evidence that in the fungus Aspergillus nidulans, some transmembrane transporter proteins reach the plasma membrane by a pathway that bypasses much of the conventional machinery associated with the Golgi apparatus and secretory vesicles. Although previous publications pointed toward a similar conclusion, the present work tackles the problem in a more rigorous and systematic way. These findings are important for cell biologists who study membrane traffic, it remains to be determined how prevalent this type of non-canonical secretion might be in other organisms.

      We thank the reviewer for his positive comments

      Reviewer #3

      The manuscript by Sagia et al compares the trafficking of a polarized (SynA) with a non-polarized (UapA) transmembrane protein. In agreement with previous work of the same lab, they find that UapA reaches the plasma membrane through a Golgi-bypass route, which they characterize to some extent. Overall, the data are of good quality and the story is interesting and timely. Understanding trafficking routes that bypass the Golgi is highly interesting. Nevertheless, there are several points of criticism that I have and below is a list where I combine major and minor points together:

      Thank you for your positive comments

      Major Comments:

      1- Is it possible that the polarized phenotype of SynA is caused by selective removal, i.e. SynA is delivered to the entire plasma membrane, but endocytosed rapidly from all areas except the tip of the hyphae. This would also result in a polarized distribution.

      This is in principle possible, but here this is not the case. SynA is polarized due to rapid local endocytosis and immediate recycling at the subapical region, known as the subapical collar. Please see:

      Taheri-Talesh N, Horio T, Araujo-Bazán L, Dou X, Espeso EA, Peñalva MA, Osmani SA, Oakley BR. The tip growth apparatus of Aspergillus nidulans. Mol Biol Cell. 2008 Apr;19(4):1439-49. doi: 10.1091/mbc.e07-05-0464.

      Hernández-González M, Bravo-Plaza I, Pinar M, de Los Ríos V, Arst HN Jr, Peñalva MA. Endocytic recycling via the TGN underlies the polarized hyphal mode of life. PLoS Genet. 2018;14(4):e1007291. Published 2018 Apr 2. doi:10.1371/journal.pgen.1007291

      This applies to all apical markers; they remain polarized by continuous local recycling after the diffuse laterally to the subapical collar.

      2- The authors describe the distribution of SynA and UapA in cells deficient of various COPII/ERES proteins. However, these data are not shown, and it is not clear how they were quantified. It would be important to add quantitative data here.

      Quantitative data are included in Figure 4C, displaying the percentages of cells with UapA either retained in the ER or reaching the PM for each background deficient in a COPII protein. Repression of SarA and Sec31 resulted in UapA retention in the ER in all analyzed cells (100%). However, repression of Sec12, Sec24, or Sec13 had a differential effect across the cell population, with UapA reaching the PM in some cells, while remaining trapped in the ER in others. To quantify these data and determine which cargo localization pattern prevails, we measured the number of cells in each category and represented them as percentages. A similar approach was used to examine the role of Golgi proteins in the trafficking of UapA and SynA (Figure 6).

      3- on page 8, the authors discuss the discrepancy regarding the role of Sec13. They offer as an explanation that the previous studies have been performed in strains that separately expressed the two cargoes. However, I am unable to see why and how this would be a valid explanation.

      Given that Sec13 has a variable/partial effect on UapA, we have previously been biased towards images that showed an effect on localization, as expected, and considered that the lack of an effect might have been due to inefficient repression in a fraction of cells. In our new system, we were able to directly compare UapA to SynA and find out that while SynA was always affected under our conditions, the effect of UapA was still variable. Thus, the partial effect of Sec13 on UapA is physiologically valid and not a matter of insufficient repression in a fraction of cells. This shows the importance of our new improved system where we follow the synchronous expression of two cargoes in the same cells.

      4- Why is the effect of Sec24 depletion so much stronger than of Sec12 depletion? Sec12 is the GEF for SarA, without which Sec24 should not be recruited to ERES. The explanation that low amounts of Sec12 are still present and sufficient to carry out the role of this protein. What is the evidence for that?

      Sec24 is the principal receptor of cargoes responsible for their recruitment to ERES. Sec12 is the catalytic effector for SarA required for the initiation of COPII vesicle formation. The question of the reviewer is thus logical.

      However, Sec12 is indeed present at extremely very low levels when expressed from its native promoter under the condition of our experiment (minimal media). This is supported by our recent proteomic analysis, performed under similar conditions, which failed to detect the Sec12 protein, unlike all other COPII components (see Dimou et al., 2021, doi; 10.3390/jof7070560), but also by cellular studies of the group of M.A. Peñalva, who failed to detect Sec12 tagged with GFP (Bravo-Plaza et al., 2019, doi: 10.1016/j.bbamcr.2019.118551). Additionally, in yeast, immune detection of Sec12 has been possible only in cells harboring sec12 on a multicopy plasmid, suggesting its low abundance in wild-type cells (Nakano et al., 1988, doi:10.1083/jcb.107.3.851).

      Given that repression of sec12 transcription via the thiAp promoter still allows 68% of cells to secrete normally both SynA and UapA, while 32% of cells are blocked in the trafficking of both cargoes, suggests that in most cells either SarA can catalyze the exchange of GDP for GTP without Sec12, maybe through a cryptic guanine nucleotide exchange factor (GEF), or that very small amounts of Sec12 remaining after repression are sufficient for significant SarA activation. Whichever scenario is true, Sec12, similarly to SarA, is not critical for distinguishing Golgi-dependent from Golgi-independent routes, as both cargoes are affected similarly. In the revised text we added a not on this issue.

      5- In Figure 5, it would help readers who are not so familiar with Aspergillus organelle morphology to explain the figure a bit better. This might appear trivial for experts, but anyone from outside this field is slightly lost.

      In the revised manuscript we added a figure panel depicting a schematic representation of A. nidulans key secretory compartments.

      6- The authors write that not seeing UapA in Golgi membranes is evidence that it does not pass through this organelle. However, when they write that SynA is never seen in cis-Golgi elements, they do not conclude that SynA bypasses the cis-Golgi.

      The fact that SynA, unlike UapA, colocalized significantly with late-Golgi/TGN and follows conventional secretion in general, strongly suggests that SynA also passes from the early-Golgi. Cargo traffic through the Golgi is mediated by cisternal maturation, where an individual cisterna gradually changes its nature from an earlier to a later one, while the cargo remains inside. UapA, unlike SynA, never colocalized with any Golgi marker used and was not affected by BFA. We agree with the reviewer that we did not have direct proof for passage of UapA or SynA from the early-Golgi in the wt background, which allows for the alternative, but rather unlikely hypothesis, that none of the two cargos is sorted to the early Golgi and that SynA traffics directly to late-Golgi/TGN. Our inability to detect sorting of any cargo to the early-Golgi is seemingly due to ultra-fast passage of cargoes from very early secretory compartments, such as ERGIC/early-Golgi. In fact, we have obtained evidence of this using Lattice Light Sheet microscopy (results in progress, to appear elsewhere).

      7- Figure 5C: the authors claim that the CopA and ArfA affects trafficking of UapA and SynA from ER to plasma membrane and assign CopA and ArfA as regulators for anterograde trafficking. I think this interpretation is not justified by the data. Depletion of CopA and ArfA will affect the Golgi apparatus in structure and function. The more straight-forward interpretation is that repression of the COPI machinery results in a defect in Golgi exit and therefore retention in pre-Golgi compartments (including the ER and maybe the ERGIC should it exist in Aspergillus). The same is true for BFA treatment where there are also negative effects on ER export, which are rather indirect consequences of alterations of Golgi function and integrity. Likewise, the interpretation of the papers by Weigel et al and Shomron et al is not correct. It is more likely that COPI is recruited to the growing ERES-derived tubule (or ERGIC) to recycle proteins back to the ER. This is not necessarily a proof that COPI regulates anterograde trafficking

      This is a highly debatable issue which our work cannot address. However, we amended the text accordingly.

      8- Figure 6: The images look like in Figure 5, yet here you don't call them ER-associated.

      The two images are not alike. In Figure 5 upon activation of Sec31 (permissive temperature) we detect mostly punctual structures resembling ERES, whereas at the nonpermissive temperature we detect a membranous network typical of the ER. Upon repression of CopA we also detect punctual structures similar to ERES. In Figure 6, we mostly detect an effect on SynA. Repression of early secretory steps (SedV, GeaA) lead to collapse of SynA in the entire ER network. Repression at later stages of Golgi maturation and post-Golgi secretion (RabO, HypB, RabE, AP-1) lead to the appearance of punctual structures, most probably Golgi aggregates.

      9- Figure 6D: How long was the BFA treatment. I am surprised that the pool of SynA preexisting at the plasma membrane seems to also be sensitive to BFA.

      Cells were grown overnight under repressed conditions for both UapA and SynA. After 12-14h cells were shifted to derepressed conditions using fructose as carbon source. BFA was added after 90min of cargo derepression, while both cargoes were still in cytoplasmic structures so there was not preexisting SynA or UapA at the PM (see also Figure 1C). Subcellular localization of both cargoes was studied for 60min after BFA treatment.

      10- This might be beyond the scope of this study, but as far as I know UapA is not N-glycosylated. Would the introduction of an N-glycosylation site shift it towards the Golgi-based route?

      Thank you for this suggestion. We have performed this experiment, adding a glycosylation site on UapA, based on the glycosylation sites found in tis mammalians homologues. We did not detect any effect on UapA trafficking route or its activity. As the reviewer recognizes this goes beyond the scope of this study and thus, we did not include it the manuscript. Differential cargo glycosylation is however an important issue to be studied systemically in respect to different trafficking routes, and we envision to investigate it systematically.

      Minor Comments

      1- This might be just a personal preference, but I think that the term polar is misleading, because it implies something about the polarity of the amino acids. I think "polarized" might be the more common term. Anyway, this is just a minor point and just a suggestion from my side.

      Amended in the revised text.

      2- The paper by the Saraste lab should be mentioned and discussed (PMID: 16421253), which I think is very relevant to the current story.

      We thank the reviewer for pointing out this important publication. In that case, the Rab1 GTPase defined a pathway connecting a pre-Golgi intermediate compartment with the PM in mammalians nerve cells. Thus, the Saraste lab publication is indeed along the lines of findings supporting that Golgi-independent unconventional cargo trafficking routes initiate at very early secretory compartments. Notice, however, that RabO, the A. nidulans homologue of Rab1, which in their case was essential for direct cargo sorting from the ERES/ERGIC to the PM, in or system, was dispensable for Golgi bypass. The Saraste lab article is now mentioned and discussed.

      3- Having worked with ERES for over two decades, I find it strange to see it written ERes. I see no reason why ER exit sites in Aspergillus should be abbreviated differently from all other types of cells (yeast, drosophila, worms, mammals). I think that the entire acronym should be capitalized.

      Amended in the text

      4- When discussing the data about the partial effect of Sec13, it would be good to refer to a previous paper by the Stephens lab that showed that silencing Sec13/31 results in a defect in trafficking of collagen, but not of VSVG (PMID: 18713835).

      We thank the reviewer for also pointing out the publication of the Stephens lab, now mentioned in the revised text. Noticeably, in that case silencing of both Sec13 and Sec31 has no effect on the trafficking of specific cargoes, whereas in our case Sec31 is still absolutely needed for both conventional and Golgi-independent secretion of SynA and UapA, respectively.

      Reviewer #3 (Significance):

      Overall, the data are of good quality and the story is interesting and timely. Understanding trafficking routes that bypass the Golgi is highly interesting. The main weakness is the lack of mechanistic understanding of the Golgi-bypass pathway. In addition, the study is limited to two proteins as representatives of polarized vs. non-polarized proteins. The main target audience for this paper are scientists working in the area of secretion and trafficking in the secretory pathway.

      We thank the reviewer for his positive comments.

      We are aware that the mechanistic details of Golgi bypass are missing and this is our next goal, dissecting those via various approaches genetic and biochemical approaches and employment of super resolution and ultra-fast microscopy.

      __Reviewer #4 __

      In this study, Sagia et al investigate the trafficking of different secretory cargo in Aspergillus nidulans under conditions that repress expression of transport factors or block stages in membrane trafficking. The primary approach is to conduct dual live-cell imaging of GFP-tagged UapA (plasma membrane localized purine transporter) and SynA (plasma membrane R-SNARE) after their simultaneous derepression to monitor trafficking routes. In germlings, both secretory proteins are detected in non-overlapping intracellular compartments and puncta after 60-90 min of derepression. After 4-6 hrs, SynA localizes to hyphal tips whereas UapA localizes to non-polar regions of the PM. Colocalization studies do not show UapA overlap with Golgi markers (SedV, PH-OSBP) during its biogenesis whereas SynA displays significant co-localization. Repression of COPII and COPI components generally block transport of both cargos to the PM and cause accumulation in ER compartments, although there are some differential effects on UapA and SynA localization. Finally, repression of other transport factors (ER-Golgi SNAREs, Golgi transport factors, and exocytic machinery) had differential effects on UapA and SynA localization over time with UapA reaching the plasma membrane in many instances and SynA accumulating in intracellular compartments.

      Based on these observations, the authors conclude that UapA and SynA follow distinct trafficking routes to the plasma membrane where SynA uses a canonical SNARE-dependent secretory pathway route and UapA follows a non-canonical route that may bypass Golgi compartments. The study is extensive and supports the model that biogenesis of SynA and UapA follow distinct processes. However, there are some complexities that may limit interpretation. First, the cargo studied are targeted to the ER differently. UapA is a multispanning transmembrane protein that is likely dependent on the Sec61 translocon for co-translational membrane insertion and will involve ER chaperones and quality control machinery for its biogenesis. SynA will depend on the tail-anchored machinery (GET/TRC pathway) for insertion into the ER and is processed by cytosolic factors/chaperones. Therefore, the sites of ER insertion and the rates of biogenesis of these cargoes will be different. In addition, the repression of trafficking machinery used in this study appears to be variable and may exert partial blocks on intracellular transport stages. Regardless, the study clearly documents that SynA and UapA follow distinct biogenesis and transport processes when co-expressed in cells under experimentally controlled conditions.

      Thank you for your positive comments.

      To our knowledge there is no evidence suggesting that SynA translocates via a tail-anchored machinery (GET/TRC pathway) and not through the translocase. Despite this, we agree with the reviewer that translocation to the ER, as well as exit from it, might be cargo-dependent, especially when it concerns proteins with very different size, structures and oligomerization. Thus, the rate of biogenesis of UapA and SynA is probably quite different. However, this still does not dismiss our basic conclusion that the two cargoes follow distinct routes to traffic to the PM. The ‘problem’ of variable transcriptional repression of some trafficking-related proteins is solved by comparing the relative effect on the two cargoes in the same cells, and this is in fact the advantage of our new system. Importantly, notice that we took care to use conditions of repression where SynA trafficking by the conventional path was totally abolished and compared it to UapA.

      1. It was not clear if the translation, ER insertion and folding of UapA and SynA are fully synchronous. Is it possible that the rate of UapA synthesis and transport to the plasma membrane is substantially faster than for SynA? The imposition of transport blocks could trap SynA and not UapA if this cargo was at later transport stages.

      As already discussed above translation, ER insertion and folding of UapA and SynA might indeed by different. This might somehow affect the trafficking path followed, but this issue is beyond the scope of this work. Notice, however, that the transcription of both cargoes is kept fully repressed during establishment of repression of secretion. Only when repression and blocking of secretion is established (12-14 h germination), as verified by Western blot analysis, we derepress the transcription of UapA and SynA, expressed from the same promoter, and follow their dynamic subcellular localization. Hence, this system ensures that both cargoes start from the earliest transport stage, the ER, upon imposition of transport blocks.

      1. In repressing transport factors (e.g., SarA, Sec12, Sec24, Sec13, SedV, RabE), it is clear that under thiamine repressing conditions these cells do not grow or have greatly reduced growth rates. However, it was not clear if proteins are depleted to the same extent in cells after repression for 12-14 hr or 16-22 hr. as mentioned in the methods. Indeed, in some cases depleted cells display different cargo localization patterns, for example 67% of cells show normal localization of UapA and SynA after sec12 repression and 33% show ER accumulation of both cargoes. There is differential localization of UapA and SynA in many cases where transport factors are repressed, but this could be due to partial inhibition and not complete blocks. It would be helpful to clearly indicate the time points and conditions in each of the figure legends as in points 3-5 below.

      In the revised manuscript we did our best to clearly indicate the time points and conditions in each of the figure legends. Differential localization of UapA and SynA in many cases where trafficking factors are repressed is indeed an interesting outcome. Inefficient repression was dismissed based on the lack of colony growth (see relative growth tests of SarA, Sec24, Sec13, Sec31, SedV, GeaA, RabO, RabE, Ykt6, Sft1, SsoA and Sec9), but also by western blots (e.g., Sec24, Sec13, Sec31 or Sec9 shown in the present manuscript, or other trafficking proteins studied previously. Martzoukou et al., 2018; Dimou et al., 2020). Repression of Sec12 and HypB, and to lower degree AP-1, allowed formation of small and/or compact colonies, but even in these cases relative protein levels could not be detected in western blots, guaranteeing efficient repression.

      1. In Fig 4A immunoblot, HA-tagged proteins are not detected after thiamine repression. Please state the time of thiamine repression used before protein extraction and blot. Is this for the same length of time as for cells shown in panel 4C? It would also be helpful to state the time of cargo derepression before capturing images in 4C. The methods section mentions 12-14 hr or 16-22 hr of growth, presumably with thiamine in the culture, and then 1-8 hr or 60 min to 4 hr of cargo derepression before imaging. Please specify.

      The time of thiamine repression before protein extraction was 16-18h. The same repression time was used for experiments shown in Figures 4C and 6C (ER/COPII and Golgi/post-Golgi repression respectively). More specifically, for microscopy experiments cells were grown in the presence of glucose and thiamine for 12-14h (repressed UapA/SynA and thiAp expressed gene). After this time, cells were shifted to fructose and thiamine for 4h (derepression of UapA/SynA and repression of thiAp expressed gene). In both cases (protein extraction and microscopy experiments) the total time of thiamine repression was 16-18h.

      1. For the thiA-copA and thiA-arfA repression experiments (Fig 5C), the methods section states that thiamine was not added ab initio in the culture, but after an 8 h time window without thiamine at the start of spore incubation. This is interpreted to mean that repression was for a shorter period to time than the 12-14 hr overnight growth. However, the figure legend states that De novo synthesis of cargos takes place after full repression of CopA and ArfA is achieved (>16 hr). Please clarify.

      We think that the review was confused with repression of cargo synthesis (via alcAp+glucose) versus repression of trafficking proteins (via thiAp+thiamine). Please see Materials and methods. We clarify our protocol also here:

      For the thiAp-copA and thiAp-arfA repression experiments addition of thiamine ab initio in the culture leads to total arrest of spore germination and germling formation. Thus, we added an 8-hour time window without thiamine to allow conidiospores to germinate until the stage of young germlings, under conditions where cargo expression via the alcAp was repressed by glucose. Subsequently, thiamine was added in the media (16-18 h) to repress CopA and ArfA, while cargo expression remained glucose-repressed. The transcriptional repression of the cargoes UapA and SynA was maintained for a longer period (24-26 h) compared to other repression experiments, but longer times of repression of cargoes do not make any difference, as full repression is achieved already at 12 h. De novo cargo trafficking was followed next day by eliciting depression, via a shift to fructose media, while still maintaining thiamine to repress CopA or ArfA.

      1. In Fig 6D, BFA treatment is shown to trap SynA in Golgi aggregates while UapA still reaches the plasma membrane. Please state the time of BFA treatment before collecting these images. Do longer treatments with BFA before cargo derepression cause accumulation of UapA in intracellular compartments?

      As mentioned above (response to Reviewer’s #3 comment 9) cells were grown overnight under repressed conditions for both UapA and SynA. After 12-14h cells were shifted to derepressed conditions using fructose as carbon source. BFA was added after 90min of cargo derepression, while both cargoes were still in cytoplasmic structures so there was not preexisting SynA or UapA at the PM (see also Figure 1C). We have not noticed any different effect on UapA trafficking after a max of 1h of BFA treatment.

      1. A minor point, but on page 21 the methods state that "cells were shifted down to the permissive temperature (25 C), to restore the secretory block...". Suggest changing to "to reverse the secretory block..."

      Modified accordingly

      Reviewer #4 (Significance):

      This manuscript nicely builds on a developing line of investigation in the Aspergillus nidulans model that specific plasma membrane proteins are efficiently delivered to the cell surface in a pathway that is distinct from the canonical secretory pathway. Previous work from this lab has suggested that a subpopulation of COPII carriers can bypass the Golgi for delivery of specific cargo to the plasma membrane. The current study uses dual expression of UapA-GFP and mCherry-SynA to provide further support for this model. Molecular definition of a direct ER to PM transport pathway for secretory cargo would be a significant advance to a broad audience. This study provides additional depth and support that such a pathway exists but does not define how COPII vesicles or related intermediates are transported to the PM.

      Again, thank you for your positive comments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Response to reviewers (minor points):

      We thank all reviewers for their very helpful suggestions and greatly appreciate their positive evaluation of our work.

      Reviewer #1:

      Ad 1) The reviewer states: Fig 5 While the data very nicely show that CPX and Syt1 have interdependent interactions in the chromaffin neurons, this seems to be not the case in neurons, where the loss of complexins and synaptotagmins have additive effects, suggesting independent mechanisms (eg Xue et al., 2010). This would be a good opportunity to discuss some possible differences between secretion in endocrine cells vs neurons.

      We greatly appreciate the insightful suggestion by the reviewer. To accommodate the reviewer’s suggestion, we now discuss this issue on page 21, line 486-491: “In murine hippocampal neurons, loss of CpxI and Syt1 has additive effects on fast synchronous release, suggesting independent mechanisms (Xue et al., 2010). On the other hand, the same study also showed that Syt1 heterozygosity fails to reduce release probability in wild-type neurons, but does so in the absence of Cpx, again suggesting that Cpx and Syt1 may functionally interact in Ca2+-triggered release.”

      Ad 2) The reviewer states: Fig 8 Shows an apparent shift in Ca sensitivity in N-terminal mutants suggesting a modification of Ca sensitivity of Syt1. Could there be also an alternative mechanism, that explains this phenotype which is based on a role of the n-term lowering the energy barrier for fusion, that in turn shifts corresponding fusion rates to take place at lower Ca saturation levels?

      We fully agree with the reviewer. While our data indicate that Cpx and Syt1 act in a dependent manner in accelerating exocytosis, they do not provide decisive evidence that the NTD of CpxII directly modulates the Ca2+ affinity of Syt1, an issue that we discuss on page 23 , line 523529: ”The results favor a model wherein the CpxII NTD either directly regulates the biophysical properties of the Ca2+-sensor by increasing the apparent forward rate of Ca2+-binding or indirectly affects SytI-SNARE or SytI-membrane interactions, thereby, lowering the energy barrier of Ca2+triggered fusion.”

      Reviewer #2:

      Ad 1) The reviewer states: The authors provide a "chromaffin cell-centric" view of the function of mammalian Cplx in vesicle fusion. With the exception of mammalian renal ribbon synapses (and some earlier RNAi knockdown studies that had off-target effects), there is very little evidence for a "fusion-clamp"-like function of Cplxs in mammalian synapses. At conventional mammalian synapses, genetic loss of Cplx (i.e. KO) consistently decreases AP-evoked release, and generally either also decreases spontaneous release rates or does not affect spontaneous release, which is inconsistent with a "fusion-clamp" theory. This is in stark contrast to invertebrate (D. m. and C. e.) synapses where genetic Cplx loss is generally associated with strong upregulation of spontaneous release, providing support for Cplx acting as a "fusion-clamp".

      We agree with the reviewer that it is difficult to reconcile contradictory findings regarding the role of Cpx in membrane fusion in vertebrates and invertebrates or between murine hippocampal neurons and neuroendocrine cells. On the other hand, we respectfully disagree with the statement of providing a "chromaffin cell-centric" view of the function of mammalian Cplx in vesicle fusion. In fact, a large number of model systems (in vitro and in vivo studies) support a scenario where complexin takes center stage in clamping of premature vesicle release. For example, in vitro analyses using a liposome fusion assay (Schaub et al., 2006, Nat Struct Mol Biol 13, 748; Schupp et al., 2016) or Hela cells that ectopically express “flipped” SNAREs on their cell surface (Giraudo et al., 2008, JBC 283, 21211) showed that complexin can inhibit the SNARE-driven fusion machinery. Likewise, several studies boosting complexin action by either genetic overexpression or peptide supplementation have provided evidence for the complexin clamp function in neuronal and nonneuronal cells (e.g. Itakura et al., 1999, BBRC 265, 691; Liu et al., 2007, Biochemistry 72, 439; Abderrahmani et al., 2004, J Cell Sci 117, 2239; Archer et al., 2002, JBC 277, 18249; Tang et al, 2006,

      Cell 126, 1175; Vaithianathan et al., 2013, J Neurosci 33, 8216; Roggero et al., 2007, JBC, 282, 26335.)

      In addition, chromaffin cells enable the investigation of secretion on the background of a well-defined intracellular calcium concentration. Indeed, CplxII knock-out in chromaffin cells demonstrated an enhanced tonic release which is evident at elevated levels of [Ca]i (>100nM), but absent at low resting [Ca]i (Dhara et al., 2014). Given this observation, it is tempting to speculate that variations in [Ca]i among the different preparations may contribute to the deviating expression of the complexin null phenotype in different preparations.

      Ad 2) The reviewer states: The authors use a Semliki Forest virus-based approach to express mutant proteins in chromaffin cells. This strategy leads to a strong protein overexpression (~7-8 fold, Figure 3 Suppl. 1). Therefore, experimental findings under these conditions may not necessarily be identical to findings with normal protein expression levels.

      As shown in Fig. 4, we use the secretion response of wt cells as a control so that we can assess the specificity and quality of the rescue approach in our experiments. In addition, the comparative analysis of the CpxII mutants was performed with respect to the equally overexpressed CpxII wt protein (Fig. 3 Suppl. 1), which we used as a control to determine the standard response under these conditions.

      Ad 3) The reviewer states: Measurements of delta Cm in response to Ca2+ uncaging by ramping [Ca2+ ] from resting levels up to several µM over a me period of several seconds were used to establish changes in the release rate vs [Ca2+ ]i relationship. It is not clear to this reviewer if and how concurrently occurring vesicle endocytosis together with a possibly Ca2+-dependent kinetics of endocytosis may affect these measurements.

      By infusing bovine chromaffin cells with 50µM free Ca2+, Smith and Betz have shown that the total capacitance increase is dominated by exocytosis and that significant endocytosis only sets in after 3 minutes (Smith and Betz, 1996, Nature, 380, 531). In the same line, we previously showed that mouse chromaffin cells (infused with 19µM free calcium over 2 minutes) responded with robust increase in membrane capacitance which strongly correlated with the number of simultaneously recorded amperometric events monitoring fusion of single vesicles (Dhara et al., 2014, Fig. 5B). Thus, capacitance alterations recorded under tonic intracellular Ca2+ increase in chromaffin cells are solely due to exocytosis and are not contaminated by significant endocytosis. As our Ca2+ ramp experiments were carried out for 6 seconds and the intracellular free [Ca]i did not exceed 19 µM the observed phenotypical differences between the experimental groups are most likely due to changes in exocytosis rather than endocytosis.

      Ad 4) The reviewer states: It should be pointed out that an altered "apparent Ca2+ affinity" or "apparent Ca2+ binding rate" does not necessarily reflect changes at Ca2+-binding sites (e.g. Syt1).

      We fully agree with the reviewer’s comment. As pointed out also in the response to reviewer 1, our experiments do not provide decisive evidence that the NTD of CpxII directly modulates the Ca2+ affinity of Syt1, an issue that we discuss on page 23 , line 523-529: ” The results favor a model wherein the CpxII NTD either directly regulates the biophysical properties of the Ca2+sensor by increasing the apparent forward rate of Ca2+-binding or indirectly affects SytI-SNARE or SytI-membrane interactions, thereby, lowering the energy barrier of Ca2+-triggered fusion.” 

      AD 5) There are alternative models on how Cplx may "clamp" vesicle fusion (see Bera et al. 2022, eLife) or how Cplx may achieve its regulation of transmitter release without mechanistically "clamping" fusion (Neher 2010, Neuron). Since the data presented here cannot rule out such alternative models (in this reviewer's opinion), the authors may want to mention and briefly discuss such alternative models.

      The study by Bara et al reiterates the model proposed by the Rothman group which attributes the clamping function of Cpx to its accessory alpha helix by hindering the progressive SNARE complex assembly. We have explicitly stated this issue in the original version of the manuscript (page 19, line 425) “As the accessory helix of Cpx has been found to bind to membrane proximal cytoplasmic regions of SNAP-25 and SybII (Malsam et al., 2012; Bykhovskaia et al., 2013; Vasin et al., 2016), an attractive scenario could be that both domains of CpxII, the CTD and the accessory helix, synergistically cooperate to stall final SNARE assembly”. In this context, we will now cite also the study by Bera et al.. 

      A related view of the function of complexin suggested that it may act as an allosteric adaptor for sytI (Neher 2010, Neuron). Here, rather than postulang independent "clamp" and "trigger" functions for the dual action of complexin, these were explained as facets of a simple allosteric mechanism by which complexin modulates the Ca2+ dependence of release. Yet, this interpretation appears to be difficult to reconcile with the observation of our and other laboratories, showing that the fusion-promoting and clamping effects are separable (e.g. Dhara et al., 2014; Lai et al., 2014; Makke et al., 2018; Bera et al., 2022).  

      Some parts of the Discussion are quite general and not specifically related to the results of the present study. The authors may want to consider shortening those parts.

      Considering the contrary findings in the field of SNARE-regulating proteins, the authors hope that the reviewer will agree that it is necessary to discuss the new observations in a broader context, as also acknowledged by the first reviewer.

      Last but not least, the presentation of the results could be improved to make the data more accessible to non-specialists, this concerns providing necessary background information, choice of colors, and labeling of diagrams.

      Done

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors): 

      Regarding figures: 

      (1) Please use clearly distinct colors in diagrams. For example, in Figure 2 Suppl. 3, four different shades of red (or reddish) are used to color the traces and the respective bars. These different shades of red are difficult to discriminate. In Figure 5 Suppl. 1, the two greens are nearly indistinguishable.  

      Done

      (2) RRP size and SRP size on the one hand, and SR rate on the other represent different quantities which are measured in different units. Please use a separate y-axis for the SR (a rate measured in fF/s) and do not combine with RRP and SRP (pool sizes measured in fF). This would also automatically alleviate the need for axis breaks in the plots of RRP size and SRP size. In general, please do not use axis breaks which make interpretation of data unnecessarily more complicated.  

      In order to clarify the display, we now define the different units together with the quantified parameter (e.g. RRP [fF], SRP [fF], SR [fF/s]) allowing us to omit a second axis in those subpanels.

      (3) When plotting bar graphs showing mean tau_RRP, mean tau_SRP, and mean delay, please always use the correct y-axis labels, i.e. use "tau_RRP", "tau_SRP" and "delay" as y-axis labels as it was done for example in Figure 4D, and do not use "tau_RRP", "tau_SRP" and "delay" as x-axis labels as it was done for example in Figure 1D and many other figure panels.  

      We have standardized the figure display. Yet, we would prefer to keep our way subpanel labelling which states the parameter underneath the bar graph and thereby makes the results more accessible.  

      (4) Are the asterisks indicating statistical significance perhaps missing in Figure 4D, middle panel (tau_SRP)?

      There was not a statistically significant difference (wt vs cpxIIko+CpxII EA, P=0.0826, Kruskal-Wallis with Dunn’ post hoc test).  

      (5) According to the Results section (pages 12 to 13), I assume that in Figures 6 and 7 the labels "+Cplx XYZ" are used by the authors to identify an overexpression of Cplx XYZ in a Cplx WT background. The legend text reads however " ... cells expressing either Cplx2 wt or the mutant ...", which would not be correct. Please check.

      We have changed the formulations to “overexpression” accordingly.

      (6) The x-axis unit in Figure 8C is likely "µM" and not "M".

      Done.

      (7) The abbreviations "CplxII LL-EE" and "CplxII LL-WW", and "CplxII LLEE" and "CplxII LLWW" are very similar but refer to different mutants. Could you please think of a more specific and unambiguous abbreviation? Perhaps "CplxII L124E-L128E"?  

      We have changed the abbreviations, accordingly (i.e. CpxII L124E-L128E).  

      Regarding the manuscript text:  

      Line 65: "prevents" instead of "impairs"? 

      done

      Line 67: why "in vivo"? 

      We changed the formulation to ‘Several’

      Line 83: "in addition to the clamping function ..." This is misleading. Many of the studies listed here did not provide evidence for enhanced spontaneous release following Cplx loss and often observed the opposite, reduced spontaneous release. The enhanced delayed release was observed by Strenzke et al 2009 J.Neurosci. and by Chang et al. 2015 J.Neurosci. (which the authors may want to cite). However, that enhanced delayed release occurred despite reduced spontaneous release indicating that it is not simply the result of a missing "fusion clamp". 

      To accommodate the reviewer’s suggestion, we have changed the formulation to “Independent of the clamping function of Cpx….”

      Line 104: "speeds up exocytosis that is controlled by the forward rate of Ca2+ binding" This is difficult to understand without context.  

      We have now added the corresponding citations (Voets et al., 2001; Sorensen et al., 2003), which showed that exocytosis timing in chromaffin cells is largely determined by the kinetics of Ca2+-binding to SytI.

      Line 116: "Cplx2 knock out ..." Please provide (here or earlier in the manuscript) information to the reader about which Cplx paralogs are expressed in chromaffin cells.  

      We now state on line 111 that “CpxII is the only Cpx isoform expressed in chromaffin cells (Cai et al., 2008)”

      Line 118: "=~" either "=" or "~". 

      done

      Line 120: "instead" seems superfluous.

      done

      Line 272: "calcium binding rates" should perhaps better read "apparent calcium binding rates". 

      done

      Line 290: "enhancing SytI's Ca2+ affinity" should perhaps better be "enhancing the apparent Ca2+ affinity of the release machinery". Ca2+ binding kinetics is never directly assayed here.

      We agree and have phrased the sentence accordingly.

      Line 300: "Expression of Cplx ... in Syt1 R233Q ki cells, ..." Perhaps better "Overexpression of Cplx ... in Syt1 R233Q ki/Cplx2 wt cells, ..." for clarification?

      done

      Lines 313ff: What is assayed here is the apparent Ca2+ binding kinetics and apparent KD values of the release machinery. Ca2+ binding to Syt1 is never directly measured!  

      We agree and have changed the wording accordingly to “CpxII NTD supports the forward rate of calcium binding to SytI in accelerating exocytosis”

      Line 347: "Complexin plays a dual role ..." This is partially misleading. It does so in chromaffin cells and D.m. and C.e. NMJs but not at conventional mammalian synapses. 

      We agree and have changed the formulation to “In many secretory systems, Complexin plays a dual role in the regulation of SNARE-mediated vesicle fusion”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors): 

      The authors should perform experiments to answer this question: does Cav3 transcription increase in the G369i-KI, or is there instead some post-transcriptional modulation that permits surface expression of functional Cav3-containing channels in the absence of typical HVA Ca conductances? Also, the authors should determine whether G369i-KI can mediate Ca2+ release from intracellular stores and whether release from stores is upregulated as Cav3-containing channel expression (or function) is increased. 

      We performed transcriptomic (drop-seq) analysis to test whether a Cav3 subtype is upregulated in cones of G369i KI mice. These experiments show that, consistent with previous studies (PMID 35803735, 26000488), Cacna1h appears to be the primary Cav3 subtype expressed mouse cones. However, as shown in new Supp.Fig.S3, there was no significant difference in the levels of Cacna1h transcripts in WT and G369i KI cones. Therefore, we propose that there may be some post-transcriptional modification, or alteration in a pathway that regulates channel availability, that enables the contribution Cav3 channels to the whole-cell Ca2+ current in the absence of functional Cav1.4 channels cones.

      We also performed Ca2+ imaging experiments in WT vs G369i KI cone terminals to assess whether the diminutive Cav3 current in G369i KI cone terminals may be compensated by upregulation of a Ca2+ signal such as from intracellular stores. Arguing against this possibility, depolarization-evoked Ca2+ signals in G369i KI cones were dramatically reduced compared to WT cones (new Fig.9). 

      Reviewer #2 (Recommendations For The Authors): 

      Major points- 

      (1) It is stated in too many places that cone features in the Cav1.4 knock-in are "intact", preserved, or spared, but this representation is not accurate. There are two instances in this study that qualify as intact when comparing KI to WT: 1) the photopic a-waves in the Cav1.4 knock-in (also demonstrated in Maddox et al 2020) and 2) latency to the platform (current MS, Figure 7f). However, in the numerous instances listed below, the authors compared the Cav1.4 knock-in to the Cav1.4 knock-out, and then referred to the KI as exhibiting intact responses. The reference point for intactness needs to be wildtype, as appropriately done for Figures 2 and 3, and when comparing the KI to the KO the phrasing should be altered; for example: "the KI was spared from the extensive degeneration witnessed in the KO....". 

      In most cases, we clearly note that there are key differences in the WT and the G369i KI cone synapses, which highlight the importance of Cav1.4-specific Ca2+ signals for certain aspects of the cone synapse. We disagree with the reviewer on the point that we did not often use the WT as a reference since most of our experiments involved comparisons of only WT and G369i KI (Figs. 3-6) or WT, G369i KI, and Cav1.4 KO (Figs.1,7—and in these cases comparisons specifically between WT and G369i KI mice were included). We used “intact” as a descriptor for G369i KI cone synapses since these are actually present, albeit abnormal in the G369i KI retina, whereas cone synapses are completely absent in the Cav1.4 KO retina. To avoid confusion, we modified our use of “intact” and “preserved” where appropriate.

      A. Abstract, line 34 to 35: ".......preserved in KI but not in KO.". 

      Abstract was rewritten and this line was removed.

      B. Line 36: "....synaptogenesis remains intact". The MS documents many differences in the morphology of KI and WT cones (immunofluorescence and electron microscopy data), which is counter to an intact phenotype. 

      The sentence was: “In CSNB2, we propose that Cav3 channels maintain cone synaptic output provided that the Ca2+-independent role of Cav1.4 in cone synaptogenesis remains intact.”

      Here the meaning of “intact” refers to the Ca2+ -independent role of Cav1.4, not synapses. Thus, we have left the sentence unchanged.

      C. This strikes the right balance, lines 67 to 68: "....although greatly impaired.....". 

      D. Line 149, "Cone signaling to a postsynaptic partner is intact in G369i KI mice". This description is inaccurate. Here there is only WT and KI, and the text reads as follows in line 162: "terminals (Figure 6b). The ON and OFF components of EPSCs in G369i KI HCs were measurable, although lower in amplitude than in WT (Figure 6a,b)." Neither "measurable" nor "lower in amplitude" meet the definition of "intact", and actual numerical values are lacking in the text. 

      We have added results showing that there are no light responses in the Cav1.4 KO horizontal cells and have modified the sentence to: “Cone synaptic responses are present in horizontal cells of G369i KI but not Cav1.4 KO mice”. 

      We have modified discussion of these results as (line 210-213): “Consistent with the lack of mature ribbons and abnormal cone pedicles (Fig.1), HC light responses were negligible in Cav1.4 KO mice (Fig.8a,b). In contrast, the ON and OFF responses were present in G369i KI HCs although significantly lower in amplitude than in WT HCs (Fig. 8a,b).”

      E. Please add a legend to Figure 6a to indicate the intensities. The shape of the KI responses is different from the control which is worthy of discussion: i) there is no clear cessation of HC EPSCs in the KI during the light ON period (when release stops, Im fluctuations should be minimal), and ii) the "peaked" appearances of the initial 500ms of the On and Off periods are very similar in shape for the KI (hard to interpret in the same fashion as a control response). How were the On and Off amplitudes analyzed? Furthermore, the OFF current is not summarized in Figure 6D, but should not this be when Cav3 should be opening and triggering release: Off response-EPSC? Lastly, Figure 6b,d shows a ~70% reduction in On-current in the KI, and the KI example of 6b an 80% reduction in Off current compared to WT. Yet, the only place asterisks are used to indicate sig diff is the DNQX data within each genotype in Fig 6d. These data cannot be described as showing "intact" KI responses, and the absence of numerical and statistical values needs to be addressed. 

      New Fig.8a depicting the horizontal cell light responses has been modified to include the legend indicating light intensities. The ON and OFF amplitudes were analyzed as the peak current amplitudes. This information has been added to the legend.

      The reviewer is correct in that the OFF response represents the EPSC whereas the ON response represents the decrease in the EPSC with light. To avoid confusion, we changed the y axis label for the averaged data to read ON or OFF “response” rather than “current” in new Fig.8b.

      As the reviewer suggests, the more transient nature of the KI response during the light ON period could result from aberrant continuation of vesicular release during the light-induced hyperpolarization of cones in the KI mice, in contrast to the prolonged suppression of release by light which is evident in the WT responses. We speculated on this difference as follows (lines 237-241):

      “In addition to its smaller amplitude, the transient nature of the ON response in G369i KI HCs suggested inadequate cessation of cone glutamate release by light (Fig.8b). Slow deactivation of Cav3 channels and/or their activation at negative voltages20 could give rise to Ca2+ signals that support release following light-induced hyperpolarization of G369i KI cones.”

      We added astericks to new Fig.8b,d indicating statistical differences and description of the tests in the legend.

      F. line 168 the section titled "Light responses of bipolar cells and visual behavior is spared in G369i KI but not Cav1.4 KO mice". 

      Changed to: “Light responses of bipolar cells and visual behavior are present in G369i KI but not Cav1.4 KO mice”

      Last sentence of erg results, 189-190: "These results suggest that cone-to-CBC signaling is intact in G369i KI mice.". "Spared and intact" are not accurate descriptions. The ERG data presented here shows massive differences between WT and the KI, except in the instance of awaves. 

      This sentence was removed.

      As for Figure 6, the results text related to Figure 7a-d does not present real numbers for ERG responses, and there is no indication of significant differences there or in the Figure panels. For instance, in Figure 7b, b-waves are KI are comparable to KO, except at the two highest-intensity flashes that show KI responses ~20% the amplitude of WT. Presentation of KI and KO data on a 6- to 10-fold expanded scale higher than WT can be misleading: a quick read of these Figure panels might make one incorrectly conclude that the KI is intact while the KO is impaired when compared to WT. The Methods section needs more details on the ERG analysis (e.g. any filtering out of oscillatory potentials when measuring b-wave, and what was the allowable range of time-to-peak for b-wave amplitude, etc..). 

      The vertical scaling of the ERG results in new Fig.10c,d has been changed so as to reflect clearly diminished responses of the KO and KI vs the WT. Further details regarding the ERG analysis was added to the Methods section.

      G. Can you point to other studies that have used the "visible platform swim test" used in Figure 7e, f, and specify further how mice were dark/light adapted prior to the recordings? 

      As referenced in the Methods, original line 674, the methods we used for the swim test were described in our previous study (PMID 29875267). Other studies that have used this assay include PMIDs: 28262416, 26402607.

      (2) The Maddox et al 2020 study does not safely address whether rods have a residual T-type Ca2+ current in the Cav 1.4 KO or KI. The study showed that membrane currents measured from rods in the KI and KO retina were distinct from WT, supporting their claim that L-type Ca2+ current is absent in the KI and KO. However, the recordings had shortcomings that challenge the analysis of Ca2+ currents: i) collected at room temp (22-24{degree sign}C), ii) at an unknown distance from the terminal (uncertain voltage clamp), iii) with a very slow voltage ramp rate that is not suitable for probing T-type currents (Figure 1d Maddox 2020, 140 mV over 1 sec: 7msec/1mV), and iv) at a signal-to-noise that does not allow to resolve a membrane current under 1 pA (avg wt rod Ca2+ current was -3.5 pA, and line noise ~1pA peak-to-peak in Maddox 2020). Suggestion: say T-type currents were not probed in Maddox et al 2020, but Davison et al 2022 did not find PCR signal for Cav3.2 in rods. 

      We disagree that recordings in the Maddox 2020 study were not sufficient to uncover a T-type current. The voltage ramps in that study were not much slower than that of the Davison et al. 2022 study (they used 0.19 mV/ms). Moreover, in new Supp. Fig.S1, we show that like the slower voltage ramp (0.15 mV/ms) used in the prior study of G369i KI rods, the voltage ramps we used in the present study (0.5 mV/ms), which clearly evoke currents with T-type properties in G369i KI cones (Fig.2a,b, Fig.3a,b) do not evoke currents in WT or G369i KI rods.  

      Minor comments. 

      (1) Suggestion: add an overview panel to Figure 1 that shows the rod terminals in the KI. The problem is that cropping out the ribbon and active zone signals from rods, to highlight cones, can give the impression that the cones are partially spared in the KI, and the rods are not spared at all. (yet you nicely clarify this in Figure 4 and in the legend and text, etc.). 

      We chose to modify the legend with this information as in Fig.4 rather than modify the figure.

      (2) Mouse wt cone Ca2+ currents look like L-type currents, as do your monkey and squirrel cone recordings, and also much like those of mouse rods (see Figure S5, Hagiwara et al., 2018 or Grabner and Moser 2021). Your pharm data from mice and squirrels further supports your conclusion, and certainly took much effort. Davison et al 2022 J Neurosci showed PCR results that support their claim that a Cav3 current exists in wt cones. Questions: 1) have you tried PCR? 2) Can you offer more details on what Cav3 KO you tried and what antibodies failed to confirm the KO? As the authors know, one complication is that the deletion of one Cav can be compensated for by the expression of a new Cav. There are 3 types of Cav3s and removal of one type may be compensated for by another Cav3. 

      We have included drop-seq data (new Supp.Fig.S3) implicating Cav3.2 as the main Cav3 subtype in cones and have modified our discussion of these results accordingly. These experiments did not reveal any changes in Cav3 subtype expression in G369i KI vs WT cones.

      (3) Lines 95/96- onward, spend more time telling the story. When working out the biophysical and pharmacological behavior of the Ca2+ currents, you might want to initially refer to the membrane current as a membrane current, and then state how your voltage protocols, intra- and extra-cell solutions, and drugs helped you verify 1) L-type and 2) T-type Ca2+ currents. 

      We have modified the text with more detail.

      (4) If data is in hand, add a ramp I-V to Figure S2, which shows the response of the ground squirrel cone. The steps in S2a are excellent for making your point that a transient current is missing, and the bipolar is a great control to illustrate ML218 works. However, a comparison of a squirrel cone ramp to a bipolar ramp response could complete the figure. 

      See Reponse to #5 below.

      (5) Consider moving Supplementary Figures S2 and S3 to the main text; these are highly relevant to the story, novel, and well-executed. 

      Fig.S2 and S3 were added as new Figs.4,5. The new Fig.4 includes voltage ramps in ground squirrel cones (panel a) to compare with the bipolar data (panel f).

      (6) The nice electron microscopy reconstructions are not elaborated on in any detail, and there is no mention of ribbon size. Is the resolution sufficient to estimate ribbon size, the number of synaptic vesicles around the ribbon and in the adjacent cytosol? The images indicate major changes in the morphology of the terminals. Is the glial envelope similar in WT and KI? 

      Since ribbons were quantified extensively in the confocal analyses in Fig.6, we felt it unnecessary to add this to the EM analysis which focused mainly on aspects of 3D structure (i.e., arrangement of ribbons, postsynaptic wiring, cone pedicle morphology). We added further discussion of the change in morphology of the G369i KI cone pedicle (lines 200-203): “Compared to WT, ribbons in G369i KI pedicles appeared disorganized and were often parallel rather than perpendicular to the presynaptic membrane (Fig.7a-c). Consistent with our confocal analyses (Fig.1), G369i KI cone pedicles extended telodendria in multiple directions rather than just apically (Fig. 7a).”

      While we did not opt to characterize the glial envelope in WT cones, we did add an analysis of synaptic vesicles around ribbons to Table 2.

      (7) Discussion line 250: "we found no evidence for a functional contribution of Cav3 in our recordings of cones in WT mice (Figures. 2,3), ground squirrels, or macaque (Supplementary Figures S2 and S3).". I would not use "functional" in this context because when comparing your work to Davison et al 2022, they defined functional as a separate response component driven by Cav3. For instance, they examined the influence of their T-type current on exocytosis (by membrane capacitance) and other features like spiking Ca2+ transients. Suggestion: substitute functional with "detectable", and say "we found no detectable Cav currents". Or if you had Ttype staining, but not T-type Ca2+ currents, then say "no functional current even though there is staining...". 

      We have modified the text as (lines 336-338): “However, in contrast to recordings of WT mouse cone pedicles in a previous study21, we found no evidence for Cav3-mediated currents in somatic recordings of cones in WT mice (Figs.2,3).”

      We propose an alternative interpretation of the results in the Davison et al study concerning the conclusion that Cav3 channels contribute to Ca2+ spikes and exocytosis. That study used 100 µM Ni2+ to block a “T-type” contribution to spike activity in cones. In their Figs.4,5, the spikes are suppressed by 100 µM Ni2+ and 10 µM nifedipine, a Cav1 antagonist, and spared by the T-type selective drug Z944. This is problematic for several reasons. First, as shown by the authors

      (their Fig.2A1,A2) and others (PMID: 15541900), 100 µM Ni2+ inhibits Cav1-type currents in photoreceptors. Second, Z944 potentiates Cav1 current in their mouse cones (their Fig.2C1,C2). Thus, both reagents are suboptimal for dissecting the contribution of either Cav subtype to spiking activity. With respect to Cav3 channels and exocytosis, these authors interpreted a reduction in exocytosis upon holding at -39 mV compared to at -69 mV as indicating a loss of a T-type driven component of release. However, Cav1 channel inactivation (PMID: 12473074) could lead to the observed reduction in exocytosis at -30 mV.

      (8) Additional literature related to your Intro and Discussion. Regarding CSNB2, related mutations of active zone proteins, and what happens to Ca2+ currents when ribbons are deleted, you might want to consider the following studies that measure Ca2+ currents from rods: conditional KO of RIM1/2 (Grabner et al 2015 JN), KO of ELKS1/2 (Hagiwara et al, 2018 JCB), and KO of Ribeye (Grabner and Moser eLife 2021). In these studies, the Cav currents were absent in rods of the ELKS1/2 DKO, strongly reduced (80%) in the RIM1/2DKO, but altered in more subtle ways (activation-inactivation) without significantly changing steady-state Ca2+ current in the Ribeye KO. This does not seem to support some of the arguments you have made in the Introduction and Discussion regarding ribbon size and Ca2+ currents, yet the suggested literature is related to the topic at hand. 

      A description of these synaptic proteins as potential mediators of the effect of Cav1.4 on ribbon morphogenesis was added to the Discussion, lines 325-327.

      (9) Line 129: "Along with the major constituents of the ribbon, CtBP2, and RIBEYE", for clarity Ribeye has two domains, one that is identical to CtBP2 (B-domain) and the unique Ribeye domain (A-domain) that is only expressed at ribbon synapses. And, Piccolino is also embedded in the ribbon (Brandstaetter lab, Wichmann/Moser labs). In other words, Ribeye and Piccolino are the major constituents of the ribbon. 

      To avoid confusion, we simply mention Ctbp2 and RIBEYE in the context of the corresponding antibodies that were used to label ribbons.

      (10) Abstract: consider to rephrase "Ca2+-independent role of Cav1.4" by "Ca2+-permeationindependent role of Cav1.4" or alike 

      Sentence changed to: “In CSNB2, we propose that Cav3 channels maintain cone synaptic output provided that the nonconducting role of Cav1.4 in cone synaptogenesis remains intact.”

      Reviewer #3 (Recommendations For The Authors): 

      Cav1.4 voltage-gated calcium channels play an important role in neurotransmission at mammalian photoreceptor synapses. Mutations in the CACNA1f gene lead to congenital stationary night blindness that particularly affects the rod pathway. Mouse Cav1.4 knockout and Cav1.4 knockin models suggest that Cav1.4 is also important for the cone pathway. Deletion of Cav1.4 in the knockout models leads to signaling malfunctions and to abundant morphological re-arrangements of the synapse suggesting that the channel not only has a role in the influx of Ca2+ but also in the morphological organization of the photoreceptor synapse. Of note, also additional Cav-channels have been previously detected in cone synapses by different groups, including L-type Cav1.3 (Wu et al., 2007; pmid; Kersten et al., 2020; pmid), and also T-type Cav3.2 (Davison et al., 2021; pmid 35803735). 

      In order to study a conductivity-independent role of Cav1.4 in the morphological organization of photoreceptor synapses, the authors generated the knockin (KI) mouse Cav1.4 G369i in a previous study (Maddox et al., eLife 2020; pmid 32940604). The Cav1.4 G369i KI channel no longer works as a Ca2+-conducting channel due to the insertion of a glycine in the pore-forming unit (Madox et al. elife 2020; pmid 32940604). In this previous study (Madox et al. elife 2020; pmid 32940604), the authors analyzed Cav1.4 G369i in rod photoreceptor synapses. In the present study, the authors analyzed cone synapses in this KI mouse. 

      For this purpose, the authors performed a comprehensive set of experimental methods

      including immunohistochemistry with antibodies (also with quantitative analyses), electrophysiological measurements of presynaptic Ca2+ currents from cone photoreceptors in the presence/absence of inhibitors of L-type- and T-type- calcium channels, electron microscopy (FIB-SEM), ERG recordings and visual behavior tests of the Cav G369i KI in comparison to the Cav1.4 knockout and wild-type control mice. 

      The authors found that the non-conducting Cav channel is properly localized in cone synapses and demonstrated that there are no gross morphological alterations (e.g., sprouting of postsynaptic components that are typically observed in the Cav1.4 knockout). These findings demonstrate that cone synaptogenesis relies on the presence of Cav1.4 protein but not on its Ca2+ conductivity. This result, obtained at cone synapses in the present study, is similar to the previously reported results observed for rod synapses (Maddox et al., eLife 2020, pmid 32940604). No further mechanistic insights or molecular mechanisms were provided that demonstrated how the presence of the Cav channels could orchestrate the building of the cone synapse. 

      We respectfully disagree regarding the mechanistic advance of our study. As indicated by Reviewer 2, a major advance of our study is in providing a mechanism that can explain the longstanding conundrum that congenital stationary night blindness type 2 mutations that would be expected to severely compromise Cav1.4 function do not produce complete blindness. Our study provides an important contrast to the Maddox et al 2020 study in showing that rods and cones respond differentially to loss of Cav1.4 function, which is also relevant to the visual phenotypes of CSNB2. How the presence of Cav1.4 orchestrates cone synaptogenesis is an important topic that is outside the scope of our present study.

      In the present study, the authors also propose a homeostatic switch from L-type to (newly occurring) T-type calcium channels in the Cav1.4 G369i KI mouse as a consequence of the deficient calcium channel conductivity in the Cav1.4 G369i Cav1.4 KI mouse. In cones of the Cav1.4 G369i, the high-voltage activated, L-type Ca2+-entry was abolished, in agreement with their previous paper (Maddox et al., eLife 2020, pmid 32940604). The authors found a lowvoltage activated Ca2+ current instead that they assigned to T-type Ca2+-currents based on pharmacological inhibitor experiments. T-type Ca2+-currents/channels were already previously identified in other studies by independent groups and independent techniques

      (electrophysiology, RT-PCR, single-cell sequencing) in cones of wild-type mice (Davison et al.,

      2021, pmid 35803735; Macosko et al., 2015, pmid 26000488; Williams et al., 2022, pmid 35650675). In the present manuscript (Figures 3a/b), the authors also observed a low-voltage activated, T-type like current in cones of wild-type mice, that is isradipine-resistant and affected by the T-type inhibitor ML218. This finding appears compatible with a T-type-like current in wildtype cones and is consistent with the published data mentioned above, although the authors interpret this data in a different way in the discussion. 

      Due to the noise inherent in whole cell voltage clamp measurements and some crossover effects in the pharmacology, we cannot completely exclude the presence of a T-type current in WT mouse cones. However, our results very clearly support a conclusion opposite to that stated by the reviewer. Namely, if WT mouse cones have T-type Ca currents, then they are far smaller than those in the Cav1.4 G369i KI and KO cones. In particular, while we identified message for Cav3.2 in WT mouse cones, we were unable to identify a functional T-type current by either voltage clamp measurements or pharmacology. See below for a detailed rebuttal.

      This proposal of a homeostatic switch is not convincingly supported in this reviewer's opinion

      (for further details, please see below). Furthermore, no data on possible molecular mechanisms were provided that would support such a proposal of a homeostatic switch of calcium channels. No mechanistic/molecular insights were provided for a proposed homeostatic switch between Ltype to T-type channels that the authors propose to occur between wild-type and Cav1.4 G369i as a consequence of conduction-deficient Cav1.4 G369i channels. Is this e.g. based on posttranslational modifications that switch on T-type channels or regulation at the transcriptional level inducing expression of T-type calcium channel or on other mechanisms? The authors remain descriptive with their central hypotheses. No molecular mechanisms/signaling pathways were provided that would support the idea of such a homeostatic switch. 

      Homeostatic plasticity refers to the maintenance of neuronal function in response to some perturbation in neuronal activity and can result from changes in the expression of ion channel genes (PMID: 36377048, 32747440, 19778903) or regulatory pathways that modulate ion channels (PMID: 15051886, 32492405). We present multiple lines of evidence showing that Cav3 currents appear in cones upon genetically induced Cav1.4 loss of function and can support cone synaptic responses and visual behavior if cone synapse structure is maintained. Our new transcriptomic studies show no difference between levels of Cav3 channel transcripts in WT and G369i KI cones, suggesting that the appearance of the Cav3 currents in G369i KI cones does not result from an increase in Cav3 gene expression. We are currently investigating our transcriptomic dataset to determine if Cav3 regulatory pathways are upregulated in G369i KI cones and will present this in a follow-up study.

      The authors show residual photopic signaling in the non-conducting Cav1.4 G369i KI mouse as judged by the recording of postsynaptic currents, ERG recordings and visual behavior tests though in a reduced manner. The residual cone-based signaling could be based on the nonaffected T-type Ca2+ channel conductivity in cone synapses. Given that the L-type current through Cav1.4 is gone in the Cav1.4 G369i KI as previously shown (Maddox et al., 2020, pmid 32940604), the T-type calcium current will remain. However as discussed above, this does not necessarily support the idea of a homeostatic switch. 

      A major point which we highlighted with new results is that despite the expression of Cav3 transcripts in WT mouse cones, Cav3 channels do not contribute to the cone Ca2+ current. This is at odds with the Davison et al study (PMID: 35803735, see our response to Reviewer 2, pt 7 for caveats of this study), but our results convincingly show that the Cav3 current appears only when Cav1.4 is genetically inactivated. Pharmacological or electrophysiological methods that should reveal the presence of Cav3 currents do not change the properties of the Ca2+ current in cones of WT mice, ground squirrel, or macaque:

      • Figs.2-4: Voltage steps to -40 mV (Fig 2e) that activate a sizeable T-current in G369i KI mouse cones produce a negligible transient at pulse onset in WT mouse cones. Similarly, transient currents that are obvious in G369i KI mouse cones during the final step to -30 mV are absent in WT cones.  When we block Cav1.4 with isradipine either in cones of WT mice or ground squirrel, the current that remains does not resemble a Cav3 current but rather a scaled down version of the L-type current. ML218, which readily blocks Cav3 channels in HEK293T cells and in G369i KI cones, has only minor effects in cones of WT mice and ground squirrel; these effects of ML218 can be attributed to non-specific actions on Cav1.4 (new Supp.Fig.S2). New Fig.4 (moved from the supplementary data to the main article) clearly shows that the ML218-sensitive current in ground squirrel cones exhibits properties of Cav1.4 not Cav3 channels. 

      • Figs.2,5: Holding voltages that inactivate Cav3 channels have no effect on the Ca2+ current in cones of WT mice or macaque (recordings of macaque cones were moved from the supplement to the main article as new Fig.5).

      In Figure 4 the authors measured an increase in the size of the active zone (as judged by the size of the bassoon cluster) and of the synaptic ribbons in the Cav1.4 G369i. A mechanistic explanation for this phenomenon was not provided and the underlying molecular mechanisms were not unraveled. 

      The FIB-SEM data uncover some ultrastructural alteration/misalignments of the synaptic ribbons and misalignments of the regular arrangement of the postsynaptic dendrites in the G369i KI mice. Also concerning this observation, the study remains descriptive and does not reveal the underlying mechanisms as it would be expected for eLife. 

      We respectfully disagree on the descriptive nature of our study and the need for a full characterization of the molecular mechanism underlying the cone synaptic defects in the G369i KI mouse.   

      An important study in the field (Zanetti et al., Sci. Rep. 2021; pmid 33526839) should be also cited that used a gain-of-function mutation of Cav1.4 to analyze its functional and structural role in the cone pathway. 

      We have added citation of this paper to the Discussion (lines 354-356).

      In conclusion, the study has been expertly performed but remains descriptive without deciphering the underlying molecular mechanisms of the observed phenomena, including the proposed homeostatic switch of synaptic calcium channels. Furthermore, a relevant part of the data in the present paper (presence of T-type calcium channels in cone photoreceptors) has already been identified/presented by previous studies of different groups (Macosko et al., 2015; pmid 26000488; Davison et al., 2021; pmid 35803735; Williams et al., 2022; pmid 35650675). The degree of novelty of the present paper thus appears limited. I think that the study might be better suited in a more specialized journal than eLife. 

      We thank the reviewer for acknowledging the rigor of our study but disagree with their evaluation regarding the novelty of our work as outlined in our responses above.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      My comments are largely limited to suggestions to make the manuscript easier to read and digest.

      In the abstract they say RNA sequencing highlights changes in innate...

      Could they be more specific? Innate immune system up or down? They do not indicate actual findings in the abstract.

      We thank the reviewer for the comment and we have revised the abstract accordingly.  

      Their use of non‐intuitive abbreviations is often confusing. Perhaps they can add a table in methods listing all the abbreviations so that the reader can follow the data better. mNGA, vmHT....etc.

      As suggested, we have now included a list of the abbreviations used in the paper.

      There are mis‐spellings in the manuscript.

      We have gone through the manuscript and corrected the mis-spellings.   

      Has the SPR RNAi line been validated?

      The SPR RNAi line that we used has been extensively validated by Yapici et al., 2007 and several subsequent publications. Importantly, the effectiveness of SPR knockdown is evident in female flies as they exhibit dramatically reduced egg laying and, importantly, lack the typical post-mating behaviors (such as rejection of male flies after initial mating) observed in the wild type mated female flies. In fact, female flies with RNAi-mediated SPR knockdown behave identically to females mated with SP-null male flies, confirming the effective disruption of the SP-SPR signaling pathway. We have revised the manuscript and added these statements in the results section concerning SPR RNAi.  

      In the figures showing the Climbing Index vs time, can they abbreviate seconds as sec vs s? At least I think it is seconds. At first, I thought it was Time or Times, and was confused about what they were indicating on those types of graphs (Figures 1D‐F).

      We have revised the figure as suggested by the reviewer.

      In Figure 3F, they have a significance indicated in an unclear manner. It looks like they are comparing neuropil to the cortex, but I think they really mean to compare the cortex of sham to cortex of D31?

      The reviewer was correct. We have revised figure 3F to make this clear.     

      In Figure 4B, what is the y‐axis? Percentage of what? Is that percentage of total flies?

      The reviewer was correct. We have revised the figure to make this clear. 

      In a figure like SF3 B, what is the y‐axis? "Norm. Accum. CI" Can they explain the abbreviation?

      We have revised the Y-axis label to be “Normalized accumulative CI”.  We have also made this clear in the legend.   

      In the methods, what does this mean: "Regions devoid of Hoechst and phalloidin signal in non‐physiologically appropriate areas were considered vacuoles"? What are non‐physiologically appropriate areas? To me, that would mean outside of the brain. I would have thought the areas should be physiologically appropriate (aka neuropil and cortex)? This is confusing.

      We have revised the method section to be more specific.  In the Drosophila brain, there are structures such as esophagus that are devoid of both Hoechst and phalloidin staining, which were excluded from our vacuole quantification.    

      Reviewer #2 (Recommendations For The Authors):

      Since I use mammalian systems, my comment about the confirmation of siRNA should be removed if this is not possible in the Drosophila system.

      We have revised the figures to include total N values when appropriate. Including individual n values for each experimental assay and condition will inevitably crowd the figure legends, so specific values are available upon request. 

      Regarding RNAi knockdown of sex peptide receptors (SPRs), we agree that confirmation of the knockdown by IHC or qRT-PCR will further strengthen our findings. It should be noted, however, that the RNAi line we used has been extensively validated by Yapici et al., 2007 and several subsequent publications. Importantly, the effectiveness of SPR knockdown is evident in female flies as they exhibit dramatically reduced egg laying and, importantly, lack the typical post-mating behaviors (such as rejection of male flies after initial mating) observed in the wild type mated female flies. In fact, female flies with RNAi-mediated SPR knockdown behave identically to females mated with SP-null male flies, confirming the effective disruption of the SP-SPR signaling pathway. We have revised the manuscript to include these statements in the results concerning the SPR RNAi knockdown.    

      Reviewer #3 (Recommendations For The Authors):

      (1) In Figures 1 and 2, the authors found that females have a lower climbing index in the acute phase in D17 injury, not due to neurodegeneration as shown no significant changes of brain vacuolation and other markers. However, in Figure 3, the authors found that female flies have a lower climbing index, more brain vacuolation, and neurodegeneration in the late phase. It's not very convincing that having a lower climbing index at the late phase is due to neurodegeneration. Is it possible that females suffered from more severe acute effects, at least in D17 injury?

      We thank the reviewer for this point. Female flies injured on D17 displayed acute climbing deficits at 90 minutes post-injury. Since we did not observe significant structural changes in the brain at this time, we believe that this short-term functional deficit is not due to acute neuronal death. Here it is important to note that males did not display any acute climbing deficits when injured on D17, which suggests that the females suffered from more severe acute effects than males. However, these injured female flies recovered fully at 24 hours post-injury and displayed no climbing deficits. At two weeks post-injury, we observe climbing deficits and increased vacuole formation as a direct result of the injuries on D17 (see Supplemental Figure 3). When we assessed sensorimotor behavior and brain vacuolation on D45, we found that the injured females had significantly lower climbing indices and more brain vacuolation than the non-injured females of the same age. In this case, the concurrent observance of decreased climbing ability and increased brain vacuolation suggests chronic neurodegeneration in aged, injured females. This is not to be confused with the acute neuronal death observed by other groups using injury models of stronger severity. Overall, our data are consistent with the current view that in many neurodegenerative diseases, functional deficits often precede observable brain degeneration, which may take years to manifest.

      (2) The authors determined late‐life brain deficits and neurodegeneration purely based on climbing index and vacuole formation. These phenotypes are not really specific to TBI‐related neurodegeneration and the significance and mechanisms of vacuole formation are not clear. Indeed, in Figures 3 A and B, male flies especially D31inj tend to have a much larger variation than any other groups. What could be the reasons? The authors should perform additional analyses on TBI‐related neurodegeneration in flies, which have been shown before, such as retinal degeneration and loss, neuronal degeneration, and loss, neuromuscular junction abnormalities, etc (Genetics. 2015 Oct; 201(2): 377‐402).

      We thank the reviewer for the thorough evaluation of our manuscript. The reviewer raised a very important question: whether the neurodegeneration observed in our model is specific to TBI. As the reviewer rightly pointed out, the neurodegenerative phenotypes are unlikely to be specific to TBI-related neurodegeneration. Throughout the manuscript, we have tried to convey the notion that the mild physical impacts to the head represent one form of environmental insults, which in combination with other risk factors such as aging can lead to the emergence of neurodegenerative conditions. It should be noted that the negative geotaxis assay and vacuolation quantification are two well-established approaches to assess sensorimotor deficits and frank brain degeneration in fly brains. 

      It is important to emphasize that the head-specific impacts delivered to the flies in our study are much milder than those used in previous studies. As we showed in our figure 1, this very mild form of head trauma (referred to as vmHT) did not cause any death, nor affected the lifespan of the injured flies. Our supplemental data also show very minimal structural neuronal damage and no acute and chronic apoptosis induced by vmHT exposure. Consistently, we did not observe any exoskeletal or eye damage immediately following injuries, nor did we observe any retinal degeneration and pseudopupil loss at the chronic stage of these flies. We have incorporated these important points in the revised manuscript.  

      (3) In Figure 4, it would be important to perform the behavior test fly speed and directional movement in the acute phase as well to determine whether the females have reduced performance at the acute phase.

      We thank the reviewer for this suggestion. Please note that our modified NGA has already improved the spatiotemporal resolution over the classic NGA.  The data presented in Fig.3 show that there are no acute deficits for young cohorts.  Therefore, we do not believe that the detailed analysis of the direction and speed of these flies is essential.  

      Unfortunately, the current setup for the AI-based analysis requires manual corrections of tracking errors, which are time-consuming and tedious.  We are building a newly designed AI-based NGA (NGA.ai) that will allow automatic tracking and quantification with minimal manual interventions. Once it is completed, we will perform some of the analyses that the reviewer suggested.  

      (4) In Figure 8, the authors performed an RNA‐seq analysis and identified some dysregulated gene expressions. However, it is really surprising to see so few DEGs even in wild‐type males and mated females, and to see that none of DEGs overlap among groups or related to the SP‐signaling. This raises questions about the validity of the RNAseq analysis. It is critical to independently verify their RNA‐sequencing results and to add some more molecular evidence to support their conclusion.

      We agree that future studies are needed to independently validate our RNA sequencing results. We believe that the small number of DEGs are likely due to two unique features of our study: (1) the very mild nature of our injury paradigm and (2) the chronic examination timepoint that was long after the head injury and SP exposure, which distinguish our study from previous fly TBI studies.  As pointed out in the manuscript, our study was aimed to understand how early life exposure to repetitive head traumatic insults could lead to the latelife onset of neurodegenerative conditions. We hope to further validate our results in our next phase of experiments using single-cell RNA sequencing and RT-qPCR. 

      (5) The current results raise a series of interesting questions: what implication of female fly mating and its associated Sex Peptide signaling would be to mammalians or humans? Would mammalian female animals mating with wild‐type or sex hormone‐null male animals have different effects on their post‐injury behavior tests or neuropathological changes? What are the mechanisms underlying the sexual dimorphism?

      As the reviewer pointed out, it would be very interesting to explore the possible roles of sex peptide-signaling in other animals and humans. As far as we know, there is no known mammalian ortholog to the insect sex peptide, so it would be difficult to study SP or an SPlike molecule in mammalian models. However, we believe that prolonged post-mating changes associated with reproduction in female fruit flies contribute to their elevated vulnerability to neurodegeneration.  In this regard, drastic changes within the biology of female mammals associated with reproduction can potentially lead to vulnerability to neurodegeneration. We agree that this demands further study, which may be done with future collaborators using rodent or large animal models.  We have discussed this point in the manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We would like to thank you very much for reviewing our manuscript and express our sincere appreciation for the valuable and thoughtful comments that led us to significantly improve the manuscript on Fshr-ZsGreen reporter mice. We have seriously taken your comments to make a major revision of the manuscript, and here is a summary of the revision:

      (1) New data on Fshr expression are input to the revised Manuscript:

      a. Fshr expression in the testis and adipose tissues (WAT and BAT) of B6 mice;

      b. Fshr expression in the testis of B6 by RNA-smFISH;

      c. Comparison of Fshr expression in the testis and ovary between Fshr-ZsGreen and B8 mice by ddRT-PCR to prove Fshr expression without interruptions by insertion of P2A-ZsGreen vector;

      d. Reduction of Fshr expression in osteocytes within the femoral sections from DMP1-CreERT2:Fshrfl/fl mice;

      e. Fshr expression in an established Leydig cell line-TM3 by immunofluorescence and ddRT-PCR, also show Fshr located in the nuclei of TM3 cells;

      f. Fshr expression at scRNA-seq level from 5 public single cell portals as Supplementary Data 3 to support our findings of the widespread expression pattern of Fshr, particularly in Leydig cells.

      (2) Re-organization of Figure 2 with a new legend.

      (3) A new paragraph is added to the Discussion Section of the revised MS to explain the function of P2A peptide in generation of GFP reporter mice and why Fshr express is not interrupted by the P2A-ZsGreen insertion in Fshr-ZsGreen reporter.

      (4) Deletion of Figure 1-D-c, as it is not necessary.

      (5) Replace of Figure 8-A (the left panel) with a reduced exposure time image.

      (6) Amended parts of the revised MS are labeled in red.

      A point by point response to the Reviewers’ comments:

      Reviewer 1:

      One of the shocking observations in this manuscript is the expression of FSHR in Leydig cells. Other observations are in the osteoblasts and endothelial cells as well as epithelial cells in different organs. The expression of ZsGreen in these tissues seems high and one shall start questioning if there are other mechanisms at play here.

      First, the turnover of fluorescent proteins is long, longer than 48h, which means that they accumulate at a different speed than the endogenous FSHR This means that ZsGreen will accumulate in time while the FSHR receptor might be degraded almost immediately. This correlated with mRNA expression (by the authors) but does not with the results of other studies in single-cell sequencing (see below).

      The expression of ZsGreen in Leydig cells seems much higher than in Sertoli cells, this is "disturbing" to put it mildly. This is visible in both the ZsGreen expression and the FISH assay (Figure 2 B-D).

      Thank you for this valuable comments. We added new data on Fshr expression to prove the presence of Fshr in Leydig cells in B6 detected by immunofluorescence staining, RNA-smFISH and ddRT-PCR, as well as in TM3 cells-isolated Leydig cells from a male mice in the revise MS (Fig 2E, F and G), that demonstrate no interruptions of normal Fshr expression by insertion of P2A-ZsGreen vector into a locus located between exon10 and stop code. We use ZsGreen as an indicator for active Fshr promoter status, rather than a method to measure Fshr expression, which is done by ddRT-PCR. These data are shown in Figure 2G of the revised MS

      In addition, we provide scRNA-seq based evidence on Fshr expression in human Leydig cells from two single cell portals (DISCO and BioGPS) as shown in Supplementary Data 3 in the revised MS. We also cited a recent report on scRNA-seq analysis of Fshr expression in Hu sheep in the revised MS as Reference 65 (PMID: 37541020) 1, which also clearly showed Fshr expression in Leydig cells at single cell level in Hu Sheep.

      We believe that the lack of Fshr expression in some single cell databases may be due to the degradation of Fshr transcript in cells during the process of single cell populations. In our laboratory, we spent more than 6 months to optimize methods and reagents to perverse mRNA integrity more than 8 for RAN-seq.

      The expression in WAT and BAT is also questionable as the expression of ZsGreen is high everywhere. That makes it difficult to believe that the images are truly informative. For example, the stainings of aorta show the ZsGreen expression where elastin and collagen fibres are - these are not "cells" and therefore are not expressing ZsGreen.

      FISH expression (for FSHR) in WT mice is missing.

      Also, the tissue sections were stained with the IgG only (neg control) but in practice both the KI and the WT tissues should be stained with the primary and secondary antibodies. The only control that I could think of to truly get a sense of this would be a tagged receptor (N-terminal) that could then be analysed by immunohistochemistry.

      Reply 2 and 3: Thank you for these comments. New data on Fshr expression in WAT and BAT of B6 mice by immunofluorescence staining and in the testis of B6 mice by immunofluorescence staining and RNA-smFISH are added to the revised MS (Fig.2D and E, and Fig. 4G), showing similar patterns to that of Fshr-ZsGreen mice. Furthermore, we provide more evidences as Supplementary Data 3 on Fshr expression obtained from 4 public single cell portables, showing FSHR expression in a widespread organs and tissues (including different fractions of adipose cells) of human, mice and rat at single cell levels. Please also check Fshr expression pattern in adipose tissues by immunostaining for Fshr in previous reports (Fig. 3a of PMID: 28538730 and Fig. 2 of PMID: 25754247) 2 3, which showed a similar expression pattern to our finding. These data should address your concerns on Fshr expression in WAT and BAT and other organs/tissues.

      Regard of “For example, the stainings of aorta show the ZsGreen expression where elastin and collagen fibres are - these are not "cells" and therefore are not expressing ZsGreen.” We believe that you referred to the image of the aorta in Supplementary Data2. However, Please take a look at the images of the aorta in Figure 5-C, which shows positively stained the layer of ‘elastin and collagen fibres’ for EMCN and a-SMA colocalized with Fshr expression with stained DAPI at a 1000X magnification, indicating endothelial cells and the cellular membrane presented in this layer, not just ‘elastin and collagen’.

      The authors also claim:

      To functionally prove the presence of FSHR in osteoblasts/osteocytes, we also deleted FSHR in osteocytes using an inducible model. The conditional knockout of FSHR triggered a much more profound increase in bone mass and decrease in fat mass than blockade by FSHR antibodies (unpublished data).

      This would be a good control for all their images. I think it is necessary to make the large claim of extragonadal expression, as well as intragonadal such as Leydig cells.

      Thank you for this very encouraging comment. As you suggested, we did add a result of reduced Fshr expression in osteocytes from DMP1-CreERT2+:Fshrfl/fl mice treated with tamoxifen to the revise MS, as shown in Figure 3D, demonstrating Fshr present in osteocytes and the specificity of Fshr antibody. Furthermore, we incorporated your advice on making ‘ large claim of extrogonadal and intragonadal expression of Fshr’ into the revised MS in red.

      Claiming that the under-developed Leydig cells in FSHR KO animals are due to a direct effect of the FSHR, and not via a cross-talk between Sertoli and Leydig cells, is too much of a claim. It might be speculated to some degree but as written at the moment it suggests this is "proven".

      Thank you for pointing out this incorrect claim and we apologized for it. In the revised MS, we deleted this claim.

      We also do not know if this FSHR expressed is a spliced form that would also result in the expression of ZsGreen but in a non-functional FSHR, or whether the FSHR is immediately degraded after expression. The insertion of the ZsGreen might have disturbed the epigenetics, transcription, or biosynthesis of the mRNA regulation.

      Thanks for this comment. In the revised MS, we added a new section to explain the function of P2A peptide in generation of a GFP reporter by sgRNA-guilded site specific knockin of P2A ZsGreen vector through CRISPRA/cas9 and provided a new result on comparison of Fshr expression in the testes and ovaries from Fshr-ZsGreen and B6 mice, showing equivalent Fshr expression between Fshr-ZsGreen and B6 mice (Figure 2G), which indicates no interruptions of Fshr expression by the insertion of P2A vector.

      The authors should go through single-cell data of WT mice to show the existence of the FSHR transcript(s).<br /> For example here:<br /> https://www.nature.com/articles/sdata2018192

      Thank you so much for the valuable comment. Yes, we took you critical advice to check Fshr expression through 4 single cell portals, including DISCO, GTEx, BioGPS and Human single cell portal, and present the collected data as Supplementary Data 3 in the revised MS, that strongly support our findings of the wider Fshr expression. Particularly, Fshr expression in Leydig cells is proved by scRNA-seq studies of human cells from DISCO and BioGPS, as well as a recent study in Hu sheep (PMID: 37541020) 1 and we cited it in the revised MS.

      Reviewer 2:

      Is the FSHR expression pattern affected by the knockin mice (no side-by-side comparison between wt and GSGreen mice, using in situ hybridization and ddRTPCR, at least in the gonads, is provided)?

      Thanks for the comment. In the revised MS, we provided a set of new data on Fshr expression in the testis, ovary, WAT and BAT of B6 mice by immunofluorescence staining and by RNA-smFISH for Fshr expression, showing similar expression patterns. Additionally, we also performed ddRT-PCT to compare Fshr expression in the testes and ovaries between Fshr-ZsGreen and B6 mice, demonstrating equivalent expression of Fshr expression between Fshr-ZsGreen and B6 mice. Interestingly, we also observed an significantly higher Fshr expression in the testis than that in the ovary (more than 30 folds).

      Is the splicing pattern of the FSHR affected in the knockin compared to wt mice, at least in the gonads?

      Thanks for the question. Please see our reply to the Reviewer 1 for the function of P2A peptide used for generation of GFP reporters.  Although we didn’t directly assess the splicing pattern, we provide a result of comparison of Fshr expression in Figure 2F in the revised MS, indirectly showing no changes of the splicing pattern. We will assess the splicing pattern of Fshr in the future that has been neglected in the field.

      Are there any additional off-target insertions of GSGreen in these mice?” and “Are similar results observed in separate founder mice?

      Thanks for the questions. As we describe it in the method section  in detail in the MS, Fshr-ZsGreen reporter was produced by the a site-specific long ssDNA recombination of the P2A-ZsGreen targeting vector to the locus between Exon10 and stop code by CRIPRA/cas9, which was guided by site-specific single guide RNA (sgRNA). We showed the results of Southern blot, DNA sequencing and site-specific PCR, proving the site-specific insertion of P2A-ZsGreen as shown in Figure 1. Because of the site-specific recombination, professionally, only one funder line is required for the study and there are no additional off-target insertions.

      How long is GSGreen half-life? Could a very long half-life be a major reason for the extremely large expression pattern observed?

      Thanks for the question. The half life of ZsGreen, also called ZsGreen1, is at least 26 h in mammalian cells or slightly longer due to its tetrameric structure, in contrast with the monomeric configuration of other well-known fluorescent proteins (PMID: 17510373) 4. The rationale for using this GFP protein is that ZsGreen is an exceptionally bright green fluorescent protein, which is up to 4X brighter than EGFP—and is ideally suited for whole-cell labelling, promoter-reporter studies, considering of the higher turnover and rapid degradation of Fshr transcript. In this study, we used ZsGreen as a monitor or an indicator of the active Fshr endogenous promoter, rather than a means for measuring the promoter activity. Therefore, regardless of its accumulation or not, ZsGreen driven by Fshr promoter, indicates the presence of active Fshr promoter in the defined cells. In stead, we used ddRT-PCR to measure Fshr expression degrees in this study. In addition, we also provide single cell sequence-based evidence from 4 public single cell portables to support our findings of the wide Fshr expression. Please see Supplementary Data 3 in the revised MS.

      References:

      (1) Su J, Song Y, Yang Y, et al. Study on the changes of LHR, FSHR and AR with the development of testis cells in Hu sheep. Anim Reprod Sci. Sep 2023;256:107306. doi:10.1016/j.anireprosci.2023.107306

      (2) Liu P, Ji Y, Yuen T, et al. Blocking FSH induces thermogenic adipose tissue and reduces body fat. Nature. Jun 1 2017;546(7656):107-112. doi:10.1038/nature22342

      (3) Liu XM, Chan HC, Ding GL, et al. FSH regulates fat accumulation and redistribution in aging through the Galphai/Ca(2+)/CREB pathway. Aging Cell. Jun 2015;14(3):409-20. doi:10.1111/acel.12331

      (4) Bell P, Vandenberghe LH, Wu D, Johnston J, Limberis M, Wilson JM. A comparative analysis of novel fluorescent proteins as reporters for gene transfer studies. J Histochem Cytochem. Sep 2007;55(9):931-9. doi:10.1369/jhc.7A7180.2007

    1. Author response:

      eLife assessment

      This useful study examines the neural activity in the motor cortex as a monkey reaches to intercept moving targets, focusing on how tuned single neurons contribute to an interesting overall population geometry. The presented results and analyses are solid, though the investigation of this novel task could be strengthened by clarifying the assumptions behind the single neuron analyses, and further analyses of the neural population activity and its relation to different features of behaviour.

      Thanks for recognizing the content of our research, and please stay tuned for our follow-up studies on neural dynamics during interception.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study addresses the question of how task-relevant sensory information affects activity in the motor cortex. The authors use various approaches to address this question, looking at single units and population activity. They find that there are three subtypes of modulation by sensory information at the single unit level. Population analyses reveal that sensory information affects the neural activity orthogonally to motor output. The authors then compare both single unit and population activity to computational models to investigate how encoding of sensory information at the single unit level is coordinated in a network. They find that an RNN that displays similar orbital dynamics and sensory modulation to the motor cortex also contains nodes that are modulated similarly to the three subtypes identified by the single unit analysis.

      Strengths:

      The strengths of this study lie in the population analyses and the approach of comparing single-unit encoding to population dynamics. In particular, the analysis in Figure 3 is very elegant and informative about the effect of sensory information on motor cortical activity. The task is also well designed to suit the questions being asked and well controlled.

      We appreciate these kind comments.

      It is commendable that the authors compare single units to population modulation. The addition of the RNN model and perturbations strengthen the conclusion that the subtypes of individual units all contribute to the population dynamics. However, the subtypes (PD shift, gain, and addition) are not sufficiently justified. The authors also do not address that single units exhibit mixed modulation, but RNN units are not treated as such.

      We’re sorry for not providing sufficient grounds to introduce the subtypes. We determined the PD shift, gain, and addition as pertinent subtypes based on classical cosine tuning model (Georgopoulos et al., 1982) and referred to some gain modulation studies (e.g. Pesaran et al. 2010, Bremner and Andersen, 2012). Here, we applied this subtype analysis as a criteria to identify the modulation in neuronal population rather than to sort neuron into distinct cell types. We will update Methods in the revised version of manuscript.

      Weaknesses:

      The main weaknesses of the study lie in the categorization of the single units into PD shift, gain, and addition types. The single units exhibit clear mixed selectivity, as the authors highlight. Therefore, the subsequent analyses looking only at the individual classes in the RNN are a little limited. Another weakness of the paper is that the choice of windows for analyses is not properly justified and the dependence of the results on the time windows chosen for single-unit analyses is not assessed. This is particularly pertinent because tuning curves are known to rotate during movements (Sergio et al. 2005 Journal of Neurophysiology).

      The mixed selectivity or precisely the mixed modulation is indeed a significant feature of neuronal population in the present study. The purpose of the subtype analysis was to serve as a criterion for the potential modulation mechanisms. However, the results appear to be a spectrum than clusters. It still through some insights to understand the modulation distribution and we will refine the description in the next version. In the current version, we observed single-unit tuning and population neural state with sliding windows, focusing on the period around movement onset (MO) due to the emergence of a ring-like structure. We will clarify the choice of windows and the dependence assessment in the next version. It’s a great suggestion to consider the role of rotating tuning curves in neural dynamics during interception.

      This paper shows sensory information can affect motor cortical activity whilst not affecting motor output. However, it is not the first to do so and fails to cite other papers that have investigated sensory modulation of the motor cortex (Stavinksy et al. 2017 Neuron, Pruszynski et al. 2011 Nature, Omrani et al. 2016 eLife). These studies should be mentioned in the Introduction to capture better the context around the present study. It would also be beneficial to add a discussion of how the results compare to the findings from these other works.

      Thanks for the reminder. We will introduce the relevant research in the next version of manuscript.

      This study also uses insights from single-unit analysis to inform mechanistic models of these population dynamics, which is a powerful approach, but is dependent on the validity of the single-cell analysis, which I have expanded on below.

      I have clarified some of the areas that would benefit from further analysis below:

      (1) Task:

      The task is well designed, although it would have benefited from perhaps one more target speed (for each direction). One monkey appears to have experienced one more target speed than the others (seen in Figure 3C). It would have been nice to have this data for all monkeys.

      Great suggestion! However, it’s hard to implement as the implanted arrays have been removed.

      (2) Single unit analyses:

      In some analyses, the effects of target speed look more driven by target movement direction (e.g. Figures 1D and E). To confirm target speed is the main modulator, it would be good to compare how much more variance is explained by models including speed rather than just direction. More target speeds may have been helpful here too.

      Nice suggestion! The fitting goodness of the simple model (just motor direction) is much less than the complex model (including target speed). We will update the results in the next version.

      The choice of the three categories (PD shift, gain addition) is not completely justified in a satisfactory way. It would be nice to see whether these three main categories are confirmed by unsupervised methods.

      A good point. We will have a try with unsupervised methods. 

      The decoder analyses in Figure 2 provide evidence that target speed modulation may change over the trial. Therefore, it is important to see how the window considered for the firing rate in Figure 1 (currently 100ms pre - 100ms post movement onset) affects the results.

      Thanks for the suggestion and close reading. We will test the decoder in other epochs.

      (3) Decoder:

      One feature of the task is that the reach endpoints tile the entire perimeter of the target circle (Figure 1B). However, this feature is not exploited for much of the single-unit analyses. This is most notable in Figure 2, where the use of a SVM limits the decoding to discrete values (the endpoints are divided into 8 categories). Using continuous decoding of hand kinematics would be more appropriate for this task.

      This is a very reasonable suggestion. In this study, we discrete the reach-direction as the previous studies (Li et al., 2018&2022) and thought that the discrete decoding was already enough to show the interaction of sensory and motor variables. In future studies, we will try continuous decoding of hand kinematics.

      (4) RNN:

      Mixed selectivity is not analysed in the RNN, which would help to compare the model to the real data where mixed selectivity is common. Furthermore, it would be informative to compare the neural data to the RNN activity using canonical correlation or Procrustes analyses. These would help validate the claim of similarity between RNN and neural dynamics, rather than allowing comparisons to be dominated by geometric similarities that may be features of the task. There is also an absence of alternate models to compare the perturbation model results to.

      Thank you for these helpful suggestions. We will perform decoding analysis on RNN units to verify if there is interaction of sensory and motor variables as in real data, as well as the canonical correlation or Procrustes analysis.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Zhang et al. examine neural activity in the motor cortex as monkeys make reaches in a novel target interception task. Zhang et al. begin by examining the single neuron tuning properties across different moving target conditions, finding several classes of neurons: those that shift their preferred direction, those that change their modulation gain, and those that shift their baseline firing rates. The authors go on to find an interesting, tilted ring structure of the neural population activity, depending on the target speed, and find that (1) the reach direction has consistent positioning around the ring, and (2) the tilt of the ring is highly predictive of the target movement speed. The authors then model the neural activity with a single neuron representational model and a recurrent neural network model, concluding that this population structure requires a mixture of the three types of single neurons described at the beginning of the manuscript.

      Strengths:

      I find the task the authors present here to be novel and exciting. It slots nicely into an overall trend to break away from a simple reach-to-static-target task to better characterize the breadth of how the motor cortex generates movements. I also appreciate the movement from single neuron characterization to population activity exploration, which generally serves to anchor the results and make them concrete. Further, the orbital ring structure of population activity is fascinating, and the modeling work at the end serves as a useful baseline control to see how it might arise.

      Thank you for recognizing our work.

      Weaknesses:

      While I find the behavioral task presented here to be excitingly novel, I find the presented analyses and results to be far less interesting than they could be. Key to this, I think, is that the authors are examining this task and related neural activity primarily with a single-neuron representational lens. This would be fine as an initial analysis since the population activity is of course composed of individual neurons, but the field seems to have largely moved towards a more abstract "computation through dynamics" framework that has, in the last several years, provided much more understanding of motor control than the representational framework has. As the manuscript stands now, I'm not entirely sure what interpretation to take away from the representational conclusions the authors made (i.e. the fact that the orbital population geometry arises from a mixture of different tuning types). As such, by the end of the manuscript, I'm not sure I understand any better how the motor cortex or its neural geometry might be contributing to the execution of this novel task.

      The present study shows the sensory modulation on motor tuning in single units and neural state during motor execution period. It’s a pity that the findings were constrained in certain time windows. We are still working this topic, and hopefully will address related questions in our follow-up studies.

      Main Comments:

      My main suggestions to the authors revolve around bringing in the computation through a dynamics framework to strengthen their population results. The authors cite the Vyas et al. review paper on the subject, so I believe they are aware of this framework. I have three suggestions for improving or adding to the population results:

      (1) Examination of delay period activity: one of the most interesting aspects of the task was the fact that the monkey had a random-length delay period before he could move to intercept the target. Presumably, the monkey had to prepare to intercept at any time between 400 and 800 ms, which means that there may be some interesting preparatory activity dynamics during this period. For example, after 400ms, does the preparatory activity rotate with the target such that once the go cue happens, the correct interception can be executed? There is some analysis of the delay period population activity in the supplement, but it doesn't quite get at the question of how the interception movement is prepared. This is perhaps the most interesting question that can be asked with this experiment, and it's one that I think may be quite novel for the field--it is a shame that it isn't discussed.

      Great idea! We are on the way, and close to complete the puzzle.

      (2) Supervised examination of population structure via potent and null spaces: simply examining the first three principal components revealed an orbital structure, with a seemingly conserved motor output space and a dimension orthogonal to it that relates to the visual input. However, the authors don't push this insight any further. One way to do that would be to find the "potent space" of motor cortical activity by regression to the arm movement and examine how the tilted rings look in that space (this is actually fairly easy to see in the reach direction components of the dPCA plot in the supplement--the rings will be highly aligned in this space). Presumably, then, the null space should contain information about the target movement. dPCA shows that there's not a single dimension that clearly delineates target speed, but the ring tilt is likely evident if the authors look at the highest variance neural dimension orthogonal to the potent space (the "null space")--this is akin to PC3 in the current figures, but it would be nice to see what comes out when you look in the data for it.

      Nice suggestion. Target-speed modulation mainly influences PC3, which is consistent with ‘null space’ hypothesis. We will try other methods of dimensionality reduction (e.g. dPCA, Manopt) to determine the potent and null space.

      (3) RNN perturbations: as it's currently written, the RNN modeling has promise, but the perturbations performed don't provide me with much insight. I think this is because the authors are trying to use the RNN to interpret the single neuron tuning, but it's unclear to me what was learned from perturbing the connectivity between what seems to me almost arbitrary groups of neurons (especially considering that 43% of nodes were unclassifiable). It seems to me that a better perturbation might be to move the neural state before the movement onset to see how it changes the output. For example, the authors could move the neural state from one tilted ring to another to see if the virtual hand then reaches a completely different (yet predictable) target. Moreover, if the authors can more clearly characterize the preparatory movement, perhaps perturbations in the delay period would provide even more insight into how the interception might be prepared.

      We are sorry that we didn’t clarify the definition of “none” type, which can be misleading. The 43% unclassified nodes include those inactive ones, when only activate (task-related) nodes included, the ratio of unclassified nodes would be much lower. By perturbing the connectivity, we intended to explore the interaction between different modulations.

      Thank you for the great advice. We tried moving neural states from one ring to another without changing the directional cluster, but this perturbation didn’t have a significant influence on network performance as expected. We will check this result again and try perturbations in the delay period.

      Reviewer #3 (Public Review):

      Summary:

      This experimental study investigates the influence of sensory information on neural population activity in M1 during a delayed reaching task. In the experiment, monkeys are trained to perform a delayed interception reach task, in which the goal is to intercept a potentially moving target.

      This paradigm allows the authors to investigate how, given a fixed reach endpoint (which is assumed to correspond to a fixed motor output), the sensory information regarding the target motion is encoded in neural activity.

      At the level of single neurons, the authors found that target motion modulates the activity in three main ways: gain modulation (scaling of the neural activity depending on the target direction), shift (shift of the preferred direction of neurons tuned to reach direction), or addition (offset to the neural activity).

      At the level of the neural population, target motion information was largely encoded along the 3rd PC of the neural activity, leading to a tilt of the manifold along which reach direction was encoded that was proportional to the target speed. The tilt of the neural manifold was found to be largely driven by the variation of activity of the population of gain-modulated neurons.

      Finally, the authors studied the behaviour of an RNN trained to generate the correct hand velocity given the sensory input and reach direction. The RNN units were found to similarly exhibit mixed selectivity to the sensory information, and the geometry of the « neural population » resembled that observed in the monkeys.

      Strengths:

      - The experiment is well set up to address the question of how sensory information that is directly relevant to the behaviour but does not lead to a direct change in behavioural output modulates motor cortical activity.

      - The finding that sensory information modulates the neural activity in M1 during motor preparation and execution is non trivial, given that this modulation of the activity must occur in the nullspace of the movement.

      - The paper gives a complete picture of the effect of the target motion on neural activity, by including analyses at the single neuron level as well as at the population level. Additionally, the authors link those two levels of representation by highlighting how gain modulation contributes to shaping the population representation.

      Thanks for your recognition.

      Weaknesses:

      - One of the main premises of the paper is the fact that the motor output for a given reach point is preserved across different target motions. However, as the authors briefly mention in the conclusion, they did not record muscle activity during the task, but only hand velocity, making it impossible to directly verify how preserved muscle patterns were across movements. While the authors highlight that they did not see any difference in their results when resampling the data to control for similar hand velocities across conditions, this seems like an important potential caveat of the paper whose implications should be discussed further or highlighted earlier in the paper.

      Thanks for the suggestion. We will highlight the resampling results as important control in the next version of manuscript.

      - The main takeaway of the RNN analysis is not fully clear. The authors find that an RNN trained given a sensory input representing a moving target displays modulation to target motion that resembles what is seen in real data. This is interesting, but the authors do not dissect why this representation arises, and how robust it is to various task design choices. For instance, it appears that the network should be able to solve the task using only the motion intention input, which contains the reach endpoint information. If the target motion input is not used for the task, it is not obvious why the RNN units would be modulated by this input (especially as this modulation must lie in the nullspace of the movement hand velocity if the velocity depends only on the reach endpoint). It would thus be important to see alternative models compared to true neural activity, in addition to the model currently included in the paper. Besides, for the model in the paper, it would therefore be interesting to study further how the details of the network setup (eg initial spectral radius of the connectivity, weight regularization, or using only the target position input) affect the modulation by the motion input, as well as the trained population geometry and the relative ratios of modulated cells after training.

      Great suggestions. It’s a considerable pity that we didn’t dissect the formation reason and influence factor of the representation in the current version. We’ve tried several combinations of inputs before: in the network which received only motor intention and GO inputs, there were rings but not tilting related to target-speed; in the network which received only target location and GO inputs, there were ring-like structures but not clear directional clusters. We will check these results and try alternative models in the next version. In future studies, we will examine the influence of network setup details.

      - Additionally, it is unclear what insights are gained from the perturbations to the network connectivity the authors perform, as it is generally expected that modulating the connectivity will degrade task performance and the geometry of the responses. If the authors wish the make claims about the role of the subpopulations, it could be interesting to test whether similar connectivity patterns develop in networks that are not initialized with an all-to-all random connectivity or to use ablation experiments to investigate whether the presence of multiple types of modulations confers any sort of robustness to the network.

      Thank you for the great suggestions. By perturbations, we intended to explore the contribution of interaction between certain subpopulations. We tried ablation experiments, but the result was not significant. Probably because the most units were of mixed selectivity, the units of only modulations were not enough for bootstrapping, or the random sampling from single subpopulation (bearing mixed selectivity) could be repeated. We will consider these suggestions carefully in the revised version.

      - The results suggest that the observed changes in motor cortical activity with target velocity result from M1 activity receiving an input that encodes the velocity information. This also appears to be the assumption in the RNN model. However, even though the input shown to the animal during preparation is indeed a continuously moving target, it appears that the only relevant quantity to the actual movement is the final endpoint of the reach. While this would have to be a function of the target velocity, one could imagine that the computation of where the monkeys should reach might be performed upstream of the motor cortex, in which case the actual target velocity would become irrelevant to the final motor output. This makes the results of the paper very interesting, but it would be nice if the authors could discuss further when one might expect to see modulation by sensory information that does not directly affect motor output in M1, and where those inputs may come from. It may also be interesting to discuss how the findings relate to previous work that has found behaviourally irrelevant information is being filtered out from M1 (for instance, Russo et al, Neuron 2020 found that in monkeys performing a cycling task, context can be decoded from SMA but not from M1, and Wang et al, Nature Communications 2019 found that perceptual information could not be decoded from PMd)?

      How and where sensory information modulates M1 are very interesting and open questions. We will discuss further about this topic in the next version.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Semenova et al. have studied a large cross-sectional cohort of people living with HIV on suppressive ART, N=115, and performed high dimensional flow cytometry to then search for associations between immunological and clinical parameters and intact/total HIV DNA levels.

      A number of interesting data science/ML approaches were explored on the data and the project seems a serious undertaking. However, like many other studies that have looked for these kinds of associations, there was not a very strong signal. Of course, the goal of unsupervised learning is to find new hypotheses that aren't obvious to human eyes, but I felt in that context, there were (1) results slightly oversold, (2) some questions about methodology in terms mostly of reservoir levels, and (3) results were not sufficiently translated back into meaning in terms of clinical outcomes.

      We appreciate the reviewer’s perspective.  In our revised version of the manuscript, we have attempted to address these concerns by more adequately explaining the limitations of the study and by more thoroughly discussing the context of the findings.  We are not able to associate the findings with specific clinical outcomes for individual study participants but we speculate about the overall biological meaning of these associations across the cohort.  We cannot disagree with the reviewer, but we find the associations statistically significant, potentially reflecting real biological associations, and forming the basis for future hypothesis testing research. 

      Strengths:

      The study is evidently a large and impressive undertaking and combines many cutting-edge statistical techniques with a comprehensive experimental cohort of people living with HIV, notably inclusive of populations underrepresented in HIV science. A number of intriguing hypotheses are put forward that could be explored further. Sharing the data could create a useful repository for more specific analyses.

      We thank the reviewer for this assessment.

      Weaknesses:

      Despite the detailed experiments and methods, there was not a very strong signal for the variable(s) predicting HIV reservoir size. The Spearman coefficients are ~0.3, (somewhat weak, and acknowledged as such) and predictive models reach 70-80% prediction levels, though sometimes categorical variables are challenging to interpret.

      We agree with the reviewer that individual parameters are only weakly correlated with the HIV reservoir, likely reflecting the complex and multi-factorial nature of reservoir/immune cell interactions.  Nevertheless, these associations are statistically significant and form the basis for functional testing in viral persistence.

      There are some questions about methodology, as well as some conclusions that are not completely supported by results, or at minimum not sufficiently contextualized in terms of clinical significance.  On associations: the false discovery rate correction was set at 5%, but data appear underdetermined with fewer observations than variables (144vars > 115ppts), and it isn't always clear if/when variables are related (e.g inverses of one another, for instance, %CD4 and %CD8).

      When deriving a list of cell populations whose frequency would be correlated with the reservoir, we focused on well-defined cell types for which functional validation exists in the literature to consider them as distinct cell types.  For many of the populations, gating based on combinations of multiple markers leads to recovery of very few cells, and so we excluded some potential combinations from the analysis.  We are also making our raw data available for others to examine and find associations not considered by our manuscript.

      The modeling of reservoir size was unusual, typically intact and defective HIV DNA are analyzed on a log10 scale (both for decays and predicting rebound). Also, sometimes in this analysis levels are normalized (presumably to max/min?, e.g. S5), and given the large within-host variation of level we see in other works, it is not trivial to predict any downstream impact of normalization across population vs within-person.

      We have repeated the analysis using log10 transformed data and the new figures are shown in Figure 1 and S2-S5.

      Also, the qualitative characterization of low/high reservoir is not standard and naturally will split by early/later ART if done as above/below median. Given the continuous nature of these data, it seems throughout that predicting above/below median is a little hard to translate into clinical meaning.

      Our ML models included time before ART as a variable in the analysis, and this was not found to be a significant driver of the reservoir size associations, except for the percentage of intact proviruses (see Figure 2C). Furthermore, we analyzed whether any of the reservoir correlated immune variables were associated with time on ART and found that, although some immune variables are associated with time on therapy, this was not the case for most of them (Table S4). We agree that it is challenging to translate above or below median into clinical meaning for this cohort, but we emphasize that this study is primarily a hypothesis generating approach requiring additional validation for the associations observed.  We attempted to predict reservoir size as a continuous variable using the data and this approach was not successful (Figure S13). We believe that a significantly larger cohort will likely be required to generate a ML model that can accurately predict the reservoir as a continuous variable.  We have added additional discussion of this to the manuscript.

      Lastly, the work is comprehensive and appears solid, but the code was not shared to see how calculations were performed.

      We now provide a link to the code used to perform the analyses in the manuscript, https://github.com/lesiasemenova/ML_HIV_reservoir.

      Reviewer #2 (Public Review):

      Summary:

      Semenova et. al., performed a cross-sectional analysis of host immunophenotypes (using flow cytometry) and the peripheral CD4+ T cell HIV reservoir size (using the Intact Proviral DNA Assay, IPDA) from 115 people with HIV (PWH) on ART. The study mostly highlights the machine learning methods applied to these host and viral reservoir datasets but fails to interpret these complex analyses into (clinically, biologically) interpretable findings. For these reasons, the direct translational take-home message from this work is lost amidst a large list of findings (shown as clusters of associated markers) and sentences such as "this study highlights the utility of machine learning approaches to identify otherwise imperceptible global patterns" - lead to overinterpretation of their data.

      We have addressed the reviewer’s concern by modifications to the manuscript that enhance the interpretation of the findings in a clinical and biological context.

      Strengths:

      Measurement of host immunophenotyping measures (multiparameter flow cytometry) and peripheral HIV reservoir size (IPDA) from 115 PWH on ART.

      Major Weaknesses:

      (1) Overall, there is little to no interpretability of their machine learning analyses; findings appear as a "laundry list" of parameters with no interpretation of the estimated effect size and directionality of the observed associations. For example, Figure 2 might actually give an interpretation of each X increase in immunophenotyping parameter, we saw a Y increase/decrease in HIV reservoir measure.

      We have added additional text to the manuscript in which we attempt to provide more immunological and clinical interpretation of the associations.  We also have emphasized that these associations are still speculative and will require additional validation.  Nevertheless, our data should provide a rich source of new hypotheses regarding immune system/reservoir interaction that could be tested in future work.

      (2) The correlations all appear to be relatively weak, with most Spearman R in the 0.30 range or so.

      We agree with the review that the associations are mostly weak, consistent with previous studies in this area.  This likely is an inherent feature of the underlying biology – the reservoir is likely associated with the immune system in complex ways and involves stochastic processes that will limit the predictability of reservoir size using any single immune parameter. We have added additional text to the manuscript to make this point clearer.

      (3) The Discussion needs further work to help guide the reader. The sentence: "The correlative results from this present study corroborate many of these studies, and provide additional insights" is broad. The authors should spend some time here to clearly describe the prior literature (e.g., describe the strength and direction of the association observed in prior work linking PD-1 and HIV reservoir size, as well as specify which type of HIV reservoir measures were analyzed in these earlier studies, etc.) and how the current findings add to or are in contrast to those prior findings.

      We have added additional text to the manuscript to help guide the readers through the possible biological significance of the findings and the context with respect to prior literature.

      (4) The most interesting finding is buried on page 12 in the Discussion: "Uniquely, however, CD127 expression on CD4 T cells was significantly inversely associated with intact reservoir frequency." The authors should highlight this in the abstract, and title, and move this up in the Discussion. The paper describes a very high dimensional analysis and the key takeaways are not clear; the more the author can point the reader to the take-home points, the better their findings can have translatability to future follow-up mechanistic and/or validation studies.

      We appreciate the reviewer’s comment.  We have increased the emphasis on this finding in the revised version of the manuscript.

      (5) The authors should avoid overinterpretation of these results. For example in the Discussion on page 13 "The existence of two distinct clusters of PWH with different immune features and reservoir characteristics could have important implications for HIV cure strategies - these two groups may respond differently to a given approach, and cluster membership may need to be considered to optimize a given strategy." It is highly unlikely that future studies will be performing the breadth of parameters resulting here and then use these directly for optimizing therapy.

      Our analyses indicate that membership of study participants in cluster1 or cluster 2 can be fairly accurately determined by a small number of individual parameters (KLRG1 etc, Figure 4F), and measuring the cells of PWH with the degree of breadth used in this paper would not be necessary to classify PWH into these clusters.  As such, we feel that it is not unrealistic to speculate that this finding could turn out to be clinically useful, if it becomes clear that the clusters are biologically meaningful.

      (6) There are only TWO limitations listed here: cross-sectional study design and the use of peripheral blood samples. (The subsequent paragraph notes an additional weakness which is misclassification of intact sequences by IPDA). This is a very limited discussion and highlights the need to more critically evaluate their study for potential weaknesses.

      We have expanded on the list of limitations discussed in the manuscript. In particular, we now address the size of the cohort, the composition with respect to different genders and demographics, lack of information for the timing of ART and the lack of information regarding intracellular transcriptional pathways.

      (7) A major clinical predictor of HIV reservoir size and decay is the timing of ART initiation. The authors should include these (as well as other clinical covariate data - see #12 below) in their analyses and/or describe as limitations of their study.

      All of the participants that make up our cohort were treated during chronic infection, and the precise timing of ART initiation is unclear in most of these cases.  We have added additional information to explain this in the manuscript and include this in the list of limitations.

      Reviewer #3 (Public Review):

      Summary:

      This valuable study by Semenova and colleagues describes a large cross-sectional cohort of 115 individuals on ART. Participants contributed a single blood sample which underwent IPDA, and 25-color flow with various markers (pre and post-stimulation). The authors then used clustering, decision tree analyses, and machine learning to look for correlations between these immunophenotypic markers and several measures of HIV reservoir volume. They identified two distinct clusters that can be somewhat differentiated based on total HIV DNA level, intact HIV DNA level, and multiple T cell cellular markers of activation and exhaustion.

      The conclusions of the paper are supported by the data but the relationships between independent and dependent variables in the models are correlative with no mechanistic work to determine causality. It is unclear in most cases whether confounding variables could explain these correlations. If there is causality, then the data is not sufficient to infer directionality (ie does the immune environment impact the HIV reservoir or vice versa or both?). In addition, even with sophisticated and appropriate machine learning approaches, the models are not terribly predictive or highly correlated. For these reasons, the study is very much hypothesis-generating and will not impact cure strategies or HIV reservoir measurement strategies in the short term.

      We appreciate the reviewer’s comments regarding the value of our study.  We fully acknowledge that the causal nature and directionality of these associations are not yet clear and agree that the study is primarily hypothesis generating in nature.  Nevertheless, we feel that the hypotheses generated will be valuable to the field.  We have added additional text to the manuscript to emphasize the hypothesis generating nature of this paper.

      Strengths:

      The study cohort is large and diverse in terms of key input variables such as age, gender, and duration of ART. Selection of immune assays is appropriate. The authors used a wide array of bioinformatic approaches to examine correlations in the data. The paper was generally well-written and appropriately referenced.

      Weaknesses:

      (1) The major limitation of this work is that it is highly exploratory and not hypothesis-driven. While some interesting correlations are identified, these are clearly hypothesis-generating based on the observational study design.

      We agree that the major goal of this study was hypothesis generating and that our work is exploratory in nature. Performing experiments with mechanism testing goals in human participants with HIV is challenging.  Additionally, before such mechanistic studies can be undertaken, one must have hypotheses to test. As such we feel our study will be useful for the field in helping to identify hypotheses that could potentially be tested.

      (2) The study's cross-sectional nature limits the ability to make mechanistic inferences about reservoir persistence. For instance, it would be very interesting to know whether the reservoir cluster is a feature of an individual throughout ART, or whether this outcome is dynamic over time.

      We agree with the reviewer’s comment. Longitudinal studies are challenging to carry out with a study cohort of this size, and addressing questions such as the one raised by the reviewer would be of great interest. We believe our study nevertheless has value in identifying hypotheses that could be tested in a longitudinal study.

      (3) A fundamental issue is that I am concerned that binarizing the 3 reservoir metrics in a 50/50 fashion is for statistical convenience. First, by converting a continuous outcome into a simple binary outcome, the authors lose significant amounts of quantitative information. Second, the low and high reservoir outcomes are not actually demonstrated to be clinically meaningful: I presume that both contain many (?all) data points above levels where rebound would be expected soon after interruption of ART. Reservoir levels would also have no apparent outcome on the selection of cure approaches. Overall, dividing at the median seems biologically arbitrary to me.

      The reviewer raises a valid point that the clinical significance of above or below median reservoir metrics is unclear, and that the size of the reservoir has potentially little relation to rebound and cure approaches.  In the manuscript, we attempted to generate models that can predict reservoir size as a continuous variable in Figure S13 and find that this approach performs poorly, while a binarized approach was more successful. As such we have included both approaches in the manuscript.  It is possible that future studies with larger sample sizes and more detailed measurements will perform better for continuous variable prediction.  While this is a fairly large study (n=115) by the standards of HIV reservoir analyses, it is a small study by the standards of the machine learning field, and accurate predictive ML models for reservoir size as a continuous variable will likely require a much larger set of samples/participants.  Nevertheless, we feel our work has value as a template for ML approaches that may be informative for understanding HIV/immune interactions and generates novel hypotheses that could be validated by subsequent studies.

      (4) The two reservoir clusters are of potential interest as high total and intact with low % intact are discriminated somewhat by immune activation and exhaustion. This was the most interesting finding to me, but it is difficult to know whether this clustering is due to age, time on ART, other co-morbidity, ART adherence, or other possible unmeasured confounding variables.

      We agree that this finding is one of the more interesting outcomes of the study. We examined a number of these variables for association with cluster membership, and these data are reported in Figure S8A-D.  Age, years of ART and CD4 Nadir were all clearly different between the clusters.   The striking feature of this clustering, however, is the clear separation between the two groups of participants, as opposed to a continuous gradient of phenotypes.  This could reflect a bifurcation of outcomes for people with HIV, dynamic changes in the reservoir immune interactions over time, or different levels of untreated infection.  It is certainly possible that some other unmeasured confounding variables contribute to this outcome and we have attempted to make this limitation clearer.

      (5) At the individual level, there is substantial overlap between clusters according to total, intact, and % intact between the clusters. Therefore, the claim in the discussion that these 2 cluster phenotypes may require different therapeutic approaches seems rather speculative. That said, the discussion is very thoughtful about how these 2 clusters may develop with consideration of the initial insult of untreated infection and / or differences in immune recovery.

      We agree with the reviewer that this claim is speculative, and we have attempted to moderate the language of the text in the revised version.

      (6) The authors state that the machine learning algorithms allow for reasonable prediction of reservoir volume. It is subjective, but to me, 70% accuracy is very low. This is not a disappointing finding per se. The authors did their best with the available data. It is informative that the machine learning algorithms cannot reliably discriminate reservoir volume despite substantial amounts of input data. This implies that either key explanatory variables were not included in the models (such as viral genotype, host immune phenotype, and comorbidities) or that the outcome for testing the models is not meaningful (which may be possible with an arbitrary 50/50 split in the data relative to median HIV DNA volumes: see above).

      We acknowledge that the predictive power of the models generated from these data is modest and we have clarified this point in the revised manuscript. As the reviewer indicates, this may result from the influence of unmeasured variables and possible stochastic processes.  The data may thus demonstrate a limit to the predictability of reservoir size which may be inherent to the underlying biology.  As we mention above, this study size (n-115) is fairly small for the application of ML methods, and an increased sample size will likely improve the accuracy of the models. At this stage, the models we describe are not yet useful as predictive clinical tools, but are still nonetheless useful as tools to describe the structure of the data and identify reservoir associated immune cell types.

      (7) The decision tree is innovative and a useful addition, but does not provide enough discriminatory information to imply causality, mechanism, or directionality in terms of whether the immune phenotype is impacting the reservoir or vice versa or both. Tree accuracy of 80% is marginal for a decision tool.

      The reviewer is correct about these points.  In the revised manuscript, we have attempted to make it clear that we are not yet advocating using this approach as a decision tool, but simply a way to visualize the data and understand the structure of the dataset.  As we discuss above, the models will likely need to be trained on a larger dataset and achieve higher accuracy before use as a decision tool.

      (8) Figure 2: this is not a weakness of the analysis but I have a question about interpretation. If total HIV DNA is more predictive of immune phenotype than intact HIV DNA, does this potentially implicate a prior high burden of viral replication (high viral load &/or more prolonged time off ART) rather than ongoing reservoir stimulation as a contributor to immune phenotype? A similar thought could be applied to the fact that clustering could only be detected when applied to total HIV DNA-associated features. Many investigators do not consider defective HIV DNA to be "part of the reservoir" so it is interesting to speculate why these defective viruses appear to have more correlation with immunophenotype than intact viruses.

      We agree with the reviewer that this observation could reflect prior viral burden and we have added additional text to make this clearer.  Even so, we cannot rule out a model in which defective viral DNA is engaged in ongoing stimulation of the immune system during ART, leading to the stronger association between total DNA and the immune cell phenotypes. We hypothesize that the defective proviruses could potentially be triggering innate immune pattern recognition receptors via viral RNA or DNA, and a higher burden of the total reservoir leads to a stronger apparent association with the immune phenotype.  We have included text in the discussion about this hypothesis.

      (9) Overall, the authors need to do an even more careful job of emphasizing that these are all just correlations. For instance, HIV DNA cannot be proven to have a causal effect on the immunophenotype of the host with this study design. Similarly, immunophenotype may be affecting HIV DNA or the correlations between the two variables could be entirely due to a separate confounding variable

      We have revised the text of the manuscript to emphasize this point, and we acknowledge that any causal relationships are, at this point, simply speculation. 

      (10) In general, in the intro, when the authors refer to the immune system, they do not consistently differentiate whether they are referring to the anti-HIV immune response, the reservoir itself, or both. More specifically, the sentence in the introduction listing various causes of immune activation should have citations. (To my knowledge, there is no study to date that definitively links proviral expression from reservoir cells in vivo to immune activation as it is next to impossible to remove the confounding possible imprint of previous HIV replication.) Similarly, it is worth mentioning that the depletion of intact proviruses is quite slow such that provial expression can only be stimulating the immune system at a low level. Similarly, the statement "Viral protein expression during therapy likely maintains antigen-specific cells of the adaptive immune system" seems hard to dissociate from the persistence of immune cells that were reactive to viremia.

      We updated the text of the manuscript to address these points and have added additional citations as per the reviewer’s suggestion.

      (11) Given the many limitations of the study design and the inability of the models to discriminate reservoir volume and phenotype, the limitations section of the discussion seems rather brief.

      We have now expanded the limitations section of the discussion and added additional considerations. We now include a discussion of the study cohort size, composition and the detail provided by the assays.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      A few specific comments:

      "This pattern is likely indicative of a more profound association of total HIV DNA with host immunophenotype relative to intact HIV DNA."

      Most studies I have seen (e.g. single cell from Lictherfeld/Yu group) show intact proviruses are generally more activated/detectable/susceptible to immune selection, so I have a hard time thinking defective proviruses are actually more affected by immunotype.

      We hypothesize that this association is actually occurring in the opposite direction – that the defective provirus are having a greater impact on the immune phenotype, due to their greater number and potential ability to engage innate or adaptive immune receptors. We have clarified this point in the manuscript

      "The existence of two distinct clusters of PWH with different immune features and reservoir characteristics could have important implications for HIV cure strategies - these two groups may respond differently to a given approach, and cluster membership may need to be considered to optimize a given strategy."

      I find this a bit of a reach, given that the definition of 2 categories depended on the total size.

      We have modified the language of this section to reduce the level of speculation.

      "This study is cross-sectional in nature and is primarily observational, so caution should be used interpreting findings associated with time on therapy".

      I found this an interesting statement because ultimately time on ART shows up throughout the analysis as a significant predictor, do you mean something about how time on ART could indicate other confounding variables like ART regimen or something?

      We have rephrased this comment to avoid confusion.  We were simply trying to make the point that we should avoid speculating about longitudinal dynamics from cross sectional data.

      "As expected, the plots showed no significant correlation for intact HIV DNA versus years of ART (Figure 1B), while total reservoir size was positively correlated with the time of ART (Figure 1A, Spearman r = 0.31)."<br />  Is this expected? Studies with longitudinal data almost uniformly show intact decay, at least for the first 10 or so years of ART, and defective/total stability (or slight decay). Also probably "time on ART" to not confuse with the duration of infection before ART.

      We have updated the language of this section to address this comment.  We have avoided comparing our data with respect to time on ART to longitudinal studies for reasons given above.

      On dimensionality reduction, as this PaCMAP seems a relatively new technique (vs tSNE and UMAP which are more standard, but absolutely have their weaknesses), it does seem important to contextualize. I think it would still be useful to show PCA and asses the % variance of each additional dimension to assess the effective dimensionality, it would be helpful to show a plot of % variance by # components to see if there is a cutoff somewhere, and if PaCMAP is really picking this up to determine the 2 dimensions/2 clusters is ideal. Figure 4B ultimately shows a lot of low/high across those clusters, and since low/high is defined categorically it's hard to know which of those dots are very close to the other categories.

      We have added this analysis to the manuscript – found in Figure S9. The PCA plot indicates that members of the two clusters also separate on PCA although this separation is not as clear as for the PaCMAP plot.

      Minor comments on writing etc:

      Intro

      -Needs some references on immune activation sequelae paragraph.

      We have added some additional references to this section.

      -"promote the entry of recently infected cells into the reservoir" -- that is only one possible mechanistic explanation, it's not unreasonable but it seems important to keep options open until we have more precise data that can illuminate the mechanism of the overabundance.

      We have modified the text to discuss additional hypotheses.

      -You might also reference Pankau et al Ppath for viral seeding near the time of ART.

      We have added this reference.

      -"Viral protein expression during therapy likely maintains antigen-specific cells of the adaptive immune system" - this was unclear to me, do you mean HIV-specific cells that act against HIV during ART? I think most studies show immunity against HIV (CD8 and CD4) wanes over time during ART.

      The Goonetilleke lab has recently generated data indicating that antiviral T cell responses are remarkably stable over time on ART, but we agree with the reviewer that the idea that ongoing antigen expression in the reservoir maintains these cells is speculative.  We have modified the text to make this point clearer.

      -Overall I think the introduction lacked a little bit of definitional precision: i.e. is the reservoir intact vs replication competent vs all HIV DNA and whether we are talking about PWH on long-term ART and how long we should be imagining? The first years of ART are certainly different than later, in terms of dynamics. The ultimate implications are likely specific for some of these categorizations.

      -"persistent sequelae of the massive disruptions to T cell homeostasis and lymphoid structures that occur during untreated HIV infection" needs a lot more context/referencing. For instance, Peter Hunt showed a decrease in activation after ART a long time ago.

      -Heather Best et al show T cell clonality stays perturbed after ART.

      We have updated the text of the introduction and added references to address the reviewer’s comments.

      Results

      -It would be important to mention the race of participants and any information about expected clades of acquired viruses, this gets mentioned eventually with reference to the Table but the breakdown would be helpful right away.

      We have added this information to the results section.

      -"performed Spearman correlations", may be calculated or tested?

      We have corrected the language for this sentence.

      Comments on figures:

      -Figure 1 data on linear scale (re discussion above) -- hard to even tell if there is a decay (to match with all we know from various long-term ART studies).

      -Figure 4 data is shown on ln (log_e) scale, which is hard to interpret for most people.

      -Figures 4 C,D, and E should have box plots to visually assess the significance.

      -Figure 4B legend says purple/pink but I think the colors are different in the plot, could be about transparency

      -Figure 5 it is now not clear if log_e(?).

      -Figure 6 "HIV reservoir characteristics" might be better to make this more explicit. Do you mean for instance in the 6B title Total HIV DNA per million CD4+ T cells I think?

      We have made these modifications.

      Reviewer #2 (Recommendations For The Authors):

      Minor Weaknesses:

      (1) The Introduction is too long and much of the text is not directly related to the study's research question and design.

      We have streamlined the introduction in the revised manuscript.

      (2) While no differences were seen by age or race, according to the authors, this is unlikely to be useful since the numbers are so small in some of these subcategories. Results from sensitivity analyses (e.g., excluding these individuals) may be more informative/useful.

      We agree that the lower numbers of participants for some subgroupings makes it challenging to know for sure if there are any differences based on these variables.  Have added text to clarify this. We have added age, race and gender to the LOCO analysis and to the variable inflation importance analysis (Table S5).

      (3) For Figure 4, based on what was described in the Results section of the manuscript, the authors should clarify that the figures show results for TOTAL HIV DNA only (not intact DNA): "Dimension reduction machine learning approaches identified two robust clusters of PWH when using total HIV DNA reservoir-associated immune cell frequencies (Figure 4A), but not for intact or percentage intact HIV DNA (Figure 4B and 4C)".

      We have added this information.

      (4) The statement on page 5, first paragraph, "Interestingly, when we examined a plot of percent intact proviruses versus time on therapy (Figure 1C), we observed a biphasic decay pattern," is not new (Peluso JCI Insight 2020, Gandhi JID 2023, McMyn JCI 2023). Prior studies have clearly demonstrated this biphasic pattern and should be cited here, and the sentence should be reworded with something like "consistent with prior work", etc.

      We have added citations to these studies and rephrased this comment.

      (5) The Cohort and sample collection sections are somewhat thin. Further details on the cohort details should include at the very minimum some description of the timing of ART initiation (is this mostly a chronic-treated cohort?) and important covariate data such as nadir CD4+ T cell count, pre-ART viral load, duration of ART suppression, etc.

      The cohort was treated during chronic infection, and we have clarified this in the manuscript.  Information regarding CD4 nadir and years on ART are included in Table 1.  Unfortunately, pre-ART viral load was not available for most members of this cohort, so we did not use it for analyses. The partial pre-ART viral load data is included with the dataset we are making publicly available.

      Reviewer #3 (Recommendations For The Authors):

      Minor points:

      (1) What is meant by CD4 nadir? Is this during primary infection or the time before ART initiation?

      We have clarified this description in the manuscript.  This term refers to the lowest CD4 count recorded during untreated infection.

      (2) The authors claim that determinants of reservoir size are starting to emerge but other than the timing of ART, I am not sure what studies they are referring to.

      We have updated the language of this section.  We intended to refer to studies looking at correlates of reservoir size, and feel that this is a more appropriate term that ‘determinants’

      (3) The discussion does not tie in the model-generated hypotheses with the known mechanisms that sustain the reservoir: clonal proliferation balanced by death and subset differentiation. It would be interesting to tie in the proposed reservoir clusters with these known mechanisms.

      We have added additional text to the manuscript to address these mechanisms.

      (4) Figure 1: Total should be listed as total HIV DNA.

      We have updated this in the manuscript.

      (5) Figure 1C: Worth mentioning the paper by Reeves et al which raises the possibility that the flattening of intact HIV DNA at 9 years may be spurious due to small levels of misclassification of defective as intact.

      We have added this reference.

      (6) "Total reservoir frequency" should be "total HIV DNA concentration"

      We respectfully feel that “frequency” is a more accurate term than “concentration”, since we are expressing the reservoir as a fraction of the CD4 T cells, while “concentration” suggests a denominator of volume.

      (7) Figure S2-5: label y-axis total HIV DNA.

      We have updated this figure.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Rebuttal_ Preprint- #RC-2023-02144

      First of all we would like to thank the three reviewers for their constructive and positive comments and suggestions, and the time spent in reviewing our manuscript. Their suggestions and comments had contributed to improve our manuscript. We feel the manuscript is much strengthened by this revision.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      __Summary:____ __The manuscript by Dabsan et al builds on earlier work of the Igbaria lab, who showed that ER-luminal chaperones can be refluxed into the cytosol (ERCYS) during ER stress, which constitutes a pro-survival pathway potentially used by cancer cells. In the current work, they extent these observations and a role for DNAJB12&14 in ERCYS. The work is interesting and the topic is novel and of great relevance for the proteostasis community. I have a number of technical comments:

      We thank the reviewer for his/her positive comments on our manuscript.


      __Major and minor comments: __

      1- In the description of Figure 2, statistics is only show to compare untreated condition with those treated with Tg or Tm, but no comparison between condition and different proteins. As such, the statement made by the authors "...DNAJB14-silenced cells were only affected in AGR2 but not in DNAJB11 or HYOU1 cytosolic accumulation" cannot be made.

      Answer: We totally agree with the reviewer#1. The aim of this figure is to show that during ER stress, a subset of ER proteins are refluxed to the cytosol. This is happening in cells expressing DNAJB12 and DNAJB14. We are not comparing the identity of the expelled proteins between DNAJB12-KD cells and DNAJB14-KD cells, This is not the scoop of this paper as such the statement was removed.

      2- Figure S2C: D11 seems to increase in the cytosolic fraction after Tm and Tg treatment. However, this is not reflected in the text. The membrane fraction also increases in the DKO. Is the increase of D11 in both cytosol and membrane and indication for a transcriptional induction of this protein by Tm/Tg? Again, the authors are not reflecting on this in their text.

      Answer: We performed qPCR experiments in control, DNAJB12-KD, DNAJB14-KD and in the DNAJB12/DNAJB14 double knock down cells (in both A549 and PC3 cells) to follow the mRNA levels of DNAJB11. As shown in (Figure S2F-S2N), there is no increase in the mRNA levels of DNAJB11, AGR2 or HYOU1 in the different cells in normal (unstressed conditions). Upon ER stress with tunicamycin or thapsigargin there is a little increase in the mRNA levels of HYOU1 and AGR2 but not in DNAJB11 mRNA levels. On the other hand, we also performed western blot analysis and we did not detect any difference between the different knockdown cells when we analyzed the levels of DNAJB11 compared to GAPDH. Those data are now added as (Figure S2F-S2N).

      We must note that although AGR2 and HYOU1 are induced at the mRNA as a result of ER stress, the data with the overexpression of DNAJB12 and DNAJB14 are important as control experiments because when DNAJB12 is overexpressed it doesn’t inducing the ER stress (Figure S3C-S3D). In those conditions there is an increase of the cytosolic accumulation of AGR2, HYOU1 and DNAJB11 despite that there was no induction of AGR2, HYOU1 or DNAJB11 (Figure 3C and Figure 3E, Figure S3, Figure 4, and Figure S4) . Those results argue against the idea that the reflux is a result of protein induction and an increase in the total proteins levels.

      3- Figure 2D: Only p21 is quantified. phospho-p53 and p53 levels are not quantified.


      Answer: We added the quantification of phospho-p53 and the p53 levels to (Figure 2E-G). Additional blots of the P21, phosphor-p53 and p53 now added to FigureS2O.

      4- Figure 2D: There appears to be a labelling error

      Answer: Yes, the labelling error was corrected.

      5- Are there conditions where DNAJB12 would be higher?

      Answer: In some cancer types there is a higher DNAJB12, DNAJB14 and SGTA expression levels that are associated with poor prognosis and reduced survival (New Figure S6E-M). The following were added to the manuscript: “Finally, we tested the effect of DNAJB12, DNAJB14, and SGTA expression levels on the survival of cancer patients. A high copy number of DNAJB12 is an unfavorable marker in colorectal cancer and in head and neck cancer because it is associated with poor prognosis in those patients (Figure S6E). A high copy number of DNAJB12, DNAJB14, and SGTA is associated with poor prognosis in many other cancer types, including colon adenocarcinoma (COAD), acute myeloid leukemia (LAML), adrenocortical carcinoma (ACC), mesothelioma (MESO), and Pheochromocytoma and paraganglioma (PCPG) (Figure S6F-M). In uveal melanoma (UVM), a high copy number of the three tested genes, DNAJB12, DNAJB14, and SGTA, are associated with poor prognosis and poor survival (Figure S6I, S6J, and S6M). The high copy number of DNAJB12, DNAJB14, and SGTA is also associated with poor prognosis in many other cancer types but with low significant scores. More data is needed to make significant differences (TCGA database). We suggest that the high expression of DNAJB12/14 and SGTA in those cancer types may account for the poor prognosis by inducing ERCYS and inhibiting pro-apoptotic signaling, increasing cancer cells' fitness.

      6- What do the authors mean by "just by mass action"?

      Answer: Mass action means increasing the amount of the protein (overexpression). We corrected this in the main text to overexpression.

      7- Figure 3C: Should be labelled to indicate membrane and cytosolic fraction. The AGR2 blot in the left part is not publication quality and should be replaced.

      Answer: We added the labelling to indicate cytosolic and membrane fractions to Figure 3C. We re-blotted the AGR2, new blot of AGR2 was added.

      8- What could be the reason for the fact that DNAJB12 is necessary and sufficient for ERCYS, while DNAJB14 is only necessary?

      Answer: Because of their very high homology, we speculate that the two proteins have partial redundancy. Partial because we believe that some of the roles of DNAJB12 cannot be carried by DNAJB14 in its absence. Although they are highly homologous, we expect that they probably have different affinities in recruiting other factors that are necessary for the reflux of proteins.

      We further developed around this point in the discussion and the main text.

      9- Figure 5A: Is the interaction between SGTA and JB12 UPR-independent?HCS70 seems to show only background binding. The interaction of JB12 with SGTA is not convincing. A better blot is needed.

      Answer: In the conditions of Figure 5A, we did not observe any induction of the UPR (Figure S3C-D). Thus, we concluded that in those condition of overexpression, DNAJB12 interacts with SGTA in UPR independent manner.

      We repeated this experiment another 3 times with very high number of cells (2X15cm2 culture dishes for each condition) and instead of coimmunoprecipitating with DNAJB12 antibodies we IP-ed with FLAG-beads, the results are very clear as shown in the new Figure 5A compared to Figure S5A.

      10- Figure 5B: the expression of DNAJB14 was induced by Tg50, but not by Tg25 or Tm. However, the authors have not commented on this. This should be mentioned in the text and discussed.

      Answer: In most of the experiments we did not see an increase in DNAJB14 upon ER stress except in this replicate. To be sure we looked at the DNAJB14 levels upon ER stress by protein and qPCR experiment as shown in new (in the Input of Figure 5 and Figure S5) and (Figure S5H-I). We also added new IP experiments in Figure 5 and Figure S5.

      11- Figure 6A: Why is a double knockdown important at all? DNAJB14 does not seem to do much at all (neither in overexpression nor with single knockdown).

      Answer: the data shows that DNAJB12 can compensate for the lack of DNAJB14 while DNAJB14 can only partially compensate for some of the DNAJB12 functions. DNAJB12 could have higher affinity to recruit other factor needed for the reflux process and thus the impact of DNAJB12 is higher. In summary, neither DNAJB12 or DNAJB14 is essential in the single knockdown which means that they compensate for each other. In the overexpression experiment, it is enough to have the endogenous DNAJB14 for the DNAJB12 activity. When DNAJB14 is overexpressed at very high levels, we believe that it binds to some factors that are needed for proper DNAJB12 activity (Figure 4 showing that the WT-DNAJB14 inhibits ER-stress induced ER protein reflux when overexpressed). We believe that DNAJB14 is important because only when we knock both DNAJB12 and DNAJB14 we see an effect on the ER-protein reflux. DNAJB14 is part of a complex of DNAJB12/HSC70 and DSGTA.

      (DNAJB12 is sufficient while DNAJB14 is not- please refer to point #8 above).

      **Referees cross-commenting**

      I agree with the comments raised by reviewer 1 about the manuscript. I also agree with the points written in this consultation session. In my opinion, the comments of reviewer 2 are phrased in a harsh tone and thus the reviewer reaches the conclusion that there are "serious" problems with this manuscript. However, I think that the authors could address many of the points of this reviewer in a matter of 3 months easily. For instance, it is easy to control for the expression levels of exogenous wild type and mutant D12 and compare it to the endogenous one (point 3). This is a very good point of this reviewer and I agree with this experiment. Likewise, it is easy to provide data about the levels of AGR2 to address the concern whether its synthesis is affected by D12 and D14 overexpression. Again, an excellent suggestion, but no reason for rejecting the story. As for not citing the literature, I think this can also easily be addressed and I am sure that this is just an oversight and no ill intention by the authors. __Overall, I am unable to see why the reviewer reaches such a negative verdict about this work. With proper revisions that might take 3 months, I think the points of all reviewers can be addressed. __

      Reviewer #1 (Significance (Required)):

      Significance: The strength of the work is that it provides further mechanistic insight into a novel cellular phenomenon (ERCYS). The functions for DNAJB12&14 are unprecedented and therefore of great interest for the proteostasis community. Potentially, the work is also of interest for cancer researchers, who might capitalize of the ERCYS to establish DNAJB12/14 as novel therapeutic targets. The major weaknesses are as follows: (i) the work is limited to a single cell line. To better probe the cancer relevance, the work should have used at least a panel of cell lines from one (or more) cancer entity. Ideally even data from patient derived samples would have been nice. Having said this, I also appreciate that the work is primarily in the field of cell biology and the cancer-centric work could be done by others. Certainly, the current work could inspire cancer specialists to explore the relevance of ERCYS. (ii) No physiological or pathological condition is shown where DNAJB12 is induced or depleted.

      Answer: We previously showed that ERCYS is conserved in many different cell lines including A549, MCF7, GL-261, U87, HEK293T, MRC5 and others and is also conserved in murine models of GBM (GL-261 and U87 derived tumors) and human patients with GBM (Sicari et al. 2021). Here, we tested the reflux process and the IP experiments in many different cell lines including A549, MCF-7, PC3 and Trex-293 cells. We also added new fractionation experiment in DNAJB12 and DNAJB14 -depleted MCF-7, PC3 and A549 cells. We added all those data to the revised version.

      We also added survival curves from the TCGA database showing that high copy number of DNAB12, DNAJB14 and SGTA are associated with poor prognosis compared to conditions where DNAJB12, DNAJB14, and SGTA are at low copy number (Figure S6E-M). Finally, we included immunofluorescent experiment to show that the interaction between the refluxed AGR2 and the cytosolic SGTA occurs in tumors collected from patients with colorectal cancer patients (Figure S5F-G) compared to non-cancerous tissue.

      This study is highly significant and is relevant not only to cancer but for other pathways that may behave in similar manner. For instance, DNAJB12 and DNAJB14 are part of the mechanism that is used by non-envelope viruses to escape the ER to the cytosol. Thus, the role of those DNAJB proteins seems to be mainly in the reflux of functional (not misfolded) proteins from the ER to the cytosol. We reported earlier that the UDP-Glucose-Glucosyl Transferase 1 (UGGT1) is also expelled during ER stress. UGGT1 is important because it is redeploy to the cytosol during enterovirus A71 (EA71) infection to help viral RNA synthesis (Huang et al, 2017). This redeployment of EAA71 is similar to what happens during the reflux process because on one hand, UGGT1 exit the ER by an ER stress mediated process (Sicari et al. 2021) and it is also a functional in the cytosol as a proteins which help viral RNA synthesis ((Huang et al, 2017). All those data showing that there is more of DNAJB12, DNAJB14, DNAJC14, DNAJC30 and DNAJC18 that still needs to be explored in addition to what is published. Thus, we suggest that viruses hijacked this evolutionary conserved machinery and succeeded to use it in order to escape the ER to the cytosol in a manner that depends on all the component needed for ER protein reflux.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors present a study in which they ascribe a role for a complex containing DNAJB12/14-Hsc70-SGTA in facilitating reflux of a AGR2 from the ER to cytosol during ER-stress. This function is proposed to inhibit wt-P53 during ER-stress.

      Concerns: 1. The way the manuscript is written gives the impression that this is the first study about mammalian homologs of yeast HLJ1, while there are instead multiple published papers on mammalian orthologs of HLJ1. Section 1 and Figure 1 of the results section is redundant with a collection of previously published manuscripts and reviews. The lack of proper citation and discussion of previous literature prevents the reader from evaluating the results presented here, compared to those in the literature.

      Answer: We highly appreciate the reviewer’s comments. This paper is not to show that DNAJB12 and DNAJB14 are the orthologues of HLJ-1 but rather to show that DNAJB12 and DNAJB14 are part of a mechanism that we recently discovered and called ERCYS that cause proteins to be refluxed out of the ER. A mechanism that is regulated in by HLJ-1 in yeast. ERCYS is an adaptive and pro-survival mechanism that results in increased chemoresistance and survival in cancer cells. The papers that reviewer #2 refer to are the ones that report DNAJB12 can replace some of the ER-Associated Degradation (ERAD) functions of HLJ-1 in degradation of membranal proteins such as CFTR. These two mechanism are totally different and the role of the yeast HLJ-1 in degradation of CFTR is not needed for ERCYS. This is because we previously showed that the role of the yeast HLJ-1 and probably its orthologues in ERCYS is independent of their activity in ERAD(Igbaria et al. 2019). Surprisingly, the role of HLJ-1 in refluxing the ER proteins is not only independent of the reported ERAD-functions of HLJ1 and the mammalian DNAJBs but rather proceeds more rigorously when the ERAD is crippled (Igbaria et al. 2019). This role of DNAJBs is unique in cancer cells and is responsible in regulating the activity of p53 during the treatment of DNA damage agents.

      In our current manuscript we show by similarity, functionality, and topological orientation, that DNAJB12 and DNJB14 may be part of a well conserved mechanism to reflux proteins from the ER to the cytosol. A mechanism that is independent of DNAJB12/14’s reported activity in ERAD(Grove et al. 2011; Yamamoto et al. 2010; Youker et al. 2004). In addition, DNAJB12 and DNAJB14 facilitate the escape of non-envelope viruses from the ER to the cytosol in similar way to the reflux process(Goodwin et al. 2011; Igbaria et al. 2019; Sicari et al. 2021). All those data show that HLJ-1 reported function may be only the beginning of our understanding on the role that those orthologues carry and that are different from what is known about their ERAD function.

      Action: We added the references to the main text and discussed the differences between the reported DNAJB12 and HLJ-1 functions to the function of DNAJB12, DNAJB14 and the other DNAJ proteins in the reflux process. We also developed around this in the discussion.

      The conditions used to study DNAJB12 and DNAJ14 function in AGR2 reflux from the ER do not appear to be of physiological relevance. As seen below they involve two transfections and treatment with two cytotoxic drugs over a period of 42 hours. The assay for ERCY is accumulation of lumenal ER proteins in a cytosolic fraction. Yet, there is no data or controls that describe the path taken by AGR2 from the ER to cytosol. It seems like pleotropic damage to the ER due the experimental conditions and accompanying cell death could account for the reported results?

      Transfection of cells with siRNA for DNAJB12 or DNAJB14 with a subsequent 24-hour growth period.

      Transfection of cells with a p53-lucifease reporter.

      Treatment of cells with etoposide for 2-hours to inhibit DNA synthesis and induce p53. D. Treatment of cells for 16 hours with tunicamycin to inhibit addition of N-linked glycans to secretory proteins and cause ER-stress.

      Subcellular fractionation to determine the localization of AGR2, DNAJB11, and HYOU1

      KD of DNAJB12 or DNAJB14 have modest if any impact on AGR2 accumulation in the cytosol. There is an effect of the double KD of DNAJB12 or DNAJB14 on AGR2 accumulation in the cytosol. Yet there are no western blots showing AGR2 levels in the different cells, so it is possible that AGR2 is not synthesized in cells lacking DNAJB12 and DNAKB14. The lack of controls showing the impact of single and double KD or DNAJB12 and DNAJB14 on cell viability and ER-homeostasis make it difficult to interpret the result presented. How many control versus siRNA KD cells survive the protocol used in these assays?


      Answer: Despite the long protocol we see differences between the control cells and the DNAJB-silenced cells in terms of the quantity of the refluxed proteins to the cytosol. The luciferase construct was used to assess the activity of p53 so the step of the second transfection was used only in experiments were we assayed the p53-luciferase activity. The rest of the experiments especially those where we tested the levels of p53 and P21 levels, were performed with one transfection. Moreover, all the experiments with the subcellular protein fractionation were performed after one transfection without the second transfection of the p53-Luciferase reporter. Finally, the protocol of the subcellular protein fractionation requires first to trypsinize the cells to lift them up from the plates, at the time of the experiment the cells were almost at 70-80% confluency and in the right morphology under the microscope.

      Here, we performed XTT assay and Caspase-3 assay to asses cell death at the end of the experiment and before the fractionation assay. We did not observe any differences at this stage between the different cell lines (Figure-RV1 for reviewers Only). This can be explained by the fact that we use low concentrations of Tm and Tg for short time of 16 hour after the pulse of etoposide.

      Finally, the claim that and ER-membrane damage result in a mix between the ER and cytosolic components is not true for the following reasons: (1) In case of mixing we would expect that GAPDH levels in the membrane fraction will be increased and that we do not see, and (2) we used our previously described transmembrane-eroGFP (TM-eroGFP) that harbors a transmembrane domain and is attached to the ER membrane facing the ER lumen. The TM-eroGFP was found to be oxidized in all conditions tested. Those data argue against a rupture of the ER membrane which can results in a mix of the highly reducing cytosolic environment with the highly oxidizing ER environment by the passage of the tripeptide GSH from the cytosol to the ER. All those data argue against (1) cell death, and (2) rupture of the ER membrane. Figure RV1 Reviewers Only.

      Moreover, as it is shown in Figure S2, AGR2 is found in the membrane fraction in all the four different knock downs, thus it is synthesized in all of them. Moreover, we assayed the mRNA levels of AGR2 in all the knockdowns and we so that they are at the same levels in all the 4 different conditions and still AGR2 mRNA levels increase upon ER stress in all of the 4 knockdown cells in different backgrounds (Figure S2F-N).

      In Figure 3 the authors overexpress WT-D12 and H139Q-D12 and examine induction of the p53-reporter. There are no western blots showing the expression levels of WT-D12 and H139Q-D12 relative to endogenous DNAJB12. HLJ1 stands for high-copy lethal DnaJ1 as overexpression of HLJ1 kills yeast. The authors present no controls showing that WT-D12 and H139-D12 are not expressed at toxic levels, so the data presented is difficult to evaluate.

      Answer: The expression levels of the overexpression of DNAJB12 and DNAJB14 were present in the initial submission of the manuscript as Figure S3A and S3B. The data showing the relationship between the expression degree and the viability were also included in the initial submission as Figure S3C (Now S3H).

      There is no mechanistic data used to help explain the putative role DNAJB12 and DNAJB14 in ERCY? In Figure 4, why does H139Q JB12 prevent accumulation of AGR2 in the cytosol? There are no westerns showing the level to which DNAJB12 and DNAJB14 are overexpressed.


      Answer: The data showing the levels of DNAJB12 compared to the endogenous were present in the initial submission as Figure S3A and S3B.

      We suggest a mechanism by which DNAJB12 and DNAJB14 interact (Figure 5 and Figure S5) and oligomerize to expel those proteins in similar way to expelling non-envelope viruses to the cytosol. Thus, when expressing the mutant DNAJB12 H139Q may indicate that the J-domain dead-mutant can still be part of the complex but affects the J-domain activity in this oligomer and thus inhibit ER-protein reflux. In other words, we showed that the H139Q exhibits a dominant negative effect when overexpressed. Moreover, here we added another IP experiment in the D12/D14-DKD cells to show that in the absence of DNAJB12 and DNAJB14, SGTA cannot bind the ER-lumenal proteins because they are not refluxed (Figure 5 and Figure S5). Those data indicate that in order for SGTA bind the refluxed proteins they have to go through the DNAJB12 and DNAJB14 and their absence this interaction does not occur. This explanation was also present in the discussion of the initial submission.

      Mechanistically, we show that AGR2 interacts with DNAJB12/14 which are necessary for its reflux. This mechanism involves the functionality of cytosolic HSP70 chaperones and their cochaperones (SGTA) proteins that are recruited by DNAJB12 and 14. This mechanism is conserved from yeast to mammals. Moreover, by using the alpha-fold prediction tools, we found that AGR2 is predicted to interact with SGTA in the cytosol by the interaction between the cysteines of SGTA and AGR2 in a redox-dependent manner.

      **Referees cross-commenting**

      __ __ I appreciate the comments of the other reviewers. I agree that the authors could revise the manuscript. Yet, based on my concerns about the physiological significance of the process under study and lack of scholarship in the original draft, I would not agree to review a revised version of the paper.

      Answer: Regards the physiological relevance, we showed in our previous study (Sicari et al. 2021) how relevant is ERCYS in human patients of GBM and murine model of GBM. ERCYS is conserved from yeast to human and is constitutively active in GL-261 GBM model, U87 GBM model and human patients with GBM (Sicari et al. 2021). Here, extended that to other tumors and showed that DNAJB12, DNAJB14 and SGTA high levels are associated with poor prognosis in many cancer types (Figure S6). We also show some data from to show the relevance and added data showing the interaction of SGTA with AGR2 in CRC samples obtained from human patients compared to healthy tissue (Figure S5). This study is highly significant and is relevant not only to cancer but for other pathways that may behave in similar manner. For instance, DNAJB12 and DNAJB14 are part of the mechanism that is used by non-envelope viruses to escape the ER to the cytosol. Thus, the role of those DNAJB proteins seems to be mainly in the reflux of functional (not misfolded) proteins from the ER to the cytosol. We reported earlier that the UDP-Glucose-Glucosyl Transferase 1 (UGGT1) is also expelled during ER stress. UGGT1 is important because it is redeploy to the cytosol during enterovirus A71 (EA71) infection to help viral RNA synthesis (Huang et al, 2017). This redeployment of EAA71 is similar to what happens during the reflux process because on one hand, UGGT1 exit the ER by an ER stress mediated process (Sicari et al. 2021) and it is also a functional in the cytosol as a proteins which help viral RNA synthesis ((Huang et al, 2017). All those data showing that there is more of DNAJB12, DNAJB14, DNAJC14, DNAJC30 and DNAJC18 that still needs to be explored in addition to what is published. We suggest that viruses hijacked this evolutionary conserved machinery and succeeded to use it in order to escape.

      We appreciate the time spent to review our paper and we are sorry that the reviewer reached such verdict that is also not understood by the other reviewers. Most of the points raised by reviewer 2 were already addressed and explained in the initial submission, anyways we appreciate the time and the comments of reviewer #2 on our manuscript.

      Reviewer #2 (Significance (Required)):

      Overall, there are serious concerns about the writing of this paper as it gives the impression that it is the first study on higher eukaryotic and mammalian homologs of yeast HLJ1. The reader is not given the ability to compare the presented data to related published work. There are also serious concerns about the quality of the data presented and the physiological significance of the process under study. In its present form, this work does not appear suitable for publication.

      Answer: Again we thank reviewer #2 for giving us the opportunity to explain how significant is this manuscript especially for people who are less expert in this field. The significance of this paper (1) showing a the unique role of DNAJB12 and DNAJB14 in the molecular mechanism of the reflux process in mammalian cells (not their role in ERAD), (2) showing the implication of other cytosolic chaperones in the process including HSC70 and SGTA (3), our alpha-fold prediction show that this process may be redox dependent that implicate the cysteines of SGTA in extracting the ER proteins, (4) overexpression of the WT DNAJB12 is sufficient to drive this process, (5) mutation in the HPD motif prevent the reflux process probably by preventing the binding to the cytosolic chaperones, and (6) we need both DNAJB12 and DNAJB14 in order to make the interaction between the refluxed ER-proteins and the cytosolic chaperones occur.

      In Summary, this study is highly significant in terms of physiology, we previously reported that ERCYS is conserved in mammalian cells and is constitutively active in human and murine tumors (Sicari et al. 2021). Moreover, DNAJB12 and DNAJB14 are part of the mechanism that is used by non-envelope viruses to escape the ER to the cytosol in a mechanism that is similar to reflux process (Goodwin et al. 2011; Goodwin et al. 2014). Thus, the role of those DNAJB proteins seems to be mainly in the reflux of functional proteins from the ER to the cytosol, viruses used this evolutionary conserved machinery and succeeded to use in order to escape. This paper does not deal with the functional orthologues of the HLJ-1 in ERAD but rather suggesting a mechanism by which soluble proteins exit the ER to the cytosol.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)):____ __

      Summary: Reflux of ER based proteins to the cytosol during ER stress inhibits wt-p53. This is a pro-survival mechanism during ER stress, but as ER stress is high in many cancers, it also promotes survival of cancer cells. Using A549 cells, Dabsan et al. demonstrate that this mechanism is conserved from yeast to mammalian cells, and identify DNAJB12 and DNAJB14 as putative mammalian orthologues of yeast HLJ1.

      This paper shows that DNAJB12 and 14 are likely orthologues of HLJ1 based on their sequences, and their behaviour. The paper develops the pathway of ER-stress > protein reflux > cytosolic interactions > inhibition of p53. The authors demonstrate this nicely using knock downs of DNAJB12 and/or 14 that partially blocks protein reflux and p53 inhibition. Overexpression of WT DNAJB12, but not the J-domain inactive mutant, blocks etoposide-induced p53 activation (this is not replicated with DNAJB14) and ER-resident protein reflux. The authors then show that DNAJB12/14 interact with refluxed ER-resident proteins and cytosolic SGTA, which importantly, they show interacts with the ER-resident proteins AGR2, PRDX4 and DNAJB11. Finally, the authors show that inducing ER stress in cancer cell lines can increase proliferation (lost by etoposide treatment), and that this is partially dependent on DNAJB12/14.

      This is a very interesting paper that describes a nice mechanism linking ER-stress to inhibition of p53 and thus survival in the face of ER-stress, which is a double edged sword regarding normal v cancerous cells. The data is normally good, but the conclusions drawn oversimplify the data that can be quite complex. The paper opens a lot of questions that the authors may want to develop in more detail (non-experimentally) to work on these areas in the future, or alternatively to develop experimentally and develop the observations further. There are only a few experimental comments that I make that I think should be done to publish this paper, to increase robustness of the work already here, the rest are optional for developing the paper further.

      We thank the reviewer for his/her positive comments His/her comments contributed to make our manuscript stronger.

      __Major comments:____ __

      1. Number of experimental repeats must be mentioned in the figure legends. Figures and annotations need to be aligned properly

      __Answer____: __All experiments were repeated at least 3 times. We added the number of repeats on each figure in the figures legends

      Results section 2:

      No intro to the proteins you've looked at for relocalization. Would be useful to have some info on why you chose AGR2. Apart from them being ER-localized, do they all share another common characteristic? Does ability to inhibit p53 vary in potency?

      Answer: We previously showed that AGR2 is refluxed from the ER to the cytosol to bind and inhibit wt-p53 (Sicari et al. 2021). Here, we used AGR2 because, (1) we know that AGR2 is refluxed from the ER to the cytosol, and (2) we know which novel functions it gains in the cytosol so we are able to measure and provide a physiological significance of those novel functions when the levels of DNAJB12 and DNAJB14 are altered. Moreover, we used DNAJB11 (41 kDa) and HYOU1 (150 kDa) proteins to show that alteration in DNAJB12 or DNAJB14 prevent the reflux small, medium and large sized proteins. We added a sentence in the discussion stating that DNAJB12/14 are responsible for the reflux of ER-resident proteins independently of their size. We also added in the result section that we are looking at proteins of different sizes and activities.


      What are the roles of DNAJB12/14 if overexpression can induce reflux? Does it allow increased binding of an already cytosolic protein, causing an overall increase in an interaction that then causes inhibition of p53? What are your suggested mechanisms?

      Answer: Previously it was reported that over-expression of DNAJB12 and DNAJB14 tend to form membranous structures within cell nuclei, which was designate as DJANGOS for DNAJ-associated nuclear globular structures(Goodwin et al. 2014). Because those structures which contain both DNAJB12 and DNAJB14 also form on the ER membrane (Goodwin et al. 2014), we speculate that during stress DNAJB12/14 overexpression may facilitate ERCYS. Interestingly, those structures contain Hsc70 and markers of the ER lumen, the nuclear and ER and nuclear membranes (Goodwin et al. 2014).

      The discussion was edited accordingly to further strengthen and clarify this point

      Fig3: A+B show overexpression of individual DNAJs but not combined. As you go on to discuss the effect of the combination on AGR2 reflux, it would be useful to include this experimentally here.

      Answer: This is a great idea, we tried to do it for long time. Unfortunately when we used cells overexpress DNAJB12 under the doxycycline promoter and transfect with DNAJB14 plasmid expressing DNAJB14 under the CMV promoter, most of the cells float within 24 hours compared to cells transfected with the empty vector alone or with DNAJB14-H136Q. We also did overexpression of DNAJB14 in cells with DNAJB12 conditional expression and also were lethal in Trex293T cells and A549-cells.

      Fig 3C: Subfractionation of cells shows AGR2 in the cytosol of A549 cells. The quality of the data is good but the bands are very high on the blot. For publication is it possible to show this band more centralized so that we are sure that we are not missing bands cut off in the empty and H139Q lanes?

      Also, you have some nice immunofluorescence in the 2021 EMBO reports paper, is it possible to show this by IF too? It is not essential for the story, but it would enrich the figure and support the biochemistry nicely. Also it is notable that the membrane fraction of the refluxed proteins doesn't appear to have a decrease in parallel (especially for AGR2). Is this because the % of the refluxed protein is very small? Is there a transcriptional increase of any of them (the treatments are 12+24 h so it would be enough time)? This could be a nice opportunity to discuss the amount of protein that is refluxed, whether this response is a huge emptying of the ER or more like a gentle release, and also the potency of the gain of function and effect on p53 vs the amount of protein refluxed. This latter part isn't essential but it would be a nice element to expand upon.

      Answer: We re-blotted the AGR2 again, new blot of AGR2 was added. More blots also are added in Figure S2, the text is edited accordingly.

      In new Figure S5 we added immunofluorescence experiment from tumors and non-tumors tissues obtained from Colorectal cancer (CRC) patients showing that the interaction between SGTA and the refluxed AGR2 also occurs in more physiological settings. It is also to emphasize that the suggested mechanism that implicates SGTA is also valid in CRC tumors.

      We performed qPCR experiments in control, DNAJB12-KD, DNAJB14-KD and in the DNAJB12/DNAJB14 double knock down cells (in both A549 and PC3 cells) to follow the mRNA levels of DNAJB11. As shown in the Figure S2F-N, there is no increase in the mRNA levels of DNAJB11, AGR2 or HYOU1 in the different cells in normal (unstressed conditions). Upon ER stress with tunicamycin or thapsigargin there is a little increase in the mRNA levels of HYOU1 and AGR2 but in DNAJB11 mRNA levels. On the other hand, we also performed western blot analysis and we did not detect any difference between the different knockdown cells when we analyzed the levels of DNAJB11 compared to GAPDH. Those data are now added to Figure S2F-N. We must note that in AGR2 and HYOU1 are induced at the mRNA as a result of ER stress. The data with the overexpression of DNAJB12 and DNAJB14 are important control experiment where we show a reflux when DNAJB12 is overexpressed without inducing the ER stress (Figure 3, Figure 4, and Figure S3). In those conditions no induction of AGR2, HYOU1 or DNAJB11 were observed. Those results argue against the reflux as a result of protein induction and the increase in the proteins levels.

      The overall protein levels in steady state are function of how much proteins are made, degraded and probably secreted outside the cell. We do see in Figure S2 under ER stress there are some differences in the levels of the mRNA, moreover, from our work in yeast we showed that the expelled proteins have very long half-life in the cytosol (Igbaria et al. 2019). Because it is difficult to assay how many of the mRNA is translated and how much of it is stable/degraded and the stability of the cytosolic fraction vs the ER, it is hard to interpret on the stability and the levels of the proteins.

      Those data are now added to the manuscript, the text is edited accordingly.

      You still mention DNAJB12 and 14 as orthologues, even though DNAJB14 has no effect on p53 activity when overexpressed. Do you think that this piece of data diminishes this statement?

      Answer: The fact that DNAJB12 and DNAJB14 are highly homologous and that only the double knockdown has a great effect on the reflux process may indicate that they are redundant. Moreover, because only DNAJB12 is sufficient may indicate that some of DNAJB12 function cannot be carried by DNAJB14. In one hand they share common activities as shown in the double knock down and on the other hand DNAJB12 has a unique function that may not be compensated by DNAJB14 when overexpressed.

      __ __ Fig 3D/F: Overexpression of DNAJB14 induces reflux of DNAJB11 at 24h, what does this suggest? Does this indicate having the same role as DNAJB12 but less potently? What's your hypothesis?

      Answer: ERCYS is new and interesting phenomenon and the redistribution of proteins to the cytosol has been documented lately by many groups. Despite that we still do not know what is the specificity of DNAJB12 and DNAJB14 to the refluxed proteins. DNAJB11 is glycosylated protein and now we are testing whether other glycosylated proteins prefer the DNAJB14 pathway or not. This data is beyond the scope of this paper

      "This suggests that the two proteins may have different functions when overexpressed, despite their overlapping and redundant functions" What does it suggest about their dependence on each other? If overexpression of WT DNAJB12 inhibits Tg induced reflux, is it also blocking the ability of DNAJB14 to permit flux?

      Answer: We hypothesize that it is all about the stichometry and the ratios between proteins. When we overexpress DNAJB14 (the one that is not sufficient to cause reflux it may hijack common components and factor by non-specifically binding to them. Those factors may be needed for DNAJB12 to function properly (Like the dominant negative effect of the DNAJB12-HPD mutant for instance). On the other hand, DNAJB12 may have higher affinity for some cytosolic partner and thus can do the job when overexpressed. Here, we deal with the DNAJB12/DNAJB14 as essential components of the reflux process, yet we need to identify the interactome of each of the proteins during stress and the role of the other DNAJ proteins that also share some of the topological and structural similarity to DNAJB12, DNAJB14 and HLJ-1 (DNAJC30, DNAJC14, and DNAJC18). We edited the text accordingly and integrated this in the discussion.

      __ __ Fig 4: PDI shown in blots but not commented on in text. Then included in the schematics. Please comment in the text.

      Answer: We commented PDI in the text.

      Fig 4F: Although the quantifications of the blots look fine, the blot shown does not convincingly demonstrate this data for AGR2. The other proteins look fine, but again it could be useful to see the individual means for each experiment, or the full gels for all replicates in a supplementary figure.

      Answer: the other two repeats are in Figure S4

      __ __Results section 3

      Fig 5A, As there is obviously a difference between DNAJB12/14 it would be useful to do the pulldown with DNAJB14 too. Re. HSC70 binding to DNAJB12 and 14, the abstract states that DNAJB12/14 bind HSC70 and SGTA through their cytosolic J domains. Fig 5 shows pulldowns of DNAJB12 with an increased binding of SGTA in FLAG-DNAJB12 induced conditions, but the HSC70 band does not seem to be enriched in any of the conditions, including after DNAJB12 induction. This doesn't support the statement that DNAJB12 binds HSC70. In fact, in the absence of a good negative control, this would suggest that the HSC70 band seen is not specific. There is also no data to show that DNAJB14 binds HSC70. I recommend including a negative condition (ie beads only) and the data for DNAJB14 pulldown.

      Answer: In Figure 5A we used the Flp-In T-REx-293 cells as it is easier to control and to tune up and down the expression levels of DNAJB12 and DNAJB14. According to new Figure S5A, DNAJB12 binds at the basal levels to HSC70 all the time. It was also surprising for us not to see the differences in the overexpression and we relate that to the fact that all the HSC70 are saturated with DNAJB12. In order to better assay that we repeated the IP in Figure 5A but instead of the IP with DNAJB12, we IP-ed with FLAG antibodies to selectively IP the transfected DNAJB12. As shown in the new Fig 5A, the increase of DNAJB12-FLAG is accompanied with an increase in the binding of HSC70.

      We further tested the interaction between DNAJB12, DNAJB14 and HSC70 during ER stress in cancer cells. In those cells we found that DNAJB12 and DNAJB14 bind to HSC70 and they recruit SGTA upon stress. We also tested the binding between DNAJB12 and DNAJB14, in unstressed conditions, there was a basal binding between both, this interaction was stronger during ER stress. Those data are now added to Figure 5 and Figure S5 and the discussion was edited accordingly.

      The binding of DNAJB12 to SGTA under stress conditions in Fig5B looks much more convincing than SGTA to DNAJB12 in Fig 5A. Bands in all blots need to be quantified from 3 independent experiments, and repeated if not already n=3. If this is solely a technical difference, please explain in the text.

      The conclusions drawn from this interaction data are important and shold be elaborated upon to support th claims made in the paper. The authors may also chose to expand the pulldowns to demonstrate their claims made on olidomerisation of DNAJB12 and 14 here. It is also clear that the interaction data of the SGTA with ER-resident proteins AGR2, PRDX4 and DNAJB11 is strong. The authors may want to draw on this in their hypotheses of the mechanism. I would imagine a complex such as DNAJB14/DNAJB12 - SGTA - AGR2/PRDX4/DNAJB11 would be logical. Have any experiments been performed to prove if complexes like this would form?

      Answer: In Figure 5A we used the Flp-In T-REx-293 cells as it is easier to control and to tune up and down the expression levels of DNAJB12 and DNAJB14. T-REx-293 are highly sensitive to ER stress, they do not die (as we did not observe apoptosis markers to be elevated) but they float and can regrow after the stress is gone. In Figure 5B we are using ER stress without the need to express DNAJB12 in A549 cell line. In order to further verify those data, we repeated the IP in another cell line as well to confirm the data in 5B. We also repeated the IP in 5A with anti-FLAG antibody to improve the IP and to specifically map he interaction with the overexpressed FLAG-DNAJB12 (discussed above). All experiments were done in triplicates and added to Figure 5 and Figure S5.

      We agree with the reviewer on the complex between the refluxed proteins and SGTA. We believed that SGTA may form a complex with other refluxed ER-proteins but we were unable to see an interaction between AGR2-DNAJB11 in the cytosolic fraction or between AGR2-PRDX4 in the conditions tested in the cytosolic fraction. We could not do this in the whole cell lysate because those proteins bind each other in the ER. Finally, our structural prediction using Alpha-fold suggests that the interaction between SGTA and the refluxed AGR2 (and probably others) is redox depending and that it requires disulfide bridge between cysteine 81 on AGR2 and cysteine 153 on SGTA. Thus, we hypothesize that SGTA binds one refluxed protein at the time.

      We repeated the figure with improvement: (1) using more cells in order to increase the amount of IP-ed proteins and to overcome the problem of the faint bands, (2) performing the IP with the FLAG antibodies instead of the DNAJB12 endogenous antibodies.

      Fig 5B: It is clear that DNAJB12 interacts with SGTA. The authors state that DNAJB14 also interacts with SGTA under normal and stress conditions, but the band in 25/50 Tg is very feint. Why would there be stronger binding at the 2 extremes than during low stress induction? In the input, there is a much higher expression of DNAJB14 in 50 Tg. What does this say about the interaction? Is there an effect of ER stress on DNAJB14 expression? A negative control should be included to show any background binding, such as a "beads only" control

      __Answer: __DNAJB14 does not change with ER stress as shown in the Ips (Input) and in the qPCR experiment in Figure S5I. We added beads only control, we also added new Ips to assess the binding between DNAJB14 and DNAJB12, and between DNAJB14-SGTA. All the new Ips and controls now added as Figure 5 and Figure S5.

      Fig 5C data is sound, although a negative control should be included.

      Answer: Negative control was added in Figure S5.

      __Results section 4____ __

      Fig 6A-B: Given that there is the complexity of overexpression v KD of DNAJB12 v 14 causing similar effects on p53 actvity (Fig 2 v 3), it would be interesting to see whether the effect of overexpression mirrors the results in Fig 6A. Is it known what SGTA overexpression does (optional)?

      Answer: In the overexpression system, cells overexpressing DNAJB12 start to die between 24-48 hours as shown in Figure S3C. Thus, it is difficult to assay the proliferation of these cells in those conditions. On the other hand, overexpression of Myc-tagged SGTA in A549 cells, MCF7 or T-ReX293 did not show any reflux of ER-proteins to the cytosol and it didn’t show any significant changes in the proliferation index (Figure Reviewers only RV2).

      Fig 6D: resolution very low

      Answer: Figure 6D was changed

      __ __ Fig 6C-D: There is an interesting difference though between the proposed cytosolic actions of the refluxed proteins. You show that AGR2, PRDX4 and DNAJB11 all bind to SGTA in stress conditions, but in the schematics you show: DNAJB11 binding to HSC70 through SGTA (not shown in the paper), then also PDIA1, PDIA3 binding to SGTA and AGR2 binding to SGTA. What role does SGTA have in these varied reactions? Sometimes it is depicted as an intermediate, sometimes a lone binder, what is its role as a binder? It should be clarified which interactions are demonstrated in the paper (or before) and which are hypothesized in a graphical way (eg. for hypotheses dotted outlines or no solid fill etc). The schematics also suggest that DNAJB14 binding to HSC70 and SGTA is inducible in stress conditions, as is PDIA3, which is not shown in the paper. Discussion "In cancer cells, DNAJB12 and DNAJB14 oligomerize and recruit cytosolic chaperones and cochaperones (HSC70 and SGTA) to reflux AGR2 and other ER-resident proteins and to inhibit wt-p53 and probably different proapoptotic signaling pathways (Figure 5, and Figure 6C-6D)." You havent shown oligomerisation between DNAJB12/14. Modify the text to make it clear that it is a hypothesis.

      Answer: We removed “oligomerize” from the text and added that it as a hypothesis. Figure (C-D) also were changed to be compatible with the text.

      Minor comments:

      __ __ It would be useful to have page or line numbers to help with document navigation, please include them. Typos and inconsistency in how some proteins are named throughout the manuscript

      Answer: Page numbers and line numbers are added. Typos are corrected

      Title: Include reference to reflux. Suggest: "chaperone complexes (?proteins) reflux from the ER to cytosol..." I presume it would be more likely that the proteins go separately rather than in complex. Do you have any ideas on the size range of proteins that can undergo this process?

      Answer: this is true, proteins may cross the ER membrane separately and then be in a complex with cytosolic chaperones. The title is changed accordingly. As discussed earlier, the protein we chose were of different sizes to show that they are refluxed independently of their size. Moreover, our previous work showed that the proteins that were refluxed are of different sizes. Most importantly UGGT1 (around 180 Kda) which is reported to deploy to the cytosol upon viral infection (Huang et al. 2017; Sicari et al. 2020). In this study we used AGR2 (around 19 Kda) and HYOU1 (150Kda).

      ERCY in abstract, ERCYS in intro. There are typos throughout, could be a formatting problem, please check

      Answer: Checked and corrected

      What about the selection of refluxed proteins? Is this only a certain category of proteins? Could it be anything? Have you looked at other cargo / ER resident proteins?

      __ ____Answer: __in our previous study by (Sicari, Pineau et al. 2020) we looked at many other proteins especially glycoproteins from the ER. In (Sicari, Pineau et al. 2020) we used mass spectrometry in order to identify new refluxed proteins and we found 26 new glycoprotein that are refluxed from cells treated with ER stressor and from human tissues obtained from GBM patients (Sicari, Pineau et al. 2020).

      We previously showed that AGR2 is refluxed from the ER to the cytosol to bind and inhibit p53 (Sicari, Pineau et al. 2020). Here, we selected AGR2 because we know that (1) it is refluxed, and (2) we know which novel functions it acquires in the cytosol so we are able to measure and provide a physiological significance of those novel functions when the levels of DNAJB12 and DNAJB14 are altered. Moreover, we selected DNAJB11 (41 kDa) and HYOU1 (150 kDa) proteins to show that alteration in DNAJB12 or DNAJB14 prevent the reflux small, medium and large protein (independently of their size). We also showed earlier by mass spectrometry analysis that the refluxed proteins range from small to very large proteins such as UGGT1, thus we believe that soluble ER-proteins can be substrates of ERCYS independently of their size. In the discussion, we added a note that the reflux by the cytosolic and ER chaperones operates on different proteins independently of their size.

      "Their role in ERCYS and cells' fate determination depends..." Suggest change to "Their role in ERCYS and determination of cell fate..."

      Answer: changed and corrected

      I think that the final sentence of the intro could be made stronger and more concise. There's a repeat of ER and cytosol. Instead could you comment on the reflux permitting new interactions between proteins otherwise spatially separated, then the effect on wt-p53 etc.

      Answer: The sentence was rephrased as suggested to “ In this study, we found that HLJ1 is conserved through evolution and that mammalian cells have five putative functionality orthologs of the yeast HLJ1. Those five DNAJ- proteins (DNAJB12, DNAJB14, DNAJC14, DNAJC18, and DNAJC30) reside within the ER membrane with a J-domain facing the cytosol (Piette et al. 2021; Malinverni et al. 2023). Among those, we found that DNAJB12 and DNAJB14, which are strongly related to the yeast HLJ1 (Grove et al. 2011; Yamamoto et al. 2010), are essential and sufficient for determining cells' fate during ER stress by regulating ERCYS. Their role in ERCYS and determining cells' fate depends on their HPD motif in the J-domain. Downregulation of DNAJB12 and DNAJB14 increases cell toxicity and wt-p53 activity during etoposide treatment. Mechanistically, DNAJB12 and DNAJB14 interact and recruit cytosolic chaperones (HSC70/SGTA) to promote ERCYS. This later interaction is conserved in human tumors including colorectal cancer.

      In summary, we propose a novel mechanism by which ER-soluble proteins are refluxed from the ER to the cytosol, permitting new inhibitory interactions between spatially separated proteins. This mechanism depends on cytosolic and ER chaperones and cochaperones, namely DNAJB12, DNAJB14, SGTA, and HSC70. As a result, the refluxed proteins gain new functions to inhibit the activity of wt-p53 in cancer cells. “

      __Figure legends: __

      In some cases the authors state the number of replicates, but this should be stated for all experiments. If experiments don't already include 3 independent repeats, this should be done. Check text for typos, correct letter capitalisation, spaces and random bold text (some of this could be from incompatability when saving as PDF)

      Answer: all experiments were repeated at least three times. The number of repeats is now indicated in the figure legends of each experiment. Typos and capitalization is corrected as well.

      Fig2E: scrambled not scramble siRNA

      Answer: corrected

      Fig 3: "to expel" is a term not used in the rest of the paper for reflux. Useful to remain consistent with terminology where possible

      Answer: Rephrased and corrected

      Results section 1:

      "Protein alignment of the yeast HLJ1p showed high amino acids similarity to the mammalian..."

      Answer: Rephrased to “ Comparing the amino acid sequences revealed significant similarity between the yeast protein HLJ1p and the mammalian proteins DNAJB12 and DNAJB14”

      __ __ Fig 1C: state in legend which organism this is from (presumably human)

      Answer: in Figure 1C legends it is stated that: “ the HPD motif within the J-domain is conserved in HLJ-1 and its putative human orthologs DNAJB12, DNAJB14, DNAJC14, DNAJC18, and DNAJC30.”

      Results Section 2

      "Test the two strongest hits DNAJB12/14" Add reference to previous paper showing this

      Answer: the references were added.

      __ __ "In the WT and J-protein-silenced A549 cells, there were no differences in the cytosolic enrichment of the three ER resident proteins AGR2, DNAJB11, and HYOU1 in normal and unstressed conditions (Figure 2A-C and Figure S2C)." I think that this is an oversimplification, and in your following discussion, you show this it's more subtle than this.

      Answer: We expanded on this both in the discussion and the results section.

      __ __ The text here isn't so clear: normal and unstressed conditions? Do you mean stressed? Please be careful in your phrases: "DNAJB12-silenced cells are slightly affected in AGR2 and DNAJB11 cytosolic accumulation but not HYOU1." This is the wrong way around. DNAJB12 silencing effects AGR2, not that AGR2 effects the cells (which is how you have written it). This also occurs agan in the next para:

      Answer: Normal cells are non-cancer cells. Unstressed conditions= without ER stress. The sentence was rephrased to: In the absence of ER stress, the cytosolic levels of the three ER-resident proteins (AGR2, DNAJB11, and HYOU1) were similar in wild-type and J-protein-silenced A549 cells.

      "During stress, DNAJB12/DNAJB14 double knockdown was highly affected in the cytosolic..." I think you mean it highly affected the cytosolic accumulation, not that it was affected by the cytosolic accumulation. Please change in the text

      Answer: the sentence is now rephrased to” During stress, double knockdown of DNAJB12 and DNAJB14 highly affected the cytosolic accumulation of all three tested proteins”

      __ __ "DNAJB12 and DNAJB14 are strong hits of the yeast HLJ1" Not clear, I presume you mean they are likely orthologues? Top candidates for being closest orthologues?

      Answer: this is correct, the sentence is rephrased and corrected

      __ __ Fig 2D: typos in WB labelling? I think Tm should be - - +, not - + +as it is now (if it's not a typo, you need more controls, eto alone.

      Answer: the labeling is now corrected

      Fig 2D-E-F typos for DKD? D12/D12 or D12/14?

      Answer: This is correct, thank you for pointing this out. The labeling in corrected

      __ __ "We assayed the phosphorylation state of wt- p53 and p21 protein expression levels (a downstream target of p53 signaling) during etoposide treatment." What are the results of this? Explain what Fig 2D-E shows, then build on this with the +Tm results. Results should be explained didactically to be clear.

      Answer: The paragraph was edited and we explained the results: In these conditions, we saw an increase in the phosphorylation of wt-p53 in the control cells and in cells knocked-down with DNAJB12, DNAJB14 or both. This phosphorylation increased the protein levels of p21 as well (Figure 2D-G). Tm addition to cells treated with etoposide resulted in a reduction in wt-p53 phosphorylation, and as a consequence, the p21 protein levels were also decreased (Figure 2D-G and Figure S2O). Cells lacking DNAJB12 or DNAJB14 have partial protection in wt-p53 phosphorylation and p21 protein levels. Silencing both proteins in A549 and MCF7 cells rescued wt-p53 phosphorylation and p21 levels (Figure 2D-G and Figure S2D). Moreover, similar results were obtained when we assayed the transcriptional activity of wt-p53 in cells transfected with a luciferase reporter under the p53-DNA binding site (Figure 2H). These data confirm that DNAJB12 and DNAJB14 are involved in ER protein reflux and the inhibition of wt-p53 activity during ER stress.


      "(Figure 2D- E). Cells lacking DNAJB12 and or DNAJB14 have partial protection in wt-p53 phosphorylation and p21 protein levels."

      Answer: This sentence is now removed

      You comment on p53 phosphorylation, but you haven't quantified this. This should be done, normalized to p53 levels, if you want to draw these conclusions, especially as total p53 varies between condition. Does Eto increase p53 txn? Does Tm alone increase p53 activity/phospho-p53? These are shown in the Sicari EMBO reports paper in 2021, you should briefly reference those.

      Answer: The blots are now quantified and new blot is added to Figure S2D. The Paragraph was edited and referenced to our previous paper (Sicari et al. 2021). “We then wanted to examine whether the gain of function of AGR2 and the inhibition of wt-p53 depends on the activity of DNAJB12 and DNJAB14. We assayed the phosphorylation state of wt-p53 and p21 protein expression levels (a downstream target of wt-p53 signaling) during etoposide treatment. In these conditions, there was an increase in the phosphorylation of wt-p53 in the control cells and in cells knocked down with DNAJB12, DNAJB14, or both. This phosphorylation also increases protein levels of p21 (Figure 2D-G and Figure S2O). Tm addition to cells treated with etoposide resulted in a reduction in wt-p53 phosphorylation, and as a consequence, the p21 protein levels were also decreased (Figure 2D-G and Figure S2O). Silencing DNAJB12 and DNAJB14 in A549 and MCF-7 cells rescued wt-p53 phosphorylation and p21 levels (Figure 2D-G and Figure S2O). Moreover, similar results were obtained when we assayed the transcriptional activity of wt-p53 in cells transfected with a luciferase reporter under the p53-DNA binding site (Figure 2H). In the latter experiment, etoposide treatment increased the luciferase activity in all the cells tested. Adding ER stress to those cells decreased the luciferase activity except in cells silenced with DNAJB12 and DNAJB14.

      These data confirm that DNAJB12 and DNAJB14 are involved in the reflux of ER proteins in general and AGR2 in particular. Inhibition of DNAJB12 and DNAJB14 prevented the inhibitory interaction between AGR2 and wt-p53 and thus rescued wt-p53 phosphorylation and its transcriptional activity as a consequence. “

      Fig3A: overexpression of DNAJB12 decreases Eto induced p53 but not at steady state. Is this because at steady state the activity is already basal? Or is there another reason?

      Answer: yes, at steady state the activity is already basal

      Switch Figs S3D and S3C as they are not referred to in order. Also Fig S3C: vary colour (or add pattern) on bars more between conditions

      Answer: The Figures now are called by their order in the new version. Colors are now added to Figure S3C.

      Need to define HLJ1 at first mention

      Answer: defined as” HLJ1 - High copy Lethal J-protein -an ER-resident tail-anchored HSP40 cochaperone.

      Results section 3

      HSC70 cochaperone (SGTA) defined twice

      Answer: the second one was removed

      "These data are important because SGTA and the ER-resident proteins (PRDX4, AGR2, and DNAJB11) are known to be expressed in different compartments, and the interaction occurs only when those ER-resident proteins localize to the cytosol." Is there a reference for this?

      Answer: Peroxireoxin 4 is the only peroxerodin that is expressed in the ER. AGR2 and DNAJB11 are also ER luminal proteins that are known to be solely expressed in the ER. SGTA is part of the cytosolic quality control system and is expressed in the cytosol. The references are added in the main text.

      Results section 4

      "by almost two folds"

      Answer: corrected

      Fig 6A: It seems strange that the difference between purple and blue bars in scrambled, and D14-KD are very significant but D12-KD is only significant. Why is this? The error bars don't look that different. It would be interesting to see the individual means for each different replicate.

      Answer: Thank you for pointing this, the two asterixis were aligned in the middle as one during figure alignments. In D14 the purple one has a lower error bar thus this changes the significance when compared to the blue while in D12-KD, the error bars in the eto treatment and the eto-Tm both are slightly higher. Graphs of the three different replicates are now added in Figure S6. Each one of the three biological replicates was repeated in three different technical repeats (averaged in the graphs).

      Figures: Fig 6A: Scale bars not well placed. Annotation on final set should be D12/D14 DKD?

      Answer: both were Corrected

      __Discussion __47. The authors mention that they want to use DNAJB12/4-HSC70/SGTA axis to impair cancer cell fitness: What effect would this have though in a non cancer model? Would this be a viable approach Although it is obviously early days, which approach would the authors see as potentially favorable?


      Answer: In our previous study we used an approach to target AGR2 in the cytosol because the reflux of AGR2 occurs only in cancer cells and not in normal cells. In that study we targeted AGR2 with scFv that targets AGR2 and is expressed in the cytosol, in this case it will target AGR2 in the cytosol which only occurs in cancer. Here, we suggest to target the interaction between the refluxed proteins and their new partners in the cytosol or to target the mechanism that causes their reflx to the cytosol by inhibiting for instance the interaction between SGTA and DNAJB proteins.


      __ __ Second para: Should be "Here we present evidences"

      Answer: we replaced with “Here we present evidences”

      "DNAJB12 overexpression was also sufficient to promote ERCYS by refluxing AGR2 and inhibit wt-p53 signaling in cells treated with etoposide" Suggest:

      Answer: DNAJB12 overexpression is also sufficient to promote ERCYS by refluxing AGR2 and inhibit wt-p53 signaling in cancer cells treated with etoposide (Figure 3). This suggests that it is enough to increase the levels of DNAJB12 without inducing the unfolded protein response in order to activate ERCYS. Moreover, the downregulation of DNAJB12 and DNAJB14 rescued the inhibition of wt-p53 during ER stress (Figure 2). Thus, wt-p53 inhibition is independent of the UPR activation but depends on the inhibitory interaction of AGR2 with wt-p53 in the cytosol.

      .

      DNAJB12 overexpression was also sufficient to promote ERCYS by increasing reflux of AGR2 and inhibition of wt-p53 signaling in cells treated with etoposide

      Answer: This sentence is repeated twice and was removed

      "Moreover, DNAJB12 was sufficient to promote this phenomenon and cause ER protein reflux by mass action without causing ER stress (Figure 3, Figure 4, and Figure S3)." You dont look at induction of ER stress here, please change the text or explain in more depth with refs if suitable

      Answer: In the initial submission and in the revised version we assayed the activation of the UPR by looking at the levels of spliced Xbp1 and Bip in the different conditions when DNAJB12 and DNAJB14 are overexpressed (Figure S3C and S3D). Our data show that although DNAJB12 overexpression induces ERCYS, there was no UPR activation.

      The mention of viruses is sparse in this paper. If it is a main theory, put it more centrally to the concept, and explain in more detail. As it is, its appearance in the final sentence is out of context.

      Answer: DNAJB12 and DNAJB14 were reported to facilitate the escape of non-envelope viruses from the endoplasmic reticulum to the cytosol. The mechanism of non-envelope penetration is highly similar to the reflux of proteins from the ER to the cytosol. Interestingly, this mechanism takes place when the DNAJB12 and DNAJB14 form a complex with chaperones from both the ER and the cytosol including HSC70, SGTA and BiP (Walczak et al. 2014; Goodwin et al. 2011; Goodwin et al. 2014)..

      Moreover, the UGGT1 that was independently found in our previous mass spectrometry analysis of the digitonin fraction obtained from HEK293T cells treated with the ER stressor thapsigargin and from isolated human GBM tumors (Sicari et al. 2020), is known to deploy to the cytosol upon viral infection (Huang et al. 2017; Sicari et al. 2020). We therefore hypothesized that the same machinary that is known to allow viruses to escape the ER to penetrate the cytosol may play an important role in the reflux of ER proteins to the cytosol.

      Because ER protein reflux and the penetration of viruses from the ER to the cytosol behave similarly, we speculate that viruses hijacked an evolutionary conserved machinery -ER protein reflux- to penetrate to the cytosol. This is key because it was also reported that during the process of nonenveloped viruses penetration, large, intact and glycosylated viral particles are able to penetrate the ER membrane on their way to the cytosol (Inoue and Tsai 2011).

      Action: we developed the discussion around this point and clarified it better because we believe it central to show that viruses hijacked this conserved mechanism.

      **Referees cross-commenting**

      I agree with the comments from Reviewer 1.

      Reviewer 2 also is correct in many ways, but I think that they have somewhat overlooked the relevance of the ER-stress element and treatments. The authors do need to reference past papers more to give a full story, as this includes the groups own papers, I don't think that it is an ethical problem but rather an oversight in the writing. Regarding reviewer 2's concerns about overexpression levels and cell death, the authors do use an inducible cell line and show the levels of DNAJB12 induced (could CRISPR also be considered?). This could be used to further address reviewer 2's concerns. It would also be useful to see data on cell death in the conditions used in the paper. Re concerns about ER integrity, this could be addressed by using IF (or EM) to show a secondary ER marker that remains ER-localised, and this would also be of interest regarding my comment on which categories of proteins can undergo reflux. If everything is relocalised, then reviewer 2's point would be validated.

      Reviewer #3 (Significance (Required)):

      Significance

      General assessment: This paper robustly shows that the yeast system of ER to cytosol reflux of ER-resident proteins is conserved in mammalian cells, and it describes clearly the link between ER stress, protein reflux and inhibition of p53 in mammalian cells. The authors have the tools to delve deeper into this mechanism and robustly explore this pathway, however the mechanistic elements - where not instantly clear from the results - have been over interpreted somewhat The results have been oversimplified in their explanations and some points and complexities of the study need to be addressed further to make the most of them - these are often some of the more interesting concepts of the paper, for example the differences in DNAJB12/14 and how the proteins orchestrate in the cytosol to play their cytosol-specific effects. I think that many points can be addressed in the text, by the authors being clear and concise with their reporting, while other experiments would turn this paper from an observational one, into a very interesting mechanistic one.

      Advance: This paper is based on previous nice papers from the group. It is a nice progressions from yeast, to basic mechanism, to physiological model. But as mentioned, without a strong mechanistic improvement, the paper would remain observatory.

      Audience: This paper is interesting to cell biologists (homeostasis, quality control and trafficking) as well as cancer cell biologists (fitness of cancer cells and homeostasis) and it is a very interesting demonstration of a process that is a double edged sword, depending on the environment of the cells.

      My expertise: cell biology, trafficking, ER homeostasis

      Answer: We would like to thank the reviewer for his/her positive feedback on our manuscript. All the comments of the three reviewers are now addressed and the manuscript has been strengthen. We put more emphasis on the mechanistic aspect with more Ips and knockdowns. We also added data to show that it is physiologically relevant. We hope that after that the revised version addressed all the concerns raised by the reviewers.

      Goodwin, E. C., A. Lipovsky, T. Inoue, T. G. Magaldi, A. P. Edwards, K. E. Van Goor, A. W. Paton, J. C. Paton, W. J. Atwood, B. Tsai, and D. DiMaio. 2011. 'BiP and multiple DNAJ molecular chaperones in the endoplasmic reticulum are required for efficient simian virus 40 infection', MBio, 2: e00101-11.

      Goodwin, E. C., N. Motamedi, A. Lipovsky, R. Fernandez-Busnadiego, and D. DiMaio. 2014. 'Expression of DNAJB12 or DNAJB14 causes coordinate invasion of the nucleus by membranes associated with a novel nuclear pore structure', PLoS One, 9: e94322.

      Grove, D. E., C. Y. Fan, H. Y. Ren, and D. M. Cyr. 2011. 'The endoplasmic reticulum-associated Hsp40 DNAJB12 and Hsc70 cooperate to facilitate RMA1 E3-dependent degradation of nascent CFTRDeltaF508', Mol Biol Cell, 22: 301-14.

      Huang, P. N., J. R. Jheng, J. J. Arnold, J. R. Wang, C. E. Cameron, and S. R. Shih. 2017. 'UGGT1 enhances enterovirus 71 pathogenicity by promoting viral RNA synthesis and viral replication', PLoS Pathog, 13: e1006375.

      Igbaria, A., P. I. Merksamer, A. Trusina, F. Tilahun, J. R. Johnson, O. Brandman, N. J. Krogan, J. S. Weissman, and F. R. Papa. 2019. 'Chaperone-mediated reflux of secretory proteins to the cytosol during endoplasmic reticulum stress', Proc Natl Acad Sci U S A, 116: 11291-98.

      Inoue, T., and B. Tsai. 2011. 'A large and intact viral particle penetrates the endoplasmic reticulum membrane to reach the cytosol', PLoS Pathog, 7: e1002037.

      Malinverni, D., S. Zamuner, M. E. Rebeaud, A. Barducci, N. B. Nillegoda, and P. De Los Rios. 2023. 'Data-driven large-scale genomic analysis reveals an intricate phylogenetic and functional landscape in J-domain proteins', Proc Natl Acad Sci U S A, 120: e2218217120.

      Piette, B. L., N. Alerasool, Z. Y. Lin, J. Lacoste, M. H. Y. Lam, W. W. Qian, S. Tran, B. Larsen, E. Campos, J. Peng, A. C. Gingras, and M. Taipale. 2021. 'Comprehensive interactome profiling of the human Hsp70 network highlights functional differentiation of J domains', Mol Cell, 81: 2549-65 e8.

      Sicari, D., F. G. Centonze, R. Pineau, P. J. Le Reste, L. Negroni, S. Chat, M. A. Mohtar, D. Thomas, R. Gillet, T. Hupp, E. Chevet, and A. Igbaria. 2021. 'Reflux of Endoplasmic Reticulum proteins to the cytosol inactivates tumor suppressors', EMBO Rep: e51412.

      Sicari, Daria, Raphael Pineau, Pierre-Jean Le Reste, Luc Negroni, Sophie Chat, Aiman Mohtar, Daniel Thomas, Reynald Gillet, Ted Hupp, Eric Chevet, and Aeid Igbaria. 2020. 'Reflux of Endoplasmic Reticulum proteins to the cytosol yields inactivation of tumor suppressors', bioRxiv.

      Walczak, C. P., M. S. Ravindran, T. Inoue, and B. Tsai. 2014. 'A cytosolic chaperone complexes with dynamic membrane J-proteins and mobilizes a nonenveloped virus out of the endoplasmic reticulum', PLoS Pathog, 10: e1004007.

      Yamamoto, Y. H., T. Kimura, S. Momohara, M. Takeuchi, T. Tani, Y. Kimata, H. Kadokura, and K. Kohno. 2010. 'A novel ER J-protein DNAJB12 accelerates ER-associated degradation of membrane proteins including CFTR', Cell Struct Funct, 35: 107-16.

      Youker, R. T., P. Walsh, T. Beilharz, T. Lithgow, and J. L. Brodsky. 2004. 'Distinct roles for the Hsp40 and Hsp90 molecular chaperones during cystic fibrosis transmembrane conductance regulator degradation in yeast', Mol Biol Cell, 15: 4787-97.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Reflux of ER based proteins to the cytosol during ER stress inhibits wt-p53. This is a pro-survival mechanism during ER stress, but as ER stress is high in many cancers, it also promotes survival of cancer cells. Using A549 cells, Dabsan et al. demonstrate that this mechanism is conserved from yeast to mammalian cells, and identify DNAJB12 and DNAJB14 as putative mammalian orthologues of yeast HLJ1.

      This paper shows that DNAJB12 and 14 are likely orthologues of HLJ1 based on their sequences, and their behaviour. The paper develops the pathway of ER-stress > protein reflux > cytosolic interactions > inhibition of p53. The authors demonstrate this nicely using knock downs of DNAJB12 and/or 14 that partially blocks protein reflux and p53 inhibition. Overexpression of WT DNAJB12, but not the J-domain inactive mutant, blocks etoposide-induced p53 activation (this is not replicated with DNAJB14) and ER-resident protein reflux. The authors then show that DNAJB12/14 interact with refluxed ER-resident proteins and cytosolic SGTA, which importantly, they show interacts with the ER-resident proteins AGR2, PRDX4 and DNAJB11. Finally, the authors show that inducing ER stress in cancer cell lines can increase proliferation (lost by etoposide treatment), and that this is partially dependent on DNAJB12/14.

      This is a very interesting paper that describes a nice mechanism linking ER-stress to inhibition of p53 and thus survival in the face of ER-stress, which is a double edged sword regarding normal v cancerous cells. The data is normally good, but the conclusions drawn oversimplify the data that can be quite complex. The paper opens a lot of questions that the authors may want to develop in more detail (non-experimentally) to work on these areas in the future, or alternatively to develop experimentally and develop the observations further. There are only a few experimental comments that I make that I think should be done to publish this paper, to increase robustness of the work already here, the rest are optional for developing the paper further.

      Major comments:

      1. Number of experimental repeats must be mentioned in the figure legends. Figures and annotations need to be aligned properly

      Results section 2: 2. No intro to the proteins you've looked at for relocalisation. Would be useful to have some info on why you chose AGR2. Apart from them being ER-localised, do they all share another common characteristic? Does ability to inhibit p53 vary in potency? 3. What are the roles of DNAJB12/14 if overexpression can induce reflux? Does it allow increased binding of an already cytosolic protein, causing an overall increase in an interaction that then causes inhibition of p53? What are your suggested mechanisms? 4. Fig3: A+B show overexpression of individual DNAJs but not combined. As you go on to discuss the effect of the combination on AGR2 reflux, it would be useful to include this experimentally here. 5. Fig 3C: Subfractionation of cells shows AGR2 in the cytosol of A549 cells. The quality of the data is good but the bands are very high on the blot. For publication is it possible to show this band more centralized so that we are sure that we are not missing bands cut off in the empty and H139Q lanes? Also, you have some nice immunofluorescence in the 2021 EMBO reports paper, is it possible to show this by IF too? It is not essential for the story, but it would enrich the figure and support the biochemistry nicely. Also it is notable that the membrane fraction of the refluxed proteins doesn't appear to have a decrease in parallel (especially for AGR2). Is this because the % of the refluxed protein is very small? Is there a transcriptional increase of any of them (the treatments are 12+24 h so it would be enough time)? This could be a nice opportunity to discuss the amount of protein that is refluxed, whether this response is a huge emptying of the ER or more like a gentle release, and also the potency of the gain of function and effect on p53 vs the amount of protein refluxed. This latter part isn't essential but it would be a nice element to expand upon. 6. You still mention DNAJB12 and 14 as orthologues, even though DNAJB14 has no effect on p53 activity when overexpressed. Do you think that this piece of data diminishes this statement? 7. Fig 3D/F: Overexpression of DNAJB14 induces reflux of DNAJB11 at 24h, what does this suggest? Does this indicate having the same role as DNAJB12 but less potently? What's your hypothesis? 8. "This suggests that the two proteins may have different functions when overexpressed, despite their overlapping and redundant functions" What does it suggest about their dependence on each other? If overexpression of WT DNAJB12 inhibits Tg induced reflux, is it also blocking the ability of DNAJB14 to permit flux? 9. Fig 4: PDI shown in blots but not commented on in text. Then included in the schematics. Please comment in the text. 10. Fig 4F: Although the quantifications of the blots look fine, the blot shown does not convincingly demonstrate this data for AGR2. The other proteins look fine, but again it could be useful to see the individual means for each experiment, or the full gels for all replicates in a supplementary figure. Results section 3 11. Fig 5A, As there is obviously a difference between DNAJB12/14 it would be useful to do the pulldown with DNAJB14 too. Re. HSC70 binding to DNAJB12 and 14, the abstract states that DNAJB12/14 bind HSC70 and SGTA through their cytosolic J domains. Fig 5 shows pulldowns of DNAJB12 with an increased binding of SGTA in FLAG-DNAJB12 induced conditions, but the HSC70 band does not seem to be enriched in any of the conditions, including after DNAJB12 induction. This doesn't support the statement that DNAJB12 binds HSC70. In fact, in the absence of a good negative control, this would suggest that the HSC70 band seen is not specific. There is also no data to show that DNAJB14 binds HSC70. I recommend including a negative condition (ie beads only) and the data for DNAJB14 pulldown. 12. The binding of DNAJB12 to SGTA under stress conditions in Fig5B looks much more convincing than SGTA to DNAJB12 in Fig 5A. Bands in all blots need to be quantified from 3 independent experiments, and repeated if not already n=3. If this is solely a technical difference, please explain in the text. The conclusions drawn from this interaction data are important and shold be elaborated upon to support th claims made in the paper. The authors may also chose to expand the pulldowns to demonstrate their claims made on olidomerisation of DNAJB12 and 14 here. It is also clear that the interaction data of the SGTA with ER-resident proteins AGR2, PRDX4 and DNAJB11 is strong. The authors may want to draw on this in their hypotheses of the mechanism. I would imagine a complex such as DNAJB14/DNAJB12 - SGTA - AGR2/PRDX4/DNAJB11 would be logical. Have any experiments been performed to prove if complexes like this would form? 13. Fig 5B: It is clear that DNAJB12 interacts with SGTA. The authors state that DNAJB14 also interacts with SGTA under normal and stress conditions, but the band in 25/50 Tg is very feint. Why would there be stronger binding at the 2 extremes than during low stress induction? In the input, there is a much higher expression of DNAJB14 in 50 Tg. What does this say about the interaction? Is there an effect of ER stress on DNAJB14 expression? A negative control should be included to show any background binding, such as a "beads only" control. 14. Fig 5C data is sound, although a negative control should be included. Results section 4 15. Fig 6A-B: Given that there is the complexity of overexpression v KD of DNAJB12 v 14 causing similar effects on p53 actvity (Fig 2 v 3), it would be interesting to see whether the effect of overexpression mirrors the results in Fig 6A. Is it known what SGTA overexpression does (optional)? 16. Fig 6D: resolution very low 17. Fig 6C-D: There is an interesting difference though between the proposed cytosolic actions of the refluxed proteins. You show that AGR2, PRDX4 and DNAJB11 all bind to SGTA in stress conditions, but in the schematics you show: DNAJB11 binding to HSC70 through SGTA (not shown in the paper), then also PDIA1, PDIA3 binding to SGTA and AGR2 binding to SGTA. What role does SGTA have in these varied reactions? Sometimes it is depicted as an intermediate, sometimes a lone binder, what is its role as a binder? It should be clarified which interactions are demonstrated in the paper (or before) and which are hypothesized in a graphical way (eg. for hypotheses dotted outlines or no solid fill etc). The schematics also suggest that DNAJB14 binding to HSC70 and SGTA is inducible in stress conditions, as is PDIA3, which is not shown in the paper. Discussion "In cancer cells, DNAJB12 and DNAJB14 oligomerize and recruit cytosolic chaperones and cochaperones (HSC70 and SGTA) to reflux AGR2 and other ER-resident proteins and to inhibit wt-p53 and probably different proapoptotic signaling pathways (Figure 5, and Figure 6C-6D)." You havent shown oligomerisation between DNAJB12/14. Modify the text to make it clear that it is a hypothesis. Minor comments: 18. It would be useful to have page or line numbers to help with document navigation, please include them. Typos and inconsistency in how some proteins are named throughout the manuscript 19. Title: Include reference to reflux. Suggest: "chaperone complexes (?proteins) reflux from the ER to cytosol..." I presume it would be more likely that the proteins go separately rather than in complex. Do you have any ideas on the size range of proteins that can undergo this process? 20. ERCY in abstract, ERCYS in intro. There are typos throughout, could be a formatting problem, please check 21. What about the selection of refluxed proteins? Is this only a certain category of proteins? Could it be anything? Have you looked at other cargo / ER resident proteins? 22. "Their role in ERCYS and cells' fate determination depends..." Suggest change to "Their role in ERCYS and determination of cell fate..." 23. I think that the final sentence of the intro could be made stronger and more concise. There's a repeat of ER and cytosol. Instead could you comment on the reflux permitting new interactions between proteins otherwise spatially separated, then the effect on wt-p53 etc.

      Figure legends:

      1. In some cases the authors state the number of replicates, but this should be stated for all experiments. If experiments don't already include 3 independent repeats, this should be done. Check text for typos, correct letter capitalisation, spaces and random bold text (some of this could be from incompatability when saving as PDF)
      2. Fig2E: scrambled not scramble siRNA
      3. Fig 3: "to expel" is a term not used in the rest of the paper for reflux. Useful to remain consistent with terminology where possible

      Results section 1:

      1. "Protein alignment of the yeast HLJ1p showed high amino acids similarity to the mammalian..."
      2. Fig 1C: state in legend which organism this is from (presumably human) Results Section 2
      3. "Test the two strongest hits DNAJB12/14" Add reference to previous paper showing this
      4. "In the WT and J-protein-silenced A549 cells, there were no differences in the cytosolic enrichment of the three ER resident proteins AGR2, DNAJB11, and HYOU1 in normal and unstressed conditions (Figure 2A-C and Figure S2C)." I think that this is an oversimplification, and in your following discussion, you show this it's more subtle than this.
      5. The text here isn't so clear: normal and unstressed conditions? Do you mean stressed? Please be careful in your phrases: "DNAJB12-silenced cells are slightly affected in AGR2 and DNAJB11 cytosolic accumulation but not HYOU1." This is the wrong way around. DNAJB12 silencing effects AGR2, not that AGR2 effects the cells (which is how you have written it). This also occurs agan in the next para:
      6. "During stress, DNAJB12/DNAJB14 double knockdown was highly affected in the cytosolic..." I think you mean it highly affected the cytosolic accumulation, not that it was affected by the cytosolic accumulation. Please change in the text
      7. "DNAJB12 and DNAJB14 are strong hits of the yeast HLJ1" Not clear, I presume you mean they are likely orthologues? Top candidates for being closest orthologues?
      8. Fig 2D: typos in WB labelling? I think Tm should be - - +, not - + +as it is now (if it's not a typo, you need more controls, eto alone.
      9. Fig 2D-E-F typos for DKD? D12/D12 or D12/14?
      10. "We assayed the phosphorylation state of wt- p53 and p21 protein expression levels (a downstream target of p53 signaling) during etoposide treatment." What are the results of this? Explain what Fig 2D-E shows, then build on this with the +Tm results. Results should be explained didactically to be clear.
      11. "(Figure 2D- E). Cells lacking DNAJB12 and or DNAJB14 have partial protection in wt-p53 phosphorylation and p21 protein levels."
      12. You comment on p53 phosphorylation, but you haven't quantified this. This should be done, normalized to p53 levels, if you want to draw these conclusions, especially as total p53 varies between condition. Does Eto increase p53 txn? Does Tm alone increase p53 activity/phospho-p53? These are shown in the Sicari EMBO reports paper in 2021, you should briefly reference those.
      13. Fig3A: overexpression of DNAJB12 decreases Eto induced p53 but not at steady state. Is this because at steady state the activity is already basal? Or is there another reason?
      14. Switch Figs S3D and S3C as they are not referred to in order. Also Fig S3C: vary colour (or add pattern) on bars more between conditions
      15. Need to define HLJ1 at first mention Results section 3
      16. HSC70 cochaperone (SGTA) defined twice
      17. "These data are important because SGTA and the ER-resident proteins (PRDX4, AGR2, and DNAJB11) are known to be expressed in different compartments, and the interaction occurs only when those ER-resident proteins localize to the cytosol." Is there a reference for this? Results section 4
      18. "by almost two folds"
      19. Fig 6A: It seems strange that the difference between purple and blue bars in scrambled, and D14-KD are very significant but D12-KD is only significant. Why is this? The error bars don't look that different. It would be interesting to see the individual means for each different replicate.
      20. Figures: Fig 6A: Scale bars not well placed. Annotation on final set should be D12/D14 DKD? Discussion
      21. The authors mention that they want to use DNAJB12/4-HSC70/SGTA axis to impair cancer cell fitness: What effect would this have though in a non cancer model? Would this be a viable approach? Although it is obviously early days, which approach would the authors see as potentially favourable?
      22. Second para: Should be "Here we present evidences"
      23. "DNAJB12 overexpression was also sufficient to promote ERCYS by refluxing AGR2 and inhibit wt-p53 signaling in cells treated with etoposide" Suggest:
      24. DNAJB12 overexpression was also sufficient to promote ERCYS by increasing reflux of AGR2 and inhibition of wt-p53 signaling in cells treated with etoposide
      25. "Moreover, DNAJB12 was sufficient to promote this phenomenon and cause ER protein reflux by mass action without causing ER stress (Figure 3, Figure 4, and Figure S3)." You dont look at induction of ER stress here, please change the text or explain in more depth with refs if suitable
      26. The mention of viruses is sparse in this paper. If it is a main theory, put it more centrally to the concept, and explain in more detail. As it is, its appearance in the final sentence is out of context.

      Referees cross-commenting

      I agree with the comments from Reviewer 1. Reviewer 2 also is correct in many ways, but I think that they have somewhat overlooked the relevance of the ER-stress element and treatments. The authors do need to reference past papers more to give a full story, as this includes the groups own papers, I don't think that it is an ethical problem but rather an oversight in the writing. Regarding reviewer 2's concerns about overexpression levels and cell death, the authors do use an inducible cell line and show the levels of DNAJB12 induced (could CRISPR also be considered?). This could be used to further address reviewer 2's concerns. It would also be useful to see data on cell death in the conditions used in the paper. Re concerns about ER integrity, this could be addressed by using IF (or EM) to show a secondary ER marker that remains ER-localised, and this would also be of interest regarding my comment on which categories of proteins can undergo reflux. If everything is relocalised, then reviewer 2's point would be validated.

      Significance

      General assessment: This paper robustly shows that the yeast system of ER to cytosol reflux of ER-resident proteins is conserved in mammalian cells, and it describes clearly the link between ER stress, protein reflux and inhibition of p53 in mammalian cells. The authors have the tools to delve deeper into this mechanism and robustly explore this pathway, however the mechanistic elements - where not instantly clear from the results - have been over interpreted somewhat. The results have been oversimplified in their explanations and some points and complexities of the study need to be addressed further to make the most of them - these are often some of the more interesting concepts of the paper, for example the differences in DNAJB12/14 and how the proteins orchestrate in the cytosol to play their cytosol-specific effects. I think that many points can be addressed in the text, by the authors being clear and concise with their reporting, while other experiments would turn this paper from an observational one, into a very interesting mechanistic one.

      Advance: This paper is based on previous nice papers from the group. It is a nice progressions from yeast, to basic mechanism, to physiological model. But as mentioned, without a strong mechanistic improvement, the paper would remain observatory.

      Audience: This paper is interesting to cell biologists (homeostasis, quality control and trafficking) as well as cancer cell biologists (fitness of cancer cells and homeostasis) and it is a very interesting demonstration of a process that is a double edged sword, depending on the environment of the cells.

      My expertise: cell biology, trafficking, ER homeostasis

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02491

      Corresponding author(s): Gilbert, Vassart

      1. General Statements [optional]

      We thank referees 1 and 2 for their in-depth analysis of our manuscript. They see interest in our study, with questions to be answered. Referee 3 is essentially negative, considering that there is nothing new ("novel finding is missing"). We respectfully disagree with him/her, comforted by the opinion of referee 2 that "the authors developed a protocol to reproducibly generate fetal-like spheroids from adult tissue is an important advancement in the field and ... the manuscript should attract a significant amount of attention in the intestinal field" and we provide evidence in our answers that he/she did not read the manuscript with the same attention as referees 1 and 2 (see in particular answer to his/her question 5).

      Here is a summary of the main reason why we consider that our study represents valuable new information in the field of intestinal regeneration.

      It is based on the serendipitous observation that dissociation of adult intestinal tissue by collagenase generates stably replatable spheroids upon culture in matrigel. Surprisingly and contrary to canonical EDTA-generated intestinal organoids and fetal spheroids, these spheroids are not traced in Rosa26Tomato mice harboring a VilCre transgene, despite expressing robustly endogenous Villin. Our interpretation is that adult intestinal spheroids originate from a cell lineage, distinct from the main developmental intestinal lineage, in which the VilCre transgene is unexpectedly not expressed, probaly due to the absence of cis regulatory sequences required for expression in this lineage.

      Adult spheroid transcriptome shares a gene signature with the YAP/TAZ signature commonly expressed in models of intestinal regeneration. This led us to look for VilCre negative crypts in the regenerating intestine of Lgr5/DTR mice in which Lgr5-positive stem cells have been ablated by diphtheria toxin. Numerous VilCre negative clones were observed, identifying a novel lineage of stem cells implicated in intestinal regeneration.

      FACS purification and scRNAseq analysis of the rare VilCre negative cells present at homeostasis identified a population of cells with characteristics of quiescent stem cells.

      In sum, we believe that our study demonstrates the existence of a hitherto undescribed stem cell lineage involved in intestinal regeneration. It points to the existence of a hierarchical model of intestinal regeneration in addition to the well-accepted plasticity model.

      2. Description of the planned revisions

      See section 3 below.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Here is a point-by-point reply to the queries of the three referees, with indication of the revisions introduced in the manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      • *In this manuscript, Marefati et al report an Lgr5-independent lineage in the regenerating intestine using in vitro organoids and in vivo injury-coupled lineage tracing model. In organoids, collagenase/dispase dissociated resulted in "immortal spheroids" that maintain a cystic and undifferentiated phenotype in the absence of standard growth factors (Rspondin/Noggin/EGF). Bulk RNAseq of spheroids demonstrates downregulation of classical CBC signatures and upregulation of fetal spheroid, mesenchymal, inflammation and regenerative signatures. In mice, Villin-Cre lineage tracing revealed some Villin- negative progenies that lack reporter tracing throughout crypt-villus ribbons after injury.

      *The authors proposed that there is Lgr5-independent population support the regenerative response upon CBC depletion. A major caveat of this study is the identification of this population is based on absence of VilCre expression. *

      We respectfully disagree. It is precisely this characteristic that makes the interest of our study. Whereas mosaicism of transgene expression is widespread and usually of little significance, our study shows that the rare VilCre-negative cells in the intestinal epithelium are not randomly showing this phenotype: they give specifically birth to what we call adult spheroids and regenerating crypts, which cannot be due to chance. The absence of VilCre expression allows tracing these cells from the zygote stage of the various VilCre/Ros26 reporter mice. We have modified our text to emphasize this point.

      *It is surprising that there is no characterisation of Lgr5 expression throughout the manuscript whilst claiming of a Lgr5- independent lineage. *

      We understand the perplexity of the referee not to see direct Lgr5 expression data in our manuscript, given our title. However, our point is that it is the cells at the origin of adult spheroids and the regenerating crypts we have identified that are Lgr5-negative, not the spheroids or the regenerated crypts themselves. Those are downstream offspring that may, and indeed have, gained some Lgr5 expression (e.g. figure 3F). We believe that our data showing that VilCre-negative spheroids are not traced in Lgr5-CreERT2/Rosa reporter mice convincingly demonstrate absence of Lgr5 expression in the cells at the origin of adult spheroids (figure 4G). We think that this experiment is better evidence than attempts to show absence of two markers (Tom and Lgr5) in the rare "white" cells present in the epithelium. Regarding the Lgr5 status of cells at the origin of the regenerating "white" crypts that we have identified, the early appearance of these crypts following ablation of CBC (i.e. Lgr5+ve) cells is a strong argument that they originate from Lgr5-negative cells. Regarding the scRNAseq experiment, Lgr5 transcripts are notoriously low and difficult to measure reliably in CBCs (Haber et al 2017). However, blowing up the pertinent regions of the merged UMAP allows showing some Lgr5 transcripts in clusters 5,6 and none in cluster 1 of figure 8GH. Given the very low level of detection, we had chosen not to include these data in the manuscript, but we hope they may help answer the point of the referee (see portion of UMAP below, with Olfm4 as a control, together with the corresponding violin plot). Several markers that gave significant signals in the CBC cluster (Smoc2, Axin2, Slc12a2) were virtually undetectable in the Olfm4-low /Tom-negative cluster of our scRNAseq data (figure 8I) supporting our conclusion.

      Although the research question is potentially interesting, the concept of epithelial reprogramming upon injury is well documented in the field. The data generated in this manuscript also seem to be preliminary and lack of detailed characterisation. Below are specific comments.

      We do not question the existence of epithelial reprogramming upon injury. We believe our data show, in addition to this well demonstrated phenomenon, the existence of rare cells traced by absence of VilCre expression that are at the origin of a developmental cell lineage distinct from Lgr5+ stem cells and also implicated in regeneration.

      • Expression of Lgr5 should be properly characterised throughout the manuscript in both organoid models and injury-induced regeneration in vivo.
      • *

      See above for a detailed answer to this point.

      • An important question is the origin of these "Lgr5-independent" adult spheroids. They look and appear like fetal organoids, which could be induced by injury (e.g. upon collagenase/dispase dissociation). Have the authors tried to culture fetal spheroids in BCM over extensive period of time? Do they behave the same? This would be a great way to directly compare the collagenase/dispase-derived organoids with fetal origin. * *Fetal spheroids require ENR for survival and die in BCM. We have chosen to illustrate this point in Fig2A by showing that, contrary to adult spheroid, they die even when only Rspondin is missing.

      • Fig 1C, Why is the replating spheroid culture time different between mesenchymal cells and conditioned medium? We took the earliest time showing convincingly the return to the organoid phenotype. This timing difference does not modify the conclusion that EDTA organoids becoming spheroid-like when exposed to factors originating from mesenchymal cells revert to the organoid phenotype when returned to ENR medium without mesenchymal influence.

      • *It is unclear how the bulk RNA-seq data in Fig. 3 were compared. How long were the adult organoids and spheroids cultured for (how many passages)? Were they culture in the same condition of were they in ENR vs BCM? * Both EDTA organoids and spheroids displaying a stable phenotype were used in this experiment. Organoids were collected at passage 4, day 5; spheroids were collected at passage passage 9 day 3.

      As stated in the legend to the figure: "...to allow pertinent comparison spheroids and organoids were cultured in the same ENR-containing medium...".

      These are important information to consider when interpreting the results. For instance, are Ptgs1 & Ptgs2 expression in adult spheroids the same in ENR vs BCM? Are the gene signatures (regenerative, fetal and YAP) changed in adult spheroids culturing in ENR vs BCM?

      We did compare bulk RNAseq of EDTA organoids to ENR-cultured spheroids, short term (passage 6, day 6) BCM-cultured spheroids and long term BCM-cultured (passage 26, day 6) spheroids. To avoid overloading the manuscript these data were not shown in the original manuscript. In summary the BCM-cultured spheroids display a similar phenotype as those cultured in ENR, but with further de-differentiation. See in revision plan folder the results for PTGS, some differentiation markers and fetal regenerative markers including YAP induced genes.

      We have included a brief description of these data in the new version of the manuscript and added an additional supplementary file (Suppl table 2) presenting the whole data set.

      • It is stated: "In agreement with their aptitude to grow indefinitely, adult spheroids express a set of upregulated genes overlapping significantly with an "adult tissue stem cell module" [159/721 genes; q value 2.11 e-94) (Fig.S2F)].". What is the definition of "indefinitely"? Are they referring to the Fig 1B where spheroid were passaged to P10? The authors should avoid the term "indefinitely" but use a more specific time scale, e.g. passages, months etc.

      We agree that the term indefinitely should be avoided, as it is vague. We have introduced the maximum number of passages during which we have maintained the stable spheroid phenotype (26 passages). Also worth noting, the spheroids could be frozen and cultured repeatedly over many months.

      SuppFig 3D: Row Z-Score is missing the "e" in Score.

      Corrected

      • Fig 4E: Figure legend says QNRQ instead of CNRQ. Corrected

      • Fig 4G: The brightfield image of adult spheroids 5 days after 3x TAM injections doesn't look like a spheroid. It seems to be differentiating. True, the choice was not the best as the spheroids started to darken. When further replated, however, the offspring of these spheroids showing a clear phenotype remain negative 30 days after tamoxifen administration as shown on the figure. We are sorry, but for reasons explained in section 4 below, we cannot redo the experiment to get a better picture.

      • Fig 4: Most mouse model data are missing the number of mice & their respective age used for organoid isolation. We have introduced these data in the legend.

      • *Fig 4A-D, H-G: How was fluorescent signal of organoids quantified? *

      The settings of fluo imaging or time of LacZ staining were the same for organoids and spheroid pictures. This has been added to the material and methods of the figure and an example is shown below for Rosa26Tomato.

      *How many images? * 2 per animal per condition.

      *Were there equal numbers of organoids? *

      No, see number of total elements counted added to the figure

      This all needs to be included in methods/figure legends.

      We have introduced additional pertinent information in the material and methods section.

      • Figure 4B-D, G-H: Which culturing conditions were used for adult spheroids? Original method or sandwich method? These data were obtained with the original protocol

      • Fig 6D-E: Please add the timepoint after DT administration these samples are from. It is not listed in text or figure legend. These samples were those obtained from mice sacrificed at the end of the 5 day period as indicated in panel A. This has been emphasized in the legend of the figure.

      • SuppFig 6D: again timepoint is missing. In this experiment all samples were untreated as indicated. This has been emphasized in the legend of the figure.

      • SuppFig 6: How were the crypts of these mice (DT WT & DT HE) isolated? Was this via EDTA? This was RNA extracted from total uncultured EDTA-released material (crypts). This has been emphasized in the legend of the figure.

      Also, what is the timepoint for isolation for these samples? Even if untreated, the timepoint adds context to the data. Please add more context to describing these different experiments, either in the figure legends or methods section.

      All these experiments were from 2 month old animals. We have indicated this in the legend of the figure.

      • SuppFig 6E: The quality of the heatmap resolution is too poor to read gene names. We have improved the resolution of the figure and hope the name of the genes are readable now.

      • 5-7, are the regenerating crypt-villus units fully differentiated or are they maintained in the developmental state? Immunostaining of markers for stem cells (Lgr5), differentiated lineages (Alpi, Muc2, Lyz, ChgA etc.) and fetal state (Sca1, Trop2 etc) should be analysed in those "white" unrecombined crypt-villus units. The differentiation phenotype is shown by the clear presence of morphologically-identified Paneth and Goblet cells. We agree that specific immunostainings could have been performed to further explore this point. Regarding the fetal state, Clu expression was shown during the regeneration period (see figure 7D,E).

      Unfortunately, for reasons explained in section 4 below, we are not in a position to perform these additional experiments.

      • The following text needs clarification: "The kinetics of appearance of newly formed un-recombined ("white") crypts was studied after a single pulse of DT (Fig.7A). This demonstrated an increase at 48 hours, with further increase at day 10 and stable maintenance at day 30. The presence of newly formed white crypts one month after toxin administration indicates that the VilCre-negative lineage is developmentally stable and does not turn on the transgene during differentiation of the various epithelial lineages occurring after regeneration (Fig.7B).

      *Comment: The "newly formed" is an overstatement, the data doesn't conclude that those are "new" crypts. *

      Except if we do not understand the point, we think we can write that a fraction of "white" crypts must be "newly formed", since they are in excess of those present in untreated animals at the same time point.

      *The end of the sentence states that these "white" crypts form developmentally stable lineages, thus these white crypts at day 30 could originate from the initial injury. *

      As stated above, we consider that crypts found in excess of those present in untreated animals result from the initial injury.

      *There was no characterisation of the various epitheial lineages. Are they fully differentiated? *

      See above the point related to Paneth cells and Goblet cells.

      Is Lgr5 expressed one month after toxin administration? Can the VilCre neg lineage give rise to CBCs?

      We have tried hard to show presence or absence of Lgr5 in white crypts at the various times following DT administration. We tried double RFP / Lgr5-RNA scope labeling and double GFP/RFP immunolabeling. Unfortunately, we could not get these methods to produce convincing specific labeling of CBCs in homeostatic crypts, which explains why we could not reach a conclusion regarding the white crypts.

      However, there is an indirect indication that "chronic" white crypts (i.e. those caused by DTR expression in CBC, plus those observed 30 days after DT administration) do not express Lgr5. Indeed, acute regeneration indicated by Clu expression at day 5 in Fig.7C is lower in white crypts than in red ones strongly suggesting that white crypts preexisting DT administration (the "chronic ones) do not express Lgr5DTR.

      The relationship between white crypt generation and appearance of Clu-positive revival cells (Ayyaz et al., 2019) was then explored. In agreement with others and similar to what happens in the irradiation model, (Ayyaz et al., 2019; Yuan et al., 2023) Clu-positive cells were rare in crypts of untreated mice and their number transiently increased forty-eight hours after a single pulse of DT, and more so after three pulses of DT (Fig.7C,D).

      Comment: Comparing 1 pulse at day 2 vs 3 pulses at day 5 makes the data hard to interpret. How is the Clu ISH level for 1 pulse at day 5? Are they equivalent?

      After a single pulse of of DT, Clu is only transiently increased. As shown by Ayyaz et al it is back to the starting point at day 5 (supplementary figure 4 of Ayyaz et al).

      Clu-positive cells were less frequently observed in white crypts (see "Total" versus "White" in Fig.7C). This fits with the hypothesis that Clu expression marks acutely regenerating crypts and that a proportion of the white crypts are chronically regenerating due to DTR expression in CBCs."

      *Comment: I believe the authors suggested that the discrepancy of less Clu expression in white crypts is due to the ectopic expression of DTR in CBCs causing low grade injury without DT administration. This means that some white crypts could have been formed before the administration of DT, and thus are on a different regenerative timeline compared to the white crypts formed from DT administration. *

      Yes, this is our interpretation. We have clarified it in the text.

      Is there any proof of the chronic regeneration? Immunostaining of chronic regenerative markers such as Sca1, Anxa1 or Yap1 nuclear localization would support the claim. It'd be important to show only the white crypts, but not the RFP+ ones, show regenerative markers.

      We think that the steady state higher number of white crypts in untreated Lgr5-DTR animals, compared to wild type siblings indicates chronical low-grade regeneration, which is supported by the RNAseq data (Suppl fig6). It must be noted, however, that this phenotype is mild compared to the well described fetal-like regeneration phenotype described in most injury models. Since these white crypts were made at undetermined earlier stages, the great majority of them are not expected to show markers of acute regeneration like Clu, Sca1....

      Fig 7D-E: What are the timepoints of harvest for HE-WT-HE 1 pulse DT mice and HE- HE-HE PBS injected mice?

      We have added this information in the figure.

      • *Fig 8-9: Regarding the CBC-like Olfm4 low population, what is the status of Lgr5? This should be shown in the figure since the argument is that this is an Lgr5-independent lineage. * See response to the second point.

      And what about the regenerative, Yap, mesenchymal and inflammatory signatures? Are they enriched in the white crypts similar to the in vitro spheroids?

      In a portion of white crypts, those we believe are newly formed after CBC ablation (see above), there is a transient increase in Clu, which may be considered a marker of Yap activation. In the CBC-like Olfm4 low cells, as seen by scRNAseq, there is nothing like an actively regenerating phenotype. This is expected, since these cells are coming from homeostatic untreated VilCre/Rosa26Tom animals and are supposed to be quiescent "awaiting to be activated".

      Reviewer #1 (Significance (Required)):

      Strengths: The study employed a range of in vitro and in vivo models to test the hypothesis.

      • *

      *Limitations: Unfortunately, the models chosen did not provide sufficient evidence to draw the conclusions. Injury induced reprogramming, both in vivo and in vitro, has been well documented in the field. The new message here is to show that such reprogrammed state is continuous rather than transient; instead of regenerating Lgr5+ stem cells, it can continue to differentiate to all cell lineages in Lgr5-independent manner.

      *

      We respectfully disagree with this analysis of our results. What we show is not "that such reprogrammed state is continuous rather than transient; instead of regenerating Lgr5+ stem cells, it can continue to differentiate to all cell lineages in Lgr5-independent manner", but that a quiescent stem cell line, not previously identified, is activated to regenerate a portion of crypts following CBC ablation. These cells are not reprogrammed, they correspond to a developmental lineage waiting to be activated and keep their VilCre-negative state at least of 30 days. We believe that their "by default tracing" (VilCre negative from the zygote stage) is as strong an evidence for the existence of such a lineage as positive lineage tracing would be. The increase in crypts originating from this lineage after CBC ablation indicates that it is implicated in regeneration. We do not question the well-demonstrated plasticity-associated reprogramming taking place during regeneration; we simply suggest that this would coexist with the involvement of the quiescent VilCre-negative lineage we have identified.

      *However, through the manuscript, there was no immunostaining of Lgr5 and other differentiation markers. The conclusion is an overstatement without solid proof. * We have provided the best answer we could to this point in our answer to the second question of the referee hereabove.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, the Marefati et al. developed a novel approach to generate spheroids from adult intestinal epithelium using a collagenase/dispase based protocol. Adult spheroids were found to be distinct from classic budding-type organoids normally generated from EDTA based release of the crypt epithelium. Transcriptional profiling indicated that adult spheroids were undifferentiated and similar to regenerating crypts or fetal spheroids. To identify the cell of origin that generates adult spheroids, the authors labelled epithelial cells with VilCreERT-LSL-Tom, VilCre-LSL-GFP and Lgr5CreERT- LSLTom mice. From these experiments the authors conclude that that spheroids are only generated from Vil-Cre negative and Lgr5 negative cells. Next the authors deleted the anti- apoptotic gene Mcl1 using Vil-CreERT mice. This led to a strong apoptotic response throughout the crypt epithelium and tissues processed from knockout mice readily generated spheroids, and in vivo, replenishment of the gut epithelium was mediated by unrecombined cells. In a second model, CBCs were ablated using Lgr5DTR mice and VilCre negative cells were found again to contribute to regeneration of the crypt epithelium. Finally based on the absence of Vil-Cre reporter activity, the authors were able to sort out and perform scRNAseq to profile VilCre negative cells. These cells were found to be quiescent, express the stem cell marker Olfm4 and were also abundant in ribosomal gene expression.

      • *

      The fact that the authors developed a protocol to reproducibly generate fetal-like spheroids from adult tissue is an important advancement in the field. Previous reports have shown that treatment with various small molecule inhibitors can revert budding organoids into a spheroid morphology, but this manuscript demonstrates that spheroids can also be generated from otherwise untreated cells. This new methodology will provide new tools to dissect the molecular determinants of fetal/regenerative cells in the gut. Based on this, the manuscript should attract a significant amount of attention in the intestinal field.

      • *

      As pointed out by the authors themselves the study has important limitations that diminish enthusiasm. The primary issue relates to the inability of the team to identify markers of VilCre neg cells other than the fact that these cells are Olfm4+ and quiescent. Nonetheless, for the reasons stated above the manuscript should reach the target audience within the research community, if the authors can address the specific points below related to issues with methodology as well as defining more precisely the characteristics and growth requirements of adult spheroid cultures.

      Thank you for this positive analysis of our study.

      Major comments

      The main conclusion of the study is that Vil-Cre neg cells are rare quiescent Olfm4+ crypt cells. If this is the case, then standard EDTA treatment should release these cells as well. Consequently, spheroids should also emerge from isolated crypts grown in the absence of ENR. If this is not the case how do the authors explain this?

      We have tried hard to generate spheroids by culturing EDTA organoids in medium lacking ENR and by treating EDTA organoids with collagenase/dispase, without success. Therefore, we are left with the conclusion that spheroid-generating cells must be more tightly attached to the matrix than those released by EDTA, and that it is their release from this attachment by collagenase that triggers a regeneration-like phenotype. This hypothesis is supported by several models of regeneration in other tissues as indicated in our references (Gilbert et al., 2010; Machado et al., 2021; Montarras et al., 2005).

      From the text the authors appear to suggest that growth of adult spheroids is dependent initially on "material" released by collagenase/dispase treatment. An obvious candidate would be mesenchymal cells, which are known to secrete factors such as Wnts and PGE2 that drive spheroid morphology. To test this, the authors should treat spheroid cultures with Porcupine and/or PGE2 inhibitors.

      We followed similar reasoning, considering that spheroids express strongly Ptgs1 ,2 (Figure 3A). We thought their phenotype might be maintained by autocrine prostaglandin action. We tested aspirin, a Ptgs inhibitor, which was without effect on the spheroid phenotype. Besides, we explored a wide variety of conditions to test whether they would affect the spheroid phenotype [Aspirin-see above, cAMP agonists/antagonists, YapTaz inhibitors (verteporfin and CA3), valproic acid, Notch inhibitors (DAPT, DBZ, LY511455), all-trans retinoic acid, NFkB inhibitors (TCPA, BMS), TGFbeta inhibitor (SB431542)]. As these results were negative, we did not include them in the manuscript.

      • If these inhibitors block growth then this would suggest that either stromal cells or autocrine signalling involving these pathways is important. Overall, more in-depth analysis of the growth requirements of adult spheroids is required.*

      Figure 1d indicates that adult spheroids can be propagated for at least 10 passages. The abstract mentions they are "immortal". The text itself does not address this issue. More precise information as to how long spheroids can be propagated is required. If these cultures can be propagated for 10 passages or more it becomes important to determine what nutrients/mitogens in the basal media are driving growth? Alternatively, what is the evidence that spheroid cultures are completely devoid of mesenchymal cells. The text only mentions that "Upon replating, these spheroids could be stably cultured free of mesenchymal cells (Fig.1B)". No validation is shown to support this.

      We agree that "immortal" is not a good way to characterize our spheroids, as also pointed out by referee nr 1. We have changed that in the text, indicating the maximal number of replating we tested was 26 and replacing immortal by stably replatable. Of note, the spheroids could frozen/thawed and recultured many times.

      Related to the question whether mesenchymal cells could still contaminate the spheroid cultures, we can provide the following answers:

      • No fibroblasts could be seen in replated cultures and multiple spheroids could be repeatedly propagated from a single starting spheroid.
      • The bulk RNAseq experiment comparing organoids to ENR or BCM cultured spheroids show, despite expression of several mesenchymal markers (see matrisome in Fig3), absence of significant expression of Pdgfra (see in revision plan folder for CP20Millions results from the raw data of new suppl table 2, with Clu, Tacstd2 and Alpi shown as controls).
      • Regarding the nutrients/mitogens in the medium driving spheroid growth, we did not explore the point further than showing that they grow in basal medium (i.e. advanced DMEM), given that the presence of Matrigel makes it difficult to pinpoint what is really needed. In Figure 2, the authors describe the growth requirements for adult spheroids and indicate that spheroids grown in ENR or EN became dark and shrink. The representative images showing this are clear, but this analysis should be quantified.

      Added to the manuscript.

      In SF3, the gene expression profile of organoids from the sandwich method only partially overlaps with that of organoids from the old protocol. What are the gene expression differences between the 2 culture systems? Secondly, the sandwich method appears to sustain growth of Tom+ spheroids based on RNAseq and the IF images. This suggest that Vil-Cre negative cells are not necessarily the only source of adult spheroids and thus this experiment seems to indicate that any cell may be converted to grow as a spheroid under the right conditions. These points should be addressed.

      Looking back to our data in order to answer the point raised by the referee, we realized that we had inadvertently-compared organoids to ENR-cultured spheroids generated by the first protocol to BCM-cultured spheroids generated by the sandwich method. We have corrected this error in a new version of suppl fig3. This shows increased correspondence between genes up- or downregulated in the spheroids obtained in the two protocols (from 49/48% to 57/57% (Venn diagram on the new figure). We agree that, even after this correction, the spheroids obtained with the two protocols present sizeable differences in their transcriptome. However, considering the very different way these spheroids were obtained and cultured initially, we do not believe this to be unexpected. The important point in our opinion is that the core of the up- and down-regulated genes typical of the de-differentiation phenotype of adult spheroids is very similar, as shown in the heatmap (which was made with the correct samples!). Also, a key observation is that that both kind of spheroids survive and can be replated in basal medium. As already stated, this characteristic is only seen rare cases [spheroids obtained from rare FACS-purified cells (Smith et al 2018) or helminth-infected intestinal tissue (Nusse et al.2018)]. Together with the observation that the majority of them is not traced by VilCre constitutes what we consider the halmark of the spheroids described in our study. As shown in figure 4E (old protocol) and Suppl Fig.3 (sandwich protocol) both red and white spheroids were extremely low in VilCre expression. As stated in the text, the fact that some spheroids are nevertheless red is most probably related to the extreme sensitivity of the Rosa26Tom marker to recombination (Liu et al., 2013), but this does not mean that there are two phenotypically different kind of spheroids. It means that the arbitrary threshold of Rosa26Tom recombination introduces an artificial subdivision of spheroids with no phenotypical significance.

      Regarding the point made by the referee that "that any cell may be converted to grow as a spheroid under the right conditions", we agree and have shown with others that organoids acquire indeed a spheroid phenotype when cultured for instance in fibroblasts-conditioned medium (see suppl fig1B and (Lahar et al., 2011; Roulis et al., 2020) quoted in the manuscript). However, these spheroids cannot be propagated in basal medium, and revert to an organoid phenotype when put back in ENR (Suppl fig1B).

      *In Figure 4, the authors conclude that spheroids do not originate from Lgr5 cell derived clones even after 30days post Tam induction. Does this suggest that in vivo and under homeostatic conditions VilCre neg cells are derived from a distinct stem cell pool or are themselves a quiescent stem cell. Given the rarity of VilCre neg cells, the latter seems unlikely.

      *

      Despite their rarity, we believe VilCre-negative cells observed under homeostatic conditions are themselves quiescent stem cells. Actually, if they were derived from a larger stem cell pool, this pool should also be VilCre-negative. And we do not see such larger number of VilCre-neg cells under homeostatic conditions.

      The problem with the original assertion is that Lgr5-CreERT mice are mosaic and therefore not all Lgr5+ cells are labelled in this model. "White" spheroids may thus derive from cells that in turn derive from these unlabelled Lgr5 cells.

      We had considered the possibility that mosaicism [very low for VilCre (Madison et al., 2002); in the 40-50% range for Lgr5CreERT2 (Barker & Clevers. Curr Protoc Stem Cell Biol. 2010 Chapter 5)] could explain our data. We think, however that we can exclude this possibility on the basis that spheroids do not conform to the expected ratio of unrecombined cells, given the observed level of mosaicism. Indeed, for VilCre, a few percent, at most, of unrecombined cells in the epithelium translates into almost 100% unrecombined spheroids. For Lgr5CreERT2 mice, the mosaicism level is in the range of 40%, which is what we observe for EDTA organoids (Figure 4G), while spheroids were in their vast majority unrecombined.

      We have included a discussion about the possible role of mosaicism in the new version.

      ATACseq experiments were briefly mentioned in the manuscript but unfortunately little information was extracted from this experiment. What does this experiment reveal about the chromatin landscape of adult spheroids relative to normal organoids?

      We only performed this experiment to search for an explanation to the paradoxical absence of expression of the VilCre transgene in spheroids, despite robust expression of endogenous villin (Suppl Fig.4). We chose to show the chromatin landscape of a gene equally expressed in both organoids and spheroids (Krt19), a gene specifically expressed in spheroids (Tacstd2) and the endogenous Villin gene also expressed in both. We believe that the observation of a clear difference in pattern of the chromatin accessibility around the endogenous villin gene in organoids and spheroids provides an explanation to the observed results. The cis regulatory sequences needed for expression of the endogenous villin gene seem to be different in organoids and spheroids, which may explain why the regulatory sequences present in the transgene (only 12.4kb) might not allow expression of the transgene in spheroids. We have added a sentence in the manuscript clarifying this point. Missing is obviously the chromatin landscape around the VilCre transgene, but this is beyond reach in such kind of experiments.

      Reviewer #2 (Significance (Required)):

      The fact that the authors developed a protocol to reproducibly generate fetal-like spheroids from adult tissue is an important advancement in the field. Previous reports have shown that treatment with various small molecule inhibitors can revert budding organoids into a spheroid morphology, but this manuscript demonstrates that spheroids can also be generated from otherwise untreated cells. This new methodology will provide new tools to dissect the molecular determinants of fetal/regenerative cells in the gut. Based on this, the manuscript should attract a significant amount of attention in the intestinal field.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): CR-2024-02491

      An Lgr5-independent developmental lineage is involved in mouse intestinal regeneration

      Marefati et al.

      Homeostatic maintenance of the intestinal epithelium has long been thought to rely upon Wnt signaling responsive Lgr5-expressing stem cells that reside at the crypt base.

      However, myriad reported mechanisms or populations have been reported to underlie epithelial regeneration after injury. Many groups have reported that reacquisition of a fetal- link intestinal phenotype is an import part of the regenerative response, however the originating cell type has not been definitively identified. Herein, the authors demonstrate that cells from adult homeostatic intestine can generate immortal spheroids that resemble fetal spheroids and are derived independent of Lgr5+ intestinal stem cells (ISCs). The authors then draw the conclusion that this indicates that a hierarchical stem cell model applies to regeneration of the intestinal epithelium, in addition to the plasticity model.

      • *

      Comments:

      1. Please indicate what species is used for studies in Fig 1.

      All experiments were performed in Mus musculus.

      Please clarify if Figure 2 studies utilize Matrigel or not.

      Yes

      RNA-seq analyses of adult intestinal generated spheroids lack the granularity of single cell analyses and thus it is unclear if this is a homogeneous population or if the population has diversity across it (i.e., enteroids/organoids have a high level of diversity). Many of the conclusions from the RNA-seq study are broad and generalized-for example Fig 3F indicates that markers of the +4 ISC populations (Bmi1, tert, lrig1, hopx) were all expressed similarly in adult spheroids as compared to adult organoids. However, while this may be true in the bulk-RNA-seq analyses, clearly scRNA-seq would provide a better foundation to make this statement, as enteroids/organoids are comprised of heterogeneous subpopulations. . .and it might indicate that these +4 markers have only very low expression in the spheroids. Based upon these concerns, misconclusions are likely to be drawn.

      We agree and it would be certainly worthwhile to perform scRNAseq of adult spheroid populations. This would certainly be worth doing in future studies to explore the possible heterogeneity of adult spheroids. We nevertheless believe that our scRNAseq performed on homeostatic intestinal tissue from VilCre/Rosa26Tom mice identify Olfm4-low VilCre-neg cells that are likely at the origin of adult spheroids and display a quite homogenous phenotype.

      *The language around Figure 4 results is confusing. Please define "white" and "red". It might be simpler to designate recombined versus not recombined lineage.

      *

      We have clarified this in the figure.

      The hypothesis that collagenase/dispase solution acts as a proxy for injury is not demonstrated and backed by data. Thus, it is difficult to make the conclusion that this approach could represent a "stable avatar" of intestinal regenerating cells. It is clear that subpopulations of crypt-based cells generate spheroids in culture without collagenase/dispase (see the cited reference Smith et al, 2018).

      * *Smith et al demonstrate clearly the possibility to obtain spheroids with properties probably similar to ours from EDTA derived intestinal crypt cells. However they need to prepurify them by FACS. Besides, Nusse et al describe spheroids similar to ours after infection of the intestine by helminths (Nusse et al. 2018). In our case, and for most labs preparing enteroids with the EDTA protocol, the result is close to 100% organoids. Even if we treat EDTA organoids with collagenase, we do not obtain spheroids. This brought us to the conclusion that spheroid-generating cells must be more tightly attached to the matrix than CBCs and that it is their release from the matrix that activates the spheroid regeneration-like phenotype. This hypothesis is supported by several models of regeneration in other tissues as indicated in our references (Gilbert et al., 2010; Machado et al., 2021; Montarras et al., 2005)

      A study based on the absence of recombination in a VilCre lineage tracing scenario is not well-established to be strong experimental approach, as there are many reasons why recombination may not cells may not be lineage marked. In order to use this system as the authors intend, they first need to demonstrate that villin is not expressed in the discrete cell population that they are targeting. For the presented observational studies, this would be difficult to do. While they do demonstrate differences in chromatin accessibility between cells from organoids versus spheroids (fig s4), some of these differences could merely be due to the bulk analytical nature of the study and the lack of comparing stem cell populations from spheroids to stem cell populations from organoids-since the spheroids are likely homogenous versus the organoids that only have a small fraction of stem cells-and thus represent a mix of stem cell and differentiated cell populations. The authors do not demonstrate that villin protein expression varies in these cells.

      If it were found that villin is not expressed in their "novel" population, then one would expect that the downstream use of villin-based recombination would demonstrate the same recombination potential (i.e., Mcl1 would not be recombined). Both recombination studies in Fig 6 are difficult to interpret, and thus it is not clear if these studies support the stated conclusions. Quantification of number of crypts that are negative should be reported as a percentage of recombined crypts.

      We are sorry but there seems to be a complete misunderstanding of our data regarding the point raised by the referee. The important point of our initial observation is that despite robust expression of villin in spheroids, the VilCre transgene is not expressed (see figure 4E). This in our opinion makes absence of VilCre expression (or of Rosa marker recombination) a trustful marker of a new developmental lineage. All the data in figure 4 constitute an answer.

      *The reasoning about heterogeneity of cell type in organoids versus probable homogeneity of spheroids is well taken. However, as the endogenous villin gene is expressed in all cells of both organoids and spheroids, it is highly significant that only spheroids do not express the transgene. *

      We performed the ATACseq experiment to search for an explanation to the paradoxical absence of expression of the VilCre transgene in spheroids, despite robust expression of endogenous villin (Suppl Fig.4). We chose to show the chromatin landscape of a gene equally expressed in both organoids and spheroids (Krt19), a gene specifically expressed in spheroids (Tacstd2) and the endogenous Villin gene also expressed in both. We believe that the observation of a clear difference in pattern of the chromatin accessibility around the endogenous villin gene in organoids and spheroids provides an explanation to the observed results. The cis regulatory sequences needed for expression of the endogenous villin gene seem to be different in organoids and spheroids, which may explain why the regulatory sequences present in the transgene (only 12.4kb) might not allow expression of the transgene in spheroids. We have added a sentence in the manuscript clarifying this point. Missing is obviously the chromatin landscape around the VilCre transgene, but this is beyond reach in such kind of experiments.

      *Figure 8 indicates that the cell population identified by scRNA-seq may be quiescent. Companion IF or IHC should be conducted to confirm this finding, as well as other conclusions from the informatics conducted.

      *

      We agree that additional experiments could be performed to support this point. We are unfortunately not in a position to perform these experiments (see section 4 below).

      Clearly the data is intriguing, however, the conclusion is strong and is an over interpretation of the presented data. There are a number of validation or extension data that would enhance the overall interpretation of the study: 1. validation of scRNA-seq or bulk RNA-seq concepts by protein staining of intestinal tissues in the damage model will serve as a secondary observation. 2. identification of the ISC that they are defining is critical and important. There is already the notion that this cell type exists and it has been shown with various different markers. 3. expand the analyses of the fetal-like expression profiling to injured intestines to demonstrate that the lineage negative cells indeed express fetal-like proteins. 4. expand the discussion of the Clu+ cell type. Is this cell the previously described revival cell? If so, how does this body of work provide unique aspects to the field?

      We agree that all these suggested experiments could be performed and would be of interest. However, we consider that they would not modify the main message of our study and would only constitute an expansion of the present work. As already stated, we are not in the position to perform them (see section 4).

      *There is some level of conflicting data, with the stem population being proliferative in culture stimulated by the stromal cells, but quiescent in vivo and also based upon scRNA- seq data in Fig 9.

      *

      We do not see any conflict in our observation regarding this point. The observation that cells that are quiescent in vivo become proliferative when subjected to culture (with or without addition of stromal cells) is routinely made in a multitude of cell culture systems. In particular, it has been shown that intestinal tissue dissociation activates the Yap/Taz pathway, resulting in proliferation (Yu et al. Hippo Pathway Regulation of Gastrointestinal Tissues. Annual Review of Physiology, 2015 Volume 77, 201-227).

      Many of the findings have been previously reported: Population that grows as spheroids (Figure 2), Population that is Wnt independent (Figure 2), Lgr5 independent regenerative growth of the intestine (figure 3F, Figure 4), Clu+ ISCs drive regeneration (Figure 7).

      Whereas these individual findings have indeed been reported, it was in a different context. We strongly disagree with the underlying suggestion that our study would not bring new information. We have identified here a developmental lineage involved in intestinal regeneration that has not been described up to now.

      Minor comments:

        • The statement that spheroids must originate from collagenase/dispase digested material might be an overstatement. As spheroids generation from EDTA treated intestines have been previously reported (Smith et al, 2018). * See answer to point 4 above. *Overall while the study includes an extensive amount of work and different approaches, a foundationally supported novel finding is missing. Many of the statements have already been demonstrated by others in the fields. In addition, one of the most intriguing aspects of the study is that the stromal population impacts this stem cell population, however, interactions and factors stimulating the crosstalk are not addressed.

      *

      Reviewer #3 (Significance (Required)):

      Overal while the study includes an extensive amount of work and different approaches, a foundationally supported novel finding is missing. Many of the statements have already been demonstrated by others in the fields. In addition, one of the most intriguing aspects of the study is that the stromal population impacts this stem cell population, however, interactions and factors stimulating the crosstalk are not addressed.

      We can only disagree.

      4. Description of analyses that authors prefer not to carry out

      • *

      We have answered most questions raised by the referees by explaining our view, by clarifying individual points and, in several cases, by providing additional information that was not included in the original manuscript.

      In a limited number of cases when additional experiments were suggested, we were unfortunately obliged to write that we are not in a position to perform them. This is because my lab is closing after more than fifty years of uninterrupted activity. There will unfortunately be nobody to perform additional experiments.

      Nevertheless, as written by referees 1 and 2, we believe that the revised manuscript, as it stands, contains data that will be of interest to the people in the field and may be the bases for future developments. We hope editors will find interest in publishing it.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2023-02306

      Corresponding author(s): John, Yates

      [Please use this template only if the submitted manuscript should be considered by the affiliate journal as a full revision in response to the points raised by the reviewers.

      • *

      If you wish to submit a preliminary revision with a revision plan, please use our "Revision Plan" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      We greatly appreciate the reviewers taking time from their busy scientific careers to evaluate our manuscript. We were elated to read all the positive comments, such as “the conclusions are well-supported and convincing”, “should contribute to a more nuanced understanding of SCZ pathogenesis”; “The potential implications for drug development underscore the broader significance of the study in advancing our knowledge of neurobiology and its relevance to neurological disorders like schizophrenia”, and “The study is informative, and has great potential to enrich the specific literature of this field”. We also found the constructive criticism very helpful for improving our manuscript. We performed additional experiments and bioinformatic analyses, as requested. We modified the manuscript to answer the reviewers’ questions. Due to its complexity, it is difficult to describe the different and sometimes conflicting hypotheses of SCZ pathogenesis in a single manuscript. This complexity is reflected in the conflicting requests from the reviewers. One reviewer requested we investigate and highlight the role of non-neuronal cells in SCZ while another reviewer suggested we did not focus enough on synaptic proteins. We believe we have achieved a balance to represent the intricacy of SCZ biology and the different opinions of the reviewers.

      Thanks again.

      2. Point-by-point description of the revisions

      This section is mandatory. *Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. *

      • *

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate). In this manuscript, McClatchy and colleagues used a conventional approach combining immunoprecipitation (IP) of endogenous target proteins (baits) followed by liquid chromatography mass spectrometry (MS) analysis of the co-immunoprecipitating proteins to map protein-protein interaction (PPI). This interaction network is centered around baits that had been annotated as susceptibility factors for schizophrenia (SCZ). A variety of previous studies have identified thousands of such SCZ susceptibility factors. Mostly based on the availability of antibodies, 8 bait proteins were selected in this study. The authors reasoned that immunoprecipitating endogenous proteins from tissues using specific antibodies was a more accurate view of physiological conditions than epitope tagging followed by affinity purification (AP) from cells in culture. The model system from which proteins were extracted was the hippocampus dissected from mice that had been treated or not by phencyclidine (PCP), a drug that has been shown to induce SCZ symptoms in humans and animals. By comparing the proteins identified and quantified from the PCP-treated samples against control IPs and/or saline-injected mouse controls, a large number of PPI were deemed statistically significant. Most of these potential interactors were not present in PPI databases (BioGRID), most likely because such databases are populated with large-scale APMS datasets from cell cultures, with very few studies using brain tissue. Strikingly, many of the co-immunoprecipitated proteins were also known as SCZ susceptibility factors, which lend weight to the hypothesis that these factors form a large protein interaction network, localized at the synapses.

      Major comments: - Are the key conclusions convincing? Overall, the conclusions drawn from the experimental design, data analysis, and corroboration with existing literature are well-supported and convincing. When selecting the SCZ susceptibility factors, the authors clearly state their goal, the databases used for gene selection, and the rationale for choosing proteins with synaptic localization. The inclusion of evidence from genetic studies and previous publications strengthens the credibility of the selected genes. The methodology used to establish the novel SCZ PPI network is mostly well-described (see minor comments below). The use of an 15N internal standard also adds rigor to the quantitation of PPI. The GO enrichment analysis provides valuable insights into the biological functions and cellular components associated with the SCZ PPI network. The annotation of identified proteins using the SynGo synaptic database and the distribution of annotated synaptic proteins among different baits further support the biological relevance of this PPI network. The cross-referencing of the PPI network with published genetic studies on SCZ susceptibility genes adds robustness to the findings. Specifically, the observation that 68% of protein interactors have evidence of being potential SCZ risk factors is a strong corroboration of the prevailing hypothesis in the field. Finally, the significant changes induced by PCP that were identified for all baits except Syt1, along with the comparison of altered proteins with SAINT-identified PPI, add depth to the understanding of PCP modulation.

      - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? No, but note that APMS/IPMS has been around for more than a decade (Introduction page 3).

      We agree and did not mean to imply that IP-MS is new technology. We tried to convey that IP-MS is not new technology, but the number of IP-MS studies employed to study the PPI of endogenous proteins in brain tissue is a small percentage of all the published PPI MS studies.

      We added the following to the Conclusions to clarify this point: “Although IP-LC-MS technology has been employed for more than a decade, quantitation of proteins using this strategy in mammalian tissue is scarce in the literature.”

      - Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. One piece of data that is missing are Western blots using the 8 selected antibodies against the proteins extracted from their experimental samples to validate the antibodies recognize 1 protein of the expected size from these tissue extracts.

      We took your suggestion and performed immunoblots with our 8 IP antibodies using the starting material (i.e. rat brain hippocampus). All antibodies recognized a single band of the approximate molecular weight of the target except for the Gsk3b, which produced a doublet instead of a single band. This image is similar to what has been observed with the phosphorylation of Gsk3b(Krishnankutty, Kimura et al. 2017, Vainio, Taponen et al. 2021). To provide evidence that the additional band observed for Gsk3b is the phosphorylated target protein, we searched our Gsk3b IP dataset for a differential phosphorylation (i.e. 79.9663) on S,T, or Y. Even though we did not perform phosphorylation enrichment, we identified S389 as abundantly phosphorylated in all Sal and PCP samples consistent with our immunoblot. Images of these immunoblots are now Supplementary Figure 1.

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. Running SDS-PAGE and Western blotting should be straightforward and cheap.

      - Are the data and the methods presented in such a way that they can be reproduced? Yes

      - Are the experiments adequately replicated and statistical analysis adequate? Yes

      Minor comments: - Specific experimental issues that are easily addressable. The rationale for the short duration between PCP injection and animal sacrifice is only explained in the discussion section (page 17). The fact that this short treatment of less than 30 min should prevent any change in transcription or translation should be introduced earlier (in the experimental procedures).

      We agree this is an important aspect of the study and that it suggests that the effect of PCP is independent of changes in transcription and translation as stated in the Discussion.

      We added the following to the Introduction:

      “PCP was administered for less than 30min., which precluded any changes in transcription or translation and allowed us to focus on PPI.*” *

      Note that the duration is written as 26 min on page 4 and 25 min on page 9. Please reconcile these numbers*. *

      We have corrected this typo. It was 26min.<br /> Is there any biological significance for this SCZ study that the mice were maintained on a reverse day-night cycle?

      Rats are nocturnal animals, i.e. active at night and sleep during the day. In this study, rats were housed on a reverse day-night cycle so that assessment of the response to PCP could be evaluated during their active phase. This is not specific SCZ research and is the routine protocol for behavioral testing in the Powell laboratory. It is not clear from reading Experimental Procedures/Bioinformatic Analysis section (page 6) if normalized N14/N15 protein ratios measured in the bait-IPs and control-IPs were used for the SAINT analysis? Or did the authors used label-free quantitation with spectral counts?

      We apologize for not making the methods clearer. In the results, it is stated that the N14 identifications are used in the SAINT analysis, and we state in the Discussion that SAINT uses spectral counts. We modified the Experimental Procedures/Bioinformatic Analysis section (page 6) to state: The input for SAINT was only the 14N identifications.

      *- Are prior studies referenced appropriately? Yes

      • Are the text and figures clear and accurate? *Fig1C: The workflow is a little too simple, the authors might want to add more details.

      We revised Fig1C with more details as suggested.

      FigS1C: Please add x-axis title (spectral counts) directly to the figure.

      “Spectral counts” was added to the x-axis. FigS1C is now FigS2C ,with the addition of the immunoblots you suggested. Fig2B-D: The color scale bar should have number values to denote lower and upper limits in % (as opposed to "lowest" and "highest"). Numerical values were added to replace the upper and lower limits. - Do you have suggestions that would help the authors improve the presentation of their data and conclusions? No * *

      Reviewer #1 (Significance (Required)):

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. In this study, the authors have drastically expanded the protein interaction landscape around 8 known SCZ susceptibility factors by using a conventional IPMS approach. Performing the IPs on protein extracted from hippocampus dissected from mice treated with phencyclidine to model SCZ increases the biological significance of such lists of proteins. Furthermore, the co-immunoprecipitation of many other SCZ susceptibility factors along with the 8 selected baits supports the hypothesis that these proteins of varied functions are part of large interaction networks. Overall, the integration of experimental data with in silico networks, along with the quantification of PPI changes in response to PCP, should contribute to a more nuanced understanding of SCZ pathogenesis. The potential implications for drug development underscore the broader significance of the study in advancing our knowledge of neurobiology and its relevance to neurological disorders like schizophrenia.

      • Place the work in the context of the existing literature (provide references, where appropriate). Overall, this study contributes to the existing literature by providing experimental data on in vivo PPI networks related to SCZ risk factors. Not only do the authors validate 124 known interactions but also they identify many novel PPI, due to a gap in the existing literature regarding the comprehensive mapping of PPI directly from tissue extracts, especially brain tissue. The authors advocate for more IPMS studies in mammalian tissues to generate robust tissue-specific in silico networks, which agrees with the growing understanding of the importance of tissue-specific networks for identifying disease mechanisms and potential drug targets. Furthermore, the SCZ PPI network reported here is enriched in proteins previously associated with SCZ, which aligns with the existing literature emphasizing the involvement of certain proteins and pathways in the pathogenesis of SCZ [References: 78-85]. The authors also investigate the response of the SCZ network to PCP treatment, hence providing insights into the potential effects of post-translational modifications, protein trafficking, and PPI alterations in a model of schizophrenia, which adds to existing knowledge about the impact of PCP on the molecular processes associated with SCZ [References: 88, 89, 92].

      • State what audience might be interested in and influenced by the reported findings. Overall, the findings reported in this manuscript have implications for both basic research in molecular biology and potential translational applications in the development of targeted therapies for neurological disorders, particularly schizophrenia. The study delves into in vivo protein-protein interaction (PPI) networks related to genes implicated in schizophrenia (SCZ) risk factors. Researchers in neuroscience, molecular biology, and psychiatry would find the information valuable for understanding the molecular basis of SCZ. The study highlights the potential for identifying disease "hubs" that could be drug targets. Pharmacologists and drug developers interested in targeting protein complexes for drug development, especially in the context of neurological disorders, may find the study relevant.

      • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. Technical Expertise | biochemistry, liquid chromatography mass spectrometry, proteomics, computational biology, protein engineering, protein interaction networks, post-translational modifications, protein crosslinking, proximity labeling, limited proteolysis, thermal shift assay, label-free and isotope-labeled quantitation. Biological Applications | human transcriptional complexes, apicomplexan parasites, viruses, nuclear envelope, ubiquitin ligases, non-model organisms.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: McClatchy, Powell and Yates aimed at identifying a protein interactome associated to schizophrenia. For that, they treated rats (N14 and N15) with PCP, which disturbs gutamatergic transmission, as a model for the disease and co-immunoprecipitated hippocampi proteins, which were further analyzed by standard LC-MS.

      The study is new, considering not much has been done in this direction in the field of schizophrenia. This justifies its publication. On the other hand, a major flaw of the is the lack of information on the level of interaction of the so called protein interactome. Meaning, we cannot distinguish, as the study was performed, which proteins are directly interacting with the targets of interest from proteins which are interacting with targets´ interactors. The different shells of interaction are crucial information in protein interactomics.

      Major: most of I am pointing below must be at least discussed or better presented in the paper, as It may not be solvable considering how the study has been conducted.

      1) The study fails in defining the level of interaction of the protein interactome with the considered targets. This has been shortly mentioned in the discussion, but must be more explicit to readers, for instance, in the abstract, introduction and in the methods sections. We agree this is crucial information that is absent from our dataset. As we explained in the Discussion, we cannot distinguish between PPI that are direct interactors with the target protein and PPI that reside in a multi-protein complex that includes the protein (i.e. indirect). This is an inherent problem with any IP-MS study. We amended the Introduction to highlight the ambiguity of the interaction data produced by the IP-MS approach, as you suggested.

      Text added to the Introduction:

      “Regardless of whether Ab or tagged proteins are employed to identify PPI from a biological sample, it cannot be determined if the identified interactor binds directly to the target protein or reside in a complex of proteins that includes the target protein (i.e. indirect).”

      Since this important information is routinely missing from IP-MS studies, we decided to try to determine the level of interaction by using the artificial intelligence algorithm AlphaFold3(AF3). We believe it is not yet optimized for PPI, but AF3 is a big leap forward in the field of structural biology. For example, we observed AF3 did not predict high confident structures for our large membrane target proteins and was unable to validate known direct PPI of these targets. In addition, analyzing data with AF3 is currently not automated or streamlined so with ~1600 PPI identified in our dataset, we chose to look at one target protein, Ppp1ca. AF3 identified many known direct binding proteins in our Ppp1ca PPI dataset, which gives high confidence to the novel PPI predicted to be direct interactors. The AF3 data is encompassed in an additional Figure 6.

      The following was added to the Results Section:

      “A disadvantage of IP-MS studies is that it cannot distinguish between a PPI that binds directly to the target protein, and a PPI in which the interactor and target protein reside the same multiprotein complex (i.e. indirect). We sought to predict which PPI may be directly interacting with its target protein by using the artificial intelligence algorithm AlphaFold3(AF3) (Abramson, Adler et al. 2024). First, we analyzed the predicted AF3 structure of the targets using the pTM score and the fraction of each structure calculated to be disordered (Figure 6A and Supplementary Table7). Our reasoning was that if targets have a poorly resolved structures, it will be difficult to screen them for direct PPI. A pTM score >0.5 suggests that the structure may be correct (the highest confidence score is 1). Undefined or disordered regions hinder the accuracy of the prediction. All targets possessed a pTM score > 0.5 except Syt1. The disordered fraction negatively correlated with the pTM score, as expected. Gsk3b, Ppp1ca, and Map2k1 had the highest pTM scores and were also the smallest of our target proteins (Figure 6B). Ppp1ca had the most confident structure (i.e. pTM 0.9) and the smallest disordered fraction (i.e. 0.07). Next, we determined the AF3 prediction of previously reported direct interactions of the targets. We used the iPTM score to determine interaction confidence. An iPTM score >0.8 is considered a highly confident direct interaction, whereas 0.8. These eight PPI have all previously been reported to form a direct interaction with Ppp1ca, except Phactr3 (Zhang, Zhang et al. 1998, Terrak, Kerff et al. 2004, Hurley, Yang et al. 2007, Marsh, Dancheck et al. 2010, Ragusa, Dancheck et al. 2010, Ferrar, Chamousset et al. 2012, Choy, Srivastava et al. 2024, Xu, Sadleir et al. 2024)*. Phactr3 is structurally similar to, but less studied than, the reported direct interactor Phactr1. These interactors are all inhibitors of PP1 except Ppp1r9b which targets Ppp1ca to specific subcellular compartments. Nine PPI were assigned a score The following has been added to the Discussion:

      Our SCZ PPI network consists of two types of PPI: direct physical interactions and “co-complex” or indirect interactions. Typically, the nature of the interaction can be distinguished in IP-MS studies. We decided to employ the new AF3 algorithm to screen the PPI of Ppp1ca to provide evidence for direct interactors. We chose to examine the PPI assigned to Ppp1ca, because its structure was the most confident among our target proteins and AF3 correctly predicted a known direct interactor with high confidence. Ppp1ca is a catalytic subunit of the phosphatase PP1, which is required to associate with regulatory subunits to create holoenzymes (Li, Wilmanns et al. 2013). Eighteen PPI were predicted to be directly interacting with Ppp1ca using a 0.6 or higher iPTM filter. This filter may be too conservative and generate false negatives, because another study employed a 0.3 filter followed by additional interrogation to screen for direct PPI (Weeratunga, Gormal et al. 2024). Forty-four percent of these predictions were confirmed by previous publications. Most of the validated direct interactions are inhibitors of the phosphatase, but one, Ppp1r9b (aka spinophilin), is known to target Ppp1ca to dendrite spines to enhance its activity to specific substrates (Allen, Ouimet et al. 1997, Salek, Claeboe et al. 2023). This high correlation with the literature provides substantial confidence in the novel PPI predicted to be direct Ppp1ca interactors. The AF3 screen predicted that NDRG2 directly interacts with Ppp1ca. This protein is known to regulate many phosphorylation dependent signaling pathways by directly interacting with other phosphatases including Pp1ma and PP2A (Feng, Zhou et al. 2022, Lee, Lim et al. 2022). Actin binding protein Capza1 was also predicted to directly interact with Ppp1ca and Ppp1ca interacts with actin and its binding proteins to maintain optimal localization for efficient activity to specific substrates (Foley, Ward et al. 2023). Hsp1e is a heat shock protein predicted to directly interact with Ppp1ca. Although there is no direct connection to Ppp1ca, other heat shock proteins have been reported to regulate Ppp1ca (Mivechi, Trainor et al. 1993, Flores-Delgado, Liu et al. 2007, Qian, Vafiadaki et al. 2011). We also observed that many of these direct PPI were altered with PCP treatment. One direct interactor, Ppp1r1b (aka DARPP-32), is phosphorylated at Thr34 by PKA in the brain upon PCP treatment. This phosphorylation event converts Ppp1rb to a potent inhibitor of Ppp1ca(Svenningsson, Tzavara et al. 2003). Importantly, manipulation of Thr34 attenuated the behavioral effects of PCP. Consistent with this report, Ppp1r1b-Ppp1ca interaction was only observed with PCP in our study. Further investigation is needed to determine if our novel direct interactors regulate the PCP phenotype. We conclude that AF3 can provide important structural insights into the nature of PPI obtained from large scale IP-MS studies.

      2) Considering the protein extraction protocol, it is fair to mention that only the most soluble proteins are being considered here. I am bringing this up since the importance of membrane receptors is clear in the studied context. This is an interesting point. It has been predicted that transmembrane proteins constitute 25-30% of the proteome(Dobson, Remenyi et al. 2015). Thus, we would predict our dataset will have more soluble proteins than membrane proteins. Half of our target proteins were transmembrane proteins, so in designing the protocol for this study we ensured that these membrane proteins could be significantly enriched compared to the control IPs (Supplementary Figure 2C). In addition, compared to soluble proteins, membrane proteins are notoriously difficult to identify by bottom-up proteomics (Savas, Stein et al. 2011). We decided to investigate how many of our protein interactors were transmembrane proteins. Using Uniprot, 199 (20%) of our protein interactors were determined to have a transmembrane domain. Therefore, this data does not support the statement that only the most soluble proteins are being considered in our study. We added this percentage of transmembrane proteins in our network to the text of the Results section.

      3) It is not clear from the methods description if antibodies from all 8 targets were all together in one Co-IP or have been incubated separately in 8 different hippocampi samples. It seems the first, given how results have been presented. If so, this maximizes the major issue raised above (in 1). We apologize for not clearly describing our experimental design. All the targets were immunoprecipitated separately and analyzed separately on the mass spectrometer. With all the biological replicates and two conditions (i.e. Saline and PCP), we performed 48 individual, separate IPs. There were an additional 48 individual, separate IPs run in parallel that were the control IPs.

      We modified the schematic of our experimental design in Figure 1C to clarify that the 8 targets IPs were analyzed separately. In addition, we modified the Results to read:

      “In total, 96 (48 bait and 48 control) IPs were performed, and each was analyzed separately by LC-MS analysis.”

      4) Definitely, results here are not representing a "SCZ PPI network". PCP-treated animals, as any other animal model, are rather limited models to schizophrenia. As a complex multifactorial disease, synaptic deficits, which is the focus of this study, can no longer be considered "the pivot" of the disease. Synaptic dysfunction is only one among many other factors associated to schizophrenia.

      We do agree that synaptic dysfunction is only one factor associated with SCZ and we will discuss this more in our response to your next comment.

      We understand the limitations of PCP as an animal model of SCZ. It is quite difficult to model a specific human complex multifactorial neurological disease in rodents and we would contend that there is no single universal SCZ model that everyone agrees with. We addressed this by adding the following to the Introduction:

      Since many SCZ symptoms are uniquely human, this is no single animal model that truly replicates all the complex human SCZ phenotypes(Winship, Dursun et al. 2019). In this respect, all SCZ animal models can be considered limited.* “ *

      We respectfully disagree, however, with the term SCZ PPI network. This study is focused on SCZ by choosing proteins implicated in SCZ, quantitating how the PPI changes in a SCZ model, and discussing how our findings are relevant to SCZ pathogenesis. So, it seems logical to call our dataset a SCZ PPI network. We do concede that without further experimentation we do not know if these PPI play a causal role in SCZ. Furthermore, our novel PPI may involve biological pathways unrelated to SCZ and that have relevance to other biological conditions.

      We added the following statement to the Discussion to address this comment:

      “Even though our network was constructed in the context of SCZ, our dataset has relevance to other neurological diseases where our targets have been implicated in the pathogenesis.

      5) Authors should look for protein interactions that might be happening also in glial cells. They are not the majority in hippocampus, but are present in the type of tissue analyzed here. Thus, some of the interactions observed might be more abundantly present in those cells. Maybe enriching using bioinformatics tools the PPI network to different cell types.

      As mentioned above, we agree that synaptic dysfunction is just one of the hypotheses of SCZ pathogenesis and emerging evidence suggests that dysfunction in astrocytes and microglia are factors. Since these non-neuronal cells can regulate synapses, these hypotheses are not mutually exclusively and suggests that at the cellular level SCZ etiology involves multiple cell types.

      We addressed your query by comparing our PPI network to an RNA-seq analysis of different cell types in the rodent brain(Zhang, Chen et al. 2014). First, we analyzed our target proteins, and found that they were expressed in all cell types to varying degrees except Syngap which was not in the RNA-seq database. This data is now represented in Figure 3E. We then determined the RNA abundance distribution of all the protein interactors, which is represented in Figure 3D as a heatmap. From a bird’s eye view, it suggests that some PPI exist in non-neuronal cells. Next, we determine how many of our protein interactors were enriched in one cell type, which is shown in Figure 3F. We defined an enriched protein as having >50% of the RNA signal in one cell type. We identified 175 proteins that were enriched in one cell type compared to the entire RNA-seq dataset which had 4008 enriched proteins. In the entire RNA-seq dataset, 24% of the enriched proteins were in neurons whereas 47% of our protein interactors were enriched in neurons. This is consistent with the enrichment of synaptic proteins in our network. There was also an increased percentage of astrocytes (19%) and oligodendrocytes (6%) in our network compared to the entire database (i.e. astrocytes-11% and oligodendrocytes-4%). In other cell types, such as microglia, there was less protein enrichment in our network compared to the database. We have amended this cell type analysis to our manuscript and concluded that a portion of our PPI network may occur in non-neuronal cells. We also created a supplementary table of our network with its associated RNA-seq data.

      Text added to the Results:

      “Non-synaptic proteins represented 59% of our network suggesting that some PPI may occur in non-neuronal cells. To investigate this possibility, we annotated our network with a transcriptome rodent brain database of eight cell types(Zhang, Chen et al. 2014). All the targets were detected in all cell types but there was obvious enrichment in specific cell types for some targets (Figure 3E). Syngap1 was not in the database. We also observed a large variation of cellular distributions for the interactors (Figure 3D). Next, we sought to determine how many interactors are enriched in a particular cell type by defining cell enrichment as a protein having >50% RNA signal in one cell type. We identified 175 protein interactors enriched in one cell type, whereas the entire database had 4008 proteins enriched (Figure 3F). Consistent with our synaptic enrichment, 47% of the enriched protein interactors were in neurons whereas only 24% of the enriched protein in the entire database were in neurons. We also observed an increase in protein interactors enriched in astrocytes compared to the database. Overall, this analysis provides evidence that our identified PPI may occur in non-neuronal cells.”

      Text added to the Discussion:

      “The exact etiology of SCZ, however, remains unclear and synaptic dysfunction is only one hypothesis (Misir and Akay 2023). There is evidence for the involvement of non-neuronal cell types, including endothelial cells, astrocytes, and microglia(Tarasov, Svistunov et al. 2019, Rodrigues-Neves, Ambrosio et al. 2022, Stanca, Rossetti et al. 2024). Although we observed an enrichment of synaptic proteins in our SCZ network, we provided evidence that a portion of our network may occur in non-neuronal cells. Since non-neuronal cells can regulate synapses(Vilalta and Brown 2018, Bauminger and Gaisler-Salomon 2022), synaptic dysfunction and perturbations in non-neuron cells in SCZ etiology are not mutually exclusive. Our data corresponds with emerging evidence that pathogenesis is multifaceted, involving dysfunction in multiple cell types.

      Minor: 1) in the abstract, it is not clear if 90% of the PPI are novel to brain tissue in general or specifically schizophrenia. We apologize for the confusing sentence. 90% are novel meaning the PPI have not been reported in any study. We changed the abstract to read:

      “Over 90% of the PPI have not been previously reported.”

      2) authors refer to LC-MS-based proteomics as "MS" all across the text. Who am I to say this to Yates et al, but I think it is rather simplified use "Mass Spectrometry Analysis", when this is a typical LC-MS type of analysis We agree with you. We have replaced MS analysis with LC-MS analysis in the manuscript.

      3) Several references used to construct the hypothesis of the paper are rather outdated: several from 10-15 years ago. It would be interesting to provide to the reader up to date references, given the rapid pace science has been progressing. We agree many of the references are 10-15 years old. Many of the hypotheses and biological mechanisms we discussed can be supported by too many studies to cite them all, due to space. If we could, we would. We also agree that there are many more recent studies that have confirmed and added more details to the original discovery or hypothesis cited. We cite the first study to support our conclusions because it deserves the most credit.

      4) "UniProt rat database". Please, state the version and if reviewed or unreviewed.

      This information was added to the Methods section. UniProt reviewed rat database with isoforms 03-25-2014.

      Reviewer #2 (Significance (Required)):

      The study is informative, and has great potential to enrich the specific literature of this field. But should tone down some arguments, given the experimental limitations of the PPI network (as described above) and should state PCP-treated rats as a limited model to schizophrenia.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary

      It is now widely accepted that schizophrenia is polygenic disorder in which a large fraction of the genetic risk is in variants affecting the expression of synaptic proteins. Moreover, it is known that these synaptic proteins are found in multiprotein complexes and that many proteins encoded by schizophrenia risk genes interact directly or indirectly in these complexes. It is also known that some drugs including phencyclidine, which binds to NMDA receptors and to Dopamine D2 receptors (not mentioned by the authors) can induce schizophreniform psychosis. The authors have set out to advance on this position by performing proteomic mass spectrometry studies on proteins identified as encoded by schizophrenia risk genes. They target 8 proteins for immunoprecipitation from rat brain and identify coisolated proteins and perform various network analyses. In the most interesting part of the paper they ask if PCP-treatment altered protein interactions and report various changes.

      Major comments:

      1. Choice of target proteins. It was not until the first paragraph of the results section that the authors first name the 8 synaptic proteins that have chosen to study. This information should be in the abstract.

      This information was added to the abstract as requested.

      The authors then use figure 1A and 1B as evidence that these 8 "baits" are schizophrenia-relevant proteins. Figure 1A does not provide any evidence at all and Figure 1B is about as weak a line of evidence imaginable - a histogram of the number of papers that have the search term "schizophrenia" and the protein name. I tried this search for Grin2B and almost immediately found papers that reported no association between Grin2B and schizophrenia (e.g. PMID: 33237434). Figure 1B should be scrapped.

      The purpose of Figure 1A was not to demonstrate that there is evidence that our proteins are involved in SCZ. The purpose of this figure is to show that these proteins are diverse in function and structure (blue = membrane proteins; yellow = soluble proteins), and that there are published studies reporting physical and functional interactions between these 8 proteins. This suggests that a more extensive network may exist.

      We agree that Figure 1B does not specifically describe how each protein is related to SCZ but demonstrates how many papers investigating their connection to SCZ have been published. We understand how by itself, this can be considered weak. We still think it is important to show that multiple laboratories have published papers connecting these proteins to SCZ. Instead of scrapping this figure, we have moved it to the Supplementary Figure 2A.

      We read PMID: 33237434 and interpret their findings quite differently than you. This report examined whether one single nucleotide mutation (SNV) in Grin2b is associated with the cognitive dysfunction in SCZ but did not examine if this mutation is associated with the other major SCZ phenotypes (i.e. psychotic and emotional). Specifically, the study selected 117 “patients in whom cognitive dysfunctions are present despite effective antipsychotic treatment of other schizophrenia symptoms.” The study concluded that Grin2B SNV was not associated with this subset of patients but concluded that they need to search for other NMDAR variants and study their association with SCZ. We would argue that the only reason this group performed these experiments was the well-known association between Grin2b and SCZ. Many studies have found SNVs in Grin2B that are associated with SCZ, but there are conflicting reports. It is unclear if the discrepancies are connected to different cohorts, complexity of SCZ phenotype, or small sample sizes. Regardless of Grin2B mutations significantly associated with SCZ, there are several lines of evidence that Grin2B is involved in SCZ. Most importantly, Grin2b is a component of the NMDAR, which is a key player to the SCZ hypo-glutamate hypothesis and the receptor that binds PCP. By immunoprecipitating Grin2b, we are analyzing the PPI network of NMDAR, which is arguably the most studied complex in SCZ research.

      The remaining part of paragraph 1 of the results does not provide an adequate, let alone systematic, justification for the use of the 8 baits. It would be appropriate to construct a table with the 8 proteins and cite relevant papers and identify the basis for why they are implicated in schizophrenia (is it a direct mutation or some other evidence?). What makes these 8 proteins better than many others that are cited as synaptic schizophrenia relevant proteins?

      We apologize for not clearly and thoroughly describing the reasons for choosing our baits. As stated in the first paragraph of the Results, we chose the proteins that had evidence of being a SCZ risk factor in SCZ databases that included a plethora of human genomic studies. This criterion by itself results in ~5000 genes. To further narrow our candidates, we chose targets that were synaptic and were observed to have phosphorylation changes in response to PCP in an SCZ animal model. Since protein-protein interactions (PPI) are often dependent on phosphorylation, we believe this is an important criterion for quantitation of PPI in response to PCP. These requirements still resulted in a list of hundreds of proteins. So, what makes these better than any other SCZ relevant protein? As stated in the manuscript, the major limiting criterion was identifying commercial antibodies that can efficiently immunoprecipitate their target in brain tissue. Since there are many reports associating our targets with SCZ, we directed the reader to SCZ databases that compile large genomic association studies. We understand, however, the request for more specific information regarding the biological connection between these proteins and SCZ. We took your suggestion and constructed a table with our 8 targets, and it is now Figure 1A. In this table, we selected references to indicate if the target has reported changes in expression and/or activity in SCZ samples (i.e. human and animal model) or genetic association with SCZ in human studies.

      The methods of protein extraction are particularly concerning. The postsynaptic density of excitatory synapses (which contains several of the target proteins in this study) has been notoriously difficult to solubilise unless one uses high pH (9) and harsh detergent extraction (1% deoxycholate). The authors use pH 7 and weak detergent conditions, which are likely to be inefficient for solubilising at least several of the target proteins. Nowhere do the authors report how much of the total of their target protein is being solubilised. Indeed, there are no figures showing biochemical conditions at all. What if only a small percentage of the target protein is being immunoprecipitated - what does this mean for the interaction data? How do we know if the fraction being immunoprecipitated is from the synapse? (why did they not use synaptosomes).

      How do we know if the fraction being immunoprecipitated is from the synapse? (why did they not use synaptosomes). The absence of this kind of data undermines the reader's confidence in the findings.

      We apologize for not clearly explaining our experimental design We were not interested in identifying the PPI of the PSD. All these proteins have been localized to the synapse, but they are also localized to other neuronal compartments and non-neuronal cell types. Synaptic dysfunction is one hypothesis of SCZ pathogenesis, but there is evidence of other cell types, including astrocytes, microglia, and oligodendrocytes(Kerns, Vong et al. 2010, Ma, Abazyan et al. 2013, Goudriaan, de Leeuw et al. 2014, Park, Noh et al. 2020). For these reasons, we chose an unbiased approach to identifying PPI.

      The Results have been amended to read: “All the targets are localized to the synapse, but also localized to non-synaptic compartments and expressed in non-neuronal cells. Thus, since there is also evidence for non-synaptic perturbations contributing to SCZ pathogenesis, we chose to perform an unbiased analysis in unfractionated brain tissue (Tarasov, Svistunov et al. 2019, Rodrigues-Neves, Ambrosio et al. 2022, Stanca, Rossetti et al. 2024). “

      Why do we choose a specific solubilization strategy? Harsh detergents can disrupt PPI and prevent efficient enrichment of the target by disrupting the target-antibody interaction(Pankow, Bamberger et al. 2015). To identify protein interactions, mild detergent conditions are typically employed in PPI studies. We used a combination of “weak” detergents (i.e. 0.5% NP-40, 0.5% Triton, and 0.01% Deoxycholate) to help prevent non-specific PPI, but still allowing efficient enrichment of the target proteins. We do agree that with our conditions the targets were not completely solubilized. It is a balancing act to find the correct conditions for IP-MS analysis. Since we are unable to immunoprecipitate all the target protein, we did not identify all the PPI for each target, and we did not make this claim. Importantly, we did identify known interactions for all our targets. Our mild detergent protocol is similar to other PPI studies and our results validates results reported in previous studies. It is more important to significantly enrich the target protein over control than to achieve complete solubilization (Supplementary Figure 2D). This allows us to use control IPs to successfully employ the SAINT algorithm to determine which proteins are confident PPI using a 5% FDR.

      How do we know protein are being immunoprecipitated from the synapse? As we show in Figures 2B and 3A, multiple proteins are annotated to the synapse with different databases, Gene ontology and SynGO. Well-known synaptic PPI were also observed, such as Grin2B-Dlg4(i.e. PSD-95), providing further evidence for proteins being immunoprecipitated for the synapses. Besides validating over a hundred published PPI interactions, we also identified many reciprocal interactions between the target datasets demonstrating the reproducibility of our protocol. Thus, we respectfully disagree with you and assert that our PPI network is very confident.

      The immunoprecipitation protocol is unusual in that the homogenates were incubated overnight (twice), which is a very long period compared to most published protocols. This is a concern because spurious protein interactions could form during this long incubation.

      There are many different immunoprecipitation protocols in the literature. The IP conditions depend upon the target protein and the antibody employed. Specifically, the abundance of the target and the affinity of the antibody to the target will dictate the IP conditions. We routinely perform overnight incubation for our IP-MS studies(Pankow, Bamberger et al. 2016, McClatchy, Yu et al. 2018). In our experience with brain tissue, this results in the highest enrichment of the target protein and the best reproducibility between biological replicates compared to IP protocols with shorter incubation times. Many other laboratories use overnight incubations(Lin and Lai 2017, Iqbal, Akins et al. 2018, Lagundzin, Krieger et al. 2022), so we do not consider our protocol unusual. We do find that IPs with tagged proteins in cell culture are more amenable to short incubation times. We have no evidence that overnight incubation causes spurious protein interactions nor could find any in the literature. Non-specific interactions are a concern with IP-MS experiments regardless of the incubation time. We took multiple steps to reduce the non-specific PPI from affecting our dataset. The first overnight incubation was incubating the brain lysate with agarose beads linked to IgGs to preclear the lysate from “sticky” non-specific interactors binding to IgGs and the beads. In addition, control IPs with IgG crosslinked to beads were incubated with brain lysate in parallel to each target IP. We computationally compared the non-specific control IPs with the target IPs using the SAINT algorithm to generate a confident list of PPI with a stringent 5% FDR. Therefore, our pipeline is specifically designed to prevent spurious PPI.

      In the section "Biological interpretation of scz PPI network". Surprisingly the authors found that synaptic proteins that are exclusively postsynaptic (Grin2B, SynGAP) or exclusively presynaptic (Syt1) show very high percentages of their interacting proteins are from the synaptic compartments where the target protein is not expressed. The authors offer no explanation for this paradox. One explanation for this could be that spurious PPIs have formed in the protein extraction/immunoprecipitation protocol. These findings need validation by biochemical fractionation of synapses into pre and post synaptic fractions and immunohistochemistry to demonstrate the subsynaptic localisation of the proteins. Grin2b is traditionally described as exclusively post-synaptic, but there is evidence for other localizations, including presynaptic(Berretta and Jones 1996, Sjostrom, Turrigiano et al. 2003, Bouvier, Larsen et al. 2018) and expression in astrocytes(Serrano, Robitaille et al. 2008, Lee, Ting et al. 2010, Lalo, Koh et al. 2021, Kim, Choi et al. 2024). Syngap has been localized to non-synaptic sites and glia expression in addition to its heavily studied role at the post synapse(Moon, Sakagami et al. 2008, Araki, Zeng et al. 2015, Birtele, Del Dosso et al. 2023). Syt1 is commonly used as a presynaptic marker, but along with other proteins previously reported to be exclusively presynaptic (such as SNAP-25), it has been localized to the postsynapse (Selak, Paternain et al. 2009, Tomasoni, Repetto et al. 2013, Hussain, Egbenya et al. 2017, Madrigal, Portales et al. 2019, Sumi and Harada 2023). Similarly, SynGo database assigns both post-synaptic and pre-synaptic localizations to Grin2b as stated in the manuscript. Thus, our data is not paradoxical, but supports the emerging evidence against the canonical exclusivity of the pre- and post-synaptic compartments. Determining subsynaptic localization of a protein is a huge undertaking and requires expertise we do not possess. This is why we relied on synaptic databases and the literature for our interpretation of our data, as other publications have done.

      We added the following to the Discussion to address this issue:

      “Using the SynGo database, 418 proteins (i.e. 41% of our network) were identified as synaptic proteins consistent with the targets having a synaptic localization. Defining the synaptic proteome is inherently difficult because the synapse is an “open organelle”, and many synaptic proteins also have non-synaptic localizations and are expressed in non-neuronal cells. We further attempted to define our synaptic PPI by differentiating between pre- and post- synaptic compartments via SynGo. Half of our targets were annotated to both compartments and all targets had PPI that were annotated to both. This data supports the emerging evidence against the canonical localization exclusivity of the pre and post synapse(Bouvier, Larsen et al. 2018, Madrigal, Portales et al. 2019).”

      My concerns about spurious interactions are raised again because the authors say that 92% of their interactions are novel (I note that they authors have not compared their interaction data of the NMDA receptor with published datasets from Dr Seth Grant's laboratory). BioGrid itself is good but not enough for comparison, maybe at this point it worth taking String, which accumulates several sources of PPIs, just select the direct PPIs.

      Since the MS-IP experiments in our study have never been performed before, we are not surprised by the extent of novel data we produced. As described above, we took many steps to prevent spurious PPI from entering our final dataset, including the use of detergents, preclearing and stringent bioinformatic filtering. Our entire dataset is very large, so the 8% of PPI that we replicated from other studies represents 124 interactions. We believe this to be an impressive number which correlates to the confidence of our data. Providing more confidence, we identified many reciprocal PPI where shared protein interactors between target proteins were identified in both target protein datasets.

          The PPI described for our targets in BioGrid encompassed 713 publications.  Two of the BioGrid datasets that were compared to our Grin2b PPI data were from the laboratory of Seth Grant.  Arbuckle et al (2010) is a low-throughout paper that describes a Grin2b and DLG4 PPI (that we also identified) and Husi et al (__2000__) is a seminal paper using high-throughput LC-MS to identify PPI in the PSD of mouse brain.  There were many differences between Husi et al and our pipeline.  Husi et al employed the C-terminal Grin2b peptide to pull down interactors from the PSD fraction whereas we employed Grin2b antibody to enrich Grin2b and its interactors from unfractionated brain tissue.  Despite these differences, our studies found 8 proteins in common.
      

      We took your suggestion and compared our data to String which includes direct PPI and functional PPI. Our input was the high confidence PPI identified by SAINT with 5% FDR as with the BioGrid comparison. The PPI network for each target protein had a more significant enrichment (p We think the problem you suggest with SynGO is more of an inherent problem with characterizing the synaptic proteome. The synaptic proteome is difficult to define since it is an “open organelle” with proteins transporting in and out. In addition, most synaptic proteins, such as mitochondrial and translational proteins, also have non-synaptic localizations. It is not possible to isolate a contaminant-free “pure” synaptic preparation by biochemical fractionation. Recently, SynGO was used in a meta-analysis of previously published PSD datasets(Kaizuka, Hirouchi et al. 2024). Kaizuka et al. found 123 proteins identified in 20 PSD datasets. SynGo annotated proteins with post-synaptic localization from this list. To a lesser extent they also identified presynaptic localizations, but it is unclear if the presynaptic proteins are novel localizations. Kaizuka et al. continued the investigation and identified a novel PSD protein, thus demonstrating that our knowledge of pre- and post- synaptic proteomes is incomplete.

      Minor comments

      1. A number of papers have reported protein interactions of native NMDA receptor complexes and their associated proteins isolated from rodent brain and are neither referenced in this paper. It would be relevant to compare these published datasets with the Grin2B IP datasets.

      We employed BioGrid as a reference of reported PPI for each of our target proteins. For Grin2B, the PPI came from 142 different publications. For eight target proteins, we decided *BioGrid * was the best resource for determining the novelty of our PPI because it is routinely used for large-scale unbiased PPI analysis. To determine the novelty of our network, we compared our PPI network to 713 publications via BioGrid. We are unsure whether the papers you are referring to are included in the BioGrid database. To make it easier for readers with similar queries, we added an additional supplementary table (TableS4) including all the publications (i.e. PMID numbers) included in BioGrid comparison for each target protein.

      We amended the Results with the following sentence, so the readers realized the extensiveness of the Biogrid comparison analysis:

      “There were 713 publications in BioGrid that describe at least one interaction with one of our targets (Supplementary Table4).”

      The use of the term "bait" in purification experiments typically refers to a protein and not an antibody. I suggest removing the word bait to avoid ambiguity and simply use the word target. We took your suggestion and used “target” instead of “bait” to avoid ambiguity.

      26 mins of treatment gives completely different set of PPIs between PCP and saline which is very interesting, so both networks should be included in Supplementary. Also, it would be useful to have a list of modulated (phosphorylated in their case, but also ubiquitinated etc) proteins, which is not presented. Table S1 lists the PPI for each target, and we designated whether the interactors were for Sal, PCP, or both. Phosphorylated and ubiquitinated proteins are very hard to reproducibly identify without an additional enrichment step. Since we did not perform this enrichment step, we did not search for these modifications and do not have any modified proteins to report.

      As they say their final network is composed of "direct physical and "co-complex" interactors and they cannot distinguish between them. This is particularly bad for the postsynapse, where all the PSD components can be co-IP-ed in different combinations. It can explain the Figure 5C, where most of the proteins have FDR = 1, which means they do not reproduce. Figure 5C represents the intersection of 15N quantification and SAINT analysis. The x-axis is the FDR reported for SAINT analysis, and the y-axis is the significant proteins from the N15 analysis. This figure demonstrates that some proteins that were significantly different with PCP via N15 quantification also were annotated as PPI by SAINT (i.e. 5%. As stated in the Discussion, we concluded that the SAINT analysis and N15 quantitation are complementary in identifying PPI and that the quantification of a biological perturbation may aid the identification of PPI. Figure 5C is not related to whether our PPI are direct physical or "co-complex" interactors. Distinguishing between direct physical and co-complex interactors is an inherent problem for all IP studies. Since another reviewer also highlighted this deficit in our manuscript, we decided to analyze our PPI dataset with the artificial intelligence algorithm AlphaFold 3(AF3). The AF3 data is encompassed in Figure 6.

      The following AF3 data was added to the Results Section:

      “A disadvantage of IP-MS studies is that it cannot distinguish between a PPI that binds directly to the target protein, and a PPI in which the interactor and target protein reside in the same multiprotein complex (i.e. indirect). We sought to predict which PPI may be directly interacting with its target protein by using the artificial intelligence algorithm AlphaFold3(AF3) (Abramson, Adler et al. 2024). First, we analyzed the predicted AF3 structure of the targets using the pTM score, and determined the fraction of each structure that was calculated to be disordered (Figure 6A and Supplementary Table7). Our reasoning was that if our targets have a poorly resolved structures then it will be difficult to screen for direct PPI. A pTM score >0.5 suggests that the structure may be correct, with the highest confidence equaling 1. Undefined or disordered regions hinder the accuracy of the prediction, and all our targets possessed a pTM score > 0.5 except Syt1. The fraction of disordered negatively correlated with the pTM score, as expected. Gsk3b, Ppp1ca, and Map2k1 were the target proteins with the highest pTM scores and were also the smallest of our targets (Figure 6B). Ppp1ca had the most confident structure (i.e. pTM 0.9) and the least fraction disordered (i.e. 0.07). Next, we determined the AF3 prediction of previously reported direct interactions of the targets. We used the iPTM score to determine an interaction confidence. An iPTM score >0.8 is a highly confident direct interaction, whereas 0.8. These eight PPI have all previously been reported to form a direct interaction with Ppp1ca, except Phactr3 (Zhang, Zhang et al. 1998, Terrak, Kerff et al. 2004, Hurley, Yang et al. 2007, Marsh, Dancheck et al. 2010, Ragusa, Dancheck et al. 2010, Ferrar, Chamousset et al. 2012, Choy, Srivastava et al. 2024, Xu, Sadleir et al. 2024)*. Phactr3 is structurally similar to, but less studied than, the reported direct interactor, Phactr1. These interactors are all inhibitors of PP1 except for Ppp1r9b which targets Ppp1ca to specific subcellular compartments. Nine PPI were assigned a score The following AF3 interpretation was added to the Discussion:

      “Our SCZ PPI network consists of two types of PPI: direct physical interactions and “co-complex” or indirect interactions. Typically, the nature of the interaction cannot be distinguished in IP-MS studies. We decided to employ the new AF3 algorithm to screen the PPI of Ppp1ca to provide evidence for direct interactors. We chose to examine the PPI assigned to Ppp1ca, because its structure was the most confident among our target proteins and AF3 correctly predicted a known direct interactor with high confidence. Ppp1ca is a catalytic subunit of the phosphatase PP1, which is required to associate with regulatory subunits to create holoenzymes (Li, Wilmanns et al. 2013). Eighteen PPI were predicted to be directly interacting with Ppp1ca using a 0.6 or higher iPTM filter. This filter may be too conservative and may generate false negatives, because another study employed a 0.3 filter followed by additional interrogation to screen for direct PPI (Weeratunga, Gormal et al. 2024). Forty-four percent of these predictions were confirmed by previous publications. Most of these validated direct interactions are inhibitors of the phosphatase, but one, Ppp1r9b (aka spinophilin), is known to target Ppp1ca to dendritic spines (Allen, Ouimet et al. 1997, Salek, Claeboe et al. 2023). This high correlation with the literature provides substantial confidence to the novel PPI predicted to be direct Ppp1ca interactors. The AF3 screen predicted that NDRG2 directly interacts with Ppp1ca. This protein is known to regulate many phosphorylation dependent signaling pathways by directly interacting with other phosphatases including Pp1ma and PP2A (Feng, Zhou et al. 2022, Lee, Lim et al. 2022). Actin binding protein Capza1 was also predicted to directly interact with Ppp1ca and Ppp1ca interacts with actin and its binding proteins to maintain optimal localization for efficient activity to specific substrates (Foley, Ward et al. 2023). Hsp1e is a heat shock protein predicted to directly interact with Ppp1ca. Although there is no direct connection to Ppp1ca, other heat shock proteins have been reported to regulate Ppp1ca (Mivechi, Trainor et al. 1993, Flores-Delgado, Liu et al. 2007, Qian, Vafiadaki et al. 2011). We also observed that many of the direct PPI were altered with PCP treatment. One direct interactor, Ppp1r1b (aka DARPP-32), is phosphorylated at Thr34 by PKA in the brain upon PCP treatment. This phosphorylation event converts Ppp1rb to a potent inhibitor of Ppp1ca(Svenningsson, Tzavara et al. 2003). Importantly, the manipulation of Thr34 attenuated the behavioral effects of PCP. Consistent with this report, Ppp1r1b-Ppp1ca interaction was only observed with PCP in our study. Further investigation is needed to determine if our novel direct interactors regulate the PCP phenotype. We conclude that AF3 can provide important structural insights into the nature of PPI obtained from large scale IP-MS studies.”

      The way PPI data is reported can be improved so that I does not have to be extracted from Table 1 and 2. It would be good if they provide just two columns PPI list, with names or IDs, plus PSP/saline/both conditions in third column, for ease of comparison with other sources and building the graph. They can add it as another spreadsheet to Table 2. We generated this table (TableS2) as you requested.

      Is Figure 2 built for Sal or PCP conditions? as they have only 23% interactions in common (Figure 4A) the Figure 2 should be pretty different for two conditions. Are the 1007 interactors combined from SAL and PCP?

      Figure 2 contains ALL the unique PPI for each target regardless of Sal or PCP conditions. The 1007 protein interactors shown in Figure 2Awhere Sal and PCP were combined to generate a non-redundant list of proteins for each target.

      We amended the Results to make this clearer:

      “When the PCP and SAL datasets were combined, there were 1007 unique proteins.”

      This sentence was added to Figure 2A:

      “For this comparison, Sal and PCP PPI were combined into a unique PPI list for each target.”

      Figure 1F is mentioned but no figure is shown. We apologize for this oversight, and we have corrected the manuscript. 8. Overall the paper could be edited and made more concise, especially the introduction and discussion. We extensively edited the manuscript to be more concise.

      Reviewer #3 (Significance (Required)):

      General assessment

      Proteomic mass spectrometry of immunoprecipitated complexes from synapses has been extensively studied since Husi et al (2000) first study of NMDA receptor and AMPA receptor complexes. Since then, a wide variety of methods have been employed to purify synaptic protein complexes including peptide affinity, tandem-affinity purification of endogenous proteins tagged with FLAG and Histine-affinity tags amongst other methods. Purification of protein complexes and the postsynaptic density from the postsynaptic terminal of mammalian excitatory synapses have been crucial for establishing that schizophrenia is a polygenic disorder affecting synapses (e.g. Fernandez et al, 2009; Kirov et al, 2012; Purcell et al, 2014, Fromer et al, 2014 etc). Network analyses of the postsynaptic proteome have described networks of schizophrenia interacting proteins (e.g. Pocklington et al, 2006; Fernandez et al, 2009) and other neuropsychiatric disorders.

      Hundreds of synaptic protein complexes have been identified (Frank et al, 2016), but very few have been characterised using proteomic mass spectrometry. This paper has chosen 8 protein targets for such analysis and identified many proteins that a putative interactors of the target protein. At this level the current manuscript does not represent a conceptual advance and the value of the data lies in its utility as a resource that may be used in future studies.

      The findings from the 8 target proteins from normal adult rat brain were used for a secondary study that describes the effects that PCP has on the interaction networks. Interestingly, this work shows that 26 minutes of drug treatment leads to considerable changes in the interactomes of the target proteins. These descriptive data could be used in future studies to understand the cell biological mechanisms that mediate these rapid changes in the proteome. PCP and drugs that interact with NMDA receptors are known to induce changes in synaptic proteome phosphorylation including modifications in protein-protein interaction sites, which may explain the PCP effects.

      The study would benefit from validation of experimental protocols for solubilisation and immunoprecipitation and validation of described interactions using orthogonal biochemical or localisation experiments.

      Audience Specialists in synapse proteins and mechanisms of schizophrenia.

      Expertise

      The reviewers' expertise is in molecular biology of synapses including synapse proteomics, protein interaction and network analysis, and genetics of schizophrenia and other brain disorders.

      Abramson, J., J. Adler, J. Dunger, R. Evans, T. Green, A. Pritzel, O. Ronneberger, L. Willmore, A. J. Ballard, J. Bambrick, S. W. Bodenstein, D. A. Evans, C. C. Hung, M. O'Neill, D. Reiman, K. Tunyasuvunakool, Z. Wu, A. Zemgulyte, E. Arvaniti, C. Beattie, O. Bertolli, A. Bridgland, A. Cherepanov, M. Congreve, A. I. Cowen-Rivers, A. Cowie, M. Figurnov, F. B. Fuchs, H. Gladman, R. Jain, Y. A. Khan, C. M. R. Low, K. Perlin, A. Potapenko, P. Savy, S. Singh, A. Stecula, A. Thillaisundaram, C. Tong, S. Yakneen, E. D. Zhong, M. Zielinski, A. Zidek, V. Bapst, P. Kohli, M. Jaderberg, D. Hassabis and J. M. Jumper (2024). "Accurate structure prediction of biomolecular interactions with AlphaFold 3." Nature 630(8016): 493-500.

      Allen, P. B., C. C. Ouimet and P. Greengard (1997). "Spinophilin, a novel protein phosphatase 1 binding protein localized to dendritic spines." Proc Natl Acad Sci U S A 94(18): 9956-9961.

      Anschuetz, A., K. Schwab, C. R. Harrington, C. M. Wischik and G. Riedel (2024). "A Meta-Analysis on Presynaptic Changes in Alzheimer's Disease." J Alzheimers Dis 97(1): 145-162.

      Araki, Y., M. Zeng, M. Zhang and R. L. Huganir (2015). "Rapid dispersion of SynGAP from synaptic spines triggers AMPA receptor insertion and spine enlargement during LTP." Neuron 85(1): 173-189.

      Bauminger, H. and I. Gaisler-Salomon (2022). "Beyond NMDA Receptors: Homeostasis at the Glutamate Tripartite Synapse and Its Contributions to Cognitive Dysfunction in Schizophrenia." Int J Mol Sci 23(15).

      Berretta, N. and R. S. Jones (1996). "Tonic facilitation of glutamate release by presynaptic N-methyl-D-aspartate autoreceptors in the entorhinal cortex." Neuroscience 75(2): 339-344.

      Birtele, M., A. Del Dosso, T. Xu, T. Nguyen, B. Wilkinson, N. Hosseini, S. Nguyen, J. P. Urenda, G. Knight, C. Rojas, I. Flores, A. Atamian, R. Moore, R. Sharma, P. Pirrotte, R. S. Ashton, E. J. Huang, G. Rumbaugh, M. P. Coba and G. Quadrato (2023). "Non-synaptic function of the autism spectrum disorder-associated gene SYNGAP1 in cortical neurogenesis." Nat Neurosci 26(12): 2090-2103.

      Bouvier, G., R. S. Larsen, A. Rodriguez-Moreno, O. Paulsen and P. J. Sjostrom (2018). "Towards resolving the presynaptic NMDA receptor debate." Curr Opin Neurobiol 51: 1-7.

      Choy, M. S., G. Srivastava, L. C. Robinson, K. Tatchell, R. Page and W. Peti (2024). "The SDS22:PP1:I3 complex: SDS22 binding to PP1 loosens the active site metal to prime metal exchange." J Biol Chem 300(1): 105515.

      Dobson, L., I. Remenyi and G. E. Tusnady (2015). "The human transmembrane proteome." Biol Direct 10: 31.

      Feng, D., J. Zhou, H. Liu, X. Wu, F. Li, J. Zhao, Y. Zhang, L. Wang, M. Chao, Q. Wang, H. Qin, S. Ge, Q. Liu, J. Zhang and Y. Qu (2022). "Astrocytic NDRG2-PPM1A interaction exacerbates blood-brain barrier disruption after subarachnoid hemorrhage." Sci Adv 8(39): eabq2423.

      Ferrar, T., D. Chamousset, V. De Wever, M. Nimick, J. Andersen, L. Trinkle-Mulcahy and G. B. Moorhead (2012). "Taperin (c9orf75), a mutated gene in nonsyndromic deafness, encodes a vertebrate specific, nuclear localized protein phosphatase one alpha (PP1alpha) docking protein." Biol Open 1(2): 128-139.

      Flores-Delgado, G., C. W. Liu, R. Sposto and N. Berndt (2007). "A limited screen for protein interactions reveals new roles for protein phosphatase 1 in cell cycle control and apoptosis." J Proteome Res 6(3): 1165-1175.

      Foley, K., N. Ward, H. Hou, A. Mayer, C. McKee and H. Xia (2023). "Regulation of PP1 interaction with I-2, neurabin, and F-actin." Mol Cell Neurosci 124: 103796.

      Goudriaan, A., C. de Leeuw, S. Ripke, C. M. Hultman, P. Sklar, P. F. Sullivan, A. B. Smit, D. Posthuma and M. H. Verheijen (2014). "Specific glial functions contribute to schizophrenia susceptibility." Schizophr Bull 40(4): 925-935.

      Hemmings, H. C., Jr., P. Greengard, H. Y. Tung and P. Cohen (1984). "DARPP-32, a dopamine-regulated neuronal phosphoprotein, is a potent inhibitor of protein phosphatase-1." Nature 310(5977): 503-505.

      Hurley, T. D., J. Yang, L. Zhang, K. D. Goodwin, Q. Zou, M. Cortese, A. K. Dunker and A. A. DePaoli-Roach (2007). "Structural basis for regulation of protein phosphatase 1 by inhibitor-2." J Biol Chem 282(39): 28874-28883.

      Hussain, S., D. L. Egbenya, Y. C. Lai, Z. J. Dosa, J. B. Sorensen, A. E. Anderson and S. Davanger (2017). "The calcium sensor synaptotagmin 1 is expressed and regulated in hippocampal postsynaptic spines." Hippocampus 27(11): 1168-1177.

      Iqbal, H., D. R. Akins and M. R. Kenedy (2018). "Co-immunoprecipitation for Identifying Protein-Protein Interactions in Borrelia burgdorferi." Methods Mol Biol 1690: 47-55.

      Kaizuka, T., T. Hirouchi, T. Saneyoshi, T. Shirafuji, M. O. Collins, S. G. N. Grant, Y. Hayashi and T. Takumi (2024). "FAM81A is a postsynaptic protein that regulates the condensation of postsynaptic proteins via liquid-liquid phase separation." PLoS Biol 22(3): e3002006.

      Kaizuka, T., T. Suzuki, N. Kishi, K. Tamada, M. W. Kilimann, T. Ueyama, M. Watanabe, T. Shimogori, H. Okano, N. Dohmae and T. Takumi (2024). "Remodeling of the postsynaptic proteome in male mice and marmosets during synapse development." Nat Commun 15(1): 2496.

      Kerns, D., G. S. Vong, K. Barley, S. Dracheva, P. Katsel, P. Casaccia, V. Haroutunian and W. Byne (2010). "Gene expression abnormalities and oligodendrocyte deficits in the internal capsule in schizophrenia." Schizophr Res 120(1-3): 150-158.

      Kim, H., S. Choi, E. Lee, W. Koh and C. J. Lee (2024). "Tonic NMDAR Currents in the Brain: Regulation and Cognitive Functions." Biol Psychiatry.

      Koopmans, F., P. van Nierop, M. Andres-Alonso, A. Byrnes, T. Cijsouw, M. P. Coba, L. N. Cornelisse, R. J. Farrell, H. L. Goldschmidt, D. P. Howrigan, N. K. Hussain, C. Imig, A. P. H. de Jong, H. Jung, M. Kohansalnodehi, B. Kramarz, N. Lipstein, R. C. Lovering, H. MacGillavry, V. Mariano, H. Mi, M. Ninov, D. Osumi-Sutherland, R. Pielot, K. H. Smalla, H. Tang, K. Tashman, R. F. G. Toonen, C. Verpelli, R. Reig-Viader, K. Watanabe, J. van Weering, T. Achsel, G. Ashrafi, N. Asi, T. C. Brown, P. De Camilli, M. Feuermann, R. E. Foulger, P. Gaudet, A. Joglekar, A. Kanellopoulos, R. Malenka, R. A. Nicoll, C. Pulido, J. de Juan-Sanz, M. Sheng, T. C. Sudhof, H. U. Tilgner, C. Bagni, A. Bayes, T. Biederer, N. Brose, J. J. E. Chua, D. C. Dieterich, E. D. Gundelfinger, C. Hoogenraad, R. L. Huganir, R. Jahn, P. S. Kaeser, E. Kim, M. R. Kreutz, P. S. McPherson, B. M. Neale, V. O'Connor, D. Posthuma, T. A. Ryan, C. Sala, G. Feng, S. E. Hyman, P. D. Thomas, A. B. Smit and M. Verhage (2019). "SynGO: An Evidence-Based, Expert-Curated Knowledge Base for the Synapse." Neuron 103(2): 217-234 e214.

      Krishnankutty, A., T. Kimura, T. Saito, K. Aoyagi, A. Asada, S. I. Takahashi, K. Ando, M. Ohara-Imaizumi, K. Ishiguro and S. I. Hisanaga (2017). "In vivo regulation of glycogen synthase kinase 3beta activity in neurons and brains." Sci Rep 7(1): 8602.

      Lagundzin, D., K. L. Krieger, H. C. Law and N. T. Woods (2022). "An optimized co-immunoprecipitation protocol for the analysis of endogenous protein-protein interactions in cell lines using mass spectrometry." STAR Protoc 3(1): 101234.

      Lalo, U., W. Koh, C. J. Lee and Y. Pankratov (2021). "The tripartite glutamatergic synapse." Neuropharmacology 199: 108758.

      Lee, B. H., F. Schwager, P. Meraldi and M. Gotta (2018). "p37/UBXN2B regulates spindle orientation by limiting cortical NuMA recruitment via PP1/Repo-Man." J Cell Biol 217(2): 483-493.

      Lee, K. W., S. Lim and K. D. Kim (2022). "The Function of N-Myc Downstream-Regulated Gene 2 (NDRG2) as a Negative Regulator in Tumor Cell Metastasis." Int J Mol Sci 23(16).

      Lee, M. C., K. K. Ting, S. Adams, B. J. Brew, R. Chung and G. J. Guillemin (2010). "Characterisation of the expression of NMDA receptors in human astrocytes." PLoS One 5(11): e14123.

      Li, X., M. Wilmanns, J. Thornton and M. Kohn (2013). "Elucidating human phosphatase-substrate networks." Sci Signal 6(275): rs10.

      Lin, J. S. and E. M. Lai (2017). "Protein-Protein Interactions: Co-Immunoprecipitation." Methods Mol Biol 1615: 211-219.

      Ma, T. M., S. Abazyan, B. Abazyan, J. Nomura, C. Yang, S. Seshadri, A. Sawa, S. H. Snyder and M. V. Pletnikov (2013). "Pathogenic disruption of DISC1-serine racemase binding elicits schizophrenia-like behavior via D-serine depletion." Mol Psychiatry 18(5): 557-567.

      Madrigal, M. P., A. Portales, M. P. SanJuan and S. Jurado (2019). "Postsynaptic SNARE Proteins: Role in Synaptic Transmission and Plasticity." Neuroscience 420: 12-21.

      Marsh, J. A., B. Dancheck, M. J. Ragusa, M. Allaire, J. D. Forman-Kay and W. Peti (2010). "Structural diversity in free and bound states of intrinsically disordered protein phosphatase 1 regulators." Structure 18(9): 1094-1103.

      McClatchy, D. B., N. K. Yu, S. Martinez-Bartolome, R. Patel, A. R. Pelletier, M. Lavalle-Adam, S. B. Powell, M. Roberto and J. R. Yates (2018). "Structural Analysis of Hippocampal Kinase Signal Transduction." ACS Chem Neurosci 9(12): 3072-3085.

      Misir, E. and G. G. Akay (2023). "Synaptic dysfunction in schizophrenia." Synapse 77(5): e22276.

      Mivechi, N. F., L. D. Trainor and G. M. Hahn (1993). "Purified mammalian HSP-70 KDA activates phosphoprotein phosphatases in vitro." Biochem Biophys Res Commun 192(2): 954-963.

      Moon, I. S., H. Sakagami, J. Nakayama and T. Suzuki (2008). "Differential distribution of synGAP alpha1 and synGAP beta isoforms in rat neurons." Brain Res 1241: 62-75.

      Pankow, S., C. Bamberger, D. Calzolari, A. Bamberger and J. R. Yates, 3rd (2016). "Deep interactome profiling of membrane proteins by co-interacting protein identification technology." Nat Protoc 11(12): 2515-2528.

      Pankow, S., C. Bamberger, D. Calzolari, S. Martinez-Bartolome, M. Lavallee-Adam, W. E. Balch and J. R. Yates, 3rd (2015). "∆F508 CFTR interactome remodelling promotes rescue of cystic fibrosis." Nature 528(7583): 510-516.

      Park, G. H., H. Noh, Z. Shao, P. Ni, Y. Qin, D. Liu, C. P. Beaudreault, J. S. Park, C. P. Abani, J. M. Park, D. T. Le, S. Z. Gonzalez, Y. Guan, B. M. Cohen, D. L. McPhie, J. T. Coyle, T. A. Lanz, H. S. Xi, C. Yin, W. Huang, H. Y. Kim and S. Chung (2020). "Activated microglia cause metabolic disruptions in developmental cortical interneurons that persist in interneurons from individuals with schizophrenia." Nat Neurosci 23(11): 1352-1364.

      Partiot, E., A. Hirschler, S. Colomb, W. Lutz, T. Claeys, F. Delalande, M. S. Deffieu, Y. Bare, J. R. E. Roels, B. Gorda, J. Bons, D. Callon, L. Andreoletti, M. Labrousse, F. M. J. Jacobs, V. Rigau, B. Charlot, L. Martens, C. Carapito, G. Ganesh and R. Gaudin (2024). "Brain exposure to SARS-CoV-2 virions perturbs synaptic homeostasis." Nat Microbiol.

      Qian, J., E. Vafiadaki, S. M. Florea, V. P. Singh, W. Song, C. K. Lam, Y. Wang, Q. Yuan, T. J. Pritchard, W. Cai, K. Haghighi, P. Rodriguez, H. S. Wang, D. Sanoudou, G. C. Fan and E. G. Kranias (2011). "Small heat shock protein 20 interacts with protein phosphatase-1 and enhances sarcoplasmic reticulum calcium cycling." Circ Res 108(12): 1429-1438.

      Ragusa, M. J., B. Dancheck, D. A. Critton, A. C. Nairn, R. Page and W. Peti (2010). "Spinophilin directs protein phosphatase 1 specificity by blocking substrate binding sites." Nat Struct Mol Biol 17(4): 459-464.

      Rodrigues-Neves, A. C., A. F. Ambrosio and C. A. Gomes (2022). "Microglia sequelae: brain signature of innate immunity in schizophrenia." Transl Psychiatry 12(1): 493.

      Salek, A. B., E. T. Claeboe, R. Bansal, N. F. Berbari and A. J. Baucum, 2nd (2023). "Spinophilin-dependent regulation of GluN2B-containing NMDAR-dependent calcium influx, GluN2B surface expression, and cleaved caspase expression." Synapse 77(3): e22264.

      Savas, J. N., B. D. Stein, C. C. Wu and J. R. Yates, 3rd (2011). "Mass spectrometry accelerates membrane protein analysis." Trends Biochem Sci 36(7): 388-396.

      Selak, S., A. V. Paternain, M. I. Aller, E. Pico, R. Rivera and J. Lerma (2009). "A role for SNAP25 in internalization of kainate receptors and synaptic plasticity." Neuron 63(3): 357-371.

      Serrano, A., R. Robitaille and J. C. Lacaille (2008). "Differential NMDA-dependent activation of glial cells in mouse hippocampus." Glia 56(15): 1648-1663.

      Sjostrom, P. J., G. G. Turrigiano and S. B. Nelson (2003). "Neocortical LTD via coincident activation of presynaptic NMDA and cannabinoid receptors." Neuron 39(4): 641-654.

      Stanca, S., M. Rossetti, L. Bokulic Panichi and P. Bongioanni (2024). "The Cellular Dysfunction of the Brain-Blood Barrier from Endothelial Cells to Astrocytes: The Pathway towards Neurotransmitter Impairment in Schizophrenia." Int J Mol Sci 25(2).

      Sumi, T. and K. Harada (2023). "Muscarinic acetylcholine receptor-dependent and NMDA receptor-dependent LTP and LTD share the common AMPAR trafficking pathway." iScience 26(3): 106133.

      Svenningsson, P., E. T. Tzavara, R. Carruthers, I. Rachleff, S. Wattler, M. Nehls, D. L. McKinzie, A. A. Fienberg, G. G. Nomikos and P. Greengard (2003). "Diverse psychotomimetics act through a common signaling pathway." Science 302(5649): 1412-1415.

      Tarasov, V. V., A. A. Svistunov, V. N. Chubarev, S. S. Sologova, P. Mukhortova, D. Levushkin, S. G. Somasundaram, C. E. Kirkland, S. O. Bachurin and G. Aliev (2019). "Alterations of Astrocytes in the Context of Schizophrenic Dementia." Front Pharmacol 10: 1612.

      Terrak, M., F. Kerff, K. Langsetmo, T. Tao and R. Dominguez (2004). "Structural basis of protein phosphatase 1 regulation." Nature 429(6993): 780-784.

      Tokizane, K., C. S. Brace and S. I. Imai (2024). "DMH(Ppp1r17) neurons regulate aging and lifespan in mice through hypothalamic-adipose inter-tissue communication." Cell Metab 36(2): 377-392 e311.

      Tomasoni, R., D. Repetto, R. Morini, C. Elia, F. Gardoni, M. Di Luca, E. Turco, P. Defilippi and M. Matteoli (2013). "SNAP-25 regulates spine formation through postsynaptic binding to p140Cap." Nat Commun 4: 2136.

      Vainio, L., S. Taponen, S. M. Kinnunen, E. Halmetoja, Z. Szabo, T. Alakoski, J. Ulvila, J. Junttila, P. Lakkisto, J. Magga and R. Kerkela (2021). "GSK3beta Serine 389 Phosphorylation Modulates Cardiomyocyte Hypertrophy and Ischemic Injury." Int J Mol Sci 22(24).

      van Oostrum, M., T. M. Blok, S. L. Giandomenico, S. Tom Dieck, G. Tushev, N. Furst, J. D. Langer and E. M. Schuman (2023). "The proteomic landscape of synaptic diversity across brain regions and cell types." Cell 186(24): 5411-5427 e5423.

      Vilalta, A. and G. C. Brown (2018). "Neurophagy, the phagocytosis of live neurons and synapses by glia, contributes to brain development and disease." FEBS J 285(19): 3566-3575.

      Weeratunga, S., R. S. Gormal, M. Liu, D. Eldershaw, E. K. Livingstone, A. Malapaka, T. P. Wallis, A. T. Bademosi, A. Jiang, M. D. Healy, F. A. Meunier and B. M. Collins (2024). "Interrogation and validation of the interactome of neuronal Munc18-interacting Mint proteins with AlphaFold2." J Biol Chem 300(1): 105541.

      Winship, I. R., S. M. Dursun, G. B. Baker, P. A. Balista, L. Kandratavicius, J. P. Maia-de-Oliveira, J. Hallak and J. G. Howland (2019). "An Overview of Animal Models Related to Schizophrenia." Can J Psychiatry 64(1): 5-17.

      Xu, Z., L. Sadleir, H. Goel, X. Jiao, Y. Niu, Z. Zhou, G. de Valles-Ibanez, G. Poke, M. Hildebrand, N. Lieffering, J. Qin and Z. Yang (2024). "Genotype and phenotype correlation of PHACTR1-related neurological disorders." J Med Genet 61(6): 536-542.

      Zhang, J., L. Zhang, S. Zhao and E. Y. Lee (1998). "Identification and characterization of the human HCG V gene product as a novel inhibitor of protein phosphatase-1." Biochemistry 37(47): 16728-16734.

      Zhang, Y., K. Chen, S. A. Sloan, M. L. Bennett, A. R. Scholze, S. O'Keeffe, H. P. Phatnani, P. Guarnieri, C. Caneda, N. Ruderisch, S. Deng, S. A. Liddelow, C. Zhang, R. Daneman, T. Maniatis, B. A. Barres and J. Q. Wu (2014). "An RNA-sequencing transcriptome and splicing database of glia, neurons, and vascular cells of the cerebral cortex." J Neurosci 34(36): 11929-11947.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary: McClatchy, Powell and Yates aimed at identifying a protein interactome associated to schizophrenia. For that, they treated rats (N14 and N15) with PCP, which disturbs gutamatergic transmission, as a model for the disease and co-immunoprecipitated hippocampi proteins, which were further analyzed by standard LC-MS.

      The study is new, considering not much has been done in this direction in the field of schizophrenia. This justifies its publication. On the other hand, a major flaw of the is the lack of information on the level of interaction of the so called protein interactome. Meaning, we cannot distinguish, as the study was performed, which proteins are directly interacting with the targets of interest from proteins which are interacting with targets´ interactors. The different shells of interaction are crucial information in protein interactomics.

      Major: most of I am pointing below must be at least discussed or better presented in the paper, as It may not be solvable considering how the study has been conducted.

      1. The study fails in defining the level of interaction of the protein interactome with the considered targets. This has been shortly mentioned in the discussion, but must be more explicit to readers, for instance, in the abstract, introduction and in the methods sections.
      2. Considering the protein extraction protocol, it is fair to mention that only the most soluble proteins are being considered here. I am bringing this up since the importance of membrane receptors is clear in the studied context.
      3. It is not clear from the methods description if antibodies from all 8 targets were all together in one Co-IP or have been incubated separately in 8 different hippocampi samples. It seems the first, given how results have been presented. If so, this maximizes the major issue raised above (in 1).
      4. Definitely, results here are not representing a "SCZ PPI network". PCP-treated animals, as any other animal model, are rather limited models to schizophrenia. As a complex multifactorial disease, synaptic deficits, which is the focus of this study, can no longer be considered "the pivot" of the disease. Synaptic dysfunction is only one among many other factors associated to schizophrenia.
      5. Authors should look for protein interactions that might be happening also in glial cells. They are not the majority in hippocampus, but are present in the type of tissue analyzed here. Thus, some of the interactions observed might be more abundantly present in those cells. Maybe enriching using bioinformatics tools the PPI network to different cell types.

      Minor:

      1. in the abstract, it is not clear if 90% of the PPI are novel to brain tissue in general or specifically schizophrenia.
      2. authors refer to LC-MS-based proteomics as "MS" all across the text. Who am I to say this to Yates et al, but I think it is rather simplified use "Mass Spectrometry Analysis", when this is a typical LC-MS type of analysis
      3. Several references used to construct the hypothesis of the paper are rather outdated: several from 10-15 years ago. It would be interesting to provide to the reader up to date references, given the rapid pace science has been progressing.
      4. "UniProt rat database". Please, state the version and if reviewed or unreviewed.

      Significance

      The study is informative, and has great potential to enrich the specific literature of this field. But should tone down some arguments, given the experimental limitations of the PPI network (as described above) and should state PCP-treated rats as a limited model to schizophrenia.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      __Evidence, reproducibility and clarity __

      The work by Przanowska et al., sought to understand the role of ORC2 in murine development and further wanted to discover its role in liver endo-reduplication. The overall methods used is sufficient enough to address its role but is not very conclusive based on their overall results and data provided as elaborated in below comments.

      Major Comments:

      1. The major issue of the paper is how well is ORC2 depleted in perinatal liver (Fig. 2C) and is not very clear from the data as all the western blots are at very low exposure levels and bands are very weak (still weak bands seen). There are good antibodies of ORC2 which can be used for IHC staining and can be used to address the extent of ORC2 depletion.

      We have now shown that ORC2 protein is significantly decreased in the hepatocytes of the Orc2 KO and DKO livers (New Fig. 2C and 6D). The decrease is consistent, with 4-5 mice examined, and all showing the depletion. We have been unable to do immunohistochemistry on tissue sections of the mouse livers with the anti-ORC antibodies we have tried, and this could be a reflection of the low level of the proteins. On hepatocytes in culture we have obtained faint signal with the anti-ORC2 antibody in WT cells, and this is clearly absent in 100% of the hepatocytes. See Fig. R1 below.

      __Reviewer Fig R1: __


      A) Immunofluorescence of hepatocytes in culture from livers of WT and two DKO mice.

      B) Quantitation of A) from counting 70-100 cells from each specimen.

      However, the calculations in the methods and the discussion are very compelling that at least the last 6-9 cell divisions in normal development start with 2n nuclei in the livers at baseline (Fig. 3B-G and 6I).

      Why in Fig 2C, the M2 mice is showing an equivalent level of ORC2 protein compared to mice M1 with NO CRE expression (compare lane1 and lane5). So, the results are based on one mouse which I do not think is significant enough to come to the conclusion. The authors need to add more data from different mice for statistical significance. Please use IHC to show the depletion of ORC2 protein in the liver sections.

      We had used total liver and had pointed out that residual ORC2 protein will be seen from stromal cells (endothelia, blood vessels and blood cells). We have therefore removed the figure which measured ORC2 levels in total liver and have now shown that when hepatocytes are isolated from five animals there was a massive depletion of ORC2 in all five animals (new Fig. 3C).

      As nicely demonstrated in the previous paper by Okano-Uchida et al., 2018 that ORC1 depletion in the liver shows an DNA ploidy effect from 6-week onwards. The authors need to demonstrate in this paper also when the 16N phenotype is observed starting from week1 to 12 months.

      Based on the results from our previous paper (Okano-Uchida et al., 2018) we decided to measure 16N phenotype at 6 weeks of age. The endoreduplication occurs at a stage when ORC2 protein is undetectable during normal development or during regeneration.

      In the double knockout experiments (ORC1 and ORC2) the authors are not even bothered to demonstrate that how much are both the proteins are actually depleted from the cells, so on the results obtained from these mice experiments are not conclusive or explanatory.

      We have performed immunoblotting of isolated hepatocytes and immunohistochemistry of livers for ORC1 and ORC2. Our data shows that both proteins are depleted in all four mice tested (New Fig. 6D).

      Minor points:

      1. Why are scale bars missing in right panel of Fig. 2G, Fig. 6D Supp Fig. 2B KO studies. The authors need to confirm that that all the large nuclei have NO or less significant ORC2 protein through IHC H&E staining.

      The scale bars are missing from the right panels to avoid redundancy. We have added “Both panels are at the same scale.” in the figure legend, according to https://doi.org/10.1371/journal.pbio.3001161.

      1. Please explain why is EYFP in Fig. 5G is cytoplasmic compared to Fig 4C (nuclear). We consistently see this variability and it was there in our previous results (Okano-Uchida et al., 2018), where EYFP was cytoplasmic in tissues, but was nuclear (and some cytoplasmic) in hepatocytes in culture.

      We do not know the reason for this difference but consistently see this difference. We now say in the text: “We did not explore why the EYFP protein is mostly nuclear in hepatocytes in culture (Fig. 4C) and mostly cytoplasmic in hepatocytes in the liver tissue (Fig. 5G, 7G), but speculate that differences in signaling pathways or fixation techniques between the two conditions contribute to this difference.”

      Are authors using the same genotype of Alb-Cre mice as shown by Okano-Uchida et al., 2018 as I do not find the reference of Schuler et. al., 2004 (PMID:15282742).

      We have been using two independent Alb-Cre animals. This is now described in the Methods.


      Significance

      The article is exactly based on their previous published paper but instead of ORC1, they were interested in dissecting the role of ORC2. Although they have discussed that CDC6 may be involved in replacing ORC1 KO mice to rescue the extensive DNA replication in endoreduplication, but instead of going to hunt the role of CDC6 in endoreduplication they checked the effect of ORC2 which actually lower the overall impact of the paper.

      We studied ORC2 conditional KO mice in a similar manner to the previously published ORC1 conditional KO in order to ensure (1) that the lack of effect in the Orc1 KO was not because ORC1 can theoretically be substituted for by CDC6 and (2) to establish the double KO of Orc1 and Orc2. To the best of our knowledge this is the first description of removal of two subunits of ORC complex at once in a mouse model. Moreover, in the light of rising recognition of sex as biological variable, we report sex-dependent effects which are very intriguing.

      We have not attempted knocking out CDC6 to uncover novel mechanisms of DNA replication, because we first needed to make sure that the mice can truly endo-reduplicate without two of the six subunits of ORC. Note that our published results in cancer cell lines (Shibata, 2016) show that CDC6 is still essential in the ORC KO cell lines, so a future experiment will likely reveal that CDC6 is still essential for endoreduplication in the ORC KO mice in vivo.

      Reviewer #2

      __Evidence, reproducibility and clarity __

      It has been reported that in the absence of ORC1, liver cells can still endoreduplicate and it has been speculated that this might occur if CDC6 can replace, at least partially, the function of ORC1. Here, authors evaluate if this is also true in the absence of ORC2 and found that ORC2 is required for cell proliferation in mouse hepatocytes but not for endoreduplication. This is also the case after combining the conditional mutations of ORC1 and ORC2. They propose that a mechanism must exist to load sufficient MCM2-7 to support DNA replication in the absence of these two ORC subunits. Some of the conclusions need further experimental support. The rationale for testing the requirement of ORC2, with or without ORC1, for endoreduplication is valid. However, a key point is that the endoreduplication level seems to be higher in the absence of ORC2 or both ORC1 and ORC2, and this is not properly addressed. Also, mechanistic details on how this could be triggered are absent from this study. As indicated below almost every figure in this manuscript contains weak points (see below).

      We now discuss the following: “One possible explanation of the greater endoreduplication in both our papers is that mitosis may be arrested earlier in development by G2 DNA damage checkpoints activated by incomplete licensing and replication of the genome in the absence of ORC. As a result, endoreduplication cycles could begin earlier in development resulting in greater endoreduplication.”

      Major 1. Fig 1G, needs a detailed comment and justification.

      We have added the following to the text: “The proliferation rate of the MEF were measured by MTT assays. Even in the Orc2+/+ MEF, the infection with adeno-Cre decreased proliferation a little (the orange line compared to the blue line in Fig. 1G). However, for Orc2f/f MEF infection with adeno-Cre impairs proliferation even further (yellow line compared to black line in Fig. 1G)..

      Note that Adeno-Cre has been reported to be toxic for cell proliferation (citations 1, 2, 3), and so we included Adeno-Cre expression in ORC2+/+ (WT) as a background control.

      Citation:

      1. Pfeifer A, Brandon EP, Kootstra N, Gage FH, Verma IM: Delivery of the Cre recombinase by a self deleting lentiviral vector: Efficient gene targeting in vivo. Proc Natl Acad Sci USA. 2001, 98: 11450-11455. 10.1073/pnas.201415498.
      2. Loonstra A, Vooijs M, Beverloo HB, Allak BA, Drunen EV, Kanaar R, Berns A, Jonkers J: Growth inhibition and DNA damage induced by Cre recombinase in mammalian cells. Proc Natl Acad Sci USA. 2001, 98: 9209-9214. 10.1073/pnas.161269798.
      3. Schmidt EE, Taylor DS, Prigge JR, Barnet S, Capecchi R: Illegitimate Cre-dependent chromosome rearrangements in transgenic mouse spermatids. Proc Natl Acad Sci USA. 2000, 97: 13702-13707. 10.1073/pnas.240471297.
      4. Fig 2D-F. Is this conclusion applicable to other endoreplicating tissues? Have authors consider to analyze body weight and liver weight measurements after normalization with similar data from a non-affected organ? The conditional KO was performed specifically in the liver. ORC is intact in other tissues in these animals. As a future direction our lab plans to study cardiac-specific conditional KO of ORC subunits to test whether other endo-reduplicating tissues can also synthesize DNA in the absence of ORC subunits.

      Fig 3 shows inconsistent results or results that lack proper justification in the text. The 2C peak is missing in Fig 3E (yellow line, positive control). However, 2n nuclei appear in Fig 3F-H. Also, the blue and yellow peaks do not coincide in the flow cytometry profiles, in particular for 8C and 16C.

      There was an error in the plotting of the former Fig. 3E. The information is better presented in the former Fig. 3F-H (now Fig. 3E-G) and so have removed the former Fig. 3E from the paper.

      Fig 4. Shorter EdU pulses could be more informative of the actual amount of S-phase cells. Thus, the use of a 2h EdU pulse needs a clear justification.

      The half-life of EDU incorporation differs slightly between in vivo and in vitro conditions. In vivo, slower cell proliferation requires a longer time, approximately 4 hours. However, in vitro, liver cells grow faster, and a 2-hour EDU pulse with 20 µM is sufficient for detection compared to a 3-hour pulse with 10 µM BrdU (Okano-Uchida et al., 2018). Several publications also use a 2-hour EDU incubation time (https://doi.org/10.1098/rsob.150172).

      Fig 5. EYFP is cytoplasmic, in contrast with results shown in Fig 4C

      We consistently see this variability and it was there in our previous results (Okano-Uchida et al., 2018), where EYFP was cytoplasmic in tissues, but was nuclear (and some cytoplasmic) in hepatocytes in culture.

      We do not know the reason for this difference but consistently see this difference. We now say in the text: “We did not explore why the EYFP protein is mostly nuclear in hepatocytes in culture (Fig. 4C) and mostly cytoplasmic in hepatocytes in the liver tissue (Fig. 5G, 7G), but speculate that differences in signaling pathways or fixation techniques between the two conditions contribute to this difference.”

      Fig 6. Results obtained with the double mutant are poorly described.

      We have split the figure into two figures (New Fig. 6 and 7) edited the results section to ensure that they are easily comprehended by the readers. We have also included Westerns from hepatocyte cell lysates of four DKO mice to show that ORC1 and ORC2 proteins are reproducible decreased (New Fig. 6D).

      What are the level of other pre-RC components in the mutants used in this study. This could be easily evaluated by Western blotting

      Despite the technical difficulty of not having antibodies that recognize all the mouse initiation proteins, we have now measured mouse ORC1, ORC2, ORC3, ORC5, ORC6, CDC6 and the MCM2 and MCM3 subunits of MCM2-7. The results do not show a consistent decrease or increase of any of these proteins in individual mice of the two genotypes, Orc2-/- or DKO (New Fig. 2D and 6E)

      How do authors justify their claim that a very limited amount of ORC are sufficient to load a vast excess of MCM2-7 hexamers?

      The rationale is stated in the introduction from data from cancer cell lines: “Given that WT cells have about 150,000 molecules of ORC2, even if this truncated protein is functional ORC2, ~150 molecules of the protein would be expected to load MCM2-7 double hexamers on at least 50,000 origins of replication. Experimentally, we show in Shibata, 2020 (Fig. 7C), that although ORC subunits are undetectable on Westerns, MCM2-7 association with the chromatin is unchanged. By the way, we do not say “vast excess” of MCM2-7, just sufficient MCM2-7 to fire 50,000 origins.

      Minor 1. The titles of the Results section could be more informative of the main conclusion rather than simply descriptive

      We updated our Results titles to be more informative.

      The Discussion is too long

      We have shortened the discussion by removing our calculations to the Results section and abbreviating some of the discussion on endoreduplication. However we had to insert new items brough forth by the reviewers. Due to the controversy of this topic in our field, we had to include extensive discussion of current literature and put our results in their proper context.

      Significance

      The topic is relevant and the hypothesis tested is reasonable, although the conceptual advance is limited (see also below). The major limitation is the absence of mechanistic details addressing the occurrence of extra endoreduplication cycles (compared to controls) in the ORC1 and ORC2 mutants.

      Reviewer #3

      __Evidence, reproducibility and clarity: __

      The origin recognition complex (ORC) is an essential loading factor for the replicative Mcm2-7 helicase complex. Despite ORC's critical role in DNA replication, there have been instances where the loss of specific ORC subunits has still seemingly supported DNA replication in cancer cells, endocycling hepatocytes, and Drosophila polyploid cells. Critically, all tested ORC subunits are essential for development and proliferation in normal cells. This presents a challenge, as conditional knockouts need to be generated, and a skeptic can always claim that there were limiting but sufficient ORC levels for helicase loading and replication in polyploid or transformed cells. That being said, the authors have consistently pushed the system to demonstrate replication in the absence or extreme depletion of ORC subunits.

      Here, the authors generate conditional ORC2 mutants to counter a potential argument with prior conditional ORC1 mutants that Cdc6 may substitute for ORC1 function based on homology. They also generate a double ORC1 and ORC2 mutant, which is still capable of DNA replication in polyploid hepatocytes. While this manuscript provides significantly more support for the ability of select cells to replicate in the absence or near absence of select ORC subunits, it does not shed light on a potential mechanism. While a mechanistic understanding of how these cells proliferate in the absence or extreme depletion of ORC subunits is outside the scope of the current manuscript, it would have been beneficial to see more functional analyses to help guide the field. For example, is there a delay or impairment in Mcm2-7 loading in G1 (FACs-based loading assay from the Cook Lab (Matson et al., eLife. 2017)) in primary hepatocytes with the ORC2 conditional deletion? Is copy number maintained as cells increase polyploidy in the absence of ORC subunits, or are some regions of the genome more sensitive to ORC depletion (CGH arrays or sequencing of the flow-sorted polyploid cells)?

      We thank the reviewer for recognizing the main point of these experiments: to dispel the argument that CDC6 can substitute for ORC1 in the six-subunit ORC (although no one has demonstrated this, the argument is made on the basis of close sequence homology between CDC6 and ORC1). The second point, also appreciated by the reviewer is to show that it is possible to find cells that replicate in the absence or near absence of two ORC subunits.

      The mechanistic questions raised are important, and we will address them here:

      Is there a delay or impairment of MCM2-7 loading in G1? The hepatocytes in culture are fragile and not immortalized and thus, this issue can be much more easily addressed in the cancer cell lines we have made that are missing several ORC subunits and will do that in a later paper. Note however, the surprising lack of change in MCM2-7 association in cell lines where both ORC2 and ORC5 are deleted (Shibata, 2020, Fig. 7C).

      Are some regions of the genome more sensitive to ORC deletion during the polyploidization? We could not find any paper where people have investigated whether the whole genome is uniformly polyploidized in livers. In other words, the baseline conditions in WT livers have not been established. We therefore have postponed experiments to answer this question for a later paper. Note that in unpublished data from mapping SNS-seq origins in WT and ORC deletion cell lines there does not appear to be selective firing of certain origins over others in the deletion cell lines.

      Additional points: I didn't understand how the numbers were derived in Table 2. Was there really a 20-fold decrease in nuclear density for female ORC1 and ORC2 double-deletion hepatocytes? The differences in Figure S2 are dramatic, but not 20-fold dramatic.

      We measure the relative nuclear density by counting the number of plump nuclei (hepatocytes) per field as described for Fig. 5F and 7F now in the Methods section. The reviewer is correct in that we overestimated the decrease of nuclear density in the female DKO mice by two-fold. The revised calculations suggest that 6 cell divisions occur in the female DKO mice after the ORC proteins have decreased to at least __Significance: __

      The strengths of this manuscript are the mouse genetics and the generation of conditional alleles of Orc2 and the rigorous assessment of phenotypes resulting from limiting amounts of specific ORC subunits. It also builds on prior work with ORC1 to rule out Cdc6 complementing the loss of ORC1. The weakness is that it is a very hard task to resolve the fundamental question of how much ORC is enough for replication in cancer cells or hepatocytes. Clearly, there is a marked reduction in specific ORC subunits that is sufficient to impact replication during development and in fibroblasts, but the devil's advocate can always claim limiting levels of ORC remaining in these specialized cells. The significance of the work is that the authors keep improving their conditional alleles (and combining them), thus making it harder and harder (but not impossible) to invoke limiting but sufficient levels of ORC. At this point, the investigators and the field are well-positioned to attempt future functional CRISPR screens to identify other factors that may modulate the response to the loss of ORC subunits. This work will be of interest to the DNA replication, polyploidy, and genome stability communities.

      We thank the reviewer for getting the important point of this paper: “making it harder and harder (but not impossible) to invoke limiting but sufficient levels of ORC….” In other words, either ORC is completely dispensable for loading MCM2-7 in certain cancer cell lines and hepatocytes or it is highly catalytic and one molecule of ORC can load a few hundred MCM2-7 doublets so that most origins in the genome are licensed and capable of firing. We are trying the CRISPR screens in cancer cell lines that the reviewer envisages

    1. Author response:

      “Overall, the paper has several strengths, including leveraging large-scale, multi-modal datasets, using computational reasonable tools, and having an in-depth discussion of the significant results.”

      We thank the reviewer for the very supportive comments.

      Based on the comments and questions, we have grouped the concerns and corresponding responses into three categories.

      (1) The scope and data selection

      “The results are somewhat inconclusive or not validated.

      The overall results are carefully designed, but most of the results are descriptive. While the authors are able to find additional evidence either from the literature or explain the results with their existing knowledge, none of the results have been biologically validated. Especially, the last three result sections (signaling pathways, eQTLs, and TF binding) further extended their findings, but the authors did not put the major results into any of the figures in the main text.”

      The goal of this manuscript is to provide a list of putative childhood obesity target genes to yield new insights and help drive further experimentation. Moreover, the outputs from signaling pathways, eQTLs, and TF binding, although noteworthy and supportive of our method, were not particularly novel. In our manuscript we placed our focus on the novel findings from the analyses. We did, however, report the part of the eQTLs analysis concerning ADCY3, which brought new insight to the pathology of obesity, in Figure 4C.

      “The manuscript would benefit from an explanation regarding the rationale behind the selection of the 57 human cell types analyzed. it is essential to clarify whether these cell types have unique functions or relevance to childhood development and obesity.”

      We elected to comprehensively investigate the GWAS-informed cellular underpinnings of childhood development and obesity. By including a diverse range of cell types from different tissues and organs, we sought to capture the multifaceted nature of cellular contributions to obesity-related mechanisms, and open new avenues for targeted therapeutic interventions.

      There are clearly cell types that are already established as being key to the pathogenesis of obesity when dysregulated: adipocytes for energy storage, immune cell types regulating inflammation and metabolic homeostasis, hepatocytes regulating lipid metabolism, pancreatic cell types intricately involved in glucose and lipid metabolism, skeletal muscle for glucose uptake and metabolism, and brain cell types in the regulation of appetite, energy expenditure, and metabolic homeostasis.

      While it is practical to focus on cell types already proven to be associated with or relevant to obesity, this approach has its limitations. It confines our understanding to established knowledge and rules out the potential for discovering novel insights from new cellular mechanisms or pathways that could play significant roles in the pathogenesis if obesity. Therefore, it is was essential to reflect known biology against the unexplored cell types to expand our overall understanding and potentially identify innovative targets for treatment or prevention.

      “I wonder whether the used epigenome datasets are all from children. Although the authors use literature to support that body weight and obesity remain stable from infancy to adulthood, it remains uncertain whether epigenomic data from other life stages might overlook significant genetic variants that uniquely contribute to childhood obesity.”

      The datasets utilized in our study were derived from a combination of sources, both pediatric and adult. We recognize that epigenetic profiles can vary across different life stages but our principal effort was to characterize susceptibility BEFORE disease onset.

      “Given that the GTEx tissue samples are derived from adult donors, there appears to be a mismatch with the study's focus on childhood obesity. If possible, identifying alternative validation strategies or datasets more closely related to the pediatric population could strengthen the study's findings.” 

      We thank the reviewer for raising this important point. We acknowledge that the GTEx tissue samples are derived from adult donors, which might not perfectly align with the study's focus on childhood obesity. The ideal strategy would be a longitudinal design that follows individuals from childhood into adulthood to bridge the gap between pediatric and adult data, offering systematic insights into how early-life epigenetic markers influencing obesity later in life. In future work, we aim to carry out such efforts, which will represent substantial time and financial commitment.

      Along the same lines, the Developmental Genotype-Tissue Expression (dGTEx) Project is a new effort to study development-specific genetic effects on gene expression at 4 developmental windows spanning from infant to post-puberty (0-18 years). Donor recruitment began in August 2023 and remains ongoing. Tissue characterization and data production are underway. We hope that with the establishment of this resource, our future research in the field of pediatric health will be further enhanced.

      “Figure 1B: in subplots c and d, the results are either from Hi-C or capture-C. Although the authors use different colors to denote them, I cannot help wondering how much difference between Hi-C and capture-C brings in. Did the authors explore the difference between the Hi-C and capture-C?”.

      Thank you for your comment. It is not within the scope of our paper to explore the differences between the Hi-C and Capture-C methods. In the context of our study, both methods serve the same purpose of detecting chromatin loops that bring putative enhancers to sometimes genomically distant gene promoters. Consequently, our focus was on utilizing these methods to identify relevant chromatin interactions rather than comparing their technical differences.

      (2) Details on defining different categories of the regions of interest

      “Some technical details are missing.

      While the authors described all of their analysis steps, a lot of the time, they did not mention the motivation. Sometimes, the details were also omitted.”

      We will add a section to the revision to address the rationale behind different OCRs categories.

      “Line 129: should "-1,500/+500bp" be "-500/+500bp"? 

      A gene promoter was defined as a region 1,500 bases upstream to 500 bases downstream of the TSS. Most transcription factor binding sites are distributes upstream (5’) from TSS, and the assembly of transcription machinery occurs up to 1000 bases 5’ from TSS. Given our interest in SNPs that can potentially disrupt transcription factor binding, this defined promoter length allowed us to capture such SNPs in our analyses.

      “How did the authors define a contact region?”

      Chromatin contact regions identified by Hi-C or Capture-C assays are always reported as pairs of chromatin regions. The Supplementary eMethods provide details on the method of processing and interaction calling from the Hi-C and Capture-C data.

      “The manuscript would benefit from a detailed explanation of the methods used to define cREs, particularly the process of intersecting OCRs with chromatin conformation data. The current description does not fully clarify how the cREs are defined.”

      “In the result section titled "Consistency and diversity of childhood obesity proxy variants mapped to cREs", the authors introduced the different types of cREs in the context of open chromatin regions and chromatin contact regions, and TSS. Figure 2A is helpful in some way, but more explanation is definitely needed. For example, it seems that the authors introduced three chromatin contacts on purpose, but I did not quite get the overall motivation.”

      We apologize for the confusion. Our definition of cREs is consistent throughout the study. Figure 2A will be the first Figure 1A in the revision in order to aid the reader.

      The 3 representative chromatin loops illustrate different ways the chromatin contact regions (pairs of blue regions under blue arcs) can overlap with OCRs (yellow regions under yellow triangles – ATAC peaks) and gene promoters.

      [1] The first chromatin loop has one contact region that overlaps with OCRs at one end and with the gene promoter at the other. This satisfies the formation of cREs; thus, the area under the yellow ATAC-peak triangle is green.

      [2] The second loop only overlapped with OCR at one end, and there was no gene promoter nearby, so it is unqualified as cREs formation.

      [3] The third chromatin loop has OCR and promoter overlapping at one end. We defined this as a special cRE formation; thus, the area under the yellow ATAC-peak triangle is green.

      To avoid further confusion for the reader, we will eliminate this variation in the new illustration for the revised manuscript.

      “Figure 2A: The authors used triangles filled differently to denote different types of cREs but I wonder what the height of the triangles implies. Please specify.”

      The triangles are illustrations for ATAC-seq peaks, and the yellow chromatin regions under them are OCRs. The different heights of ATAC-seq peaks are usually quantified as intensity values for OCRs. However, in our study, when an ATAC-seq peak passed the significance threshold from the data pipeline, we only considered their locations, regardless of their intensities. To avoid further confusion for the reader, we will eliminate this variation in the new illustration for the revised manuscript.

      “Figure 1B-c. the title should be "OCRs at putative cREs". Similarly in Figure 1B-d.”

      cREs are a subset of OCRs.

      - In the section "Cell type specific partitioned heritability", the authors used "4 defined sets of input genomic regions". Are you corresponding to the four types of regions in Figure 2A? 

      Figure 2A will be the first Figure 1A in the revision and will be modified to showcase how we define OCRs and cREs.

      “It seems that the authors described the 771 proxies in "Genetic loci included in variant-to-genes mapping" (ln 154), and then somehow narrowed down from 771 to 94 (according to ln 199) because they are cREs. It would be great if the authors could describe the selection procedure together, rather than isolated, which made it quite difficult to understand.”

      In the Methods section entitled “Genetic loci included in variant-to-genes mapping," we described the process of LD expansion to include 771 proxies from 19 sentinel obesity-significantly associated signals. Not all of these proxies are located within our defined cREs. Figure 2B, now Figure 2A in the revision, illustrates different proportions of these proxies located within different types of regions, reducing the proxy list to 94 located within our defined cREs.

      “Figure 2. What's the difference between the 771 and 758 proxies? “

      13 out of 771 proxies did not fall within any defined regions. The remaining 758 were located within contact regions of at least one cell type regardless of chromatin state.

      (3) Typos

      “In the paragraph "Childhood obesity GWAS summary statistics", the authors may want to describe the case/control numbers in two stages differently. "in stage 1" and "921 cases" together made me think "1,921" is one number.”

      This will be amended in the revision.

      “Hi-C technology should be spelled as Hi-C. There are many places, it is miss-spelled as "hi-C". In Figure 1, the author used "hiC" in the legend. Similarly, Capture-C sometime was spelled as "capture-C" in the manuscript.”

      “At the end of the fifth row in the second paragraph of the Introduction section: "exisit" should be "exist".

      “In Figure 2A: "Within open chromatin contract region" should be "Within open chromatin contact region". 

      These typos and terminology inconsistencies will be amended in the revision.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, Komarova et al. investigate the clinical prognostic ability of cell-level metabolic heterogeneity quantified via the fluorescence lifetime characteristics of NAD(P)H. Fluorescence lifetime imaging microscopy (FLIM) has been studied as a minimally invasive approach to measure cellular metabolism in live cell cultures, organoids, and animal models. Its clinical translation is spearheaded through macroscopic implementation approaches that are capable of large sampling areas and enable access to otherwise constrained spaces but lack cellular resolution for a one-to-one transition with traditional microscopy approaches, making the interpretation of the results a complicated task. The merit of this study primarily lies in its design by analyzing with the same instrumentation and approach colorectal samples in different research scenarios, namely in vitro cells, in vivo animal xenografts, and tumor tissue from human patients. These conform to a valuable dataset to explore the translational interpretation hurdles with samples of increasing levels of complexity. For human samples, the study specifically investigates the prediction ability of NAD(P)H fluorescence metrics for the binary classification of tumors of low and advanced stage, with and without metastasis, and low and high grade. They find that NAD(P)H fluorescence properties have a strong potential to distinguish between high- and low-grade tumors and a moderate ability to distinguish advanced-stage tumors from low-stage tumors. This study provides valuable results contributing to the deployment of minimally invasive optical imaging techniques to quantify tumor properties and potentially migrate into tools for human tumor characterization and clinical diagnosis.

      Strengths:

      The investigation of colorectal samples under multiple imaging scenarios with the same instrument and approach conforms to a valuable dataset that can facilitate the interpretation of results across the spectrum of sample complexity.

      The manuscript provides a strong discussion reviewing studies that investigated cellular metabolism with FLIM and the metabolic heterogeneity of colorectal cancer in general.

      The authors do a thorough acknowledgement of the experimental limitations of investigating human samples ex vivo, and the analytical limitation of manual segmentation, for which they provide a path forward for higher throughput analysis.

      Weaknesses:

      To substantiate the changes in fluorescence properties at the examined wavelength range (associated with NAD(P)H fluorescence) in relationship to metabolism, the study would strongly benefit from additional quantification of metabolic-associated metrics using currently established standard methods. This is especially interesting when discussing heterogeneity, which is presumably high within and between patients with colorectal cancer, and could help explain the particularities of each sample leading to a more in-depth analysis of the acquired valuable dataset.

      In order to address this issue, we have performed immunohistochemical staining of the available tumor samples for the two standard metabolic markers GLUT3 and LDHA.

      The results are included in Supplementary (Fig.S4). Discussion has been extended.

      Additionally, NAD(P)H fluorescence does not provide a complete picture of the cell/tissue metabolic characteristics. Including, or discussing the implications of including fluorescence from flavins would comprise a more compelling dataset. These additional data would also enable the quantification of redox metrics, as briefly mentioned, which could positively contribute to the prognosis potential of metabolic heterogeneity.

      We agree with the Reviewer that fluorescence from flavins could be helpful to obtain more complete data on cellular metabolic states. However, we lack to detect sufficiently intensive emission from flavins in colorectal cancer cells and tissues. The paragraph about flavins was added in Discussion and representative images - in Supplementary Material (Figure S5).

      In the current form of the manuscript, there is a diluted interpretation and discussion of the results obtained from the random forest and SHAP analysis regarding the ability of the FLIM parameters to predict clinicopathological outcomes. This is, not only the main point the authors are trying to convey given the title and the stated goals, but also a novel result given the scarce availability of these type of data, which could have a remarkable impact on colorectal cancer in situ diagnosis and therapy monitoring. These data merit a more in-depth analysis of the different factors involved. In this context, the authors should clarify how is the "trend of association" quantified (lines 194 and 199).

      We thank the Reviewer for this suggestion. The section has been updated with SHAP analysis using different parameters (dispersion D of t2, a1, tm and bimodality index BI of t2, a1, tm). It is now more clear that D-a1 is more strongly associated with clinicopathological outcomes compared with other variables. We have also added some biological interpretation of these results in the Discussion.

      Reviewer #2 (Public Review):

      Summary:

      In the manuscript "Metabolic heterogeneity of colorectal cancer as a prognostic factor: insights gained from fluorescence lifetime imaging" by Komarova et al., the authors used fluorescence lifetime imaging and quantitative analysis to assess the metabolic heterogeneity of colorectal cancer. Generally, this work is logically well-designed, including in vitro and in vivo animal models and ex vivo patient samples. However, since the key parameter presented in this study, the BI index, is already published in a previous paper by this group (Shirshin et al., 2022), and the quantification method of metabolic heterogeneity has already been well (and even better) described in previous studies (such as the one by Heaster et al., 2019), the novelty of this study is doubted. Moreover, I am afraid that the way of data analysis and presentation in this study is not well done, which will be mentioned in detail in the following sections.

      Strengths:

      (1) Solid experiments are performed and well-organized, including in vitro and in vivo animal models and ex vivo patient samples.

      (2) Attempt and efforts to build the association between the metabolic heterogeneity and prognosis for colorectal cancer.

      Weaknesses:

      (1) The human sample number (from 21 patients) is very limited. I wonder how the limited patient number could lead to reliable diagnosis and prognosis;.

      Additional 8 samples of patients’ tumors collected while the manuscript was under review were added to the present data. We agree that the number is still limited to conclude about the prognostic value of cell-level metabolic heterogeneity. But at this point we can expect that this parameter will become a metric for prognosis. We will continue this study to collect more samples of colorectal tumors and expand the approach to different cancer types.

      (2) The BI index or similar optical metrics have been well established by this and other groups; therefore, the novelty of this study is doubted.

      The purpose of this research was to quantify and compare the cellular metabolic heterogeneity across the systems of different complexity - commercial cell lines, tumor xenografts and patients’ tumors - using previously established FLIM-based metrics. For the first time, using FLIM, it was shown that heterogeneity of patients’ samples is much higher than of laboratory models and that it has associations with clinical characteristics of the tumors - the stage and the grade. In addition, this study provides evidence that bimodality (BI) in the distribution of metabolic features in the cell population is less important than the width of the spread (the dispersion value D).

      Some corrections have been made in the text on this point.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The following comments should be addressed to strengthen the rigor and clarity of the manuscript.

      (1) The ethical committee that approved the human studies should also be mentioned in the methods section, as was done with the animal studies.

      Information about the ethics committee has been added in the Manuscript.

      The study with the use of patients’ material was approved by the ethics committee of the Privolzhsky Research Medical University (approval № 09 from 30.06.2023).

      (2) The captions in Figures 2 and 3 must be revised. In Figure 2, it seems the last 2 sentences for the description of (C) do not belong there, and instead, the last sentence in the description of (D) may need to be included in (C) instead. Figure 3 is similar.

      The captions were revised.

      (3) From supplement Figure S2 it seems that EpCam and vimentin staining were only done in two of the mouse tumor types. No further mention is made in the results or methods section. Is there any reason this was not performed in the other tumor types? Were the histology and IHC protocols the same for the mouse and human tumors?

      The data on other tumor types and patients’ tumors have been added in Figure S3. Discussion was extended with the following paragraph.

      One of the possible reasons for metabolic heterogeneity could be the presence of stromal cells or diversity of epithelial and mesenchymal phenotypes of cancer cells within a tumor. Immunohistochemical staining of tumors for EpCam (epithelial marker) and vimentin (mesenchymal marker) showed that the fraction of epithelial, EpCam-positive, cells was more than 90% in tumor xenografts and on average 76±10 % in patients’ tumors (Figure S3). However, the ratio of EpCam- to vimentin-positive cells in patients’ samples neither correlated with D-a1 nor with BI-a1, which means that the presence of cells with mesenchymal phenotype did not contribute to metabolic heterogeneity of tumors identified by NAD(P)H FLIM.

      (4) Clarify the design of the experiments: The results come from 50 - 200 cells in each sample (except 30 in the CaCo2 cell culture) that were counted from 5 - 10 images acquired from each sample. There were 21 independent human samples. How many independent samples were included in the cell culture experiments and the mouse tumor models? Why is there an order of magnitude fewer cells included in the CaCo2 group compared to the other groups (Figure 1)? From the image (Figure 1A - CaCo2), it seems to be a highly populated type of sample, yet only 30 cells were quantified. What prevents the inclusion of the same number of cells to be quantified in each group for a more systematic evaluation?

      We thank the Reviewer for this comment.

      Cell culture experiments included two independent replicates for each cell line, the data from which were then combined. In animal experiments measurements were made in three mice (numbered 1-3 in Figure 2C) for each tumor type. We have made calculations for additional >100 cells of CaCo2 cell line. In the revised version the number of Caco2 cells is 146.

      The text of the Manuscript was revised accordingly.

      (5) Regarding references: Some claims throughout the text would benefit from an additional reference. For example: line 70 "Metabolic heterogeneity [...] is believed to have prognostic value"; line 121 " [...] the uniformity of cell metabolism in a culture, which is consistent with the general view on standard cell lines [...]". The clinical translational aspect (i.e., paragraph in line 255) warrants the inclusion of the efforts already done with FLIM imaging in the clinical setting both in vivo and ex vivo with point-spectroscopy and macroscopy imaging (e.g., Jo Lab, Marcu Lab, French Lab, and earlier work by Mycek and Richards-Kortum in colorectal cancer to name a few).

      Additional references were added.

      Reviewer #2 (Recommendations For The Authors):

      (1) In the Introduction, line 85, the authors mention that "Specifically, the unbound state of NAD(P)H has a short lifetime (~0.4 ns) and is associated with glycolysis, while the protein-bound state has a long lifetime (~1.7-3.0 ns) and is associated with OXPHOS". I do not think this claim is appropriate. One cannot simply say that the unbound state is associated with glycolysis, nor that the bound state is associated with OXPHOS; both unbound and bound state are associated with almost all the metabolic pathways. Instead, the expression of "glycolytic/ OXPHOS shift", as authors used in other sections of this manuscript, is a more appropriate one in this case.

      The text of the Introduction was revised.

      (2) What are the biological implications of the bimodality index (BI)? Please provide specific insights.

      Bimodal distribution indicates there are two separate and independent peaks in the population data. In the metabolic FLIM data, this indicates that there are two sub-populations of cells with different metabolic phenotypes. Previously, we have observed bimodal distribution in the population of chemotherapy treated cancer cells, where one sub-population was responsive (shifted metabolism) and the second - non-responsive (unchanged metabolism) [Shirshin et al., PNAS, 2022]. In the naive tumor, a number of factors have an impact on cellular metabolism, including genetics features and microenvironment, so it is difficult to determine which ones resulted in bimodality. Our data on correlation of bimodality (BI) with clinical characteristics of the tumors show that there are no associations between them. What really matters is the width of the parameter spread in the population. The early-stage tumors (T1, T2) were metabolically more heterogeneous than the late-stage ones (T3, T4). A degree of heterogeneity was also associated with differentiation state, a stage-independent prognostic factor in colorectal cancer where the lower grade correlates with better the prognosis. The early-stage tumors (T1, T2) and high-grade (G3) tumors had significantly higher dispersion of NAD(P)H-a1, compared with the late-stage (T3, T4) and low-grade ones (G1, G2). From the point of view of biological significance of heterogeneity, this means that in stressful and unfavorable conditions, to which the tumor cells are exposed, the spread of the parameter distribution in the population rather than the presence of several distinct clusters (modes) matters for adaptation and survival. The high diversity of cellular metabolic phenotypes provided the survival advantage, and so was observed in more aggressive (undifferentiated or poorly differentiated) and the least advanced tumors.

      The discussion has been expanded on this account.

      (3) Have you run statistics in Figure 1B? If yes, do you find any significance? The same question also applies to Figures 2C and 3C.

      We performed statistical analysis to compare different cell lines in in vitro and in vivo models, the results obtained are presented in Table S4.

      (4) Line 119, why is the BI threshold set at 1.1?

      When setting the BI threshold at 1.1, we relied on the work by Wang et al, Cancer Informatics, 2009. The authors recommended the 1.1 cutoff as more reliable to select bimodally expressed genes. Further, we validated this BI threshold to identify chemotherapy responsive and non-responsive sub-populations of cancer cells (Shirshin et al. PNAS, 2022)

      (5) Line 123, what does the high BI of mean lifetime stand for? Please provide biological implications and insights.

      The sentence was removed because inclusion of additional CaCo2 cells (n=146) for quantification NAD(P)H FLIM data showed no bimodality in this cell culture.

      (6) In the legend for Figure 2C, the authors mention that "the bimodality index (BI-a1) is shown above each box"; however, I do not see such values. It is also true for Figure 3C.

      The legends for Fig. 2 and 3 were corrected.

      (7) In Figure 2, t1-t3 were not explained and mentioned in the main text. What do they mean? Do they mean different time points or different tumors?

      t1-t3 means different tumors in a group. Changes have been made to the figure - individual tumors are indicated by numbers.

      (8) In Figure 3, what do p13, p15 and p16 mean? It is not clearly explained. If they just represent patients numbered 13, 15, and 16, then why are these patients chosen as representatives? Do they represent different stages or are they just chosen randomly?

      Figure 3 was revised. Representative images were changed and a short description for each representative sample was included. In the revised version, representatives have been selected to show different stages and grades.

      (9) In Figure 3, instead of showing the results for each patient, I would suggest that authors show representative results from tumors at different stages; or, at least, clearly indicate the specific information for each patient. I do not think that providing the patient number only without any patient-specific information is helpful.

      Figure 3 was revised.

      (10) The sample number (21 patients) is very limited. I wonder how the limited patient number could lead to reliable diagnosis and prognosis.

      Additional eight samples were added. The text, figures and tables were revised accordingly.

      (11) In Discussion, it would be helpful to compare the BI index used in this study with the previously developed OMI-index (Line 275).

      We believe that BI index and OMI index describe different things and, therefore, it is hard to compare them. While BI index is used to describe the degree of the metabolic heterogeneity, OMI index is an integral parameter that includes redox ratio, mean fluorescence lifetimes of NAD(P)H and FAD, and rather indicates the metabolic state of a cell. In this sense it is more relevant to compare it with conventional redox ratio or Fluorescence Lifetime Redox Ratio (FLIRR) (H. Wallrabe et al., Segmented cell analyses to measure redox states of autofluorescent NAD(P)H, FAD & Trp in cancer cells by FLIM, Sci. Rep. 2018; 8: 79). The assessment of the heterogeneity of the FLIM parameters has been previously reported using the weighted heterogeneity (wH) index (Amy T. Shah et al, In Vivo Autofluorescence Imaging of Tumor Heterogeneity in Response to Treatment, Neoplasia 17, pp. 862–870 (2015). To the best of our knowledge, this is the only metric to quantify metabolic heterogeneity on the basis of FLIM data for today. A comparison of BI with the wH-index showed that the value of wH-index provides results similar to BI in the heterogeneity evaluation as demonstrated in our earlier paper (E.A. Shirshin et al, Label-free sensing of cells with fluorescence lifetime imaging: The quest for metabolic heterogeneity, PNAS 119 (9) e2118241119 (2022).  Yet, the BI provides dimensionless estimation on the inherent heterogeneity of a sample, and therefore it can be used to compare heterogeneity assessed by different decay parameters and FLIM data analysis methods. The limitation of using the OMI index for FLIM data analysis is the low intensity of the FAD signal, which was the case in our experiments.

    1. Reviewer #1 (Public Review):

      The authors developed an extension to the pairwise sequentially Markov coalescent model that allows to simultaneously analyze multiple types of polymorphism data. In this paper, they focus on SNPs and DNA methylation data. Since methylation markers mutate at a much faster rate than SNPs, this potentially gives the method better power to infer size history in the recent past. Additionally, they explored a model where there are both local and regional epimutational processes.

      Integrating additional types of heritable markers into SMC is a nice idea which I like in principle. However, a major caveat to this approach seems to be a strong dependence on knowing the epimutation rate. In Fig. 6 it is seen that, when the epimutation rate is known, inferences do indeed look better; but this is not necessarily true when the rate is not known. (See also major comment #1 below about the interpretation of these plots.) A roughly similar pattern emerges in Supp. Figs. 4-7; in general, results when the rates have to be estimated don't seem that much better than when focusing on SNPs alone. This carries over to the real data analysis too: the interpretation in Fig. 7 appears to hinge on whether the rates are known or estimated, and the estimated rates differ by a large amount from earlier published ones.

      Overall, this is an interesting research direction, and I think the method may hold more promise as we get more and better epigenetic data, and in particular better knowledge of the epigenetic mutational process.

    2. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors developed an extension to the pairwise sequentially Markov coalescent model that allows to simultaneously analyze multiple types of polymorphism data. In this paper, they focus on SNPs and DNA methylation data. Since methylation markers mutate at a much faster rate than SNPs, this potentially gives the method better power to infer size history in the recent past. Additionally, they explored a model where there are both local and regional epimutational processes. Integrating additional types of heritable markers into SMC is a nice idea which I like in principle. However, a major caveat to this approach seems to be a strong dependence on knowing the epimutation rate. In Fig. 6 it is seen that, when the epimutation rate is known, inferences do indeed look better; but this is not necessarily true when the rate is not known. (See also major comment #1 below about the interpretation of these plots.) A roughly similar pattern emerges in Supp. Figs. 4-7; in general, results when the rates have to be estimated don't seem that much better than when focusing on SNPs alone. This carries over to the real data analysis too: the interpretation in Fig. 7 appears to hinge on whether the rates are known or estimated, and the estimated rates differ by a large amount from earlier published ones.

      Overall, this is an interesting research direction, and I think the method may hold more promise as we get more and better epigenetic data, and in particular better knowledge of the epigenetic mutational process. At the same time, I would be careful about placing too much emphasis on new findings that emerge solely by switching to SNP+SMP analysis.

      Major comments:

      - For all of the simulated demographic inference results, only plots are presented. This allows for qualitative but not quantitative comparisons to be made across different methods. It is not easy to tell which result is actually better. For example, in Supp. Fig. 5, eSMC2 seems slightly better in the ancient past, and times the trough more effectively, while SMCm seems a bit better in the very recent past. For a more rigorous approach, it would be useful to have accompanying tables that measure e.g. mean-squared error (along with confidence intervals) for each of the different scenarios, similar to what is already done in Tables 1 and 2 for estimating $r$.

      We believe this comment was addressed in the previous revision (Sup Table 6-10) by adding Root Mean Square Errors for the demographic estimates (and RMSE for recent versus past portions of the demography). 

      - 434: The discussion downplays the really odd result that inputting the true value of the mutation rate, in some cases, produces much worse estimates than when they are learned from data (SFig. 6)! I can't think of any reason why this should happen other than some sort of mathematical error or software bug. I strongly encourage the authors to pin down the cause of this puzzling behaviour. (Comment addressed in revision. Still, I find the explanation added at 449ff to be somewhat puzzling -- shouldn't the results of the regional HMM scan only improve if the true mutation rate is given?)

      We do understand that our results and explanation can appear counter-intuitive. As acknowledged by the reviewer, in the previous round of revision we have at length clarified this puzzling behaviour by the discrepancy in assessing methylation regions using the HMM method which then differs from the HMM for the SMC inference. We are happy to clarify further in response to the new question of reviewer 1:

      If the Reviewer #1 means the SNP mutations (e.g. A → T), knowing the true mutation rate does not help the HMM to recover the region level methylation status. 

      If the Reviewer #1 means the epimutations (whether it is the region, site or both), knowing the true epimutations rates could theoretically help the HMM to recover the region level methylation status. However, at present, our method does not leverage information from epimutation rates to infer the region level methylation status. As inferring the epimutations rates is one of the goals of this study in the SMC inference, and that region level methylation status is required to infer those rates, we suspect that using epimutations rates to infer the region level methylation status could be statistically inappropriate (generating some kind of circular estimations). Instead, our HMM uses only the proportion of methylated and unmethylated sites (estimated from the genome) to determine whether or not a region status is most-likely to be methylated or unmethylated. We now explicit this fact in the HMM for methylation region in the method section.

      We acknowledge that our HMM to infer region level methylation status could be improved, but this would be a complete project and study on its own (due to the underlying complexity of the finite site and the lack of a consensus model for epimutations at evolutionary time scale). We believe our HMM to have been the best compromise with what was known from methylation and our goals when the study was conducted, and future work is definitely worth conducting on the estimation of the methylation regions.

      - As noted at 580, all of the added power from integrating SMPs/DMRs should come from improved estimation of recent TMRCAs. So, another way to study how much improvement there is would be to look at the true vs. estimated/posterior TMRCAs. Although I agree that demographic inference is ultimately the most relevant task, comparing TMRCA inference would eliminate other sources of differences between the methods (different optimization schemes, algorithmic/numerical quirks, and so forth). This could be a useful addition, and may also give you more insight into why the augmented SMC methods do worse in some cases. (Comment addressed in revision via Supp. Table 7.).

      - A general remark on the derivations in Section 2 of the supplement: I checked these formulas as best I could. But a cleaner, less tedious way of calculating these probabilities would be to express the mutation processes as continuous time Markov chains. Then all that is needed is to specify the rate matrices; computing the emission probabilities needed for the SMC methods reduces to manipulating the results of some matrix exponentials. In fact, because the processes are noninteracting, the rate matrix decomposes into a Kronecker sum of the individual rate matrices for each process, which is very easy to code up. And this structure can be exploited when computing the matrix exponential, if speed is an issue.

      We believe this comment was acknowledged in the previous revision (line 649), and we thank the reviewer for this interesting insight.

      - Most (all?) of the SNP-only SMC methods allow for binning together consecutive observations to cut down on computation time. I did not see binning mentioned anywhere, did you consider it? If the method really processes every site, how long does it take to run?

      We believe this comment was addressed in the previous revision and was added to the manuscript in the methods Section (subsection :  SMC optimization function).

      - 486: The assumed site and region (de)methylation rates listed here are several OOM different from what your method estimated (Supp. Tables 5-6). Yet, on simulated data your method is usually correct to within an order of magnitude (Supp. Table 4). How are we to interpret this much larger difference between the published estimates and yours? If the published estimates are not reliable, doesn't that call into question your interpretation of the blue line in Fig. 7 at 533? (Comment addressed in revision.)

      Reviewer #2 (Public Review):

      A limitation in using SNPs to understand recent histories of genomes is their low mutation frequency. Tellier et al. explore the possibility of adding hypermutable markers to SNP based methods for better resolution over short time frames. In particular, they hypothesize that epimutations (CG methylation and demethylation) could provide a useful marker for this purpose. Individual CGs in Arabidopsis tends to be either close to 100% methylated or close to 0%, and are inherited stably enough across generations that they can be treated as genetic markers. Small regions containing multiple CGs can also be treated as genetic markers based on their cumulative methylation level. In this manuscript, Tellier et al develop computational methods to use CG methylation as a hypermutable genetic marker and test them on theoretical and real data sets. They do this both for individual CGs and small regions. My review is limited to the simple question of whether using CG methylation for this purpose makes sense at a conceptual level, not at the level of evaluating specific details of the methods. I have a small concern in that it is not clear that CG methylation measurements are nearly as binary in other plants and other eukaryotes as they are in Arabidopsis. However, I see no reason why the concept of this work is not conceptually sound. Especially in the future as new sequencing technologies provide both base calling and methylating calling capabilities, using CG methylation in addition to SNPs could become a useful and feasible tool for population genetics in situations where SNPs are insufficient.

      We thank again the reviewer #2 for his positive comments.  

      Reviewer #3 (Public Review):

      I very much like this approach and the idea of incorporating hypervariable markers. The method is intriguing, and the ability to e.g. estimate recombination rates, the size of DMRs, etc. is a really nice plus. I am not able to comment on the details of the statistical inference, but from what I can evaluate it seems reasonable and in principle the inclusion of highly mutable sties is a nice advance. This is an exciting new avenue for thinking about inference from genomic data. I remain a bit concerned about how well this will work in systems where much less is understood about methylation,

      The authors include some good caveats about applying this approach to other systems, but I think it would be helpful to empiricists outside of thaliana or perhaps mammalian systems to be given some indication of what to watch out for. In maize, for example, there is a nonbimodal distribution of CG methlyation (35% of sites are greater than 10% and less than 90%) but this may well be due to mapping issues. The authors solve many of the issues I had concerns with by using gene body methylation, but this is only briefly mentioned on line 659. I'm assuming the authors' hope is that this method will be widely used, and I think it worth providing some guidance to workers who might do so but who are not as familiar with these kind of data.

      We thank the reviewer #3 for his positive comments. And we agree with Reviewer #3 concerning the application to data and that our approach needs to be carefully thought before applied. Our results clearly show that methylation processes are not well enough understood to apply our approach as we initially (maybe naively) designed it. Further investigations need to be conducted and appropriate theoretical models need to be developed before reliable results can be obtained. And we hope that our discussion points this out. However, our approach, the theoretical models and the additional tools contained in this study can be used to help researchers in their investigations to whether or not use different genomic markers to build a common (potentially more reliable) ancestral history. We enhanced the discussion in this second revision by clarifying also the use of the methylation from genic regions to avoid  confusion (lines 700-731).

      Recommendations for the authors:  

      Reviewer #1 (Recommendations For The Authors):

      In added Supp. Table 7, I don't think these are in log10 units as stated in the caption.

      Well Spotted! Indeed, the RMSE is not in log10 scale, we corrected the caption. We also added that the TMRCA used for MRSE calculations is in generations units to avoid potential confusion.  

      Reviewer #3 (Recommendations for The Authors):

      I very much appreciate the authors' attention to previous questions. I would ask that a bit more is spent in the discussion on concerns/approaches empiricists should keep in mind -- I am wary of this being uncritically applied to data from non-model species. It was not clear to me, for example (only mentioned on line 659 in the discussion) that the thaliana data is only using gene-body methylation. This poses potential issues with background selection that the authors acknowledge appropriately, but also assuages many of my concerns about using genome-wide data. I think text with recommendations for data/filtering/etc or at least cautions of assumptions empiricists should be aware of would help.

      We apologize for the confusion at line 659. As written in the other section of the manuscript we meant CG sites in genic regions (and not only gene body methylated regions).

      Due to the manuscript’s structure, the data from Arabidopsis thaliana is only described at the very end of the manuscript (line 900+). However, a brief description could also be found line 291-296. We however added a sentence in the introduction (line 128) for clarity. 

      We however agree with the comment made by reviewer #3 concerning the application to data. We pointed in the discussion the risk of applying our approach on ill-understood (or illprepared) data and stressed the current need of studies on the epimutations processes at evolutionary time scale ( i.e. at Ne time scale) (line 700-703).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The development of effective computational methods for protein-ligand binding remains an outstanding challenge to the field of drug design. This impressive computational study combines a variety of structure prediction (AlphaFold2) and sampling (RAVE) tools to generate holo-like protein structures of three kinases (DDR1, Abl1, and Src kinases) for binding to type I and type II inhibitors. Of central importance to the work is the conformational state of the Asp-Phy-Gly "DFG motif" where the Asp points inward (DFG-in) in the active state and outward (DFG-out) in the inactive state. The kinases bind to type I or type II inhibitors when in the DFG-in or DFG-out states, respectively.

      It is noted that while AlphaFold2 can be effective in generating ligand-free apo protein structures, it is ineffective at generating holo-structures appropriate for ligand binding. Starting from the native apo structure, structural fluctuations are necessary to access holo-like structures appropriate for ligand binding. A variety of methods, including reduced multiple sequence alignment (rMSA), AF2-cluster, and AlphaFlow may be used to create decoy structures. However, those methods can be limited in the diversity of structures generated and lack a physics-based analysis of Boltzmann weight critical to their relative evaluation.

      To address this need, the authors combine AlphaFold2 with the Reweighted Autoencoded Variational Bayes for Enhanced Sampling (RAVE) method, to explore metastable states and create a Boltzmann ranking. With that variety of structures in hand, grid-based docking methods Glide and Induced-Fit Docking (IFD) were used to generate protein-ligand (kinase-inhibitor) complexes.

      The authors demonstrate that using AlphaFold2 alone, there is a failure to generate DFG-out structures needed for binding to type II inhibitors. By applying the AlphaFold2 with rMSA followed by RAVE (using short MD trajectories, SPIB-based collective variable analysis, and enhanced sampling using umbrella sampling), metastable DFG-out structures with Boltzmann weighting are generated enabling protein-ligand binding. Moreover, the authors found that the successful sampling of DFG-out states for one kinase (DDR1) could be used to model similar states for other proteins (Abl1 and Src kinase). The AF2RAVE approach is shown to result in a set of holo-like protein structures with a 50% rate of docking type II inhibitors.

      Overall, this is excellent work and a valuable contribution to the field that demonstrates the strengths and weaknesses of state-of-the-art computational methods for protein-ligand binding. The authors also suggest promising directions for future study, noting that potential enhancements in the workflow may result from the use of binding site prediction models and free energy perturbation calculations.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript explores the utility of AlphaFold2 (AF2) and the author's own AF2-RAVE method for drug discovery. As has been observed elsewhere, the predictive power of docking against AF2 structures is quite limited, particularly for proteins like kinases that have non-trivial conformational dynamics. However, using enhanced sampling methods like RAVE to explore beyond AF2 starting structures leads to a significant improvement.

      Strengths:

      This is a nice demonstration of the utility of the authors' previously published RAVE method.

      Weaknesses:

      My only concern is the authors' discussion of induced fit. I'm quite confident the structures discussed are present in the absence of ligand binding, consistent with conformational selection. It seems the author's own data also argues for an important role in conformational selection. It would be nice to acknowledge this instead of going along with the common practice in drug discovery of attributing any conformational changes to induced fit without thoughtful consideration of conformational selection.

      The reviewer is correct. We aim to highlight the significant role of conformational selection. To clarify this, we have expanded the discussion on conformational selection in the introduction.

      Reviewer #3 (Public Review):

      In this manuscript, the authors aim to enhance AlphaFold2 for protein conformation-selective drug discovery through the integration of AlphaFold2 and physics-based methods, focusing on improving the accuracy of predicting protein structures ensemble and small molecule binding of metastable protein conformations to facilitate targeted drug design.

      The major strength of the paper lies in the methodology, which includes the innovative integration of AlphaFold2 with all-atom enhanced sampling molecular dynamics and induced fit docking to produce protein ensembles with structural diversity. Moreover, the generated structures can be used as reliable crystal-like decoys to enrich metastable conformations of holo-like structures. The authors demonstrate the effectiveness of the proposed approach in producing metastable structures of three different protein kinases and perform docking with their type I and II inhibitors. The paper provides strong evidence supporting the potential impact of this technology in drug discovery. However, limitations may exist in the generalizability of the approach across other structures, especially complex structures such as protein-protein or DNA-protein complexes.

      Proteins undergo thermodynamic fluctuations and can occasionally reach metastable configurations. It can be assumed that other biomolecules, such as proteins and DNA, stabilize these metastable states when forming protein-protein or protein-DNA complexes. Since our method has the potential to identify these metastable states, it shows promise for designing drugs targeting proteins in allosteric configurations induced by other biomolecules.

      The authors largely achieved their aims by demonstrating that the AF2RAVE-Glide workflow can generate holo-like structure candidates with a 50% successful docking rate for known type II inhibitors. This work is likely to have a significant impact on the field by offering a more precise and efficient method for predicting protein structure ensemble, which is essential for designing targeted drugs. The utility of the integrated AF2RAVE-Glide approach may streamline the drug discovery process, potentially leading to the development of more effective and specific medications for various diseases.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Suggestions

      (1) The computational protocol is found to be insufficient to generate precise values of the relative free energies between structures generated. The authors note in the Conclusion that an enhancement in the workflow might result from the addition of free energy calculations. Can the authors comment on the prospects for generating more accurate estimates of the free energy that might be used to qualitatively evaluate poses and the free energy landscape surrounding putative metastable states? What are the principal challenges and what might help overcome them? What would the most effective computational protocol be?

      More accurate estimates of the free energy can theoretically be achieved by increasing the number of umbrella sampling windows and extending the simulation length until the PMF converges. However, there is always a trade-off between PMF accuracy and computational costs, so we have chosen to stick with the current setup. Metadynamics is another method to obtain a more accurate free energy profile, which we have used in previous versions of AlphaFold2-RAVE, but for the specific systems we investigated, it had issues in achieving back and forth movement given the high entropic nature of the activation loop. Research in enhanced sampling methods and dimensionality reduction techniques for reaction coordinates is continually evolving and will play a critical role in alleviating this problem.

      (2) I was surprised that there was not more correlation of a funnel-like shape in Figures S16 and S18, showing a stronger correlation between low RMSD and better docking score. This is true for both the ponatinib and imatinib applications in DDR1 and Abl1. That also seems true for the trimmed results for Src kinase in Figure S19. I was also surprised that there are structures with very large RMSD but docking scores comparable to the best structures of the lowest RMSD. Might something be done to make the docking score a more effective discriminator?

      The docking algorithm and docking score are used to filter out highly improbable docking poses. False positives in predicted docking poses are a common issue across all docking methods as described for instance in:

      Fan, Jiyu, Ailing Fu, and Le Zhang. "Progress in molecular docking." Quantitative Biology 7 (2019): 83-89.

      Ferreira, R.S., Simeonov, A., Jadhav, A., Eidam, O., Mott, B.T., Keiser, M.J., McKerrow, J.H., Maloney, D.J., Irwin, J.J. and Shoichet, B.K., 2010. "Complementarity between a docking and a high-throughput screen in discovering new cruzain inhibitors." Journal of medicinal chemistry, 53(13), pp.4891-4905.

      Moreover, there is always a trade-off between docking accuracy and computational cost. While employing more accurate docking methods may decrease false positives, it can also be resource-intensive. In such scenarios, our approach to enriching holo-structures can be impactful by reducing the number of pocket structures in the input ensembles and significantly enhancing docking efficiency.

      (3) I think that it is fine to identify one structure as "IFD winner" but also feel that its significance is overstressed, especially given that it can be identified only in a retrospective analysis rather than through de novo prediction.

      We agree with the reviewer. We did not intend to emphasize the specific structure "IFD winner". Rather, we aimed to demonstrate that our method can enrich promising candidates for holo-structures. We verified this by showing that our holo-structure candidates performed well in retrospective docking using IFD, which we previously referred to as "IFD winner". We have now revised this term to "holo-model".

      Minor Points

      p. 3 "DymanicBind" should be "DynamicBind"

      p. 3 Change "We chosen" to "We have chosen" or "we chose."

      p. 3 In identifying the Schrödinger software Glide and IFD, I recommend removing the subjective modifier "industry-leading."

      Modifications done.

      Reviewer #2 (Recommendations For The Authors):

      In the view of this reviewer, the writing is 'choppy'.

      We have tried to improve the writing.

      Reviewer #3 (Recommendations For The Authors):

      (1) In Figure 1, the workflow labels (i) to (iv) are not shown on the figures, making it difficult for readers to follow. Consider adding these labels to the figures.

      Modifications done.

      (2) Explain how Boltzmann ranks were calculated based on unbiased MD simulations to guide the enrichment of holo-like structures in metastable states.

      The Methods section is now updated for clarification.

      (3) The authors could clarify how the classical DFG-out decoys in the DDR1 rMSA AF2 ensemble are transferred to Abl1 kinase in the Methods section.

      The Methods section is now updated for clarification.

      (4) The authors can clarify the methodology section by providing more detailed explanations about how the unbiased MD simulations are performed, including which MD simulation software was used and whether energy minimization and equilibrium steps were needed as in conventional MD simulations, and other setup details.

      The Methods section is now updated for clarification.

      (5) The validation of the proposed approach in this work used three kinase proteins. The authors can enhance the discussion section by addressing other types of protein structure prediction that can use the proposed approach in drug discovery, beyond the three kinase proteins tested.

      The proposed approach is theoretically applicable to other types of proteins, such as GPCRs, where both conformational selection and the induced-fit effect are crucial. We have expanded the discussion on the generalization of our protocol in the Conclusion section.

      (6) The authors should add appropriate citations for the software and tools used in the manuscript. For example, a reference should be added for the Glide XP docking experiments that utilized the Maestro software. Double-check all related software citations.

      We have now updated the citations for docking experiments based on the instruction of the Maestro Glide User manual and IFD User manual.

      (7) The authors should consider offering a comprehensive list of software tools and databases utilized in the study to assist in replicating the experiments and further validating the results.

      We have now added a summary of tools used in the Methods section.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      The authors present evidence suggesting that MDA5 can substitute as a sensor for triphosphate RNA in a species that naturally lacks RIG-I. The key findings are potentially important for our understanding of the evolution of innate immune responses. Compared to an earlier version of the paper, the strength of evidence has improved but it is still partially incomplete due to a few key missing experiments and controls.

      We would like to thank the editorial team for their positive comments and constructive suggestions on improving our manuscript. We have made further improvements based on the valuable suggestions of the reviewers, and we are pleased to send you the revised manuscript now. After revising the manuscript and further supplementing with experiments, we think that our existing data can support our claims.

      Public Reviews:

      Reviewer #1 (Public Review):

      This study offers valuable insights into host-virus interactions, emphasizing the adaptability of the immune system. Readers should recognize the significance of MDA5 in potentially replacing RIG-I and the adversarial strategy employed by 5'ppp-RNA SCRV in degrading MDA5 mediated by m6A modification in different species, further indicating that m6A is a conservational process in the antiviral immune response.

      However, caution is warranted in extrapolating these findings universally, given the dynamic nature of host-virus dynamics. The study provides a snapshot into the complexity of these interactions, but further research is needed to validate and extend these insights, considering potential variations across viral species and environmental contexts. Additionally, it is noted that the main claims put forth in the manuscript are only partially supported by the data presented.

      After meticulous revisions of the manuscript, including adjustments to the title, abstract, results, and discussion, the main claim of our study now is the arm race between the MDA5 receptor and SCRV virus in a lower vertebrate fish, M. miiuy. This mainly includes two parts: Firstly, the MDA5 of M. miiuy can recognize virus invasion and initiate host immune response by recognizing the triphosphate structure of SCRV. Secondly, as an adversarial strategy, 5’ppp-RNA SCRV virus can utilize the m6A mechanism to degrade MDA5 in M. miiuy. Based on the reviewer's suggestions, we have further supplemented the critical experiments (Figure 3F-3G, Figure 4D, Figure 5G) and provided a more detailed and accurate explanation of the experimental conclusions, we believe that our existing manuscript can support our main claims. In addition, because virus-host coevolution complicates the derivation of universal conclusions, we will further expand our insights in future research.

      Reviewer #2 (Public Review):

      This manuscript by Geng et al. aims to demonstrate that MDA5 compensates for the loss of RIG-I in certain species, such as teleost fish miiuy croaker. The authors use siniperca cheats rhabdovirus (SCRV) and poly(I:C) to demonstrate that these RNA ligands induce an IFN response in an MDA5-dependent manner in m.miiuy derived cells. Furthermore, they show that MDA5 requires its RD domain to directly bind to SCRV RNA and to induce an IFN response. They use in vitro synthesized RNA with a 5'triphosphate (or lacking a 5'triphosphate as a control) to demonstrate that MDA5 can directly bind to 5'-triphosphorylated RNA. The second part of the paper is devoted to m6A modification of MDA5 transcripts by SCRV as an immune evasion strategy. The authors demonstrate that the modification of MDA5 with m6A is increased upon infection and that this causes increased decay of MDA5 and consequently a decreased IFN response.

      One critical caveat in this study is that it does not address whether ppp-SCRV RNA induces IRF3-dimerization and type I IFN induction in an MDA5 dependent manner. The data demonstrate that mmiMDA5 can bind to triphosphorylated RNA (Fig. 4D). In addition, triphosphorylated RNA can dimerize IRF3 (4C). However, a key experiment that ties these two observations together is missing.

      Specifically, although Fig. 4C demonstrates that 5'ppp-SCRV RNA induces dimerization (unlike its dephosphorylated or capped derivatives), this does not proof that this happens in an MDA5-dependent manner. This experiment should have been done in WT and siMDA5 MKC cells side-by-side to demonstrate that the IRF3 dimerization that is observed here is mediated by MDA5 and not by another (unknown) protein. The same holds true for Fig. 4J.

      Thank you for the referee's professional suggestions. In fact, we have transfected SCRV RNA into WT and si-MDA5 MKC cells, and subsequently assessed the dimerization of IRF3 and the IFN response (Figure 2P-2Q). The results indicated that knockdown of MDA5 prevents immune activation of SCRV RNA. However, considering the potential for SCRV RNA to activate immunity independent of the triphosphate structure, this experimental observation does not comprehensively establish the MDA5-dependent induction of IRF3 dimer by 5’ppp-RNA. Accordingly, in accordance with the referee's recommendation, we proceeded to investigate the inducible activity of 5'ppp-SCRV on IRF3 dimerization in WT and si-MDA5 MKC cells, revealing that 5'ppp-SCRV indeed elicits immunity in an MDA5-dependent manner (Figure 4D). Additionally, poly(I:C)-HMW, a known ligand for MDA5, demonstrated a residual, albeit attenuated, activation of IRF3 following MDA5 knockdown, potentially attributed to its capacity to stimulate immunity through alternative pathways such as TLR3.

      - Fig 1C-D: these experiments are not sufficiently convincing, i.e. the difference in IRF3 dimerization between VSV-RNA and VSV-RNA+CIAP transfection is minimal.

      We have reconstituted the necessary materials and repeated the pertinent experiments depicted in Fig 1C-1D. The results demonstrate that SCRV-RNA+CIAP and VSV-RNA+CIAP exhibit a mitigating effect on the induction activity of SCRV-RNA and VSV-RNA on IRF3 dimerization, albeit without complete elimination (Figure 1C and 1D). These findings suggest the presence of receptors within M. miiuy and G. gallus capable of recognizing the viral triphosphate structure; however, it is worth noting that RNA derived from SCRV and VSV viruses does not exclusively depend on the triphosphate structure to activate the host's antiviral response.

      Fig. 2N and 2O: why did the authors decide to use overexpression of MDA5 to assess the impact of STING on MDA5-mediated IFN induction? This should have been done in cells transfected with SCRV or polyIC (as in 2D-G) or in infected cells (as in 2H-K). In addition, it is a pity that the authors did not include an siMAVS condition alongside siSTING, to investigate the relative contribution of MAVS versus STING to the MDA5-mediated IFN response. Panel O suggests that the IFN response is completely dependent on STING, which is hard to envision.

      In our previous laboratory investigations, we have substantiated the induction effect of STING on IFN under SCRV infection or poly(I:C) stimulation, as documented in the relevant literature (10.1007/s11427-020-1789-5), which we have referenced in our manuscript (lines 177-178). While we did assess the impact of STING on MDA5-mediated IFN induction in SCRV-infected cells, as indicated in the figure legends, we have revised Figure 2N-2O for improved clarity, and similarly, Figure 1H-1I has also been updated. Furthermore, considering that RNA virus infection can activate the cGAS/STING axis (10.3389/fcimb.2023.1172739) and the significant role of MAVS in sensing RNA virus invasion in the NLR pathway (10.1038/ni.1782), it is challenging to ascertain the respective contributions of STING and MAVS to the immune signaling cascade mediated by MDA5 during RNA virus infection. We intend to explore this aspect further in future research endeavors.

      Fig. 3F and 3G: where are the mock-transfected/infected conditions? Given that ectopic expression of hMDA5 is known to cause autoactivation of the IFN pathway, the baseline ISG levels should be shown (ie. In absence of a stimulus or infection). Normalization of the data does not reveal whether this is the case and is therefore misleading.

      Based on the reviewer's suggestions, we have rerun the experiment. We examined the effects of MDA5 and MDA5-ΔRD on antiviral factors in both uninfected, SCRV-infected, and poly(I:C)-HMW-stimulated MKC cells. Results showed that overexpression of both MDA5 and MDA5-ΔRD stimulated the expression of antiviral genes. However, when cells were infected or stimulated with SCRV or poly(I:C)-HMW, only the overexpression of MDA5, not MDA5-ΔRD, significantly increased the expression of antiviral genes (Figure 3F-3I).

      Fig. 4F and 4G: can the authors please indicate in the figure which area of the gel is relevant here? The band that runs halfway the gel? If so, the effects described in the text are not supported by the data (i.e. the 5'OH-SCRV and 5'pppGG-SCRV appear to compete with Bio-5'ppp-SCRV as well as 5'ppp-SCRV).

      Apologies for any confusion. The relevant areas in the gel pertaining to the experimental findings were denoted with asterisks and elaborated upon in the figure legends (Figure 4G, 4H, and 4M). The findings indicated that 5'ppp-SCRV, in contrast to 5'OH-SCRV and 5'pppGG-SCRV, demonstrated the ability to compete with bio-5'ppp-SCRV.

      My concerns about Fig. 5 remain unaltered. The fact that MDA5 is an ISG explains its increased expression and increased methylation pattern. The authors should at the very least mention in their text that MDA5 is an ISG and that their observations may be partially explained by this fact.

      First, as our m6A change analysis pipeline controls for changes in gene expression, these data should represent true changes in m6A modification rather than changes in the expression of m6A-modified transcripts (10.1038/s41598-020-63355-3). Similar studies demonstrated that m6A modification in RIOK3 and CIRBP mRNAs are altered following Flaviviridae infection (10.1016/j.molcel.2019.11.007). The specific calculation method is as follows: relative m6A level for each transcript was calculated as the percent of input in each condition normalized to that of the respective positive control spike-in. Fold change of enrichment was calculated with mock samples normalized to 1. Therefore, changes in the expression level of MDA5 can partially explain the increase in m6A modification on all MDA5 mRNA in cells, but it cannot indicate changes in m6A modification on each mDA5 transcript. We have supplemented the calculation method process in the manuscript and cited relevant literature (Lines 606-608). In addition, we have elaborated on the fact that MDA5 is an ISG gene in the experimental results (lines 260-261), and emphasized its compatibility with enhanced m6A modification of MDA5 in the discussion section (lines 405-409).

      Reviewer #3 (Public Review):

      In this manuscript, the authors explored the interaction between the pattern recognition receptor MDA5 and 5'ppp-RNA in the Miiuy croaker. They found that MDA5 can serve as a substitute for RIG-I in detecting 5'ppp-RNA of Siniperca cheilinus rhabdovirus (SCRV) when RIG-I is absent in Miiuy croaker. Furthermore, they observed MDA5's recognition of 5'ppp-RNA in chickens (Gallus gallus), a species lacking RIG-I. Additionally, the authors documented that MDA5's functionality can be compromised by m6A-mediated methylation and degradation of MDA5 mRNA, orchestrated by the METTL3/14-YTHDF2/3 regulatory network in Miiuy croaker during SCRV infection. This impairment compromises the innate antiviral immunity of fish, facilitating SCRV's immune evasion. These findings offer valuable insights into the adaptation and functional diversity of innate antiviral mechanisms in vertebrates.

      We extend our sincere appreciation for your professional comments and insightful suggestions on our manuscript, as they have significantly contributed to enhancing its quality.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The interpretation of Figures 1H and I, along with the captions, seems unclear. Particularly, understanding the meaning of the X-axis in Figure I is challenging. Additionally, the designation of "H2O = 1" on the Y-axis in Figure 1E lacks clarity. It would be helpful if the author could revise and clarify these figures for better comprehension.

      We appreciate your reminder and have corrected and clarified these figures and figure legends (lines 768-772). We have replaced the Y-axis of Figure 1I with "Relative mRNA expression" instead of " Relative IFN-1 expression" (Figure 1I). In addition, we have added an explanation of "H2O=1" in the legend of Figure 1E.

      (2) The interpretation of Figure 5 in section 2.5 seems incomplete. The author mentioned that both m6A levels and MDA5 expression levels are increased (lines 256-257), prompting questions about the relationship between m6A and MDA5 expression. If higher m6A levels typically lead to MDA5 mRNA instability and lower MDA5 expression, observing both increasing simultaneously appears contradictory. Considering the dynamic changes shown in Figure 5, it would be more appropriate to propose an alteration in both m6A levels and MDA5 expression levels. Given the fluctuating nature of these changes, definitively labeling them as solely "increased" is challenging. Therefore, offering a nuanced interpretation of the results and clarifying this aspect would bolster the study's conclusions.

      While changes in m6A modification and the expression of m6A-modified transcripts are biologically relevant, identifying bona fide m6A alterations during viral infection will allow us to understand how m6A modification of cellular mRNA is regulated. As our m6A change analysis pipeline controls for changes in gene expression, these data should represent true changes in m6A modification rather than changes in the expression of m6A-modified transcripts (10.1038/s41598-020-63355-3). Similar studies demonstrated that m6A modification in RIOK3 and CIRBP mRNAs are altered following Flaviviridae infection (10.1016/j.molcel.2019.11.007). The specific calculation method is as follows: relative m6A level for each transcript was calculated as the percent of input in each condition normalized to that of the respective positive control spike-in. Fold change of enrichment was calculated with mock samples normalized to 1. Therefore, the upregulation of MDA5 expression can partially explain the increase in m6A modification on all MDA5 mRNA in cells, but it cannot indicate changes in m6A modification on each mDA5 transcript. We have supplemented the calculation method process in the manuscript and cited relevant literature. I hope to receive your understanding.

      In addition, although higher m6A levels often lead to unstable MDA5 mRNA and lower MDA5 expression, SCRV can affect MDA5 expression through multiple pathways. For example, since MDA5 is an interferon-stimulated gene, the infection of SCRV virus can cause strong expression of interferon and indirectly induce high-level expression of MDA5. Therefore, the expression of MDA5 is not contradictory to the simultaneous increase in MDA5 modification (24 h). In order to further enhance our experimental conclusions, we supplemented the dual fluorescence experiment. The results indicate that, the infection of SCRV can inhibit the fluorescence activity of MDA5-exon1 reporter plasmids containing m6A sites but not including the promoter sequence of the MDA5 gene, and this inhibitory effect can be counteracted by cycloleucine (CL, an amino acid analogue that can inhibit m6A modification) (Figure 5G). This further indicates that SCRV can reduce the expression of MDA5 through the m6A pathway.

      Finally, in light of the fluctuations in MDA5 expression levels, we have changed the subheadings of Results 2.5 section and provided a more comprehensive and precise elucidation of the experimental outcomes. We are grateful for your valuable feedback.

      (3) In the discussion section, it would indeed be advantageous for the author to explore the novelty of this work more comprehensively, moving beyond merely acknowledging the widespread loss of RIG-I and suggesting MDA5 as a compensatory mechanism. Considering the well-established roles of MDA5 and m6A in host-virus interactions, the findings of this study may seem familiar in light of previous research. To enhance the discussion, it would be valuable for the author to delve into the implications of this evolutionary model. For instance, does the compensation or loss of RIG-I impact a species' susceptibility to specific types of viruses? Exploring such questions would provide insight into the broader significance of this compensation model and its potential effects on host-virus interactions, thus adding depth to the study's contribution.

      We appreciate the expert advice provided by the referee. In response, we have expanded our discussion in the relevant section, addressing the potential influence of RIG-I deficiency and MDA5 compensation on the antiviral immune system in vertebrates (lines 371-376). Furthermore, we underscore the significance of exploring the impact of SCRV infection on MDA5 m6A modification, considering its compatibility with MDA5 as an ISG gene, in elucidating the host response to viral infection (lines 405-409).

      (4) To improve the manuscript, it would be beneficial if the editors could aid the author in refining the language. Many descriptions in the article are overly redundant, and there should be appropriate differentiation between experimental methods and results.

      We appreciate the reviewer’s comment. We have carefully revised the manuscript and removed redundant descriptions in the experimental results and methods.

      Reviewer #3 (Recommendations For The Authors):

      The authors have addressed all of my concerns.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1

      R1 Cell profiling is an emerging field with many applications in academia and industry. Finding better representations for heterogeneous cell populations is important and timely. However, unless convinced otherwise after a rebuttal/revision, the contribution of this paper, in our opinion, is mostly conceptual, but in its current form - not yet practical. This manuscript combined two concepts that were previously reported in the context of cell profiling, weakly supervised representations. Our expertise is in computational biology, and specifically applications of machine learning in microscopy.

      In our revised manuscript, we have aimed to better clarify the practical contributions of our work by demonstrating the effectiveness of the proposed concepts on real-world datasets. We hope that these revisions and our detailed responses address your concerns and highlight the potential impact of our approach.

      R1.1a. CytoSummaryNet is evaluated in comparison to aggregate-average profiling, although previous work has already reported representations that capture heterogeneity and self-supervision independently. To argue that both components of contrastive learning and sets representations are contributing to MoA prediction we believe that a separate evaluation for each component is required. Specifically, the authors can benchmark their previous work to directly evaluate a simpler population representation (PMID: 31064985, ref #13) - we are aware that the authors report a 20% improvement, but this was reported on a separate dataset. The authors can also compare to contrastive learning-based representations that rely on the aggregate (average) profile to assess and quantify the contribution of the sets representation.

      We agree that evaluating the individual contributions of the contrastive learning framework and single-cell data usage is important for understanding CytoSummaryNet's performance gains.

      To assess the impact of the contrastive formulation independently, we applied CytoSummaryNet to averaged profiles from the cpg0004 dataset. This isolated the effect of contrastive learning by eliminating single-cell heterogeneity. The experiment yielded a 32% relative improvement in mechanism of action retrieval, compared to the 68% gain achieved with single-cell data. These findings suggest that while the contrastive formulation contributes significantly to CytoSummaryNet's performance, leveraging single-cell information is crucial for maximizing its effectiveness. We have added a discussion of this experiment to the Results section:

      “We conducted an experiment to determine whether the improvements in mechanism of action retrieval were due solely to CytoSummaryNet's contrastive formulation or also influenced by the incorporation of single-cell data. We applied the CytoSummaryNet framework to pre-processed average profiles from the 10 μM dose point data of Batch 1 (cpg0004 dataset). This approach isolated the effect of the contrastive architecture by eliminating single-cell data variability. We adjusted the experimental setup by reducing the learning rate by a factor of 100, acknowledging the reduced task complexity. All other parameters remained as described in earlier experiments.

      This method yielded a less pronounced but still substantial improvement in mechanism of action retrieval, with an increase of 0.010 (32% enhancement - Table 1). However, this improvement was not as high as when the model processed single-cell level data (68% as noted above). These findings suggest that while CytoSummaryNet's contrastive formulation contributes to performance improvements, the integration of single-cell data plays a critical role in maximizing the efficacy of mechanism of action retrieval.”

      We don't believe comparing with PMID: 31064985 is useful: while the study showcased the usefulness of modeling heterogeneity using second-order statistics, its methodology is limited in scalability due to the computational burden of computing pairwise similarities for all perturbations, particularly in large datasets. Additionally, the study's reliance on similarity network fusion, while expedient, introduces complexity and inefficiency. We contend that this comparison does not align with our objective of testing the effectiveness of heterogeneity in isolation, as it primarily focuses on capturing second and first-order information. Thus, we do not consider this study a suitable baseline for comparison.

      R1.1b. The evaluation metric of mAP improvement in percentage is misleading, because a tiny improvement for a MoA prediction can lead to huge improvement in percentage, while a much larger improvement in MoA prediction can lead to a small improvement in percentage. For example, in Fig. 4, MEK inhibitor mAP improvement of ~0.35 is measured as ~50% improvement, while a much smaller mAP improvement can have the same effect near the origins (i.e., very poor MoA prediction).

      We agree that relying solely on percentage improvements can be misleading, especially when small absolute changes result in large percentage differences.

      However, we would like to clarify two key points regarding our reporting of percentage improvements:

      • We calculate the percentage improvement by first computing the average mAP across all compounds for both CytoSummaryNet and average profiling, and then comparing these averages. This approach is less susceptible to the influence of outlier improvements compared to calculating the average of individual compound percentage improvements.
      • We report percentage improvements alongside their corresponding absolute improvements. For example, the mAP improvement for Stain4 (test set) is reported as 0.052 (60%). To further clarify this point, we have updated the caption of Table 1 to explicitly state how the percentage improvements are calculated:

      The improvements are calculated as mAP(CytoSummaryNet)-mAP(average profiling). The percentage improvements are calculated as (mAP(CytoSummaryNet)-mAP(average profiling))/mAP(average profiling).

      R1.1b. (Subjective) visual assessment of this figure does not show a convincing contribution of CytoSummaryNet representations of the average profiling on the test set (3.33 uM). This issue might also be relevant for the task of replicate retrieval. All in all, the mAP improvement reported in Table 1 and throughout the manuscript (including the Abstract), is not a proper evaluation metric for CytoSummaryNet contribution. We suggest reporting the following evaluations:

      1. Visualizing the results of cpg0001 (Figs. 1-3) similarly to cpg0004 (Fig. 4), i.e., plotting the matched mAP for CytoSummaryNet vs. average profile.

      2. In Table 1, we suggest referring to the change in the number of predictable MoAs (MoAs that pass a mAP threshold) rather than the improvement in percentages. Another option is showing a graph of the predictability, with the X axis representing a threshold and Y-axis showing the number of MoAs passing it. For example see (PMID: 36344834, Fig. 2B) and (PMID: 37031208, Fig. 2A), both papers included contributions from the corresponding author of this manuscript.

      Regarding the suggestion to visualize the results for compound group cpg0001 similarly to cpg0004, unfortunately, this is not feasible due to the differences in data splitting between the two datasets. In cpg0001, an MoA might have one compound in the training set and another in the test or validation set. Reporting a single value per MoA would require combining these splits, which could be misleading as it would conflate performance across different data subsets.

      However, we appreciate the suggestion to represent the number of predictable MoAs that surpass a certain mAP threshold, as it provides another intuitive measure of performance. To address this, we have created a graph that visualizes the predictability of MoAs across various thresholds, similar to the examples provided in the referenced papers (PMID: 36344834, Figure 2B and PMID: 37031208, Figure 2A). This graph, with the x-axis depicting the threshold and the y-axis showing the number of MoAs meeting the criterion, has been added to Supplementary Material K.

      R1.1c.i. "a subset of 18 compounds were designated as validation compounds" - 5 cross-validations of 18 compounds can make the evaluation complete. This can also enhance statistical power in figures 1-3.

      We appreciate your suggestion and acknowledge the potential benefits of employing cross-validation, particularly in enhancing statistical power. While we understand the merit of cross-validation for evaluating model performance and generalization to unseen data, we believe the results as presented already highlight the generalization characterics of our methods.

      Specifically, (the new) Figure 3 demonstrates the model's improvement over average profiling in both training and validation plates, supporting its ability to generalize to unseen compounds (but not to unseen plates).

      While cross-validation could potentially enhance our analysis, retraining five new models solely for different validation set results may not substantially alter our conclusions, given the observed trends in Suppl Figure A1 and (the new) Figure 4, both of which show results across multiple stain sets (but a single train-test-validation split).


      R1.1c.ii. Clarify if the MoA results for cpg0001 are drawn from compounds from both the training and the validation datasets. If so, describe how the results differ between the sets in text and graphs.

      We confirm that the Mechanism of Action (MoA) retrieval results for cpg0001 are derived from all available compounds. It's important to note that the training and validation dataset split for the replicate retrieval task is different from the MoA prediction task. For replicate retrieval, we train using all available compounds and validate on a held-out set (see Figure 2). For MoA prediction, we train using the replicate retrieval task as the objective on all available compounds but validate using MoA retrieval, which is a distinct task. We have added a brief clarification in the main text to highlight the distinction between these tasks and how validation is performed for each:

      “We next addressed a more challenging task: predicting the mechanism of action class for each compound at the individual well level, rather than simply matching replicates of the exact same compound (Figure 5). It's also important to note that mechanism of action matching is a downstream task on which CytoSummaryNet is not explicitly trained. Consequently, improvements observed on the training and validation plates are more meaningful in this context, unlike in the previous task where only improvements on the test plate were meaningful. For similar reasons, we calculate the mechanism of action retrieval performance on all available compounds, combining both the training and validation sets. This approach is acceptable because we calculate the score on so-called "sister compounds" only—that is, different compounds that have the same mechanism of action annotation. This ensures there is no overlap between the mechanism of action retrieval task and the training task, maintaining the integrity of our evaluation. ”

      R1.1c.iii. "Mechanism of action retrieval is evaluated by quantifying a profile's ability to retrieve the profile of other compounds with the same annotated mechanism of action.". It was unclear to us if the evaluation of mAP for MoA identification can include finding replicates of the same compound. That is, whether finding a close replicate of the same compound would be included in the AP calculation. This would provide CytoSummaryNet with an inherent advantage as this is the task it is trained to do. We assume that this was not the case (and thus should be more clearly articulated), but if it was - results need to be re-evaluated excluding same-compound replicates.

      The evaluation excludes replicate wells of the same compound and only considers wells of other compounds with the same MoA. This methodology ensures that the model's performance on the MoA prediction task is not inflated by its ability to find replicates of the same compound, which is the objective of the replicate retrieval task. Please see the explanation we have added to the main text in our response to R1.1c.ii. Additionally, we have updated the Methods section to clearly describe this evaluation procedure:

      “Mechanism of action retrieval is evaluated by quantifying a profile’s ability to retrieve the profile of different compounds with the same annotated mechanism of action.”



      __R1.2a. __The description of Stain2-5 was not clear for us at first (and second) read. The information is there, but more details will greatly enhance the reader's ability to follow. One suggestion is explicitly stating that these "stains" partitioning was already defined in ref 26. Another suggestion is laying out explicitly a concrete example on the differences between two of these stains. We believe highlighting the differences between stains will strengthen the claim of the paper, emphasizing the difficulty of generalizing to the out-of-distribution stain.

      We appreciate your feedback on the clarity of the Stain2-5 dataset descriptions; we certainly struggled to balance detail and concepts in describing these. We have made the following changes:

      • Explicitly mentioned that the partitioning of the Stain experiments was defined in https://pubmed.ncbi.nlm.nih.gov/37344608/: “The partitioning of the Stain experiments have been defined and explained previously [21].”
      • Moved an improved version of (now) Figure 2 from the Methods section to the main text to help visually explain how the stratification is done early on.
      • Added a new section in the Experimental Setup: Diversity of stain sets, which includes a concrete example highlighting the differences between Stain2, and Stain5 to emphasize the diversity in experimental setups within the same dataset: “Stain2-5 comprise a series of experiments which were conducted sequentially to optimize the experimental conditions for image-based cell profiling. These experiments gradually converged on the most optimal set of conditions; however, within each experiment, there were significant variations in the assay across plates. To illustrate the diversity in experimental setups within the dataset, we will highlight the differences between Stain2 and Stain5.

      Stain2 encompasses a wide range of nine different experimental protocols, employing various imaging techniques such as Widefield and Confocal microscopy, as well as specialized conditions like multiplane imaging and specific stains like MitoTracker Orange. This subset also includes plates acquired with strong pixel binning instead of default imaging and plates with varying concentrations of dyes like Hoechst. As a result, Stain2 exhibits greater variance in the experimental conditions across different plates compared to Stain5.

      In contrast, Stain5, the last experiment in the series, follows a more systematic approach, consistently using either confocal or default imaging across three well-defined conditions. Each condition in Stain5 utilizes a lower cell density of 1,000 cells per well compared to Stain2's 2,500 cells per well. Being the final experiment in the series, Stain5 had the least variance in experimental conditions.

      For training the models, we typically select the data containing the most variance to capture the broadest range of experimental variation. Therefore, we chose Stain2-4 for training, as they represented the majority of the data and captured the most experimental variation. We reserved Stain5 for testing to evaluate the model's ability to generalize to new experimental conditions with less variance.

      All StainX experiments were acquired in different passes, which may introduce additional batch effects.”

      These changes aim to provide a clearer understanding of the dataset's complexity and the challenges associated with generalizing to out-of-distribution data.

      R1.2b. What does each data point in Figures 1-3 represent? Is it the average mAP for the 18 validation compounds, using different seeds for model training? Why not visualize the data similarly to Fig. 4 so the improvement per compound can be clearly seen?

      The data points in (the new) Figures 3,4,5 represent the average mAP for each plate, calculated by first computing the mAP for each compound and then averaging across compounds to obtain the average mAP per plate. We have updated the figure captions to clarify this:

      "... (each data point is the average mAP of a plate) ..."

      While visualizing the mAP per compound, similar to (the new) Figure 6 for cpg0004, could provide insights into compound-level improvements, it would require creating numerous additional figures or one complex figure to adequately represent all the stratifications we are analyzing (plate, compound, Stain subset). By averaging the data per plate across different stratifications, we aim to provide a clearer and more comprehensible overview of the trends and improvements while allowing us to draw conclusions about generalization.

      Please note: this comment is related to the comment R1.1b (Subjective)

      R1.2.c [On the topic of enhancing clarity and readability:] Justification and interpretation of the evaluation metrics.

      Please refer to our response to comment R1.1b, where we have addressed your concerns regarding the justification and interpretation of the evaluation metrics.

      R1.2d. Explicitly mentioning the number of MoAs for each datasets and statistics of number of compounds per MoA (e.g., average\median, min, max).

      We have added the following to the Experimental Setup: Data section:

      “A subset of the data was used for evaluating the mechanism of action retrieval task, focusing exclusively on compounds that belong to the same mechanism class. The Stain plates contained 47 unique mechanisms of action, with each compound replicated four times. Four mechanisms had only a single compound; the four mechanisms (and corresponding compounds) were excluded, resulting in 43 unique mechanisms used for evaluation. In the LINCS dataset, there were 1436 different mechanisms, but only 661 were used for evaluation because the remaining had only one compound.”

      R1.2e. The data split in general is not easily understood. Figure 8 is somewhat helpful, however in our view, it can be improved to enhance understanding of the different splits. Specifically, the training and validation compounds need to be embedded and highlighted within the figure.

      Thank you for highlighting this. We have completely revised the figure, now Figure 2 which we hope more clearly conveys the data split strategy.

      Please note: this comment is related to the comment R1.2a.





      R1.3a. Why was stain 5 used for the test, rather than the other stains?

      Stain2-5 were part of a series of experiments aimed at optimizing the experimental conditions for image-based cell profiling using Cell Painting. These experiments were conducted sequentially, gradually converging on the most optimal set of conditions. However, within each experiment, there were significant variations in the assay across plates, with earlier iterations (Stain2-4) having more variance in the experimental conditions compared to Stain5. As Stain5 was the last experiment in the series and consisted of only three different conditions, it had the least variance. For training the models, we typically select the data containing the most variance to capture the broadest range of experimental variation. Therefore, Stain2-4 were chosen for training, while Stain5 was reserved for testing to evaluate the model's ability to generalize to new experimental conditions with less variance.

      We have now clarified this in the Experimental Setup: Diversity of stain sets section. Please see our response to comment R1.2a. for the full citation.

      R1.3b How were the 18 validation compounds selected?

      20% of the compounds (n=18) were randomly selected and designated as validation compounds, with the remaining compounds assigned to the training set. We have now clarified this in the Results section:

      “Additionally, 20% of the compounds (n=18) were randomly selected and designated as validation compounds, with the remaining compounds assigned to the training set (Supplementary Material H).”

      R1.3c. For cpg0004, no justification for the specific doses selected (10uM - train, 3.33 uM - test) for the analysis in Figure 4. Why was the data split for the two dosages? For example, why not perform 5-fold cross validation on the compounds (e.g., of the highest dose)?

      We chose to use the 10 μM dose point as the training set because we expected this higher dosage to consist of stronger profiles with more variance than lower dose points, making it more suitable for training a model. We decided to use a separate test set at a different dose (3.33 μM) to assess the model's ability to generalize to new dosages. While cross-validation on the highest dose could also be informative, our approach aimed to balance the evaluation of the model's generalization capability with its ability to capture biologically relevant patterns across different dosages.

      This explanation has been added to the text:

      “We chose the 10 μM dose point for training because we expected this high dosage to produce stronger profiles with more variance than lower dose points, making it more suitable for model training.”

      “The multiple dose points in this dataset allowed us to create a separate hold-out test set using the 3.33 μM dose point data. This approach aimed to evaluate the model's performance on data with potentially weaker profiles and less variance, providing insights into its robustness and ability to capture biologically relevant patterns across dosages. While cross-validation on the 10 μM dose could also be informative, focusing on lower dose points offers a more challenging test of the model's capacity to generalize beyond its training conditions, although we do note that all compounds’ phenotypes would likely have been present in the 10 μM training dataset, given the compounds tested are the same in both.”

      R1.3d. A more detailed explanation on the logic behind using a training stain to test MoA retrieval will help readers appreciate these results. In our first read of this manuscript we did not grasp that, we did in a second read, but spoon-feeding your readers will help.

      This comment is related to the rationale behind training on one task and testing on another, which is addressed in our responses to comments R1.1.cii and R1.1.ciii.

      R1.4 Assessment of interpretability is always tricky. But in this case, the authors can directly confirm their interpretation that the CytoSummaryNet representation prioritizes large uncrowded cells, by explicitly selecting these cells, and using their average profile re

      We progressively filtered out cells based on a quantile threshold for Cells_AreaShape features (MeanRadius, MaximumRadius, MedianRadius, and Area), which were identified as important in our interpretability analysis, and then computed average profiles using the remaining cells before determining the replicate retrieval mAP. In the exclusion experiment, we gradually left out cells as the threshold increased, while in the inclusion experiment, we progressively included larger cells from left to right.

      The results show that using only the largest cells does not significantly increase the performance. Instead, it is more important to include the large cells rather than only including small cells. The mAP saturates after a threshold of around 0.4, indicating that larger cells define the profile the most, and once enough cells are included to outweigh the smaller cell features, the profile does not change significantly by including even larger cells.

      These findings support our interpretation that CytoSummaryNet prioritizes large, uncrowded cells. While this approach could potentially be used as a general outlier removal strategy for cell profiling, further investigation is needed to assess its robustness and generalizability across different datasets and experimental conditions.

      We have created Supplementary Material L to report these findings and we additionally highlight them in the Results:

      “To further validate CytoSummaryNet's prioritization of large, uncrowded cells, we progressively filtered cells based on Cells_AreaShape features and observed the impact on replicate retrieval mAP (Supplementary Material L). The results support our interpretation and highlight the key role of larger cells in profile strength.”

      __R1.5. __Placing this work in context of other weakly supervised representations. Previous papers used weakly supervised labels of proteins / experimental perturbations (e.g., compounds) to improve image-derived representations, but were not discussed in this context. These include PMID: 35879608, https://www.biorxiv.org/content/10.1101/2022.08.12.503783v2 (from the same research groups and can also be benchmarked in this context), https://pubs.rsc.org/en/content/articlelanding/2023/dd/d3dd00060e , and https://www.biorxiv.org/content/10.1101/2023.02.24.529975v1. We believe that a discussion explicitly referencing these papers in this specific context is important.

      While these studies provide valuable insights into improving cell population profiles using representation learning, our work focuses specifically on the question of single-cell aggregation methods. We chose to use classical features for our comparisons because they are the current standard in the field. This approach allows us to directly assess the performance of our method in the context of the most widely used feature extraction pipeline in practice. However, we see the value in incorporating them in future work and have mentioned them in the Discussion:

      “Recent studies exploring image-derived representations using self-supervised and self-supervised learning [35][36] could inspire future research on using learned embeddings instead of classical features to enhance model-aggregated profiles.”

      R1.minor1. "Because the improved results could stem from prioritizing certain features over others during aggregation, we investigated each cell's importance during CytoSummaryNet aggregation by calculating a relevance score for each" - what is the relevance score? Would be helpful to provide some intuition in the Results.

      We have included more explanation of the relevance score in the Results section, following the explanation of sensitivity analysis (SA) and critical point analysis (CPA):

      “SA evaluates the model's predictions by analyzing the partial derivatives in a localized context, while CPA identifies the input cells with the most significant contribution to the model's output. The relevance scores of SA and CPA are min-max normalized per well and then combined by addition. The combination of the two is again min-max normalized, resulting in the SA and CPA combined relevance score (see Methods for details).”

      R1.minor2. Figure 1:

      1. Colors of the two methods too similar
      2. The dots are too close. It will be more easily interpreted if they were further apart.
      3. What do the dots stand for?
      4. We recommend considering moving this figure to the supp. material (the most important part of it is the results on the test set and it appears in Fig.2).
      1. We chose a lighter and darker version of the same color as a theme to simplify visualization, as this theme is used throughout (the new) Figures 3,4,5.
      2. We agree; we have now redrawn the figure to fix this.
      3. Each data point is the average mAP of a plate. Please see our answer for R1.2b as well.
      4. We believe that (the new) Figures 3,4,5 serve distinct purposes in testing various generalization hypotheses. We have added the following text to emphasize that the first figures are specifically about generalization hypothesis testing: “We first investigated CytoSummaryNet’s capacity to generalize to out-of-distribution data: unseen compounds, unseen experimental protocols, and unseen batches. The results of these investigations are visualized in Figures 3, 4, and 5, respectively.”

      R1.minor3 Figure 4: It is somewhat misleading to look at the training MoAs and validation MoAs embedded together in the same graph. We recommend showing only the test MoAs (train MoAs can move to SI).

      We addressed this comment in R1.1c.ii. To reiterate briefly, there are no training, validation, or test MoAs because these are not used as labels during the training process. There is an option to split them based on training and validation compounds, which is addressed in R1.1c.ii.


      R1.minor4 Figure 5

      1. Why only Stain3? What happens if we look at Stains 2,3 and 4 together? Stain 5?

      2. Should validation compounds and training compounds be analyzed separately?

      3. Subfigure (d): it is expected that the data will be classified by compound labels as it is the training task, but for this to be persuasive I would like to see this separately on the training compounds first and then and more importantly on the validation compounds.

      4. For subfigures (b) and (d): it appears there are not enough colors for d, which makes it partially not understandable. For example, the pink label in (d) shows a single compound which appears to represent two different MoAs. This is probably not the case, and it has two different compounds, but it cannot be inferred when they are represented by the same color.

      5. For the Subfigure (e) - only 1 circle looks justified (in the top left). And for that one, is it not a case of an outlier plate that would perhaps need to be removed from analysis? Is it not good that such a plate will be identified?

      We have addressed this point in the text, stating that the results are similar for Stain2 and Stain4. Stain5 represents an out-of-distribution subset because of a very different set of experimental conditions (see Experimental Setup: Diversity of stain sets). To improve clarity, we have revised the figure caption to reiterate this information:

      “... Stain2 and Stain4 yielded similar results (data not shown). …”

      1. For replicate retrieval, analyzing validation and training compounds separately is appropriate. However, this is not the case for MoA retrieval, as discussed in our responses to R1.1c.ii and R1.1c.i.
      2. We have created the requested plot (below) but ultimately decided not to include it in the manuscript because we believe that (the new) Figures 3 and 4 are more effective for making quantitative comparative claims.

      [Please see the full revision document for the figures]

      Top: training compounds (validation compounds grayed out); not all compounds are listed in the legend.

      *Bottom: validation compounds (training compounds grayed out). *

      Left: average profiling; Right: CytoSummaryNet

      1. We agree with your observation and have addressed this issue by labeling the center mass as a single class (gray) and highlighting only the outstanding pairs in color. Please refer to the updated figure and our response to R3.6 for more details.

      2. In the updated figure, we have revised the figure caption to focus solely on the annotation of same mechanism of action profile clusters, as indicated by the green ellipses. The annotation of isolated plate clusters has been removed (Figures 7e and 7f) to maintain consistency and avoid potential confusion. Despite being an outlier for Stain3, the plate (BR00115134bin1) clusters with Stain4 plates (Supplementary Figure F1, green annotated square inside the yellow annotated square), indicating it is not merely a noisy outlier and can provide insights into the out-of-sample performance of our model.

      R1.minor5a. Discussion: "perhaps in part due to its correction of batch effects" - is this statement based on Fig. 5F - we are not convinced.

      We appreciate the reviewer's scrutiny regarding our statement about batch effect correction. Upon reevaluation, we agree that this claim was not adequately substantiated by empirical data. We quantified the batch effects using comparison mean average precision for both average profiles and CytoSummaryNet profiles, and the statistical analysis revealed no significant difference between these profiles in terms of batch effect correction. Therefore, we have removed this theoretical argument from the manuscript entirely to ensure that all claims are strongly supported by the data presented.

      R1.minor5b. "Overall, these results improve upon the ~20% gains we previously observed using covariance features" - this is not the same dataset so it is hard to reach conclusions - perhaps compare performance directly on the same data?

      We have now explicitly clarified this is a different dataset. Please see our response to R1.1a for why a direct comparison was not performed. The following clarification can be found in the Discussion:

      “These results improve upon the ~20% gains previously observed using covariance features [13] albeit on a different dataset, and importantly, CytoSummaryNet effectively overcomes the challenge of recomputation after training, making it easier to use.”

      Reviewer 2

      R2.1 The authors present a well-developed and useful algorithm. The technical motivation and validation are very carefully and clearly explained, and their work is potentially useful to a varied audience.

      That said, I think the authors could do a better job, especially in the figures, of putting the algorithm in context for an audience that is unfamiliar with the cell painting assay. (a) For example, a figure towards the beginning of the paper with example images might help to set the stage. (b) Similarly a schematic of the algorithm earlier in the paper would provide a graphical overview. (c) For the sake of a biologically inclined audience, I would consider labeling the images in the caption by cell type and label.

      Thank you for your valuable suggestions on improving the accessibility of our figures for readers unfamiliar with the Cell Painting assay. We have made the following changes to address your comments:

      1. and b. To provide visual context and a graphical overview of the algorithm, we have moved the original Figure 7 to Figure 1. This figure now includes example images that help readers new to the Cell Painting assay.
      2. We have added relevant details to the example images in (the new) Figure 1

        R2.2 The interpretability results were intriguing. The authors might consider further validating these interpretations by removing weakly informative cells from the dataset and retraining. Are the cells so uninformative that the algorithm does better without them, or are they just less informative than other cells?

      Please see our responses to R1.4 and R3.0

      R2.3 As far as I can tell, the authors only oblique state whether the code associated with the manuscript is openly available. Posting the code is needed for reproducibility. I would provide not only a github, but a doi linked to the code, or some other permanent link.

      We have now added a Code Availability and Data Availability section, clearing stating that the code and data associated with the manuscript are openly available.

      R2.4 Incorporating biological heterogeneity into machine-learning driven problems is a critical research question. Replacing means/modes and such with a machine learning framework, the authors have identified a problem with potentially wide significance. The application to cell painting and related assays is of broad enough significance for many journals, However, the authors could further broaden the significance by commenting on other possible cell biology applications. What other applications might the algorithm be particularly suited for? Are there any possible roadblocks to wider use. What sorts of data has the code been tested on so far?

      We have added the following paragraph to discuss the broader applicability of CytoSummaryNet:

      “The architecture of CytoSummaryNet holds significant potential for broader applications beyond image-based cell profiling, accommodating tabular, permutation-invariant data and enhancing downstream task performance when applied to processed population-level profiles. Its versatility makes it valuable for any omics measurements where downstream tasks depend on measuring similarity between profiles. Future research could also explore CytoSummaryNet's applicability to genetic perturbations, expanding its utility in functional genomics.”

      Reviewer 3

      R3.0 The authors have done a commendable job discussing the method, demonstrating its potential to outperform current models in profiling cell-based features. The work is of considerable significance and interest to a wide field of researchers working on the understanding of cell heterogeneity's impact on various biological phenomena and practical studies in pharmacology.

      One aspect that would further enhance the value of this work is an exploration of the method's separation power across different modes of action. For instance, it would be interesting to ascertain if the method's performance varies when dealing with actions that primarily affect size, those that affect marker expression, or compounds that significantly diminish cell numbers.

      Thank you for encouraging comments!

      We have added the following to Results: Relevance scores reveal CytoSummaryNet's preference for large, isolated cells:

      “Statistical t-tests were conducted to identify the features that most effectively differentiate mechanisms of action from negative controls in average profiles, focusing on the three mechanisms of action where CytoSummaryNet demonstrates the most significant improvement and the three mechanisms where it shows the least. Consistent with our hypothesis that CytoSummaryNet emphasizes larger, more sparse cells, the important features for the CytoSummaryNet-improved mechanisms of action (Supplementary Material I1) often involve the radial distribution for the mitochondria and RNA channels. These metrics capture the fraction of those stains near the edge of the cell versus concentric rings towards the nucleus, which are more readily detectable in larger cells compared to small, rounded cells.

      In contrast, the important features for mechanisms of action not improved by CytoSummaryNet (Supplementary Material I) predominantly include correlation metrics between brightfield and various fluorescent channels, capturing spatial relationships between cellular components. Some of these mechanisms of action included compounds that were not individually distinguishable from negative controls, and CytoSummaryNet did not overcome the lack of phenotype in these cases. This suggests that while CytoSummaryNet excels in identifying certain cellular features, its effectiveness is limited when dealing with mechanisms of action that do not exhibit pronounced phenotypic changes.”

      We have also added supplementary material to support (I. Relevant features for CytoSummaryNet improvement).

      R3.0 Another test on datasets that are not concerned with chemical compounds, but rather genetic perturbations would greatly increase the reach of the method into the functional genomics community and beyond. This additional analysis could provide valuable insights into the versatility and applicability of the proposed method.

      We agree that testing the method's behavior on genetic perturbations would be interesting and could provide insights into its versatility. However, the efficacy of the methodology may vary depending on the specific properties of different genetic perturbation types.

      For example, the penetrance of phenotypes may differ between genetic and chemical perturbations. In some experimental setups, a selection agent ensures that nearly all cells receive a genetic perturbation (though not all may express a phenotype due to heterogeneity or varying levels of the target protein). Other experiments may omit such an agent. Additionally, different patterns might be observed in various classes of reagents, such as overexpression, CRISPR-Cas9 knockdown (CRISPRn), CRISPR-interference (CRISPRi), and CRISPR-activation (CRISPRa).

      We believe that selecting a single experiment with one of these technologies would not adequately address the question of versatility. Instead, we propose future studies that may conclusively assess the method's performance across a variety of genetic perturbation types. This would provide a more comprehensive understanding of CytoSummaryNet's applicability in functional genomics and beyond. We have update the Discussion section to reflect this:

      “Future research could also explore CytoSummaryNet's applicability to genetic perturbations, expanding its utility in functional genomics.”

      R3.1. The datasets were stratified based on plates and compounds. It would be beneficial to clarify the basis for data stratification applied for compounds. Was the data sampled based on structural or functional similarity of compounds? If not, what can be expected from the model if trained and validated using structurally or functionally diverse and non-diverse compounds?

      Thank you for raising the important question of data stratification based on compound similarity. In our study, the data stratification was performed by randomly sampling the compounds, without considering their structural or functional similarity.

      This approach may limit the generalizability of the learned representations to new structural or functional classes not captured in the training set. Consequently, the current methodology may not fully characterize the model’s performance across diverse compound structures.

      In future work, it would be valuable to explore the impact of compound diversity on model performance by stratifying data based on structural or functional similarity and comparing the results to our current random stratification approach to more thoroughly characterize the learned representations.

      R3.2. Is the method prioritizing a particular biological reaction of cells toward common chemical compounds, such as mitotic failure? Could this be oncology-specific, or is there more utility to it in other datasets?

      Our analysis of CytoSummaryNet's performance in (the new) Figure 6 reveals a strong improvement in MoAs targeting cancer-related pathways, such as MEK, HSP, MDM, dehydrogenase, and purine antagonist inhibitors. These MoAs share a common focus on cellular proliferation, survival, and metabolic processes, which are key characteristics of cancer cells.

      Given the composition of the cpg0004 dataset, which contains 1,258 unique MoAs with only 28 annotated as oncology-related, the likelihood of randomly selecting five oncology-related MoAs that show strong improvement is extremely low. This suggests that the observed prioritization is not due to chance.

      Furthermore, the prioritization cannot be solely attributed to the frequency of oncology-related MoAs in the dataset. Other prevalent disease areas, such as neurology/psychiatry, infectious disease, and cardiology, do not exhibit similar improvements despite having higher MoA counts.

      While these findings indicate a potential prioritization of oncology-related MoAs by CytoSummaryNet, further research is necessary to fully understand the extent and implications of this bias. Future work should involve conducting similar analyses across other disease areas and cell types to assess the method's broader utility and identify areas for refinement and application. However, given the speculative nature of these observations, we have chosen not to update the manuscript to discuss this potential bias at this time.

      R3.3 Figures 1 and 2 demonstrate that the CytoSummaryNet profiles outperform average-aggregated profiles. However, the average profiling results seem more consistent when compared to CytoSummaryNet profiling. What further conditions or approaches can help improve CytoSummaryNet profiling results to be more consistent?

      The observed variability in CytoSummaryNet's performance is primarily due to the intentional technical variance in our datasets, where each plate tested different staining protocol variations. It's important to note that this level of technical variance is not typical in standard cell profiling experiments. In practice, the variance across plates would be much lower. We want to emphasize that while a model capable of generalizing across diverse experimental conditions might seem ideal, it may not be as practically useful in real-world scenarios. This is because such non-uniform conditions are uncommon in typical cell profiling experiments. In normal experimental settings, where technical variance is more controlled, we expect CytoSummaryNet's performance to be more consistent.

      R3.4 Can the poor performance on unseen data (in the case of stain 5) be overcome? If yes, how? If no, why not?

      We believe that the poor performance on unseen data, such as Stain 5, can be overcome depending on the nature of the unseen data. As shown in Figure 4 (panel 3), the model improves upon average profiling for unseen data when the experimental conditions are similar to the training set.

      The issue lies in the different experimental conditions. As explained in our response to R3.3, this could be addressed by including these experimental conditions in the training dataset. As long as CytoSummaryNet is trained (seen) and tested (unseen) on data generated under similar experimental conditions, we are confident that it will improve or perform as well as average profiling.

      It's important to note that the issue of generalization to vastly different experimental conditions was considered out of scope for this paper. The main focus is to introduce a new method that improves upon average profiling and can be readily used within a consistent experimental setup.

      R3.5 It needs to be mentioned how the feature data used for CytoSummaryNet profiling was normalized before training the model. What would be the impact of feature data normalization before model training? Would the model still outperform if the skewed feature data is normalized using square or log transformation before model training?

      We have clarified in the manuscript that we standardized the feature data on a plate-by-plate basis to achieve zero mean and unit variance across all cells per feature within each plate. We have added the following statement to improve clarity:

      “The data used to compute the average profiles and train the model were standardized at the plate-level, ensuring that all cell features across the plate had a zero mean and unit variance. The negative control wells were then removed from all plates."

      We chose standardization over transformations like squaring or logging to maintain a balanced scale across features while preserving the biological and morphological information inherent in the data. While transformations can reduce skewness and are useful for data spanning several orders of magnitude, they might distort biological relevance by compressing or expanding data ranges in ways that could obscure important cellular variations.

      Regarding the potential impact of square or log transformations on skewed feature data, these methods could improve the model's learning efficiency by making the feature distribution more symmetrical. However, the suitability and effectiveness of these techniques would depend on the specific data characteristics and the model architecture.

      Although not explored in this study, investigating various normalization techniques could be a valuable direction for future research to assess their impact on the performance and adaptability of CytoSummaryNet across diverse datasets and experimental setups.

      R3.6. In Figure 5 b and c, MoAs often seem to be represented by singular compounds and thus, the test (MoA prediction) is very similar to the training (compound ID). Given this context, a discussion about the extent this presents a circular argument supported by stats on the compound library used for training and testing would be beneficial.

      Clusters in (the new) Figure 7 that contain only replicates of a single compound would not yield an improved performance on the MoA task unless they also include replicates of other compounds sharing the same MoA in close proximity. Please see our response to R1.1c.iii. for details. To improve visual clarity and avoid misinterpretation, we have recomputed the colors for (the new) Figure 7 and grayed out overlapping points.

      R3.7 Can you estimate the minimum amount of supervision (fuzzy/sparse labels, often present in mislabeled compound libraries with dirty compounds and polypharmacology being present) that is needed for it to be efficiently trained?

      It's important to note that the metadata used by the model is only based on identifying replicates of the same compound. Mechanism of action (MoA) annotations, which can be erroneous due to dirty compounds, polypharmacology, and incomplete information, are not used in training at all. MoA annotations are only used in our evaluation, specifically for calculating the mAP for MoA retrieval.

      We have successfully trained CytoSummaryNet on 72 unique compounds with 4 replicates each. This is the current empirical minimum, but it is possible that the model could be trained effectively with even fewer compounds or replicates.

      Determining the absolute minimum amount of supervision required for efficient training would require further experimentation and analysis. Factors such as data quality, feature dimensionality, and model complexity could influence the required level of supervision.

      R3.minor1 Figure 5: The x-axis and y-axis tick values are too small, and image resolution/size needs to be increased.

      We have made the following changes to address the concerns:

      • Increased the image resolution and size to improve clarity and readability.
      • Removed the x-axis and y-axis tick values, as they do not provide meaningful information in the context of UMAP visualizations. We believe these modifications enhance the visual presentation of the data and make it easier for readers to interpret the results.

      R3.minor2 The methods applied to optimize hyperparameters in supplementary data need to be included.

      We added the following to Supplementary Material D:

      “We used the Weights & Biases (WandB) sweep suite in combination with the BOHB (Bayesian Optimization and HyperBand) algorithm for hyperparameter sweeps. The BOHB algorithm [47] combines Bayesian optimization with bandit-based strategies to efficiently find optimal hyperparameters.

      Additionally Table D1 provides an overview of all tunable hyperparameters and their chosen values based on a BOHB hyperparameter optimization.”

      R3.minor3 Figure 5(c, d): The names of compound 2 and Compound 5 need to be included in the labels.

      These compounds were obtained from external companies and are proprietary, necessitating their anonymization in our study. This has now been added in the caption of (the new) Figure 7:

      “Note that Compound2 and Compound5 are intentionally anonymized.”

      R3.minor4 Table C1: Plate descriptions need to be included.

      *Table C1: The training, validation, and test set stratification for Stain2, Stain3, Stain4, and Stain5. Five training, four validation, and three test plates are used for Stain2, Stain3, and Stain4. Stain5 contains six test set plates only. *

      __Stain2 __

      Stain3

      Stain4

      Stain5

      Training plates

      Test plates

      BR00113818

      BR00115128

      BR00116627

      BR00120532

      BR00113820

      BR00115125highexp

      BR00116631

      BR00120270

      BR00112202

      BR00115133highexp

      BR00116625

      BR00120536

      BR00112197binned

      BR00115131

      BR00116630highexp

      BR00120530

      BR00112198

      BR00115134

      200922_015124-Vhighexp

      BR00120526

      Validation plates

      BR00120274

      BR00112197standard

      BR00115129

      BR00116628highexp

      BR00112197repeat

      BR00115133

      BR00116629highexp

      BR00112204

      BR00115128highexp

      BR00116627highexp

      BR00112201

      BR00115127

      BR00116629

      Test plates

      BR00112199

      BR00115134bin1

      200922_044247-Vbin1

      BR00113819

      BR00115134multiplane

      200922_015124-V

      BR00113821

      BR00115126highexp

      BR00116633bin1

      We have added a reference to the metadata file in the description of Table C1: https://github.com/carpenter-singh-lab/2023_Cimini_NatureProtocols/blob/main/JUMPExperimentMasterTable.csv

      R3.minor5 Figure F1: Does the green box (stain 3) also involve training on plates from stain 4 (BR00116630highexp) and 5 (BR00120530) mentioned in Table C1? Please check the figure once again for possible errors.

      We have carefully re-examined Figure F1 and Table C1 to ensure their accuracy and consistency. Upon double-checking, we can confirm that the figure is indeed correct. We intentionally omitted the training and validation plates from Figure F1 to maintain clarity and readability, as including them resulted in a cluttered and difficult-to-interpret figure.

      Regarding the specific plates mentioned:

      • BR00116630highexp (Stain4) is used for training, as correctly stated in Table C1. This plate is considered an outlier within the Stain4 dataset and happens to cluster with the Stain3 plates in Figure F1.
      • BR00120530 (Stain5) is part of the test set only and correctly falls within the Stain5 cluster in Figure F1. To improve the clarity of the training, validation, and test split in Table C1, we have added a color scheme that visually distinguishes the different data subsets. This should make it easier for readers to understand the distribution of plates across the various splits.
  2. Jul 2024
    1. Cohort studies increasingly collect biosamples for molecular profiling and are observing molecular heterogeneity. High throughput RNA sequencing is providing large datasets capable of reflecting disease mechanisms. Clustering approaches have produced a number of tools to help dissect complex heterogeneous datasets, however, selecting the appropriate method and parameters to perform exploratory clustering analysis of transcriptomic data requires deep understanding of machine learning and extensive computational experimentation. Tools that assist with such decisions without prior field knowledge are nonexistent. To address this we have developed Omada, a suite of tools aiming to automate these processes and make robust unsupervised clustering of transcriptomic data more accessible through automated machine learning based functions. The efficiency of each tool was tested with five datasets characterised by different expression signal strengths to capture a wide spectrum of RNA expression datasets. Our toolkit’s decisions reflected the real number of stable partitions in datasets where the subgroups are discernible. Within datasets with less clear biological distinctions, our tools either formed stable subgroups with different expression profiles and robust clinical associations or revealed signs of problematic data such as biased measurements.

      This work has been peer reviewed in GigaScience (see paper), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer name: **Pierre Cauchy **

      Kariotis et al present Omada, a tool dedicated to automated partitioning of large-scale, cohort-based RNA-Sequencing data such as TCGA. A great strength for the manuscript is that it clearly shows that Omada is capable of performing partitioning from PanCan into BRCA, COAD and LUAD (Fig 5), and datasets with no known groups (PAH and GUSTO), which is impressive and novel. I would like to praise the authors for coming up with such a tool, as the lack of a systematic tool dedicated to partitioning TCGA-like expression data is indeed a shortcoming in the field of medical genomics Overall, I believe the tool will be very valuable to the scientific community and could potentially contribute to meta-analysis of cohort RNA-Seq data. I only have a few comments regarding the methodology and manuscript. I also think that it should be more clearly stated that Omada is dedicated to large datasets (e.g. TCGA) and not differential expression analysis. I would also suggest benchmarking Omada to comparable tools via ROC curves if possible (see below). Methods: This section should be a bit more homogeneous between text descriptive and mathematical descriptive. It should specify what parts are automated and what part needs user input and refer to the vignette documentation. I also could not find the Omada github repository. Sample and gene expression preprocessing: To me, this section lacks methods/guidelines and only loosely describes the steps involved. "numerical data may need to be normalised in order to account for potential misdirecting quantities" - which kind of normalisation? "As for the number of genes, it is advised for larger genesets (>1000 genes) to filter down to the most variable ones before the application of any function as genes that do not vary across samples do not contribute towards identifying heterogeneity" What filtering is recommended? Top 5% variance? 1%? Based on what metric? Determining clustering potential: To me, it was not clear if this is automatically performed by Omada and how the feasibility score is determined. Intra-method Clustering Agreement: Is this from normalised data? Because affinity matrix will be greatly affected whether it's normalised or non-normalised data as the matrix of exponential(-normalised gene distance)^2 Spectral clustering step 2: "Define D to be the diagonal matrix whose (i, i)-element is the sum of A's i-th row": please also specify that A(i,j) is 0 in this diagonal matrix. Please also confirm which matrix multiplication method is used, product or Cartesian product? Also if there are 0 values, NAs will be obtained in this step. Hierarchical clustering step 5: "Repeat Step 3 a total of n − 1 times until there is only one cluster left." This is a valuable addition as this merges identical clusters, the methods should emphasise that the benefits of this clustering reduction method to help partition data, i.e. that this minimises the number of redundant clusters. Stability-based assessment of feature sets: "For each dataset we generate the bootstrap stability for every k within range". Here it should be mentioned that this is carried out by clusterboot, and the full arguments should be given for documentation "The genes that comprise the dataset with the highest stability are the ones that compose the most appropriate set for the downstream analysis" - is this the single highest or a gene list in the top n datasets? Please specify. Choosing k number of clusters: "This approach prevents any bias from specific metrics and frees the user from making decisions on any specific metric and assumptions on the optimal number of clusters.". Out of consistency with the cluster reduction method in the "intra-clustering agreement" section which I believe is a novelty introduced by Omada, and within the context of automated analysis, the package should also ideally have an optimized number of k-clusters. K-means clustering analysis is often hindered due to the output often resulting in redundant, practically identical clusters which often requires manual merging. While I do understand the rationale described there and in Table 3, in terms of biological information and especially for deregulated genes analysis (e.g. row z-score clustering), should maximum k also not be determined by the number of conditions, i.e 2n, e.g. when n=2, kmax=4; n=3, kmax=8? Test datasets and Fig 6: Please expand on how the number of features 300 was determined. While this number of genes corresponds to a high stability index, is this number fixed or can it be dynamically estimated from a selection (e.g. from 100 to 1000)? Results Overall this section is well written and informative. I would just add the following if applicable: Figure 3: I think this figure could additionally include benchmarking, ROC curves of. Omada vs e.g. previous TCGA clustering analyses (PMID 31805048) Figure 4: I think it would be useful to compare Omada results to previous TCGA clustering analyses, e.g. PMID 35664309 Figure 6: swap C and D. Why is cluster 5 missing on D)?

    1. Today, in order to bridge an emerging chasm, African-Americanwriters may seek to initiate and sustain a greater dialogue betweenactivists and academics. Analyzing the relationship between commen-tary and organizing strengthens critical writing, research, and activ-ism. Or, as Cornel West notes: "Local activists must become more andmore at the center of how we think about the condition for the possibilityof social motion and social movement."52 This seems particularly truein interracial rape cases where racism and sexism violently converge andmythology shapes cultural meanings and social and legal prosecution

      !!!!

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02491

      Corresponding author(s): Gilbert, Vassart

      1. General Statements [optional]

      We thank referees 1 and 2 for their in-depth analysis of our manuscript. They see interest in our study, with questions to be answered. Referee 3 is essentially negative, considering that there is nothing new ("novel finding is missing"). We respectfully disagree with him/her, comforted by the opinion of referee 2 that "the authors developed a protocol to reproducibly generate fetal-like spheroids from adult tissue is an important advancement in the field and ... the manuscript should attract a significant amount of attention in the intestinal field" and we provide evidence in our answers that he/she did not read the manuscript with the same attention as referees 1 and 2 (see in particular answer to his/her question 5).

      Here is a summary of the main reason why we consider that our study represents valuable new information in the field of intestinal regeneration.

      It is based on the serendipitous observation that dissociation of adult intestinal tissue by collagenase generates stably replatable spheroids upon culture in matrigel. Surprisingly and contrary to canonical EDTA-generated intestinal organoids and fetal spheroids, these spheroids are not traced in Rosa26Tomato mice harboring a VilCre transgene, despite expressing robustly endogenous Villin. Our interpretation is that adult intestinal spheroids originate from a cell lineage, distinct from the main developmental intestinal lineage, in which the VilCre transgene is unexpectedly not expressed, probaly due to the absence of cis regulatory sequences required for expression in this lineage.

      Adult spheroid transcriptome shares a gene signature with the YAP/TAZ signature commonly expressed in models of intestinal regeneration. This led us to look for VilCre negative crypts in the regenerating intestine of Lgr5/DTR mice in which Lgr5-positive stem cells have been ablated by diphtheria toxin. Numerous VilCre negative clones were observed, identifying a novel lineage of stem cells implicated in intestinal regeneration.

      FACS purification and scRNAseq analysis of the rare VilCre negative cells present at homeostasis identified a population of cells with characteristics of quiescent stem cells.

      In sum, we believe that our study demonstrates the existence of a hitherto undescribed stem cell lineage involved in intestinal regeneration. It points to the existence of a hierarchical model of intestinal regeneration in addition to the well-accepted plasticity model.

      2. Description of the planned revisions

      See section 3 below.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Here is a point-by-point reply to the queries of the three referees, with indication of the revisions introduced in the manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      • *In this manuscript, Marefati et al report an Lgr5-independent lineage in the regenerating intestine using in vitro organoids and in vivo injury-coupled lineage tracing model. In organoids, collagenase/dispase dissociated resulted in "immortal spheroids" that maintain a cystic and undifferentiated phenotype in the absence of standard growth factors (Rspondin/Noggin/EGF). Bulk RNAseq of spheroids demonstrates downregulation of classical CBC signatures and upregulation of fetal spheroid, mesenchymal, inflammation and regenerative signatures. In mice, Villin-Cre lineage tracing revealed some Villin- negative progenies that lack reporter tracing throughout crypt-villus ribbons after injury.

      *The authors proposed that there is Lgr5-independent population support the regenerative response upon CBC depletion. A major caveat of this study is the identification of this population is based on absence of VilCre expression. *

      We respectfully disagree. It is precisely this characteristic that makes the interest of our study. Whereas mosaicism of transgene expression is widespread and usually of little significance, our study shows that the rare VilCre-negative cells in the intestinal epithelium are not randomly showing this phenotype: they give specifically birth to what we call adult spheroids and regenerating crypts, which cannot be due to chance. The absence of VilCre expression allows tracing these cells from the zygote stage of the various VilCre/Ros26 reporter mice. We have modified our text to emphasize this point.

      *It is surprising that there is no characterisation of Lgr5 expression throughout the manuscript whilst claiming of a Lgr5- independent lineage. *

      We understand the perplexity of the referee not to see direct Lgr5 expression data in our manuscript, given our title. However, our point is that it is the cells at the origin of adult spheroids and the regenerating crypts we have identified that are Lgr5-negative, not the spheroids or the regenerated crypts themselves. Those are downstream offspring that may, and indeed have, gained some Lgr5 expression (e.g. figure 3F). We believe that our data showing that VilCre-negative spheroids are not traced in Lgr5-CreERT2/Rosa reporter mice convincingly demonstrate absence of Lgr5 expression in the cells at the origin of adult spheroids (figure 4G). We think that this experiment is better evidence than attempts to show absence of two markers (Tom and Lgr5) in the rare "white" cells present in the epithelium. Regarding the Lgr5 status of cells at the origin of the regenerating "white" crypts that we have identified, the early appearance of these crypts following ablation of CBC (i.e. Lgr5+ve) cells is a strong argument that they originate from Lgr5-negative cells. Regarding the scRNAseq experiment, Lgr5 transcripts are notoriously low and difficult to measure reliably in CBCs (Haber et al 2017). However, blowing up the pertinent regions of the merged UMAP allows showing some Lgr5 transcripts in clusters 5,6 and none in cluster 1 of figure 8GH. Given the very low level of detection, we had chosen not to include these data in the manuscript, but we hope they may help answer the point of the referee (see portion of UMAP below, with Olfm4 as a control, together with the corresponding violin plot). Several markers that gave significant signals in the CBC cluster (Smoc2, Axin2, Slc12a2) were virtually undetectable in the Olfm4-low /Tom-negative cluster of our scRNAseq data (figure 8I) supporting our conclusion.

      Although the research question is potentially interesting, the concept of epithelial reprogramming upon injury is well documented in the field. The data generated in this manuscript also seem to be preliminary and lack of detailed characterisation. Below are specific comments.

      We do not question the existence of epithelial reprogramming upon injury. We believe our data show, in addition to this well demonstrated phenomenon, the existence of rare cells traced by absence of VilCre expression that are at the origin of a developmental cell lineage distinct from Lgr5+ stem cells and also implicated in regeneration.

      • Expression of Lgr5 should be properly characterised throughout the manuscript in both organoid models and injury-induced regeneration in vivo.
      • *

      See above for a detailed answer to this point.

      • An important question is the origin of these "Lgr5-independent" adult spheroids. They look and appear like fetal organoids, which could be induced by injury (e.g. upon collagenase/dispase dissociation). Have the authors tried to culture fetal spheroids in BCM over extensive period of time? Do they behave the same? This would be a great way to directly compare the collagenase/dispase-derived organoids with fetal origin. * *Fetal spheroids require ENR for survival and die in BCM. We have chosen to illustrate this point in Fig2A by showing that, contrary to adult spheroid, they die even when only Rspondin is missing.

      • Fig 1C, Why is the replating spheroid culture time different between mesenchymal cells and conditioned medium? We took the earliest time showing convincingly the return to the organoid phenotype. This timing difference does not modify the conclusion that EDTA organoids becoming spheroid-like when exposed to factors originating from mesenchymal cells revert to the organoid phenotype when returned to ENR medium without mesenchymal influence.

      • *It is unclear how the bulk RNA-seq data in Fig. 3 were compared. How long were the adult organoids and spheroids cultured for (how many passages)? Were they culture in the same condition of were they in ENR vs BCM? * Both EDTA organoids and spheroids displaying a stable phenotype were used in this experiment. Organoids were collected at passage 4, day 5; spheroids were collected at passage passage 9 day 3.

      As stated in the legend to the figure: "...to allow pertinent comparison spheroids and organoids were cultured in the same ENR-containing medium...".

      These are important information to consider when interpreting the results. For instance, are Ptgs1 & Ptgs2 expression in adult spheroids the same in ENR vs BCM? Are the gene signatures (regenerative, fetal and YAP) changed in adult spheroids culturing in ENR vs BCM?

      We did compare bulk RNAseq of EDTA organoids to ENR-cultured spheroids, short term (passage 6, day 6) BCM-cultured spheroids and long term BCM-cultured (passage 26, day 6) spheroids. To avoid overloading the manuscript these data were not shown in the original manuscript. In summary the BCM-cultured spheroids display a similar phenotype as those cultured in ENR, but with further de-differentiation. See in revision plan folder the results for PTGS, some differentiation markers and fetal regenerative markers including YAP induced genes.

      We have included a brief description of these data in the new version of the manuscript and added an additional supplementary file (Suppl table 2) presenting the whole data set.

      • It is stated: "In agreement with their aptitude to grow indefinitely, adult spheroids express a set of upregulated genes overlapping significantly with an "adult tissue stem cell module" [159/721 genes; q value 2.11 e-94) (Fig.S2F)].". What is the definition of "indefinitely"? Are they referring to the Fig 1B where spheroid were passaged to P10? The authors should avoid the term "indefinitely" but use a more specific time scale, e.g. passages, months etc.

      We agree that the term indefinitely should be avoided, as it is vague. We have introduced the maximum number of passages during which we have maintained the stable spheroid phenotype (26 passages). Also worth noting, the spheroids could be frozen and cultured repeatedly over many months.

      SuppFig 3D: Row Z-Score is missing the "e" in Score.

      Corrected

      • Fig 4E: Figure legend says QNRQ instead of CNRQ. Corrected

      • Fig 4G: The brightfield image of adult spheroids 5 days after 3x TAM injections doesn't look like a spheroid. It seems to be differentiating. True, the choice was not the best as the spheroids started to darken. When further replated, however, the offspring of these spheroids showing a clear phenotype remain negative 30 days after tamoxifen administration as shown on the figure. We are sorry, but for reasons explained in section 4 below, we cannot redo the experiment to get a better picture.

      • Fig 4: Most mouse model data are missing the number of mice & their respective age used for organoid isolation. We have introduced these data in the legend.

      • *Fig 4A-D, H-G: How was fluorescent signal of organoids quantified? *

      The settings of fluo imaging or time of LacZ staining were the same for organoids and spheroid pictures. This has been added to the material and methods of the figure and an example is shown below for Rosa26Tomato.

      *How many images? * 2 per animal per condition.

      *Were there equal numbers of organoids? *

      No, see number of total elements counted added to the figure

      This all needs to be included in methods/figure legends.

      We have introduced additional pertinent information in the material and methods section.

      • Figure 4B-D, G-H: Which culturing conditions were used for adult spheroids? Original method or sandwich method? These data were obtained with the original protocol

      • Fig 6D-E: Please add the timepoint after DT administration these samples are from. It is not listed in text or figure legend. These samples were those obtained from mice sacrificed at the end of the 5 day period as indicated in panel A. This has been emphasized in the legend of the figure.

      • SuppFig 6D: again timepoint is missing. In this experiment all samples were untreated as indicated. This has been emphasized in the legend of the figure.

      • SuppFig 6: How were the crypts of these mice (DT WT & DT HE) isolated? Was this via EDTA? This was RNA extracted from total uncultured EDTA-released material (crypts). This has been emphasized in the legend of the figure.

      Also, what is the timepoint for isolation for these samples? Even if untreated, the timepoint adds context to the data. Please add more context to describing these different experiments, either in the figure legends or methods section.

      All these experiments were from 2 month old animals. We have indicated this in the legend of the figure.

      • SuppFig 6E: The quality of the heatmap resolution is too poor to read gene names. We have improved the resolution of the figure and hope the name of the genes are readable now.

      • 5-7, are the regenerating crypt-villus units fully differentiated or are they maintained in the developmental state? Immunostaining of markers for stem cells (Lgr5), differentiated lineages (Alpi, Muc2, Lyz, ChgA etc.) and fetal state (Sca1, Trop2 etc) should be analysed in those "white" unrecombined crypt-villus units. The differentiation phenotype is shown by the clear presence of morphologically-identified Paneth and Goblet cells. We agree that specific immunostainings could have been performed to further explore this point. Regarding the fetal state, Clu expression was shown during the regeneration period (see figure 7D,E).

      Unfortunately, for reasons explained in section 4 below, we are not in a position to perform these additional experiments.

      • The following text needs clarification: "The kinetics of appearance of newly formed un-recombined ("white") crypts was studied after a single pulse of DT (Fig.7A). This demonstrated an increase at 48 hours, with further increase at day 10 and stable maintenance at day 30. The presence of newly formed white crypts one month after toxin administration indicates that the VilCre-negative lineage is developmentally stable and does not turn on the transgene during differentiation of the various epithelial lineages occurring after regeneration (Fig.7B).

      *Comment: The "newly formed" is an overstatement, the data doesn't conclude that those are "new" crypts. *

      Except if we do not understand the point, we think we can write that a fraction of "white" crypts must be "newly formed", since they are in excess of those present in untreated animals at the same time point.

      *The end of the sentence states that these "white" crypts form developmentally stable lineages, thus these white crypts at day 30 could originate from the initial injury. *

      As stated above, we consider that crypts found in excess of those present in untreated animals result from the initial injury.

      *There was no characterisation of the various epitheial lineages. Are they fully differentiated? *

      See above the point related to Paneth cells and Goblet cells.

      Is Lgr5 expressed one month after toxin administration? Can the VilCre neg lineage give rise to CBCs?

      We have tried hard to show presence or absence of Lgr5 in white crypts at the various times following DT administration. We tried double RFP / Lgr5-RNA scope labeling and double GFP/RFP immunolabeling. Unfortunately, we could not get these methods to produce convincing specific labeling of CBCs in homeostatic crypts, which explains why we could not reach a conclusion regarding the white crypts.

      However, there is an indirect indication that "chronic" white crypts (i.e. those caused by DTR expression in CBC, plus those observed 30 days after DT administration) do not express Lgr5. Indeed, acute regeneration indicated by Clu expression at day 5 in Fig.7C is lower in white crypts than in red ones strongly suggesting that white crypts preexisting DT administration (the "chronic ones) do not express Lgr5DTR.

      The relationship between white crypt generation and appearance of Clu-positive revival cells (Ayyaz et al., 2019) was then explored. In agreement with others and similar to what happens in the irradiation model, (Ayyaz et al., 2019; Yuan et al., 2023) Clu-positive cells were rare in crypts of untreated mice and their number transiently increased forty-eight hours after a single pulse of DT, and more so after three pulses of DT (Fig.7C,D).

      Comment: Comparing 1 pulse at day 2 vs 3 pulses at day 5 makes the data hard to interpret. How is the Clu ISH level for 1 pulse at day 5? Are they equivalent?

      After a single pulse of of DT, Clu is only transiently increased. As shown by Ayyaz et al it is back to the starting point at day 5 (supplementary figure 4 of Ayyaz et al).

      Clu-positive cells were less frequently observed in white crypts (see "Total" versus "White" in Fig.7C). This fits with the hypothesis that Clu expression marks acutely regenerating crypts and that a proportion of the white crypts are chronically regenerating due to DTR expression in CBCs."

      *Comment: I believe the authors suggested that the discrepancy of less Clu expression in white crypts is due to the ectopic expression of DTR in CBCs causing low grade injury without DT administration. This means that some white crypts could have been formed before the administration of DT, and thus are on a different regenerative timeline compared to the white crypts formed from DT administration. *

      Yes, this is our interpretation. We have clarified it in the text.

      Is there any proof of the chronic regeneration? Immunostaining of chronic regenerative markers such as Sca1, Anxa1 or Yap1 nuclear localization would support the claim. It'd be important to show only the white crypts, but not the RFP+ ones, show regenerative markers.

      We think that the steady state higher number of white crypts in untreated Lgr5-DTR animals, compared to wild type siblings indicates chronical low-grade regeneration, which is supported by the RNAseq data (Suppl fig6). It must be noted, however, that this phenotype is mild compared to the well described fetal-like regeneration phenotype described in most injury models. Since these white crypts were made at undetermined earlier stages, the great majority of them are not expected to show markers of acute regeneration like Clu, Sca1....

      Fig 7D-E: What are the timepoints of harvest for HE-WT-HE 1 pulse DT mice and HE- HE-HE PBS injected mice?

      We have added this information in the figure.

      • *Fig 8-9: Regarding the CBC-like Olfm4 low population, what is the status of Lgr5? This should be shown in the figure since the argument is that this is an Lgr5-independent lineage. * See response to the second point.

      And what about the regenerative, Yap, mesenchymal and inflammatory signatures? Are they enriched in the white crypts similar to the in vitro spheroids?

      In a portion of white crypts, those we believe are newly formed after CBC ablation (see above), there is a transient increase in Clu, which may be considered a marker of Yap activation. In the CBC-like Olfm4 low cells, as seen by scRNAseq, there is nothing like an actively regenerating phenotype. This is expected, since these cells are coming from homeostatic untreated VilCre/Rosa26Tom animals and are supposed to be quiescent "awaiting to be activated".

      Reviewer #1 (Significance (Required)):

      Strengths: The study employed a range of in vitro and in vivo models to test the hypothesis.

      • *

      *Limitations: Unfortunately, the models chosen did not provide sufficient evidence to draw the conclusions. Injury induced reprogramming, both in vivo and in vitro, has been well documented in the field. The new message here is to show that such reprogrammed state is continuous rather than transient; instead of regenerating Lgr5+ stem cells, it can continue to differentiate to all cell lineages in Lgr5-independent manner.

      *

      We respectfully disagree with this analysis of our results. What we show is not "that such reprogrammed state is continuous rather than transient; instead of regenerating Lgr5+ stem cells, it can continue to differentiate to all cell lineages in Lgr5-independent manner", but that a quiescent stem cell line, not previously identified, is activated to regenerate a portion of crypts following CBC ablation. These cells are not reprogrammed, they correspond to a developmental lineage waiting to be activated and keep their VilCre-negative state at least of 30 days. We believe that their "by default tracing" (VilCre negative from the zygote stage) is as strong an evidence for the existence of such a lineage as positive lineage tracing would be. The increase in crypts originating from this lineage after CBC ablation indicates that it is implicated in regeneration. We do not question the well-demonstrated plasticity-associated reprogramming taking place during regeneration; we simply suggest that this would coexist with the involvement of the quiescent VilCre-negative lineage we have identified.

      *However, through the manuscript, there was no immunostaining of Lgr5 and other differentiation markers. The conclusion is an overstatement without solid proof. * We have provided the best answer we could to this point in our answer to the second question of the referee hereabove.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, the Marefati et al. developed a novel approach to generate spheroids from adult intestinal epithelium using a collagenase/dispase based protocol. Adult spheroids were found to be distinct from classic budding-type organoids normally generated from EDTA based release of the crypt epithelium. Transcriptional profiling indicated that adult spheroids were undifferentiated and similar to regenerating crypts or fetal spheroids. To identify the cell of origin that generates adult spheroids, the authors labelled epithelial cells with VilCreERT-LSL-Tom, VilCre-LSL-GFP and Lgr5CreERT- LSLTom mice. From these experiments the authors conclude that that spheroids are only generated from Vil-Cre negative and Lgr5 negative cells. Next the authors deleted the anti- apoptotic gene Mcl1 using Vil-CreERT mice. This led to a strong apoptotic response throughout the crypt epithelium and tissues processed from knockout mice readily generated spheroids, and in vivo, replenishment of the gut epithelium was mediated by unrecombined cells. In a second model, CBCs were ablated using Lgr5DTR mice and VilCre negative cells were found again to contribute to regeneration of the crypt epithelium. Finally based on the absence of Vil-Cre reporter activity, the authors were able to sort out and perform scRNAseq to profile VilCre negative cells. These cells were found to be quiescent, express the stem cell marker Olfm4 and were also abundant in ribosomal gene expression.

      • *

      The fact that the authors developed a protocol to reproducibly generate fetal-like spheroids from adult tissue is an important advancement in the field. Previous reports have shown that treatment with various small molecule inhibitors can revert budding organoids into a spheroid morphology, but this manuscript demonstrates that spheroids can also be generated from otherwise untreated cells. This new methodology will provide new tools to dissect the molecular determinants of fetal/regenerative cells in the gut. Based on this, the manuscript should attract a significant amount of attention in the intestinal field.

      • *

      As pointed out by the authors themselves the study has important limitations that diminish enthusiasm. The primary issue relates to the inability of the team to identify markers of VilCre neg cells other than the fact that these cells are Olfm4+ and quiescent. Nonetheless, for the reasons stated above the manuscript should reach the target audience within the research community, if the authors can address the specific points below related to issues with methodology as well as defining more precisely the characteristics and growth requirements of adult spheroid cultures.

      Thank you for this positive analysis of our study.

      Major comments

      The main conclusion of the study is that Vil-Cre neg cells are rare quiescent Olfm4+ crypt cells. If this is the case, then standard EDTA treatment should release these cells as well. Consequently, spheroids should also emerge from isolated crypts grown in the absence of ENR. If this is not the case how do the authors explain this?

      We have tried hard to generate spheroids by culturing EDTA organoids in medium lacking ENR and by treating EDTA organoids with collagenase/dispase, without success. Therefore, we are left with the conclusion that spheroid-generating cells must be more tightly attached to the matrix than those released by EDTA, and that it is their release from this attachment by collagenase that triggers a regeneration-like phenotype. This hypothesis is supported by several models of regeneration in other tissues as indicated in our references (Gilbert et al., 2010; Machado et al., 2021; Montarras et al., 2005).

      From the text the authors appear to suggest that growth of adult spheroids is dependent initially on "material" released by collagenase/dispase treatment. An obvious candidate would be mesenchymal cells, which are known to secrete factors such as Wnts and PGE2 that drive spheroid morphology. To test this, the authors should treat spheroid cultures with Porcupine and/or PGE2 inhibitors.

      We followed similar reasoning, considering that spheroids express strongly Ptgs1 ,2 (Figure 3A). We thought their phenotype might be maintained by autocrine prostaglandin action. We tested aspirin, a Ptgs inhibitor, which was without effect on the spheroid phenotype. Besides, we explored a wide variety of conditions to test whether they would affect the spheroid phenotype [Aspirin-see above, cAMP agonists/antagonists, YapTaz inhibitors (verteporfin and CA3), valproic acid, Notch inhibitors (DAPT, DBZ, LY511455), all-trans retinoic acid, NFkB inhibitors (TCPA, BMS), TGFbeta inhibitor (SB431542)]. As these results were negative, we did not include them in the manuscript.

      • If these inhibitors block growth then this would suggest that either stromal cells or autocrine signalling involving these pathways is important. Overall, more in-depth analysis of the growth requirements of adult spheroids is required.*

      Figure 1d indicates that adult spheroids can be propagated for at least 10 passages. The abstract mentions they are "immortal". The text itself does not address this issue. More precise information as to how long spheroids can be propagated is required. If these cultures can be propagated for 10 passages or more it becomes important to determine what nutrients/mitogens in the basal media are driving growth? Alternatively, what is the evidence that spheroid cultures are completely devoid of mesenchymal cells. The text only mentions that "Upon replating, these spheroids could be stably cultured free of mesenchymal cells (Fig.1B)". No validation is shown to support this.

      We agree that "immortal" is not a good way to characterize our spheroids, as also pointed out by referee nr 1. We have changed that in the text, indicating the maximal number of replating we tested was 26 and replacing immortal by stably replatable. Of note, the spheroids could frozen/thawed and recultured many times.

      Related to the question whether mesenchymal cells could still contaminate the spheroid cultures, we can provide the following answers:

      • No fibroblasts could be seen in replated cultures and multiple spheroids could be repeatedly propagated from a single starting spheroid.
      • The bulk RNAseq experiment comparing organoids to ENR or BCM cultured spheroids show, despite expression of several mesenchymal markers (see matrisome in Fig3), absence of significant expression of Pdgfra (see in revision plan folder for CP20Millions results from the raw data of new suppl table 2, with Clu, Tacstd2 and Alpi shown as controls).
      • Regarding the nutrients/mitogens in the medium driving spheroid growth, we did not explore the point further than showing that they grow in basal medium (i.e. advanced DMEM), given that the presence of Matrigel makes it difficult to pinpoint what is really needed. In Figure 2, the authors describe the growth requirements for adult spheroids and indicate that spheroids grown in ENR or EN became dark and shrink. The representative images showing this are clear, but this analysis should be quantified.

      Added to the manuscript.

      In SF3, the gene expression profile of organoids from the sandwich method only partially overlaps with that of organoids from the old protocol. What are the gene expression differences between the 2 culture systems? Secondly, the sandwich method appears to sustain growth of Tom+ spheroids based on RNAseq and the IF images. This suggest that Vil-Cre negative cells are not necessarily the only source of adult spheroids and thus this experiment seems to indicate that any cell may be converted to grow as a spheroid under the right conditions. These points should be addressed.

      Looking back to our data in order to answer the point raised by the referee, we realized that we had inadvertently-compared organoids to ENR-cultured spheroids generated by the first protocol to BCM-cultured spheroids generated by the sandwich method. We have corrected this error in a new version of suppl fig3. This shows increased correspondence between genes up- or downregulated in the spheroids obtained in the two protocols (from 49/48% to 57/57% (Venn diagram on the new figure). We agree that, even after this correction, the spheroids obtained with the two protocols present sizeable differences in their transcriptome. However, considering the very different way these spheroids were obtained and cultured initially, we do not believe this to be unexpected. The important point in our opinion is that the core of the up- and down-regulated genes typical of the de-differentiation phenotype of adult spheroids is very similar, as shown in the heatmap (which was made with the correct samples!). Also, a key observation is that that both kind of spheroids survive and can be replated in basal medium. As already stated, this characteristic is only seen rare cases [spheroids obtained from rare FACS-purified cells (Smith et al 2018) or helminth-infected intestinal tissue (Nusse et al.2018)]. Together with the observation that the majority of them is not traced by VilCre constitutes what we consider the halmark of the spheroids described in our study. As shown in figure 4E (old protocol) and Suppl Fig.3 (sandwich protocol) both red and white spheroids were extremely low in VilCre expression. As stated in the text, the fact that some spheroids are nevertheless red is most probably related to the extreme sensitivity of the Rosa26Tom marker to recombination (Liu et al., 2013), but this does not mean that there are two phenotypically different kind of spheroids. It means that the arbitrary threshold of Rosa26Tom recombination introduces an artificial subdivision of spheroids with no phenotypical significance.

      Regarding the point made by the referee that "that any cell may be converted to grow as a spheroid under the right conditions", we agree and have shown with others that organoids acquire indeed a spheroid phenotype when cultured for instance in fibroblasts-conditioned medium (see suppl fig1B and (Lahar et al., 2011; Roulis et al., 2020) quoted in the manuscript). However, these spheroids cannot be propagated in basal medium, and revert to an organoid phenotype when put back in ENR (Suppl fig1B).

      *In Figure 4, the authors conclude that spheroids do not originate from Lgr5 cell derived clones even after 30days post Tam induction. Does this suggest that in vivo and under homeostatic conditions VilCre neg cells are derived from a distinct stem cell pool or are themselves a quiescent stem cell. Given the rarity of VilCre neg cells, the latter seems unlikely.

      *

      Despite their rarity, we believe VilCre-negative cells observed under homeostatic conditions are themselves quiescent stem cells. Actually, if they were derived from a larger stem cell pool, this pool should also be VilCre-negative. And we do not see such larger number of VilCre-neg cells under homeostatic conditions.

      The problem with the original assertion is that Lgr5-CreERT mice are mosaic and therefore not all Lgr5+ cells are labelled in this model. "White" spheroids may thus derive from cells that in turn derive from these unlabelled Lgr5 cells.

      We had considered the possibility that mosaicism [very low for VilCre (Madison et al., 2002); in the 40-50% range for Lgr5CreERT2 (Barker & Clevers. Curr Protoc Stem Cell Biol. 2010 Chapter 5)] could explain our data. We think, however that we can exclude this possibility on the basis that spheroids do not conform to the expected ratio of unrecombined cells, given the observed level of mosaicism. Indeed, for VilCre, a few percent, at most, of unrecombined cells in the epithelium translates into almost 100% unrecombined spheroids. For Lgr5CreERT2 mice, the mosaicism level is in the range of 40%, which is what we observe for EDTA organoids (Figure 4G), while spheroids were in their vast majority unrecombined.

      We have included a discussion about the possible role of mosaicism in the new version.

      ATACseq experiments were briefly mentioned in the manuscript but unfortunately little information was extracted from this experiment. What does this experiment reveal about the chromatin landscape of adult spheroids relative to normal organoids?

      We only performed this experiment to search for an explanation to the paradoxical absence of expression of the VilCre transgene in spheroids, despite robust expression of endogenous villin (Suppl Fig.4). We chose to show the chromatin landscape of a gene equally expressed in both organoids and spheroids (Krt19), a gene specifically expressed in spheroids (Tacstd2) and the endogenous Villin gene also expressed in both. We believe that the observation of a clear difference in pattern of the chromatin accessibility around the endogenous villin gene in organoids and spheroids provides an explanation to the observed results. The cis regulatory sequences needed for expression of the endogenous villin gene seem to be different in organoids and spheroids, which may explain why the regulatory sequences present in the transgene (only 12.4kb) might not allow expression of the transgene in spheroids. We have added a sentence in the manuscript clarifying this point. Missing is obviously the chromatin landscape around the VilCre transgene, but this is beyond reach in such kind of experiments.

      Reviewer #2 (Significance (Required)):

      The fact that the authors developed a protocol to reproducibly generate fetal-like spheroids from adult tissue is an important advancement in the field. Previous reports have shown that treatment with various small molecule inhibitors can revert budding organoids into a spheroid morphology, but this manuscript demonstrates that spheroids can also be generated from otherwise untreated cells. This new methodology will provide new tools to dissect the molecular determinants of fetal/regenerative cells in the gut. Based on this, the manuscript should attract a significant amount of attention in the intestinal field.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): CR-2024-02491

      An Lgr5-independent developmental lineage is involved in mouse intestinal regeneration

      Marefati et al.

      Homeostatic maintenance of the intestinal epithelium has long been thought to rely upon Wnt signaling responsive Lgr5-expressing stem cells that reside at the crypt base.

      However, myriad reported mechanisms or populations have been reported to underlie epithelial regeneration after injury. Many groups have reported that reacquisition of a fetal- link intestinal phenotype is an import part of the regenerative response, however the originating cell type has not been definitively identified. Herein, the authors demonstrate that cells from adult homeostatic intestine can generate immortal spheroids that resemble fetal spheroids and are derived independent of Lgr5+ intestinal stem cells (ISCs). The authors then draw the conclusion that this indicates that a hierarchical stem cell model applies to regeneration of the intestinal epithelium, in addition to the plasticity model.

      • *

      Comments:

      1. Please indicate what species is used for studies in Fig 1.

      All experiments were performed in Mus musculus.

      Please clarify if Figure 2 studies utilize Matrigel or not.

      Yes

      RNA-seq analyses of adult intestinal generated spheroids lack the granularity of single cell analyses and thus it is unclear if this is a homogeneous population or if the population has diversity across it (i.e., enteroids/organoids have a high level of diversity). Many of the conclusions from the RNA-seq study are broad and generalized-for example Fig 3F indicates that markers of the +4 ISC populations (Bmi1, tert, lrig1, hopx) were all expressed similarly in adult spheroids as compared to adult organoids. However, while this may be true in the bulk-RNA-seq analyses, clearly scRNA-seq would provide a better foundation to make this statement, as enteroids/organoids are comprised of heterogeneous subpopulations. . .and it might indicate that these +4 markers have only very low expression in the spheroids. Based upon these concerns, misconclusions are likely to be drawn.

      We agree and it would be certainly worthwhile to perform scRNAseq of adult spheroid populations. This would certainly be worth doing in future studies to explore the possible heterogeneity of adult spheroids. We nevertheless believe that our scRNAseq performed on homeostatic intestinal tissue from VilCre/Rosa26Tom mice identify Olfm4-low VilCre-neg cells that are likely at the origin of adult spheroids and display a quite homogenous phenotype.

      *The language around Figure 4 results is confusing. Please define "white" and "red". It might be simpler to designate recombined versus not recombined lineage.

      *

      We have clarified this in the figure.

      The hypothesis that collagenase/dispase solution acts as a proxy for injury is not demonstrated and backed by data. Thus, it is difficult to make the conclusion that this approach could represent a "stable avatar" of intestinal regenerating cells. It is clear that subpopulations of crypt-based cells generate spheroids in culture without collagenase/dispase (see the cited reference Smith et al, 2018).

      * *Smith et al demonstrate clearly the possibility to obtain spheroids with properties probably similar to ours from EDTA derived intestinal crypt cells. However they need to prepurify them by FACS. Besides, Nusse et al describe spheroids similar to ours after infection of the intestine by helminths (Nusse et al. 2018). In our case, and for most labs preparing enteroids with the EDTA protocol, the result is close to 100% organoids. Even if we treat EDTA organoids with collagenase, we do not obtain spheroids. This brought us to the conclusion that spheroid-generating cells must be more tightly attached to the matrix than CBCs and that it is their release from the matrix that activates the spheroid regeneration-like phenotype. This hypothesis is supported by several models of regeneration in other tissues as indicated in our references (Gilbert et al., 2010; Machado et al., 2021; Montarras et al., 2005)

      A study based on the absence of recombination in a VilCre lineage tracing scenario is not well-established to be strong experimental approach, as there are many reasons why recombination may not cells may not be lineage marked. In order to use this system as the authors intend, they first need to demonstrate that villin is not expressed in the discrete cell population that they are targeting. For the presented observational studies, this would be difficult to do. While they do demonstrate differences in chromatin accessibility between cells from organoids versus spheroids (fig s4), some of these differences could merely be due to the bulk analytical nature of the study and the lack of comparing stem cell populations from spheroids to stem cell populations from organoids-since the spheroids are likely homogenous versus the organoids that only have a small fraction of stem cells-and thus represent a mix of stem cell and differentiated cell populations. The authors do not demonstrate that villin protein expression varies in these cells.

      If it were found that villin is not expressed in their "novel" population, then one would expect that the downstream use of villin-based recombination would demonstrate the same recombination potential (i.e., Mcl1 would not be recombined). Both recombination studies in Fig 6 are difficult to interpret, and thus it is not clear if these studies support the stated conclusions. Quantification of number of crypts that are negative should be reported as a percentage of recombined crypts.

      We are sorry but there seems to be a complete misunderstanding of our data regarding the point raised by the referee. The important point of our initial observation is that despite robust expression of villin in spheroids, the VilCre transgene is not expressed (see figure 4E). This in our opinion makes absence of VilCre expression (or of Rosa marker recombination) a trustful marker of a new developmental lineage. All the data in figure 4 constitute an answer.

      *The reasoning about heterogeneity of cell type in organoids versus probable homogeneity of spheroids is well taken. However, as the endogenous villin gene is expressed in all cells of both organoids and spheroids, it is highly significant that only spheroids do not express the transgene. *

      We performed the ATACseq experiment to search for an explanation to the paradoxical absence of expression of the VilCre transgene in spheroids, despite robust expression of endogenous villin (Suppl Fig.4). We chose to show the chromatin landscape of a gene equally expressed in both organoids and spheroids (Krt19), a gene specifically expressed in spheroids (Tacstd2) and the endogenous Villin gene also expressed in both. We believe that the observation of a clear difference in pattern of the chromatin accessibility around the endogenous villin gene in organoids and spheroids provides an explanation to the observed results. The cis regulatory sequences needed for expression of the endogenous villin gene seem to be different in organoids and spheroids, which may explain why the regulatory sequences present in the transgene (only 12.4kb) might not allow expression of the transgene in spheroids. We have added a sentence in the manuscript clarifying this point. Missing is obviously the chromatin landscape around the VilCre transgene, but this is beyond reach in such kind of experiments.

      *Figure 8 indicates that the cell population identified by scRNA-seq may be quiescent. Companion IF or IHC should be conducted to confirm this finding, as well as other conclusions from the informatics conducted.

      *

      We agree that additional experiments could be performed to support this point. We are unfortunately not in a position to perform these experiments (see section 4 below).

      Clearly the data is intriguing, however, the conclusion is strong and is an over interpretation of the presented data. There are a number of validation or extension data that would enhance the overall interpretation of the study: 1. validation of scRNA-seq or bulk RNA-seq concepts by protein staining of intestinal tissues in the damage model will serve as a secondary observation. 2. identification of the ISC that they are defining is critical and important. There is already the notion that this cell type exists and it has been shown with various different markers. 3. expand the analyses of the fetal-like expression profiling to injured intestines to demonstrate that the lineage negative cells indeed express fetal-like proteins. 4. expand the discussion of the Clu+ cell type. Is this cell the previously described revival cell? If so, how does this body of work provide unique aspects to the field?

      We agree that all these suggested experiments could be performed and would be of interest. However, we consider that they would not modify the main message of our study and would only constitute an expansion of the present work. As already stated, we are not in the position to perform them (see section 4).

      *There is some level of conflicting data, with the stem population being proliferative in culture stimulated by the stromal cells, but quiescent in vivo and also based upon scRNA- seq data in Fig 9.

      *

      We do not see any conflict in our observation regarding this point. The observation that cells that are quiescent in vivo become proliferative when subjected to culture (with or without addition of stromal cells) is routinely made in a multitude of cell culture systems. In particular, it has been shown that intestinal tissue dissociation activates the Yap/Taz pathway, resulting in proliferation (Yu et al. Hippo Pathway Regulation of Gastrointestinal Tissues. Annual Review of Physiology, 2015 Volume 77, 201-227).

      Many of the findings have been previously reported: Population that grows as spheroids (Figure 2), Population that is Wnt independent (Figure 2), Lgr5 independent regenerative growth of the intestine (figure 3F, Figure 4), Clu+ ISCs drive regeneration (Figure 7).

      Whereas these individual findings have indeed been reported, it was in a different context. We strongly disagree with the underlying suggestion that our study would not bring new information. We have identified here a developmental lineage involved in intestinal regeneration that has not been described up to now.

      Minor comments:

        • The statement that spheroids must originate from collagenase/dispase digested material might be an overstatement. As spheroids generation from EDTA treated intestines have been previously reported (Smith et al, 2018). * See answer to point 4 above. *Overall while the study includes an extensive amount of work and different approaches, a foundationally supported novel finding is missing. Many of the statements have already been demonstrated by others in the fields. In addition, one of the most intriguing aspects of the study is that the stromal population impacts this stem cell population, however, interactions and factors stimulating the crosstalk are not addressed.

      *

      Reviewer #3 (Significance (Required)):

      Overal while the study includes an extensive amount of work and different approaches, a foundationally supported novel finding is missing. Many of the statements have already been demonstrated by others in the fields. In addition, one of the most intriguing aspects of the study is that the stromal population impacts this stem cell population, however, interactions and factors stimulating the crosstalk are not addressed.

      We can only disagree.

      4. Description of analyses that authors prefer not to carry out

      • *

      We have answered most questions raised by the referees by explaining our view, by clarifying individual points and, in several cases, by providing additional information that was not included in the original manuscript.

      In a limited number of cases when additional experiments were suggested, we were unfortunately obliged to write that we are not in a position to perform them. This is because my lab is closing after more than fifty years of uninterrupted activity. There will unfortunately be nobody to perform additional experiments.

      Nevertheless, as written by referees 1 and 2, we believe that the revised manuscript, as it stands, contains data that will be of interest to the people in the field and may be the bases for future developments. We hope editors will find interest in publishing it.

    1. Reviewer #2 (Public Review):

      Summary:

      Schmidlin & Apodaca et al. aim to distinguish mutants that resist drugs via different mechanisms by examining fitness tradeoffs across hundreds of fluconazole-resistant yeast strains. They barcoded a collection of fluconazole-resistant isolates and evolved them in different environments with a view to having relevance for evolutionary theory, medicine, and genotype-phenotype mapping.

      Strengths:

      There are multiple strengths to this paper, the first of which is pointing out how much work has gone into it; the quality of the experiments (the thought process, the data, the figures) is excellent. Here, the authors seek to induce mutations in multiple environments, which is a really large-scale task. I particularly like the attention paid to isolates with are resistant to low concentrations of FLU. So often these are overlooked in favour of those conferring MIC values >64/128 etc. What was seen is different genotype and fitness profiles. I think there's a wealth of information here that will actually be of interest to more than just the fields mentioned (evolutionary medicine/theory).

      Weaknesses:

      Not picking up low fitness lineages - which the authors discuss and provide a rationale as to why. I can completely see how this has occurred during this research, and whilst it is a shame I do not think this takes away from the findings of this paper. Maybe in the next one!

      In the abstract the authors focus on 'tradeoffs' yet in the discussion they say the purpose of the study is to see how many different mechanisms of FLU resistance may exist (lines 679-680), followed up by "We distinguish mutants that likely act via different mechanisms by identifying those with different fitness tradeoffs across 12 environments". Whilst I do see their point, and this is entirely feasible, I would like a bit more explanation around this (perhaps in the intro) to help lay-readers make this jump. The remainder of my comments on 'weaknesses' are relatively fixable, I think:

      In the introduction I struggle to see how this body of research fits in with the current literature, as the literature cited is a hodge-podge of bacterial and fungal evolution studies, which are very different! So example, the authors state "previous work suggests that mutants with different fitness tradeoffs may affect fitness through different molecular mechanisms" (lines 129-131) and then cite three papers, only one of which is a fungal research output. However, the next sentence focuses solely on literature from fungal research. Citing bacterial work as a foundation is fine, but as you're using yeast for this I think tailoring the introduction more to what is and isn't known in fungi would be more appropriate. It would also be great to then circle back around and mention monotherapy vs combination drug therapy for fungal infections as a rationale for this study. The study seems to be focused on FLU-resistant mutants, which is the first-line drug of choice, but many (yeast) infections have acquired resistance to this and combination therapy is the norm.

      Methods: Line 769 - which yeast? I haven't even seen mention of which species is being used in this study; different yeast employ different mechanisms of adaptation for resistance, so could greatly impact the results seen. This could help with some background context if the species is mentioned (although I assume S. cerevisiae). In which case, should aneuploidy be considered as a mechanism? This is mentioned briefly on line 556, but with all the sequencing data acquired this could be checked quickly?

      I think the authors could be bolder and try and link this to other (pathogenic) yeasts. What are the implications of this work on say, Candida infections?

    1. Author response:

      We thank you for the opportunity to provide a concise response. The criticisms are accurately summarized in the eLife assessment:

      the study fails to engage prior literature that has extensively examined the impact of variance in offspring number, implying that some of the paradoxes presented might be resolved within existing frameworks.

      The essence of our study is to propose the adoption of the Haldane model of genetic drift, based on the branching process, in lieu of the Wright-Fisher (WF) model, based on sampling, usually binomial.  In addition to some extensions of the Haldane model, we present 4 paradoxes that cannot be resolved by the WF model. The reviews suggest that some of the paradoxes could be resolved by the WF model, if we engage prior literature sufficiently.

      We certainly could not review all the literature on genetic drift as there must be thousands of them. Nevertheless, the literature we do not cover is based on the WF model, which has the general properties that all modifications of the WF model share.  (We should note that all such modifications share the sampling aspect of the WF model. To model such sampling, N is imposed from outside of the model, rather than self-generating within the model.  Most important, these modifications are mathematically valid but biologically untenable, as will be elaborated below. Thus, in concept, the WF and Haldane models are fundamentally different.)

      In short, our proposal is general with the key point that the WF model cannot resolve these (and many other) paradoxes.  The reviewers disagree (apparently only partially) and we shall be specific in our response below.

      We shall first present the 4th paradox, which is about multi-copy gene systems (such as rRNA genes and viruses, see the companion paper). Viruses evolve both within and between hosts. In both stages, there are severe bottlenecks.  How does one address the genetic drift in viral evolution? How can we model the effective population sizes both within- and between- hosts?  The inability of the WF model in dealing with such multi-copy gene systems may explain the difficulties in accounting for the SARS-CoV-2 evolution. Given the small number of virions transmitted between hosts, drift is strong which we have shown by using the Haldane model (Ruan, Luo, et al. 2021; Ruan, Wen, et al. 2021; Hou, et al. 2023). 

      As the reviewers suggest, it is possible to modify the WF model to account for some of these paradoxes. However, the modifications are often mathematically convenient but biologically dubious. Much of the debate is about the progeny number, K.  (We shall use haploid model for this purpose but diploidy does not pose a problem as stated in the main text.) The modifications relax the constraint of V(k) = E(k) inherent in the WF sampling.  One would then ask how V(k) can be different from E(k) in the WF sampling even though it is mathematically feasible (but biologically dubious)?  Kimura and Crow (1963) may be the first to offer a biological explanation.  If one reads it carefully, Kimura's modification is to make the WF model like the Haldane model. Then, why don't we use the Haldane model in the first place by having two parameters, E(k) and V(k), instead of the one-parameter WF model?

      The Haldane model is conceptually simpler. It allows the variation in population size, N, to be generated from within the model, rather than artificially imposed from outside of the model.  This brings us to the first paradox, the density-dependent Haldane model. When N is increasing exponentially as in bacterial or yeast cultures, there is almost no drift when N is very low and drift becomes intense as N grows to near the carrying capacity.  We do not see how the WF model can resolve this paradox, which can otherwise be resolved by the Haldane model.

      The second and third paradoxes are about how much mathematical models of population genetic can be detached from biological mechanisms. The second paradox about sex chromosomes is rooted in the realization of V(k) ≠ E(k).  Since E(k) is the same between sexes but V(k) is different, how does the WF sampling give rise to V(k) ≠ E(k)? We are asking a biological question that troubled Kimura and Crow (1963) alluded above. The third paradox is acknowledged by two reviewers. Genetic drift manifested in the fixation probability of an advantageous mutation is 2s/V(k).  It is thus strange that the fundamental parameter of drift in the WF model, N (or Ne), is missing.  In the Haldane model, drift is determined by V(k) with N being a scaling factor; hence 2s/V(k) makes perfect biological sense,

      We now answer the obvious question: If the model is fundamentally about the Haldane model, why do we call it the WF-Haldane model? The reason is that most results obtained by the WF model are pretty good approximations and the branching process may not need to constantly re-derive the results.  At least, one can use the WF results to see how well they fit into the Haldane model. In our earlier study (Chen, et al. (2017); Fig. 3), we show that the approximations can be very good in many (or most) settings.

      We would like to use the modern analogy of gas-engine cars vs. electric-motor ones. The Haldane model and the WF model are as fundamentally different concepts as the driving mechanisms of gas-powered vs electric cars.  The old model is now facing many problems and the fixes are often not possible.  Some fixes are so complicated that one starts thinking about simpler solutions. The reservations are that we have invested so much in the old models which might be wasted by the switch. However, we are suggesting the integration of the WF and Haldane models. In this sense, the WF model has had many contributions which the new model gratefully inherits. This is true with the legacy of gas-engine cars inherited by EVs.

      The editors also issue the instruction: while the modified model yields intriguing theoretical predictions, the simulations and empirical analyses are incomplete to support the authors' claims. 

      We are thankful to the editors and reviewers for the thoughtful comments and constructive criticisms. We also appreciate the publishing philosophy of eLife that allows exchanges, debates and improvements, which are the true spirits of science publishing.

      References for the provisional author responses

      Chen Y, Tong D, Wu CI. 2017. A New Formulation of Random Genetic Drift and Its Application to the Evolution of Cell Populations. Mol. Biol. Evol. 34:2057-2064.

      Hou M, Shi J, Gong Z, Wen H, Lan Y, Deng X, Fan Q, Li J, Jiang M, Tang X, et al. 2023. Intra- vs. Interhost Evolution of SARS-CoV-2 Driven by Uncorrelated Selection-The Evolution Thwarted. Mol. Biol. Evol. 40.

      Kimura M, Crow JF. 1963. The measurement of effective population number. Evolution:279-288.

      Ruan Y, Luo Z, Tang X, Li G, Wen H, He X, Lu X, Lu J, Wu CI. 2021. On the founder effect in COVID-19 outbreaks: how many infected travelers may have started them all? Natl. Sci. Rev. 8:nwaa246.

      Ruan Y, Wen H, He X, Wu CI. 2021. A theoretical exploration of the origin and early evolution of a pandemic. Sci Bull (Beijing) 66:1022-1029.

      Review comments

      eLife assessment 

      This study presents a useful modification of a standard model of genetic drift by incorporating variance in offspring numbers, claiming to address several paradoxes in molecular evolution.

      It is unfortunate that the study fails to engage prior literature that has extensively examined the impact of variance in offspring number, implying that some of the paradoxes presented might be resolved within existing frameworks.

      We do not believe that the paradoxes can be resolved.

      In addition, while the modified model yields intriguing theoretical predictions, the simulations and empirical analyses are incomplete to support the authors' claims. 

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The authors present a theoretical treatment of what they term the "Wright-Fisher-Haldane" model, a claimed modification of the standard model of genetic drift that accounts for variability in offspring number, and argue that it resolves a number of paradoxes in molecular evolution. Ultimately, I found this manuscript quite strange.

      The notion of effective population size as inversely related to the variance in offspring number is well known in the literature, and not exclusive to Haldane's branching process treatment. However, I found the authors' point about variance in offspring changing over the course of, e.g. exponential growth fairly interesting, and I'm not sure I'd seen that pointed out before.

      Nonetheless, I don't think the authors' modeling, simulations, or empirical data analysis are sufficient to justify their claims. 

      Weaknesses: 

      I have several outstanding issues. First of all, the authors really do not engage with the literature regarding different notions of an effective population. Most strikingly, the authors don't talk about Cannings models at all, which are a broad class of models with non-Poisson offspring distributions that nonetheless converge to the standard Wright-Fisher diffusion under many circumstances, and to "jumpy" diffusions/coalescents otherwise (see e.g. Mohle 1998, Sagitov (2003), Der et al (2011), etc.). Moreover, there is extensive literature on effective population sizes in populations whose sizes vary with time, such as Sano et al (2004) and Sjodin et al (2005).

      Of course in many cases here the discussion is under neutrality, but it seems like the authors really need to engage with this literature more. 

      The most interesting part of the manuscript, I think, is the discussion of the Density Dependent Haldane model (DDH). However, I feel like I did not fully understand some of the derivation presented in this section, which might be my own fault. For instance, I can't tell if Equation 5 is a result or an assumption - when I attempted a naive derivation of Equation 5, I obtained E(K_t) = 1 + r/c*(c-n)*dt. It's unclear where the parameter z comes from, for example. Similarly, is equation 6 a derivation or an assumption? Finally, I'm not 100% sure how to interpret equation 7. I that a variance effective size at time t? Is it possible to obtain something like a coalescent Ne or an expected number of segregating sites or something from this? 

      Similarly, I don't understand their simulations. I expected that the authors would do individual-based simulations under a stochastic model of logistic growth, and show that you naturally get variance in offspring number that changes over time. But it seems that they simply used their equations 5 and 6 to fix those values. Moreover, I don't understand how they enforce population regulation in their simulations---is N_t random and determined by the (independent) draws from K_t for each individual? In that case, there's no "interaction" between individuals (except abstractly, since logistic growth arises from a model that assumes interactions between individuals). This seems problematic for their model, which is essentially motivated by the fact that early during logistic growth, there are basically no interactions, and later there are, which increases variance in reproduction. But their simulations assume no interactions throughout! 

      The authors also attempt to show that changing variance in reproductive success occurs naturally during exponential growth using a yeast experiment. However, the authors are not counting the offspring of individual yeast during growth (which I'm sure is quite hard). Instead, they use an equation that estimates the variance in offspring number based on the observed population size, as shown in the section "Estimation of V(K) and E(K) in yeast cells". This is fairly clever, however, I am not sure it is right, because the authors neglect covariance in offspring between individuals. My attempt at this derivation assumes that I_t | I_{t-1} = \sum_{I=1}^{I_{t-1}} K_{i,t-1} where K_{i,t-1} is the number of offspring of individual i at time t-1. Then, for example, E(V(I_t | I_{t-1})) = E(V(\sum_{i=1}^{I_{t-1}} K_{i,t-1})) = E(I_{t-1})V(K_{t-1}) + E(I_{k-1}(I_{k-1}-1))*Cov(K_{i,t-1},K_{j,t-1}). The authors have the first term, but not the second, and I'm not sure the second can be neglected (in fact, I believe it's the second term that's actually important, as early on during growth there is very little covariance because resources aren't constrained, but at carrying capacity, an individual having offspring means that another individuals has to have fewer offspring - this is the whole notion of exchangeability, also neglected in this manuscript). As such, I don't believe that their analysis of the empirical data supports their claim. 

      Thus, while I think there are some interesting ideas in this manuscript, I believe it has some fundamental issues:

      first, it fails to engage thoroughly with the literature on a very important topic that has been studied extensively. Second, I do not believe their simulations are appropriate to show what they want to show. And finally, I don't think their empirical analysis shows what they want to show. 

      References: 

      Möhle M. Robustness results for the coalescent. Journal of Applied Probability. 1998;35(2):438-447. doi:10.1239/jap/1032192859 

      Sagitov S. Convergence to the coalescent with simultaneous multiple mergers. Journal of Applied Probability. 2003;40(4):839-854. doi:10.1239/jap/1067436085 

      Der, Ricky, Charles L. Epstein, and Joshua B. Plotkin. "Generalized population models and the nature of genetic drift." Theoretical population biology 80.2 (2011): 80-99 

      Sano, Akinori, Akinobu Shimizu, and Masaru Iizuka. "Coalescent process with fluctuating population size and its effective size." Theoretical population biology 65.1 (2004): 39-48 

      Sjodin, P., et al. "On the meaning and existence of an effective population size." Genetics 169.2 (2005): 1061-1070 

      Reviewer #2 (Public Review): 

      Summary: 

      This theoretical paper examines genetic drift in scenarios deviating from the standard Wright-Fisher model. The authors discuss Haldane's branching process model, highlighting that the variance in reproductive success equates to genetic drift. By integrating the Wright-Fisher model with the Haldane model, the authors derive theoretical results that resolve paradoxes related to effective population size. 

      Strengths: 

      The most significant and compelling result from this paper is perhaps that the probability of fixing a new beneficial mutation is 2s/V(K). This is an intriguing and potentially generalizable discovery that could be applied to many different study systems. 

      The authors also made a lot of effort to connect theory with various real-world examples, such as genetic diversity in sex chromosomes and reproductive variance across different species. 

      Weaknesses: 

      One way to define effective population size is by the inverse of the coalescent rate. This is where the geometric mean of Ne comes from. If Ne is defined this way, many of the paradoxes mentioned seem to resolve naturally. If we take this approach, one could easily show that a large N population can still have a low coalescent rate depending on the reproduction model. However, the authors did not discuss Ne in light of the coalescent theory. This is surprising given that Eldon and Wakeley's 2006 paper is cited in the introduction, and the multiple mergers coalescent was introduced to explain the discrepancy between census size and effective population size, superspreaders, and reproduction variance - that said, there is no explicit discussion or introduction of the multiple mergers coalescent. 

      The Wright-Fisher model is often treated as a special case of the Cannings 1974 model, which incorporates the variance in reproductive success. This model should be discussed. It is unclear to me whether the results here have to be explained by the newly introduced WFH model, or could have been explained by the existing Cannings model. 

      The abstract makes it difficult to discern the main focus of the paper. It spends most of the space introducing "paradoxes". 

      The standard Wright-Fisher model makes several assumptions, including hermaphroditism, non-overlapping generations, random mating, and no selection. It will be more helpful to clarify which assumptions are being violated in each tested scenario, as V(K) is often not the only assumption being violated. For example, the logistic growth model assumes no cell death at the exponential growth phase, so it also violates the assumption about non-overlapping generations. 

      The theory and data regarding sex chromosomes do not align. The fact that \hat{alpha'} can be negative does not make sense. The authors claim that a negative \hat{alpha'} is equivalent to infinity, but why is that? It is also unclear how theta is defined. It seems to me that one should take the first principle approach e.g., define theta as pairwise genetic diversity, and start with deriving the expected pair-wise coalescence time under the MMC model, rather than starting with assuming theta = 4Neu. Overall, the theory in this section is not well supported by the data, and the explanation is insufficient. 

      {Alpha and alpha' can both be negative.  X^2 = 0.47 would yield x = -0.7}

      Reviewer #3 (Public Review): 

      Summary: 

      Ruan and colleagues consider a branching process model (in their terminology the "Haldane model") and the most basic Wright-Fisher model. They convincingly show that offspring distributions are usually non-Poissonian (as opposed to what's assumed in the Wright-Fisher model), and can depend on short-term ecological dynamics (e.g., variance in offspring number may be smaller during exponential growth). The authors discuss branching processes and the Wright-Fisher model in the context of 3 "paradoxes": (1) how Ne depends on N might depend on population dynamics; (2) how Ne is different on the X chromosome, the Y chromosome, and the autosomes, and these differences do match the expectations base on simple counts of the number of chromosomes in the populations; (3) how genetic drift interacts with selection. The authors provide some theoretical explanations for the role of variance in the offspring distribution in each of these three paradoxes. They also perform some experiments to directly measure the variance in offspring number, as well as perform some analyses of published data. 

      Strengths: 

      (1) The theoretical results are well-described and easy to follow. 

      (2) The analyses of different variances in offspring number (both experimentally and analyzing public data) are convincing that non-Poissonian offspring distributions are the norm. 

      (3) The point that this variance can change as the population size (or population dynamics) change is also very interesting and important to keep in mind. 

      (4) I enjoyed the Density-Dependent Haldane model. It was a nice example of the decoupling of census size and effective size. 

      Weaknesses: 

      (1) I am not convinced that these types of effects cannot just be absorbed into some time-varying Ne and still be well-modeled by the Wright-Fisher process. 

      (2) Along these lines, there is well-established literature showing that a broad class of processes (a large subset of Cannings' Exchangeable Models) converge to the Wright-Fisher diffusion, even those with non-Poissonian offspring distributions (e.g., Mohle and Sagitov 2001). E.g., equation (4) in Mohle and Sagitov 2001 shows that in such cases the "coalescent Ne" should be (N-1) / Var(K), essentially matching equation (3) in the present paper. 

      (3) Beyond this, I would imagine that branching processes with heavy-tailed offspring distributions could result in deviations that are not well captured by the authors' WFH model. In this case, the processes are known to converge (backward-in-time) to Lambda or Xi coalescents (e.g., Eldon and Wakely 2006 or again in Mohle and Sagitov 2001 and subsequent papers), which have well-defined forward-in-time processes. 

      (4) These results that Ne in the Wright-Fisher process might not be related to N in any straightforward (or even one-to-one) way are well-known (e.g., Neher and Hallatschek 2012; Spence, Kamm, and Song 2016; Matuszewski, Hildebrandt, Achaz, and Jensen 2018; Rice, Novembre, and Desai 2018; the work of Lounès Chikhi on how Ne can be affected by population structure; etc...) 

      (5) I was also missing some discussion of the relationship between the branching process and the Wright-Fisher model (or more generally Cannings' Exchangeable Models) when conditioning on the total population size. In particular, if the offspring distribution is Poisson, then conditioned on the total population size, the branching process is identical to the Wright-Fisher model. 

      (6) In the discussion, it is claimed that the last glacial maximum could have caused the bottleneck observed in human populations currently residing outside of Africa. Compelling evidence has been amassed that this bottleneck is due to serial founder events associated with the out-of-Africa migration (see e.g., Henn, Cavalli-Sforza, and Feldman 2012 for an older review - subsequent work has only strengthened this view). For me, a more compelling example of changes in carrying capacity would be the advent of agriculture ~11kya and other more recent technological advances. 

      Recommendations for the authors: 

      Reviewing Editor Comments: 

      The reviewers recognize the value of this model and some of the findings, particularly results from the density-dependent Haldane model. However, they expressed considerable concerns with the model and overall framing of this manuscript.

      First, all reviewers pointed out that the manuscript does not sufficiently engage with the extensive literature on various models of effective population size and genetic drift, notably lacking discussion on Cannings models and related works.

      Second, there is a disproportionate discussion on the paradoxes, yet some of the paradoxes might already be resolved within current theoretical frameworks. All three reviewers found the modeling and simulation of the yeast growth experiment hard to follow or lacking justification for certain choices. The analysis approach of sex chromosomes is also questioned. 

      The reviewers recommend a more thorough review of relevant prior literature to better contextualize their findings. The authors need to clarify and/or modify their derivations and simulations of the yeast growth experiment to address the identified caveats and ensure robustness. Additionally, the empirical analysis of the sex chromosome should be revisited, considering alternative scenarios rather than relying solely on the MSE, which only provides a superficial solution. Furthermore, the manuscript's overall framing should be adjusted to emphasize the conclusions drawn from the WFH model, rather than focusing on the "unresolved paradoxes", as some of these may be more readily explained by existing frameworks. Please see the reviewers' overall assessment and specific comments. 

      Reviewer #2 (Recommendations For The Authors): 

      In the introduction -- "Genetic drift is simply V(K)" -- this is a very strong statement. You can say it is inversely proportional to V(K), but drift is often defined based on changes in allele frequency. 

      Page 3 line 86. "sexes is a sufficient explanation."--> "sex could be a sufficient explanation" 

      The strongest line of new results is about 2s/V(K). Perhaps, the paper could put more emphasis on this part and demonstrate the generality of this result with a different example. 

      The math notations in the supplement are not intuitive. e.g., using i_k and j_k as probabilities. I also recommend using E[X] and V[X]for expectation and variance rather than \italic{E(X)} to improve the readability of many equations. 

      Eq A6, A7, While I manage to follow, P_{10}(t) and P_{10} are not defined anywhere in the text. 

      Supplement page 7, the term "probability of fixation" is confusing in a branching model. 

      E.q. A 28. It is unclear eq. A.1 could be used here directly. Some justification would be nice. 

      Supplement page 17. "the biological meaning of negative..". There is no clear justification for this claim. As a reader, I don't have any intuition as to why that is the case.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment:

      Franke et al. explore and characterize the color response properties in the mouse primary visual cortex, revealing specific color opponent encoding strategies across the visual field. The data is solid; however, the evidence supporting some conclusions is incomplete. In its current form, the paper makes a useful contribution to how color is coded in mouse V1. Significance would be enhanced with some additional analyses and a clearer discussion of the limitations of the data presented.

      We thank the reviewers for appreciating our manuscript. We have rewritten the conclusions of the paper to be more conservative and now more explicitly focus on color processing in mouse V1, rather than comparing V1 to the retina. Additionally, we discuss the limitations of our approach in detail in the Discussion section. Finally, we have addressed all comments from the reviewers below.

      Referee 1 (Remarks to the Author):

      In this study, Franke et al. explore and characterize color response properties across primary visual cortex, revealing specific color opponent encoding strategies across the visual field. The authors use awake 2P imaging to define the spectral response properties of visual interneurons in layer 2/3. They find that opponent responses are more pronounced at photopic light levels, and that diversity in color opponent responses exists across the visual field, with green ON/ UV OFF responses more strongly represented in the upper visual field. This is argued to be relevant for the detection of certain features that are more salient when using chromatic space, possibly due to noise reduction. In the revised version, Franke et al. have addressed the potential pitfalls in the discussion, which is an important point for the non-expert reader. Thus, this study provides a solid characterization of the color properties of V1 and is a valuable addition to visual neuroscience research.

      My remaining concerns are based more on the interpretation. I’m still not convinced by the statement "This type of color-opponency in the receptive field center of V1 neurons was not present in the receptive field center of retinal ganglion cells and, therefore, is likely computed by integrating center and surround information downstream of the retina." and I would suggest rewording it in the abstract.

      As discussed previously and now nicely added to the discussion, it is difficult to make a direct comparison given the different stimulus types used to characterize the retina and V1 recordings and the different levels of adaptation in both tissues. I will leave this point to the discussion, which allows for a more nuanced description of the phenomenon. Why do I think this is important? In the introduction, the authors argue that "the discrepancy [of previous studies] may be due to differences in stimulus design or light levels." However, while different light levels can be tested in V1, this cannot be done properly in the retina with 2P experiments. To address this, one would have to examine color-opponency in RGC terminals in vivo, which is beyond the scope of this study. Addressing these latter points directly in the discussion would, in my opinion, only strengthen the study.

      We thank the reviewer for the feedback. We removed the sentence mentioned by the reviewer from the abstract, as well as from the summary of our results in the Introduction. Additionally, we now phrase the interpretation of the retinal results more conservatively and specifically highlight in the Discussion that comparing ex-vivo retinal to in-vivo cortical data is challenging. With these changes, we believe that the focus of the paper is explicitly defined to be on the neuronal representation of color in mouse visual cortex, rather than on the comparison of retinal and cortical color processing.

      Minor:

      In the abstract, the second sentence says that we already know the mechanisms in primates.

      Unfortunately, I do not think this is true. First, primates refers to an order with several species, which might have adaptations to their color-processing. Second, I’m aware of several characterizations in "primates" that have led to convincing models (as referenced), but in my opinion, this is far from a true understanding the mechanisms, especially since very little is known about foveal color processing due to the difficulties of these experiments. Similarly in the introduction. "Primates" is indirectly defined as a species. Perhaps some rewording is needed here as well, since we know how different cone distributions can be in rodents (see Peichl’s work).

      Thanks. We have reworded the Abstract and Introduction towards indicating that many studies have been performed in primate species, without suggesting that the mechanisms are described.

      The legend in Fig. 2 has a "Fig. ???"

      Fixed.

      Referee 2 (Remarks to the Author):

      Franke et al. characterize the representation of color in the primary visual cortex of mice, highlighting how this changes across the visual field. Using calcium imaging in awake, head-fixed mice, they characterize the properties of V1 neurons (layer 2/3) using a large center-surround stimulation where green and ultra-violet colors were presented in random combinations. Clustering of responses revealed a set of functional cell-types based on their preference to different combinations of green and UV in their center and surround. These functional types were demonstrated to have different spatial distributions across V1, including one neuronal type (Green-ON/UV-OFF) that was much more prominent in the posterior V1 (i.e. upper visual field). Modelling work suggests that these neurons likely support the detection of predator-like objects in the sky.

      Strengths: The large-scale single-cell resolution imaging used in this work allows the authors to map the responses of individual neurons across large regions of the visual cortex. Combining this large dataset with clustering analysis enabled the authors to group V1 neurons into distinct functional cell types and demonstrate their relative distribution in the upper and lower visual fields. Modelling work demonstrated the different capacity of each functional type to detect objects in the sky, providing insight into the ethological relevance of color opponent neurons in V1.

      We thank the reviewer for appreciating our study.

      Weaknesses: While the study presents convincing evidence about the asymmetric distribution of color-opponent neurons in V1, the paper would greatly benefit from a more in-depth discussion of the caveats related to the conclusions drawn about their origin. This is particularly relevant regarding the conclusion drawn about the contribution of color opponent neurons in the retina. The mismatch between retinal color opponency and V1 color opponency could imply that this feature is not solely inherited from the retina, however, there are other plausible explanations that are not discussed here. Direct evidence for this statement remains weak.

      Thanks for this comment. We removed the retinal findings from the abstract, as well as from the summary of our results in the Introduction. In addition, we now phrase the interpretation of the retinal results more conservatively and specifically highlight in the Discussion that comparing ex-vivo retinal to in-vivo cortical data is challenging. With these changes, we believe that the focus of the paper is explicitly defined to be on the neuronal representation of color in mouse visual cortex, rather than on the comparison of retinal and cortical color processing.

      In addition, the paper would benefit from adding explicit neuron counts or percentages to the quadrants of each of the density plots in Figures 2-5. The variance explained by the principal components does not capture the percentage of color opponent cells. Additionally, there appear to be some remaining errors in the figure legend and labels that have not been addressed (e.g. ’??’ in Fig 2 legend).

      Thank you for this suggestion. We believe that adding the numbers or percentages to the figure panels would make them too crowded. Instead, we have now mentioned in the Results section and the legends that the percentages of variance explained by the color (off-diagonal) and luminance axis (diagonal) correlate with the number of neurons located in the color (top left and bottom right) and luminance contrast quadrants (top right and bottom left), respectively. Together with the number of neurons in each plot stated in the legends and the scale bar indicating the number of neurons per gray level, we hope this approach provides clarity for the reader to interpret the panels. Additionally, we have fixed the broken reference in the legend of Fig. 2.

      Overall, this study will be a valuable resource for researchers studying color vision, cortical processing, and the processing of ethologically relevant information. It provides a useful basis for future work on the origin of color opponency in V1 and its ethological relevance.

      General Suggestions:

      -  Please add possible caveats of using ETA method to the discussion section. For example, it is unclear to what extent ON/OFF cells are being overlooked by using ETA method.

      We now discuss the limitations of the ETA approach in the Discussion section.

      - The caveats of using the percentage of variance explained in the retina as evidence against V1 solely inheriting color-opponency from retinal output neurons are not adequately addressed. For example, could the mismatch in explained variance of the color axis between V1 and RGCs be explained by a subset of non-color opponent RGCs projecting elsewhere (not dLGN-V1) or that color opponent cells project to a larger number of neurons in V1 than non-color opponent cells? We suggest adding a paragraph to the discussion to address this issue.

      We have removed these conclusions from the paper, more carefully interpret the retinal results and mention that comparing ex-vivo retina data with in-vivo cortical data is challenging.

      - Please clarify how the different response types shown in Figure 5e-f lead to differences in noise detection and thereby differences in predator discriminability. For example, why does Gon/UVoff not respond to the noise scene while Goff/UVoff does?

      We added this to the Results section.

      - Please clarify the relationship between ETA amplitude, neural response probability, and neural response amplitude. For example, do color-opponent cells have equal absolute neural response amplitudes to the different colors?

      Thank you for bringing up this point. The ETA is obtained by summing the stimulus sequences that elicit an event (i.e., response), weighted by the amplitude of the response. Consequently, the absolute amplitude of the ETA correlates with the calcium amplitude. Importantly, the ETA amplitudes of different stimulus conditions are comparable because they were estimated on the same normalized calcium trace. Therefore, comparing the absolute amplitudes of ETAs of color-opponent neurons reveals the response magnitude of the cells to different colors. We have now included this information in the Results section.

      Abstract: - "more than a third of neurons in mouse V1 are color-opponent in their receptive field center". It is unclear what data supports this statement. Can you please provide a statement in the manuscript that supports this directly using the number of neurons?

      We added the following sentence to the Results section: Nevertheless, a substantial fraction of neurons (33.1%) preferred color-opponent stimuli and scattered along the off-diagonal in the upper left and lower right quadrants, especially for the RF center.

      Figure 2: - There is a ?? in the figure legend. Which figure should this refer to? - please provide explicit neuron counts/percentages for each quadrant in b.

      We fixed the figure reference. We believe that adding the numbers or percentages to the figure panels would make them too crowded. Instead, we have now mentioned in the Results section and the legends that the percentages of variance explained by the color (off-diagonal) and luminance axis (diagonal) correlate with the number of neurons located in the color (top left and bottom right) and luminance contrast quadrants (top right and bottom left), respectively. Together with the number of neurons in each plot stated in the legends and the scale bar indicating the number of neurons per gray level, we hope this approach provides clarity for the reader to interpret the panels.

      Figure 3: - Fig 3: Color scheme makes it very difficult to differentiate the different conditions, especially when printed.

      Thanks we changed the color scheme.

      - Add explicit neuron counts/percentages for each quadrant in b.

      See above.

      Figure 4: - Add explicit neuron counts/percentages for each quadrant in b.

      See above.

      Figure 5: - Add explicit neuron counts/percentages for each quadrant in c.

      See above.

      Methods: - "we modeled each response type to have a square RF with 10 degrees visual angle in diameter". There appears to be a mismatch between this statement and Figure 5e where 18 degrees is reported.

      Thanks we fixed that.

      Referee 3 (Remarks to the Author):

      This paper studies chromatic coding in mouse primary visual cortex. Calcium responses of a large collection of cells are measured in response to a simple spot stimulus. These responses are used to estimate chromatic tuning properties - specifically sensitivity to UV and green stimuli presented in a large central spot or a larger still surrounding region. Cells are divided based on their responses to these stimuli into luminance or chromatic sensitive groups. The results are interesting and many aspects of the experiments and conclusions are well done; several technical concerns, however, limit the support for several main conclusions,

      Limitations of stimulus choice The paper relies on responses to a large (37.5 degree diameter) modulated spot and surround region. This spot is considerably larger than the receptive fields of both V1 cells and retinal ganglion cells (it is twice the area of the average V1 receptive field). As a result, the spot itself is very likely to strongly activate both center and surround mechanisms, and responses of cells are likely to depend on where the receptive fields are located within the spot

      (and, e.g., how much of the true neural surround samples the center spot vs the surround region). Most importantly, the surrounds of most of the recorded cells will be strongly activated by the central spot. This brings into question statements in the paper about selective activation of center and surround (e.g. page 2, right column). This in turn raises questions about several subsequent analyses that rely on selective center and surround activation.

      Thank you for this comment. A similar point was raised by a reviewer in the first round of revision. We agree with the reviewers that it is critical to discuss both the rationale behind our stimulus design and its limitations to facilitate better interpretation by the reader.

      To be able to record from many V1 neurons simultaneously, we used a stimulus size of 37.5 degree visual angle in diameter, which is slightly larger than center RFs of single V1 neurons (between 20 - 30 degrees visual angle depending on the stimulus, see here). The disadvantage of this approach is that the stimulus is only roughly centered on the neurons’ center RFs. To reduce the impact of potential stimulus misalignment on our results, we used the following steps: { For each recording, we positioned the monitor such that the mean RF across all neurons lies within the center of the stimulus field of view.

      We confirmed that this procedure results in good stimulus alignment for the large majority of recorded neurons within individual recording fields by using a sparse noise stimulus (Suppl. Fig. 1a-c). Specifically, we found that for 83% of tested neurons, more than two thirds of their center RF, determined by the sparse noise stimulus, overlapped with the center spot of the color noise stimulus.

      For analysis, we excluded neurons without a significant center STA, which may be caused by misalignment of the stimulus.

      Together, we believe these points strongly suggest that the center spot and the surround annulus of the noise stimulus predominantly drive center (i.e. classical RF) and surround (i.e. extraclassical RF), respectively, of the recorded V1 neurons. This is further supported by the fact that color response types identified using an automated clustering method were robust across mice (Suppl. Fig. 6c), indicating consistent stimulus centering.

      Nevertheless, we cannot exclude the possibility that the stimulus was misaligned for a subset of the recorded neurons used in our analysis. We agree with the reviewer that such misalignment might have caused the center stimulus to partially activate the surround. To further address this issue beyond the controls we have already implemented, one could compare the results of our approach with an approach that centers the stimulus on individual neurons. However, we believe that performing these additional experiments is beyond the scope of the current study.

      To acknowledge the experimental limitations of our study and the concerns brought up by the reviewer, we have added the steps we perform to reduce the effects of stimulus misalignment in the Results section and discuss the problem of stimulus alignment in the Discussion in a separate section. With this, we believe our manuscript explains both the rationale behind our stimulus design as well as important limitations of the approach.

      Comparison with retina A key conclusion of the paper is that the chromatic tuning in V1 is not inherited from retinal ganglion cells. This conclusion comes from comparing chromatic tuning in a previously-collected data set from retina with the present results. But the retina recordings were made using a considerably smaller spot, and hence it is not clear that the comparison made in the paper is accurate. For example, the stimulus used for the V1 experiments almost certainly strongly stimulates both center and surround of retinal ganglion cells. The text focuses on color opponency in the receptive field centers of retinal ganglion cells, but center-surround opponency seems at least as relevant for such large spots. This issue needs to be described more clearly and earlier in the paper.

      Thanks for this comment. We removed the retinal findings from the abstract, as well as from the summary of our results in the Introduction. In addition, we now phrase the interpretation of the retinal results more conservatively and specifically highlight in the Discussion that comparing ex-vivo retinal to in-vivo cortical data is challenging. With these changes, we believe that the focus of the paper is explicitly defined to be on the neuronal representation of color in mouse visual cortex, rather than on the comparison of retinal and cortical color processing.

      Limitations associated with ETA analysis One of the reviewers in the previous round of reviews raised the concern that the ETA analysis may not accurately capture responses of cells with nonlinear receptive field properties such as On/Off cells. This possibility and whether it is a concern should be discussed.

      Thanks for this comment. We now discuss the limitation of using an ETA analysis in the

      Discussion section.

      Discrimination performance poor Discriminability of color or luminance is used as a measure of population coding. The discrimination performance appears to be quite poor - with 500-1000 neurons needed to reliably distinguish light from dark or green from UV. Intuitively I would expect that a single cell would provide such discrimination. Is this intuition wrong? If not, how do we interpret the discrimination analyses?

      Thank you for raising this point. The plots in Fig. 2c (and Figs. 3-5) show discriminability in bits, with the discrimination accuracy in % highlighted by the dotted horizontal lines. For 500 neurons, the discriminability is approx. 0.8 bits, corresponding to 95% accuracy. Even for 50 neurons, the accuracy is significantly above chance level. We now mention in the legends that the dotted lines indicate decoding accuracy in %.

    1. Author response:

      The following is the authors’ response to the current reviews.

      (1) Though we cannot survey all mutants, our observation that 774 genetically diverse adaptive mutants converge at the level of phenotype is important. It adds to growing evidence (see PMID33263280, PMID37437111, PMID22282810, PMID25806684) that the genetic basis of adaptation is not as diverse as the phenotypic basis. This convergence could make evolution more predictable.

      (2) Previous fitness competitions using this specific barcode system have been run for greater than 25 generations (PMID33263280, PMID27594428, PMID37861305, PMID27594428). We measure fitness per cycle, rather than per generation, so our fitness advantages are comparable to those in the aforementioned studies, including Venkataram and Dunn et al. (PMID27594428).

      (3) Our results remain the same upon removing the ~150 lineages with the noisiest fitness inferences, including those the reviewer mentions (see Figure S7).

      (4) We agree that there are likely more than the 6 clusters that we validated with follow-up studies (see Discussion). The important point is that we see a great deal of convergence in the behavior of diverse adaptive mutants.

      (5) The growth curves requested by the reviewer were included in our original manuscript; several more were added in the revision (see Figures 5D, 5E, 7D, S11B, S11C).


      The following is the authors’ response to the original reviews.

      Public Reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      In their manuscript, Schmidlin, Apodaca, et al try to answer fundamental questions about the evolution of new phenotypes and the trade-offs associated with this process. As a model, they use yeast resistance to two drugs, fluconazole and radicicol. They use barcoded libraries of isogenic yeasts to evolve thousands of strains in 12 different environments. They then measure the fitness of evolved strains in all environments and use these measurements to examine patterns in fitness trade-offs. They identify only six major clusters corresponding to different trade-off profiles, suggesting the vast genotypic landscape of evolved mutants translates to a highly constrained phenotypic space. They sequence over a hundred evolved strains and find that mutations in the same gene can result in different phenotypic profiles.  

      Overall, the authors deploy innovative methods to scale up experimental evolution experiments, and in many aspects of their approach tried to minimize experimental variation. 

      We thank the reviewer for this positive assessment of our work. We are happy that the reviewer noted what we feel is a unique strength of our approach: we scaled up experimental evolution by using DNA barcodes and by exploring 12 related selection pressures.  Despite this scaling up, we still see phenotypic convergence among the 744 adaptive mutants we study. 

      Weaknesses: 

      (1) One of the objectives of the authors is to characterize the extent of phenotypic diversity in terms of resistance trade-offs between fluconazole and radicicol. To minimize noise in the measurement of relative fitness, the authors only included strains with at least 500 barcode counts across all time points in all 12 experimental conditions, resulting in a set of 774 lineages passing this threshold. This corresponds to a very small fraction of the starting set of ~21 000 lineages that were combined after experimental evolution for fitness measurements. 

      This is a misunderstanding that we clarified in this revision. Our starting set did not include 21,000 adaptive lineages. The total number of unique adaptive lineages in this starting set is much lower than 21,000 for two reasons. 

      First, ~21,000 represents the number of single colonies we isolated in total from our evolution experiments. Many of these isolates possess the same barcode, meaning they are duplicates. Second, and perhaps more importantly, most evolved lineages do not acquire adaptive mutations, meaning that many of the 21,000 isolates are genetically identical to their ancestor. In our revised manuscript, we explicitly stated that these 21,000 isolated lineages do not all represent unique, adaptive lineages. We changed the word “lineages” to “isolates” where relevant in Figure 2 and the accompanying legend. And we have added the following sentence to the figure 2 legend (line 212), “These ~21,000 isolates do not represent as many unique, adaptive lineages because many either have the same barcode or do not possess adaptive mutations.”

      More broadly speaking, several previous studies have demonstrated that diverse genetic mutations converge at the level of phenotype and have suggested that this convergence makes adaptation more predictable (PMID33263280, PMID37437111, PMID22282810, PMID25806684). Most of these studies survey fewer than 774 mutants. Further, our study captures mutants that are overlooked in previous studies, such as those that emerge across subtly different selection pressures (e.g., 4 𝜇g/ml vs. 8 𝜇g/ml flu) and those that are undetectable in evolutions lacking DNA barcodes. Thus, while our experimental design misses some mutants (see next comment), it captures many others. Thus, we feel that “our work – showing that 774 mutants fall into a much smaller number of groups” is important because it “contributes to growing literature suggesting that the phenotypic basis of adaptation is not as diverse as the genetic basis (lines 176 - 178).”

      As the authors briefly remark, this will bias their datasets for lineages with high fitness in all 12 environments, as all these strains must be fit enough to maintain a high abundance. 

      We now devote 19 lines of text to discussing this bias (on lines 160 - 162, 278-284, and in more detail on 758 - 767).

      We walk through an example of a class of mutants that our study misses. One lines 759 - 763, we say, “our study is underpowered to detect adaptive lineages that have low fitness in any of the 12 environments. This is bound to exclude large numbers of adaptive mutants. For example, previous work has shown some FLU resistant mutants have strong tradeoffs in RAD (Cowen and Lindquist 2005). Perhaps we are unable to detect these mutants because their barcodes are at too low a frequency in RAD environments, thus they are excluded from our collection of 774.”

      In our revised version, we added more text earlier in the manuscript that explicitly discusses this bias. Lines 278 – 283 now read, “The 774 lineages we focus on are biased towards those that are reproducibly adaptive in multiple environments we study. This is because lineages that have low fitness in a particular environment are rarely observed >500 times in that environment (Figure S4). By requiring lineages to have high-coverage fitness measurements in all 12 conditions, we may be excluding adaptive mutants that have severe tradeoffs in one or more environments, consequently blinding ourselves to mutants that act via unique underlying mechanisms.”

      Note that while we “miss” some classes of mutants, we “catch” other classes that may have been missed in previous studies of convergence. For example, we observe a unique class of FLU-resistant mutants that primarily emerged in evolution experiments that lack FLU (Figure 3). Thus, we think that the unique design of our study, surveying 12 environments, allows us to make a novel contribution to the study of phenotypic convergence.

      One of the main observations of the authors is phenotypic space is constrained to a few clusters of roughly similar relative fitness patterns, giving hope that such clusters could be enumerated and considered to design antimicrobial treatment strategies. However, by excluding all lineages that fit in only one or a few environments, they conceal much of the diversity that might exist in terms of trade-offs and set up an inclusion threshold that might present only a small fraction of phenotypic space with characteristics consistent with generalist resistance mechanisms or broadly increased fitness. This has important implications regarding the general conclusions of the authors regarding the evolution of trade-offs. 

      We agree and discussed exactly the reviewer’s point about our inclusion threshold in the 19 lines of text mentioned previously (lines 160 - 162, 278-284, and 758 - 767). To add to this discussion, and avoid the misunderstanding the reviewer mentions, we added the following strongly-worded sentence to the end of the paragraph on lines 749 – 767 in our revised manuscript: “This could complicate (or even make impossible) endeavors to design antimicrobial treatment strategies that thwart resistance”. 

      More generally speaking, we set up our study around Figure 1, which depicts a treatment strategy that works best if there exists but a single type of adaptive mutant. Despite our inclusion threshold, we find there are at least 6 types of mutants. This diminishes hopes of designing simple multidrug strategies like Figure 1. Our goal is to present a tempered and nuanced discussion of whether and how to move forward with designing multidrug strategies, given our observations. On one hand, we point out how the phenotypic convergence we observe is promising. But on the other hand, we also point out how there may be less convergence than meets the eye for various reasons including the inclusion threshold the reviewer mentions (lines 749 - 767).

      We have made several minor edits to the text with the goal of providing a more balanced discussion of both sides. For example, we added the words, “may yet” to the following sentences on lines 32 – 36 of the abstract: “These findings, on one hand, demonstrate the difficulty in relying on consistent or intuitive tradeoffs when designing multidrug treatments. On the other hand, by demonstrating that hundreds of adaptive mutations can be reduced to a few groups with characteristic tradeoffs, our findings may yet empower multidrug strategies that leverage tradeoffs to combat resistance.”

      (2) Most large-scale pooled competition assays using barcodes are usually stopped after ~25 to avoid noise due to the emergence of secondary mutations. 

      The rate at which new mutations enter a population is driven by various factors such as the mutation rate and population size, so choosing an arbitrary threshold like 25 generations is difficult. 

      We conducted our fitness competition following previous work using the Levy/Blundell yeast barcode system, in which the number of generations reported varies from 32 to 40 (PMID33263280, PMID27594428, PMID37861305, see PMID27594428 for detailed calculation of the fraction of lineages biased by secondary mutations in this system). 

      The authors measure fitness across ~40 generations, which is almost the same number of generations as in the evolution experiment. This raises the possibility of secondary mutations biasing abundance values, which would not have been detected by the whole genome sequencing as it was performed before the competition assay. 

      Previous work has demonstrated that in this evolution platform, most mutations occur during the transformation that introduces the DNA barcodes (Levy et al. 2015). In other words, these mutations are already present and do not accumulate during the 40 generations of evolution. Therefore, the observation that we collect a genetically diverse pool of adaptive mutants after 40 generations of evolution is not evidence that 40 generations is enough time for secondary mutations to bias abundance values.

      We have added the following sentence to the main text to highlight this issue (lines 247 - 249): “This happens because the barcoding process is slightly mutagenic, thus there is less need to wait for DNA replication errors to introduce mutations (Levy et al. 2015; Venkataram et al. 2016).

      We also elaborate on this in the method section entitled, “Performing barcoded fitness competition experiments,” where we added a full paragraph to clarify this issue (lines 972 - 980).

      (3) The approach used by the authors to identify and visualize clusters of phenotypes among lineages does not seem to consider the uncertainty in the measurement of their relative fitness. As can be seen from Figure S4, the inter-replicate difference in measured fitness can often be quite large. From these graphs, it is also possible to see that some of the fitness measurements do not correlate linearly (ex.: Med Flu, Hi Rad Low Flu), meaning that taking the average of both replicates might not be the best approach.  Because the clustering approach used does not seem to take this variability into account, it becomes difficult to evaluate the strength of the clustering, especially because the UMAP projection does not include any representation of uncertainty around the position of lineages. This might paint a misleading picture where clusters appear well separate and well defined but are in fact much fuzzier, which would impact the conclusion that the phenotypic space is constricted. 

      Our noisiest fitness measurements correspond to barcodes that are the least abundant and thus suffer the most from stochastic sampling noise. These are also the barcodes that introduce the nonlinearity the reviewer mentions. We removed these from our dataset by increasing our coverage threshold from 500 reads to 5,000 reads. The clusters did not collapse, which suggests that they were not capturing this noise (Figure S7B).

      More importantly, we devoted 4 figures and 200 lines of text to demonstrating that the clusters we identified capture biologically meaningful differences between mutants (and not noise). We have modified the main text to point readers to figures 5 through 8 earlier, such that it is more apparent that the clustering analysis is just the first piece of our data demonstrating convergence at the level of phenotype.

      (4) The authors make the decision to use UMAP and a gaussian mixed model to cluster and represent the different fitness landscapes of their lineages of interest. Their approach has many caveats. First, compared to PCA, the axis does not provide any information about the actual dissimilarities between clusters. Using PCA would have allowed a better understanding of the amount of variance explained by components that separate clusters, as well as more interpretable components. 

      The components derived from PCA are often not interpretable. It’s not obvious that each one, or even the first one, will represent an intuitive phenotype, like resistance to fluconazole.  Moreover, we see many non-linearities in our data. For example, fitness in a double drug environment is not predicted by adding up fitness in the relevant single drug environments. Also, there are mutants that have high fitness when fluconazole is absent or abundant, but low fitness when mild concentrations are present. These types of nonlinearities can make the axes in PCA very difficult to interpret, plus these nonlinearities can be missed by PCA, thus we prefer other clustering methods. 

      Still, we agree that confirming our clusters are robust to different clustering methods is helpful. We have included PCA in the revised manuscript, plotting PC1 vs PC2 as Figure S9 with points colored according to the cluster assignment in figure 4 (i.e. using a gaussian mixture model). It appears the clusters are largely preserved.

      Second, the advantages of dimensional reduction are not clear. In the competition experiment, 11/12 conditions (all but the no drug, no DMSO conditions) can be mapped to only three dimensions: concentration of fluconazole, concentration of radicicol, and relative fitness. Each lineage would have its own fitness landscape as defined by the plane formed by relative fitness values in this space, which can then be examined and compared between lineages. 

      We worry that the idea stems from apriori notions of what the important dimensions should be. The biology of our system is unfortunately not intuitive. For example, it seems like this idea would miss important nonlinearities such as our observation that low fluconazole behaves more like a novel selection pressure than a dialed down version of high fluconazole. 

      Third, the choice of 7 clusters as the cutoff for the multiple Gaussian model is not well explained. Based on Figure S6A, BIC starts leveling off at 6 clusters, not 7, and going to 8 clusters would provide the same reduction as going from 6 to 7. This choice also appears arbitrary in Figure S6B, where BIC levels off at 9 clusters when only highly abundant lineages are considered. 

      We agree. We did not rely on the results of BIC alone to make final decisions about how many clusters to include. Another factor we considered were follow-up genotyping and phenotyping studies that confirm biologically meaningful differences between the mutants in each cluster (Figures 5 – 8). We now state this explicitly. Here is the modified paragraph where we describe how we chose a model with 7 clusters, from lines 436 – 446 of the revised manuscript:

      “Beyond the obvious divide between the top and bottom clusters of mutants on the UMAP, we used a gaussian mixture model (GMM) (Fraley and Raftery, 2003) to identify clusters. A common problem in this type of analysis is the risk of dividing the data into clusters based on variation that represents measurement noise rather than reproducible differences between mutants (Mirkin, 2011; Zhao et al., 2008). One way we avoided this was by using a GMM quality control metric (BIC score) to establish how splitting out additional clusters affected model performance (Figure S6). Another factor we considered were follow-up genotyping and phenotyping studies that demonstrate biologically meaningful differences between mutants in different clusters (Figures 5 – 8). Using this information, we identified seven clusters of distinct mutants, including one pertaining to the control strains, and six others pertaining to presumed different classes of adaptive mutant (Figure 4D). It is possible that there exist additional clusters, beyond those we are able to tease apart in this study.”

      This directly contradicts the statement in the main text that clusters are robust to noise, as more a stringent inclusion threshold appears to increase and not decrease the optimal number of clusters. Additional criteria to BIC could have been used to help choose the optimal number of clusters or even if mixed Gaussian modeling is appropriate for this dataset. 

      We are under the following impression: If our clustering method was overfitting, i.e. capturing noise, the optimal number of clusters should decrease when we eliminate noise. It increased. In other words, the observation that our clusters did not collapse (i.e.

      merge) when we removed noise suggests these clusters were not capturing noise. 

      Most importantly, our validation experiments, described below, provide additional evidence that our clusters capture meaningful differences between mutants (and not noise).  

      (5) Large-scale barcode sequencing assays can often be noisy and are generally validated using growth curves or competition assays. 

      Some types of bar-seq methods, in particular those that look at fold change across two time points, are noisier than others that look at how frequency changes across multiple timepoints (PMID30391162). Here, we use the less noisy method. We also reduce noise by using a stricter coverage threshold than previous work (e.g., PMID33263280), and by excluding batch effects by performing all experiments simultaneously, since we found this to be effective in our previous work (PMID37237236). 

      Perhaps also relevant is that the main assay we use to measure fitness has been previously validated (PMID27594428) and no subsequent study using this assay validates using the methods suggested above (see PMID37861305, PMID33263280, PMID31611676, PMID29429618, PMID37192196, PMID34465770, PMID33493203). Similarly, bar-seq has been used, without the suggested validation, to demonstrate that the way some mutant’s fitness changes across environments is different from other mutants (PMID33263280, PMID37861305, PMID31611676, PMID33493203, PMID34596043). This is the same thing that we use bar-seq to demonstrate. 

      For all of these reasons above, we are hesitant to confirm bar-seq itself as a valid way to infer fitness. It seems this is already accepted as a standard in our field. However, please see below.

      Having these types of results would help support the accuracy of the main assay in the manuscript and thus better support the claims of the authors. 

      While we don’t agree that fitness measurements obtained from this bar-seq assay generally require validation, we do agree that it is important to validate whether the mutants in each of our 6 clusters indeed are different from one another in meaningful ways.

      Our manuscript has 4 figures (5 - 8) and over 200 lines of text dedicated to validating whether our clusters capture reproducible and biologically meaningful differences between mutants. In the revised manuscript, we added additional validation experiments, such that three figures (Figures 5, 7 and S11) now involve growth curves, as the reviewer requested. 

      Below, we walk through the different types of validation experiments that are present in our manuscript, including those that were added in this revision.

      (1) Mutants from different clusters have different growth curves: In our original manuscript, we measured growth curves corresponding to a fitness tradeoff that we thought was surprising. Mutants in clusters 4 and 5 both have fitness advantages in single drug conditions. While mutants from cluster 4 also are advantageous in the relevant double drug conditions, mutants from cluster 5 are not! We validated these different behaviors by studying growth curves for a mutant from each cluster (Figures 7 and S11), finding that mutants from different clusters have different growth curves. In the revised manuscript, we added growth curves for 6 additional mutants (3 from cluster 1 and 3 from cluster 3), demonstrating that only the cluster 1 mutants have a tradeoff in high concentrations of fluconazole (see Figure 5D & 5E). In sum, this work demonstrates that mutants from different clusters have predictable differences in their growth phenotypes.

      (2) Mutants from different clusters have different evolutionary origins: In our original manuscript, we came up with a novel way to ask whether the clusters capture different types of adaptive mutants. We asked whether the mutants in each cluster originate from different evolution experiments. They often do (see pie charts in Figures 5, 6, 7, 8). In the revised manuscript, we extended this analysis to include mutants from cluster 1. Cluster 1 is defined by high fitness in low fluconazole that declines with increasing fluconazole. In our revised manuscript, we show that cluster 1 lineages were overwhelmingly sampled from evolutions conducted in our lowest concentration of fluconazole (see pie chart in new Figure 5A). No other cluster’s evolutionary history shows this pattern (compare to pie charts in figures 6, 7, and 8).

      **These pie charts also provide independent confirmation supporting the fitness tradeoffs observed for each cluster in figure 4E. For example, mutants in cluster 5 appear to have a tradeoff in a particular double drug condition (HRLF), and the pie charts confirm that they rarely originate from that evolution condition. This differs from cluster 4 mutants, which do not have a fitness tradeoff in HRLF, and are more likely to originate from that environment (see purple pie slice in figure 7). Additional cases where results of evolution experiments (pie charts) confirm observed fitness tradeoffs are discussed in the manuscript on lines 320 – 326, 594 – 598, 681 – 685.

      (3) Mutants from each cluster often fall into different genes: We sequenced many of these mutants and show that mutants in the same gene are often found in the same cluster. For example, all 3 IRA1 mutants are in cluster 6 (Fig 8), both GPB2 mutants are in cluster 4 (Figs 7 & 8), and 35/36 PDR mutants are in either cluster 2 or 3 (Figs 5 & 6). 

      (4) Mutants from each cluster have behaviors previously observed in the literature: We compared our sequencing results to the literature and found congruence. For example, PDR mutants are known to provide a fitness benefit in fluconazole and are found in clusters that have high fitness in fluconazole (lines 485 - 491). Previous work suggests that some mutations to PDR have different tradeoffs than others, which corresponds to our finding that PDR mutants fall into two separate clusters (lines 610 - 612). IRA1 mutants were previously observed to have high fitness in our “no drug” condition and are found in the cluster that has the highest fitness in the “no drug” condition (lines 691 - 696). Previous work even confirms the unusual fitness tradeoff we observe where IRA1 and other cluster 6 mutants have low fitness only in low concentrations of fluconazole (lines 702 - 704).

      (5) Mutants largely remain in their clusters when we use alternate clustering methods:  In our original manuscript, we performed various different re-clustering and/or normalization approaches on our data (Fig 6, S5, S7, S8, S10). The clusters of mutants that we observe in figure 4 do not change substantially when we re-cluster the data. In our revised manuscript, we added another clustering method: principal component analysis (PCA) (Fig S9).  Again, we found that our clusters are largely preserved.

      While these experiments demonstrate meaningful differences between the mutants in each cluster, important questions remain. For example, a long-standing question in biology centers on the extent to which every mutation has unique phenotypic effects versus the extent to which scientists can predict the effects of some mutations from other similar mutations. Additional studies on the clusters of mutants discovered here will be useful in deepening our understanding of this topic and more generally of the degree of pleiotropy in the genotype-phenotype map.

      Reviewer #2 (Public Review): 

      Summary: 

      Schmidlin & Apodaca et al. aim to distinguish mutants that resist drugs via different mechanisms by examining fitness tradeoffs across hundreds of fluconazole-resistant yeast strains. They barcoded a collection of fluconazole-resistant isolates and evolved them in different environments with a view to having relevance for evolutionary theory, medicine, and genotypephenotype mapping. 

      Strengths: 

      There are multiple strengths to this paper, the first of which is pointing out how much work has gone into it; the quality of the experiments (the thought process, the data, the figures) is excellent. Here, the authors seek to induce mutations in multiple environments, which is a really large-scale task. I particularly like the attention paid to isolates with are resistant to low concentrations of FLU. So often these are overlooked in favour of those conferring MIC values >64/128 etc. What was seen is different genotype and fitness profiles. I think there's a wealth of information here that will actually be of interest to more than just the fields mentioned (evolutionary medicine/theory). 

      We are grateful for this positive review. This was indeed a lot of work! We are happy that the reviewer noted what we feel is a unique strength of our manuscript: that we survey adaptive isolates across multiple environments, including low drug concentrations.  

      Weaknesses: 

      Not picking up low fitness lineages - which the authors discuss and provide a rationale as to why. I can completely see how this has occurred during this research, and whilst it is a shame I do not think this takes away from the findings of this paper. Maybe in the next one! 

      We thank the reviewer for these words of encouragement and will work towards catching more low fitness lineages in our next project.

      In the abstract the authors focus on 'tradeoffs' yet in the discussion they say the purpose of the study is to see how many different mechanisms of FLU resistance may exist (lines 679-680), followed up by "We distinguish mutants that likely act via different mechanisms by identifying those with different fitness tradeoffs across 12 environments". Whilst I do see their point, and this is entirely feasible, I would like a bit more explanation around this (perhaps in the intro) to help lay-readers make this jump. The remainder of my comments on 'weaknesses' are relatively fixable, I think: 

      We have expanded the introduction, in particular lines 129 – 157 of the revised manuscript, to walk readers through the connection between fitness tradeoffs and molecular mechanisms. For example, here is one relevant section of new text from lines 131 - 136: “The intuition here is as follows. If two groups of drug resistant mutants have different fitness tradeoffs, it could mean that they provide resistance through different underlying mechanisms. Alternatively, both could provide drug resistance via the same mechanism, but some mutations might also affect fitness via additional mechanisms (i.e. they might have unique “side-effects” at the molecular level) resulting in unique fitness tradeoffs in some environments.”

      In the introduction I struggle to see how this body of research fits in with the current literature, as the literature cited is a hodge-podge of bacterial and fungal evolution studies, which are very different! So example, the authors state "previous work suggests that mutants with different fitness tradeoffs may affect fitness through different molecular mechanisms" (lines 129-131) and then cite three papers, only one of which is a fungal research output. However, the next sentence focuses solely on literature from fungal research. Citing bacterial work as a foundation is fine, but as you're using yeast for this I think tailoring the introduction more to what is and isn't known in fungi would be more appropriate. It would also be great to then circle back around and mention monotherapy vs combination drug therapy for fungal infections as a rationale for this study. The study seems to be focused on FLU-resistant mutants, which is the first-line drug of choice, but many (yeast) infections have acquired resistance to this and combination therapy is the norm. 

      We ourselves are broadly interested in the structure of the genotype-phenotype-fitness map (PMID33263280, PMID32804946). For example, we are interested in whether diverse mutations converge at the level of phenotype and fitness. Figure 1A depicts a scenario with a lot of convergence in that all adaptive mutations have the same fitness tradeoffs.

      The reason we cite papers from yeast, as well as bacteria and cancer, is that we believe general conclusions about the structure of the genotype-phenotype-fitness map apply broadly. For example, the sentence the reviewer highlights, “previous work suggests that mutants with different fitness tradeoffs may affect fitness through different molecular mechanisms” is a general observation about the way genotype maps to fitness. So, we cited papers from across the tree of life to support this sentence.  And in the next sentence, where we cite 3 papers focusing solely on fungal research, we cite them because they are studies about the complexity of this map. Their conclusions, in theory, should also apply broadly, beyond yeast.

      On the other hand, because we study drug resistant mutations, we hope that our dataset and observations are of use to scientists studying the evolution of resistance. We use our introduction to explain how the structure of the genotype-phenotype-fitness map might influence whether a multidrug strategy is successful (Figure 1).

      We are hesitant to rework our introduction to focus more specifically on fungal infections as this is not our primary area of expertise.

      Methods: Line 769 - which yeast? I haven't even seen mention of which species is being used in this study; different yeast employ different mechanisms of adaptation for resistance, so could greatly impact the results seen. This could help with some background context if the species is mentioned (although I assume S. cerevisiae). 

      In the revised manuscript, we have edited several lines (line 95, 186, 822) to state the organism this work was done with is Saccharomyces cerevisiae. 

      In which case, should aneuploidy be considered as a mechanism? This is mentioned briefly on line 556, but with all the sequencing data acquired this could be checked quickly? 

      We like this idea and we are working on it, but it is not straightforward. The reviewer is correct in that we can use the sequencing data that we already have. But calling aneuploidy with certainty is tough because its signal can be masked by noise. In other words, some regions of the genome may be sequenced more than others by chance.

      Given this is not straightforward, at least not for us, this analysis will likely have to wait for a subsequent paper. 

      I think the authors could be bolder and try and link this to other (pathogenic) yeasts. What are the implications of this work on say, Candida infections? 

      Perhaps because our background lies in general study of the genotype-phenotype map, we are hesitant about making bold assertions about how our work might apply to pathogenic yeasts. We are hopeful that our work will serve as a stepping-stone such that scientists from that community can perhaps make (and test) such statements.   

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      I found the ideas and the questions asked in this manuscript to be interesting and ambitious. The setup of the evolution and fitness competition experiments was well poised to answer them, but the analysis of the data is not currently enough to properly support the claims made. I would suggest revising the analysis to address the weaknesses raised in the public review and if possible, adding some more experimental validations. As you already have genome sequencing data showing the causal mutation for many mutants across the different clusters, it should be possible for you to reconstruct some of the strains and test validate their phenotypes and cluster identity. 

      Yes, this is possible. We added more validation experiments (see figure 5). We already had quite a few validation experiments (figures 5 - 8 and lines 479 - 718), but we did not clearly highlight the significance of these analyses in our original manuscript. Therefore, we modified the text in our revised manuscript in various places to do so. For example, we now make clearer that we jointly use BIC scores as well as validation experiments to decide how many clusters to describe (lines 436 - 446). We also make clearer that our clustering analysis is only the first step towards identifying groups of mutants with similar tradeoffs by using words and phrases like, “we start by” (line 411) and “preliminarily” (line 448) when discussing the clustering analysis.  We also point readers to all the figures describing our validation experiments earlier (line 443), and list these experiments out in the discussion (lines 738 - 741).

      Also, please deposit your genome sequencing data in a public database (I am not sure I saw it mentioned anywhere). 

      We have updated line 1088 of the methods section to include this sentence: “Whole genome sequences were deposited in GenBank under SRA reference PRJNA1023288.”

      Reviewer #2 (Recommendations For The Authors):

      I don't think the figures or experiments can be improved upon, they are excellent. There are a few times I feel things are written in a rather confusing way and could be explained better, but also I feel there are places the authors jump from one thing to another really quickly and the reader (who might not be an expert in this area) will struggle to keep up. For example: 

      Explaining what RAD is - it is introduced in the methods, but what it is, is not really explained. 

      Since the introduction is already very long, we chose not to explain radicicol’s mechanism of action here. Instead, we bring this up later on lines 614 – 621 when it becomes relevant.

      More generally, in response to this advice and that from reviewer 1, we also added text to various places in the manuscript to help explain our work more clearly. In particular, we clarified the significance of our validation experiments and various important methodological details (see above). We also better explained the connection between fitness tradeoffs and mechanisms (see above) and added more details about the potential use cases of our approach (lines 142 – 150).

      The abstract states "some of the groupings we find are surprising. For example, we find some mutants that resist single drugs do not resist their combination, and some mutants to the same gene have different tradeoffs than others". Firstly, this sentence is a bit confusing to read but if I've read it as intended, then is it really surprising? It's difficult for organisms (bacteria and fungi) to develop multiple beneficial mutations conferring drug resistance on the same background, hence why combination antifungal drug therapy is often used to treat infections. 

      This is a place where brevity got in the way of clarity. We added a bit of text to make clear why we were surprised. Specifically, we were surprised because not all mutants behave the same. Some resist single drugs AND their combination. Some resist single drugs but not their combination. The sentence in the abstract now reads, “For example, we find some mutants that resist single drugs do not resist their combination, while others do. And some mutants to the same gene have different tradeoffs than others.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      This study is convincing because they performed time-resolved X-ray crystallography under different pH conditions using active/inactive metal ions and PpoI mutants, as with the activity measurements in solution in conventional enzymatic studies. Although the reaction mechanism is simple and may be a little predictable, the strength of this study is that they were able to validate that PpoI catalyzes DNA hydrolysis through "a single divalent cation" because time-resolved X-ray study often observes transient metal ions which are important for catalysis but are not predictable in previous studies with static structures such as enzyme-substrate analog-metal ion complexes. The discussion of this study is well supported by their data. This study visualized the catalytic process and mutational effects on catalysis, providing new insight into the catalytic mechanism of I-PpoI through a single divalent cation. The authors found that His98, a candidate of proton acceptor in the previous experiments, also affects the Mg2+ binding for catalysis without the direct interaction between His98 and the Mg2+ ion, suggesting that "Without a proper proton acceptor, the metal ion may be prone for dissociation without the reaction proceeding, and thus stable Mg2+ binding was not observed in crystallo without His98". In future, this interesting feature observed in I-PpoI should be investigated by biochemical, structural, and computational analyses using other metal-ion dependent nucleases. 

      We appreciate the reviewer for the positive assessment as well as all the comments and suggestions.

      Reviewer #2 (Public Review): 

      Summary: 

      Most polymerases and nucleases use two or three divalent metal ions in their catalytic functions. The family of His-Me nucleases, however, use only one divalent metal ion, along with a conserved histidine, to catalyze DNA hydrolysis. The mechanism has been studied previously but, according to the authors, it remained unclear. By use of a time resolved X-ray crystallography, this work convincingly demonstrated that only one M2+ ion is involved in the catalysis of the His-Me I-PpoI 19 nuclease, and proposed concerted functions of the metal and the histidine. 

      Strengths: 

      This work performs mechanistic studies, including the number and roles of metal ion, pH dependence, and activation mechanism, all by structural analyses, coupled with some kinetics and mutagenesis. Overall, it is a highly rigorous work. This approach was first developed in Science (2016) for a DNA polymerase, in which Yang Cao was the first author. It has subsequently been applied to just 5 to 10 enzymes by different labs, mainly to clarify two versus three metal ion mechanisms. The present study is the first one to demonstrate a single metal ion mechanism by this approach. 

      Furthermore, on the basis of the quantitative correlation between the fraction of metal ion binding and the formation of product, as well as the pH dependence, and the data from site-specific mutants, the authors concluded that the functions of Mg2+ and His are a concerted process. A detailed mechanism is proposed in Figure 6. 

      Even though there are no major surprises in the results and conclusions, the time-resolved structural approach and the overall quality of the results represent a significant step forward for the Me-His family of nucleases. In addition, since the mechanism is unique among different classes of nucleases and polymerases, the work should be of interest to readers in DNA enzymology, or even mechanistic enzymology in general. 

      Thank you very much for your comments and suggestions.

      Weaknesses: 

      Two relatively minor issues are raised here for consideration: 

      p. 4, last para, lines 1-2: "we next visualized the entire reaction process by soaking I-PpoI crystals in buffer....". This is a little over-stated. The structures being observed are not reaction intermediates. They are mixtures of substrates and products in the enzyme-bound state. The progress of the reaction is limited by the progress of the soaking of the metal ion. Crystallography has just been used as a tool to monitor the reaction (and provide structural information about the product). It would be more accurate to say that "we next monitored the reaction progress by soaking....". 

      We appreciate the clarification regarding the description of our experimental approach. We agree that our structures do not represent reaction intermediates but rather mixtures of substrate and product states within the enzyme-bound environment. We have revised the text accordingly to more accurately reflect our methodology.

      p. 5, the beginning of the section. The authors on one hand emphasized the quantitative correlation between Mg ion density and the product density. On the other hand, they raised the uncertainty in the quantitation of Mg2+ density versus Na+ density, thus they repeated the study with Mn2+ which has distinct anomalous signals. This is a very good approach. However, there is still no metal ion density shown in the key Figure 2A. It will be clearer to show the progress of metal ion density in a figure (in addition to just plots), whether it is Mg or Mn. 

      Thank you for your insightful comments. We recognize the importance of visualizing metal ion density alongside product density data. To address this, we included in Figure S4 to present Mg2+/Mn2+ and product densities concurrently.

      Reviewer #1 (Recommendations For The Authors): 

      (1) Figure 6. I understand that pre-reaction state (left panel) and Metal-binding state (two middle panels) are in equilibrium. But can we state that the Metal-binding state (two middle panels) and the product state (right panel) are in equilibrium and connected by two arrows? 

      Thank you for your comments. We agree that the DNA hydrolysis reaction process may not be reversible within I-Ppo1 active site. To clarify, we removed the backward arrows between the metal-binding state and product state. In addition, we thank the reviewer for giving a name for the middle state and think it would be better to label the middle state. We added the metal-binding state label in the revised Figure 6 and also added “on the other hand, optimal alignment of a deprotonated water and Mg2+ within the active site, labeled as metal-binding state, leads to irreversible bond breakage (Fig. 6a)” within the text.

      (2) The section on DNA hydrolysis assay (Materials and Methods) is not well described. In this section, the authors should summarize the methods for the experiments in Figure 4 AC, Figure 5BC, Figure S3C, Figure S4EF, and Figure S6AB. The authors presented some graphs for the reactions. For clarity, the author should state in the legends which experiments the results are from (in crystallo or in solution). Please check and modify them. 

      Thank you for the suggestion. We have added four paragraphs to detail the experimental procedures for experiments in these figures. In addition, we have checked all of the figure legends and labeled them as “in crystallo or in solution.” To clarify, we also added “in crystallo” or “solution” in the corresponding panels.

      (3) The authors showed the anomalous signals of Mn2+ and Tl+. The authors should mention which wavelength of X-rays was used in the data collections to calculate the anomalous signals. 

      Thank you for the suggestion. We have included the wavelength of the X-ray in the figure legends that include anomalous maps, which were all determined at an X-ray wavelength of 0.9765 Å.

      (4) The full names of "His-Me" and "HNH" are necessary for a wide range of readers. 

      Thank you for the suggestion. We have included the full nomenclature for His-Me (histidine-metal) nucleases and HNH (histidine-asparagine-histidine) nuclease.

      (5) The authors should add the side chain of Arg61 in Figure 1E because it is mentioned in the main text. 

      Thank you for the suggestion. We have added Arg61 to Figure 1E.

      (6) Figure 5D. For clarity, the electron densities should cover the Na+ ion. The same request applies to WatN in Figure S3B.

      Thank you for catching this detail. We have added the electron density for the Na+ ion in Figure 5D and WatN in Figure S3B.

      (7) At line 269 on page 8, what is "previous H98A I-PpoI structure with Mn2+"? Is the structure 1CYQ? If so, it is a complex with Mg2+. 

      Thank you for catching this detail. We have edited the text to “previous H98A I-PpoI structure with Mg2+.”

      (8) At line 294 on page 9, "and substrate alignment or rotation in MutT (66)." I think "alignment of the substrate and nucleophilic water" is preferred rather than "substrate alignment or rotation". 

      Thank you for the suggestion. We have edited the text to “alignment of the substrate and nucleophilic water.”

      (9) At line 305 on page 9, "Second, (58, 69-71) single metal ion binding is strictly correlated with product formation in all conditions, at different pH and with different mutants (Figure 3a and Supplementary Figure 4a-c) (58)". The references should be cited in the correct positions. 

      Thank you for catching this typo. We have removed the references.

      (10) At line 347 on page 10, "Grown in a buffer that contained (50 g/L glucose, 200 g/L α-lactose, 10% glycerol) for 24 hrs." Is this sentence correct? 

      Thank you for catching this detail. We have corrected the sentence.

      (11) At line 395 on page 11, "The His98Ala I-PpoI crystals of first transferred and incubated in a pre-reaction buffer containing 0.1M MES (pH 6.0), 0.2 M NaCl, 1 mM MgCl2 or MnCl2, and 20% (w/v) PEG3350 for 30 min." In the experiments using this mutant, does a pre-reaction buffer contain MgCl2 or MnCl2? 

      Thank you for bringing this to our attention. We have performed two sets of experiments: 1) metal ion soaking in 1 mM Mn2+, which is performed similarly as WT and does not have Mn2+ in the pre-reaction buffer; 2) imidazole soaking, 1 mM Mn2+ was included in the pre-reaction buffer. We reasoned that the Mn2+ will not bind or promote reaction with His98Ala I-PpoI, but pre-incubation may help populate Mn2+ within the lattice for better imidazole binding. However, neither Mn2+ nor imidazole were observed. We have added experimental details for both experiments with His98Ala I-PpoI.

      (12) In the figure legends of Figure 1, is the Fo-Fc omit map shown in yellow not in green? Please remove (F) in the legends. 

      We have changed the Fo-Fc map to be shown in violet. We have also removed (f) from the figure legends.

      (13) I found descriptions of "MgCl". Please modify them to "MgCl2". 

      Thank you for catching these details. We have modified all “MgCl” to “MgCl2.”

      (14) References 72 and 73 are duplicated. 

      We have removed the duplicated reference.

      Reviewer #2 (Recommendations For The Authors): 

      p. 9, first paragraph, last three lines: "Thus, we suspect that the metal ion may play a crucial role in the chemistry step to stabilize the transition state and reduce the electronegative buildup of DNA, similar to the third metal ion in DNA polymerases and RNaseH." This point is significant but the statement seems a little uncertain. You are saying that the single metal plays the role of two metals in polymerase, in both the ground state and the transition state. I believe the sentence can be stronger and more explicit. 

      Thank you for raising this point. We suspect the single metal ion in I-PpoI is different from the A-site or B-site metal ion in DNA polymerases and RNaseH, but similar to the third metal ion in DNA polymerases and nucleases. As we stated in the text,

      (1) the metal ion in I-PpoI is not required for substrate alignment. The water molecule and substrate can be observed in place even in the presence of the metal ion. In contrast, the A-site or B-site metal ion in DNA polymerases and RNaseH are required for aligning the substrates.

      (2) Moreover, the appearance of the metal ion is strictly correlated with product formation, similar as the third metal ion in DNA polymerase and RNaseH.

      To emphasize our point, we have revised the sentence as

      “Thus, similar to the third metal ion in DNA polymerases and RNaseH, the metal ion in I-PpoI is not required for substrate alignment but is essential for catalysis. We suspect that the single metal ion helps stabilize the transition state and reduce the electronegative buildup of DNA, thereby promoting DNA hydrolysis.”

      Minor typos: 

      p. 2, line 4 from bottom: due to the relatively low resolution... 

      Thank you for catching this. We have edited the text to “due to the relatively low resolution.”

      Figure 4F: What is represented by the pink color? 

      The structures are color-coded as 320 s at pH 6 (violet), 160 s at pH 7 (yellow), and 20 s at pH 8 (green). We have included the color information in figure legend and make the labeling clearer in the panel.

      p. 9, first paragraph, last line: ...similar to the third... 

      Thank you for catching this. We have edited the text.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      This important study explores the potential influence of physiologically relevant mechanical forces on the extrusion of vesicles from C. elegans neurons. The authors provide compelling evidence to support the idea that uterine distension can induce vesicular extrusion from adjacent neurons. The work would be strengthened by using an additional construct (preferably single-copy) to demonstrate that the observed phenotypes are not unique to a single transgenic reporter. Overall, this work will be of interest to neuroscientists and investigators in the extracellular vesicle and proteostasis fields. 

      We now include supporting data using a single copy alternate fluorescent reporter expressed in touch neurons (Fig. 3H).

      In brief, we examined the induction of exophergenesis in an alternative single-copy transgene strain that expresses mKate fluorescent protein specifically in touch receptor neurons. As compared to the multi-copy transgene that is broadly used in this study and expresses mCherry fluorescent protein specifically in touch receptor neurons, the mKate single-copy transgene is associated with a much lower frequency of exophergenesis. However, increasing uterine distension via blocking egg-laying can increase the exophergenesis of the mKate single-copy transgenic line from 0% to approximately 60% on adult day 1, indicating that the observed response is not tied to a single reporter.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The authors sought to understand the stage-dependent regulation of exophergenesis, a process thought to contribute to promoting neuronal proteostasis in C. elegans. Focusing on the ALMR neuron, they show that the frequency of exopher production correlates with the timing of reproduction. Using many genetic tools, they dissect the requirements of this pathway to eventually find that occupancy of the uterus acts as a signal to induce exophergenesis. Interestingly, the physical proximity of neurons to the egg zone correlates with exophergenesis frequency. The authors conclude that communication between the uterus and proximal neurons occurs through the sensing of mechanic forces of expansion normally provided by egg occupancy to coordinate exophergenesis with reproduction. 

      Strengths: 

      The genetic data presented is thorough and solid, and the observation is novel. 

      Weaknesses: 

      The main weakness of the study is that the detection of exophers is based on the overexpression of a fluorescent protein in touch neurons, and it is not clear whether this process is actually stimulated in wild-type animals, or if neurons have accumulated damaged proteins in relatively young day 2 animals. 

      We now include data using a single copy alternate fluorescent reporter expressed in touch neurons. Although baseline exopher levels are low in this strain, we demonstrate that inducing egg retention in this background markedly increases exopher generation from a baseline of near zero to ~60% (new Fig. 3H), supporting that uterine distention, rather than reporter identity, is associated with early life exopher elevation. Data also add to our observations indicating that high protein-expressing strains generally produce higher baseline levels of exophers in early adulthood (for example, Melentijevic et al. (PMID 28178240) documented that mCherry RNAi knockdown in the strain primarily studied here can lower exopher levels).

      The second point raised here, regarding the occurrence and physiological role of early-adult exophers in “native” non-stressed neurons is a fascinating question that we are beginning to address in continuing experiments. Readers will appreciate that quantifying relatively rare, “invisible” touch receptor neuron exophergenesis accurately without expressing a fluorescent reporter is technically challenging. Our speculation, outlined now a bit more clearly in the Discussion here, is that certain molecular and organelle debris that cannot readily be degraded in cells during larval development may be stored until release to more capable degradative neighbors or to the coelomocytes for later management, as one component of the early adult transition in proteostasis (see J. Labbadia and R. I. Morimoto, PMID 24592319). Receiving cells may be primed for this at a particular timepoint, possibly analogous to the “bulky garbage” collection of over-sized difficult-to-dispose-of household items that a town will address with specialized action only at specific times. The prediction is that we should be able to detect some mass protein aggregation through early development, and at least partial elimination by adult day 3; this elimination should be impaired when eggs are eliminated. Initial testing is underway.

      Reviewer #2 (Public Review): 

      Summary: 

      This paper reports that mechanical stress from egg accumulation is a biological stimulus that drives the formation of extruded vesicles from the neurons of C. elegans ALMR touch neurons. Using powerful genetic experiments only readily available in the C. elegans system, the authors manipulate oocyte production, fertilization, embryo accumulation, and egg-laying behavior, providing convincing evidence that exopher production is driven by stretch-dependent feedback of fertilized, intact eggs in the adult uterus. Shifting the timing of egg production and egg laying alters the onset of observed exophers. Pharmacological manipulation of egg laying has the predicted effects, with animals retaining fewer eggs having fewer exophers and animals with increased egg accumulation having more. The authors show that egg production and accumulation have dramatic consequences for the viscera, and moving the ALMR process away from eggs prevents the formation of exophers. This effect is not unique to ALMR but is also observed in other touch neurons, with a clear bias toward neurons whose cell bodies are adjacent to the filled uterus. Embryos lacking an intact eggshell with reduced rigidity have impaired exopher production. Acute injection into the uterus to mimic the stretch that accompanies egg production causes a similar induction of exopher release. Together these results are consistent with a model where stretch caused by fertilized embryo accumulation, and not chemical signals from the eggs themselves or egg release, underlies ALMR exopher production seen in adult animals. 

      Strengths: 

      Overall, the experiments are very convincing, using a battery of RNAi and mutant approaches to distinguish direct from indirect effects. Indeed, these experiments provide a model generally for how one would methodically test different models for exopher production. The paper is well-written and easy to understand. I had been skeptical of the origin and purpose of exophers, concerned they were an artefact of imaging conditions, caused by deranged calcium activity under stressful conditions, or as evidence for impaired animal health overall. As this study addresses how and when they form in the animal using otherwise physiologically meaningful manipulations, the stage is now set to address at a cellular level how exophers like these are made and what their functions are. 

      Weaknesses: 

      Not many. The experiments are about as good as could be done. Some of the n's on the more difficult-to-work strains or experiments are comparatively low, but this is not a significant concern because of the number of different, complementary approaches used. The microinjection experiment in Figure 7 is very interesting, there are missing details that would confirm whether this is a sound experiment. 

      We expanded description of details for the microinjection experiment in both the figure legend and the methods section, to enhance clarity and substantiate approach.

      Reviewer #3 (Public Review): 

      Summary: 

      In this paper, the authors use the C. elegans system to explore how already-stressed neurons respond to additional mechanical stress. Exophers are large extracellular vesicles secreted by cells, which can contain protein aggregates and organelles. These can be a way of getting rid of cellular debris, but as they are endocytosed by other cells can also pass protein, lipid, and RNA to recipient cells. The authors find that when the uterus fills with eggs or otherwise expands, a nearby neuron (ALMR) is far more likely to secrete exophers. This paper highlights the importance of the mechanical environment in the behavior of neurons and may be relevant to the response of neurons exposed to traumatic injury. 

      Strengths: 

      The paper has a logical flow and a compelling narrative supported by crisp and clear figures. 

      The evidence that egg accumulation leads to exopher production is strong. The authors use a variety of genetic and pharmacological methods to show that increasing pressure leads to more exopher production, and reducing pressure leads to lower exopher production. For example, egg-laying defective animals, which retain eggs in the uterus, produce many more exophers, and hyperactive egg-laying is accompanied by low exopher production. The authors even inject fluid into the uterus and observe the production of exophers. 

      Weaknesses: 

      The main weakness of the paper is that it does not explore the molecular mechanism by which the mechanical signals are received or responded to by the neuron, but this could easily be the subject of a follow-up study. 

      We agree that the molecular mechanisms operative are of considerable interest, and our initial pursuit suggests that a comprehensive study will be required for satisfactory elaboration of how mechanical signals are received or responded to by the neuron.

      I was intrigued by this paper, and have many questions. I list a few below, which could be addressed in this paper or which could be the subject of follow-up studies. 

      - Why do such a low percentage of ALMR neurons produce exophers (5-20%)? Does it have to do with the variability of the proteostress? 

      We do not yet understand why some ALMR neurons within a same genotype will produce exophers and some will not. We know that in addition to the uterine occupation we report here, proteostasis compromise, feeding status, oxidative stress, and osmotic stress can elevate exopher numbers (PMID 34475208); cell autonomous influences on exopher levels include aggresome-associated biology (PMID 37488107) and expression levels of the mCherry protein (PMID 28178240). Turek reports that social interaction on plates can influence muscle exopher levels (PMID 34288362). Thus, although variable proteostress experienced by neurons is likely a factor, we have not yet experimentally defined specific trigger rules. We suspect the summation of internal proteostasis crisis and environmental conditions, including particular force vectors/frequency will underlie the variable exopher production phenomeonon.

      - Why does the production of exophers lag the peak in progeny production by 24-48 hours? Especially when the injection method produces exophers right away?

      The progeny production can track well with exopher production (Fig. 1B), although the nature of egg counts (permanent, one time events) vs. exophers (which are slowly degraded) can skew the peak scores apart. We synchronized animals at the L4 stage. 24 hours later was adult day 1, and we measured then and every subsequent 24 hours. The daily progeny count reflects the total number of progeny produced every 24 hours; exopher events were scored once a day, but exophers can persist such that the daily exopher count can partially reflect slow degradation, with some exophers being counted on two days. We now explain our scoring details better in the Methods section.

      The rapid appearance of exophers, as early as about ~10 minutes after sustained injection, is fascinating and probably holds mechanistic implications for exopher biology. For one thing, we can infer that in the mCherry Ag2 background, touch neurons can be poised to extrude exophers, but that the pressure/push acts to trigger or license final expulsion. It is interesting that we found we needed to administer sustained injection of two minutes to find exopher increase (now better emphasized in the expanded Methods section). We speculate that a multiple pressure events, or sustained force vector might be critical (like an egg slowly passing through??). Minimally, this assay may help us assign molecular roles to pathway components as we identify them moving forward. 

      - As mentioned in the discussion, it would be interesting to know if PEZO-1/PIEZO is required for uterine stretching to activate exophergenesis. pezo-1 animals accumulate crushed oocytes in the uterus. 

      We have begun to test the hypothesis that PEZO-1 is a signaling component for ALMR exophergenesis, initially using the N and C terminal pezo-1 deletion mutants as in Bai et al. (PMID 32490809). These pezo-1 mutants have a mild decrease in ALMR exophergenesis under normal conditions. However, vulva-less conditions in pezo-1N and piezo-1C increased ALMR exophergenesis from approximately 10% to 60%, similar to the response of wild-type worms to high mechanical stress, data that suggest PEZO-1 is not a required player in mediating mechanical force-induced ALMR exophergenesis. We are currently testing genetic requirements for other known mechanosensors. We intend comprehensive investigation of the molecular mechanisms of mechanical signaing in a future study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      -The study would be significantly strengthened by the addition of data detecting regulation of exophergenesis by uterine forces in a more physiological context, in the absence of overexpression of a toxic protein. In other words, is this a process that occurs naturally during reproduction, or is it specific to proteotoxic stress induced by overexpression? Perhaps the authors could repeat key experiments using a single copy transgene, and challenge the animals with exogenous proteotoxic stress if necessary.

      We now include data using a single copy alternate fluorescent reporter expressed in touch neurons. Although baseline exopher levels are low in this strain, we demonstrate that inducing egg retention in this background markedly increases exopher generation from a baseline of near zero to ~60% (Fig. 3H), supporting that uterine distention, rather than reporter identity or over-expression alone dries early life exopher elevation.

      Also noteworthy is that we find exophergenesis in the single-copy transgenic line is only approximately 0.3% on adult day 2 (average in three trials, data not shown), which is much lower than the 5-20% exophergenesis rate typically observed in the multi-copy high expression mCherry transgenic line. Therefore, consequences of overexpression of mCherry likely potentiate exophergenesis.

      -The authors mention that exophergenesis has been described in muscle cells. Is this also dependent on the proximity to the uterus? It would have been interesting to include data on other cell types in the vicinity of the reproductive system.

      Yes, in interesting work on exophers produced by muscle, Turek et al. reported that muscle exopher events are mostly located in a region proximal to the uterus. Moreover, this work also documented that sterile hermaphrodites are associated with approximately 0% muscle exophergenesis, and egg retention in the uterus strongly increases muscle exophergenesis (PMID: 34288362).  

      -Is exophergenesis also induced by other forms of mechanical stress? For example, swimming.

      We have looked at crude treatments such as centrifugation or vortexing without observing changes in exopher levels. Our preliminary work indicates that swimming can increase exophergenesis, and this effect depends on the presence of eggs in the uterus. We appreciate the question, and expect to include documentation of alternative pressure screening in our planned future paper on molecular mechanisms.

      -In Figure 1E, the profile of exopher production for the control condition at 25oC is very similar to the profile observed at 20oC in Figure 1B. However, the profile of progeny production at 25oC is known to have an earlier peak of progeny production. Perhaps egg retention is differently correlated with progeny production at this temperature? The authors could easily test this.

      Overall, exophers (which degrade with time) and progeny counts (a fixed number) have slightly different temporal features, anchored in part by how long exophers or their “starry night” debris persist. Most exophers start to degrade within 1-6 hours (PMID: 36861960), but exopher debris can persist for more than 24 hours. An exopher event observed on day 1 may thus also be recorded at the day 2 time point, which leads to a higher frequency of exopher events on day 2 as compared to day 1.

      We have previously published on the impact of temperature on exopher number (Supplemental Figure 2 in PMID 34475208). In brief, increasing culture temperature for animals that are raised over constant lifetime temperature modestly increases exopher number; a greater increase in exophers is observed under conditions in which animals were switched to a higher temperature in adult life, suggesting changes in temperature (a mandatory part of the ts mutant studies) engages complex biology that modulates exopher production. Our previous data show that in a temperature shift to 25oC, the peak of exophers was at adult day 1. Here, Fig. 1B is constant temperature, 20oC; Fig. 1E has a temperature shift 15-25oC. That egg retention might be temperature-influenced is a plausible hypothesis, but given the complexities of temperature shifts for some mutants, we elected to defer drill-down on the temperature-exopher-egg relationship. 

      -It is not clear how to compare panels A and B in Figure 3. In panel A the males are present throughout the adult life of the hermaphrodites whereas in panel B the males are added in later life. Therefore, the effect of later-life mating on progeny production is not shown and the title of panel A in the legend is misleading. The authors need to perform a progeny count in the same conditions of mating presented in Figure 3B to allow direct comparison.

      As Reviewer 1 suggested, we performed a new progeny count now presented in new Fig. 3A, which more appropriately matches the study presented in Fig. 3B; legends adjusted.

      -On page 12, the authors state that the baseline of exophergenesis in rollers is 71%, but then attribute the 71% in Figure 4F to exophergenesis specifically in ALMR that is posterior to AVM. The authors need to clarify this point.

      Good catch on our error. The baseline of exophergenesis in rollers is ~40%, and we corrected the main text.

      -Considering the conclusion of Figure 2 that blocking embryonic events passed the 4-cell stage does not impact exopher production, it would have been interesting to compare the uterine length for emb-8 and for mex-3, since it is quite intriguing that the former suppresses exopher production while the latter has no effect.

      We repeated the emb-8 and mex-3 RNAi for these studies and encountered variability in outcome for 2 cell stage disruption via emb-8 RNAi, which is consistent with the range of published endpoints for emb-8 RNAi. We elected to include these emb-8 findings in the figure legend 2G, but removed the RNAi data from the main text figure. mex-3 uterine measures are added to revised panels 5H, 6I.

      Reviewer #2 (Recommendations For The Authors): 

      -Leaving the worms in halocarbon oil for too long (e.g. 10 min) can desiccate and kill them. Did the authors take them out of the oil before analyzing exopher production? The authors refer to these as 'sustained injections' without much description beyond that. As the worms are very small, the flow rate needed for a sustained injection over 2 minutes must be very low - so low that the needle is in danger of being clogged. Do the authors have an estimate of how much fluid was injected or the overall flow rate? I realize the flow rate measured outside of the worm may not compare directly to that of a pressurized worm, but such estimates would be instructive, particularly if they can be related to the relative volume of the eggs the injection is trying to mimic.

      After injection or mock injection, we removed the animal from the oil and flipped it if necessary to observe the ALMR neuron on the NGM-agar plate. We now expanded description of the experimental details of injection, including the estimated flow rate, in the revised Methods section.

      - The authors describe the ALMR neurons as "proteostressed", but I am not clear on whether these neurons were treated in a unique procedure to induce such a state or if the authors are merely building on other observations that egg-laying adults are dedicating significant resources to egg production, so they must be proteostressed. If they are not inducing a proteostressed state in their experiments, the authors should refrain from describing their neurons and effects as depending on such a state.

      We revised to more explicity feature published evidence that the ALMR neurons we track with mCherryAg2 bz166 are likely protestressed. Overexpression of mCherry in bz166 is associated with enlargement of lysosomes and formation of large mCherry foci that often correspond toe LAMP::GFP-positive structures in ALMR neurons (PMID: 28178240; PMID: 37488107). Marked changes in ultrastructure reflect TN stress in this background. These cellular features are not seen in wild type animals. We previously published that mCherry, polyQ74, polyQ128, Ab1-42 (which enhance proteostress) over-expression all increase exophers (PMID: 28178240). Likewise most genetic compromise of different proteostasis branches--heat shock chaperones, proteasome and autophagy--promote exophergenesis, supporting exophergenesis as a response to proteostress. In sum, the mCherryAg2 bz166 appear markedly stressed above a non-over expressing line and produce more exophers. RNAi knockdown of the mCherry lowers exopher levels (PMID: 28178240).

      In response to reviewer comment, we added a study with a single copy mKate reporter (new data Fig. 3H). We find a very low baseline of exophers in this background. This would support that high autonomous compromise associated with over-expression influences exopher levels. Interestingly, however, we found that ALMR neurons expressing mKate under a single-copy transgene still exhibit excessive exopher production (>60%) under high mechanical stress (Fig. 3H). These data are consistent with ideas that mechanical stresses can enhance exopher production, and may markedly lower the threshold for exophergenesis in close-to-native stress level neurons.

      - The authors should include more details on the source and use of the RNAi, for example, if the clones were from the Ahringer RNAi library, made anew for this study, or both.

      We now add this information in the methods section.

      - I would be curious if the authors would similarly see an induction in exopher production after acute vulval muscle silencing with histamine. I'm not suggesting this experiment, but it may offer a way to induce exophers in a more controlled manner.

      This is a great suggestion that we will try in future studies.

      - I am not sure if Figure 5 needs to be a main figure in the paper or if it would be more appropriate as a supplement.

      We considered this suggestion but we think that the strikingly strong correleation of uterus length and exopher levels is a major point of the story and these data establish a metric that we will use moving forward to distinquish whethere an exopher modulation disruption is more likely to act by modulation of reproduction or modulation of touch neuron biology. For this reason we elected to keep Figure 5 in the main text.

      Reviewer #3 (Recommendations For The Authors): 

      -The Statistics section in the methods should be expanded to describe the statistics used in the experiments that aren't nominal, of which there are many.

      We have updated and expanded the statistics section.

      -P.2 Line 49 spelling 'que' should be queue (I remember this by the useless queue of letters lined up after the 'q').

      Corrected 

      -The introduction has a bit too much information about oocyte maturation, not relevant to the study.

      We agree that the information about oocyte maturation is not critical for the laying out the related experiments and cut this section to improve focus.

      -p.3 line 22: Some exophers are seen on Day 3, so this should be restated for accuracy.

      Corrected

      -p.3 line 26. Explain here why sperm is necessary (ooyctes don't mature or ovulate effectively without sperm).

      We added this clarifying explanation.

      -p.3 line 44 Clarify in the spe-44 the oocytes are in the oviduct (not the uterus). Might be helpful to include a DIC image to accompany the helpful diagram in Figure 1D. 

      We added a sentence describing the impact of sperm absence on oocyte maturation, progression into the uterus, and retention in the gonad, with reference to PMID: 17472754.  We were able to add a DIC in the tightly packed Figure 1.

      In Supplemental Figure 6, we now include a field picture of oocyte retention in the sem-2 mutant and upon treatment of lin-39(RNAi).

      -p.5 line 3 in the Figure 1D legend; recommend delete 'light with' which is confusing and just refer to the sperm as dark dots. 

      Corrected

      -p.6 line 22-24 Check for alignment of the statements with Figure 2 (2F is cited, but it should be 2G).

      Corrected

      -p12 line 13-15; Many ALMRs not in the egg zone (70%) did not produce exophers - this is still quite a lot. It would be good to state this section in a more straightforward way (less leading the reader) and if possible to give a possible explanation.

      We modified the text to be less leading: “Thus, although ALMR soma positioning in the egg zone does not guarantee exophergenesis in the mCherryAg2 strain, the neurons that did make exophers were nearly always in the egg zone.”

      -p.15 paragraph 3 - clarify how uterine length was controlled for the overall body length of the worm.

      We did not systematically measure body length, but rather focused on uterine distention. It would be of interest to determine if length of the body correlates with uterine size, and then address how that relationship translates to exopher production but here our attention came to rest on the striking correlation of uterine length and number of exophers.

      -p.17 line 23-25; Could be stated more simply. 

      We adjusted the text: “Moreover, the oocyte retention was similarly efficacious in elevating exopher production to egg retention, increasing ALMR exophergenesis to approximately 80% in the sem-2(rf) mutant (Fig. 6C)”.

      -p.23 Line 4. I think by the time the reader reaches this sentence, the egg-coincident exophorgenesis will not be 'puzzling'. 

      Agreed, corrected.

      -p.26, Line 22, Male 'mating', not 'matting'.

      Corrected.

      -Throughout, leave space between number and unit (this is not required for degree or percent, but be consistent). 

      Corrected.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

      We thank the reviewers for their insights and helpful suggestions on the manuscript. Based on these, we have prepared a revision plan for this manuscript, which is outlined below. We believe these revisions will improve the overall quality of the manuscript.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      • *

      Reviewer #1

      (Evidence, reproducibility and clarity (Required)):

      Summary:

      This study builds on previous work from the same group, where they use Drosophila photoreceptors as a model system to investigate the role or ER-plasma membrane contact sites in an in vivo setting. The authors recently described a role of the ER-PM contact site protein dEsyt in regulating photoreceptor function in Drosophila. In this follow-up study, they explore whether this function of dEsyt is connected Ca2+ signaling downstream of photoreceptor activation. Using a dEsyt mutant that should be unable to bind Ca2+, they find that Ca2+ to some extent is required for dEsyt localization, membrane contact site formation and photoreceptor function.

      Major comments:

      The use of photoreceptor cells in Drosophila is an elegant model system that enable studies of membrane contact sites and associated proteins in a native condition. The data presented by the authors clearly shows that these structures are important for photoreceptor function, and that dEsyt plays a role at these sites. However, this was already known from previous studies by the same group. When it comes to whether these contacts are sensing Ca2+ changes and if these changes are acting through dEsyt, which is the focus of the current manuscript, the results are unclear to me and would need to be clarified by the authors both in text and with new experiments.

      1) What is the role of cellular Ca2+ signaling in the regulation of dEsyt function? There are several aspects here that needs to be clarified. 1) How is WT dEsyt localization regulated by Ca2+? This could for example be evaluated in the mutant flies used in Fig. 1 (trpl302; trp343), where lack of light-induced Ca2+ influx would be predicted to result in a localization of dEsyt that resembles that observed for dEsytCaBM. 2) Is Ca2+ important for dEsyt localization, lipid exchange or both? The authors express a version of dEsyt with mutation made in all three C2 domains. In mammalian E-Syts, Ca2+ binding to the C2A domain is important for lipid exchange while binding to C2C (in E-Syt1) is important for interactions with lipids in the plasma membrane. Using more carefully designed mutants will allow the authors to determine how Ca2+ regulates dEsyt function in vivo. In addition, the authors must show experimentally that the mutant dEsytCaBM is unable to bind Ca2+ (could e.g. be done by acute Ca2+ changes in the cell-based model used in Fig. 3). Writing that "This transgene carrying a total of nine mutations should render the protein unable to bind calcium" (p. 6, line 173) is not sufficient.

      1) How is WT dEsyt localization regulated by Ca2+?

      We agree that further experimental evidence would be helpful in establishing the significance of cellular Ca2+ signaling in the control of dEsyt function. As suggested by the reviewer, the localization of wild type dEsyt will be examined in the mutants: norpAP24 (PLC null mutant) and trpl302; trp343 (protein null mutants of TRPL and TRP channels respectively) in which light induced calcium influx is eliminated. These data will be included in the revision.

      2) Is Ca2+ important for dEsyt localization, lipid exchange or both?

      We have already performed experiments to address the question of how important calcium binding to dEsyt is for lipid transport at the ER-PM interface in Drosophila photoreceptors. This results indicate a previously unexpected role for lipid exchange and will be included in the revision.

      3). Writing that "This transgene carrying a total of nine mutations should render the protein unable to bind calcium" (p. 6, line 173) is not sufficient.

      We concur with the reviewers that at present we do not have experimental data to demonstrate that dEsytCaBM can't bind Ca2+. However, as Reviewer 4 pointed out, it will be challenging to demonstrate this experimentally. A direct proof would only come from measurements of the calcium binding affinity of dEsyt (which involves protein purification that is beyond the scope of the current work). An indirect demonstration would be any cellular or in vivo experiment. In addition to the in silico analysis already included in Fig 2 C-F, we propose the following to provide additional evidence to strengthen our in silico analysis: Use AlphaFold model to demonstrate that the arrangement of the calcium binding residues in the C2 domain of dEsyt is compatible with Ca2+ binding.

      2) The localization of dEsyt shown in Fig. 3B is a bit confusing. First of all, I would recommend including markers of the ER and the plasma membrane, because without these it is difficult to make statements about the localization of dEsyt to these structures.

      As suggested, to better appreciate the localization of dEsyt in photoreceptors, we will perform colocalization of dEsyt with markers of the PM (Rhabdomere) and ER (Sub Microvillar Cisternae).

      Second, it appears that WT dEsyt localize to the reticular ER, and that the CaBM version localize to the plasma membrane. This is somewhat opposite to mammalian ESyts, where mutations that prevent Ca2+ binding either had no effect (for ESyt2) or prevented (for ESyt1) the interaction with the plasma membrane. It also appears different from the localization in vivo (Fig. 3C). Clarifying this will be important. It will also be important to connect this localization to changes in Ca2+ and not just to the localization of a mutant that may or may not be deficient in Ca2+ binding (see comment above).

      In considering this comment, we need to bear in mind the following:

      • Mammalian cells have three genes that encode for Esyt: Esyt 1, 2 and 3 whereas the Drosophila genome encodes only a single gene for Esyt.
      • In terms of sequence similarity and structure, dEsyt and hEsyt2 are very similar. However, in contrast to hEsyt2 and hEsyt3, which localize to the plasma membrane (PMID: 17360437), dEsyt acts like hEsyt1 and localizes to the ER-PM junctions.
      • A single study (PMID: 27065097) has shown that the SMP domain of Esyt1 can transfer lipids in an in vitro assay. In our studies, we have noted an unexpected function for the SMP domain of dEsyt for in vivo function as measured through phenotypes in the eye (data will be presented in the revised ms).
      • While knocking out the single dEsyt in Drosophila photoreceptor neurons results in phenotypes (Nath et.al PMID: 32716137) to date, knocking out all three Esyts in mammalian cell culture models or mice has not revealed an in vivo Bearing these points in mind it may not be reasonable to expect every observation on mammalian Esyt to be recapitulated in the fly system or vice versa. 3) I don't fully understand the time course of events. The authors show that dEsytCaBM is mislocalized already at day 1 in dark-reared flies (Fig. 3C) but this mislocalization is not accompanied by a change in MCS density or gap distance, and consistently does not influence the localization of RDGB. The authors next expose the flies to constant light illumination to trigger Ca2+ dependent signaling, and this leads to mislocalization of RDGB, perhaps indicating changes in MCS (this is not shown). From these results it is difficult to know what the role of dEsyt is. It would be necessary to also show a control where Ca2+ signaling is not induced, e.g. a parallel dark-control (same number of days but no illumination).

      It is important to remember that even complete loss of Esyt does not result in altered MCS or mislocalization of RDGB on day 1 post eclosion. This has been published by us previously (Nath et.al PMID: 32716137). Since we show in this manuscript that dEsytCaBM exerts a dominant negative effect when expressed in wild type and phenocopies dEsytKO, one might expect expression of dEsytCaBM to also lead to altered MCS density and mislocalization of RDGB by 6D constant light.

      Bearing this in mind, we will incorporate the following data in the manuscript: Addition of MCS density in dEsytKO photoreceptors at Day1 in Figure 3C.

      • Electron Microscopy to check MCS density in Rh1>dEsytCaBM at Day 6CL with appropriate control genotypes.
      • Confocal Imaging: RDGB staining in Rh1>dEsytCaBM- Day 6CD reared flies with appropriate control genotypes- dark control where only reduced Ca2+ signaling is induced due to dark noise or spontaneous PLC activation. This is particularly important given that the authors show in Fig. 1 that preventing Ca2+ influx had a dramatic impact on MCS density even at day 1 (which is in sharp contrast to dEsytCaBM-expressing flies, that show normal morphology at day 1, which rather implies that dEsyt is not a major Ca2+ effector).

      In thinking about this comment, it is important to bear in mind the details of the experimental paradigm in use in each of the experiments while drawing comparisons between the observed results. It is to be noted that throughout the manuscript dEsytCaBM is expressed selectively in photoreceptors using the Rhodopsin enhancer which drives expression of the transgene during late eye development. By contrast, in germ line mutant strains such as trpl302;trp343 the channels are blocked throughout development. Thus the phenotypes of trpl302;trp343 might be broader than that of expressing dEsytCaBM. Therefore, mutating the calcium binding residues of dEsyt and expressing it using Rh1 enhancer at a specific developmental time window might not have the same impact on the contact site density as completely blocking the major calcium permeable channels, TRP and TRPL that is important to sustain the ongoing phototransduction cascade all through the development.

      4) The experiments done in dEsyt KO flies are important, and here the authors show that dEsyt1 could to some extent rescue all phenotypes. Some results are a bit puzzling. For example, dEsyt1CaBM localization in dEsyt1 KO flies is identical to that of WT dEsyt (Fig. 5C), which is in sharp contrast to the data shown in Fig. 3C. What is the reason for this? I would have anticipated the opposite (i.e. that in WT flies, dEsytCaBM can form dimers with endogenous dEsyt through SMP-domain interactions which may have an impact on its localization and the function of endogenous dEsyt, but that in the dEsyt KO cells, dEsytCaBM would show a different localization due to the lack of endogenous dEyt to interact with). It is important to clarify as one of the major observations here is that dEsytCaBM no longer localize to MCS. Since the CaBM version of dEsyt could rescue, to some extent, MCS density and delay photoreceptor degeneration, this implies that Ca2+ may not be required for regulation of dEsyt function or that the mutant is still able to partially bind to Ca2+.

      The localization shown in Fig 5C is not of dEsytCaBM in dEsytKO photoreceptors but the localization of RDGB in Rh1>dEsytCaBM; dEsytKO at Day 1 (Figure 5C i) and as a function of age and illumination- Day 6CL (Figure 5C ii).

      One experiment that would help the authors determining the function of dEsyt in vivo would be to use a mutant that lacks functional SMP domain (ideally also with and without mutations in the C2-domains).

      There is information available to address the question of how the lipid binding module, SMP is important to render dEsyt functional at the ER-PM interface in Drosophila photoreceptors. The same will be included in the revision.

      5) PLC activation typically couples to rapid signaling and involved hydrolysis of PIP2 and release of Ca2+ from the ER. Mammalian Esyts also require PIP2 for plasma membrane binding (through interactions with C2-domains), so constitutive PLC activity would be expected to impair ESyt localization to MCS. Here, the authors expose flies for days of constant illumination. How does this influence plasma membrane PIP2 levels and could this be of relevance for how data is interpreted?

      This is an interesting question from the reviewer. However, we would like to clarify the fact that constitutive activation of PLC is different from constant activation of PLC during illumination. Flies have robust mechanisms for controlling PLC turnover and PIP2 levels during continuous illumination and Ca2+ is a key regulator of this process; the underlying mechanisms have been described by Raghu and Hardie in multiple past papers (PMID: 11343651, PMID: 15355960). This is why, apart from adaptation, flies grown in constant light for many days do not show electrophysiological defects and neither do they undergo retinal degeneration. We will however measure the kinetics of PIP2 resynthesis in (i) wild type (Day 1 vs Day 6CD vs Day 6CL) and (ii) Control, Rh1>dEsyt and Rh1>dEsytCaBM (Day 1 vs Day6CL). This might reveal some interesting insight into the mutants.

      Do the authors know whether the CaBM mutant has reduced affinity for PIP2?

      The ability of wild type dEsyt to bind PIP2 has not been determined. We will test this and if it does so, the impact of CaBM on PIP2 binding can be tested.

      Minor comments:

      • The overexpression of WT dEsyt had a dramatic impact on MCS density and gap distance, while expression of dEsytCaBM did not. If these contacts are important for photoreceptor function, is it not surprising that such a dramatic change in photoreceptor structure was without effect on function? This should be further discussed. The establishment of more contact sites and reduction in contact site distance in Rh1>dEsyt::GFP photoreceptors is likely indicative of the proposed tethering role of the protein at the ER-PM MCS. Increase in contact site density or reduction in distance need not directly parallel to the increase in the levels of MCS proteins that are expressed at these contact sites to enhance the ongoing signal transduction. We will test this idea proposed by the reviewer and include the following data in a revision to strengthen our statement:

      • RDGB levels in control vs Rh1>dEsyt::GFP - Western blot

      • Electroretinograms from the genotypes indicated above as a functional readout of the ongoing signaling cascade.
      • PIP2 kinetics in control vs Rh1>dEsyt::GFP to understand if establishing more contact sites can enhance the replenishment of the lipid at the PM. 2) How is quantification of MCS density and gap distance influenced by retinal degeneration (e.g. induced by dEsyt KO)?

      Wherever we have analyzed MCS density or gap distance, these experiments have been done in flies at ages prior to the onset of retinal degeneration defined as collapse of the microvilli of the rhabdomere. Therefore, our measurements of MCS density and gap in this paper are not affected by retinal degeneration.

      3) The graphical abstract is a bit confusing. It seems to suggest that changes in dEsyt is a consequence of ageing and does not show any role of this protein in photoreceptor function. I think that the abstract could be improved to more clearly highlight the findings in the manuscript. For example, it doesn't at all show the difference in localization between WT and CaBM.

      We will modify the graphical abstract.

      4) P. 5, line 135 the authors state that "The tethering and lipid transfer activity of mammalian Esyts are reported to be influenced by Ca2+". This is a massive understatement. Ca2+ is a critical regulator of Esyt function in mammalian cells.

      The statement will be modified.

      5) In figure legend 1B and C: correct µM to µm.

      Changes will be incorporated as per the suggestion.

      6) In figure legend 2A: should be red rectangles and not black rectangles.

      Changes will be incorporated as per the suggestion.

      7) In Fig. 2B: specify which isoform of human ESyt that is shown.

      Changes will be incorporated as per the suggestion.

      8) In Fig. 2C: do the authors mean D374 or D384 (as indicated in Fig. 2A)?

      Changes will be incorporated as per the suggestion; the residue is D374.

      Significance

      Light-induced signal transduction in photoreceptor cells involves Ca2+ influx and signaling and also depends on correct formation of ER-plasma membrane contact sites. In mammalian cells, the Esyts (esp. Esyt1 and Esyt2) localize to ER-PM contacts in a Ca2+-dependent manner, and the ion has dual effects in both enriching the protein at the membrane contact sites and in promoting lipid transport. Mammalian Esyts form homo- and heterodimers, and the properties of the dimers depends on their composition (PMID: 26202220). Drosophila only have one Esyt (dEsyt) which is structurally most similar to mammalian Esyt2, and the authors have previously shown how this protein is required for photoreceptor function (PMID: 32716137), although the role of Ca2+ was not investigated in that study. However, an earlier study has shown that mutations of all Ca2+-coordinating residues in dEsyt impairs protein function in Drosophila neurons (PMID: 28882990), so a similar Ca2+-dependence in the retina would be expected. The results from the present study confirm the requirement of Ca2+ signaling for dEsyt function, and extends this Ca2+-dependent regulation to also involve photoreceptor-induced Ca2+ signaling, which corroborates many other studies showing the requirement of Ca2+ signaling for the regulation of Esyt function in mammalian cells (e.g. PMID: 23791178; PMID: 27065097; PMID: 29222176; PMID: 26202220; PMID: 24183667; PMID: 30589572). As such, the results from this study represent an incremental step towards understanding Esyt function in vivo. These results would be of greatest interest to researchers working of photoreceptor function, and of some interest to a broader audience working on membrane contact sites and signal transduction. My own background is in mammalian cell biology, with a focus on lipid and Ca2+ signaling and inter-organelle communication. I have limited understanding of the model system used here (Drosophila photoreceptor cells).


      We would like to provide an alternative perspective on the reviewer’s view that “As such, the results from this study represent an incremental step towards understanding Esyt function in vivo.”

      We are well aware of the content in several studies of Esyt in mammalian cells including the ones cited by the reviewer (e.g. PMID: 23791178; PMID: 27065097; PMID: 29222176; PMID: 26202220; PMID: 24183667; PMID: 30589572). These have been cited in our manuscript. However, it is important to recognize that each of these studies is an analysis of the properties of mammalian Esyt as a molecule in the context of Ca2+. However, none of these studies addresses the key question of whether the regulation of Esyt by Ca2+ is important for cellular function or to support cell physiology. The reason for this is quite straightforward and well known in the field. To date, there is no cellular or physiological phenotype that is reported to depend on endogenous Esyt function in mammalian cellular or animal models. As an illustrative example, deletion of all three mammalian Esyt does not affect cell signalling (PMID 23791178) including Ca2+ signalling and a triple knockout of all three Esyt in mice (PMID: 27348751) has no discernable phenotype.

      By contrast, deletion of the single Esyt gene in Drosophila results in robust phenotypes in adult photoreceptors (PMID: 32716137). Using these phenotypes, in this manuscript we study the importance of Ca2+ dependent regulation of cellular functions mediated by dEsyt. Therefore, this study fills an important unfilled gap in establishing the mechanism by which dEsyt proteins regulate cellular functions in vivo, in a Ca2+ dependent manner. We respectfully ask that this not be caricatured as an incremental step.


      Reviewer #2

      Evidence, reproducibility and clarity

      Esyt is a C domain (a Ca2+ binding domain) containing protein that localizes to the ER-MCS, playing a role in ER-mitochondria tethering and lipid transfer. At the same time, proteins at the ER-MCS are well-positioned to sense changing levels of Ca2+. Previous studies reported that loss of Esyt in Drosophila causes a loss of ER-PM integrity and retinal degeneration. Here, the authors report the consequence of disrupting the Esyt C domain in Drosophila photoreceptor cells. They used in-silico strategies to identify the Ca2+ contacting residues within the C domain and generated transgenic flies containing either the wild type or the Esyt-CaBM mutants. They show that the wild type transgene rescues several Esyt KO phenotypes in the Drosophila photoreceptors. In some cases, they report dominant negative effects of Esyt-CaBM overexpression.

      This is a straightforward structure-function analysis of the Esyt C domain. Overall, the experiments are well executed. At the same time, a few aspects of the manuscript could be further improved. For example, the authors analyze multiple aspects of photoreceptor integrity. In some cases, they show that the mutant Esyt transgene shows dominant negative effects. In others, there is no evidence or even a partial function. Clarifying these points could be helpful. Below are a few specific points for the authors' consideration:

      Major Comments

      1. RDGB is a protein that localizes to the ER-MCS. Esyt-CABM-GFP expression causes RDGB mis-localization even in the presence of wild type Esyt expression, suggestive of a dominant negative effect (Fig. 4C). But Esyt CaBM-GFP expression doesn't seem to have a dominant negative effect on contact site distance (Fig. 4D). Are the authors not seeing a dominant negative effect because they didn't examine older flies? Or, is there a distinct effect of Esyt CaBM on RDGB localization and contact site distance? If there is a distinct effect, what is the reason? As the reviewer correctly mentions, we are not seeing a dominant negative effect of dEsytCaBM::GFP expression on contact site distance because we didn't examine older flies.

      Dominant negative effect of dEsytCaBM on the wild type protein is observed in all phenotypes analyzed. The contact site distance analysis shown in the paper is done on day 1 old constant dark reared flies. Contact site distance exhibited by dEsytCaBM is like that of dEsytKO photoreceptors at day 1 post eclosion. dEsyt deprived photoreceptors are comparable to its wild type counterpart at Day 1 in all aspects of phototransduction (PMID: 32716137). But as a function of age and illumination, the dEsytKO photoreceptors exhibit progressive loss in contact site integrity, followed by induction of retinal degeneration and RDGB mis-localisation (PMID: 32716137). These observations are consistent in dEsytCaBM.

      During the revision, the following experiments will be included to strengthen this statement:

      • Add the MCS density and gap distance in dEsytKO photoreceptors at Day1 in Figure 3C.
      • Electron Microscopy to check MCS density and distance in Rh1>dEsytCaBM at Day 6CL with appropriate control genotypes.

      Esyt-CABM-GFP partially rescues the Esyt KO phenotype in retinal degeneration (Fig 6). This is surprising since cellular assays in Fig 4 show a failure of Esyt-CaBM to localize to ER-MCS. The results here contrast with earlier data showing that Esyt-CABM has dominant negative effects. How will the authors interpret the results? Is it possible that Esyt-CAMB still has some residual Ca2+ binding activity? Alternatively, does this result imply that Esyt can still function (albeit at lower capacity) without binding Ca2+? Is there Esyt function unrelated to ER-MCS site maintenance when it comes to its role in retinal degeneration? A reasonable explanation is warranted.

      Partial rescue of dEsytKO phenotypes by Rh1>dEsytCaBM; dEsytKO photoreceptors indicate that apart from calcium sensing, there might be another function for dEsyt at the ER-PM interface which is yet to be discovered.


      Minor Comments:

      Figure legends refer to "SMC" (I am guessing they are referring to Sub microvillar cisternae) without defining it in the text.

      Changes will be incorporated as per the suggestion.


      Significance

      This study will be of interest to those generally interested in the ER mitochondria contact sites. The main significance here is in dissecting the role of the C-domain within the Esyt protein. The authors demonstrate a physiological role using Drosophila photoreceptors as a model.

      We thank the reviewer for appreciating the significance of our study which seeks to show the in vivo significance of the Ca2+ regulation of dEsyt for in vivo function.

      __Reviewer #3 __

      (Evidence, reproducibility and clarity (Required)):

      Summary

      In the present work, the authors explore the role of Ca2+ binding to Esyt in the regulation of ER-PM contact sites using drosophila photoreceptors as a model system. By expressing in wild type or in EsytKO flies a mutated version of dEsyt which is predicted to lose Ca2+ binding, they highlight a potential role of Ca2+ binding to Esyt in the regulation of ER-PM contact sites density and the development of rhabdomeres. The data clearly show the effect of Esyt mutant during development of photoreceptors in Drosophila. However, as discussed below, one essential missing point is the experimental proof that the mutant has indeed lost its ability to bind Ca2+, and that PIP2 binding is not perturbed.

      Major comments

      1. One major comment is the lack of experimental proof that the EsytCABM mutant is indeed unable to bind Ca2+. The MIB tool only gives a prediction and it is not sufficient to prove their statements throughout the manuscript on the requirement of Ca2+ binding for the regulation of MCS. We understand the reviewer’s comment that this manuscript does not contain experimental data demonstrating that dEsytCaBM does not bind Ca2+. However, as Reviewer 4 pointed out, it will be challenging to demonstrate this experimentally. A direct proof would likely come from measurements of the calcium binding affinity of dEsyt (which involves protein purification that is beyond the scope of this work). An indirect demonstration would be any cellular or in vivo experiment oar any additional in silico analysis. To provide additional indirect evidence to address this question, we will:

      2. Use the AlphaFold model to demonstrate that the arrangement of the calcium binding residues in dEsyt is compatible with Ca2+

      3. Evaluate if the wild type dEsyt is mislocalized in the photoreceptors upon eliminating the calcium entry to these specialized sensory neurons. The localization of wild type dEsyt will be examined in the mutants: norpAP24 (PLC null mutant) and trpl302; trp343 (protein null mutant of TRPL and TRP channels respectively) in which light induced calcium influx is eliminated. Moreover, they should check experimentally the potential differences in the capacity of EsytCABM mutant to bind PI(4,5)P2, which can potentially perturb its subcellular localization.

      As recommended by the reviewer, it is important to determine the PIP2 binding capacity of dEsytCaBM. The ability of wild type dEsyt to bind PIP2 has not been determined. We will test this and if it does so, the impact of CaBM on PIP2 binding can be tested.

      Figure 1A: the legend on the right side of the scheme is missing. On the left, RDGB and dEsyt don't associate with the PM.

      Changes will be incorporated as per the suggestion.

      line 125: the authors should describe more precisely the Trp mutant that they used.

      The text will be modified.

      Concerning the quantification of MCS density done throughout the paper, can the authors mention what they considered as an MCS, in other words, what distance they defined as the maximal distance between the ER and the PM.

      We used fixation methods that allow enhanced membrane preservation and better visualization of membranes and MCS (PMID: 2496206). Such images allowed us to quantify the fraction of SMC that are present at the base of the microvilli in each ultrathin section of a photoreceptor. The MCS is the dark stretch that can be seen at the base of the rhabdomere in each TEM image (PMID: 32716137). Contact site distance measured is the absolute distance between the visible demarcation of the PM and SMC as indicated by the yellow arrows in Figure 4D iii, vi, and ix.

      Figure 3: the localization of Esyt and EsytCABM in S2R cells and in vivo is not precisely analyzed: a co-staining with PM and ER markers should be added in order to state the localization at ER-PM MCS or at apical PM.

      As suggested, to better understand the compartmental localization of dEsyt in photoreceptors, we will use markers of PM (Rhabdomere) and ER (Sub Microvillar Cisternae) and conduct co-localization assays.

      line 181: the authors should precise in which membrane compartments Esyt is localized.

      The text will be modified.

      line 185-187: the conclusion here doesn't seem to fit the data, as the EsytCABM mutant looks enriched at ER-PM contact sites.

      As previously answered, we will remark on whether there is an enrichment of dEsytCaBM at the ER-PM contact sites following the co-localization experiment that is recommended in Q5.

      a paragraph on the production of Drosophila transgene mutants should be added to the Mat et Med section.

      The text will be added as suggested.

      considering the phenotypes observed for the EsytCABM mutant in vivo, the authors should provide an analysis of the level of expression of the exogenous proteins Esyt and EsytCABM by western blot in the different backgrounds. EsytCABM seems to be expressed at lower levels in Figure 3C.

      As per the suggestion, western blot analysis will be conducted and better representative confocal images depicting the protein levels will be added in the manuscript.

      Fig 4D: considering the perturbation of RDGB localization observed at Day 6, the authors should analyze the organization of MCS by TEM at Day 6, in addition to Day 1.

      We agree that to support the observation of RDGB mis-localization, the decrease in contact site integrity as a function of age and illumination (Day6CL) should be evaluated in Rh1>dEsytCaBM photoreceptors. The manuscript revision will include data from this experiment.

      the EsytCABM mutant exhibits strong dominant negative effects, but rescues completely or partially some of the phenotypes of Esyt KO: could the authors discuss and provide some hypothesis on this apparent discrepancy?

      We are unsure what the reviewer means by “apparent discrepancy”. When dEsytCaBM is expressed in wild type photoreceptors, it exhibits a strong dominant negative effect presumably by inhibiting the function of wild type dEsyt protein.

      dEsytKO is a protein null allele. Therefore, when dEsytCaBM is expressed in the dEsytKO background it does not exert a dominant negative effect as there is no wild type protein to interact with. The partial rescue of dEsytKO phenotypes by Rh1>dEsytCaBM; dEsytKO photoreceptors likely indicates that calcium binding is not the sole factor affecting dEsyt function at the ER-PM interface.

      lines 230-233: the sentence is not clear. I don't see any consistency between data in Figure 5B, showing only very partial rescue by EsytCABM, and the data in Figure 5C (ii) showing complete rescue of RDGB localization by EsytCABM.

      The time point (six days of continuous light exposure following eclosion) at which RDGB localization was analyzed becomes extremely important in thinking about this reviewer comment. If we look at the degeneration kinetics depicted in figure 5B, we can see that neurodegeneration begins in both dEsytKO and Rh1>dEsytCaBM on Day 8 post-eclosion; prior to which, on Day 6, RDGB is mislocalized from the base. However, in Rh1>dEsytCaBM; dEsytKO, the onset of degeneration is delayed, and the photoreceptors show intact structure until Day 8 or Day 10, and measurable retinal degeneration begins on Day 12. This may be the reason why, RDGB continues to be correctly localized in Rh1>dEsytCaBM; dEsytKO at Day 6CL.

      Figure 6D: could the authors comment the increase of MCS density observed in Esyt-GFP expressing flies.

      Esyt is proposed to function as a tether that connects the ER and PM (PMID: 23791178; PMID: 27065097; PMID: 29222176), bringing them closer together. Based on this idea, perhaps by expressing dEsyt::GFP we are drawing the membranes together thus establishing more MCS.

      on several TEM images, some pictures illustrating different conditions look very similar, as if they were serial cuts: Fig 1B (Day 1 and Day 14), Fig 4D (Rh1 and Rh1>dEsytCABM::GFP), Fig 6B Day 1 and Day 14 and Fig 6C Day 1. Could the authors check if there was a mistake with these pictures?

      The images are not taken from serial sections of the same TEM block as is evident from the arrangement of nucleus of each photoreceptor cell. As mentioned in the figure legends, all experiments are carried out using 3 independent blocks (N=3 fly heads) prepared from each genotype and 10 photoreceptors from each block/ fly retinae are used for quantification of contact site density/ contact site distance. Aside from the arrangement of the accessory cells and cellular nuclei, the TEM images will appear very similar since Drosophila photoreceptor neurons are symmetrically arranged, with around 700–800 ommatidia per eye each comprising 8 photoreceptors.

      Minor comments:

      • lines 84-88 : the sentence is not clear. Besides, the authors should precise what they mean by "extra-cellular Ca2+ influx enhance ER-PM contact sites". Which parameter exactly has been shown to be regulated by Ca2+?

      The paper by Idevall-Hagren et al. proposes that following store operated Ca2+ influx, Esyt1 translocates to ER-PM junctions and the number of ER-PM contact sites increases. Please refer to this section of the publication from Idevall-Hagren et al. (2015) (PMID: 26202220):

      “As detected by TIRF microscopy, the depletion of Ca2+ from the lumen of the ER occurring under these conditions led to a progressive accumulation of ER‐anchored STIM1 at the PM, where it activates Orai Ca2+ channels (Fig 4C). Subsequent addition of 1–10 mM Ca2+ to the extracellular medium, either in the absence or in the presence of SERCA inhibitors, caused a massive increase in cytosolic Ca2+ (SOCE) through the activated Ca2+ channels (Figs 4A and EV4D–G). Such increase induced a very robust translocation of E‐Syt1 to the PM (Figs 4B and EV4D–G), which, in the absence of SERCA inhibition (i.e., when a reversible inhibitor of the SERCA pump had been washed out), preceded the dissociation of STIM1 and the inactivation of SOCE (Fig 4D). Inspection of TIRF microscopy images during the manipulation showed that E‐Syt1 does not form new contacts but populates and expands contacts previously occupied by STIM1.”

      • lines 108-110: can you give the reference?

      Reference for the localization of dEsyt to ER-PM MCS is Nath et.al PMID PMID: 32716137

      Reference for the localization of TRP and TRPL at the microvillar plasma membrane: Numerous primary research papers have shown this- for example see review PMID: 11557987, PMID: 22487656

      • line 189: the authors should summarize the findings in one sentence. "Functional activity" would refer to lipid transfer.

      The text will be modified as per the suggestion.

      Reviewer #3 (Significance (Required)):

      General assessment

      The work relies on a model system that enables the exploration of the role of Esyt in vivo, in a fundamental process highly regulated during development. The data clearly show the effect of Esyt mutant during development of photoreceptors in Drosophila but as discussed before, some experimental evidences are missing to completely prove the statements.

      Advance

      This work brings new insights in the functional role of lipid transfer during development and explores how the dialog between lipid transfer and Ca2+ flux can influence MCS organization. The interesting points that could be explored in the paper are the effects of a Ca2+ influx on Esyt and EsytCABM localization, and on their lipid transfer activity.

      Audience

      This work would be of interest for the membrane contact sites community and for the Developmental biology community.

      We thank the reviewer for highlighting the significance of our work and the clarity of the data. Additional data to address the points they have raised will be provided.

      __Reviewer #4 __

      (Evidence, reproducibility and clarity (Required)):

      In this study, Nath et al., aim at understanding the role of dESyt Ca2+ binding activity on ER-PM MCS in D. melanogaster photoreceptors. Using a combination of transmission electron microscopy and fluorescence microscopy, the authors explore the ability of a dESyt mutant, supposedly unable to bind Ca2+ (based on homology with the human ortholog hESyt2), to recapitulate the function of the wild type version of the protein in establishing ER-PM MCS and modulating their density.

      Findings:

      1) MCS density depends on the activity of TRP and TRPL channels in aging photoreceptors.

      2) Mutation of dESyt Ca2+ binding residues (dEsytCaBM::GFP) leads to a gross mis-localization of the protein, even in the presence of the endogenous protein.

      3) Overexpression of the mutant affects the structure of photoreceptors upon constant illumination.

      4) After 6 days of continuous illumination, RDGB is mis-localized in cells overexpressing dEsytCaBM::GFP.

      5) Overexpressed dEsytCaBM::GFP fails to reduce the distance between ER and PM, meaning it fails to establish ER-PM contract sites, while overexpressed dEsyt::GFP show reduced MCS distance. Overexpressed dEsyt::GFP also leads to a 10% increase in MCS density compared to WT or cells expressing dEsytCaBM::GFP.

      6) dEsytCaBM::GFP is not able to rescue the light dependent retinal degeneration of dESytKO, although it slightly delays the onset, but is able to rescue RDGB localization at day 6 of constant illumination.

      7) Examining MCS density in dESytKO cells, rescues with dEsyt::GFP and dEsytCaBM::GFP show a slightly higher MCS density than dESytKO at day 1. At day 14, ER-PM MCS were non-existent in dESytKO, unchanged in dEsyt::GFP and reduced by 20% in dEsytCaBM::GFP compared to day1.

      Specific comments:

      My field of expertise is biochemistry and structural biology (including cellular cryo-electron tomography), but I have no experience with drosophila biology, so I am not able to judge the drosophila work per se.

      While I find the confocal microscopy experiments compelling, I have some reservations regarding the quantification of the TEM images (MCS distances and density) as it was done manually, and therefore, to some extent subjective, especially, when differences between conditions are in the order of 10%. I would have found the quantification more convincing if done systematically, i.e. segmenting the MCS and computationally measuring distances and densities. Otherwise, the authors could expand a little bit on how their methodology is accurate.

      As the reviewer correctly mentions, the quantification will be more convincing if done systematically, i.e. segmenting the MCS and computationally measuring distances and densities. For MCS measurements, we have experimented with the segmentation method using ImageJ and Imaris. As mentioned in the answer to Q4 of reviewer 3, we used fixation methods that allow enhanced membrane preservation and better visualization of membranes and MCS (Matsumoto‐Suzuki et al, 1989). However, this staining method does not selectively stain the ER which is part of the MCS but all the ER. Due to this, automated segmentation poses significant challenges.

      The primary drawback of the segmentation method is that, in the process of training the software to predict/detect distinct cellular compartments, it recognizes all ER membranes, including SMC as well as the ER that is not part of the MCS. As a result, the software's minimum distance calculation may be between PM and SMC or PM and generic ER, which does not help the analysis we wish to perform. Similarly, to determine the contact site distance in images with obscure ER and PM boundaries, the software uses the border it can identify—which is typically inside the rhabdomere rather than at its edge. For the contact site density measurements, software is not able to distinguish between ER and pigment granules close to the rhabdomere as the gray scale value for both these compartments are comparable.

      Advantages of manual approach:

      To account for potential effects of photoreceptor depth on contact site density and distance, we have analyzed TEM sections obtained directly from the nuclear plane of the photoreceptors to calculate both contact site density and distance. Additionally, by utilizing the freehand line tool, manual analysis enables us to define the length of each little section of the MCS and the base of the rhabdomere. The entire length of the MCS at the base is then calculated by adding each segment together. An illustration of how the manual analysis is done will be included as part of methods in the revision.

      Another point is whether the levels of expression of dESyt proteins (dESyt-GFP and dESytCABM-GFP) are comparable. In the overexpression experiments, what are the expression levels of the constructs compared to the endogenous protein? The authors should provide e.g. a Western blot.

      As per the suggestion, western blot analysis will be conducted to compare the expression levels of the constructs utilized to the endogenous protein.

      Concerning the modelling, while I do think that the identification of dESyt Ca2+ binding residues is correct (the sequence alignment is convincing and the sequence identity is very high), and that most likely the structural arrangement will be conserved, homology modelling (using MODELLER with a single reference) leads to models highly similar to the input reference (in particular when the sequence identity is very high). Therefore, rmsd will necessarily be low and the side chain arrangement of conserved residues will be identical. This is unlikely to happen, as protein structures will not be identical despite high sequence conservation. In addition, a crystal structure is a snapshot of a protein conformation that is favorable for crystal formation. It would have been more interesting to use an AlphaFold model and show that the arrangement on the residues is compatible with Ca2+ binding (i.e., the C positions are similar).

      We agree with the reviewer that the data presented to demonstrate the inability of dEsytCaBM to bind Ca2+ is inadequate as is also pointed out by other reviewers. It would be crucial to prove this using multiple approaches. As suggested AlphaFold model will be used to answer the same.

      Minor comments:

      Line 102: indicate what PI and PA stand for (I don't think that there is a need for acronyms when they are not reused in the text later on).

      Changes will be incorporated as per the suggestion.

      Line 217-219: "When the same experimental set was examined for MCS density, we discovered that the density enhanced by 10% in Rh1>dEsyt::GFP while being comparable between wild type and dEsytCaBM::GFP flies." The authors don't comment on this finding. Does that imply that increase in the protein levels leads to increase in MCS density?

      Yes. Increase in wild type dEsyt protein levels can establish more contact sites as well as reduce the contact site distance which further elucidates the protein's role in functional tethering as mentioned in line 215 as proposed by previous studies in other models (PMID: 23791178; PMID: 27065097; PMID: 29222176).

      Lines 298-302: "...implying that dEsytCaBM exerts a dominant negative effect on wild type dEsyt. One possible mechanism for the phenotypes exhibited by dEsytCaBM expression in wild type cells is suggested by the findings of a structural and mass spectrometry investigation of hEsyt2 that reveals that the SMP domain dimerizes to create a 90Å long cylinder to facilitate the transfer of lipids (Schauder et al., 2014)." It is not clear to me what the authors suggest here: because of the dimerisation between wild type and mutant, the mutant has a negative effect or that the SMP dimerization is somehow impaired in dEsytCaBM?

      SMP domain of Esyt proteins have previously been shown to dimerize (PMID: 23791178, PMID: 24847877). They are known to form either homodimers or heterodimers in mammalian system where there are three genes that code for the protein (Esyt1, 2 and 3). In Drosophila, since it is just one gene that codes for the protein, our hypothesis is that one copy of the functional wild type gene dimerizes with the CaBM mutant and thereby render the wild type gene product nonfunctional.

      Line 304-305: "...protein expression was restricted to the cell body rather than the presynaptic terminals...". I am not sure that this is correct. The fact that a protein is localizing to a compartment does not mean that its expression is restricted to that compartment (one should measure mRNA levels to conclude this).

      The statement is based on the findings made by Kikuma et al, 2017 (PMID: 28882990) when they tried to understand the role of dEsyt at the NMJs.

      In figure 1B legend, indicate what SMC stands for (the acronym should be indicated in figure 1A legend).

      The text will be added as suggested.

      In figure 2A legend Ca binding in black box but in red boxes in figure.

      Changes will be incorporated as per the suggestion.

      **Referees cross-commenting**

      I agree with the other reviewers that one of the premise of this study relies on the loss of calcium binding by the dESyt mutant and this is not experimentally proven by the authors. However, I find that this will be difficult to prove in vivo. Only measurements of dESyt calcium binding affinity would constitute a direct proof (which requires protein purification. Any in vivo or cellular experiment would be an indirect proof. I believe that based on the high sequence conservation with ESyt proteins, the calcium binding residues have been correctly identified.

      Reviewer #4 (Significance (Required)):

      ESyt proteins are known ER-PM tethers involved in lipid transfer at MCS in a Ca2+ dependent manner. Contrary to yeast and mammals, that have several ESyt orthologs, D. melanogaster has only one ESyt, making it an ideal model to study ESyt function in vivo. It has been previously shown that proper localization of ESyt at MCS depends on Ca2+ concentration: ESyts are anchors to the ER but translocate to the PM in response to elevation of Ca2+ levels in the cytosol (Fernández-Busnadiego et al., 2015). The finding that an ESyt mutant unable to bind calcium is not localized properly is therefore not surprising. The link between RDGB, a protein known to localize at MCS, and ESyt has been shown before but to my knowledge Nath et al., show for the first time that RDBG localization at MCS is directly dependent on the Ca2+ binding activity of ESyt. In addition, the authors convincingly demonstrate that the Ca2+ binding activity of dESyt is necessary to maintain the structure of aging photoreceptors.

      The main finding of this study is that the Ca2+ binding activity of dESyt regulates the density of ER-PM MCS in photoreceptors. If true (see my comment below), that would be a novel finding, although the authors don't propose any mechanistic explanation for this.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      We haven't made any changes to the manuscript yet. However, we will be able to implement the changes mentioned in the pointwise response to reviewers above.

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      • *

      We feel that experiments to directly determine the calcium binding of dEsyt and the loss of this in dEsytCaBM are beyond the scope of this study. This is because of the huge work to heterologously express and purify the protein. We have proposed alternate ways to strengthen this conclusion.

    1. Author response:

      The following is the response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Nitta et al, in their manuscript titled, "Drosophila model to clarify the pathological significance of OPA1 in autosomal dominant optic atrophy." The novelty of this paper lies in its use of human (hOPA1) to try to rescue the phenotype of an OPA1 +/- Drosophilia DOA model (dOPA). The authors then use this model to investigate the differences between dominant-negative and haploinsufficient OPA1 variants. The value of this paper lies in the study of DN/HI variants rather than the establishment of the drosophila model per se as this has existed for some time and does have some significant disadvantages compared to existing models, particularly in the extra-ocular phenotype which is common with some OPA1 variants but not in humans. I judge the findings of this paper to be valuable with regards to significance and solid with regards to the strength of the evidence.

      Suggestions for improvements:

      (1) Stylistically the results section appears to have significant discussion/conclusion/inferences in section with reference to existing literature. I feel that this information would be better placed in the separate discussion section. E.g. lines 149-154.

      We appreciate the reviewer’s suggestion to relocate the discussion, conclusions, and inferences, particularly those that reference existing literature, to a separate discussion section. For lines 149–154, we placed them in the discussion section (lines 343–347) as follows. “Our established fly model is the first simple organism to allow observation of degeneration of the retinal axons. The mitochondria in the axons showed fragmentation of mitochondria. Former studies have observed mitochondrial fragmentation in S2 cells (McQuibban et al., 2006), muscle tissue (Deng et al., 2008), segmental nerves (Trevisan et al., 2018), and ommatidia (Yarosh et al., 2008) due to the LOF of dOPA1.”

      For lines 178–181, we also placed them in the discussion section (lines 347–351) as follows. “Our study presents compelling evidence that dOPA1 knockdown instigates neuronal degeneration, characterized by a sequential deterioration at the axonal terminals and extending to the cell bodies. This degenerative pattern, commencing from the distal axons and progressing proximally towards the cell soma, aligns with the paradigm of 'dying-back' neuropathy, a phenomenon extensively documented in various neurodegenerative disorders (Wang et al., 2012). ”

      For lines 213–217, 218–220, and 222–223, we also placed them in the discussion section (lines 363– 391) as follows. “To elucidate the pathophysiological implications of mutations in the OPA1 gene, we engineered and expressed several human OPA1 variants, including the 2708-2711del mutation, associated with DOA, and the I382M mutation, located in the GTPase domain and linked to DOA. We also investigated the D438V and R445H mutations in the GTPase domain and correlated with the more severe DOA plus phenotype. The 2708-2711del mutation exhibited limited detectability via HA-tag probing. Still, it was undetectable with a myc tag, likely due to a frameshift event leading to the mutation's characteristic truncated protein product, as delineated in prior studies (Zanna et al., 2008). Contrastingly, the I382M, D438V, and R445H mutations demonstrated expression levels comparable to the WT hOPA1. However, the expression of these mutants in retinal axons did not restore the dOPA1 deficiency to the same extent as the WT hOPA1, as evidenced in Figure 5E. This finding indicates a functional impairment imparted by these mutations, aligning with established understanding (Zanna et al., 2008). Notably, while the 2708-2711del and I382M mutations exhibited limited functional rescue, the D438V and R445H mutations did not show significant rescue activity. This differential rescue efficiency suggests that the former mutations, particularly the I382M, categorized as a hypomorph (Del Dotto et al., 2018), may retain partial functional capacity, indicative of a LOF effect but with residual activity. The I382M missense mutation within the GTPase domain of OPA1 has been described as a mild hypomorph or a disease modifier. Intriguingly, this mutation alone does not induce significant clinical outcomes, as evidenced by multiple studies (Schaaf et al., 2011; Bonneau et al., 2014; Bonifert et al., 2014; Carelli et al., 2015). A significant reduction in protein levels has been observed in fibroblasts originating from patients harboring the I382M mutation. However, mitochondrial volume remains unaffected, and the fusion activity of mitochondria is only minimally influenced (Kane et al., 2017; Del Dotto et al., 2018). This observation is consistent with findings reported by de la Barca et al. in Human Molecular Genetics 2020, where a targeted metabolomics approach classified I382M as a mild hypomorph. In our current study, the I382M mutation preserves more OPA1 function compared to DN mutations, as depicted in Figures 5E and F. Considering the results from our Drosophila model and previous research, we hypothesize that the I382M mutation may constitute a mild hypomorphic variant. This might explain its failure to manifest a phenotype on its own, yet its contribution to increased severity when it occurs in compound heterozygosity.

      (2) I do think further investigation as to why a reduction of mitochondria was noticed in the knockdown. There are conflicting reports on this in the literature. My own experience of this is fairly uniform mitochondrial number in WT vs OPA1 variant lines but with an increased level of mitophagy presumably reflecting a greater turnover. There are a number of ways to quantify mitochondrial load e.g. mtDNA quantification, protein quantification for tom20/hsp60 or equivalent. I feel the reliance on ICC here is not enough to draw conclusions. Furthermore, mitophagy markers could be checked at the same time either at the transcript or protein level. I feel this is important as it helps validate the drosophila model as we already have a lot of experimental data about the number and function of mitochondria in OPA+/- human/mammalian cells.

      We thank the reviewer for the insightful comments and suggestions regarding our study on the impact of mitochondrial reduction in a knockdown model. We concur with the reviewer’s observation that our initial results did not definitively demonstrate a decrease in the number of mitochondria in retinal axons. Furthermore, we measured mitochondrial quantity by conducting western blotting using antiCOXII and found no reduction in mitochondrial content with the knockdown of dOPA1 (Figure S4A and B). Consequently, we have revised our manuscript to remove the statement “suggesting a decreased number of mitochondria in retinal axons. However, whether this decrease is due to degradation resulting from a decline in mitochondrial quality or axonal transport failure remains unclear.” Instead, we have refocused our conclusion to reflect our electron microscopy findings, which indicate reduced mitochondrial size and structural abnormalities. The reviewer’s observation of consistent mitochondrial numbers in WT versus mutant variant lines and elevated mitophagy levels prompted us to evaluate mitochondrial turnover as a significant factor in our study. Regarding verifying mitophagy markers, we incorporated the mito-QC marker in our experimental design. In our experiments, mito-QC was expressed in the retinal axons of Drosophila to assess mitophagy activity upon dOPA1 knockdown. We observed a notable increase in mCherry positive but GFP negative puncta signals one week after eclosion, indicating the activation of mitophagy (Figure 2D–H). This outcome strongly suggests that dOPA1 knockdown enhances mitophagy in our Drosophila model. The application of mito-QC as a quantitative marker for mitophagy, validated in previous studies, offers a robust approach to analyzing this process. Our findings elucidate the role of dOPA1 in mitochondrial dynamics and its implications for neuronal health. These results have been incorporated into Figure 2, with the corresponding text updated as follows (lines 159–167): “Given that an increase in mitophagy activity has been reported in mouse RGCs and nematode ADOA models (Zaninello et al., 2022; Zaninello et al., 2020), the mitoQC marker, an established indicator of mitophagy activity, was expressed in the photoreceptors of Drosophila. The mito-QC reporter consists of a tandem mCherry-GFP tag that localizes to the outer membrane of mitochondria (Lee et al., 2018). This construct allows the measurement of mitophagy by detecting an increase in the red-only mCherry signal when the GFP is degraded after mitochondria are transported to lysosomes. Post dOPA1 knockdown, we observed a significant elevation in mCherry positive and GFP negative puncta signals at one week, demonstrating an activation of mitophagy as a consequence of dOPA1 knockdown (Figure 2D–H).”  

      (3) Could the authors comment on the failure of the dOPA1 rescue to return their biomarker, axonal number to control levels. In Figure 4D is there significance between the control and rescue. Presumably so as there is between the mutant and rescue and the difference looks less.

      As the reviewer correctly pointed out, there is a significant difference between the control and rescue groups, which we have now included in the figure. Additionally, we have incorporated the following comments in the discussion section (lines 329–342) regarding this significant difference: “In our study, expressing dOPA1 in the retinal axons of dOPA1 mutants resulted in significant rescue, but it did not return to control levels. There are three possible explanations for this result. The first concerns gene expression levels. The Gal4-line used for the rescue experiments may not replicate the expression levels or timing of endogenous dOPA1. Considering that the optimal functionality of dOPA1 may be contingent upon specific gene expression levels, attaining a wild-type-like state necessitates the precise regulation of these expression levels. The second is a nonautonomous issue. Although dOPA1 gene expression was induced in the retinal axons for the rescue experiments, many retinal axons were homozygous mutants, while other cell types were heterozygous for the dOPA1 mutation. If there is a non-autonomous effect of dOPA1 in cells other than retinal axons, it might not be possible to restore the wild-type-like state fully. The third potential issue is that only one isoform of dOPA1 was expressed. In mouse OPA1, to completely restore mitochondrial network shape, an appropriate balance of at least two different isoforms, lOPA1 and s-OPA1, is required (Del Dotto et al., 2017). This requirement implies that multiple isoforms of dOPA1 are essential for the dynamic activities of mitochondria.”

      (4) The authors have chosen an interesting if complicated missense variant to study, namely the I382M with several studies showing this is insufficient to cause disease in isolation and appears in high frequency on gnomAD but appears to worsen the phenotype when it appears as a compound het. I think this is worth discussing in the context of the results, particularly with regard to the ability for this variant to partially rescue the dOPA1 model as shown in Figure 5.

      As the reviewer pointed out, the I382M mutation is known to act as a disease modifier. However, in our system, as suggested by Figure 5, I382M appears to retain more activity than DN mutations. Considering previous studies, we propose that I382M represents a mild hypomorph. Consequently, while I382M alone may not exhibit a phenotype, it could exacerbate severity in a compound heterozygous state. We have incorporated this perspective in our revised discussion (lines 375-391).

      “Notably, while the 2708-2711del and I382M mutations exhibited limited functional rescue, the D438V and R445H mutations did not show significant rescue activity. This differential rescue efficiency suggests that the former mutations, particularly the I382M, categorized as a hypomorph (Del Dotto et al., 2018), may retain partial functional capacity, indicative of a LOF effect but with residual activity. The I382M missense mutation within the GTPase domain of OPA1 has been described as a mild hypomorph or a disease modifier. Intriguingly, this mutation alone does no induce significant clinical outcomes, as evidenced by multiple studies (Schaaf et al., 2011; Bonneau et al., 2014; Bonifert et al., 2014; Carelli et al., 2015). A significant reduction in protein levels has been observed in fibroblasts originating from patients harboring the I382M mutation. However, mitochondrial volume remains unaffected, and the fusion activity of mitochondria is only minimally influenced (Kane et al., 2017; Del Dotto et al., 2018). This observation is consistent with findings reported by de la Barca et al. in Human Molecular Genetics 2020, where a targeted metabolomics approach classified I382M as a mild hypomorph. In our current study, the I382M mutation preserves more OPA1 function compared to DN mutations, as depicted in Figures 5E and F. Considering the results from our Drosophila model and previous research, we hypothesize that the I382M mutation may constitute a mild hypomorphic variant. This might explain its failure to manifest a phenotype on its own, yet its contribution to increased severity when it occurs in compound heterozygosity.”

      (5) I feel the main limitation of this paper is the reliance on axonal number as a biomarker for OPA1 function and ultimately rescue. I have concerns because a) this is not a well validated biomarker within the context of OPA1 variants b) we have little understanding of how this is affected by over/under expression and c) if it is a threshold effect e.g. once OPA1 levels reach <x% pathology develops but develops normally when opa1 expression is >x%. I think this is particularly relevant when the authors are using this model to make conclusions on dominant negativity/HI with the authors proposing that if expression of a hOPA1 transcript does not increase opa1 expression in a dOPA1 KO then this means that the variant is DN. The authors have used other biomarkers in parts of this manuscript e.g. ROS measurement and mito trafficking but I feel this would benefit from something else particularly in the latter experiments demonstrated in figure 5 and 6.

      The reviewer raised concerns regarding the adequacy of axonal count as a validated biomarker in the context of OPA1 mutants. In response, we corroborated its validity using markers such as MitoSOX, Atg8, and COXII. Experiments employing MitoSOX revealed that the augmented ROS signals resulting from dOPA1 knockdown were mitigated by expressing human OPA1. Conversely, the mutant variants 2708-2711del, D438V, and R445H did not ameliorate these effects, paralleling the phenotype of axonal degeneration observed. These findings are documented in Figure 5F, and we have incorporated the following text into section lines 248–254 of the results:

      “Furthermore, we assessed the potential for rescuing ROS signals. Similar to its effect on axonal degeneration, wild-type hOPA1 effectively mitigated the phenotype, whereas the 2708-2711del, D438V, and R445H mutants did not (Figure 5F). Importantly, the I382M variant also reduced ROS levels comparably to the wild type. These findings demonstrate that both axonal degeneration and the increase in ROS caused by dOPA1 downregulation can be effectively counteracted by hOPA1. Although I382M retains partial functionality, it acts as a relatively weak hypomorph in this experimental setup.”

      Moreover, utilizing mito-QC, we observed elevated mitophagy in our Drosophila model, with these results now included in Figure 2D–H. Given the complexity of the genetics involved and the challenges in establishing lines, autophagy activity was quantified by comparing the ratio of Atg8-1 to Atg8-2 via Western blot analysis. However, no significant alterations were detected across any of the genotypes. Additionally, mitochondrial protein levels derived from COXII confirmed consistent mitochondrial quantities, showing no considerable variance following knockdown. These insights affirm that retinal axon degeneration and mitophagy activation are present in the Drosophila DOA model, although the Western blot analysis revealed no significant changes in autophagy activation. Such findings necessitate caution as this model may not fully replicate the molecular pathology of the corresponding human disease. These Western blot findings are presented in Figure S4, with the following addition made to section lines 255–263 of the results:

      “We also conducted Western blot analyses using anti-COXII and anti-Atg8a antibodies to assess changes in mitochondrial quantity and autophagy activity following the knockdown of dOPA1. Mitochondrial protein levels, indicated by COXII quantification, were evaluated to verify mitochondrial content, and the ratio of Atg8a-1 to Atg8a-2 was used to measure autophagy activation. For these experiments, Tub-Gal4 was employed to systemically knockdown dOPA1. Considering the lethality of a whole-body dOPA1 knockdown, Tub-Gal80TS was utilized to repress the knockdown until eclosion by maintaining the flies at 20°C. After eclosion, we increased the temperature to 29°C for two weeks to induce the knockdown or expression of hOPA1 variants. The results revealed no significant differences across the genotypes tested (Figure S4A–D).”

      In assessing the effects of dominant negative mutations, measurements including ROS levels, the ratio of Atg8-1 to Atg8-2, and the quantity of COXII protein were conducted, yet no significant differences were observed (Figure S6). This limitation of the fly model is mentioned in the results, noting the observation of the axonal degeneration phenotype but not alterations in ROS signaling, autophagy activity, or mitochondrial quantity as follows (line 287–290):

      “We investigated the impacts of dominant negative mutations on mitochondrial oxidation levels, mitochondrial quantity, and autophagy activation levels; however, none of these parameters showed statistical significance (Figure S6).”

      The reviewer also inquired about the effects of overexpressing and underexpressing OPA1 on axonal count and whether these effects are subject to a threshold. In response, we expressed both wild-type and variant forms of human OPA1 in Drosophila in vivo and assessed their protein levels using Western blot analysis. The results showed no significant differences in expression levels between the wild-type and variant forms in the OPA1 overexpression experiments, suggesting the absence of a variation threshold effect. These findings have been newly documented as quantitative data in Figure 5C. Furthermore, we have included a statement in the results section for Figure 6A, clarifying that overexpression of hOPA1 exhibited no discernible impact, as detailed on lines 274–276.

      “The results presented in Figure 5C indicate that there are no significant differences in the expression levels among the variants, suggesting that variations in expression levels do not influence the outcomes.”

      (6) Could the authors clarify what exons in Figure 5 are included in their transcript. My understanding is transcript NM_015560.3 contains exon 4,4b but not 5b. According to Song 2007 this transcript produces invariably s-OPA1 as it contains the exon 4b cleavage site. If this is true, this is a critical limitation in this study and in my opinion significantly undermines the likelihood of the proposed explanation of the findings presented in Figure 6. The primarily functional location of OPA1 is at the IMM and l-OPA1 is the primary opa1 isoform probably only that localizes here as the additional AA act as a IMM anchor. Given this is where GTPase likely oligomerizes the expression of s-OPA1 only is unlikely to interact anyway with native protein. I am not aware of any evidence s-OPA1 is involved in oligomerization. Therefore I don't think this method and specifically expression of a hOPA1 transcript which only makes s-OPA1 to be a reliable indicator of dominant negativity/interference with WT protein function. This could be checked by blotting UAS-hOPA1 protein with a OPA1 antibody specific to human OPA1 only and not to dOPA1. There are several available on the market and if the authors see only s-OPA1 then it confirms they are not expressing l-OPA1 with their hOPA1 construct.

      As suggested by the reviewer, we performed a Western blot using a human OPA1 antibody to determine if the expressed hOPA1 was producing the l-OPA1 isoform, as shown in band 2 of Figure 5D. The results confirmed the presence of both l-OPA1 and what appears to be s-OPA1 in bands 2 and 4, respectively. These findings are documented in the updated Figure 5D, with a detailed description provided in the manuscript at lines 224-226. Additionally, the NM_015560.3 refers to isoform 1, which includes only exons 4 and 5, excluding exons 4b and 5b. This isoform can express both l-OPA1 and s-OPA1 (refer to Figure 1 in Song et al., J Cell Biol. 2007). We have updated the schematic diagram in the figure to include these exons. The formation of s-OPA1 through cleavage occurs at the OMA1 target site located in exon 5 and the Yme1L target site in exon 5b of OPA1. Isoform 1 of OPA1 is prone to cleavage by OMA1, but a homologous gene for OMA1 does not exist in Drosophila. Although a homologous gene for Yme1L is present in Drosophila, exon 5b is missing in isoform 1 of OPA1, leaving the origin of the smaller band resembling s-OPA1 unclear at this point.

      Reviewer #2 (Public Review):

      The data presented support and extend some previously published data using Drosophila as a model to unravel the cellular and genetic basis of human Autosomal dominant optic atrophy (DOA). In human, mutations in OPA1, a mitochondrial dynamin like GTPase (amongst others), are the most common cause for DOA. By using a Drosophila loss-of-function mutations, RNAi- mediated knockdown and overexpression, the authors could recapitulate some aspects of the disease phenotype, which could be rescued by the wild-type version of the human gene. Their assays allowed them to distinguish between mutations causing human DOA, affecting the optic system and supposed to be loss-of-function mutations, and those mutations supposed to act as dominant negative, resulting in DOA plus, in which other tissues/organs are affected as well. Based on the lack of information in the Materials and Methods section and in several figure legends, it was not in all cases possible to follow the conclusions of the authors.

      We appreciate the reviewer's constructive feedback and the emphasis on enhancing clarity in our manuscript. We recognize the concerns raised about the lack of detailed information in the Materials and Methods section and several figure legends, which may have obscured our conclusions. In response, we have appended the detailed genotypes of the Drosophila strains used in each experiment to a supplementary table. Additionally, we realized that the description of 'immunohistochemistry and imaging' was too brief, previously referenced simply as “immunohistochemistry was performed as described previously (Sugie et al., 2017).” We have now expanded this section to include comprehensive methodological details. Furthermore, we have revised the figure legends to provide clearer and more thorough descriptions.

      Similarly, how the knowledge gained could help to "inform early treatment decisions in patients with mutations in hOPA1" (line 38) cannot be followed.

      To address the reviewer's comments, we have refined our explanation of the clinical relevance of our findings as follows. We believe this revision succinctly articulates the practical application of our research, directly responding to the reviewer’s concerns about linking the study's outcomes to treatment decisions for patients with hOPA1 mutations. By underscoring the model’s value in differential diagnosis and its influence on initiating treatment strategies, we have clarified this connection explicitly, within the constraints of the abstract’s word limit. The revised sentence now reads: "This fly model aids in distinguishing DOA from DOA plus and guides initial hOPA1 mutation treatment strategies."

      Reviewer #3 (Public Review):

      Nitta et al. establish a fly model of autosomal dominant optic atrophy, of which hundreds of different OPA1 mutations are the cause with wide phenotypic variance. It has long been hypothesized that missense OPA1 mutations affecting the GTPase domain, which are associated with more severe optic atrophy and extra-ophthalmic neurologic conditions such as sensorineural hearing loss (DOA plus), impart their effects through a dominant negative mechanism, but no clear direct evidence for this exists particularly in an animal model. The authors execute a well-designed study to establish their model, demonstrating a clear mitochondrial phenotype with multiple clinical analogs including optic atrophy measured as axonal degeneration. They then show that hOPA1 mitigates optic atrophy with the same efficacy as dOPA1, setting up the utility of their model to test disease-causing hOPA1 variants. Finally, they leverage this model to provide the first direct evidence for a dominant negative mechanism for 2 mutations causing DOA plus by expressing these variants in the background of a full hOPA1 complement.

      Strengths of the paper include well-motivated objectives and hypotheses, overall solid design and execution, and a generally clear and thorough interpretation of their results. The results technically support their primary conclusions with caveats. The first is that both dOPA1 and hOPA1 fail to fully restore optic axonal integrity, yet the authors fail to acknowledge that this only constitutes a partial rescue, nor do they discuss how this fact might influence our interpretation of their subsequent results.

      As the reviewer rightly points out, neither dOPA1 nor hOPA1 achieve a complete recovery. Therefore, we acknowledge that this represents only a partial rescue and have added the following explanations regarding this partial rescue in the results and discussion sections.

      Result:

      Significantly —> partially (lines 207 and 228) Discussion (lines 329–342):

      In our study, expressing dOPA1 in the retinal axons of dOPA1 mutants resulted in significant rescue, but it did not return to control levels. There are three possible explanations for this result. The first concerns gene expression levels. The Gal4-line used for the rescue experiments may not replicate the expression levels or timing of endogenous dOPA1. Considering that the optimal functionality of dOPA1 may be contingent upon specific gene expression levels, attaining a wild-type-like state necessitates the precise regulation of these expression levels. The second is a non-autonomous issue. Although dOPA1 gene expression was induced in the retinal axons for the rescue experiments, many retinal axons were homozygous mutants, while other cell types were heterozygous for the dOPA1 mutation. If there is a non-autonomous effect of dOPA1 in cells other than retinal axons, it might not be possible to restore the wild-type-like state fully. The third potential issue is that only one isoform of dOPA1 was expressed. In mouse OPA1, to completely restore mitochondrial network shape, an appropriate balance of at least two different isoforms, l-OPA1 and s-OPA1, is required (Del Dotto et al., 2017). This requirement implies that multiple isoforms of dOPA1 are essential for the dynamic activities of mitochondria.

      The second caveat is that their effect sizes are small. Statistically, the results indeed support a dominant negative effect of DOA plus-associated variants, yet the data show a marginal impact on axonal degeneration for these variants. The authors might have considered exploring the impact of these variants on other mitochondrial outcome measures they established earlier on. They might also consider providing some functional context for this marginal difference in axonal optic nerve degeneration.

      In response to the reviewer’s comment regarding the modest effect sizes observed, we acknowledge that the magnitude of the reported changes is indeed small. To explore the impact of these variants on additional mitochondrial outcomes as suggested, we employed markers such as MitoSOX, Atg8, and COXII for validation. However, we could not detect any significant effects of the DOA plus-associated variants using these methods. We apologize for the redundancy, but to address Reviewer #1's fifth question, we present experimental results showing that while the increased ROS signals observed upon dOPA1 knockdown were rescued by expressing human OPA1, the mutant variants 2708-2711del, D438V, and R445H did not ameliorate this effect. This outcome mirrors the axonal degeneration phenotype and is documented in Figure 5F. The following text has been added to the results section lines 248–254:

      “Furthermore, we assessed the potential for rescuing ROS signals. Similar to its effect on axonal degeneration, wild-type hOPA1 effectively mitigated the phenotype, whereas the 2708-2711del, D438V, and R445H mutants did not (Figure 5F). Importantly, the I382M variant also reduced ROS levels comparably to the wild type. These findings demonstrate that both axonal degeneration and the increase in ROS caused by dOPA1 downregulation can be effectively counteracted by hOPA1. Although I382M retains partial functionality, it acts as a relatively weak hypomorph in this experimental setup.”

      Moreover, utilizing mito-QC, we observed elevated mitophagy in our Drosophila model, with these results now included in Figure 2D–H. Given the complexity of the genetics involved and the challenges in establishing lines, autophagy activity was quantified by comparing the ratio of Atg8-1 to Atg8-2 via Western blot analysis. However, no significant alterations were detected across any of the genotypes. Additionally, mitochondrial protein levels derived from COXII confirmed consistent mitochondrial quantities, showing no considerable variance following knockdown. These insights affirm that retinal axon degeneration and mitophagy activation are present in the Drosophila DOA model, although the Western blot analysis revealed no significant changes in autophagy activation. Such findings necessitate caution as this model may not fully replicate the molecular pathology of the corresponding human disease. These Western blot findings are presented in Figure S4, with the following addition made to section lines 255–263 of the results:

      “We also conducted Western blot analyses using anti-COXII and anti-Atg8a antibodies to assess changes in mitochondrial quantity and autophagy activity following the knockdown of dOPA1. Mitochondrial protein levels, indicated by COXII quantification, were evaluated to verify mitochondrial content, and the ratio of Atg8a-1 to Atg8a-2 was used to measure autophagy activation. For these experiments, Tub-Gal4 was employed to systemically knockdown dOPA1. Considering the lethality of a whole-body dOPA1 knockdown, Tub-Gal80TS was utilized to repress the knockdown until eclosion by maintaining the flies at 20°C. After eclosion, we increased the temperature to 29°C for two weeks to induce the knockdown or expression of hOPA1 variants. The results revealed no significant differences across the genotypes tested (Figure S4A–D).”

      In assessing the effects of dominant negative mutations, measurements including ROS levels, the ratio of Atg8-1 to Atg8-2, and the quantity of COXII protein were conducted, yet no significant differences were observed (Figure S6). This limitation of the fly model is mentioned in the results, noting the observation of the axonal degeneration phenotype but not alterations in ROS signaling, autophagy activity, or mitochondrial quantity as follows (line 287–290):

      “We investigated the impacts of dominant negative mutations on mitochondrial oxidation levels, mitochondrial quantity, and autophagy activation levels; however, none of these parameters showed statistical significance (Figure S6).”

      Despite these caveats, the authors provide the first animal model of DOA that also allows for rapid assessment and mechanistic testing of suspected OPA1 variants. The impact of this work in providing the first direct evidence of a dominant negative mechanism is under-stated considering how important this question is in development of genetic treatments for DOA. The authors discuss important points regarding the potential utility of this model in clinical science. Comments on the potential use of this model to investigate variants of unknown significance in clinical diagnosis requires further discussion of whether there is indeed precedent for this in other genetic conditions (since the model is nevertheless so evolutionarily removed from humans).

      As suggested by the reviewer, we have expanded the discussion in our study to emphasize in greater detail the significance of the fruit fly model and the MeDUsA software we have developed, elaborating on the model's potential applications in clinical science and its precedents in other genetic disorders. Our text is as follows (lines 299–318):

      “We have previously utilized MeDUsA to quantify axonal degeneration, applying this methodology extensively to various neurological disorders. The robust adaptability of this experimental system is demonstrated by its application in exploring a wide spectrum of genetic mutations associated with neurological conditions, highlighting its broad utility in neurogenetic research. We identified a novel de novo variant in Spliceosome Associated Factor 1, Recruiter of U4/U6.U5 Tri-SnRNP (SART1). The patient, born at 37 weeks with a birth weight of 2934g, exhibited significant developmental delays, including an inability to support head movement at 7 months, reliance on tube feeding, unresponsiveness to visual stimuli, and development of infantile spasms with hypsarrhythmia, as evidenced by EEG findings. Profound hearing loss and brain atrophy were confirmed through MRI imaging. To assess the functional impact of this novel human gene variant, we engineered transgenic Drosophila lines expressing both wild type and mutant SART1 under the control of a UAS promoter.

      Our MeDUsA analysis suggested that the variant may confer a gain-of-toxic-function (Nitta et al.,  2023). Moreover, we identified heterozygous loss-of-function mutations in DHX9 as potentially causative for a newly characterized neurodevelopmental disorder. We further investigated the pathogenic potential of a novel heterozygous de novo missense mutation in DHX9 in a patient presenting with short stature, intellectual disability, and myocardial compaction. Our findings indicated a loss of function in the G414R and R1052Q variants of DHX9 (Yamada et al., 2023). This experimental framework has been instrumental in elucidating the impact of gene mutations, enhancing our ability to diagnose how novel variants influence gene function.”

      Recommendations for the Authors:

      Reviewer #1 (Recommendations For The Authors):

      Overall I enjoyed reading this paper. It is well presented and represents a significant amount of well executed study. I feel it further characterizes a poorly understood model of OPA1 variants and one which displays significant differences with the human phenotype. However I feel the use of this model with the author's experiments are not enough to validate this model/experiment as a screening tool for dominant negativity. I have therefore suggested the above experiments as a way to both further validate the mitochondrial dysfunction in this model and to ensure that the expressed transcript is able affect oligomerization as this is a pre-requisite to the authors conclusions.

      We assessed the extent to which our model reflects mitochondrial dysfunction using COXII, Atg8, and MitoSOX markers. Unfortunately, neither COXII levels nor the ratio of Atg8a-1 to Atg8a-2 showed significant variations across genotypes that would clarify the impact of dominant negative mutations. Nonetheless, MitoSOX and mito-QC results revealed that mitochondrial ROS levels and mitophagy are increased in Drosophila following intrinsic knockdown of dOPA1. These findings are documented in Figures 2, 5, and S6.

      Regarding oligomer formation, the specifics remain elusive in this study. However, the expression of dOPA1K273A, identified as a dominant negative variant in Drosophila, significantly disrupted retinal axon organization, as detailed in Figure S7. From these observations, we hypothesize that oligomerization of wild-type and dominant negative forms in Drosophila results in axonal degeneration. Conversely, co-expression of Drosophila wild-type with human dominant negative forms does not induce degeneration, suggesting that they likely do not interact.

      Reviewer #2 (Recommendations For The Authors):

      Materials and Methods:

      The authors used GMR-Gal4 to express OPA1-RNAi. I) GMR is expressed in most cells in the developing eye behind the morphogenetic furrow. So the defects observed can be due to knock- down in support cells rather than in photoreceptor cells.

      We have added the following sentences in the result (lines 194–196)."The GMR-Gal4 driver does not exclusively target Gal4 expression to photoreceptor cells. Consequently, the observed retinal axonal degeneration could potentially be secondary to abnormalities in support cells external to the photoreceptors.”

      OPA1-RNAi: how complete is the knock-down? Have the authors tested more than one RNAi line?

      We conducted experiments with an additional RNAi line, and similarly observed degeneration in the retinal axons (Figure S2 A and B; lines 178–179).

      The loss-of-function allele, induced by a P-element insertion, gives several eye phenotypes when heterozygous (Yarosh et al., 2008). Does RNAi expression lead to the same phenotypes?

      A previous report indicated that the compound eyes of homozygous mutations of dOPA1 displayed a glossy eye phenotype (Yarosh et al., 2008). Upon knocking down dOPA1 using the GMR-Gal4 driver, we also observed a glossy eye-like rough eye phenotype in the compound eyes. These findings have been added to Figure S3 and lines 192–194.

      There is no description on the way the somatic clones were generated. How were mutant cells in clones distinguished from wild-type cells (e. g. in Fig. 4).

      In the Methods section, we described the procedure for generating clones and their genotypes as follows (lines 502–505): "The dOPA1 clone analysis was performed by inducing flippase expression in the eyes using either ey-Gal4 with UAS-flp or ey3.5-flp, followed by recombination at the chromosomal location FRT42D to generate a mosaic of cells homozygous for dOPA1s3475." Furthermore, we have created a table detailing these genotypes. In these experiments, it was not possible to differentiate between the clone and WT cells. Accordingly, we have noted in the Results section (lines 201–203): "Note that the mutant clone analysis was conducted in a context where mutant and heterozygous cells coexist as a mosaic, and it was not possible to distinguish between them.”

      Why were flies kept at 29{degree sign}C? this is rather unusual.

      Increased temperature was demonstrated to induce elevated expression of GAL4 (Kramer and Staveley, Genet. Mol. Res., 2003), which in turn led to an enhanced expression of the target genes. Therefore, experiments involving knockdown assays or Western blotting to detect human OPA1 protein were exclusively conducted at 29°C. However, all other experiments were performed at 25°C, as described in the methods sections: “Flies were maintained at 25°C on standard fly food. For knockdown experiments (Figures 1C–E, 1F–H, 2A–H, 3B–K, 5F, S1, S2 A and B, and S6A), flies were kept at 29°C in darkness.” Furthermore, “We regulated protein expression temporally across the whole body using the Tub-Gal4 and Tub-GAL80TS system. Flies harboring each hOPA1 variant were maintained at a permissive temperature of 20°C, and upon emergence, females were transferred to a restrictive temperature of 29°C for subsequent experiments.”

      Legends:

      It would be helpful to have a description of the genotypes of the flies used in the different experiments. This could also be included as a table.

      We have created a table detailing the genotypes. Additionally, in the legend, we have included a note to consult the supplementary table for genotypes.

      Results:

      Line 141: It is not clear what they mean by "degradation", is it axonal degeneration? And if so, what is the argument for this here?

      In the manuscript, we addressed the potential for mitochondrial degradation; however, recognizing that the expression was ambiguous, the following sentence has been omitted: "Nevertheless, the degradation resulting from mitochondrial fragmentation may have decreased the mitochondrial signal.”

      Fig. 2: Axons of which photoreceptors are shown?

      We have added "a set of the R7/8 retinal axons" to the legend of Figure 2.

      Line 167: The authors write that axonal degeneration is more severe after seven days than after eclosion. Is this effect light-dependent? The same question concerns the disappearance of the rhabdomere (Fig. 3G–J).

      We conducted the experiments in darkness, ensuring that the observed degeneration is not light- dependent. This condition has been added to the methods section to clarify the experimental conditions.

      Line 178/179: Based on what results do they conclude that there is degeneration of the "terminals" of the axons?

      Quantification via MeDUsA has enabled us to count the number of axonal terminals, and a noted decrease has led us to conclude axonal terminal degeneration. We have published two papers on these findings. We have added the following description to the results section to clarify how we defined degeneration (lines 174–176): "We have assessed the extent of their reduction from the total axonal terminal count, thereby determining the degree of axonal terminal degeneration (Richard JNS 2022; Nitta HMG 2023).

      Line 189: They write: ".. we observed dOPA1 mutant axons...". How did they distinguish es mutant from the controls?

      Fig. 5 and Fig. 6: How did they distinguish genetically mutant cells from genetically control cells in the somatic clones?

      Mutant clone analysis was conducted in a context where mutant and heterozygous cells coexist as a mosaic, and it was not possible to distinguish between them. Accordingly, this point has been added to lines 201–203, “Note that the mutant clone analysis was conducted in a context where mutant and heterozygous cells coexist as a mosaic, and it was not possible to distinguish between them.” and the text in the results section has been modified as follows:

      (Before “To determine if dOPA1 is responsible for axon neurodegeneration, we observed the dOPA1 mutant axons by expressing full- length versions of dOPA1 in the photoreceptors at one day after eclosion and found that dOPA1 expression significantly rescued the axonal degeneration” —>

      (After “To determine if dOPA1 is responsible for axon neurodegeneration, we quantify the number of the axons in the dOPA1 eye clone fly with the expression of dOPA1 at one day after eclosion and found that dOPA1 expression partially rescued the axonal degeneration”

      Line 225/226: It is not clear to me how their approach "can quantitatively measure the degree of LOF".

      To address the reviewer's question and clarify how our approach quantitatively measures the degree of loss of function (LOF), we revised the statement (lines 238–247):

      "Our methodology distinctively facilitates the quantitative evaluation of LOF severity by comparing the rescue capabilities of various mutations. Notably, the 2708-2711del and I382M mutations demonstrated only partial rescue, indicative of a hypomorphic effect with residual activity. In contrast, the D438V and R445H mutations failed to show significant rescue, suggesting a more profound LOF. The correlation between the partial rescue by the 2708-2711del and I382M mutations and their classification as hypomorphic is significant. Moreover, the observed differences in rescue efficacy correspond to the clinical severities associated with these mutations, namely in DOA and DOA plus disorders. Thus, our results substantiate the model’s ability to quantitatively discriminate among mutations based on their impact on protein functionality, providing an insightful measure of LOF magnitude.”

      Discussion:

      Line 251, 252 and line 358: What is "the optic nerve" in the adult Drosophila?

      In humans, the axons of retinal ganglion cells (RGCs) are referred to as the optic nerve, and we posit that the retinal axons in flies are similar to this structure. In the introduction section, where it is described that the visual systems of flies and humans bear resemblance, we have appended the following definition (lines 107–108): “In this study, we defined the retinal axons of Drosophila as analogous to the human optic nerve.”

      Line 344: These bands appear only upon overexpression of the hOPA1 constructs, so this part of the is very speculative.

      Confirmation was achieved using anti-hOPA1, demonstrating that myc is not nonspecific. These results have been added to Figure 5D. Furthermore, the phrase “The upper band was expected as” has been revised to “From a size perspective, the upper band was inferred to represent the full-length hOPA1 including the mitochondria import sequence (MIS).” (lines 464–465)

      I was missing a discussion about the increase of ROS upon loss/reduction of dOPA1 observed by others and described here. Is there an increase of ROS upon expression of any of the constructs used?

      We demonstrated that not only axonal degeneration but also ROS can be suppressed by expressing human OPA1 in the genetic background of dOPA1 knockdown. Additionally, rescue was not possible with any variants except for I382M. Furthermore, we assessed whether there were changes in ROS in the evaluation of dominant negatives, but no significant differences were observed in this experimental system. These findings have been added to the discussion section as follows (lines 318–328). “Our research established that dOPA1 knockdown precipitates axonal degeneration and elevates ROS signals in retinal axons. Expression of human OPA1 within this context effectively mitigated both phenomena; it partially reversed axonal degeneration and nearly completely normalized ROS levels. These results imply that factors other than increased ROS may drive the axonal degeneration observed post-knockdown. Furthermore, while differences between the impacts of DN mutations and loss-of- function mutations were evident in axonal degeneration, they were less apparent when using ROS as a biomarker. The extensive use of transgenes in our experiments might have mitigated the knockdown effects. In a systemic dOPA1 knockdown, assessments of mitochondrial quantity and autophagy activity revealed no significant changes, suggesting that the cellular consequences of reduced OPA1 expression might vary across different cell types.”

      Reviewer #3 (Recommendations For The Authors):

      Consider being more explicit regarding literature that has or has failed to test a direct dominant negative effect by expressing a variant in question in the background of a full OPA1 complement. My understanding is that this is the first direct evidence of this widely held hypothesis. This lends to the main claim promoting the utility of fly as a model in general. The authors might also outline this in the introduction as a knowledge gap they fill through this study.

      In the introduction, we have incorporated a passage that highlights precedents capable of distinguishing between LOF and DN effects, and we note the absence of models capable of dissecting these distinctions within an in vivo organism. This study aims to address this gap, proposing a model that elucidates the differential impacts of LOF and DN within the context of a living model organism, thereby contributing to a deeper understanding of their roles in disease pathology. We added the following sentences in the introduction (lines 71–80).

      “In the quest to differentiate between LOF and DN effects within the context of genetic mutations, precedents exist in simpler systems such as yeast and human fibroblasts. These models have provided valuable insights into the conserved functions of OPA1 across species, as evidenced by studies in yeast models (Del Dotto et al., 2018) and fibroblasts derived from patients harboring OPA1 mutations (Kane et al., 2017). However, the ability to distinguish between LOF and DN effects in an in vivo model organism, particularly at the structural level of retinal axon degeneration, has remained elusive. This gap underscores the necessity for a more complex model that not only facilitates molecular analysis but also enables the examination of structural changes in axons and mitochondria, akin to those observed in the actual disease state.”

      The authors should clarify the language used in the abstract and introduction on the effect of hOPA1 DOA and DOA plus on the dOPA1- phenotype. Currently written as "none of the previously reports mutations known to cause DOA or DOA plus were rescued, their functions seems to be impaired." but presumably the authors mean that these variants failed to rescue to the dOPA1 deficient phenotype.

      We thank the reviewer for the constructive feedback. We acknowledge the need for clarity in our description of the effects of hOPA1 DOA and DOA plus mutations on the dOPA1- phenotype in both the abstract and the introduction. The current phrasing, "none of the previously reported mutations known to cause DOA or DOA plus were rescued, their functions seem to be impaired," may indeed be confusing. To address your concern, we have revised this statement to more accurately reflect our findings: "Previously reported mutations failed to rescue the dOPA1 deficiency phenotype." For Abstract site, we have changed as following. "we could not rescue any previously reported mutations known to cause either DOA or DOA plus.”→ “mutations previously identified did not ameliorate the dOPA1 deficiency phenotype.”

      DOA plus is associated with a multiple sclerosis-like illness; as written it suggests that the pathogenesis of sporadic multiple sclerosis and that associated with DOA plus share and underlying pathogenic mechanism. Please use the qualifier "-like illness." 

      We have added the term “multiple sclerosis-like illness” wherever “multiple sclerosis” is mentioned.

    1. Studies of the motions of the most remote globular clusters and the small galaxies that orbit our own show that the total mass of the Galaxy is at least 2 × 1012 MSun, which is about twenty times greater than the amount of luminous matter. Moreover, the dark matter (as astronomers have come to call the invisible material) extends to a distance of at least 200,000 light-years from the center of the Galaxy. Observations indicate that this dark matter halo is almost but not quite spherical. The obvious question is: what is the dark matter made of? Let’s look at a list of “suspects” taken from our study of astronomy so far. Since this matter is invisible, it clearly cannot be in the form of ordinary stars. And it cannot be gas in any form (remember that there has to be a lot of it). If it were neutral hydrogen gas, its 21-cm wavelength spectral-line emission would have been detected as radio waves. If it were ionized hydrogen, it should be hot enough to emit visible radiation. If a lot of hydrogen atoms out there had combined into hydrogen molecules, these should produce dark features in the ultraviolet spectra of objects lying beyond the Galaxy, but such features have not been seen. Nor can the dark matter consist of interstellar dust, since in the required quantities, the dust would significantly obscure the light from distant galaxies.

      Dark matter is an interesting concept in astrophysics precisely because we have absolutely no idea what it is. Because of how it affects gravity, we assume its some form of matter, but in truth we are unsure. Dark matter is more grounded in our current model of the universe that dark energy is, due to its effect on gravity, but it still follows the same habit of physicists encountering an unknown an labeling it "dark something" to account for the discrepancy in their model. It goes to show how there's still a lot more to learn about the universe, and that our current model may not be as correct as we think it is.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The study of human intelligence has been the focus of cognitive neuroscience research, and finding some objective behavioral or neural indicators of intelligence has been an ongoing problem for scientists for many years. Melnick et al, 2013 found for the first time that the phenomenon of spatial suppression in motion perception predicts an individual's IQ score. This is because IQ is likely associated with the ability to suppress irrelevant information. In this study, a high-resolution MRS approach was used to test this theory. In this paper, the phenomenon of spatial suppression in motion perception was found to be correlated with the visuo-spatial subtest of gF, while both variables were also correlated with the GABA concentration of MT+ in the human brain. In addition, there was no significant relationship with the excitatory transmitter Glu. At the same time, SI was also associated with MT+ and several frontal cortex FCs.

      Strengths:

      (1) 7T high-resolution MRS is used.

      (2) This study combines the behavioral tests, MRS, and fMRI.

      Weaknesses:

      (1) In the intro, it seems to me that the multiple-demand (MD) regions are the key in this study. However, I didn't see any results associated with the MD regions. Did I miss something?

      Thank you to the reviewer for pointing this out. After careful consideration, we agree with your point of view. According to the results of Melnick 2013, the motion surround suppression (SI) and the time thresholds of small and large gratings representing hMT+ functionality are correlated with Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indicators, with correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. This suggests that hMT+ does have the potential to become the core of MD system. However, due to our results only delving into “the GABA-ergic inhibition in human MT predicts visuo-spatial intelligence mediated through the frontal cortex”, it is not yet sufficient to prove that hMT+is the core node of the MD system, we have adjusted the explanatory logic of the article. Briefly, we emphasize the de-redundancy of hMT+ in visual-spatial intelligence and the improvement of information processing efficiency, while weaken the significance of hMT+ in MD systems.

      (2) How was the sample size determined? Is it sufficient?

      Thank you to reviewer for pointing this out. We use G*power to determine our sample size. In the study by Melnick (2013), they reported a medium effect between SI and Perception Reasoning sub-ability (r=0.47). Here we use this r value as the correlation coefficient (ρ H1), setting the power at the commonly used threshold of 0.8 and the alpha error probability at 0.05. The required sample size is calculated to be 26. This ensures that our study has reasonable power to yield valid statistical results. Furthermore, compared to earlier within-subject studies like Schallmo et al.'s 2018 research, which used 22 datasets to examine GABA levels in MT+ and the early visual cortex (EVC), our study includes an enough dataset.

      (3) In Schallmo elife 2018, there was no correlation between GABA concentration and SI. How can we justify the different results different here?

      Thank reviewer for pointing this out. There are several differences between us:

      a. While the earlier study by Schallmo et al. (2018) employed 3T MRS, we utilize 7T MRS, enhancing our ability to detect and measure GABA with greater accuracy.

      b. Schallmo elife 2018 choose to use the bilateral hMT+ as the MRS measurement region while we use the left hMT+. The reason why we focus on left hMT+ are describe in reviewer 1. (6). Briefly, use of left MT/V5 as a target was motivated by studies demonstrating that left MT/V5 TMS is more effective at causing perceptual effects (Tadin et al., 2011).

      c. The resolution of MRS sequence in Schallmo elife 2018 is 3 cm isotropic voxel, while we apply 2 cm isotropic voxel. This helps us more precisely locate hMT+ and exclude more white matter signal.

      (4) Basically this study contains the data of SI, BDT, GABA in MT+ and V1, Glu in MT+ and V1-all 6 measurements. There should be 6x5/2 = 15 pairwise correlations. However, not all of these results are included in Figure 1 and supplementary 1-3. I understand that it is not necessary to include all figures. But I suggest reporting all values in one Table.

      We thank the reviewer for the good suggestion, we have made a correlation matrix to reporting all values in Figure Supplementary 9.

      (5) In Melnick (2013), the IQ scores were measured by the full set of WAIS-III, including all subtests. However, this study only used the visual spatial domain of gF. I wonder why only the visuo-spatial subtest was used not the full WAIS-III?

      We thank the reviewer for pointing this out. The decision was informed by Melnick’s findings which indicated high correlations between Surround suppression (SI) and the Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indexes, with correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. It is well-established that the hMT+ region of the brain is a sensory cortex involved in visual perception processing (3D perception). Furthermore, motion surround suppression (SI), a specific function of hMT+, aligns closely with this region's activities. Given this context, the Perception Reasoning sub-ability was deemed to have the clearest mechanism for further exploration. Consequently, we selected the most representative subtest of Perception Reasoning—the Block Design Test—which primarily assesses 3D visual intelligence.

      (6) In the functional connectivity part, there is no explanation as to why only the left MT+ was set to the seed region. What is the problem with the right MT+?

      We thank the reviewer for pointing this out. The main reason is that our MRS ROI is the left hMT+, we would like to make different models’ ROI consistent to each other. Use of left MT/V5 as a target was motivated by studies demonstrating that left MT/V5 TMS is more effective at causing perceptual effects (Tadin et al., 2011).

      (7) In Melnick (2013), the authors also reported the correlation between IQ and absolute duration thresholds of small and large stimuli. Please include these analyses as well.

      We thank the reviewer for the good advice. Containing such result do help researchers compare the result between Melnick and us. We have made such figures in the revised version (Figure 3f, g).

      Reviewer #2 (Public Review):

      Summary:

      Recent studies have identified specific regions within the occipito-temporal cortex as part of a broader fronto-parietal, domain-general, or "multiple-demand" (MD) network that mediates fluid intelligence (gF). According to the abstract, the authors aim to explore the mechanistic roles of these occipito-temporal regions by examining GABA/glutamate concentrations. However, the introduction presents a different rationale: investigating whether area MT+ specifically, could be a core component of the MD network.

      Strengths:

      The authors provide evidence that GABA concentrations in MT+ and its functional connectivity with frontal areas significantly correlate with visuo-spatial intelligence performance. Additionally, serial mediation analysis suggests that inhibitory mechanisms in MT+ contribute to individual differences in a specific subtest of the Wechsler Adult Intelligence Scale, which assesses visuo-spatial aspects of gF.

      Weaknesses:

      (1) While the findings are compelling and the analyses robust, the study's rationale and interpretations need strengthening. For instance, Assem et al. (2020) have previously defined the core and extended MD networks, identifying the occipito-temporal regions as TE1m and TE1p, which are located more rostrally than MT+. Area MT+ might overlap with brain regions identified previously in Fedorenko et al., 2013, however the authors attribute these activations to attentional enhancement of visual representations in the more difficult conditions of their tasks. For the aforementioned reasons, It is unclear why the authors chose MT+ as their focus. A stronger rationale for this selection is necessary and how it fits with the core/extended MD networks.

      We really appreciate reviewer’s opinions. The reason why we focus on hMT+ is following: According to the results of Melnick 2013, the motion surround suppression (SI) and the time thresholds of small and large gratings representing hMT+ functionality are correlated with Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indicators, with high correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. In addition, Fedorenko et al. 2013, the averaged MD activity region appears to overlap with hMT+. Based on these findings, we assume that hMT+ does have the potential to become the core of MD system.

      (2) Moreover, although the study links MT+ inhibitory mechanisms to a visuo-spatial component of gF, this evidence alone may not suffice to position MT+ as a new core of the MD network. The MD network's definition typically encompasses a range of cognitive domains, including working memory, mathematics, language, and relational reasoning. Therefore, the claim that MT+ represents a new core of MD needs to be supported by more comprehensive evidence.

      Thank reviewer for pointing this out. After careful consideration, we agree with your point of view. Due to our results only delving into visuo-spatial intelligence, it is not yet sufficient to prove that hMT is the core node of the MD system. We will adjust the explanatory logic of the article, that is, emphasizing the de-redundancy of hMT+in visual-spatial intelligence and the improvement of information processing efficiency, while weakening the significance of hMT+ in MD systems.

      Reviewer #3 (Public Review):

      Summary:

      This manuscript aims to understand the role of GABA-ergic inhibition in the human MT+ region in predicting visuo-spatial intelligence through a combination of behavioral measures, fMRI (for functional connectivity measurement), and MRS (for GABA/glutamate concentration measurement). While this is a commendable goal, it becomes apparent that the authors lack fundamental understanding of vision, intelligence, or the relevant literature. As a result, the execution of the research is less coherent, dampening the enthusiasm of the review.

      Strengths:

      (1) Comprehensive Approach: The study adopts a multi-level approach, i.e., neurochemical analysis of GABA levels, functional connectivity, and behavioral measures to provide a holistic understanding of the relationship between GABA-ergic inhibition and visuo-spatial intelligence.

      (2) Sophisticated Techniques: The use of ultra-high field magnetic resonance spectroscopy (MRS) technology for measuring GABA and glutamate concentrations in the MT+ region is a recent development.

      Weaknesses:

      Study Design and Hypothesis

      (1) The central hypothesis of the manuscript posits that "3D visuo-spatial intelligence (the performance of BDT) might be predicted by the inhibitory and/or excitation mechanisms in MT+ and the integrative functions connecting MT+ with the frontal cortex." However, several issues arise:

      (1.1) The Suppression Index depicted in Figure 1a, labeled as the "behavior circle," appears irrelevant to the central hypothesis.

      We thank the reviewer for pointing this out. In our study, the inhibitory mechanisms in hMT+ are conceptualized through two models: the neurotransmitter model and the behavioral model. The Suppression Index is essential for elucidating the local inhibitory mechanisms within the behavioral model. However, we acknowledge that our initial presentation in the introduction may not have clearly articulated our hypothesis, potentially leading to misunderstandings. We have revised the introduction to better clarify these connections and ensure the relevance of the Suppression Index is comprehensively understood.

      (1.2) The construct of 3D visuo-spatial intelligence, operationalized as the performance in the Block Design task, is inconsistently treated as another behavioral task throughout the manuscript, leading to confusion.

      We thank the reviewer for pointing this out. We acknowledge that our manuscript may have inconsistently presented this construct across different sections, causing confusion. To address this, we ensured a consistent description of 3D visuo-spatial intelligence in both the introduction and the discussion sections. But we maintained ‘Block Design task score' within the results section to help readers clarify which subtest we use.

      (1.3) The schematics in Figure 1a and Figure 6 appear too high-level to be falsifiable. It is suggested that the authors formulate specific and testable hypotheses and preregister them before data collection.

      We thank the reviewer for pointing this out. We have revised the Figure 1a and made it less abstract and more logical. For Figure 6, the schematic represents our theoretical framework of how hMT+ contributes to 3D visuo-spatial intelligence, we believe the elements within this framework are grounded in related theories and supported by evidence discussed in our results and discussions section, making them specific and testable.

      (2) Central to the hypothesis and design of the manuscript is a misinterpretation of a prior study by Melnick et al. (2013). While the original study identified a strong correlation between WAIS (IQ) and the Suppression Index (SI), the current manuscript erroneously asserts a specific relationship between the block design test (from WAIS) and SI. It should be noted that in the original paper, WAIS comprises Similarities, Vocabulary, Block design, and Matrix reasoning tests in Study 1, while the complete WAIS is used in Study 2. Did the authors conduct other WAIS subtests other than the block design task?

      Thank you for pointing this out. Reviewer #1 also asked this question, we copy the answers in here “The decision was informed by Melnick’s findings which indicated high correlations between Surround suppression (SI) and the Verbal Comprehension, Perceptual Reasoning, Working Memory, and Processing Speed Indexes, with correlation coefficients of 0.69, 0.47, 0.49, and 0.50, respectively. It is well-established that the hMT+ region of the brain is a sensory cortex involved in visual perception processing (3D perception). Furthermore, motion surround suppression (SI), a specific function of hMT+, aligns closely with this region's activities. Given this context, the Perception Reasoning sub-ability was deemed to have the clearest mechanism for further exploration. Consequently, we selected the most representative subtest of Perception Reasoning—the Block Design Test—which primarily assesses 3D visual intelligence.”

      (3) Additionally, there are numerous misleading references and unsubstantiated claims throughout the manuscript. As an example of misleading reference, "the human MT ... a key region in the multiple representations of sensory flows (including optic, tactile, and auditory flows) (Bedny et al., 2010; Ricciardi et al., 2007); this ideally suits it to be a new MD core." The two references in this sentence are claims about plasticity in the congenitally blind with sensory deprivation from birth, which is not really relevant to the proposal that hMT+ is a new MD core in healthy volunteers.

      Thank you for pointing this out. We have carefully read the corresponding references and considered the corresponding theories and agree with these comments. Due to our results only delving into “the GABA-ergic inhibition in human MT predicts visuo-spatial intelligence mediated by reverberation with frontal cortex”, it is not yet sufficient to prove that hMT+ is the core node of the MD system, we will adjust the explanatory logic of the article, that is, emphasizing the de redundancy of hMT+in visual-spatial intelligence and the improvement of information processing efficiency, while weakening the significance of hMT+ in MD systems. In addition, regarding the potential central role of hMT+ in the MD system, we agree with your view that research on hMT+ as a multisensory integration hub mainly focuses on developmental processes. Meanwhile, in adults, the MST region of hMT+ is considered a multisensory integration area for visual and vestibular inputs, which potentially supports the role of hMT+ in multitasking multisensory systems (Gu et al., J. Neurosci, 26(1), 73–85, 2006; Fetsch et al., Nat. Neurosci, 15, 146–154, 2012.). Further research could explore how other intelligence sub-ability such as working memory and language comprehension are facilitated by hMT+'s features.

      Another example of unsubstantiated claim: the rationale for selecting V1 as the control region is based on the assertion that "it mediates the 2D rather than 3D visual domain (Born & Bradley, 2005)". That's not the point made in the Born & Bradley (2005) paper on MT. It's crucial to note that V1 is where the initial binocular convergence occurs in cortex, i.e., inputs from both the right and left eyes to generate a perception of depth.

      Thank you for pointing this out. We acknowledge the inappropriate citation of "Born & Bradley, 2005," which focuses solely on the structure and function of the visual area MT. However, we believe that choosing hMT+ as the domain for 3D visual analysis and V1 as the control region is justified. Cumming and DeAngelis (Annu Rev Neurosci, 24:203–238.2001) state that binocular disparity provides the visual system with information about the three-dimensional layout of the environment, and the link between perception and neuronal activity is stronger in the extrastriate cortex (especially MT) than in the primary visual cortex. This supports our choice and emphasizes the relevance of hMT+ in our study. We have revised our reference in the revised version.

      Results & Discussion

      (1) The missing correlation between SI and BDT is crucial to the rest of the analysis. The authors should discuss whether they replicated the pattern of results from Melnick et al. (2013) despite using only one WAIS subtest.

      We thank for the reviewer’s suggestion. We have placed it in the main text (Figure 3e).

      (2) ROIs: can the authors clarify if the results are based on bilateral MT+/V1 or just those in the left hemisphere? Can the authors plot the MRS scan area in V1? I would be surprised if it's precise to V1 and doesn't spread to V2/3 (which is fine to report as early visual cortex).

      We thank for the reviewer’s suggestion. We have drawn the V1 ROI MRS scanning area (Figure supplement 1). Using the template, we checked the coverage of V1, V2, and V3. Although the MRS overlap regions extend to V2 (3%) and V3 (32%), the major coverage of the MRS scanning area is in V1, with 65% overlap across subjects.

      (3) Did the authors examine V1 FC with either the frontal regions and/or whole brain, as a control analysis? If not, can the author justify why V1 serves as the control region only in the MRS but not in FC (Figure 4) or the mediation analysis (Figure 5)? That seems a little odd given that control analyses are needed to establish the specificity of the claim to MT+

      We thank for the reviewer’s suggestion. We have done the V1 FC-behavior connection as control analysis (Figure supplement 7). Only positive correlations in the frontal area were detected, suggesting that in the 3D visuo-spatial intelligence task, V1 plays a role in feedforward information processing. However, hMT+, which showed specific negative correlations in the frontal, is involved in the inhibition mechanism. These results further emphasize the de-redundancy function of hMT+ in 3D visuo-spatial intelligence.

      Regarding the mediation analysis, since GABA/Glu concentration in V1 has no correlation with BDT score, it is not sufficient to apply mediation analysis.

      (4) It is not clear how to interpret the similarity or difference between panels a and b in Figure 4.

      We thank the reviewer for pointing this out. We have further interpreted the difference between a and b in the revised version. Panels a represents BDT score correlated hMT+-region FC, which is obviously involved in frontal cortex. While panels b represents SI correlated hMT+-region FC, which shows relatively less regions. The overlap region is what we are interested in and explain how local inhibitory mechanisms works in the 3D visuo-spatial intelligence. In addition, we have revised Figure 4 and point out the overlap region.

      (5) SI is not relevant to the authors‘ priori hypothesis, but is included in several mediation analyses. Can the authors do model comparisons between the ones in Figure 5c, d, and Figure S6? In other words, is SI necessary in the mediation model? There seem discrepancies between the necessity of SI in Figures 5c/S6 vs. Figure 5d.

      We thank the reviewer for highlighting this point. The relationship between the Suppression Index (SI) and our a priori hypotheses is elaborated in the response to reviewer 3, section (1). SI plays a crucial role in explicating how local inhibitory mechanisms, on the psychological level, function within the context of the 3D visuo-spatial task. Additionally, Figure 5c illustrates the interaction between the frontal cortex and hMT+, showing how the effects from the frontal cortex (BA46) on the Block Design Task are fully mediated by SI. This further underscores the significance of SI in our model.

      (6) The sudden appearance of "efficient information" in Figure 6, referring to the neural efficiency hypothesis, raises concerns. Efficient visual information processing occurs throughout the visual cortex, starting from V1. Thus, it appears somewhat selective to apply the neural efficiency hypothesis to MT+ in this context.

      We thank the reviewer for highlighting this point. There is no doubt that V1 involved in efficient visual information processing. However, in our result, the V1 GABA has no significant correlation between BDT score, suggesting that the V1 efficient processing might not sufficiently account for the individual differences in 3D visuo-spatial intelligence. Additionally, we will clarify our use of the neural efficiency hypothesis by incorporating it into the introduction of our paper to better frame our argument.

      Transparency Issues:

      (1) Don't think it's acceptable to make the claim that "All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary information". It is the results or visualizations of data analysis, rather than the raw data themselves, that are presented in the paper/supp info.

      We thank the reviewer for pointing this out. We realized that such expression would lead to confusion. We have deleted this expression.

      (2) No GitHub link has been provided in the manuscript to access the source data, which limits the reproducibility and transparency of the study.

      We thank the reviewer for pointing this out. We have attached the GitHub link in the revised version.

      Minor:

      "Locates" should be replaced with "located" throughout the paper. For example: "To investigate this issue, this study selects the human MT complex (hMT+), a region located at the occipito-temporal border, which represents multiple sensory flows, as the target brain area."

      We thank the reviewer for pointing this out. We have revised it.

      Use "hMT+" instead of "MT+" to be consistent with the term in the literature.

      We thank the reviewer for pointing this out. We agree to use hMT+ in the literature.

      "Green circle" in Figure 1 should be corrected to match its actual color.

      We thank the reviewer for pointing this out. We have revised it.

      The abbreviation for the Wechsler Adult Intelligence Scale should be "WAIS," not "WASI."

      We thank the reviewer for pointing this out. We have revised it.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The figures and tables should be substantially improved.

      We thank the reviewer for pointing this out. We have improved some of the figures’ quality.

      (2) Please explain the sample size, and the difference between Schallmo eLife 2018, and Melnick, 2013.

      We thank the reviewer for pointing this out. These questions are answered in the public review. We copy the answer in the public review.

      (2.1)  How was the sample size determined? Is it sufficient??

      Thank you to the reviewer for pointing this out. We use G*power to determine our sample size. In the study by Melnick (2013), they reported a medium effect between SI and Perception Reasoning sub-ability (r=0.47). Here we use this r value as the correlation coefficient (ρ H1), setting the power at the commonly used threshold of 0.8 and the alpha error probability at 0.05. The required sample size is calculated to be 26. This ensures that our study has adequate power to yield valid statistical results. Furthermore, compared to earlier within-subject studies like Schallmo et al.'s 2018 research, which used 22 subjects to examine GABA levels in MT+ and the early visual cortex (EVC), our study includes an enough dataset.

      (2.2)  In Schallmo elife 2018, there was no correlation between GABA concentration and SI. How can we justify the different results different here?

      Thank you to the reviewer for pointing this out. There are several differences between the two studies, ours and theirs:

      a. While the earlier study by Schallmo et al. (2018) employed 3T MRS, we utilize 7T MRS, enhancing our ability to detect and measure GABA with greater accuracy.

      b. Schallmo elife 2018 choose to use the bilateral hMT+ as the MRS measurement region while we use the left hMT+. The reason why we focus on left hMT+ are described in review 1. (6). Briefly, use of left MT/V5 as a target was motivated by studies demonstrating that left MT/V5 TMS is more effective at causing perceptual effects (Tadin et al., 2011).

      c. The resolution of MRS sequence in Schallmo elife 2018 is 3 cm isotropic voxel, while we apply 2 cm isotropic voxel. This helps us more precisely locate hMT+ and exclude more white matter signal.

      (3) Table 1 and Table Supplementary 1-3 contain many correlation results. But what are the main points of these values? Which values do the authors want to highlight? Why are only p-values shown with significance symbols in Table Supplementary 2?

      (3.1) what are the main points of these values?

      Thank you to the reviewer for pointing this out. These correlations represent the relationship between behavior task (SI/BDT) and resting-state functional connectivity. It indicates that left hMT+ is involved in the efficient information integration network when it comes to the BDT task. In addition, left hMT+’s surround suppression is involved in several hMT+ - frontal connectivity. Furthermore, the overlapping regions between two tasks indicate a shared underlying mechanism.

      (3.2) Which values do the authors want to highlight?

      Table 1 and Table Supplementary 1-3 present the preliminary analysis results for Table 2 and Table Supplementary 4-6. So, we generally report all value. Conversely, in the Table 2 and Table Supplementary 4-6, we highlight (bold font) indicating the significant correlations survived from multi correlation correction.

      (3.3) Why are only p-values shown with significance symbols in Table Supplementary 2?

      Thank you for pointing this out, it is a mistake. We have revised it and delete the significance symbols.

      (4) Line 27, it is unclear to me what is "the canonical theory".

      We thank the reviewer for pointing this out. We have revised “the canonical theory" to “the prevailing opinion”.

      (5) Throughout the paper, the authors use "MT+", I would suggest using "hMT+" to indicate the human MT complex, and to be consistent with the human fMRI literature.

      We thank the reviewer for pointing this out. We have revised them and used "hMT+" to be consistent with the human fMRI literature.

      (6) At the beginning of the results section, I suggest including the total number of subjects. It is confusing what "31/36 in MT+, and 28/36 in V1" means.

      We thank the reviewer for pointing this out. We have included the total number of subjects in the beginning of result section.

      (7) Line 138, "This finding supports the hypothesis that motion perception is associated with neural activity in MT+ area". This sentence is strange because it is a well-established finding in numerous human fMRI papers. I think the authors should be more specific about what this finding implies.

      We thank the reviewer for pointing this out. We have deleted the inappropriate sentence "This finding supports the hypothesis that motion perception is associated with neural activity in MT+ area".

      (8) There are no unit labels for all x- and y-axies in Figure 1. I only see the unit for Conc is mmol per kg wet weight.

      We thank the reviewer for pointing this out. Figure 1 is a schematic and workflow chart, so labels for x- and y-axes are not needed. I believe this confusion might pertain to Figure 3. In Figures 3a and 3b, the MRS spectrum does not have a standard y-axis unit as it varies based on the individual physical conditions of the scanner; it is widely accepted that no y-axis unit is used. While the x-axis unit is ppm, which indicate the chemical shift of different metabolites. In Figure 3c, the BDT represents IQ scores, which do not have a standard unit. Similarly, in Figures 3d and 3e, the Suppression Index does not have a standard unit.

      (9) Although the correlations are not significant in Figure Supplement 2&3, please also include the correlation line, 95% confidence interval, and report the r values and p values (i.e., similar format as in Figure 1C).

      We thank the reviewer for pointing this out. We have revised them.

      (10) There is no need to separate different correlation figures into Figure Supplementary 1-4. They can be combined into the same figure.

      We thank the reviewer for the suggestion. However, each correlation figure in the supplementary figures has its own specific topic and conclusion. The correlation figures in Supplementary Figure 1 indicate that GABA in V1 does not show any correlation with BDT and SI, illustrating that inhibition in V1 is unrelated to both 3D visuo-spatial intelligence and motion suppression processing. The correlations in Supplementary Figure 2 indicate that the excitation mechanism, represented by Glutamate concentration, does not contribute to 3D visuo-spatial intelligence in either hMT+ or V1. Supplementary Figure 3 validates our MRS measurements. Supplementary Figure 4 addresses potential concerns regarding the impact of outliers on correlation significance. Even after excluding two “outliers” from Figures 3d and 3e, the correlation results remain stable.

      (11) Line 213, as far as I know, the study (Melnick et al., 2013) is a psychophysical study and did not provide evidence that the spatial suppression effect is associated with MT+.

      We thank the reviewer for pointing this out. It was a mistake to use this reference, and we have revised it accordingly.

      (12) At the beginning of the results, I suggest providing more details about the motion discrimination tasks and the measurement of the BDT.

      We thank the reviewer for pointing this out. We have included some brief description of task at the beginning of the result section.

      (13) Please include the absolute duration thresholds of the small and large sizes of all subjects in Figure 1.

      We thank the reviewer for the suggestion. We have included these results in Figure 3.

      (14) Figure 5 is too small. The items in plot a and b can be barely visible.

      We thank the reviewer for pointing this out. We increase the size and resolution of Figure 5.

      Reviewer #2 (Recommendations For The Authors):

      Recommendations for improving the writing and presentation.

      I highly recommend editing the manuscript for readability and the use of the English language. I had significant difficulties following the rationale of the research due to issues with the way language was used.

      We thank the reviewer for pointing this out. We apologize for any shortcomings in our initial presentation. We have invited a native English speaker to revise our manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:  

      Reviewer #1 (Public Review):  

      Summary:  

      Heer and Sheffield used 2 photon imaging to dissect the functional contributions of convergent dopamine and noradrenaline inputs to the dorsal hippocampus CA1 in head-restrained mice running down a virtual linear path. Mice were trained to collect water rewards at the end of the track and on test days, calcium activity was recorded from dopamine (DA) axons originating in the ventral tegmental area (VTA, n=7) and noradrenaline axons from the locus coeruleus (LC, n=87) under several conditions. When mice ran laps in a familiar environment, VTA DA axons exhibited ramping activity along the track that correlated with distance to reward and velocity to some extent, while LC input activity remained constant across the track, but correlated invariantly with velocity and time to motion onset. A subset of recordings taken when the reward was removed showed diminished ramping activity in VTA DA axons, but no changes in the LC axons, confirming that DA axon activity is locked to reward availability. When mice were subsequently introduced to a new environment, the ramping to reward activity in the DA axons disappeared, while LC axons showed a dramatic increase in activity lasting 90 s (6 laps) following the environment switch. In the final analysis, the authors sought to disentangle LC axon activity induced by novelty vs. behavioral changes induced by novelty by removing periods in which animals were immobile and established that the activity observed in the first 2 laps reflected novelty-induced signal in LC axons.  

      Strengths:  

      The results presented in this manuscript provide insights into the specific contributions of catecholaminergic input to the dorsal hippocampus CA1 during spatial navigation in a rewarded virtual environment, offering a detailed analysis of the resolution of single axons. The data analysis is thorough and possible confounding variables and data interpretation are carefully considered.  

      Weaknesses:  

      Aspects of the methodology, data analysis, and interpretation diminish the overall significance of the findings, as detailed below.  

      The LC axonal recordings are well-powered, but the DA axonal recordings are severely underpowered, with recordings taken from a mere 7 axons (compared to 87 LC axons).

      Additionally, 2 different calcium indicators with differential kinetics and sensitivity to calcium changes (GCaMP6S and GCaMP7b) were used (n=3, n=4 respectively) and the data pooled. This makes it very challenging to draw any valid conclusions from the data, particularly in the novelty experiment. The surprising lack of novelty-induced DA axon activity may be a false negative. Indeed, at least 1 axon (axon 2) appears to be showing a novelty-induced rise in activity in Figure 3C. Changes in activity in 4/7 axons are also referred to as a 'majority' occurrence in the manuscript, which again is not an accurate representation of the observed data.  

      We appreciate the reviewer's detailed feedback regarding the analysis of VTA axons in our dataset. The relatively low sample size for VTA axons is due to their sparsity in the dCA1 region of the hippocampus and the inherent difficulty in recording from these axons. VTA axons are challenging to capture due to their low baseline fluorescence and long-range axon segments, resulting in a typical yield of only a single axon per field of view (FOV) per animal. In contrast, LC axons are more abundant in dCA1.

      To address the disparity in sample sizes between LC and VTA axons, we down-sampled the LC axons to match the number of VTA axons, repeating this process 1000 times to create a distribution. However, we acknowledge the reviewer's concern that the relatively low sample size for VTA axons might result in insufficient sampling of this population. Increasing the baseline expression of GCaMP to record from VTA axons requires several months, limiting our ability to quickly expand the sample size.

      In response to the reviewer's comments, we have added recordings from 2 additional VTA axons, increasing the sample size from 7 to 9. We re-analyzed all data from the familiar environment with n=9 VTA axons, comparing them to down-sampled LC axons as previously described. However, the additional axons were not recorded in the novel environment. We agree with the reviewer that the lack of novelty-induced DA axon activity may be a false negative. To address this, we have revised the description of our results to include the following sentence:

      “However, 1 VTA ROI showed an increase in activity immediately following exposure to novelty, indicating heterogeneity across VTA axons in CA1, and the lack of a novelty signal on average may be due to a small sample size.”

      Regarding the use of two different GCaMP constructs, we understand the reviewer's concern. We used GCaMP6s and GCaMP7b variants to determine if one would improve the success rate of recording from VTA axons. Given the long duration of these experiments and the low yield, we pooled the data from both GCaMP variants to increase statistical power. However, we recognize the importance of verifying that there are no differences in the signals recorded with these variants.

      With the addition of 2 VTA DA axons expressing GCaMP6s, we now have n=5 GCaMP6s and n=4 GCaMP7b VTA DA axons. This allowed us to compare the activity of the two sensors in the familiar environment. As shown in new Supplementary Figure 2, both sets of axons responded similarly to the variables measured: position in VR, time to motion onset, and animal velocity (although the GCaMP6s expressing axons showed stronger correlations). Since all LC axons recorded expressed GCaMP6s, we also specifically compared VTA GCaMP6s axons to LC GCaMP6s axons (Supp Fig. 3). Our conclusions remained consistent when comparing this subset of VTA axons to LC axons.

      Overall, our paper now includes comparisons of combined VTA axons (n=9) and separately the GCaMP6s-expressing VTA axons (n=5) with LC axons. Both datasets support our initial conclusions that VTA axons signal proximity to reward, while LC axons encode velocity and motion initiation in familiar environments.

      The authors conducted analysis on recording data exclusively from periods of running in the novelty experiment to isolate the effects of novelty from novelty-induced changes in behavior. However, if the goal is to distinguish between changes in locus coeruleus (LC) axon activity induced by novelty and those induced by motion, analyzing LC axon activity during periods of immobility would enhance the robustness of the results.  

      We appreciate the reviewer's insightful suggestion to analyze LC axon activity during periods of immobility to distinguish between changes induced by novelty and those induced by motion. This additional analysis would indeed strengthen our conclusions regarding the LC novelty signal.

      In response to this suggestion, we performed the same analysis as before, but focused on periods of immobility. Our findings indicate that following exposure to novelty, there was a significant increase in LC activity specifically during immobility. This supports the idea that LC axons produce a novelty signal that is independent of novelty-induced behavioral changes. The results of this analysis are now presented in new Supplementary Figure 5b

      The authors attribute the ramping activity of the DA axons to the encoding of the animals' position relative to reward. However, given the extensive data implicating the dorsal CA1 in timing, and the remarkable periodicity of the behavior, the fact that DA axons could be signalling temporal information should be considered.  

      This is an insightful comment regarding the potential role of VTA DA axons in signaling temporal information. We agree that VTA DA axons could indeed be encoding temporal information, as previous work from our lab has shown that these axons exhibit ramping activity when averaged by time to reward (Krishnan et al., 2022).

      To address this, we have now examined DA axon activity relative to time to reward, as shown in new Supplementary Figure 4. Our analysis confirms that these axons ramp up in activity relative to time to reward. Given the periodicity of our mice's behavior in these experiments, as the reviewer correctly points out, we are unable to distinguish between spatial proximity to reward and time to reward. We have added a sentence to our paper highlighting this limitation and stating that further experiments are necessary to differentiate these two variables.

      Krishnan, L.S., Heer, C., Cherian, C., Sheffield, M.E. Reward expectation extinction restructures and degrades CA1 spatial maps through loss of a dopaminergic reward proximity signal. Nat Commun 13, 6662 (2022).

      The authors should explain and justify the use of a longer linear track (3m, as opposed to 2m in the DAT-cre mice) in the LC axon recording experiments.  

      We appreciate the reviewer's insightful comment regarding the use of a longer linear track (3m, as opposed to 2m in the DAT-cre mice) in the LC axon recording experiments. The choice of a 3m track for LC axon recordings was made to align with a previous experiment from our lab (Dong et al., 2021), in which mice were exposed to a novel 3m track while CA1 pyramidal cell populations were recorded. In that study, we detailed the time course of place field formation within the novel track. Our current hypothesis is that LC axons signal novelty, and we aimed to investigate whether the time course of LC axon activity aligns with the time course of place field formation. This hypothesis, and the potential role of LC axons in facilitating plasticity for new place field formation, is further discussed in the Discussion section of our paper.

      For the VTA axon recordings, we utilized a 2m track, consistent with another recent study from our lab (Krishnan et al., 2022), where reward expectation was manipulated, and CA1 pyramidal cell populations were recorded. By matching the track length to this prior study, we aimed to explore how VTA dopaminergic inputs to CA1 might influence CA1 population dynamics along the track under conditions of varying reward expectations.

      We acknowledge that using different track lengths for LC and VTA recordings introduces a variable that could potentially confound direct comparisons. To address this, we normalized the track lengths for our LC versus VTA comparison analysis. This normalization allowed us to directly compare patterns of activity across the two types of axons by adjusting the data to a common scale, thereby ensuring that any observed differences or similarities are attributable to the intrinsic properties of the axons rather than differences in track lengths. By doing so, we could assess relative changes in activity levels at matched spatial bins.

      Although the experiences of the animals on the different track lengths are not identical, our observations suggest that LC and VTA axon signals are not majorly influenced by variations in track length. LC axons are associated with velocity and a pre-motion initiation signal, neither of which are affected by track length. VTA axons, which also correlate with velocity, can be compared to LC axon velocity signals because mice reach maximal velocity very quickly a long the track, well before the end of the 2m track. The range of velocities are therefore capture on both track lengths. While VTA axons exhibit ramping activity as they approach the reward zone—a signal potentially modulated by track length—LC axons do not show such ramping to reward signals. Thus, a comparison across different track lengths is justified for this aspect of our analysis.

      To further enhance the rigor of our comparisons between axon dynamics recorded on 2m and 3m tracks, we conducted an additional analysis plotting axon activity by time to reward and actual (un-normalized) distance from reward (Supplementary Figure 4). This analysis revealed very similar signals between the two sets of axons, supporting our initial conclusions.

      We thank the reviewer for raising this important point and hope that our detailed explanation and additional analysis address their concern.

      Krishnan, L.S., Heer, C., Cherian, C., Sheffield, M.E. Reward expectation extinction restructures and degrades CA1 spatial maps through loss of a dopaminergic reward proximity signal. Nat Commun 13, 6662 (2022).

      Dong, C., Madar, A. D. & Sheffield, M.E. Distinct place cell dynamics in CA1 and CA3 encode experience in new environments. Nat Commun 12, 2977 (2021).

      Reviewer #2 (Public Review):  

      Summary:  

      The authors used 2-photon Ca2+-imaging to study the activity of ventral tegmental area (VTA) and locus coeruleus (LC) axons in the CA1 region of the dorsal hippocampus in head-fixed male mice moving on linear paths in virtual reality (VR) environments.  

      The main findings were as follows:  

      - In a familiar environment, the activity of both VTA axons and LC axons increased with the mice's running speed on the Styrofoam wheel, with which they could move along a linear track through a VR environment.  

      - VTA, but not LC, axons showed marked reward position-related activity, showing a ramping-up of activity when mice approached a learned reward position.  

      - In contrast, the activity of LC axons ramped up before the initiation of movement on the Styrofoam wheel.  

      - In addition, exposure to a novel VR environment increased LC axon activity, but not VTA axon activity.  

      Overall, the study shows that the activity of catecholaminergic axons from VTA and LC to dorsal hippocampal CA1 can partly reflect distinct environmental, behavioral, and cognitive factors. Whereas both VTA and LC activity reflected running speed, VTA, but not LC axon activity reflected the approach of a learned reward, and LC, but not VTA, axon activity reflected initiation of running and novelty of the VR environment.  

      I have no specific expertise with respect to 2-photon imaging, so cannot evaluate the validity of the specific methods used to collect and analyse 2-photon calcium imaging data of axonal activity.  

      Strengths:  

      (1) Using a state-of-the-art approach to record separately the activity of VTA and LC axons with high temporal resolution in awake mice moving through virtual environments, the authors provide convincing evidence that the activity of VTA and LC axons projecting to dorsal CA1 reflect partly distinct environmental, behavioral and cognitive factors.  

      (2) The study will help a) to interpret previous findings on how hippocampal dopamine and norepinephrine or selective manipulations of hippocampal LC or VTA inputs modulate behavior and b) to generate specific hypotheses on the impact of selective manipulations of hippocampal LC or VTA inputs on behavior.  

      Weaknesses:  

      (1) The findings are correlational and do not allow strong conclusions on how VTA or LC inputs to dorsal CA1 affect cognition and behavior. However, as indicated above under Strengths, the findings will aid the interpretation of previous findings and help to generate new hypotheses as to how VTA or LC inputs to dorsal CA1 affect distinct cognitive and behavioral functions.  

      (2) Some aspects of the methodology would benefit from clarification.  

      First, to help others to better scrutinize, evaluate, and potentially to reproduce the research, the authors may wish to check if their reporting follows the ARRIVE (Animal Research: Reporting of In Vivo Experiments) guidelines for the full and transparent reporting of research involving animals (https://arriveguidelines.org/). For example, I think it would be important to include a sample size justification (e.g., based on previous studies, considerations of statistical power, practical considerations, or a combination of these factors). The authors should also include the provenance of the mice. Moreover, although I am not an expert in 2-photon imaging, I think it would be useful to provide a clearer description of exclusion criteria for imaging data.

      We thank the reviewer for helping us formalize the scientific rigor of our study. There are ten ARRIVE Guidelines and we have addressed most of them in our study already. However, there is an opportunity to add detail. We have listed below all ten points and how we have addressed each one (and point out any new additions):

      (1) Experimental design - we go into great depth explaining the experimental set-up, how we used the autofluorescent blebs as imaging controls, how we controlled for different sample sizes between the two populations, and the statistical tests used for comparisons. We also carefully accounted for animal behavior when quantifying and describing axon dynamics both in the familiar and novel environments.

      (2) Sample size - we state both the number of ROIs and mice for each analysis. We have now also added the number of mice we observed specific types of activity in. 

      (3) Inclusion/exclusion criteria - The following has now been added to the Methods section: Out of the 36 NET-Cre mice injected, 15 were never recorded from for either failing to reach behavioral criteria, or a lack of visible expression in axons. Out of the 54 DAT-Cre mice injected, imaging was never conducted in 36 of them for lack of expression or failing to reach behavioral criteria. Out of the remaining 21 NET-CRE, 5 were excluded for heat bubbles, z-drift, or bleaching, while 10 DAT-Cre were excluded for the same reasons. This was determined by visually assessing imaging sessions, followed by using the registration metrics output by suite2p. This registration metric conducted a PCA on the motion-corrected ROIs and plotted the first PC. If the PC drifted largely, to the point where no activity was apparent, the video was excluded from analysis. 

      (4) Randomization - Already included in the paper is a description of random downsampling of LC axons to make statistical comparisons with VTA axons. LC axons were selected pseudo-randomly (only one axon per imaging session) to match VTA sampling statistics. This randomization was repeated 1000 times and comparisons were made against this random distribution. 

      (5) Blinding-masking - no blinding/masking was conducted as no treatments were given that would require this. We will include this statement in the next version. 

      (6) Outcomes - We defined all outcomes measured, such as those related to animal behavior and axon signaling. 

      (7) Statistical methods - None of the reviewers had any issues regarding our description of statistical methods, which we described in great detail in this version of the paper. 

      (8) Experimental animals - We have now described that DAT- Cre mice were obtained through JAX labs, and NET-Cre mice were obtained from the Tonegawa lab (Wagatsuma et al. 2017). This was absent in the initial version of the paper.

      (9) Experimental procedure - Already listed in great detail in Methods section.

      (10) Results - Rigorously described in detail for behaviors and related axon dynamics.

      Wagatsuma, Akiko, Teruhiro Okuyama, Chen Sun, Lillian M. Smith, Kuniya Abe, and Susumu Tonegawa. “Locus Coeruleus Input to Hippocampal CA3 Drives Single-Trial Learning of a Novel Context.” Proceedings of the National Academy of Sciences 115, no. 2 (January 9, 2018): E310–16. https://doi.org/10.1073/pnas.1714082115.

      Second, why were different linear tracks used for studies of VTA and LC axon activity (from line 362)? Could this potentially contribute to the partly distinct activity correlates that were found for VTA and LC axons?  

      We thank the reviewer for pointing this out and giving us a chance to address it directly. A detailed response to this is written above for a similar comment from reviewer 1.

      Third, the authors seem to have used two different criteria for defining immobility. Immobility was defined as moving at <5 cm/s for the behavioral analysis in Figure 3a, but as <0.2 cm/s for the imaging data analysis in Figure 4 (see legends to these figures and also see Methods, from line 447, line 469, line 498)? I do not understand why, and it would be good if the authors explained this.  

      This is a typo leftover from before we converted velocity from rotational units of the treadmill to cm/s. This has now been corrected.

      (3) In the Results section (from line 182) the authors convincingly addressed the possibility that less time spent immobile in the novel environment may have contributed to the novelty-induced increase of LC axon activity in dorsal CA1 (Figure 4). In addition, initially (for the first 2-4 laps), the mice also ran more slowly in the novel environment (Figure 3aIII, top panel). Given that LC and VTA axon activity were both increasing with velocity (Figure 1F), reduced velocity in the novel environment may have reduced LC and VTA axon activity, but this possibility was not addressed. Reduced LC axon activity in the novel environment could have blunted the noveltyinduced increase. More importantly, any potential novelty-induced increase in VTA axon activity could have been masked by decreases in VTA axon activity due to reduced velocity. The latter may help to explain the discrepancy between the present study and previous findings that VTA neuron firing was increased by novelty (see Discussion, from line 243). It may be useful for the authors to address these possibilities based on their data in the Results section, or to consider them in their Discussion.  

      We appreciate the reviewer's insightful comment regarding the potential impact of decreased velocity on novelty responses in LC and VTA axons. The decreased velocity in the novel environment could lead to a diminished novelty response in LC axons and could mask a subtle novelty signal in VTA axons. We have now included the following points in our discussion:

      “In addition, as noted above, on average we did observe a velocity associated signal in VTA axons. When mice were exposed to the novel environment their velocity initially decreased. This would be expected to reduce the average signal across the VTA axon population relative to the higher velocity in the familiar environment. It is possible that this decrease could somewhat mask a subtle novelty induced signal in VTA axons. Therefore, additional experiments should be conducted to investigate the heterogeneity of these axons and their activity under different experimental conditions during tightly controlled behavior.”

      “As discussed above, the slowing down of animal behavior in the novel environment could have decreased LC axon activity and reduced the magnitude of the novelty signal we detected during running. The novelty signal we report here may therefore be an under estimate of it's magnitude under matched behavioral settings.”

      However, it is important to note that although VTA axons, on average, showed activity modulated by velocity in a familiar rewarded environment, this relationship was largely due to the activity of two VTA axons that were strongly modulated by velocity, indicating heterogeneity within the VTA axon population in dCA1. We have highlighted this point in the discussion. We also discuss that:

      “It is possible that some VTA DA inputs to dCA1 respond to novel environments, and the small number of axons recorded here are not representative of the whole population.”

      (4) Sensory properties of the water reward, which the mice may be able to detect, could account for reward-related activity of VTA axons (instead of an expectation of reward). Do the authors have evidence that this is not the case? Occasional probe trials, intermixed with rewarded trials, could be used to test for this possibility.  

      Mice receive their water reward through a water spout that is immobile and positioned directly in front of their mouth. Water delivery is triggered by a solenoid when the mice reach the end of the virtual track. Therefore, because the water spout is immobile and the water reward is not delivered until they reach the end of the track, there is nothing for the mice to detect during their run. We have added clarifications about the water spout to the Methods and Results sections, along with appropriate discussion points.

      Additionally, we note that the ramping activity of VTA axons is still present on the initial laps with no reward (Krishnan et al., 2022), indicating that this activity is not directly related to the presence or absence of water but is instead associated with the animal’s reward expectation.

      We thank the reviewer for raising this point and hope that these clarifications address their concern.

      Reviewer #3 (Public Review):  

      Summary:  

      Heer and Sheffield provide a well-written manuscript that clearly articulates the theoretical motivation to investigate specific catecholaminergic projections to dorsal CA1 of the hippocampus during a reward-based behavior. Using 2-photon calcium imaging in two groups of cre transgenic mice, the authors examine the activity of VTA-CA1 dopamine and LC-CA1 noradrenergic axons during reward seeking in a linear track virtual reality (VR) task. The authors provide a descriptive account of VTA and LC activities during walking, approach to reward, and environment change. Their results demonstrate LC-CA1 axons are activated by walking onset, modulated by walking velocity, and heighten their activity during environment change. In contrast, VTA-CA1 axons were most activated during the approach to reward locations. Together the authors provide a functional dissociation between these catecholamine projections to CA1. A major strength of their approach is the methodological rigor of 2-photon recording, data processing, and analysis approaches. These important systems neuroscience studies provide solid evidence that will contribute to the broader field of learning and memory. The conclusions of this manuscript are mostly well supported by the data, but some additional analysis and/or experiments may be required to fully support the author's conclusions.  

      Weaknesses:  

      (1) During teleportation between familiar to novel environments the authors report a decrease in the freezing ratio when combining the mice in the two experimental groups (Figure 3aiii). A major conclusion from the manuscript is the difference in VTA and LC activity following environment change, given VTA and LC activity were recorded in separate groups of mice, did the authors observe a similar significant reduction in freezing ratio when analyzing the behavior in LC and VTA groups separately?  

      In response to the comment regarding the freezing ratios during teleportation between familiar and novel environments, we have analyzed the freezing ratios and lap velocities of DAT-Cre and NET-Cre mice separately (Fig. 3Aiii). Our analysis shows that the mean lap velocities of both groups overlap in the familiar environment and significantly decrease on the first lap of the novel environment (Fig. 3iii, top). For subsequent laps, the velocities in both groups are not statistically significantly different from the familiar environment lap velocities.

      Freezing ratios also show a statistically significant decrease on the first lap of the novel environment compared to the familiar environment in both groups (Fig. 3iii, bottom). In the NETCRE mice, the freezing ratios remain statistically lower in subsequent laps, while in the DATCRE mice, the following laps show a similar trend but without statistical significance. This lack of statistical significance in the DAT-CRE mice is likely due to their already lower freezing ratios in the familiar environment. Overall, the data demonstrate similar behavioral responses in the two groups of mice during the switch from the familiar to the novel environment.

      (2) The authors satisfactorily apply control analyses to account for the unequal axon numbers recorded in the LC and VTA groups (e.g. Figure 1). However, given the heterogeneity of responses observed in Figures 3c, 4b and the relatively low number of VTA axons recorded (compared to LC), there are some possible limitations to the author's conclusions. A conclusion that LC-CA1 axons, as a general principle, heighten their activity during novel environment presentation, would require this activity profile to be observed in some of the axons recorded in most all LC-CA1 mice.

      We agree with the reviewer’s point. To address this issue, when downsampling LC axons to compare to VTA axons, we matched the sampling statistics of the VTA axons/mice by only selecting one LC axon from each mouse to match the VTA dataset.

      Additionally, we have now included the number of recording sessions and the number of mice in which we observed each type of activity. This information has been added to further clarify and support our conclusions.

      Additionally, if the general conclusion is that VTA-CA1 axons ramp activity during the approach to reward, it would be expected that this activity profile was recorded in the axons of most all VTA-CA1 mice. Can the authors include an analysis to demonstrate that each LC-CA1 mouse contained axons that were activated during novel environments and that each VTA-CA1 mouse contained axons that ramped during the approach to reward?  

      As above, we have now added the number of mice that had each activity type we report in the paper here.  

      (3) A primary claim is that LC axons projecting to CA1 become activated during novel VR environment presentation. However, the experimental design did not control for the presentation of a familiar environment. As I understand, the presentation order of environments was always familiar, then novel. For this reason, it is unknown whether LC axons are responding to novel environments or environmental change. Did the authors re-present the familiar environment after the novel environment while recording LC-CA1 activity?  

      While we did not vary the presentation order of familiar and novel environments, we recorded the activity of LC axons in some mice when exposed to a dark environment (no VR cues) prior to exposure to the familiar environment. Our analysis of this data demonstrates that LC axons are also active following abrupt exposure to the familiar environment.

      We have added a new figure showing this response (Supplementary Figure 5A) and expanded on our original discussion point that LC axon activity generally correlates with arousal, as this result also supports that interpretation.

      We thank the reviewer for highlighting this important consideration. It certainly helps with the interpretation regarding what LC axons generally encode.  

      >Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):  

      In addition to what has been described in the public review, I have the following recommendations:  

      The sample size of DA axon recordings should be increased with the use of a single GCaMP for valid conclusions to be made about the lack of novelty-inducted activity in these axons.  

      We have increased the n of VTA GCaMP6s axons in the familiar environment by including two axons that were recorded in the familiar rewarded condition. We have also conducted an analysis comparing GCaMPs versus GCaMP7b, which is discussed in detail above.

      Regarding the concerns about valid conclusions of novelty-induced activity in VTA axons, we have added a comment in the discussion to tone down our conclusions regarding the lack of a novelty signal in the VTA axons. This valid concern is discussed in detail above.  

      The title is currently very generic, and non-informative. I recommend the use of more specific language in describing the type of behavior under investigation. It is not clear to the reviewer why 'learning' is included here.  

      Original title: “Distinct catecholaminergic pathways projecting to hippocampal CA1 transmit contrasting signals during behavior and learning”

      To make it more specific to the experiments conducted here, we have changed the title to this:

      New title: “Distinct catecholaminergic pathways projecting to hippocampal CA1 transmit contrasting signals during navigation in familiar and novel environments”

      Error noted in Figure 4C legend - remove reference to VTA ROIs.  

      The reference to VTA ROIs has been removed from the figure legend

      Reviewer #2 (Recommendations For The Authors):  

      (1) The concluding sentence of the Abstract could be more specific: which distinct types of information are reflected/'signaled'/'encoded' by LC and VTA inputs to dorsal CA1?  

      The abstract has been adjusted accordingly. The new sentence is more specific: “These inputs encode unique information, with reward information in VTA inputs and novelty and kinematic information in LC inputs, likely contributing to differential modulation of hippocampal activity during behavior and learning.”

      (2) Line 46/47: The study by Mamad et al. (2017) did not quite show that VTA dopamine input to dorsal CA1 'drives place preference'. To my understanding, the study showed that suppression of VTA dopamine signaling in a specific place caused avoidance of this place and that VTA dopamine signaling modulated hippocampal place-related firing. So, please consider rephrasing.  

      Corrected, thanks for pointing this out.

      (3) Legend to Figure 3AIII: 'Each lap was compared to the first lap in F . . .' Could you clarify if 'F' refers to the 'familiar environment?  

      Figure legend has been changed accordingly

      (4) Line 176: '36 LC neurons' - should this not be '36 imaged axon terminals in dorsal CA1' or something along these lines?  

      This reference has been changed to “LC axon ROIs”

      (5) Line 353: Why was water restriction started before the hippocampal window implant, if behavioral training to run for water reward only started after the implant? Please clarify.

      A sentence was added to the methods to explain that this was done to reduce bleeding and swelling during the hippocampal window implantation.  

      (6) Line 377: '. . . which took 10-14 days (although some mice never reached this threshold).' How many mice did not reach the criterion within 14 days? I think it is not accurate to say the mice 'never' reached the threshold, as they were only tested for a limited period of time.  

      We have added details of how many mice were excluded from each group and the reason why they were excluded.

      (7) Exclusion criteria for imaging data: The authors state (from line 402): 'Imaging sessions with large amounts of drift or bleaching were excluded from analysis (8 sessions for NET mice, 6 sessions for LC Mice).' What exactly were the quantitative exclusion criteria? Were these defined before the onset of the study or throughout the study?  

      Imaging sessions were first qualitatively assessed by looking for disappearance or movement of structures in the Z-plane throughout the imaging FOV. Additionally, following motion correction in suite2p, we used the registration metrics, which plots the first Principle Component of the motion corrected images, to assess for drift, bleaching, or heat bubbles. If this variable increased or decreased greatly throughout a session, to the point where any apparent activity was not visible in the first PC, the dataset was excluded. We have added these exclusion criteria to the methods section.

      Reviewer #3 (Recommendations For The Authors):  

      Please provide a justification or rationale for having two different criteria for immobility (< 5cm/sec) and freezing (<0.2 cm/sec). If VTA and LC axon activities are different between these two velocities, please provide some commentary on this difference.  

      This is a typo leftover from before we converted velocity from rotational units to cm/s.

    1. Welcome back and in this demo lesson I'm going to step through how you can register a domain using Route 53. Now this is an optional step within the course. Worst case you should know how to perform the domain registration process within AWS and optionally you can use this domain within certain demos within the course to get a more real-world like experience.

      To get started, as always, just make sure that you're logged in to the IAM admin user of the general AWS account which is the management account of the organization. Now make sure that you have the Northern Virginia region selected. While Route 53 is a global service, I want you to get into the habit of using the Northern Virginia region. Now we're going to be using the Route 53 product, so click in the search box at the top of the screen, type Route 53 and then click to move to the Route 53 console.

      Now Route 53, at least in the context of this demo lesson, has two major areas. First is hosted zones and this is where you create or manage DNS zones within the product. Now DNS zones, as you'll learn elsewhere in the course, you can think of as databases which store your DNS records. When you create a hosted zone within Route 53, Route 53 will allocate four name servers to host this hosted zone. And that's important, you need to understand that every time you create a new hosted zone, Route 53 will allocate four different name servers to host that zone. Now the second area of Route 53 is registered domains, and it's in the registered domains area of the console where you can register a domain or transfer a domain in to Route 53.

      Now we're going to register a domain, but before we do that, if you do see any notifications about trying out new versions of the console, then go ahead and click to try out that new version. Where possible, I always like to teach using the latest version of the console UI because it's going to be what you'll be using long-term. So in my case, I'm going to go ahead and click on, try out the new console, depending on when you're doing this demo, you may see this or not. In either case, you want to be using this version of the console UI. So if you are going to register a domain for this course, then you need to go ahead and click register domains.

      The first step is to type the domain that you want into this box. Now, a case study that I use throughout the course is animals for life. So I'm going to go ahead and register a domain related to this case study. So if I type animalsforlive.com and press enter, it will search for the domain and tell us whether it's available. In this case, animalsforlive.com is not available. It's already been registered. In my case, I'm going to use an alternative, so I'm going to try and register animalsforlive.io. Now, I/O domains are one of the most expensive, so if you are registering a domain yourself, I would tend to advise you to look for one of the cheaper ones. I'm going to register this one and it is available.

      Once I've verified that it is available and it's the one I want, we're gonna go ahead and click on select. We can verify the price of this domain for one year, in this case it's 71 US dollars, and then go ahead and click on proceed to check out. Now it's here where you can specify a duration for the domain registration. You can use the default of one year, or alternatively you can go ahead and pick a longer registration period. For this domain I'm going to choose one year and then you can choose whether you want to auto renew the domain after that initial period. In my case I'm going to leave this selected. You'll see a subtotal at the price and then you can click next to move on to the next step.

      Now at this point you need to specify the contact type. In most cases you'll be putting a person or a company but there's also association, public body or reseller. You need to go ahead and fill in all of these details and they do need to be valid details, that's really important. If you are worried about privacy, most domains will allow you to turn on privacy protection, so any details that you enter here cannot be seen externally. Now obviously to keep my privacy intact, I'm going to go ahead and fill in all of these details and I'm going to hide the specifics and once I've entered them all, I'm going to go ahead and click on 'Next' and you should do the same. Again I've hidden my details on the bottom of the screen.

      Route 53 does tell you that in addition to the domain registration cost there is a monthly cost for the hosted zone which will be created as part of this registration. So there is a small monthly cost for every hosted zone which you have hosted using Route 53 and every domain that you have will need one hosted zone. So I'm going to scroll down. Everything looks good, you'll need to agree to the terms and conditions and then click on submit. Now at this point the domain is registering and it will take some time to complete. You may receive a registration email which may include something that you need to do, clicking on a link or some other form of identity verification. You might not get that, but if you do get it, it's important that you do follow all of the steps contained within that email. And if you don't receive an email, you should check your spam folder, because if there are any actions to perform and you don't, it could result in the domain being disabled.

      You can see the status of the domain registration by clicking on "requests" directly below "registered domains". The status will initially be listed as "in progress", and we need this to change to "successful". So pause the video, wait for this status to change, and then you're good to continue. Welcome back, in my case this took about 20 minutes to complete, but as you can see my domain is now registered. So if we go to registered domains you'll be able to see the domain name listed together with the expiration date, the auto renew status, and the status of the transfer lock. Now transfer lock is a security feature, it means the domain cannot be transferred away from route 53 without you disabling this lock.

      Now we're able to see additional details on the domain if we click on the domain name. Now obviously I've hidden my contact information. If you click on the DNSsecKeys tab then it's here where you can configure DNSsec on the domain. We won't be doing anything with that at this stage. One of the important points I want to draw your attention to is the name servers. So I've registered animalsforlife.io and it's these name servers that will be entered into the Animals for Life record within the .io top level domain zone. So these servers are the ones that the DNS system will point at. These currently are set to four Route 53 name servers. And because we've registered the domain inside Route 53, this process is automatic. So a hosted zone is created, four name servers are allocated to host this hosted zone And then those four name servers are entered into our domain records in our top level domain zone.

      This process end-to-end is all automatic. So the four name servers for the animalsforlife.io hosted zone. These are entered into the animalsforlife.io record within the .io top level domain zone. It's all automatic. So if we move to the hosted zone area of the console and then go inside AnimalsForLife.io and then expand the hosted zone details at the top These are the four name servers which are hosting this hosted zone And if you're paying attention You'll note these are the same four servers that are contained within the registered domains Area of the console and these are the same four servers which have been entered into the .io top level domain zone. Now if you ever delete and then recreate a hosted zone It's going to be allocated with four brand new name servers. These name servers will be different than the name servers for the zone which you deleted So if you delete and recreate a hosted zone You'll be given four brand new name servers. In order to stop any DNS problems you'll need to take these brand new name servers and update the items within the registered domains area of the console but again because you've registered the domain within route 53 this process has been handled for you end to end you won't need to worry about any of this unless you delete and recreate the host of zone.

      Now that's everything you need to do at this point if you followed this process throughout this demo lesson you now have an operational domain within the global DNS infrastructure that's manageable within Route 53. Now as I mentioned earlier this is an optional step for the course if you do have a domain registered then you will have the opportunity to use it within various demo lessons within the course. If you don't, don't worry, none of this is mandatory you can do the rest of the course without having a domain. At this point though that is everything I wanted you to do in this demo lesson. Go ahead and complete the video and when you're ready I'll look forward to you joining me in the next.

    1. Welcome back and in this demo lesson you're going to get some experience interacting with CloudWatch. So you're going to create an EC2 instance, you're going to cause that instance to consume some CPU capacity and then you're going to monitor exactly how that looks within CloudWatch. Now to do this in your own environment you'll just need to make sure that you're logged into the general AWS account as the IAM admin user and as always make sure that you have the Northern Virginia region selected which is US-East-1. Once you've got those set correctly then click in the search box at the top and type EC2, find the EC2 service and then just go ahead and open that in a brand new tab.

      Now we're going to skip through the instance creation process because you've done that in a previous demo lesson. So just go ahead and click on instances and then Launch Instance. Under Name, I just want you to put CloudWatch Test as the instance name. Then scroll down and then under the Amazon Machine image to use, go ahead and select Amazon Linux. We're going to pick the Amazon Linux 2023 version, so that's the most recent version of this AMI. It should be listed as Free Tier Eligible, so just make sure that's the case. We'll leave the architecture set to 64-bit x86 and scroll down. It should already be set to an instance type which is free tier eligible, in my case t2.micro. We'll be connecting to this instance using ec2 instance connect so we won't be using an SSH key pair. So in this drop down just click and then say proceed without a key pair. We won't need one because we won't be connecting with a local SSH client. Scroll down further still and under Network Settings click on Edit and just make sure that the default VPC is selected. There should only be one in this list but just make sure that it's set as default. Under Subnet we can leave this as No Preference because we don't need to set one. We will need to make sure that Auto Assign Public IP is set to Enable.

      Under create security group for the name and for the description just go ahead and type CloudWatch SG so CloudWatch SG for both the security group name and the description now the default for security group rule should be fine because it allows SSH to connect from any source location and that's what we want scroll down further still and we'll be leaving storage as default remember this is set from the AMI that we pick. Now because this is a CloudWatch lesson, we're going to set something a little bit different. So expand Advanced Details and then scroll down and look for Detailed CloudWatch Monitoring. Now this does come at an additional cost, so you've got a couple of options. You can just watch me do this or you can do this demo without Detailed Monitoring enabled. And if you don't enable this, it will be entirely free, but you might need to wait a little bit longer for things to happen in the demo lesson so keep that in mind.

      What I'm going to do is I'm going to enable detailed CloudWatch monitoring and if we click on info here we can see some details about exactly what that does and we can also open this in a new tab and explore what additional charges apply if we want to enable it. Now in this case I'm going to enable it you don't have to it's not a huge charge but I think for me demoing this to you it's good that I enable it you don't have to you might just have to wait a little bit longer for things to happen in the demo. Now once all of that set just scroll all the way down to the bottom and go ahead and click launch instance. Now this might take a few minutes to create we're first waiting for this success dialog and once that shows we can go ahead and click on view all instances. Go ahead and click refresh until you see the instance it will start off in a pending state with nothing listed under status check. After a few moments this will change status we'll see that it's in a running state and then we need to wait for this to change to two of two status checks before we continue. So go ahead and pause the video wait for your status check to update and once it does we're good to continue.

      Okay so now this has changed to two out of two checks passed and that's good that's what we want so so it should display running on the instant state and then two out of two checks passed under status check. Once this is the case, go ahead and click in the search box at the top and just type CloudWatch, locate the CloudWatch service, and then open that in a brand new tab. This is the CloudWatch console, and it's here where we're going to create a CloudWatch alarm. Now if you see anything about a new UI or new features, you can just go ahead and close down that dialog. Once we're here, go ahead and click on Alarms on the left and then click on all alarms. This will show a list of all the alarms that you've configured within CloudWatch, and currently there aren't any. What we're going to do is to create an alarm. So click on create alarm, and then click on select metric. Once we're on this screen, scroll down, and we're going to be looking for an EC2 metric, because we need to find the CPU utilization metric, which is inside the EC2 namespace. In other words, it comes from the EC2 service. So go ahead and click on EC2, and then we're looking for per instance metrics. So click on per instance metrics, and this will show all of the EC2 instance metrics that we currently have. Now if I scroll through this list, what you'll see is that I have two different instance IDs, because I'm using this account to create all of these demo lessons. In my case, I see previous instances. Now if you're doing this in your account, if you go back to the EC2 Management Console, you can see your instance ID here. Just remember the last four digits of this instance ID, and then go back to the CloudWatch Console. If you have more than one instance listed in CloudWatch, look for the instance ID that ends with the four digits that you just noted down, and then from that list you need to identify CPU utilization. And so I'm going to check the box next to this metric. Now this is the metric that monitors, as the name suggests, CPU utilization on this specific instance ID, which is our CloudWatch test instance. If I scroll up, I'm able to see any data that's already been gathered for this specific instance. And as you can see, it's not a great deal at the moment because we've only just launched this instance. So I'm gonna go ahead and click on Select Metric, and then because we're creating an alarm, it's going to ask us for what metric and conditions we want to evaluate.

      So I'm going to scroll down, and under Conditions, I'm going to pick Static, because I want this alarm to go into an alarm state when something happens to the CPU utilization. So I'm going to ask CloudWatch that whenever the CPU utilization is greater or equal to a specific value than to go into an alarm state. So that value is going to be 15%. So whenever the CPU utilization on this EC2 instance is greater or equal to 15%, then this alarm will go into the alarm state. So I'm gonna go ahead and click on Next. Now you can set this up so that if this alarm goes into an alarm state, it can notify you using SNS. Now that's useful if this is in production usage, but in this case we're not using it in production, so I'm going to go ahead and click on remove. Scroll down to the bottom, there's also other things that you could pick, so you could do an auto scaling action, an EC2 action, or a systems manager action. But we're going to be talking about these in much more detail as we move through the course. For now we're going to keep this simple, it's just going to be a basic alarm which goes into an alarm state or not. So click on next and then under alarm name I'm going to put CloudWatch test and then high CPU and you should do the same. So type that, click on next, scroll down to the bottom and create that alarm.

      Now initially this alarm state will be insufficient data because CloudWatch hasn't yet gathered enough data on the CPU utilization to generate the state. That's fine because we've we've got another thing that we need to do first. So now move back to the EC2 console and we're going to connect into this instance using EC2 Instance Connect. Remember, that's the web-based way to get access to this instance. So over the top of the CloudWatch Test instance, right click and go to Connect. Make sure that EC2 Instance Connect is selected, so click that tab. You can leave everything as default and click on Connect and that will connect you to this EC2 instance. Now at this point, we need to install an application called stress on this EC2 instance. And stress is an application which will put artificial CPU load onto a system. And that's what we want to do in order to see how CloudWatch reacts. To install stress, we're going to run this command. And this next command will use the yum package manager to install the stress utility. So go ahead and run this command and then clear the screen again. Now the stress command can be run by typing stress and what we're going to do is do a double hyphen help just to get the help for this command. So what we're going to do is we're going to run stress and we're going to specify the number of CPUs to use and we want that number to be the same number of virtual CPUs that this instance has. Now a t2.micro has one virtual CPU and so the command that we need to run is stress space hyphen c space 1 and then space and then we're going to use hyphen t which is the timeout command and this specifies how long we want to run this for. So we're going to specify 3600 so hyphen t and then a space 3600 and this will run the stress for 3600 seconds and that's plenty for us to see how this affects the metrics which are being monitored by CloudWatch.

      Now what I want to do before we do that is go back to the CloudWatch console. You might need to refresh if you haven't seen the state update yet. In my case it's already showing as okay. So this means that it's now got access to some data. So click on this alarm and you'll be able to see that currently the CPU started off at very low levels and then it spiked up and potentially in my case that's because we've just installed some software. But note here this red line which indicates the alarm level for this alarm. So if the CPU utilisation, which is in blue, exceeds this red line then this alarm will move from OK to ALARM. And that's what we want to simulate. So go back to the instance and press Enter to run this stress command. And that's going to begin placing high levels of CPU load on this instance and what we'll see over the next few minutes is CloudWatch will detect this additional CPU load and it will cause this alarm to go from OK into an alarm state. So move back to the CloudWatch console and just keep hitting refresh until you see a change in the alarm state. Again this might take a few minutes. What I suggest you do is pause the video and wait for your alarm to change away from OK and then you're good to continue.

      Now in my case this only took a few minutes and as you can see the CPU load reported by this alarm in CloudWatch went from this value here and spiked all the way up to this value which is well above the 15% of the alarm threshold. So the alarm changed from OK to IN alarm based on this excessive CPU and if we keep monitoring this over time you'll see that this trend continues because this CPU is under extremely high load because it's been artificially simulated using the stress utility. Now if we go back to this EC2 instance and press ctrl and C at the same time this will exit out of the stress utility and at this point the artificial CPU load has been removed and the instance will gradually move back down to its normal levels which is very close to zero. So again what you'll see is this may take a few minutes to be reflected inside CloudWatch. So keep refreshing this once you've cancelled the stress utility and wait for the reported CPU utilization to move back down below the alarm value. Again that might take a few minutes so go ahead and pause the video and wait for this blue line to move back under the red line and once it does you should see that the alarm state changes from in alarm to OK again.

      In my case it took a few minutes for the blue line to move below the alarm threshold and then a few more minutes afterwards for the alarm to change from in alarm to OK. But as you can see at this point that's exactly what's happened once the CPU usage goes below the configured threshold value then the alarm changes back to an OK state. And at this point that's everything that I wanted to cover in this demo lesson on CloudWatch. CloudWatch is a topic that I'm going to be going into much more detail later on in the course. This has just been a really brief introduction to the product and how it interacts with EC2. Now at this point the only thing left is to clear up the account and put it back into the same state as it was at the start of this lesson. So to do that go ahead and click on All Alarms, select the CloudWatch Test High CPU Alarm that you created, click on the actions dropdown, select delete, and then confirm that deletion. Then go back to EC2, go to the instances overview, right click on the CloudWatch test instance, making sure that it is the correct instance, so CloudWatch test, and then select terminate instance and confirm that termination. Now that's going to move through a few states, it will start with shutting down, and you need to wait until that instance is in a terminated state. Go ahead and pause the video and wait for your instance to change into terminated.

      Okay so once your instance has terminated on the menu on the left scroll down go to security groups select the CloudWatch SG security group making sure that you do pick the correct one so CloudWatch SG click on actions scroll down delete security groups and click on delete and at that point the account is back in the same state as it was at the start of this demo lesson. So thanks for watching this video. I hope you gained some experience of the CloudWatch product and again we're going to be talking about it in much more detail later in the course. At this point though go ahead and complete this video and when you're ready I'll look forward to you joining me in the next.

    1. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Otero-Coronel and colleagues use a combination of acoustic stimuli and electrical stimulation of the tectum to study MSI in the M-cells of adult goldfish. They first perform a necessary piece of groundwork in calibrating tectal stimulation for maximal M-cell MSI, and then characterize this MSI with slightly varying tectal and acoustic inputs. Next, they quantify the magnitude and timing of FFI that each type of input has on the M-cell, finding that both the tectum and the auditory system drive FFI, but that FFI decays more slowly for auditory signals. These are novel results that would be of interest to a broader sensory neuroscience community. By then providing pairs of stimuli separated by 50ms, they assess the ability of the first stimulus to suppress responses to the second, finding that acoustic stimuli strongly suppress subsequent acoustic responses in the M-cell, that they weakly suppress subsequent tectal stimulation, and that tectal stimulation does not appreciably inhibit subsequent stimuli of either type. Finally, they show that M-cell physiology mirrors previously reported behavioural data in which stronger stimuli underwent less integration.

      The manuscript is generally well-written and clear. The discussion of results is appropriately broad and open-ended. It's a good document. Our major concerns regarding the study's validity are captured in the individual comments below. In terms of impact, the most compelling new observation is the quantification of the FFI from the two sources and the logical extension of these FFI dynamics to M-cell physiology during MSI. It is also nice, but unsurprising, to see that the relationship between stimulus strength and MSI is similar for M-cell physiology to what has previously been shown for behavior. While we find the results interesting, we think that they will be of greatest interest to those specifically interested in M-cell physiology and function.

      Strengths:

      The methods applied are challenging and appropriate and appear to be well executed. Open questions about the physiological underpinnings of M-cell function are addressed using sound experimental design and methodology, and convincing results are provided that advance our understanding of how two streams of sensory information can interact to control behavior.

      Weaknesses:

      Our concerns about the manuscript are captured in the following specific comments, which we hope will provide a useful perspective for readers and actionable suggestions for the authors.

      Comment 1 (Minor):

      Line 124. Direct stimulation of the tectum to drive M-cell-projecting tectal neurons not only bypasses the retina, it also bypasses intra-tectal processing and inputs to the tectum from other sources (notably the thalamus). This is not an issue with the interpretation of the results, but this description gives the (false) impression that bypassing the retina is sufficient to prevent adaptation. Adding a sentence or two to accurately reflect the complexity of the upstream circuitry (beyond the retina) would be welcome.

      Comment 2 (Major):

      The premise is that stimulation of the tectum is a proxy for a visual stimulus, but the tectum also carries the auditory, lateral line, and vestibular information. This seems like a confound in the interpretation of this preparation as a simple audio-visual paradigm. Minimally, this confound should be noted and addressed. The first heading of the Results should not refer to "visual tectal stimuli".

      Comment 3 (Major):

      Figure 1 and associated text.

      It is unclear and not mentioned in the Methods section how phasic and tonic responses were calculated. It is clear from the example traces that there is a change in tonic responses and the accumulation of subthreshold responses. Depending on how tonic responses were calculated, perhaps the authors could overlay a low-passed filtered trace and/or show calculations based on the filtered trace at each tectal train duration.

      Comment 4 (Minor):

      Figure 3 and associated text.<br /> This is a lovely experiment. Although it is not written in text, it provides logic for the next experiment in choosing a 50ms time interval. It would be great if the authors calculated the first timepoint at which the percentage of shunting inhibition is not significantly different from zero. This would provide a convincing basis for picking 50ms for the next experiment. That said, I suspect that this time point would be earlier than 50m s. This may explain and add further complexity to why the authors found mostly linear or sublinear integration, and perhaps the basis for future experiments to test different stimulus time intervals. Please move calculations to Methods.

      Comment 5 (Major):

      Figure 4C and lines 398-410.<br /> These are beautiful examples of M-cell firing, but the text suggests that they occurred rarely and nowhere close to significantly above events observed from single modalities. We do not see this as a valid result to report because there is insufficient evidence that the phenomenon shown is consistent or representative of your data.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC- 2024-02497

      Corresponding author(s): Tourriere, Hélene and Maraver, Antonio

      1. General Statements [optional]

      We sincerely thank the Editors and Reviewers for the time devoted to our manuscript. We found their critiques interesting and very helpful. After careful examination and thanks to a large collaborative effort, we will be able to answer to all the reviewers’ comments by adding significantly new experimental data.

      We are also encouraged by the positive comments of the Reviewers:

      “This manuscript will likely engage oncologists who investigate the chemotherapy-resistant mechanisms of platinum compounds in NSCLC treatment” (Reviewer 1);

      “Overall, the authors have conducted experiments that sufficiently elucidate their claims, and the description of the experiments is detailed.”; and “Overall, this work unveils a novel mechanism for Notch activation in response to platinum chemotherapy, providing a renewed outlook on overcoming chemotherapy resistance in NSCLC” (Reviewer 2).

      We are also aware that both reviewers agreed that there is room for improvement, and we are sure that upon accomplishment of all proposed experiments both reviewers will be fully satisfied.

      Please bear in mind that although it was known that platinum-based chemotherapy induced the Notch pathway in lung cancer cells, the underlying molecular mechanism was largely unknown. Thanks to the molecular dissection we performed in our study, we propose an innovative treatment for patients with lung cancer, the main cause of death by cancer in the world. Hence, we agree with both reviewers that our study will be appealing for a large number of cancer researchers, and we feel it will be also the case for those interested in DNA damage, Notch and MDM2 pathways.

      2. Description of the planned revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: This manuscript from Maraver and co-authors investigates the putative resistance mechanisms that hinder the efficacy of platinum-based therapies (e.g., carboplatin) against non-small cell lung carcinoma (NSCLC). Using in vitro lung cancer cell lines, shRNA-based knockdown, and exogenous overexpression systems, the research describes a DNA damage-induced resistance mechanism involving the NOTCH signaling pathway and the E3 ligase MDM2. The authors show that carboplatin treatment induces DNA damage and promotes ATM activation, which in turn activates the NOTCH signaling pathway via ubiquitination and stabilization of the Notch Intracellular Domain (NICD). New findings include the MDM2-mediated ubiquitination and stabilization of NICD. Using in vivo NSCLC-PDX models, they demonstrate that combining carboplatin with Notch and MDM2 inhibitors can enhance tumor killing, suggesting that targeting the MDM2/NICD axis in conjunction with carboplatin may be a viable therapeutic alternative. Furthermore, they show that NICD and MDM2 levels are elevated among tumor samples from chemo-resistant patients. Consistent with these findings, high MDM2 levels correlate with poor progression-free survival (PFS) in NSCLC patients.

      [Authors] We thank this reviewer for her/his fair summary of our work that highlights our new findings.

      Major comments:

      Some of the key conclusions may not be convincing.

      [Authors] We understand the concerns that reviewer might have and we are sure that upon accomplishment of all experiments detailed below, she/he will be convinced that the manuscript will be ready for publication.

      1. One significant weakness of the manuscript is the lack of exploration into the underlying mechanism of how MDM2 mediates the stabilization of NICD. While the observation of MDM2-mediated NICD stabilization is intriguing, it is important to provide a more convincing explanation for the reviewers. This could be achieved by offering a detailed molecular mechanism, especially considering that MDM2 typically targets proteins for degradation.

      [Authors] After reading this reviewer’s comment, we realize we did a poor job discussing better the previous study demonstrating that MDM2 induced ubiquitination on NICD but not for degradative purposes (Pettersson et al., 2013). In particular, they performed it using a mutated form of ubiquitin in lysine 48, i.e., the K48R mutant. Like this, the authors of this seminal study demonstrated that MDM2 was still able to induce ubiquitination in NICD, and hence it was not degradative.

      Still, and to confirm that this is the case also upon DNA damage, we will perform experiments using same K48R mutant to formally prove that MDM2 upon DNA damage does not ubiquitinate NICD via lysine 48-linked polymers, and hence it is not degradative. Even more, upon discussion with Laetitia Linares, author of our study and long-lasting expert in ubiquitination (for instance see (Riscal et al., 2016) and (Arena et al., 2018)), we will use another ubiquitin mutant in lysine 63. This different type of ubiquitination does not mark proteins for degradation but promote an association of the targeted protein with DNA helping for DNA repair (Liu et al., 2018). Using a ubiquitin mutated in this lysine, i.e., K63R, this type of ubiquitination cannot occur. Taking into account that we observe NICD increase ubiquitination upon DNA damage, the use of K63R will be very informative.

      Hence, we will repeat experiments of current Figure 3A with the same WT ubiquitin as before, and now also with K48R and K63R mutants. Even more, we will also include mutant forms of ubiquitin which can only form ubiquitin chains on lysine 48 (K48 only) or lysine 63 (K63 only) and we anticipate that in the presence of K48 only mutant, NICD will not be ubiquitinated upon DNA damage, while the use of K63 only mutant will be very useful. All these data will be part of the new Figure 3A.

      Of note, Dr Linares has all tools required to perform these experiments and hence we will start them soon.

      Another weakness lies in the unclear role and the underlying mechanism of ATM in the MDM2-mediated NICD stabilization. While the data presented (Fig. 3B, 3C) suggest that carboplatin could elevate MDM2 levels for NICD stabilization, a more precise method to induce MDM2 overexpression specifically for targeting NICD is required. It appears that ATM plays a crucial role in this regulatory process. The following questions must be addressed: Does ATM induce the phosphorylation of MDM2 for its protein stabilization and/or E3 ligase activity?

      [Authors] There are several points here.

      For the first one, the use of a more precise method to induce MDM2 overexpression, it is exactly what we did in Figure 4A, i.e., ectopic expression of MDM2 to demonstrate that MDM2 is sufficient to increase NICD levels.

      For the second one, i.e., the phosphorylation status of MDM2 by ATM in our system, we will perform different experiments. There are up to six proposed residues in MDM2 to be phosphorylated by ATM upon DNA damage: S386, S395, S407, T419, S425, and S429 (Cheng et al., 2011). Among all of them, S395 is the most well-known and again Dr Linares has interesting tools we will use to answer to this specific reviewer’s point. We will use an MDM2 mutant harboring an aspartate instead of the serine in this position, i.e., S395D, that mimics the serine 395 phosphorylation induced by ATM upon DNA damage. We will use this mutant together with the WT and 464A MDM2 proteins already used, and if this residue is important in our phenotype, total levels of NICD will be even higher and/or localize more in the nuclei when compared with WT MDM2. All these new data will appear as the new Figure 4A __and new Figure 4B__.

      Furthermore, we will also use an antibody that recognizes this phosphorylation site by WB after carboplatin treatment and it will be part of the new Figure 3B.

      Finally, we will also express WT MDM2 and purify it by immunoprecipitation in different experimental conditions: steady state, upon carboplatin treatment and also in combination of carboplatin and ATM inhibitor, to perform phospho-proteomics analysis upon all these conditions. Of note, and to show the feasibility of this approach, the proteomic platform at Biocampus in Montpellier has experience using this technique (Kassouf et al., 2019).

      The combination therapy of carboplatin with MDM2 and NICD inhibitors may lack compelling rationale (see below).

      [Authors] This is a very important point but we discuss it below, where more information is provided by the reviewer. Still, we anticipate we will perform a new in vivo experiment to answer to this point.

      In lines 275-276, the authors stated that their preclinical data establish the enhancement of carboplatin's therapeutic effect in NSCLC in vivo through MDM2-NICD axis inhibition. However, it's important to note that this finding remains preliminary at this stage.

      [Authors] We consider that our statement is not exaggerated, but we will tone down the message as proposed by the reviewer in the next submission.

      Minor comments:

      1. The observed loss of NICD during ATMi + carboplatin treatment in Figures 2A and 2B raises the question of whether ATM regulates the gene transcription of NOTCH. In addition to the CHX assay conducted in Figures 2C and 2D, quantifying NOTCH mRNA upon ATM inhibition could provide further insights. Alternatively, referencing relevant studies on this topic may strengthen the discussion.

      [Authors] This is an interesting experiment and we will perform it.

      In Figures 4A and 4B, the noticeable discrepancy between the exogenous expression of wild-type (WT) MDM2 and catalytically inactive MDM2-464A raises concerns. It is essential to consider if the reduced ubiquitination and stability of NICD might be attributed to varying levels of MDM2-464A in the cells rather than its catalytic inactivity. While p53 ubiquitination was utilized as a control, ensuring comparable levels of MDM2 and MDM2-464A expression could enhance the experimental rigor. Compared to the smear poly-ubiquitination bands observed for MDM2 in Figure 4B, the ubiquitination of NICD appears simpler. What distinguishes the feature of MDM2-mediated NICD ubiquitination? Could it potentially involve mono-ubiquitination?

      [Authors] The point of the reviewer is well taken, and importantly, as mentioned above in main point 2, we will repeat these experiments and will appear as new Figure 4A and new Figure 4B.

      Regarding the type of ubiquitination, as explained in detail in major point 1 to same reviewer, we will fully characterize the type of ubiquitination on NICD induced by DNA damage, and we will confirm that MDM2 is required for this specific ubiquitination in future new Figure 4C where we will overexpress the required ubiquitin forms and WT MDM2.

      In Figure 5A, the authors need to consider conducting additional NOTCH-associated factors to definitively demonstrate the activation of NOTCH signaling beyond HES1. Alternatively, in Figure 5B, the NICD Western blot could be complemented by detecting HES1 or other NOTCH-associated factors.

      [Authors] To answer to this particular point, we will test for other downstream targets of Notch as NRARP and it will appear as part of new Figure 5C.

      In Figures 5C and 5D, crucial control groups are missing, specifically mice treated solely with SP141+DBZ, carboplatin+SP141, and SP141+DBZ. It is essential to include these groups to demonstrate that the enhanced tumor killing results from the combination of carboplatin with SP141 and/or DBZ, rather than from SP141 and DBZ alone. Furthermore, in addition to the currently used NSCLC-PDX model harboring the p53 (P151R) mutation, it would be informative to include a NSCLC-PDX model expressing WT p53.

      [Authors] This is a crucial point in this rebuttal as mentioned before in major point 3 and we detail it in here.

      We did only 3 groups because preliminary data indicated that SP141 in combination with carboplatin was not showing any benefit compared to carboplatin alone while upon combination of carboplatin with Notch inhibition there was only a slight increase in therapeutic carboplatin benefit but otherwise not very potent, and for simplicity we preferred to don’t show these data. But, after reading this point from Reviewer 1, even if we will propose later only the triple combination for patients, we clearly need to demonstrate that the other combinations are not potent enough or not at all.

      The reviewer asked to include: “SP141+DBZ, carboplatin+SP141, and SP141+DBZ”. We imagine that she/he meant: SP141+DBZ, carboplatin+SP141, and carboplatin +DBZ, that together with the vehicle, carboplatin and carboplatin+SP141+DBZ makes 6 groups of treatments. Putting together the 8 mice devoted for tumor growth and survival, plus 4 mice for the acute treatment for IHC and WB purposes (for current Figures 5A and 5B) makes a total of 72, that is a substantial number of mice. Of note, since we performed the in vivo experiment presented in the current manuscript, a new Notch inhibitor called nirogacestat, appear in the market being the first in class Notch inhibitor to treat solid cancer patients (desmoid tumors) after demonstrating a significant therapeutic effect in clinical trials (Gounder et al., 2023).

      Hence, we will take advantage of the repetition of this experiment to substitute this new molecule instead of DBZ, that is an interesting molecule for preclinical research, but without any clinical relevance. Therefore, the use of nirogacestat will further increase the medical impact of our data. Importantly, nirogacestat is better tolerated than DBZ, meaning that mice can be treated for longer periods of time and we propose in here to treat up to 12 weeks. Finally, after discussion with Quentin Thomas, author of the manuscript and clinical researcher in the lab, we will provide 4 carboplatin cycles as it is proposed today to NSCLC patients in an attempt of getting closer to the clinical setting. In particular we will provide carboplatin to mice on weeks 1, 4, 7 and 10, while treating with MDM2 inhibitor (SP141) and Notch inhibitor (nirogacestat) from Monday to Friday for the 12 weeks.

      This experiment will be long and will require an important use of resources both human and financial, but we are sure that the effect in tumor growth and survival will be more dramatic than the one presented now.

      On the contrary and as explained in the 4th subheading part of this “revision plan”, including another 72 mice to treat a p53 proficient NSCLC PDX, when we already demonstrated in vitro that p53 is not required for the phenotype described in this study, for us it is totally unfeasible by ethical reasons, i.e., the use of animals in research (please see below for further details).

      All the new data will appear as new Figure 5 (B to E). For new Figure 5A please see below the major comment 2 of Reviewer 2.

      Though beyond the current study's scope, in the discussion section, the authors may want to propose or hypothesize on how MDM2-mediated NICD stabilization contributes to carboplatin resistance. This could provide valuable insights for future research directions.

      [Authors] We will discuss this part as proposed by the reviewer.

      In the Western blot results, the total ATM and ATR controls were absent.

      [Authors] The reviewer is totally right and we will repeat experiments to include all the totals as requested.

      Authors may choose to include a graphical abstract at the end of their study to visually illustrate the mechanisms they have described.

      [Authors] Very good idea thanks, we will do it.

      Reviewer #1 (Significance (Required)):

      Advance: The authors aim to present a novel perspective on the resistance mechanisms to platinum compounds in NSCLC therapy. They explore platinum compounds-induced DNA damage, ATM activation, and MDM2-mediated stabilization of the active form of NOTCH (NICD). However, to strengthen their claims, they must provide more conclusive results.

      Audience: This manuscript will likely engage oncologists who investigate the chemotherapy-resistant mechanisms of platinum compounds in NSCLC treatment, as well as scientists specializing in NOTCH and MDM2 pathways. However, the manuscript's central claims lack robust support from the available data, and the current approaches employed are not sufficiently thoughtful and rigorous; there is room for improvement.

      My expertise is molecular medicine, cancer biology, and epigenetics.

      [Authors] We want to thank again this reviewer for her/his helpful comments that will increase the impact and the relevance of our study while keeping the original message.

      We are also very satisfied when she/he said: “This manuscript will likely engage oncologists who investigate the chemotherapy-resistant mechanisms of platinum compounds in NSCLC treatment”.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Sara Bernardo et al. investigated the molecular mechanisms underlying the activation of the Notch signaling in response to DNA damage induced by platinum-based chemotherapeutic agents in non-small cell lung cancer (NSCLC). They demonstrated that carboplatin treatment induces DNA double-strand breaks (DSBs) and stabilizes NICD, a process dependent on ATM and mediated by MDM2. In vivo experiments in patient-derived xenografts (PDX) showed that inhibition of NICD and MDM2 enhanced platinum effectiveness. Furthermore, clinical analysis revealed a correlation between MDM2 expression and poor prognosis in NSCLC patients treated with platinum compounds, emphasizing the clinical relevance of the MDM2-NICD axis in platinum resistance.

      [Authors] We thank this reviewer for her/his nice synopsis of our study.

      Major comments:

      Overall, the authors have conducted experiments that sufficiently elucidate their claims, and the description of the experiments is detailed. However, there is still room for the improvement.

      [Authors] We are very pleased that reviewer finds our experimental work “…sufficiently elucidate their claims, and the description of the experiments is detailed.” And we are sure that after all the new experiments we are proposing in here, she/he will be fully satisfied.

      1.The finding that MDM2 promoted NICD stability through non degradative ubiquitination is interesting and in line with a previous study. As it is also known that NICD is regulated by various post-translational modifications, including ubiquitination that promotes NICD degradation. It is unclear what's the potential difference between these two types of ubiquitination. For example, do these two differ in specific ubiquitination sites? Can the authors provide some discussion?

      [Authors] We agree with the reviewer and hence we will perform a new set of experiments to determine the role of 2 key lysine residues in the ubiquitin protein promoting either degradation or DNA binding. As explained in detail in major point 1 from reviewer 1, we will determine if DNA damage promotes ubiquitination in position 48, i.e., to degrade, or in position 63, i.e., to facilitate the binding to DNA for repairing upon DNA damage, or in any of these 2 positions. And as mentioned above, we will then confirm that MDM2 is responsible of the specific ubiquitination type we will uncover. We are sure that the reviewer will be satisfied by these new data once is generated.

      As for the specific ubiquitination sites in NICD, there are up to 17 lysine residues susceptible of being ubiquitinated. Hence unveiling what residues are targeted by MDM2 and if they differ from others inducing degradation as those promoted by the E3 ligase FBXW7, we feel is out of the scope of the current manuscript. Still, we will discuss all this part as kindly proposed by the reviewer.

      Could the overexpression of MDM2 or NICD lead to carboplatin resistance in A549 or H358 cells?

      [Authors] This is a very interesting experiment and prompted by the reviewer’s comment we started the subcloning of inducible NICD into lentiviral vectors to generate stable cells and test the carboplatin sensitivity in presence of different levels of NICD. These new data will be the new Figure 5A.

      The trends observed in the western blot data within the manuscript appear inconsistent. While the authors propose that NICD levels increased upon incubation with carboplatin, the discrepancy arises when considering the NICD levels without cycloheximide (CHX) treatment in Figure 1E, where no significant elevation is observed (Lane 6 vs. Lane 1).

      [Authors] The point of the reviewer is well taken. Please bear in mind that in here we are handling several signaling pathways that interact among them while having each one different kinetics. Our finding of increased NICD upon carboplatin treatment is highly consistent in vitro and in vivo, but it is true that in the experiment mentioned by the reviewer is not obvious, probably due to some kinetic issue. We are repeating this experiment to have the increased in NICD upon carboplatin as it is in the rest of the manuscript (up to 9 times only in main figures).

      The quality of western blots needs to be improved, especially Fig. 1C and S1C, also Figure 3B. Moreover, the NICD western blot sometimes appears as one band and sometimes as two bands. Please provide an explanation. If possible, please quantify the bands in western blots.

      [Authors] We agree with the reviewers that not all WB have the same quality and we will repeat some of them to homogenize the quality all over the manuscript, and particularly, we will repeat the ones kindly pointed out by the reviewer.

      The two bands it is something we also noticed and we will pay attention while reproducing the WB, since it might be related to discrepancies in the percentage of acrylamide. If this is not the case, i.e., upon repetition we still observe in some conditions and not in others, we will provide explanations for this in the new submission as kindly proposed by the reviewer.

      Finally, and also as proposed by the reviewer we will quantify the WB bands.

      Please provide a necessary discussion on whether the targeted treatment approach towards the MDM2-NICD axis is applicable to all patients or only to those with high expression of MDM2/NICD.

      [Authors] In the discussion of the current manuscript, we focused into the MDM2 high expression subset of patients for this issue, but in the next submission we will enlarge to patients with high levels of NICD also.

      How to interpret the significance of the simultaneous increase in NICD ubiquitination and stability mediated by MDM2? Please provide a relevant discussion.

      [Authors] We will provide strong experimental data to go beyond discussion (please see above the experiments with ubiquitin mutants), but we will also provide discussion of this particular point.

      In Figure 5B, please also check the level of MDM2. In Figure 5C, carboplatin appears to have little impact on tumor growth. How to explain the increase of Ki-67 in the carboplatin treatment group in Figure 5A?

      [Authors] We will measure also levels of MDM2 in the future new Figure 5C as requested by the reviewer.

      As for the interesting observation of the Ki67, since we will repeat the whole experiment, we will pay special attention to this point if ever it is repeated. Should be this the case, we will elaborate an explanation.

      Minor comments:

      1.Please include scale bars in Figure 1B and Supplemental Figure 1B.

      [Authors] We thank the reviewer for this comment. We will include the scale bars where required.

      2.Figure 5D, the P values of the survival curve should be indicated in the figures.

      [Authors] We will include the P values in the future new Figure 5E.

      3.The presentation of survival curve data in Figures 5D and 6A should be consistent.

      [Authors] The point of the reviewer is well taken and we will use Prism to draw the PFS for patients in Figure 6A as we did for the mice in current Figure 5D.

      4.It seems that supplemental figure 2 is missing.

      [Authors] We actually jumped from supplemental figure 1 to 3 because we do not have any associated supplemental figure to main Figure 2. We will clarify this point in the next submission.

      5.Please carefully check the spelling of the entire text, for example, on page 20, line 426 it should be 'western'. Also, please spell out the abbreviations DDR and ATM.

      [Authors] We will double check all spelling and provide the abbreviations kindly suggested by the reviewer.

      6.The abbreviation for Cleaved caspase 3 should be CC3.

      [Authors] We thank the reviewer for this information, we will use CC3 in the next submission.

      Reviewer #2 (Significance (Required)):

      Notch signaling is associated with the occurrence and development of non-small cell lung cancer (NSCLC). Previous study indicates that the expression of Notch protein is significantly higher in NSCLC tissues compared to normal tissues (PMID: 31170211). Additionally, the upregulation of Notch1 is correlated with higher tumor grades, lymph node metastasis, tumor-node-metastasis (TNM) staging, and poor prognosis (PMID: 25996086). Abnormal activation of Notch signaling pathway is frequently observed in chemotherapy-resistant NSCLC, and some studies have aimed to address NSCLC drug resistance via modulating Notch signaling (PMID: 30087852, 38301911). This manuscript firstly proposes that MDM2-mediated stabilization of NICD upon DNA damage plays a major role in NSCLC response to platinum chemotherapy. It further suggests that targeting the MDM2-NICD axis could prove to be an effective therapeutic strategy. Overall, this work unveils a novel mechanism for Notch activation in response to platinum chemotherapy, providing a renewed outlook on overcoming chemotherapy resistance in NSCLC. This manuscript will attract those interested in the mechanisms of chemotherapy resistance and novel treatment approaches.

      [Authors] We sincerely thank the reviewer for finding that our “…work unveils a novel mechanism for Notch activation in response to platinum chemotherapy, providing a renewed outlook on overcoming chemotherapy resistance in NSCLC”. We are also very satisfied when she/he says: “This manuscript will attract those interested in the mechanisms of chemotherapy resistance and novel treatment approaches.”

      Finally, we are convinced that the reviewer will appreciate all the new proposed experimental data, and also that upon finishing all experiments, she/he will think that the manuscript will be suitable for publication.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      For simplicity, we decided to introduce all changes in next submission upon conclusion of all experimental approaches proposed above.

      4. Description of analyses that authors prefer not to carry out

      While we will perform almost all experiments proposed by reviewers, there is one we feel is not possible to do due to ethical reasons. Reviewer 1 wanted us to perform a new in vivo experiment with the same PDX using up to 6 treatment groups. We use 8 mice per condition (for tumor growth and survival) plus 4 for the “acute” treatment for WB and IHC purposes, hence 12 mice x 6 groups = 72 mice, and we will perform this experiment as indicated above and proposed by the reviewer.

      On the contrary, the reviewer asked us also to repeat the same experiment with a PDX p53 proficient. While we understand the possible interest, since we demonstrated in vitro that p53 is not required for the protective phenotype of MDM2 and Notch upon DNA damage, we honestly believe that using another 72 mice to confirm this aspect in vivo, is against the rational use of animals in research going against the 3Rs rule. Hence, we will not perform this experiment unless Editors believe is strictly required.

      REFERENCES

      Arena, G., Cisse, M. Y., Pyrdziak, S., Chatre, L., Riscal, R., Fuentes, M., Arnold, J. J., Kastner, M., Gayte, L., Bertrand-Gaday, C., et al. (2018). Mitochondrial MDM2 Regulates Respiratory Complex I Activity Independently of p53. Mol Cell 69, 594-609 e598.

      Cheng, Q., Cross, B., Li, B., Chen, L., Li, Z., and Chen, J. (2011). Regulation of MDM2 E3 ligase activity by phosphorylation after DNA damage. Mol Cell Biol 31, 4951-4963.

      Gounder, M., Ratan, R., Alcindor, T., Schoffski, P., van der Graaf, W. T., Wilky, B. A., Riedel, R. F., Lim, A., Smith, L. M., Moody, S., et al. (2023). Nirogacestat, a gamma-Secretase Inhibitor for Desmoid Tumors. N Engl J Med 388, 898-912.

      Kassouf, T., Larive, R. M., Morel, A., Urbach, S., Bettache, N., Marcial Medina, M. C., Merezegue, F., Freiss, G., Peter, M., Boissiere-Michot, F., et al. (2019). The Syk Kinase Promotes Mammary Epithelial Integrity and Inhibits Breast Cancer Invasion by Stabilizing the E-Cadherin/Catenin Complex. Cancers (Basel) 11.

      Liu, P., Gan, W., Su, S., Hauenstein, A. V., Fu, T. M., Brasher, B., Schwerdtfeger, C., Liang, A. C., Xu, M., and Wei, W. (2018). K63-linked polyubiquitin chains bind to DNA to facilitate DNA damage repair. Sci Signal 11.

      Pettersson, S., Sczaniecka, M., McLaren, L., Russell, F., Gladstone, K., Hupp, T., and Wallace, M. (2013). Non-degradative ubiquitination of the Notch1 receptor by the E3 ligase MDM2 activates the Notch signalling pathway. Biochem J 450, 523-536.

      Riscal, R., Schrepfer, E., Arena, G., Cisse, M. Y., Bellvert, F., Heuillet, M., Rambow, F., Bonneil, E., Sabourdy, F., Vincent, C., et al. (2016). Chromatin-Bound MDM2 Regulates Serine Metabolism and Redox Homeostasis Independently of p53. Mol Cell 62, 890-902.

    1. Reviewer #2 (Public Review):

      Summary:

      Turning behavior plays a crucial role in animal exploration and escape responses, regardless of the presence or absence of environmental cues. These turns can be broadly categorized into two categories: strong reorientations, characterized by sudden changes in path directionality, and smooth turns, which involve gradual changes in the direction of motion, leading to sinuosity and looping patterns. One of the key model animals to study these behaviors is the nematode Caenorhabditis elegans, in which the role of strong reorientations has been thoroughly studied. Despite their impact on trajectories, smooth turns have received less attention and remain poorly understood. This study addresses this gap in the literature, by studying the interplay between smooth turns and strong reorientations in nematodes moving in a uniform environment, surrounded by an aversive barrier. The authors use this set-up to study both exploration behavior (when the worm is far from the aversive barrier) and avoidance behavior (when the worm senses the aversive barrier). The main claims of the paper are that (1) during exploratory behavior, the parameters governing strong reorientations are optimized to compensate for the effect of smooth turns, increasing exploration efficiency, and (2) during avoidance, strong reorientations are biased towards the side that maximizes escape success. To support these two claims, the paper presents a detailed quantitative characterization of the statistics of smooth turns and strong reorientations. These results offer insights that may interest a diverse audience, including those in movement ecology, animal search behavior, and the study of Caenorhabditis elegans. In our opinion, the experimental work and data analysis are of the highest quality, resulting in a very clean characterization of C. elegans' turning behavior. However, the experimental design and data analyses presented are not fully aligned with some of the central conclusions drawn, and in particular, we believe that further work is needed to fully support the claim that strong reorientations are optimized to increase exploration efficiency.

      Strengths:

      The authors have addressed important questions in movement ecology through hypothesis-driven experiments. The choice of C. elegans as a model organism to investigate the impact of turning dynamics on escape and exploration is well-justified by its limited repertoire of strong reorientation behaviors and consistent turning bias across strains and individuals. The quality of the experimental data is very high, using state-of-the-art techniques, and a set-up where a robust and reproducible avoidance response can be studied. The data analysis benefits from state-of-the-art techniques and a deep understanding of C. elegans' behavior, resulting in a very clean and very clear set of results. We particularly appreciated the use of a ventral/dorsal reference system (rather than a left/right one), which is more natural and insightful. As a result, the paper presents one of the best characterizations of C. elegans sharp turning behavior published to date. We find that the claim that strong reorientations are chosen in a way that optimizes avoidance behavior is solid and well-supported. The manuscript is well-written and maintains a coherent line of reasoning throughout.

      Weaknesses:

      Our primary concerns revolve around the significance and rigor of the research on exploratory behavior. First, we believe that the experimental arena was too small for accurately observing the unfolding of exploration. The movement of assayed animals was clearly impaired by boundary effects, which obscured key elements of C. elegans exploratory behavior such as the mean square displacement or large-scale trajectory structures emerging from curvature bias. Second, we think that the proof that strong reorientations are optimized to maximize exploration performance is too indirect: it relies on a particular model with some unrealistic assumptions and lacks a quantification of the gains provided by the optimization to the individuals. We believe that a more thorough and direct analysis would be needed to fully support the claim.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Responses to recommendations

      Reviewer #1 (Recommendations For The Authors):

      Describe more precisely how gene expression graphs are built (tissues, reads counts). For example, how were read counts normalized? Were they from DESeq2 data, which only works by comparing two samples? If so, all samples should be independently compared to a reference and the normalized expression value of the reference will change from sample to sample... thus introducing a pure technical artifact.

      We have added additional information about the normalisation method to the

      Material and Methods section (Lines 597-598: “Lastly, expression levels shown in figures 2-5 are normalised gene counts produced by DESeq2.”) and figure legends

      (lines 247, 286, 372, 404: “Gene expression data was generated from whole fish.

      Expression levels were derived from DESeq2 normalised gene counts.”) to address this recommendation. 

      DESeq2 provides a reference independent normalisation through a median of ratios method (a good explanation can be found here:

      https://hbctraining.github.io/DGE_workshop/lessons/02_DGE_count_normalization.h tml). The normalised expression values are independent of any reference, and therefore will not change from sample and sample as suggested in this comment. In contrast, the pairwise comparisons are done when analysing significantly differentially expressed genes between two treatments using a Wald test, which is done against a reference and generates log2 fold change information and p-values.; however, this is different to the normalisation we described above.

      Provide bioinformatics workflows and, if possible, the set of parameters used, the computing resources, etc. Were some assembly finishing steps carried out (by long-range PCR?) and experimental validations (especially for allelespecific transcripts, by conventional RT-PCR based on diagnostic mutations)?

      We have added additional information on the bioinformatics workflows where required, including parameters used (Lines 530, 536, 549-551, and 574-583.). No finishing steps other than HiC scaffolding were performed. No allele-specific analysis was done as part of this manuscript.

      To further improve transparency, we have also uploaded all the scripts used for this study to https://github.com/R-Huerlimann/Malabar_grouper_genome and the gene models and functional annotation to https://figshare.com/projects/Malabar_grouper_Epinephelus_malabaricus_genome_ annotation/199909. This information has been added to the manuscript in lines 600601 and 609-611.

      Reviewer #3 (Recommendations For The Authors):

      General author response:

      All the recommendations of this reviewer are very relevant and would certainly provide a lot of information, but they are constituting a full project in themselves as they would imply establishing this grouper species as an experimental model in our lab. Currently we only have access to the larval and juvenile stages via a collaboration with the Okinawa Prefectural Sea Farming Center, which is an hour drive from our lab, and is limited to the grouper spawning season. If we want to do all what is suggested, we need to have a regular and easy access to the fishes. This would require establishing this model in our marine station, which is not possible due to space and time issues. These groupers grow to a very large size (1-2 m in length, and up to 150 kg in weight) and only mature into males after > 6 years.

      First and foremost, I would advise the authors to extend their TH and cortisol levels measurements to the entire developmental time considered in their analysis.

      For the reasons stated above we could not perform these experiments. We must emphasize that the data regarding TH are available for a closely related species (e.g., Epinephelus coioides, de Jesus et al. 1998) and there is no reason to think that the situation will be drastically different in E. malabaricus. In addition, given that we have now studied several coral reef fish species in the same context (clownfish, surgeonfish, damselfish, gobies) we observed that the transcriptomic data are more robust, more sensitive, and more precise than hormone measurements. 

      Consider carrying out in situ hybridisation of TSH with putative CRH receptors to determine if thyrotrophin could be competent to respond to HPA axis signals.

      We agree studying the interplay between corticoids and thyroid hormones at the neuroendocrine level would be desirable and we fully agree with the experiment suggested by the reviewer, but this is impossible in our current situation. We are not working with an establish animal model like zebrafish or Xenopus, but with a large, long-lived marine fish that reproduces in spawning aggregations and whose husbandry is notoriously difficult.

      Consider conducting cortisol treatment experiments to functionally determine if indeed cortisol is involved in grouper metamorphosis.

      We tried to do TH and cortisol treatments specifically on the early larval stages corresponding to the early TH peak to see how this would impact the development of the fin spines, but our trials were unsuccessful. The larvae at that stage are extremely fragile and even putting them into small volumes of treatment drugs induced massive mortalities. Again, this would mean establishing this grouper species as a model organism and would require a massive effort to improve larval rearing as discussed above. We feel that our data stands on its own in the meantime and adds valuable information to the existing literature by studying a rarely investigated species.

      Responses to comments

      Reviewer #1 (Public Review):

      Weaknesses:

      The manuscript needs proper editing and is not complete. Some wordings lack precision and make it difficult to follow (e.g. line 98 "we assembled a chromosome-scale genome of ..." should read instead "we assembled a chromsome-scla genome sequence of ...". Also, panel Figure 2E is missing.

      We made the suggested change of adding “sequence” in lines 32 and 121. Concerning additional changes, we have carefully edited our manuscript and looked for any incomplete sections. Unfortunately, it is difficult to see what other issues are being raised here without any further information. 

      As for panel E of figure 2, it is not missing. The panel is located to the right, just below “Target Cells”.

      The shortcomings of the manuscripts are not limited to the writing style, and important technical and technological information is missing or not clear enough, thereby preventing a proper evaluation of the resolution of the genomic resources provided:

      Several RNASeq libraries from different tissues have been built to help annotate the genome and identify transcribed regions. This is fine. But all along the manuscript, gene expression changes are summarized into a single panel where it is not clear at all which tissue this comes from (whole embryo or a specific tissue ?), or whether it is a cumulative expression level computed across several tissues (and how it was computed) etc. This is essential information needed for data interpretation.

      No fertilised eggs or embryos have been sequenced. The individual tissues derived from juvenile fish were used for the genome annotation only, using ISOseq. The whole larval fish were used for the developmental analysis using RNAseq, as well as the genome annotation. We have added additional information in the figures and text that the results shown are from whole larvae, and added more detail to the material and methods section about which type of sample was analysed in which way.

      Specifically, we have added “Lastly, expression levels shown in figures 2-5 are normalised gene counts produced by DESeq2.” to lines 597-598 in the Material and Methods section, “Gene expression data was generated from whole larvae.” to line 191, and “Gene expression data was generated from whole fish. Expression levels were derived from DESeq2 normalised gene counts.” to the figure legends in lines 247, 286, 372, 404). Additionally, we have added clarifications in lines 489, 497, 530, and 536. 

      The bioinformatic processing, especially of the assemble and annotation, is very poorly described. This is also a sensitive topic, as illustrated by the numerous "assemblathon" and "annotathon" initiatives to evaluate tools and workflows. Importantly, providing configuration files and in-depth description of workflows and parameter settings is highly recommended. This can be made available through data store services and documents even benefit from DOIs. This provides others with more information to evaluate the resolution of this work. No doubt that it is well done,but especially in the field of genome assembly and annotation, high resolution is VERY cost and time-intensive. Not surprisingly, most projects are conditioned by trade-offs between cost, time, and labor. The authors should provide others with the information needed to evaluate this.

      We have added additional information on parameters used in the genome assembly, annotation and transcriptome analysis in lines 549-551, 577, 579, 580, and 582. Additionally, we have uploaded all scripts to github as outlined in the Code and Data Availability section (lines 599-614).

      The genome assembly did not use a specific workflow (e.g., nextflow), but was done with a simple command and standard parameters in IPA. Scaffolding was carried out by Phase Genomics using their standardised proprietary workflow, of which a detailed description provided by Phase Genomics can be found in the supplementary material.

      Quantifications of T3 and T4 levels look fairly low and not so convincing. The work would clearly benefit from a discussion about why the signal is so low and what are the current technological limitations of these quantifications.

      This would really help (general) readers.

      The T3/T4 levels are consistent with other published work in fish. In the present manuscript for grouper we have a peak level of 1.2 ng/g (1,200 pg/g) of T4 and 0.06 ng/g (60 pg/g) of T3. This is a higher level of T4 and comparable level of T3 to what was found in convict tang (Holzer et al. 2017; Figure 2) with 30 pg/g of T4 and 100 pg/g of T3. Of course, there are also examples with higher levels, such as clownfish (Roux et al. 2023; Figure 1), with 10 ng/g (10,000 pg/g) of T4 and 2 ng/g (2,000 pg/g) of T3.

      The differences could be due to different structure of fish tissues and therefore different hormone extraction efficiency, different hormone measurement protocols, different fish physiology, different fish size (e.g., the weighting of tiny grouper larvae is difficult and less precise than in convict tang). What is important is not the absolute level but the relative level, which shows the change within different larval stages of a species with identical extraction and measurement protocols. Which means our data is internally consistent and coherent with what the grouper literature says.

      Holzer, Guillaume, et al. "Fish larval recruitment to reefs is a thyroid hormonemediated metamorphosis sensitive to the pesticide chlorpyrifos." Elife 6 (2017): e27595.

      Roux, Natacha, et al. "The multi-level regulation of clownfish metamorphosis by thyroid hormones." Cell Reports 42.7 (2023).

      Differential analysis highlights up to ~ 15,000 differentially expressed genes (DEG), out of a predicted 26k genes. This corresponds to more than half of all genes. ANOVA-based differential analysis relies on the simple fact that only a minority of genes are DEG. Having >50% DEG is well beyond the validity of the method. This should be addressed, or at least discussed.

      The large number of differentially expressed genes is due to the fact that this is coming from a larval developmental transcriptome going from one day old larva to fully metamorphosed juveniles at around day 60. 

      While DESeq2 indeed works on an assumption that most genes are not differentially expressed, this affects normalization but not hypothesis testing (Wald-test, LRT tests or ANOVA). However, normalisation in DESeq2 is fairly robust to this assumption. According to the author of DESeq2, Micheal Love, DESeq2 is using the median ratio for normalisation, and as long as the number of up and down regulated genes is relatively even, DESeq2 will be able to handle the data. As part of our general quality control for this project we consulted the MA plots, which do not show any overrepresented up or down expression patterns. Additionally see Michael Love comment on comparing different tissues, which is also applicable here when comparing vastly different larval stages (https://support.bioconductor.org/p/63630/):

      “For experiments where all genes increase in expression across conditions, the median ratio method will not be able to capture this difference, but this is typically not the case for a tissue comparison, as there are many "housekeeping" genes with relatively similar expression pattern across tissues.”

      Reviewer #3 (Public Review):

      Weaknesses:

      However, the authors make substantial considerations that are not proven by experimental or functional data. In fact, this is a descriptive study that does not provide any functional evidence to support the claims made.

      We agree with the reviewer that our paper lacks functional experiments but despite that, the transcriptomic data clearly show the activation of TH and corticoid pathways during two distinct periods: an early activation between D1 and D10, and a second one between D32 and juvenile stage. These data are interesting as they call for further examination of 1) the existence of an early larval developmental step also involving TH and corticosteroids and 2) the possible interaction of corticoids and TH during metamorphosis. This is a question that is certainly not settled yet in teleost fishes and which is of great interest.

      Especially 1) is of interest and importance, since this early activation (unique to our knowledge in any teleost fish studied so far) raises a lot of new questions and once again will certainly be scrutinised by other groups in the years to come, therefore ensuring a good citation impact of this study. We hope that the reviewer, while disagreeing with some our statements, will recognize that our study will be stimulating at that level and that this is what scientific studies should do.

      We acknowledge the descriptive nature of the data and the lack of functional experiments in the Discussion in lines 443 to 445: “This may suggest that in some aspect, cortisol synthesis could work in concert with TH, as has been shown in several different contexts in amphibians, but functional experiments need to be conducted to confirm this hypothesis.” As stated above doing such functional experiment would require establishing the grouper as an experimental model in our husbandry, which currently is not possible due to the large size of the adult fish.

      The consideration that cortisol is involved in metamorphosis in teleosts has never been shown, and the only example cited by the authors (REF 20) clearly states that cortisol alone does not induce flatfish metamorphosis. In that work, the authors clearly state that in vivo cortisol treatment had no synergistic effect with TH in inducing metamorphosis. Moreover, in Senegalensis, the sole pre-otic CRH neuron number decreases during metamorphosis, further arguing that, at least in flatfish, cortisol is not involved in flatfish metamorphosis (PMID: 25575457).  

      We will do our best to improve the clarity of the revised manuscript to avoid any misunderstanding about our claims. However, we would like to point out the semantic shift in the reviewer first sentence: Indeed “being involved” is not the same as “cortisol alone does not induce”. In ref 20 the authors explicitly wrote that “Cortisol further enhanced the effects of both T4 and T3, but was ineffective in the absence of thyroid hormones” and in our view this indeed corresponds to ”being involved in metamorphosis”.

      We are not claiming that cortisol alone is involved in metamorphosis as the reviewer suggests, but simply that there is a possible involvement of cortisol together with TH in metamorphosis. We stand on this claim as we indeed observed an activation of corticoid pathway genes around D32, which is sufficient to say it is involved. We do agree that functional experiments will be needed to properly demonstrate the involvement of corticoids in grouper metamorphosis, but this was not possible in the current study as it would imply to set up a full grouper life cycle in lab conditions which is impossible for the scope of this manuscript.

      We also mentioned in the discussion that the role of corticoids in fish larval development is still debated, and we agree that this remains a contentious issue. We have clarified the Discussion on this point (lines 375-376, lines 439-464).

      We wrote that “There is contrasting evidence of communication between these two pathways during teleost fish larval development with some data suggesting a synergic and other an antagonistic relationship. In terms of synergy, an increase in cortisol level concomitantly with an increase in TH levels has been observed in flatfish [26], golden sea bream [64] and silver sea bream [65]. Cortisol was also shown to enhance in vitro the action of TH on fin ray resorption (phenomenon occurring during flatfish metamorphosis) in flounder[27]. It has also been shown that cortisol regulates local T3 bioavailability in the juvenile sole via regulation of deiodinase 2 in an organ-specific manner [66]. On the antagonistic side, it has been shown that experimentally induced hyperthyroidism in common carp decreases cortisol levels[67], whereas cortisol exposure decreases TH levels in European eel [68]. Given this scattered evidence, the existence of a crosstalk active during teleost larval development and metamorphosis has never been formally demonstrated. The results we obtained in grouper are clearly indicating that HPI axis is activated during both early development and metamorphosis and that cortisol synthesis is activated during early development. This may suggest that in some aspect, cortisol synthesis could work in concert with TH, as has been shown in several different contexts in amphibians [25], but functional experiments need to be conducted to confirm this hypothesis.” In the revised manuscript, we have also added the interesting case of the Senegal sole mentioned by the reviewer.

      In the last revision, we had also added that our results “brought a first insight into the potential role of corticoids in the metamorphosis of E. malabaricus and call for functional experiments directly testing a possible synergy” meaning that we clearly acknowledge that we are only revealing a hypothesis that remains to be tested. We later follow up with a discussion about the most novel observation and focus of our study, the increase in THs and cortisol during early development, which was unexpected and very intriguing. Again, these results suggest that there might be a link between the two, as has been shown in amphibians. This is typically the kind of results that should encourage more investigations into other fish species. Indeed, this has been pointed out by other authors and in particular by Bob Denver (probably the foremost expert on this topic) in Crespi and Denver 2012: “Elevation in HPA/I axis activity has been described prior to Metamorphosis in amphibians and fish, birth in mammals (reviewed in Crespi & Denver 2005a; Wada 2008)”. B. Denver also adds that: “Experiments in which GCs were elevated prior to metamorphosis or prior to hatching or birth (e.g. Weiss, Johnston & Moore 2007) or inhibited by treatments with GC synthesis blockers (e.g. metyrapone) or receptor antagonists (e.g. RU486, Glennemeir & Denver 2002) demonstrate that GCs play a causal role in precipitating these life-history transitions (also reviewed in Crespi & Denver 2005a; Wada 2008).” We believe the reviewer will be convinced by these elements coming from a colleague unanimously respected in the field. 

      Furthermore, the authors need to recognise that the transcriptomic analysis is whole-body and that HPA axis genes are upregulated, which does not mean they are involved in regulating the HPT axis. The authors do not show that in thyrotrophs, any CRH receptor is expressed or in any other HPT axis-relevant cells and that changes in these genes correlate with changes in TSH expression. An in-situ hybridisation experiment showing co-expression on thyrotrophs of HPA genes and TSH could be a good start. However, the best scenario would be conducting cortisol treatment experiments to see if this hormone affects grouper metamorphosis.

      We agree that functional experiments are needed to validate our hypothesis. As the early peaks of expression levels observed for many genes were very intriguing for us, we did carry out thyroid hormones and goitrogenic treatment on young grouper larvae to test their effect on the morphological changes. Unfortunately, such experiments, already tricky on metamorphosing larvae, are even more risky on such tiny individuals just after hatching and we encountered high mortality rates. We must add that because we cannot establish a full grouper life cycle under lab conditions, we have done these experiments in the context of a commercial husbandry system in Japan, which while excellent limits the scope of possible experiments. We were thus not able to provide functional validation of our hypothesis. Such experiments will be a full project in itself, requiring setting up a rearing system suitable for both larval survival and economical constraints related to drug treatments. We were further limited by the spawning times of the grouper in the operational aquaculture farm, which are limited to a short time during each year. So even if we strongly agree with the necessity of conducting such experiments, we think that this is not in the scope of the present paper, but something future research can explore.

      High TSH and Tg levels usually parallel whole-body TH levels during teleost metamorphosis. However, in this study, high Tg expression levels are only achieved at the juvenile stage, whereas high TSH is achieved at D32, and at the juvenile stage, they are already at their lowest levels.

      This is exactly our point. We observe two peaks in TSH expression, one at D3 and one at D32. The peak at D3 coincides with high thyroid hormone levels on the same day, and while we have not measured TH at D32, existing literature shows that there is a peak in TH during that time (e.g., de Jesus et al., 1998). Similarly, there is a small peak of Tg at D3. Our manuscript focused more on the upregulation of these genes at D3, which has not been reported before in the literature and raised the question of the role of TH so early in the larval development, outside of the metamorphosis period. 

      Regarding the respective levels of TSH and Tg, we first would like to add that their respective order of appearance before metamorphosis (TSH at D32, Tg after) is consistent with what we would expect. We agree however that the strong increase of Tg and TPO expression is later than expected. Therefore, we have added the following sentence in lines 212 to 216: “The respective order of appearance of TSH and Tg (TSH at D32, Tg after) is consistent with what we would expect but a bit later than expected given the morphologicl transformation. It would be interesting to revisit this in a future series of experiments, with tighter temporal sampling to study how gene expression and morphological transformation aligned.“.

      It is very difficult to conclude anything with the TH and cortisol levels measurements. The authors only measured up until D10, whereas they argue that metamorphosis occurs at D32. In this way, these measurements could be more helpful if they focus on the correct developmental time. The data is irrelevant to their hypothesis.

      We respectfully disagree with the reviewer, considering that 1) TH levels have already been investigated in groupers coinciding with pigmentation changes and fin rays resorption (Figure 4 in de Jesus et al, 1998), 2) there is also evidence in numerous fish species that TH level increase is concomitant with increase of TH related genes, and 3) we observed in our data an increase in the expression of TH related genes as well as pigmentation changes and fin rays resorption. Based on our experience in fish metamorphosis and the literature we can say confidently that those observations indicate that metamorphosis is occurring between D32 and the juvenile stage. This clearly shows that our inference is correct. Additionally, we would like to reemphasize that from our experience in several fish species transcriptomic data are more robust and precise than hormone measurements.

      However, as we were surprised by the activation of TH and corticoid pathway genes very early in the larval development (at D3), which is clearly outside of the metamorphosis period, we decided to measure TH and cortisol levels during this period of time to determine if whether or not there this surprising early activation was indeed corresponding to an increase in both TH and cortisol. As such observation has never been made in other teleost species (to our knowledge), and as we were wondering if gene activation was accompanied by hormonal increase, the measurements we did for TH and cortisol between D1 and D10 are relevant. In order to clarify our message further, we have changed some of the mentions of

      “metamorphosis” to “larval development” throughout the manuscript and added other improvements to avoid any confusion between the two periods we are studying: early larval development (between D1 and D10) and metamorphosis (between D32 and juvenile stage).  

      Moreover, as stated in the previous review, a classical sign of teleost metamorphosis is the upregulation of TSHb and Tg, which does not occur at D32 therefore, it is very hard for me to accept that this is the metamorphic stage. With the lack of TH measurements, I cannot agree with the authors. I think this has to be toned down and made clear in the manuscript that D32 might be a putative metamorphic climax but that several aspects of biology work against it. Moreover, in D10, the authors show the highest cortisol level and lowest T4 and T3 levels. These observations are irreconcilable, with cortisol enhancing or participating in TH-driven metamorphosis.

      We thank the reviewer for this comment, but we think that there might be a misunderstanding here. 

      (1) We clearly observed an increase of TSHb (that occurs between D18 and juvenile stage) and an increase of tg from D32 which coincide with the activation of other genes involved in TH pathway (dio2, dio3, and also a strong increase of TRb). All this and put in the context of what we know from previous grouper studies, clearly supports our conclusion that TH-regulated metamorphosis is starting at around D32 in grouper. We also observed morphological changes such as fin rays resorption and pigmentation changes between D32 and juvenile stage. Such morphological changes have already been associated as corresponding to metamorphosis in groupers (De Jesus et al 1998) as they occur during TH level increase, and they also happen to be under the control of TH in grouper (De Jesus et al 1998). Based on this study but also on studies (conducted on many other teleost species) showing that the increase of TH levels is always associated with an activation of TH pathway genes and morphological and pigmentation changes we concluded that metamorphosis of E. malabaricus occurs between D32 and juvenile stage. We have improved the clarity of the manuscript in several places to make sure that our conclusion is based on our transcriptomic and morphological data plus the available literature.

      (2) We clearly observed another activation of TH related gene earlier in the development (between D1 and D10, with a surge of trhrs, tg and tpo at D3. As this activation was very unexpected for us, we decided to focus the analysis of TH levels between D1 and D10 and very interestingly we observed high level of T4 at D3 indicating that THs are instrumental very precociously in the larval development of the malabar grouper which has never been shown before. We declared lines 224-225 that our “data reinforce the existence of two distinct periods of TH signalling activity, one early on at D3 and one late corresponding to classic metamorphosis at D32”. However, we agree that we could have been clearer and clearly explained that this early activation was very intriguing for us and that we wanted to investigate hormonal levels around that period. However, we never claimed anywhere in the manuscript

      that this early developmental period corresponds to metamorphosis. Something else is occurring and both TH and cortisol seem to be involved but further experiments need to be conducted to understand their role and their possible interaction. We have added corresponding statements in the abstract (lines 39-43) and discussion (lines 447 to 449).

      (3) Finally, regarding the comment about cortisol enhancing or participating in TH driven metamorphosis, our data clearly showed an activation of the corticoid pathway genes around metamorphosis (between D32 and juvenile stage) suggesting a potential implication of corticoids in metamorphosis, but we agree with the reviewer that further experiment are needed to test that. We never claimed that cortisol was enhancing or participating in metamorphosis, on the contrary we are “suggesting a possible interaction between TH and corticoid pathway during metamorphosis”. And we also say that our “results brought a first insight into the potential role of corticoids in the metamorphosis of E. malabaricus and call for functional experiments directly testing a possible synergy.” Nonetheless, we agree that some parts of our manuscript can be confusing in regards of cortisol synthesis during metamorphosis as we did not measure cortisol levels between D32 and juvenile stage. We have therefore made changes throughout the Introduction and Discussion to make this clearer.

      Given this, the authors should quantify whole-body TH levels throughout the entire developmental window considered to determine where the peak is observed and how it correlates with the other hormonal genes/systems in the analysis.

      We did not measure TH levels at later stages as it has already been measured during Epinephelus coioides metamorphosis and the morphological changes observed in this species around the TH peak corresponds to what we observed in Epinephelus malabaricus around the peak of expression of TH pathway genes (see De Jesus et al., 1998 General and Comparative Endocrinology, 112:10-16). The main focus of this manuscript is the novel observation of the existence of an early activation period observed at D3, and for which we needed TH levels to determine if they were involved in another early developmental process (not related to metamorphosis). Our hypothesis is that this early activation might be related to the growth of fin rays necessary to enhance floatability during the oceanic larval dispersal. As we may have arrived at the explanation of this hypothesis too rapidly without setting up the context well enough, we have made changes to the introduction and discussion.

      Even though this is a solid technical paper and the data obtained is excellent, the conclusions drawn by the authors are not supported by their data, and at least hormonal levels should be present in parallel to the transcriptomic data. Furthermore, toning down some affirmations or even considering the different hypotheses available that are different from the ones suggested would be very positive.

      We thank the reviewer for acknowledging the solidity of the method of our paper and the quality of the results. We agree that there were several parts where our message was unclear. We have addressed these points in the revised version of the manuscript to make sure there is no more confusion between the two distinct periods we studied in this paper (early larval development and metamorphosis). We also made sure that our claims about TH/corticoids interaction during both periods remain hypothetical as we cannot yet, despite trials, sustain them with functional experiment.

    1. Author response:

      eLife assessment

      This study offers a useful treatment of how the population of excitatory and inhibitory neurons integrates principles of energy efficiency in their coding strategies. The analysis provides a comprehensive characterisation of the model, highlighting the structured connectivity between excitatory and inhibitory neurons. However, the manuscript provides an incomplete motivation for parameter choices. Furthermore, the work is insufficiently contextualized within the literature, and some of the findings appear overlapping and incremental given previous work.

      We thank the Reviewers and the Reviewing Editor for taking time to provide extremely valuable suggestions and comments, which will help us to substantially improve our paper. In what follows we summarize our current plan to improve the paper taking up on their suggestions.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: Koren et al. derive and analyse a spiking network model optimised to represent external signals using the minimum number of spikes. Unlike most prior work using a similar setup, the network includes separate populations of excitatory and inhibitory neurons. The authors show that the optimised connectivity has a like-to-like structure, leading to the experimentally observed phenomenon of feature competition. They also characterise the impact of various (hyper)parameters, such as adaptation timescale, ratio of excitatory to inhibitory cells, regularisation strength, and background current. These results add useful biological realism to a particular model of efficient coding. However, not all claims seem fully supported by the evidence. Specifically, several biological features, such as the ratio of excitatory to inhibitory neurons, which the authors claim to explain through efficient coding, might be contingent on arbitrary modelling choices. In addition, earlier work has already established the importance of structured connectivity for feature competition. A clearer presentation of modelling choices, limitations, and prior work could improve the manuscript.

      Thanks for these insights and for this summary of our work.

      Major comments:

      (1) Much is made of the 4:1 ratio between excitatory and inhibitory neurons, which the authors claim to explain through efficient coding. I see two issues with this conclusion: (i) The 4:1 ratio is specific to rodents; humans have an approximate 2:1 ratio (see Fang & Xia et al., Science 2022 and references therein); (ii) the optimal ratio in the model depends on a seemingly arbitrary choice of hyperparameters, particularly the weighting of encoding error versus metabolic cost. This second concern applies to several other results, including the strength of inhibitory versus excitatory synapses. While the model can, therefore, be made consistent with biological data, this requires auxiliary assumptions.

      We will describe better the ratio of numbers of E and I neurons found in real data, as suggested. The first submission already contained an analysis of how this ratio of neuron numbers depends on the weighting of the loss of E and I neurons and on the relative weighting of the encoding error vs the metabolic cost in the loss function (see Fig 6E). We will make sure that these results are suitably expanded and better emphasized in revision. We will also include new analysis of dependence of optimal parameters on the relative weighting of encoding error vs metabolic cost in the loss function when studying other parameters (namely: noise intensity, metabolic constant, ratio of mean I-I to E-I connectivity, time constants of single E and I neurons).

      (2) A growing body of evidence supports the importance of structured E-I and I-E connectivity for feature selectivity and response to perturbations. For example, this is a major conclusion from the Oldenburg paper (reference 62 in the manuscript), which includes extensive modelling work. Similar conclusions can be found in work from Znamenskiy and colleagues (experiments and spiking network model; bioRxiv 2018, Neuron 2023 (ref. 82)), Sadeh & Clopath (rate network; eLife, 2020), and Mackwood et al. (rate network with plasticity; eLife, 2021). The current manuscript adds to this evidence by showing that (a particular implementation of) efficient coding in spiking networks leads to structured connectivity. The fact that this structured connectivity then explains perturbation responses is, in the light of earlier findings, not new.

      We agree that the main contribution of our manuscript in this respect is to show how efficient coding in spiking networks can lead to structured connectivity similar to those proposed in the above papers. We apologize if this was not clear enough in the previous version. We will make it clearer in revision.  We nevertheless think it useful to report the effects of perturbations within this network because the structure derived in our network is not identical to those studied in the above paper, and because these results give information about how lateral inhibition works in this network. Thus, we will keep presenting it in the revised version, although we will de-emphasize and simplify its presentation to give more emphasis to the novelty of the derivation of this connectivity rule from the principles of efficient coding.

      (3) The model's limitations are hard to discern, being relegated to the manuscript's last and rather equivocal paragraph. For instance, the lack of recurrent excitation, crucial in neural dynamics and computation, likely influences the results: neuronal time constants must be as large as the target readout (Figure 4), presumably because the network cannot integrate the signal without recurrent excitation. However, this and other results are not presented in tandem with relevant caveats.

      We will improve the Limitations paragraph in Discussion, and also anticipate caveats in tandem with results when needed, as suggested.

      (4) On repeated occasions, results from the model are referred to as predictions claimed to match the data. A prediction is a statement about what will happen in the future - but most of the "predictions" from the model are actually findings that broadly match earlier experimental results, making them "postdictions".

      This distinction is important: compared to postdictions, predictions are a much stronger test because they are falsifiable. This is especially relevant given (my impression) that key parameters of the model were tweaked to match the data.

      We will better distinguish between pre- and post-dictions  in revision.

      Reviewer #2 (Public Review):

      Summary: In this work, the authors present a biologically plausible, efficient E-I spiking network model and study various aspects of the model and its relation to experimental observations. This includes a derivation of the network into two (E-I) populations, the study of single-neuron perturbations and lateral-inhibition, the study of the effects of adaptation and metabolic cost, and considerations of optimal parameters. From this, they conclude that their work puts forth a plausible implementation of efficient coding that matches several experimental findings, including feature-specific inhibition, tight instantaneous balance, a 4 to 1 ratio of excitatory to inhibitory neurons, and a 3 to 1 ratio of I-I to E-I connectivity strength. It thus argues that some of these observations may come as a direct consequence of efficient coding.

      Strengths:

      While many network implementations of efficient coding have been developed, such normative models are often abstract and lacking sufficient detail to compare directly to experiments. The intention of this work to produce a more plausible and efficient spiking model and compare it with experimental data is important and necessary in order to test these models.

      In rigorously deriving the model with real physical units, this work maps efficient spiking networks onto other more classical biophysical spiking neuron models. It also attempts to compare the model to recent single-neuron perturbation experiments, as well as some long-standing puzzles about neural circuits, such as the presence of separate excitatory and inhibitory neurons, the ratio of excitatory to inhibitory neurons, and E/I balance. One of the primary goals of this paper, to determine if these are merely biological constraints or come from some normative efficient coding objective, is also important.

      Though several of the observations have been reported and studied before (see below), this work arguably studies them in more depth, which could be useful for comparing more directly to experiments.

      Thanks for these insights and for the kind words of appreciation of the strengths of our work.

      Weaknesses:

      Though the text of the paper may suggest otherwise, many of the modeling choices and observations found in the paper have been introduced in previous work on efficient spiking models, thereby making this work somewhat repetitive and incremental at times. This includes the derivation of the network into separate excitatory and inhibitory populations, discussion of physical units, comparison of voltage versus spike-timing correlations, and instantaneous E/I balance, all of which can be found in one of the first efficient spiking network papers (Boerlin et al. 2013), as well as in subsequent papers. Metabolic cost and slow adaptation currents were also presented in a previous study (Gutierrez & Deneve 2019). Though it is perfectly fine and reasonable to build upon these previous studies, the language of the text gives them insufficient credit.

      We will improve the text to make sure that credit to previous studies is more precisely and more clearly given.

      Furthermore, the paper makes several claims of optimality that are not convincing enough, as they are only verified by a limited parameter sweep of single parameters at a time, are unintuitive and may be in conflict with previous findings of efficient spiking networks. This includes the following. Coding error (RMSE) has a minimum at intermediate metabolic cost (Figure 5B), despite the fact that intuitively, zero metabolic cost would indicate that the network is solely minimizing coding error and that previous work has suggested that additional costs bias the output. Coding error also appears to have a minimum at intermediate values of the ratio of E to I neurons (effectively the number of I neurons) and the number of encoded variables (Figures 6D, 7B). These both have to do with the redundancy in the network (number of neurons for each encoded variable), and previous work suggests that networks can code for arbitrary numbers of variables provided the redundancy is high enough (e.g., Calaim et al. 2022). Lastly, the performance of the E-I variant of the network is shown to be better than that of a single cell type (1CT: Figure 7C, D). Given that the E-I network is performing a similar computation as to the 1CT model but with more neurons (i.e., instead of an E neuron directly providing lateral inhibition to its neighbor, it goes through an interneuron), this is unintuitive and again not supported by previous work. These may be valid emergent properties of the E-I spiking network derived here, but their presentation and description are not sufficient to determine this.

      We are addressing this issue in two ways. First, we will present results of joint sweeps of variations of pairs of parameters whose joint variations are expected to influence optimality in a way that cannot be understood varying one parameter at a time. Namely we plan to vary jointly the noise intensity and the metabolic constant, as well as the ratio of E to I neuron numbers and the ratio of mean I-I to E-I connectivity. Second, we will individuate a reasonable/realistic range of possible variations of each individual parameter and then perform a Monte Carlo search for the optimal point within this range, and compare the so-obtained results with those obtained from the understanding gained from varying one or two parameters at a time.  We will also add the suggested citation to Calaim et al. 2022 in regard to the points discussed above.

      We will improve the comparison between the Excitatory-Inhibitory and the 1-Cell-Type model (see reply to the suggestions of Referee 3 for more details).

      Alternatively, the methodology of the model suggests that ad hoc modeling choices may be playing a role. For example, an arbitrary weighting of coding error and metabolic cost of 0.7 to 0.3, respectively, is chosen without mention of how this affects the results. Furthermore, the scaling of synaptic weights appears to be controlled separately for each connection type in the network (Table 1), despite the fact that some of these quantities are likely linked in the optimal network derivation. Finally, the optimal threshold and metabolic constants are an order of magnitude larger than the synaptic weights (Table 1). All of these considerations suggest one of the following two possibilities. One, the model has a substantial number of unconstrained parameters to tune, in which case more parameter sweeps would be necessary to definitively make claims of optimality. Or two, parameters are being decoupled from those constrained by the optimal derivation, and the optima simply corresponds to the values that should come out of the derivation.

      In the previously submitted manuscript we presented both the encoding error and the metabolic cost separately as a function of the parameters, so that readers could get an understanding of how stable optimal parameters would be to the change of the relative weighting of encoding error and metabolic cost. We will improve this work by adding the suggested calculations to provide quantitative measures of the dependence of the optimal network parameters and configurations on this relative weighting.

      Reviewer #3 (Public Review):

      Summary: In their paper the authors tackle three things at once in a theoretical model: how can spiking neural networks perform efficient coding, how can such networks limit the energy use at the same time, and how can this be done in a more biologically realistic way than previous work?

      They start by working from a long-running theory on how networks operating in a precisely balanced state can perform efficient coding. First, they assume split networks of excitatory (E) and inhibitory (I) neurons. The E neurons have the task to represent some lower dimensional input signal, and the I neurons have the task to represent the signal represented by the E neurons. Additionally, the E and I populations should minimize an energy cost represented by the sum of all spikes. All this results in two loss functions for the E and I populations, and the networks are then derived by assuming E and I neurons should only spike if this improves their respective loss. This results in networks of spiking neurons that live in a balanced state, and can accurately represent the network inputs.

      They then investigate in-depth different aspects of the resulting networks, such as responses to perturbations, the effect of following Dale's law, spiking statistics, the excitation (E)/inhibition (I) balance, optimal E/I cell ratios, and others. Overall, they expand on previous work by taking a more biological angle on the theory and showing the networks can operate in a biologically realistic regime.

      Strengths:

      (1) The authors take a much more biological angle on the efficient spiking networks theory than previous work, which is an essential contribution to the field.

      (2) They make a very extensive investigation of many aspects of the network in this context, and do so thoroughly.

      (3) They put sensible constraints on their networks, while still maintaining the good properties these networks should have.

      Thanks for this summary and for these kind words of appreciation of the strengths of our work.

      Weaknesses:

      (1) The paper has somewhat overstated the significance of their theoretical contributions, and should make much clearer what aspects of the derivations are novel. Large parts were done in very similar ways in previous papers. Specifically: the split into E and I neurons was also done in Boerlin et al (2008) and in Barrett et al (2016). Defining the networks in terms of realistic units was already done by Boerlin et al (2008). It would also be worth it to discuss Barrett et al (2016) specifically more, as there they also use split E/I networks and perform biologically relevant experiments.

      We will improve the text to make sure that credit to previous studies is more precisely and more clearly given.

      (2) It is not clear from an optimization perspective why the split into E and I neurons and following Dale's law would be beneficial. While the constraints of Dale's law are sensible (splitting the population in E and I neurons, and removing any non-Dalian connection), they are imposed from biology and not from any coding principles. A discussion of how this could be done would be much appreciated, and in the main text, this should be made clear.

      We indeed removed non-Dalian connections because having only connections respecting Dale’s law is a major constraint for biological plausibility. Our logic was to consider efficient coding within the space of networks that satisfy this (and other) biological plausibility constraints. We did not intend to claim that removing the non-Dalian connections was the result of an analytical optimization. However, to get better insights into how Dale’s Law constrains or influences the design of efficient networks, we added a comparison of the coding properties of networks that either do or do not satisfy Dale’s law. We apologize if this was not sufficiently clear in the previous version and we will clarify this in revision. 

      (3) Related to the previous point, the claim that the network with split E and I neurons has a lower average loss than a 1 cell-type (1-CT) network seems incorrect to me. Only the E population coding error should be compared to the 1-CT network loss, or the sum of the E and I populations (not their average). In my author recommendations, I go more in-depth on this point.

      We will perform the suggested detailed comparisons between the network loss in the 1CT-model and E-I model and then revise or refine conclusions if and as needed, according to the results we will obtain.

      (4) While the paper is supposed to bring the balanced spiking networks they consider in a more experimentally relevant context, for experimental audiences I don't think it is easy to follow how the model works, and I recommend reworking both the main text and methods to improve on that aspect.

      We will try to make the presentation of the model more accessible to a non-computational audience.

      Assessment and context: Overall, although much of the underlying theory is not necessarily new, the work provides an important addition to the field. The authors succeeded well in their goal of making the networks more biologically realistic, and incorporating aspects of energy efficiency. For computational neuroscientists, this paper is a good example of how to build models that link well to experimental knowledge and constraints, while still being computationally and mathematically tractable. For experimental readers, the model provides a clearer link between efficient coding spiking networks to known experimental constraints and provides a few predictions.

      Thanks for these kind words. We will make sure that these points emerge more clearly and in a more accessible way from the revised paper.

    1. https://web.archive.org/web/20240725080148/https://fossacademic.tech/2024/02/11/Move-Slowy-Preview.html [[Move Slowly and Build Bridges by Robert Gehl]] is a forthcoming book on 'Mastodon, the Fediverse, and the Struggle for Ethical Social Media'. This post gives summaries per chapter of the draft. Ch1 focuses on Xodus after Musk only. Odd, there are many examples where costs of leaving socmed platforms played a role, which may well be more informative than just n=1. Ch 2 on AP as protocol Ch 3 CoC as a social layer on networked tech (no regard here it seems for the fact that human networks exist outside of tech and span multiple tech platforms simultaneously, and themselves have social norms that guid behaviour regardless whether codified in CoC or expressed in federation choices) Ch 4 on blocking and defederation as a needed safety tool. Socially I think the default might need to be the other way around, federating is the choice, defed the default, as it is how we do it socially irl. We are not unwelcoming to newcomers in a group but we are wary. Ch 5. Who pays for the fediverse infra. Short answer is we all do/many of us do. I pay my own instance, and also contribute hours to the governance of the largest Dutch instance. Good point about people forgetting there are other bizz models for digital media than what centralised adtech kraken do. Ch 6. on eco impact of socmed, and need of awareness what running this stuff costs ecologically. Seems to then pivot to how degrowth and solarpunk people using fediverse tech to interact, which is not the same thing. (It says mitigate, but compared to what, X? ) Ch 7. Threads , or the corp reaction to a growing fediverse. Conclusion, this is where the ethics will be discussed finally.

      Forthcoming w Oxford Univ Press. Not sure this is for me, reads like a snapshot with a limited time window in which it might be informative. Perhaps of interest for [[Stichting ActivityClub Bestuur Hoofdnote]].

    1. Reviewer #1 (Public Review):

      Summary:

      Boldt et al test several possible relationships between trandiagnostically-defined compulsivity and cognitive offloading in a large online sample. To do so, they develop a new and useful cognitive task to jointly estimate biases in confidence and reminder-setting. In doing so, they find that over-confidence is related to less utilization of reminder-setting, which partially mediates the negative relationship between compulsivity and lower reminder-setting. The paper thus establishes that, contrary to the over-use of checking behaviors in patients with OCD, greater levels of transdiagnostically-defined compulsivity predict less deployment of cognitive offloading. The authors offer speculative reasons as to why (perhaps it's perfectionism in less clinically-severe presentations that lowers the cost of expending memory resources), and set an agenda to understand the divergence in cognition between clinical and nonclinical samples. Because only a partial mediation had robust evidence, multiple effects may be at play, whereby compulsivity impacts cognitive offloading via overconfidence and also by other causal pathways.

      Strengths:

      The study develops an easy-to-implement task to jointly measure confidence and replicates several major findings on confidence and cognitive-offloading. The study uses a useful measure of cognitive offloading - the tendency to set reminders to augment accuracy in the presence of experimentally manipulated costs. Moreover, the utilizes multiple measures of presumed biases - overall tendency to set reminders, the empirically estimated indifference point at which people engage reminders, and a bias measure that compares optimal indifference points to engage reminders relative to the empirically-observed indifference points. That the study observes convergenence along all these measures strengthens the inferences made relating compulsivity to the under-use of reminder-setting. Lastly, the study does find evidence for one of several a priori hypotheses and sets a compelling agenda to try to explain why such a finding diverges from an ostensible opposing finding in clinical OCD samples and the over-use of cognitive offloading.

      Weaknesses:

      Although I think this design and study are very helpful for the field, I felt that a feature of the design might reduce the tasks's sensitivity to measuring dispositional tendencies to engage cognitive offloading. In particular, the design introduces prediction errors, that could induce learning and interfere with natural tendencies to deploy reminder-setting behavior. These PEs comprise whether a given selected strategy will be or not be allowed to be engaged. We know individuals with compulsivity can learn even when instructed not to learn (e.g., Sharp, Dolan, and Eldar, 2021, Psychological Medicine), and that more generally, they have trouble with structure knowledge (eg Seow et al; Fradkin et al), and thus might be sensitive to these PEs. Thus, a dispositional tendency to set reminders might be differentially impacted for those with compulsivity after an NPE, where they want to set a reminder, but aren't allowed to. After such an NPE, they may avoid more so the tendency to set reminders. Those with compulsivity likely have superstitious beliefs about how checking behaviors leads to a resolution of catastrophes, which might in part originate from inferring structure in the presence of noise or from purely irrelevant sources of information for a given decision problem.

      It would be good to know if such learning effects exist if they're modulated by PE (you can imagine PEs are higher if you are more incentivized - e.g., 9 points as opposed to only 3 points - to use reminders, and you are told you cannot use them), and if this learning effect confounds the relationship between compulsivity and reminder-setting.

      A more subtle point, I think this study can be more said to be an exploration than a deductive test of a particular model -> hypothesis -> experiment. Typically, when we test a hypothesis, we contrast it with competing models. Here, the tests were two-sided because multiple models, with mutually exclusive predictions (over-use or under-use of reminders) were tested. Moreover, it's unclear exactly how to make sense of what is called the direct mechanism, which is supported by partial (as opposed to complete) mediation.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02394

      Corresponding author(s): Altman, Brian J

      1. General Statements [optional]

      We thank all three Reviewers for their insightful and helpful feedback and suggestions. We strongly believe that addressing these comments has now resulted in a much-improved manuscript. We appreciate that the Reviewers found the manuscript "interesting" with "valuable insights and... obvious novelty", "an important study that is well-done", and "an important understanding of the crosstalk between cancer cells and immune cells as well as the understanding of how the TME disrupts circadian rhythms". All three Reviewers requested a significant revision, which we provide here. We carefully and completely responded to each Reviewer question or suggestion, in most cases with new experiments and text, and in a very few cases with changes or additions to the Discussion section. This includes new data in seven of the original Figures and Supplementary Figures, and one new main Figure and three new Supplementary Figures. Highlights of these new data include testing the role of low pH in cancer cell supernatant on macrophage rhythms, and analysis of single-cell RNA-sequencing data for heterogeneity in macrophage circadian gene expression. Additional experiments were also performed that were not included in the manuscript, and these data are presented in this Response. A detailed point-by-point response to each comment is included below with excerpts of the data and updated text for the reviewers. Please note that the PDF version of this Response includes images of the new Figures inserted in to the manuscript.

      2. Point-by-point description of the revisions

      __Reviewer #1 __

      Evidence, reproducibility and clarity

      The manuscript by Knudsen-Clark et al. investigates the novel topic of circadian rhythms in macrophages and their role in tumorigenesis. The authors explore how circadian rhythms of macrophages may be influenced by the tumor microenvironment (TME). They utilize a system of bone marrow-derived macrophages obtained from transgenic mice carrying PER2-Luciferase (PER2-Luc), a trackable marker of rhythmic activity. The study evaluates how conditions associated with the TME, such as polarizing stimuli (to M1 or M2 subtype), acidic pH, and elevated lactate, can each alter circadian rhythms in macrophages. The authors employ several approaches to explore macrophage functions in cancer-related settings. While the manuscript presents interesting findings and may be the first to demonstrate that tumor stimuli alter circadian rhythms in macrophages and impact tumor growth, it lacks a clear conclusion regarding the role of altered circadian rhythms in suppressing tumor growth. Several discrepancies need to be addressed before publication, therefore, the manuscript requires revision before publication, addressing the following comments:

      We thank Reviewer #1 for the comments regarding the quality of our work and are pleased that the Reviewer finds that this manuscript "presents interesting findings and may be the first to demonstrate that tumor stimuli alter circadian rhythms in macrophages and impact tumor growth". We have addressed all comments and critiques from Reviewer #1 below. To summarize, we added new data on how different macrophage polarization states affect media pH (Supplementary Figure 4), further characterized gene expression in our distinct macrophage populations (Supplementary Figure 1), provided clarity in the data and text on the universal nature of Clock Correlation Distance (CCD) across macrophage populations (Figure 6), included human tumor-associated macrophage (TAM) data for CCD (Figure 7) analyzed single-cell RNA-sequencing data of TAMs to demonstrate heterogeneity in circadian gene expression (Figure 9), and used tumor-conditioned media to show that low pH still affects macrophage rhythms in this context *Supplementary Figure 5". Thanks to the helpful suggestions of the Reviewer, we also made numerous clarifications and fixed a critical referencing error that the Reviewer identified.

      Major comments: 1. It is well known that pro-inflammatory macrophages primarily rely on glycolysis during inflammation, exhibiting dysregulated tricarboxylic acid (TCA) cycle activity. These pro-inflammatory macrophages are commonly referred to as 'M1' or pro-inflammatory, as noted in the manuscript. In contrast, M2 macrophages, or pro-resolution macrophages, are highly dependent on active mitochondrial respiration and oxidative phosphorylation (OXPHOS). Given that M1 macrophages favor glycolysis, they create an acidic environment due to elevated lactate levels and other acidifying metabolites. However, the study does not address this effect. The authors' hypothesis revolves around the acidic environment created by glycolytic tumors, yet they overlook the self-induced acidification of media when culturing M1 macrophages. This raises the question of how the authors explain the reduced circadian rhythms observed in pro-inflammatory macrophages in their study, while low pH and higher lactate levels enhance the amplitude of circadian rhythms. I would encourage the authors to incorporate the glycolytic activity of pro-inflammatory macrophages into their experimental setup. Otherwise the data look contradictory and misleading in some extent.

      We appreciate the important point Reviewer #1 made that macrophages polarized toward a pro-inflammatory phenotype such as those stimulated with IFNγ and LPS (M1 macrophages) prioritize metabolic pathways that enhance glycolytic flux, resulting in increased release of protons and lactate as waste products from the glycolysis pathway. In this way, polarization of macrophages toward the pro-inflammatory phenotype can lead to acidification of the media, which may influence our observations given that we are studying the effect of extracellular pH on rhythms in macrophages. To address this point, we have performed additional experiments in which we measured pH of the media to capture changes in media pH that occur during the time in which we observe changes in rhythms of pro-inflammatory macrophages.

      In line with the documented enhanced glycolytic activity of pro-inflammatory macrophages, the media of pro-inflammatory macrophages is acidified over time, in contrast to media of unstimulated or pro-resolution macrophages. Notably, while pH decreased over time in the pro-inflammatory group, the pH differential between the pH7.4, pH6.8, and pH6.5 sample groups was maintained over the period in which we observe and measure changes in circadian rhythms of pro-inflammatory macrophages. Additionally, media that began at pH 7.4 was acidified only to pH 7 by day 2, above the acidic pH of 6.8 or 6.5. As a result, there remained a difference in pH between the two groups (pH 7.4 and pH 6.5) out to 2 days consistent with the changes in rhythms that we observe between these two groups. This indicates that the difference in circadian rhythms observed in pro-inflammatory macrophages cultured at pH 7.4 compared to pH 6.5 were indeed due to the difference in extracellular pH between the two conditions. We have incorporated these data, shown below, into Supplementary Figure 4 and added the following discussion of these data to the Results section:

      "In line with their documented enhanced glycolytic capacity, pro-inflammatory macrophages acidified the media over time (Supplementary Figure 4C). Notably, while pH of the media the pro-inflammatory macrophages were cultured in decreased over time pH, the pH differential between the pH 7.4, pH 6.8, and pH 6.5 samples groups of pro-inflammatory macrophages was maintained out to 2 days, consistent with the changes in rhythms that we observe and measure between these groups."

      The article examines the role of circadian rhythms in tumor-associated macrophages, yet it lacks sufficient compelling data to support this assertion. Two figures, Figure 7 and Figure 9, are presented in relation to cancer. In Figure 7, gene expression analysis of Arg1 (an M2 marker) and Crem (a potential circadian clock gene) is conducted in wild-type macrophages, BMAL1-knockout macrophages with dysregulated circadian rhythms, and using publicly available data on tumor-associated macrophages from a study referenced as 83. However, it is noted that this referenced study is actually a review article by Geeraerts et al. (2017) titled "Macrophage Metabolism as Therapeutic Target for Cancer, Atherosclerosis, and Obesity" published in Frontiers in Immunology. This raises concerns about the reliability of the results. Furthermore, comparing peritoneal macrophages from healthy mice with macrophages isolated from lung tumors is deemed inaccurate. It is suggested that lung macrophages from healthy mice and those from mice with lung tumors should be isolated separately for a more appropriate comparison. Consequently, Figure 7B is further questioned regarding how the authors could compare genes from the circadian rhythm pathway between these non-identical groups. As a result, the conclusion drawn from these data, suggesting that tumor-associated macrophages exhibit a gene expression pattern similar to BMAL1-KO macrophages, is deemed incorrect, affecting the interpretation of the data presented in Figure 8.

      We thank Reviewer #1 for pointing out our error in the reference provided as the source of the TAM data used for CCD in Figure 7. While we took care to provide the GEO ID for the data set (GSE188549) in the Methods section, we mistakenly cited Geeraerts (2017) Front Immunol when we should have cited Geeraerts (2021) Cell Rep. We have corrected this citation error in the main text.

      We also appreciate Reviewer #1's concern that we are comparing circadian gene expression of peritoneal macrophages to tumor-associated macrophages derived from LLC tumors, which are grown ectopically in the flank for the experiment from which the data set was produced. To ensure an accurate comparison of gene expression, we downloaded the raw FASTQ files from each dataset and processed them in identical pipelines. Our main comparison between these cell types is Clock Correlation Distance (CCD), which compares the pattern of co-expression of circadian genes (Shilts et al PeerJ 2018). CCD was built from multiple mouse and human tissues to be a "universal" tool to compare circadian rhythms, and designed to compare between different tissues and cell types. Each sample is compared to a reference control built from these multiple tissues. To better convey this concept to readers to give confidence the suitability of CCD for comparing data sets across different tissues, we have added the reference control to Figure 7 (now Figure 6B), We have also expanded our analysis to include bone marrow-derived macrophages, to further demonstrate that the organization of clock gene co-expression is not specific to peritoneal macrophages; we have added this data to Figure 7 (now Figure 6C,D). Finally, we have included an abbreviated explanation of the points made above in the results section.

      Due to the universal nature of the CCD tool, we disagree with Reviewer #1's assertion that "the conclusion drawn from these data, suggesting that tumor-associated macrophages exhibit a gene expression pattern similar to BMAL1-KO macrophages, is deemed incorrect". Indeed, this finding mirrors findings in the original CCD paper, which showed that tumor tissues universally exhibit a disordered molecular clock as compared to normal tissue. Notably, the original CCD paper also compared across cell and tumor types.

      As an additional note to the review, we would like to clarify that nowhere in the manuscript do we propose that Crem is a potential circadian clock gene. We are clear throughout the manuscript that we are using Crem as a previously established biomarker for acidic pH-sensing in macrophages. Please see below for the modified Figure and text.

      "To understand the status of the circadian clock in TAMs, we performed clock correlation distance (CCD) analysis. This analysis has previously been used to assess functionality of the circadian clock in whole tumor and in normal tissue[102]. As the circadian clock is comprised of a series of transcription/translation feedback loops, gene expression is highly organized in a functional, intact clock, with core clock genes existing in levels relative to each other irrespective of the time of day. In a synchronized population of cells, this ordered relationship is maintained at the population level, which can be visualized in a heatmap. CCD is designed to compare circadian clock gene co-expression patterns between different tissues and cell types. To accomplish this, CCD was built using datasets from multiple different healthy tissues from mouse and human to be a universal tool to compare circadian rhythms. Each sample is compared to a reference control built from these multiple tissues (Figure 6B)[102]. To validate the use of this analysis for assessing circadian disorder in macrophages, we performed CCD analysis using publicly available RNA-sequencing data from bone marrow-derived macrophages and wild type peritoneal macrophages, as a healthy control for functional rhythms in a synchronized cell population, and BMAL1 KO peritoneal macrophages, as a positive control for circadian disorder[44]."

      And in the Discussion:

      "Interestingly, analysis of TAMs by clock correlation distance (CCD) presents evidence that rhythms are disordered in bulk TAMs compared to other macrophage populations (Figure 6). CCD is one of the most practical tools currently available to assess circadian rhythms due to its ability to assess rhythms independent of time of day and without the need for a circadian time series, which is often not available in publicly available data from mice and humans[102]."

      If the authors aim to draw a clear conclusion regarding the circadian rhythms of tumor-associated macrophages (TAMs), they may need to analyze single-sorted macrophages from tumors and corresponding healthy tissues. Such data are publicly available (of course not in #83)

      We agree with Reviewer #1 that while our interpretation of the data is that there may be heterogeneity in circadian rhythms of tumor-associated macrophages, we cannot prove this without assessing circadian rhythms at the single cell level. While single-cell RNA-sequencing data of freshly isolated tumor associated macrophages of sufficient read depth for circadian gene expression analysis has historically been unavailable, fortunately a dataset was released recently (May 2024) which we were able to use to address this point. We have analyzed publicly available single-cell RNAseq data of tumor-associated macrophages (GSE260641, Wang 2024 Cell) to determine whether there are differences in expression of circadian clock genes between different TAM populations. We have added these data as a new Figure 9. Please see the figure and updated text below.

      "Tumor-associated macrophages exhibit heterogeneity in circadian clock gene expression.

      __ Our findings suggested that heterogeneity of the circadian clock may lead to disorder in bulk macrophage populations, but did not reveal if specific gene expression changes exist in tumor-associated macrophages at the single-cell level. To determine whether heterogeneity exists within the expression of circadian clock genes of the tumor-associated macrophage population, we analyzed publicly available single-cell RNA sequencing data of macrophages isolated from B16-F10 tumors[107]. To capture the heterogeneity of macrophage subsets within the TAM population, we performed unbiased clustering (Figure 9A). We then performed differential gene expression to determine if circadian clock genes were differentially expressed within the TAM subpopulations. The circadian clock genes Bhlhe40 (DEC1), Bhlhe41 (DEC2), Nfil3 (E4BP4), Rora (RORα), Dbp (DBP), and Nr1d2 (REV-ERBβ) were significantly (adj.p We next sought to determine whether differences in circadian clock gene expression between TAM subpopulations were associated with exposure to acidic pH in the TME. To this end, we first assessed Crem expression in the TAM subpopulations that were identified by unbiased clustering. Crem expression was significantly higher in TAM clusters 4, 5, and 6 compared to TAM clusters 1-3 and 7-9 (Figure 9C). Clusters were subset based on Crem expression into Crem high (clusters 4-6) and Crem low (clusters 1-3, 7-9) (Figure 9D), and differential gene expression analysis was performed. The circadian clock genes Nfil3, Rora, Bhlhe40, and Cry1 (CRY1) were significantly (adj.p __And in the Discussion:

      "Supporting the notion that population-level disorder may exist in TAMs, we used scRNA-sequencing data and found evidence of heterogeneity between the expression of circadian clock genes in different TAM subpopulations (Figure 9A, B). Phenotypic heterogeneity of TAMs in various types of cancer has previously been shown[20, 21, 125, 126], and we have identified distinct TAM subpopulations by unbiased clustering (Figure 9A). Within those TAM subpopulations, we identified differential expression of circadian clock genes encoding transcription factors that bind to different consensus sequences: DEC1 and DEC2 bind to E-boxes, NFIL3 and DBP binds to D-boxes, and RORα and REV-ERBβ binds to retinoic acid-related orphan receptor elements (ROREs)[127, 128]. While little is known about regulation of macrophages by E-box and D-box elements beyond the circadian clock, aspects of macrophage function have been shown to be subject to transcriptional regulation through ROREs[129, 130]. Thus, we speculate that variations in these transcription factors may exert influence on expression of genes to drive diversity between TAM subpopulations. Differential expression of circadian clock genes between TAM subpopulations was also associated with Crem expression (Figure 9C-E), suggesting that exposure of TAMs to acidic pH within the TME can alter the circadian clock. However, there remained significant variation in expression of circadian clock genes within the Crem high and Crem low groups (Figure 9B), suggesting that acidic pH is not the only factor in the TME that can alter the circadian clock. Together, these data implicate the TME in driving heterogeneity in TAM circadian rhythms just as it drives heterogeneity in TAM phenotype.

      Interestingly, in contrast to our observations of circadian disorder in TAMs isolated from LLC tumors (Figure 6), rhythmicity in expression of circadian genes was observed in bulk TAMs isolated from B16 tumors[107]. This suggests that circadian rhythms of TAMs are maintained differently in different types of cancer. Notably, both of these observations were at the population level. Upon separation of the B16 TAM population into subsets by unbiased clustering of single-cell RNA sequencing data, we measured differences in expression of circadian clock genes between TAM subpopulations (Figure 9A,B). This suggests that even within a rhythmic TAM population, there is heterogeneity in the circadian clock of TAM subpopulations."

      Additionally, it is widely acknowledged that human and mouse macrophages exhibit distinct gene expression profiles, both in vitro and in vivo. While assuming that genes involved in circadian rhythms are conserved across species, the authors could consider extending their funding to include analyses of single-sorted macrophages from cancer patients, such as those with lung cancer or pancreatic ductal adenocarcinoma (PDAC). These experiments would provide relevant insights into TAM biology.

      We agree that with Reviewer #1 that ultimately, being able to relate findings in mice to humans is critical. It is important to assess if circadian disorder is observed in TAMs in human cancers as it is for LLC tumor-derived macrophages in mice. To address this point, we have performed CCD using a human data set (GSE116946; Garrido 2020 J Immunother Cancer) suitable for use with CCD (wherein macrophages were isolated from bulk tumor in humans, with a high enough samples size, and not cultured prior to sequencing). We have added these data as a new Figure 7, shown below. Please see the added data and updated text below.

      "We next assessed the status of the circadian clock in human TAMs from NSCLC patients. We performed CCD with publicly available RNA-seq data of tumor-adjacent macrophages and tumor-associated macrophages from NSCLC patients, using alveolar macrophages from healthy donors as a control[104, 105]. To assess the contribution of the acidic TME to circadian disorder, we subset TAM NSCLC patient samples into groups (Crem high TAMs and Crem low TAMs) based on median Crem expression. Notably, in macrophages from human NSCLC there was a trend toward disorder in Crem low but not Crem high TAM samples (Figure 7A,B). Additionally, the co-variance among core clock genes observed in alveolar macrophages from healthy donors was absent within Crem low and Crem high TAM samples (Figure 7C). In all, these data indicate that there is population-level disorder in the circadian rhythms of tumor-associated macrophages in humans and mice, suggesting that circadian rhythms are indeed altered in macrophages within the TME."

      And in the Discussion:

      "Indeed, we observed differences in the circadian clock of Crem low human TAM samples compared to Crem high human TAM samples, suggesting that acidic pH influences circadian disorder in TAMs (Figure 7). Interestingly, Crem low TAM samples exhibited a trend toward disorder while Crem high TAM samples did not. This is of particular interest, as we have observed that acidic pH can enhance circadian rhythms in macrophages, raising the question of whether acidic pH promotes or protects against circadian disorder."

      Minor comments: 1. Figure 2C needs clarification. It's unclear why pro-inflammatory macrophages treated with lactic acid would have a shorter amplitude and period, while acidic pH would increase amplitude and period in M2 macrophages.

      We thank Reviewer #1 for this important observation. Based on the comment, it is our understanding that the Reviewer is referring to the data in Figure 2 (low pH) compared to Figure 4 (lactate). We also find it very interesting that lactate alters rhythms in a manner distinct from the way in which acidic pH alters rhythms. Reviewer 3 asked for clarification on how lactate affected circadian gene expression in pH 7.4 or 6.5. We have added these data as Figure 4C (data and text below). It is notable that lactate opposing effects on circadian gene expression in pH 6.5, enhancing the effects of low pH in some cases (Nr1d1) while blunting them in other cases (Cry1). This is mentioned in the text.

      "Lactate was also observed to alter expression of the circadian clock genes Per2, Cry1, and Nr1d1 over time in BMDMs cultured at pH 6.5, while having more subtle effects at pH 7.4 (Figure 4C). Notably, lactate blunted the effect of pH 6.5 on Cry1 expression, while enhancing the effect of low pH on Nr1d1 expression."

      Why these two stimuli alter rhythms differently remains an open question that is discussed in the Discussion section and is prime to be a topic of future investigation. We have added to the Discussion section potential reasons why these conditions may alter rhythms differently, such as the different pathways downstream of sensing these two different conditions. Please see the updated text, below.

      "Although lactate polarizes macrophages toward a pro-resolution phenotype similar to acidic pH[30, 93], exposure to lactate had different effects on circadian rhythms - and in some cases, circadian clock gene expression - than exposure to acidic pH (Figure 4). Sensing of lactate occurs through different pathways than acid-sensing, which may contribute to the different ways in which these two stimuli modulate circadian rhythms of macrophages[111]. One previously published finding that may offer mechanistic insight into how phenotype can influence circadian rhythms is the suppression of Bmal1 by LPS-inducible miR-155[54]. It has also been observed that RORα-mediated activation of Bmal1 transcription is enhanced by PPARγ co-activation[112]. In macrophages, PPARγ expression is induced upon stimulation with IL-4 and plays a key role in alternative activation of macrophages, promoting a pro-resolution macrophage phenotype, and supporting resolution of inflammation[113-115]. Such observations prompt the question of whether there are yet-unidentified factors induced downstream of various polarizing stimuli that can modulate expression of circadian genes at the transcriptional and protein levels. Further work is required to understand the interplay between macrophage phenotype and circadian rhythms."

      The scale in Figure 2C should be equal for all conditions (e.g., -200).

      We appreciate Reviewer #1's preference for the axes to be scaled similarly to enable cross-comparison between graphs. However, due to the different amplitude of pro-inflammatory macrophages compared to the others, we feel that making all axes the same will make it hard to see the rhythms of pro-inflammatory macrophages, hindering the reader's ability to observe the data. Thus, we have put the matched-axis plots, shown below, in Supplementary Figure 4A.

      Absolute values of amplitude, damping, and period differ between Figure 1 and Figure 2A, B, C. The authors should explain these discrepancies.

      As with many experimental approaches, there is slight variation in absolute values between independent experiments, which Reviewer #1 correctly notes. However, while the absolute values vary slightly, the relationship between the values in each of these conditions remains the same across the panels mentioned by Reviewer #1.

      The authors should consider modulating the acidic environment of macrophages in settings more representative of cancer. For example, by adding conditioned media from tumor cells with pronounced glycolysis.

      We appreciate Reviewer #1's desire to more closely mimic the tumor microenvironment. To address Reviewer #1's point, we cultured macrophages in RPMI or cancer cell (KCKO) supernatant at pH 6.5 or pH-adjusted to pH 7.4 and assessed rhythms by measuring rhythmic activity of Per2-Luc with LumiCycle analysis. We then compared changes in rhythms between macrophages cultured normal media to cancer cell supernatant in pH-matched conditions to assess how cancer cell-conditioned media may influence circadian rhythms of macrophages, and the contribution of acidic pH. We have added these data, shown below, as a new Supplementary Figure 5, and included a discussion of these data in the manuscript. Please see the new Figure and updated text below.

      "Cancer cell supernatant alters circadian rhythms in macrophages in a manner partially reversed by neutralization of pH.

      We have observed that polarizing stimuli, acidic pH, and lactate can alter circadian rhythms. However, the tumor microenvironment is complex. Cancer cells secrete a variety of factors and deplete nutrients in the environment. To model this, we cultured BMDMs in RPMI or supernatant collected from KCKO cells, which are a murine model of pancreatic ductal adenocarcinoma (PDAC)[94, 95], at pH 6.5 or neutralized to pH 7.4 (Supplementary Figure 5). Circadian rhythms of BMDMs cultured in cancer cell supernatant at pH 7.4 or pH 6.5 exhibited increased amplitude and lengthened period compared to RPMI control at pH 7.4 or 6.5, respectively, indicating that cancer cell supernatant contains factors that can alter circadian rhythms of BMDMs. Notably, BMDMs cultured in cancer cell supernatant at pH 6.5 had increased amplitude and shortened period compared to BMDMs cultured in cancer cell-conditioned media at pH7.4, indicating that pH-driven changes in rhythms were maintained in BMDMs cultured in cancer cell supernatant. When the pH of cancer cell supernatant was neutralized to pH7.4, the increased amplitude was decreased, and the shortened period was lengthened, indicating that neutralizing acidic pH partially reverses the changes in rhythms observed in macrophages cultured in cancer cell supernatant at pH 6.5. These data further support our observations that acidic pH can alter circadian rhythms of macrophages both alone and in combination with various factors in the TME."

      And, in the Discussion:

      "We have shown that various stimuli can alter rhythms of macrophages in a complex and contributing manner, including polarizing stimuli, acidic pH, and lactate. TGFβ is produced by a variety of cells within the TME, and was recently identified as a signal that can modulate circadian rhythms[123, 124]. Additionally, when we exposed macrophages to cancer cell-conditioned media, rhythms were modulated in a manner distinct from acidic pH or lactate, with these changes in rhythms partially reversed by neutralization of the cancer cell-conditioned media pH (Supplementary Figure 5). It is conceivable that, in addition to acidic pH, other stimuli in the TME are influencing circadian rhythms to drive population-level disorder that we observed by CCD."

      Arg1 alone is not sufficient as an M2 polarization marker. The authors should include additional markers.

      We thank Reviewer #1 for bringing up this critical point in experimental rigor. While Arg1 is a commonly-used marker for M2 polarization, Reviewer #1 points out that polarization of macrophages is typically assessed by a full panel of markers characteristic of the M2 state. To address this point, we have expanded our panel to include several other markers of M2 polarization in mice such as Retnla, Ym1, MGL1, and CD206. In response to Reviewer 2's major point 2 and Reviewer 3's major point 4 below, we have also expanded our panel of markers used to assess the M1 polarization state with Tnfa, Il1b. and Il6. We have added these data, shown below, to Supplementary Figure 1 and updated the text appropriately. Please see the new Figure and updated text below.

      "Consistent with previous studies, we found that genes associated with anti-inflammatory and pro-resolution programming characteristic of IL-4 and IL-13-stimulated macrophages such as Arg1, Retnla, Chil3 (Ym1), Clec10a (MGL1), and Mrc1 (CD206) were induced in IL-4 and IL-13-stimulated macrophages, but not IFNγ and LPS-stimulated macrophages. In contrast, genes associated with pro-inflammatory activity characteristic of IFNγ and LPS-stimulated macrophages such as Nos2 (iNOS), Tnfa, Il1b, and Il6 were induced in IFNγ and LPS-stimulated macrophages, but not IL-4 and IL-13-stimulated macrophages (Supplementary Figure 1)[28, 30, 65, 71, 74, 75]. This indicates that macrophages stimulated with IL-4 and IL-13 were polarized toward a pro-resolution phenotype, while macrophages stimulated with IFNγ and LPS were polarized toward a pro-inflammatory phenotype."

      __ Significance__

      While the manuscript provides valuable insights and has obvious novelty, it requires a significant revision

      We thank Reviewer #1 for their deep read of our manuscript, and their helpful feedback and suggestions. As shown by the comments above, we are confident we have fully addressed each of the points that were made to result in a much-improved revised manuscript.

      __ Reviewer #2 __

      Evidence, reproducibility and clarity

      Knudsen-Clark et al. showed that the circadian rhythm of bone marrow-derived macrophages (BMDM) can be affected by polarization stimuli, pH of the microenvironment, and by the presence of sodium-lactate. Mechanistically, the acidic pH of cell microenvironment is partly regulated by intracellular cAMP-mediated signaling events in BMDM. The authors also showed that the circadian clock of peritoneal macrophages is also modified by the pH of the cell microenvironment. Using publicly available data, the authors showed that the circadian rhythm of tumor-associated macrophages is similar to that of Bmal1-KO peritoneal macrophages. In a murine model of pancreatic cancer, the authors showed that the tumor growth is accelerated in C57BL/6 mice co-injected with cancer cells and Bmal1-KO BMDM as compared to mice co-injected with cancer cell and wild type BMDM.

      We thank Reviewer #2 for their insightful and helpful comments and feedback. Their Review guided key clarifying experiments and additions to the Discussion and Methods. To summarize, we added new data to Supplementary Figure 1 to characterize distinct gene expression in our different polarized macrophage populations, showed in Supplementary Figure 2 that serum shock independently induces cAMP and Icer, discussed the limitations of the artificial polarization models more clearly, and updated our Methods to better explain how macrophages were isolated from the peritoneum. We also quantified multiple immunoblots of pCREB, provided clarity in the Methods and Reviewer-only data on how our protein-extraction protocol isolates nuclear protein, better introduced the BMAL1-KO mouse model, and showed in Supplementary Figure 6 that low pH can induce oscillations in the absence of a serum shock.

      Major points of criticism: 1. Nine main figures include different experimental models on a non-systematic manner in the manuscript, and only literature-based correlation is used to link the results each other. The authors used in vitro BMDM and peritoneal cell-based model systems to study the effects of IL4+IL13, IFNg+LPS, low pH, sodium-lactate, adenylate cyclase inhibitors on the circadian clock of macrophages. The link between these microenvironment conditions of the cells is still correlative with the tumor microenvironment: publicly available data were used to correlate the increased expression level of cAMP-activated signaling events with the presence of acidic pH of tumor microenvironment. Notably, the cell signaling messenger molecule cAMP is produced by not only low extracellular pH by activated GPCRs, but also starvation of the cell. The starvation is also relevant to this study, since the BMDM used in the in vitro culture system were starving for 24 hours before the measurement of Per2-Luc expression to monitor circadian rhythm.

              We agree with the important point that Reviewer #2 makes that our synchronization protocol of serum starvation followed by serum shock can impact the cAMP signaling pathway. Indeed, it has previously been shown that serum shock induces phosphorylation of CREM in rat fibroblasts, which is indicative of signaling through the cAMP pathway. To address this point, we have added a schematic of our synchronization protocol to Supplementary Figure 2B for additional clarity. We have also performed additional experiments to test whether cAMP signaling is induced in macrophages by our synchronization protocol. For this, we assessed downstream targets of the cAMP signaling pathway, Icer and pCREB, after serum starvation but before serum shock, and at several time points post-treatment with serum shock (Supplementary Figures 2D,E). We observed that Icer and phosphorylation of Creb are induced rapidly in macrophages upon exposure to serum shock, as early as 10 minutes for pCREB and 1 hour post-exposure for Icer. Notably, this signaling is transient and rapidly returns to baseline, with pCREB levels fully returned to baseline by 2 hours post-treatment, at which time media is replaced and the experiment begins (CT 0). These data, shown below, have been added to Supplementary Figure 2 and a discussion of these data has been added to the manuscript - please see the modified text below.
      

      "The synchronization protocol we use to study circadian rhythms in BMDMs involves a 24-hour period of serum starvation followed by 2 hours of serum shock. It has previously been shown that serum shock can induce signaling through the cAMP pathway in rat fibroblasts[98]. To determine whether the synchronization protocol impacts cAMP signaling in macrophages, we harvested macrophages before and after serum shock. We then assessed Icer expression and phosphorylation of cyclic AMP-response element binding protein (CREB), which occur downstream of cAMP and have been used as readouts to assess induction of cAMP signaling in macrophages[29, 96, 100]. Serum shock of macrophages following serum starvation led to rapid phosphorylation of CREB and Icer expression that quickly returned to baseline (Supplementary Figure 2D,E). This indicates that serum starvation followed by serum shock in the synchronization protocol we use to study circadian rhythms in BMDMs induces transient signaling through the cAMP signaling pathway. "

      The definition of pre-resolution macrophages (MF) used across the manuscript could be argued. The authors defined BMDM polarized with IL-4 and IL-13 as pre-resolution MF. Resolution is followed by inflammation, but the IL-4 secretion does not occur in every inflammatory setting. Moreover, IL-4 and IL-13 are secreted during specific tissue environment and immunological settings involving type 2 inflammation or during germinal center reactions of the lymph nodes. • What are the characteristics of pre-resolution macrophages (MF)? The authors indicated that IL-4 and IL-13 cytokines were used to model the pre-resolution macrophages. In which pathological context are these cytokines produced and induce pre-resolution macrophages? IL-4 polarized BMDM can also produce pro-inflammatory protein and lipid mediators as compared to LPS-stimulated BMDM, and IL-4 polarized BMDM still have potent capacity to recruit immune cells and to establish type 2 inflammation.

      • The authors showed Arg1 and Vegfa qPCR data from BMDM only. Based on the literature, these MFs are anti-inflammatory cells particularly. Resolution-related MFs followed by acute inflammation are a specific subset of MFs, and the phenotype of pre-resolution MF should be described, referred, and measured specifically.

      We thank Reviewer #2 for bringing up this important point that clarity is required in describing our in vitro macrophage models. We chose the most commonly used models of in vitro macrophage polarization in the tumor immunology field, M2 (IL-4+IL-13) and M1 (IFNγ+LPS). These polarization conditions have been used for over two decades in the field, and have been well-characterized to drive a pro-inflammatory (for M1) and pro-resolution or anti-inflammatory (for M2) macrophage phenotype (Murray 2017 Annu Rev Phys). Each of these cell states have similarities in phenotype to pro-inflammatory and pro-resolution (pro-tumorigenic) macrophages found in tumors. In fact, in the literature, pro-inflammatory and pro-resolution TAMs will frequently be categorized as "M1" or "M2", respectively, even though this is a gross oversimplification (Ding 2019 J Immunol, Garrido-Martin 2020 J Immunother Cancer).

      As Reviewer #2 points out, IL-4 and IL-13 play a role in inflammatory settings, mediating protective responses to parasites and pathological responses to allergens. Importantly, IL-4 and IL-13 are also key regulators and effectors of resolution and wound repair (Allen 2023 Annu Rev Immunol). In line with this, M2 macrophages show many of the characteristics of pro-resolution programming in their gene expression profile, expressing genes associated with wound healing (ex. Vegf) and immunoregulation (ex. Arg1) (Mantovani 2013 J Pathol). These cells have frequently been used as a model for studying TAMs in vitro, due to the similarity in pro-resolution programming that is dysregulated/hijacked in TAMs (Biswas 2006 Blood). M2 macrophages have also been referred to as anti-inflammatory, and this is in line with their role in the type 2 response driven by IL-4 and IL-13, as this is primarily a response induced by allergy or parasites where tissue damage drives an anti-inflammatory and pro-resolution phenotype in macrophages (Pesce 2009 Plos Pathogens and Allen 2023 Annu Rev Immunol).

      We do not assert that these in vitro models recapitulate the macrophage polarization cycle that Reviewer #2 astutely describes, and indeed, stimuli polarizing macrophages in tumor are much more diverse and complex (Laviron 2022 Cell Rep). We also fully agree with Reviewer #2 that, while IL4 and IL13 may exist in the tumor and be secreted by Th2 CD4 T cells (see Shiao 2015 Cancer Immunol Res), there may be multiple reasons why macrophages may be polarized to a pro-resolution, M2-like state in a tumor (in fact, exposure to low pH and lactate each independently do this, as we show in Supplementary Figure 2 and Figure 4, and was previously shown in Jiang 2021 J Immunol and Colegio 2014 Nature). Nonetheless, using the well-described M1 and M2 in vitro models allows our findings to be directly comparable to the vast literature that also uses these models, and to understand how distinct polarization states respond to low pH.

      We fully agree with Reviewer #2 that these cells must be defined more clearly in the text. We have taken care to discuss the limitations of using in vitro polarization models to study macrophages in our Limitations of the Study section. To better address Reviewer #2's concern, we have more thoroughly introduced the M2 macrophages as a model, and are clear that that these are type 2-driven macrophages that share characteristics of pro-resolution macrophages. We have also added additional citations to the manuscript, including those highlighted above in our response. Finally, we have expanded our panel to better characterize the IL-4/IL-13 stimulated macrophages using more markers that have been characterized in the literature, in line with both Reviewer #2's comments and that of Reviewer #1 and Reviewer #3. Please see the updated data and text, below.

      "As macrophages are a phenotypically heterogeneous population in the TME, we first sought to understand whether diversity in macrophage phenotype could translate to diversity in circadian rhythms of macrophages. To this end, we used two well-established in vitro polarization models to study distinct macrophage phenotypes[5, 60-63]. For a model of pro-inflammatory macrophages, we stimulated macrophages with IFNγ (interferon γ) and LPS (lipopolysaccharide) to elicit a pro-inflammatory phenotype[60, 64]. These macrophages are often referred to as 'M1' and are broadly viewed as anti-tumorigenic, and we will refer to them throughout this paper as pro-inflammatory macrophages[65, 66]. For a model at the opposite end of the phenotypic spectrum, we stimulated macrophages with IL-4 and IL-13[60, 67]. While these type 2 stimuli play a role in the response to parasites and allergy, they are also major drivers of wound healing; in line with this, IL-4 and IL-13-stimulated macrophages have been well-characterized to adopt gene expression profiles associated with wound-healing and anti-inflammatory macrophage phenotypes[68-71]. As such, these macrophages are often used as a model to study pro-tumorigenic macrophages in vitro and are often referred to as 'M2' macrophages; throughout this paper, we will refer to IL-4 and IL-13-stimulated macrophages as pro-resolution macrophages[66, 72, 73]. Consistent with previous studies, we found that genes associated with anti-inflammatory and pro-resolution programming characteristic of IL-4 and IL-13-stimulated macrophages such as Arg1, Retnla, Chil3 (Ym1), Clec10a (MGL1), and Mrc1 (CD206) were induced in IL-4 and IL-13-stimulated macrophages, but not IFNγ and LPS-stimulated macrophages. In contrast, genes associated with pro-inflammatory activity characteristic of IFNγ and LPS-stimulated macrophages such as Nos2 (iNOS), Tnfa, Il1b, and Il6 were induced in IFNγ and LPS-stimulated macrophages, but not IL-4 and IL-13-stimulated macrophages (Supplementary Figure 1)[28, 30, 65, 71, 74, 75]. This indicates that macrophages stimulated with IL-4 and IL-13 were polarized toward a pro-resolution phenotype, while macrophages stimulated with IFNγ and LPS were polarized toward a pro-inflammatory phenotype.

      In the Limitations of the Study section, we now write the following:

      "Our observations of rhythms in macrophages of different phenotypes are limited by in vitro polarization models. It is important to note that while our data suggest that pro-inflammatory macrophages have suppressed rhythms and increased rate of desynchrony, it remains unclear the extent to which these findings apply to the range of pro-inflammatory macrophages found in vivo. We use IFNγ and LPS co-treatment in vitro to model a pro-inflammatory macrophage phenotype that is commonly referred to as 'M1', but under inflammatory conditions in vivo, macrophages are exposed to a variety of stimuli that result in a spectrum of phenotypes, each highly context-dependent. The same is true for for 'M2'; different tissue microenvironment are different and pro-resolution macrophages exist in a spectrum."

      The authors used IFNg and LPS, or IL-4 and IL-13 and co-treatments to polarize BMDM in to type 1 (referred as pro-inflammatory MF) and type 2 (referred as pre-resolution MF) activation state. The comparison between these BMDM populations has limitations, since LPS induces a potent inflammatory response in MF. The single treatment with MF-polarizing cytokines enable a more relevant comparison to study the circadian clock in classically and alternatively activated MF.

      We thank Reviewer #2 for bringing up this important point to provide additional clarity on our polarization conditions. The use of IFNγ and LPS to polarize macrophages toward a pro-inflammatory, M1 phenotype, and the use of IL-4 an IL-13 to polarize macrophages toward a pro-resolution, M2 phenotype have been commonly used for over two decades, and thus are well-characterized in the literature (please see Murray 2017 Annu Rev Phys for an extensive review on the history of these polarization models, as well as Hörhold 2020 PLOS Computational Biology, Binger 2015 JCI, McWhorter 2013 PNAS, Ying 2013 J Vis Exp for more recent studies using these models). The use of LPS alone or in combination with IFNγ, and IL-13 along with IL-4, was introduced in 1998 (Munder 1998 J Immunol). This approach was originally designed to mimic what could happen when macrophages were exposed to CD4+ Th1 cells, which produce IFNγ, or Th2 cells, which produce IL-4 and IL-13 (Munder 1998 J Immunol, Murray 2017 Annu Rev Phys). As Reviewer #2 points out, these stimuli induce potent responses, driving macrophages to adopt pro-inflammatory or pro-resolution/anti-inflammatory phenotypes that are two extremes at opposite ends of the spectrum of macrophage phenotypes (Mosser 2008 Nat Rev Immunol). Since our goal was to study rhythms of distinct macrophage phenotypes in vitro, and how TME-associated conditions such as acidic pH and lactate affect their rhythms, these cell states were appropriate for our questions. Thus, the polarization models used in this paper allowed us to achieve this goal. We include a section in the Discussion on the limitations of in vitro polarization models.

      "A critical question in understanding the role of circadian rhythms in macrophage biology is determining how different polarization states of macrophages affect their internal circadian rhythms. This is especially important considering that tumor-associated macrophages are a highly heterogeneous population. Our data indicate that compared to unstimulated macrophages, rhythms are enhanced in pro-resolution macrophages, characterized by increased amplitude and improved ability to maintain synchrony; in contrast, rhythms are suppressed in pro-inflammatory macrophages, characterized by decreased amplitude and impaired ability to maintain synchrony (Figure 1). These agree with previously published work showing that polarizing stimuli alone and in combination with each other can alter rhythms differently in macrophages[80, 81]. In a tumor, macrophages exist along a continuum of polarization states and phenotypes[18-21, 24]. Thus, while our characterizations of rhythms in in vitro-polarized macrophages provide a foundation for understanding how phenotype affects circadian rhythms of macrophages, further experiments will be needed to assess macrophages across the full spectrum of phenotypes. Indeed, alteration of rhythms may be just as highly variable and context-dependent as phenotype itself."

      There are missing links between the results of showing the circadian rhythm of polarized BMDM, sodium-lactate treated BMDM, and tumor growth. Specifically, do the used pancreatic ductal adenocarcinoma cells produce IL-4 and sodium-lactate? In the LLC-based experimental in silico analysis of tumors, the LLC do not produce IL-4.

      Reviewer #2 raises important points about the source of lactate and IL-4 in tumors as relevance for our investigation of how these factors can alter rhythms in macrophages. Tumor-infiltrating Th2 CD4 T cells are potential sources of IL-4 and IL-13 in the tumor (see Shiao 2015 Cancer Immunol Res). Various cells in the tumor can produce lactate. We discuss this in both the Introduction and the Results: poor vascularization of tumors results in hypoxia areas, where cells are pushed toward glycolysis to survive and thus secrete increased glycolytic waste products such as protons and lactate. As lactate is lactic acid, ionized it is sodium l-lactate.

      How can the circadian rhythm affect the function of BMDM? The Authors should provide evidence that circadian rhythm affects the function of polarized MF.

      We agree with Reviewer #2 that the next step is to determine how altered rhythms influence function of macrophages. This will be the topic of future work, but is outside the scope of this paper. Our contribution with this paper is providing the first evidence that rhythms are altered in the TME and the TME-associated conditions can alter rhythms in macrophages. We have added what is currently known about how circadian rhythms influence macrophages function to the discussion section to facilitate a conversation about this important future direction. Please see the updated text below.

      "Considering our observations that conditions associated with the TME can alter circadian rhythms in macrophages, it becomes increasingly important to understand the relevance of macrophage rhythms to their function in tumors. It has been shown that acidic pH and lactate can each drive functional polarization of macrophages toward a phenotype that promotes tumor growth, with acidic pH modulating phagocytosis and suppressing inflammatory cytokine secretion and cytotoxicity[28, 30, 93]. However, how the changes in circadian rhythms of macrophages driven by these conditions contributes to their altered function remains unknown. Current evidence suggests that circadian rhythms confer a time-of-day-dependency on macrophage function by gating the macrophage response to inflammatory stimuli based on time-of-day. As such, responses to inflammatory stimuli such as LPS or bacteria are heightened during the active phase while the inflammatory response is suppressed during the inactive phase. An important future direction will be to determine how changes in circadian rhythms of macrophages, such as those observed under acidic pH or high lactate, influences the circadian gating of their function."

      In Figure 3, the authors show data from peritoneal cells. The isolated peritoneal cells are not pure macrophage populations. Based on the referred article in the manuscript, the peritoneal cavity contains more then 50% of lymphocytes, and the myeloid compartment contains 80% macrophages.

      Reviewer #2 raises important concerns about the purity of the peritoneal population used in our experiments. We enrich for peritoneal macrophages from the peritoneal exudate cells by removing non-adherent cells in culture. This is described in our Methods section and is a method of isolation that is commonly used in the field, as lymphocytes are non-adherent. In addition to the source cited in the paper within our Methods section (Goncalves 2015 Curr Prot Immunol), please see Layoun 2015 J Vis Exp, de Jesus 2022 STAR Protocols, and Harvard HLA Lab protocol - macrophages enriched in this manner have been shown to be over 90% pure. We have modified our Methods section to make this clear, and added the additional references in this response to this section of our Methods. Please see the modified text below.

      "Peritoneal exudate cells were harvested from mice as previously published[137]. To isolate peritoneal macrophages, peritoneal exudate cells were seeded at 1.2*106 cells/mL in RPMI/10% HI FBS supplemented with 100U/mL Penicillin-Streptomycin and left at 37⁰C for 1 hour, after which non-adherent cells were rinsed off[136]. Isolation of peritoneal macrophages using this method has been shown to yield a population that is over 90% in purity[138, 139]. Peritoneal macrophages were then cultured in Atmospheric Media at pH 7.4 or 6.5 with 100μM D-luciferin, and kept at 37⁰C in atmospheric conditions."

      The figure legend of Figure 3 describes the effects of pH on the circadian rhythm of bone marrow-derived macrophages ex vivo. Peritoneal macrophages involve tissue resident peritoneal macrophages with yolk sac and fetal liver origin, and also involve small peritoneal MF with bone marrow origin. The altered description of results and figure legends makes confusion.

      We are very grateful to Reviewer #2 for pointing out our typo. We have fixed the caption of Figure 3 to properly describe the data as "peritoneal macrophages ex vivo".

      In Figure 6C, one single Western blot is shown with any quantification. The authors should provide data of the relative protein level of p-CREB from at least 3 independent experiments. In the Western-blot part of the methods, the authors described that the pellet was discarded after cell lysis. The p-CREB is the activated form of the transcription factor CREB and there is increased binding to the chromatin to regulate gene expression. By discarding the pellet after cell lysis, the chromatin-bond p-CREB could be also removed at the same time.

      We thank Reviewer 2 for bringing up this point. We agree that quantification is an important aspect of western blot. We have repeated the experiment again for n=3 and provide quantification of pCREB normalized to total protein. We have added these data, shown below, to Figure 5.

      Reviewer #2 also expressed concern that we may not be capturing all of the CREB due to nuclear localization and chromatin binding. We specifically chose the lysis buffer M-Per, which is formulated to lyse the nucleus and solubilize nuclear and chromatin-bound proteins. To demonstrate this, we show in the below Figure to the Reviewer that the nuclear protein p85 is solubilized and readily detectable by western blot using our protein extraction method.

      We have also added an additional sentence in the Methods section for clarity - please see the modified text below.

      "Cells were lysed using the M-Per lysis reagent (Thermo Scientific, CAT#78501), supplemented with protease and phosphatase inhibitor cocktail (1:100; Sigma, CAT#PPC1010) and phosphatase inhibitor cocktail 2 (1:50; Sigma, CAT#P5726), with 200μM deferoxamine (Sigma, CAT#D9533). M-Per is formulated to lyse the nucleus and solubilize nuclear and chromatin-bound proteins, allowing isolation of nuclear proteins as well as cytosolic proteins. Lysates were incubated on ice for 1 hour, then centrifuged at 17,000 xg to pellet out debris; supernatant was collected."

      It is confusing that adenylate-cyclase inhibitor MDL-12 elevated the phospho-CREB levels in BMDM. How can the authors exclude any other inducers of CREB phosphorylation?

      We agree with Reviewer #2 that it is surprising pCREB was elevated with MDL-12 treatment alone, and we do indeed think that there are other pathways contributing to this. We have addressed this point in the Discussion - please see the text below.

      "The mechanism through which acidic pH can modulate the circadian clock in macrophages remains unclear. Evidence in the literature suggests that acidic pH promotes a pro-resolution phenotype in macrophages by driving signaling through the cAMP pathway[29]. It has previously been shown that cAMP signaling can modulate the circadian clock[99]. However, our data indicated that cAMP signaling was not fully sufficient to confer pH-mediated changes in circadian rhythms of macrophages (Figure 5A,B). Treatment with MDL-12, commonly known as an inhibitor of adenylyl cyclase[29, 117], resulted in suppression of pH-induced changes in amplitude of circadian rhythms but did not inhibit signaling through the cAMP signaling pathway (Figure 5C,D). While MDL-12 is commonly used as an adenylyl cyclase inhibitor, it has also been documented to have inhibitory activity toward phosphodiesterases (PDEs) and the import of calcium into the cytosol through various mechanisms[118, 119]. This is of particular interest, as calcium signaling has also been shown to be capable of modulating the circadian clock[120]. Furthermore, while acid-sensing through GPCRs have been the most well-characterized pathways in macrophages, there remain additional ways in which acidic pH can be sensed by macrophages such as acid-sensing ion channels[121, 122]. Further work is required to understand the signaling pathways through which pH can influence macrophage phenotype and circadian rhythms."

      It is described in the methods that BMDM were starving for 24 hours in serum-free culture media followed by serum shock (50% FBS). The cAMP production can be induced during cell starvation which should be considered for the data representation.

      We appreciate that Reviewer #2 points out that our synchronization protocol of serum starvation followed by serum shock may impact the cAMP signaling pathway in macrophages, as serum shock has been shown to induce pCREB, a downstream mediator of cAMP signaling, in rat fibroblasts. Indeed, we show in additional experiments performed (in response to Reviewer #2's major comment 1) evidence that cAMP signaling is induced in macrophages following the serum shock phase of our synchronization protocol, as indicated by elevation of Icer and pCREB. As we note above, this induction is transient and returns to baseline by 2 hours post-serum shock, the time at which we replace media and begin our experiments (CT 0).

      Despite the transient nature of cAMP induction by our synchronization protocol, we agree wholeheartedly with Reviewer #2 that this must be considered in light of our experimental system in which we are studying the effect of acidic pH on circadian rhythms of macrophages, which in itself induces signaling through the cAMP signaling pathway. To address Reviewer #2's point, we have performed experiments in which we culture unstimulated BMDMs in neutral pH 7.4 or acidic pH 6.5, without prior serum starvation and serum shock (i.e. we do not submit these BMDMs to the synchronization protocol). We then observed circadian rhythms of Per2-Luc by LumiCycle to determine whether acidic pH alters circadian rhythms of BMDMs in the absence of prior serum starvation followed by serum shock. Similar to our observations in Figure 2, circadian rhythms of macrophages at pH 6.5 had increased amplitude and shortened period compared to rhythms of macrophages at pH 7.4. This indicates that pH-driven changes in circadian rhythms observed in our system are not due to the synchronization protocol. The data, shown below, have been placed in a new Supplementary Figure 6, and a discussion of these results has been added to the Results section - please see the updated text below.

      "As acidic pH induces signaling through the cAMP pathway, we sought to determine whether acidic pH independently contributed to the pH-driven changes in circadian rhythms we observe in BMDMs. To test this, we omitted the synchronization step and observed BMDM rhythms by LumiCycle when cultured in neutral pH 7.4 or acidic pH 6.8 or pH 6.5 (Supplementary Figure 6). Circadian rhythms of BMDMs cultured at pH 6.5 exhibited similar changes as previously observed, with enhanced amplitude and shortened period relative to BMDMs at pH 7.4. This indicates pH-driven changes observed in circadian rhythms of BMDMs occur in the absence of prior serum starvation and serum shock. "As acidic pH independently induces signaling through the cAMP pathway, we sought to determine whether acid pH could also independently contribute to the pH-driven changes in circadian rhythms we observe in BMDMs. To test this, we omitted the synchronization step and observed BMDM rhythms by LumiCycle when cultured in neutral pH 7.4 or acidic pH 6.8 or pH 6.5 (Supplementary Figure 6). Circadian rhythms of BMDMs cultured at pH 6.5 exhibited similar changes as previously observed, with enhanced amplitude and shortened period relative to BMDMs at pH 7.4. This indicates pH-driven changes observed in circadian rhythms of BMDMs occur in the absence of prior serum starvation and serum shock."

      How can the authors explain and prove that the wild type and Bmal1-KO BMDM co-injected with pancreatic cancer cells subcutaneously survive, present, and have effector functions at the same extent in the subcutaneous tissue, before and during tumor growth (Figure 9)? In other words, what kind of MF-derived parameters could be modified by disrupting the circadian rhythm of MF during tumor development? The production of MF-derived regulatory enzymes, cytokines, growth factors are affected by the disrupted circadian clock in MF?

              Review #2 poses the very important question of why we see differences in tumor growth in our co-injection model, and what might be driving it. Of note, co-injection models of tumor growth are commonly used to determine macrophage-specific roles in tumor growth (Colegio 2014 Nature, Mills 2019 Cell Rep, Lee 2018 Nat Comm). We observed that tumor growth is altered when macrophages with disrupted circadian rhythms (BMAL1 KO) are co-injected compared to when macrophages with intact circadian rhythms (WT) are co-injected in a murine model of pancreatic cancer using KCKO cells. Our observation is supported by a previously published paper in which they used a co-injection model of melanoma, which we cite in the manuscript(Alexander 2020 eLife). What drives this difference in tumor growth remains an open question that is the subject of future work and is outside the scope of this paper, which focuses on our discovery that factors associated with the tumor microenvironment can alter circadian rhythms in macrophages. We have included a discussion on what is currently known about how circadian rhythms alter macrophage function, acknowledging that we have yet to answer these important questions and identifying it as interest for future work. Please see the text below.
      

      "Considering our observations that conditions associated with the TME can alter circadian rhythms in macrophages, it becomes increasingly important to understand the relevance of macrophage rhythms to their function in tumors. It has been shown that acidic pH and lactate can each drive functional polarization of macrophages toward a phenotype that promotes tumor growth, with acidic pH modulating phagocytosis and suppressing inflammatory cytokine secretion and cytotoxicity[28, 30, 93]. However, how the changes in circadian rhythms of macrophages driven by these conditions contributes to their altered function remains unknown. Current evidence suggests that circadian rhythms confer a time-of-day-dependency on macrophage function by gating the macrophage response to inflammatory stimuli based on time-of-day. As such, responses to inflammatory stimuli such as LPS or bacteria are heightened during the active phase while the inflammatory response is suppressed during the inactive phase. An important future direction will be to determine how changes in circadian rhythms of macrophages, such as those observed under acidic pH or high lactate, influences the circadian gating of their function. Data from our lab and others suggest that disruption of the macrophage-intrinsic circadian clock accelerates tumor growth, indicating that circadian regulation of macrophages is tumor-suppressive in models of PDAC (our work) and melanoma [109]. This agrees with complementary findings that behavioral disruption of circadian rhythms in mice (through chronic jetlag) disrupts tumor macrophage circadian rhythms and accelerates tumor growth[56]. It remains unclear whether this is through the pro-tumorigenic functions of macrophages such as extracellular matrix remodeling or angiogenesis, through suppression of the anti-tumor immune response, or a combination of both functions. Further work will be needed to tease apart these distinctions."

      Minor points of criticism: 1. The figure legends of the graphs and diagrams are missing in Figure 2D,E,F

      We thank Reviewer #2 for pointing out that figure legends were missing. We have added legends for Figure 2D,E,F.

      The BMAL1-based in vivo murine model of circadian rhythm is not introduced in the manuscript.

      We thank Reviewer #2 for bringing to our attention that the BMAL1 KO macrophage model was not well-introduced in the manuscript. To address this point, we have modified the text to better introduce this model. Please see the modified text below.

      "As a positive control for circadian clock disruption, we used data from BMAL1 KO peritoneal macrophages [44]. BMAL1 KO macrophages have a genetic disruption of the circadian clock due to the loss of Bmal1, the central clock gene. As a result, circadian rhythms of BMAL1 KO macrophages are disrupted, lacking rhythmicity and downstream circadian regulation of macrophage function (Supplementary Figure 8)[45, 54]. "As a positive control for circadian clock disruption, we used data from BMAL1 KO peritoneal macrophages [44]. BMAL1 KO macrophages have a genetic disruption of the circadian clock due to the loss of Bmal1, the central clock gene. As a result, circadian rhythms of BMAL1 KO macrophages are disrupted, lacking rhythmicity and downstream circadian regulation of macrophage function (Supplementary Figure 8)[45, 54]."__ __

      Significance

      Knudsen-Clark et al. showed that the circadian rhythm of bone marrow-derived macrophages (BMDM) can be affected by polarization stimuli, pH of the microenvironment, and by the presence of sodium-lactate. Mechanistically, the acidic pH of cell microenvironment is partly regulated by intracellular cAMP-mediated signaling events in BMDM. The authors also showed that the circadian clock of peritoneal macrophages is also modified by the pH of the cell microenvironment. Using publicly available data, the authors showed that the circadian rhythm of tumor-associated macrophages is similar to that of Bmal1-KO peritoneal macrophages. In a murine model of pancreatic cancer, the authors showed that the tumor growth is accelerated in C57BL/6 mice co-injected with cancer cells and Bmal1-KO BMDM as compared to mice co-injected with cancer cell and wild type BMDM.

      We are grateful to Reviewer #2 for their very helpful comments and suggestions, which we believe have greatly enhanced the clarity and reproducibility of this manuscript.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __

      Review for Knudsen-Clark et al.

      "Circadian rhythms of macrophages are altered by the acidic pH of the tumor microenvironment"

      Knudsen-Clark and colleagues explore the impact of TME alterations on macrophage circadian rhythms. The authors find that both acidic pH and lactate modulate circadian rhythms which alter macrophage phenotype. Importantly, they define circadian disruption of tumor-associated macrophages within the TME and show that circadian disruption in macrophages promotes tumor growth using a PDAC line. This represents an important understanding of the crosstalk between cancer cells and immune cells as well as the understanding of how the TME disrupts circadian rhythms. The study is well-done, however, authors need to address several important points below.

      We thank Reviewer #3 for their in-depth and insightful comments and suggestions, which have resulted in a much-improved manuscript. We were pleased that Reviewer #3 found the work to be "an important study that is well-done" and that it "represents an important understanding of the crosstalk between cancer cells and immune cells as well as the understanding of how the TME disrupts circadian rhythms.". In response to Reviewer #3's comments, we have added several new key experiments and changes to the text. To summarize, we added new data to Supplementary Figure 1 to better characterize our macrophage polarization states, showed in Figure 3 that low pH affects peritoneal macrophage circadian gene expression in a similar fashion as bone marrow-derived macrophages, added new data in Figure 4 to show how lactate and low pH affect circadian gene expression over time, and new computational analysis to Figures 6, 7, and Supplementary Figure 9 to probe circadian gene covariance from publicly available data. We also made several key additions to the Discussion to discuss the functional implications of macrophage circadian rhythm disruption by low pH and potential mechanisms of this disruption. Finally, at the request of Reviewer #3, we consolidated several existing Figures and added new data, where appropriate, to existing figures, and we worked to describe new findings succinctly.

      Major comments:

      • In Figures 3 and 4, the authors can include additional clock genes that can be run by qPCR. This was done in Figure 2 and was a nice addition to the data.

      We agree with Reviewer #3's suggestion that an analysis of clock gene expression at the mRNA level would enhance our data in Figures 3 and 4. To address this point, we have performed short time course experiments to assess circadian clock gene expression over time in BMDMs cultured with or without lactate at neutral or acidic pH (for Figure 4). In line with the difference in circadian rhythms of Per2-Luc levels between BMDMs cultured in the presence or absence of lactate which we observed by Lumicycle analysis, we measured changes in expression of the circadian clock genes Per2, Nr1d1, and Cry1 between macrophages cultured with 25 mM sodium-L-lactate compared to those cultured with 0 mM sodium-L-lactate at pH 6.5. We have added these data, shown below, to Figure 4, and updated the manuscript accordingly to discuss these results. Please see below for the new Figure Panel and modified text.

      "Lactate was also observed to alter expression of the circadian clock genes Per2, Cry1, and Nr1d1 over time in BMDMs cultured at pH 6.5, while having more subtle effects at pH 7.4 (Figure 4C). Notably, lactate blunted the effect of pH 6.5 on Cry1 expression, while enhancing the effect of low pH on Nr1d1 expression. In all, these data indicate that concentration of lactate similar to that present in the TME can influence circadian rhythms and circadian clock gene expression of macrophages."

      As an additional measure to address Reviewer #3's point about Figure 3 (peritoneal macrophages), we have compared expression of circadian clock genes in peritoneal macrophages cultured at neutral pH 7.4 or acidic pH 6.8 for 24 hours using a publicly available RNA-seq data set from Jiang 2021 J Immunol (GSE164697). In line with previous observations in macrophages cultured under acidic compared to neutral pH conditions, including the clock gene expression data from Figure 2 in BMDMs and the Per2-Luc levels observed in peritoneal macrophages in Figure 3, we found that peritoneal macrophages exhibited differences in expression of circadian clock genes when cultured at acidic pH 6.8 compared to neutral pH 7.4. We have added these data, shown below, as Figure 3B, and have updated the manuscript accordingly - please see below for the new Figure panel and modified text.

      "To test whether pH-driven changes in circadian rhythms of peritoneal macrophages were reflected at the mRNA level, we compared expression of circadian clock genes in peritoneal macrophages cultured at neutral pH 7.4 or acidic pH 6.8 for 24 hours using publicly available RNA-sequencing data [30]. In line with altered circadian rhythms observed by Lumicycle, peritoneal macrophages cultured at pH 6.8 expressed different levels of circadian clock genes than peritoneal macrophages culture at pH 7.4 (Figure 3B). The trends in changes of gene expression in peritoneal macrophages cultured at pH 6.8 matched what we observed in BMDMs, where low pH generally led to higher levels of circadian clock gene expression (Figure 2D-F). These data support our observations by LumiCycle and indicate that acidic pH drives transcriptional changes in multiple components of the circadian clock. In all, these data are evidence that pH-dependent changes in circadian rhythms are relevant to in vivo-differentiated macrophages."

      We have also updated the Methods section appropriately

      "FASTQ files from a previously published analysis of peritoneal macrophages cultured under neutral pH 7.4 or acidic pH 6.8 conditions were downloaded from NCBI GEO (accession #GSE164697) [30]."

      2) There are far too many figures with minimal data in each. Please consolidate the figures. For example, Figures 1-3 can be fully combined, Figures 4-6 can be combined, and Figures 7-8 can be combined. Additionally, it is unclear if Figure 5 needs to be in the main, it can be moved to the supplement.

      We appreciate the preference of Reviewer #3 to see some of the figures consolidated. We have combined Figures 5 and 6 into a single new Figure 5. Additionally, we have added new data from revisions to current figures to increase the amount of data in each figure and minimize the amount of new figures generated. In all, despite the large amount of new data added to the paper in response to Reviewer comments and suggestions (including additional data in Figure 4 and new Figures 6 and 8), our manuscript now contains 10 main Figures, only one more than the initial submission.

      3) The observation that conditions like pH and lactate alter macrophage phenotype and rhythmicity are important. However, macrophage phenotype via gene expression does not always correlate to function. It is important for authors to demonstrate the effect of pH or lactate on macrophage function. This can be done using co-culture assays with cancer cells. If these experiments cannot be performed, it is suggested that authors discuss these limitations and consideration in the discussion.

      Reviewer #3 correctly points out that changes in phenotype does not always correlate to changes in function. Others have shown that acidic pH and lactate can each alter macrophage phenotype, and also alter macrophage function and the ability to promote tumor growth (please see El-Kenawi 2019 Br J Cancer, Jiang 2021 J Immunol, Colegio 2014 Nature). How changes in rhythms influence macrophage function remains unknown and we agree with Reviewer #3 that this is an important future direction, We have added a section in the Discussion to facilitate the discussion of this important future direction. Please see the text below.

      "Considering our observations that conditions associated with the TME can alter circadian rhythms in macrophages, it becomes increasingly important to understand the relevance of macrophage rhythms to their function in tumors. It has been shown that acidic pH and lactate can each drive functional polarization of macrophages toward a phenotype that promotes tumor growth, with acidic pH modulating phagocytosis and suppressing inflammatory cytokine secretion and cytotoxicity[28, 30, 93]. However, how the changes in circadian rhythms of macrophages driven by these conditions contributes to their altered function remains unknown. Current evidence suggests that circadian rhythms confer a time-of-day-dependency on macrophage function by gating the macrophage response to inflammatory stimuli based on time-of-day. As such, responses to inflammatory stimuli such as LPS or bacteria are heightened during the active phase while the inflammatory response is suppressed during the inactive phase. An important future direction will be to determine how changes in circadian rhythms of macrophages, such as those observed under acidic pH or high lactate, influences the circadian gating of their function."

      4) On line 119-122, authors describe a method for polarization of macrophages. They then reference one gene to confirm each macrophage polarization state. To more definitively corroborate proper macrophage polarization, authors should perform qPCR for additional target genes that are associated with each phenotype. For example, Socs3, CD68, or CD80 for M1, and CD163 or VEGF for M2. Alternatively, the authors should cite previous literature validating this in vitro polarization model.

      We appreciate Reviewer #3's suggestion to better the phenotypic identity of our polarization models with additional canonical markers. To address this point, we have expanded our panel using transcriptional markers commonly used in the murine polarization model for M1 macrophages such as Tnfa, Il6, and Il1b. As discussed in the response to Reviewer #1's minor point 5 and Reviewer #2's major point 2, we have also expanded our panel to include additional markers for M2 such as Vegf, Retnla, Ym1, Mgl1, and CD206. We have added these new data to Supplementary Figure 1. Finally, we have added additional citations for the in vitro polarization models. Please see the modified text and new data, below.

      "As macrophages are a phenotypically heterogeneous population in the TME, we first sought to understand whether diversity in macrophage phenotype could translate to diversity in circadian rhythms of macrophages. To this end, we used two well-established in vitro polarization models to study distinct macrophage phenotypes[5, 60-63]. For a model of pro-inflammatory macrophages, we stimulated macrophages with IFNγ (interferon γ) and LPS (lipopolysaccharide) to elicit a pro-inflammatory phenotype[60, 64]. These macrophages are often referred to as 'M1' and are broadly viewed as anti-tumorigenic, and we will refer to them throughout this paper as pro-inflammatory macrophages[65, 66]. For a model at the opposite end of the phenotypic spectrum, we stimulated macrophages with IL-4 and IL-13[60, 67]. While these type 2 stimuli play a role in the response to parasites and allergy, they are also major drivers of wound healing; in line with this, IL-4 and IL-13-stimulated macrophages have been well-characterized to adopt gene expression profiles associated with wound-healing and anti-inflammatory macrophage phenotypes[68-71]. As such, these macrophages are often used as a model to study pro-tumorigenic macrophages in vitro and are often referred to as 'M2' macrophages; throughout this paper, we will refer to IL-4 and IL-13-stimulated macrophages as pro-resolution macrophages[66, 72, 73]. Consistent with previous studies, we found that genes associated with anti-inflammatory and pro-resolution programming characteristic of IL-4 and IL-13-stimulated macrophages such as Arg1, Retnla, Chil3 (Ym1), Clec10a (MGL1), and Mrc1 (CD206) were induced in IL-4 and IL-13-stimulated macrophages, but not IFNγ and LPS-stimulated macrophages. In contrast, genes associated with pro-inflammatory activity characteristic of IFNγ and LPS-stimulated macrophages such as Nos2 (iNOS), Tnfa, Il1b, and Il6 were induced in IFNγ and LPS-stimulated macrophages, but not IL-4 and IL-13-stimulated macrophages (Supplementary Figure 1)[28, 30, 65, 71, 74, 75]. This indicates that macrophages stimulated with IL-4 and IL-13 were polarized toward a pro-resolution phenotype, while macrophages stimulated with IFNγ and LPS were polarized toward a pro-inflammatory phenotype.

      5) Several portions of the manuscript are unnecessarily long, including the intro and discussion. Please consolidate the text. The results section is also very lengthy, please consider consolidation.

      We appreciate Reviewer #3's preference for a shorter manuscript. The revised manuscript, in response to the many Reviewer comments and requests, contains many new pieces of data, and we have taken care to describe these new data as briefly and simply as possible. In preparation for this Revision, we also removed and shortened several sections of the Results and Discussion where we felt extra explanation was not necessary. We will work with the editor of the journal we submit to ensure the length of the manuscript sections is compliant with the journal's guidelines.

      6) The authors find that macrophage phenotype impacts rhythmicity. However, there is no mechanistic understanding of why this occurs. The authors should provide some mechanistic insight on this topic in the discussion.

      We agree with Reviewer #3 that while the mechanism by which macrophage phenotype alters rhythms remains unknown, this is an important topic of discussion. While there is some literature on how circadian rhythms modulate inflammatory response (and hints at how it may influence phenotype) in macrophages, there is very little on the converse: how phenotype may influence circadian rhythms. We address this point by expanding on our Discussion - please see the modified text below.

      "Elucidating the role of circadian rhythms in regulation of macrophage biology necessitates a better understanding of the crosstalk between phenotype and circadian rhythms. Although lactate polarizes macrophages toward a pro-resolution phenotype similar to acidic pH[30, 93], exposure to lactate had different effects on circadian rhythms - and in some cases, circadian clock gene expression - than exposure to acidic pH (Figure 4). Sensing of lactate occurs through different pathways than acid-sensing, which may contribute to the different ways in which these two stimuli modulate circadian rhythms of macrophages[111]. One previously published finding that may offer mechanistic insight into how phenotype can influence circadian rhythms is the suppression of Bmal1 by LPS-inducible miR-155[54]. It has also been observed that RORα-mediated activation of Bmal1 transcription is enhanced by PPARγ co-activation[112]. In macrophages, PPARγ expression is induced upon stimulation with IL-4 and plays a key role in alternative activation of macrophages, promoting a pro-resolution macrophage phenotype, and supporting resolution of inflammation[113-115]. Such observations prompt the question of whether there are yet-unidentified factors induced downstream of various polarizing stimuli that can modulate expression of circadian genes at the transcriptional and protein levels. Further work is required to understand the interplay between macrophage phenotype and circadian rhythms."

      7) The data presented in Figure 9 is very intriguing and arguably the strongest aspect of the paper. To strengthen the point, the authors could repeat this experiment with an additional cell model, another PDAC line or a different cancer line.

      We appreciate Reviewer #3's comment about the impact of tumor growth data. Indeed, our finding that deletion of Bmal1 in co-injected macrophages accelerated PDAC growth has been recapitulate by others in different cancer models. This lends strength to our observations. We discuss and cite complementary work on macrophage rhythms and tumor growth in other models of cancer the Discussion, please see below.

      "Data from our lab and others suggest that disruption of the macrophage-intrinsic circadian clock accelerates tumor growth, indicating that circadian regulation of macrophages is tumor-suppressive in models of PDAC (our work) and melanoma [109]. This agrees with complementary findings that behavioral disruption of circadian rhythms in mice (through chronic jetlag) disrupts tumor macrophage circadian rhythms and accelerates tumor growth[56]."

      Minor Comments:

      1) Data is Figure 2 is interesting and the impact on circadian rhythms is clear based on changes in amplitude and period. However, though the impact on period and amplitude is clear from Figures 2A-C, the changes in circadian gene expression are less clear. For instance, though amplitude is up in 2B, amplitude is suppressed in 2C. However, that does not appear to be reflected in the gene expression data in Figures 2E and F. The authors should comment on this.

      Reviewer #3 correctly points out that there appear to be discrepancies between the LumiCycle data in Figure 2 and the circadian gene expression data in Figure 2. This discrepancy is perhaps unsurprising given that the gene expression data is only a short time course over 12 hours, while the LumiCycle data are collected over a course of 3 days. The gene expression data do not allow us to determine changes in period or rhythm. Another point of interest is that it's been shown that circadian regulation occurs on many different levels (transcriptional, post-transcriptional, translational, post-translational). As result of this, circadian patterns observed in gene transcripts don't always match those of their encoded proteins; just the same, circadian patterns of proteins aren't always reflected in their encoding gene transcripts (Collins 2021 Genome Res). Due to this multi-level regulation, we propose that the results of the LumiCycle analysis, which measures PER2-Luc levels, are a more robust readout of rhythms because they are further downstream of the molecular clock than transcriptional readouts. That said, observing changes at both the protein (by Lumicycle) and transcriptional level confirm that all components of the clock are altered by acidic pH, even if the way in which they are altered appears to differ. We have incorporated the points we raised above into the Results section.

      Please see the modified text below.

      "Low pH was also observed to alter the expression of the circadian clock genes Per2, Cry1, and Nr1d1 (REV-ERBα) over time across different macrophage phenotypes, confirming that multiple components of the circadian clock are altered by acidic pH (Figure 2D-F). Notably, the patterns in expression of circadian genes did not always match the patterns of PER2-Luc levels observed by LumiCycle. This is perhaps unsurprising, as circadian rhythms are regulated at multiple levels (transcriptional, post-transcriptional, translational, post-translational); as a result, circadian patterns observed in circadian proteins such as PER2-Luc do not always match those of their gene transcripts[77]."

      2) On line 156-158, authors describe damping rate. I believe the authors are trying to say that damping rate increases as the time it takes cells to desynchronize decreases and vice versa. However, this point needs to be better explained.

      We thank Reviewer #3 for bringing to our attention that this was not communicated clearly in the text. We have adjusted our explanation to be clearer. Please see the modified text below.

      "Damping of rhythms in most free-running cell populations (defined as populations cultured in the absence of external synchronizing stimuli) occurs naturally as the circadian clocks of individual cells in the population become desynchronized from each other; thus, damping can be indicative of desynchrony within a population[84]. The damping rate increases as the time it takes for rhythms to dissipate decreases; conversely, as damping rate decreases as the time it takes for rhythms to dissipate increases."

      3) Data presented in Figures 3 and 4 are different in terms of the impact of changing the pH. The source of the macrophages is different, but the authors could clarify this further.

      We thank Reviewer #3 for this comment. Our conclusion is that the impact of low pH is largely similar in Figure 3 (peritoneal macrophages) and Figure 4 (BMDMs exposed to low pH and lactate). In both Figures 3 and 4, exposure to acidic pH by culturing macrophages at pH 6.5 increased amplitude, decreased period, and increased damping rate compared to macrophages cultured at neutral pH 7.4.

      4) For heatmaps shown in Figures 7 and 8, please calculate covariance and display asterisks where P We thank Reviewer #3 for the excellent suggestion to use an additional approach to asses circadian clock status in samples by measuring co-variance in the circadian clock gene network. To address this point, we have performed weighted gene co-expression network analysis (WGCNA) to calculate covariance, as was originally performed in Chun and Fortin et al Science Advances 2022. For the samples analyzed in Figure 7 (now Figure 6), we have added these data to the figure. We have applied this analysis to a new set of human data that we analyzed and added it to the new Figure 7. Finally, for the samples analyzed in Figure 8, we have added these data as a new Supplementary Figure 9. Please see the data and modified text below.

      Figure 6

      "Weighted gene co-expression network analysis (WGCNA) has been used as an alternate approach to measure the co-variance between clock genes and thus assess bi-directional correlations among the core clock gene network in healthy tissue and tumor samples [103]. In line with the circadian disorder observed by CCD, while many bi-directional correlations among the core clock gene network were significant and apparent in wild type peritoneal macrophages, these relationships were altered or abolished within BMAL1 KO peritoneal macrophages and TAM samples, and in some cases replaced by new relationships (Figure 6E). This indicates that there is population-level disorder in the circadian rhythms of tumor-associated macrophages in murine lung cancer."

      Figure 7

      "We next assessed the status of the circadian clock in human TAMs from NSCLC patients. We performed CCD with publicly available RNA-seq data of tumor-adjacent macrophages and tumor-associated macrophages from NSCLC patients, using alveolar macrophages from healthy donors as a control[104, 105]. To assess the contribution of the acidic TME to circadian disorder, we subset TAM NSCLC patient samples into groups (Crem high TAMs and Crem low TAMs) based on median Crem expression. Notably, in macrophages from human NSCLC there was a trend toward disorder in Crem low but not Crem high TAM samples (Figure 7A,B). Additionally, the co-variance among core clock genes observed in alveolar macrophages from healthy donors was absent within Crem low and Crem high TAM samples (Figure 7C). In all, these data indicate that there is population-level disorder in the circadian rhythms of tumor-associated macrophages in humans and mice, suggesting that circadian rhythms are indeed altered in macrophages within the TME."

      Supplementary Figure 9

      "CCD score worsened as populations became increasingly desynchronized, with the 12hr desynchronized population having a significantly worse CCD score than synchronized, homogenous macrophage population (Figure 8C). This indicates that as circadian rhythms of individual macrophages within a population become more different from each other, circadian disorder increases at the population-level. This is further supported by WGCNA, which revealed that the significant co-variance of circadian clock genes in the synchronized population was progressively altered and lost as the population is increasing desynchronized to 12 hours (Supplementary Figure 9)."

      Reviewer #3 (Significance (Required)):

      This is an important study that is well-done. It is the feeling of the reviewer that the study warrants a revision, at the discretion of the editor. The study represents an important understanding of the crosstalk between cancer cells and immune cells as well as the understanding of how the TME disrupts circadian rhythms.

      We thank Reviewer #3 for their comments regarding the impact and significance of our work. As shown by the comments above, we are confident we have fully addressed each of the points that were made to result in a much-improved revised manuscript.




    1. Author response:

      The following is the authors’ response to the original reviews.

      Editors’ recommendations for the authors

      The reviewers recommend the following: 

      (a) Digging deeper into the discussion of the density-dependent dispersal. 

      (b) Clarifying the microfluidic setup.  

      (c) Clarifying the description and interpretation of the transcriptomic evidence. 

      (d) Toning down carbon cycle connections (some reviewers felt the evidence did not fully support the claims). 

      We would like to thank the editors for their thoughtful evaluation of our manuscript and their clear suggestions. We have revised the manuscript in the light of these comments, as we outline below and address in detail in the point-by-point response to the reviewers’ comments that follows. 

      (a) We have expanded the discussion of density-dependent dispersal and revised Figure 2C to improve clarity. 

      (b) We have also added further information concerning the microfluidic setup in the results section and provide an illustration of the setup in a new figure panel, Figure 1A.

      (c) Addressing the reviewers’ comments on the transcriptomic analysis, we have added more information in the description and interpretation of the results. 

      (d) We have rephrased the text describing the role of degradation-dispersal cycles for carbon cycling to highlight it as the motivation of this study and emphasize the link to literature on foraging, without creating expectations of direct measurements of global carbon cycling.

      Public Reviews:

      Reviewer #1 (Public Review):

      [...]

      Weaknesses: 

      Much of the genetic analysis, as it stands, is quite speculative and descriptive. I found myself confused about many of the genes (e.g., quorum sensing) that pop up enriched during dispersal quite in contrast to my expectations. While the authors do mention some of this in the text as worth following up on, I think the analysis as it stands adds little insight into the behaviors studied. However, I acknowledge that it might have the potential to generate hypotheses and thus aid future studies. Further, I found the connections to the carbon cycle and marine environments in the abstract weak --- the microfluidics setup by the authors is nice, but it provides limited insight into naturalistic environments where the spatial distribution and dimensionality of resources are expected to be qualitatively different. 

      We thank the reviewer for their suggestions to improve our manuscript. We agree that the original manuscript would have benefitted from more detailed interpretation of the observed changes in gene expression. We have revised the manuscript to elaborate on the interpretation of the changes in expression of quorum sensing genes (see response to reviewer 1, comment 3), motility genes (see response to reviewer 1, comment 6), alginate lyase genes (see response to reviewer 1, comment 7 and reviewer 2, comment 2), and ribosomal and transporter genes (see response to reviewer 2, comment 2).

      In general, we think that the gene expression study not only supports the phenotypic observations that we made in the microfluidic device, such as the increased swimming motility when exposed to digested alginate medium, but  also adds further insights. Our reasoning for studying the transcriptomes in well mixed-batch cultures was the inability to study gene expression dynamics to support the phenotypic observations about differential motility and chemotaxis in our microfluidics setup. The transcriptomic data clearly show that even in well-mixed environments, growth on digested alginate instead of alginate is sufficient to increase the expression of motility and chemotaxis genes. In addition, the finding that expression of alginate lyases and metabolic genes is increased during growth on digested alginate was revealed through the analysis of transcriptomes, something which would not have been possible in the microfluidic setup. We agree with the reviewer that our analyses implicate further, perhaps unexpected, mechanisms like quorum sensing in the cellular response to breakdown products, and that this represents an interesting avenue for further studies.

      Finally, we  also agree with the reviewer that it would be good to be more explicit in the text that our microfluidic system cannot fully capture the complex dynamics of natural environments. Our approach does, however, allow the characterization of cellular behaviors at spatial and temporal scales that are relevant to the interactions of bacteria, and thus provides a better understanding of colonization and dispersal of marine bacteria in a manner that is not possible through in situ experiments. We have edited our manuscript to highlight this and modified our statements regarding carbon cycling towards emphasizing the role degradation-dispersal cycles in remineralization of polysaccharides (see response to reviewer 1, comment 2).  

      Reviewer #2 (Public Review):

      [...]

      Weaknesses: 

      The explanation of the microfluidics measurements is somewhat confusing but I think this could be easily remedied. The quantitative interpretation of the dispersal data could also be improved and I'm not clear if the data support the claim made. 

      We thank the reviewer for their comments and helpful suggestions. We have revised the manuscript with these suggestions in mind and believe that the manuscript is improved by a more detailed explanation of the microfluidic setup. We have added more information in the text (detailed in response to reviewer 2, comments 1 and 2) and have added a depiction of the microfluidic setup (Fig. 1A). We have also modified the presentation and discussion of the dispersal data (Fig. 2C), as described in detail below in response to reviewer 2, comment 4, and argue that they clearly show density-dependent dispersal. We believe that this modification of how the results are presented provides a more convincing case for our main conclusion, namely that the presence of degradation products controls bacterial dispersal in a density-dependent manner.  

      Reviewer #3 (Public Review):

      [...]

      Weaknesses: 

      I find this paper very descriptive and speculative. The results of the genetic analyses are quite counterintuitive; therefore, I understand the difficulty of connecting them to the observations coming from experiments in the microfluidic device. However, they could be better placed in the literature of foraging - dispersal cycles, beyond bacteria. In addition, the interpretation of the results is sometimes confusing. 

      We thank the reviewer for their suggestions to improve the manuscript. We have edited the manuscript to interpret the results of this study more clearly, in particular with regard to the fact that breakdown products of alginate cause cell dispersal (see response to reviewer 2, comment 1), gene expression changes of ribosomal proteins and transporters (see response to reviewer 2, comment 2), as well as genes relating to alginate catabolism (see response to reviewer 2, comment 3).

      To provide more context for the interpretation of our results we now also embed our findings in more detail in the previous work on foraging strategies and dispersal tradeoffs.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The authors should clarify in more detail what they mean by density dependence in Figure 2. Usually density dependence refers to a per capita dependence, but here it seems that the per capita rate of dispersal might be roughly independent of density (Figure 2c; if you double the number of cells it doubles the number of cells leaving). Rather it seems the dispersal is such that the density of remaining cells falls below a threshold (~300 cells). 

      We thank the reviewer for raising this important point. To analyze the data more explicitly in terms of per capita dependence and so make the density dependence in the dispersal from the microfluidic chambers more clear, we have modified Figure 2C and edited the text. 

      In the modified Figure 2C, we computed the fraction of dispersed cells for each chamber (i.e the change in cell number divided by the cell number at the time of the nutrient switch). This quantity directly reveals the per-capita dependence, as mentioned by reviewer 1, and is now represented on the y-axis of Figure 2C instead of the absolute change in cell number. 

      These data demonstrate that the fraction of dispersed cells increases with increasing numbers of cells present in the chamber at the time of switching, with more highly populated chambers showing a higher fraction of dispersed cells. These findings indicate that there is a strong density dependence in the dispersal process.

      As pointed out by reviewer 1, another interesting aspect of the data is the transition at low cell number. The fraction of dispersed cells is negative in the case of the chamber with approximately 70 cells, consistent with no dispersal at this low density, and a moderate density increase as a function of continued growth.  

      In addition to the new analysis presented in Figure 2C, we have modified the paragraph that discusses this result as follows (line 208):

      “We indeed found that the nutrient switch caused a few or no cells to disperse from small cell groups (Fig. 2B), whereas a large fraction of cells from large cell groups dispersed (Fig. 2C). In fact, the e fraction of cells that dispersed upon imposition of the nutrient switch showed a strong positive relationship with the number of cells present, meaning that cells in chambers with many cells were more likely to disperse than cells in chambers with fewer cells (Fig. 2C).”

      (2) The authors should tone down their claims about the carbon cycle in the abstract. I do not believe the results as they stand could be used to understand degradation-dispersal cycles in marine environments relevant to the carbon cycle, since these behaviors have been studied in microfluidic environments which in my understanding are quite different. As such, statements such as "degradation-dispersal cycles are an integral part in the global carbon cycle, we know little about how cells alternate between degradation and motility" and "Overall, our findings reveal the cellular mechanisms underlying bacterial degradation-dispersal cycles that drive remineralization in natural environments" are overstated in the abstract. 

      We appreciate the reviewer’s comments regarding the connections of our work with the carbon cycle. We have now rephrased these statements in our manuscript to describe a potential connection between our work and the marine carbon cycle. The colonization of polysaccharides particles by bacteria and subsequent degradation has been widely acknowledged to play a significant role in controlling the carbon flow in marine ecosystems. (Fenchel, 2002; Preheim et al., 2011; Yawata et al., 2014, 2020). We still refer to carbon flow in the revised manuscript, though cautiously, as microbial remineralization of biomass, which is recognized as an important factor in the marine biological carbon pump (e.g., (Chisholm, 2000; Jiao et al., 2024). As stated in the previous version of the manuscript, the main motivation of our work was to study the growth behaviors of marine heterotrophic bacteria during polysaccharide degradation, especially to understand when bacteria depart already colonized and degraded particles and find novel patches to grow and degrade, a process that is poorly understood. Therefore, it is conceivable that degradation-dispersal cycles do play a role in the flow of carbon in marine ecosystems. However, we acknowledge that the carbon cycle is influenced by a multitude of biological and chemical processes, and the bacterial degradation-dispersal cycle might not be the sole mechanism at play. 

      We also appreciate the reviewer’s comments highlighting that the complexity of natural environments is not fully captured in our microfluidics system. However, our microfluidics setup does allow us to quantify responses and behaviors of microbial groups at high spatial and temporal resolution, especially in the context of environmental fluctuations. Microbes in nature interact at small spatial scales and have to respond to changes in the environment, and the microfluidics setup enables the quantification of these responses. Moreover, dispersal of the bacterium V. cyclitrophicus that we use in our study, has been previously observed even during growth on particulate alginate (Alcolombri et al., 2021), but the cues and regulation controlling dispersal behaviors have been unclear.  Microfluidic experiments have now allowed us to study this process in a highly quantitative manner, and align well with observations from experiments from more nature-like settings. These quantitative experiments on bacterial strains isolated from marine particles are expected to constrain quantitative models of carbon degradation in the ocean (Nguyen et al., 2022).

      We have now adjusted our statements throughout our manuscript to reflect the knowledge gaps in understanding the triggers of degradation-dispersal cycles and their links with carbon flow in marine ecosystems. The revised manuscript, especially, contains the following statements (line 47 and line 60):

      “Even though many studies indicate that these degradation-dispersal cycles contribute to the carbon flow in marine systems, we know little about how cells alternate between polysaccharide degradation and motility, and which environmental factors trigger this behavioral switch.”

      “Overall, our findings reveal cellular mechanisms that might also underlie bacterial degradation-dispersal cycles, which influence the remineralization of biomass in marine environments.”

      (3) The authors should clarify why they think quorum-sensing genes are increased in expression on digested alginate. The authors currently mention that QS could be used to trigger dispersal, but given the timescales of dispersal in Figure 2 (~half an hour), I find it hard to believe that these genes are expressed and have the suggested effect on those timescales. As such I would have expected the other way round - for QS genes to be expressed highly during alginate growth, so that density could be sensed and responded to. Please clarify. 

      We have now clarified this point in the revised manuscript. While the triggering of dispersal by quorum-sensing genes may indeed appear counterintuitive, and the response is rapid (we see dispersal of cells within 30-40 minutes), both observations are in line with previous studies in another model organism Vibrio cholerae. The dispersal time is similar to the dispersal time of V. cholerae cells from biofilms, as described by Singh and colleagues, (Figure 1E of Ref. Singh et al., 2017). In that case, induction of the quorum sensing dispersal regulator HapR was observed during biofilm dispersal within one hour after switch of condition (Fig. 2, middle panel of Ref. Singh et al., 2017). Even though the specific quorum sensing signaling molecules are probably different in our strain (there is no annotated homolog of the hapR gene in V. cyclitrophicus), we observed that the full set of quorum sensing genes was enriched in cells growing on digested alginate (as reported in line 314 and Fig. 4A).

      We have added this information in the manuscript (line 317): 

      “The set of quorum sensing genes was also positively enriched in cells growing on digested alginate (Fig. 4A and S4F, Table S13). This role in dispersal is in agreement with a previous study that showed induction of the quorum sensing master regulator in V. cholerae cells during dispersal from biofilms on a similar time scale as here (less than an hour) [28].”

      Reviewer #2 (Recommendations For The Authors):

      (1) Around line 144 - I don't really understand how you flow alginate through the microfluidic platform. It seems if the particles are transiently going through the microfluidic chamber then the flow rate and hence residence time of the alginate particles will matter a lot by controlling the time the cells have to colonize and excrete enzymes for alginate breakdown. Or perhaps the alginate is not particulate but is instead a large but soluble polymer? I think maybe a schematic of the microfluidic device would help -- there is an implicit assumption that we are familiar with the Dal Co et al device, but I don't recall its details and maybe a graphic added to Figure 1 would help. 

      a. In reviewing the Dal Co paper I see that cells are trapped and the medium flows through channels and the plane where the cells are held. I am still a little confused about the size of the polymeric alginate -- large scale (>1um) particles or very small polymers? 

      We have now provided a detailed description of our microfluidic experimental system. At the start of the experiments, cells are in fact not trapped within the microfluidic device, but grow and can move freely within a chamber designed with dimensions (sub-micron heights) so that growth occurs only as a monolayer. Cells were exposed to nutrients, either alginate or alginate digestion products, both in soluble form (not particles). These compounds were flowed into the device through a main channel, but entered the flowfree growth chambers by diffusion. To make these aspects of our experiments clearer, we have added further information on this in the Materials & Methods section (line 556), added this information in the abstract (line 51), and in the results (line123).

      To make our microfluidic setup clearer, we have followed this advice and added a schematic as Figure 1A and have added more information on the setup to the main text (line 153):

      “In brief, the microfluidic chips are made of an inert polymer (polydimethylsiloxane) bound to a glass coverslip. The PDMS layer contains flow channels through which the culture medium is pumped continuously. Each channel is connected to several growth chambers that are laterally positioned. The dimensions of these growth chambers (height: 0.85 µm, length: 60 µm, width: 90-120 µm) allow cells to freely move and grow as monolayers. The culture medium, containing either alginate or digested alginate in their soluble form, is constantly pumped through the flow channel and enters the growth chambers primarily through diffusion [15,16,4,17,8]. Therefore, the number of cells and their positioning within microfluidic chambers is determined by the cellular growth rate as well as by cell movement4. This setup combined with time-lapse microscopy allowed us to follow the development of cell communities over time.”

      (2) What makes this confusing is the difference between Figure 1C and Figure S2A -- the authors state that the difference in Figure 1C is due to dispersal, but is there flow through the microfluidic device? So what role does that flow through the device have in dispersal? Is the adhesion of the cell groups driven at all by a physical interaction with high molecular weight polymers in the microfluidic devices or is this purely a biological effect? Could this also be explained by different real concentrations of nutrients in the two cases? 

      We realize from this comment that the role of flow of the medium in the microfluidic setup was not clearly addressed in our manuscript. In fact, cells were not exposed to flow, and nutrients were provided to the growth chambers by diffusion. We have added a clearer explanation of this point on line 158:

      “The culture medium, containing either alginate or digested alginate in their soluble form, is constantly pumped through the flow channel and enters the growth chambers primarily through diffusion [15,16,4,17,8]. Therefore, the number of cells and their positioning within microfluidic chambers is determined by the cellular growth rate as well as by cell movement4.“

      One purely physical effect that we anticipate is that a high viscosity of the medium could immobilize cells. To address this point, we measured the viscosity of both alginate and digested alginate and conclude that the increase in viscosity is not strong enough to immobilize cells. We added a statement in the text (line 170)

      “To test the role of increased viscosity of polymeric alginate in causing the increased aggregation of cells, we measured the viscosity of 0.1% (w/v) alginate or digested alginate dissolved in TR media. For alginate, the viscosity was 1.03±0.01 mPa·s (mean and standard deviation of three technical replicates) whereas the viscosity of digested alginate in TR media was found to be 0.74±0.01 mPa·s. Both these values are relatively close to the viscosity of water at this temperature (0.89 mPa·s18) and, while they may affect swimming behavior [19], they are insufficient to physically restrain cell movement [20].”

      as well as a section in the Materials and Methods (line 594):

      “Viscosity of the alginate and digested alginate solution

      We measured the viscosity of alginate solutions using shear rheology measurements. We use a 40 mm cone-plate geometry (4° cone) in a Netzsch Kinexus Pro+ rheometer. 1200 uL of sample was placed on the bottom plate, the gap was set at 150 um and the sample trimmed. We used a solvent trap to avoid sample evaporation during measurement. The temperature was set to 25°C using a Peltier element. We measure the dynamic viscosity over a range of shear rates  = 0.1 – 100 s-1. We report the viscosity of each solution as the average viscosity measured over the shear rates 10 – 100 s-1, where the shear-dependence of the viscosity was low.

      We measured the viscosity of 0.1% (w/V) alginate dissolved in TR media, which was 1.03 +/- 0.01 mPa·s (reporting the mean and standard deviation of three technical replicates.). The viscosity of 0.1% digested alginate in TR media was found to be 0.74+/-0.01 mPa·s. This means that the viscosity of alginate in our microfluidic experiments is 36% higher than of digested alginate, but the viscosities are close to those expected of water (0.89 mPa·s at 25 degree Celsius according to Berstad and colleagues [18]).”

      While our microfluidic setup allows us to track the position and movement of cells in a spatially structured setting, these observations do not allow us to distinguish directly whether the differences in dispersal are a result of purely physical effects of polymers on cells or are a result of them triggering a biological response in cells that causes them to become sessile. It is known that bacterial appendages like pili interact with polysaccharide residues (Li et al., 2003). Therefore, it is quite plausible that cross-linking by polysaccharides can contribute growth behaviors on alginate. However, our analysis of gene expression demonstrates that flagellum-driven motility is decreased in the presence of alginate compared to digested alginate, alongside other major changes in gene expression. In addition, our measures of dispersal show that dispersal of cells when exposed to digested alginate is density dependent. Both observations suggest that the patterns in dispersal are governed by decision-making processes by cells resulting in changes in cell motility, rather than being a product of purely physical interactions with the polymer. 

      The finding that viscosities of both alginate and digested alginate are similar to that of water, suggests that diffusion of nutrients in the growth chambers should be similar. Therefore, we think that the differences in real concentrations of nutrients is likely not contributing to the observed differences in behavior. 

      (3) Why is Figure S1 arbitrary units? Does this have to do with the calibration of LC-MS? It would be better, it seems, to know the concentrations in real units of the monomer at least. 

      We agree with the reviewer that it would have been better to have absolute concentrations for these compounds. However, to calibrate the mass spectrometer signals (ion counts) to absolute concentrations for the different alginate compounds, we would need an analytical standard of known concentration. We are not aware of such a standard and thus report only relative concentrations. We agree that the y-axis label of Figure S1 should not contain ‘arbitrary’ units, as it shows a ratio (of measurements in the same arbitrary units). We have edited the labels of Figure S1 accordingly and the figure legend in line 26 of the Supplemental Material (“Relative concentrations…”).

      (4) Line 188 - density-dependent dispersal. The claim here is that "cells in chambers with many cells were more likely to disperse than cells in chambers with less cells." (my emphasis). Looking at the data in Figure 2C it appears that about 40% of the cells disperse irrespective of the density, before the switch to digested alginate. So it would seem that there is not a higher likelihood of dispersal at higher cell densities. For the very highest cell density, it does appear that this fraction is larger, but I'd be concerned about making this claim from what I understand to be a single experiment. To support the claim made should the authors plot Change in Cell number/Starting Cell number on the y-axis of Fig. 2C to show that the fraction is increasing? It would seem some additional data at higher starting cell densities would help support this claim more strongly. 

      We thank the reviewer for this comment, which is in line with a remark made by reviewer 1 in their comment 1. In response to these two comments (and as described above), we have edited Figure 2C and now have plotted the change in cell number relative to starting cell number at the y axis to directly show the density dependence. We observe a positive (approximately linear) relationship between the fraction of dispersed cells with the number of cells present in the chamber at the time of switching. This indicates that there is a density dependence in the dispersal process, with highly populated chambers showing a higher fraction of dispersed cells. 

      In addition to the change in Figure 2C, we have modified the paragraph around line 208: “We indeed found that the nutrient switch caused a few or no cells to disperse from small cell groups (Fig. 2B), whereas a large fraction of cells from large cell groups dispersed (Fig. 2C). In fact, the e fraction of cells that dispersed upon imposition of the nutrient switch showed a strong positive relationship with the number of cells present, meaning that cells in chambers with many cells were more likely to disperse than cells in chambers with fewer cells (Fig. 2C).”

      The highest cell number at the start of the switch that we include is about 800 cells. The maximum number of cells that can fit into a chamber are ca. 1000 cells. Thus, 800 resident cells are close to the maximal density.

      (5) A comment -- I find the result of significant chemotaxis towards alginate but not the monomers of alginate to be quite surprising. The ecological relevance of this (line 219) seems like an important result that is worth expanding on a bit at least in the discussion. For now, my question is whether the authors know of any mechanism by which chemotaxis receptors could respond to alginate but not the monomer. How can a receptor distinguish between the two? 

      We agree that this result is surprising, given that oligomers can be more easily transported into the periplasm where sensing takes place, and they also provide an easier accessible nutrient source. Indeed, in case of the insoluble polymer chitin it has been shown that chemotaxis towards chitin is mediated by chitin oligomers (Bassler et al., 1991), which was suggested as a general motif to locate polysaccharide nutrient sources (Keegstra et al., 2022). However, a recent study has changed this perspective by showing widespread chemotaxis of marine bacteria towards the glucose-based marine polysaccharide laminarin, but not towards laminarin oligomers or glucose (Clerc et al., 2023). Together with our results on chemotaxis towards alginate (but not significantly toward alginate oligomers) this suggests that chemotaxis towards soluble polysaccharides can be mediated by direct sensing of the polysaccharide molecules.

      As recommended, we expanded the discussion of the ecological relevance and also added more information on possible mechanisms of selective sensing of alginate and its breakdown products (around line 479).:

      “Direct chemotaxis towards polysaccharides may facilitate the search for new polysaccharide sources after dispersal. We found that the presence of degradation products not only induces cell dispersal but also increases the expression of chemotaxis genes. Interestingly, we found that V. cyclitrophicus ZF270 cells show chemotaxis towards polymeric alginate but not digested alginate. This contrasts with previous findings for bacterial strains degrading the insoluble marine polysaccharide chitin, where chemotaxis was strongest towards chitin oligomers53, suggesting that oligomers may act as an environmental cue for polysaccharide nutrient sources55. However, recent work has shown that certain marine bacteria are attracted to the marine polysaccharide laminarin, and not laminarin oligomers56. Together with our results, this indicates that chemotaxis towards soluble polysaccharides may be mediated by the polysaccharide molecules themselves. The mechanism of this behavior is yet to be identified, but could be mediated by polysaccharide-binding proteins as have been found in Sphingomonas sp. A1 facilitating chemotaxis towards pectin57. Direct polysaccharide sensing adds complexity to chemosensing as polysaccharides cannot freely diffuse into the periplasm, which can lead to a trade-off between chemosensing and uptake58. Furthermore, most polysaccharides are not immediately metabolically accessible as they require degradation. But direct polysaccharide sensing can also provide certain benefits compared to using oligomers as sensory cues. First, it could enable bacterial strains to preferably navigate to polysaccharide nutrients sources that are relatively uncolonized and hence show little degradation activity. Second, strong chemotaxis towards degradation products could hinder a timely dispersal process as the dispersal then requires cells to travel against a strong attractant gradient formed by the degradation products. Overall, this strategy allows cells to alternate between degradation and dispersal to acquire carbon and energy in a heterogeneous world with nutrient hotspots [44,59–61].”

      (6) Comment on lines 287-8 -- that the "positive enrichment of the gene set containing bacterial motility proteins matched the increase in motile cells that we observe in Fig 3E." I'm confused about what is meant by the word "matched" here. Is the implication that there is some quantitative correspondence between increased motility in Figure 3 and the change in expression in Figure 4? Or is the statement a qualitative one -- that motility genes are upregulated in the presence of digested alginate? Table S12 didn't help me answer this question. 

      We thank the reviewer for their helpful comment. Our original statement was a qualitative one - observing that gene expression enrichment in genes associated with bacterial motility aligned with our expectations based on the previous observation of an increase in motile cells. We have now changed the wording to highlight the qualitative nature of this statement (line 315):

      “The positive enrichment of the gene set containing bacterial motility proteins aligned with our expectations based on the increase in motile cells that we observed in Figure 3E (Fig. 4A, Table S12).”

      (7) Line 326 - what is the explanation for the production of public enzymes in the presence of digest? How does this square with the previous narrative about cells growing on alginate digest expressing motility genes and chemotaxing towards alginate? It seems like the story is a bit tenuous here in the sense that digested alginates stimulate both motility - which is hypothesized to drive the discovery of new alginate particles - and lyase enzymes which are used to degrade alginate. So do the high motility cells that are chemotaxing towards alginate also express lyases en route? I'm of the opinion that constructing narratives like these in the absence of a more quantitative understanding of the colonization and degradation dynamics of alginate particles presents a major challenge and may be asking more of the data than the data can provide. 

      a. I noted later that this is addressed later around lines 393 in the Discussion section.

      Indeed, the notion that the presence of breakdown products triggers motility and also increases the expression of alginate lyases and other metabolic genes for alginate catabolism seems counterintuitive. We have now expanded our discussion of these results to contextualize these findings (around line 443):

      "One reason for this observation may be that cells primarily rely on intracellular monosaccharide levels to trigger the upregulation of genes associated with polysaccharide degradation and catabolism, as has previously been observed for E. coli across various carbon sources [50,51]. In fact, the majority of carbon sources are sensed by prokaryotes through one‑component sensors inside the cell50. In the one‑component internal sensing scheme, the enzymes and transporters for the use of various carbon sources are expressed at basal levels, which leads to an increase in pathway intermediates upon nutrient availability. The pathway intermediates are sensed by an internal sensor, usually a transcription factor, and lead to the upregulation of transporter and enzyme expression [50,51]. This results in a positive feedback loop, which enables small changes in substrate abundance to trigger large transcriptional responses [50,52]. Thus, the presence of alginate breakdown products may likely result in increased expression of all components of the alginate degradation pathway, including the expression of degrading enzymes. As the gene expression analysis was performed on well-mixed cultures in culture medium containing alginate breakdown products, we therefore expect a strong stimulation of alginate catabolism. In a natural scenario, where cells disperse from a polysaccharide hotspot before its exhaustion, the expression of alginate catabolism genes may likely decrease again once the local concentration of breakdown products decreases. However, continued production of alginate lyases could also provide an advantage when encountering a new alginate source and continued production of alginate lyases may thus help cells to prepare for likely future environments. Further investigations of bacterial enzyme secretion in changing nutrient environments and at relevant spatial scales are required to improve our understanding of the regulation of enzyme secretion along nutrient gradients."

      (8) I like Figure 6, and I think this hypothesis is a good result from this paper, but I think it would be important to emphasize this as a proposal that needs further quantitative analysis to be supported. 

      We have now edited the manuscript to make this point more clear. While both degradation and dispersal are well-appreciated parts of microbial ecology, the transitions and underlying mechanisms are unclear. We have edited the discussion to improve the clarity (line 419): 

      “This cycle of biomass degradation and dispersal has long been discussed in the context of foraging e.g., [44,45,13,46,47], but the cellular mechanisms that drive the cell dispersal remain unclear.”

      Also, we have updated Figure 6 to indicate more clearly which new findings this work proposes (now bold font) and which previous findings that were made in different bacterial taxa and carbon sources that aligns with our  work (now light font). We edited the figure legend accordingly (line 503):

      "By integrating our results with previous studies on cooperative growth on the same system, as well as results on dispersal cycles in other systems, we highlight where the specific results of this work add to this framework (bold font)."

      Minor comments 

      (1) Is there any growth on the enzyme used for alginate digestion? E.g. is the enzyme used to digest the alginate at sufficiently high concentrations that cells could utilize it for a carbon/nitrogen source? 

      We thank the reviewer for raising this point. We added the following paragraph as Supplemental Text to address it (line 179):

      “Protein amount of the alginate lyases added to create digested alginate

      Based on the following calculation, we conclude that the amount of protein added to the growth medium by the addition of alginate lyases is so small that we consider it negligible. In our experiment we used 1 unit/ml of alginate lyases in a 4.5 ml solution to digest the alginate. As the commercially purchased alginate lyases are 10,000 units/g, our 4.5 ml solution contains 0.45 mg of alginate lyase protein. The digested alginate solution diluted 45x when added to culture medium. This means that we added 0.18 µg alginate lyase protein to 1 ml of culture medium.

      As a comparison, for 1ml of alginate medium, 1000µg of alginate is added or for 1 ml of Lysogeny broth (LB) culture medium, 3,500 µg of LB are added.  Thus, the amount of alginate lyase protein that we added is ca. 5000 - 20,000 times smaller than the amount of alginate or LB that one would add to support cell growth. Therefore, we expect the growth that the digestion of the added alginate lyases would allow to be negligible.”

      (2) The lines in Figure 2B are very hard to see. 

      We have addressed this comment by using thicker lines in Figure 2B.

      (3) The black background and images in Figure 3A and B are hard to see as well. 

      We have now replaced Figure 3A and B, now using a white background.

      (4) Typo at the beginning of line 251? 

      Unfortunately we failed to find the typo referred to. We are happy to address it if it still exists in the revised manuscript.

      Reviewer #3 (Recommendations For The Authors):

      (1) I think there is not enough experimental evidence to conclude that the underlying cause of increased motility is the accumulation of digested alginate products. To conclusively show that this is the cause and not just some signal linked to cell density, perhaps the experiment should be repeated with a different carbon source. 

      We thank the reviewer for their comment, which made us realize that we did not make the nature of the dispersal cue clear. The gene expression data was obtained from batch cultures and measured at the same approximate bacterial densities in batch, which indeed shows that the digested alginate is a sufficient signal for an increase in motility gene expression. This agrees very well with our observation that cells growing on digested alginate in microfluidic chambers have an increased fraction of motile cells in comparison with cells exposed to alginate (Fig 3E). However, we did not mean to suggest that the observed dispersal by bacterial motility is not influenced by cell density, in fact, we see that dispersal (and hence the increase in cell motility) in microfluidic chambers that are switched from polymeric to digested alginate depends on the bacterial density in the chamber, with higher bacterial densities showing increased dispersal. This shows that the presence of alginate oligomers does trigger dispersal through motility, but this signal affects bacterial groups in a cell density dependent manner.

      Similar observations have been made in Caulobacter crescentus, which was found to form cell groups on the polymer xylan while cells disperse when the corresponding monomer xylose becomes available (D’Souza et al., 2021). We reference the additional work in lines 179 and 230. Taken together, these observations indicate a more general phenomenon in dispersal from polysaccharide substrates.

      (2) About the expression data: 

      • Ribosomal proteins and ABC transporters are enriched in cells grown on digested alginate and the authors discuss that this explains the difference in max growth rate between alginate and digested alginate. However, in Figure S2E the authors report no statistical difference between growth rates. 

      We have now edited the manuscript to clarify this point. We found that cells grown on degradation products reached their maximal growth rate around 7.5 hours earlier (Fig. S2D) and showed increased expression of ribosomal biosynthesis and ABC transporters in late-exponential phase (Fig. 4A). We consider this shorter lag time as a sign of a different growth state and therefore a possible reason for the difference in ribosomal protein expression.

      As the reviewer correctly points out, the maximum growth rates that were computed from the two growth curves were not significantly different (Fig. S2E). However, for our gene expression analysis, we harvested the transcriptome of cells that reached OD 0.39-0.41 (mid- to late-exponential phase). At this time point, the cell cultures may have differed in their momentary growth rate.

      We edited the manuscript to make this clearer (line 287):

      “Both observations likely relate to the different growth dynamics of V. cyclitrophicus ZF270 on digested alginate compared to alginate (Fig. S2A), where cells in digested alginate medium reached their maximal growth rate 7.5 hours earlier and thus showed a shorter lag time (Fig. S2D). As a consequence, the growth rate at the time of RNA extraction (mid-to-late exponential phase) may have differed, even though the maximum growth rate of cells grown in alginate medium and digested alginate medium were not found to be significantly different (Fig. S2E).”

      • The increased expression of transporters for lyases in cells grown on digested alginate (lines 273-274 and 325-328) is very confusing and the explanation provided in lines 412-420 is not very convincing. My two cents on this: Expression of more enzymes and induction of motility might be a strategy to be prepared for more likely future environments (after dispersal, alginate is the most likely carbon source they will find). This would be in line with observed increased chemotaxis towards the polymer rather than the monomer (Similar to C. elegans). 

      This comment is in line with reviewer 2, comment 7. In response to these two comments (and as described above), we expanded our discussion of these results to contextualize these findings (around line 443):

      “One reason for this observation may be that cells primarily rely on intracellular monosaccharide levels to trigger the upregulation of genes associated with polysaccharide degradation and catabolism, as has previously been observed for E. coli across various carbon sources [50,51]. In fact, the majority of carbon sources are sensed by prokaryotes through one‑component sensors inside the cell [50]. In the one‑component internal sensing scheme, the enzymes and transporters for the use of various carbon sources are expressed at basal levels, which leads to an increase in pathway intermediates upon nutrient availability. The pathway intermediates are sensed by an internal sensor, usually a transcription factor, and lead to the upregulation of transporter and enzyme expression [50,51]. This results in a positive feedback loop, which enables small changes in substrate abundance to trigger large transcriptional responses [50,52]. Thus, the presence of alginate breakdown products may likely result in increased expression of all components of the alginate degradation pathway, including the expression of degrading enzymes. As the gene expression analysis was performed on well-mixed cultures in culture medium containing alginate breakdown products, we therefore expect a strong stimulation of alginate catabolism. In a natural scenario, where cells disperse from a polysaccharide hotspot before its exhaustion, the expression of alginate catabolism genes may likely decrease again once the local concentration of breakdown products decreases. However, continued production of alginate lyases could also provide an advantage when encountering a new alginate source and continued production of alginate lyases may thus help cells to prepare for likely future environments. Further investigations of bacterial enzyme secretion in changing nutrient environments and at relevant spatial scales are required to improve our understanding of the regulation of enzyme secretion along nutrient gradients.”

      Additionally, we agree with the intriguing comment that continued expression of alginate lyases may also prepare cells for likely future environments. Further studies that aim to answer whether marine bacteria are primed by their growth on one carbon source towards faster re-initiation of degradation on a new particle will be an interesting research question. We now address this point in our manuscript (line 458):

      “However, continued production of alginate lyases could also provide an advantage when encountering a new alginate source and continued production of alginate lyases may thus help cells to prepare for likely future environments. Further investigations of bacterial enzyme secretion in changing nutrient environments and at relevant spatial scales are required to improve our understanding of the regulation of enzyme secretion along nutrient gradients.“

      (3) The yield reached by Vibrio on alginate is significantly higher than the yield in digested alginate, not similar, as stated in lines 133-134. Only cell counts are similar. Perhaps the author can correct this statement and speculate on the reason leading to this discrepancy: perhaps cells tend to aggregate in alginate despite the fact that these are well-mixed cultures. 

      We have edited the description of the OD measurements accordingly and agree with the reviewer that aggregation is indeed a possible reason for the discrepancy (line 141):

      “We also observed that the optical density at stationary phase was higher when cells were grown on alginate (Fig. S2B and C). However, colony counts did not show a significant difference in cell numbers (Fig. S3), suggesting that the increased optical density may stem from aggregation of cells in the alginate medium, as observed for other Vibrio species [7].”

      (4) I suggest toning down the importance of the results presented in this study for understanding global carbon cycling. There is a link but at present it is too much emphasized. 

      We have edited our statements regarding the carbon cycle. In the revised manuscript we stress the lack of direct quantifications of carbon cycling. . We still refer to carbon flow in the revised manuscript, as we would argue that microbial remineralization of biomass is recognized as an important factor in the marine biological carbon pump (e.g., Chisholm, 2000) and research on marine bacterial foraging investigates how bacterial cells manage to find and utilize this biomass.

      Our revised manuscript contains the following modified statements (line 47 and line 60): “Even though many studies indicate that these degradation-dispersal cycles contribute to the carbon flow in marine systems, we know little about how cells alternate between polysaccharide degradation and motility, and which environmental factors trigger this behavioral switch.”

      “Overall, our findings reveal cellular mechanisms that might also underlie bacterial degradation-dispersal cycles, which influence the remineralization of biomass in marine environments.”

      References

      • Alcolombri, U., Peaudecerf, F. J., Fernandez, V. I., Behrendt, L., Lee, K. S., & Stocker, R. (2021). Sinking enhances the degradation of organic particles by marine bacteria. Nature Geoscience, 14(10), 775–780. https://doi.org/10.1038/s41561-021-00817-x
      • Bassler, B. L., Gibbons, P. J., Yu, C., & Roseman, S. (1991). Chitin utilization by marine bacteria. Chemotaxis to chitin oligosaccharides by Vibrio furnissii. Journal of Biological Chemistry, 266(36), 24268–24275. https://doi.org/10.1016/S0021-9258(18)54224-1
      • Chisholm, S. W. (2000). Stirring times in the Southern Ocean. Nature, 407(6805), 685–686. https://doi.org/10.1038/35037696
      • Chubukov, V., Gerosa, L., Kochanowski, K., & Sauer, U. (2014). Coordination of microbial metabolism. Nature Reviews. Microbiology, 12(5), 327–340. https://doi.org/10.1038/nrmicro3238
      • Clerc, E. E., Raina, J.-B., Keegstra, J. M., Landry, Z., Pontrelli, S., Alcolombri, U., Lambert, B. S., Anelli, V., Vincent, F., Masdeu-Navarro, M., Sichert, A., De Schaetzen, F., Sauer, U., Simó, R., Hehemann, J.-H., Vardi, A., Seymour, J. R., & Stocker, R. (2023). Strong chemotaxis by marine bacteria towards polysaccharides is enhanced by the abundant organosulfur compound DMSP. Nature Communications, 14(1), 8080. https://doi.org/10.1038/s41467-023-43143z
      • Dal Co, A., van Vliet, S., Kiviet, D. J., Schlegel, S., & Ackermann, M. (2020). Shortrange interactions govern the dynamics and functions of microbial communities. Nature Ecology and Evolution, 4(3), 366–375. https://doi.org/10.1038/s41559-019-1080-2
      • D’Souza, G., Ebrahimi, A., Stubbusch, A., Daniels, M., Keegstra, J., Stocker, R., Cordero, O., & Ackermann, M. (2023). Cell aggregation is associated with enzyme secretion strategies in marine polysaccharide-degrading bacteria. The ISME Journal. https://doi.org/10.1038/s41396-023-01385-1
      • D’Souza, G. G., Povolo, V. R., Keegstra, J. M., Stocker, R., & Ackermann, M. (2021). Nutrient complexity triggers transitions between solitary and colonial growth in bacterial populations. The ISME Journal, 15(9), 2614–2626. https://doi.org/10.1038/s41396-021-00953-7
      • D’Souza, G., Schwartzman, J., Keegstra, J., Schreier, J. E., Daniels, M., Cordero, O. X., Stocker, R., & Ackermann, M. (2023). Interspecies interactions determine growth dynamics of biopolymer-degrading populations in microbial communities. Proceedings of the National Academy of Sciences of the United States of America, 120(44), e2305198120. https://doi.org/10.1073/pnas.2305198120
      • Fenchel, T. (2002). Microbial Behavior in a Heterogeneous World. Science, 296(5570), 1068–1071. https://doi.org/10.1126/science.1070118
      • Jiao, N., Luo, T., Chen, Q., Zhao, Z., Xiao, X., Liu, J., Jian, Z., Xie, S., Thomas, H., Herndl, G. J., Benner, R., Gonsior, M., Chen, F., Cai, W.-J., & Robinson, C. (2024). The microbial carbon pump and climate change. Nature Reviews Microbiology. https://doi.org/10.1038/s41579-024-01018-0
      • Keegstra, J. M., Carrara, F., & Stocker, R. (2022). The ecological roles of bacterial chemotaxis. Nature Reviews Microbiology, 20(8), 491–504. https://doi.org/10.1038/s41579-022-00709-w
      • Konishi, H., Hio, M., Kobayashi, M., Takase, R., & Hashimoto, W. (2020). Bacterial chemotaxis towards polysaccharide pectin by pectin-binding protein. Scientific Reports, 10(1), 3977. https://doi.org/10.1038/s41598-020-60274-1
      • Li, Y., Sun, H., Ma, X., Lu, A., Lux, R., Zusman, D., & Shi, W. (2003). Extracellular polysaccharides mediate pilus retraction during social motility of Myxococcus xanthus. Proceedings of the National Academy of Sciences, 100(9), 5443–5448. https://doi.org/10.1073/pnas.0836639100
      • Martínez-Antonio, A., Janga, S. C., Salgado, H., & Collado-Vides, J. (2006). Internal sensing machinery directs the activity of the regulatory network in Escherichia coli. Trends in Microbiology, 14(1), 22–27. https://doi.org/10.1016/j.tim.2005.11.002
      • McDougald, D., Rice, S. A., Barraud, N., Steinberg, P. D., & Kjelleberg, S. (2012). Should we stay or should we go: Mechanisms and ecological consequences for biofilm dispersal. Nature Reviews Microbiology, 10(1), 39–50. https://doi.org/10.1038/nrmicro2695
      • Nguyen, T. T. H., Zakem, E. J., Ebrahimi, A., Schwartzman, J., Caglar, T., Amarnath, K., Alcolombri, U., Peaudecerf, F. J., Hwa, T., Stocker, R., Cordero, O. X., & Levine, N. M. (2022). Microbes contribute to setting the ocean carbon flux by altering the fate of sinking particulates. Nature Communications, 13(1), 1657. https://doi.org/10.1038/s41467-022-29297-2
      • Norris, N., Alcolombri, U., Keegstra, J. M., Yawata, Y., Menolascina, F., Frazzoli, E., Levine, N. M., Fernandez, V. I., & Stocker, R. (2022). Bacterial chemotaxis to saccharides is governed by a trade-off between sensing and uptake. Biophysical Journal, 121(11), 2046–2059. https://doi.org/10.1016/j.bpj.2022.05.003
      • Povolo, V. R., D’Souza, G. G., Kaczmarczyk, A., Stubbusch, A. K., Jenal, U., & Ackermann, M. (2022). Extracellular appendages govern spatial dynamics and growth of Caulobacter crescentus on a prevalent biopolymer. bioRxiv, 2022.06.13.495907. https://doi.org/10.1101/2022.06.13.495907
      • Preheim, S. P., Boucher, Y., Wildschutte, H., David, L. A., Veneziano, D., Alm, E. J., & Polz, M. F. (2011). Metapopulation structure of Vibrionaceae among coastal marine invertebrates. Environmental Microbiology, 13(1), 265–275. https://doi.org/10.1111/j.1462-2920.2010.02328.x
      • Schwartzman, J. A., Ebrahimi, A., Chadwick, G., Sato, Y., Orphan, V., & Cordero, O. X. (2021). Bacterial growth in multicellular aggregates leads to the emergence of complex lifecycles. bioRxiv, 2021.11.01.466752. https://doi.org/10.1101/2021.11.01.466752
      • Singh, P. K., Bartalomej, S., Hartmann, R., Jeckel, H., Vidakovic, L., Nadell, C. D., & Drescher, K. (2017). Vibrio cholerae Combines Individual and Collective Sensing to Trigger Biofilm Dispersal. Current Biology, 27(21), 3359-3366.e7. https://doi.org/10.1016/j.cub.2017.09.041
      • Ulrich, L. E., Koonin, E. V., & Zhulin, I. B. (2005). One-component systems dominate signal transduction in prokaryotes. Trends in Microbiology, 13(2), 52–56. https://doi.org/10.1016/j.tim.2004.12.006
      • Wall, M. E., Hlavacek, W. S., & Savageau, M. A. (2004). Design of gene circuits: Lessons from bacteria. Nature Reviews Genetics, 5(1), 34–42. https://doi.org/10.1038/nrg1244
      • Yawata, Y., Carrara, F., Menolascina, F., & Stocker, R. (2020). Constrained optimal foraging by marine bacterioplankton on particulate organic matter. Proceedings of the National Academy of Sciences, 117(41), 25571–25579. https://doi.org/10.1073/pnas.2012443117
      • Yawata, Y., Cordero, O. X., Menolascina, F., Hehemann, J.-H., Polz, M. F., & Stocker, R. (2014). Competition–dispersal tradeoff ecologically differentiates recently speciated marine bacterioplankton populations. Proceedings of the National Academy of Sciences, 111(15), 5622–5627. https://doi.org/10.1073/pnas.1318943111
      • Zöttl, A., & Yeomans, J. M. (2019). Enhanced bacterial swimming speeds in macromolecular polymer solutions. Nature Physics, 15(6), 554–558. https://doi.org/10.1038/s41567-019-0454-3
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      __Reviewer #1 (Evidence, reproducibility, and clarity (Required)):____ __ Summary: Viruses exploit host endoplasmic reticulum (ER)-resident chaperones to support new protein synthesis during viral replication. Here, Najarro et al. study the role of the ER-resident HSP70 family member Binding immunoglobulin protein (BiP) during lytic infection by the Kaposi's sarcoma-associated herpesvirus (KSHV). Using the established doxycycline-inducible lytic reactivation infection model cell line iSLK-BAC16, they showed that KSHV reactivation leads to an upregulation of total BiP protein but not RNA, and is independent of the unfolded protein response. siRNA knockdown or pharmacological inhibition by HA15 of BiP significantly reduced global viral gene expression and infectious virus production. The authors attribute this to at least the reduction of levels of the K1 gene which is required for efficient viral replication. Finally, they showed that HA15 has cytostatic activity in KSHV-transformed B cells and cytotoxic effects in KSHV-infected lymphatic endothelial cells arguing for BiP inhibition as a potential therapeutic strategy to treat KSHV-driven malignancies. The manuscript is well-written and the conclusions were generally supported by the data with a few exceptions below.

      Major comments:

      • They propose in lines 196-199 that the reduction of K1 from HA15 treatment partially explains the defect in virion production during lytic reactivation. I am not convinced that this statement is fully supported by their data. Reduction of K1 is likely a downstream consequence and not the cause of the inhibition of lytic replication.

      We thank the reviewer for this comment. We conducted a more detailed analysis of our RNAseq data in iSLK.219 cells and confirmed the downregulation of the K1 transcript in latently infected cells treated with HA15 (See Fig 3 and Sup Fig 5). It is likely that the drop in transcript levels results from IRE1-mediated degradation in a recently-described process known as RIDDLE (IRE1-mediated RNA decay lacking endomotif), in which IRE1 depletes mRNAs1*. We have included this hypothesis in the discussion. *

      Unfortunately, we cannot confirm the downregulation of K1 at the protein level in iSLK.219 cells since the antibodies are highly specific for K1 variants in PEL cells. To overcome this technical limitation, we conducted mass spectrometry analysis of the viral proteome from whole cell lysates of latent and lytic cells undergoing HA15 treatment. While we detect the expected global downregulation of viral proteins in lytic cells treated with HA15, we were not able to detect any viral proteins except for LANA in the latently infected cells, and our detection of several lytic proteins was limited. We speculate that the levels of latent viral proteins expressed in iSLK.219 cells are below the limits of detection of our assay, or that extensive modification of some of these viral proteins may hinder their detection. Due to these limitations, we decided not to include these data in the manuscript.

      • Additionally, we note that the lower levels of K1 detected in latent iSLK.219 and TREx-BCBL-1 cells treated with HA15 may affect viral reactivation, which is consistent with findings from the Damania lab showing K1's crucial role in viral replication2.*

      • *

      • The quantification of the K1 blots in Fig. 3C only has n=2. With subtle differences by eye, large error bars, and no statistical analysis, it is hard to conclude here with confidence. *

      We agree with the reviewer. We have moved the K1 blot to the Sup. Fig. 3E and adjusted the text accordingly.* *

      • Like K1, ORF45, and K8.1 proteins are similarly decreased at 24 h in Fig. 2E, suggesting that the defect is upstream of K1. Does HA15 affect the amount of endogenous and/or transgene copy of RTA being produced (hence the broader effect in early gene expression at 24h?)?

      • **To answer the Reviewer's query, we re-evaluated the impact of HA15 treatment on the activity of dox-inducible RTA. However, we think it is unlikely for HA15 to alter RTA activity since RTA does not enter the secretory pathway. *

      To evaluate the activity of RTA in HA15 treated cells, we measured the expression of the viral episome-encoded RFP reporter, driven by the viral PAN promoter4*, at 24h post-doxycycline treatment of iSLK.219 cells. We compared the response of the PAN promoter to RTA in cells treated with or without HA15 at this early timepoint, to avoid any potential confounding effects stemming from elevated endogenous RTA expression at later times post-reactivation. We demonstrate that the levels of RFP in iSLK.219 cells treated with Dox are identical in presence or absence of HA15. This result, included in Sup. Fig. 3, indicates that the activity of RTA, crucial for initiating the lytic cycle in this context, is unaffected by BiP inhibition at early times post reactivation. *

      • *

      • K1 levels appear to decrease even during latency. Are the other latent proteins also affected? What about latent genome copies?

      To address this query, we compared the Log2 fold change of latent transcripts (K1, K2, K12, ORF71, ORF72, ORF73) in the iSLK.219 RNAseq data set (Fig 3). Only the K1 transcript is reduced in HA15-treated cells. We include these data in Sup Fig 5A.

      Regarding differences in genome copies, the consistent levels of the viral genome-encoded GFP in HA15 -/+ iSLK-219 cells (Sup Fig 3) indicate no significant changes in the levels of viral genomes at 24h post-treatment (prior to DNA replication). Previous studies by our lab and others show that knockdown of the major latency protein LANA results in episomal loss and lower levels of GFP5*. These results validate the use of GFP fluorescence in iSLK.219 as a proxy for genome copies. *

      • *

      • Fig. 3C was performed in a PEL cell line which they showed to enter cytostasis upon HA15 treatment (Fig. 5). This cytostasis (rather than K1) may be the root cause of the defect in viral replication as cells could be arrested at a different stage compared to the G2 requirement for lytic replication in PEL cells (Balisteri et al., PLOS Pathogens 2016, PMID: 26891221).

      See point 2. below

      • The cytostatic effect in PEL cell lines (Fig. 5) should be demonstrated using more direct methods that measure cell cycle (e.g. PI-BrdU).

      We thank the reviewer for this comment. While more direct methods to measure the cell cycle stage affected by HA15 treatment will inform on its mechanism of action, these experiments lie outside of the scope of this manuscript and we consider are better suited for future studies on the anticancer properties of HA15. The data presented in Fig. 5 demonstrates that HA15 treatment of PEL cells causes a reduction in cell numbers without cytotoxicity, thus supporting our conclusion of a net negative effect on proliferation rather than cell death. The loss of our LN2 tank and PEL cell lines currently limits our ability to do these more detailed analyses. At the moment, we do not have an accurate estimate of how long it will take to replace these cell lines for our subsequent studies.

      • *

      • While having an uninfected B cell as a matched negative control for PEL is challenging, primary peripheral B cells (mostly of mature memory B cell stage) may not be the appropriate negative control. PEL cells are of plasma cell lineage which have unusually high protein translation and overloaded ER. The plasma cell lineage may explain the sensitivity of PEL cells to HA15. It is possible that HA15 may be toxic to plasma cells when used as a therapeutic agent.

      We agree with the reviewer on the potential impact of HA15 on plasma cell viability. Indeed, HA15 (>2uM) treatment reduces the viability of plasma cell myeloma lines (NCI-H929 and U266 cells), substantiating its use as a potential anti-cancer drug6. Although HA15 has not been tested as a therapeutic agent in humans, studies in mice have demonstrated tolerability without evident toxicity, measured as normal body weight7*. The potential therapeutic application of HA15 for cancer warrants further investigation and is beyond the scope of our manuscript. *

      • Does HA15 have cytostatic effects in uninfected or latently infected iSLK cells?

      • *

      We observed no cytostatic or cytotoxic effects in uninfected or latently infected iSLK cells exposed to up to 30uM of HA15. Although HA15 has been tested on various cancer types8*, it has not been evaluated in Renal Carcinoma Cells (RCC), the cellular background of iSLK.219 cells. The mechanism behind the resistance of these cells to HA15 eludes us, but its link to the cellular background of iSLK.219s merits exploration in future studies. *

      Minor comments: 1. Consider changing the title of line 98 to specify cell type since BiP levels do not increase in BCBL-1 (Supp. Fig. 3).

      • *

      Revised in the manuscript

      Fig. 3A may benefit from using z-scores instead of log2TPM so differences are more obvious per gene.

      Since the data have already been collected, can the authors include both latent and lytic cells with and without HA15 treatment in Fig. 3A? It may give more information for the reader. *

      *We have reanalyzed all the RNAseq data and included a z-score plot for all samples in Fig. 3. We also providing three new supplementary tables with the raw counts, the z-scores for viral genes, and the log2 of the normalized counts.

      *

      *Reviewer #1 (Significance (Required)):

      Significance: Here, the authors convincingly demonstrate the proviral role of the ER chaperone BiP during KSHV reactivation. This manuscript will be relevant to researchers in the gammaherpesvirus field. Although the authors did present some interesting data, the scope is narrow, and mechanistic studies were not pursued that would have added more insight in BiP and/or KSHV biology. For instance, how do BiP protein levels increase during reactivation (is this at the level of RNA sequestration/export, translation, or protein stability?)? How does BiP promote lytic replication?

      Field of expertise: KSHV, molecular and cell biology

      *

      * __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __ Many viruses have complex relationships with cellular ER proteostasis machinery that remain poorly understood. Here Najarro, et al. report on studies of the oncogenic gammaherpesvirus KSHV. They report that the ER chaperone BiP is upregulated in epithelial cells during KSHV lytic replication. Unexpectedly, BiP upregulation is independent of the unfolded protein response, which stimulates transcriptional activation of BiP to meet the protein folding demand in the ER. Using a combination of genetic and pharmacologic approaches (CRISPRi and selective chemical inhibitor) they demonstrate that BiP inhibition interferes with the replication of diverse enveloped viruses including poxviruses and several herpesviruses, and reduces proliferation of KSHV-infected cells.

      Figure-by-figure:

      Fig. 1: This figure convincingly demonstrates the selective upregulation of BiP at the protein level during the course of KSHV lytic replication, and that KSHV late genes are dispensable for this upregulation. It further demonstrates that BiP is not upregulated at the mRNA level at all during KSHV infection, despite the fact that the UPR-dependent BiP mRNA upregulation pathway (presumably via ATF6 and IRE1) remains functional.

      Fig. 2: This figure convincingly demonstrates that BiP ATPase activity is required to support KSHV lytic replication in both epithelial and B cell models on infection, even though it is also clear that BiP is not upregulated in the B cell model.

      Fig. 3: This data demonstrates that steady-state levels of KSHV lytic gene products are reduced following HA15-treatment, whereas later gene expression was unaffected. As an interesting side note, v-IL6 bucks the trend of HA15-mediated downregulation of viral mRNA levels, suggesting that it may be regulated in a different manner. One thing that the authors may consider is the report from Drs. Yuan Chang and Patrick Moore (PMID: 12434062) that demonstrated that the v-IL6 gene is transactivated by type I interferon. Considering the poor replication of this virus during HA15 treatment, it may be valuable to investigate IFN production by these cells, and the extent to which it is impacted by inhibition of BiP ATPase activity.*

      We thank the reviewer for bringing this report to our attention. We also found intriguing the specific transcriptional upregulation of IL6 in IFN-a treated BCP-1 cells. Although we see a dramatic upregulation of the vIL6 in HA15 treated cells, we still detect the expression of most viral genes, albeit at significantly lower levels than in untreated cells, which indicates that the viral transcriptional program in lytic+HA15 iSLK.219 cells is different from the one seen in IFN-treated BCP-1 cells. Preliminary analyses of the host transcriptome from our RNAseq results show the expression of several ISGs (OAS1, 2 and 3, IFI6, IFIT1, IFIT3, IFITM1) in lytic-untreated iSLK.219 cells, but not in those treated with HA15. Together, these observations substantiate the notion that there is no IFN-driven expression of vIL6 in HA15-treated iSLK.219 cells.

      Fig. 4: This figure demonstrates that HA15 has broad, non-cytotoxic, antiviral activity against diverse enveloped viruses.

      Figs. 5/6: These figure shows cytotoxic effects of HA15 on latently infected PEL cells, either solely infected with KSHV or co-infected with KSHV and EBV, whereas normal B cells were unaffected. HA15 was also cytotoxic to KSHV infected lymphatic endothelial cells.

      **Referees cross-commenting**

      I appreciate the insightful comments from Reviewer #1 and Reviewer #3. I think we are largely on the same page. The data is generally supportive of author's conclusions, with a few exceptions that are straightforward to address in revisions. The manuscript is limited in scope, which could also be addressed by additional experimentation if the authors are motivated to explore mechanism in greater depth. Of particular note is the lack of mechanistic insight into how BiP is upregulated at the protein level during lytic replication, if the mRNA is unchanged. The experimental approaches to this are straightforward.

      *

      *

      We appreciate the reviewers' comments on the scope of our study. The mechanism of BiP upregulation remains an outstanding question for the following technical reasons: We hypothesized that the upregulation of BiP may depend on the IRES element present in its 5' UTR9. We tested this hypothesis by transfecting iSLK.219 cells with a bicistronic Renilla-(BiP)IRES-Firefly luciferase reporter from Licursi et. al10*. Unfortunately, for reasons that still elude us, our reactivation rates in transfected cells were consistently low in all of our experiments and therefore, we were not able to measure luciferase changes consistently and reliably. A potential workaround this technical limitation is to use a lentivirus-encoded IRES reporter to a lentiviral vector, as transduction of iSLK.219 cells does not alter viral reactivation, in our experience. At the moment, we do not have access to these reporters due to our lab's move to a different institution, and the first author of our study has started the next stage of their career. Therefore, we will not be able to pursue these experiments in a timely manner. *

      • *

      *As for the scope of this manuscript, even when the mechanism of BiP upregulation in KSHV infected cells remains unsolved, we consider that the broad-spectrum antiviral effect of BiP inhibition is an exciting finding that advances the field and benefits the virology community-the proteostasis network has been seldomly explored as a potential node for broad-spectrum antiviral intervention. Our results provide important proof-of-concept to continue the investigation of factors involved in protein synthesis, folding and transport as potential targets for the development of versatile broad-spectrum antivirals. *

      Reviewer #2 (Significance (Required)):

      Strengths: This is a well-written manuscript. The text and figures are clear and accurate and the methods are sufficiently informative that the study can be reproduced. The data generally supports the authors' conclusions. BiP appears to be a druggable target with minimal off-target cytotoxicity in normal, uninfected cells, although this study does not go beyond cell culture studies to validate in vivo.

      Weaknesses: The study is somewhat limited in scope. The authors make the case for UPR transcription-independent upregulation of BiP during KSHV infection, and that late gene synthesis is dispensable, but the mechanism is not investigated further.

      Point by point discussion:

      Could an early KSHV gene product involved in this phenotype be identified by screening an ORF library or viral genome-wide CRISPRi screen?

      The question of the viral protein responsible for the upregulation of BiP during lytic infection is indeed a fascinating one. However, we suspect that the mechanism may be not specifically directed to BiP, but rather general modulation of IRES-related translation. Identifying the gene product(s) affected and corroborating IRES involvement is a major undertaking and a long-term goal requiring considerable effort. These analyses are outside the scope of this manuscript, but we will pursue them in the future.

      Or, beyond implicating viral factors in the mechanism of BiP upregulation, can some simple biochemical studies be performed to investigate BiP protein? Is the BiP mRNA more efficiently spliced and exported in KSHV infected cells?

      Do alternative translation initiation mechanisms like eIF2A play a role in boosting BiP levels during infection?

      What is the normal BiP protein turnover mechanism, and is this hindered during KSHV lytic replication? Is BiP AMPylation/de-AMPylation by FICD affected (PMID: 36041787)? These kinds of mechanistic studies are well within reach and would help extend the impact and interest to a broad audience.

      We agree on the putative involvement of translation initiation factors like eIF2A on promoting the translation of BiP (see discussion). We tested the effect of siRNA-mediated KD of eIF2A on BiP expression and found that, interestingly, the levels of BiP rose above those of controls in latent iSLK.219 cells (Data included in the manuscript and the discussion has been modified accordingly). This finding aligns with previous reports suggesting that eIF2A may suppress IRES-mediated translation in yeast cells and in mammalian in vitro translation assays. Moreover, Starck et. al11, observed a 50% increase of endogenous BiP levels in HeLa cells transfected with siRNAs against eIF2A, supporting the IRES-suppressor role for eIF2A in mammalian cells. Future work will be required to address the role of eIF2A on BiP translation. These analyses are beyond the scope our manuscript.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Najarro et al. investigates the contribution of BiP/GRP78 to double-stranded DNA virus infection, primarily focusing on the oncogenic gammaherpesvirus Kaposi's sarcoma-associated herpesvirus (KSHV). The authors observe that BiP expression is increased in lytic iSLK.219 cells as well as in KSHV-infected LECs. Interestingly, the authors data suggest a post-translational regulation of BiP in the iSLK.219 cells. Using various knockdown approaches and chemical inhibitors the authors demonstrate that inhibition of BiP impacts KSHV reactivation in multiple cells lines. Importantly, the authors also find that BiP inhibition can selectively kill KSHV-infected cells, while sparing primary B cells. Overall, this is a very well controlled and presented manuscript. My comments for the manuscript are minor, and largely cosmetic to aid the presentation of the data.

      • Fig 1C, It would be ideal to show that PAA treatment did indeed prevent the virus from entering the late stage of gene expression.

      *We have included an immunoblot for K8.1 in Figure 1C to confirm the effect of PFA on arresting the KSHV lytic cycle. *

      Sup Fig2, should show KD efficiency of XBP1, same goes for ATF6.

      • *

      Sup. Fig. 2D shows the expression of XBP1s in NS vs. XBP1KD cells in the presence or absence of Tg. In Sup Fig. 2G we have also included a bar graph showing the efficiency of downregulation of ATF6 mRNA in the presence of the targeting sgRNA.

      Sup Fig 3. It is interesting that the authors do not see increased BiP in TREx-BCBL1-RTA cells. A potential caveat is that lytic reactivation in TREx-BCBL1-RTA cells is not as efficient as in iSLK.219 cells. Therefore, it may simply be a result of the reduced population entering the lytic cycle. It may be worth adding a comment regarding this.

      • Images of the microscopy for Figure 4 would be useful.

      Images have been included in Fig. 4

      • Add label of the cell types for Figure 5.

      DONE

      • Does HSV1, HCMV, or VacV increase BiP expression compared to mock-infected cells?

      Yes, we have included a comment on this in the discussion

      Reviewer #3 (Significance (Required)):

      Overall, this is a very well controlled and presented manuscript.

      • *

      • *

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Many viruses have complex relationships with cellular ER proteostasis machinery that remain poorly understood. Here Najarro, et al. report on studies of the oncogenic gammaherpesvirus KSHV. They report that the ER chaperone BiP is upregulated in epithelial cells during KSHV lytic replication. Unexpectedly, BiP upregulation is independent of the unfolded protein response, which stimulates transcriptional activation of BiP to meet the protein folding demand in the ER. Using a combination of genetic and pharmacologic approaches (CRISPRi and selective chemical inhibitor) they demonstrate that BiP inhibition interferes with the replication of diverse enveloped viruses including poxviruses and several herpesviruses, and reduces proliferation of KSHV-infected cells.

      Figure-by-figure:

      Fig. 1: This figure convincingly demonstrates the selective upregulation of BiP at the protein level during the course of KSHV lytic replication, and that KSHV late genes are dispensable for this upregulation. It further demonstrates that BiP is not upregulated at the mRNA level at all during KSHV infection, despite the fact that the UPR-dependent BiP mRNA upregulation pathway (presumably via ATF6 and IRE1) remains functional.

      Fig. 2: This figure convincingly demonstrates that BiP ATPase activity is required to support KSHV lytic replication in both epithelial and B cell models on infection, even though it is also clear that BiP is not upregulated in the B cell model.

      Fig. 3: This data demonstrates that steady-state levels of KSHV lytic gene products are reduced following HA15-treatment, whereas later gene expression was unaffected. As an interesting side note, v-IL6 bucks the trend of HA15-mediated downregulation of viral mRNA levels, suggesting that it may be regulated in a different manner. One thing that the authors may consider is the report from Drs. Yuan Chang and Patrick Moore (PMID: 12434062) that demonstrated that the v-IL6 gene is transactivated by type I interferon. Considering the poor replication of this virus during HA15 treatment, it may be valuable to investigate IFN production by these cells, and the extent to which it is impacted by inhibition of BiP ATPase activity.

      Fig. 4: This figure demonstrates that HA15 has broad, non-cytotoxic, antiviral activity against diverse enveloped viruses.

      Figs. 5/6: These figure shows cytotoxic effects of HA15 on latently infected PEL cells, either solely infected with KSHV or co-infected with KSHV and EBV, whereas normal B cells were unaffected. HA15 was also cytotoxic to KSHV infected lymphatic endothelial cells.

      Referees cross-commenting

      I appreciate the insightful comments from Reviewer #1 and Reviewer #3. I think we are largely on the same page. The data is generally supportive of author's conclusions, with a few exceptions that are straightforward to address in revisions. The manuscript is limited in scope, which could also be addressed by additional experimentation if the authors are motivated to explore mechanism in greater depth. Of particular note is the lack of mechanistic insight into how BiP is upregulated at the protein level during lytic replication, if the mRNA is unchanged. The experimental approaches to this are straightforward.

      Significance

      Strengths: This is a well-written manuscript. The text and figures are clear and accurate and the methods are sufficiently informative that the study can be reproduced. The data generally supports the authors' conclusions. BiP appears to be a druggable target with minimal off-target cytotoxicity in normal, uninfected cells, although this study does not go beyond cell culture studies to validate in vivo.

      Weaknesses: The study is somewhat limited in scope. The authors make the case for UPR transcription-independent upregulation of BiP during KSHV infection, and that late gene synthesis is dispensable, but the mechanism is not investigated further. Could an early KSHV gene product involved in this phenotype be identified by screening an ORF library or viral genome-wide CRISPRi screen? Or beyond implicating viral factors in the mechanism of BiP upregulation, can some simple biochemical studies be performed to investigate BiP protein? Is the BiP mRNA more efficiently spliced and exported in KSHV infected cells? Do alternative translation initiation mechanisms like eIF2A play a role in boosting BiP levels during infection? What is the normal BiP protein turnover mechanism, and is this hindered during KSHV lytic replication? Is BiP AMPylation/de-AMPylation by FICD affected (PMID: 36041787)? These kinds of mechanistic studies are well within reach and would help extend the impact and interest to a broad audience.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript "Engineering of PAClight1P78A: A High-Performance Class-B1 GPCR-Based Sensor for PACAP1-38" by Cola et al. presents the development of a novel genetically encoded sensor, PAClight1P78A, based on the human PAC1 receptor. The authors provide a thorough in vitro and in vivo characterization of this sensor, demonstrating its potential utility across various applications in life sciences, including drug development and basic research.

      The diverse methods to validate PAClight1P78A demonstrate a comprehensive approach to sensor engineering by combining biochemical characterization with in vivo studies in rodent brains and zebrafish. This establishes the sensor's biophysical properties (e.g., sensitivity, specificity, kinetics, and spectral properties) and demonstrates its functionality in physiologically relevant settings. Importantly, the inclusion of control sensors and the testing of potential intracellular downstream effects such as G-protein activation underscore a careful consideration of specificity and biological impact.

      Strengths:

      The fundamental development of PAClight1P78A addresses a significant gap in sensors for Class-B1 GPCRs. The iterative design process -starting from PAClight0.1 to the final PAClight1P78A variant - demonstrates compelling optimization. The innovative engineering results in a sensor with a high apparent dynamic range and excellent ligand selectivity, representing a significant advancement in the field. The rigorous in vitro characterization, including dynamic range, ligand specificity, and activation kinetics, provides a critical understanding of the sensor's utility. Including in vivo experiments in mice and zebrafish larvae demonstrates the sensor's applicability in complex biological systems.

      Weaknesses:

      The manuscript shows that the sensor fundamentally works in vivo, albeit in a limited capacity. The titration curves show sensitivity in the nmol range at which endogenous detection might be possible. However, perhaps the sensor is not sensitive enough or there are not any known robust paradigms for PACAP release. A more detailed discussion of the sensors's limitations, particularly regarding in vivo applications and the potential for detecting endogenous PACAP release, would be helpful.

      We thank the reviewer for carefully analyzing our in vivo data and highlighting the limitation of our results regarding the sensor’s applicability in detecting endogenous PACAP. We added several sections conversing future possibilities for optimization in the discussion (see paragraphs 2-4). We agree that a more specific discussion of the limitations of our study is an important addition to help design future experiments. 

      There are several experiments with an n=1 and other low single-digit numbers. I assume that refers to biological replicates such as mice or culture wells, but it is not well defined. n=1 in experimental contexts, particularly in Figure 1, raises significant concerns about the exact dynamic range of the sensor, data reproducibility, and the robustness of conclusions drawn from these experiments. Also, ROI for cell cultures, like in Figure 1, is not well defined. The methods mentioned ROIs were manually selected, which appears very selective, and the values in Figure 1c become unnecessarily questionable. The lack of definition for "ROI" is confusing. Do ROIs refer to cells, specific locations on the cell membrane, or groups of cells? It would be best if the authors could use unbiased methods for image analysis that include the majority of responsive areas or an explanation of why certain ROIs are included or excluded.

      We thank the reviewer for the helpful suggestions. We have increased the number of replicates to n=3 for both HEK293T and neuron data depicted in Fig.1c. Furthermore, we have added Fig.1c’ containing the quantification of the maximum responses obtained in the dataset shown in Fig.1c also depicting the single values for each replicate. To clarify the definition of an ROI in our manuscript, we have detailed the process of ROI selection in the Methods section “Cell culture, imaging and quantification section”. Additionally, we also increased mouse numbers for in vivo PACAP infusions in mice (see Figure 4g).

      Reviewer #2 (Public Review):

      Summary:

      The PAClight1 sensor was developed using an approach successful for the development of other fluorescence-based GPCR sensors, which is the complete replacement of the third intracellular loop of the receptor with a circularly-permuted green fluorescent protein. When expressed in HEK cells, this sensor showed good expression and a weak but measurable response to the extracellular presence of PACAP1-38 (a

      F/Fo of 43%). Additional mutation near the site of insertion of the linearized GPF, at the C-terminus of the receptor, and within the second intracellular loop produced a final optimized sensor with F/Fo of >1000%. Finally, screening of mutational libraries that also included alterations in the extracellular ligand-binding domain of the receptor yielded a molecule, PAClight1P78A, that exhibited a high ligand-dependent fluorescence response combined with a high differential sensitivity to PACAP (EC50 30 nM based on cytometric sorting of stably transfected HEK293 cells) compared to its congener VIP, (with which PACAP shares two highly related receptors, VPAC1 and VPAC2) as well as several unrelated neuropeptides, and significantly slowed activation kinetics by PACAP in the presence of a 10-fold molar excess of the PAC1 antagonist PACAP6-38. A structurally highly similar control construct, PAClight1P78Actl, showed correspondingly similar basal expression in HEK293 cells, but no PACAP-dependent enhancement in fluorescent properties.

      PAClight1P78A was expressed in neurons of the mouse cortex via AAV9.hSyn-mediated gene transduction. Slices taken from PAClight1P78A-transfected cortex, but not slices taken from PAClight1P78Actl-transfected cortex exhibited prompt and persistent elevation of F/Fo after 2 minutes of perfusion with PACAP1-38 which persisted for up to 14 minutes and was statistically significant after perfusion with 3000, but not 300 or 30 nM, of peptide. Likewise, microinfusion of 200 nL of 300 uM PACAP1-38 into the cortex of optical fiber-implanted freely moving mice elicited a F/Fo (%) of greater than 15, and significantly higher than that elicited by application of similar concentrations of VIP, CRF, or enkephalin, or vehicle alone. In vivo experiments were carried out in zebrafish larvae by the introduction of PAClight1P78A into single-cell stage Danio rerio embryos using a Tol2 transposase-based plasmid with a UAS promoter via injection (of plasmid and transposase mRNA), and sorting of post-fertilization embryos using a marker for transgenesis carried in the UAS :

      PAClight1P78A construct. Expression of PAClight1P78A was directed to cells in the olfactory bulb which express the fish paralog of the human PAC1 receptor by using the Tg(GnRH3:gal4ff) line, and fluorescent signals were elicited by intracerebroventricular administration of PACAP1-38 at a single concentration (1 mM), which were specific to PACAP and to the presence of PAClight1P78A per se, as controlled by parallel experiments in which PAClight1P78Actl instead of PAClight1P78A was contained in the transgenic plasmid.

      Major strengths and weaknesses of the methods and results

      The report represents a rigorous demonstration of the elicitation of fluorescent signals upon pharmacological exposure to PACAP in nervous system tissue expressing PAClight1P78A in both mammals (mice) and fish (zebrafish larvae). Figure 4d shows a change in GFP fluorescence activation by PACAP occurring several seconds after the cessation of PACAP perfusion over a two-minute period, and its persistence for several minutes following. One wonders if one is apprehending the graphical presentation of the data incorrectly, or if the activation of fluorescence efficiency by ligand presentation is irreversible in this context, in which case the utility of the probe as a real-time indicator, in vivo, of released peptide might be diminished.

      We thank the reviewer for their careful consideration of our manuscript and agree that the activation of PAClight persisting for several minutes at micromolar concentrations could be a potential limitation for in vivo applications. We added a possible explanation for the persisting sensor activation in response to artificial application of PACAP38 in paragraph 3 of the discussion. We agree that this addition eases the interpretation of PAClight signals detected in vivo. 

      Appraisal of achievement of aims, and data support of conclusions:

      Small cavils with controls are omitted for clarity; the larger issue of appraisal of results based on the scope of the designed experiments is discussed in the section below. An interesting question related to the time dependence of the PACAP-elicited activation of PAClight1P87A is its onset and reversibility, and additional data related to this would be welcome.

      We agree that the reversibility of the sensor’s fluorescence is indeed an important feature especially for detecting endogenous PACAP release. Our data indicate that the sensor’s fluorescence is reversible when detecting small to medium doses of PACAP38 (see Figure 4d – Application of 30-300nM) that are presumably closer to physiological concentrations than the non-reversible concentration of 3000nM. Please, see also our new discussion on peptide concentrations in paragraph 4 of our discussion. For future experiments, it is indeed advisable to adjust the interval of repeated applications to the decay of the response at the respective concentration. Considering, the long-lasting downstream effects of endogenous signaling, longer intervals between ligand applications are generally preferred to match more closely the physiological range in which endogenous PAC1 is most likely affective. 

      Discussion of the impact of the work, and utility of the methods and data:

      Increasingly, neurotransmitter function may be observed in vivo, rather than by inferring in vivo function from in vitro, in cellular, or ex vivo experimentation. This very valuable report discloses the invention of a genetically encoded sensor for the class B1 GPCR PAC1. PAC1 is the major receptor for the neuropeptide PACAP, which in turn is a major neurotransmitter involved in brain response to psychogenic stress, or threat, in vertebrates as diverse as mammals and fishes. If this sensor possesses the sensitivity to detect endogenously released PACAP in vivo it will indeed be an impactful tool for understanding PACAP neurotransmission (and indeed PACAP action in general, in immune and endocrine compartments as well) in future experiments.

      However, the sensor has not yet been used to detect endogenously released PACAP. Until this has been done, one cannot answer the question as to whether the levels of exogenously perfused/administered PACAP used here merely to calibrate the sensor's sensitivity are indeed unphysiologically high. If endogenous PACAP levels don't get that high, then the sensor will not be useful for its intended purpose. The authors should address this issue and allude to what kind of experiments would need to be done in order to detect endogenous PACAP release in living tissue in intact animals. The authors could comment upon the success of other GPCR sensors that have been used to observe endogenous ligand release, and where along the pathway to becoming a truly useful reagent this particular sensor is.

      We thank the reviewer for highlighting the lack in clarity that the scope of this paper was not intended to cover the detection of endogenous PACAP release. We therefore expanded our discussion to encompass the intended purpose of detecting artificially infused or applied PAC1 agonists, such as conducting fundamental tests of drug specificity and developing new pharmacological ligands to selectively target PAC1. This includes a more detailed discussion of our in vivo findings and a clearer phrasing that stresses the potential application for applied drugs and not endogenous PACAP (see last paragraph in the discussion).

      We also agree that little is known about endogenous concentrations of PACAP in the brain. However, we have supplemented our discussion with several references estimating lower concentrations of PACAP and other peptides in vivo, suggesting average PACAP levels below the detection threshold of the sensor. Importantly, within certain brain regions and in closer proximity to release sites, significantly higher concentrations might be reached. Additionally, our data indicate that the concentrations observed under our current conditions do not saturate the sensor in vivo.  

      We therefore acknowledge the reviewer’s comment on the sensor’s potential limitations under our current experimental conditions. Hence, we expanded our discussion and suggest the use of higher resolution imaging to potentially reveal loci of high PACAP concentrations, which should be validated by future studies (see also our added discussion in paragraph 4). 

      Reviewer #3 (Public Review):

      Summary:

      The manuscript introduces PAClight1P78A, a novel genetically encoded sensor designed to facilitate the study of class-B1 G protein-coupled receptors (GPCRs), focusing on the human PAC1 receptor. Addressing the significant challenge of investigating these clinically relevant drug targets, the sensor demonstrates a high dynamic range, excellent ligand selectivity, and rapid activation kinetics. It is validated across a variety of experimental contexts including in vitro, ex vivo, and in vivo models in mice and zebrafish, showcasing its utility for high-throughput screening, basic research, and drug development efforts related to GPCR dynamics and pharmacology.

      Strengths:

      The innovative design of PAClight1P78A successfully bridges a crucial gap in GPCR research by enabling realtime monitoring of receptor activation with high specificity and sensitivity. The extensive validation across multiple models emphasizes the sensor's reliability and versatility, promising significant contributions to both the scientific understanding of GPCR mechanisms and the development of novel therapeutics. Furthermore, by providing the research community with detailed methodologies and access to the necessary viral vectors and plasmids, the authors ensure the sensor's broad applicability and ease of adoption for a wide range of studies focused on GPCR biology and drug targeting.

      Weaknesses

      To further strengthen the manuscript and validate the efficacy of PAClight1P78A as a selective PACAP sensor, it is crucial to demonstrate the sensor's ability to detect endogenous PACAP release in vivo under physiological conditions. While the current data from artificial PACAP application in mouse brain slices and microinfusion in behaving mice provide foundational insights into the sensor's functionality, these approaches predominantly simulate conditions with potentially higher concentrations of PACAP than naturally occurring levels.

      We thank the reviewer for their valuable comments and agree that the use of PAClight for detecting endogenous PACAP will be of big interest for the scientific community and should be a goal for future research. Considering the time, equipment and additional animal licenses necessary, we are convinced that these questions would go beyond the scope of the current paper and might rather be addressed in a follow-up publication. We therefore rephrased the discussion and added more details to clarify further the intended purpose of the current study. Additionally, we added a paragraph in the discussion suggesting experiments needed to validate PAClight for putative future in vivo applications. 

      Although the sensor's specificity for the PAC1 receptor and its primary ligand is a pivotal achievement, exploring its potential application to other GPCRs within the class-B1 family or broader categories could enhance the manuscript's impact, suggesting ways to adapt this technology for a wider array of receptor studies. Additionally, while the sensor's performance is convincingly demonstrated in short-term experiments, insights into its long-term stability and reusability in more prolonged or repeated measures scenarios would be valuable for researchers interested in chronic studies or longitudinal behavioral analyses. Addressing these aspects could broaden the understanding of the sensor's practical utility over extended research timelines.

      We extend our gratitude to the reviewer for diligently assessing our results. 

      Indeed, the very high level of sensitivity that we could achieve in PAClight leads us to think that potentially a grafting-based approach, such as the one we’ve recently described for class-A GPCR-based sensors (PMID: 37474807) could also work for the direct generation of multiple class-B1 sensors based on the optimized fluorescent protein module present in PAClight. Unfortunately, considering the amount of work that testing this hypothesis would entail, we are not able to perform these experiments in the context of this revision, and would rather pursue them as a future project. Nevertheless, we have expanded the discussion of the manuscript with a paragraph with these considerations.

      While we lack comprehensive data on the long-term stability of the sensor, our preliminary findings from photometry recordings optimization indicate consistent baseline expression of PAClight and PACLight ctrl over several weeks. Conducting experiments to systematically assess stability would require several months, which is currently impractical due to limitations in tools and licenses for repeated in vivo infusions. Hence, we intend to include these experiments in potential follow-up studies.

      Furthermore, the current in vivo experiments involving microinfusion of PACAP near sensor-expressing areas in behaving mice are based on a relatively small sample size (n=2), which might limit the generalizability of the findings. Increasing the number of subjects in these experimental groups would enhance the statistical power of the results and provide a more robust assessment of the sensor's in vivo functionality. Expanding the sample size will not only validate the findings but also address potential variability within the population, thereby reinforcing the conclusions drawn from these crucial experiments.

      We agree with the reviewer that a sample size of N=2 is not sufficient for in vivo recordings. We therefore increased the sample size and now present recordings with 5 PAClight1P78A and 4 PACLight-control mice. Of note, the new data validate our previous findings and conclusions and give a better idea of the variability in vivo that we now discuss in much more detail in the discussion (see paragraph 2). 

      Recommendations for the Authors:

      Reviewer #1 (Recommendations For The Authors):

      The lower potency of maxadilan activation might reflect broader implications for ligand-receptor dynamics. Perhaps the authors could discuss the maxadilan binding from a structural perspective, including AlphaFold models. Also, discussing how these findings might influence sensor application in diverse biological contexts would be insightful. Clear definitions and consistent use of these terms are crucial for ensuring that readers understand the methods and results.

      We would like to thank the reviewer for the comments. As part of this work, we did not obtain a dose-response curve for maxadilan peptide, and only reported the maximal response of the sensor to a high concentration of the peptide (10 µM). Thus, our findings would rather inform us on the maximal efficacy of the peptide, as opposed to its potency towards the PAC1R. Furthermore, we would like to point out that due to the lack of structural details for any GPCR-based sensor published to date, we cannot make any molecularly accurate conclusion regarding the precise reasons why a different ligand (in this case the sandfly maxadilan) induces a lower maximal efficacy of the response compared to the endogenous cognate ligand of the receptor. We do not believe that AlphaFold models can accurately replace structural information in this regard, especially given the consideration that the aminoacid linker regions between the GPCR and the fluorescent protein, which are a critical determinant of allosteric chromophore modulation by ligand-induced conformational changes, typically obtain the lowest confidence score in all AlphaFold predicted structural models of GPCR-based sensors. Finally, we would like to refer the reviewer to a very nice recent publication (PMID: 32047270) which resolved the structures of each of these peptides bound to the PAC1 receptor-Gs protein complex, which provides accurate molecular details on the different modalities of receptor binding and activation by PACAP138  versus maxadilan.

      Reviewer #2 (Recommendations For The Authors):

      The authors are congratulated on the meticulous achievement of their aim, i.e. a fluorescence-based sensor for the detection of PACAP with in vivo utility. Whether or not this sensor will have the requisite sensitivity to detect the release of endogenous PACAP within various regions of the nervous system, in response to specific environmental stimuli or changes in brain or physiological state, remains to be determined.

      We thank the reviewer for the very positive evaluation of our manuscript and for the suggested additions that will improve the strength of our arguments.

      We agree that the in vivo detection of endogenous PACAP will be an important objective for future studies. Due to time, resource and animal license constraints, we are not able to address this objective in our current study, but we now detail possible future experiments in the discussion section. Please see also our answer to the suggested discussion points previously.

      Reviewer #3 (Recommendations For The Authors):

      To comprehensively assess the sensor's sensitivity and specificity to endogenous PACAP, I recommend conducting additional in vivo experiments where PAClight1P78A is expressed in neurons that endogenously express the Pac1r receptor (using Adcyap1r1-Cre mouse line). These experiments should involve applying sensory or emotional stimuli known to evoke PACAP release or activating upstream PACAP-expressing neurons. Such studies would offer valuable data on the sensor's performance under natural physiological conditions and its potential utility for exploring PACAP's roles in vivo.

      We express our gratitude to the reviewer for providing detailed methodological approaches to examine endogenous PACAP release. These suggestions will prove invaluable for future investigations and are important additions to a follow-up publication. As mentioned earlier, we have incorporated some of these approaches into our discussion. Additionally, we have underscored the existing limitations in detecting endogenous PACAP in vivo and emphasized the relevance of PAClight for drug development purposes.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): Throughout, the authors claim that there is a cross-talk between UPRmt and SG. This is unsubstantiated and unclear.

      We strongly disagree this comment. Throughout the manuscript, we show how manipulating UPRmt signalling affects SG formation, and how manipulating SG assembly alters mitochondrial functions and UPRmt-associated mitochondrial ouputs. In addition, both other reviewers are supportive of our conclusions.

      Major: Link between UPRmt and stress granules:

      The authors claim a link between the UPRmt and stress granule formation based on the finding that the loss of ATF5 affects the expression of UPRmt markers, but not ISR markers. Yet, the authors actually show that GTPP-induced SGs form in a manner independent of ATF5 (Supp. Fig. 2). Thus, there is no data in the manuscript that substantiates this claim.

      In the revised manuscript, we show that reducing ATF5 level results in defective SG assembly, with SGs displaying small size and more numerous, reflecting a maturation defect (Sup Figure 6B, 6C and 6D). In addition, we show a clear dependence of SGs to PERK activation (see comment below) and a specific increase of the ISR main negative regulator GADD34 (Figure 2A and 2B). Therefore, we disagree with this reviewer's conclusion and provide data supporting a link between UPRmt and SG formation.

      PERK-mediated activation of the ISR. The authors claim that PERK mediates activation of the ISR following GTPP treatment. However, the experiments in Fig. 2E were done 1h after treatment. The authors in Fig. 1C nicely show that SG formation begins at 2h. Thus, it is possible that following a longer GTPP treatment (i.e. >2h) the ISR is activated by different branches; for example, the mitochondrial branch that is mediated by HRI. Thus, the authors should determine which kinase mediates ISR activation at the time point that SG formation is maximal.

      We apologise if the description of the experimental procedure was unclear. These experiments are performed at 2h post GTPP treatment as explained in the text (see line 222) and legend (see lines 715-717, Figure 2 legend), and therefore performed at a time of maximal SG induction. Therefore, the identification of PERK as the driver for eIF2α-P and SG formation is performed at a time point where SG formation is maximal.

      Role of SG-linked decrease in cellular adaptation to stress. The finding that SGs limit mitochondrial respiration is interesting. Presumably this promotes cellular adaptation to mitochondrial stresses. The authors should test whether G3BP1/2 DKO cells are more susceptible to death following longer GTPP treatments.

      We thank the reviewer for this comment. These data are presented in Figure 8, where we show that G3BP1/2 dKO cells are less viable compared to wild-type cells following GTPP treatment for up to 28 hours.

      Minor: Fig. 2C should be moved to supplemental as well as the data indicated the lack of ISR inhibition.

      Figure 2C is now supplementary Figure 3.

      Fig. 3A should have representative images of all conditions from Fig. 3B.

      This has now been included as supplementary Figure 4.

      IFAs in Fig. 3 and 4 are hard to interpret given both DAPI and G3BP1 are in shades of blue. Ideally, insets of a merged panel should show each individual panel.

      We adopted the combination cyan, magenta and clue for our images to make scientific figures accessible to readers with red/green color-blindness. For these figures, G3BP1 is in light cyan and DAPI in dark blue, a colour we adopted previously in three publications (PMID 36965618, PMID 35098996, PMID 31905230), allowing colour blind reader to appreciate the results.

      Reviewer #1 (Significance (Required)): The link between the UPRmt and SGs is interesting and would be an advance. However, the authors put forward data that indicates SGs form in an UPRmt (ATF5)- independent manner. An interesting aspect of this story for which there is data is that SGs limit mitochondrial function. This should be explored further (i.e. although it limits mitochondrial respiration, perhaps SGs protect mitochondria against chronic ISR stress).

      As suggested we now provided an extensive amount of additional data supporting a role in mitochondrial functions, with data demonstrating that the absence of SGs rescues cell viability (Figure 8A and 8B), restoring mitochondrial functions such as respiration, ATP production (Figure 6D, 6E and 6F) or translation (Figure 7A), and reducing the production mitochondrial ROS (Figure 6C) or mitochondrial fragmentation (Figure 6A and 6B).

      Reviewer #2 (Evidence, reproducibility and clarity (Required)): Summary: The article by Lopez-Nieto Jordana et al entitled "Activation of the mitochondrial unfolded protein response regulates the dynamic formation of stress granules" describes the identification of a novel cross talk between the mitochondrial unfolded protein response (UPRmt) and the integrated stress response (ISR) and the contributory role SG regulation plays in mitochondrial function and adaptation to stress. This manuscript presents data highlighting that activation of the UPRmt results in the temporal modulation of SG formation via GADD34 levels and further this analysis by suggesting that these levels of GADD34 may enable cells to be protected from prolonged stress.

      Minor comments: This is a very well written manuscript with beautifully presented data. There are some inconsistencies/typos with the abbreviation GTPP- this needs to be checked within the manuscript but examples are on Lines: 204/206/214/324/328/357.

      This has now been corrected throughout.

      Check reference list for inconsistencies; line 680 reference has no page numbers, line 718 reference has no issue or page numbers

      This has now been corrected, references curated throughout.

      Line 255 - is it correct to say induction here? I think impairment should be used.

      This has now been corrected, see lines 283-284.

      Cell type not mentioned in Fig 2 legend.

      This has now been corrected, see line 707.

      Errors in Fig 4 legend - 4F, G do not exist.

      This has now been corrected, see lines 748-750.

      Major comments: In figure 1- the GTPP treatment only results in 25% of cells showing SGs compared with 80% in Ars treated cells. While the activation of ISR markers by GTPP treatment is convincing (in Figure 2A), What happens to overall protein synthesis levels in these cells? Puromycin incorporation assays would be a useful addition here.

      We now show in Figure 1D that GTPP treatment result in a global reduction in translation, and that cells displaying SGs present with a stronger shut-off when compared with treated cell lacking SGs.

      Fig. 1A - ATF4 upregulation is lower in ATF5 siRNA treated cells - what is % uptake of the siRNA in these cells - also see comment below. If possible, it would be nice to see the re-localisation of ATF5 to the nucleus to confirm the UPRmt activation of this protein

      These are experiments that we had planned to perform, however in our hands none of the commercially available antibodies allowed us to determine with confidence the localisation of ATF5. We have not determined the uptake of ATF5 siRNA but show by qPCR a reduction in ATF5 mRNA levels following siRNA treatment (see Figure 1A).

      Does the dispersal of SGs also correlate with a recovery of protein synthesis- there is still a relatively high level of eIF2alph-P at the 8h (from Figure 2A).

      We have not performed these experiments as we do not believe they would have added depth to our study. It is well accepted that SG disassembly results in mRNA re-entry in polysomes and the restart of translation (PMID: 30664789). SGs disappear a few minutes before translation is resumed.

      In Figure 2A the 30 min treatment of GTPP induces a robust level of eIF2α-P yet SGs are only observed following the induction of ATF4/GADD34 at 2h. Puromycin incorporation assays may also be able to shed light on the lack of SG inductions at this stage. The formation of SGs around the time when ATF4 and GADD34 are induced seems counterintuitive and should be commented on.

      As commented in response to an earlier point, our analysis shows that GTPP result in a global reduction in translation level, the assembly of SGs in a subpopulation of cells (as reported also in the context of many viral infection) may reflect cell-specific differences in the levels of eIF2α kinases and/or differences in reaching the threshold needed for eIF2α phosphorylation to induce SG assembly (as shown in PMID 30674674 and PMID 35319985).

      In line 207-208 you state that "PERK is the main eIF2α kinase responsive to GTTP. Overall, these results suggest that induction of the UPRmt is associated with an early SG assembly and ISR activation through PERK." Does the PERK inhibitor inhibit the formation of SG following GTTP treatment? # This is now shown in Figures 2E and 2F. Indeed pharmacological inhibition of PERK following GTPP treatment resulted in inhibition of SG assembly.

      Additionally, does GTPP activation of the UPRmt also induce an oxidative stress and therefore activate an additional EIF2AK such as HRI? If so could be the reason you don't get formation of SGs following Ars treatment? Have you considered what would happen if you used the UV stress which activates GCN2 followed by Ars treatment?

      As shown on Figures 2D and 2E, we could not detect contribution from the other eIF2a kinases GCN2 and PKR following GTPP treatment; and Figures 2E, 2F demonstrate that PERK inhibition is sufficient to revert eIF2a phosphorylation and ablate SG induction, as noted in the response to the point above. This strongly suggest that the eIF2a kinase HRI does not contribute to eIF2a signalling, however we do not exclude in the broader sense (beyond eIF2a signalling) an induction of oxidative during UPRmt activation. Furthermore, as shown in Figure 2D, A-92 treatment reduced p-eIF2a levels in response to UV treatment but not those induced by GTPP therefore we can exclude a contribution from GCN2. If we understand correctly, this reviewer asks what would happen if cells were UV-stressed to activate GCN2 followed by oxidative stress with arsenite. This is outside the scope of this manuscript, but based on our previous work showing that mRNA GADD34 mRNA levels act as the molecular memory of the ISR and drives cell adaptation to acute and chronic stress, we would expect that the response to a second pulse of stress would be dampened by the sustained level of GADD34 mRNA induced following the first stress (see PMID 35319985). In these previous studies we already demonstrated that induction of p-eIF2a and SGs by a first acute stress (heat shock or thapsigargin) impairs the induction of p-eIF2a and SGs by a second acute (heat shock or arsenite) or chronic (HCV infection) stress (PMID 35319985, see Figure 6; PMID: 38602876, see Figure 7).

      Overall, this and the response to the previous comment strongly support that PERK activation, and the resulting induction of GADD34, are responsible for SG regulation following GTPP treatment.

      In Figure 3, for the paraquat experiments have you missed the transient induction of SGs by only looking at 48h? You already have GADD34 levels high here so SGs/eIF2α-P levels will already be lowered.

      We have now included additional timepoints, see supplementary Figure 5, showing the absence of SGs at 1, 2, 6 and 24h post paraquat treatment, to complement the 48h treatment previously shown.

      In addition, when analysing GTPP + Ars treatment impact on SG formation (Fig 2B), could the 2 h GTPP + Ars data also be included, as this is the peak time for SG induction by GTPP

      This is now included in Figure 3B.

      In line 211 you refer to the early and late stages of the stress, how have these been defined? It seems that the ability of the UPRmt to be protective to an additional stressor is time dependent- the number of SGs that are present following the additional stress increases from 4-8h. Does this correlate with a decrease in the level of GADD34?

      We define early and late to the time points corresponding to induction (early) or disassembly (late) of SGs. Also see lines 227-230.

      In line 254 you state that ATF5 silencing didn't impact the ISR or SG formation? These data suggest that the formation of SGs is not a direct impact of activation of the UPRmt but rather activation of the cellular ISR possibly due to the proteotoxic and/or oxidative stress? Can the authors comment on this?

      We now show in supplementary Figure 6 that reducing the expression of ATF5 results in defects in SG maturation with GTPP treatment resulting in more numerous and smaller SGs. Moreover, it should be noted that HSF1, in addition to ATF5, is a key controller of UPRmt induction and future studies could aimed at dissecting the role of HSF1 in the SG-UPRmt crosstalk (discussed in lines 459-461).

      In Figure 4, If GADD34 was driving the loss of SGs in GTPP treated cells why are SGs not persistent in these KO cells. Please comment on this.

      Two phosphatases are known to catalyse eIF2a-P dephosphorylation, GADD34 and CReP. The current model proposes that GADD34, which is induced following stress, acts in a negative feedback loop to resolve cellular stress. In contrast, CReP is constitutively expressed and controls basal P-eIF2α levels independently from stress levels (PMID 27161320). In recent work, we have shown that when GADD34 expression is silenced, CReP takes over to revert eIF2a -P and therefore disassemble SGs (PMID: 38602876). This work also showed that CreP is stress-induced in the absence of GADD34. Therefore, in Figure 4 we can speculate that the absence of SGs in GTPP treated KO cells is due to the ability of CReP to compensate for the absence of GADD34. In the context of GTPP treatment followed by arsenite, GADD34 is important to increase the threshold at which SGs can form, altering the response to a second pulse of stress.

      In addition, in these GADD34KO cells there should also be a persistent level of eIF2α-P when treated with GTPP and Pq, there is some as evidenced by the quantification but this is not very convincing

      As noted here, we do provide evidence of sustained levels of eIF2a-P in cells treated with GTPP at least, the results of independent experiments (n=3) showing persistent phosphorylation when compared treatment in GADD34 KO relative to WT cells. But as noted in the point above the likely activity of CReP can compensate for the lack GADD34, and therefore dampen the amount of eIF2a phosphorylation observed.

      Fig 4B shows no cells exhibiting SG following 4h GTPP treatment, which does not correlate with other experiments in the original cell line, e.g. supp 2B - please explain. Can GTPP still activate the UPR-mt in this CRISPR control cell line

      GTPP still activates the UPRmt in the CRISPR control cell line has shown by the inhibition of arsenite-induced SGs assembly when cells are pre-treated with GTPP for 4h (Figure 4A). However, we have noted that the timings of the response to GTPP can vary slightly, impacting on the exact SG kinetics, depending on the purity of the drug (synthetised through organic routes by our collaborator Dr Altieri), with the SG peak either at 2 h or at 4 h post-GTPP treatment. Potentially live imaging of SGs in control and GADD34 KO cells would alleviate this caveat, however in the time frame of the rebuttal, further engineering of GADD34 KO and parental lines into G3BP1/2 knock-outs / GFP-G3BP1 knock-ins was not achievable.

      In Figure 5, of the 80% of SG still present in GTPP treated Sil SGs- was size or frequency impacted here too as in Pq treatment? # These data are now provided, see Figure 5C and in the result section lines 325-329. These show that GTPP treatment resulted in a reduction in average size of silvestrol-induced SGs, from 0.98 μm2 to 0.9 μm2, and increased average number of SGs, from 18 to 22, when compared to non-treated cells. Additionally, we also quantified features of Ars-induced SGs in GTPP-pretreated cells, data provided in Figure 3C and in the result section lines 245-250. The analysis showed that as paraquat, GTPP pre-treatment also impacts size and frequency of arsenite-induced SGs.

      This is just for clarification but If GTPP is a hsp90 inhibitor, is it specific to mitochondrial Hsp90 proteins?

      Indeed GTPP is specific to mitochondrial Hsp90.

      In the last results section the authors suggest that G3BP1/2 KO cells unable to assemble SGs present with improved mitochondrial function during stress. Firstly, is the UPRmt activated in these KO cells? Could the increased activity just be a consequence of the cells not being able to sense the stress and adapt? Are these cells able to recover from the GTPP stress to the same extent as the wt? Do they die at later timepoints? If you inhibited the disassembly of SGs using DYRK3 inhibitors would you decrease mitochondrial activity? # The figure below confirms the upregulation of UPRmt genes mRNA levels after GTPP treatment in U2OS G3BP1/2 dKO (rebuttal Figure 1). We did not include this in the main manuscript given it is figure heavy already and this did not add depth to our results. Our extensive additional analysis shows that cells unable to assemble SGs present with multiple restored mitochondrial functions following UPRmt induction, including increased ATP production (Fig 6D), and respiration (FIG 6E, 6F), reduced mitochondrial ROS level (Fig 6C) and fragmentation (Fig 6A, 6B). These all support a model in which SG assembled following UPRmt induction contribute to impaired mitochondrial function and that their inhibition/disassembly is necessary to restore mitochondrial homeostasis.

      Rebuttal Figure 1: RT-qPCR analysis of the UPRmt and ISR markers DNAJA3, HSPD1, CHOP and ATF4 mRNA levels in U2OS cells treated with GTPP for up to 6 h. Results shown representative of n=3, normalised to RPL9 mRNA and shown relative to DMSO.

      Reviewer #2 (Significance (Required)): Significance: This is an interesting and clearly important observation providing mechanistic insight into the role SGs may play in the cells control of mitochondrial function during stress. The functional role of SGs in disease and stress is still widely unknown and this manuscript therefore sheds light on how the cell may use SGs to modulate and adapt to mitochondrial stress. This is an exciting area of research that will be applicable to a large audience as SGs are implicated in a wide range of diseases. While the data is significant there are currently a number of important experiments required to strengthen the current observational analysis. Below are some minor and major comments linked to the manuscript. # We thank the reviewer for highlighting the importance of our work in an 'exciting area of research'.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)): As it stands, this study will be suited for a specialized cell biology journal. In order to be published in a journal of a broader readership, the authors would need to address two major points:

      1. Mitochondrial dysfunction affects cellular function in many ways. Reduced levels of ATP, oxidative stress by increased ROS levels and mitochondrial precursor proteins that challenge proteostasis in the cytosol are just three major consequences of mitochondrial defects. Arguably, for the generation of stress granules, it will be important which of these consequences of mitochondrial dysfunction are prevalent. Since mitochondrial dysfunction is an ill-defined umbrella term, this study would be stronger if the authors could link stress granule formation to the specific molecular defects that arise from specific inhibition of mitochondrial functions.

      We agree with this reviewer that mitochondrial dysfunction can take many shapes and therefore to address their comment we have now performed an extensive amount of additional experiments probing various aspects of mitochondrial functions. In addition to the data previously included we can now show to that inhibition of SG formation during UPRmt induction result in increased cell viability (Figure 8A-B), restoring mitochondrial functions such as respiration, ATP production (Figure 6C-F) or translation (Figure 7A), and reduce mitochondrial ROS (Figure 6C) or fragmentation (Figure 6A-B). These all support a model in which SGs assembled following UPRmt induction contribute to impaired mitochondrial function and that their inhibition/disassembly is necessary to restore mitochondrial homeostasis.

      1. Also stress granules are an umbrella term. Different treatments will presumably change the spectrum of transcripts that are sequestered in these granules. As mitochondrial defects remodel the transcription and translation of mitochondrial precursor proteins, the study would benefit from a comprehensive analysis of the spectrum of transcripts that are contained in granules induced by GTPP and sodium arsenite, respectively.

      Previous studies, including our own, have demonstrated that indeed different stress (or infections) can result in the assembly of compositionally distinct SGs (or SG-like foci) that sequester specific subset of mRNAs or proteins. These studies are based on affinity purification or proximity ligation approaches followed by multi-omics analysis of SG components by RNA-seq and mass spectrometry. While we agree with this reviewer that determining the composition of UPRmt-induced SGs could help understand their function, we believe these studies are outside the scope of the current manuscript, and this would instead form the basis of subsequent study and manuscript.

      Reviewer #3 (Significance (Required)): The study is interesting but descriptive. It confirms previous observations. The advance in mechanistic insights is limited. Nevertheless, the study is technically sound and of interest for a specialized readership. As it stands, the study might be published in a specialized journal. In order to be of general interest for a large and general readership, the authors will have to provide much more mechanistic and molecular insight, which will require at least another six months of work.

      We have now produced an extensive additional body of work to answer specific comments made by all three reviewers, bolstering our hypothesis, and delving deeper into the impact of SG assembly on mitochondrial functions.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We are grateful for the valuable, constructive comments of the reviewers, which helped to substantially improve the quality of our manuscript. We particularly agree that the original structure of the manuscript was confusing and in parts misleading, since we followed the history of the project, which first identified the RBM39 mediated impact on IRF3 expression, whereas the -omics studies, identifying additional factors, were done at a far later point. Many discrepancies further arose from the low sensitivity of our initial proteomics analysis, which we now repeated, thereby obtaining far more sensitive detection of the key factors we also found in the transcriptomics data.

      We have re-structured the entire manuscript by moving the -omics data from the end of the paper towards the middle and provide similar depth downstream analysis of all relevant key factors identified (RIG-I/MDA5, IFN receptors, STAT1/2), to reduce the focus on IRF3, as suggested. We further changed the title and abstract to reflect this major conceptual change. Thanks to this helpful comment, we think that our manuscript is now conceptually much clearer.

      We further added new data to support the central claims of our manuscript, including a repetition of the proteomics study. Proteomics and transcriptomics now consistently demonstrate the impact of RMB39 knockdown as well as indisulam treatment on several key factors of innate immunity, including IRF3, STAT1/2, RIG-I and MDA5 (now in Fig. 5), with IFNAR2 and IL10RB additionally found in transcriptomics. We provide additional functional evidence that IRF3 is the key factor affected in the TLR3 pathway (IRF3 overexpression, Fig. 6B, C), whereas diminished abundance of RIG-I/MAD5 is equally important in the respective pathway, thereby also affecting NF-κB response (Fig. 6F-I). We further show the functional significance of IFN-receptor/STAT downregulation on type I and III IFN responses (Fig. 7E-G).

      The reviewers also pointed to some datasets showing the expected trends, but in some cases lacking statistical significance, due to variability in knockdown efficiency. We repeated all mentioned datasets with new batches of siRNA with sufficient biological replicates (n=3). We thereby obtained consistent, statistically significant data in all cases. Importantly, all experiments implementing the RMB39.esc control now show consistent rescue (Fig2. A-E).

      To generate a homogenous experimental design for virus infections, we further added new data showing a comparable impact of siRNA knockdown (Fig. 3F) and indisulam treatment (new Fig. 3G) on Sendai virus infection in A549 cells and took this as a rationale to consistently use indisulam for all other infections.

      2. Point-by-point description of the revisions

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __ This manuscript by Li and colleagues examines the role of RBM39 in innate immune signaling. Splicing factor RBM39 was identified through a genome wide screen with a death reporter under control of the IFIT1 promoter that got stimulated with pIC in a TLR3-dependent manner. Besides IFIT1, further experiments showed that RBM39 is also involved in optimal expression of other innate immunity genes like IFNB, CXCL10, RIG-I or MDA5. While NFkB-dependent genes seem not to depend on RBM39, for IRF3 it was shown that protein levels decrease under conditions of RBM39 depletion, because IRF3 mRNAs are (slightly) reduced and spliced differently. The sulfonamid Indisulam could largely recapitulate the phenotype of RBM39 depletion. Further analyses using proteomics and transcriptomics showed that RBM39 is required for mRNA splicing and expression of a large set of other proteins. Altogether, this well designed and written study highlights the fundamental role played by RBM39 in in maintaining the pathways of immunity and metabolism. The key conclusions are convincing but some additional experiments would strengthen them further.

      We are grateful for the very positive general comments of this reviewer.

      Major comments: - For the statistics, authors seem not to have done multiple tests but rather tested individual datasets within larger graphs against each other. Please explain where this is the case and use corrections if multiple testing was done

      We apologize for not have been clearer here, we indeed used multiple testing. In the proteomics, statistical significance was evaluated by "two-sample tests" (Student's T-test with permutation-based FDR 0.05 and 250 number of randomizations). For the analysis of RNAseq data, p values were calculated with the Wald test and corrected for multiple testing according to Benjamini-Hochberg. We have now included this information in the materials and methods section and in the respective figure legends.

      • Fig. 4 shows that RBM39 depletion reduces IFIT expression in virus infected cells and slightly increases virus replication. RBM39 has a major effect on IRF3 levels, but also on other players in innate immunity. What happens if IRF3 is ectopically expressed as in figure 5? With this experiment one could measure how high the contribution of IRF3 miss-splicing is to innate immunity.

      We thank this reviewer for the valuable suggestion. We restructured the entire manuscript, to address several reviewer comments regarding the focus on IRF3 and the lack of data on other factors in the pathway. We now clearly demonstrate that ectopic IRF3 expression entirely rescues the TLR3 response to poly(I:C) in PH5CH cells (Fig. 6B-C), which also explains the lack of impact on the NF-κB pathway (Fig. 2G-H). In contrast, overexpression of IRF3 does not rescue the RIG-I/MDA5 response in A549 cells (new data, Fig. 6F-I). Here, also the NF-κB pathway is affected by knockdown of RBM39, suggesting that reduced RIG-I/MDA5 abundance upon RMB39 knockdown substantially contributed to the diminished innate immune response.

      • Fig. 4 A uses siRNAs but B, C and D only indisulam treatment. It would be better if siRNAs would also be used for the other viruses.

      We agree that a homogenous setup for virus infection would be favorable, however, the use of different cell lines was authorative due to limited permissivess of the used cell types towards virus infection and it appeared challenging to achieve similar knockdown efficiencies. To generate a homogenous experimental design, we now added new data showing a comparable impact of siRNA knockdown (Fig. 3F) and indisulam treatment (new Fig. 3G) on Sendai virus infection in A549 cells and took this as a rationale to consistently use indisulam for all other infections.

      • RBM39 depletion strongly reduces IRF3 levels in the WB, but not so much in RT-PCR and not at all in proteomics. Is the antibody used for WB perhaps recognizing a domain that is underrepresented in isoforms after disturbed splicing? Please clarify.

      Our previous proteomics data suffered from a very low sensitivity, therefore we missed clear detection of many factors, including IRF3. We repeated the whole proteomics analysis with siRNA and indisulam treatment (new Fig. 5A, B) and now found significantly reduced IRF3 protein levels in both conditions (new Fig. S5C), in agreement with the WB data. The lower impact on IRF3 mRNA abundance is due to the additional contribution of alternative splicing (Fig. 6A, Fig. S6A-D), which both in combination affect protein abundance.

      • Volcano plots in figure 7 show a lot of hits obtained after both RBM38 siRNA and indisulam (green dots), and some that are additionally identified in transcriptomes and in proteomes (red dots). Nonetheless only innate immunity and stress response genes are marked, although they do not belong to these highly conserved classes. Please elaborate more on the most RBM39-dependent genes, e.g. by presenting them in a heat map.

      To our knowledge, our study is the first with a comprehensive comparison on the impact of RBM39 knockdown and indisulam treatment on the host cell proteome and transcriptome. However, several studies already did -omics studies on individual conditions/readouts (e.g. (Coomar et al, 2023; Dou et al, 2023; Mai et al, 2016; Nijhuis et al, 2022)). These studies already identified and described in detail key changes in transcriptome and proteome e.g. affecting genes involved in cell cycle control and metabolism, which we find as well. However, the novelty of our paper is the impact on innate immune response, we therefore rather decided to put an even stronger focus on these genes and to omit other factors, like stress response pathway components, etc.. This strategy is supported by the higher sensitivity of our new proteome analysis, which now generated a far better overlap with the transcriptomics, favoring a display setting on highlighting only those factors that were further analyzed in detail in the volcano blots (Fig. 5). Still, interested readers will find the comprehensive list of data in the supplementary Excel-datasheets as well as in our primary data in online depositories.

      Minor comments: - Some abbreviations are not explained, like PGK, siNT, siVTN

      We apologize and have added the missing explanation of abbreviations.

      • Welsch should read Welch

      Corrected.

      • Fig. 2H: were cells also stimulated and if yes, how?

      These were unstimulated conditions, to show the impact of RBM39 on basal expression of the IFNlambda receptor chains. However, we deleted this dataset due to the re-organisation of the manuscript. The analysis of the type I and type III receptor and STAT1/2 expression is now comprehensively shown in Fig. 7/S6E, F, solely based on the transcriptomic data for consistency reasons, along with the functional impact on the IFN response.

      • Fig. 6E: I cannot see a difference between to IRF3-203 and 228 isoforms. And what are the white boxes?

      • Also 6E: Location of the primers is barely visible

      Due to the re-organization of the manuscript these data are now shown in Fig. S6D. Both isoforms are indeed very similar and only differ by a very small (16nt) additional exon in isoform 228. The white boxes are exons not translated in the respective isoforms. We have included this important information in the legend to Fig. S6 and increased the arrows indicating the positions of the primer.

      • Some materials are not properly referenced, like the death reporter, the lentiviral system, or the Rift Valley fever luciferase virus

      We are sorry for the missing information, which has now been added to the materials and methods section.

      • Supplement has no page numbers

      We have added page numbers to the supplementary information.

      Reviewer #1 (Significance (Required)):

      The study advances our knowledge about the regulation of innate immunity. Strengths are the discovery of a novel layer of innate immunity regulation by splicing and the in-depth analysis of the importance of RBM39 for cellular gene expression. A potential weakness might be the focus on innate immunity as other biological functions seem even more dependent on RBM39. However, this reviewer sees the necessity that covering all aspects of RBM39 finction would be beyond the scope of a single study. The relevant literature is appropriately cited (except for some materials, see minor comments). Results will be of interest not only to people doing basic research on innate immunity, but also to those interested in gene regulation in general or to cancer researchers using indisulam

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __ The authors performed a CRISPR-based screen for genes required for TLR3-mediated signaling and gene expression in Hepatoma cells. Interferon-stimulated expression of an apoptosis inducer was used as a read-out system. A number of candidate genes were identified and one of these, RBM39, investigated in detail. The protein has previously been linked to both transcriptional control and RNA processing. Validation studies confirm that reduction of cellular RBM39 results in less TLR3-mediated IFN-beta synthesis and lower levels of ISG mRNA synthesis. Initial studies suggest a role of RBM39 in regulating of IRF3 levels, the transcription factor activated by TLR3 signaling to induce IFN-beta synthesis. However, the effect is variable and poorly supported by transcriptomic and proteomic data. Moreover, only one out of four cell-based viral infection models reports a substantial effect of the RBM39 knockdown.

      We apologize for the lack of consistency among several datasets, which was mainly due to the low sensitivity of the proteomic analysis. This has been repeated and now fully confirms all other data. In part due to the comments of this reviewer, we further broadened the scope of the manuscript away from IRF3, including a change of the title.

      Major comments:

      1. The data do not support the claim that RBM39 is a broadly acting player in innate immune responses. In addition, they suggest that IRF3 may not be the only relevant RBM39 target. The most informative knockdown control in this regard would be IRF3 siRNA.

      We have re-structured the entire manuscript and added new data to support the central claims of our manuscript, including a repetition of the proteomics study. Proteomics and transcriptomics now consistently demonstrate the impact of RMB39 knockdown as well as indisulam treatment on several key factors of innate immunity, including IRF3, STAT1/2, RIG-I and MDA5 (now in Fig. 5), with IFNAR2 and IL10RB additionally found in transcriptomics. We further provide functional evidence that IRF3 is the key factor affected in the TLR3 pathway (IRF3 overexpression, Fig. 6B, C), whereas diminished abundance of RIG-I/MAD5 is equally important in the respective pathway, thereby also affecting NF-κB response (Fig. 6F-I). We further show the functional significance of IFN-receptor/STAT downregulation on type I and III IFN responses (Fig. 7E-G). We hope this reviewer now agrees with our claim that RBM39 is a broadly acting player in innate immune responses.

      1. The structure of the manuscript is rather confusing because IRF3 is presented as the main RBM39 target in figures 3-6, but the -omics data in figures 7 and 8 do not support this view. The authors argue different sensitivities of the experimental approaches, but I think few people would agree that western blots are more sensitive than MS. To my opinion a narrative with less focus on IRF3 and a broader integration of candidates of the -omics approaches would be preferable.

      We are grateful for this valuable comment and fully agree that the original structure of the manuscript was confusing and in parts misleading, which was mainly due to the fact that we followed the history of the project, which first identified the RBM39 mediated impact on IRF3 expression, whereas the -omics studies, identifying additional factors, were done at a far later point. Many discrepancies further arose from the low sensitivity of our proteomics analysis, which we now repeated, thereby obtaining far more sensitive detection of the key factors we also found in the transcriptomics data. We now moved the -omics data from the end of the paper towards the middle and provide similar depth downstream analysis of all relevant key factors identified (RIG-I/MDA5, IFN receptors, STAT1/2, to reduce the focus on IRF3, as suggested. We further changed the title and abstract to reflect this major conceptual change. Thanks to this helpful comment, we think that our manuscript is now conceptually much clearer.

      Investigating the role of RBM39 by RNA-seq in pIC-treated cells would further strengthen the manuscript. It will yield a broader view of the protein's role in induced innate immunity.

      We did not add pIC treatment to the RNA-seq analysis, since, based on own experience and numerous papers, this will change the expression of literally thousands of genes. Based on the key factors of the pIC response modulated by RBM39 (RLRs and IRF3), this would very likely simply result in reduced induction of the whole ISG panel (as exemplified for IFIT1, ISG15, MxA and CXCL10 in Fig. 2B-E).

      3.The results in figures 6A-C are confusing for two reasons. First, the siRNA-mediated knockdown should result in reduced RBM39 protein as well (as shown in Fig. 3A) and, therefore, in an increase in RBM39 levels. Second, why was this effect not noted in the experiments shown in figs. 1-5? To avoid this confusion it might be good to mention which IRF3 splice isoforms are detected by the primers and antibodies used in these figures.

      Unfortunately, the reviewer seems to have conceptually misinterpreted Fig. 6A-C of the original paper, which did not show protein, but transcriptome data. We now added the corresponding data of the proteomic analysis in the new Fig. S5, for all detectable, relevant candidates, showing consistency to all previous data. The confusing point in previous Fig. 6B, which the reviewer appears to refer to, is the upregulation of RBM39 transcript levels upon indisulam treatment, which was not apparent in previous experiments, since we always used WB to show diminished RBM39 protein levels upon indisulam treatment. This increase in RBM39 mRNA is due to an autoregulation of RBM39 mRNA by protein abundance, which has been reported in literature (Campagne et al, 2023). Since this is rather confusing and not relevant for our study, we removed previous Fig. 6B and show this aspect only in the volcano blot in Fig. 5D, mentioning and citing the paper on autoregulation.

      Minor comments.

      1. Fig S1: the figure panels and legend are inconsistent. IFIT1 is labeled as ISG56 in panel S1A.

      We apologie for this inconsistency and now use IFIT1 throughout the paper.

      1. Data with the siRNA escape mutant of RBM39 are inconsistent. For example, why is its effect significantly different only in 1 out of 4 ISG in figures S2A-D?

      We apologize for the inconsistency, which is due to variability of silencing efficiency. We repeated the entire set of experiments (n=3) with a new batch of siRNA and obtained comparable, significant differences for all ISGs analyzed (new Fig. 2B-E).

      1. Line 164: the statement that TRIF and RBM39 siRNAs produce effects of similar magnitude is incorrect for the IFIT1 gene in figure S2A.

      This experiment was repeated (see previous point), now obtaining significant, more homogenous data. We have modified the text accordingly.

      4.Fig. 2H: In absence of additional evidence for functional implications, the data showing reduced IL10RB expression should be omitted.

      We omitted the data, as suggested by the reviewer, however, we provide a more in depth analysis of the type I and III IFN response in Fig. 7, based on the transcriptomic data and a functional analysis.

      5.Fig. 3: More datapoints would be needed in panel A to sustain the lack of significant difference between the untreated and escape mutant samples. Are the viability data in panels B and C normalized to untreated cells to control for Indisulam toxicity? In figure S3A the effect of the mutant is rather small. To allow for comparison, the Indisulam titration curves should be adapted to the concentrations used in Fig. 3.

      Fig. 3 (now Fig. 4) was replaced by another representative experiment, now also containing the quantification of the shown western blots, however, the statistical analysis shown in the previous version was and is based on three independent biological replicates, as indicated in the figure legend. Viability data was normalized to controls and this information is now added to the figure lengend as well. The mutant analyzed in Fig. S3A (now S4A) confers only partial resistance, which explains the limited but clear rescue. We did not include higher indisulam concentrations here due to the increased cytotoxicity of concentration above 5 µM in PH5CH, in the absence of pronounced additional effects on RBM39 abundance (Fig. 4B).

      6.RNA-seq measures steady-state RNA, not transcription.

      This is of course correct, we changed all sentences, where our wording might have indicated that we are measuring transcription by RNAseq. However, we still need to differentiate between the role of RBM39 in transcriptional regulation and splicing, where changes in RNA abundance found in RNAseq rather point to transcriptional regulation.

      Reviewer #2 (Significance (Required)):

      The identification of RBM39 as a candidate player in innate immune responses is of interest to a large scientific community with interest in signalling by pattern recognition receptors. Its role should be strengthened with additional infection models. It is puzzling that three out of four viruses don't benefit from the reduced IFN-beta synthesis in the RBM39 knockdown. Moreover, the data are not convincing (or too diverse) to nail down IRF3 as a major, or the most relevant, RBM39 target.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __ CRISPR Screen for factors that are required for dsRNA-dependent ISG production. Found a large number of hits but most did not validate in subsequent assays. The authors follow up the one candidate that did pass secondary screening criteria, RBM39, although re-expression of RBM39 only rescues the phenotype of the siRNAs against RBM39 (siRBM39) in one of the two cell lines tested. Additionally, siRBM39 impacts only a subset of polyIC-induced ISGs and does not regulate NFkB-driven gene expression. They go on to attempt to investigate the impact of siRBM39 on other key innate immune genes and proteins, although many key controls and appropriate methods are missing.

      We thank this reviewer for pointing at inconsistencies and missing controls in our manuscript. We have critically re-evaluated the respective datasets.

      Major comments: 1) The authors propose some rationale for the limited success of the screen, however, while RBM39 may have a role in dsRNA-induced innate immunity, in general the screen seems to have limited value.

      The aim of our CRISPR/Cas9 death reporter screen was the identification of so far unknown contributors to innate immune response. This was achieved by identifying a critical role of RBM39, followed by an in depth validation focusing on RBM39. We further found known components of the TLR3 pathway in our candidate list (e.g. TRIF and UNC93B1), pointing to the overall quality of the experimental setup. At no point of the manuscript we claim that our screen aimed for or delivered a comprehensive overview on innate immunity pathways. Honestly, no comparable screen (e.g. on cytopathic viruses) has delivered such data.

      2) Given that the siRBM39 clearly has off-target effects (since expression of a resistant RBM39 cDNA only gives limited rescue in many cases - Fig S2), each of the experiments in which siRBM39 is used (i.e. Fig 2) should have the RBM39.esc control - especially those that drive subsequent experiments such as the expression of IFNbeta and IFNLR1 (Fig 2a, h)

      The inconsistency in some datasets, showing all the same trends, but in some cases lacking statistical significance was due to variability in knockdown efficiency. We repeated all mentioned datasets with new batches of siRNA with sufficient biological replicates (n=3) with now all of them revealing consistent, statistically significant data. Importantly, all experiments implementing the RMB39.esc control now show consistent rescue.

      3) Since RBM39 reduction has an apparent impact even if IFNLR1-deficient cells (although need the rescue control to know if this is real) the authors conclude that RBM39 regulates the initial wave of dsRNA signaling-events, but this should be tested with the use of Ruxilitinib to block JAK-STAT signaling.

      Due to the general major re-organization of the manuscript, aiming for a less confusing data presentation and consistency towards depth of candidate evaluation, we have removed the data on the IFNLR-deficient cell line. The claim that RBM39 affects the initial wave of ISG responses is based on reduced IFNb expression, which is exclusively induced by the initial wave of ISG response and by the general impact on ISG expression, which we measure at 6h after induction, too early for autocrine IFN stimulation (Burkart et al, 2023). However, we further demonstrate that downregulation of type I and type III IFN receptors in conjunction with STAT1/2 affect the type I and the type III IFN response as well (Fig. 7E-G, in part new data). Therefore, RBM39 affects both, the intial wave and the auto-/paracrine IFN response, and we therefore undertook no further efforts to separate these effects.

      4) IRF3 expression in the Indisulam-treated cells more closely tracks cell viability than RBM39 expression. For example in Fig 3C 10 microM gives 50% IRF3 expression and 50% viability but still 95% RBB39 expression - arguing that the impact of siRBM39 on IRF3 might be very indirect (and error bars on rescue are large so unclear if the rescue really worked in Fig 3A).

      Based on this reviewer comment we re-evaluated the quantification in previous Fig. 3C (now Fig. 4C), which combines data from three independent experiments. We deeply apologize, but the initial quantification proved to be wrong, due erroneous background subtraction, which was relatively high in one of the PHH-replicates (Replicate 1, see Reviewer Fig. 1 in uploaded file). The re-evaluated quantification revealed 55% for the RBM39 abundance at 10µM indisulam, which better reflects the data shown and is now in line with the impact on cytotoxicity and IRF3 abundance.

      5) It is unclear in Fig 4 why some cell/virus combinations are tested with siRBM39 and others are tested with Indisulam. Also the conclusion that RBM39 "substantially contributes to the cell intrinsic innate immune response to viral infections" is greatly overstated given that the differences are between ~3 fold and non-significant.

      We agree that a homogenous setup for virus infection would be favorable, however, the use of different cell lines was authoritave due to limited permissivess of the used cell types towards virus infection and it appeared challenging to achieve similar knockdown efficiencies. To generate a homogenous experimental design, we now added new data showing a comparable impact of siRNA knockdown (Fig. 3F) and indisulam treatment (new Fig. 3G) on Sendai virus infection in A549 cells and took this as a rationale to consistently use indisulam for all other infections. Overall, the aim of the virus infection experiments was using a variety of natural triggers of innate immunity beyond synthetic poly(I:C). Here we found indeed significant reductions of ISG induction for all viruses tested, similar to poly(I:C), this is the basis for the statement that RBM39 contributes the cell intrinsic innate immune response to viral infections. Our experimental design did not intend to see pronounced effects on viral replication, this was only measured to secure that reduced ISG induction was not due to inhibition of viral replication. We have explained this strategy now clearer and tuned down corresponding statements, to exclude potential overinterpretation of the data.

      6) Neither DTU/DRIMseq or qPCR are valid methods to measure splice isoform differences. The authors need to use rMATS or MAJIQ and validate by gel-based RT-PCR.

      Output generated by modern alignment algorithms like salmon is suitable for studies on an isoform level (Love et al, 2018) and has been used in a variety of studies (e.g.(Jabs et al, 2020; Xiong et al, 2023). MAJIQ and rMATS are only superior tools if the detection of so far unknown isoforms is of interest (Love et al., 2018), which is beyond the scope of this project. We have validated the data for IRF3 in RT-qPCR, showing close to identical results to the DTU analysis (compare Fig. 6A and S6D). We disagree that a gel-based RT-PCR analysis would be superior here, due to the lack of quantification.

      7) The conclusions from the proteomic and transcriptomic analyses should be treated with extreme caution given the caveats of methodology and controls discussed above.

      We are aware of the caveats of these technologies. The previous proteomic analysis indeed suffered from low sensitivity, failing to detect essential candidates like IRF3. The repetition of the experiment (new Fig. 5A, B, new Fig. S5) now revealed data very consistent with the transcriptomic data. Overall, the strength of our approach is the direct comparison of siRNA based RBM39 knockdown and RBM39 depletion by indisulam throughout transcriptomics and proteomics analyses. The wide overlap argues for the validity of our data and suggests that we thereby circumvented many caveats.

      Reviewer #3 (Significance (Required)):

      Innate immune signaling is a complex and essential pathway for maintaining health. While much is known about key components of this pathway, additional regulators are likely to exist. This manuscript describes an attempt to identify new regulators of dsRNA-mediated gene expression.

      References

      Burkart SS, Schweinoch D, Frankish J, Sparn C, Wust S, Urban C, Merlo M, Magalhaes VG, Piras A, Pichlmair A et al (2023) High-resolution kinetic characterization of the RIG-I-signaling pathway and the antiviral response. Life Sci Alliance 6

      Campagne S, Jutzi D, Malard F, Matoga M, Romane K, Feldmuller M, Colombo M, Ruepp MD, Allain FH (2023) Molecular basis of RNA-binding and autoregulation by the cancer-associated splicing factor RBM39. Nat Commun 14: 5366

      Coomar S, Mota P, Penson A, Schwaller J, Abdel-Wahab O, Gillingham D (2023) Overlaid Transcriptional and Proteome Analyses Identify Mitotic Kinesins as Important Targets of Arylsulfonamide-Mediated RBM39 Degradation. Mol Cancer Res 21: 768-778

      Dou Z, Zhang X, Su W, Zhang T, Ye F, Zhao D, Chen X, Li Q, Zhang H, Di C (2023) Indisulam exerts anticancer effects via modulation of transcription, translation and alternative splicing on human cervical cancer cells. Am J Cancer Res 13: 2922-2937

      Jabs S, Biton A, Becavin C, Nahori MA, Ghozlane A, Pagliuso A, Spano G, Guerineau V, Touboul D, Giai Gianetto Q et al (2020) Impact of the gut microbiota on the m(6)A epitranscriptome of mouse cecum and liver. Nat Commun 11: 1344

      Love MI, Soneson C, Patro R (2018) Swimming downstream: statistical analysis of differential transcript usage following Salmon quantification. F1000Res 7: 952

      Mai S, Qu X, Li P, Ma Q, Cao C, Liu X (2016) Global regulation of alternative RNA splicing by the SR-rich protein RBM39. Biochim Biophys Acta 1859: 1014-1024

      Nijhuis A, Sikka A, Yogev O, Herendi L, Balcells C, Ma Y, Poon E, Eckold C, Valbuena GN, Xu Y et al (2022) Indisulam targets RNA splicing and metabolism to serve as a therapeutic strategy for high-risk neuroblastoma. Nat Commun 13: 1380

      Xiong L, Liu J, Han SY, Koppitch K, Guo JJ, Rommelfanger M, Miao Z, Gao F, Hallgrimsdottir IB, Pachter L et al (2023) Direct androgen receptor control of sexually dimorphic gene expression in the mammalian kidney. Dev Cell 58: 2338-2358 e2335

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      This paper by Beath et. al. identifies a potential regulatory role for proteins involved in cytoplasmic streaming and maintaining the grouping of paternal organelles: holding sperm contents in the fertilized embryos away from the oocyte meiotic spindle so that they don't get ejected into the polar body during meiotic chromosome segregation. The authors show that by time-lapse video, paternal mitochondria (used as a readout for sperm and its genome) is excluded from yolk granules and maternal mitochondria, even when moving long distances by cytoplasmic streaming. To understand how this exclusion is accomplished, they first show that it is independent of both internal packing and the engulfment of the paternal chromosomes by maternal endoplasmic reticulum creating an impermeable barrier. They then test whether the control of cytoplasmic steaming affects this exclusion by knocking down two microtubule motors, Katanin and kinesis I. They find that the ER ring, which is used as a proxy for paternal chromosomes, undergoes extensive displacement with these treatments during anaphase I and interacts with the meiotic spindle, supporting their hypothesis that the exclusion of paternal chromosomes is regulated by cytoplasmic streaming. Next, they test whether a regulator of maternal ER organization, ATX-2, disrupts sperm organization so that they can combine the double depletion of ATX-2 and KLP-7, presumably because klp-7 RNAi (unlike mei-1 RNAi) does not affect polar body extrusion and they can report on what happens to paternal chromosomes. They find that the knockdown of both ATX-2 and KLP-7 produces a higher incidence of what appears to be the capture of paternal chromosomes by the meiotic spindle (5/24 vs 1/25). However, this capture event appears to halt the cell cycle, preventing the authors from directly observing whether this would result in the paternal chromosomes being ejected into the polar body. 

      Strengths: 

      This is a useful, descriptive paper that highlights a potential challenge for embryos during fertilization: when fertilization results in the resumption of meiotic divisions, how are the paternal and maternal genomes kept apart so that the maternal genome can undergo chromosome segregation and polar body extrusion without endangering the paternal genome? In general, the experiments are well-executed and analyzed. In particular, the authors' use of multiple ways to knock down ATX-2 shows rigor. 

      Weaknesses: 

      The paper makes a case that this regulation may be important but the authors should do some additional work to make this case more convincing and accessible for those outside the field. In particular, some of the figures could include greater detail to support their conclusions, they could explain the rationale for some experiments better and they could perform some additional control experiments with their double depletion experiments to better support their interpretations. Also, the authors' inability to assess the functional biological consequences of the capture of the sperm genome by the oocyte spindle should be discussed, particularly in light of the cell cycle arrest that they observe. 

      These general comments are addressed in the more specific critiques below.

      Reviewer #2 (Public Review): 

      Summary 

      In this manuscript, Beath et al. use primarily C. elegans zygotes to test the overarching hypothesis that cytoplasmic mechanisms exit to prevent interaction between paternal chromosomes and the meiotic spindle, which are present in a shared zygotic cytoplasm after fertilization. Previous work, much of which by this group, had characterized cytoplasmic streaming in the zygote and the behavior of paternal components shortly after fertilization, primarily the clustering of paternal mitochondria and membranous organelles around the paternal chromosomes. This work set out to identify the molecular mechanisms responsible for that clustering and test the specific hypothesis that the "paternal cloud" helps prevent the association of paternal chromosomes with the meiotic spindle. 

      Strengths 

      This work is a collection of technical achievements. The data are primarily 3- and 4-channel time-lapse images of zygotes shortly after fertilization, which were performed inside intact animals. There are many instances in which the experiments show extreme technical skill, such as tracking the paternal chromosomes over large displacements throughout the volume of the embryo. The authors employ a wide variety of fluorescent reporters to provide a remarkably clear picture of what is going on in the zygote. These reagents and the novel characterization of these stages that they provide will be widely beneficial to the community. 

      The data provide direct visualization of what had previously been a mostly hypothetical structure, the "paternal cloud," using simultaneous labeling of paternal DNA and mitochondria in combination with a variety of maternal proteins including maternal mitochondria, yolk granules, tubulin, and plasma membrane. Together, these images provided convincing evidence of the existence of this specified cytoplasmic domain. They go on to show that the knockdown of the ataxin-2 homolog ALX-2, a protein previously shown to affect ER dynamics, disrupted the paternal cloud, identifying a role for ER organization in this structure. 

      The authors then used the system to test the functional consequences of perturbing the cytoplasmic organization. Consistent with the paternal cloud being a stable structure, it stayed intact during large movements the authors generated using previously published knockdowns (of mei-1/katanin and kinesin-13/kpl-7) that increased cytoplasmic streaming. They used this data to document instances in which the paternal chromosomes were likely to have been attached to the spindle. They concluded with direct evidence of spindle fibers connecting to the paternal chromatin upon knockdown of ATX-2 in combination with increased cytoplasmic streaming, providing strong, direct support for their overarching hypothesis. 

      Weaknesses 

      While the data is convincing, the narrative of the paper could be streamlined to highlight the novelty of the experiments and better articulate the aims. For example, the cloud of paternal mitochondria and membranous organelles was previously shown, but Figures 1-2 largely reiterate that observation. The innovation seems to be that the combination of ER, yolk, and maternal mitochondrial markers makes the existence of a specified domain more concrete. There are also some instances where more description is needed to make the conclusions from the images clear. 

      These general comments are addressed in the more specific critiques below.

      The manuscript intersperses what read like basic characterizations of fluorescent markers that, as written, can distract from the main story. The authors characterized the dynamics of ER organization throughout the substages of meiosis and the permeability of the envelope of ER that surrounds the paternal chromatin, but it could be more clearly established how the ability to visualize these structures allowed them to address their aims.

      We have added the following after the initial description of ER morphology changes: (ER morphology was used to determine cell-cycle stages during live imaging reported below in Fig. 6.)

      More background on what was previously known about ER organization in M-phase and the role of ataxin proteins specifically may help provide more continuity. 

      We have added references to transitions to ER sheets during mitotic M-phase in HeLa cells and Xenopus extracts.

      Reviewer #3 (Public Review): 

      Summary: 

      This study by Beath et al. investigated the mechanisms by which sperm DNA is excluded from the meiotic spindle after fertilization. Time-lapse imaging revealed that sperm DNA is surrounded by paternal mitochondria and maternal ER that is permeable to proteins. By increasing cytoplasmic streaming using kinesin-13 or katanin RNAi, the authors demonstrated that limiting cytoplasmic streaming in the embryo is an important step that prevents the capture of sperm DNA by the oocyte meiotic spindle. Further experiments showed that the Ataxin-2 protein is required to hold paternal mitochondria together and close to the sperm DNA. Finally, double depletion of kinesin-13 and Ataxin-2 suggested an increased risk of meiotic spindle capture of sperm DNA. 

      Overall, this is an interesting finding that could provide a new understanding of how meiotic spindle capture of sperm DNA and its accidental expulsion into the polar body is prevented. However, some conceptual gaps need to be addressed and further experiments and improved data analyses would strengthen the paper. 

      - It would be helpful if the authors could discuss in good detail how they think maternal ER surrounds the sperm DNA

      We have added 2 references to papers about nuclear envelope re-assembly from Shirin Bahmanyar’s lab and suggest the ER envelope is a halted intermediate in nuclear envelope reassembly.

      and why is it not disrupted following Ataxin disruption. 

      We have been attempting to disrupt ER structures in the meiotic embryo for the last 5 years by depleting profilin, BiP, atlastin, ATX-2 and by optogenetically packing ER into a ball in the middle of the oocyte.  None of these treatments prevent envelopment of the sperm DNA by maternal ER.  None of these treatments remove ER from the spindle envelope and none remove ER from the plasma membrane.  These treatments mostly result in “large aggregates” of ER that we have not examined by EM.  Wild speculation: any disruption of the ER strong enough to prevent ER envelopment around chromatin would be sterile because the M to S transition in the mitotic zone of the germline would be blocked.  Rapid depletion of ATX-2 to the extent shown by rigorous data in this manuscript does not prevent ER envelopment around chromatin.  We chose not to speculate about the reasons for this because we do not know why.

      - Since important phenotypes revealed in RNAi experiments (e.g. kinesin-13 and ataxin-2 double depletion) are not very robust, the authors should consider toning down their conclusions and revising some of their section headings. I appreciate that they are upfront about some limitations, but they do nonetheless make strong concluding sentences. 

      We have changed the discussion of the klp-7 atx-2 double depletion to: “The capture of the sperm DNA by the meiotic spindle in ATX-2 KLP-7 double depleted embryos suggests that the integrity of the exclusion zone around the sperm DNA might insulate the sperm DNA from spindle microtubules.  However, a much larger number of klp-7(RNAi) singly depleted and atx-2(degron) singly depleted time-lapse sequences are needed to rigorously support this idea. “

      - The discussion section could be improved further to present the authors' findings in the larger context of current knowledge in the field. 

      We have expanded the discussion as suggested.

      - The authors previously demonstrated that F-actin prevents meiotic spindle capture of sperm DNA in this system. However, the current manuscript does not discuss how the katanin, kinesin-13 and Ataxin-2 mechanisms could work together with previously established functions of F-actin in this process. 

      We have added pfn-1(RNAi) to the discussion section.

      - How can the authors exclude off-target effects in their RNAi depletion experiments? Can kinesin-13, katanin, and Ataxin phenotypes be rescued for instance? 

      For ataxin-2 phenotypes, two completely independent controls for off target effects are shown.  GFP(RNAi) on a strain with and endogenous ATX-2::GFP tag vs GFP(RNAi) on a strain with no tag on the ATX-2.  ATX-2::AID with or without auxin.  For kinesin-13 and katanin, we did not do a rigorous control for off-target effects of RNAi.  However, the effects of these depletions on cytoplasmic microtubules have been previously reported by others

      - How are the authors able to determine if the paternal genome was actually captured by the spindle? Does lack of movement definitively suggest capture without using a spindle marker? 

      mKate::tubulin labels the spindle in each capture event.  This can be seen in Video S3. for mei-1(RNAi) and Figure 9 for atx-2 klp-7 double depletions.

      (1) Major issues: 

      The images provided are not convincing that mitochondria are entirely excluded from the regions with yolk granules from the images provided. Please provide insets of magnified images of the paternal mitochondria in Figure 1E to more clearly show the exclusion even when paternal mitochondria are streaming. Providing grayscale images, individual z-sections and/or some quantification of this data might also be more convincing to this reviewer. 

      We have modified Fig. 1 by adding single wavelength magnified insets to more clearly show that paternal mitochondria are in a “black hole” in the maternal yolk granules during  cytoplasmic streaming.

      Figure 2 -This figure can be retitled to highlight that the paternal organelle cloud is impermeable to mitochondria and conserved. 

      The legend has been re-titled as suggested.

      Figure 3B, An image of the DNA within the ring of maternal ER especially since the maternal ER ring is used as a proxy for the paternal chromosomes in later figures would strengthen the authors' claims.

      We have added a panel showing DAPI-stained DNA in the center of the ER ring and paternal mitochondria cloud. 

      Why is the faster time scale imaging significant? I think this could be more clearly set up in the paper. Perhaps rapid imaging of maternal mito-labeled kca-1(RNAi) embryos would better show the difference in time scale, with the expectation that the paternal cloud forms and persists while the ER invades. 

      We are not sure what the reviewer means.  5 sec time intervals were used throughout the paper.  We are also not sure how kca-1(RNAi) would help.  Movement of the entire oocyte into and out of the spermatheca is what limits the ability to keep a fusing sperm in focus.  kca-1(RNAi) would prevent cytoplasmic streaming but not ovulation movements.

      Figure 4 - The question about the permeability of the ER envelope seems to come out of nowhere as written. It isn't clear how it contributes to the larger story about preventing sperm incorporation in the spindle.

      This section of the results is introduced with: “If the maternal ER envelope around sperm DNA was sealed and impermeable during meiosis, this could both prevent the sperm DNA from inducing ectopic spindle assembly and prevent the sperm DNA from interacting with meiotic spindle microtubules.” 

      The data in Figure 4 would probably not be expected to be in this paper based on the paper title. Maybe the title needs something about ER dynamics? "eg. ATX-2 but not an ER envelope" isolates the paternal chromatin? 

      In Figure 5, it seems that RNAi of klp-7 and Mei-1 had slightly different effects on short-axis displacement of the ER envelope (klp-7 affecting it more dramatically than mei-1) and slightly different effects on interaction with the meiotic spindle (capture vs streaming past the spindle). The authors mention in their discussion that the difference in the interaction with the meiotic spindle might reflect the effects that loss of Mei-1 may have on the spindle but could it also be a consequence of the differences in cytoplasmic streaming observed?

      With our current data, the only statistically significant difference between cytoplasmic streaming of the sperm contents in mei-1(RNAi) vs klp-7(RNAi) is that excessive streaming persists longer into metaphase II in klp-7(RNAi).  We have added a sentence describing this difference to the results.  If differences in streaming were the cause of different capture frequencies, then klp-7(RNAi) would cause more capture events than mei-1(RNAi) but the opposite was observed.  We have avoided too much discussion here because the frequency of capture events is too low to demonstrate statistically significant differences between mei-1(RNAi), klp-7(RNAi), and atx-2(degron) + klp-7(RNAi) without a very large increase in the number of time-lapse sequences.  

      Also, the authors should find a way to represent this interaction with the meiotic spindle in a quantitative or table form to allow the reader to observe some of the patterns they report more easily.

      We have added a table to Fig. 9 that summarizes capture data.

      Finally, can the authors report when they observe the closest association with the meiotic spindle: Does it correlate with the period of greatest displacement (AI) or are they unlinked? 

      The low frequency of capture events makes it difficult to test this rigorously.

      Figure 6- 'Endogenously tagged ATX-2 was observed throughout oocytes and meiotic embryos without partial co-localization with ER.' How can the authors exclude co-localization with ER? 

      We have changed the wording to: “Endogenously tagged ATX-2 was observed throughout oocytes and meiotic embryos (Fig. 6A; Fig. S2).  ATX-2 did not uniquely  co-localize with ER (Fig. S2).“

      The rationale for why the authors think that the integrity of sperm organelles is important to keep the genomes apart is not clear to this reviewer and needs to be explained better. Moving the discussion of the displacement experiments in Figure S3 from the end of the results section to the ATX-2 knockdown section would help accomplish this. 

      We have added the sentence: “The frequency of sperm capture by the meiotic spindle (Fig. 9D) was significantly higher than wild-type controls in klp-7(RNAi) atx-2(AID) double depleted embryos (p=0.011 Fisher’s exact test).   Although the number of single mutant embryos analyzed was too low to demonstrate a significant difference between single and double mutant embryos,  these results qualitatively support the hypothesis that limiting cytoplasmic streaming and maintaining the integrity of the ball of paternal mitochondria are both important for preventing capture events between the meiotic spindle and sperm DNA.”

      It looks like, in the double knockdown of ATX-2 and KLP-7, the spread of paternal mitochondria is less affected than when only ATX-2 is depleted. What effect does this result have on the observation that the incidence of sperm capture appears to increase in the double depletion? What does displacement of the ER ring look like in the double depletion? Is it additive, consistent with their interpretation that both limiting cytoplasmic streaming and maintaining the integrity of the ball of paternal mitochondria is required to keep the genomes separate? 

      We cannot show a significant difference between single a double knockdowns without increasing n by alot.  We did not analyze ER ring displacement in the double mutant.

      Is the increased incidence of capture in the double-depleted embryos significant? 

      We have added the sentence: “The frequency of sperm capture by the meiotic spindle (Fig. 9D) was significantly higher than wild-type controls in klp-7(RNAi) atx-2(AID) double depleted embryos (p=0.011 Fisher’s exact test).   Although the number of single mutant embryos analyzed was too low to demonstrate a significant difference between single and double mutant embryos,  these results qualitatively support the hypothesis that limiting cytoplasmic streaming and maintaining the integrity of the ball of paternal mitochondria are both important for preventing capture events between the meiotic spindle and sperm DNA.”

      What do the authors make of the cell cycle arrest observed when paternal chromosomes are captured? Is there an argument to be made that this arrest supports the idea that preventing this capture is actively regulated and therefore functionally important? 

      We chose not to discuss the mechanism of this arrest because considerably more work would be required to prove that it is not caused by a combination of imaging conditions and genotype.  The low frequency of these capture + arrest events would make it very difficult to show that the arrest does not occur after depleting a checkpoint protein.

      (2) Minor concerns: 

      Top of page 4: "streaming because depletion tubulin stops cytoplasmic streaming (7)" should be "streaming because depletion of tubulin stops cytoplasmic streaming (7)" 

      The ”of” has been inserted.

      Page 6: "This result indicated that the volume of paternal mitochondria excludes maternal mitochondria and yolk granules but not maternal ER." The authors have only shown this for maternal mitochondria, not yolk granules. 

      We have deleted the mention of yolk granules here.

      Page 7: "These results suggest that all maternal membranes are initially excluded from the sperm at fusion." Should be "These results show that maternal ER are initially excluded from the sperm at fusion. Since maternal mitochondria and yolk granules are excluded later, this suggests that all maternal membranes are initially excluded from the sperm at fusion." 

      We have changed this sentence as suggested.

      It's not clear why the authors show other types of movement that might be quantified when cytoplasmic streaming is affected in Figure 5A and only quantify long-axis and short-axis displacement. 

      We have deleted the other types of movement from the schematic.  Although these parameters were quantified, we did not include this data in the results so it would be confusing for the reader to have them in the schematic.

      Bottom of page 7: Mention that the GFP::BAF-1 was maternally provided. 

      We have added “Maternally provided..”

      Missing an Arrow on Figure 1A 9:20. 

      We removed the text citation to an arrow in Fig. 1A because we moved most of the description of the ER ring to Fig. 3 to address other reviewer suggestions.

      Supplemental videos should be labeled appropriately to indicate what structures are labeled. It is currently difficult to understand what is being shown. 

      (3) Issues with the Discussion section: 

      "The simplest explanation is that cytoplasm does not mix during the 45 min from GVBD to pronucleus formation due to the high viscosity of cytoplasm." - Citation page 12. 

      We have changed the sentence to: “The simplest hypothesis is that maternal and paternal cytoplasm might not mix during the 45 min from GVBD to pronucleus formation due to the high viscosity of cytoplasm.” 

      "The higher frequency of capture of the sperm DNA by the meiotic spindle in ATX-2 KLP-7 double depleted embryos compared with either single depletion suggests that the integrity of the exclusion zone around the sperm DNA may insulate the sperm DNA from spindle microtubule" - Pages 12-13 reference the figures. 

      This sentence has been rewritten in response to other comments but the new sentence now references revised Fig. 9.

      "ATX-2 is required to maintain the integrity of the ball of paternal mitochondria around the sperm DNA, but the mechanism is unknown." - Page 13 reference figure. 

      A reference to Figs 7 and 8 has been inserted.

      " In control embryos, the sperm contents rarely came near the meiotic spindle in agreement with a previous study that found that male and female pronuclei rarely form next to each other (6). Streaming of the sperm contents was most commonly restricted to a jostling motion with little net displacement, circular streaming in the short axis of the embryo, or long axis streaming in which the sperm turned away from the spindle before the halfway point of the embryo. Depletion of MEI-1 or KLP-7 resulted in longer excursions of the sperm contents in the long axis of the embryo toward the spindle but frequent capture of the sperm by the spindle was only observed in mei-1(RNAi)." - Page 13, the corresponding figures need to be referenced for these sentences. 

      We have inserted figure references.

      "In capture events observed after double depletion of ATX-2 and KLP-7, a bundle of microtubules was discernible extending from the spindle into the ER envelope surrounding the sperm DNA. Such bundles were not observed in mei-1(RNAi) capture events, likely because of the previously reported low density of microtubules in mei-1(RNAi) spindles (36, 37)." - Pages 13-14 references figures here. 

      We have inserted figure references.

      "The higher frequency of capture of the sperm DNA by the meiotic spindle in ATX-2 KLP-7 double depleted embryos compared with either single depletion suggests that the integrity of the exclusion zone around the sperm DNA may insulate the sperm DNA from spindle microtubules." - This should be toned down since this phenotype is not robust. 

      We have changed this to: “The capture of the sperm DNA by the meiotic spindle in ATX-2 KLP-7 double depleted embryos suggests that the integrity of the exclusion zone around the sperm DNA might insulate the sperm DNA from spindle microtubules.  However, a much larger number of klp-7(RNAi) singly depleted and atx-2(degron) singly depleted time-lapse sequences are needed to rigorously support this idea. “

      ATX-2 depletion alters ER morphology but does not impact the maternal ER envelope - could the authors provide a potential explanation for this? 

      In the discussion, we cite papers showing that ATX-2 depletion affects many different cellular processes so the effect we see on paternal mitochondria might have nothing to do with the ER ring.   We have been attempting to disrupt ER structures in the meiotic embryo for the last 5 years by depleting profilin, BiP, atlastin, ATX-2 and by optogenetically packing ER into a ball in the middle of the oocyte.  None of these treatments prevent envelopment of the sperm DNA by maternal ER.  None of these treatments remove ER from the spindle envelope and none remove ER from the plasma membrane.  These treatments mostly result in “large aggregates” of ER that we have not examined by EM.  Wild speculation: any disruption of the ER strong enough to prevent ER envelopment around chromatin would be sterile because the M to S transition in the mitotic zone of the germline would be blocked.  Rapid depletion of ATX-2 to the extent shown by rigorous data in this manuscript does not prevent ER envelopment around chromatin.  We chose not to speculate about the reasons for this because we do not know why.

      It would be good to have representative images of what the altered spindle looks like in MEI-1-depleted oocytes. 

      The structure of MEI-1-depleted spindles has been described in the cited references.

      "Depletion of MEI-1 or KLP-7 resulted in longer excursions of the sperm contents in the long axis of the embryo toward the spindle but frequent capture of the sperm by the spindle was only observed in mei-1(RNAi)" - It is intriguing that this does not happen in the double depletion experiments of kinesin-13 and ATX-2. The authors should perhaps discuss this. 

      This does happen in KLP-7 ATX-2 double depleted embryos as shown in Fig. 9.

      (4) Missing citations: 

      "This analysis was restricted to embryos from anaphase I through anaphase II because our streaming data and that of Kimura 2020 indicate that the sperm contents have not moved significantly before anaphase I." - This needs an appropriate citation. Page 10. 

      We have inserted citations here.

      " The simplest explanation is that cytoplasm does not mix during the 45 min from GVBD to pronucleus formation due to the high viscosity of cytoplasm." - Citation page 12. Not referencing figures in the discussion. 

      We have changed the sentence to: “The simplest hypothesis is that maternal and paternal cytoplasm might not mix during the 45 min from GVBD to pronucleus formation due to the high viscosity of cytoplasm.” 

      "The higher frequency of capture of the sperm DNA by the meiotic spindle in ATX-2 KLP-7 double depleted embryos compared with either single depletion suggests that the integrity of the exclusion zone around the sperm DNA may insulate the sperm DNA from spindle microtubule" - Pages 12-13 reference the figures. 

      A reference to the revised Fig. 9 has been inserted in the revised version of this sentence.

      "ATX-2 is required to maintain the integrity of the ball of paternal mitochondria around the sperm DNA, but the mechanism is unknown." 

      References to Figs. 7 and 8 have been inserted.

      Page 13 reference figure 

      " In control embryos, the sperm contents rarely came near the meiotic spindle in agreement with a previous study that found that male and female pronuclei rarely form next to each other (6). Streaming of the sperm contents was most commonly restricted to a jostling motion with little net displacement, circular streaming in the short axis of the embryo, or long axis streaming in which the sperm turned away from the spindle before the halfway point of the embryo. Depletion of MEI-1 or KLP-7 resulted in longer excursions of the sperm contents in the long axis of the embryo toward the spindle but frequent capture of the sperm by the spindle was only observed in mei-1(RNAi)." Page 13, the corresponding figures need to be referenced for these sentences. 

      We have inserted citations here.

      "In capture events observed after double depletion of ATX-2 and KLP-7, a bundle of microtubules was discernible extending from the spindle into the ER envelope surrounding the sperm DNA. Such bundles were not observed in mei-1(RNAi) capture events, likely because of the previously reported low density of microtubules in mei-1(RNAi) spindles (36, 37)." Pages 13-14 references figures here. 

      We have inserted citations here.

      (5) Referencing wrong figures in the text: 

      Figure 5 - In the figure legend there is a 5C but there is no 5C panel in the figure. 

      A C has been inserted in Fig. 5.

      Figure 6A - "Dark holes were observed suggesting exclusion from the lumens of larger membranous organelles (Fig. 6A; Fig. S2)." Page 10. 

      6A has been changed to 6C.

      Figure 6A is showing background autofluorescence in WT oocytes so I am not certain why it is cited here. 

      The Figure citation has been corrected to 6B, C.

      Figure 8 - I could not find the supplemental data file with the individual mitochondria distance measurements. 

      We are including the Excel file with the revised submission.

      The last sentence of the first paragraph should be re-worded to be more concise ". In C. elegans, the nucleus is positioned away from the site of future fertilization so that the meiosis I spindle assembles at the opposite end of the ellipsoid zygote from the site of fertilization (2-4). " 

      Every word of this sentence is important.

      Last sentence second paragraph typo "These microtubules are thought to drive meiotic cytoplasmic streaming because depletion tubulin stops cytoplasmic streaming (7) and depletion of the microtubule-severing protein katanin by RNAi results in an increased mass of cortical microtubules and an increase in cytoplasmic streaming (8)." Pages 3-4. 

      “of” has been inserted.

      (6) Typos in the introduction should be corrected: 

      Ataxin or kinesin-13 are not mentioned in the introduction but these are a big focus of the paper. 

      Gong et al 2024 written instead of number citation (page 5), no citation in References.

      This has been corrected. 

      Supplemental videos should be labeled appropriately to indicate what structures are labeled. It is currently difficult to understand what is being shown.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #2 (Public Review):

      Summary:

      The authors used four datasets spanning 30 countries to examine funding success and research quality score for various disciplines. They examined whether funding or research quality score were influenced by majority gender of the discipline and whether these affected men, women, or both within each discipline. They found that disciplines dominated by women have lower funding success and research quality score than disciplines dominated by men. These findings, are surprising because even the men in women-dominated fields experienced lower funding success and research quality score.

      Strengths:

      - The authors utilized a comprehensive dataset covering 30 countries to explore the influence of the majority gender in academic disciplines on funding success and research quality scores.

      - Findings suggest a systemic issue where disciplines with a higher proportion of women have lower evaluations and funding success for all researchers, regardless of gender.

      - The manuscript is notable for its large sample size and the diverse international scope, enhancing the generalizability of the results.

      - The work accounts for various factors including age, number of research outputs, and bibliometric measures, strengthening the validity of the findings.

      - The manuscript raises important questions about unconscious bias in research evaluation and funding decisions, as evidenced by lower scores in women-dominated fields even for researchers that are men.

      - The study provides a nuanced view of gender bias, showing that it is not limited to individuals but extends to entire disciplines, impacting the perception and funding and quality or worth of research.

      - This work underscores the need to explore motivations behind gender distribution across fields, hinting at deep-rooted societal and institutional barriers.

      - The authors have opened a discussion on potential solutions to counter bias, like adjusting funding paylines or anonymizing applications, or other practical solutions.

      - While pointing out limitations such as the absence of data from major research-producing countries, the manuscript paves the way for future studies to examine whether its findings are universally applicable.

      Weaknesses:

      - The study does not provide data on the gender of grant reviewers or stakeholders, which could be critical for understanding potential unconscious bias in funding decisions. These data are likely not available; however, this could be discussed. Are grant reviewers in fields dominated by women more likely to be women?

      - There could be more exploration into whether the research quality score is influenced by inherent biases towards disciplines themselves, rather than only being gender bias.

      - The manuscript should discuss how non-binary gender identities were addressed in the research. There is an opportunity to understand the impact on this group.

      - A significant limitation is absence of data from other major research-producing countries like China and the United States, raising questions about the generalizability of the findings. How comparable are the findings observed to these other countries?

      - The motivations and barriers that drive gender distribution in various fields could be expanded on. Are fields striving to reach gender parity through hiring or other mechanisms?

      - The authors could consider if the size of funding awards correlates with research scores, potentially overlooking a significant factor in the evaluation of research quality. Presumably there is less data on smaller 'pilot' funds and startup funds for disciplines where these are more common. Would funding success follow the same trend for these types of funds?

      - The language used in the manuscript at times may perpetuate bias, particularly when discussing "lower quality disciplines," which could influence the reader's perception of certain fields.

      - The manuscript does not clarify how many gender identities were represented in the datasets or how gender identity was determined, potentially conflating gender identity with biological sex.

      Reviewer #3 (Public Review):

      This study seeks to investigate one aspect of disparity in academia: how gender balance in a discipline is valued in terms of evaluated research quality score and funding success. This is important in understanding disparities within academia.

      This study uses publicly available data to investigate covariation between gender balance in an academic discipline and:

      i) Individual research quality scores of New Zealand academics as evaluated by one of 14 broader subject panels.

      ii) Funding success in Australia, Canada, Europe, UK.

      The study would benefit from further discussion of it limitations, and from the clarification of some technical points (as described in the recommendations for the authors).

      Recommendations For The Authors:

      Reviewer #2 (Recommendations For The Authors):

      This is a very nice study as-is. In the following comments, I have mainly put my thoughts as I was reading the manuscript. If there are practical ways to answer my questions, I think they could improve the manuscript but the data required for this may not be available.

      Are there any data on the gender of grant reviewers or stakeholders who make funding decisions?

      The research quality score metrics seem to be more related to unconscious bias. The funding metrics may also, but there are potentially simple fixes (higher paylines for women or remove gender identities from applications).

      We have included some details about PBRF funding panel gender diversity. These panels are usually more gender balanced than the field they represent, but in the extreme cases (Engineering, Education, Mathematics) they are skewed as would be expected. Panels for other award decision makers was not available.

      I wonder if the research score metric isn't necessarily reflecting on the gender bias in the discipline but rather on the discipline itself? Terms like "hard science" and "soft science" are frequently used and may perpetuate these biases. This is somewhat supported by the data - on line 402-403 the authors state that women in male-dominated fields like Physics have the same expected score as a man. Could it be that Physics has a higher score than Education even if Physics was woman-dominated and Education was man-dominated? Are there any instances in the data where traditionally male- or female-dominated disciplines are outliers and happen to be the opposite? If so, in those cases, do the findings hold up?

      Overall we would love to answer this question! But our data is not enough. We mention these points in the Discussion (Lines 472-466). We have extended this a little to cover the questions raised here.

      How are those with non-binary gender identities handled in this article? If there is any data on the subject, I would be curious to know how this effects research score and funding success.

      These data were either unavailable or the sample size was too small to be considered anonymously (Mentioned on Lines 74-76).

      A limitation of the present article is a lack of data on major research-producing countries like China and the United States. Is there any data relevant to these or other countries? Is there reason to believe the findings outlined in this manuscript would apply or not apply to those countries also?

      We would be very excited to see if the findings held up in other countries, particularly any that were less European based. Unfortunately we could not find any data to include. Maybe one day!

      What are the motivations or other factors driving men to certain fields and women to certain fields over others? What are the active barriers preventing all fields from 50% gender parity?

      Field choice is a highly studied area and the explanations are myriad we have included a few references in the discussion section on job choice. I usually recommend my students read the blog post at

      https://www.scientificamerican.com/blog/hot-planet/the-people-who-could-have-done-science-didnt/

      It is very thoughtful but unfortunately not appropriate to reference here.

      The authors find very interesting data on funding rates. Have you considered funding rates and the size of funding awards as a factor in research score? Some disciplines like biomedical science receive larger grants than others like education.

      A very interesting thought for our next piece of work. We would definitely like to explore our hypothesis further.

      There are instances where the authors writing may perpetuate bias. If possible these should be avoided. One example is on line 458-459 where the authors state "...why these lower quality disciplines are more likely..." This could be re-written to emphasize that some disciplines are "perceived" as lower quality. Certainly those in these discipline would not characterize their chosen discipline as "low quality".

      Well-spotted! Now corrected as you suggest.

      Similar to the preceding comment, the authors should use care with the term "gender". In the datasets used, how many gender identities were captured? How many gender identity options were given in the surveys or data intake forms? Could individuals in these datasets have been misgendered? Do the data truly represent gender identity or biological sex?

      We know that in the PBRF dataset gender was a binary choice and transgender individuals were able to choose which group they identified with. There was no non-binary option (in defence the latest dataset there is from 2018 and NZ has only recently started updating official forms to be more inclusive) and individuals with gender not-stated (a very small number) were excluded. ARC did mention that a small number of individuals were either non-binary or gender not stated, again these are not included here for reasons of anonymity. This is now mentioned on Lines 74-76. The effects on this group are important and understudied likely because, as here, the numbers are too small to be included meaningfully.

      Reviewer #3 (Recommendations For The Authors):

      Major revisions:

      Could you add line numbers to the Supplementary Materials for the next submission?

      Yes! Sorry for the omission.

      (1) In the main text L146 and Figure 1, it is not clear why the expected model output line is for a 50 year old male from University of Canterbury only, but the data points are from disciplines in all eight universities in New Zealand. I think it would be more clear and informative to report the trend lines that represent the data points. At the moment it is hard to visualise how the results apply to other age groups or universities.

      As age and institution are linear variables with no interactions they are only a constant adjustment above or below this line and the adjustment is small in comparison to the linear trend. Unfortunately, if they were included graphically they do not aid understanding. We agree that indluded raw data with an adjusted trend line can be confusing buy after a lor of between-author discussion this was the most informative compromise we could find (many people like raw data so we included it).

      (2) Does your logistic regression model consider sample size weighting in pmen? Weighting according to sample sizes needs to be considered in your model. At the moment it is unclear and suggests a proportion between 0 and 1 only is used, with no weighting according to sample size. If using R, you can use glm(cbind(nFem, nMalFem).

      Yes. All data points were weighted by group size exactly as you suggest. We have updated the text on Lines 317 to make this clear.

      (3) For PBRF, I think it is useful to outline the 14 assessment panels and the disciplines they consider. Did you include the assessment panel as an explanatory variable in your model too to investigate whether quality is assessed in the same manner between panels? If not, then suggest reasons for not doing so.

      We have now included more detail in main text on the gender split of the panels. They were not included as an explanatory variable. In theory there was some cross-referencing of panel scores to ensure consistency as part of the PBRF quality assurance guidelines.

      (4) There are several limitations which should be discussed more openly:

      Patterns only represent the countries studied, not necessarily academia worldwide.

      Mentioned on Line 485-487.

      Gender is described as a binary variable.

      Discussed on Line 74-76.

      The measure of research evaluation as a reflection of academic merit.

      This is acknowledged in the data limitations paragraph in the discussion, at the end of the discussion

      Minor revisions:

      (1) L186. Why do you analyse bibliometric differences between individuals from University of Canterbury only? It would be helpful to outline your reasons.

      Although bibliometric data is publicly available it is difficult to collect for a large number of individuals. You also need some private data to match bibliometrics with PBRF data which is anonymous. We were only able to do this for our own institution with considerable internal support.

      (2) How many data records did you have to exclude in L191 because they could not be linked? This is helpful to know how efficient the process was, should anyone else like to conduct similar studies.

      We matched over 80% of available records (384 individuals). We have mentioned this on Line 194.

      (3) Check grammar in the sentence beginning in L202.

      Thank-you. Corrected.

      (4) Please provide a sample size gender breakdown for "University of Canterbury (UC) bibliometric data", as you do for the preceding section. A table format is helpful.

      Included on Line 194.

      (5) L377 I think this sentence needs revision.

      Thank you, we have reworked that paragraph.

      (6) L389-392 Is it possible evaluation panels can score women worse than men and that because more women are present in female-biassed disciplines, the research score in these are worse? Women scoring worse between fields, may be a result of some scaling to the mean score.

      No.  This is not possible because women in male-dominated fields score higher.

      (7) L393 Could you discuss explanations for why men outperform women in research evaluation scores more when disciplines are female dominated?

      Unfortunately, we don’t have an explanation for this and can’t get one from our data. We hope it will be an interesting for future work.

      (8) Could the figures be improved by having the crosses, x and + scaled, for example, in thickness corresponding to sample size? Alternatively, some description of the sample size variation? Sorting the rows by order of pmen in Table E1 would also be helpful for the reader.

      As with the previous figure we have tried many ways of presenting it (including tis one). Unfortunately nothing helped.

      We have provided Table E1 as a spreadsheet to allow readers to do this themselves.

      (9) Please state in your methods section the software used to aid repeatability.

      This is now in Supplementary Materials (Matlab 2022b).

      (10) It is great to report your model findings into real terms for PBRF and ARC. Please can you extend this to CIHR and EIGE. i.e. describing how a gender skew increase of x associates with a y increase in funding success chance.

      We have added similar explanations for both these datasets comparing the advantage of being male with the advantage of working in a male dominated discipline.

      (11) I would apply care to using pronouns "his" and "her" in L322-L324 and avoid if at all possible, instead, replacing them with "men" and "women".

      We have updated the text to avoid there pronouns in most places.

      The article in general would benefit from a disclosure statement early on conceding that gender investigated here is only as a binary variable, discounting its spectrum.

      See Line 74-76.

      Please also report how gender balance is defined in the datasets as in the data summary in supplementary materials, within the main text.

      Our definition of gender balance (proportion of researchers who are men, ) is given on Line 103.

      (12) The data summary Table S1 could benefit from explaining the variables in the first column. It is currently unclear how granularity, size of dataset and quotas/pre-allocation? are defined.

      These lines have been removed as they information they contained is included elsewhere in the table with far better explanations!

      (13) There are only 4 data points for investigating covariation between gender balance and funding success in CIHR. This should be discussed as a limitation.

      The small size of the dataset is now mentioned on Line 348.

      (14) L455 "Research varies widely across disciplines" in terms of what?

      This sentence has been extended

      .

      (15) L456 Maybe I am missing something but I don't understand the relevance of "Physicists' search for the grand unified theory" to research quality.

      Removed.

      (16) Can you provide more discussion into the results of your bibliographic analysis and Figure 2? An explanation into the relationships seen in the figure at least would be helpful.

      Thank you we have clarified the relationships seen in each of figures 2A (Lines 226-235), 2B (Lines 236-252), and 2C (lines  260-268).

      (17) It would be helpful to include in the discussion a few more sentences outlining:

      - Potential future research that would help disentangle mechanisms behind the trends you find.

      - How this research could be applied. Should there be some effort to standardise?

      We have added a short paragraph to the discussion about implications/applications, and future research (Lines 481-484).

      (18) The introduction could benefit from discussing and explaining their a priori hypotheses for how research from female-biassed disciplines may be evaluated differently.

      While not discussed in the introduction, possible explanations for why and how research in female dominated fields might be evaluated differently are explored in some detail in the Discussion.  We think once is enough, and towards the end is more effective than at the beginning.

      (19) L16 "Our work builds on others' findings that women's work is valued less, regardless of who performs that work." I find this confusing because in your model, there is a significant interaction effect between gender:pmen. This suggests that for female-biassed disciplines, there is even more of a devaluation for women, which I think your lines in figure 1 suggest.

      Correct but men are still affected, so the sentence is correct.  What is confusing is that the finding is counter to what we might expect.

    1. Overall Rating (⭐⭐⭐⭐☆)

      Impact (⭐⭐⭐⭐⭐): This paper compares the ability of three different species, two primates and one non-primate, to persist in behaviors and works to explain why there are such similarities in some actions but differences in others. It makes some interesting findings that may be relevant to brain cognition in humans and therefore has the potential to have high impact in the behavioral neuroscience field.In this study the authors utilized a well known decision making paradigm to study how decision making compares between primates and mice. This is important because rodent models are increasingly being used as replacements for primates in cognitive studies, particularly developmental studies, drug development, and injury models. This use of rodents can only be of value if their decision making behaviors properly model those of the primates they are replacing. Neural network studies in primate studies would suggest that rodents would not be a total replacement for cognitive studies and this paper seems to corroborate that, at least when it comes to task persistence, with rodents switching tasks at a more rapid pace than the primates. It is unclear yet how to incorporate this information into cognitive studies using rodents but having this information is an important step in being able to

      Methods (⭐⭐⭐⭐☆): The authors use three different species, mice (Mus musculus (males and females). They were presented with a species appropriate k-armed bandit task where targets were placed for them to choose from to get a reward. Individuals had to choose from a known reward or to explore for a new reward. Switching between choices was analyzed and compared using standard ANOVA.

      Note: Please add IACUC and IRB protocol numbers to the methods sections.

      Results (⭐⭐⭐⭐⭐): The authors examined switching behavior and exploratory behaviors in each species and found that while all three species engaged in switching and exploring, the mice switched targets most often indicating a lack of task persistence. In other words, they seemed to explore their options more than the primates did. This remained even after controlling for trial times and task design. Overall the results were compelling and well documented. The statistical analysis was thorough. The figures were clear but, if space were not an object, I would recommend that the 4 across panels be reworked to be a 4 square with 2 top panels and 2 lower panels, all of which could be a bit larger for better viewing.

      Discussion(⭐⭐⭐⭐☆): The discussion is fairly thorough although I think the addition of a discussion of the neural network models of task switching could add value. The neural networks are vastly different between rodents and primates and may also be a reason for the difference seen in the task persistence seen in this study. A discussion of next steps on how to potentially encorporate this information into the analysis of rodent studies on cognitive abilities would be so helprul seeing as we will only continue to increase the use of rodents in these types of studies, although that may take further experiementation.

      Overall, the study is excellent and should be published.

      Reviewer Information The reviewer is the Chair of Biotechnology at the Franklin Cummings Technical Institute. Her PhD is in neuroscience, but her work is as a protein biochemist working on inflammation, signal transduction, and cell-cell communication. She has worked in both industry and academia for over 20 years.

      Dr. Heather Duffy on ResearchHub: https://www.researchhub.com/user/1790894/

      ResearchHub Peer Reviewer Statement: This peer review has been uploaded from ResearchHub as part of a paid peer review initiative. ResearchHub aims to accelerate the pace of scientific research using novel incentive structures.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Although the manuscript is well organized and written, it could be largely improved and therefore made more plausible and easier to read. See my point-by-point comments listed below:

      (1) The introduction section is a bit overloaded with some unnecessary information. For example, the authors discussed the relationship between neurotransmitters in the prefrontal and striatum and substance use/sustained attention. However, the results are related to neither the neurotransmitters nor the striatum. In addition, there is a contradictory description about neurotransmitters there, Nicotine/THC leads to increased neurotransmitters, and decreased neurotransmitters is related to poor sustained attention. Does that mean that the use of Nicotine/THC could increase sustained attention?

      Thanks for this insightful question. We understand your concern regarding the seemingly contradictory statements about neurotransmitters and sustained attention. Previous studies have shown that acute administration of nicotine can improve sustained attention (Lawrence et al., 2002; Potter and Newhouse, 2008; Valentine and Sofuoglu, 2018; Young et al., 2004). On the other hand, the acute effects of smoking cannabis on sustained attention are mixed and depend on factors such as dosage and individual differences (Crean et al., 2011). For instance, a previous study (Hart et al., 2001) found that performance on a tracking task, which requires sustained attention, was found to improve significantly after smoking cannabis with a high dose of THC, albeit in experienced cannabis users. However, chronic substance use, including nicotine and cannabis, has been associated with impaired sustained attention (Chamberlain et al., 2012; Dougherty et al., 2013).

      To address your concerns and improve clarity and succinctness of the Introduction, we have removed the description of neurotransmitters from the Introduction. This revision should make the introduction more concise and focus on the direct relationships pertinent to our study.

      (2) It is a bit hard to follow the story for the readers because the Results section went straight into detail. For example, the authors directly introduced that they used the ICV from the Go trials to index sustained attention without basic knowledge about the task. Why use the ICV of Go trials instead of other trials (i.e., successful stop trials) as an index of sustained attention? I suggest presenting the subjects and task details about the data before the detailed behavioral results. The results section should include enough information to understand the presenting results for the readers, rather than forcing the reader to find the answer in the later Methods section.

      We appreciate your suggestion to provide more context about the task and ICV before diving into the detailed behavioural results.

      We used the ICV derived from the Go trials instead of Success stop trials as an index of sustained attention, based on the nature of the stop-signal task and the specific data it generates. Previous studies have indicated that reaction time (RT) variability is a straightforward measure of sustained attention, with increasing variability thought to reflect poorer ability to sustain attention (Esterman and Rothlein, 2019). RT variability is defined as ICV, calculated as the standard deviation of mean Go RT divided by the mean Go RT from Go trials (O'Halloran et al., 2018). The stop signal task includes both Go trials and stop trials. During Go trials, participants are required to respond as quickly and accurately as possible to a Go signal, allowing for the recording of RT for calculating ICV. In contrast, stop trials are designed to measure inhibitory control, where successful response inhibition results in no RT or response recorded in the output. Therefore, Go trials are specifically used to assess sustained attention, while Stop trials primarily assess inhibitory control (Verbruggen et al., 2019).

      We acknowledge the importance of providing this contextual information within the Results section to enhance reader understanding. We have added this information before presenting the behavioural results on Page 6.

      Results

      (1) Behavioural changes over time

      Reaction time (RT) variability is a straightforward measure of sustained attention, with increasing variability thought to reflect poor sustained attention. RT variability is defined as intra-individual coefficient of variation (ICV), calculated as the standard deviation of mean Go RT divided by the mean Go RT from Go trials in the stop signal task. Lower ICV indicates better sustained attention.

      (3) The same problem for section 2 in the Results. What are the predictive networks? Are the predictive networks the same as the networks constructed based on the correlation with ICV? My intuitive feeling is that they are the circular analyses here. The positive/negative/combined networks are calculated based on the correlation between the edges and ICV. Then the author used the network to predict the ICV again. The manipulation from the raw networks (I think they are based on PPI) to the predictive network, and the calculation of the predicted ICV are all missing. The direct exposure of the results to the readers without enough detailed knowledge made everything hard to digest.

      We thank the Reviewer for the insightful comment. We agree with the need for more clarity regarding the predictive networks and the CPM analysis before presenting results. CPM, a data-driven neuroscience approach, is applied to predict individual behaviour from brain functional connectivity (Rosenberg et al., 2016; Shen et al., 2017). The CPM analysis used the strength of the predictive network to predict the individual difference in traits and behaviours. CPM includes several steps: feature selection, feature summarization, model building, and assessment of prediction significance (see Fig. S1).

      During feature selection, we assessed whether connections between brain areas (i.e., edges) in a task-related functional connectivity matrix (derived from general psychophysiological interaction analysis) were positively or negatively correlated with ICV using a significance threshold of P < 0.01. These positively or negatively correlated connections are regarded as positive or negative network, respectively. The network strength of the positive network (or negative network) was determined in each individual by summing the connection strength of each positively (or negatively) correlated edge. The combined network was determined by subtracting the strength of the negative network from the positive network. Next, CPM built a linear model between the network strength of the predictive network and ICV. This model was initially developed using the training set. The predictive networks were then applied to the test set, where network strength was calculated again, and the linear model was used to predict ICV using k-fold cross-validation. Following your advice, we have updated it in the Results section to include these details on Page 7.

      Results

      (2) Cross-sectional brain connectivity

      This study employed CPM, a data-driven neuroscience approach, to identify three predictive networks— positive, negative, and combined— that predict ICV from brain functional connectivity. CPM typically uses the strength of the predictive networks to predict individual differences in traits and behaviors. The predictive networks were obtained based on connectivity analyses of the whole brain. Specifically, we assessed whether connections between brain areas (i.e., edges) in a task-related functional connectivity matrix derived from generalized psychophysiological interaction analysis were positively or negatively correlated with ICV using a significance threshold of P < 0.01. These positively or negatively correlated connections were regarded as positive or negative network, respectively. The network strength of positive networks (or negative networks) was determined for each individual by summing the connection strength of each positively (or negatively) correlated edge. The combined network was determined by subtracting the strength of the negative network from the positive network. We then built a linear model between network strength and ICV in the training set and applied these predictive networks to yield network strength and a linear model in the test set to calculate predicted ICV using k-fold cross validation.

      (4) The authors showed the positive/negative/combined networks from both Go trials and successful stop trials can predict the ICV. I am wondering how the author could validate the specificity of the prediction of these positive/negative/combined networks. For example, how about the networks from the failed stop trials?

      We appreciate the opportunity to clarify the specificity of the predictive networks identified in our study. Here is a more detailed explanation of our findings and their implications.

      To validate the specificity of the sustained attention network identified from CPM analysis, we calculated correlations between the network strength of positive and negative networks and performances from a neuropsychology battery (CANTAB) at each timepoint separately. CANTAB includes several tasks that measure various cognitive functions, such as sustained attention, inhibitory control, impulsivity, and working memory. We found that all positive and negative networks derived from Go and Successful stop trials significantly correlated with a behavioural assay of sustained attention – the rapid visual information processing (RVP) task – at ages 14 and 19 (all P values < 0.028). Age 23 had no RVP task data in the IMAGEN study. There were sporadic significant correlations between constructs such as delay aversion/impulsivity and negative network strength, for example, but the correlations with the RVP were always significant. This demonstrates that the strength of the sustained attention brain network was specifically and robustly correlated with a typical sustained attention task, rather than other cognitive measures. The results are described in the main text on Page 8 and shown in Supplementary materials (Pages 1 and 3) and Table S12.

      In addition, we conducted a CPM analysis to predict ICV using gPPI under Failed stop trials. Our findings showed that positive, negative, and combined networks derived from Failed stop trials significantly predicted ICV: at age 14 (r = 0.10, P = 0.033; r = 0.19, P < 0.001; and r = 0.17, P < 0.001, respectively), at age 19 (r = 0.21; r = 0.18; and r = 0.21, all P < 0.001, respectively), and at age 23 (r = 0.33, r = 0.35, and r = 0.36, respectively, all P < 0.001). Similar results were obtained using a 5-fold CV and leave-site-out CV.

      Our analysis further showed that task-related functional connectivity derived from Go trials, Successful Stop trials, and Failed Stop trials could predict sustained attention across three timepoints. However, the predictive performances of networks derived from Go trials were higher than those from Successful Stop and Failed Stop trials. This suggests that sustained attention is particularly crucial during Go trials when participants need to respond to the Go signal. In contrast, although Successful Stop and Failed Stop trials also require sustained attention, these tasks primarily involve inhibitory control along with sustained attention.

      Taken together, these findings underscore the specificity of the predictive networks of sustained attention. We have updated these results in the Supplementary Materials (Pages 3-5 and Page 7 ):

      Method

      CPM analysis using Failed stop trials

      We performed another CPM analysis using Failed stop trials using gPPI matrix obtained from the second GLM, described in the main text. The CPM analysis was conducted using 10-fold CV, 5-fold CV and leave-site-out CV.

      Results

      CPM predictive performance under Failed stop trials

      Positive, negative, and combined networks derived from Failed stop trials significantly predicted ICV: at age 14 (r = 0.10, P = 0.033; r = 0.19, P < 0.001; and r = 0.17, P < 0.001, respectively), at age 19 (r = 0.21; r = 0.18; and r = 0.21, all P < 0.001, respectively), and at age 23 (r = 0.33, r = 0.35, and r = 0.36, respectively, all P < 0.001). We obtained similar results using a 5-fold CV and leave-site-out CV (Table S6).

      Discussion

      Specificity of the prediction of predictive networks

      We found that task-related function connectivity derived from Go trials, Successful stop trials, and Failed stop trials successfully predicted sustained attention across three timepoints. However, predictive performances of predictive networks derived from Go trials were higher than those derived from Successful stop trials and Failed stop trials. These results suggest that sustained attention is particularly crucial during Go trials when participants need to respond to the Go signal. In contrast, although Successful Stop and Failed Stop trials also require sustained attention, these tasks primarily involve inhibitory control along with sustained attention.

      (5) The author used PPI to define the connectivity of the network. I am not sure why the author used two GLMs for the PPI analysis separately. In the second GLM, Go trials were treated as an implicit baseline. What does this exactly mean? And the gPPI analysis across the entire brain using the Shen atlas is not clear. Normally, as I understand, the PPI/gPPI is conducted to test the task-modulated connectivity between one seed region and the voxels of the whole rest brain. Did the author perform the PPI for each ROI from Shen atlas? More details about how to use PPI to construct the network are required.

      Thank you for your insightful questions. Here, we’d like to clarify how we applied generalized PPI across the whole brain using the Shen atlas and why we used two separate GLMs for the gPPI analysis.

      Yes, PPI is conducted to test the task-modulated connectivity between one seed region and other brain areas. This method can be both voxel-based and ROI-based. In our study, we performed ROI-based gPPI analysis using Shen atlas with 268 regions. Specifically, we performed the PPI on each seed region of interest (ROI) to estimate the task-related FC between this ROI and the remaining ROI (267 regions) under a specific task condition. By performing this analysis across each ROI in the Shen atlas, we generated a 268 × 268 gPPI matrix for each task condition. The matrices were then transposed and averaged with the original matrices, which yielded symmetrical matrices, which were subsequently used for CPM analysis.

      Regarding the use of two separate GLMs for the gPPI analysis, our study aimed to define the task-related FC under two conditions: Go trials and Successful stop trials. The first GLM including Go trials was built to estimate the gPPI during Go trials. However, due to the high frequency of Go trials in the stop signal task, it is common to regard the Go trials as an implicit baseline, as in previous IMAGEN studies (D'Alberto et al., 2018; Whelan et al., 2012). Therefore, to achieve a more accurate estimation of FC during Successful stop trials, we built a second GLM specifically for these trials. Accordingly, we have updated it in the Method Section in the main text on Page 16.

      Method

      2.5 Generalized psychophysiological interaction (gPPI) analysis

      In this study, we adopted gPPI analysis to generate task-related FC matrices and applied CPM analysis to investigate predictive brain networks from adolescents to young adults. PPI analysis describes task-dependent FC between brain regions, traditionally examining connectivity between a seed region of interest (ROI) and the voxels of the whole rest brain. However, this study conducted a generalized PPI analysis, which is on ROI-to-ROI basis (Di et al., 2021), to yield a gPPI matrix across the whole brain instead of just a single seed region.

      Given the high frequency of Go trials in SST, it is common to treat Go trials as an implicit baseline in previous IMAGEN studies (D'Alberto et al., 2018; Whelan et al., 2012). Hence, we built a separate GLM for Successful stop trials, which included two task regressors (Failed and Successful stop trials) and 36 nuisance regressors.

      (6) Why did the author use PPI to construct the network, rather than the other similar methods, for example, beta series correlation (BSC)?

      Thanks for your question. PPI is an approach used to calculate the functional connectivity (FC) under a specific task (i.e., task-related FC). Although most brain connectomic research has utilized resting-state FC (e.g., beta series correlation), FC during task performance has demonstrated superiority in predicting individual behaviours and traits,  due to its potential to capture more behaviourally relevant information (Dhamala et al., 2022; Greene et al., 2018; Yoo et al., 2018). Specifically, Zhao et al. (2023) suggested that task-related FC outperforms both typical task-based and resting-state FC in predicting individual differences. Therefore, we chose to use task-related FC to predict sustained attention over time. We have updated it in the Introduction on Page 5.

      Introduction

      Although most brain connectomic research has utilized resting-state fMRI data, functional connectivity (FC) during task performance has demonstrated superiority in predicting individual behaviours and traits, due to its potential to capture more behaviourally relevant information (Dhamala et al., 2022; Greene et al., 2018; Yoo et al., 2018). Specifically, Zhao et al. (2023) suggested that task-related FC outperforms both typical task-based and resting-state FC in predicting individual differences. Hence, we applied task-related FC to predict sustained attention over time.

      (7) In the section of 'Correlation analysis between the network strength and substance use', the author just described that 'the correlations between xx and xx are shown in Fig5X', and repeated it three times for three correlation results. What exactly are the results? The author should describe the results in detail. And I am wondering whether there are scatter plots for these correlation analyses?

      We’d like to clarify the results in Fig. 5. Fig. 5 illustrates the significant correlations between behaviour and brain activity associated with sustained attention and Cigarette and cannabis use (Cig+CB) after FDR correction. Panel A shows the significant correlation between behaviour level of sustained attention and Cig+CB. Panels B and C show the correlations between brain activity associated with sustained attention and Cig+CB. While Panel B presents the brain activity derived from Go trials, Panel C presents brain activity derived from Successful stop trials. In response to your suggestion, we have described these results in detail on Page 9. We also have included scatter plots for the significant correlations, which are shown in Fig. 5 in Supplementary materials (Fig. S10).

      Results

      (6) Correlation between behaviour and brain to cannabis and cigarette use

      Figs. 5A-C summarizes the results showing the correlation between ICV/brain activity and Cig+CB per timepoint and across timepoints. Fig. 5A shows correlations between ICV and Cig+CB (Tables S14-15). ICV was correlated with Cig+CB at ages 19 (Rho = 0.13, P < 0.001) and 23 (Rho = 0.17, P < 0.001). ICV at ages 14 (Rho = 0.13, P = 0.007) and 19 (Rho = 0.13, P = 0.0003) were correlated with Cig+CB at age 23. Cig+CB at age 19 was correlated with ICV at age 23 (Rho = 0.13, P = 9.38E-05). Fig. 5B shows correlations between brain activity derived from Go trials and Cig+CB (Tables S18-19). Brain activities of positive and negative networks derived from Go trials were correlated with Cig+CB at age 23 (positive network: Rhop = 0.12, P < 0.001; negative network: Rhon = -0.11, P < 0.001). Brain activity of the negative network derived from Go trials at age 14 was correlated with Cig+CB at age 23 (Rhon = -0.16, P = 0.001). Cig+CB at age 19 was correlated with brain activity of the positive network derived from Go trials at age 23 (Rhop = 0.10, P = 0.002). Fig. 5C shows the correlations between brain activity derived from Successful stop and Cig+CB (Tables S18-19). Brain activities of positive and negative networks derived from Successful stop were correlated with Cig+CB at ages 19 (positive network: Rhop = 0.10, P = 0.001; negative network: Rhon = -0.08, P = 0.013) and 23 (positive network: Rhop = 0.13, P < 0.001; negative network: Rhon = -0.11, P = 0.001).

      (8) Lastly, the labels of (A), (B) ... in the figure captions are unclear. The authors should find a better way to place the labels in the caption and keep them consistent throughout all figures.

      Thank you for this valuable comment. We have revised the figure captions in the main text to ensure the labels (A), (B), etc., are placed more clearly and consistently across all figures.

      Reviewer #2 (Public Review):

      While the study largely achieves its aims, several points merit further clarification:

      (1) Regarding connectome-based predictive modeling, an assumption is that connections associated with sustained attention remain consistent across age groups. However, this assumption might be challenged by observed differences in the sustained attention network profile (i.e., connections and related connection strength) across age groups (Figures 2 G-I, Fig. 3 G_I). It's unclear how such differences might impact the prediction results.

      Thank you for your insightful comment. We’d like to clarify that we did not assume that connections associated with sustained attention remain completely consistent across age groups. Indeed, we expected that connections would change across age groups, due to the developmental changes in brain function and structure from adolescence to adulthood. Our focus was on the consistency of individual differences in sustained attention networks over time, recognising that the actual connections within those networks may change. However, we did show that there is some consistency in the specific connections associated with sustained attention over time. Notably, this consistency markedly increases when comparing ages 19 and 23, when developmental factors are less relevant. We support our reasoning above with the following analyses:

      (1) Supplementary materials (Pages 2 and 5), relevant sections highlighted here for emphasis.

      Method

      Comparison of predictive networks identified at one timepoint versus another

      Steiger’s Z value was employed to compare predictive performances of networks identified at different timepoints. This analysis involved comparing the R values derived from networks defined at distinct ages to predict ICV at the same age. For example, we compared the r values of brain networks defined at age 14 when predicting ICV at 19 (i.e., positive network: r = 0.25, negative network: r = 0.25, combined network: r = 0.28) with those R values of brain networks defined at age 19 itself (i.e., positive network: r = 0.16, negative network: r = 0.14, combined network: r = 0.16) derived from Go trials using Steiger's Z test (age 14 → age 19 vs. age 19 → 19). Similarly, comparisons were made between networks defined at age 14 predicting ICV at age 23 and those at age 23 predicting ICV at age 23 (age 14 → age 23 vs. age 23 → 23), as well as between networks defined at age 19 predicting ICV at age 23 and those at age 23 predicting ICV at age 23 (age 19 -> age 23 vs. age 23 -> age 23). These comparisons were performed separately for Go trials and Successful Stop trials.

      Results

      Comparison of predictive performance at different timepoints

      For positive, negative, and combined networks predicting ICV derived from Go trials at age 19, the R values were higher when using predictive networks defined at 19 than those defined at 14 (Z = 3.79, Z = 3.39, Z = 3.99, all P < 0.00071). Similarly, the R values for positive, negative, and combined networks predicting ICV derived from Go trials at age 23 were higher when using predictive networks defined at age 23 compared to those defined at ages 14 (Z = 6.00, Z = 5.96, Z = 6.67, all P < 3.47e-9) or 19 (Z = 2.80, Z = 2.36, Z = 2.57, all P < 0.005).

      At age 19, the R value for the positive network predicting ICV derived from Successful stop trials was higher when using predictive networks defined at 19 compared to those defined at 14 (Z = 1.54, P = 0.022), while the negative and combined networks did not show a significant difference (Z = 0.85, P = 0.398; Z = 2.29, P = 0.123). At age 23, R values for the positive and combined networks predicting ICV derived from Successful stop trials were higher when using predictive networks defined at 23 compared to those defined at 14 (Z = 3.00, Z = 2.48, all P < 3.47e-9) or 19 (Z = 2.52, Z = 1.99, all P < 0.005). However, the R value for the negative network at age 23 did not significantly differ when using predictive networks defined at 14 (Z = 1.80, P = 0.072) or 19 (Z = 1.48, P = 0.138).

      These results indicate that some specific pairwise connections associated with sustained attention at earlier ages, such as 14 and 19, are still relevant as individuals grow older. However, some connections are not optimal for good sustained attention at older ages. That is, the brain reorganizes its connection patterns to maintain optimal functionality for sustained attention as it matures.

      (2) Consistency of Individual Differences:

      We found individual differences in ICV were significantly correlated between the three timepoints (Fig. 1B). In addition, we calculated the correlations of network strength of predictive networks predicting sustained attention derived from Go trials and Successful trials between each timepoints. We found that the correlations of network strength for predictive networks (derived from Go trials and Successful trials) were also significant (all P < 0.003). We have updated these results in the main text (Pages 7-8) and Supplementary Materials (Table S7).

      (2) Cross-sectional brain connectivity

      In addition, we found that network strength of positive, negative, and combined networks derived from Go trials was significantly correlated between the three timepoints (Table S7, all P < 0.003).

      In addition, we found that network strength of positive, negative, and combined networks derived from Successful stop trials was significantly correlated between the three timepoints (Table S7, all P < 0.001).

      (3) Predictive networks across timepoints: Predictive networks defined at age 14 were successfully applied to predict ICV at ages 19 and 23. Similarly, predictive networks defined at age 19 were successfully applied to predict ICV at age 23 (Fig. 4). These results reflect the robustness of the brain network associated with sustained attention over time.

      (4) Dice coefficient analysis: We calculated the Dice coefficient to quantify the similarity of predictive networks across the three timepoints. Connections in the sustained attention networks were significantly similar from ages 14 to 23 (Table S13), despite relatively few overlapping edges over time (as discussed in Supplementary Materials on Page 6).

      (5) Global brain activation: Based on these findings, we indicate that sustained attention relies on global brain activation (i.e., network strength) rather than specific regions or networks (see also (Zhao et al., 2021)).

      In summary, brain network connections undergo change and are not completely consistent across time. However, individual differences in sustained attention and its network are consistent across time, as we found that 1) the brain reorganizes its connection patterns to maintain optimal functionality for sustained attention as it matures. 2) ICV and network strength of sustained attention network were significantly correlated between each timepoint. 3) Sustained attention networks identified from previous timepoints could predict ICV in the subsequent timepoint. 4) Dice coefficient analysis indicated that the edges in the sustained attention networks were significantly similar from ages 14 to 23. 5) Sustained attention networks function as a global activation, rather than specific regions or networks.

      (2) Another assumption of the connectome-based predictive modeling is that the relationship between sustained attention network and substance use is linear and remains linear over development. Such linear evidence from either the literature or their data would be of help.

      Thanks for your valuable suggestion. We'd like to clarify that while CPM assumes a linear relationship between brain and behaviour (Shen et al., 2017), it does not assume that the relationship between the sustained attention network and substance use remains linear over development.

      Our approach in applying CPM to predict sustained attention across different timepoints was based on previous neuroimaging studies (Rosenberg et al., 2016; Rosenberg et al., 2020), which indicated linear associations between brain connectivity patterns and sustained attention using CPM analysis. These findings support the notion of a linear relationship between brain connectivity and sustained attention. In this study, we performed CPM analysis to identify predictive networks predicting sustained attention, not substance use and used the network strength of these predictive networks to represent sustained attention activity.

      To examine the relationship between substance use and sustained attention, as well as its associated brain activity, we conducted correlation analyses and utilized a latent change score model instead of CPM analysis. This decision was informed by cross-sectional studies (Broyd et al., 2016; Lisdahl and Price, 2012) that consistently reported linear associations between substance use and impairments in sustained attention. Additionally, longitudinal research by (Harakeh et al., 2012) indicated a linear relationship between poorer sustained attention and the initiation and escalation of substance use over time.

      Given these previous findings, we assumed a linear relationship between sustained attention and substance use. Our analyses included calculating correlations between substance use and sustained attention, as well as its associated brain activity at each timepoint and across timepoints (Fig. 5). Furthermore, we employed a three-wave bivariable latent change score model, a longitudinal approach, to assess the relationship between substance use and behavirour and brain activity associated with sustained attention (Figs. 6-7). We have added more information in the Introduction to make it more clear on Page 6.

      Introduction

      Additionally, previous cross-sectional and longitudinal studies (Broyd et al., 2016; Harakeh et al., 2012; Lisdahl and Price, 2012) have shown that there are linear relationships between substance use and sustained attention over time. We therefore employed correlation analyses and a latent change score model to estimate the relationship between substance use and both behaviours and brain activity associated with sustained attention.

      (3) Heterogeneity in results suggests individual variability that is not fully captured by group-level analyses. For instance, Figure 1A shows decreasing ICV (better-sustained attention) with age on the group level, while there are both increasing and decreasing patterns on the individual level via visual inspection. Figure 7 demonstrates another example in which the group with a high level of sustained attention has a lower risk of substance use at a later age compared to that in the group with a low level of sustained attention. However, there are individuals in the high sustained attention group who have substance use scores as high as those in the low sustained attention group. This is important to take into consideration and could be a potential future direction for research.

      Thanks for this valuable comment. We appreciate your observation regarding the individual variability that is not fully captured by group-level analyses to some degree. Fig. 1A shows the results from a linear mixed model, which explains group-level changes over time while accounting for the random effect within subjects. Similarly, Fig. 7 shows the group-level association between substance use and sustained attention. We agree that future research could indeed consider individual variability. For example, participants could be categorized based on their consistent trajectories of ICV or substance use (i.e., keep decreasing/increasing) over multiple timepoints. We agree that incorporating individual-level analyses in the future could provide valuable insights and are grateful for your suggestion, which will inform our future research directions.

      The above-mentioned points might partly explain the significant but low correlations between the observed and predicted ICV as shown in Figure 4. Addressing these limitations would help enhance the study's conclusions and guide future research efforts.

      We have updated the text in the Discussion on Page 13:

      Discussion

      However, there are still some individual variabilities not captured in this study, which could be attributed to the diversity in genetic, environmental, and developmental factors influencing sustained attention and substance use. Future research should aim to explore these variabilities in greater depth to gain better understanding of the relationship between sustained attention and substance use.

      Reviewer #3 (Public Review):

      Weaknesses: It's questionable whether the prediction approach (i.e., CPM), even when combined with longitudinal data, can establish causality. I recommend removing the term 'consequence' in the abstract and replacing it with 'predict'. Additionally, the paper could benefit from enhanced rigor through additional analyses, such as testing various thresholds and conducting lagged effect analyses with covariate regression.

      Thank you for your comment. We have replaced “consequence” by “predict” in the abstract.

      Abstract

      Previous studies were predominantly cross-sectional or under-powered and could not indicate if impairment in sustained attention was a predictor of substance-use or a marker of the inclination to engage in such behaviour.

      Reviewer #3 (Recommendations For The Authors):

      (1) The connectivity analysis predicts both baseline and longitudinal attention measures. However, given the high correlation in attention abilities across the three time-points, it's unclear whether the connectivity predicts shared variations of attention across three time points. It would be insightful to assess if predictions at the 2nd and 3rd-time points remained  significant after controlling for attention abilities at the initial time point.

      Thanks for your comments. We performed the CPM analysis to predict ICV at the 2nd and 3rd timepoint, controlling for ICV at age 14 as a covariate. We found that controlling for ICV at age 14, positive, negative, and combined networks derived from Successful stop trials defined at age 14 still predicted ICV at ages 19 and 23. In addition, positive, negative, and combined networks derived from Successful stop trials defined at age 19 predicted ICV at age 23. In addition, positive, negative, and combined networks derived from Go trials defined at age 19 still predicted ICV at age 23, after controlling for ICV at age 14. However, positive, negative, and combined networks derived from Go trials defined at age 14 had lower predictive performances in predicting ICV at ages 19 and 23, after controlling for ICV at age 14. Notably, controlling for ICV at the initial timepoint did not significantly impact the performances of predictive networks derived from Successful stop trials. Accordingly, we have added this analysis and the results in the Supplementary Materials (Pages 3 and 5).

      Method

      Prediction across timepoints controlling for ICV at age 14

      To examine whether connectivity predictors shared variations of sustained attention across timepoints, we applied predictive models developed at ages 14 and 19 to predict ICV at subsequent timepoints controlling for ICV at age 14. Specifically, we used predictive models (including parameters and selected edges) developed at age 14 to predict ICV at ages 19 and 23 separately. First, we calculated the network strength using the gPPI matrix at ages 19 and 23 based on the selected edges identified from CPM analysis at age 14. We then estimated the predicted ICV at ages 19 and 23 by applying the linear model parameters (slope and intercept) obtained from CPM analysis at age 14 to the network strength. Finally, we evaluated the predictive performance by calculating the partial correlation between the predicted and observed values at ages 19 and 23, controlling for ICV at age 14. Similarly, we applied models developed at age 19 to predict ICV at age 23, also controlling for ICV at age 14. To assess the significance of the predictive performance, we used a permutation test, shuffling the predicted ICV values and calculating partial correlation to general a random distribution over 1,000 iterations.

      Results

      Predictions across timepoints controlling for ICV at age 14

      Positive and combined networks derived from Go trials defined at age 14 predicted ICV at ages 19 (r = 0.10, P = 0.028; r = 0.08, P = 0.047) but negative network did not (r = 0.06, P = 0.119). Positive network derived from Go trials defined at age 14 predicted ICV at age 23 (r = 0.11, P = 0.013) but negative and combined networks did not (r = 0.04, P = 0.187; r = 0.08, P = 0.056).  Positive, negative, and combined networks derived from Go trials defined at age 19 predicted ICV at age 23 (r = 0.22, r = 0.19, and r = 0.22, respectively, all P < 0.001).

      Positive, negative, and combined networks derived from Successful stop trials defined at age 14 predicted ICV at age 19 (r = 0.08, P = 0.036; r = 0.10, P = 0.012; r = 0.11, P = 0.009) and 23 (r = 0.11, P = 0.005; r = 0.13, P = 0.005; r = 0.13, P = 0.017) respectively. Positive, negative, and combined networks derived from Successful stop trials defined at age 19 predicted ICV at age 23 (r = 0.18, r = 0.18, and r = 0.17, respectively, all P < 0.001).

      (2) In the Results section, a significance threshold of p = 0.01 was used for the CPM analysis. It would be beneficial to test the stability of these findings using alternative thresholds such as p = 0.05 or p = 0.005.

      We appreciate this insightful comment. We appreciate the suggestion to test the stability of our findings using alternative significance thresholds. Indeed, we have already conducted CPM analyses using a range of thresholds, including 0.1, 0.05, 0.01, 0.005, 0.001, 0.0005, and 0.0001 (see Table S8 in supplementary Materials). The results were similar across different thresholds. Following prior studies (Feng et al., 2024; Ren et al., 2021; Yoo et al., 2018) which used P < 0.01 for feature selection, we chose to focus on the threshold of P < 0.01 for our main analysis. Following your suggestion, we have highlighted this in the Method section on Pages 17-18.

      Method

      2.6.1 ICV prediction

      The r value with an associated P value for each edge was obtained, and a threshold P = 0.01 (Feng et al., 2024; Ren et al., 2021; Yoo et al., 2018) was set to select edges.

      2.6.2 Three cross-validation schemes

      In addition, we conducted the CPM analysis using a range of thresholds for feature selection and observed similar results across different thresholds (See Supplementary Materials Table S8).

      (3) Could you clarify if you used one sub-sample to extract connectivity related to sustained attention and then used another sub-sample to predict substance use with attention-related connectivity?

      Thank you very much for the question. We used the same sample to extract the brain network strength and estimated the correlation with substance use using both the Spearman correlation and latent change score model across three timepoints. We controlled for covariates including sex, age, and scan site at the same time. Accordingly, we have clarified this in the Method section on Page 20. We note that the CPM analyses were conducted using cross-validation, plus a leave-site-out analysis.

      Method

      2.7.3 Correlation between network strength and substance use

      It is worth noting that all the correlations between substance use and sustained attention were conducted using the same sample across three timepoints.

      (4) Could you clarify whether you have regressed covariates in the lagged effects analysis of part 7?

      Thanks for this question. Yes, we confirmed that we controlled the covariates including age, sex and scan sites in the latent change score model. We have described them more clearly now in the Method section (Page 18).

      Method

      2.7.3 Correlation between network strength and substance use

      Additionally, cross-lagged dynamic coupling (i.e., bidirectionality) was employed to explore individual differences in the relationships between substance use and linear changes in ICV/brain activity, as well as the relationship between ICV/brain activity and linear change in substance use. The model accounted for covariates such as age, sex and scan sites.

      References:

      Broyd, S.J., van Hell, H.H., Beale, C., Yucel, M., Solowij, N., 2016. Acute and Chronic Effects of Cannabinoids on Human Cognition-A Systematic Review. Biol Psychiatry 79, 557-567.

      Chamberlain, S.R., Odlaug, B.L., Schreiber, L.R.N., Grant, J.E., 2012. Association between Tobacco Smoking and Cognitive Functioning in Young Adults. The American Journal on Addictions 21, S14-S19.

      Crean, R.D., Crane, N.A., Mason, B.J., 2011. An evidence based review of acute and long-term effects of cannabis use on executive cognitive functions. J Addict Med 5, 1-8.

      D'Alberto, N., Chaarani, B., Orr, C.A., Spechler, P.A., Albaugh, M.D., Allgaier, N., Wonnell, A., Banaschewski, T., Bokde, A.L.W., Bromberg, U., Buchel, C., Quinlan, E.B., Conrod, P.J., Desrivieres, S., Flor, H., Frohner, J.H., Frouin, V., Gowland, P., Heinz, A., Itterman, B., Martinot, J.L., Paillere Martinot, M.L., Artiges, E., Nees, F., Papadopoulos Orfanos, D., Poustka, L., Robbins, T.W., Smolka, M.N., Walter, H., Whelan, R., Schumann, G., Potter, A.S., Garavan, H., 2018. Individual differences in stop-related activity are inflated by the adaptive algorithm in the stop signal task. Hum Brain Mapp 39, 3263-3276.

      Dhamala, E., Yeo, B.T.T., Holmes, A.J., 2022. Methodological Considerations for Brain-Based Predictive Modelling in Psychiatry. Biological Psychiatry.

      Di, X., Zhang, Z.G., Biswal, B.B., 2021. Understanding psychophysiological interaction and its relations to beta series correlation. Brain Imaging and Behavior 15, 958-973.

      Dougherty, D.M., Mathias, C.W., Dawes, M.A., Furr, R.M., Charles, N.E., Liguori, A., Shannon, E.E., Acheson, A., 2013. Impulsivity, attention, memory, and decision-making among adolescent marijuana users. Psychopharmacology (Berl) 226, 307-319.

      Esterman, M., Rothlein, D., 2019. Models of sustained attention. Curr Opin Psychol 29, 174-180.

      Feng, Q., Ren, Z., Wei, D., Liu, C., Wang, X., Li, X., Tie, B., Tang, S., Qiu, J., 2024. Connectome-based predictive modeling of Internet addiction symptomatology. Soc Cogn Affect Neurosci 19.

      Greene, A.S., Gao, S., Scheinost, D., Constable, R.T., 2018. Task-induced brain state manipulation improves prediction of individual traits. Nature Communications 9, 2807.

      Harakeh, Z., de Sonneville, L., van den Eijnden, R.J., Huizink, A.C., Reijneveld, S.A., Ormel, J., Verhulst, F.C., Monshouwer, K., Vollebergh, W.A., 2012. The association between neurocognitive functioning and smoking in adolescence: the TRAILS study. Neuropsychology 26, 541-550.

      Hart, C.L., van Gorp, W., Haney, M., Foltin, R.W., Fischman, M.W., 2001. =. Neuropsychopharmacology 25, 757-765.

      Lawrence, N.S., Ross, T.J., Stein, E.A., 2002. Cognitive mechanisms of nicotine on visual attention. Neuron 36, 539-548.

      Lisdahl, K.M., Price, J.S., 2012. Increased marijuana use and gender predict poorer cognitive functioning in adolescents and emerging adults. J Int Neuropsychol Soc 18, 678-688.

      O'Halloran, L., Cao, Z.P., Ruddy, K., Jollans, L., Albaugh, M.D., Aleni, A., Potter, A.S., Vahey, N., Banaschewski, T., Hohmann, S., Bokde, A.L.W., Bromberg, U., Buchel, C., Quinlan, E.B., Desrivieres, S., Flor, H., Frouin, V., Gowland, P., Heinz, A., Ittermann, B., Nees, F., Orfanos, D.P., Paus, T., Smolka, M.N., Walter, H., Schumann, G., Garavan, H., Kelly, C., Whelan, R., 2018. Neural circuitry underlying sustained attention in healthy adolescents and in ADHD symptomatology. Neuroimage 169, 395-406.

      Potter, A.S., Newhouse, P.A., 2008. Acute nicotine improves cognitive deficits in young adults with attention-deficit/hyperactivity disorder. Pharmacol Biochem Behav 88, 407-417.

      Ren, Z., Daker, R.J., Shi, L., Sun, J., Beaty, R.E., Wu, X., Chen, Q., Yang, W., Lyons, I.M., Green, A.E., Qiu, J., 2021. Connectome-Based Predictive Modeling of Creativity Anxiety. Neuroimage 225, 117469.

      Rosenberg, M.D., Finn, E.S., Scheinost, D., Papademetris, X., Shen, X., Constable, R.T., Chun, M.M., 2016. A neuromarker of sustained attention from whole-brain functional connectivity. Nat Neurosci 19, 165-171.

      Rosenberg, M.D., Scheinost, D., Greene, A.S., Avery, E.W., Kwon, Y.H., Finn, E.S., Ramani, R., Qiu, M., Constable, R.T., Chun, M.M., 2020. Functional connectivity predicts changes in attention observed across minutes, days, and months. Proc Natl Acad Sci U S A 117, 3797-3807.

      Shen, X., Finn, E.S., Scheinost, D., Rosenberg, M.D., Chun, M.M., Papademetris, X., Constable, R.T., 2017. Using connectome-based predictive modeling to predict individual behavior from brain connectivity. Nat Protoc 12, 506-518.

      Valentine, G., Sofuoglu, M., 2018. Cognitive Effects of Nicotine: Recent Progress. Curr Neuropharmacol 16, 403-414.

      Verbruggen, F., Aron, A.R., Band, G.P.H., Beste, C., Bissett, P.G., Brockett, A.T., Brown, J.W., Chamberlain, S.R., Chambers, C.D., Colonius, H., Colzato, L.S., Corneil, B.D., Coxon, J.P., Dupuis, A., Eagle, D.M., Garavan, H., Greenhouse, I., Heathcote, A., Huster, R.J., Jahfari, S., Kenemans, J.L., Leunissen, I., Li, C.S.R., Logan, G.D., Matzke, D., Morein-Zamir, S., Murthy, A., Pare, M., Poldrack, R.A., Ridderinkhof, K.R., Robbins, T.W., Roesch, M.R., Rubia, K., Schachar, R.J., Schall, J.D., Stock, A.K., Swann, N.C., Thakkar, K.N., van der Molen, M.W., Vermeylen, L., Vink, M., Wessel, J.R., Whelan, R., Zandbelt, B.B., Boehler, C.N., 2019. A consensus guide to capturing the ability to inhibit actions and impulsive behaviors in the stop-signal task. Elife 8.

      Whelan, R., Conrod, P.J., Poline, J.B., Lourdusamy, A., Banaschewski, T., Barker, G.J., Bellgrove, M.A., Buchel, C., Byrne, M., Cummins, T.D., Fauth-Buhler, M., Flor, H., Gallinat, J., Heinz, A., Ittermann, B., Mann, K., Martinot, J.L., Lalor, E.C., Lathrop, M., Loth, E., Nees, F., Paus, T., Rietschel, M., Smolka, M.N., Spanagel, R., Stephens, D.N., Struve, M., Thyreau, B., Vollstaedt-Klein, S., Robbins, T.W., Schumann, G., Garavan, H., Consortium, I., 2012. Adolescent impulsivity phenotypes characterized by distinct brain networks. Nat Neurosci 15, 920-925.

      Yoo, K., Rosenberg, M.D., Hsu, W.T., Zhang, S., Li, C.R., Scheinost, D., Constable, R.T., Chun, M.M., 2018. Connectome-based predictive modeling of attention: Comparing different functional connectivity features and prediction methods across datasets. Neuroimage 167, 11-22.

      Young, J.W., Finlayson, K., Spratt, C., Marston, H.M., Crawford, N., Kelly, J.S., Sharkey, J., 2004. Nicotine improves sustained attention in mice: evidence for involvement of the alpha7 nicotinic acetylcholine receptor. Neuropsychopharmacology 29, 891-900.

      Zhao, W., Makowski, C., Hagler, D.J., Garavan, H.P., Thompson, W.K., Greene, D.J., Jernigan, T.L., Dale, A.M., 2023. Task fMRI paradigms may capture more behaviorally relevant information than resting-state functional connectivity. Neuroimage, 119946.

      Zhao, W., Palmer, C.E., Thompson, W.K., Chaarani, B., Garavan, H.P., Casey, B.J., Jernigan, T.L., Dale, A.M., Fan, C.C., 2021. Individual Differences in Cognitive Performance Are Better Predicted by Global Rather Than Localized BOLD Activity Patterns Across the Cortex. Cereb Cortex 31, 1478-1488.

    1. Reviewer #1 (Public Review):

      Summary:

      This paper by Schommartz and colleagues investigates the neural basis of memory reinstatement as a function of both how recently the memory was formed (recent, remote) and its development (children, young adults). The core question is whether memory consolidation processes as well as the specificity of memory reinstatement differ with development. A number of brain regions showed a greater activation difference for recent vs. remote memories at the long versus shorter delay specifically in adults (cerebellum, parahippocampal gyrus, LOC). A different set showed decreases in the same comparison, but only in children (precuneus, RSC). The authors also used neural pattern similarity analysis to characterize reinstatement, though still in this revised paper I have substantive concerns about how the analyses were performed. While scene-specific reinstatement decreased for remote memories in both children and adults, claims about its presence cannot be made given the analyses. Gist-level reinstatement was observed in children but not adults, but I also have concerns about this analysis. Broadly, the behavioural and univariate findings are consistent with the idea memory consolidation differs between children and adults in important ways, and takes a step towards characterizing how.

      Strengths:

      The topic and goals of this paper are very interesting. As the authors note, there is little work on memory consolidation over development, and as such this will be an important data point in helping us begin to understand these important differences. The sample size is great, particularly given this is an onerous, multi-day experiment; the authors are to be commended for that. The task design is also generally well controlled, for example as the authors include new recently learned pairs during each session.

      Weaknesses:

      As noted above and in my review of the original submission, the pattern similarity analysis for both item and category-level reinstatement were performed in a way that is not interpretable given concerns about temporal autocorrelation within scanning run. Unfortunately these issues remain of concern in this revision because they were not rectified. Most of my review focuses on this analytic issue, though I also outline additional concerns.

      (1) The pattern similarity analyses are largely uninterpretable due to how they were performed.

      (a) First, the scene-specific reinstatement index: The authors have correlated a neural pattern during a fixation cross (delay period) with a neural pattern associated with viewing a scene as their measure of reinstatement. The main issue with this is that these events always occurred back-to-back in time. As such, the two patterns will be similar due simply to the temporal autocorrelation in the BOLD signal. Because of the issues with temporal autocorrelation within scanning run, it is always recommended to perform such correlations only across different runs. In this case, the authors always correlated patterns extracted from the same run, and which moreover have temporal lags that are perfectly confounded with their comparison of interest (i.e., from Fig 4A, the "scene-specific" comparisons will always be back-to-back, having a very short temporal lag; "set-based" comparisons will be dispersed across the run, and therefore have a much higher lag). The authors' within-run correlation approach also yields correlation values that are extremely high - much higher than would be expected if this analysis was done appropriately. The way to fix this would be to restrict the analysis to only cross-run comparisons, which is not possible given the design.

      To remedy this, in the revision the authors have said they will refrain from making conclusions about the presence of scene-specific reinstatement (i.e., reinstatement above baseline). While this itself is an improvement from the original manuscript, I still have several concerns. First, this was not done thoroughly and at times conclusions/interpretations still seem to imply or assume the presence of scene reinstatement (e.g., line 979-985, "our research supports the presence of scene-specific reinstatement in 5-to-7-year-old children"; line 1138). Second, the authors' logic for the neural-behavioural correlations in the PLSC analysis involved restricting to regions that showed significant reinstatement for the gist analysis, which cannot be done for the analogous scene-specific reinstatement analysis. This makes it challenging to directly compare these two analyses since one was restricted to a small subset of regions and only children (gist), while scene reinstatement included both groups and all ROIs. Third, it is also unclear whether children and adults' values should be directly comparable given pattern similarity can be influenced by many factors like motion, among other things.

      My fourth concern with this analysis relates to the lack of regional specificity of the effects. All ROIs tested showed a virtually identical pattern: "Scene-specific reinstatement" decreased across delays, and was greater in children than adults. I believe control analyses are needed to ensure artifacts are not driving these effects. This would greatly strengthen the authors' ability to draw conclusions from the "clean" comparison of day 1 vs. day 14. (A) The authors should present results from a control ROI that should absolutely not show memory reinstatement effects (e.g., white matter?). Results from the control ROI should look very different - should not differ between children and adults, and should not show decreases over time. (B) Do the recent items from day 1 vs. day 14 differ? If so, this could suggest something is different about the later scans (and if not, it would be reassuring). (C) If the same analysis was performed comparing the object cue and immediately following fixation (rather than the fixation and the immediately following scene), the results should look very different. I would argue that this should not be an index of reinstatement at all since it involves something presented visually rather than something reinstated (i.e., the scene picture is not included in this comparison). If this control analysis were to show the same effects as the primary analysis, this would be further evidence that this analysis is uninterpretable and hopelessly confounded.

      (b) For the category-based neural reinstatement: (1) This suffers from the same issue of correlations being performed within run. Again, to correct this the authors would need to restrict comparisons to only across runs (i.e., patterns from run 1 correlated with patterns for run 2 and so on). The authors in their response letter have indicated that because the patterns being correlated are not derived from events in close temporal proximity, they should not suffer from the issue of temporal autocorrelation. This is simply not true. For example, see the paper by Prince et al. (eLife 2022; on GLMsingle). This is not the main point of Prince et al.'s paper, but it includes a nice figure that shows that, using standard modelling approaches, the correlation between (same-run) patterns can be artificially elevated for lags as long as ~120 seconds (and can even be artificially reduced after that; Figure 5 from that paper) between events. This would affect many of the comparisons in the present paper. The cleanest way to proceed is to simply drop the within-run comparisons, which I believe the authors can do and yet they have not. Relatedly, in the response letter the authors say they are focusing mainly on the change over time for reinstatement at both levels including the gist-type reinstatement; however, this is not how it is discussed in the paper. They in fact are mainly relying on differences from zero, as children show some "above baseline" reinstatement while adults do not, but I believe there were no significant differences over time (i.e., the findings the authors said they would lean on primarily, as they are arguably the most comparable). (2) This analysis uses a different approach of comparing fixations to one another, rather than fixations to scenes. In their response letter and the revised paper, the authors do provide a bit of reasoning as to why this is the most sensible. However, it is still not clear to me whether this is really "reinstatement" which (in my mind) entails the re-evoking of a neural pattern initially engaged during perception. Rather, could this be a shared neural state that is category specific? In any case, I think additional information should be added to the text to clarify that this definition differs from others in the literature. The authors might also consider using some term other than reinstatement. Again (as I noted in my prior review), the finding of no category-level reinstatement in adults is surprising and confusing given prior work and likely has to do with the operationalization of "reinstatement" here. I was not quite sure about the explanation provided in the response letter, as category-level reinstatement is quite widespread in the brain for adults and is robust to differences in analytic procedures etc. (3) Also from a theoretical standpoint-I'm still a bit confused as to why gist-based reinstatement would involve reinstatement of the scene gist, rather than the object's location (on the screen) gist. Were the locations on the screen similar across scene backgrounds from the same category? It seems like a different way to define memory retrieval here would be to compare the neural patterns when cued to retrieve the same vs. similar (at the "gist" level) vs. different locations across object-scene pairs. This is somewhat related to a point from my review of the initial version of this manuscript, about how scene reinstatement is not necessary. The authors state that participants were instructed to reinstate the scene, but that does not mean they were actually doing it. The point that what is being measured via the reinstatement analyses is actually not necessary to perform the task should be discussed in more detail in the paper.

      (2) Inspired by another reviewer's comment, it is unclear to me the extent to which age group differences can be attributed to differences in age/development versus memory strength. I liked the other reviewer's suggestions about how to identify and control for differences in memory strength, which I don't think the authors actually did in the revision. They instead showed evidence that memory strength does seem to be lower in children, which indicates this is an interpretive confound. For example, I liked the reviewer's suggestion of performing analyses on subsets of participants who were actually matched in initial learning/memory performance would have been very informative. As it is, the authors didn't really control for memory strength adequately in my opinion, and as such their conclusions about children vs. adults could have been reframed as people with weak vs. strong memories. This is obviously a big drawback given what the authors want to conclude. Relatedly, I'm not sure the DDM was incorporated as the reviewer was suggesting; at minimum I think the authors need to do more work in the paper to explain what this means and why it is relevant. (I understand putting it in the supplement rather than the main paper, but I still wanted to know more about what it added from an interpretive perspective.)

      (3) Some of the univariate results reporting is a bit strange, as they are relying upon differences between retrieval of 1- vs. 14-day memories in terms of the recent vs. report difference, and yet don't report whether the regions are differently active for recent and remote retrieval. For example in Figure 3A, neither anterior nor posterior hippocampus seem to be differentially active for recent vs. remote memories for either age group (i.e., all data is around 0). Precuneus also interestingly seems to show numerically recent>remote (values mostly negative), whereas most other regions show the opposite. This difference from zero (in either direction) or lack thereof seems important to the message. In response to this comment on the original manuscript, the authors seem to have confirmed that hippocampal activity was greater during retrieval than implicit baseline. But this was not really my question - I was asking whether hippocampus is (and other ROIs in this same figure are) differently engaged for recent vs. remote memories.

      (4) Related to point 3, the claims about hippocampus with respect to multiple trace theory feel very unsupported by the data. I believe the authors want to conclude that children's memory retrieval shows reliance on hippocampus irrespective of delay, presumably because this is a detailed memory task. However the authors have not really shown this; all they have shown is that hippocampal involvement (whatever it is) does not vary by delay. But we do not have compelling evidence that the hippocampus is involved in this task at all. That hippocampus is more active during retrieval than implicit baseline is a very low bar and does not necessarily indicate a role in memory retrieval. If the authors want to make this claim, more data are needed (e.g., showing that hippocampal activity during retrieval is higher when the upcoming memory retrieval is successful vs. unsuccessful). In the absence of this, I think all the claims about multiple trace theory supporting retrieval similarly across delays and that this is operational in children are inappropriate and should be removed.

      (5) There are still not enough methodological details in the main paper to make sense of the results. Some of these problems were addressed in the revision but others remain. For example, a couple of things that were unclear: that initially learned locations were split, where half were tested again at day 1 and the other half at day 14; what specific criterion was used to determine to pick the 'well-learned' associations that were used for comparisons at different delay periods (object-scene pairs that participants remembered accurately in the last repetition of learning? Or across all of learning?).

      (6) In still find the revised Introduction a bit unclear. I appreciated the added descriptions of different theories of consolidation, though the order of presented points is still a bit hard to follow. Some of the predictions I also find a bit confusing as laid out in the introduction. (1) As noted in the paper multiple trace theory predicts that hippocampal involvement will remain high provided memories retained are sufficiently high detail. The authors however also predict that children will rely more on gist (than detailed) memories than adults, which would seem to imply (combined with the MTT idea) that they should show reduced hippocampal involvement over time (while in adults, it should remain high). However, the authors' actual prediction is that hippocampus will show stable involvement over time in both kids and adults. I'm having a hard time reconciling these points. (2) With respect to the extraction of gist in children, I was confused by the link to Fuzzy Trace Theory given the children in the present study are a bit young to be showing the kind of gist extraction shown in the Brainerd & Reyna data. Would 5-7 year olds not be more likely to show reliance on verbatim traces under that framework? Also from a phrasing perspective, I was confused about whether gist-like information was something different from just gist in this sentence: "children may be more inclined to extract gist information at the expense of detailed or gist-like information." (p. 8) - is this a typo?

      (7) For the PLSC, if I understand this correctly, the profiles were defined for showing associations with behaviour across age groups. (1) As such, is it not "double dipping" to then show that there is an association between brain profile and behaviour-must this not be true by definition? If I am mistaken, it might be helpful to clarify this in the paper. (2) In addition, I believe for the univariate and scene-specific reinstatement analyses these profiles were defined across both age groups. I assume this doesn't allow for separate definition of profiles across the two group (i.e., a kind of "interaction"). If this is the case, it makes sense that there would not be big age differences... the profiles were defined for showing an association across all subjects. If the authors wanted to identify distinct profiles in children and adults they may need to run another analysis. (3) Also, as for differences between short delay brain profile and long delay brain profile for the scene-specific reinstatement - there are 2 regions that become significant at long delay that were not significant at a short delay (PC, and CE). However, given there are ceiling effects in behaviour at the long but not short delay, it's unclear if this is a meaningful difference or just a difference in sensitivity. Is there a way to test whether the profiles are statistically different from one another? (4) As I mentioned above, it also was not ideal in my opinion that all regions were included for the scene-specific reinstatement due to the authors' inability to have an appropriate baseline and therefore define above-chance reinstatement. It makes these findings really challenging to compare with the gist reinstatement ones.

      (8) I would encourage the authors to be specific about whether they are measuring/talking about memory representations versus reinstatement, unless they think these are the same thing (in which case some explanation as to why would be helpful). For example, especially under the Fuzzy Trace framework, couldn't someone maintain both verbatim and gist traces of a memory yet rely more on one when making a memory decision?

      (9) With respect to the learning criteria - it is misleading to say that "children needed between two to four learning-retrieval cycles to reach the criterion of 83% correct responses" (p. 9). Four was the maximum, and looking at the Figure 1C data it appears as though there were at least a few children who did not meet the 83% minimum. I believe they were included in the analysis anyway? Please clarify. Was there any minimum imposed for inclusion?

      (10) For the gist-like reinstatement PLSC analysis, results are really similar a short and long delays and yet some of the text seems to implying specificity to the long delay. One is a trend and one is significant (p. 31), but surely these two associations would not be statistically different from one another?

      (11) As a general comment, I had a hard time tying all of the (many) results together. For example adults show more mature neocortical consolidation-related engagement, which the authors say is going to create more durable detailed memories, but under multiple trace theory we would generally think of neocortical representations as providing more schematic information. If the authors could try to make more connections across the different neural analyses, as well as tie the neural findings in more closely with the behaviour & back to the theoretical frameworks, that would be really helpful.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Reviews):

      Summary:

      This paper by Schommartz and colleagues investigates the neural basis of memory reinstatement as a function of both how recently the memory was formed (recent, remote) and its development (children, young adults). The core question is whether memory consolidation processes as well as the specificity of memory reinstatement differ with development. A number of brain regions showed a greater activation difference for recent vs. remote memories at the long versus shorter delay specifically in adults (cerebellum, parahippocampal gyrus, LOC). A different set showed decreases in the same comparison, but only in children (precuneus, RSC). The authors also used neural pattern similarity analysis to characterize reinstatement, though I have substantive concerns about how this analysis was performed and as such will not summarize the results. Broadly, the behavioural and univariate findings are consistent with the idea that memory consolidation differs between children and adults in important ways, and takes a step towards characterizing how.

      Strengths:

      The topic and goals of this paper are very interesting. As the authors note, there is little work on memory consolidation over development, and as such this will be an important data point in helping us begin to understand these important differences. The sample size is great, particularly given this is an onerous, multi-day experiment; the authors are to be commended for that. The task design is also generally well controlled, for example as the authors include new recently learned pairs during each session.

      Weaknesses:

      As noted above, the pattern similarity analysis for both item and category-level reinstatement was performed in a way that is not interpretable given concerns about temporal autocorrelation within the scanning run. Below, I focus my review on this analytic issue, though I also outline additional concerns.

      We thank the reviewer for both the positive and critical appraisal of our paper.

      (1) The pattern similarity analyses were not done correctly, rendering the results uninterpretable (assuming my understanding of the authors' approach is correct).

      a. First, the scene-specific reinstatement index: The authors have correlated a neural pattern during a fixation cross (delay period) with a neural pattern associated with viewing a scene as their measure of reinstatement. The main issue with this is that these events always occurred back-to-back in time. As such, the two patterns will be similar due simply to the temporal autocorrelation in the BOLD signal. Because of the issues with temporal autocorrelation within the scanning run, it is always recommended to perform such correlations only across different runs. In this case, the authors always correlated patterns extracted from the same run, which moreover have temporal lags that are perfectly confounded with their comparison of interest (i.e., from Fig 4A, the "scene-specific" comparisons will always be back-to-back, having a very short temporal lag; "set-based" comparisons will be dispersed across the run, and therefore have a much higher lag). The authors' within-run correlation approach also yields correlation values that are extremely high - much higher than would be expected if this analysis was done appropriately. The way to fix this would be to restrict the analysis to only cross-run comparisons, but I don't believe this is possible unfortunately given the authors' design; I believe the target (presumably reinstated) scene only appears once during scanning, so there is no separate neural pattern during the presentation of this picture that they can use. For these reasons, any evidence for "significant scene-specific reinstatement" and the like is completely uninterpretable and would need to be removed from the paper.

      We thank the reviewer for this important input. We acknowledge that our study design leads to temporal autocorrelation in the BOLD signal when calculating RSA between fixation and scene time windows. We also recognize that we cannot interpret the significance of scene-specific reinstatement compared to zero and have accordingly removed this information. Nevertheless, our primary objective was to investigate changes in scene-specific reinstatement in relation to the different time delays of retrieval. Given that the retrieval procedure is the same over time and presumably similarly influenced by temporal autocorrelations, we argue that our results must be attributed to the relative differences in reinstatement across recent and remote trials. Bearing this in mind, we argue that our results can be interpreted in terms of delay-related changes in reinstatement. This information is discussed in pp. 21, 40 of the manuscript.

      We agree with the reviewer that cross-run comparisons would be extremely interesting. This could be achieved by introducing the same items repeatedly across different runs, which was not possible in our current setup since we were interested in single exposure retrieval and practical time restriction in scanning children. We have  introduced this idea in Limitations and Discussion sections (pp. 40, 44) of the manuscript to inform future studies.

      Finally, thanks to the reviewer’s comment, we identified a bug in the final steps of our RSA calculation. Fischer’s z-transformation was incorrectly applied to r-1 values, resulting in abnormally high values. We apologize for this error. We have revised the scripts and rectified the bug by correctly applying Fischer’s z-transformation to the r similarity values. We also adjusted the methods description figure accordingly (Figure 5, p. 22). This adjustment led to slightly altered reinstatement indices. Nevertheless, the overall pattern of delay-related attenuation in the scene-specific reinstatement index, observed in both children and adults, remains consistent. Similarly, we observed gist-like reinstatement uniquely in children.

      b. From a theoretical standpoint, I believe the way this analysis was performed considering the fixation and the immediately following scene also means that the differences between recent and remote could have to do with either the reactivation (processes happening during the fixation, presumably) or differences in the processing of the stimulus itself (happening during the scene presentation). For example, people might be more engaged with the more novel scenes (recent) and therefore process those scenes more; such a difference would be interpreted in this analysis as having to do with reinstatement, but in fact could be just related to the differential scene processing/recognition, etc.

      Thank you for your insightful comments. We acknowledge the theoretical concerns raised about distinguishing between the effects of reactivation processes occurring during fixation and differential processing of the stimulus itself during scene presentation. Specifically, the notion that engagement levels with recent scenes could result in enhanced processing, which might be misattributed to memory reinstatement mechanisms.

      We argue, however, that during scene presentation, scenes are processed more “memory-wise” rather than “perception-wise”, since both recent and remote memories are well-learned, as we included only correctly recalled memories in the analysis.

      We concur that scene presentations entail perceptual processing; however, such processing would be consistent across all items, given that they were presented with the same repeated learning procedure, rendering them equally familiar to participants. In addition, we would argue that distinct activation patterns elicited during varying delays are more likely attributable to memory-related processing, since participants actively engaged in a memory-based decision-making task during these intervals. We have incorporated this rationale into the discussion section of our manuscript (p. 40).

      With this in mind, we hypothesized that in case of “memory-wise” processing, the neural engagement during the scene time window should be higher for remote compared to recent  items, and this increases with passing time as more control and effort should be exhibited during retrieval due to reorganized and distributed nature of memories. If the scenes are processed more “perception-wise”, we would expect higher neural engagement during the retrieval of recent compared to remote items. Our exploratory analysis (detailed overview in supplementary materials, Figure S3, Table S9) revealed a higher neural activation for remote compared to recent items in medial temporal, prefrontal, occipital and cerebellar brain regions, supporting the notion of “memory-wise” processes during scene time window. However, this exploratory analysis cannot provide a direct solution to the reviewer’s concern as our paradigm per se cannot arbitrate between “memory-wise” and “perception-wise” nature of retrieval. We added the point to the discussion (see p. 40).

      c. For the category-based neural reinstatement:

      (1) This suffers from the same issue of correlations being performed within the run. Again, to correct this the authors would need to restrict comparisons to only across runs (i.e., patterns from run 1 correlated with patterns for run 2 and so on). With this restriction, it may or may not be possible to perform this analysis, depending upon how the same-category scenes are distributed across runs. However, there are other issues with this analysis, as well.

      (2) This analysis uses a different approach of comparing fixations to one another, rather than fixations to scenes. The authors do not motivate the reason for this switch. Please provide reasoning as to why fixation-fixation is more appropriate than fixation-scene similarity for category-level reinstatement, particularly given the opposite was used for item-level reinstatement. Even if the analyses were done properly, it would remain hard to compare them given this difference in approach.

      (3) I believe the fixation cross with itself is included in the "within category" score  Is this not a single neural pattern correlated with itself, which will yield maximal similarity (pearson r=1) or minimal dissimilarity (1-pearson r=0)? Including these comparisons in the averages for the within-category score will inflate the difference between the "within-category" and "between-category" comparisons. These (e.g., forest1-forest1) should not be included in the within-category comparisons considered; rather, they should be excluded, so the fixations are always different but sometimes the comparisons are two retrievals of the same scene type (forest1-forest2), and other times different scene types (forest1-field1)

      (4) It is troubling that the results from the category reinstatement metric do not seem to conceptually align with past work; for example, a lot of work has shown category-level reinstatement in adults. Here the authors do not show any category-level reinstatement in adults (yet they do in children), which generally seems extremely unexpected given past work and I would guess has to do with the operationalization of the metric.

      Thank you for this important input regarding category-based reinstatement.

      (1) The distribution of within-category items across runs was approximately similar and balanced. Additionally, within runs, they were presented randomly without close temporal proximity. Based on this arrangement, we believe that the issue of close temporal autocorrelation, as pointed out by the reviewer in the context of scene-specific reinstatement, may not apply to the same extent here. Again, our focus is not on the absolute level of category-based reinstatement, but the relative difference across conditions (recent vs. remote short delay vs. remote long delay) which are equally impacted by the autocorrelations.  

      (2) We apologize for not motivating this analysis further. Whereas the scene-reinstatement index (i.e., fixation to scene correlation) gives us a measure of the pre-activation of a concrete scene (e.g., a yellow forest in autumn), the gist-like reinstatement gives us a measure of the pre-activation of a whole category of scenes (e.g., forests). Critically, our window of interest is the fixation period for both sets of analysis (in the absence of any significant visual input). The scene-specific reinstatement uses the scene window as a neural template against which the fixation period can be compared, while the gist-like reinstatement compares similarity of reactivation pattern for trials from the same category but differ in the exact memory content. The reinstatement of more generic, gist-like memory (e.g., forest) across multiple trials should yield more similar neural activation patterns. Significant gist-like reinstatement would suggest that neural patterns for scenes within the same category are more generic, as indicated by higher similarity among them. On the other hand, a more detailed reinstatement of specific types of forests (e.g., a yellow forest in autumn, green pine trees, a bare-leaved forest in spring, etc.) that differ in various dimensions could result in neural activation patterns that are as dissimilar as those seen in the reinstatement of scenes from entirely different categories. Through this methodology, we could distinguish between more generic, gist-like reinstatement and more specific, detailed reinstatement. This is now clarified in the manuscript, see p. 25.

      (3) We apologize for the confusion caused by the figure and analysis description. In our analysis, we indeed excluded the correlation of the fixation cross with itself. Consequently, the diagonal in the figure should be blank to indicate this. This is now revised in the manuscript (Figure 7B and in Methods).

      (4) We appreciate your concern and recognize that the terminology we used might not align perfectly with the conventional understanding of category-based reinstatement. Typically, category-level neural representations (as discussed in Polyn et al., 2005; Jafarpour et al., 2014; among others) are investigated to identify specific brain areas associated with encoding/perception of scenes or faces. Our aim, however, was to explore the mnemonic reinstatement of highly detailed scenes that were elaborately encoded, with the hypothesis that substantial representational transformations would occur over time and vary with age. This hypothesis is based on the memory literature, including the Fuzzy-Trace Theory, the Contextual Binding Theory, and the Trace Transformation Theory (Brainerd & Reyna, 1998; Yonelinas, 2019; Moscovitch & Gilboa, 2023). Therefore, we renamed 'category-based' reinstatement to 'gist-like' reinstatement, which clarifies our concept and better aligns it with existing literature.

      We anticipated that young adults, having the ability to retain detailed narratives post-encoding, would demonstrate a reinstatement of scenes with distinct details, making these scenes dissimilar from each other (see similar findings in Sommer et al., 2021). In contrast, given the anticipated lesser strategic elaboration during learning in children, we hypothesized that they would demonstrate a shallower, more gist-like reinstatement (for instance, children recalling a forest or a field in a general sense without specific details or vivid imagery). This could result in higher category-based similarity, as children might reinstate a more generic forest concept.

      We did not gather additional data on the verbal quality of reinstatement due to the limited scanning time available for children, so these assumptions remain unverified. However, anecdotal observations post-retrieval indicated that adults often reported very vivid scenes associated with clear narrative recall. In contrast, children frequently described more vague memories (e.g., “I know it was a forest”) without specific details. Future studies should include measures to assess the quality of reinstatement, potentially outside the scanning environment.

      (2) I did not see any compelling statistical evidence for the claim of less robust consolidation in children.

      Specifically in terms of the behavioral results of retention of the remote items at 1 vs 14 days, shown in Figure 2B, the authors conclude that memory consolidation is less robust in children (line 246). Yet they do not report statistical evidence for this point, as there was no interaction of this effect with the age group. Children had worse memory than adults overall (in terms of a main effect - i.e. across recent and remote items). If it were consolidation-specific, one would expect that the age differences are bigger for the remote items, and perhaps even most exaggerated for the 14-day-old memories. Yet this does not appear to be the case based on the data the authors report. Therefore, the behavioral differences in retention do not seem to be consolidation specific, and therefore might have more to do with differences in encoding fidelity or retrieval processes more generally across the groups. This should be considered when interpreting the findings.

      Thank you for highlighting this important issue. We acknowledge that our initial description and depiction of our behavioral findings may not have effectively conveyed the main message about memory consolidation. Therefore, we have revised the behavioral results section (see pp. 12-14) to communicate our message more clearly.

      As detailed in the methods section, we reported retention rates only for those items that were correctly (100%) learned on day 0, day 1, and day 14. This approach meant that different participants had varying numbers of items learned correctly. However, this strategy allowed us to address our primary question: whether memory consolidation, based on all items initially encoded successfully, is comparably robust between groups.

      To illustrate the change in retention rate slopes over time for recently learned items (i.e., immediately 30 minutes after learning), short delay remote, and long delay remote items, relative to the initially correctly learned items more clearly and straightforward, we conducted the following analysis: after observing no differences between sessions in both age groups for recent items on days 1 and 14, we combined the recent items. This approach enabled us to investigate how the slope of memory retention for initially correctly learned items (with a baseline of 100%) changes over time. We observed a significant interaction between item type (recent, short delay remote, long delay remote) and group (F(3,250) = 17.35, p < .001, w2 = .16). The follow up of this interaction revealed significantly less robust memory consolidation across all delay times in children compared to young adults. This information is added in the manuscript in pp. 12-14. We have also updated the figures, incorporating the baseline of 100% correct performance.

      (3) Please clarify which analyses were restricted to correct retrievals only. The univariate analyses states that correct and incorrect trials were modelled separately but does not say which were considered in the main contrast (I assume correct only?). The item specific reinstatement analysis states that only correct trials were considered, but the category-level reinstatement analysis does not say. Please include this detail.

      Thank you for bringing this to our attention. We indeed limited our analysis – including univariate, specific reinstatement, and gist-like analyses – to only correctly remembered items. This decision was made because our goal was to observe delay-related changes in the neural correlates of correct memories, which are potentially stronger. We have incorporated this information into the manuscript.

      (4) To what extent could performance differences be impacting the differences observed across age groups? I think (see prior comment) that the analyses were probably limited to correct trials, which is helpful, but still yields pretty big differences across groups in terms of the amount of data going into each analysis. In general, children showed more attenuated neural effects (e.g., recent/remote or session effects); could this be explained by their weaker memory? Specifically, if only correct trials are considered that means that fewer trials would be going into the analysis for kids, especially for the 14-day remote memories, and perhaps pushing the remove > recent difference for this condition towards 0. The authors might be able to address this analytically; for example, does the remote > recent difference in the univariate data at day 14 correlate with day 14 memory?

      Thank you for pointing this out. Indeed, there was a significant relationship between remote > recent difference in the univariate data and memory performance at day 14 across both age group (see Figure 4C-D). The performance of all participants including children was above chance level for remote trial on day 14. In addition, although number of remote trials was lower in children (18 trials on average) in comparison to adults (22 trials on average), we believe that the number of remote trials was not too low or different across groups for the contrast.

      (5) Some of the univariate results reporting is a bit strange, as they are relying upon differences between retrieval of 1- vs. 14-day memories in terms of the recent vs. report difference, and yet don't report whether the regions are differently active for recent and remote retrieval. For example, in Figure 3A, neither anterior nor posterior hippocampus seem to be differentially active for recent vs. remote memories for either age group (i.e., all data is around 0). This difference from zero or lack thereof seems important to the message - is that correct? If so, can the authors incorporate descriptions of these findings?

      Thank you for this valuable input. When examining recent and remote retrieval separately, indeed both the anterior and posterior regions of the hippocampus exhibited significant activation from zero in adults (all p < .0003FDRcorr) and children (all p < .014FDRcorr, except for recent posterior hippocampus) during all delays. We include this information in the manuscript (see p. 17) and add it to the supplementary materials (Figure S2, Table S7).

      (6) Please provide more details about the choices available for locations in the 3AFC task. (1) Were they different each time, or always the same? If they are always the same, could this be a motor or stimulus/response learning task? (2) Do the options in the 3AFC always come from the same area - in which case the participant is given a clue as to the gist of the location/memory? Or are they sometimes randomly scattered across the image (in which case gist memory, like at a delay, would be sufficient for picking the right option)? Please clarify these points and discuss the logic/impact of these choices on the interpretation of the results.Response: Thank you for pointing this out. During learning and retrieval, we employed the 3AFC (Three-Alternative Forced Choice) task.

      The choices for locations varied across scenes while remained the same across time within individuals. There were 18 different key locations for the objects, distributed across the stimulus set. This means the locations of the objects were quite heterogeneous and differed between objects. The location of the object within the task was presented once during encoding and remained consistent throughout learning. Given the location heterogeneity, we believe our task cannot be reduced to a mere “stimulus/response learning task” but is more accurately described as an object-location associations task.

      Similar to the previous description, the options for the 3AFC task did not originate from the same area, as there were 18 different areas in total. The three choice options were distributed equally: so sometimes the “correct” answer was the left option, sometimes in the middle option, or sometimes the right option. Therefore, we believe that the 3AFC task did not provide clues to the location but required detailed and precise memory of the location. Moreover, the options were not randomly scattered but rather presented close together in the scene, demanding a high level of differentiation between choices.

      Taking all the above into consideration, we assert that precise object-location associative memory is necessary for a correct answer. We have added this information to the manuscript (p. 9).

      (7) Often p values are provided but test statistics, effect sizes, etc. are not - please include this information. It is at times hard to tell whether the authors are reporting main effects, interactions, pairwise comparisons, etc.

      Thank you for bringing this to our attention. We realize that including this information in the Tables may not be the most straightforward approach. Therefore, we have incorporated the test statistics, effect sizes, and related details into the text of the results section for clarity.

      (8) There are not enough methodological details in the main paper to make sense of the results. For example, it is not clear from reading the text that there are new object-location pairs learned each day.

      Thank you for pointing this out. We have added this information to the main manuscript. Additionally, we have emphasized this information in the text referring to Figure 1B.

      (9) The retrieval task does not seem to require retrieval of the scene itself, and as such it would be helpful for the authors to both explain their reasoning for this task to measure reinstatement. Strictly speaking, participants could just remember the location of the object on the screen. Was it verified that children and adults were recalling the actual scene rather than just the location (e.g. via self-report)? It's possible that there may be developmental differences in the tendency to reinstate the scene depending on e.g., their strategy.

      Thank you for highlighting this important point. Indeed, the retrieval task included explicit instructions for participants to recall and visualize the scene associated with the object presented during the fixation time window. Participants were also instructed to recollect the location of the object within the scene. Since the location was contextually bound to the scene and each object had a unique location in each scene, the location of the object was always embedded in the specific scene context. We have added this information to both the Methods and Results sections.

      From the self-reports of the participants (which unfortunately were not systematically collected on all occasions), they indicated that when they could recall the scene and the location due to the memory of stories created during strategic encoding, it aided their memory for the scene and location immensely. We also concur with your observation that children and young adults may differ in their ability to reinstate scenes, depending on the success of their employed recall strategies. This task was conducted with an awareness of potential developmental differences in the ability to form complex contextual memories. Our elaborative learning procedure was designed to minimize these differences. It is important to note though we did not expect children to achieve performance levels fully comparable to adults. There may indeed be developmental differences in reinstatement, such as due to differences in knowledge availability and accessibility (Brod, Werkle-Bergner, & Shing, 2013). We think that these differences may underlie our findings of neural reinstatement. This is now discussed in p. 34-35, 39-43 of the manuscript.

      (10) In general I found the Introduction a bit difficult to follow. Below are a few specific questions I had.

      a. At points findings are presented but the broader picture or take-home point is not expressed directly. For example, lines 112-127, these findings can all be conceptualized within many theories of consolidation, and yet those overarching frameworks are not directly discussed (e.g., that memory traces go from being more reliant on the hippocampus to more on the neocortex). Making these connections directly would likely be helpful for many readers.

      Thank you for bringing this to our attention. We have incorporated a summary of the general frameworks of memory consolidation into the introduction. This addition outlines how our summarized findings, particularly those related to memory consolidation for repeatedly learned information, align with these frameworks (see lines 126-138, 146-150).

      b. Lines 143-153 - The comparison of the Tompary & Davachi (2017) paper with the Oedekoven et al. (2017) reads like the two analyses are directly comparable, but the authors were looking at different things. The Tompary paper is looking at organization (not reinstatement); while the Oedekoven et al. paper is measuring reinstatement (not organization). The authors should clarify how to reconcile these findings.

      Thank you for highlighting this aspect. We have revised how we present the results from Tompary & Davachi (2017). This study examined memory reorganization for memories both with and without overlapping features, and it observed higher neural similarity for memories with overlapping features over time. The authors also explored item-specific reinstatement for recent and remote memories by assessing encoding-retrieval similarity. Since Oedekoven et al. (2017) utilized a similar approach, their results are comparable in terms of reinstatement. We have updated and expanded our manuscript to clarify the parallels between these studies (see lines 157-162).

      c. Line 195-6: I was confused by the prediction of "stable involvement of HC over time" given the work reviewed in the Introduction that HC contribution to memory tends to decrease with consolidation. Please clarify or rephrase.

      Drawing on the Contextual Binding Theory (Yonelinas et al., 2019), as well as the Multiple Trace Theory (Nadel et al., 2000) and supported for instance by evidence from Sekeres et al. (2018), we hypothesized that detailed contextual memories formed through repeated and strategic learning would strengthen the specificity of these memories, resulting in consistent hippocampal involvement for successfully recalled contextualized detailed memories. We have included additional explanatory information in the manuscript to clarify this hypothesis (see lines 217-219).

      d. Lines 200-202: I was a bit confused about this prediction. Firstly, please clarify whether immediate reinstatement has been characterized in this way for kids versus adults. Secondly, don't adults retain gist more over long delays (with specific information getting lost), at least behaviourally? This prediction seems to go against that; please clarify.

      Thank you for raising this important point. Indeed, there are no prior studies that examined memory reinstatement over extended durations in children. The primary existing evidence suggests that neural specificity or patterns of neural representations in children can be robustly observed, while neural selectivity or univariate activation in response to the same stimuli tends to mature later (i.e., Fandakova et al., 2019). Bearing this in mind and recognizing that such neural patterns can be observed in both children and adults, we hypothesized that adults may form stronger detailed contextual memories compared to children. By employing strategies such as creating stories, adults might more easily recall scenes without the need to resort to forming generic or gist-like memories (for example, 'a red fox was near the second left pine tree in a spring green forest'). This assumption aligns with the Fuzzy Trace Theory (Reyna & Brainerd, 1995), which posits that verbatim memories can be created without the extraction of a gist.

      Conversely, we hypothesized that children, due to the ongoing maturation of associative and strategic memory components (as discussed in Shing et al., 2008 and 2010), which are dependent respectively on the hippocampus (HC) and the prefrontal cortex (PFC), would be less adept at creating, retaining, and extracting stories to aid their retrieval process. This could result in them remembering more generic integrated information, like the relationship between a fox and some generic image of a forest. We have added explanatory information to the manuscript to elucidate these points (see lines 225-230).

      Reviewer #1 (Recommendations For The Authors):

      (1) For Figure 3, I would highly recommend changing the aesthetics for the univariate data - at least on my screen they appear to be open boxes with solid vs. dashed lines, and as such look identical to the recent vs. remove distinction in Figure 2B. It also doesn't match the legend for me, which shows the age groups having purple vs. yellow coloring.

      Thank you for this observation. We have adjusted Figure 2 (now Figure 3) (please refer to p. 14) accordingly, now utilizing purple and yellow colors to distinguish between the age groups.

      (2) Lines 329-330, it is not true that "all" indices were significant from zero but this is only apparent if you read the next sentence. Please rephrase to clarify. e.g., "All ... indices with a few exceptions ... were significantly..."?

      Based on the above suggestions and considering our primary focus on time-related changes in scene-specific reinstatement, we will refrain from further interpreting the relative expression of individual scene-specific indices against 0. Consequently, we have removed this information from our analysis.

      (3) It is challenging to interpret some of the significance markers, such as those in Figure 3. For example what effects are being denoted by the asterisks and bars above vs. below the data on panel D? Please clarify and/or note in the legend.

      We have included a note in the legend to clarify the meaning of all significance markers. In addition, we decided to state any significant main and interaction effects in the figure rather that to use significance markers.

      (4) For Figures 2 and 3, only the meaning of error bars is described in the caption. It is not explained in the caption what the boxes, lines, and points denote. Please clarify.

      Thank you for highlighting this. We have added explanations to the figure's annotation for clarity. Please note, that considering other review’s suggestions figure plots may have been adjusted or changed, resulting in adjustment of the explanations in the figure annotation.

      (5) How were recent and remote interspersed relative to one another? The text says that each run had 10 recent and 10 remote pairs, presented in a "pseudo-random order" - not clear what that (pseudo) means in this case. Please clarify.

      Thank you for raising this point. We provide this information in the Methods section “Materials and Procedure”: 'The jitters and the order of presentation for recent and remote items were determined using OptimizeXGUI (Spunt, 2016), following an exponential distribution (Dale, 1999). Ten unique recently learned pairs (from the same testing day) and ten unique remotely learned items (from Day 0) were distributed within each run (in total three runs) in the order as suggested by the software as the most optimal. There were three runs with unique sets of stimuli each resulting in thirty unique recent and thirty unique remote stimuli overall.'

      (6) Figure 1A, second to last screen on the learning cycles row - what would be presented to participants here, one of these three emojis? What does the sleepy face represent? I see some of these points were mentioned in the methods, but additional clarification in the caption would be helpful.

      Thank you for highlighting this. We have included this information in the figure caption. Specifically, the sleepy face symbol in the figure denotes a 'missed response'.

      (7) Not clear how the jittered fixation time between object presentation and scene test is dealt with in representational similarity analyses.

      Thank you for pointing this out. Beta estimates were obtained from a Least Square Separate (LSS) regression model. Each event was modeled with their respective onset and duration and, as such, one beta value was estimated per event (with the lags between events differing from trial to trial). We have edited the corresponding section (see p. 53).  

      (8) It was a little bit strange to have used anterior vs posterior HPC ROIs separately in univariate analysis but then combined them for multivariate. There are many empirical and theoretical motivations for looking at item-specific and category reinstatement in anterior and posterior HPC separately, so I was surprised not to see this. Please explain this reasoning.

      Thank you for pointing this out. We agree with the reviewer and included the anterior and posterior HC ROIs into the multivariate analysis. Please see the revised results section (pp. 13-15).

      (9) The term "neural specificity" is introduced (line 164) without explanation; please clarify.

      Thank you for bringing this to our attention. The term ‘neural specificity’ refers to the neural representational distinctiveness of information. In other words, ‘neural specificity,’ as defined by Fandakova et al. (2019), refers to the distinctiveness of neural representations in the regions that process that sensory input. We decided, however to refrain from using this term and instead to use neural representational distinctiveness, which is more self-explaining and was also introduced in the manuscript.

      (10) Age range is specified as 5-7 years initially (line 187) and then 6-7 years (line 188).

      We have corrected the age range in line 188 to '5 to 7 years.'

      Reviewer #2 (Public Reviews):

      Schommartz et al. present a manuscript characterizing neural signatures of reinstatement during cued retrieval of middle-aged children compared to adults. The authors utilize a paradigm where participants learn the spatial location of semantically related item-scene memoranda which they retrieve after short or long delays. The paradigm is especially strong as the authors include novel memoranda at each delayed time point to make comparisons across new and old learning. In brief, the authors find that children show more forgetting than adults, and adults show greater engagement of cortical networks after longer delays as well as stronger item-specific reinstatement. Interestingly, children show more category-based reinstatement, however, evidence supports that this marker may be maladaptive for retrieving episodic details. The question is extremely timely both given the boom in neurocognitive research on the neural development of memory, and the dearth of research on consolidation in this age group. Also, the results provide novel insights into why consolidation processes may be disrupted in children. Despite these strengths, there are quite a few important design and analytical choices that derail my enthusiasm for the paper. If the authors could address these concerns, this manuscript would provide a solid foundation to better understand memory consolidation in children.

      We thank the reviewer for both the positive and critical appraisal of our paper.

      Reviewer #2 (Recommendations For The Authors):

      (1) My greatest concern is the difference in memory accuracy that emerges as soon as immediate learning, which undermines the interpretation of any consolidation-related differences. This concern is two-fold. The authors utilize an adaptive learning approach in which participants learn to criteria or stop after 4 repetitions. This type of approach leads to children seeing the stimuli more often during learning compared to adults, which on its own could have consequences for consolidation-related neural markers. Specifically, within adults theoretical and empirical work this shows that repeating information can actually lead to more gist-like representations, which is the exact profile the children are showing. While there could be a strength to this approach because it allows for equivocal memory, the decision to stop repetitions before criteria means that memory performance is significantly lower in the children, which again could have consequences to consolidation-related neural markers. First, the authors do not show any of the learning-related data which would be critical to assess the impact of this design choice. Second, there are likely differences in memory strength at the delay, making it extremely difficult to determine if the neural markers reflect development, worse memory strength, or both. This issue is compounded by the use of a 3-AFC paradigm, wherein "correct responses" included in the analysis could contain a significant amount of guessing responses. I think a partial solution to this problem is to analyze the RT data and include them in the analyses or use a drift-diffusion modeling approach to get more precise estimates of memory strength to control for this feature. An alternative is to sub-select participants in each group to have a sample matched on performance (including # of repetitions) and re-run all the analyses in this sub-sample. Without addressing these concerns it is near impossible to interpret the presented data.

      Thank you for highlighting this point.

      Firstly, we believe that our approach, involving strategic and repeated learning coupled with feedback, enhances the formation of detailed contextual memories. The retrieval procedure also emphasized the need for detailed memory for location. These are critical differences in experimental procedure from previous studies, which enhanced the importance of detailed representations and likely reduced the likelihood of forming gist-like memories.

      Indeed, we ceased further learning after the fourth repetition. Extensive piloting, where we initially stopped after the seventh repetition, showed no improvement beyond the fourth repetition. In fact, performance tended to decline due to fatigue. Therefore, we limited the number of repetition cycles to the point where an improvement of performance was still feasible. Even though children exhibited lower final learning performance overall, we believe our procedure facilitated them to reach their maximal performance within the experimental setup.

      To address the reviewer’s concern, we included learning data to illustrate the progression of learning (see Fig. 1C, pp. 9-10 in Results).

      When interpreting the retention rates, it is important to note that we reported retention rates only for items that were correctly learned (100%) on day 0, day 1, and day 14. This approach meant that different participants had varying numbers of items learned correctly. However, this method enabled us to address our primary question: whether memory consolidation, based on all items initially encoded successfully, is comparably robust between the groups. To simultaneously examine the change in retention rate slopes over time for recent (30 minutes after learning), short delay (one night after) remote, and long delay (two weeks after) remote items, we conducted a separate analysis of retention rates for recent items on days 1 and 14. After observing no differences between sessions in both age groups, we combined the data for recent items. This allowed us to investigate how the slope of memory retention for initially correctly learned items (with a baseline of 100%) changes over time. We observed a significant interaction between item type (recent, short delay remote, long delay remote) and group. Analysis of this interaction revealed significantly less robust memory consolidation across all delay times for children compared to young adults. The figures have been adjusted accordingly to incorporate the baseline of 100% correct performance.

      Following your suggestion, we also employed the drift diffusion model approach to characterize memory strength, calculating drift rate, boundary and non-decision time parameters. We added the results to the Supplementary Materials (section S2.1, Figure S1).

      Generally, our findings indicate lower overall drift rate in children when considering all items that had to be learned. We also observed that adults show higher slope of decline in drift rate in short and long delay, which, however, are characterized still by higher memory strength compared to children. Both age groups required similar amount of evidence to make decision, which declined with delay. It may indicate an adaptation of weaker memory. Further, we observed lesser non-decision time in children compared to adults, potentially suggesting less error checking or less thorough processing and memory access through strategy in children.

      Overall, these results indicate weaker memory strength in children as a quantitative measure. It may nevertheless stem from qualitatively different memory representations that children form, as our RSA findings suggest. We believe that our neural effect reflects the effect of interest (i.e., worse memory due to lower memory strength in children). When controlled for, it will take away variance of interest in the neural data. Therefore, we will refrain from including memory strength into the model. However, we will include mean RT as the indicator of general response tendencies.

      Given that the paper is already very complex and long, we opted to add the diffusion model results to the Supplementary Materials (section S2.1, Fig. S1), while discussing the results in the discussion (p. 35).

      (2) More discussion of the behavioral task should be included in the results, in particular the nature of the adaptive learning paradigm including the behavioral results as well as the categorical nature of the memoranda. Without this information, it is difficult for the reader to understand what category-level versus item-level reinstatement reflects.

      Thank you for this valuable input. We have incorporated this information into the results section. Please refer to pp. 9-10, 12, 14, 21, 25-26 for the added details.

      (3) Some of the methods for the reinstatement analysis were unclear to me or warranted further adjustment. I believe the authors compared the scene against all other scenes. I believe it would be more appropriate to only compare this against scenes drawn from the same category as opposed to all scenes. Secondly, from my reading, it seems like the reinstatement was done during the scene presentation, rather than the object presentation in which they would retrieve the scene. I believe the reinstatement results would be much stronger if it was captured during the object presentation rather than the re-presentation of the scene. Or perhaps both sets of analyses should be included.

      We apologize for the confusion regarding the analysis method.

      During the review process we have improved the description of this analysis and hope it is easier to follow now. In short, we used both approaches (within and between categories) to suit different goals (I.e., measuring scene-reinstatement and gist-like reinstatement).

      Both types of reinstatement were assessed during the fixation cross to avoid confounds with the object itself being on the screen. We only used the scene window in one analysis (scene-reinstatement index) as a neural template to track its pre-activation during the fixation. So, as the reviewer suggests, our rationale is that the reinstatement indeed starts taking place at the short object presentation window, but importantly, extends to the fixation window. We added this clarifying information to the results section (see p. 21-27).

      (4) For the univariate results, it was unclear to me when reading the results whether they were focusing on the object presentation portion of the trial or the scene presentation portion of the trial. Again, I think the claims of reinstatement related activity would be stronger if they accounted for the object presentation period.

      Thank you for pointing this out. Indeed, the univariate results were based on the object presentation time window. We added this information to the results section (Fig. 3, pp. 14, 16).

      (5) Further, given the univariate differences shown across age groups, the authors should re-run all analyses for the RSA controlling for mean activation within the ROI.

      Thank you for highlighting this. We re-ran all analysis for the RSA controlling for the mean activation within the ROI. The results remained unchanged. We have added this information to the results section as well as in Table S8 and S11 in the Supplementary Materials for further details.

      (6) The authors should include explicit tests across groups for their brain-behavior analyses if they want to make any developmentally relevant interpretations of the data. Also, It would be helpful to include similar analyses to those using the univariate signals, and not just the RSA results.

      Following reviewer’s suggestion, we included brain-behavior analyses for univariate data as well as RSA data with explicit tests across groups. These can be found in the Results Section pp. 18-20, 28-32. Due to the interdependence of predefined ROIs and to avoid running a high number of correlation tests, we employed the partial least square correlation analysis for this purpose. This approach focuses on multivariate links between specified Regions of Interest (ROIs) and fluctuations in memory performance over short and long delays across different age cohorts. We argue that this multivariate strategy offers a more comprehensive understanding of the relationships between brain metrics across various ROIs and memory performance, given their mutual dependence and connectivity (refer to Genon et al. (2022) for similar discussions).

      (7) There could be dramatic differences in memory processing across 5-7 year olds. I know the sample is a little small for this, but I would like to see regressions done within the middle childhood group in addition to the across-group comparisons.

      We have included information detailing the relationship between memory retention rate and age within the child group (refer to p. 13). In the child group, both recent and short delay remote memory improved with age. However, the retention rate for long-delayed memory did not show a significant improvement with increasing age in children.

      (8) I am concerned that the authors used global-signal as a regressor in their first-level analyses, given that there could be large changes in the amount of univariate activation that occurs across groups. This approach can lead to false positives and negatives that obscure localized differences. The authors should remove this term, and perhaps use the mean sum of the white matter or CSF to achieve the noise regressor they wanted to include.

      We understand the reviewers' concerns. However, we believe that our approach is recommended for the pediatric population. Specifically, Graff et al., 2021, found that global signal regression is a highly efficacious denoising technique in their study of 4 to 8-year-old children. This technique was previously suggested for adults by Ciric et al., 2017, and the benefits in terms of motion and physiological noise removal outweigh the potential costs of removing some signal of interest, as indicated by Behzadi et al., 2007. Additionally, we incorporated the six anatomic component-based noise correction (CompCor) to account for WM and CSF signals, as recommended in the pediatric literature.

      (9) The authors discuss the relationship between hippocampal reactivation and worse memory through the lens of Schapiro et al., but a new paper by Tanriverdi et al came out in JOCN recently that is more similar to the authors' findings.

      Thank you for highlighting the recent paper by Tanriverdi et al. in JOCN, which aligns closely with our findings. We appreciate the suggestion and agree that exploring this alignment could further enrich our discussion on the relationship between hippocampal reactivation and memory retention. We incorporated this work in our revised manuscript .

      Minor Comments

      - I was surprised that the authors did not see any differences in univariate signals for memory retrieval as a function of development, as much of the prior work has shown differences (for example work by Tracy Riggins). I believe this contrast should be highlighted in the discussion.

      - Given the robust differences in sleep patterns across childhood and the role of sleep in systems consolidation framework, I think this feature should be highlighted in either the introduction or discussion.

      - Could the authors report on differences (or lack of differences) in head motion across the groups, and if they are different whether they could include them as a confounding variable.

      I believe we included six motion parameters and their derivatives into the model

      Thank you for your comments.

      First, prior works on univariate signals of memory retrieval focused mostly on remembered vs forgotten contrasts, while in our study we focused on remote vs recent in short and long delay only for correctly remembered items. This can partially explain the results. We highlighted this information in the discussion session.

      Second, we agree with the reviewer that sleep patterns across childhood should be addressed in the analysis. Therefore, we incorporated them in the discussion section.

      Third, indeed head motion were included in the analysis as confounding variables, as adding them is highly recommended for the developmental population (e.g., Graff et al. 2021). As an example, we observed higher framewise displacement in children compared to adults, t = -16(218), p <. 001, as well as in translational y, t = -2.33(288), p = .02.

      Reviewer #3 (Public Reviews):

      Summary:

      This study aimed to understand the neural correlates of memory recall over short (1-day) and long (14-days) intervals in children (5-7 years old) relative to young adults. The results show that children recall less than young adults and that this is accompanied by less activation (relative to young adults) in brain networks associated with memory retrieval.

      Strengths:

      This paper is one of few investigating long-term memory (multiple days) in a developmental population, an important gap in the field. Also, the authors apply a representational similarity analysis to understand how specific memories evolve over time. This analysis shows how the specificity of memories decreases over time in children relative to adults. This is an interesting finding.

      We thank the reviewer for the appraisal of our manuscript.

      Weaknesses:

      Overall, these results are consistent with what we already know: recall is worse in children relative to adults (e.g., Cycowicz et al., 2001) and children activate memory retrieval networks to a lesser extent than adults (Bauer et al, 2017).

      It seems that the reduced activation in memory recall networks is likely associated with less depth of memory encoding in children due to inattentiveness, reduced motivation, and documented differences in memory strategies. In regard to this, there was consideration of IQ, sex, and handedness but these were not included as covariates as they were not significant although I note p<.16 suggests there was some level of association nonetheless. Also, IQ is measured differently for the children and adults so it's not clear these can be directly contrasted. The authors suggest the instructed elaborative encoding strategy is effective for children and adults but the reference in support of this (Craik & Tulving, 1975) does not seem to support this point.

      Thank you for your review, and we appreciate your valuable feedback. Here are our responses and clarifications:

      Regarding the novelty of the results in terms of mentioned existent literature, we believe that in contrast to Cycowicz et al. (2001) and Bauer et al (2017), etc, we assess not only immediate memory after encoding with semantic judgement of abstract associations, but add to these findings investigating consolidation-related changes in complex associative and contextual information in much under investigated sample of 5-to-7-year-old preschoolers. With this we are able to infer also how neural representations of children change over time, providing invaluable insights into knowledge formation in this developmental cohort.

      With this, the observed age differences are not so of primary importance, as time-related changes in mnemonic representations observed in children.

      Regarding the assumption of inattentiveness in children, we want to emphasize that the experimenter was present throughout the learning process, closely supervising the children. We observed prompt responses to every trial in children and noted an increase in accuracy over the encoding-learning cycles, leading us to conclude that the children were indeed attentive to the task. The observed accuracy improvement across learning cycles  indicates increase in remembered information. Furthermore, we took measures to ensure their engagement, including extensive training in both verbal and computerized versions to ensure that they understood and actively created stories to support their learning.

      We collected motivation data after each task execution in children, and the results indicated that they scored high in motivation. Children not only completed the tasks but also expressed their willingness to participate in subsequent appointments, highlighting their active involvement in the study.

      The observed differences in the efficiency of strategy utilization were expected, given developmental differences in the associative and strategic components of memory in children, as noted in prior research (Shing, 2008, 2010).

      We appreciate your point about IQ, sex, and handedness. These variables were indeed included in the behavioral models, and mean brain activation was also included in the brain data models, addressing the potential influence of these factors on our results.

      While it's true that we applied different tests to measure IQ in children and adults, these tests targeted comparable subtests that addressed similar cognitive constructs. As the final IQ values are standardized, we believe it is appropriate to compare them between the two groups.

      Lastly, we agree that the citation Craik & Tulving, 1975 supports the notion of effectiveness of instructed elaborative learning only in adults, but not in children. For this purpose, we added relevant literature for the child cohort (i.e., Pressley, 1982; Pressley et al., 1981; Shing et al., 2008).

      Reviewer #3 (Recommendations For The Authors):

      An additional point for the authors to consider is that the hypotheses were uncertain. The first is that prefrontal, parietal, cerebellar, occipital, and PHG brain regions would have greater activation over time in adults and not children - which is very imprecise as this is basically the whole brain. Moreover, brain imaging data may be in opposition to this prediction: e.g., the hippocampus has a delayed maturational pattern beyond 5-yrs (e.ge., Canada 2019; Uematsu 2012) and some cortical data predicts earlier development in these regions.

      Thank you for your feedback, and we appreciate your insights regarding our hypotheses.

      The selection of our regions of interest (ROIs) was guided by prior literature that has demonstrated the interactive involvement of multiple brain areas in memory retrieval and consolidation processes. Additionally, our recent work utilizing multivariate partial least square correlation analysis (Schommartz, 2022, Developmental Cognitive Neuroscience) has indicated that unique profiles derived from the structural integrity of multiple brain regions are differentially related to short and long-delay memory consolidation.

      Indeed, the literature suggests that the hippocampus may exhibit a more delayed maturational pattern extending into adolescence, as supported by studies such as Canada (2019) and Uematsu (2012), etc. We added this information as well as findings from the literature on cortical development to be more balanced in our review of the literature.

      Given this complexity, we believe it is important to emphasize in our discussion that both the medial temporal lobe, including the hippocampus, and cortical structures, as well as the cerebellum, undergo profound neural maturation. We highlight these nuances in our revised manuscript to provide a more comprehensive perspective on the developmental differences in memory retention over time.

      The writing was challenging to follow - consider as an example on page 9 the sentence that spans 10 lines of text.

      Thank you for bringing this to our attention. We have carefully reviewed the manuscript and have made efforts to streamline the text, ensuring that sentences are not overly long or complex to improve readability and comprehension.

      I found the analysis (and accompanying figures) a bit of a data mine - there are so many results that are hard to digest and in other cases highly redundant one from the other. This may be resolved in part by moving redundant findings to the supplemental. Some were hard to follow - so when there is a line between recent and recent data, that seems confusing to connect data that, I believe, are different sets of items. Later scatterplots (Fig 7) have pale yellow dots that I had a hard time seeing.

      Thank you for bringing up your concerns regarding the analysis and figures in our manuscript. We have carefully considered your feedback and made several improvements to address these issues.

      To alleviate the challenge of digesting numerous results, we have taken steps to enhance clarity and reduce redundancy. Specifically, we have moved some of the redundant findings to the supplementary sections, which should help streamline the main manuscript and make it more reader friendly.

      Regarding the line between 'recent' and 'recent data,' figure were transformed to a clearer version. Furthermore, we have improved the visibility of certain elements, such as the pale-yellow dots in the scatterplots (Fig 1, 2, 4, etc. ), to ensure that readers can better discern the data points.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review): 

      […] 

      Weaknesses: 

      The question of the physiological relevance of short bouts of ischemia remains.

      The chemical ischemia protocol induces a duration-dependent ATP depletion in acute slices on a time scale of minutes (Pape and Rose 2023). This is about the same time scale as the peri-infarct depolarisation (Lauritzen et al. 2011) that the protocol attempts to model. Of course, such models do not completely replicate the complex situation in vivo. However, the presented analyses of synapse function cannot be performed in vivo. We discuss this now in the manuscript.

      The precise mechanisms underlying the shift between ischemia-induced long-term potentiation and long-term failure of synaptic responses were not addressed. Could this be cell death?

      Thank you for the comment. Yes, we indeed believe that the persistent failure of synaptic transmission is because of neuronal cell death (i.e., of CA1 pyramidal cells) or at least persistent depolarisation. We did not explicitly state that in the original submission but do so in the revised manuscript. It is supported by the unquantified observation of swelling and/or loss of integrity of CA1 pyramidal cell bodies in parallel to postsynaptic failure. It is also in line with many reports from the literature, of which we now cite two (lines 186-198).

      Sex differences are not addressed or considered.

      We have performed all experiments on male mice, as indicated in Material and Methods. We have indeed not addressed sex differences of the observed effects. We consider this, and many other important factors, to be interesting topics for follow-up studies. This is now discussed (lines 413-424).

      Reviewer #2 (Public Review): 

      […]

      Weaknesses: 

      The weaknesses are minor and only relate to the interpretation of some of the data regarding the presynaptic mechanisms causing the potentiation of release. The authors measured the fiber volley, which reflects the extracellular voltage of the compound action potential of the fiber bundle. The half-duration of the fiber volley was increased, which could be due to the action potential broadening of the individual axons but could also be due to differences in conduction velocity. We are therefore skeptical whether the conclusion of action broadening is justified.

      These are excellent points. We have added an analysis demonstrating that axonal conduction velocity is unlikely to be affected. Nonetheless, the fiber volley is indeed an indirect measure of what happens in individual axons. We have adjusted our interpretation accordingly and now also discuss alternative explanations of our findings (lines 363-379).

      Reviewer #3 (Public Review): 

      […]

      Weaknesses: 

      The data on fiber volley duration should be supported by more direct measurements to prove that chemical ischemia increases presynaptic Ca2+ influx due to a presynaptic broadening of action potentials. Given the influence that positioning of the stimulating and recording electrode can have on the fiber volley properties, I found this data insufficient to support the assumption of a relationship between increased iGluSnFR fluorescence, action potential broadening, and increased presynaptic Ca2+ levels.

      We have added a new analysis showing that the latency of the fiber volley is unaffected and relatively constant, which strengthens our conclusion. But the fiber volley is indeed an indirect measure of action potential firing in individual axons. The suggested experiment, which would require simultaneous recording of Ca2+ and action potentials in single axons in combination with chemical ischemia, is extremely difficult, if possible at all. Instead, we have extended the discussion and include now further alternative mechanistic explanations (lines 363-379).

      The results are obtained in an ex-vivo preparation, it would be interesting to assess if they could be replicated in vivo models of cerebral ischemia. 

      This would certainly be very interesting but also extremely challenging technically. For a detailed analysis of synaptic changes as presented here, the main difficulty will be to stimulate and visualise glutamate release exclusively in an isolated population of synapses while recording postsynaptic responses in a stroke model.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors): 

      […]

      Labelling of experimental groups of 2-minute and 5-minute chemical ischemia is more accurate than "metabolic stress" and "with postsynaptic failure". The critical difference between these two conditions is lost with this nomenclature. The reader could be misled to believe that the two groups form a heterogenous population of responses from the same experimental manipulation which is incorrect.

      We had stated in the manuscript that we ‘ … grouped combined iGluSnFR and electrophysiological recordings according to the effect of chemical ischemia on the synaptic response: ‘chemical ischemia with postsynaptic failure’ if the postsynaptic response did not recover to above 50% of the baseline level and ‘chemical ischemia’ when it did (as indicated in Fig. 1H). …’. The recordings were not grouped according to chemical stress duration but according to the effect on the postsynaptic response. We have revised the text explaining this (lines 125-135) and illustrate that now also in Fig. 1H. We hope this is easier to follow now.

      More details on the long-term impact of 5-minute ischemia on cell viability would be enlightening regarding the specific mechanism separating these two conditions. With 2 minutes it would appear that cells remain alive (i.e. intact post-synaptic responses), 5 minutes however, inducing cell death. 

      Yes, our observations, although not quantified, are in line with cell death as CA1 pyramidal cell bodies appeared swollen and/or lost their integrity when chemical ischemia was followed by postsynaptic failure. This is also in line with reports from the literature. We have revised the results section accordingly (lines 186-201).

      In the paragraph titled "glutamate uptake is unaffected after acute chemical ischemia", there are two erroneous citations of Figure S3 that should be Figure S4.

      Thank you. We corrected this mistake.

      The sex of animals is not given. This is essential information. 

      We used male mice as indicated in the initial version of the manuscript (Material and Methods). We have added a statement regarding the role of sex to the final section of the Discussion.

      Reviewer #2 (Recommendations For The Authors):

      We propose addressing the weaknesses mentioned in the public review. As said, the fibre volley is a very indirect measure of action potential broadening. Based on the iGluSnFR data, the authors predict that the potentiation is mediated by depolarization, action potential broadening, and increased presynaptic calcium influx. The latter could be tested experimentally, but this does not seem necessary if the data are interpreted more cautiously. For example, other explanations for the broadened fiber volley could be mentioned, such as a slowing and/or dispersion of the action potential propagation speed. Furthermore, depolarization could cause elevated resting calcium concentrations, which could potentiate release independently of action potential broadening. Finally, classical forms of presynaptic potentiation of the release machinery that occur during homeostatic plasticity or Hebbian plasticity may operate independently of calcium dynamics.

      Thank you for this comment. The discussion of the mechanism was indeed too short. We have added an analysis of the fiber volley delay after stimulation, which was not affected. Presynaptic action potential broadening is, in our opinion, a very likely explanation for our observations but we did not perform direct experiments. Directly recording presynaptic action potentials and Ca2+ transients in the chemical ischemia model over extended periods of time is a major technical challenge and certainly of interest in the future. As suggested, we have expanded the discussion section and now mention various alternative explanations (lines 363-379).

      There are the following minor suggestions:

      Add line numbers.

      We have added line numbers.

      We would suggest providing exact P values instead of asterisks in the figures. 

      We agree that having exact P values in the figure panels can be very helpful. However, in the present figures they are hard to integrate without overcrowding the already complex panels and thereby obscuring other important details. All p-values are included in the figure legends and/or main text.

      Abstract: "We also observed an unexpected hierarchy of vulnerability of the involved mechanisms and cell types." This sentence is hard to understand and cell types were not directly compared (i.e. axons of CA3 and axons of CA1 neurons were not compared).

      We have revised this statement and removed the reference to cell types.

      In Figure 1G there seems to be an increase in the fiber volley. Is this significant? Could this be due to swelling of the slice during chemical ischemia? Or an increase in excitability? Maybe this could be discussed. 

      The effect was analysed in the context of Fig. 2. A significant increase of the fiber volley amplitude was detected in chemical ischemia (Fig. 2H) but also under control conditions (Fig. 2F). We therefore consider this a change that is detectable but not related to chemical ischemia and not a potential explanation for increased glutamate release (lines 157-160). Also, no significant fiber volley increase was detected in chemical ischemia with postsynaptic failure (Fig. 2H) and in the experiments illustrated in Fig. 4E. Our interpretation is that the fiber volley unspecifically increases in some experiments over the time course of the experiment (~ 60 min) but this is unrelated to chemical ischemia.

      In the results: "A fully separate set of experiments..." Please explain better what this means. 

      We have revised the entire section to explain more clearly how recordings were grouped (lines 125135).

      In the results: "...(Syková and Nicholson, 2008) (Figure S3). However, this was not observed for chemical ischemia without postsynaptic failure (Figure S3), in which the increased glutamate transients were observed." This should probably refer to Figure S4. 

      Thank you for spotting this mistake. We corrected it.

      The last sentence in the results "... most likely by increased presynaptic Ca2+ influx, and, at the same time, the postsynaptic response." This is difficult to understand. Does "at the same time" refer to another mechanism or the consequence of more Ca2+? 

      We revised this part of the results section to improve clarity and toned down our conclusions (lines 328-335 and 363-379).

      Reviewer #3 (Recommendations For The Authors): 

      There are a few points that the author needs to clarify: 

      The authors do not discuss the different behaviour of iGlu F0 during chemical ischemia and chemical ischemia with postsynaptic failure shown in Figure 2, panels D and E. In the first case, during the application of the solution to induce ischemia, iGluF0 decreases while in the other case, it strongly increases before falling down. In both cases, the fEPSP slope is decreased. How does the author explain this observation? 

      We attribute the transient increase of extracellular glutamate during prolonged chemical ischemia to the increase of synaptic glutamate release observed previously under such conditions (Hershkowitz et al. 1993; Tanaka et al. 1997) and other mechanisms reviewed by us (Passlick et al. 2021) (e.g., glial glutamate release, transiently reduced glutamate uptake), which we could not detect during shorter chemical ischemia. The initial drop of the fEPSP slope is most likely due to postsynaptic depolarisation, which is followed by a repolarisation if the chemical stress duration is short. We now explain this in more detail in lines 185-200 of the revised manuscript. Although we focussed on the bi-directional effect on longer timescales in this manuscript, this transient phase during chemical ischemia is very interesting for further investigations.

      On page 8, first line, I think that the authors meant Figure S4, not Figure S3 when they mentioned results on ECS diffusivity and ECS fraction. 

      Yes, thank you for spotting this. We corrected the mistake.

      In Supplementary Figure 5 panel B It seems that PPR is significantly reduced upon chemical ischemia (asterisk on columns green) but the authors claimed in the paper at page 10 that "Analysing the paired-pulse ratio (PPR) of postsynaptic response and iGluSnFR transients revealed no consistent changes after chemical ischemia (Figure S5).". Did the authors refer to the data normalized in panel D? In this case, I do not see the need to normalize raw data that have been already shown in a previous panel and that give different statistical results, probably due to the different tests used (paired in panel B and not paired in panel D). 

      We have clarified this point in the supplementary material (Figure S5, legend). There is a relevant difference between the analyses presented in panel B and D. The paired test presented in B analyses the change of the electrophysiological PPR in response to chemical ischemia. The test in D on the electrophysiologically PPR asks if the reduction in B is significantly different from the changes seen under control conditions. Because it is not, we conclude that chemical ischemia has no relevant effect on the electrophysiological PPR and, in combination with the results on the iGluSnFR PPR, also not on short-term plasticity, as tested here.

      References

      Hershkowitz N, Katchman AN, Veregge S. Site of synaptic depression during hypoxia: a patch-clamp analysis. Journal of Neurophysiology 69: 432–441, 1993.

      Lauritzen M, Dreier JP, Fabricius M, Hartings JA, Graf R, Strong AJ. Clinical Relevance of Cortical Spreading Depression in Neurological Disorders: Migraine, Malignant Stroke, Subarachnoid and Intracranial Hemorrhage, and Traumatic Brain Injury. J Cereb Blood Flow Metab 31: 17–35, 2011.

      Pape N, Rose CR. Activation of TRPV4 channels promotes the loss of cellular ATP in organotypic slices of the mouse neocortex exposed to chemical ischemia. The Journal of Physiology 601: 2975–2990, 2023.

      Passlick S, Rose CR, Petzold GC, Henneberger C. Disruption of Glutamate Transport and Homeostasis by Acute Metabolic Stress. Front Cell Neurosci 15: 637784, 2021.

      Tanaka E, Yamamoto S, Kudo Y, Mihara S, Higashi H. Mechanisms Underlying the Rapid

      Depolarization Produced by Deprivation of Oxygen and Glucose in Rat Hippocampal CA1 Neurons In Vitro. Journal of Neurophysiology 78: 891–902, 1997.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Review:

      Reviewer #1 (Public Review):

      In 'Systems analysis of miR-199a/b-5p and multiple miR-199a/b-5p targets during chondrogenesis', Patel et al. present a variety of analyses using different methodologies to investigate the importance of two miRNAs in regulating gene expression in a cellular model of cartilage development. They first re-analysed existing data to identify these miRNAs as one of the most dynamic across a chondrogenesis development time course. Next, they manipulated the expression of these miRNAs and showed that this affected the expression of various marker genes as expected. An RNA-seq experiment on these manipulations identified putative mRNA targets of the miRNAs which were also supported by bioinformatics predictions. These top hits were validated experimentally and, finally, a kinetic model was developed to demonstrate the relationship between the miRNAs and mRNAs studied throughout the paper.

      I am convinced that the novel relationships reported here between miR-199a/b-5p and target genes FZD6, ITGA3, and CAV1 are likely to be genuine. It is important for researchers working on this system and related diseases to know all the miRNA/mRNA relationships but, as the authors have already published work studying the most dynamic miRNA (miR-140-5p) in this biological system I was not convinced that this study of the second miRNA in their list provided a conceptual advance on their previous work.

      We believe this study is an enhancement on our previous work for two reasons, which have been alluded to in new text within the introduction. Firstly, our previous work used experimental and bioinformatic analysis to identify microRNAs with significant regulatory roles during chondrogenesis. This new manuscript additionally uses  a systems biology approaches to identify novel miRNA-mRNA interactions and capture these within an in silico model. Secondly, this work was initiated by the analysis of our previously generated data – using a novel tool we developed for this type of data (Bioconductor - TimiRGeN).  

      I was also concerned with the lack of reporting of details of the manipulation experiments. The authors state that they have over-expressed miR-199a-5p (Figure 2A) and knocked down miR-199b-5p (Figure 2B) but they should have reported their proof that these experiments had worked as predicted, e.g. showing the qRT-PCR change in miRNA expression. Similarly, I was concerned that one miRNA was over-expressed while the other was knocked down - why did the authors not attempt to manipulate both miRNAs in both directions? Were they unable to achieve a significant change in miRNA expression or did these experiments not confirm the results reported in the manuscript?

      We agree with the reviewer that some additional data were needed to demonstrate the effective regulation of miR-199-5p.  Hence, Supplementary Figure 1 is now included which provides validation of the effects of miR-199a-5p overexpression

      (Supplementary Figure 1A) and inhibition of miR-199a/b-5p (Supplementary Figure 1B). Within the main manuscript, Figure 2B has been amended to include the consequences of inhibition of miR-199a-5p, with 2C showing the consequences of miR-199b-5p inhibition. Further, we include new data with regards to miR-199a/b-5p inhibition on CAV1 (Figure 4A). 

      I had a number of issues with the way in which some of the data was presented. Table 1 only reported whether a specific pathway was significant or not for a given differential expression analysis but this concealed the extent of this enrichment or the level of statistical significance reported. Could it be redrawn to more similarly match the format of Figure 3A? The various shades of grey in Figure 2 and Figure 4 made it impossible to discriminate between treatments and therefore identify whether these data supported the conclusions made in the text. It also appeared that the same results were reported in Figure 3B and 3C and, indeed, Figure 3B was not referred to in the main text. Perhaps this figure could be made more concise by removing one of these two sets of panels.

      We agree with all points made here and have amended these within the manuscript. Figure 1A is now pathway enrichment plots from the TimiRGeN R Bioconductor package, and the table which previously showed the pathways enriched at each time point is now in the supplementary materials (supp. Table 1). Figure 2 and 4 now have color instead of shades of grey. Figure 3C has now been moved to supplementary materials (Supplementary Figure 2) and is referenced in the text. 

      Overall, while I think that this is an interesting and valuable paper, I think its findings are relatively limited to those interested in the role of miRNAs in this specific biomedical context.

      Reviewer #2 (Public Review):

      Summary:

      This study represents an ambitious endeavor to comprehensively analyze the role of miR199a/b-5p and its networks in cartilage formation. By conducting experiments that go beyond in vitro MSC differentiation models, more robust conclusions can be achieved.

      Strengths:

      This research investigates the role of miR-199a/b-5p during chondrogenesis using bioinformatics and in vitro experimental systems. The significance of miRNAs in chondrogenesis and OA is crucial, warranting further research, and this study contributes novel insights.

      Weaknesses:

      While miR-140 and miR-455 are used as controls, these miRNAs have been demonstrated to be more relevant to Cartilage Homeostasis than chondrogenesis itself. Their deficiency has been genetically proven to induce Osteoarthritis in mice. Therefore, the results of this study should be considered in comparison with these existing findings.

      We agree with the reviewers comments. miR-455-null mice develop normally but miR-140-null (or mutated) mice and humans do have skeletal abnormalities (e.g. Nat Med. 2019 Apr;25(4):583-590. doi: 10.1038/s41591-019-0353-2), indicating a role in chondrogenesis.  We have made an addition in the description to point towards the need to assess the roles miR-199a/b-5p may play during skeletogenesis and OA. We anticipate miR-199a/b-5p to be relevant in OA and have ongoing additional work for this – but this beyond the scope of this manuscript. 

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      Beyond the issues raised in the public review, I had a few minor recommendations that are largely designed to help improve the understanding of the manuscript as it is currently written.

      (1) Please provide the statistical tests used to obtain p-values in the Figure 2 and 4 legends.

      We have now added statistical test information to the figure legends of figures 2 and 4.

      (2) It is stated on p. 9 that both miRNAs may share a functional repertoire because 25 and 341 genes are interested between their inhibition experiments. Please provide statistical support that this overlap is an enrichment over the null background in this experiment. Total DE genes – chi squared. Expected / Observed. 

      A chi-squared test is now presented in the manuscript which shows that the number of significant genes which were found in common between miR-199a-5p knockdown and miR-199b-5p knockdown were significantly more than expected for day 0 or day 1 of the experiments. 

      (3) The final sentence on p. 12 (beginning 'Size of the points reflect...') seemed out of place - is it part of a legend?

      Thank you for pointing out this mistake - it was part of figure 3C and now is in the supplementary materials.

      (4) A sentence on p. 14 reads that 'FZD6 and ITGA3 levels increased significantly' but this should read decreased, rather than increased. Quite an important typo!

      Thank you for pointing this error out. It has been corrected.

      (5) Theoretical transcripts are mentioned in the legend of Figure 5A but these were not present in the figure. Please include these or remove them from the legend.

      This error has been removed form Figure 5A.

      (6) On p 20, the references 22 and 27 should I think be moved to earlier in the sentence (after 'miR-199a-5p-FZD6 has been predicted previously'). Currently, it reads as if these references support your luciferase assays which you claim are the first evidence for this target relationship.

      We agree with this change and have corrected the manuscript.

      (7) The reference to Figure 5D on p. 20 should be a reference to Figure 5C.

      Thank you for pointing this error out – this has been corrected.

      Reviewer #2 (Recommendations For The Authors):

      (1) The paper is based on the importance of miR-140 and miR-455 as miRNAs in chondrogenesis, citing only Barter, M. J. et al. Stem Cells 33, (2015). Considering the scope and results of this study, this citation is insufficient.

      We agree with this reviewers comments. For many year miR-140 and miR-455 have been experimented on and their importance in OA research has become apparent. We included additional references within the introduction to address this.

      (2) Analyzing chondrogenesis solely through differentiation experiments from MSCs is inadequate. It is essential to perform experiments involving the network within normal cartilage tissue and/or the generation of knockout mice to understand the precise role of miR199a/b-5p in chondrogenesis.

      We have added an additional paragraph in the discussion to state this, and do believe it is highly important that miR-199a/b-5p be tested in OA samples – however this would be beyond the intended scope of this article.

      (3) In light of the above points, it is imperative to investigate the role of miR-199a/b-5p beyond the in vitro differentiation model from MSCs, encompassing mouse OA models or human disease samples.

      In tangent with the previous address, we agree with the pretense and believe additional experiments should be performed to gain more insight to the mechanism of how miR-199a/b-5p regulate OA. But development of a new mouse line to investigate this is not in the scope of this manuscript.

    1. Author response:

      eLife assessment

      This important study describes the crystallographic screening of a number of small molecules against a viral enzyme critical for the 5' capping of SARS-CoV-2 RNA and viral replication. While the high-quality crystal structures and complementary biophysical assays in this study provide solid evidence to support the major claims regarding how these small molecule compounds bind to the viral enzyme, the mismatch between the antiviral activity and binding to the viral enzyme of several small molecule compounds could have been more thoroughly investigated or discussed. This paper would be of interest to the fields of coronavirus biology, structural biology, and drug discovery.

      We do fully agree that the antiviral assay results could be brought better into context clarifying that the antiviral effects of tubercine and its derivates are due to off-target effects.

      Reviewer #1 (Public Review):

      Summary:

      This manuscript describes the crystallographic screening of a number of small molecules derived from the natural substrates S-adenosyl methionine (SAM) and adenine, against the SARS-CoV-2 2'-O-methyltransferase NSP16 in complex with its partner NSP10. High-quality structures for several of these are presented together with efforts to evaluate their potential biophysical binding and antiviral activities. The structures are of high quality and the data are well presented but do not yet show potency in biophysical binding. They only offer limited insights into the design of inhibitors of NSP16/10.

      Strengths:

      The main strengths of the study are the high quality of the structural data, and associated electron density maps making the structural data highly accurate and informative for future structure-based design. These results are clearly presented and communicated in the manuscript. Another strength is the authors' attempts to probe the binding of the identified fragments using biophysical assays. Although in general the outcome of these experiments shows negative data or very weak binding affinities the authors should be commended for attempting several techniques and showing the data clearly. This study is also useful as an example of the complexities associated with drug discovery on a bi-substrate target such as a methyltransferase, several of the observed binding poises were unexpected with compounds that are relatively similar to substrates binding in different parts of the active site or other unexpected orientations. This serves as an example of how experimental structural information is still of crucial importance to structure-based drug design. In general, the claims in the manuscript are well supported by the data.

      Weaknesses:

      The main limitations of the study are that the new structures generated in the study are fairly limited in terms of chemical space being similar to either SAM or RNA-CAP analogues. It feels a little bit of a lost opportunity to expand this to more diverse ligands which may reveal potential inhibitors that are distinct from current methyltransferase inhibitors based on SAM analogues and truly allow a selective targeting of this important target.

      It is true that it makes sense to screen for more diverse compounds to expand to a more diverse ligand set and we do hope our study motivates to do so. Given the limited number of crystal structures of nsp10-16 with potential drug molecules, the aim of this study was to upgrade the data base with new complex structures to have a pool of complex structures for future compound designs with increased selectivity. Furthermore, some of the hits are known inhibitors of similar enzymes and most prominent and potent methyltransferase inhibitors are structurally related to SAM, like sinefungin and tubercidine. We do think that knowing which SAM compounds or fragments of SAM are able to bind in the nsp10-16 active site is highly valuable for further specific and optimized inhibitor design.

      Another limitation is the potentially misleading nature of the antiviral assays. It is not possible to say if these compounds display on-target activity in these assays or even if the inhibition of NSP16/10 would have any effect in these assays. Whilst the authors do mention these points I think this should be emphasized more strongly.

      That is a very valid point and we do not believe that the antiviral activity is based on on-target effects. We do agree that the way it is currently presented can be considered misleading and we indeed clarify this point in the revised version.

      Minor critical points:

      The authors state that their crystals and protein preps have co-purified SAM occupying the active site of the crystals. Presumably, this complicates the interpretation of electron density maps as many of the ligands share overlap with the existing SAM density making traditional analysis of difference maps challenging. The authors did not utilize the PanDDA analysis for this step, perhaps this is related to the presence of SAM in the ground state datasets? Also, occupancies are reported in the manuscript in some cases to two significant figures, this seems to be an overestimation of the ability of refinement to determine occupancy based on density alone and the authors should clarify how these figures were reached.

      We have used PanDDA in parallel for hit finding. We however did not see any advantages for this target over the hit finding results from the visual inspection. This is probably as mentioned because of SAM being present is the “ground state” which complicates the PanDDA map calculations.

      Regarding the occupancies, we fully agree with this comment and change it to reasonable digits and clarify how the figures were reached.  

      The molecular docking approach to pre-selection of library compounds to soak did not appear to be successful. Could the authors make any observations about the compounds selected by docking or the docking approach used that may explain this?

      Yes, it is a good point to give possible explanations why the docking approach was not successful to facilitate similar approaches in future studies.

      Reviewer #2 (Public Review):

      Summary:

      The study by Kremling et al. describes a study of the nsp16-nsp10 methyl transferase from SARS CoV-2 protein which is aimed at identifying inhibitors by x-ray crystallography-based compound screening.<br /> A set of 234 compounds were screened resulting in a set of adenosine-containing compounds or analogues thereof that bind in the SAM site of nsp16-nsp10. The compound selection was mainly based on similarity to SAM and docking of commercially available libraries. The resulting structures are of good quality and clearly show the binding mode of the compounds. It is not surprising to find that these compounds bind in the SAM pocket since they are structurally very similar to portions of SAM. Nevertheless, the result is novel and may be inspirational for the future design of inhibitors. Following up on the crystallographic screen the identified compounds were tested for antiviral activity and binding to np16-nsp10. In addition, an analysis of similar binding sites was presented.

      Strengths:

      The crystallography is solid and the structures are of good quality. The compound binding constitutes a novel finding.

      Weaknesses:

      The major weakness is the mismatch between antiviral activity and binding to the target protein. Only one of the compounds could be demonstrated to bind to the nsp16-nsp10 protein. By performing a displacement experiment using ITC Sangivamycin is concluded to bind with a Kd > 1mM. However, the same compound displays antiviral activity with an EC50 of 0.01 microM. Even though the authors do not make specific claims that the antiviral effect is due to inhibition of nsp16-nsp10, it is implicit. If the data is included, it should state specifically that the effect is not likely due to nsp16-nsp10 inhibition.

      We do believe that the antiviral data are valuable and should be published within this work. We also agree with the comment that it should be clearly stated that the antiviral effect is not likely because of nsp10-16 inhibition and we will optimize that accordingly.

      The structure of the paper and the language needs quite a lot of work to bring it to the expected quality.

      We will go through the manuscript again and further improve the structure and language as much as possible

      Technical point:

      Refinement of crystallographic occupancies to single digit percentage is not normally supported by electron density.

      We agree with that point and correct it in the revised version.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      Ewing sarcoma is an aggressive pediatric cancer driven by the EWS-FLI oncogene. Ewing sarcoma cells are addicted to this chimeric transcription factor, which represents a strong therapeutic vulnerability. Unfortunately, targeting EWS-FLI has proven to be very difficult, and a better understanding of how this chimeric transcription factor works is critical to achieving this goal. Towards this perspective, the group had previously identified a DBD-𝛼𝛼4 helix (DBD) in FLI that appears to be necessary to mediate EWS-FLI transcriptomic activity. Here, the authors used multi-omic approaches, including CUT&tag, RNAseq, and MicroC to investigate the impact of this DBD domain. Importantly, these experiments were performed in the A673 Ewing sarcoma model where endogenous EWS-FLI was silenced, and EWS-FLI-DBD proficient or deficient isoforms were re-expressed (isogenic context). They found that the DBD domain is key to mediating EWS-FLI cis activity (at msat) and to generating the formation of specific TADs. Furthermore, cells expressing DBD-deficient EWS-FLI display very poor colony-forming capacity, highlighting that targeting this domain may lead to therapeutic perspectives.

      We thank Reviewer 1 for their strong summary of Ewing sarcoma background and accurate description of our experimental approaches and findings.

      Strengths:

      The group has strong expertise in Ewing sarcoma genetics and epigenetics and also in using and analyzing this model (Theisen et al., 2019; Boone et al., 2021; Showpnil et al., 2022).

      We thank the reviewer.  

      They aim at better understanding how EWS-FLI mediated its oncogenic activity, which is critical to eventually identifying novel therapies against this aggressive cancer.

      We are happy to see that our overall aim was also appreciated by Reviewer 1.

      They use the most recent state-of-the-art omics methods to investigate transcriptome, epigenetics, and genome conformation methods. In particular, Micro-C enables achieving up to 1kb resolved 3D chromatin structures, making it possible to investigate a large number of TADs and sub-TADs structures where EWS-FLI1 mediates its oncogenic activity.

      We thank Reviewer 1 for their acknowledgement of our approaches and the resolution achieved with our Micro-C experiments.  

      They performed all their experiments in an Ewing sarcoma genetic background (A673 cells) which circumvents bias from previously reported approaches when working in non-orthologous cell models using similar approaches.

      We agree with the reviewer about the importance of using model systems that accurately capture features of the disease being studied. As we have added an additional cell line in the revision we should note that this second model also represents a Ewing sarcoma genetic background while representing tumors expressing another oncogenic fusion found in this disease. 

      Weaknesses:

      The main weakness comes from the poor reproducibility of Micro-C data . Indeed, it appears that the distances/clustering observed between replicates are typically similar or even larger than between biological conditions. For instance, in Figure 1B, I do not see any clustering when considering DBD1, DBD2, DBD+1, DBD+2.

      Lanes 80-83: "KD replicates clustered together with DBD replicate 1 on both axes and with DBD replicate 2 on the y-axis. DBD+ replicates, on the other hand, clustered away from both KD and DBD replicates. These observations suggest that the global chromatin structure of DBD replicates is more similar to KD than DBD+ replicates."

      When replacing DBD replicate 1 with DBD replicate 2, their statement would not be true anymore.

      Additional replicates to clarify this aspect seem absolutely necessary since those data are paving the way for the entire manuscript.

      These are valid concerns and we thank the reviewers for highlighting this limitation of poor clustering of Micro-C replicates on MDS plot. We account for this variability between different replicates when identifying differentially interacting regions. By using an adjusted p-value < 0.05, we aim to ensure that repeating the experiments we will discover the same differentially interacting regions with a false discovery rate of 5%.

      We also would like to note that the replicates cluster much closely on PCA plot of RNA-seq data (Supplementary Figure 1C) and as well as on PCA plot of H3K27ac CUT&Tag data (Figure 4A). Notably, the RNA-seq result has now reproduced when performed with different sets of hands across multiple studies (Boone, et. al., 2021 and this report), as well as in a second cell line (as reported in this manuscript revision). These observations suggest that the cells of these replicates are functionally similar to each other at a population level. Chromatin organization detected by Micro-C is a highly heterogenous within cells of a population (Misteli, et. al., 2020). Moreover, despite increased resolution with Micro-C over Hi-C, the conventional sequencing depth that Micro-C is performed at makes resolving finer scale 3D interactions, particularly between enhancers and promoters, challenging (Goel, et. al., 2023). Thus biologically relevant interactions driving EWSR1::ETS transcriptional regulation through de novo enhancers may have relatively weak signal in Micro-C. Both the strength of the signal and the heterogeneous chromatin state present in bulk samples could affect the average signal leading to poor clustering replicates (Hafner and Boettiger, 2022). 

      Importantly, rather than add an additional replicate of a single cell line, we repeated our study in an additional cell line, TTC466, and largely reproduced our high-level findings for transcription, enhancer formation, and 3D chromatin. Specific limitations of the TTC466 study are addressed in the Discussion section (392-420). The reproduction of weak/moderate clustering in the MDS plot in both A673 and TTC466 cell lines suggests the α4 helix of EWSR1::ETS fusions are important for reshaping 3D chromatin. However, higher resolution analyses focused on specific EWSR1::ETS-bound loci are likely an important area of future study required to fully understand the role of the α4 helix in chromatin regulation in Ewing sarcoma.

      Similarly:

      - In Figure 1C, how would the result look when comparing DBD2/KD2/DBD+2? Same when comparing DBD 1 with KD1 and DBD+1. Would the difference go in the same direction?

      This is a great point. We added distance decay plots of individual replicates in Supplementary Figure 2 and added discussion of these results in lines 88-89 of the text.

      - Figure 1D-E. How would these plots look like when comparing each replicate to each other's? How much difference would be observed when comparing, for instance, DBD1/DBD2 ? or DBD1/DBD+1?

      Unfortunately, separate replicates are required to conduct Differentially Interacting Region analysis as it determines statistically significant interactions. Therefore, we are unable to plot these analyses with individual replicates. 

      - Figure 2: again, how would these analyses look like when performing the analysis with only DBD1/DBD+1/KD1 or DBD2/DBD+2/KD?

      This is a good suggestion. It is possible to do such analysis. However, we will lose resolution as such that we may not accurately detect TADs, especially smaller TADs. Therefore, we decided to combine the biological replicates.   

      Another major question is the stability of EWS-FLI DBD vs EWS-FLI DBD+ proteins. In the WB, FLAG intensities seem also higher (2/3 replicates) in DBD+ condition compared to the DBD condition (Figure S1B).

      This is a valid concern with shRNA knock-down/rescue system and we regularly validate new constructs to ensure that they have similar expression levels as rescue with the wildtype fusion before proceeding to more exhaustive experimental workups. We would note that while we have not tested for differences in protein stability, for these constructs we largely see similar expression levels across multiple experiments, multiple cell lines, and multiple sets of hands. There may be some variations in expression level from experiment to experiment, but western blotting is a semiquantitative assay and it is also not possible to rule out that slight differences in band intensity may be a result of error in gel loading. For this reason, alongside western blotting for construct expression, we also validate construct function using RNA-seq and colony formation assays (as reported in this manuscript) and these show good agreement across biological replicates.  

      Indeed, it seems that they have more FLAG (i.e., EWS-FLI) peaks in the DBD+ condition compared to the DBD condition (Figure 2B). 

      We appreciate the comment since the legend of Figure 2B led to a misunderstanding. Figure 2B depicts the number of TADs detected in DBD and DBD+ conditions (height of the bar graphs) and the proportion of those TADs overlapped with FLAG, CTCF, both or neither peaks on y-axis. The number of FLAG peaks is actually lower in DBD+ as compared to DBD as shown in Figure 5A-B.  We clarified our Figure 2 legend to accurately describe the various proportions (color coded section) of TADs bound by DBD/DBD+ FLAG and CTCF.

      Would it be possible that DBD+ is just more expressed or more stable than DBD? The higher stability of the re-expressed DBD+ could also partially explain their results independently of the 3D conformational change. In other words, can they exclude that DBD+ and DBD binding are not related to their respective protein stability or their global re-expression levels?

      It is possible that DBD+ protein is overexpressed or more stable than DBD. With our current set of data, we cannot conclusively exclude if binding by DBD and DBD+ are not related to their expression level or stability. We would note, as above, that western blots, RNA-seq, and agar assays have largely reproduced across experiments, hands, and cell lines and that western blot is an imperfect assay for assessing protein stability.

      Surprisingly, WB FLI bands in DBD+ conditions are systematically (3/3 replicates) fainter than in DBD conditions (Figure S1B). How do the authors explain these opposite results between FLI and FALG in the WB?

      This is an excellent observation that highlights one of the intricacies of studying EWSR1::FLI1 in our KD/rescue system. Often the limiting factor for an experiment is whether or not the KD condition maintains KD through a second viral transduction for rescue and selection. We have observed over many years of working with this system that rescue conditions which are fully functional (i.e. wildtype EWSR1::FLI1, DBD+, etc.) tend to maintain better KD of endogenous EWSR1::FLI1. Constructs that don’t rescue EWSR1::FLI1 function sometimes maintain KD to a lesser degree, though frequently to a functional degree (i.e. cells are not transformed and EWSR1::FLI1 transcriptional regulation is not rescued). We suspect this observation, also raised by Reviewer 1 is resulted from a potential selection of cells with more endogenous EWSR1::FLI1 escaping KD in in DBD conditions due to selective pressures during expansion in tissue culture.

      We should note that the antibody used for detecting FLI recognizes residues that are deleted in

      DBD and DBD+ constructs, such that the FLI1 blot in Supplementary Figure 1B does not detect either construct. It only detects endogenous EWSR1::FLI1 and the 3X-FLAG-EWSR1::FLI1 construct in the middle lane that runs at a slightly higher molecular weight. The FLAG antibody is the only antibody that detects all three rescue constructs.    

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Bayanjargal et al. entitled "The DBD-alpha4 helix of EWS::FLI is required for GGAA microsatellite binding that underlies genome regulation in Ewing sarcoma" reports on the critical role of a small alpha helix in the DNA binding domain (DBD) of the FLI1 portion of EWS::FLI1 that is critical for binding to repetitive stretches of GGAA-motifs, i.e. GGAA microsatellites, which serve as potent neoenhancers in Ewing sarcoma.

      We thank Reviewer 2 for their succinct and accurate summary of our manuscript. 

      Strengths:

      The paper is generally well-written, and easy to follow and the data presented are of high quality, welldescribed and underpin the conclusions of the authors. The report sheds new light on how EWS::FLI1 mechanistically binds to and activates GGAA microsatellite enhancers, which is of importance to the field.

      We appreciate the reviewer’s assessment of our work. 

      Weaknesses:

      While there are no major weaknesses in this paper, there are a few minor issues that the authors may wish to address before publication:

      (1) While the official protein symbol for the gene EWSR1 is indeed EWS, the protein symbol for the gene FLI1 is identical, i.e. FLI1. The authors nominate the fusion oncoprotein EWS::FLI1 (even in the title) but it appears more adequate to use EWS::FLI1.

      We appreciate the reviewer for bringing this to our attention. Indeed, the most recent guideline for fusion proteins nomenclature is to use the full gene symbols separated by double colons. Therefore, the accurate nomenclature is EWSR1::FLI1. We replaced instances of EWS::FLI with EWSR1::FLI1 and have used the EWSR1::ERG nomenclature in our revised manuscript.  

      (2) The used cell lines should be spelled according to their official nomenclature (e.g. A-673 instead of A673).

      Corrected, thanks!

      (3) It appears as if the vast majority of results were generated in a single Ewing sarcoma cell line (A-673) which is an atypical Ewing sarcoma cell line harboring an activating BRAF mutation and may be genomically quite unstable as compared to other Ewing sarcoma cell lines (Kasan et al. 2023 preprint at bioRxiv https://www.biorxiv.org/content/10.1101/2023.11.20.567802v1). Hence, it may be supportive for the paper to recapitulate/cross-validate a few key results in other Ewing sarcoma cell lines, e.g. by using EWS::ERG-positive cell lines. Perhaps the authors could make use of available published data.

      We thank Reviewer 2 for this helpful comment. We replicated the experiments in TTC-466 cells containing EWSR1::ERG fusion and found that as for A-673 cells the DBD-α4 helix is important for transcriptional, enhancer, and 3D chromatin regulation (Supplementary Figures 9-18).  

      (4) Figure 6 and Supplementary Figure 5 are very interesting but focus on two selected target genes of the fusion (FCGRT and CCND1). It would be interesting to see whether these findings also extend to common EWS::ETS transcriptional signatures that have been reported. The authors could explore their data and map established consensus EWS::ETS signatures to investigate which other hubs might be affected at relevant target genes.

      We expanded our analysis to other genes demonstrated to be regulated by EWSR1::FLI1 nucleated transcriptional hubs (Chong, et. al., 2018) and included NKX2-2 and GSTM4 gene regions in

      Supplementary Figure 7-8 in A-673 cells. We also investigated the same gene regions of FCGRT, CCND1, NKX2-2, GSTM4 in TTC466 cells and report them in Supplementary Figures 14-17. For the purpose brevity, we decided to include the above examples. We may need to develop different tools to conduct further analysis to understand the gene regulatory networks driven by DBD and DBD+ in relation to hub formation. Although it is a great suggestion to map such network, this may be outside the scope of this manuscript. We thank the reviewer for bringing such a good point to our attention.  

      (5) Table 1 is a bit hard to read. In my opinion, it is not necessary to display P-values with up to 8 decimal positions. The gene symbols should be displayed in italic font.

      Suggestions are adapted, thanks!

      Reviewing Editor (Recommendations For The Authors):

      We would draw the authors' attention to the following issues that would best benefit from additional revision.

      As indicated by Referee 1, an important issue concerns the apparent poor reproducibility of Micro-C data. In Figure 1B, the clustering of the DBD1, DBD2, DBD+1, and DBD+2 is poor.

      It appears that the distances/clustering observed between replicates are typically similar or even larger than between biological conditions. Lines 80-83: "KD replicates clustered together with DBD replicate 1 on both axes and with DBD replicate 2 on the y-axis. DBD+ replicates, on the other hand, clustered away from both KD and DBD replicates. If one replaced DBD replicate 1 with DBD replicate 2, this statement would no longer be true. The referees believe that it is important to fully account for these potential discrepancies. Most of the study is based on analyses of these data sets, so if there are issues with them it has repercussions on the entire study. We note however that in Figure 4A the clustering of the H3K27ac data is much more convincing. The referees also feel that it is important to show immunoblots of the expression of DBD and DBD+ levels in the experiments performed here. While this was previously shown in the Boone et al publication in 2021, it could be illustrated again here.

      We thank the editors for concisely summarizing the main weaknesses of the paper and underscoring the importance of the Micro-C data in the rest of the paper. While the Editors note tighter clustering of the H3K27ac (Figure 4A), we would like to note that the replicates cluster much closely on PCA plot of RNA-seq data (Supplementary Figure 1C). Notably, the RNA-seq result has now reproduced when performed with different sets of hands across multiple studies (Boone, et. al., 2021 and this report), as well as in a second cell line (as reported in this manuscript revision). Though not as tight, the H3K27ac CUT&Tag also reproduces in TTC466 cells. Thus, we interpret these findings to indicate that our replicates are functionally similar to each other. As discussed above in the response to Reviewer 1 in more detail, there are several factors that could affect how these functional similarities are represented in Micro-C data. Micro-C is ultimately a readout of the chromatin organization in a heterogeneous population of cells (Misteli et al., 2020). Additionally, sequencing depth limitations in conventional Micro-C experiments limit the ability to faithfully assess the enhancer-promoter interactions that may be relevant for our model system (Goel, et. al., 2023). Thus, both the strength of the biologically relevant signal and the heterogeneous chromatin state present in bulk samples could affect the average signal and lead to poorly clustering replicates (Hafner and Boettiger, 2022). 

      To address these important concerns about rigor and reproducibility of the analyses, we repeated our study in an additional cell line, TTC466, and largely reproduced our high-level findings for transcription, enhancer formation, and 3D chromatin. These additional studies were not without their own limitations and these are addressed in the Discussion section (392-420). The reproduction of weak/moderate clustering in the MDS plot in both A673 and TTC466 cell lines suggests the α4 helix of EWSR1::ETS fusions are important for reshaping 3D chromatin. However, additional genomic analyses geared toward higher resolution at specific EWSR1::ETS-bound loci are likely an important area of future study required to fully understand the role of the α4 helix in chromatin regulation in Ewing sarcoma. Live cell imaging, as performed by Chong, et. al., 2018 and additional biochemical techniques may also be informative and are beyond the scope of this report.

      With regards to concerns about construct expression, we have included immunoblots of the rescue constructs in both cell lines (Supplementary Figure 1B and 9A) and discussed Reviewer 1’s specific concerns in detail above.  

      The referees also raise the issue of using an additional cell line to make a more general message. Although it would perhaps be asking too much to repeat the MicroC experiments, consolidation of the observations could be performed by focusing on specific loci such as FCGRT and CCND1 that were analyzed in this study. Could the authors use 4C-type experiments to reproduce the conclusions in an additional cell line? It would also be pertinent to consolidate the findings at these loci by 4C-type approaches even in the cell line used here. For the moment, all conclusions are based on the same set of data and a single technical approach.

      We repeated the experiments in TTC466 cells and analyzed the data using same cut-offs used in A-673 cells. This allows us to compare between the two cell lines. We hope this new set of experiments and analyses address the reviewers’ concerns.  

      Reviewer #1 (Recommendations For The Authors):

      All the data are performed in A673 cells. Knowing the transcriptomic and epigenetic heterogeneity of Ewing sarcoma cells, some of the experiments supporting their findings should be replicated in at least another Ewing sarcoma model.

      Per our discussion above, we have replicated our experiments in an additional cell line model of Ewing sarcoma. Importantly, the TTC466 cell line used expresses the EWSR1::ERG fusion found in 10-15% of Ewing sarcoma cases.  

      Supplementary Figure 2B. Proportion of TAD boundaries bound by FLAG (i.e., EWS-FLI1) and CTCF. The number/proportion of FLAG (i.e., EWS-FLI) peaks observed at CTCF peak/TAD boundaries seems unexpectedly high. How do they explain this result since EWS-FLI peaks are rather intra-TAD to mediate their enhancer function?

      In our previous study, we showed that EWSR1::FLI1 binding can be detected at boundaries of TADs (Showpnil, et. al., 2022). We think therefore it is likely that EWSR1::FLI1 binding is able to mediate enhancer function both inside TADs as well as at the borders of TADs and may, in some cases, function as an insulator between TADs.  

      For the >50kb loop analysis, what was the low-range threshold? Up to 15-20 kp, contact frequency interactions may be caused by PFA crosslink (did they use a 5kb threshold ?). Were those excluded from that analysis?

      We acknowledge that we did not use a lower threshold to exclude those short-range loop interactions. In our previous study, we observed that EWSR1::FLI1 binding reduces long-range interactions in favor of short-range interactions (Showpnil, et. al., 2022) and wanted to be able to capture short-range loops in our analysis.  

      In Figure 2D, they observed that within TADs containing FLAG peaks at GGAA microsatellites, the intensity of the DBD+ FLAG peaks was higher compared to DBD FLAG peaks. How would this analysis look when considering the ETS FLAG peaks (i.e., EWS-FLI rather repressive peaks)? Could they compare TAD with GGAA msat vs TAD with ETS peaks?

      We agree that this is an interesting observation. In our prior analyses we found no discernible relationship between EWSR1::FLI1 binding and changes in 3D chromatin associated with repression (Showpnil, et. al., Nucleic Acids Research, 2022). In contrast, EWSR1::FLI1-bound superenhancers had greater H3K27ac deposition when overlapping both a bound GGAA repeat and a non-microsatellite site. While there have been several additional reports about the relevance of EWSR1::FLI1 binding at nonmicrosatellite peaks, motifs at these loci have not yet been rigorously defined as GGAA repeats were by Johnson, et. al. in PLoS One, 2017. Each ETS factor binds different motifs containing the core 5’-GGAA-3’ with varying affinities depending on the flanking residues. There may be >100-fold difference in sequence-specific binding affinity for “high” vs. “low” affinity motifs. Better defining the types of ETS motifs bound by EWSR1::FLI1 and the functional changes associated with them thus represents an interesting area of future study.

      Figure 1F: What is the biological meaning of these results (29.7, 39.5, and 54Mbp)? These distances are typically the size of a chromosome arm and clearly beyond classical chromatin loop/TAD structures in which EWS-FLI mediates its cis-activity.

      We agree with referee here. This panel is now removed in our revised manuscript.  

      How do DBD, KD, and DBD+ conditions compare with WT parental cells in the omics data? (Figures 1B, 4A). Do DBD+ conditions overlap with WT conditions? It would be nice to have these analyses also for Micro-C and Cut&Tag data. To be acknowledged here, the transcriptome data showing this aspect in Figure S1C are very convincing.

      This is a fair point. We were not able to obtain similar sequencing depth of wtEF Micro-C libraries to that of KD, DBD and DBD+ due to disproportional use of wtEF libraries in troubleshooting. Therefore, we decided to exclude wtEF condition from these analysis. 

      EWS-FLI cis-regulation at CCND1 also occurs through a much closer EWS-FLI peak (~-20kb msat upstream of CCND1 TSS) which was not taken into consideration. EWS-FLI peak intensity in both DBD and DBD+ at this msta seems similar. How would this fit into their model?

      The referee is correct. The closest peak upstream of CCND1 TSS is about ~19kb away. We highlighted this peak with the dashed boxes near the CCND1 TSS (Supplementary Figure 6). Peak intensity of DBD+ FLAG is slightly higher compared to DBD. Nonetheless, we acknowledge that the difference is small. We suspect that the DBD-α4 helix is affecting binding dynamics at GGAA repeats, but these genomics approaches are not well suited to detect small, but significant, changes in binding affinity or dynamics. In this case a more biochemical approach may be needed. Even though, both protein can still bind the same microsatellites, it is possible that they might differ in their stability of binding or in the recruitment of additional proteins. These possibilities are discussed in the Discussion section (444-463).  

      For the Micro-C, they sequenced only 7 to 8 million reads per condition. This coverage seems particularly low, especially for their analyses using 1-5kb bins. How does this compare with other published Micro-C data? Can this explain the variability observed between replicates?

      We apologize for the inconsistent verbiage of sequencing coverage that may have caused confusion. 7 to 8 million reads were used for shallow sequencing and QC analysis. Once a sample passed QC, we then sequenced 300 million reads per sample. 300M is now changed to 300 million to prevent a misunderstanding at line 598.  

      They mention:

      "In our recent studies of EWS::FLI, we found a small alpha helix in the DNA binding domain DBD-𝛼𝛼4, to

      be required for transcription and regulation by the fusion protein (Boone et al., 2021). Interestingly, this study did not find any change in chromatin accessibility (ATAC-Seq) and genome localization of EWS::FLI constructs (CUT&RUN) when DBD-𝛼𝛼4 helix was deleted leaving the mechanistic basis for the requirement of DBD-𝛼𝛼4 in transcription regulation unclear. "

      And

      "To assay the enhancer landscape, we collected H3K27ac CUT&Tag data from KD, DBD, and DBD+ cells. Principal component analysis of H3K27ac localization shows that the DBD replicates were clustered closer to the KD replicates while being in between the KD and the DBD+ replicates (Figure 4A), suggesting that DBD-𝛼𝛼4 helix is required to reshape the enhancer landscape."

      But now H3K27ac CUT&Tag show strong differences which were not observed in ATAC seq. How to explain this discrepancy?

      Though both H3K27ac and ATAC signal are associated with enhancers and promoters in euchromatin, they are not exactly measurements of the same thing. H3K4me2 is a mark more closely associated with ATAC signal than H3K27ac (Henikoff, et. al., 2020). Nonetheless, there are clear differences between the prior publication (Boone, et. al., 2021) and this work with regards to similar ATAC signal for each replicate and differences in H3K27ac. We suspect this may be related to a tighter association between H3K27ac and EWSR1::FLI1-mediated genome regulation and ATAC. Notably, there were very few differentially accessible regions between EWSR1::FLI1-depleted cells and conditions with EWSR1::FLI1 expression (either endogenous or wildtype rescue) using the A673 KD/Rescue system in Boone, et. al., 2021. In contrast, other A673 KD-rescue studies have reported differences in H3K27ac in EWSR1::FLI1 expressing conditions relative to EWSR1::FLI1-depleted conditions (Theisen, et. al., 2021). .  

      The authors mention:

      "Our study thus uncovered a surprising role for FLI DBD in the process of hub formation which is usually attributed to the EWS low complexity domain."

      Not sure this can be claimed, hubs are composed of many other factors that are not investigated here. Furthermore, promoter enhancer hubs/loops often include combined ETS and mSat chains to generate transcriptional hubs which have not been considered here. None of these points were discussed here.

      We replaced “uncovered” with “suggest” in our revised manuscript at line 476.  

      What are the barcode patterns in Supp 5, are those frequently observed in their Micro-C data, likely mapping artifacts, do they have any impact on their analyses?

      The barcode patterns in now Supplementary Figure 6 are blind spots in the hg19 genome assembly. Since they are few in numbers, we don’t expect these blind spots to impact our analysis.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02516

      Corresponding author(s): Christopher Shoemaker

      __1. __General Statements [optional]

      Thank you to all the reviewers for their helpful efforts on behalf of our manuscript. We appreciate the time and effort they have invested in providing valuable feedback.

      Overall, the positive reception from our reviewers highlighted their appreciation for our approach and findings. Moreover, their comments underscored the relevance and potential impact of our findings, particularly within the fields of autophagy and protein interaction networks. Their detailed and constructive critiques will also help refine both the content and presentation of our work.

      In response to the reviews, we have proposed targeted revisions to the manuscript, all of which are well within our lab's capabilities and can be executed efficiently. We have detailed our responses to each specific point raised by the reviewers below. * *

      • *

      __2. __Description of the planned revisions

      • *

      Reviewer #1

      Evidence, reproducibility and clarity

      1. EVIDENCE, REPRODUCIBILITY AND CLARITY Summary:

      Selective autophagy receptors (SARs) of the Sequestosome-1 like receptor group (SLRs) including SQSTM1(Sequestosome-1)/p62, NBR1, TAX1BP1, NDP52, CALCOCO1 and Optineurin are soluble SARs that engage cargo and ATG8 family proteins as well as components of the core autophagy machinery like FIP200/RBCC1 to bring about the autophagic degradation of the cargo and themselves. In the autophagic degradation of protein aggregates (aggrephagy) the most studied SAR p62 collaborates with the archetypal autophagy receptor NBR1 and also TAX1BP1 to bring about effective turnover of ubiquitinated cargos sequestered into p62 bodies or droplets by liquid-liquid phase separation. How this intricate co-operation of these SARs is orchestrated is incompletely understood. In the paper by North et al entitled "The LC3-interacting region of NBR1 is a protein interaction hub enabling optimal flux" the authors use peptide arrays to map the binding sites for ATG8-family proteins LC3A and GABARAPL1, FIP200 and TAX1BP1 to the autophagy receptor NBR1. The authors find that three short linear interaction motifs (SLiMs), the LIR, FIR and TIR interacting with ATG8 family proteins, FIP200 and TAX1BP1, respectively, partly overlap in a short region of NBR1 that can adopt different conformations to accommodate the different binding partners. In short, the different interactions are mediated by distinct overlapping determinants, rather than a single, convergent, SLiM. While the important binding determinants for ATG8 proteins and FIP200 show more overlap and it was not possible here to find mutations that distinguish LIR and FIR binding, TAX1BP1 bound more to a region downstream of the LIR and a specific mutation in NBR1 and in TAX1BP1 could abolish binding. Checking the role of phosphorylations in augmenting binding using phosphomimetic mutations it was seen that while FIP200 and Atg8-family binding were generally augmented by phosphorylation, TAX1BP1 binding did not respond to these mutations. Very interestingly, the authors found that co-expression of TAX1BP1 with tandem-tagged NBR1 in pentaKO cells (not expressing the SLRs p62, NBR1, NDP52, TAX1BP1 and OPTN) increased significantly the autophagic turnover of NBR1. None of the other SLRs could do this. Instead, this over-expression assay revealed a competition.

      Major points:

      1) In Fig 4 the peptide array binding assay is not sufficient as it is only semiquantitative. The data shown should be accompanied by a more direct binding assay allowing the determination of kDs for the binding where the WT peptides are directly compared to the phosphor mimicking mutant peptides. Here the fluorescence anisotropy assay the authors use in Suppl Fig. 1E or ITC, OctetRed96 or another assay suitable for kD determinations should be used.

      Response: Thank you for the constructive comments regarding our peptide array binding assay. We agree that the semi-quantitative nature of this method limits its ability to provide detailed binding affinity measurements. To address this, we will purify multiple peptides and assess the binding affinities between phosphomimetic+/- LIR peptides and Atg8s, FIP200, and TAX1BP1. While testing all peptides may be cost and time prohibitive, we will prioritize a representative range for this validation effort.

      2) As this paper is already dominated by the use of peptides it would significantly enhance the quality of the data if the authors had included studied with peptides phosphorylated at the specific positions to allow comparison with the phosphomimetic substitutions to aspartate.

      Response: Thank you for your insightful comment. We agree that incorporating studies with peptides phosphorylated at specific positions could provide a more nuanced comparison with the phosphomimetic substitutions to aspartate. Previous studies, including Popelka and Klionsky (2022) and Kliche et al. (2022), have indeed suggested that phosphomimetic substitutions do not perfectly replicate phosphorylation events.

      In response, we plan to order a peptide array containing phosphorylated peptides, not merely phosphomimetics, and will conduct additional experiments with TAX1BP1, FIP200, and LC3A. This approach will allow us to directly assess the effects of actual phosphorylation compared to phosphomimetic substitutions.

      While we acknowledge the possibility of subtle differences in binding affinity or regulatory interactions, we anticipate that the primary conclusions of our study—namely, that TAX1BP1 is largely insensitive to phosphorylation, whereas FIP200 and LC3A binding activities are affected—will remain unchanged. These experiments will provide valuable data to confirm the robustness of our conclusions under the conditions of true phosphorylation.

      3) The quality of the 2D peptide array probing of GST-LC3A binding in Fig 3A is poor. Is this a stripped and re-probed membrane? I do not think these data are publication quality and the experiment should be redone unless the authors have very good arguments against my suggestion. It would also be nice to see a 2D peptide array of GABARAPL1 binding too to make the comparative study complete.

      Response: Thank you for your constructive feedback regarding the quality of the 2D peptide array probing of GST-LC3A in Figure 3A. As you rightly pointed out, the membrane was indeed stripped and reprobed, with LC3A being the final probe. This method sometimes introduces artifacts, such as the 'ring' effect observed, which are common with this technique. However, the results consistently aligned with established consensus sequences for LC3, reinforcing the reliability of our findings despite the suboptimal image quality.

      Recognizing the concerns about the quality of the blot, we are prepared to repeat this experiment using a new commercial vendor, as our previous collaborator is no longer available. We anticipate some differences in the appearance of the blots due to changes in dot size and spacing from the new supplier. Given these variations, we propose adding the revised blot to the supplementary materials rather than the main figures to avoid disrupting the visual continuity of the data presentation.

      Additionally, in response to the reviewer’s suggestion, we will include a 2D peptide array probing for GABARAPL1. This will enhance the comparative analysis within our study.

      One alternative (related to Reviewer 3, comment 3) that we can deliver is using our LIR arrays to derive consensus sequences for LC3 binders and GABARAPL1 binders. In doing this, we find the same differences in LC3 and GABARAP binding preferences that were reported previously in Rogov et al 2017. Recovering these known, and somewhat subtle, differences in binding preference further bolster the validity of our approach.

      4) For the data shown in Fig 6 it should be noted that although these are very interesting results a clear limitation of the study is that the results on the autophagic turnover is based on overexpressing the SLRs in the pentaKO cells. In a physiological setting with all relevant actors in place and with a different stoichiometry the effects could likely be different.

      Response: We appreciate the observation regarding the limitations of our study due to the use of overexpressed SLRs in pentaKO cells. As the reviewer rightly points out, the stoichiometry and interaction dynamics in a physiological setting might differ significantly. Critically, after submission of this manuscript, a recent preprint by Sascha Martens’ group (Bauer et al. BioRxiv) has shown similar results using endogenously tagged p62, TAX1BP1, and NBR1. This study corroborates our results, suggesting that the interactions we observed are not merely artifacts of overexpression but reflect genuine biological phenomena. We will incorporate a detailed discussion of this study in the Discussion section of our manuscript to contextualize our findings within a more physiologically relevant framework.

      Therefore, we believe that our reductionist approach, while not fully reflective of physiological conditions, offers valuable and generalizable insights into the intricate cooperation of SARs in autophagy.

      Minor points:

      1) It would be beneficial for the reader to show a cartoon of the domain organization of both TAX1BP1 and NBR1 in Figure 1. NBR1 is shown in supplemental figure 1, but there is no depiction of the domain organization of TAX1BP1.

      Response: As suggested, a domain schematic for NBR1 and TAX1BP1 will be included.

      2) The authors say at the bottom of page 4 "Complementary in vivo studies reveal that while SLRs typically compete". But do they actually typically compete? Is this not a result of the experimental strategies employed? There is more a shortage of SLRs based on cargo competition as shown recently by Peter Kim's group that excessive pexophagy may reduce mitophagy etc. (Germain et al. 2023).

      Response: Thank you for pointing out this overstatement. We will soften this statement.

      3) In Fig. 3D it should be shown that D, E, A and V are preferred residues at position +1 for LC3A binding.

      Response: As suggested, we will amend the figure to include these residues at the +1 position.

      4) In such a 2D mutational analysis it is often just as important to determine which residues are not allowed for binding. It would therefore be nice if the authors could summarize/visualize their results in a better way in Fig 3D to also show the residues that lead to loss of binding. These could be shown below the sequence and the use of color to distinguish basic, acidic, hydrophobic and aromatic residues could be attempted.

      Response: As suggested, we will add to this figure to make it more comprehensive by including residues that are both preferred and lead to loss of binding. Furthermore, we have incorporated the use of color to distinguish the traits of different residues (basic, acidic, hydrophobic and aromatic) that are dis(favored) at each position.

      5) Line 327: To be clear about the fact that this is an overexpression assay "simultaneous expression" should be corrected to simultaneous overexpression".

      Response: We will make the suggested change.

      6) There are LIRs and FIRs that overlap and those that do not. To check the degree of overlaps that may occur among known LIRs the authors made a peptide array with 100 established LIR sequences taken from the LIR-Central database (Chatzichristofi et al., 2023). The peptide array was probed with LC3A (29 bound), GABARAPL1 (49 bound), the FIP200 Claw domain (57 bound) and the TAX1BP1 CC2 domain (49 bound). As much as one third (32) of the LIR peptides were not bound by any of the four probes. Do the authors have a good explanation for the fact that so many peptides did not bind?

      Response: Thank you for highlighting the significant number of LIR peptides that did not bind to any of the probes in our study. At first, we were similarly surprised by this. In our manuscript, we will expand on several factors that might explain this observation:

      • Specificity of Atg8 Family Proteins: The LIR-Central database indicates that these sequences bind at least one Atg8-family protein, but not necessarily all. Our assay might not have included the specific Atg8 proteins that some LIRs preferentially bind.
      • Peptide Solubility and Conformation: The solubility and conformational stability of peptides printed on an array can vary, affecting binding efficiency. Certain sequences may not adopt the optimal conformation for binding under these assay conditions.
      • Sequence Context and Accessibility: The native context in which the LIR motif is contained, including neighboring amino acids, can influence binding. Peptide arrays strip these peptides of their physiological context. As short linear interaction motifs, the assumption is that context will not strongly affect binding, but it’s known that many LIRs adopt partially structured motifs that influence binding (e.g. a C-terminal helix). Our peptide array approach is likely to impede such secondary structures from forming and may limit binding.
      • Misannotated sequences. The LIRs included from the database have varying levels of validation. Some sequences might be misannotated and, therefore, do not bind any of the probes. These discussion points will be included in the manuscript to provide a comprehensive explanation for the observed data.

      7) Strangely enough, the NBR1 peptide used in Figure 2A did not bind any of the probes while the NBR1 peptides used in Fig. 1C bound very well. Do the authors have any explanation for this?

      Response: Thank you for noting the discrepancy in NBR1 peptide binding observed in Figure 2A compared to Figure 1C. This observation was noted by all reviewers. The difference likely arises from the solubility issues associated with the NBR1 peptide in the format used for Figure 2A, where the peptide sequence included the LIR motif plus 10 amino acids on each side. The core LIR sequence of NBR1 (YIII) is highly hydrophobic, which can affect its solubility and, consequently, its observed binding in our peptide array.

      To overcome this, we optimized the LIR sequence of NBR1 for peptide arrays (amino acids 725-749), which includes seven residues before the LIR and 14 residues after. This shift enhanced solubility and facilitated more reliable probing in our experiments (notably Fig 3). In Fig2A and other assays, both the standard and the optimized formats of the NBR1 LIR were included: the standard format to maintain consistency with other LIRs extracted from the LIR-Central database and the optimized version as a control to validate our results.

      We will detail this explanation in the manuscript, clarifying the rationale behind the observed binding differences.


      Significance

      SIGNIFICANCE

      I found this paper very interesting to read with a lot of interesting new detailed and useful information on binding specificity for the proteins and motifs involved. It is a generally well performed study with interesting results. I also very much enjoyed the Discussion section which opens up for several interesting possible scenarios. The study also produced important point mutants that can be used in future studies to selectively abolish TAX1BP1 binding to NBR1. I think this is a "must read" paper for researchers interested in selective autophagy and co-operation between SARs, and more generally for getting some insight into how SLiMs may work. As such, this paper will be of interest for all interested in autophagy research and for a wider audience too as it is in essence about how overlapping SLiMs may be employed to orchestrate multiple protein-protein interactions using distinct overlapping determinants, rather than a single, convergent, SLiM. It is also one of the very few papers I have come across exploiting the power of the peptide array method so extensively with success for mapping protein binding sites.

      It could perhaps be interesting if the authors discussed their results in relation to another study from the group of Sascha Martens on the role of TAX1BP1 in p62 bodies or condensates (doi: https://doi.org/10.1101/2024.05.17.594671). These two papers should be read together as they are both very interesting and important contributions.

      Response: Thank you for pointing out this important reference that was posted shortly after our manuscript was submitted. As mentioned above, we will include an expanded discussion section to discuss these corroborating findings. We will also include a citation to Ferrari et al (PMID: ) on Tau evasion of autophagy through exclusion of TAX1BP1.

      Reviewer #2

      Evidence, reproducibility and clarity

      Summary In this manuscript, North et al. examined how short linear interaction motifs (SLiMs) help to orchester selective autophagy receptors (SARs) function during cargo engulfment in autophagosomes. In particular, the authors focused on NBR1 as a model SAR to address the role of its role in the clearance of protein aggregates (aggrephagy). Using binding assays, the authors showed that a SLiM harboring NBR1's LIR motif also mediates binding to FIP200 and TAX1BP1. Intrigued by these overlapping binding sites, the authors probed 100 LIRs for their binding to TAX1BP1's coiled-coil 2 region (CC2), FIP200's claw domain and two different ATG8 family members and found heterogenous binding pattern and distinct correlation between these four binding partners. Using mutational peptide arrays of NBR1's SLiM, the authors revealed unique binding determinants of these NBR1 partners and their potential differential regulation by phosphorylation. Taking advantage of their new NBR1 binding insights, the authors structurally modeled the binding of TAX1BP1's CC2 to NBR1's SLiM and identified crucial residues in both proteins for this interaction. Lastly, the authors turned to autophagy flux assays in cells and showed that TAX1BP1 acts synergistically with NBR1 to increase its lysosomal delivery. Overall, the claims and the conclusions are largely supported by the data. However, a few critical issues should be addressed.

      Are the data and the methods presented in such a way that they can be reproduced?

      Are the experiments adequately replicated and statistical analysis adequate?

      Major comments

      1) What are the expression levels of the different tf-SAR fusions compared to the endogenous levels of the respective SAR? And are tf-NBR1 protein levels changed upon co-expression of the other SARs?

      __Response: __We appreciate the questions concerning the expression levels of tf-SAR fusions relative to the endogenous levels of the respective SARs, similar to inquiries from Reviewer 1 (major comment 4). In our study, the levels of tf-NBR1 are notably higher than the endogenous levels. Interestingly, we observed that the co-expression of autophagy-competent NBR1 and TAX1BP1 generally leads to a decrease in the levels of both proteins, likely due to enhanced autophagic turnover. This pattern is not seen with autophagy-deficient mutants, suggesting a functional interaction affecting protein stability.

      Furthermore, a recent preprint by Sascha Martens’ group (Bauer et al., BioRxiv) has presented findings that echo our results using endogenously tagged versions of p62, TAX1BP1, and NBR1. This study supports our observations, indicating that the interactions and effects we report are not artifacts of overexpression but are reflective of genuine biological processes. These findings will be thoroughly discussed in the Discussion section of our manuscript to provide context for our results within a physiologically relevant framework.

      Therefore, we believe that our reductionist approach, while not fully reflective of physiological conditions, offers valuable and generalizable insights into the intricate cooperation of SARs in autophagy.

      2) Which of the 100 LIRs have been shown to specifically bind LC3A or GABARAPL1? The authors should include this information from the literature in Figure 2 (e.g., highlighted by color or else).

      __Response: __Thank you for your suggestion to detail the specific interactions between the 100 LIRs and Atg8 homologs like LC3A and GABARAPL1 in Figure 2. While each LIR in the LIR-Central database has been validated, detailed information on which LIRs bind specific Atg8 homologs—and with what relative affinity—is often lacking in the literature. This gap makes it challenging to present comprehensive binding preferences in a visually coherent way within Figure 2.

      Nevertheless, we recognize the value of such information. We plan to conduct a thorough literature review on all 100 LIRs included in our study. Should we find sufficient and reliable data regarding binding specificities, we will incorporate this into Figure 2, potentially using color coding or another method to highlight these relationships clearly.

      We can also perform the reciprocal experiment by using our LIR arrays to derive consensus sequences for LC3 binders and GABARAPL1 binders. In doing this, we find the same differences in LC3 and GABARAP preferences that were reported previously in Rogov et al 2017. Recovering these known, and somewhat subtle, differences in binding preference further bolster the validity of our approach. These new data will be added to the manuscript.


      3) How effective is the stripping of the peptide array? The authors should provide evidence that there is no carry over binding from sequential probing the array. As a control, the authors should at least repeat probing for the last binder in their sequential binding assay with a new peptide array that has not yet been incubated with a different binder and then stripped.

      __Response: __This is an important question, related to Reviewer 1 (comment 3), as the stripping of the peptide array can be variably affective. Prior to performing any of the arrays included in this manuscript, we did several validation arrays to identify the proper ordering of probes (e.g. what proteins can be stripped, which cannot). FIP200 and TAX1BP1 probing was performed on fresh or successfully stripped blots. LC3A probing was done last, as there is substantial previous literature defining the LC3 motif. However, the results of the LC3A binding consistently aligned with established consensus sequences for LC3, reinforcing the reliability of our findings despite the stripping process. Therefore, while stripping sometimes introduces artifacts, such as the 'ring effect’ observed in Figure 3A, the results did not appear to be influenced by prior probes.

      As suggested, we are prepared to repeat the LC3A probing on a new array to fully cement this interpretation. We note, however, that this will be done using a new commercial vendor, as our previous collaborator is no longer available (The original blots were ordered over 3 years ago). We anticipate some differences in the appearance of the blots due to changes in dot size and spacing from the new supplier. Given these variations, we propose adding the revised blot to the supplementary materials rather than the main figures to avoid disrupting the visual continuity of the data presentation.

      4) What is the number of replicates for the peptide array assays?

      __Response: __Due to cost considerations, peptide array assays in our study were conducted as one or two replicates. We understand the limitations this presents in terms of statistical robustness and variability assessment. However, where possible, we supplemented these assays with additional validation experiments and controls to ensure reliability of our findings. For critical experiments, including key interaction validations, we used independent biochemical assays to confirm the results obtained from the peptide arrays.

      5) The authors should test whether the enhancement of NBR1 flux by TAX1BP1 is only due to the contribution of an additional LIR or potential other functions of TAX1BP1 (e.g. ubiquitin binding or FIP200 binding). The authors should expand the panel shown in Figure 6E with TAX1BP1 mutant which are deficient in ubiquitin or FIP200 binding.

      __Response: __We thank the reviewer for their suggestion. We will include data with TAX1BP1 mutants that are deficient in ubiquitin or FIP200 binding

      Minor comments

      6) Molecular weight markers are missing on immunoblots.

      __Response: __We apologize for this oversight. We will amend figure to include molecular weight markers.

      7) It would be more informative (since some proteins have more than one LIR) if the actual LIR motif would be displayed next to the peptide array (as e.g. done for NBR1) and not only in the supplements.

      __Response: __We appreciate this thoughtful input and will consider its implementation carefully. We will explore the feasibility of integrating this detail in a manner that maintains figure clarity.

      8) Along this line in Figure 2A, NBR1's LIR (marked with a red star) is among the LIRs for which no binding was observed. The authors should explain this.

      Response: Thank you for noting the discrepancy in NBR1 peptide binding observed in Figure 2A compared to Figure 1C. This observation was noted by all reviewers. The difference likely arises from the solubility issues associated with the NBR1 peptide in the format used for Figure 2A, where the peptide sequence included the LIR motif plus 10 amino acids on each side. The core LIR sequence of NBR1 (YIII) is highly hydrophobic, which can affect its solubility and, consequently, its observed binding in our peptide array.

      To overcome this, we optimized the LIR sequence of NBR1 for peptide arrays (amino acids 725-749), which includes seven residues before the LIR and 14 residues after. This shift enhanced solubility and facilitated more reliable probing in our experiments (notably Fig 3). In Fig2A and other assays, both the standard and the optimized formats of the NBR1 LIR were included: the standard format to maintain consistency with other LIRs extracted from the LIR-Central database and the optimized version as a control to validate our results.

      We will detail this explanation in the manuscript, clarifying the rationale behind the observed binding differences.


      Significance

      Collectively, the work of North and colleagues provide valuable new mechanistic insights into the network of interaction that governs the function of SARs. Importantly, this works extends the knowledge in the field that SARs are acting in an orchestrated manner which reinforces their delivery to lysosomes. However, given the involvement of several SARs in the same process, it is crucial to dissect the binding modalities among these factors. In this regard, the current study on fine mapping binding sites provides an important contribution. In particular, in probing the in vitro findings in reconstituted KO cells. This part is really strong. In addition, the identification of critical residues for these bindings events represents important tools for the autophagy community which will be among the basic research audience most interested in this technical study.

      __ __


    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The study entitled "Rifampicin tolerance and growth fitness among isoniazid-resistant clinical Mycobacterium tuberculosis isolates: an in-vitro longitudinal study" by Vijay et al. provides valuable insights into the association of rifampicin tolerance and growth fitness with isoniazid resistance among clinical isolates of M. tuberculosis. Antibiotic tolerance in M. tuberculosis is an important topic since it contributes to the lengthy and complicated treatment required to cure tuberculosis disease and may portend the emergence of antibiotic resistance. The authors found that rifampicin tolerance was correlated with bacterial growth, rifampicin minimum inhibitory concentrations, and isoniazid-resistance mutations.

      Strengths:

      The large number of clinical isolates evaluated and their longitudinal nature during treatment for TB (including exposure to rifampin) are strengths of the study.

      Weaknesses:

      Some of the methodologies are not well explained or justified and the association of antibiotic tolerance with growth rate is not a novel finding. In addition, the molecular mechanisms underlying rifampicin tolerance only in rapidly growing isoniazid-resistant isolates have not been elucidated and the potential implications of these findings for clinical management are not immediately apparent.

      We thank the reviewer for the comments, we have modified the method section and figure 1 to clarify the method as suggested by the reviewer.

      Although we agree that previous studies have shown the association of slow growth rate with antibiotic tolerance, ours is the most comprehensive assessment of rifampicin tolerance among clinical isolates, to our knowledge. In particular, we show that the degree of tolerance in clinical isolates can vary over several orders of magnitude: which had not been previously documented or appreciated. Furthermore, the association of high tolerance among IR isolates is a new finding, and given the potential for tolerance to increase risk of de novo drug resistance, our study suggests that IR isolates with high rifampicin tolerance may present a risk for development of MDR-TB.

      In addition, we have also analysed the longitudinal isolates and the genetic variants emerging in them associated with increase in rifampicin tolerance. This analysis reveals possible multiple pathways to increase in rifampicin tolerance among clinical M. tuberculosis isolates. Possible clinical implication includes associating high rifampicin tolerance and isoniazid resistance as a risk factor for tuberculosis treatment failure. This study helps to develop further clinical studies to evaluate the role of rifampicin tolerance in IR isolates and treatment outcome. We have focused on these aspects in the discussion of the revised manuscript.

      Reviewer #2 (Public Review):

      Summary:

      This study by Vijay and colleagues addresses a clinically important, and often overlooked aspect of Tb treatment. Detecting for variations in the level of antibiotic tolerance amongst otherwise antibiotic-susceptible isolates is difficult to routinely screen for, and consequently not performed. The authors, present a convincing argument that indeed, there is significant variation in the susceptibility of isoniazid-resistant strains to killing by rifampicin, in some cases at the same tolerance levels as bona fide resistant strains. On the whole, the study is easy to follow and the results are justified. This work should be of interest to the wider TB community at both a clinical and basic level.

      Weaknesses:

      The manuscript is long, repetitive in places, and the figures could use some amending to improve clarity (this could be a me-specific issue as they look ok on my screen, yet the colour is poor when printed).

      We thank the reviewer for the comments, we have modified the revised manuscript as per the reviewer suggestions.

      It would have been great to have seen some correlation between increased rifampicin tolerance and treatment outcome, although I'm not sure if this data is available to the researchers. I agree with the researchers the use of a single media condition is a limitation. However, this is true of a lot of studies. Rifampicin tolerance and treatment outcome analysis.

      We agree with the reviewer that correlation between rifampicin tolerance and treatment outcome is important. This needs to be performed in future studies with better design to correlate rifampicin tolerance with treatment progression or outcome data.  

      Reviewer #3 (Public Review):

      Summary:

      The authors have initiated studies to understand the molecular mechanisms underlying the devolvement of multi-drug resistance in clinical Mtb strains. They demonstrate the association of isoniazid-resistant isolates by rifampicin treatment supporting the idea that selection of MDR is a microenvironment phenomenon and involves a group of isolates.

      Strengths:

      The methods used in this study are robust and the results support the authors' claims to a major extent.

      Weaknesses:

      The manuscript needs a thorough vetting of the language. At present, the language makes it very difficult to comprehend the methodology and results.

      We thank the reviewer for the comments, we have revised the manuscript as per the reviewer’s suggestions.

      Reviewer #1 (Recommendations For The Authors):

      Major comments:

      (1) Methods: The authors attempt to differentiate between "fast"- and "slow"-growing bacteria in order to determine if the growth rate is associated with rifampicin tolerance. This is accomplished by assessing growth on solid agar at 15 and 60 days post-incubation, respectively. However, mycobacterial growth rate is not a binary phenomenon but rather a continuous variable. Moreover, it is not clear why 15 and 60 days were selected. Also, instead of a "slow growth" phenotype, the 60-day time point might simply reflect a longer lag phase. Were the plates examined at any interval time points? It would be interesting to know whether colony growth was delayed overall in the populations observed only at 60 days, or simply if the appearance of microcolonies visible to the naked eye was delayed (with normal growth afterwards).

      We thank the reviewer for the comments, we want to clarify that we have not used agar plates but most-probable number method to determine the survival fraction post antibiotic treatment. We have clarified this in the revised manuscript and revised figure 1. The MPN method is a binary measure (growth/ no growth) and therefore cannot differentiate between long lag time and other mechanisms. In our original analysis, we included an intermediate time point of 30 days, but these data (included as supp fig. 1) cannot address the issue of lag phase directly. Since the 30-day time point did not add to the overall analysis and interpretation, we had not included them in the original submission.

      (2) Methods/Results/Discussion: Some important clinical information is missing-how were the patients treated who had IR isolates? Did they receive the standard regimen for DS TB or was another drug substituted for isoniazid? Exposure to different drugs could affect the rifampicin-tolerant populations during the intensive phase (Figure 5).

      Thank you for this comment, we have included the information regarding the treatment regimen in the revised manuscript.

      Were there differences in microbiological (sputum culture conversion rate at 8 weeks or time to culture negativity) or clinical outcomes based on isoniazid susceptibility? Perhaps more importantly, were there differences in microbiological/clinical outcomes based on the proportion of bacterial subpopulations with rifampicin tolerance for a particular isolate? There should be more discussion on the potential clinical implications of the study's findings.

      We agree with the reviewer that correlation between rifampicin tolerance and treatment progression or outcome is important. This needs to be performed in future studies with better design to correlate rifampicin tolerance with treatment progression or outcome data.  

      (3) Results (Figure 3A): Although an interesting finding, the increased rifampicin tolerance observed only in the "rapidly" growing populations of isoniazid-resistant isolates (IR) vs. isoniazid-susceptible (IS) isolates is not explained. In contrast, equally, increased rifampicin tolerance is seen in the "slowly" growing populations of both IR and IS isolates. It would be interesting to know if these slowly growing populations show specific tolerance to rifampicin or if, as expected, slow growth confers tolerance to a range of different bactericidal antibiotics.

      We thank the reviewer for the suggestions. we agree these will be interesting to investigate in a future study but are outside the scope of the current study.

      (4) Results (Figure 3B): The basis for the classification into tertiles is not clear and appears somewhat arbitrary-does this represent the survival of a particular isolate following rifampicin exposure relative to the other isolates based on isoniazid susceptibility (IS or IR) or the % growth relative to other populations for the same isolate? Figure 3B is missing a y-axis label. Is it a log10 MPN ratio?

      We thank the reviewer for pointing this, we want to clarify that for the classification into tertiles, first we pooled both group of isolates isoniazid susceptible (IS) and isoniazid resistant (IR) into a single population. Subsequently, we categorized this unified population into three distinct groups: low, medium, and high, based on their survival fraction following rifampicin treatment. Consequently, the 'low,' 'medium,' and 'high' tertiles represent the survival of each isolate following rifampicin exposure relative to the total number of isolates  combing both IS and IR isolates.

      For clarity, we provide a breakdown of the criteria for each tertile:

      +Low tertile: Consists of isolates with the lowest survival fraction (bottom 25%).

      +Medium tertile: Encompasses isolates with survival fractions that fall between the bottom 25% and the top 25%.

      +High tertile: Comprises isolates with the highest survival fractions (top 25%). This we have modified in the revised manuscript to clarify.

      We have also modified the Figure 3B to correct the y-axis label.

      (5) Results (lines 185-186): For correlating relative growth in the absence of antibiotics, 19 clinical isolates "outliers" were removed without explanation.

      We have added explanation for the “outliers” which were removed earlier due to deviation from normal distribution, we have also provided the supplementary figure 3 which includes these outliers.

      (6) Results (lines 203-211): The authors attempted to investigate a potential association between the mechanism of M. tuberculosis isoniazid resistance and the degree of rifampicin tolerance. However, the vast majority of IR clinical isolates (n=71) had a katG_S315X mutation and only 8 isolates had alternative mutations (inhA_I21T and fabG1_C-15X). Given the wide range of rifampicin tolerance observed within these isoniazid-resistant isolates, they concluded that other genetic or epigenetic determinants must be playing a role. WGS of longitudinally collected isolates from the same patients during TB treatment yielded non-synonymous SNPs in a list of genes previously reported to be associated with persistence, tolerance, and mycobacterial survival. However, precise mechanisms (including, e.g., expression of efflux pumps) are not investigated.

      We thank the reviewer for summarising the findings. Yes, we agree that investigating the precise mechanism of rifampicin tolerance is beyond the scope of the current work.

      Minor comments:

      (1) Abstract (line 41): The nonstandard abbreviations "IR" and "IS" have not been introduced prior to this usage.

      We have modified this in the abstract.

      (2) Introduction (line 60): Insert "phenomena" or "mechanisms" after "two".

      We have modified this in the introduction.

      (3) Introduction (lines 66-69): This sentence is confusing, especially the second part ("supporting this studies...").

      We have modified the lines to clarify.

      (4) Introduction (line 84): In the current text, it appears as if "IR" is the abbreviation for "isoniazid". Therefore, I recommend changing "resistance to isoniazid" to "isoniazid resistance".

      We have modified this in the revised manuscript.

      (5) Results (line 141): Insert "the" before "rest".

      We have modified this in the revised manuscript.

      (6) Results (line 187): Replace "did not had" with "did not have".

      We have modified this in the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Abstract:

      The abstract is long and repetitive. It needs reworking and shortening to improve clarity and highlight the main takeaway message.

      We thanks the reviewer for the suggestions and have modified this in the revised manuscript.

      The introduction is interesting and contains relevant information. However, it is long and takes a while to get to the point of the study. It needs re-writing to emphasise key prior results and the purpose of this study.

      We thanks the reviewer for the suggestions and we have modified this in the revised manuscript.

      Results:

      As the study relies predominately on the use of MPN, I think a simple schematic of how the experiment is performed would be informative. Could this be added to Figure 1?

      We have revised the figure 1 in the manuscript to include the schematic representation.

      Some of the differences in MKD90, whilst they may be significant, are small so it would at least provide context as to the relevance of these differences. This may also alleviate my confusion as to how the authors can measure the time required to achieve MDK90 as 1.23-1.31 days when the first time point that is taken is day 2 (the data in Figure 2). They have FigS6 but this is small and hard to follow.

      We thank the reviewer for this suggestion, we have modified this in the revised manuscript and figureS6.

      Figure 2:

      Would be helpful to have -1 on the Y axis.

      The grey dots don't print very well (Might be my printer)

      We have modified this in the revised manuscript, figure 2.

      Line 142: The authors note a difference in RIF tolerance at day 15 that disappeared by day 60. I assume they are referring to the day 5 timepoint although this isn't clear as written.

      Yes, it is referring to the day 5 time point and we have clarified this in the revised manuscript.

      The section starting at line 148 (fig 3) is interesting, but it is difficult to read and follow what the difference is between this data and the prior data in Figure 2. It also wasn't until about line 165 that the purpose became clear. Overall the conclusions are sound and interesting.

      We have modified this in the revised manuscript.

      Line 154: What are the early and late time recovery time points?

      Is Figure 3A the same data as Figure 2?

      We have clarified this in the revised manuscript, the figure 3A is the same data as Figure 2.

      I found Figure 6 hard to follow. I'm not sure how better to present this data, but it should be improved. Some further clarification in the text would be helpful.

      We thank the reviewer for the suggestions. We have added more explanation in the text to clarify figure 6.

      Conclusions:

      The conclusions are sound, based on the data presented. The clinical relevance is highlighted, yet appropriately phrased to not be too far-reaching.

      Again, I think the conclusions could be condensed considerably. It is repetitive in places, which distills the main outcomes of this otherwise interesting and important study. The authors appropriately highlight some of the limitations of their study.

      We thank the reviewer for these comments and have modified this in the revised manuscript.

      Reviewer #3 (Recommendations For The Authors):

      The manuscript "Rifampicin tolerance and growth fitness among isoniazid-resistant clinical Mycobacterium tuberculosis isolates: an in-vitro longitudinal study" by Srinivasan et.al., details the identification/ development of isoniazid-resistant strains in clinical isolates following testament with rifampicin. This is an important aspect of understanding MDR development in TB strains. the results are promising and gel well with the hypothesis. However, the manuscript requires a thorough language modification. While the overall idea is clear the methodology does not come out clearly.

      Specific comments:

      (1) It is not clear whether rifampicin treatments were given for 2 and 5 days before kill curves or for 15 and 60 days? The methodology needs to be phased clearly. Why was this time interval of 15 days and 60 days taken? is there a rationale for this?

      We thank the reviewer for the suggestions, we have modified the method and figure 1 to clarify this in the revised manuscript.

      (2) A concentration of 2ug/ml was used for in vitro culture in this study. While the authors themselves indicate that this is well above the MIC, this might represent a non- natural dose and hence may force the evolution of strains. What will be the scenario in the natural course of antibiotic treatment (dose at MIC or less than MIC)?

      We have observed that till 5 days there is no significant resistant emergence but after 5 days only resistance emerges, therefore we avoided determining the survival fraction after resistance emergence, the kill curve represents mostly tolerant sub population. ADD: Pharmacokinetic studies of rifampicin dosing suggest that peak concentrations of >2-32 µg/mL are typical for standard doses of the drug, therefore we believe the chosen concentration of 2 µg/mL to be physiologically relevant.

      (3) As described in line 155, the survival spanned a broad distribution, across a million times in difference. This is rather surprising that 5 days of rifampicin treatment would lead to such a spread in resistance patterns. Did the authors study the different populations to understand this phenomenon? This is important given the scale of resistance developed in this short time.

      We want to clarify that the broad range of survival fraction reflect the difference in tolerant sub-populations but not resistant sub-population to rifampicin as they are determined post rifampicin treatment in rifampicin free media, this has been clarified in the revised figure 1.

      Overall, the manuscript is a detailed study with new insights into the development of multi-drug resistance by Mtb. A thorough vetting for language is essential for a greater impact of the study.

      We thank the reviewer and have attempted to improve the clarity of the language to increase the potential impact of our findings.

    1. Author response:

      The following is the authors' response to the current reviews.

      Reviewer #1 (Public Review):

      I'll begin by summarizing what I understand from the results presented, and where relevant how my understanding seems to differ from the authors' claims. I'll then make specific comments with respect to points raised in my previous review (below), using the same numbering. Because this is a revision I'll try to restrict comments here to the changes made, which provide some clarification, but leave many issues incompletely addressed.

      As I understand it the main new result here is that certain recurrent network architectures promote emergence of coordinated grid firing patterns in a model previously introduced by Kropff and Treves (Hippocampus, 2008). The previous work very nicely showed that single neurons that receive stable spatial input could 'learn' to generate grid representations by combining a plasticity rule with firing rate adaptation. The previous study also showed that when multiple neurons were synaptically connected their grid representations could develop a shared orientation, although with the recurrent connectivity previously used this substantially reduced the grid scores of many of the neurons. The advance here is to show that if the initial recurrent connectivity is consistent with that of a line attractor then the network does a much better job of establishing grid firing patterns with shared orientation.

      Beyond this point, things become potentially confusing. As I understand it now, the important influence of the recurrent dynamics is in establishing the shared orientation and not in its online generation. This is clear from Figure S3, but not from an initial read of the abstract or main text. This result is consistent with Kropff and Treves' initial suggestion that 'a strong collateral connection... from neuron A to neuron B... favors the two neurons to have close-by fields... Summing all possible contributions would result in a field for neuron B that is a ring around the field of neuron A.' This should be the case for the recurrent connections now considered, but the evidence provided doesn't convincingly show that attractor dynamics of the circuit are a necessary condition for this to arise. My general suggestion for the authors is to remove these kind of claims and to keep their interpretations more closely aligned with what the results show.

      We would like to clarify that the simple (flexible) attractor is a weaker condition than the ones previously used to align grid cells. However, by no means we claim that it is a necessary condition for grid maps to align. Other architectures, certainly more complex ones but perhaps even simpler ones, can align grid maps in our model.

      Major (numbered according to previous review)

      (1) Does the network maintain attractor dynamics after training? Results now show that 'in a trained network without feedforward Hebbian learning the removal of recurrent collaterals results in a slight increase in gridness and spacing'. This clearly implies that the recurrent collaterals are not required for online generation of the grid patterns. This point needs to be abundantly clear in the abstract and main text so the reader can appreciate that the recurrent dynamics are important specifically during learning.

      We respectfully disagree with the interpretation of this result. In this model cells self-organize to produce aligned grid maps. In such systems it makes sense to characterize the equilibrium states of the system. We turned learning off in Figure S3 to show that the recurrent connections have a contractive effect on grid spacing. But artificially turning off learning means that one can no longer make claims about the equilibrium states of the system, since it can no longer evolve freely. In a functional network, if the recurrent attractor is removed, the system will evolve towards poor gridness and no alignment no matter what the starting point is, as also shown in Figure S3. Several experimental results invite us to think of grid cells as the equilibrium solution of a series of constraints that is ready to change at any time: Barry et al, 2012; Yoon et al, 2013; Carpenter et al, 2015; Krupic et al, 2015; Krupic et al, 2018; Jayakumar et al, 2019.

      One point in which we perhaps agree with the reviewer is that information about the hexagonal maps is kept in the feedforward weights, while behavior and the recurrent collaterals act as constraints of which these feedforward weights are the equilibrium solution.

      (2) Additional controls for Figure 2 to test that it is connectivity rather than attractor dynamics (e.g. drawing weights from Gaussian or exponential distributions). The authors provide one additional control based on shuffling weights. However, this is far from exhaustive and it seems difficult on this basis to conclude that it is specifically the attractor dynamics that drive the emergence of coordinated grid firing.

      Again, we do not claim that this is the only way in which grid maps can be aligned, but it is the simplest one proposed so far. We were asked if it was the specific combination of input weights to a cell rather than the organization provided by the attractor which resulted in aligned maps. By shuffling the inputs to a cell we keep the combination of inputs invariant but lose the attractor architecture. Since grid maps in this new situation are not aligned, we can safely conclude that it is not the combination of inputs per se, but the specific organization of these inputs that allows grid alignment. It is not fully clear to us what ‘exhaustive’ means in this context.

      (3) What happens if recurrent connections are turned off? The new data clearly show that the recurrent connections are not required for online grid firing, but this is not clear from the abstract and is hard to appreciate from the main text.

      This point is related to (1). Absent this constraint, Figure S3 shows that the system evolves toward larger spacing, with poorer gridness and no alignment.

      (4) This is addressed, although the legend to Fig. S2D could provide an explanation / definition for the y-axis values.

      We have now added: Mean input fields are the sum of all inputs of a given kind entering a neuron at a given moment in time, averaged across cells and time.

      (5) Given the 2D structure of the network input it perhaps isn't surprising that the network generates 2D representations and this may have little to do with its 1D connectivity. The finding that the networks maintain coordinated grids when recurrent connections are switched off supports my initial concern and the authors explanation, to me at least, remain confusing. I think it would be helpful to consider that the connectivity is specifically important for establishing the coordinated grid firing, but that the online network does not require attractor dynamics to generate coordinated grid firing.

      This point is related to (1) and (3). We agree with the reviewer that the input lies within a 2D manifold, but this is not something that the network has to find out because it receives one datapoint of information at a time. This alone is not enough to form aligned grid cells, since each grid cell can find a roughly equivalent equilibrium in a different direction. It is only the constraint imposed by the recurrent collaterals that aligns grid maps, and, as we show, this constraint does not need to be constructed ad hoc to work on 2D, as previously thought. When recurrent connections are switched off, the system evolves toward unaligned grid maps, with larger spacing and lower gridness. Regarding the results obtained after modifying the network and turning off learning, we think they have a very limited scope (in this case showing the contractive effect of recurrent collaterals on grid spacing), given that the system is artificially being kept out of its natural equilibrium.

      (6) Clarity of the introduction. This is somewhat clearer, but I wonder if it would be hard for someone not familiar with the literature to accurately appreciate the key points.

      We have made our best effort to improve the clarity of the introduction.

      (7) Remapping. I'm not sure why this is ill posed. It seems the proposed model can not account for remapping results (e.g. Fyhn et al. 2007). Perhaps the authors could just clearly state this as a limitation of the model (or show that it can do this).

      We view our model as perfectly consistent with Fyhn et al, 2007. Remapping is not triggered by the network itself, though, but rather by a re-arrangement of the inputs requiring the network to learn new associations. Different simulations of the same model with identical parameters can be interpreted as remapping experiments.

      Reviewer #3 (Public Review):

      Summary:

      The paper proposes an alternative to the attractor hypothesis, as an explanation for the fact that grid cell population activity patterns (within a module) span a toroidal manifold. The proposal is based on a class of models that were extensively studied in the past, in which grid cells are driven by synaptic inputs from place cells in the hippocampus. The synapses are updated according to a Hebbian plasticity rule. Combined with an adaptation mechanism, this leads to patterning of the inputs from place cells to grid cells such that the spatial activity patterns are organized as an array of localized firing fields with hexagonal order. I refer to these models below as feedforward models.

      It has already been shown by Si, Kropff, and Treves in 2012 that recurrent connections between grid cells can lead to alignment of their spatial response patterns. This idea was revisited by Urdapilleta, Si, and Treves in 2017. Thus, it should already be clear that in such models, the population activity pattern spans a manifold with toroidal topology. The main new contributions in the present paper are (i) in considering a form of recurrent connectivity that was not directly addressed before. (ii) in applying topological analysis to simulations of the model. (iii) in interpreting the results as a potential explanation for the observations of Gardner et al.

      We wanted to note that we do not see this paper as proposing an alternative to the attractor hypothesis, given that we use attractor networks, but rather as an exploration of possibilities not yet visited by this hypothesis.

      Strengths:

      The exploration of learning in a feedforward model, when recurrent connectivity in the grid cell layer is structured in a ring topology, is interesting. The insight that this not only align the grid cells in a common direction but also creates a correspondence between their intrinsic coordinate (in terms of the ring-like recurrent connectivity) and their tuning on the torus is interesting as well, and the paper as a whole may influence future theoretical thinking on the mechanisms giving rise to the properties of grid cells.

      Weaknesses:

      (1) In Si, Kropff and Treves (2012) recurrent connectivity was dependent on the head direction tuning, in addition to the location on a 2d plane, and therefore involved a ring structure. Urdapilleta, Si, and Treves considered connectivity that depends on the distance on a 2d plane. The novelty here is that the initial connectivity is structured uniquely according to latent coordinates residing on a ring.

      The recurrent architectures in the cited works are complex and require arranging cells in a 2D manifold to calculate connectivity based on their relative 2D position. In other words, the 2D structure is imprinted in the architecture, as in our 2D condition. In this work the network is much simpler and only requires neighboring relations in 1D. Such relationships have been shown to spontaneously emerge in the hippocampal formation (Pastalkova et al, 2008; Gonzalo Cogno et al, 2024).

      (2) The paper refers to the initial connectivity within the grid cell layer as one that produces an attractor. However, it is not shown that this connectivity, on its own, indeed sustains persistent attractor states. Furthermore, it is not clear whether this is even necessary to obtain the results of the model. It seems possible that (possibly weaker) connections with ring topology, that do not produce attractor dynamics but induce correlations between neurons with similar locations on the ring would be sufficient to align the spatial response patterns during the learning of feedforward weights.

      Regarding the first part of the comment, the recurrent collaterals create one or at times multiple bumps of activity in the network so that neighboring (interconnected) cells activate together. An initial random state of activity rapidly falls into this dynamic, constrained by the attractor. To us this is not surprising given that this connectivity is the classical means of creating a continuous attractor. Perhaps there is some deeper meaning in this comment that we are not fully grasping.

      Regarding the second part of the comment, we fully agree with the reviewer. We are presenting what so far is the simplest connectivity that can align grid maps, but by no means we claim that it is the simplest possible one. Regarding weaker connections with ring topology, we show in Figure S2 that a ring attractor with too weak or too strong connections is incapable of aligning grids, since a balance between feedforward and feedback inputs is required.

      (3) Given that all the grid cells are driven by an input from place cells that span a 2d manifold, and that the activity in the grid cell network settles on a steady state which is uniquely determined by the inputs, it is expected that the manifold of activity states in the grid cell layer, corresponding to inputs that locally span a 2d surface, would also locally span a 2d plane. The result is not surprising. My understanding is that this result is derived as a prerequisite for the topological analysis, and it is therefore quite technical.

      We understand that the reviewer is referring to the motivation behind studying local dimensionality. We agree that the topological analysis approach is quite technical, but it provides unique insights. The theorem of closed surfaces, which allows us to deduce a toroidal topology from Betti numbers (1,2,1), only applies to closed surfaces. One thus needs to show that the point cloud is a surface (local dimensionality of 2) and is closed (no borders or singularities). If borders or singularities were present, a toroidal topology could not be claimed from these Betti numbers. Thus, it is a crucial step of the analysis.

      (4) The modeling is all done in planar 2d environments, where the feedforward learning mechanism promotes the emergence of a hexagonal pattern in the single neuron tuning curve. Under the scenario in which grid cell responses are aligned (i.e. all neurons develop spatial patterns with the same spacing and orientation) it is already quite clear, even without any topological analysis that the emerging topology of the population activity is a torus.

      However, the toroidal topology of grid cells in reality has been observed by Gardner et al also in the wagon wheel environment, in sleep, and close to boundaries (whereas here the analysis is restricted to the a sub-region of the environment, far away from the walls). There is substantial evidence based on pairwise correlations that it persists also in various other situations, in which the spatial response pattern is not a hexagonal firing pattern. It is not clear that the mechanism proposed in the present paper would generate toroidal topology of the population activity in more complex environments. In fact, it seems likely that it will not do so, and this is not explored in the manuscript.

      We agree that our work was constrained to exploration in 2D and that the situations posed by the reviewer are challenging, but we do not see them as unsurmountable. The wagon wheel shows a preservation of toroidal topology locally, where the behavior of the animal is rather 2-dimensional. Globally, hexagonal maps are lost, which is compatible with some flexibility in the way grid maps are formed. If sleep meant that all inputs are turned off, our model would predict a dynamic dictated by the architecture (1D for the ring attractor, for example), but we do not really know that this is the case. In the future, we intend to explore predictive activity along the linear attractor, which could both result in path integration and in some level of preservation of the activity when inputs are completely turned off.

      Regarding boundaries, as we have argued before, the cited work chooses to filter away what looks like more than half of the overall explained variance through PCA, and this is only before applying a non-linear dimensionality reduction algorithm. It is specifically shown that the analyzed components are the ones with global periodicity throughout the environment. Thus, it is conceivable that through this approach, local irregularities found only at the borders are disregarded in favor of a clearer global picture. While using a different methodology, our approach follows a similar spirit, albeit with far less noisy data.

      (5) Moreover, the recent work of Gardner et al. demonstrated much more than the preservation of the topology in the different environments and in sleep: the toroidal tuning curves of individual neurons remained the same in different environments. Previous works, that analyzed pairwise correlations under hippocampal inactivation and various other manipulations, also pointed towards the same conclusion. Thus, the same population activity patterns are expressed in many different conditions. In the present model, this preservation across environments is not expected. Moreover, the results of Figure 6 suggest that even across distinct rectangular environments, toroidal tuning curves will not be preserved, because there are multiple possible arrangements of the phases on the torus which emerge in different simulations.

      We agree with this observation. A symmetry in our implementation results in the fact that only ~50% of times the system falls in the preferred solution, and the rest of the times it falls into other local minima. Whether this result is at odds with current observations can be debated on the basis of probabilities. However, we believe that the symmetry we found is purely circumstantial, and that it can be broken by elements such as head direction modulation or other ingredients used to achieve path integration. In other words, we acknowledge that symmetry is an issue of the implementation we show here (which has been kept as simple as possible to serve as a proof-of-principle) but we do not think that it is a defining feature of flexible attractors in general. We expect that future implementations that incorporate path integration capabilities will not present this kind of symmetry in the space of solutions.

      Regarding the rigid phase translation across modalities, while this effect is very clear in Gardner et al, it is less so in other datasets. The analyses shown in Hermansen et al (2024) can rather be interpreted as somewhere in the way between perfect rigid translation and fully randomized phases across navigation modalities.

      (6) In real grid cells, there is a dense and fairly uniform representation of all phases (see the toroidal tuning of grid cells measured by Gardner et al). Thus, the highly clustered phases obtained in the model (Fig. S1) seem incompatible with the experimental reality. I suspect that this may be related to the difficulty in identifying the topology of a torus in persistent homology analysis based on the transpose of the matrix M.

      We partly agree with this observation and note that a pattern of ordered phases is an issue not only for the 1D attractor but also for the 2D one, which appears much more uniform than in experimental data. The low number of neurons we used for computational economy and the full connectivity could be key ingredients to generate these phase patterns. To show that this is not a defining feature of flexible attractors, apart from the fact that these patterns appear also with non-flexible 2D architectures, we included in Figure S1 simulations with ‘fragmented 1D’ architectures. In this case the architecture is a superposition of 20 random 1D stripe-like attractors. While the alignment of maps achieved with this architecture is almost at the same level as the one obtained with 1D and 2D attractors, the phases are much more similar to what has been observed experimentally, and less uniform than what is obtained with 2D attractors.

      (7) The motivations stated in the introduction came across to me as weak. As now acknolwledged in the manuscript, attractor models can be fully compatible with distortions of the hexagonal spatial response patterns - they become incompatible with this spatial distortions only if one adopts a highly naive and implausible hypothesis that the attractor state is updated only by path integration. While attractor models are compatible with distortions of the spatial response pattern, it is very difficult to explain why the population activity patterns are tightly preserved across multiple conditions without a rigid two-dimentional attractor structure. This strong prediction of attractor models withstood many experimental tests - in fact, I am not aware of any data set where substantial distortions of the toroidal activity manifold were observed, despite many attempts to challenge the model. This is the main motivation for attractor models. The present model does not explain these features, yet it also does not directly offer an explanation for distortions in the spatial response pattern.

      Some interesting examples are experiments in 3D, where grid cells presumably communicate with each other through the same recurrent collaterals, but global periodicity is lost and only some local order is preserved even away from boundaries (Ginosar et al, 2021; Grieves et al, 2021). While these datasets have not been explored using topological analysis, they serve as strong motivators to understanding 2D grid cells as one equilibrium solution that arises under some set of constraints, but belongs to a wider space of possible solutions that may arise as well under more flexible constraints. Even (and especially) if one adheres to the hypothesis that grid cells are pre-wired into a 2D torus, a concept like flexible attractors might become useful to understand how their activity is rendered in 3D. Another strong motivation is our lack of understanding of how a perfectly balanced 2D structure is formed and maintained. Simpler architectures could be thought of as alternatives, but also as an intermediate step towards it.

      Regarding the rigid phase translation across modalities, while this effect is very clear in Gardner et al, it is less so in other datasets. The analyses shown in Hermansen et al (2024) can rather be interpreted as somewhere in the way between perfect rigid translation and fully randomized phases.

      In a separate point, although it might not be strictly related to the comment, we do not fully share the idea that persistent activity patterns during sleep are necessary or sufficient conditions for attractor dynamics, although we do agree that attractors could be the mechanism behind them and any alternative is at least as complex as attractors. On the necessity side, attractors in the hippocampus are not constantly engaged (Wills et al, 2005). For sufficiency, one should prove that no other network is capable of reproducing the phenomenon, and to our best knowledge we are still far from that point.

      (8) There is also some weakness in the mathematical description of the dynamics. Mathematical equations are formulated in discrete time steps, without a clear interpretation in terms of biophysically relevant time scales. It appears that there are no terms in the dynamics associated with an intrinsic time scale of the neurons or the synapses (a leak time constant and/or synaptic time constants). I generally favor simple models without lots of complexity, yet within this style of modelling, the formulation adopted in this manuscript is unconventional, introducing a difficulty in interpreting synaptic weights as being weak or strong, and a difficulty in interpreting the model in the context of other studies.

      We chose to keep the model as simple as possible and in the line of previous publications developing it. However, we see the usefulness of putting it in what in the meantime has become a canonical framework. Fortunately this has been done by D’Albis and Kempter (2017). In our simplified version of the model there is no leak term and adaptation on its own brings down activity in the absence of input, but we agree that such a term could be added, albeit not without modifying all other network parameters.

      In my view, the weaknesses discussed above limit the ability of the model, as it stands, to offer a compelling explanation for the toroidal topology of grid cell population activity patterns, and especially the rigidity of the manifold across environments and behavioral states. Still, the work offers an interesting way of thinking on how the toroidal topology might emerge.

      Reviewer 1:

      Reviewer #1 (Recommendations For The Authors):

      See comments above. In addition:

      (1) Abstract: '...interconnected by a two-dimensional attractor guided by path integration'. This is unclear. I think the intended meaning might be along the lines of '...their being computed by a 2D continous attractor that performs path integration'?

      'path integration allowing for no deviations from the hexagonal pattern' This is incorrect. Local modulation of the gain of the speed input to a standard CAN would distort the grid pattern.

      'Using topological data analysis, we show that the resulting population activity is a sample of a torus' Activity in the model?

      'More generally, our results represent a proof of principle against the intuition that the architecture and the representation manifold of an attractor are topological objects of the same dimensionality, with implications to the study of attractor networks across the brain' I guess one might hold this intuition, but it strikes me as obvious that if you impose an sufficiently strong n-dimensional input on a network then it it's activity could have the same dimensionality. I don't really see this as being a point worth highlighting. Perhaps the more interesting point, it that during learning the recurrent connectivity aligns the grid fields of neurons in the network, and this may be a specific function of the 1D attractor dynamcis, although I don't think the authors have made this point convincing.

      'The flexibility of this low dimensional attractor allows it to negotiate the geometry of the representation manifold with the feedforward inputs'. See above for comments on the use of 'negotiate'.

      'while the ensemble of maps preserves features of the network architecture'. I don't understand this. What is the 'ensemble of maps' and what are the features referred to.

      We have reviewed the abstract considering these points. Regarding the ‘strong n-dimensional input’, we want to point out that it is not the input itself that generates a torus (the no attractor condition does not lead to a torus) but rather the interplay between the input and the attractor.

      ‘Perhaps the more interesting point …’, we do not fully understand how this sentence deviates from our own conclusions. We here show that a strong n-dimensional input is not enough to align grid cells (produce a n-torus), it is the interplay between inputs and attractor dynamics that does so, even if the attractor is not n-dimensional in terms of architecture.

      The ensemble of maps refers to the transpose of the population activity matrix, where each point in the cloud is a map, and the features refer to the persistent homology.

      (2) The manuscript still fails to clarify the difference between a model that path integrates in two dimensions and a model that simply represents information with a given dimensionality. The argument that it's surprising that a network with 1D architecture represents a higher dimensional input strikes me as incorrect and an unnecessary attempt to argue for conceptual importance. At least to me this isn't surprising. It would be surprising if the 1D network could path integrate but this doesn't seem to be the case.

      In response to the reviewer’s concerns, we have made clear in the introduction and discussion that this model has no path integration capabilities, although we aim to develop a model capable of path integration using the kind of simple architecture presented here. We want to highlight here that equating attractor dynamics with path integration would be a conceptual mistake.

      (3) Other wording also seems to make unnecessary conceptual claims. E.g. The repeated use of 'negotiate' implies some degree of intelligence, or at least an exchange of information, that isn't shown to exist. I wonder if more precise language could be used? As I understand it the dimensionality is bounded by the inputs on the one hand, and the network connectivity on the other, with the actual dimensionality being a function of the recurrent and feedforward synaptic weights. There's clearly some role for the relative weights and the properties of plasticity rules, but I don't see any evidence for a negotiation.

      An interesting observation in Figure S2 is that grid maps are aligned only if the relative strength of feedforward and recurrent inputs is similar. If one of them can impose over the other, grid maps do not align. This equilibrium can metaphorically be thought of as a negotiation instance, where the negotiation is an emergent property of the system rather than something happening at an individual synapse.


      The following is the authors’ response to the original reviews.

      Reviewer #1:

      Reviewer #1 (Recommendations For The Authors):

      Major

      (1) What is the evidence that, after training, the 1D network maintains its attractor dynamics when feedforward inputs are active? If the claim is that it does then it's important to provide evidence, e.g. responses to perturbations, or other tests. The alternative is that after training the recurrent inputs are drowned out by the feed forward spatial inputs.

      We agree with the reviewer on the importance of this point. In our model, networks are always learning, and the population activity represented by aligned grid maps in a trained network is a dynamic equilibrium that emerges from the interplay between feedforward and collateral constraints. If Hebbian learning is turned off, one gets a snapshot of the network at that moment. We now show in Fig. S3 that in a trained network without feedforward Hebbian learning the removal of recurrent collaterals results in a slight increase in gridness and spacing. The expansion is due to the fact that, as we argue in the Results section, the attractor has a contractive effect on grid maps, which could relate to observations in novel environments (Barry et al, 2007). If Hebbian learning is turned on in the same situation, the maps, no longer constrained by the attractor, drift toward the equilibrium solution of the ‘No attractor’ condition, with significantly larger spacing, no alignment and lower individual gridness. Thus, the attractor is the force preventing them to do so when feedforward Hebbian learning is on.

      These observations point to the key role played by the attractor not only in forming but also in sustaining grid activity. The dynamic equilibrium framework fits well known properties of the system, such as its capacity to recalibrate very fast (Jayakumar et al, 2019), although this particular feature cannot be modeled with the current version of our model, that lacks path integration capabilities.

      (2) It would be useful to include additional control conditions for Figure 2 to test the hypothesis that it is simply connectivity, rather than attractor dynamics, that drives alignment.

      This could be achieved by randomly assigning strengths to the recurrent connections, e.g. drawing from exponential or Gaussian distributions.

      We agree and have included Fig. S2b-d, showing that the same distribution of collateral input weights entering each neuron, but lacking the 1D structure provided by the attractor, does not align grid maps. This is achieved by shuffling rows in the connectivity matrix, while avoiding self connections to make the comparison fair (self connections substantially alter the dynamic of the network, making it much more rigid). We observed that individual grid maps have very low gridness levels, even lower than in the no-attractor condition. In contrast, they have levels of population gridness slightly higher than in the no-attractor condition, but closer to 0 than to levels achieved with attractors. Our interpretation of these results is that irregular connectivity achieves some alignment in a few arbitrary directions and/or locations, which improves the coordination between maps at the expense of impairing rather than improving hexagonal responses of individual cells. Such observations stand in clear context to what is observed with continuous attractors with an orderly architecture.

      These results suggest that it is the structure of the attractor that allows grid cells to be aligned rather than the mere presence of recurrent collateral connections.

      (3) It seems conceivable that once trained the recurrent connections would no longer be required for alignment. Can this be evaluated by considering what happens if the recurrent connections are turned off after training (or slowly turned off during training)? Does the network continue to generate aligned grid fields?

      This point has elements in common with point 1. As we argued in that response, the attractor has two main effects on grid maps: it aligns them and it contracts them. If the attractor is turned off, feedforward Hebbian learning progressively drives maps toward the solution obtained for the ‘no attractor’ condition, characterized by maps with larger spacing, poorer gridness and lack of alignment.

      (4) After training what is the relative strength of the recurrent and feedforward inputs to each neuron?

      Both recurrent and feedforward synaptic-strength matrices are normalized throughout training, so that the overall incoming synaptic strength to each neuron is invariant. Because of this, although individual feed-forward and recurrent input fields vary dynamically, their average is constant, with the exception of the very first instances of the simulation, before a stable regime is reached in grid-cell activity levels. We have included Fig. S2d, showing the dynamics of feedforward and recurrent mean fields throughout learning as well as their ratio. In addition, Fig. S2a shows that the strength of recurrent relative to feedforward inputs is an important parameter, since alignment is only obtained in an intermediate range of ratios.

      (5) It would be helpful to also evaluate the low dimensional structure of the input to the network. Assuming it has a 2D structure, as it represents 2D space, can an explanation be provided for why it is surprising that the trained network also encodes activity with a 2D manifold? It strikes me that the more interesting finding might relate to alignment of the grids rather than claims about a 1D attractor encoding a 2D representation. Either way, stronger evidence and clearer discussion would be helpful.

      The reviewer is correct in assuming that the input has a 2D structure, that can be represented by a sheet embedded in a high dimensional space and thus has the Betti numbers [1,0,0]. The surprising element in our results is that we are showing for the first time that the population activity of an attractor network is constrained to a manifold that results from the negotiation between the architecture of the attractor and the inputs, and does not merely reflect the former as previously assumed. In this sense, the alignment of grid cells by a 1D attractor is an instance of the more general case that 1D attractors can encode 2D representations.

      It is certainly the case that the 2D input is a strong constraint pushing population activity toward a 2D manifold. However, the final form of the 2D manifold is strongly constrained by the attractor, as shown by the contrast with the no-attractor condition (a 2D sheet, as in the input, vs a torus when the attractor is present). The 1D attractor is able to flexibly adapt to the constraint posed by the inputs while doing its job (as demonstrated in previous points), which results in 2D grid maps aligned by a 1D attractor. Generally speaking, this work provides a proof of principle demonstrating that the topology of the attractor architecture and the manifold of the population activity space need not be identical, as previously widely assumed by the attractor community, and need not even have the same dimensionality. Instead, a single architecture can potentially be applied to many purposes. Hence, our work provides a valuable new perspective that applies to the study of attractors throughout the brain.

      (6) The introduction should be clearer about the different types of grid model and the computations they implement. E.g. The authors' previous model generates grid fields from spatial inputs, but if my understanding is correct it isn't able to path integrate. By contrast, while the many 2D models with continuous attractor dynamics also generate grid representations, they do so by path integration mechanisms that are computationally distinct from the spatial transformation implemented by feedforward models (see also general comments above).

      We agree with the reviewer and have made this point explicit in the introduction.

      (7) A prediction from continuous attractor models is that when place cells remap the low dimensional manifold of the grid activity is unaffected, except that the location of the activity bump is moved. It strikes me as important to test whether this is the case for the model presented here (my intuition is that it won't be, but it would be important to establish either way).

      We want to emphasize that our model is a continuous attractor model, so the question regarding the difference between what our model and continuous attractor network models predict is an ill-posed one. One of our main conclusions is precisely that attractors can work in a wider spectrum of ways than previously thought.

      In lack of a better definition, our multiple simulations could be thought of as training in different arenas. It is true that in our model maps take time to form, but this is also the case in novel environments (Barry et al, 2007 ), and continuous attractor models exclusively or strongly guided by self motion cues struggle to replicate this phenomenon. We show that the current version of our model accepts multiple solutions (in practice four but conceptually infinite countable), all of them resulting in a torus for the population activity (i.e. the same topology or low dimensional manifold). It is not clear to us how easy it would be to differentiate between most of these solutions in experimental data, with only incomplete information. This said, incorporating a symmetry-breaking ingredient to the model, for example related to head direction modulation, could perhaps lead to the prevalence of a single type of solution. We intend to explore this possibility in the future in order to add path-integration capabilities to the system, as described in the discussion.

      (8) The Discussion implies that 1D networks could perform path integration in a manner similar to 2D networks. This is a strong claim but isn't supported by evidence in the study. I suggest either providing evidence that this is the case for models of this kind or replacing it with a more careful discussion of the issue.

      The current version of our model has no path integration capabilities, as is now made explicit in the Introduction and Discussion. In addition, we have now made clear that the idea that path integration could perhaps be implemented using 1D networks is, although reasonable, purely speculative.

      Minor

      (1) Introduction. 'direct excitatory communication between them'. Suggest rewording to 'local synaptic interactions', as communication can also be purely inhibitory (e.g. Burak and Fiete, 2009) or indirect by excitation of local interneurons (e.g. Pastoll et al., Neuron, 2013).

      We agree and have adopted this phrasing.

      (2) The decision to focus the topology analysis on the 60 cm wide central square appears somewhat arbitrary. Are the irregularities referred to a property of the trained networks or would they also emerge with analysis of simulated ideal data? Can more justification be expanded and supplementary analyses be shown when the whole arena is used?

      In practical terms, a subsampling of the data to around half was needed because the persistent homology packages struggle to handle large amounts of data, especially in the calculation of H2. We decided to cut a portion of contiguous pixels in the open field at least larger than the hexagonal tile representing the whole grid population period (as represented in Figure 6). Leaving the borders aside was a logical choice since it is known that the solution at the borders is particularly influenced by the speed anisotropy of the virtual rat (see Si, Kropff & Treves, 2012), in a way that mimics how borders locally influence grid maps in actual rats (Krupic et al, 2015). The specific way in which our virtual rat handles borders is arbitrary and might not generalize. A second issue around borders is that maps are differently affected by incomplete smoothing, although this issue does not apply to our data because we did not smooth across neighboring pixels. In sum, considering the central 60 cm wide square was sufficient to contain the whole torus and a reasonable compromise that would allow us to perform all analyses in the part of the environment less influenced by boundaries.

      (3) It could help the general reader to briefly explain what a persistence diagram is.

      This is developed in the Appendix, but we have now added a reference to it and a brief description in the main text.

      (4) For the analyses in Figure 3-4, and separately for Figure 5, it might help the reader to provide visualizations of the low dimensional point cloud.

      All these calculations take place in the original high-dimensional point cloud. Doing them in a reduced space would be incorrect because there is no dimensionality reduction technique that guarantees the preservation of topology. In Figure 7 we reduce the dimensionality of data but emphasize that it is only done for visualization purposes, not to characterize topology. We also point out in this Figure that the same non-linear dimensionality reduction technique applied to objects with identical topology yields a wide variety of visualizations, some of them clear and some less clear. This observation further exemplifies why one cannot assume that a dimensionality-reduction technique preserves topology, even for a low-dimensional object embedded in a high-dimensional space.

      (5) The detailed comparison of the dynamics of each model is limited by the number of data points. Why not address this by new simulations with more neurons?

      We are not sure we understand this comment. In Figure 2, the dynamics for each model are markedly different. These are averages over 100 simulations. We are not sure what benefit would be obtained from adding more neurons. Before starting this work we searched for the minimal number of neurons that would result in convergence to an aligned solution in 2D networks, which we found to be around 100. Optimizing this parameter in advance was important to reduce computational costs throughout our work.

      (6) Could the variability in Figure 7 also be addressed by increasing the number of data points?

      As we argued in a previous point, there is no reason to expect preservation of topology after applying Isomap. We believe this lack of topology preservation to be the main driver of variability.

      (7) Page/line numbers would be useful.

      We agree. However, the text is curated by biorxiv which, to our best knowledge, does not include them.

      Reviewer 2:

      Reviewer #2 (Recommendations For The Authors):

      (1) I highly suggest that the author rewrite some parts of the Results. There are lots of details which should be put into the Methods part, for example, the implementation details of the network, the analysis details of the toroidal topology, etc. It will be better to focus on the results part first in each section, and then introduce some of the key details of achieving these results, to improve the readability of the work.

      This suggestion contrasts with that of Reviewer #1. As a compromise, we decided to include in the Results section only methodological details that are key to understanding the conclusions, and describe everything else in the Methods section.

      (2) 'Progressive increase in gridness and decrease in spacing across days have been observed in animals familiarizing with a novel environment...' From Fig.2c I didn't see much decrease. The authors may need to carry out some statistical test to prove this. Moreover, even the changes are significant, this might be not the consequence of the excitatory collateral constraint. To prove this, the authors may need to offer some direct evidence.

      We agree that the decrease is not evident in this figure due to the scale, so we are adding the correlation in the figure caption as proof. In addition, several arguments, some related to new analyses, demonstrate that the attractor contracts grid maps. First, the ‘no attractor’ condition has a markedly larger spacing compared to all other conditions (Fig. 2a). We also now show that spacing monotonically decreases with the strength of recurrent relative to feedforward weights, in a way that is rather independent of gridness (Fig. S2a). Second, as we now show in Fig. S2b-d, simulations with a shuffled 1D attractor, such that the sum of input synapses to each neuron are the same as in the 1D condition but no structure is present, lead to a spacing that is mid-way between the ‘no attractor’ condition and the conditions with attractors. Third, as we now show in Fig. S3a, turning off both recurrent connections and feedforward learning in a trained network results in a small increase in spacing. Fourth, as we now show in Fig. S3b, turning off recurrent connections while feedforward learning is kept on increases grid spacing to levels comparable to those of the ‘no attractor’ condition. All these elements support a role of the attractor in contracting grid spacing.

      (3) Some of the items need to be introduced first before going into details in the paper, for instance, the stipe-like attractor network, the Betti number, etc.

      We have added in the Results section a brief description and references to full developments in the Appendix.

      Reviewer 3 (Public Review):

      (1) It is not clear to me that the proposal here is fundamentally new. In Si, Kropff and Treves (2012) recurrent connectivity was dependent on the head direction tuning and thus had a ring structure. Urdapilleta, Si, and Treves considered connectivity that depends on the distance on a 2d plane.

      In the work of Si et al connectivity is constructed ad-hoc for conjunctive cells to represent a torus, it depends on head-directionality but also on the distance in a 2D plane. The topology of this architecture has not been assessed, but it is close to the typical 2D ‘rigid’ constraint. In the work of Urdapilleta et al, the network is a simple 2D one. The difference with our work is that we focus on the topology of the recurrent network and do not use head-direction modulation. In this context, we prove that a 1D network is enough to align grid cells and, more generally, we provide a proof of principle that the topology of the architecture and the representation space of an attractor network do not need to be identical, as previously assumed by the attractor community. These two important points were neither argued, speculated nor self-evident from the cited works.

      (2) The paper refers to the connectivity within the grid cell layer as an attractor. However, would this connectivity, on its own, indeed sustain persistent attractor states? This is not examined in the paper. Furthermore, is this even necessary to obtain the results in the model? Perhaps weak connections that do not produce an attractor would be sufficient to align the spatial response patterns during the learning of feedforward weights, and reproduce the results? In general, there is no exploration of how the strength of collateral interactions affects the outcome.

      The reviewer makes several important points. Local excitation combined with global inhibition is the archetypical architecture for continuous attractors (see for example Knierim and Zhang, Annual review of neuroscience, 2012). Thus, in the absence of feedforward input, we observe a bump of activity. As in all continuous attractors, this bump is not necessarily ‘persistent’ and instead is free to move along the attractor.

      We cannot prove that there is not a simpler architecture that has the same effect as our 1D or 1DL conditions, and we think that there are some interesting candidates to investigate in the future. What we now prove in new Fig. S2b-d is that it is not the strength of recurrent connections themselves, but instead the continuous attractor structure that aligns grid cells in our model. To demonstrate this, we shuffle incoming recurrent connections to each neuron in the 1D condition (while avoiding self-connections for fairness), and show that training does not lead to grid alignment. We also show in Fig. S1 that an architecture represented by 20 overlapping 1DL attractors, each formed by concatenating 10 random cells, aligns grid cells to levels slightly lower but similar to the 1D or 1DL attractors. This architecture can perhaps be considered as simpler to build in biological terms than all the others, but it is still constituted by continuous attractors.

      The strength of recurrent collaterals, or more precisely the recurrent to feedforward ratio, is crucial in our model to achieve a negotiated outcome from constraints imposed by the attractor and the inputs. We now show explicit measures of this ratio in Fig. S2, as well as examples showing that an imbalance in this ratio impairs grid alignment. When the ratio is too high or too low, both individual and population gridness are low. Interestingly, grid spacing behaves differently, decreasing monotonically with the relative strength of recurrent connections.

      (3) I did not understand what is learned from the local topology analysis. Given that all the grid cells are driven by an input from place cells that spans a 2d manifold, and that the activity in the grid cell network settles on a steady state that depends only on the inputs, isn't it quite obvious that the manifold of activity in the grid cell layer would have, locally, a 2d structure?

      The dimensionality of the input is important, although not the only determinant of the topology of the activity. The recurrent collaterals are the other determinant, and their architecture is a crucial feature. For example, as we now show in Figure S2b-d, shuffled recurrent synaptic weights fail to align grid cells. In the 1D condition, if feedforward inputs were absent, the dynamics of the activity would be confined to a ring. The opposite condition is our ‘no attractor’ condition, in which activity in the grid cell layer mimics the topology of inputs, a 2D sheet (and not a torus). It is in the intermediate range, when both feedforward and recurrent inputs are important, that a negotiated solution (a torus) is achieved.

      The analyses of local dimensionality and local homology of Figure 3 are crucial steps to demonstrate toroidal topology. According to the theorem of classification of closed surfaces, global homology is not enough to univocally define the topology of a point cloud, and thus this step cannot be skipped. The step is aimed to prove that the point cloud is indeed a closed surface.

      (4) The modeling is all done in planar 2d environments, where the feedforward learning mechanism promotes the emergence of a hexagonal pattern in the single neuron tuning curve. This, combined with the fact that all neurons develop spatial patterns with the same spacing and orientation, implies even without any topological analysis that the emerging topology of the population activity is a torus.

      We cannot agree with this intuition. In the ‘no attractor’ condition, individual maps have hexagonal symmetry with standardized spacing, but given the lack of alignment the population activity is not a closed surface and thus not a torus. It can rather be described as a 2D sheet embedded in a high dimensional space, a description that also applies to the input space.

      While it is rather evident that an ad hoc toroidal architecture folds this 2D population activity into a torus, it is less evident and rather surprising that 1D architectures have the same capability. This is the main novelty in our work.

      (5) Moreover, the recent work of Gardner et al. demonstrated much more than the preservation of the topology in the different environments and in sleep: the toroidal tuning curves of individual neurons remained the same in different environments. Previous works, that analyzed pairwise correlations under hippocampal inactivation and various other manipulations, also pointed towards the same conclusion. Thus, the same population activity patterns are expressed in many different conditions. In the present model, the results of Figure 6 suggest that even across distinct rectangular environments, toroidal tuning curves will not be preserved, because there are multiple possible arrangements of the phases on the torus which emerge in different simulations.

      We agree with the reviewer in the main point, although the recently found ring activity in the absence of sensory feedback (Gonzalo Cogno et al, 2023) suggests that what is happening in the EC is more nuanced than a pre-wired torus. Solutions in Figure 6 are different ways of folding a 1D strip into a torus, with or without the condition of periodicity in the 1D strip. Whether or not these different solutions would be discernible from one another in a practical setup is not clear to us. For example, global homology, as addressed in the Gardner paper, is the same for all these solutions. Furthermore, while our solutions of up to order 3 are highly discernable, higher order solutions, potentially achievable with other network parameters, would be impossible to discern by eye in representations similar to the ones in Figure 6. In addition, while we chose to keep our model in the simplest possible form as a clear proof of principle, new elements introduced to the model such as head directionality could break the symmetry and lead to the prevalence of one preferred solution for all simulation replicates. We plan to investigate this possibility in the future when attempting to incorporate path-integration capabilities to the model.

      (6) In real grid cells, there is a dense and fairly uniform representation of all phases (see the toroidal tuning of grid cells measured by Gardner et al). Here the distribution of phases is not shown, but Figure 7 suggests that phases are non uniformly represented, with significant clustering around a few discrete phases. This, I believe, is also the origin for the difficulty in identifying the toroidal topology based on the transpose of the matrix M: vectors representing the spatial response patterns of individual neurons are localized near the clusters, and there are only a few of them that represent other phases. Therefore, there is no dense coverage of the toroidal manifold that would exist if all phases were represented equally. This is not just a technical issue, however: there appears to be a mismatch between the results of the model and the experimental reality, in terms of the phase coverage.

      As mentioned in the results section, Figure 7 is meant for visualization purposes only, and serves more as cautionary tale regarding the imprevisible risks of non-linear dimensionality reduction than as a proof of the organization of activity in the network. Isomap is a non-linear transformation that deforms each of our solutions in a unique way so that, while all have the topology of a torus embedded in a high dimensional space, only a few of them exhibited one of two possible toroidal visualizations in a 3D Isomap reduction. Isomap, as well as all other popular dimensionality reduction techniques, provide no guarantee of topology invariance. A better argument to judge the homogenous distribution of phases is persistent homology, which identifies relatively large holes (compared to the sampling spacing) in the original manifold embedded in a high dimensional space. In our case, persistent homology identified only two holes significantly larger than noise (the two cycles of a torus) and one cavity in all conditions that included attractors. Regarding the specific distribution of phases in different conditions, however, see our reply below.

      (7) The manuscript makes several strong claims that incorrectly represent the relation between experimental data and attractor models, on one hand, and the present model on the other hand. For the latter, see the comments above. For the former, I provide a detailed list in the recommendations to the authors, but in short: the paper claims that attractor models induce rigidness in the neural activity which is incompatible with distortions seen in the spatial response patterns of grid cells. However, this claim seems to confuse distortions in the spatial response pattern, which are fully compatible with the attractor model, with distortions in the population activity patterns, which would be incompatible with the attractor model. The attractor model has withstood numerous tests showing that the population activity manifold is rigidly preserved across conditions - a strong prediction (which is not made, as far as I can see, by feedforward models). I am not aware of any data set where distortions of the population activity manifold have been identified, and the preservation has been demonstrated in many examples where the spatial response pattern is disrupted. This is the main point of two papers cited in the present manuscript: by Yoon et al, and Gardner et al.

      First of all, we would like to note that our model is a continuous attractor model. Different attractor models have different outcomes, and one of the main conclusions of our manuscript is that attractors can do a wider range of operations than previously thought.

      We agree with the reviewer that distortions in spatial activity (which speak against a purely path-integration guided attractor) should not be confused with distortions in the topology of the population activity (which would instead speak against the attractor dynamics itself). We have rephrased these observations in the manuscript. In fact, we believe that the capacity of grid cells to present distorted maps without a distortion of the population activity topology, as shown for example by Gardner and colleagues, could result from a tension between feedforward and recurrent inputs, the potential equilibriums of which our manuscript aims to characterize.

      (8) There is also some weakness in the mathematical description of the dynamics. Mathematical equations are formulated in discrete time steps, without a clear interpretation in terms of biophysically relevant time scales. It appears that there are no terms in the dynamics associated with an intrinsic time scale of the neurons or the synapses, and this introduces a difficulty in interpreting synaptic weights as being weak or strong. As mentioned above, the nature of the recurrent dynamics within the grid cell network (whether it exhibits continuous attractor behavior) is not sufficiently clear.

      We agree with the reviewer that our model is rather simple, and we value the extent to which this simplicity allows for a deep characterization. All models are simplifications and the best model in any given setup is the one with the minimum amount of complexity necessary to describe the phenomenon under study. We believe that to understand whether or not a 1D continuous attractor architecture can result in a toroidal population activity, a biophysically detailed model, with prohibitive computational costs, would have been unnecessarily complex. This argument does not intend to demerit biophysically detailed models, which are capable of addressing a wider range of questions regarding, for example, the spiking dynamics of grid cells, which cannot be addressed by our simple model.

      Reviewer #3 (Recommendations For The Authors):

      The work points to an interesting scenario for the emergence of toroidal topology, but the interpretation of this idea should be more nuanced. I recommend reconsidering the claims about limitations of the attractor theory, and acknowledging the limitations of the present theory.

      I don't see the limitations mentioned above as a reason to reject the ideas proposed in this manuscript, for two main reasons: first, additional research might reveal a regime of parameters where some issues can be resolved (e.g. the clustering of phases). In addition, the mechanism described here might act at an early stage in development to set up initial dynamics along a toroidal manifold, while other mechanisms might be responsible for the rigidity of the toroidal manifold in an adult animal. But all this implies that the novelty in the present manuscript is weaker than implied, the ability to explain experimental observations is more limited than implied, and these limitations should be acknowledged and discussed.

      I recommend reporting on the distribution of grid cell phases and, if indeed clustered, this should be discussed. It will be helpful to explore whether this is the reason for the difficulty in identifying the toroidal topology based on the collection of spatial response patterns (using the transpose of the matrix M).

      Ideally, a more complete work would also explore in a more systematic and parametric way the influence of the recurrent connectivity's strength on the learning, and whether a toroidal manifold emerges also in non-planar, such as the wagon-wheel environment studied in Gardner et al.

      Part of these recommendations have been addressed in the previous points (public review). Regarding the reason why the transpose of M does not fully recapitulate architecture with our conservative classification criteria, we believe that there is no reason why it should in the first place. We view the fact that the transpose of M recapitulates some features of the architecture as a purely phenomenological observation, and we think it is important as a proof that M is not exactly the same for the different conditions. We imagined that if M matrices were exactly the same this could be due to poor spatial sampling by our bins. Knowing that they are intrinsically different is important even if the reason why they have these specific features is not fully clear to us.

      Although we do not think that the distribution of phases is related to the absence of a cavity in the transpose of M or to the four clusters found in Isomap projections, it remains an interesting question that we did not explore initially. We are now showing examples of the distribution of phases in Figure S1. We observed that in both 2D and 1D conditions phases are distributed following rather regular patterns. Whether or not these patterns are compatible with experimental observations of phase distribution is to our view debatable, given that so far state-of-the-art techniques have only allowed to simultaneously record a small fraction of the neurons belonging to a given module. This said, we think that it is important to note that ordered phase patterns are an anecdotal outcome of our simulations rather than a necessary outcome of flexible attractors or attractors in general. To prove this point, we simulated a condition with a new architecture represented by the overlap of 20 short 1DL attractors, each recruiting 10 random neurons from the pool of 100 available ones.

      The rest of the parameters of the simulations were identical to those in the other conditions.

      By definition, the topology of this architecture has Betti numbers [20,0,0]. We show in Figure S1 that this architecture aligns grid cells, with individual and population gridness reaching slightly lower levels compared to the 1D condition. However, the distribution of phases of these grid cells has no discernible pattern. This result is an arbitrary example that serves as a proof-of-principle to show that flexible attractors can align grid cells without exhibiting ordered phases, not a full characterization of the outcome of this type of architecture, which we leave for future work. For the rest of our work, we stick to the simplest versions of 1D architectures, which allow for a more in-depth characterization.

      The wagon-wheel is an interesting case in which maps loose hexagonal symmetry although the population activity lies in a torus, perhaps evidencing the tension between feedforward and recurrent inputs and suggesting that grid cell response does not obey the single master of path integration. If we modeled it with a 1D attractor, we believe the outcome would strongly depend on virtual rat trajectory. If the trajectory was strictly linear, the population activity would be locally one-dimensional and potentially represented by a ring. Instead, if the trajectory allowed for turns, i.e. a 2D trajectory within a corridor-like maze, the population activity would be toroidal as in our open field simulations, while maps would not have perfect hexagonal symmetry, mimicking experimental results.

      More minor comments:

      Recurrent dynamics are modeled as if there is no intrinsic synaptic or membrane time constant. This may be acceptable for addressing the goals of this paper, but it is a bit unusual and it will be helpful to explain and justify this choice.

      As mentioned above, we believe that the best model in a given setup is the one with the lowest number of complexities that can still address the phenomenon under study. One does not use general relativity to build a bridge, although it provides a ‘more accurate’ description of the physics involved. All models are simplifications, and the more complex a model, the more it has to be taken as a black box.

      The Introduction mentions that in most models interaction between co-modular neurons occurs through direct excitatory communication, but in quite a few models the interaction is inhibitory. The crucial feature is that the interaction is strongly inhibitory between neurons that differ in their tuning, and either less inhibitory or excitatory between neurons with similar phases.

      We agree that directed inhibition has been shown to be as efficient as directed excitation, and we have modified the introduction to reflect this.

      The Discussion claims that the present work is the first one in which the topology of the recurrent architecture differs from the topology of the emergent state space. However, early works on attractor models of grid cells showed how neural connectivity which is arranged on a 2d plane, without any periodic boundary conditions, leads to a state space that exhibits the toroidal topology. Therefore, this claim should be revised.

      We agree, although the 2D sheet in this case acts as a piece of the torus, and locally the input space and architecture are identical objects. It could be argued that architectures that represent a 2D local slice of the torus, the whole torus, or several cycles around the torus form a continuous family parametrized by the extension of recurrent connections, and as a consequence it is not surprising that these works have not made claims about the incongruence between architecture and representation topologies. The 2D sheet connectivity is still constructed ad hoc to organize activity in a 2D bump, and there is no negotiation between disparate constraints because locally the constraints imposed by input and architecture are the same. We believe this situation is conceptually different from our flexible 1D attractors. We have adapted our claim to include this technical nuance.

      Why are neural responses in the perimeter of the environment excluded from the topological analysis? The whole point of the toroidal manifold analysis on real experimental data is that the toroidal manifold is preserved regardless of the animal's location and behavioral condition.

      We agree, although experimental data needs to go through extensive pre-processing such as dimensionality reduction before showing a toroidal topology. Such manipulations might smooth away the specific effects of boundaries on maps, together with other sources of noise. In our case, the original reason to downsample the dataset is related to the explosion in computational time that we experience with the ripser package when using more than ~1000 data points. For a proof-of-principle characterization we were much more interested in what happened in the center of the arena, where a 1D attractor could fold itself to confine population activity into a torus. The area we chose was sufficiently large to contain the whole torus. Borders do affect the way the attractor folds (they also affect grid maps in real rats). We feel that these imperfections could be interesting to study in relation to the parameters controlling how our virtual rat behaves at the borders, but not at this proof-of-principle stage.

      The periodic activity observed in Ref. 29 could in principle provide the basis for the ring arrangement of neurons. However, it is not yet clear whether grid cells participate in this periodic activity.

      We agree. So far it seems that entorhinal cells in general participate in the ring, which would imply that all kinds of cells are involved. However, it could well be that only some functional types participate in the ring and grid cells specifically do not, as future experiments will tell.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This valuable work explores death coding data to understand the impact of COVID-19 on cancer mortality. The work provides solid evidence that deaths with cancer as a contributing cause were not above what would be expected during pandemic waves, suggesting that cancer did not strongly increase the risk of dying of COVID-19. These results are an interesting exploration into the coding of causes of death that can be used to make sense of how deaths are coded during a pandemic in the presence of other underlying diseases, such as cancer.

      We thank the editor and reviewers for the time they took to review our manuscript and for the thoughtful suggestions they provided. We have completed several revisions based on their feedback and we feel our paper is stronger as a result. However, none of these revisions change the overall conclusions of our study.

      Reviewer #1 (Public Review):

      Summary:

      In the paper "Disentangling the relationship between cancer mortality and COVID-19", the authors study whether the number of deaths in cancer patients in the USA went up or down during the first year (2020) of the COVID-19 pandemic. They found that the number of deaths with cancer mentioned on the death certificate went up, but only moderately. In fact, the excess with-cancer mortality was smaller than expected if cancer had no influence on the COVID mortality rate and all cancer patients got COVID with the same frequency as in the general population. The authors conclude that the data show no evidence of cancer being a risk factor for COVID and that the cancer patients were likely actively shielding themselves from COVID infections.

      Strengths:

      The paper studies an important topic and uses sound statistical and modeling methodology. It analyzes both, deaths with cancer listed as the primary cause of death, as well as deaths with cancer listed as one of the contributing causes. The authors argue, correctly, that the latter is a more important and reliable indicator to study relationships between cancer and COVID. The authors supplement their US-wide analysis by analysing three states separately.

      Weaknesses:

      The main findings of the paper can be summarized as six numbers. Nationally, in 2022, multiple-cause cancer deaths went up by 2%, Alzheimer's deaths by 31%, and diabetes deaths by 39%. At the same time, assuming no relationship between these diseases and either Covid infection risk or Covid mortality risk, the deaths should have gone up by 7%, 46%, and 28%. The authors focus on cancer deaths and as 2% < 7%, conclude that cancer is not a risk factor for COVID and that cancer patients must have "shielded" themselves against Covid infections.

      However, I did not find any discussion of the other two diseases. For diabetes, the observed excess was 39% instead of "predicted by the null model" 28%. I assume this should be interpreted as diabetes being a risk factor for Covid deaths. I think this should be spelled out, and also compared to existing estimates of increased Covid IFR associated with diabetes.

      And what about Alzheimer's? Why was the observed excess 31% vs the predicted 46%? Is this also a shielding effect? Does the spring wave in NY provide some evidence here? Why/how would Alzheimer's patients be shielded? In any case, this needs to be discussed and currently, it is not.

      We thank the reviewer for their positive feedback on the paper and for these suggestions. It is true that we have emphasized the impact on cancer deaths, as this was the primary aim of the paper. In the revised version, we have expanded the results and discussion sections to more fully describe the other chronic conditions we used as comparators (lines 267-284;346 – 386).

      Note that we are somewhat reluctant to designate any of these conditions as risk factors based solely on comparing the time series model with the demographic model of our expectations. As we mention in the discussion, there is considerable uncertainty around estimates from the demographic model in terms of the size of the population-at-risk, the mean age of the population-at-risk, and the COVID-19 infection rates and infection fatality ratios. Our demographic model is primarily used to demonstrate the effects of competing risks across types of cancers and chronic conditions, since these findings are robust to model assumptions. In contrast, the demographic model should be used with caution if the goal is to titrate the level of these risk factors (as the level of imputed risk is dependent on model assumptions). In the updated version of the manuscript, we have included uncertainty intervals in Table 3, using the upper and lower bounds of the estimated infection rates and IFRs, to better represent this uncertainty. We have also discussed this uncertainty more explicitly in the text and ran sensitivity analyses with different infection rate assumptions in the discussion (lines 354-362; 367 -370).

      We would like to note that rather than interpreting the absolute results, we used this demographic model as a tool to understand the relative differences between these conditions. From the demographic model we determined that we would expect to see much higher mortality in diabetes and Alzheimer’s deaths compared to cancer deaths due to three factors (1. Size of population-at-risk, 2. Mean age of the population-at-risk, 3. Baseline risk of mortality from the condition), that are separate from the COVID-19 associated IFR. And in general, this is what we observed.

      In comparing the results from the demographic model to the observed excess, diabetes does standout as an outlier from cancer and Alzheimer’s disease in that the observed excess is consistently above the null hypothesis which does lend support to the conclusion that diabetes is in fact a risk factor for COVID-19. A conclusion which is also supported by many other studies. Our findings for hematological cancers are also similar, in that we find consistent support for this condition being a risk factor. We have commented on this in the discussion and added a few references (lines 346-354; 395-403).

      Our hypothesis regarding non-hematological cancer deaths (lower than anticipated mortality due to shielding) could also apply to Alzheimer’s deaths. Furthermore, we used the COVID-19 attack rate for individuals >65 years (based on the data that is available), but we estimate that the mean age of Alzheimer’s patients is actually 80-81 years, so this attack rate may in fact be a bit too high, which would increase our expected excess. We have commented on this in the discussion (lines 363-377).

      Reviewer #2 (Public Review):

      The article is very well written, and the approach is quite novel. I have two major methodological comments, that if addressed will add to the robustness of the results.

      (1) Model for estimating expected mortality. There is a large literature using a different model to predict expected mortality during the pandemic. Different models come with different caveats, see the example of the WHO estimates in Germany and the performance of splines (Msemburi et al Nature 2023 and Ferenci BMC Medical Research Methodology 2023). In addition, it is a common practice to include covariates to help the predictions (e.g., temperature and national holidays, see Kontis et al Nature Medicine 2020). Last, fitting the model-independent for each region, neglects potential correlation patterns in the neighbouring regions, see Blangiardo et al 2020 PlosONE.

      Thank you for these comments and suggestions. We agree there are a range of methods that can be used for this type of analysis, and they all come with their strengths, weaknesses, and caveats. Broadly, the approach we chose was to fit the data before the pandemic (2014-2019), and project forward into 2020. To our knowledge it is not a best practice to use an interpolating spline function to extrapolate to future years. This is demonstrated by the WHO estimates in Germany in the paper you mention. This was our motivation for using polynomial and harmonic terms.

      Based on the above:

      a. I believe that the authors need to run a cross-validation to justify model performance. I would suggest training the data leaving out the last year for which they have mortality and assessing how the model predicts forward. Important metrics for the prediction performance include mean square error and coverage probability, see Konstantinoudis et al Nature Communications 2023. The authors need to provide metrics for all regions and health outcomes.

      Thank you for this suggestion. We agree that our paper could be strengthened by including cross validation metrics to justify model performance. Based on this suggestion, and your observations regarding Alzheimer’s disease, we have done two things. First, for the full pre-pandemic period (2014-2019) for each chronic condition and location we tested three different models with different degree polynomials (1. linear only, 2. linear + second degree polynomial, 3. linear + second degree polynomial + third degree polynomial) and used AIC to select the best model for each condition and location. Next, also in response to your suggestion, we estimated coverage statistics. Using the best fit model from the previous step, we then fit the model to data from 2014-2018 only and used the model to predict the 2019 data. We calculated the coverage probability as the proportion of weekly observed data points that fell within the 95% prediction interval. For all causes of death and locations the coverage probability was 100% (with the exception of multiple cause kidney disease in California, which is only shown in the appendix). The methods and results have been updated to reflect this change and we have added a figure to the appendix showing the selected model and coverage probability for each cause of death and location (lines 504 – 519; 847-859; Appendix 1- Figure 11).

      b. In the context of validating the estimates, I think the authors need to carefully address the Alzheimer case, see Figure 2. It seems that the long-term trends pick an inverse U-shape relationship which could be an overfit. In general, polynomials tend to overfit (in this case the authors use a polynomial of second degree).It would be interesting to see how the results change if they also include a cubic term in a sensitivity analysis.

      Thank you for this observation. Based on the changes described above, the model for Alzheimer’s disease now includes a cubic term in the national data and in Texas and California. The model with the second-degree polynomial remained the best fit for New York (Appendix 1 – Figure 11).

      c. The authors can help with the predictions using temperature and national holidays, but if they show in the cross-validation that the model performs adequately, this would be fine.

      At the scale of the US, adding temperature or environmental covariates is difficult and few US-wide models do so (see Goldstein 2012 and Quandelacy 2014 for examples from influenza). Furthermore, because we are looking at chronic disease outcomes, it is unclear that viral covariates or national holidays would drive these outcomes in the same way as they would if we were looking at mortality outcomes more directly related to transmissible diseases (such as respiratory mortality). Our cross validation also indicates that our models fit well without these additional covariates.

      d. It would be nice to see a model across the US, accounting for geography and spatial correlation. If the authors don't want to fit conditional autoregressive models in the Bayesian framework, they could just use a random intercept per region.

      We think the reviewer is mistaken here about the scale of our national analysis. Our national analysis did not fit independent models for each state or region. Rather, we fit a single model to the weekly-level national mortality data where counts for the whole of the US have been aggregated. We have clarified in the text (lines 156, 464). As such, we do not feel a model accounting for spatial correlation would be appropriate nor would we be able to include a random intercept for each region. We did fit three states independently (NY, TX, CA), but these states are very geographically distant from each other and unlikely to be correlated. These states were chosen in part because of their large population sizes, yet even in these states, confidence intervals were very wide for certain causes of death. Fitting models to each of the 50 US states, most of which are smaller than those chosen here, would exacerbate this issue.

      (2) I think the demographic model needs further elaboration. It would be nice to show more details, the mathematical formula of this model in the supplement, and explain the assumptions

      Thank you for this comment. We have added additional details on the demographic model to the methods. We have also extended this analysis to each state to further strengthen our conclusions (lines 548-590).

      Reviewing Editor Recommendations:

      I think that perhaps something that is missing is that the authors never make their underlying assumption explicit: they are assuming that if cancer increases the risk of dying of COVID-19, this would be reflected in the data on multiple causes of death where cancer would be listed as one of the multiple causes rather than as the underlying cause, and that their conclusions are predicated on this assumption. I would suggest explicitly stating this assumption, as opposed to other reasons why cancer mortality would increase (ex. if cancer care worsened during pandemic waves leading to poorer cancer survival).

      Response: Thank you for this suggestion. We have added a few sentences to the introduction to make this assumption clear (lines 106-112).

      Reviewer #1 (Recommendations For The Authors):

      - It could make sense to add "in the United States" into the title, as the paper only analyses US data.

      - It may make sense to reformulate the title from "disentangling the relationship..." into something that conveys the actual findings, e.g. "Lack of excess cancer mortality during Covid-19 pandemic" or something similar. Currently, the title tells nothing about the findings.

      Thank you for these suggestions. We have added “in the US” to the title. However, we feel that our findings are a bit more subtle than the suggested reformulation would imply, and we prefer to leave it in its current form.

      - Abstract, lines 42--45: This is the main finding of the paper, but I feel it is simplified too strongly in the abstract. Your simulations do *not* "largely explain" excess mortality with cancer; they give higher numbers! Which you interpret as "shielding" etc., but this is completely absent from the abstract. This sentence makes the impression that you got a good fit between simulated excess and real excess, which I would say is not the case.

      Thank you for this comment. We have rephrased the sentence in the abstract to better reflect our intentions for using the demographic model (lines 46-49). As stated above, the purpose of the demographic model was not to give a good fit with the observed excess mortality. Rather, we used the demographic model as a tool to understand the relative differences between these conditions in terms of expected excess mortality given the size, age-distribution, and underlying risk of death from the condition itself, assuming similar IFR and attack rates. And based on this, we conclude that it is not necessarily surprising that we see higher excess mortality for diabetes and Alzheimer’s compared to cancer.

      - Results line 237: you write that it's "more consistent with the null hypothesis", however clearly it is *not* consistent with the null hypothesis either (because 2% < 7%). You discuss in the Discussion that it may be due to shielding, but it would be good to have at least one sentence about it already here in the Results, and refer to the Discussion.

      We have mentioned this in the results and refer to the discussion (lines 277-278).

      - Results line 239: why was it closer to the assumption of relative risk 2? If I understand correctly, your model prediction for risk=1 was 7% and for risk=2 it was 13%. In NY you observed 8% (line 187). How is this closer to risk=2?

      Thank you for this observation. We have updated the demographic model with new data, extended the model to state-level data, and included confidence intervals on these estimates. We have also added additional discussion around the differences between our observations and expectations (lines 249-284).

      - Discussion line 275: "we did not expect to see large increases" -- why exactly? Please spell it out here. Was it due to the age distribution of the cancer patients? Was it due to the high cancer death risk?

      We demonstrate that it is the higher baseline risk of death for cancer that seems to be driving our low expectations for cancer excess mortality (lines 304-320). We have added this to the sentence to clarify our conclusions on this point and have added a figure to better illustrate this concept of competing risks (Figure 6).

      - Methods, line 405: perhaps it makes sense to cite some other notable papers on Covid excess mortality such as Msemburi et al Nature 2023, Karlinsky & Kobak eLife 2021, Islam et al BMJ 2021, etc.

      Thank you for mentioning this oversight. We certainly should have cited these papers and have included them in the updated version.

      - Methods line 410: why did you use a 5-week moving average? Why not fit raw weekly death counts? NB regression should be able to deal with it.

      Smoothing time series data with a moving average prior to running regression models is a very common practice. We did a sensitivity analysis using the raw data. This produced excess estimates with slightly larger confidence intervals, but does not change the overall conclusions of the paper.

      - Methods line 416: please indicate the software/library/package you used for fitting NB regression.

      We fit the NB regression using the MASS package in R version 4.3. We have added this to the methods (line 519).

      - Line 489: ORCHID -> ORCID

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      Summary:

      Codol et al. present a toolbox that allows simulating biomechanically realistic effectors and training Artificial Neural Networks (ANNs) to control them. The paper provides a detailed explanation of how the toolbox is structured and several examples that demonstrate its usefulness.

      Main comments:

      (1) The paper is well written and easy to follow. The schematics help in understanding how the toolbox works and the examples provide an idea of the results that the user can obtain.

      We thank the reviewer for this comment.

      (2) As I understand it, the main purpose of the paper should be to facilitate the usage of the toolbox. For this reason, I have missed a more explicit link to the actual code. As I see it, researchers will read this paper to figure out whether they can use MotorNet to simulate their experiments, and how they should proceed if they decide to use it. I'd say the paper provides an answer to the first question and assures that the toolbox is very easy to install and use. Maybe the authors could support this claim by adding "snippets" of code that show the key steps in building an actual example.

      This is an important point, which we also considered when writing this paper. We instead decided to focus on the first approach, because it is easier to illustrate the scientific use of the toolbox using code or interactive (Jupyter) notebooks than a publication format. We find the “how to proceed” aspect of the toolbox can more easily and comprehensively be covered using online, interactive tutorials. Additionally, this allows us to update these tutorials as the toolbox evolves over different versions, while it is more difficult to update a scientific article. Consequently, we explicitly avoided code snippets on the article itself. However, we appreciate that the paper would gain in clarity if this was more explicitly stated early. We have modified the paper to include a pointer to where to find tutorials online. We added this at the last paragraph of the introduction section:

      The interested reader may consult the full API documentation, including interactive tutorials on the toolbox website at https://motornet.org.

      (3) The results provided in Figures 1, 4, 5 and 6 are useful, because they provide examples of the type of things one can do with the toolbox. I have a few comments that might help improving them:

      a. The examples in Figures 1 and 5 seem a bit redundant (same effector, similar task). Maybe the authors could show an example with a different effector or task? (see point 4).

      The effectors from figures 1 and 5 are indeed very similar. However, the tasks in figure 1 and 5 present some important differences. The training procedure in figure 1 never includes any perturbations, while the one from figure 5 includes a wide range of perturbations of different magnitudes, timing and directions. The evaluation procedure of figure 1 includes center-out reaches with permanent viscous (proportional to velocity) external dynamics, while that of figure 5 are fixed, transient, square-shaped perturbation orthogonal to the reach direction. Finally, the networks in figure 1 undergo a second training procedure after evaluation while the network of figure 5 do not.

      While we agree that some variation of effectors would be beneficial, we do show examples of a point-mass effector in figure 6. Overall, figure 5 shows a task that is quite different from that of figure 1 with a similar effector, while the opposite is true for figure 6. We have modified the text to clarify this for the reader, by adding the following.

      End of 1st paragraph, section 2.4.

      Therefore, the training protocol used for this task largely differed from section 2.1 in that the networks are exposed to a wide range of mechanical perturbations with varying characteristics.

      1st paragraph of section 2.5

      […] this asymmetrical representation of PMDs during reaching movements did not occur when RNNs were trained to control an effector that lacked the geometrical properties of an arm such as illustrated in Figure 4c-e and section 2.1.

      b. I missed a discussion on the relevance of the results shown in Figure 4. The moment arms are barely mentioned outside section 2.3. Are these results new? How can they help with motor control research?

      We thank the reviewer for this comment. This relates to a point from reviewer 2 indicating that the purpose of each section was sometimes difficult to grasp as one reads. Section 2.3 explains the biomechanical properties that the toolbox implements to improve realism of the effector. They are not new results in the sense that other toolboxes implement these features (though not in differentiable formats) and these properties of biological muscles are empirically well-established. However, they are important to understand what the toolbox provides, and consequently what constraints networks must accommodate to learn efficient control policies. An example of this is the results in figure 6, where a simple effector versus a more biomechanically complex effector will yield different neural representations.

      Regarding the manuscript itself, we agree that more clarity on the goal of every paragraph may improve the reader’s experience. Consequently, we ensured to specify such goals at the start of each section. Particularly, we clarify the purpose of section 2.3 by adding several sentences on this at the end of the first paragraph in that section. We also now clearly state the purpose of section 2.3 with the results of figure 6 and reference figure 4 in that section.

      c. The results in Figure 6 are important, since one key asset of ANNs is that they provide access to the activity of the whole population of units that produces a given behavior. For this reason, I think it would be interesting to show the actual "empirical observations" that the results shown in Fig. 6 are replicating, hence allowing a direct comparison between the results obtained for biological and simulated neurons.

      These empirical observations are available from previous electrophysiological and modelling work. Particularly, polar histograms across reaching directions like panel C are displayed in figures 2 and 3 of Scott, Gribble, Graham, Cabel (2001, Nature). Colormaps of modelled unit activity across time and reaching directions like panel F are also displayed in figure 2 of Lillicrap, Scott (2013, Neuron). Electrophysiological recordings of M1 neurons during a similar task in non-human primates can also be seen on “Preserved neural population dynamics across animals performing similar behaviour” figure 2 B (https://doi.org/10.1101/2022.09.26.509498) and “Nonlinear manifolds underlie neural population activity during behaviour” figure 2 B as well (https://doi.org/10.1101/2023.07.18.549575). Note that these two pre-prints use the same dataset.

      We have added these citations to the text and made it explicit that they contain visualizations of similar modelling and empirical data for comparison:

      This heterogeneous set of responses matches empirical observations in non-human primate primary motor cortex recordings (Churchland & Shenoy, 2007; Michaels et al., 2016) and replicate similar visualizations from previously published work (Fortunato et al., 2023; Lillicrap & Scott, 2013; Safaie et al., 2023).

      (4) All examples in the paper use the arm26 plant as effector. Although the authors say that "users can easily declare their own custom-made effector and task objects if desired by subclassing the base Plant and Task class, respectively", this does not sound straightforward. Table 1 does not really clarify how to do it. Maybe an example that shows the actual code (see point 2) that creates a new plant (e.g. the 3-joint arm in Figure 7) would be useful.

      Subclassing is a Python process more than a MotorNet process, as python is an object-oriented language. Therefore, there are many Python tutorials on subclassing in the general sense that would be beneficial for that purpose. We have amended the main text to ensure that this is clearer to the reader.

      Subclassing a MotorNet object, in a more specific sense, requires overwriting some methods from the base MotorNet classes (e.g., Effector or Environment classes, which correspond to the original Plant and Task object, respectively). Since we made the decision (mentioned above) to not include code in the main text, we added tutorials to the online documentation, which include dedicated tutorials for MotorNet class subclassing. For instance, this tutorial showcases how to subclass Environment classes:

      https://colab.research.google.com/github/OlivierCodol/MotorNet/blob/master/examples/3-environments.ipynb

      (5) One potential limitation of the toolbox is that it is based on Tensorflow, when the field of Computational Neuroscience seems to be, or at least that's my impression, transitioning to pyTorch. How easy would it be to translate MotorNet to pyTorch? Maybe the authors could comment on this in the discussion.

      We have received a significant amount of feedback asking for a PyTorch implementation of the toolbox. Consequently, we decided to enact this, and the next version of the toolbox will be exclusively in PyTorch. We will maintain the Application Programming Interface (API) and tutorial documentation for the TensorFlow version of the toolbox on the online website. However, going forward we will focus exclusively on bug-fixing and expanding from the latest version of MotorNet, which will be in PyTorch. We now believe that the greater popularity of PyTorch in the academic community makes that choice more sustainable while helping a greater proportion of research projects.

      These changes led to a significant alteration of the MotorNet structure, which are reflected by changes made throughout the manuscript, notably in Figure 3 and Table 1.

      (6) Supervised learning (SL) is widely used in Systems Neuroscience, especially because it is faster than reinforcement learning (RL). Thus providing the possibility of training the ANNs with SL is an important asset of the toolbox. However, SL is not always ideal, especially when the optimal strategy is not known or when there are different alternative strategies and we want to know which is the one preferred by the subject. For instance, would it be possible to implement a setup in which the ANN has to choose between 2 different paths to reach a target? (e.g. Kaufman et al. 2015 eLife). In such a scenario, RL seems to be a more natural option Would it be easy to extend MotorNet so it allows training with RL? Maybe the authors could comment on this in the discussion.

      The new implementation of MotorNet that relies on PyTorch is already standardized to use an API that is compatible with Gymnasium. Gymnasium is a standard and popular interfacing toolbox used to link RL agents to environments. It is very well-documented and widely used, which will ensure that users who wish to employ RL to control MotorNet environments will be able to do so relatively effortlessly. We have added this point to accurately reflect the updated implementation, so users are aware that it is now a feature of the toolbox (new section 3.2.4.).

      Impact:

      MotorNet aims at simplifying the process of simulating complex experimental setups to rapidly test hypotheses about how the brain produces a specific movement. By providing an end-to-end pipeline to train ANNs on the simulated setup, it can greatly help guide experimenters to decide where to focus their experimental efforts.

      Additional context:

      Being the main result a toolbox, the paper is complemented by a GitHub repository and a documentation webpage. Both the repository and the webpage are well organized and easy to navigate. The webpage walks the user through the installation of the toolbox and the building of the effectors and the ANNs.

      Reviewer #2 (Public Review):

      MotorNet aims to provide a unified interface where the trained RNN controller exists within the same TensorFlow environment as the end effectors being controlled. This architecture provides a much simpler interface for the researcher to develop and iterate through computational hypotheses. In addition, the authors have built a set of biomechanically realistic end effectors (e.g., an 2 joint arm model with realistic muscles) within TensorFlow that are fully differentiable.

      MotorNet will prove a highly useful starting point for researchers interested in exploring the challenges of controlling movement with realistic muscle and joint dynamics. The architecture features a conveniently modular design and the inclusion of simpler arm models provides an approachable learning curve. Other state-of-the-art simulation engines offer realistic models of muscles and multi-joint arms and afford more complex object manipulation and contact dynamics than MotorNet. However, MotorNet's approach allows for direct optimization of the controller network via gradient descent rather than reinforcement learning, which is a compromise currently required when other simulation engines (as these engines' code cannot be differentiated through).

      The paper could be reorganized to provide clearer signposts as to what role each section plays (e.g., that the explanation of the moment arms of different joint models serves to illustrate the complexity of realistic biomechanics, rather than a novel discovery/exposition of this manuscript). Also, if possible, it would be valuable if the authors could provide more insight into whether gradient descent finds qualitatively different solutions to RL or other non gradient-based methods. This would strengthen the argument that a fully differentiable plant is useful beyond improving training time / computational power required (although this is a sufficiently important rationale per se).

      We thank the reviewer for these comments. We agree that more clarity on the section goals may improve the reader’s experience and ensured this is the case throughout the manuscript. Particularly, we added the following on the first paragraph of section 2.3, for which an explicit goal was most missing:

      In this section we illustrate some of these biomechanical properties displayed by MotorNet effectors using specific examples. These properties are well-characterised in the biology and are often implemented in realistic biomechanical simulation software.

      Regarding the potential difference in solutions obtained from reinforcement or supervised learning, this would represent a non-trivial amount of work to do so conclusively and so may not be within the scope of the current article. We do appreciate however that in some situations RL may be a more fitting approach to a given task design. In relation to this point we now specify in the discussion that the new API can accommodate interfacing with reinforcement learning toolboxes for those who may want to pursue this type of policy training approach when appropriate (new section 3.2.4.).

      Reviewer #3 (Public Review):

      Artificial neural networks have developed into a new research tool across various disciplines of neuroscience. However, specifically for studying neural control of movement it was extremely difficult to train those models, as they require not only simulating the neural network, but also the body parts one is interested in studying. The authors provide a solution to this problem which is built upon one of the main software packages used for deep learning (Tensorflow). This allows them to make use of state-of-the-art tools for training neural networks.

      They show that their toolbox is able to (re-)produce several commonly studied experiments e.g., planar reaching with and without loads. The toolbox is described in sufficient detail to get an overview of the functionality and the current state of what can be done with it. Although the authors state that only a few lines of code can reproduce such an experiment, they unfortunately don't provide any source code to reproduce their results (nor is it given in the respective repository).

      The possibility of adding code snippets to the article is something we originally considered, and which aligns with comment two from reviewer one (see above). Hopefully this provides a good overview of the motivation behind our choice not to add code to the article.

      The modularity of the presented toolbox makes it easy to exchange or modify single parts of an experiment e.g., the task or the neural network used as a controller. Together with the open-source nature of the toolbox, this will facilitate sharing and reproducibility across research labs.

      I can see how this paper can enable a whole set of new studies on neural control of movement and accelerate the turnover time for new ideas or hypotheses, as stated in the first paragraph of the Discussion section. Having such a low effort to run computational experiments will be definitely beneficial for the field of neural control of movement.

      We thank the reviewer for these comments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The main goal of the authors was to study the testis-specific role of the protein FBXO24 in the formation and function of the ribonucleoprotein granules (membraneless electron-dense structures rich in RNAs and proteins).

      We appreciate the summary comment of reviewer #1.

      Strengths:

      The wide variety of methods used to support their conclusions (including transgenic models)

      We appreciate the positive comment of reviewer #1.

      Weaknesses:

      The lack of specific antibodies against FBXO24. Some of the experiments showing a specific phenotype are descriptive and lack of logical explanation about the possible mechanism (i.e. AR or the tail structure).

      Because we could not obtain specific antibodies against FBXO24, we generated Fbxo24-FLAG transgenic mice, which can be used to show the interaction between FBXO24 and IPO5. For the mechanism of impaired acrosome reaction, we added some results and discussion as written in the response to the question (1) of reviewer #1 (public review). For the mechanism of abnormal flagellar structure, we added new results and fixed the manuscript as written in the response to the major comments of reviewer #3 (recommendations for the authors).

      Questions:

      The paper is excellent and employs a wide variety of methods to substantiate the conclusions. I have very few questions to ask:

      (1) KO mice cannot undergo acrosome reaction (AR) even spontaneously. How do you account for this, given that no visible defects were observed in the acrosome?

      One possibility is that Fbxo24 KO spermatozoa cannot undergo capacitation; however, it is difficult to analyze the capacitation status such as tyrosine phosphorylation because most Fbxo24 KO spermatozoa are not alive (Figure S3A). Other possibility is that AR-related proteins are affected in Fbxo24 KO spermatozoa. Therefore, we analyzed the amounts of AR-related proteins with mass spectrometry (Figure S3C). Although previous studies indicate that the assembly of the SNARE complex is a key event prior to AR [Hutt et al., 2005 (PMID: 15774481); Katafuchi et al., 2000 (PMID: 11066067); Schulz et al., 1997 (PMID: 9356173); Tomes et al., 2002 (PMID: 11884041)], no clear differences were detected for SNARE proteins (Figure S3C and D). PLCD4 that is important for AR [Fukami et al., 2001 (PMID: 11340203)) was also detected in Fbxo24 KO spermatozoa (Figure S3C). Although we could not find differences in the amounts of AR-related proteins, it is still possible that FER1L5, another AR-related protein [Morohoshi et al., 2023 (PMID: 36696506)] not detected in the mass spectrometry analyses, or AR-related proteins not yet identified are affected in Fbxo24 KO spermatozoa. We added these results and discussion (line 160-166 and 305-312).

      (2) KO sperm are unable to migrate in the female tract, and, more intriguingly, they do not pass through the utero-tubal junction (UTJ). The levels of ADAM3 are normal, suggesting that the phenotype is influenced by other factors. The authors should investigate the levels of Ly6K since mice also exhibit the same phenotype but with normal levels of ADAM3.

      We detected LY6K in Fbxo24 KO spermatozoa with immunoblotting, but no difference was found.

      We added the results (Figure S3E and line 172–175).

      (3) In Figure 4A, the authors assert that "RBGS Tg mice revealed that mitochondria were abnormally segmented in Fbxo24 KO spermatozoa." I am unable to discern this from the picture shown in that panel. Could you please provide a more detailed explanation or display the information more explicitly?

      We are sorry for the ambiguous explanation on the morphology of sperm mitochondria sheath. Fbxo24 KO cauda epidydimal spermatozoa shows disorganized mitochondria sheath rather than “segmented”. We fixed the sentence (line 190-192) and added white arrowheads that indicate the disorganized regions (Figure 4A).

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Kaneda et al "FBXO24 ensures male fertility by preventing abnormal accumulation of membraneless granules in sperm flagella" is a significant paper on the role of FBXO24 in murine male germ cell development and sperm ultrastructure and function. The body of experimental evidence that the authors present is extraordinarily strong in both breadth and depth. The authors investigate the protein's functions in male germ cells and sperm using a wide variety of approaches but focusing predominantly on their novel mouse model featuring deletion of the Fbxo24 gene and its product. Using this mouse, and a cross of it with another model that expresses reporters in the head and midpiece, they logically build from one experiment to the next. Together, their data show that this protein is involved in the regulation of membraneless electron-dense structures; loss of FBXO24 led to an accumulation of these materials and defects in the sperm flagellum and fertilizing ability. Interestingly, the authors found that several of the best-known components of electron-dense ribonucleoprotein granules that are found in the intermitochondrial cement and chromatoid body were not disrupted in the Fbxo24 knockout, suggesting that the electron-dense material and these structures are not all the same, and the biology is more complicated than some might have thought. They found evidence for the most changes in IPO5 and KPNB1, and biochemical evidence that FBXO24 and IPO5 could interact.

      We appreciate the summary comment of reviewer #2.

      Strengths:

      The authors are to be commended for the thoroughness of their experimental approaches and the extent to which they investigated impacts on sperm function and potential biochemical mechanisms. Very briefly, they start by showing that the Fbxo24 message is present in spermatids and that the protein can interact with SKP1, in a way that is dependent on its F-box domain. This points toward a potential function in protein degradation. To test this, they next made the knockout mouse, validated it, and found the males to be sterile, although capable of plugging a female. Looking at the sperm, they identified a number of ultrastructural and morphological abnormalities, which they looked at in high resolution using TEM. They also cross their model with RBGS mice so that they have reporters in both the acrosome and mitochondria. The authors test a variety of sperm functions, including motility parameters, ability to fertilize by IVF, cumulus-free IVF, zona-free-IVF, and ICSI. They found that ICSI could rescue the knockout but not other assisted reproductive technologies. Defects in male fertility likely resulted from motility disruption and failure to get through the utero-tubal junction but defects in acrosome exocytosis also were noted. The authors performed thorough investigations including both targeted and unbiased approaches such as mass spectrometry. These enabled them to show that although the loss of the FBXO24 protein led to more RNA and elevated levels of some proteins, it did not change others that were previously identified in the electron-dense RNP material.

      The manuscript will be highly significant in the field because the exact functions of the electron-dense RNP materials have remained somewhat elusive for decades. Much progress has been made in the past 15 years but this work shows that the situation is more complex than previously recognized. The results show critical impacts of protein degradation in the differentiation process that enables sperm to change from non-descript round cells into highly polarized and compartmentalized mature sperm, with an equally highly compartmentalized flagellum. This manuscript also sets a high bar for the field in terms of how thorough it is, which reveals wide-ranging impacts on processes such as mitochondrial compaction and arrangement in the midpiece, the correct building of the major cytoskeletal elements in the flagellum, etc.

      We appreciate the positive comment of reviewer #2.

      Weaknesses:

      There are no real weaknesses in the manuscript that result from anything in the control of the authors. They attempted to rescue the knockout by expressing a FLAG-tagged Fbxo24 transgene, but that did not rescue the phenotype, either because of inappropriate levels/timing/location of expression, or because of interference by the tag. They also could not make anti-FBXO24 that worked for coimmunoprecipitation experiments, so relied on the FLAG epitope, an approach that successfully showed co-IP with IPO5 and SKP1.

      We could not rescue the phenotype with Fbxo24-FLAG transgene, but different Fbxo24 mutant mice show the same phenotypes (Figure S6G). Further, another group showed that Fbxo24 KO mice exhibited abnormal mitochondrial coiling [Li et al., 2024 (PMID: 38470475)], confirming that

      FBXO24 is involved in the mitochondrial sheath formation.

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript, the authors found that FBXO24, a testis-enriched F-box protein, is indispensable for male fertility. Fbxo24 KO mice exhibited malformed sperm flagellar and compromised sperm motility.

      We appreciate the summary comment of reviewer #3.

      Strengths:

      The phenotype of Fbxo24 KO spermatozoa was well analyzed.

      We appreciate the positive comment of reviewer #3.

      Weaknesses:

      The authors observed numerous membraneless electron-dense granules in the Fbxo24 KO spermatozoa. They also showed abnormal accumulation of two importins, IPO5 and KPNB1, in the Fbxo24 KO spermatozoa. However, the data presented in the manuscript do not support the conclusion that FBXO24 ensures male fertility by preventing the abnormal accumulation of membraneless granules in sperm flagella, as indicated in the manuscript title.

      Fbxo24 KO mice showed abnormal accumulation of membraneless granules in sperm flagella and male infertility, suggesting that FBXO24 is involved in these processes, but there are no results that show the direct relationship as reviewer #3 mentioned. Therefore, we fixed the title.

      Recommendations For The Authors:

      Reviewer #2 (Recommendations For The Authors):

      On page 4, lines 152-154, the authors introduce the RBGS mouse model and use it in their experiments.

      However, they left out an obvious but helpful sentence that tells the reader that they crossed the Fbxo24-null mouse with the RBGS. As one continues reading it is clear, but best to avoid even slight confusion.

      We revised the explanation in the result section (line 150-153).

      Reviewer #3 (Recommendations For The Authors):

      In this manuscript, the authors found that FBXO24, a testis-enriched F-box protein, is indispensable for male fertility. Fbxo24 KO mice exhibited malformed sperm flagellar and compromised sperm motility. The phenotype of Fbxo24 KO spermatozoa was well analyzed.

      The authors observed numerous membraneless electron-dense granules in the Fbxo24 KO spermatozoa. They also showed abnormal accumulation of two importins, IPO5 and KPNB1, in the Fbxo24 KO spermatozoa. However, the data presented in the manuscript do not support the conclusion that FBXO24 ensures male fertility by preventing the abnormal accumulation of membraneless granules in sperm flagella, as indicated in the manuscript title.

      Fbxo24 KO mice showed abnormal accumulation of membraneless granules in sperm flagella and male infertility, suggesting that FBXO24 is involved in these processes, but there are no results that show the direct relationship as reviewer #3 mentioned. Therefore, we fixed the title.

      Major comments:

      In the title, abstract, introduction, and some sections such as lines 275-276, the authors conclude that FBXO24 prevents the accumulation of importins and RNP granules during spermiogenesis. However, the provided data do not substantiate this claim. To provide conclusive evidence to support the current title, the authors need to present evidence supporting: 1) direct degradation of IPO5 and KPNB1 by FBXO24; 2) the direct requirement of IPO5 for the formation of the membraneless granules, and 3) infertility resulting from the presence of membraneless granules, rather than other issues such as abnormal ODF and AX.

      (1) direct degradation of IPO5 and KPNB1 by FBXO24.

      To examine if IPO5 can be degraded by FBXO24, we performed a ubiquitination assay using HEK293T cells. Ubiquitination of IPO5 was upregulated in the presence of WT FBXO24 but not with the mutant ΔF-box FBXO24, suggesting that IPO5 can be ubiquitinated by FBXO24. We did not examine the ubiquitination of KPNB1 because we failed to construct a plasmid vector expressing mouse KPNB1. We think that KPNB1 is not the substrate because we did not detect the interaction between FBXO24 and KPNB1 (Figure 5E). We added the results of the ubiquitination assay (Figure

      5F and line 261-265) and mentioned it in the abstract (line 35).

      (2) the direct requirement of IPO5 for the formation of the membraneless granules.

      (3) infertility resulting from the presence of membraneless granules, rather than other issues such as abnormal ODF and AX.

      We revealed that IPO5 aggregate under stress condition in COS7 cells (Figure 6C and D); however, we did not examine whether IPO5 is required for the formation of the membraneless granules. We consider that protein degradation systems such as PROTAC or Trim-Away to knockdown IPO5 at the protein level in Fbxo24 KO mice could be a good way to see if the membraneless granules are diminished and male fertility is rescued. However, it takes time to apply the degradation systems in vivo. Therefore, we would like to leave this rescue experiment for future studies. We fixed the title and  abstract (line 37-38), and removed the last sentence of the introduction.

      Also, the other group reported the analyses of Fbxo24 KO mice [Li et al., 2024 (PMID: 38470475)] right after we submitted our manuscript to the eLife. They reported not only disorganized flagellar structures but also abnormal head morphology, which may lead to male infertility. The differences from our study may be due to different mouse genetic backgrounds. We mentioned it in the discussion section (line 348-353).

      Minor comments:

      (1) The authors claimed a significant increase in the total amount of RNAs in Fbxo24 KO spermatozoa (lines 259-261), suggesting that the ...contain RNAs. More direct evidence supporting this claim should be provided.

      We show that the amounts of IPO5 and KBNB1 increased in Fbxo24 KO spermatozoa (Figure 5A and B), both of which could be incorporated into RNP granules in COS7 cells (Figure 6C and D), supporting the idea that membraneless electron-dense structures may be RNP granules. However, because we did not show direct evidence that electron-dense structures contain RNAs, we removed the sentences (line 259-261 of the 1st submission manuscript). 

      (2) The author should provide an explanation for the absence of a FLAG band in the input Tg in Figure 5D and the larger size of the IPO5 band in the FLAG-IP group compared to the input. Similar observations are also noted in Figure 5E.

      The FLAG band is weak because the protein amount is low. When we increase the contrast, we can see the FLAG band. We added an image with high contrast (Figure 5D). Sometimes, proteins run differently with SDS-PAGE after immunoprecipitation, likely due to varying protein composition in the sample. We explained it in the figure legend (line 868-869).

      (3) In Line 526, clarify the procedure for sperm purification, and determine the potential for contamination from somatic cells.

      We did not perform sperm purification, but when we observed spermatozoa obtained from cauda epididymis, we rarely observed either somatic cells or immature spermatogenic cells. We added  pictures in Figure S7. Further, we added detailed explanation about how to collect spermatozoa from the epididymis (line 549-550).

      (4) Define the Y-axis in Figure 2E, F, and G.

      We have revised the figures.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In this paper, the authors investigate the impact of fecal microbiota transfer (FMT) on intestinal recovery from enterotoxigenic E. coli infection following antibiotic treatment. Using a piglet model of intestinal infection, the authors demonstrate that FMT reduces weight loss and diarrhea and enhances the expression of tight junction proteins. Sequencing analysis of the intestinal microbiota following FMT showed significant increases in Akkermansia muciniphila and Bacteroides fragilis. Using additional mouse and organoid models, the authors examine the impact of these microbes on intestinal recovery and modulation of the Wnt signaling pathway. Overall, the data support the notion that FMT following ETEC infection is beneficial, however, additional investigation is required to fully elucidate the mechanisms involved.

      Strengths:

      Initial experiments used a piglet model of infection to test the value of FMT on recovery from E. coli. The FMT treatment was beneficial and the authors provide solid evidence that the treatment increased the diversity of the microbiota and enhanced the recovery of the intestinal epithelium. Sequencing data highlighted an increase in Akkermansia muciniphila and Bacteroides fragilis after FMT.

      The mouse data are consistent with the observations in pigs, and reveal that daily gavage with A. muciniphila or B. fragilis enhances intestinal recovery based on histological analysis, expression of tight junction proteins, and analysis of intestinal barrier function.

      The authors demonstrate the benefit of probiotic treatment following infection using a range of model systems.

      Weaknesses:

      Without sequencing the pre-infection pig microbiota or the FMT input material itself, it's challenging to firmly say that the observed bloom in Akkermansia muciniphila and Bacteroides fragilis stemmed from the FMT.

      Response: We have determined the relative abundance of each bacterium in fecal bacterial suspension, referring to Hu et al. (2018). The absolute abundances of Akkermansia muciniphila and Bacteroides fragilis in the FMT were 1.3 × 103 ± 2.6 × 103 and 4.5 × 103 ± 6.1 × 103 respectively.

      Reference:

      Hu LS, Geng SJ, Li Y, et al. Exogenous Fecal Microbiota Transplantation from Local Adult Pigs to Crossbred Newborn Piglets. Front. Microbiol. 2018, 8.

      The lack of details for the murine infection model, such as weight loss and quantification of bacterial loads over time, make it challenging for a reader to fully appreciate how treatment with Akkermansia muciniphila and Bacteroides fragilis is altering the course of infection. Bacterial loads of E. coli were only quantified at one time point, and the mice that received A. muciniphila and B. fragilis had very low levels of E. coli. Therefore, it is not clear if all mice were subjected to the same level of infection in the first place. The reduced translocation of E. coli to the organs and enhanced barrier function may just reflect the low level of infection in these mice. Further, the authors' conclusion that the effect is specific to A. muciniphila or B. fragilis would be more convincing if the experiments included an inert control bacterium, to demonstrate that gavage with any commensal microbe would not elicit a similar effect.

      The weight loss was added in Figure S2A. All mice were subjected to the same level of infection in the first place.

      Many of the conclusions in the study are drawn from the microscopy results. However, the methods describing both light microscopy and electron microscopy lack sufficient detail. For example, it is not clear how many sections and fields of view were imaged or how the SEM samples were prepared and dehydrated. The mucus layer does not appear to be well preserved, which would make it challenging to accurately measure the thickness of the mucus layer.

      For light microscopy, 3-4 fields were selected from each mouse to count about 30 crypts. The method of electron microscopy was complemented on line 263-270. We have removed data of the mucus layer.

      Gene expression data appears to vary across the different models, for example, Wnt3 expression in mice versus organoids. Additional experiments may be required to clarify the mechanisms involved. Considering that both of the bacteria tested elicited similar changes in Wnt signaling, this pathway might be broadly modulated by the microbiota.

      The reason why the Wnt3 expression pattern is different in mice and in porcine intestinal organoids may be caused by the different infection periods of ETEC in vivo and in vitro. Furthermore, in vivo, the stem cell niche of intestinal stem cells is not only regulated by intestinal epithelial cells, but also affected by mesenchymal cells in connective tissues (Luo et al., 2022). However, in vitro models, stem cell niche is only regulated by epithelial secretory factors, which may also account for the differences in in vitro and in vivo results.

      It has been reported that B. fragilis pretreatment significantly increased the relative abundance of A. muciniphila in the intestine of CDI mice, and the growth and maintenance of A. muciniphila were involved in the restoration of intestinal barrier integrity after CDI infection, indicating that there might exist a bacterial metabolic symbiosis between A. muciniphila and B. fragilis (Deng et al., 2018).

      References:

      Luo HM, Li MX, Wang F, et al. The role of intestinal stem cell within gut homeostasis: Focusing on its interplay with gut microbiota and the regulating pathways. Int. J. Biol. Sci. 2022, 18(13): 5185-5206.

      Deng HM, Yang SQ, Zhang YC, et al. Bacteroides fragilis Prevents Clostridium difficile Infection in a Mouse Model by Restoring Gut Barrier and Microbiome Regulation. Front. Microbiol. 2018, 9.

      The unconventional choice to not include references in the results section makes it challenging for the reader to put the results in context with what is known in the field. Similarly, there is a lack of discussion acknowledging that B. fragilis is a potential pathogen, associated with intestinal inflammation and cancer (Haghi et al. BMC Cancer 19, 879 (2019) ), and how this would impact its utility as a potential probiotic.

      Bacteroides fragilis is one of the symbiotic anaerobes within the mammalian gut and is also an opportunistic pathogen which often isolated from clinical specimens. Bacteroides fragilis was first isolated from the pathogenic site and considered to be pathogenic bacteria. However, with the deepening of research, it is gradually realized that in the long-term evolution process, Bacteroides fragilis colonized in the gut has established a friendly relationship with the host, which is an essential component for maintaining the health of the host, especially for obesity, diabetes and immune deficiency diseases. We have supplemented the discussion on line 598-603.

      Reviewer #2 (Public Review):

      Ma X. et al proposed that A. muciniphila was a key strain that promotes the proliferation and differentiation of intestinal stem cells by acting on the Wnt/β-catenin signaling pathway. They used various models, such as the piglet model, mouse model, and intestinal organoids to address how A. muciniphila and B. fragilis offer protection against ETEC infection. They showed that FMT with fecal samples, A. muciniphila or B. fragilis protected piglets and/or mice from ETEC infection, and this protection is manifested as reduced intestinal inflammation/bacterial colonization, increased tight junction/Muc2 proteins, as well as proper Treg/Th17 cells. Additionally, they demonstrated that A. muciniphila protected basal-out and/or apical-out intestinal organoids against ETEC infection via Wnt signaling. While a large body of work has been performed in this study, there are quite a few questions to be addressed.

      Major comments:

      - The similar protective effect of FMT with fecal samples, A. muciniphila or B. fragilis is perhaps not that surprising, considering that FMT likely restores microbiota-mediated colonization resistance against ETEC infection. While FMT with fecal samples increases SCFAs, it is unclear whether/how FMT with A. muciniphila or B. fragilis alter the microbiota composition/abundance as well as metabolites in the current models in a way that offers protection.

      We examined changes in the gut microbiota of mice treated with A. muciniphila and B. fragilis through 16s rRNA, and results showed that both A. muciniphila and B. fragilis improved the alpha and beta diversities of the microbiota, while these results were not included in this manuscript.

      - Does ETEC infection in piglets/mice cause histological damage in the intestines? These data should be shown.

      The results of scanning electron microscopy (Figure 3A) showed the intestinal damage of piglets after ETEC infection. H&E staining and transmission electron microscopy (Figure 5A and 5B) showed the intestinal damage of mice after ETEC infection.

      - Line 447, "ETEC adheres to intestinal epithelial cells". However, there is no data showing the adherence (or invasion) of ETEC to intestinal epithelial cells, irrespective of piglets/mouse/organoids.

      The scanning electron microscope (Figure 3A bottom) showed that ETEC K88 infected piglets existed obvious rod-shaped bacterial adhesion on the surface of microvilli. Figure 2C showed the colonization of ETEC K88 in the jejunum and colon of piglets. Figure S2A showed the E. coli colonization in intestines and other tissues of mice.

      - In both basal-out and apical-out intestinal organoid models, A. muciniphila protects organoids against ETEC infection. Did ETEC enter into intestinal epithelial cells at all after only one hour of infection? Is the protection through certain A. muciniphila metabolites?

      It has been reported that the duration of the co-culture for studying the host-microbiota cross-talk by apical-out organoids model is 1 hour (Poletti et al., 2021). In addition, Co et al. (2019) used apical-out organoids model to study host-pathogen interactions, with Salmonella enterica serovar Typhimurium or Listeria monocytogenes invading organoids for an hour.

      References:

      Poletti M, Arnauts K, Ferrante M, et al. Organoid-based Models to Study the Role of Host-microbiota Interactions in IBD. J. Crohns Colitis. 2021, 15(7): 1222-1235.

      Co JY, Margalef-Catala M, Li XN, et al. Controlling Epithelial Polarity: A Human Enteroid Model for Host-Pathogen Interactions. Cell Reports. 2019, 26(9): 2509-2520.

      Reviewer #3 (Public Review):

      Summary:

      The manuscript by Ma et al. describes a multi-model (pig, mouse, organoid) investigation into how fecal transplants protect against E. coli infection. The authors identify A. muciniphila and B. fragilis as two important strains and characterize how these organisms impact the epithelium by modulating host signaling pathways, namely the Wnt pathway in lgr5 intestinal stem cells.

      Strengths:

      The strengths of this manuscript include the use of multiple model systems and follow-up mechanistic investigations to understand how A. muciniphila and B. fragilis interacted with the host to impact epithelial physiology.

      Weaknesses:

      The major weakness is that, as presented, the manuscript is quite difficult to follow, even for someone familiar with the field. The lack of detail in figure legends, organization of the text, and frequent use of non-intuitive abbreviated group names without a clear key (ex. EP/EF, or C E A B) make comprehension challenging. The results section is perhaps too succinct and does not provide sufficient information to understand experimental design and interpretation without reading the methods section first or skipping to the discussion (as an example: WNT-c59 treatment). Extensive revisions could be encouraged to aid in communicating the potentially exciting findings.

      The abbreviations of experimental groups are firstly defined in the Methods and Materials, and we have supplemented the experimental design in the results section on line 397-399, 439-442 and 516-520.

      The bioinformatics section of the methods requires revision and may indicate issues in the pipeline. Merging the forward and reverse reads may represent a problem for denoising. Also since these were sequenced on a NovaSeq, the error learning would have to be modified or the diversity estimates would be inappropriately multiplied. "Alpha diversity and beta diversity were calculated by normalized to the same sequence randomly." Not sure what this means, does this mean subsampled? "Blast was used for sequence alignment", does this mean the taxonomic alignment? This would need to be elaborated on and database versions should be included. The methods, including if any form of multiple testing was included, for LEFSE was also not included.

      Denoising was conducted using UNOISE3 to correct for sequencing errors. Subsequent analysis of alpha diversity and beta diversity were all performed based on the output normalized data. Multiple sequence alignment was performed using MUSCLE (v3.8.31) software to obtain the phylogenetic relationships of all OTUs sequences. We have supplemented the method of multiple testing on line 323-328.

      Reviewer #1 (Recommendations For The Authors):

      At some points, the rationale for using both porcine and murine models was unclear, and it would be helpful for the reader to elaborate on the benefits of these models and why they were used in the introduction. Similarly, it would be helpful to describe the benefits of basal-in organoids versus injecting standard organoids with bacteria.

      The main subject of this study was piglets, supplemented by a mouse model for validation. Interpretation of measurements from organoid microinjection experiments must account for multiple confounding variables such as heterogeneous exposure concentrations and durations, as well as impacts of disrupting the organoid wall. We have added the description in the introduction on line 88-90.

      Line 165 -- The number of piglets used seems high, is it correct approximately 100 pigs were used?

      Nine litters were selected for processing, while only 18 piglets were finally slaughtered.

      There is very little discussion of the preliminary experiment that the authors used to determine how much bacteria to use. I recommend either discussing the data and how the doses were chosen or omitting it. It was not clear if the authors used pasteurized or live bacteria in the experiments. It would also be interesting to include a discussion of the observation that relatively low levels of Akkermansia (10^6 CFU) appeared more beneficial than the higher doses, typically used in these types of experiments.

      We removed these results. The experiments used live bacteria.

      Microscopy methods for both light microscopy and EM would be stronger with added details including how many sections and fields of view were imaged and how the numbers of goblet cells normalized across samples. Without having a clear cross-section of a crypt, it is not clear to me how the images can be used to accurately quantify the number of cells per crypt. Additional details in the methods on how many total crypts were counted should also be included.

      For light microscopy, 3-4 fields were selected from each mouse to count about 30 crypts. We have removed the data of the mucus layer and goblet cells.

      Line 236 -- missing which gene was used.

      The Genbank Accession was added on line 232-233.

      Line 310 -- OTU nomenclature.

      We have supplemented the OTU nomenclature on line 314.

      Line 413 -- This line seems inconsistent with the data analysis described in the methods section. The authors may need to expand their description of the 16S data analysis to be clear and reproducible.

      We have redescribed the 16S data analysis on line 312-328.

      Line 413 -- it is not surprising that 16s analysis did not capture species, it will have limited resolution beyond the genus level.

      We deleted this sentence.

      Methods are missing some details on the data analysis, eg. methods/programs and statistical analysis of PCoA and NMDS, LefSe.

      The methods and statistical analysis of PCoA, NMDS and LEfSe were supplemented on line 323-328.

      Fig 4C -- The images do not clearly capture the mucus layer or how it was analyzed. The sections appear to be cut at a slight angle, with multiple partial sections of crypts. I think this might make it challenging to count goblet cells, especially if the counts are normalized over the number of crypts or villi. The mucus layer does not appear well preserved. For example, I would expect to see an intact mucus layer lining the colon in the PBS control group. Re-cutting sections with a clean cross-section through the tissue will make data analysis easier.

      We have removed data of the mucus layer.

      Fig 4D -- The images appear to be of the mouse proximal colon, whereas the mucus layer and most muc2 will be in the distal colon. If the authors have tissue sections of the distal colon, this may give a clearer image of the mucus layer and might be more consistent with the TEM images in Fig. 4B.

      We apologize for the absence of the distal colon sections.

      To fully preserve the mucus layer, in addition to fixing in Carnoy's solution, the embedding process must be run without the standard washes in 70% ethanol (see: Johansson and Hansson. Methods Mol Biol. (2012) 229; doi: 10.1007/978-1-61779-513-8_13). The mucus will wash away during standard paraffin embedding if the tissue is washed with 70% ethanol, and I wonder if that has occurred in these samples.

      The tissue wasn’t washed with 70% ethanol.

      Fig 6A and 6B -- Although the legend indicates that the data is representative of two independent experiments, it is not clear how many fields of view or cells were imaged. In the bar graphs, it is not clear how many crypts were analyzed and from how many fields of view.

      3-4 fields were selected from each mouse to count about 30 crypts.

      **For all of the bar graphs, this could be addressed by displaying all of the data points, rather than just the mean, to give the reader a sense of how many cells were counted. (as was done in Fig 7B).

      We have changed the bar graphs with data points.

      498-501 -- The text says that the gene expression patterns in the organoids are consistent with the in vivo data, but the data patterns of gene expression appear to be different. For example, patterns for Wnt3 and B-catenin expression in mice, appear to be the opposite of what was observed in the organoid?

      Lines 509-512 mean that the expression patterns of mice in organoids and in vivo is consistent. Figure 7C was incorrectly written as Figure 8C, we have changed it.

      Since Akkermansia does not grow under aerobic conditions, it should be made clear that the organoid co-culture treatment does not involve actively growing bacterial cultures.

      Reunanen et al. found that Akkermansia can tolerate oxygen, more than 90% Akkermansia can keep for 1 h under oxic, 5% CO2 conditions.

      Reference:

      Reunanen J, Kainulainen V, Huuskonen L, et al. Akkermansia muciniphila Adheres to Enterocytes and Strengthens the Integrity of the Epithelial Cell Layer. Appl. Environ. Microbiol. 2015, 81(11): 3655-3662.

      Minor points

      Line 50 -"evidence".

      We have changed to “evidence” on line 49.

      Line 64, 422 - italicize, check italics throughout.

      We have checked italics throughout the manuscript.

      Line 64 - may need to be reworded.

      We have changed to “Clostridioides difficile” on line 66.

      Line 77 - pathogen.

      We have changed to “pathogen” on line 77.

      Line 161 - the.

      We have removed “the” on line 161.

      Line 178 - mouse.

      We have changed to “mouse” on line 179.

      Line 313 -- wording is confusing.

      We have changed the description on line 319-320.

      Line 318 -- Silva version #.

      The version is Silva 132. We have added it on line 316.

      Line 334 - Manufacturer for Live/Dead cell stain?

      The Live/Dead cell stain was used BD Biosciences FVS510. We have added it on line 345.

      Line 433 -- FD4 not defined until here.

      We have refined the FD4 on line 218-219.

      Line 512 -- but did not promote.

      We have changed to “but did not promote” on line 526.

      Line 517 -- Looks like this should be "basal-in organoids" instead of basal-out?

      We have changed the "basal-out" to "apical-to" on line 531.

      Line 546 -- induced neonatal should be protected?

      They are in separate pens.

      Jumps from Fig 7B to Fig 8C in the text.

      We apologize for the wrong writing, and we have change it.

      Reviewer #2 (Recommendations for The Authors):

      The title itself is a bit misleading. Please consider changing it. The authors meant that A. muciniphila prevents pathogen invasion, but does not function in pathogen invasion.

      We have changed the title.

      Major comments:

      - Figures 4A, 4D, and 6B should include presentation of cross-section pictures.

      We provided cross-section pictures to the journal.

      - Figures 7, 8, and 9 should indicate clearly whether mouse or piglet organoids are used. For instance, in the main text, line 490, it indicates piglet organoids, but in Figure 7A legend, it indicates mouse tissue.

      We apologize for the misspelling, and have changed to “mice” on line 501-502.

      - In Figure 7A, the 3rd row, 2nd panel, crypts formed into spherical organoids; whereas in Figure 8, ETEC infection of basal-out organoids formed budding organoids. This needs to be better explained.

      Mouse intestinal organoids were cultured ex vivo from crypts isolated from mice infected with ETEC, while porcine intestinal organoids were co-cultured with ETEC in vitro.

      Minor comments:

      - In the result section, the numbering of Figures or supplementary Figures is problematic, i.e it should start with Figure 1..., Figure S1, but not directly go to Figure S2A etc.

      The Figure 1 was in Materials and Methods.

      - Line 458, please add the gating strategy used in the flow cytometry study.

      The gating strategy was added on line 351-356.

      - The effect of A. muciniphila on the proliferation of intestinal epithelium through the Wnt/β-catenin signaling pathway is well known (such as PMID: 32138776). The authors should discuss this in detail.

      We have supplemented the discussion on line 637-639.

      Reviewer #3 (Recommendations For The Authors):

      It is somewhat unusual that the results from the piglets are in the supplement as this is a major strength of the manuscript (Fig S2).

      We have put these results into Figure 2 of the manuscript.

      "Collectively, our results may provide theoretical basis that FMT is a promising mitigation method for pathogenic bacteria infection and a new strategy for precise application of FMT in clinical and livestock production"- This is somewhat of an odd statement as the introduction of the manuscript completely skips over most of what is known about FMTs in the context of C. difficile. Also if anything, does the authors' own data not point mostly at using A. muciniphila on its own? Clinical trials are well underway in humans.

      We have changed the sentences to “Collectively, our results may provide theoretical basis that A. muciniphila is a promising method to repair intestinal barrier damage and a new strategy for the precise application of A. muciniphila in livestock production.” on line 98-100.

      Line 26: I am not sure probiotic is the right word here given its strict scientific definition. Perhaps beneficial or protective would be more appropriate.

      We have changed “probiotic” to “beneficial” on line 25.

      Line 27: I believe AIMD is antibiotic-induced microbiome-depletion in most usages which may be more accurate and informative than dysregulated.

      The type, dosing, and time of antibiotic we used were applied to induce microbiota disorder.

      It would appear that there are issues in the reference formatting where a number of journal names are missing.

      We have re-edited the reference formatting.

      Line 64- I believe eLife requires the standard practice of italicizing genus and species names. Also Clostridium difficile should now be referred to as Clostridioides difficile.

      We have changed to “Clostridioides difficile” and italicized it on line 66 and 569. The italicizing genus and species names were checked throughout the manuscript.

      Figure S2C: is it not clear why the melt curve was included here, but the legend should make it more clear what is being shown. I assume this is to provide evidence of specificity?

      The melting curve was used to demonstrate that only the ETEC K88 could be amplified by the primers we used. We have added an illustration in the figure legend.

      Figure 2D: there should be a quantitative analysis done on the staining of Muc2.

      We have quantified the staining of MUC2 in Figure 3D.

      Figure 3: The legends are not sufficient. For example: it is not clear what Figure 3A actually shows as the y-axis is not labelled and it is not clear what the relationship is between this and the anosim which is a function for permanova.

      Anosim analysis was performed using the R software with anosim package function based on the rank order of Bray-Curtis distance values to test the significance of differences between groups. The y-axis is the rank of the distance between samples.

      Line 416- OTU not OUT.

      We have changed to “OTU” on line 428.

      Figure 4- the naming key needs to be included in the figure legend. C, E, A, and B are immediately obvious.

      The naming key was included in the figure legend.

      Methods: additional information on the flow cytometry gating strategy/controls should be included.

      The gating strategy was added on line 351-356.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The manuscript addresses a fundamental question about how different types of communication signals differentially affect brain state and neurochemistry. In addition, their manuscript  highlights the various processes that modulate brain responses to communication signals, including prior experience, sex, and hormonal status. Overall, the manuscript is well-written and the research is appropriately contextualized.

      That being said, it remains important for the authors to think more about their analytical approaches. In particular, the effect of normalization and the explicit outlining and interpretations of statistical models. As mentioned in the original review, the normalization of neurochemical data seems unnecessary given the repeated-measures design of their analysis and by normalizing all data to the baseline data and including this baseline data in the repeated measures analysis,   one artificially creates a baseline period with minimal variation that dramatically differs in variance from other periods (akin to heteroscedasticity). If the authors want to analyze how a stimulus changes neurochemical concentrations, they could analyze the raw data but depict normalized data in their figures (similar to other papers). Or they could analyze group differences in the normalized data of the two stimulus periods (i.e., excluding the baseline period used for normalization).

      We appreciate the reviewer’s point on the difference in variance caused by including the 100% baseline values in the analysis. After consulting with our statistician, we chose the latter of the two approaches suggested by the reviewer. Specifically, we reran the analysis to exclude the baseline and focus only on the playback windows and the group differences. The text in the results, the significance signs in the figures, and the discussion are corrected accordingly. Despite these changes, our major conclusions remains as before.

      We also followed this reviewer’s suggestions to clarify the statistical model in studying the experience effect. After further consultation with our statistician, we reran the analysis on experience effect, including all the groups of EXP and INEXP animals together. We have corrected text in the figure captions, results, discussion, and data analysis sections of the manuscript related to the effect of experience and its interactions. This has not changed the conclusion made related to the experience effect in the dataset.

      It would also be useful for the authors to provide further discussion of the potential contributions of different types of experiences (mating vs. restraint) to the change in behavior and neurochemical responses to the vocalization playbacks and to try to disentangle sensory and  motor contributions to neurochemical changes.

      We have acknowledged in the Discussion that previous studies suggest that the effect of experience involving stress could be generalized. We believe that this is an important area of future research. Our Discussion acknowledges that the relationship between sensory and motor contributions to neurochemical changes remains an area of interest. We further point out that the time resolution of microdialysis data renders the suggested discussion highly speculative. We plan to use other methods to assess this in future experiments.

      Reviewer #3 (Public Review):

      The work by Ghasemahmad et al. has the potential to significantly advance our understanding of how neuromodulators provide internal-state signals to the basolateral amygdala (BLA) while an animal listens to social vocalizations.

      Ghasemahmad et al. made changes to the manuscript that have significantly improved the work. In particular, the transparency in showing the underlying levels of Ach, DA, and 5HIAA is excellent. My previous concerns have been adequately addressed.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I appreciate the authors responses to my previous queries (and to the comments by other reviewers). The introduction does a better job contextualizing the data, and the additional details in the results and Methods sections help readers digest the material. I continue to think the topic  is interesting and the manuscript is potentially impactful. However, I continue to be concerned about their analytical approaches and other aspects of the revised manuscript.

      (a) Normalization

      In my original review I wrote: "The normalization of neurochemical data seems unnecessary   given the repeated-measures design of their analysis and could be problematic; by normalizing     all data to the baseline data (p. 24), one artificially creates a baseline period with minimal   variation (all are "0"; Figures 2, 3 & 5) that could inflate statistical power." I continue to feel that an analysis of normalized data that includes the baseline data is inappropriate because of the minimal variation in the normalized data for the baseline period. When the normalized data for   the baseline period is included in the analysis, there is clearly variation in the extent of variability within each of the time periods (no variability at baseline, variability during periods 1 & 2; analogous to heteroscedasticity). For example, when analyzing the RAW DATA about the change in ACh release in experienced males listening to restraint vocalizations (thank you for releasing the raw data), there was a non-significant effect of time (baseline, period 1, and period 2; linear mixed effects model; F(2,12)=3.2, p=0.0793). However, when the normalized data for  this dataset was analyzed (with baseline values being set at 100% for each mouse), there was a statistically significant effect (F(2,12)=4.5, p=0.0352). This example is just to illustrate how normalization can affect (e.g., inflate) statistical power.

      That being said, I do think that it is reasonable to analyzed normalized data if the period used for normalization is NOT included in the analysis (see Figure 3 of one of the paper the authors listed in their response to reviewers: Galvez-Marquez et al., 2022). However, from the reading of this manuscript, it does seem like normalized baseline data are analyzed to assess how stimuli affect neurochemical concentrations.

      We appreciate the reviewer’s point on the difference in variance caused by including the 100% baseline values in the analysis. After consulting with our statistician, we chose one of the two approaches suggested by the reviewer. Specifically, we reran the analysis to exclude the baseline and focus only on the playback windows and the group differences. The text in the results, the significance signs in the figures, and the discussion are corrected accordingly. Despite these changes, our major conclusions remains as before. We have included some descriptive statistics in the text because we think these are informative.

      We decided to take this approach because the inter-individual variability in the raw data levels, caused by non-experimental factors, is too great to be useful. As we have stated before, these values are affected by probe placement, collection process, or differences in the HPLC or LC/MS runs. These effects are widely recognized in the field.

      It is worth pointing out a few things about the papers listed by the authors. Li et al. (2023) does depict normalized microanalysis data but it isn't clear that any analysis of the normalized data is conducted. The same can be said about Holly et al. (2016). Further, in Bagley et al (2011), the authors depict normalized data in the figures but conduct analyses on the raw data ("After  chronic morphine treatment, systemic naloxone injection increased GABA outflow in PAG by 41% (from 24.6 {plus minus} 2.9 nM to a peak of 34.8 {plus minus} 3.8 nM, n = 6, P = 0.016), but did not alter GABA levels after vehicle treatment (39.8 {plus minus} 8.3 to 38.6 {plus  minus} 7.4 nM with naloxone at matched peak time, n = 4; Fig. 3a)". This latter approach (analyzing raw data in a repeated-measures manner and depicted normalized data) seems reasonable for the authors of the current study.

      (b) Clarification and modification of statistical models

      When analyzing the effect of experience on neuromodulator release, the authors analyze the experienced and inexperienced mice independently (e.g., figure 3 vs. 6). The ideal way to assess the effects of experience is to create a factorial model. For example, one could analyze a full factorial model with experience (exp vs. inexp), stimulus time (mating vs. restraint) and time  (baseline, period 1 vs period 2, assuming raw data are used). If one wanted to exclude the  baseline period because group differences in baseline are not informative, conducting a factorial analysis of normalized data with just the data from period 1 and 2 seems fine. I believe an analysis like this will help increase the legitimacy of the analysis. For example, when analyzing the normalized data (periods 1 and 2) of experienced and inexperienced males in response to mating or restraint vocalizations, you find a significant interaction between experience and stimulus type. Finding an effect of experience in an analysis that includes both experienced and inexperienced mice is ideal from an analytical framework.

      In Figure 6, it is not clear what the statistical model is and what the interactions mean. For example, in the figure legend for figure 6, the authors report time*context and time*sex interactions. However, in this analysis there are two groups of inexperienced males (males that   are listening to restraint vocalizations, males that are listening to mating vocalizations) and one group of females (females that are listening to mating vocalizations); in other words, this is an unbalanced analysis. So, when the authors indicate a time*context interaction, does that mean  they are comparing the male-restraint group to the combination of males and females listening to mating vocalizations? And when they talk about a time*sex interaction, are they analyzing how males listening to either mating or restraint vocalizations differ from females listening to a   mating vocalization? This all seems peculiar to me.

      - A similar set of questions could be raised about interaction effects depicted in Figure 4.

      Overall, I would like this manuscript to be reviewed by a statistician to provide additional input on how best to analyze the data.

      We followed the reviewer’s suggestions to clarify the statistical model in studying the experience effect. After further consultation with the statistician, we reran the analysis on experience effect, including all the groups of EXP and INEXP animals together.

      Design: Intercept + Sex +Context + Experience+ Sex* Experience + Context* Experience.

      The model is not full factorial as recommended by the statistician, because we don’t have females in the restraint group and that would make an unbalanced design. Therefore, running GLM based on the above model and included factors, as advised by the statistician, is the best way of approaching the analysis for the current dataset.

      We have corrected text in the figure captions, results, discussion, and data analysis sections of the manuscript related to the effect of experience and its interactions. The GLM models are clarified for all the figures in the “data analysis” section of the manuscript. We have clarified that the major effect of experience on neuromodulators was seen in the ACh data.

      (c) Analysis of post-stimulus period

      I agree with Reviewer 3 that analyzing the post-stimulus period would be useful. As mentioned     in the original review, these data could serve as an opportunity to show that the neurochemical levels returned to baseline and add further support for the model described in Figure 6. In   addition, these data could help reveal the link  between  neurochemical  release,  auditory responses, and behavior. If neurochemical changes reflect auditory responses, then these should back to baseline during the post-stimulus period. In addition, if behavioral variation (e.g.,    between mice hearing mating vs. restraint stimuli) persists following the termination of playback, then one could similarly assess whether neurochemical variation persists following playback. If   the latter is the case, then the neurochemical release could be more related to the behavior than to the playback stimulus itself.

      We did not change this analysis. Our response to Reviewer 3’s comment is shown below.

      “We decided not to include analyses of the post-stimulus period because this period is subject to wider individual and neuromodulator-specific effects and because it weakens statistical power in addressing the core question—the change in neuromodulator release DURING vocal playback. We agree that the general question is of interest to the field, but we don’t think our study is best designed to answer that question.”

      This was accepted by Reviewer 3. We also note that release patterns have multiple time courses (e.g., Aitta-aho et al., 2018 for ACh), and thus may not support an assumption that levels should return to baseline shortly after playback offset.

      Minor comments:

      Page 7, line 15: I suggest changing "vocalization-dependent" to "stimulus-dependent" because the former could connote patterns of release related to the animal itself vocalizing.

      Changed to: “There were also distinct patterns of ACh and DA release into the BLA depending on the type of vocalization playback (Fig 3C,D).”

      Discussion section: The authors should point out a few caveats with their experiments in the Discussion section. First, experienced animals received both mating (social) and restraint experiences, and it is not clear to what degree each type of experience affected neural and behavioral responses (i.e., specificity of experience effects). For example, mating experience can lead to a wide range of physiological changes, including a resilience to stress (e.g., Leuner et al., PLoS One, 2010; Arnold et al., Hormones and Behavior, 2019), so it is possible that mating experiences by themselves could have induced these changes. Or it could be that experiencing restraint stress affects responses to mating stimuli. This could be added to the first paragraph in page 16. (The authors could also discuss which aspects of the sexual encounters might be most important for the behavioral and neural plasticity.)

      We have added text to raise this issue, stating that it is unknown wither the experience effects are specific and citing the above references concerning the generalized effects of certain experiences.

      Discussion section: It would also be useful for the authors to discuss the extent to which behavior might be driving the neurochemical changes. Some of the analyses suggest that the release is independent of the behavior (e.g., reflects a sensory responses) but this could be emphasized    more in the Discussion.

      We believe that we have addressed this issue sufficiently in our previous response to related issues raised by this reviewer. As we note, there are limitations in the time resolution of microdialysis data that render the suggested discussion highly speculative. We plan to use other methods to assess this in future experiments.

      Figure 2, legend: Please note that the text above the images describes the stimulus played back to these animals and their hormonal state, and not the type of experienced they underwent (i.e.,  clarify the titles)

      Changed as requested.

      I also agree with Reviewer 3 that "mating experience" is a misnomer for this manuscript. "Social experience with a female" is a more accurate descriptor. If they wanted to specifically provide mating experience, males should have only been tested with estrus (receptive females). I don't think this wording change detracts from their findings.

      We have not changed this term. As noted in our previous response to Reviewer #3, we stated: “In the mating experience, mounting or attempted mounting was required for the animal to be included in subsequent testing.” Due to this requirement, the term “mating behavior” is informative and appropriate. In our view, “Social experience with a female” does not adequately describe our inclusion criterion or the experience.

      Reviewer #3 (Recommendations For The Authors):

      The work by Ghasemahmad et al. has the potential to significantly advance our understanding of how neuromodulators provide internal-state signals to the basolateral amygdala (BLA) while an animal listens to social vocalizations.

      Ghasemahmad et al. made changes to the manuscript that have significantly improved the work. In particular, the transparency in showing the underlying levels of Ach, DA, and 5HIAA is excellent. My previous concerns have been adequately addressed. I only have a few minor suggestions for the text and one figure.

      Minor suggestions:

      Page 2, Ln 9: add adult before male and female mice

      Changed as requested

      Page 4, Ln 10: add a period after Tsukano et al., 2019)

      Changed as requested

      Page 6, Ln 9: what did you mean by "their interaction"? Being more specific, but concise, would help the readers.

      We revised the wording to clarify that the neuromodulatory systems interact in the emission of positive and negative vocalizations.

      Page 6, Ln 17: You mention Stim 1 and Stim 2, but the stimuli are not defined at this point. The clear explanation is provided in the following paragraph. Maybe consider switching the order  and define the stimuli before you describe the liquid chromatography/mass spectrometry technique.

      We have revised and merged these paragraphs so that Stim 1 and Stim 2 are defined on first use. We also revised our description of the depiction and analysis of neurochemical data.

      Page 11, Ln 12: replace well-proven with well-documented

      Changed as requested

      Figure 2: There are two arrows pointing towards a single track. I assume one of the arrows is a duplicate. If so, delete one of the arrows. If not, please explain what the second arrow represents.

      Arrow removed

    1. Author response:

      The following is the authors’ response to the current reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors show that upon treatment with Doxorubicin (Doxo), there is an increase in senescence and inflammatory markers in the muscles. They also show these genes get upregulated in C2C12 myoblasts when treated with conditioned media or 15d-PGJ2. 15dPGJ2 induces cell death in the myoblasts, decreases proliferation (measured by cell numbers), and decreases differentiation and fusion. 15d-PGJ2 modified Cys184 of HRas, which is required for its activation as indicated by the FRET analysis with RAF RBD. They also showed that 15d-PGJ2 activates ERK signaling, but not Akt signaling, through the electrophilic center. 15d-PGJ2 inhibits Golgi localization of HRAS (only WT, not C181 or C184 mutant). They also showed that expressing the WT HRas followed by 15d-PGJ2 treatment led to a decrease in the levels of MHC mRNA and protein, and this defect is dependent on C184. This is a well-written manuscript with interesting insights into the mechanism of action of 15d-PGJ2. However, some clarification and experiments will help the paper advance the field significantly.

      Strengths:

      The data clearly shows that 15d-PGJ2 has a negative role in the myoblast cells and that it leads to modification of HRas protein. Moreover, the induction of biosynthetic enzymes in the PGD2 pathway also supports the induction of 15d-PGJ2 in Doxorubicin-treated cells. Both conditioned media experiments and the 15d-PGJ2 experiments show that 15d-PGJ2 could be the active component secreted by the senescent myoblasts.

      Weaknesses:

      The genes that are upregulated in the muscles upon injection with Doxo are also markers for inflammation. Since Doxo is also known to induce systemic inflammation, it is important to delineate these two effects (Inflammatory cells vs senescent cells). The expression of beta Gal and other markers of senescence in the tissue sections will help to delineate these.

      As pointed out Doxo induces systemic inflammation along with inducing DNA damage-mediated senescence. Therefore, along with the inflammatory markers of the SASP (CXCL1/2, TNF1α, IL6, PTGS1/2, PTGDS) we also observed an increase in the mRNA levels of canonical markers of DNA damage-mediated senescence. We observed an increase in the mRNA levels of cell cycle and senescence associated proteins p16 and p21 (Fig. 1C). We also observed an increased nuclear accumulation of p21 (Fig. 1A) and increased levels of phosphorylated H2A.X in the nucleus (Fig. 1B).

      In Figure 2, where the defect in the differentiation of myoblasts upon treatment with 15d-PGJ2 is shown, most of the cells die within 48 hours at higher concentrations, making it difficult to perform the experiments. This also shows that 15d-PGJ2 was toxic to these cells. Lower concentrations show a decrease in the differentiation based on the lower number of nuclei in fibers and low expression of MyoD, MyoG, and MHC. However, it is unclear if this is due to increased cell death or defective differentiation. It would be a lot more informative if the cell count, cell division, and cell death could be plotted for these concentrations of the drug during the experiment.

      We measured the viability of C2C12 cells after 24 hours of treatment with 15d-PGJ2 using the MTT assay and observed that the viability of cells was decreased after treatment with 15d-PGJ2 (10 µM) but not with 15d-PGJ2 (1 µM, 2 µM, 4 µM, or 5 µM) (see Fig. S2A of the updated manuscript). The results and figures of the manuscript have been updated accordingly.

      Also, in the myoblast experiments, are the effects of treatment with Dox reversible?

      The treatment with Doxorubicin is irreversible as the senescent phenotype was not reversed after withdrawal of Doxorubicin, even after 20 days.

      In Figure 3, most of the experiments are done at a high concentration, which induces almost complete cell death within 48 hours.

      Figure 3 is an acute experiment for only 1 hour, at which time no cell death was observed. Specifically, we measured the phosphorylation of Erk and Akt proteins after 1 hour of treatment with 15d-PGJ2 (10 µM) during which we did not observe any cell death.

      Even at such a high concentration of 15dPGJ2, the increase in ERK phosphorylation is minimal.

      We observe a ~30% increase in the phosphorylation of Erk proteins after treatment with 15d-PGJ­2 in 0.2% serum medium compared to treatment with vehicle (DMSO). This is reproducible and significant.

      The experiment Figure 4C shows that C181 and C84 mutants of the HRas show higher levels in Golgi compared with WT. However, this could very well be due to the defect in palmitoylation rather than the modification with 15d-PGJ2.

      Our data does not suggest higher levels of C184S mutant in the Golgi compared with WT (Fig. S4A). We observed that the ratio of HRas levels in the Golgi to the HRas levels in the plasma membrane were similar in C2C12 cells expressing HRas C184S and HRas WT (Fig. S4A graph columns 1 and 5).

      Though the authors allude to the possibility that intracellular redistribution of HRas by 15d-PGJ2 requires C181 palmitoylation, the direct influence of C184 modification on C181 palmitoylation is not shown. To have a meaningful conclusion, the authors need to compare the palmitoylation and modification with 15d-PGJ2.

      Palmitoylation of HRas C181S is required for the localization of HRas at the plasma membrane. The inhibition of palmitoylation of C181, either by mutation (C181S) or treatment with protein palmitoyl transferase inhibitor (2-Bromopalmitate), results in the accumulation of HRas at Golgi(Rocks et al., 2005) (Fig. S4A). Modification of HRas at C184 by 15d-PGJ2 (Fig. 3A) could inhibit the palmitoylation of HRas at C181. However, our data does not support this hypothesis as modification of HRas WT by 15d-PGJ2 does not increase the level of HRas at the Golgi, like in the case of inhibition of cysteine palmitoylation due to C181S mutation.

      To test if the inhibition of myoblast differentiation depends on HRas, they overexpressed the HRas and mutants in the C2C12 lines. However, this experiment does not take the endogenous HRAs into consideration, especially when interpreting the C184 mutant. An appropriate experiment to test this would be to knock down or knock out HRas (or make knock-in mutations of C184) and show that the effect of 15d-PGJ2 disappears.

      Endogenous HRas (wild type) is present in the C2C12 cells overexpressing the EGFP-tagged HRas constructs. Therefore, we only observe a partial rescue in the differentiation after 15d-PGJ2 treatment in C2C12 cells expressing the C184S mutant (Fig. 4D and E). However, since HRas is expressed under high expression CMV promoter and in the absence of other regulatory elements, the overexpressed constructs do show a dominant effect over the endogenous HRas, showing cysteine mutant dependent inhibition of differentiation of myoblasts after treatment with 15d-PGJ2 (Fig. 4D and E).

      Moreover, in this specific experiment, it is difficult to interpret without a control with no HRas construct and another without the 15d-PGJ2 treatment.

      The mRNA levels of MyoD, MyoG, and MHC in C2C12 cells expressing HRas constructs after treatment with 15d-PGJ2 were normalized to the mRNA levels in C2C12 cells expressing corresponding constructs and were treated with vehicle (DMSO). mRNA levels in C2C12 cells treated with vehicle were not shown as they were normalized to 1. MHC protein levels in C2C12 cells expressing HRas constructs after 15d-PGJ2 treatment were normalized to that in C2C12 cells treated with vehicle (DMSO). Since the hypothesis to study the effect of HRas cysteine mutations on the differentiation of myoblasts after treatment with 15d-PGJ2, C2C12 cells expressing HRas WT serve as adequate control. Fig. 2 shows the effect of 15d-PGJ2 on muscle differentiation when HRas was not overexpressed.

      Moreover, the overall study does not delineate the toxic effects of 15d-PGJ2 from its effect on the differentiation.

      The inhibition of differentiation in C212 cells after treatment with 15d-PGJ2 cannot be attributed to the general toxicity of 15d-PGJ2 in cells. We show that the inhibition of differentiation of myoblasts after 15d-PGJ2 depends on modification of HRas at C184 i.e. failure to modify HRas at C184 (Fig. 3A) and resultant activation (Fig. 3B) by 15d-PGJ2 rescues this inhibition of differentiation of C2C12 cells (Fig. 4D and E), dissecting the inhibition of differentiation of myoblasts by 15d-PGJ2 from general toxic effects of 15d-PGJ2 on cell physiology.

      Please note that the effect of 15d-PGJ2 on cell physiology is context-specific. On one hand, 15d-PGJ2 has been shown to exert tumor-suppressor effects by inhibiting the proliferation of ovarian cancer cells and lung adenocarcinoma cells (de Jong et al., 2011; Slanovc et al., 2024), 15d-PGJ2 also exerts pro-carcinogenic effects by induction of epithelial to mesenchymal transition in breast cancer cells MCF7 and inhibition of tumor-suppressor protein p53 in MCF7 and PC-3 cells (Choi et al., 2020; Kim et al., 2010).

      Reviewer #2 (Public Review):

      Summary:

      In this study, Swarang and colleagues identified the lipid metabolite 15d-PGJ2 as a potential component of senescent myoblasts. They proposed that 15d-PGJ2 inhibits myoblast proliferation and differentiation by binding and regulating HRas, suggesting its potential as a target for restoring muscle homeostasis post-chemotherapy.

      Strengths:

      The regulation of HRas by 15d-PGJ2 is well controlled.

      Weaknesses:

      (1) I still think the novelty is limited by previous published findings. The authors themselves noted that the accumulation of 15d-PGJ2 in senescent cells has been reported in various cell types, including human fibroblasts, HEPG2 hepatocellular carcinoma cells, and HUVEC endothelial cells (PMCID: PMC8501892). Although the current study observed similar activation of 15d-PGJ2 in myoblasts, it appears to be additive rather than fundamentally novel. The covalent adduct of 15d-PGJ2 with Cys-184 of H-Ras was reported over 20 years ago (PMID: 12684535), and the biochemical principles of this interaction are likely universal across different cell types. The regulation of myogenesis by both HRas and 15d-PGJ2 has also been previously extensively reported (PMID: 2654809, 1714463, 17412879, 20109525, 11477074). The main conceptual novelty may lie in the connection between these points in myoblasts. But as discussed in another comment, the use of C2C12 cells as a model for senescence study is questionable due to the lack of the key regulator p16. The findings in C2C12 cells may not accurately represent physiological-relevant myoblasts. It is recommended that these findings be validated in primary myoblasts to strengthen the study's conclusions.

      This is the first study to show a molecular mechanism where activation of HRas signaling in skeletal myoblasts due to covalent modification by 15d-PGJ2 at C184 of HRas inhibits the differentiation of skeletal myoblasts.

      (2) The C2C12 cell line is not an ideal model for senescence study.

      C2C12 cells are a well-established model for studying myogenesis. However, their suitability as a model for senescence studies is questionable. C2C12 cells are immortalized and do not undergo normal senescence like primary cells as C2C12 cells are known to have a deleted p16/p19 locus, a crucial regulator of senescence (PMID: 20682446). The use of C2C12 cells in published studies does not inherently validate them as a suitable senescence model. These studies may have limitations, and the appropriateness of the C2C12 model depends on the specific research goals.

      Several reports have shown that cells undergo senescence independent of p16 expression. MCF7 human breast adenocarcinoma cells have been shown to undergo DNA damage mediated and Oncogene induced senescence as seen after treatment with Doxorubicin (PMID: PMC7025418) and expression of constitutively active HRas (PMID: 17135242), despite the homozygous deletion of p16 locus (ISBN 9780124375512 Chapter 17 Table 2) by upregulation of cell cycle inhibitor protein p21. In this study, we observe an increase in the senescence markers in C2C12 cells after treatment with Doxo (Fig. 1). We also observed an increase in the markers of DNA damage-mediated senescence in MCF7 after treatment with Doxo (Data will be included in the revised manuscript). Based on these observations, we have concluded that C2C12 cells undergo senescence despite lacking the p16/p19 locus.

      In the study by Moustogiannis et al. (PMID: 33918414), they claimed to have aged C2C12 cells through multiple population doublings. However, the SA-β-gal staining in their data, which is often used to confirm senescence, showed almost fully confluent "aged" C2C12 cells. This confluent state could artificially increase SA-β-gal positivity, suggesting that these cells may not truly represent senescence. Moreover, the "aged" C2C12 cells exhibited normal proliferation, which contradicts the definition of senescence. Similar findings were reported in another study of C2C12 cells subjected to 58 population doublings (PMID: 21826704), where even at this late stage, the cells were still dividing every 2 or 3 days, similar to younger cells at early passages. More importantly, I do know how the p16 was detected in that paper since the locus was already mutated. In terms of p21, there was no difference in the proliferative C2C12 cells at day 0.

      In the study by Moiseeva et al. in 2023 (PMID: 36544018), C2C12 cells were used for senescence modeling for siRNA transfection. However, the most significant findings were obtained using primary satellite cells or confirmed with complementary data.

      In conclusion, while molecular changes observed in studies using C2C12 cells may be valid, the use of primary myoblasts is highly recommended for senescence studies due to the limitations and questionable senescence characteristics of the C2C12 cell line.

      (3) Regarding source of increased PGD in the conditioned medium, I want to emphasize that it's unclear whether the PGD or its metabolites increase in response to DNA damage or the senescence state. Thus, using a different senescent model to exclude the possibility of DNA damage-induced increase will be crucial.

      Though Senescence can be induced by several stress stimuli like DNA damage, Oncogene expression, ROS, Mitochondrial Dysfunction, etc., DNA damage remains critical for the induction of the SASP (reviewed in PMID: 20078217). Also, other models of senescence, like Oncogene Induced Senescence (reviewed in PMID: 17671427), ROS Induced Senescence (PMID: 24934860), Mitochondrial Dysfunction Associated Senescence (MiDAS) (PMID: 26686024) have shown upregulation of DNA damage-associated signaling pathways. In this study, we have explored the SASP of cells undergoing senescence upon chemotherapy drug Doxorubicin-mediated DNA damage.

      (4) Similarly for the in vivo Doxorubicin (Doxo) injection, both reviewers have raised concerns about the potential side effects of Doxo, including inflammation, DNA damage, and ROS generation. These effects could potentially confound the results of the study. The physiological significance of this study will heavily rely on the in vivo data. However, the in vivo senescence component is confounded by the side effects of Doxo.

      We concur that this is a limitation of this study and the subsequent work will demonstrate the origin of prostaglandin biosynthesis after treatment with Doxo in vivo.

      (5) Figure 2A lacks an important control from non-senescent cells during the measurement of C2C12 differentiation in the presence of conditioned medium. The author took it for granted that the conditioned medium from senescent cells would inhibit myogenesis, relying on previous publications (PMID: 37468473). However, that study was conducted in the context of myotonic dystrophy type 1. To support the inhibitory effect in the current experimental settings, direct evidence is required. It would be necessary to include another control with conditioned medium from normal, proliferative C2C12 cells.

      Conditioned medium of senescent cells of several types, like senescent myoblasts in case of DM1 (PMID: 37468473), adipocytes undergoing senescence due to H2O2 treatment, Insulin Resistance, and Replicative senescence (PMID: 37321332), has been shown to inhibit the differentiation of myoblasts. Therefore, in this study, we measured the effect of prostaglandin PGD2 and its metabolites on the differentiation of myoblasts by inhibiting the biosynthesis of PGD2 in senescent myoblasts by treatment with AT-56. We inhibited the synthesis of PGD2 in senescent cells by treatment with AT-56, and then collected the conditioned medium. Conditioned medium collected from senescent C2C12 cells treated with vehicle (DMSO) served as a control for the experiment.

      (6) Statistical analyses problems.

      Only t-test was used throughout the study even when there are more than two groups. Please have a statistician to evaluate the replicates and statistical analyses used.

      In experiments with more than two groups, the t-test was used for column-wise comparison of the experiment samples to the control sample. Multiple sample comparisons using one-way or two-way ANOVA were avoided as experimental samples were individually compared to the control sample.

      For the 15d-PGJ2/cell concentration measurements in Figure 1F, there were only two replicates, which was provided in the supplementary table after required. Was that experiment repeated with more biological replicates?

      Additional replicates of the experiment will be included in the revised manuscript.

      For figure 1C, Fig 1F, 1G, 1J, 2C, 2E, 3A, 3E, 3F, 4D, 4E, please include each data points in bar graphs as used in Fig 1D, or at least provide how many biological replicates were used for each experiment?

      Appropriate revisions will be made in the figure legends of the revised manuscript.

      There is no error bar in a lot of control groups (Fig 2C, 2E, 3EF, 4E, S4B).

      There are no error bars for the control groups in the figures 2C, 2E, 3E, 3F, 4E, and S4B as the experimental samples of each replicate were normalized to the corresponding control sample, rendering the values for the control sample of each replicate to 1.

      For qPCR data in Figure 1C, the author responded in that the data in was plotted using 2-ΔCT instead of 2-ΔΔCT to show the variability in the expression of mRNAs isolated from animals treated with Saline. This statement does not align with the method section. Please revise.

      Appropriate revisions will be made to the method sections of the revised manuscript.

      (7) For Figure 1, the title may not be appropriate as there is insufficient data to support the inhibition of myoblast differentiation.

      Appropriate revisions will be made to the revised manuscript.

      Recommendations for the authors:

      After careful review, the editors advise you to carefully address the following concerns.

      (1) There were concerns that in the revised manuscript, the DMSO and Doxo experiments depicted in Figure 1H appeared quite homogenous despite the author's description to the contrary. This leads to concerns about the type of statistics employed and the possible low number of replicates of experiments shown in Fig. 1.

      (2) Experiments in Figure 1F, 1I, and 1J had as few as n=2 experiments. Figures 1C, 1D, 1F, 1G, and 1J, the statistics used a two-tailed student's t-test; for all other experiments, they marked N/A for statistics. Using a t-test for multi-group comparisons (as indicated in the figure legend) and relying on only 2 replicates for many experiments are not appropriate.

      Additional replicates for the experiments shown in figures 1F, 1I, and 1J have been done and the data will be revised along with updated statistical tests during the revision of the manuscript.

      (3) In several experiments, the difference between technical replicates is too high.

      Reviewer #1 (Recommendations For The Authors):

      Most of my concerns were addressed in the revised manuscript.

      We thank the reviewer for their time in reviewing the manuscript and consideration of the author’s response to their comments in during the previous round of review.

      Reviewer #2 (Recommendations For The Authors):

      Validating the findings in a primary myoblast is highly recommended for senescence studies due to the limitations and questionable senescence characteristics of the C2C12 cell line.

      We have explained the statistical tests used in the manuscript in the general comment section of the reviewer’s comments.

      Validate the finding in a different senescent model to exclude the possibility of DNA damage-response.

      We have explained the statistical tests used in the manuscript in the general comment section of the reviewer’s comments.

      For Fig 2A, add another control with a conditioned medium from normal, proliferative C2C12 cells.

      We have explained the statistical tests used in the manuscript in the general comment section of the reviewer’s comments.

      Please have a statistician to evaluate the replicates and statistical analyses used.

      We have explained the statistical tests used in the manuscript in the general comment section of the reviewer’s comments.

      For the barplots (figure 1C, Fig 1F, 1G, 1J, 2C, 2E, 3A, 3E, 3F, 4D, 4E), please include each data points, or at least provide how many biological replicates were used for each experiment.

      Appropriate revisions will be made in the figure legends of the revised manuscript.

      For Figure 1, the title may not be appropriate as there is insufficient data to support the inhibition of myoblast differentiation.

      Appropriate revisions will be made to the revised manuscript.


      The following is the authors’ response to the original reviews.

      eLife assessment

      This manuscript provides useful information about the lipid metabolite 15d-PGJ2 as a potential regulator of myoblast senescence. The authors provide experimental evidence that 15d-PGJ2 inhibits myoblast proliferation and differentiation by binding and regulating HRas. However, the manuscript is incomplete in its current form, as it lacks robust support from the data regarding the main conclusions related to senescence and technical concerns related to the senescence models used in this study.

      We are grateful to the editors and the reviewers for their time and comments in sharpening the science and the writing of the manuscript. We have attached a detailed response to emphasize that the manuscript does include robust evidence regarding the claims, which could have been missed during the review process. We have provided a better context for these points now.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors show that upon treatment with Doxorubicin (Doxo), there is an increase in senescence and inflammatory markers in the muscles. They also show these genes get upregulated in C2C12 myoblasts when treated with conditioned media or 15d-PGJ2. 15dPGJ2 induces cell death in the myoblasts, decreases proliferation (measured by cell numbers), and decreases differentiation and fusion. 15d-PGJ2 modified Cys184 of HRas, which is required for its activation as indicated by the FRET analysis with RAF RBD. They also showed that 15d-PGJ2 activates ERK signaling, but not Akt signaling, through the electrophilic center. 15d-PGJ2 inhibits Golgi localization of HRAS (only WT, not C181 or C184 mutant). They also showed that expressing the WT HRas followed by 15d-PGJ2 treatment led to a decrease in the levels of MHC mRNA and protein, and this defect is dependent on C184. This is a well-written manuscript with interesting insights into the mechanism of action of 15d-PGJ2. However, some clarification and experiments will help the paper advance the field significantly.

      Strengths:

      The data clearly shows that 15d-PGJ2 has a negative role in the myoblast cells and that it leads to modification of HRas protein. Moreover, the induction of biosynthetic enzymes in the PGD2 pathway also supports the induction of 15d-PGJ2 in Doxorubicin-treated cells. Both conditioned media experiments and the 15d-PGJ2 experiments show that 15d-PGJ2 could be the active component secreted by the senescent myoblasts.

      Weaknesses:

      The genes that are upregulated in the muscles upon injection with Doxo are also markers for inflammation. Since Doxo is also known to induce systemic inflammation, it is important to delineate these two effects (inflammatory cells vs senescent cells). The expression of beta Gal and other markers of senescence in the tissue sections will help to delineate these.

      As pointed out Doxo induces systemic inflammation along with inducing DNA damage-mediated senescence. Therefore, along with the inflammatory markers of the SASP (CXCL1/2, TNF1α, IL6, PTGS1/2, PTGDS) we also observed an increase in the mRNA levels of canonical markers of DNA damage-mediated senescence. We observed an increase in the mRNA levels of cell cycle and senescence associated proteins p16 and p21 (Fig. 1C). We also observed an increased nuclear accumulation of p21 (Fig. 1A) and increased levels of phosphorylated H2A.X in the nucleus (Fig. 1B).

      In Figure 2, where the defect in the differentiation of myoblasts upon treatment with 15d-PGJ2 is shown, most of the cells die within 48 hours at higher concentrations, making it difficult to perform the experiments. This also shows that 15d-PGJ2 was toxic to these cells. Lower concentrations show a decrease in the differentiation based on the lower number of nuclei in fibers and low expression of MyoD, MyoG, and MHC. However, it is unclear if this is due to increased cell death or defective differentiation. It would be a lot more informative if the cell count, cell division, and cell death could be plotted for these concentrations of the drug during the experiment.

      We measured the viability of C2C12 cells after 24 hours of treatment with 15d-PGJ2 using the MTT assay and observed that the viability of cells was decreased after treatment with 15d-PGJ2 (10 µM) but not with 15d-PGJ2 (1 µM, 2 µM, 4 µM, or 5 µM) (see Fig. S2A of the updated manuscript). The results and figures of the manuscript have been updated accordingly.

      Also, in the myoblast experiments, are the effects of treatment with Dox reversible?

      The treatment with Doxorubicin is irreversible as the senescent phenotype was not reversed after withdrawal of Doxorubicin, even after 20 days.

      In Figure 3, most of the experiments are done at a high concentration, which induces almost complete cell death within 48 hours.

      Figure 3 is an acute experiment for only 1 hour, at which time no cell death was observed. Specifically, we measured the phosphorylation of Erk and Akt proteins after 1 hour of treatment with 15d-PGJ2 (10 µM) during which we did not observe any cell death. 

      Even at such a high concentration of 15dPGJ2, the increase in ERK phosphorylation is minimal.

      We observe a ~30% increase in the phosphorylation of Erk proteins after treatment with 15d-PGJ2 in 0.2% serum medium compared to treatment with vehicle (DMSO). This is reproducible and significant.

      The experiment Figure 4C shows that C181 and C84 mutants of the HRas show higher levels in Golgi compared with WT. However, this could very well be due to the defect in palmitoylation rather than the modification with 15d-PGJ2.

      Our data does not suggest higher levels of C184S mutant in the Golgi compared with WT (Fig. S4A). We observed that the ratio of HRas levels in the Golgi to the HRas levels in the plasma membrane were similar in C2C12 cells expressing HRas C184S and HRas WT (Fig. S4A graph columns 1 and 5).

      Though the authors allude to the possibility that intracellular redistribution of HRas by 15d-PGJ2 requires C181 palmitoylation, the direct influence of C184 modification on C181 palmitoylation is not shown. To have a meaningful conclusion, the authors need to compare the palmitoylation and modification with 15d-PGJ2.

      Palmitoylation of HRas C181S is required for the localization of HRas at the plasma membrane. The inhibition of palmitoylation of C181, either by mutation (C181S) or treatment with protein palmitoyl transferase inhibitor (2-Bromopalmitate), results in the accumulation of HRas at Golgi(Rocks et al., 2005) (Fig. S4A). Modification of HRas at C184 by 15d-PGJ2 (Fig. 3A) could inhibit the palmitoylation of HRas at C181. However, our data does not support this hypothesis as modification of HRas WT by 15d-PGJ2 does not increase the level of HRas at the Golgi, like in the case of inhibition of cysteine palmitoylation due to C181S mutation.

      To test if the inhibition of myoblast differentiation depends on HRas, they overexpressed the HRas and mutants in the C2C12 lines. However, this experiment does not take the endogenous HRAs into consideration, especially when interpreting the C184 mutant. An appropriate experiment to test this would be to knock down or knock out HRas (or make knock-in mutations of C184) and show that the effect of 15d-PGJ2 disappears. 

      Endogenous HRas (wild type) is present in the C2C12 cells overexpressing the EGFP-tagged HRas constructs. Therefore, we only observe a partial rescue in the differentiation after 15d-PGJ2 treatment in C2C12 cells expressing the C184S mutant (Fig. 4D and E). However, since HRas is expressed under high expression CMV promoter and in the absence of other regulatory elements, the overexpressed constructs do show a dominant effect over the endogenous HRas, showing cysteine mutant dependent inhibition of differentiation of myoblasts after treatment with 15dPGJ2 (Fig. 4D and E).

      Moreover, in this specific experiment, it is difficult to interpret without a control with no HRas construct and another without the 15d-PGJ2 treatment.

      The mRNA levels of MyoD, MyoG, and MHC in C2C12 cells expressing HRas constructs after treatment with 15d-PGJ2 were normalized to the mRNA levels in C2C12 cells expressing corresponding constructs and were treated with vehicle (DMSO). mRNA levels in C2C12 cells treated with vehicle were not shown as they were normalized to 1. MHC protein levels in C2C12 cells expressing HRas constructs after 15d-PGJ2 treatment were normalized to that in C2C12 cells treated with vehicle (DMSO). Since the hypothesis to study the effect of HRas cysteine mutations on the differentiation of myoblasts after treatment with 15d-PGJ2, C2C12 cells expressing HRas WT serve as adequate control. Fig. 2 shows the effect of 15dPGJ2 on muscle differentiation when HRas was not overexpressed.

      Moreover, the overall study does not delineate the toxic effects of 15d-PGJ2 from its effect on the differentiation.

      The inhibition of differentiation in C212 cells after treatment with 15d-PGJ2 cannot be attributed to the general toxicity of 15d-PGJ2 in cells. We show that the inhibition of differentiation of myoblasts after 15d-PGJ2 depends on modification of HRas at C184 i.e. failure to modify HRas at C184 (Fig. 3A) and resultant activation (Fig. 3B) by 15d-PGJ2 rescues this inhibition of differentiation of C2C12 cells (Fig. 4D and E), dissecting the inhibition of differentiation of myoblasts by 15d-PGJ2 from general toxic effects of 15d-PGJ2 on cell physiology.

      Please note that the effect of 15d-PGJ2 on cell physiology is context-specific. On one hand, 15d-PGJ2 has been shown to exert tumor-suppressor effects by inhibiting the proliferation of ovarian cancer cells and lung adenocarcinoma cells (de Jong et al., 2011; Slanovc et al., 2024), 15d-PGJ2 also exerts pro-carcinogenic effects by induction of epithelial to mesenchymal transition in breast cancer cells MCF7 and inhibition of tumor-suppressor protein p53 in MCF7 and PC-3 cells (Choi et al., 2020; Kim et al., 2010).

      Reviewer #2 (Public Review):

      Summary:

      In this study, Swarang and colleagues identified the lipid metabolite 15d-PGJ2 as a potential component of senescent myoblasts. They proposed that 15d-PGJ2 inhibits myoblast proliferation and differentiation by binding and regulating HRas, suggesting its potential as a target for restoring muscle homeostasis post-chemotherapy.

      Strengths:

      The regulation of HRas by 15d-PGJ2 is well controlled.

      Weaknesses:

      The novelty of the study is compromised as the activation of PGD and 15d-PGJ2, as well as the regulation of HRas and cell proliferation, have been previously reported. 

      Literature does not support this statement, and it is important to clarify this misimpression for the field as a whole. 

      Let us clarify- 

      Covalent modification of HRas by 15d-PGJ2 has been reported only twice in the literature(Luis Oliva et al., 2003; Yamamoto et al., 2011) in fibroblasts and neurons respectively. 

      Interaction between Hras and 15d-PGJ2 in skeletal muscles has not been shown before, even though both Hras and 15d-PGJ2 are shown to be key regulators of muscle homeostasis. 

      Activation of Hras by 15d-PGJ2 was reported first by Luis Oliva et al (Luis Oliva et al., 2003). However, this study does not comment on the functional implications of activation of Hras signaling. 

      Recently, our lab contributed to a study where the functional implication of activation of Hras signaling due to covalent modification by 15d-PGJ2 was shown in the maintenance of senescence phenotype (Wiley et al., 2021). 

      15d-PGJ2 was shown to inhibit the differentiation of myoblasts by Hunter et al (Hunter et al., 2001). This study hypothesized that the inhibition of myoblast differentiation is via 15d-PGJ2 mediated activation of the PPARγ signaling, the study also showed inhibition of myoblast differentiation independent of PPARγ activity, suggesting the presence of other mechanisms.

      This is the first study to show a molecular mechanism where activation of Hras signaling in skeletal myoblasts due to covalent modification by 15d-PGJ2 at C184 of Hras inhibits the differentiation of skeletal myoblasts.

      Additionally, there are major technical concerns related to the senescence models, limiting data interpretation regarding the relevance to senescent cells.

      Major concerns:

      (1) The C2C12 cell line is not an ideal model for senescence study due to its immortalized nature and lack of normal p16 expression. A more suitable myoblasts model is recommended, with a more comprehensive characterization of senescence features.

      C2C12 is a good model for DNA damage-based senescence that is used in this manuscript. Several reports in the literature have shown the induction of senescence in C2C12 cells. Moiseeva et al 2023 show induction of senescence in C2C12 cells after etoposide-mediated DNA damage. Moustogiannis et al 2021 show the induction of replicative senescence in C2C12 cells. In this study, we show that C2C12 cells undergo DNA damage-mediated senescence after treatment with Doxo. We measured the induction of senescence in C2C12 cells upon DNA damage using several physiological (Nuclear Size, Cell Size, and SA β-gal) and molecular markers (mRNA levels of p21 and SASP factors (IL6 and TGFβ), protein levels of p21) of senescence (see Fig. 1 of the updated manuscript). The results and the figures in the manuscript have been updated accordingly.

      (2) The source of increased PGD or its metabolites in the conditioned medium is unclear. Including other senescence models, such as replicative or oncogeneinduced senescence, would strengthen the study.

      Fig. 1E shows time-dependent increase in the expression of PGD2 biosynthetic enzymes in senescent C2C12 cells. Fig. 1F shows an increase in the levels of 15dPGJ2 secreted by senescent C2C12 cells in the conditioned medium. This data shows that senescent C2C12 cells are the source of PGD and its metabolites in the conditioned medium.

      Again, C2C12 is not suitable for replicative senescence due to its immortalized status.

      We and others have shown that C2C12 cells undergo senescence, and this manuscript only used DNA damage induced senescence.

      (3) In the in vivo part, it is unclear whether the increased expression of PTGS1, PTGS2, and PTGDS is due to senescence or other side effects of DOXO.

      We concur that this is a limitation of this study and the subsequent work will demonstrate the origin of prostaglandin biosynthesis after treatment with Doxo in vivo.

      (4) Figure 2A lacks an important control from non-senescent cells during the measurement of C2C12 differentiation in the presence of a conditioned medium.

      Figure 2A tests the effect of prostaglandin PGD2 and its metabolites secreted by the senescent cells on the differentiation of myoblasts. Therefore, we inhibited the synthesis of PGD2 in senescent cells by treatment with AT-56, and then collected the conditioned medium. Conditioned medium collected from senescent C2C12 cells treated with vehicle (DMSO) served as a control for the experiment, whereas differentiation of C2C12 cells without any treatment serves as a positive control.

      There is no explanation of how differentiation was quantified or how the fusion index was calculated.

      The fusion index was calculated using a published myotube analyzer software (Noë et al., 2022). Appropriate information has been added to the materials and methods section of the manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Line 3: Expand SA in "SA β-gal".

      The manuscript has been updated accordingly (See line 3).

      Line 68: HRas is highly regulated by lipid modifications.

      The manuscript has been updated accordingly (See line 67).

      Figures

      Figure S1A seemed incomplete (maybe some processing issue).

      The Figure has been updated in the revised manuscript (See Fig. S1A).

      Figure S1B-H are mislabeled.

      The figure has been updated in the revised manuscript (See Fig. S1C, D, E, and F).

      Figures S1E-H are not mentioned in the manuscript.

      The manuscript has been updated accordingly (See line 120).

      Many supplementary figures are not cited in the article.

      The manuscript has been updated accordingly. (See lines 85, 120, 123, 166, 225, 356, 364, 412, and 413)

      Reviewer #2 (Recommendations For The Authors):

      (1) Clarify the injection method for Doxorubicin in B6J mice on line 83 (IP or IM).

      Mice were injected intraperitoneally with Doxorubicin (as mentioned in the materials and methods, see lines 83 and 794)

      (2) Address missing information in figures or figure legends.

      There is missing piece in Sup Fig 1A.

      The figure has been updated in the revised manuscript (See Fig. S1A).

      Correct labels in Sup Fig 1C and 1D.

      The figure has been updated in the revised manuscript (See Fig. S1C, D, E, and F).

      How would the authors explain the dramatic differences in the morphology of C2C12 cells treated with DOXO between bright field and SA-beta-gal staining images in Sup Fig 1B and 1C.

      The SA β-gal image after treatment with Doxo does show a flattened cell morphology. Another field of view from the same experiment has been added in the figure to show the difference in the cell morphology more prominently in the revised manuscript (See Fig. 1H).

      Provide explanations for Sup Fig 1E-1G, including the meaning of the y-axis and the blue dots and red lines.

      We have provided an explanation for the multiple reaction monitoring mass spectrometry used to measure the concentration of 15d-PGJ2 in the conditioned medium in the revised manuscript (see lines 119-130 and the legends of Fig. S1C, D, and E)

      (3) Please review the calculation of qPCR data in Figure 1C for correctness, ensuring reference samples with an average expression level of 1.

      The data in Fig. 1C was plotted using 2-ΔCT instead of 2-ΔΔCT to show the variability in the expression of mRNAs isolated from animals treated with Saline.

      (4) Please explain the calculation of 15d-PGJ2/cell concentration in Figure 1F and provide raw data for review, considering the substantial changes and small error bars. The method or result section lacks an explanation of how this calculation was performed. Additionally, there is no mention of the cell number count.

      All the raw values (concentration of 15d-PGJ2 measured using mass spec and cell numbers counted at the time of collection of conditioned medium) are provided in the supplementary table 1. The standard curve to calculate the concentration of 15dPGJ2 in the conditioned medium is shown in Fig. S1F. The cell number was counted after trypsinization using a hemocytometer on the day of collection of the conditioned medium.

      (5) Please clarify how cell number normalization and doubling time calculation were done in Fig 2B. Consider replacing the figure with a growth curve showing confluence on the y-axis for easier interpretation.

      Cells were counted every 24 hours and the normalization was done to the number of cells counted on day 0 of the treatment (to consider attaching efficiency and other cell culture parameters). Doubling time was calculated as the reciprocal of the slope of the graph of log2(normalized cell number) vs time.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1

      The paper is overall convincing. However, a little more attention to data presentation and possibly the addition of at least another technique (see below) would greatly strengthen the findings.

      As we hope to demonstrate below, we have taken steps to improve our manuscript on both fronts (data presentation and experimental evidence).

      The absence of statistics catches immediately the eye. I am sure that the shown differences are statistically significant (thanks to the number of analyzed cells), but reporting the result of some statistical test would help the reader in identify the relevant data in a plot. This is somehow necessary considering that sometimes in the text something is deemed to be "significant" or "not significant", and I felt that I really needed that when looking at the plot in Fig. 3D.

      To facilitate the interpretation of figures that contain data from multiple strains (such as the one mentioned by the reviewer), we have carried out a nonparametric single-step multiple comparison test (Games-Howell) to identify mutants whose means differ significantly from each other. To avoid overcrowding the figures, we have graphically summarized the p-values of all pairwise comparisons in a small matrix within the corresponding panel, and provided 99% confidence intervals and p-values of all differences in the Supplement.

      Related to the previous point: for every N/C distribution analysis, a number of analyzed cells is reported. By the way it is written, it seems that the replication relies solely by the cells in that specific population, i.e.: each cell is treated as a replicate. At least I could not find if that is not the case in the legends or in the methods. I wonder what the results would be (and their significance) if each replicate would be a new assay on another population.

      Cell populations exhibit significant variability in their phenotypic characteristics. Consequently, the quantification of a specific feature (e.g., the Sfp1 nuclear/cytoplasmic ratio) across a sample of cells from a given population results in a distribution rather than a single fixed value. For each quantification, we report the number of cells that were used to construct the corresponding distribution, i.e. the sample size. To compare samples from different populations (e.g., different Sfp1 mutant strains), we run them in parallel during microscopy experiments and compare their means, as described above. Throughout our study, we have tried to ensure that we quantify a sufficiently large number of cells to overcome cell-to-cell variability and enhance the reliability of our results.

      In this context, the question of the reviewer is not entirely clear to us, as individual measurements of a sample are not replicates. However, one can replicate the entire experiment on a different day by re-growing the different strains, running microscopy, quantifying the new movies etc. In this sense, the experiments shown in the manuscript consist of single replicates, i.e. experiments that were carried out on the same day, with all the relevant mutants and controls quantified together. However, we have monitored many of our mutants multiple times over the course of our work. For example, Fig. 1 below shows replicates of the Sfp1 N/C ratio distributions at steady-state in the analog-sensitive (A) and wild-type (B) background, which were quantified several times across various experiments. While day-to-day variability in the empirical distributions of the same mutant exists to a small extent, it is quite small.

      The scale of x axes in N/C ratio plots. Besides not being consistent throughout the figures, it originates from 1, visually enhancing the differences.

      We believe the reviewer was referring to the y-axes, as the x-axes represent time. Summarizing the N/C ratio dynamics of different Sfp1 mutants has been challenging. First, the average N/C ratios at steady-state vary considerably across different mutants, as shown in the panels that summarize steady-state N/C ratios. To compare the magnitude and features of their responses, normalization is necessary. We chose to normalize the time series of each mutant to have a mean of 1 prior to the onset of a perturbation. This allows the normalized time series to represent the percentage-wise changes in the Sfp1 N/C ratio upon perturbation.

      Using a common y-axis scale for all plots of N/C ratio dynamics not ideal, as some responses are subtler than others. Additionally, we do not believe that N/C dynamics across different figures need to (or should) be compared to each other. However, within a figure, panels that require comparison are placed in the same row and share the same y-axis scale. We believe that this approach optimizes data visualization and facilitates important visual comparisons.

      Related to the previous point: it is evident from the plots that the N/C ratio is always positive, even in the most deficient of the analyzed mutants. This implies that a relevant fraction of Sfp1 is still nuclear. I thus wonder what the impact of these mutations would be on the actual function of Sfp1. For this reason, I feel that qPCR evaluation of transcripts of Sfp1 target genes is particularly needed. Since lack of Sfp1 is known to yield some of the smallest cells possible, it would also be cool to have an estimate of the size of mutants where Sfp1 is less nuclear. These analyses could confer phenotypical relevance to the data, but would also help in assessing a currently unexplored possibility, that phosphorylation events by PKA influence Sfp1 function besides its localization, i.e.: the still somehow nuclear fraction is not as functional as wt Sfp1 in promoting transcription.

      It is indeed the case that the recorded N/C ratios are larger than 1 in all strains that we have monitored. We have never observed an N/C ratio smaller than 1 using widefield microscopy for two main reasons: first, out-of-focus light from the cytosol above and below the nucleus is added to the nuclear signal, causing the nuclear signal to always be non-zero, even for predominantly cytosolic proteins. Second, both in- and out of focus vacuoles are devoid of the fluorescent protein fusions that we quantify, which reduces the average brightness of the cytosol. For these reasons, even when a protein is largely cytosolic, the average N/C ratio over a cell population is no lower than around 1.5. Keeping these points in mind, one can observe that our most delocalized Sfp1 mutants have an N/C ratio that is around 1.6-1.7, which is very close to the lower limit. This means that these Sfp1 mutants are largely cytosolic, and the nuclear fraction (if non-zero) is quite small.

      We agree that assessing the phenotypic relevance of Sfp1 mutations is of interest. However, this was impossible with our original strains, as we introduced each Sfp1 mutant as an extra copy in the HO locus while leaving the endogenous Sfp1 locus intact. This was done in order to avoid any phenotypic changes that might result from changes in Sfp1 activity.

      To address the suggestion of the reviewer, we therefore deleted the endogenous Sfp1 copy in strains carrying sfp1PKA2A, sfp1PKA2D and sfp113A, leaving only the mutated Sfp1 copy at the HO locus. Surprisingly, the growth rate and drug sensitivity (determined by halo assays) of these single-copy mutants did not differ much in comparison to the mutants carrying the functional Sfp1 copy and from the wild-type (Supp. Figs. 4J and 7). This observation aligns with findings for the single-copy sfp1-1 mutant in [Lempiäinen et al. 2009], which corresponds to sfp1TOR7A in our work. [Lempiäinen et al. 2009] had suggested that Sch9 compensates for the loss of Sfp1 activity via a feedback mechanism, which could explain our results as well. If this is the case, acute depletion of wild-type Sfp1 could unveil transient changes in cell growth, before the compensatory effect of Sch9 was established. Unfortunately, we were unable to efficiently degrade wild-type Sfp1 carrying a C-terminal auxin-inducible degron. Instead, we followed the same approach with [Lempiäinen et al. 2009] and deleted SCH9.

      As we describe in the last section of Results, the difference was dramatic for sfp113A __mutants, which were extremely slow-growing in the absence of Sch9 (doubling time was around 4 hours, but it was hard to estimate because we could not grow the cells consistently). Interestingly, SCH9 deletion had a negative impact on sfp1__PKA2D __but not sfp1__PKA2A __cells (__Supp. Fig. 7). Overall, these results demonstrate that Sch9 can compensate for loss of Sfp1 activity, which makes it challenging to study the impact of Sfp1 mutations on cellular phenotypes.

      To further understand to what extent Sch9 compensates for loss of Sfp1 phosphorylation, we carried out RNA-seq on WT and cells carrying a single copy of sfp113A (with the endogenous SFP1 copy removed). Despite the fact that sfp113A __grow as well as WT, RNA-seq picked up several differentially expressed genes related to amino acid biosynthesis. This surprising finding is presented in the last section of Results, and in __Supplementary Figures 8, 9 and 10. We explore the relevance of these results and their connection with past literature on Sfp1 and Sch9 in the Discussion section.

      I found some typos here and there, and it would greatly help to report them if in the manuscript line numbers were included.

      We apologize for the typos. We have tried to eliminate them, and we have also added line numbers to the manuscript.

      Reviewer 2

      There is no biochemical evidence presented that the putative PKA sites (S105 and S136) are genuinely phosphorylated by PKA. The fact that they match the PKA consensus motif, alone, does not guarantee this. In order to claim that they are looking at the effect of PKA by mutagenizing these residues, the authors have to demonstrate the PKA-dependency of S105 and S136 phosphorylation by, for example, mass spec experiments or western blotting with phospho-specific antibodies (Cell Signaling Technology #9624 for example). Also, does the band-shift caused by PKA inhibition (Fig 3C) is canceled by the S105A/S136A mutation?

      We took several actions to demonstrate that the putative PKA sites are indeed phosphorylated by PKA. We first tried to detect Sfp1 phosphorylation using the antibody mentioned by the reviewer, but failed as the sensitivity of this antibody appears to be quite low. On the other hand, mass spectrometry did not produce the right fragments to detect the sites of interest. We therefore resorted to an in vitro kinase assay using [γ-32P]ATP together with purified PKA and Sfp1. Unfortunately, bacterial overexpression of MBP-tagged Tpk1, Tpk2 and Tpk3 (the catalytic subunits of PKA) was quite challenging and we were unable to produce soluble protein. We therefore resorted to commercially available bovine PKA (bPKA, PKA catalytic subunit, Sigma-Aldrich 539576), which shows high homology to the yeast Tpk kinases [Toda et al. 1987]. Moreover 87% of bPKA substrates have been shown to also be Tpk1 substrates [Ptacek et al. 2005], and bPKA has been used to identify new Tpk substrates in budding yeast [Budovskaya et al. 2005__]. As we show in the revised manuscript, bovine PKA does phosphorylate Sfp1. Moreover, phosphorylation is reduced by 50% in the double S105A, S136A mutant (Fig.1F), and becomes undetectable in the 13A mutant__ (Supp Fig. 6). Together with the rapid response of Sfp1 localization to acute PKA inhibition which we had already reported, we believe that these results provide strong evidence that Sfp1 is a direct PKA substrate, and that the two phosphosites that we identified are functional.

      As the above in vivo experiments do not exclude S105/S136 phosphorylation by other kinases downstream of PKA, in order to claim the direct phosphorylation, the authors need in vitro PKA kinase assay. These biochemical experiments are not trivial, but I think absolutely necessary for this story.

      One cannot exclude that S105/S136 are also phosphorylated by other kinases of the AGC family (note that [Lempiäinen et al. 2009] has already excluded Sch9). However, as we hope to have shown, PKA indeed phosphorylates Sfp1. Examining if other kinases besides PKA and TORC1 target Sfp1 is a very interesting question that should be addressed in future work.

      The authors only look at the localization of Sfp1. To assess its functionality and so physiological impact, it would be informative to measure the mRNA level of target ribosomal genes in various Sfp1 mutants they created.

      As we described in our response to Reviewer 1 above, we did perform RNA-seq on WT and cells carrying a single copy of sfp113A. We observed a notable absence of differentially expressed ribosomal genes and ribosome-related categories in the GO analysis (Supp. Figs. 8, 9 and 10). Together with our observations on SCH9 deletion (Supp. Fig. 7), these results suggest that Sch9 can largely compensate for the loss of Sfp1 activity. On the other hand, the emergence of differentially expressed amino acid biosynthesis genes is a finding that merits further investigation, as it connects with previous observations made with Sch9 deletion mutants and the [ISP+] prion form of Sfp1 (cf. Discussion).

      In the experiments using analog-sensitive PKA (Fig 1D and E for example), they directly compare wildtype-PKA versus analog sensitive-PKA, or with 1-NM-PP1 versus without 1-NM-PP1. This makes interpretation difficult, particularly because 1-NM-PP1 itself has a significant impact even in the wild PKA strain. The real question is the difference between wild-type Sfp1 versus mutant Sfp1. In the current form, they compare Fig 1D versus 1E, these two do not look like a single, side-by-side experiment. They should compare wild-type Sfp1 versus mutant Sfp1 side-by-side.

      Figure 1D shows that 1-NM-PP1 has a transient off-target effect on Sfp1 localization in WT cells, which could also affect Sfp1 mutants. This observation prompted us to use wild-type PKA as a control when testing the effect of 1-NM-PP1 on sfp1PKA2D in cells carrying PKAas (Figure 1E). As Fig. 1E shows, the effect of 1-NM-PP1 on sfp1PKA2D localization in PKAas cells is quite similar to the off-target effect in cells carrying sfp1__PKA2D __and wild-type PKA. This behavior of sfp1__PKA2D __is clearly different from the response of wild-type Sfp1 to PKAas inhibition, which results in sustained delocalization. We have made the latter observation repeatedly, both in this study and our previously published work [Guerra et al. 2021].

      In Figure 3, the argument around the additive effects of PKA and TORC1 is confusing. The authors say they are additive referring Figure 3E, but say they are not additive referring Figure 3B. Which is true? In fact, Figure 3B appears to show an additive effect as well.

      We did not use the word "additive" in the text, because we find it difficult to interpret. Instead, we state that PKA and TORC1 appear to control Sfp1 phosphorylation independently of each other. PKA and TORC1 phosphorylation converges to the same response, affecting Sfp1 localization. It appears that loss of either kinase delocalizes Sfp1, while loss of both kinases may only have a small additional effect.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This study identifies new types of interactions between Drosophila gustatory receptor neurons (GRNs) and shows that these interactions influence sensory responses and behavior. The authors find that HCN, a hyperpolarization-activated cation channel, suppresses the activity of GRNs in which it is expressed, preventing those GRNs from depleting the sensillum potential, and thereby promoting the activity of neighboring GRNs in the same sensilla. HCN is expressed in sugar GRNs, so HCN dampens the excitation of sugar GRNs and promotes the excitation of bitter GRNs. Impairing HCN expression in sugar GRNs depletes the sensillum potential and decreases bitter responses, especially when flies are fed on a sugar-rich diet, and this leads to decreased bitter aversion in a feeding assay. The authors' conclusions are supported by genetic manipulations, electrophysiological recordings, and behavioral assays.

      Strengths:

      (1) Non-synaptic interactions between neurons that share an extracellular environment (sometimes called "ephaptic" interactions) have not been well-studied, and certainly not in the insect taste system. A major strength of this study is the new insight it provides into how these interactions can impact sensory coding and behavior.

      We appreciate the reviewer’ view that our findings may allow researchers to better understand sensory coding and behavior. However, we respectfully disagree that the SP homeostasis in Drosophila gustation we describe here pertains to ephaptic interaction. Although SP reduction was proposed as the basis of post-ephaptic hyperpolarization in Drosophila olfaction, we find that SP changes are found to be too slow to mediate the fast action of ephaptic inhibition in gustation, reported in the ref#17. We observed a slow, sweet-dependent SP depletion (Fig. 5B, revised), which takes more than one hour. The real-time change of SP was also slow even upon contact with 200-mM sucrose; this result was set aside for another manuscript in preparation. Therefore, we believe the main findings in this paper concern the homeostatic preservation of SP for the maintenance of gustatory function, not ephaptic interaction.

      (2) The authors use many different types of genetic manipulations to dissect the role of HCN in GRN function, including mutants, RNAi, overexpression, ectopic expression, and neuronal silencing. Their results convincingly show that HCN impacts the sensillum potential and has both cell-autonomous and nonautonomous effects that go in opposite directions. There are a couple of conflicting or counterintuitive results, but the authors discuss potential explanations.

      (3) Experiments comparing flies raised on different food sources suggest an explanation for why the system may have evolved the way that it did: when flies live in a sugar-rich environment, their bitter sensitivity decreases, and HCN expression in sugar GRNs helps to counteract this decrease.

      Weaknesses/Limitations:

      (1) The genetic manipulations were constitutive (e.g. Ih mutations, RNAi, or misexpression), and depleting Ih from birth could lead to compensatory effects that change the function of the neurons or sensillum. Using tools to temporally control Ih expression could help to confirm the results of this study.

      We attempted to address this point by using the tub-Gal80ts system. The result is now included as Fig. 1-figure supplement 2. At 29C, a non-permissive temperature for GAL80ts which allows GAL4-dependent expression Ih-RNAi, we observed that bGRN responses were decreased and sGRN responses were increased compared to the control maintained at 18°C, and this is in parallel with the result in Fig. 1C,D. For this experiment, we inserted “To exclude the possibility that Ih is required for normal gustatory development, we temporally controlled Ih RNAi knockdown to occur only in adulthood, which produced similar results (Fig. 1-figure supplement 2).” (~line 113).

      (2) The behavioral experiment shows a striking loss of bitter sensitivity, but it was only conducted for one bitter compound at one concentration. It is not clear how general this effect is. The same is true for some of the bitter GRN electrophysiological experiments that only tested one compound and concentration.

      We conducted additional behavioral experiments with other bitters such as lobeline and theophylline (Fig. 5-figure supplement 1), which showed sensitivity losses in Ih mutants similar to caffeine. For these results, the following is inserted at ~line 274: “These results were recapitulated with other bitters, lobeline and theophylline (Fig. 5-figure supplement 1).”

      We also added single sensillum recording data with bitters, berberine, lobeline, theophylline and umbelliferone, which yielded results similar to those obtained with caffeine (Fig. 1-figure supplement 1). This is described with the sentence at ~line 105 “Other bitter chemical compounds, berberine, lobeline, theophylline, and umbelliferone, also required Ih for normal bGRN responses (Fig. 1-figure supplement 1).”

      (3) Several experiments using the Gal4/UAS system only show the Gal4/+ control and not the UAS/+ control (or occasionally neither control). Since some of the measurements in control flies seem to vary (e.g., spiking rate), it is important to compare the experimental flies to both controls to ensure that any observed effects are in fact due to the transgene expression.

      We appreciate the reviewers for raising this point. Indeed, there was a small logical flaw with the controls. We have now included all the necessary controls for Fig. 1C-F, Fig. 2I,J, Fig. 4E, and Fig. 5D, as reviewers suggested. These experiments remained statistically significant after including the new control groups.

      (4) I was surprised that manipulations of sugar GRNs (e.g. Ih knockdown, Gr64a-f deletion, or Kir silencing) can impact the sensillum potential and bitter GRN responses even in experiments where no sugar was presented.

      We are afraid there is a misunderstanding on the early part of the paper. We suspected that the manipulations impacted bGRNs and SP due to the sweetness in the regular cornmeal food, as stated in lines 214-220 “Typically, we performed extracellular recordings on flies 4-5 days after eclosion, during which they were kept in a vial with fresh regular cornmeal food containing ~400 mM D-glucose. The presence of sweetness in the food would impose long-term stimulation of sGRNs, potentially requiring the delimitation of sGRN excitability for the homeostatic maintenance of gustatory functions. To investigate this possibility, we fed WT and Ihf03355 flies overnight with either non-sweet sorbitol alone (200 mM) or a sweet mixture of sorbitol (200 mM) + sucrose (100 mM).”

      I believe the authors are suggesting that the effects of sugar GRN activity (e.g., from consuming sugar in the fly food prior to the experiment) can have long-lasting effects, but it wasn't entirely clear if this is their primary explanation or on what timescale those long-lasting effects would occur. How much / how long of a sugar exposure do the flies need for these effects to be triggered, and how long do those effects last once sugar is removed?

      We attempted to address this point with additional experiments (Fig. 5A,B). The reduction of SP could be observed in WT and HCN-deficient mutants with similar degrees 1 hr after the flies were transferred from nonsweet sorbitol-containing vials to sweet sucrose-containing ones. Moreover, the mutants, but not WT, showed further depression of SP when the sweetness persisted in the media for 4 hrs and overnight. This long-term exposure to sweetness longer than 1 hr may simulates the feeding on the regular sweet cornmeal food. The recovery of SP was also tested by removing flies from the sweet media after overnight-long sweet exposure and placing them in sorbitol food. SPs of WT and the mutants were recovered to the similar levels 1 hr after separating the animals from sweetness, although the HCN-lacking mutants showed much lower SP right after overnight sweetness exposure. The unimpaired recovery of the mutants suggests that HCN is independent of generating transepithelial potential itself. Therefore, regardless of HCN, SP changes are not fast even in the presence of strong sweetness, and SP is much better guarded when sGRNs express HCN in a sweet environment.

      We inserted the following at ~line 260 to describe the newly added recovery experiment: “Following overnight sweet exposure, SPs of WT and Ihf03355 were recovered to similar levels after 1-hr incubation with sorbitol only food. However, it was after 4 hrs on the sorbitol food that the two lines exhibited SP levels similar to those achieved by overnight incubation with sorbitol only food (Fig. 5B). These results indicate that SP depletion by sweetness is a slow process, and that the dysregulated reduction and recovery of SPs in Ihf03355 manifest only after long-term conditioning with and without sweetness, respectively.”.

      (5) The authors mention that HCN may impact the resting potential in addition to changing the excitability of the cell through various mechanisms. It would be informative to record the resting potential and other neuronal properties, but this is very difficult for GRNs, so the current study is not able to determine exactly how HCN affects GRN activity.

      On this point, we cannot but rely on previous studies of biophysical and electrophysiological characterization on mammalian HCN channels and a heterologous expression study that revealed a robust hyperpolarization-activated cation current from Drosophila HCN channels (PMID: 15804582).

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors start by showing that HCN loss-of-function mutation causes a decrease in spiking in bitter GRNs (bGRN) while leaving sweet GRN (sGRN) response in the same sensillum intact. They show that a perturbation of HCN channels in sweet-sensing neurons causes a similar decrease while increasing the response of sugar neurons. They were also able to rescue the response by exogenous expression. Ectopic expression of HCN in bitter neurons had no effect. Next, they measure the sensillum potential and find that sensillum potential is also affected by HCN channel perturbation. These findings lead them to speculate that HCN in sGRN increases sGRN spiking which in turn affects bGRNs. To test this idea that carried out multiple perturbations aimed at decreasing sGRN activity. They found that decreasing sGRN activity by either using receptor mutant or by expressing Kir (a K+ channel) in sGRN increased bGRN responses. These responses also increase the sensillum potential. Finally, they show that these changes are behaviorally relevant as conditions that increase sGRN activity decrease avoidance of bitter substances.

      Strengths:

      There is solid evidence that perturbation of sweet GRNs affects bitter GRN in the same sensillum. The measurement of transsynaptic potential and how it changes is also interesting and supports the authors' conclusion.

      Weaknesses:

      The ionic basis of how perturbation in GRN affects the transepithelial potential which in turn affects the second neuron is not clear.

      We speculate that HCN-dependent membrane potential regulation, rather than ionic composition change, is responsible for the observed SP preservation, as further discussed as an author response in the section of “Recommendations for the authors”. The transepithelial potential can be dissipated by increased conductance through receptor-linked ion channels following gustatory receptor activation in GRNs. The volume of the sensillum lymph is very small according to electron micrographs of horizontally sliced bristles (PMID: 11456419). Therefore, robust excitation of a gustatory neuron may easily deplete the extracellular potential built as a form of polarized ion concentrations across the tight junction. When the consumption is too strong and extended, the neighboring neuron, which share TEP with the activated GRN, can be negatively affected. We propose that HCN suppresses overexcitation of sGRNs by means of membrane potential stabilization. This stabilization prevents sGRNs from excessively reducing the TEP, thereby protecting the activity of neighboring bGRNs.

      Reviewer #3 (Public Review):

      Ephaptic inhibition between neurons housed in the same sensilla has been long discovered in flies, but the molecular basis underlying this inhibition is underexplored. Specifically, it remains poorly understood which receptors or channels are important for maintaining the transepithelial potential between the sensillum lymph and the hemolymph (known as the sensillum potential), and how this affects the excitability of neurons housed in the same sensilla.

      Although a reduction of sensillum potential was proposed to underlie membrane hyperpolarization of post-ephaptic olfactory neurons in Drosophila, our preliminary data (not shown due to a manuscript in preparation) and the results included in the paper (Fig. 5B) strongly suggest that SP reduction is not a requisite for ephaptic inhibition at least in GRNs. Ephaptic inhibition is expected to be instantaneous, whereas we find that SP reduction in gustation is very slow. Therefore, we would like to indicate that the findings we report in this manuscript are not directly related to ephaptic inhibition.

      Lee et al. used single-sensillum recordings (SSR) of the labellar taste sensilla to demonstrate that the HCN channel, Ih, is critical for maintaining sensillum potential in flies. Ih is expressed in sugar-sensing GRNs (sGRNs) but affects the excitability of both the sGRNs and the bitter-sensing GRNs (bGRNs) in the same sensilla. Ih mutant flies have decreased sensillum potential, and bGRNs of Ih mutant flies have a decreased response to the bitter compound caffeine. Interestingly, ectopic expression of Ih in bGRNs also increases sGRN response to sucrose, suggesting that Ih-dependent increase in sensillum potential is not specific to Ih expressed in sGRNs. The authors further demonstrated, using both SSR and behavior assays, that exposure to sugars in the food substrate is important for the Ih-dependent sensitization of bGRNs. The experiments conducted in this paper are of interest to the chemosensory field. The observation that Ih is important for the activity in bGRNs albeit expressed in sGRNs is especially fascinating and highlights the importance of non-synaptic interactions in the taste system.

      Despite the interesting results, this paper is not written in a clear and easily understandable manner. It uses poorly defined terms without much elaboration, contains sentences that are borderline unreadable even for those in the narrower chemosensory field, and many figures can clearly benefit from more labeling and explanation. It certainly needs a bit of work.

      We would like to revise the language aspect of the manuscript after finalizing the scientific revision.

      Below are the major points:

      (1) Throughout the paper, it is assumed that Ih channels are expressed in sugar-sensing GRNs but not bitter-sensing GRNs. However, both this paper and citation #17, another paper from the same lab, contain only circumstantial evidence for the expression of Ih channels in sGRNs. A simple co-expression analysis, using the Ih-T2A-GAL4 line and Gr5a-LexA/Gr66a-LexA line, all of which are available, could easily demonstrate the co-expression. Including such a figure would significantly strengthen the conclusion of this paper.

      We did conduct confocal imaging with Ih-T2A-Gal4 in combination with GRN Gal4s (ref#17 version2). The expression is very broad, including both neurons and non-neuronal cells. We observed much stronger sGRN expression than bGRN expression. But the promiscuous expression of the reporter in many cells hindered us from clearly demonstrating the void of the reporter in bGRNs. However, the functional and physiological examination of Ih-T2A-Gal4 with the neuronal modifiers such as TRPA1 and Kir2.1 in ref#17 indicates the strong and little expression of Ih in sGRNs and bGRNs, respectively. Furthermore, the RNAi kd results present another line of evidence that HCN expressed in sGRNs regulates SP and bGRN activity (Fig. 1C,D, Fig. 1-figure supplement 2). Ih-RNAi expression in bGRNs did not result in any statistically significant changes in the activities of sGRNs and bGRNs compared to controls (Fig. 1C,D, revised), advocating that Ih acts in sGRNs for the functional homeostasis of SP and GRNs, as we claim.

      (2) Throughout this paper, it is often unclear which class of labellar taste sensilla is being recorded. S-a, S-b, I-a, and I-b sensilla all have different sensitivities to bitters and sugars. Each figure should clearly indicate which sensilla is being recorded. Justification should be provided if recordings from different classes of sensilla are being pooled together for statistics.

      We mainly performed SSR (single sensillum recording) on i-type bristles as they have the simplest composition of GRNs compared to s- and L-type bristles. As single s-types also contain each of s- and bGRN, we measured SP also for s-types (Figs. 2, 3F and 4D). In case of Fig.3-figure supplement 1, L-types were tested for the relationship between water cell activity and SP. Now all the panels are labelled with the tested bristle types.

      (3) In many figures, there is a lack of critical control experiments. Examples include Figures 1C-F (lacking UAS control), Figure 2I-J (lacking UAS control), Figure 4E (lacking the UAS and GAL4 control, and it is also strange to compare Gr64f > RNAi with Gr66a > RNAi, instead of with parental GAL4 and UAS controls.), and Figure 5D (lacking UAS control). Without these critical control experiments, it is difficult to evaluate the quality of the work.

      Thank you for pointing this out. We appreciate the feedback and have addressed these concerns by including all the requested controls in the figures. Specifically, we have added the UAS controls for Figs 1C-F and 2I-J, as well as the UAS and GAL4 controls for Fig. 4E. We have also included the UAS control for Fig. 5D.

      (4) Figure 2A could benefit from more clarification about what exactly is being recorded here. The text is confusing: a considerable amount of text is spent on explaining the technical details of how SP is recorded, but very little text about what SP represents, which is critical for the readers. The authors should clarify in the text that SP is measuring the potential between the sensillar lymph, where the dendrites of GRNs are immersed, and the hemolymph. Adding a schematic figure to show that SP represents the potential between the sensillar lymph and hemolymph would be beneficial.

      SP was defined at lines 55-56 in the first paragraph of introduction, which also contains the background information for SP as a transepithelial potential. As reviewer suggested, we now also included a sentence describing SP (“SP is known as a transepithelial potential between the sensillum lymph and the hemolymph, generated by active ion transport through support cells”, line 126) and a drawing to illustrate the concept of SP (Fig. 2A), and revised the legend.

      (5) The sGRN spiking rate in Figure 4B deviates significantly from previous literature (Wang, Carlson, eLife 2022; Jiao, Montell PNAS 2007, as examples), and the response to sucrose in the control flies is not dosage-dependent, which raises questions about the quality of the data. Why are the responses to sucrose not dosage-dependent? The responses are clearly not saturated at these (10 mM to 100 mM) concentrations.

      Our recordings show different spiking frequencies from others’ work, because the frequencies are from 5-sec bins not only first 0.5 sec. This lowers the frequencies, as spikes are relatively more frequent in the beginning of the recording (Fig. 4-figure supplement 1).

      Why are the responses to sucrose not dosage-dependent? The responses are clearly not saturated at these (10 mM to 100 mM) concentrations.

      We were also puzzled with the flat dose dependence to sucrose. This result may suggest the existence of another mechanism moderating sucrose responses of sGRNs. This flat curve reappeared with other genotypes with the same concentration range (5-50 mM) in Fig. 4E. However, 1-mM sucrose produced much lower spiking frequencies (Fig. 4E), suggesting that sGRN responses are saturated at 5 mM sucrose with our recording/analysis condition.

      (6) In Figure 4C, instead of showing the average spike rate of the first five seconds and the next 5 seconds, why not show a peristimulus time histogram? It would help the readers tremendously, and it would also show how quickly the spike rate adapts to overexpression and control flies. Also, since taste responses adapt rather quickly, a 500 ms or 1 s bin would be more appropriate than a 5-second bin.

      Taste single sensillum recording starts by contacting stimulants, which bars us from recording pre-stimulus responses of GRNs. Therefore, we showed post-stimulus graphs with 1-sec bins (Fig. 4-figure supplement 1) as we reviewer suggested.

      (7) Lines 215 - 220. The authors state that the presence of sugars in the culture media would expose the GRNs to sugar constantly, without providing much evidence. What is the evidence that the GRNs are being activated constantly in flies raised with culture media containing sugars? The sensilla are not always in contact with the food.

      We agree with reviewer. We replaced “long-term stimulation of sGRNs” with “strong and frequent stimulation of sGRNs for extended period”. The word long-term may be interpreted to be constant.

      (8) Line 223. To show that bGRN spike rates in Ih mutant flies "decreased even more than WT", you need to compare the difference in spike rates between the sorbitol group and the sorbitol + sucrose group, which is not what is currently shown.

      The data were examined by ANOVA and a multiple comparison test (Dunn’s) between all the groups regardless of genotypes and conditions in the panel (all the groups sharing the y axis). Therefore, the differences were statistically examined. However, the cited expression we used read like it was about the slope or extent of the decrease. We intended to indicate the difference in the absolute values of spiking frequencies after overnight sweet exposure between the genotypes, while bGRN activities were statistically indifferent between WT and Ih mutants when they were kept only on sorbitol food. We revised it to “decreased to the level significantly lower than WT”. We also changed the graph style to effectively present the trend of changes in bGRN sensitivity with comparison between genotypes. Again, the groups were statistically examined together regardless of the genotypes and conditions.

      (9) To help readers better understand the proposed mechanisms here, including a schematic figure would be helpful. This should show where Ih is expressed, how Ih in sGRNs impacts the sensillum potential, how elevated sensillum potential increases the electrical driving force for the receptor current, and affects the excitability of the bGRNs in the same sensilla, and how exposure to sugar is proposed to affect ion homeostasis in the sensillum lymph.

      As reviewer suggested, we included two panels to show working model for gustatory homeostasis via SP maintenance by HCN (Fig. 5E,F).

      Reviewer #1 (Recommendations For The Authors):

      (1) The relationship between this paper and the authors' bioRxiv preprint posted last year is not clear. In the introduction they made it seem like this paper is a follow-up that builds on the preprint, but most or all of the experiments in this paper were already performed in the preprint. I guess the authors are planning to divide the original paper into two papers. I would suggest updating the preprint to avoid confusion.

      Thank you for the comment. We updated the preprint to be without a part of Fig.6 and entire Fig.7 along with associated texts. As reviewer pointed out, our eLife paper was spun off from the part of the preprint paper, because we feel that the two stories could confuse readers when presented together.

      (2) Have the authors considered testing responses of water GRNs? They reside in the same sensilla as sugar neurons, so are they also increased affected by Ih mutation or RNAi in sugar neurons? This would strengthen the evidence that the indirect (non-cell autonomous) effects of Ih are due to the sensillum potential and not some specific interaction between sweet and bitter cells.

      As reviewer proposed, we appraised water GRN activity in the L-type bristles of WT, Ihf03355 and a genomic rescue line for Ihf03355. Spiking responses in water GRNs were evoked by hypo-osmolarity of electrolyte (0.1 mM tricholine citrate-TCC). Interestingly, the Ih mutant showed reduced 0.1 mM TCC-provoked spiking frequencies compared to WT. This impairment was rescued by the genomic fragment containing an intact Ih locus (Figure 3-figure supplement 1A).

      Additionally, SPs in L-type bristles were reduced by Ih deficiencies but increased in Gr64af, suggesting that HCN regulates sGRNs in L-type bristles as well (Figure 3-figure supplement 1B). Again, the bristles of animals with both mutations together exhibited SPs similar to those of WT.

      Furthermore, when we conducted cDNA rescue experiments in L bristles, introduction of Ih-RF cDNA in sGRNs restored SPs, while expressing it in bGRNs did not unlike the results from the i- and s-bristles (Fig. 2K,L), likely because L-bristles lack bGRNs. These cDNA rescue and genetic interaction experiments were conducted using flies fed on fresh cornmeal food with strong sweetness, suggesting that the sweetness in the media is the likely key factor producing the genetic interaction and necessitating HCN, consistent with other results in the manuscript. Therefore, SP regulation by HCN is observed in the L-type bristles.

      Minor comments:

      Line 52: typo, "Many of"

      Thank you. Corrected

      Line 95: typo, "sensilla do an sGRN"

      Corrected

      Line 98: typo, "we observed reduced the spiking responses"

      Corrected

      Line 206: typo, "a relatively low sucrose concentrations"

      Corrected

      Line 260: "inverse relationship between the two GRNs in excitability" - I am not exactly sure what data you are referring to.

      Although alleles did not show increased sGRN activities, knockdown of Ih decreased bGRN activity but increased sGRN activity (Fig. 1C,D, Fig.1-figure supplement 2B), while suppression of sGRNs increased bGRN activity (Fig. 3). To clarify this point, we revised the phrase to “the inverse relationship between the two GRNs in excitability observed in Fig. 1C,D, Fig. 1-figure supplement 2B, and Fig. 3”.

      Methods: typo, "twenty of 3-5 days with 10 males and 10 females"

      Corrected to “Twenty flies, aged 3-5 days and consisting of 10 males and 10 females,”

      Methods: typo, "Kim's wipes" should be "Kimwipes"

      Corrected

      Reviewer #2 (Recommendations For The Authors):

      (1) More clarification is necessary on Transepithelial potential (TEP). TEP is typically created by having pumps and tight junctions between the sensillar lymph and the hemolymph.

      We have an introduction to TEP or SP in the context of sensory functions (lines 40-57) with relevant references. The involvement of pumps and tight junction was mentioned in the same paragraph; “Glia-like support cells exhibit close physical association with sensory receptor neurons, and conduct active transcellular ion transport, which is important for the operation of sensory systems” (line 40) and “Tight junctions between support cells separate the externally facing sensillar lymph from the internal body fluid known as hemolymph” (line 53).

      It is not clear how HCN channels in one of the neurons might change the composition of the sensillum lymph. An explanation of their model of how TEP depends on HCN is necessary.

      Although the ionic composition of the sensillum lymph is a contributing factor to the sensillum potential, it is more conceptually relevant to describe our findings with the perspective of membrane potential regulation given the role of HCN in membrane potential stabilization as discussed in our manuscript.

      We speculate that HCN controls the membrane potential at rest and/or in motion to modulate sGRN activity towards saving SP despite the sweetness in the niche. We positioned our results in relation to SP in discussion; “Our results provide multiple lines of evidence that HCN suppresses HCN-expressing GRNs, thereby sustaining the activity of neighboring GRNs within the same sensilla. We propose that this modulation occurs by restricting SP consumption through HCN-dependent neuronal suppression rather than via chemical and electrical synaptic transmission.” (lines 252-255). Moreover, it is unclear whether HCN is localized to the dendrite bathed in the sensillum lymph to influence the ionic composition of the lymph. It would be very interesting to study in future whether the ionic flow through HCN channels itself is critical for the function of HCN in this context, and whether HCN is exclusively present in the dendrite to support the postulation. However, we would like to remind reviewer that Kir2.1 and HCN channels in sGRNs showed similar effects on SP and bGRNs, while they differ in Na+ conductance.

      In the initially submitted manuscript (lines 325-343), we discussed the potential mechanism by which Kir2.1 and HCN channels commonly increase SP in terms of how the membrane potential regulation in the soma can control the SP consumption in the dendrite of sGRNs.

      Another point about the TEP that needs some explanation is that these sensilla are open to the environment as tastants must flow in and are different from mechanical sensilla in that sense.

      This is a very important question regarding the general physiology of the taste sensilla, as the sensillum lymph is in contact with the external environment through the pore of the sensillum. It is indeed interesting to consider how the composition and potential of the lymph are maintained despite the relatively vast volume of food the sensilla encounter during gustation and the continuous evaporation to air between episodes of gustation. However, we believe that this question, while important, is distinct from the primary focus of our manuscript.

      Are the TEP measurements in Figure 2 under control conditions where there are no tastants?

      There is no tastant in the SP-measuring glass electrode other than the electrolyte. We apologize that we did not specify the recording electrode condition. We inserted a clause in the method; “For SP recordings, the recording electrode contained 2 mM TCC as the electrolyte, and…”

      Does the TEP change dynamically as sGRN is activated?

      SP does shift in response to sweets. Please see Fig. 5B. Also, we showed SP changes by mechanical stimuli, which depended on the mechanoreceptor, NompC (Fig. 2D-F). Mechanoreceptor neurons share the sensillum lymph with GRNs.

      (2) More clarification on the potential transduction mechanism and how TEP affects one neuron differentially. Essentially, sGRN perturbation affects sGRN activity and it affects the TEP. More explanation is needed for the potential ionic mechanism of each.

      Our results strongly suggest that HCN lowers the activity of HCN-expressing GRNs, mitigating SP consumption. This modulation is crucial because the SP serves as a driving force for neuronal activation within the sensillum. HCN is particularly necessary in sGRNs because of the flies’ sweet feeding niche, which is expected to result in frequent and strong activation of sGRNs. The SP saved by HCN-dependent delimitation of sGRNs can be used to raise the responsibility of bGRNs.

      (3) The authors refer to their own unreviewed paper (Reference 17). This paper is on a similar topic and there seems to be some overlap. Clarification on this point would be important.

      We revised the biorxiv preprint, so that the preprint version 2 does not contain the parts overlapping with this eLife paper. This eLife paper was originally part of the preprint paper, but it was separated to clarify the messages of the two stories. As we explained in Discussion (lines 276-297), HCN provides resistance to both hyperpolarization and depolarization of the membrane potential. Simply put, one paper focuses on the role of HCN in resisting hyperpolarization, while the other (this paper in eLife) focuses on resisting depolarization.

      (4) Methods are sparse. Many details on the method are necessary. For example, Sensilla recordings are being done by the tip-dip method (I assume). What does "number of experiments" mean in Figure 1? Is it the number of animals or the number of sensilla? How many trials/sensilla?

      We indicated the extracellular recording was performed by the tip-dip method; “In vivo extracellular recordings were performed by the tip-dip method as detailed previously”. We also added a statement on the number of experiments; “The number of experiments indicated in figures are the number of naïve bristles tested. The naïve bristles were from at least three different animals.”

      (5) Figure 1: I understand the author's interpretation. But if one compares WT in Figure 1A to Gr64a-IhRNAi in 1C, we can come to the conclusion that there is no change. In other words, the control in Figure 1C (grey) has a much higher response than WT. Similar conclusions can be made for other experiments. Is the WT response stable enough to make the conclusions made here?

      The genetic background of each genotype may influence GRN activity to some extent. RNAi knockdown experiments are well-known for their hypomorphic nature, and their effects should be evaluated by comparison with their parental controls such as Gal4 and UAS lines. As all reviewers pointed out, we added the results from UAS control. This effort confirms that Gr89a>Ih RNAi is statistically indifferent to UAS control as well as Gr64f-Gal4 control in bGRN spiking evoked by 2-mM caffeine, while Gr64f>Ih RNAi showed reduced bGRN responses to 2 mM caffeine compared to all the controls.

      (6) Figure 3: Why is bGRN spiking not plotted against sensillum potential to observe the dependence more directly?

      This is a very interesting suggestion. We are not, however, equipped to measure spiking and sensillum potential simultaneously. Therefore, they are independent experiments, and we treated them accordingly.

      (7) Figure 4: Why bGRN response is only affected at high caffeine concentrations is not clear.

      We were also surprised by the differences in the dose dependence results of b- and sGRNs, genetically manipulated to mis-express and over-express HCN in Fig. 4A and 4E, respectively. Each gustatory neuron likely has distinct sets of players and parameters that set its own membrane potential and excitability.

      We can think of a possibility that there might be a range of membrane potentials within which HCN does not engage. In bGRNs, the resting membrane potential may lie low within this range, so that some degrees of membrane depolarization by low concentrations of caffeine do not significantly close HCN channels, thus preventing their hyperpolarizing effects. On the other hand, the membrane potential of sGRNs may be high within this range, showing suppressive effects at all tested sucrose concentrations. However, we find this explanation is too speculative to include in the main text, while we stated in the original manuscript, “implying a complex cell-specific regulation of GRN excitability.” (line 210).

      (8) Minor:

      L98 - there is a small typo

      Corrected

      L274: "funny" !?

      “Funny” currents, denoted If, were initially observed by electrophysiologists and later attributed to HCN channels, now indicated by Ih (thus the gene name Ih in Drosophila). These currents were termed "funny" due to their unusual properties compared to other currents. For more detailed information, please refer to the cited references.

      L257: Neuropeptide seemed to be abrupt

      We attempted to discuss possible mechanisms that mediate excitability changes across GRNs beyond the mechanism by SP shifts. Neuropeptides, which are chemical neurotransmitters along with small neurotransmitters, were mentioned following the discussion on synaptic transmission to suggest alternative pathways for excitability regulation. This inclusion is meant to provide a comprehensive overview of potential mechanisms influencing GRN activity.

      Reviewer #3 (Recommendations For The Authors):

      Congratulations on your fascinating research! The results are certainly of interest to the chemosensory field. However, I suggest using academic editing services to enhance the clarity of your text and ensure that the terminology and jargon align with standard usage in the field. The current choice of words may not be consistent with commonly used terms. As it is now, the writing might not fully showcase the compelling story and the effort behind your study, and is underselling your interesting results. Proper refinement could make sure your valuable findings are appropriately recognized.

      We appreciate your comments and apologize for any difficulties reviewers faced during the review process. We are currently prioritizing the review of scientific content and plan to address language issues in a subsequent revision. It would be very helpful for future revisions if the problematic sentences or expressions could be indicated in detail after this revision. This will allow us to ensure that our terminology and expression align with standard usage in the field, and that our findings are clearly and effectively communicated.

      Minor points:

      (1) Line 110: what is Ih-RF?

      We apologize that we relied on a reference in describing the cDNA. The following clause was inserted with additional reference and the Flybase id: “(Flybase id: FBtr0290109), which previously rescued Ih deficiency in other contexts17,26 ,”  

      (2) Line 158: Gr64af mutant flies still have Gr5a and a residual response to fructose and sucrose (Slone, Amrein 2007).

      We revised the line to “is severely impaired in sucrose and glucose sensing”, since there is a substantial loss of sucrose and glucose sensing in both Gr64af from Kim et al 2018 and DGr64 from Slone et al 2007, when they were examined by the proboscis extension reflex assay. This was also confirmed in the study by Jiao et al 2009. We also deleted “sugar-ageusic” and instead describe the mutant “impaired in sucrose and glucose sensing” in Fig. 3 legend.

      (3) Lines 264-273 seem unnecessary. This paper is not about the function of HCN in mammals, and these discussions seem largely irrelevant.

      We feel that it is important to position our results within a broader context by discussing the potential implications of our findings for sensory systems of other animals. As we stated, HCN channels have been localized in mammalian sensory systems, but their roles are often not well understood. By including this discussion, we aim to highlight the relevance of our findings beyond the model organism used in our study and suggest possible areas for future research in mammalian systems.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Overall, the manuscript is very well written, the approaches used are clever, and the data were thoroughly analyzed. The study conveyed important information for understanding the circuit mechanism that shapes grid cell activity. It is important not only for the field of MEC and grid cells, but also for broader fields of continuous attractor networks and neural circuits.

      We appreciate the positive comments.

      (1) The study largely relies on the fact that ramp-like wide-field optogenetic stimulation and focal optogenetic activation both drove asynchronous action potentials in SCs, and therefore, if a pair of PV+ INs exhibited correlated activity, they should receive common inputs. However, it is unclear what criteria/thresholds were used to determine the level of activity asynchronization, and under these criteria, what percentage of cells actually showed synchronized or less asynchronized activity. A notable percentage of synchronized or less asynchronized SCs could complicate the results, i.e., PV+ INs with correlated activity could receive inputs from different SCs (different inputs), which had synchronized activity. More detailed information/statistics about the asynchronization of SC activity is necessary for interpreting the results.

      The short answer here is that spiking responses from the pairs of SCs that we sampled appear asynchronous. We now show this in the form of cross-correlograms for all recorded pairs of SCs (Figure 2, Figure Supplement 1). The correlograms lack peaks that would indicate synchronous activation. Thus, while our dataset is not large enough to rule out occasional direct synchronisation of SCs, this appears unlikely to account for synchronised input to PV+INs.

      This conclusion is consistent with consideration of mechanisms that could in principle synchronise SCs:

      First, if responses to ramping light inputs was fully deterministic, then this could lead to fixed relative timing of spikes fired by different SCs. This is unlikely given the influence of stochastic channel gating on SC spiking (Dudman and Nolan 2009) and is inconsistent with trial to trial variability in spike timing (Figure 2, Figure Supplement 2).

      Second, as SCs are glutamatergic they could excite one another. However, excitatory connections between stellate cells are rare (Pastoll et al. 2013; Couey et al. 2013; Fuchs et al. 2016) and when detected they have low amplitude (mean < 0.25 mV; (Winterer et al. 2017)). Our finding that spiking by pairs of SCs is not correlated is consistent with this.

      Third, strong interaction between stellate cells mediated by local inhibitory pathways (Pastoll et al. 2013; Couey et al. 2013) could coordinate their activity. The lack of correlation between spiking of pairs of SCs suggests that such coordination is rarely recruited by our ramping protocols. Nevertheless, recruitment of inhibition may happen to some extent as experiments in Figure 4 show that correlated input from SCs to more distant, but not nearby PV+INs, is reduced by blocking inhibitory synapses. Given that we don't find evidence for synchronised spiking of SCs, this additional common input to widely separated PV+INs is instead best explained by recruitment of interneurons that act directly on the target SCs. We have modified Figure 8 to make this clear.

      Thus, for experiments with ramping light stimuli, synchronous activation of SCs is unlikely to explain common input to PV+INs. Input from the same SC best explains correlated responses of nearby PV+IN inhibitory populations, while recruitment of an additional inhibitory pathway may contribute to correlated responses of more distant PV+INs.

      For experiment using focal stimulation, substantial trial-to-trial variation in SC spike timing argues strongly against deterministic coordination. Indirect coordination of presynaptic neurons is also extremely unlikely given that focal activation is sparse and brief, while inputs from many presynaptic SCs are required to drive a postsynaptic interneuron to spike (e.g. (Pastoll et al. 2013; Couey et al. 2013)). Results from these experiments thus corroborate results from experiments using ramping light stimulation.

      In revising the manuscript we have tried to ensure these arguments are clear (e.g. p 5, para 3; p 6, para 2; p 10, para 1).

      (2) The hypothesis about the "direct excitatory-inhibitory" synaptic interactions is made based on the GABAzine experiments in Figure 4. In the Figure 8 diagram, the direct interaction is illustrated between PV+ INs and SCs. However, the evidence supporting this "direct interaction" between these two cell types is missing. Is it possible that pyramidal cells are also involved in this interaction? Some pieces of evidence or discussions are necessary to further support the "direction interaction".

      Indirect connections between stellate cells mediated via fast spiking inhibitory interneurons are well established by previous studies (e.g. (Pastoll et al. 2013; Couey et al. 2013; Fuchs et al. 2016), and so were not addressed here. Previous work also establishes that connections from stellate cells to pyramidal cells are extremely rare (Winterer et al. 2017). Because the Sim1:Cre mouse line is specific to stellate cells and does not drive transgene expression in pyramidal cells (Sürmeli et al. 2015), it's therefore unlikely that pyramidal cells play a role.

      To make these points clearer we have modified the text in the discussion (p 5, para 3; p 10, paras 1 & 2). We have also modified Figure 8 to highlight that the indirect interaction may be best accounted for by inhibitory pathways onto PV+INs rather than via SCs (which our new cross-correlation analyses indicate is unlikely).

      Reviewer #2 (Public Review):

      In this study, Huang et al. employed optogenetic stimulation alongside paired whole-cell recordings in genetically defined neuron populations of the medial entorhinal cortex to examine the spatial distribution of synaptic inputs and the functional-anatomical structure of the MEC. They specifically studied the spatial distribution of synaptic inputs from parvalbumin-expressing interneurons to pairs of excitatory stellate cells. Additionally, they explored the spatial distribution of synaptic inputs to pairs of PV INs. Their results indicate that both pairs of SCs and PV INs generally receive common input when their relative somata are within 200-300 ums of each other. The research is intriguing, with controlled and systematic methodologies. There are interesting takeaways based on the implications of this work to grid cell network organization in MEC.

      We appreciate the positive comments.

      (1) Results indicate that in brain slices, nearby cells typically share a higher degree of common input. However, some proximate cells lack this shared input. The authors interpret these findings as: "Many cells in close proximity don't seem to share common input, as illustrated in Figures 3, 5, and 7. This implies that these cells might belong to separate networks or exist in distinct regions of the connectivity space within the same network.". Every slice orientation could have potentially shared inputs from an orthogonal direction that are unavoidably eliminated. For instance, in a horizontal section, shared inputs to two SCs might be situated either dorsally or ventrally from the horizontal cut, and thus removed during slicing. Given the synaptic connection distributions observed within each intact orientation, and considering these distributions appear symmetrically in both horizontal and sagittal sections, the authors should be equipped to estimate the potential number of inputs absent due to sectioning in the orthogonal direction. How might this estimate influence the findings, especially those indicating that many close neurons don't have shared inputs?

      Given we find high probabilities of correlated inputs to nearby cells in both planes, our conclusion that nearby cells are likely to receive common inputs appears to be independent of the slice plane. For cells further apart, where the degree of correlated input becomes more variable, it is possible that cell pairs that have low input correlations measured in one slice plane would have high input correlations if measured in a different plane. An argument against this is that as the cell pairs are further apart, it is less likely that an orthogonal axon would intersect dendritic trees of both cells. Nevertheless, we can't rule this out given the data here. We have amended the discussion to highlight this possibility (p 10, para 1). We agree it would be interesting to address this point further with quantitative analyses but this will be difficult without detailed reconstructions of the circuit.

      (2) The study examines correlations during various light-intensity phases of the ramp stimuli. One wonders if the spatial distribution of shared (or correlated) versus independent inputs differs when juxtaposing the initial light stimulation phase, which begins to trigger spiking, against subsequent phases. This differentiation might be particularly pertinent to the PV to SC measurements. Here, the initial phase of stimulation, as depicted in Figure 7, reveals a relatively sparse temporal frequency of IPSCs. This might not represent the physiological conditions under which high-firing INs function. While the authors seem to have addressed parts of this concern in their focal stim experiments by examining correlations during both high and low light intensities, they could potentially extract this metric from data acquired in their ramp conditions. This would be especially valuable for PV to SC measurements, given the absence of corresponding focal stimulation experiments.

      We understand the gist of the question here as being can differences in correlation scores between initial vs later phases of responses to ramping light inputs be used to infer spatial organisation? These differences are likely to reflect heterogeneity in the spiking of the input neurons, for example through differences in spike threshold, spike frequency adaptation and saturation of spiking (e.g. Figure 2, Figure Supplement 1A, and also see (Pastoll et al. 2020)). We don't expect these differences to have any spatial organisation along the mediolateral axis, and while spike threshold follows a dorsoventral organisation there is nevertheless substantial local variation between neurons (Pastoll et al. 2020). It's therefore unlikely we can use differences in early versus late correlations to make the inferences proposed by the reviewer.

      With respect to PV to SC measurements, similar heterogeneity is likely. We note that we were unable to carry out focal stimulation experiments for PV to SC connections as PV neurons did not spike in response to focal optogenetic stimulation.

      With respect to physiological conditions, our aim here is simply to assess connectivity in well controlled conditions, e.g. voltage-clamp, minimal spontaneous activity, known neuronal locations, etc. It's not clear that physiological activation patterns would improve on these tests and quite likely data would be noisier and harder to interpret.

      (3) Re results from Figure 2: Please fully describe the model in the methods section. Generally, I like using a modeling approach to explore the impact of convergent synaptic input to PVs from SCs that could effectively validate the experimental approach and enhance the interpretability of the experimental stim/recording outcomes. However, as currently detailed in the manuscript, the model description is inadequate for assessing the robustness of the simulation outcomes. If the IN model is simply integrate-and-fire with minimal biophysical attributes, then the findings in Fig 2F results shown in Fig 2F might be trivial. Conversely, if the model offers a more biophysically accurate representation (e.g., with conductance-based synaptic inputs, synapses appropriately dispersed across the model IN dendritic tree, and standard PV IN voltage-gated membrane conductances), then the model's results could serve as a meaningful method to both validate and interpret the experiments.

      We appreciate the simulation descriptions were insufficient and have modified the manuscript to include additional details and clarification (p 14, paras 1-3).

      We're not sure we follow the logic here with respect to model types. The experiments were carried out in the voltage-clamp recording configuration with the goal of identifying correlated inputs independently from how they are integrated by the postsynaptic neuron. Given that membrane potential doesn't change (and so the CdVm/dt term of the membrane equation = 0), integrate and fire and point conductance-based models both simplify down to summing of input currents. We achieve this by convolving spike times with experimentally measured synaptic current waveforms. An assumption of our approach is that we achieve a reasonable space clamp. We believe this is justified given that stellate cells and PV interneurons are reasonably electrotonically compact, and that our analysis relies on consistent correlations rather than absolute amplitudes or time constants of the postsynaptic response and so should tolerate moderate space clamp errors.

      Reviewer #3 (Public Review):

      This paper presents convincing data from technically demanding dual whole-cell patch recordings of stellate cells in medial entorhinal cortex slice preparations during optogenetic stimulation of PV+ interneurons. The authors show that the patterns of postsynaptic activation are consistent with dual recorded cells close to each other receiving shared inhibitory input and sending excitatory connections back to the same PV neurons, supporting a circuitry in which clusters of stellate cells and PV+IN interact with each other with much weaker interactions between clusters. These data are important to our understanding of the dynamics of functional cell responses in the entorhinal cortex. The experiments and analysis are quite complex and would benefit from some revisions to enhance clarity.

      These are technically demanding experiments, but the authors show quite convincing differences in the correlated response of cell pairs that are close to each other in contrast to an absence of correlation in other cell pairs at a range of relative distances. This supports their main point of demonstrating anatomical clusters of cells receiving shared inhibitory input.

      We appreciate the positive comments.

      The overall technique is complex and the presentation could be more clear about the techniques and analysis. In addition, due to this being a slice preparation they cannot directly relate the inhibitory interactions to the functional properties of grid cells which was possible in the 2-photon in vivo imaging experiment by Heys and Dombeck, 2014.

      We have modified the manuscript to try to improve the presentation (specific changes are detailed below). We agree that an important future challenge is to relate our findings to in vivo observations (p 11, para 2).

      Reviewer #1 (Recommendations For The Authors):

      Major points

      (1) The study largely relies on the fact that ramp-like wide-field optogenetic stimulation and focal optogenetic activation both drove asynchronous action potentials in SCs, and therefore, if a pair of PV+ INs exhibited correlated activity, they should receive common inputs. In Figure 2 and its supplementary figures, the authors also showed examples of asynchronized activity. However, it is unclear to me what criteria/thresholds were used to determine the level of activity asynchronization, and under these criteria, what percentage of cells actually showed synchronized or less asynchronized activity. A notable percentage of synchronized or less asynchronized SCs could complicate the results, i.e., PV+ INs with correlated activity could receive inputs from different SCs (different inputs), which had synchronized activity. Related to this concern, it would also be important to simulate what level of activity asynchronization in SCs could still lead to correlated PV+ IN activity above shuffle, and among the recorded SCs, what percentage of cells belong to this synchronized/less asynchronized category.

      We address this point in our response to the public review. In brief, we have added additional cross-correllograms showing that ramp activation of SC pairs does not cause detectable synchronous activation. We also clarify that sensitivity of correlations of some widely separated pairs to GABA-blockers is suggestive of SCs activating common inhibitory inputs to cell pairs.

      (2) The above concern is more relevant to the focal stimulation experiments, in which the authors tried to claim that a pair of PV+ INs with correlated activity could receive inputs from the same SCs neurons. The authors also showed that the stimulation patterns leading to the activation of PV+ INs were more similar if PV+ INs had correlated activity (Figure 5D). However, if nearby SCs were more synchronized than distal SCs within this stimulation scale, even though a pair of PV+ INs showed correlated activity, they could still receive inputs from different but nearby SCs. In this case, it would be helpful to quantify the relationship between the level of activity synchronization of SCs and their distances. In Figure 5 Supplementary Figure 1, the data were only provided for 8 cells. If feasible, collecting data from more cells would be needed for the proposed analysis.

      We explain in our responses to point 1 above and in the public review that direct synchronisation of SCs is unlikely. This is particularly unlikely for focal stimulation experiments as the timing of responses of individual SCs is extremely variable between trials. Thus, even if there were strong synaptic connections between SCs, which the evidence suggests there is not (Pastoll et al. 2013; Couey et al. 2013; Fuchs et al. 2016), then this would be unlikely to result in reliably timed coordinated firing.

      (3) It is unclear what the definition of "common inputs" is. Do they refer to inputs from the same group of cells? If different groups of cells provide synchronized inputs, will the inputs be considered "common inputs" or "different inputs"?

      We used "common" in an attempt to be consistent with classic work by Yoshimura et al. and in an attempt to be succinct. Thus, by common input we are referring to cell pairs for which a proportion of their input is from the same presynaptic neuron(s), as opposed to cell pairs for which their input is from different neurons and therefore have no common input. We have attempted to make sure this is clear in the revised manuscript (e.g description of simulations on p 4, para 2).

      (4) In the introduction and abstract, it was mentioned that "dense, but specific, direct excitatory-inhibitory synaptic interactions may operate at the scale of grid cell clusters". It is unclear to me how "dense" was demonstrated in the data. Can the authors clarify?

      Thanks for flagging this, we were insufficiently clear. We have revised the text to refer to cell pairs for which a proportion of their input is from the same presynaptic neurons (e.g. p 3, para 1), and separately about indirect coordination, by which we mean inputs to cell pairs that appear correlated because of coordination between upstream neurons.

      (5) The hypothesis about the "direct excitatory-inhibitory" synaptic interactions is made based on the GABAzine experiments in Figure 4. In the Figure 8 diagram, the direct interaction is illustrated between PV+ INs and SCs. Is there any evidence supporting this "direct interaction"?

      The direct interaction from SCs to PV+INs and from PV+INs to SCs were previously demonstrated by experiments with recordings from pairs of neurons (e.g. (Pastoll et al. 2013; Couey et al. 2013; Fuchs et al. 2016; Winterer et al. 2017). Our results in Figures 3-5, which show that exciting SCs by light activation of ChR2 leads to excitation of PV+INs, and in Figure 7, which show that light activation of PV+INs expressing ChR2 leads to inhibition of SCs, are consistent with these previous conclusions. We have modified the manuscript to make sure this is clear (p 2, para 3).

      Is it possible that pyramidal cells are also involved in this interaction? If this is unlikely, the author may provide some pieces of evidence (e.g., timing of responses after optogenetic stimulation) or some discussions.

      This is unlikely given that previous studies indicate that connections from stellate to pyramidal cells are weak or absent (Winterer et al. 2017). We now clarify this in the Discussion (p 10, para 1).

      Minor points (1) Page 4: the last paragraph: the author claimed that CCpeakmean was reduced and CClagvar increased with cell separation. Although the trends are visible in the figures, the author may provide appropriate statistics to support this statement, such as a correlation between cell separation and CCpeakmean CClagvar./

      We have inserted summaries of linear model fits into the legends for Figure 3E-F, Figure 5F-H and Figure 7D.

      (2)  If I understood correctly, in the second last paragraph on page 6, "pairs of SCs" should be changed to "pairs of PV+ INs".

      Thanks. Corrected.

      (3)  Page 9: the 7th line to the end: where is Figure S4?

      Corrected to 'Figure 3, Figure Supplement 2'.

      (4)  Page 27: at the end of figure caption B: two ".

      Corrected.

      (5)  Figures 3A and B: what are the red vertical rectangles?

      These are the regions shown on an expanded time base in C and D. This is now clarified in the legend.

      (6)  Page 28 Figure caption of D and E: (C) and (D) should be (D) and (E).

      Corrected.

      (7)  The first sentence of the third paragraph in INTRODUCTION: 'later' should be 'layer'.

      Corrected.

      Reviewer #2 (Recommendations For The Authors):

      - Some related work has been done by Beed et al. 2013 to map the spatial distribution of inputs to neurons in MEC. Certainly, there are differences in the approaches and the key questions, but the contribution of this study would benefit from a more detailed comparison of the results from Beed vs the current study and should be included in the discussion.

      It's hard to include a detailed comparison of results, at least without losing focus, as the two studies address different questions with different approaches. We already noted that 'Local optical activation of unidentified neurons has also been used to infer connectivity principles but with a focus on responses of single postsynaptic neurons (Beed et al., 2013, 2010)'. In addition, we now note that 'Our focal optogenetic stimulation approach also offers insight into the spatial organization of presynaptic neuronal populations, with the advantage, compared to focal glutamate uncaging previously used to investigate connectivity in the MEC (Beed et al., 2013, 2010), that the identity of the presynaptic cell population is genetically defined'.

      - There are a few places where the language is ambiguous or needs a more detailed description for clarity. • 3rd paragraph under "Focal activation of SCs generates common input to nearby PV+Ins". The correlation probability description in this paragraph and a similar sentence in the methods are very hard to understand. I had to look up the analysis in Yoshimura et al. 2005 to understand what was done here. It's a nice analysis, but the manuscript could benefit from a more detailed description of this measure in the methods.

      We agree, it is a somewhat complex metric and is challenging to explain. In the interests of keeping the main text succinct, we have left the bare bones explanation as it was in the Results, but have expanded the explanation in the Methods. We hope this is now clear.

      - " Alternatively, if there is no clear spatial organization of SC to PV+INs connections, then the similarity between stimulus locations for pairs of SCs should have a random distribution." This sentence is hard to understand. I think the use of the phrase "similarity of stimulus location" is a strange phrasing and is driving the confusion in this sentence.

      We have replaced this with 'correspondence between active stimulus locations'.

      - In the discussion under "Spatial extent and functional organization of L2 circuits" there is a grammatical mistake (seems to be 2x phrasing of "leads to common synaptic input").

      Corrected.

      - Citation in the introduction/discussion. Introduction: in addition to Gu et al. 2018, Heys et al 2014 also showed there are non-random correlations among putative grid cells as a function of their somatic distance. In the discussion section, in addition to Gu et al. 2018, Heys et al. 2014 showed there is anatomical clustering of grid cells in MEC. This earlier work investigating functional correlations among neurons in the superficial aspect of MEC in vivo should be cited and is particularly relevant in these two sections of the manuscript.

      Thanks, we apologise for the oversight. We're well aware of this important study and have now cited it.

      -Typo - Paragraph 3 of the intro; "later" should be layer.

      Corrected.

      -Figure 5 (D-E) there is a typo high correlation probability is D and low correlation is E (text says C/D).

      Corrected.

      Reviewer #3 (Recommendations For The Authors):

      The paper is missing the bibliography section. This makes the review somewhat difficult as some cited papers are not immediately familiar based on the citation.

      Thanks and our apologises for making extra work by omitting this. It is now included.

      Page 2 - "cell clusters" - they should also cite the paper by Heys and Dombeck, 2014 that shows a spatial scale of inhibitory interactions computed based on correlations of grid cells recorded using 2-photon calcium imaging.

      Added (see above).

      Page 2 - "later 2 of the MEC" - layer.

      Corrected.

      Page 2 - "synaptic interactions" - again they should mention the work by Heys and Dombeck, 2014 that indirectly measured the spatial scale of inhibition.

      Now cited in this paragraph.

      Page 4 "we simulated responses" and Figure 2E - in each simulation - did they fit the magnitude and time constant of the simulated EPSCs to individual EPSCs in the data? Or did they randomly vary these to find the best fit?

      The parameters for the simulations are given in the Methods and were chosen to correspond to the experimental values. We have rewritten this section to make the simulation methods clearer. Simulations using different time constants within a physiological range support similar conclusions.

      Page 4 - "we identified 35/71" - Are these the cells that appear in yellow as correlated in Figures 3E-F? If so, the text should indicate that these cells are shown in yellow.

      We have added this and have also updated the legends for additional clarification.

      Figure 2, Figure Supplement 1 - B,C - the following phrase is not clear: "when the 4 / 8 of each neurons inputs from SCs also project to the other neuron (B)," Should the "the" be removed? Also, by 4/8 do they mean 50%, or do they mean 4 to 8?

      Thanks, we've reworded to improve the clarity.

      E - "receiving presynaptic inputs consisted of 4 overlapping SCs" - should it say "consisting"?

      Corrected.

      Figure 3, Figure Supplement 1 part E - "the same data as (C )" - should this be the same data as (D)?? I do not see how doing clustering on the shuffled data in (C ) would give two groups, but it makes sense if it is from (D).

      That's right, now corrected.

      Page 5 - "used action potentials" - this is confusing. Is the word "used" supposed to be there?

      Corrected.

      Page 5 - "widefield activation experiments" - they should cite the experiments that they are referring to here.

      Added.

      Page 5 - "effect of blocking" - "Figure 4" - I find it very odd that the agent GABAzine in Figure 4 is not explicitly mentioned in the main text (though it is mentioned in the methods). The main text should indicate that blocking was performed using GABAzine.

      Added.

      Page and page 14 and Figure 5 - "shifted" - do they mean shuffled?

      We do. The classic papers by Yoshimura et al. used shifted so we keep this here so it's clear we've used their approach. We've added additional explanation to try to make sure the meaning is clear.

      Figure 5 A, B, D, and E would benefit from a more detailed description. They should state whether the labels "1a" and "1b" and "2a" and "2b" refer to different recorded neurons in each pair. They should indicate that 2a and 2b are a different pair? Are the x, y axes of the images corresponding to anatomical position? Does "B" indicate the location of recordings shown in Figure 5B? The authors probably think this is all obvious, but it is not immediately obvious to the reader.

      We have added additional clarification.

      Page 8 - "Beed et al." - These papers by Beed ought to be cited in the introduction as well as they are highly relevant.

      We now cite Beed et al. 2013 in the Introduction when we discuss local inhibitory input to SCs. While the Beed et al. 2010 paper is an important contribution to understanding about pathways from deep to superficial layers, the introduction focuses on communication between identified pre- and postsynaptic populations within layer 2 and therefore we haven't found a way to cite it without losing focus. We do cite this paper multiple times elsewhere.

      Page 10 - "Excitatory-inhibitory interactions" - this summary of attractor models ought to cite the paper by Burak and Fiete as well.

      The discussion focuses on models with excitatory-inhibitory connectivity and cites an important paper from the Fiete group. The model by Burak and Fiete, while also important, is purely inhibitory and so is not well constrained by the known circuitry, and therefore could not be correctly cited here.

      Page 10 - "be consistent with models…or that focus on pyramidal neurons have also been proposed" - this seems ungrammatical as if two different sentences were merged.

      Corrected.

      References

      Couey, Jonathan J, Aree Witoelar, Sheng-Jia Zhang, Kang Zheng, Jing Ye, Benjamin Dunn, Rafal Czajkowski, et al. 2013. “Recurrent Inhibitory Circuitry as a Mechanism for Grid Formation.” Nat. Neurosci. 16 (3): 318–24. https://doi.org/10.1038/nn.3310.

      Dudman, Joshua T, and Matthew F Nolan. 2009. “Stochastically Gating Ion Channels Enable Patterned Spike Firing through Activity-Dependent Modulation of Spike Probability.” Plos Comput. Biol. 5 (2): e1000290. https://doi.org/10.1371/journal.pcbi.1000290.

      Fuchs, Elke C, Angela Neitz, Roberta Pinna, Sarah Melzer, Antonio Caputi, and Hannah Monyer. 2016. “Local and Distant Input Controlling Excitation in Layer II of the Medial Entorhinal Cortex.” Neuron 89 (1): 194–208. https://doi.org/10.1016/j.neuron.2015.11.029.

      Pastoll, Hugh, Derek L Garden, Ioannis Papastathopoulos, Gülşen Sürmeli, and Matthew F Nolan. 2020. “Inter- and Intra-Animal Variation in the Integrative Properties of Stellate Cells in the Medial Entorhinal Cortex.” Elife 9 (February). https://doi.org/10.7554/eLife.52258.

      Pastoll, Hugh, Lukas Solanka, Mark C W van Rossum, and Matthew F Nolan. 2013. “Feedback Inhibition Enables Theta-Nested Gamma Oscillations and Grid Firing Fields.” Neuron 77 (1): 141–54. https://doi.org/10.1016/j.neuron.2012.11.032.

      Sürmeli, Gülşen, Daniel Cosmin Marcu, Christina McClure, Derek L F Garden, Hugh Pastoll, and Matthew F Nolan. 2015. “Molecularly Defined Circuitry Reveals Input-Output Segregation in Deep Layers of the Medial Entorhinal Cortex.” Neuron 88 (5): 1040–53. https://doi.org/10.1016/j.neuron.2015.10.041.

      Winterer, Jochen, Nikolaus Maier, Christian Wozny, Prateep Beed, Jörg Breustedt, Roberta Evangelista, Yangfan Peng, Tiziano D’Albis, Richard Kempter, and Dietmar Schmitz. 2017. “Excitatory Microcircuits within Superficial Layers of the Medial Entorhinal Cortex.” Cell Rep. 19 (6): 1110–16. https://doi.org/10.1016/j.celrep.2017.04.041.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      While there are many models for sequence retrieval, it has been difficult to find models that vary the speed of sequence retrieval dynamically via simple external inputs. While recent works [1,2] have proposed some mechanisms, the authors here propose a different one based on heterogeneous plasticity rules. Temporally symmetric plasticity kernels (that do not distinguish between the order of pre and post spikes, but only their time difference) are expected to give rise to attractor states, asymmetric ones to sequence transitions. The authors incorporate a rate-based, discrete-time analog of these spike-based plasticity rules to learn the connections between neurons (leading to connections similar to Hopfield networks for attractors and sequences). They use either a parametric combination of symmetric and asymmetric learning rules for connections into each neuron, or separate subpopulations having only symmetric or asymmetric learning rules on incoming connections. They find that the latter is conducive to enabling external inputs to control the speed of sequence retrieval.

      Strengths:

      The authors have expertly characterised the system dynamics using both simulations and theory. How the speed and quality of retrieval varies across phases space has been well-studied. The authors are also able to vary the external inputs to reproduce a preparatory followed by an execution phase of sequence retrieval as seen experimentally in motor control. They also propose a simple reinforcement learning scheme for learning to map the two external inputs to the desired retrieval speed.

      Weaknesses:

      (1) The authors translate spike-based synaptic plasticity rules to a way to learn/set connections for rate units operating in discrete time, similar to their earlier work in [5]. The bio-plausibility issues of learning in [5] carry over here, for e.g. the authors ignore any input due to the recurrent connectivity during learning and effectively fix the pre and post rates to the desired ones. While the learning itself is not fully bio-plausible, it does lend itself to writing the final connectivity matrix in a manner that is easier to analyze theoretically.

      We agree with the reviewer that learning is not `fully bio-plausible’. However, we believe that extending the results to a model in which synaptic plasticity depends on recurrent inputs is beyond the scope of this work. We have added a mention of this issue in the Discussion in the revised manuscript.

      (2) While the authors learn to map the set of two external input strengths to speed of retrieval, they still hand-wire one external input to the subpopulation of neurons with temporally symmetric plasticity and the other external input to the other subpopulation with temporally asymmetric plasticity. The authors suggest that these subpopulations might arise due to differences in the parameters of Ca dynamics as in their earlier work [29]. How these two external inputs would connect to neurons differentially based on the plasticity kernel / Ca dynamics parameters of the recurrent connections is still an open question which the authors have not touched upon.

      The issue of how external inputs could self-organize to drive the network to retrieve sequences at appropriate speeds is addressed in the Results section, paragraph `Reward-driven learning’. These inputs are not `hand-wired’ - they are initially random and then acquire the necessary strengths to allow the network to retrieve the sequences at different speeds thanks to a simple reinforcement learning scheme. We have rewritten this section to clarify this issue.

      (3) The authors require that temporally symmetric and asymmetric learning rules be present in the recurrent connections between subpopulations of neurons in the same brain region, i.e. some neurons in the same brain region should have temporally symmetric kernels, while others should have temporally asymmetric ones. The evidence for this seems thin. Though, in the discussion, the authors clarify 'While this heterogeneity has been found so far across structures or across different regions in the same structure, this heterogeneity could also be present within local networks, as current experimental methods for probing plasticity only have access to a single delay between pre and post-synaptic spikes in each recorded neuron, and would therefore miss this heterogeneity'.

      We agree with the reviewer that this is currently an open question. We describe this issue in more detail in the Discussion of the revised manuscript.

      (4) An aspect which the authors have not connected to is one of the author's earlier work:

      Brunel, N. (2016). Is cortical connectivity optimized for storing information? Nature Neuroscience, 19(5), 749-755. https://doi.org/10.1038/nn.4286 which suggests that the experimentally observed over-representation of symmetric synapses suggests that cortical networks are optimized for attractors rather than sequences.

      We thank the reviewer for this suggestion. We have added a paragraph in the discussion that discusses work on statistics of synaptic connectivity in optimal networks. We expect that in networks that contain two subpopulations of neurons, the degree of symmetry should be intermediate between a network storing fixed point attractors exclusively, and a network storing sequences exclusively.

      Despite the above weaknesses, the work is a solid advance in proposing an alternate model for modulating speed of sequence retrieval and extends the use of well-established theoretical tools. This work is expected to spawn further works like extending to a spiking neural network with Dale's law, more realistic learning taking into account recurrent connections during learning, and experimental follow-ups. Thus, I expect this to be an important contribution to the field.

      We thank the reviewer for the insightful comments.

      Reviewer #2 (Public Review):

      Sequences of neural activity underlie most of our behavior. And as experience suggests we are (in most cases) able to flexibly change the speed for our learned behavior which essentially means that brains are able to change the speed at which the sequence is retrieved from the memory. The authors here propose a mechanism by which networks in the brain can learn a sequence of spike patterns and retrieve them at variable speed. At a conceptual level I think the authors have a very nice idea: use of symmetric and asymmetric learning rules to learn the sequences and then use different inputs to neurons with symmetric or asymmetric plasticity to control the retrieval speed. The authors have demonstrated the feasibility of the idea in a rather idealized network model. I think it is important that the idea is demonstrated in more biologically plausible settings (e.g. spiking neurons, a network with exc. and inh. neurons with ongoing activity).

      Summary

      In this manuscript authors have addressed the problem of learning and retrieval sequential activity in neuronal networks. In particular, they have focussed on the problem of how sequence retrieval speed can be controlled?

      They have considered a model with excitatory rate-based neurons. Authors show that when sequences are learned with both temporally symmetric and asymmetric Hebbian plasticity, by modulating the external inputs to the network the sequence retrieval speed can be modulated. With the two types of Hebbian plasticity in the network, sequence learning essentially means that the network has both feedforward and recurrent connections related to the sequence. By giving different amounts of input to the feed-forward and recurrent components of the sequence, authors are able to adjust the speed.

      Strengths

      - Authors solve the problem of sequence retrieval speed control by learning the sequence in both feedforward and recurrent connectivity within a network. It is a very interesting idea for two main reasons: 1. It does not rely on delays or short-term dynamics in neurons/synapses 2. It does not require that the animal is presented with the same sequences multiple times at different speeds. Different inputs to the feedforward and recurrent populations are sufficient to alter the speed. However, the work leaves several issues unaddressed as explained below.

      Weaknesses

      - The main weakness of the paper is that it is mostly driven by a motivation to find a computational solution to the problem of sequence retrieval speed. In most cases they have not provided any arguments about the biological plausibility of the solution they have proposed e.g.:

      - Is there any experimental evidence that some neurons in the network have symmetric Hebbian plasticity and some temporally asymmetric? In the references authors have cited some references to support this. But usually the switch between temporally symmetric and asymmetric rules is dependent on spike patterns used for pairing (e.g. bursts vs single spikes). In the context of this manuscript, it would mean that in the same pattern, some neurons burst and some don't and this is the same for all the patterns in the sequence. As far as I see here authors have assumed a binary pattern of activity which is the same for all neurons that participate in the pattern.

      There is currently only weak evidence for heterogeneity of synaptic plasticity rules within a single network, though there is plenty of evidence for such a heterogeneity across networks or across locations within a particular structure (see references in our Discussion). The reviewer suggests another interesting possibility, that the temporal asymmetry could depend on the firing pattern on the post-synaptic neuron. An example of such a behavior can be found in a paper by Wittenberg and Wang in 2006, where they show that pairing single spikes of pre and post-synaptic neurons lead to LTD at all time differences in a symmetric fashion, while pairing a pre-synaptic spike with a burst of post-synaptic spikes lead to temporally asymmetric plasticity, with a LTP window at short positive time differences. We now mention this possibility in the Discussion, but we believe exploring fully this scenario is beyond the scope of the paper.

      - How would external inputs know that they are impinging on a symmetric or asymmetric neuron? Authors have proposed a mechanism to learn these inputs. But that makes the sequence learning problem a two stage problem -- first an animal has to learn the sequence and then it has to learn to modulate the speed of retrieval. It should be possible to find experimental evidence to support this?

      Our model does not assume that the two processes necessarily occur one after the other. Importantly, once the correct external inputs that can modulate sequence retrieval are learned, sequence retrieval modulation will automatically generalize to arbitrary new sequences that are learned by the network.

      - Authors have only considered homogeneous DC input for sequence retrieval. This kind of input is highly unnatural. It would be more plausible if the authors considered fluctuating input which is different from each neuron.

      We have modified Figure 1e and Figure 2c to show the effects of fluctuating inputs on pattern correlations and single unit activity. We find that these inputs do not qualitatively affect our results.

      - All the work is demonstrated using a firing rate based model of only excitatory neurons. I think it is important that some of the key results are demonstrated in a network of both excitatory and inhibitory spiking neurons. As the authors very well know it is not always trivial to extend rate-based models to spiking neurons.

      I think at a conceptual level authors have a very nice idea but it needs to be demonstrated in a more biologically plausible setting (and by that I do not mean biophysical neurons etc.).

      We have included a new section in the discussion with an associated figure (Figure 7) demonstrating that flexible speed control can be achieved in an excitatory-inhibitory (E-I) spiking network containing two excitatory populations with distinct plasticity mechanisms.

      Reviewer #1 (Recommendations For The Authors):

      In the introduction, the authors state: 'symmetric kernels, in which coincident activity leads to strengthening regardless of the order of pre and post-synaptic spikes, have also been observed in multiple contexts with high frequency plasticity induction protocols in cortex [21]'. To my understanding, [21]'s final model 3, ignores LTD if the post-spike also participates in LTP, and only considers nearest-neighbour interactions. Thus, the kernel would not be symmetric. Can the authors clarify what they mean and how their conclusion follows, as [21] does not show any kernels either.

      In this statement, we were not referring to the model in [21], but rather the experimentally observed plasticity kernels at different frequencies. In particular, we were referring to the symmetric kernel that appears in the bottom panel of Figure 7c in that paper.

      The authors should also address the weaknesses mentioned above. They don't need to solve the issues but expand (and maybe indicate resolutions) on these issues in the Discussion.

      For ease of reproducibility, the authors should make their code available as well.

      We intend to publish the code required to reproduce all figures on Github.

      Reviewer #2 (Recommendations For The Authors):

      -  Show the ground state of the network before and after learning.

      We have decided not to include such a figure, as we have not analyzed the learning process, but instead a network with a fixed connectivity matrix which is assumed to be the end result of a learning process.

      -  Authors have only considered a network of excitatory neurons. This does not make sense. I think they should demonstrate a network of both exc. and inch. neurons (spiking neurons) exhibiting ongoing activity.

      See our comment to Reviewer #2 in the previous section.

      -  Show how the sequence dynamics unfolds when we assume a non-zero ongoing activity.

      We are not sure what the reviewer means by `non-zero ongoing activity. We show now the dynamics of the network in the presence of noisy inputs, which can represent ongoing activity from other structures (see Fig 1e and 2c).

      -  From the correlation (==quality) alone it is difficult to judge how well the sequence has been recovered. Authors should consider showing some examples so that the reader can get a visual estimate of what 0.6 quality may mean. High speed is not really associated with high quality (Fig 2b). So it is important to show how the sequence retrieval quality is for non-linear and heterogeneous learning rules.

      We believe that some insight into the relationship between speed and quality for the case of non-linear and heterogeneous learning rules is addressed by the correlation plots for chosen input configurations (see Fig. 3a and and 5b). We leave a full characterization for future work.

      -  Authors should show how the retrieval and quality of sequences change when they are recovered with positive input, or positive input to one population and negative to another. In the current version sequence retrieval is shown only with negative inputs. This is a somewhat non-biological setting. The inhibitory gating argument (L367-389) is really weak.

      We would like to clarify that with the parameters chosen in this paper, the transfer function has half its maximal rate at zero input. This is due to the fact we chose the threshold to be zero, using the fact that any threshold can be absorbed in the external inputs. Thus, negative inputs really mean sub-threshold inputs, and they are consistent with sub-threshold external excitatory inputs. We have clarified this issue in the revised manuscript.

      -  Authors should demonstrate how the sequence retrieval dynamics is altered when they assume a fluctuating input current for sequence retrieval instead of a homogeneous DC input.

      See our comment to Reviewer #2 in the previous section.

      -  Authors should show what are the differences in synaptic weight distribution for the two types of learning (bi-linear and non-linear). I am curious to know if the difference in the speed in the two cases is related to the weight distribution. In general I think it is a good idea to show the synaptic weight distribution before and after learning.

      As mentioned above, we do not study any learning process, but rather a network with a fixed connectivity matrix, assumed to represent the end result of learning. In this network, the distribution of synaptic weights converges to a Gaussian in the large p and cN limits, independently of the functions f and g, because of the central limit theorem, if there are no sign constraints on weights. In the presence of sign constraints, the distribution is a truncated Gaussian.

      -  I suggest the use of a monochromatic color scale for figure 2b and 3b.

      Figure 3: The sentence describing panel 2 seems incomplete.

      Also explain why there is non-monotonic relationship between I_s and speed for some values of

      I_a in 3b

      There is a non-monotonic relationship for retrieval quality, not speed. We have clarified this in the manuscript text, but don’t currently have an explanation for why this phenomenon occurs for these specific values of I_a.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Additional Discussion Points

      (1) There is not much exploration of potential mechanisms, i.e., the impact of PV neuron activity on the broader circuit. Additionally, the study exclusively focuses on PV cells and does not explore the role of other prefrontal populations, particularly those known to respond to cueevoked fear states. The discussion should consider how PV activity might impact the broader circuit and whether the present findings are specific to PV cells or applicable to other interneuron subtypes.

      We have added an extensive discussion of potential mechanisms and the potential contributions of other interneuron subtypes:

      “For example, PV neurons aid in improving visual discrimination through sharpening response selectivity in visual cortex (Lee et al., 2012). In prefrontal cortex, PV neurons are critical for task performance, particularly during performance of tasks that require flexible behavior such as rule shift learning (Cho et al., 2020) and reward extinction (Sparta et al., 2014). Further, PV neurons play an essential role in the generation of cortical gamma rhythms, which contribute to synchronization of selective populations of pyramidal neurons (Sohal et al., 2009; Cardin et al., 2009). Courtin et al (2014) showed that brief suppression of dorsomedial prefrontal (dmPFC) PV neural activity enhanced fear expression, one of the main functions of the dmPFC, by synchronizing the spiking activity of dmPFC pyramidal neurons (Courtin et al., 2014). This result is potentially relevant to our findings, but likely involves different circuit mechanisms because of the difference in timescale, targeted area, and downstream projection targets (Vertes, 2004). These and other studies support the idea that PV neural activity supports the execution of a behavior by shaping rather than suppressing cortical activity, potentially by selecting among conflicting behaviors by the synchronization of different pyramidal populations (Warden et al., 2012; Lee et al., 2014).

      The roles of other inhibitory neural subtypes (such as somatostatin (SOM)-expressing and vasoactive intestinal peptide (VIP)-expressing IL GABA neurons) in avoidance behavior are currently unknown, but are likely important given the role of SOM neurons in gamma-band synchronization (Veit et al., 2017), and the role of VIP neurons in regulating PV and SOM neural activity (Cardin, 2018).” 

      (2) There is some discordance between changes in neural activity and behavior. For example, in Figure 4C, the relationship between PV neuron activity and movement emerges almost immediately during learning, but successful active avoidance emerges much more gradually. Why is this?

      We have added extensive text to the discussion that addresses this issue:

      “Interestingly, the rise in IL PV neural activity during movement does not require avoidance learning. IL PV neurons begin to respond during movement immediately after the animal has received a single shock in an environment, but learning to cross the chamber to avoid the signaled shock takes tens of trials. Why is there a discordance between the emergence of the IL PV signal during movement and avoidance learning?

      The components underlying active avoidance have been debated over the years, but are thought to involve at least two essential behaviors – suppressing freezing, and moving to safety (LeDoux et al., 2017). Freezing is the default response of mice upon hearing a shock-predicting tone, and can be learned in a single trial (Ledoux, 1996; Fanselow, 2010; Zambetti et al., 2022). When a predator is in the distance, freezing can increase the chance of survival by reducing the chances of detection. However, a strategic avoidance behavior may prevent a future encounter with the predator altogether. The importance of IL PV neural activity in defensive behavior may be to suppress reactive defensive behaviors such as freezing in order to permit a flexible goaldirected response to threat.

      The freezing suppression and avoidance movement components of the avoidance response are dissociable, both because freezing precedes avoidance learning, and because animals intermittently move prior to avoidance learning. Our finding that the rise in PV activity during movement emerges immediately after receiving a single shock, tens of trials before animals have learned the avoidance behavior, suggests that the IL PV signal is associated with the suppression of freezing. Further, IL PV neurons do not respond during movement toward cued rewards because in reward-based tasks there is no freezing response in conflict with reward approach behavior.” 

      (3) vmPFC was defined here as including the infralimbic (IL) and dorsal peduncular (DP) regions. While the role of IL has been frequently characterized for motivated behavior, relatively few studies have examined DP. Perhaps the authors are just being cautious, given the challenges involved in the viral targeting of the IL region without leakage to nearby regions such as DP. But since the optical fibers were positioned above the IL region, it is possible that DP did not contribute much to either the fiber photometry signals or the effects of the optogenetic manipulations. Perhaps DP should be completely omitted, which is more consistent with the definitions of vmPFC in the field.

      Yes, we included DP to be cautious as our viral expression sometimes leaks into DP, though the optic fiber targets IL. We have replaced vmPFC with IL throughout the manuscript. 

      (4) In the Discussion, the authors should consider why PV cells exhibit increased activity during both movement initiation and successful chamber crossing during avoidance. While the functional contribution of the PV signal during movement initiation was tested with optogenetic inhibition, some discussion on the possible role of the additional PV signal during chamber crossing is of interest readers who are intrigued by the signaling of two events. Is the chamber crossing signal related to successful avoidance or learned safety (e.g., see Sangha, Diehl, Bergstrom, Drew 2020)?

      IL PV neural activity starts to increase at movement initiation, peaks at chamber crossing (when movement speed is highest), and decreases after chamber crossing (Figure 1E). Thus, the increase in PV neural activity at movement initiation and at chamber crossing are different phases of the same event. 

      We think this signal is unlikely to be a safety signal, and have added text to the discussion to clarify this issue:

      “We think the IL PV signal is unlikely to be a safety signal (Sangha et al., 2020). First, the PV signal rises during movement not only in the avoidance context, but during any movement in a “threatening” context (i.e. a context where the animal has been shocked). For example, PV neural activity rises during movement during the intertrial interval in the avoidance task. Further, the emergence of the PV signal during movement happens quickly – after the first shock – and significantly before the animal has learned to move to the safe zone. This suggests a close association with enabling movement in a threatening environment, when animals must suppress a freezing response in order to move. Additionally, the rise in PV activity was specifically associated with movement and not with tone offset, the indicator of safety in this task. Finally, if IL PV neural activity reflects safety signals one would expect the response to be enhanced by learning, but the amplitude of the IL PV response was unaffected by learning after the first shock.”

      (5) The primary conclusion here that PV cells control the fear response should be considered within the context of prior findings by the Herry laboratory. Courtin et al (2014) demonstrated a select role of prefrontal PV cells in the regulation of fear states, accomplished through their control over prefrontal output to the basolateral amygdala. The observations in this paper, which used both ChR2 and Arch-T to address the impact of vmPFC PV activity on reactive behavior, are highly relevant to issues raised both in the Introduction and Discussion.

      Courtin et al (2014)’s finding is very important. We did not discuss this paper originally because Courtin et al. is about dmPFC, which has a different role in fear processing than IL/vmPFC. We have added text about this finding to the discussion:

      “Courtin et al (2014) showed that brief suppression of dorsomedial prefrontal (dmPFC) PV neural activity enhanced fear expression, one of the main functions of the dmPFC, by synchronizing the spiking activity of dmPFC pyramidal neurons (Courtin et al., 2014). This result is potentially relevant to our findings, but likely involves different circuit mechanisms because of the difference in timescale, targeted area, and downstream projection targets (Vertes, 2004).

      Additional analyses

      (1) As avoidance trials progress (particularly on days 2 and 3), do PFC PV responses attenuate? That is, does continued unreinforced tone presentations lead to reduced reliance of PV cellmediated suppression in order for successful avoidance to occur?

      We added Figure 1—Figure supplement 1M and 1N and a sentence on page 5: “IL PV neural activity during the avoidance movement was not attenuated by learning or repeated reinforcement (Figure 1—Figure supplement 1M and N, N = 8 mice, p = 0.8886, 1-way ANOVA).” We only included data from days 1 and 2, since we started to introduce short and long tone trials on day 3 which might interfere. 

      (2) In Figure 3D, it would be very informative and further support the claim of "no role for movement during reward" if the response of these cells during the "initiation of movement during reward-approach" was shown (similar to Figure 1F for threat avoidance).

      Thank you for the question. We added Figure 3—Figure supplement 1B and C to show IL PV neural activity aligned to initiation of movement during reward-approach. IL PV activity decreased after movement initiation for reward approach (N = 6 mice, p=0.0382, paired t-test). This further solidifies our claim that IL PV neuron activity only increases for threat avoidance.   

      Reviewer 1 (Recommendations For The Authors):

      (1) Fig1G shows the average response of PV cells during chamber crossing on an animal-toanimal basis. It would be informative to also see a similar plot for movement initiation.

      We have added the suggested figure in Figure 1—Figure supplement 1B.  

      (2) In the Results section (Page 5), there is a small issue with the logic. It says: "As vmPFC inactivation impairs avoidance behavior, the activity of inhibitory vmPFC PV neurons might be predicted to be low during successful avoidance trials." As opposed to "low", it should say "high", right? If inhibition impairs avoidance, then high responding by these cells would be presumed to drive the avoidance response, as supported by your findings.

      We have re-worded the text in this section. Based on prior findings that IL inactivation impairs avoidance (Moscarello et al., 2013), we predicted that inhibitory PV neurons would be less active during avoidance, because activating these neurons could suppress IL. However, we found that they were selectively active during avoidance.

      (3) In the caption/legend for Fig1E, it says that the "black ticks" indicate "tone onset". But it should say "movement initiation".

      We thank the reviewer for pointing out this error. The ticks do indicate tone onset, and we have corrected the figure to reflect this. 

      Reviewer 2 (Recommendations For The Authors):

      (4) Perhaps replace the term 'good outcomes' with 'reinforcing outcomes' or simply 'reinforcement'.

      Thank you for the suggestion. We have replaced ‘good outcomes’ with ‘reinforcing outcomes’.

      Reviewer 3 (Recommendations For The Authors):

      (5) It would be useful to provide some (perhaps speculative) explanation for the discordance between the PV activity-movement relationship and success of active avoidance in Fig. 4C

      We have added text to the discussion that addresses this issue:

      “Interestingly, the rise in IL PV neural activity during movement does not require avoidance learning. IL PV neurons begin to respond during movement immediately after the animal has received a single shock in an environment, but learning to cross the chamber to avoid the signaled shock takes tens of trials. Why is there a discordance between the emergence of the IL PV signal during movement and avoidance learning?

      The components underlying active avoidance have been debated over the years, but are thought to involve at least two essential behaviors – suppressing freezing, and moving to safety (LeDoux et al., 2017). Freezing is the default response of mice upon hearing a shock-predicting tone, and can be learned in a single trial (Ledoux, 1996; Fanselow, 2010; Zambetti et al., 2022). When a predator is in the distance, freezing can increase the chance of survival by reducing the chances of detection. However, a strategic avoidance behavior may prevent a future encounter with the predator altogether. The importance of IL PV neural activity in defensive behavior may be to suppress reactive defensive behaviors such as freezing in order to permit a flexible goaldirected response to threat.

      The freezing suppression and avoidance movement components of the avoidance response are dissociable, both because freezing precedes avoidance learning, and because animals intermittently move prior to avoidance learning. Our finding that the rise in PV activity during movement emerges immediately after receiving a single shock, tens of trials before animals have learned the avoidance behavior, suggests that the IL PV signal is associated with the suppression of freezing. Further, IL PV neurons do not respond during movement toward cued rewards because in reward-based tasks there is no freezing response in conflict with reward approach behavior.” 

      (6) I don't really understand what is shown in Figure 4D -- exactly what time points does this represent? Was habituation performed everyday?

      Figure 4D shows data from the approach task, not the avoidance task. This data is from welltrained mice, not the first day of training on this task. There was a pre-task recording period every day.

      (7) Why was optogenetic inhibition only delivered from 0.5-2.5 sec after the tone cue?

      We wanted to avoid any possibility that perception of the tone would be disrupted, so we delayed the onset of optogenetic inhibition. We chose 0.5 sec onset because animals typically begin to move ~1 second after tone onset.

      (8) The regression analysis with shuffled time points is not well explained -- some additional methodological details are needed (Fig. 2H).

      We added the following to the methods section to provide a clearer explanation: 

      “DF/F (t) was modeled as the linear combination of all event kernels. Given the event occurrence time points of all event types, we can use linear regression to decompose characteristic kernels for each event type. Kernel coefficients of the model were solved by minimizing the mean square errors between the model and the actual recorded signals. To prove that kernel ki is an essential component for the raw calcium dynamics, we compared the explanation power of the full model to the reduced model where the time points of the occurrence of event ki were randomly assigned. Thus, the kernel coefficients should not reflect the response to the event in the reduced model. 

      Editor's notes:

      -  Should you choose to revise your manuscript, please include full statistical reporting including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals. These should be reported for all key questions and not only when the pvalue is less than 0.05.

      Thank you for pointing this out. We have included all the test statistics and exact p values as suggested.

      -  Please note the sex of the mice and distribution of sexes in each group for each experiment.

      We have added the sex of mice for all experiments in the methods section.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This work successfully identified and validated TRLs in hepatic metastatic uveal melanoma, providing new horizons for enhanced immunotherapy. Uveal melanoma is a highly metastatic cancer that, unlike cutaneous melanoma, has a limited effect on immune checkpoint responses, and thus there is a lack of formal clinical treatment for metastatic UM. In this manuscript, the authors described the immune microenvironmental profile of hepatic metastatic uveal melanoma by sc-RNAseq, TCR-seq, and PDX models. Firstly, they identified and defined the phenotypes of tumor-reactive T lymphocytes (TRLs). Moreover, they validated the activity of TILs by in vivo PDX modelling as well as in vitro coculture of 3D tumorsphere cultures and autologous TILs. Additionally, the authors found that TRLs are mainly derived from depleted and late activated T cells, which recognize melanoma antigens and tumor-specific antigens. Most importantly, they identified TRLs associated phenotypes, which provide new avenues for targeting expanded T cells to improve cellular and immune checkpoint immunotherapy.

      Strengths:

      Jonas A. Nilsson, et al. has been working on new therapies for melanoma.  The team has also previously performed the most comprehensive genome-wide analysis of uveal melanoma available, presenting the latest insights into metastatic disease. In this work, the authors performed paired sc-RNAseq and TCR-seq on 14 patients with metastatic UM, which is the largest single-cell map of metastatic UM available. This provides huge data support for other  studies of metastatic UM.

      We thank the reviewer for these kind words about our work.

      Weaknesses:

      Although the paper does have strengths in principle, the weaknesses of the paper are that these strengths are not  directly demonstrated. That is,  insufficient analyses are performed to fully support the key claims in the manuscript by the data presented. In particular:

      The author's description of the overall results of the article should be logical, not just a description of the observed phenomena. For example, the presentation related to the results of TRLs lacked logic. In addition, the title of the article emphasizes the three subtypes of hepatic metastatic UM  TRLs, but these three subtypes are not specifically discussed in the results as well as the discussion section. The title of the article is not a very comprehensive generalization and should be carefully considered by the authors.

      We thank the reviewer for the critical reading of our work. We have added more data and more discussion.

      The authors' claim that they are the first to use autologous TILs and sc-RNAseq to study immunotherapy needs to be supported by the corresponding literature to be more convincing. This can help the reader to understand the innovation and importance of the methodology.

      We have gone through the manuscript and found that we only refer to being first in using PDX models and autologous TILs to study immunotherapy responses by single-cell sequencing. While there are data to be deduced from other studies, we still believe this to be an accurate statement.

      In addition, the authors argue that TILs from metastatic UM can kill tumor cells. This is the key and bridging point to the main conclusion of the article. Therefore, the credibility of this conclusion should be considered.  Metastatic UM1 and UM9 remain responsive to autologous tumors under in vitro conditions with their autologous TILs.

      UM1 responds also in vivo in the subcutaneous model in the paper. We have also finished an experiment where we show that this model also responds in a liver metastasis model. These data have been added in this revised version of the paper. We add two main figures and one supplementary figure where we characterize the response in vivo and also by single-cell sequencing of TILs.

      In contrast, UM22, also as a metastatic UM, did not respond to TIL treatment. In particular, the presence of MART1-responsive TILs. The reliability of the results obtained by the authors in the model of only one case of UM22 liver metastasis should be considered. The authors should likewise consider whether such a specific cellular taxon might also exist in other patients with metastatic UM, producing an immune response to tumor cells. The results would be more comprehensive if supported by relevant data.

      The reviewer has interpreted the results absolutely right, the allogenic and autologous MART1-specific TILs cells while reactive in vitro against UM22, cannot kill this tumor either in a subcutaneous or liver metastases model. We hypothesize this has to do with an immune exclusion phenotype and show weak immunohistochemistry that suggest this. We hope the addition of more UM1 data can be viewed as supportive of tumor-reactivity also in vivo.

      In addition, the authors in that study used previously frozen biopsy samples for TCR-seq, which may be associated with low-quality sequencing data, high risk of outcome indicators, and unfriendly access to immune cell information. The existence of these problems and the reliability of the results should be considered. If special processing of TCR-seq data from frozen samples was performed, this should also be accounted for.  

      We agree with the reviewers and acknowledge we never anticipated the development of single-cell sequencing techniques when we started biobank 2013. We performed dead cell removal before the 10x Genomics experiment. We have also done extensive quality controls and believe that the data from the biopsies should be viewed as a whole and that quantitative intra-patient comparisons cannot be done.

      Reviewer #2 (Public Review):  

      Summary:  

      The study's goal is to characterize and validate tumor-reactive T cells in liver metastases of uveal melanoma (UM), which could contribute to enhancing immunotherapy for these patients. The authors used single-cell RNA and TCR sequencing to find potential tumor-reactive T cells and then used patientderived xenograft (PDX) models and tumor sphere cultures for functional analysis. They discovered that tumor-reactive T cells exist in activated/exhausted T cell subsets and in cytotoxic effector cells. Functional experiments with isolated TILs show that they are capable of killing UM cells in vivo and ex vivo.

      Strengths:  

      The study highlights the potential of using single-cell sequencing and functional analysis to identify T cells that can be useful for cell therapy and marker selection in UM treatment. This is important and novel as conventional immune checkpoint therapies are not highly effective in treating UM. Additionally, the study's strength lies in its validation of findings through functional assays, which underscores the clinical relevance of the research. 

      We thank the reviewer for these kind words about our work.

      Weaknesses:  

      The manuscript may pose challenges for individuals with limited knowledge of single-cell analysis and immunology markers, making it less accessible to a broader audience.

      The first draft of the manuscript (excluding methods) was written by a person (J.A.N) who is not a bioinformatician. It has been corrected to include the correct nomenclature where applicable but overall it is written with the aim to be understandable. We have made an additional effort in this version. 

      Reviewer #1 (Recommendations For The Authors):  

      (1) Firstly, the authors should provide high-resolution pictures to ensure readability for readers. 

      We have converted to pdf ourselves and that improved resolution. We are happy to provide high-resolution to the office if needed for the printing.

      (2) Furthermore, some parts of the article are more colloquial, and the authors should consider the logic and academic nature of the overall writing of the article. For example, authors should double-check whether the relevant expressions in the results are correct. For example, 'TCR' in the fourth part of the results should be 'TRLs'.

      We thank the reviewer for the recommendations and have gone through the manuscript.

      (3) Moreover, UM22 is described several times in the results as a metastatic UM and should be clearly defined in the methodology.

      The UM22 and UM1 samples are described in-depth in Karlsson et al., Nature Communications, 2020, a paper that is cited in the beginning of Results as part of the narrative. The current work can be viewed as an extension of that work.

      (4) Finally, it is recommended that authors describe a part of the results in full before citing the corresponding picture, otherwise, it will lead to confusion among readers.

      We have made an effort in the revised version to describe the new data in more detail.

      Reviewer #2 (Recommendations For The Authors):  

      The manuscript is very interesting and important to understanding key aspects of uveal melanoma immune profile and functionality. However, in my opinion, there are a few aspects that could be addressed.  

      - The manuscript lacks comprehensive details about the samples used, such as their disease progression, response to treatment, or any relevant information that could shed light on potential differences between samples. It would be valuable to know whether these samples were collected before any systemic treatment or if any of the patients underwent immunotherapy post-sample collection, along with the outcomes of such treatments. Providing this information would enrich the manuscript and provide a more holistic view of the research.

      We thank the reviewer for the recommendation and have included a new Supplementary table 7 with information about the samples. We have also pasted in individual samples’ contribution to the UMAP to add further holistic view.  

      - The results presented and discussed in the manuscript seem to indicate that there were no significant differences across the various samples, including comparisons between lymph-node and liver metastases. However, this lack of variation or the reasons for not discussing any observed differences should be clarified. If there are distinctions between the samples, it would be beneficial to discuss these findings in the manuscript.

      We thank the reviewer for the recommendation. Whereas 14 samples are many for a uveal melanoma study it is not really powered to do intra-patient comparisons.

      - The manuscript may pose difficulties for individuals with limited knowledge of single-cell analysis and immunology markers, potentially limiting its accessibility. To make the research more inclusive, the authors might consider presenting the technical aspects of their work in a less descriptive manner and providing explanations for those less familiar with the technology. This would help a broader audience grasp the significance of the study's findings. 

      The manuscript is from a multidisciplinary team where all have read and commented. The draft was written by a tumor biologist and edited by a bioinformatician for accuracy. We honestly think it is more understandable than most studies in this bioinformatics era. But we have tried to describe the new data in an easier way.

    1. if we fail to control our numbers and our appetites well then yes our society will start to to crash in a similar way to that of 00:35:32 easter island only on a worldwide scale and that means the whole industrial civilization will break down and 00:35:45 our descendants will essentially be uh savages to use that term very advisably and savages in the sense that they will have lost 00:35:58 the fruits of civilization and hate us

      for - progress trap - dark futures scenario - like Easter Island but on a global scale

      comment - The potential global breakdown of global industrialized society, rupturing supply chains so that our highly interdependent world becomes the very Achilles Heel that hastens its demise is chilling - It could mean a huge disruption to the most important aspect of civilization - the continuing accruing and inter-generational transmission of knowledge - It would be catastrophic to lose that, but it is entirely possible - As Wright himself famously said, to use a computer metaphor, we humans are like 50,000 year old hardware, running modern software - By that, he meant that our cognitive physiology (brain and sensory processing system) has not changed for tens of thousands of years, yet cultural evolution happens at exponentially faster rates, so much so that our biological systems are not adapted to keep up with the pace, and that spells disaster - When we no longer have the sensory or cognitive apparatus to sense danger, and we are offloading that to AI, we are in an extremely vulnerable situation

      progress trap - Gedanken - Think of our ancestors from 50,000 years ago. - What Wright is saying with his metaphor is that if that child from 50,000 years ago were transported by a time machine to modernity, (s)he would have little problem integrating into modern society - LIKEWISE, if we lose all the knowledge fruits of accumulated over so many thousands of years, it would be like being born into a human tribe 50,000 years ago. - We would likely still have language, but all our technology may have to start from scratch!

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      The authors provide solid data on a functional investigation of potential nucleoid-associated proteins and the modulation of chromosomal conformation in a model cyanobacterium. While the experiments presented are convincing, the manuscript could benefit from restructuring towards the precise findings; alternatively, additional data buttressing the claims made would significantly enhance the study. These valuable findings will be of interest to the chromosome and microbiology fields.

      We appreciate editors for taking time for assessment and reviewers for giving critical suggestions. Both reviewers were concerned about our interpretation of 3C data, and Reviewer #2 suggested the biochemistry of cyAbrB2 to reinforce our claim. We agree with the concern and suggest editors add a sentence “How cyAbrB2 affects chromosome structure is still elusive from this study, and the biochemical assays are needed in the future experiment.” to the eLife assessment.

      The major revision points are the following;

      Reconstruction of Figures

      Previous Figure 5E has been omitted

      Additional 3C data on the nifJ region

      Rephrasing the conclusion of 3C data

      Additional discussion on cyAbrB2 and NAPs

      Reviewer #1 (Public Review): 

      Strength: 

      At first glance, I had a very positive impression of the overall manuscript. The experiments were well done, the data presentation looks very structured, and the text reads well in principle.

      Weakness: 

      Having a closer look, the red line of the manuscript is somewhat blurry. Reading the abstract, the introduction, and parts of the discussion, it is not really clear what the authors exactly aim to target. Is it the regulation of fermentation in cyanobacteria because it is under-investigated? Is it to bring light to the transcriptional regulation of hydrogenase genes? The regulation by SigE? Or is it to get insight into the real function of cyAbrB2 in cyanobacteria? All of this would be good of course. But it appears that the authors try to integrate all these aspects, which in the end is a little bit counterintuitive and in some places even confusing. From my point of view, the major story is a functional investigation of the presumable transcriptional regulator cyAbrB2, which turned out to be a potential NAP. To demonstrate/prove this, the hox genes have been chosen as an example due to the fact that a regulatory role of cyAbrB2 has already been described. In my eyes, it would be good to restructure or streamline the introduction according to this major outcome. 

      As you pointed out, the major focus of this study is cyAbrB2 as a potential NAPs. To focus on NAPs, we simplified the first paragraph of the discussion (ll.246-263) and added the section comparing cyAbrB2 with other known NAPs (11.269-299). To emphasize the description of cyAbrB2, we also rearranged the figures and divided the analysis on cyAbrB2 ChIP into two figures. We reduced the first paragraph of the introduction but mostly preserved the composition of the introduction to keep the general to specific pattern, even though the manuscript is blurry.

      Points to consider: 

      The authors suggest that the microoxic condition is the reason for the downregulation of e.g. photosynthesis (l.112-114). But of course, they also switched off the light to achieve a microoxic environment, which presumably is the trigger signal for photosynthesis-related genes. I suggest avoiding making causal conclusions exclusively related to oxygen and recommend rephrasing (for example, "were downregulated under the conditions applied").

      We agree with this point. We rephrased l.114 to “by the transition to dark microoxic conditions from light aerobic conditions” (ll.108-109).

      The authors hypothesized that cyAbrB2 modulates chromosomal conformation and conducted a 3C analysis. But if I read the data in Figure 5B & C correctly, there is a lot of interaction in a range of 1650 and 1700 kb, not only at marked positions c and j. Positions c and j have been picked because it appears that cyAbrB2 deletion impacts this particular interaction. But is it really significant? In the case of position j the variation between the replicates seems quite high, in the case of position c the mean difference is not that high. Moreover, does all this correlate with cyAbrB2 binding, i.e. with positions of gray bars in panel A? If this was the case, the data obtained for the cyabrB2 mutant should look totally different but they are quite similar to WT. That's why the sentence "By contrast, the interaction frequency in Δcyabrb2 mutant were low and unchanged in the aerobic and microoxic conditions" does not fit to the data shown. But I have to mention that I am not an expert in these kinds of assays. Nevertheless, if there is a biological function that shall be revealed by an experiment, the data must be crystal clear on that. At least the descriptions of the 3C data and the corresponding conclusions need to be improved. For me, it is hard to follow the authors' thoughts in this context. 

      According to your suggestion, we again have carefully observed the 3C data. Furthermore, we conducted an additional 3C experiment on nifJ region (Figures 7F-J). Then we admit we had overinterpreted the 3C data. Therefore, we rewrote the result and discussion of the 3C assay in line with the data (ll.220-245) and removed the previous Figure 5E. Following are individual responses.

      Positions c and j have been picked because it appears that cyAbrB2 deletion impacts this particular interaction. But is it really significant?

      We could not find statistically significant differences at locus c and j. Therefore, we added this in the result section “Note that the interaction scores exhibit considerable variability and we could not detect statistical significance at those loci.” (ll.231-232)

      does all this correlate with cyAbrB2 binding, i.e. with positions of gray bars in panel A?

      As you are concerned, interaction frequency and cyAbrB2 binding do not correlate. Therefore, we withdraw the previous claim and stated as follows; “Moreover, our 3C data did not support bridging at least in hox region and nifJ region, as the high interaction locus and cyAbrB2 binding region did not seem to correlate (Figure 7).” (ll.280-282)

      If this was the case, the data obtained for the cyabrB2 mutant should look totally different but they are quite similar to WT.

      We rewrote it as follows; “Then we compared the chromatin conformation of wildtype and cyabrb2∆. Although overall shapes of graphs did not differ, some differences were observed in wildtype and cyabrb2∆ (Figures 7B and 7G); interaction of locus (c) with hox region were slightly lower in cyabrb2∆ and interaction of loci (f’) and (g’) with nifJ region were different in wildtype and cyabrb2∆. Note that the interaction scores exhibit considerable variability and we could not detect statistical significance at those loci.” (ll.228-232)

      That's why the sentence "By contrast, the interaction frequency in Δcyabrb2 mutant were low and unchanged in the aerobic and microoxic conditions" does not fit to the data shown.

      We rewrote the sentence as follow; “While the interaction scores exhibit considerable variability, the individual data over time demonstrate declining trends of the wildtype at locus (c) and (j) (Figure S8). In ∆cyabrb2, by contrast, the interaction frequency of loci (c) and (j) was unchanged in the aerobic and microoxic conditions (Figure 7E). The interaction frequency of locus (c) in ∆cyabrb2 was as low as that in the microoxic condition of wildtype, while that of locus (j) in ∆cyabrb2 was as high as that in the aerobic condition of wildtype (Figures 7B and 7C).” (ll.238-243)

      The figures are nicely prepared, albeit quite complex and in some cases not really supportive of the understanding of the results description. Moreover, they show a rather loose organization that sometimes does not fit the red line of the results section. For example, Figure 1D is not mentioned in the paragraph that refers to several other panels of the same figure (see lines110-128). Panel 1D is mentioned later in the discussion. Does 1D really fit into Figure 1 then? Are all the panels indeed required to be shown in the main document? As some elements are only briefly mentioned, the authors might also consider moving some into the supplement (e.g. left part of Figure 1C, Figure 2A, Figure 3B ...) or at least try to distribute some panels into more figures. This would reduce complexity and increase comprehensibility for future readers. Also, Figure 3 is a way too complex. Panel G could be an alone-standing figure. The latter would also allow for an increase in font sizes or to show ChIP data of both conditions (L+O2 and D-O2) separately. Moreover, a figure legend typically introduces the content as a whole by one phrase but here only the different panels are described, which fits to the impression that all the different panels are not well connected. Of course, it is the decision of the authors what to present and how but may they consider restructuring and simplifying.

      According to the advice, we have rearranged the Figure composition.

      The left side of Figure 1C has been moved to supplement. Instead, representative expression fold changes of “Transient”, “Plateau”, “Continuous”, and “Late” genes are shown for comprehensibility. We left Figure 1D in Figure 1, as this diagram shows our motive to focus on hox and nifJ. We moved Figure 2A to supplement. We did not move Fig3B, as this figure shows the distribution of cyAbrB2 (“long tract of AT-rich DNA”) comprehensively and simply. We agree that Figure 3 was too complex. Therefore, we moved Figures 3F and 3G to a new independent figure (Figure 4). In Figure 4C (former 3G), we show the ChIP data of the L+O2 condition only, and the change of ChIP data under the D-O2 condition is shown in Figure 5. The schematic image showing cyanobacterial chromosome and NAPs (previous Figure 5E) was omitted because it was overinterpreting.

      The authors assume a physiological significance of transient upregulation of e.g. hox genes under microoxic conditions. But does the hydrogenase indeed produce hydrogen under the conditions investigated and is this even required? Moreover, the authors use the term "fermentative gene". But is hydrogen indeed a fermentation product, i.e. are protons the terminal electron acceptor to achieve catabolic electron balance? Then huge amounts of hydrogen should be released. Comment should be made on this.

      This is a very important point; Yes, hydrogenase indeed produces hydrogen under the conditions we investigated, and proton accepts a majority of reducing power under the dark microoxic condition. We wrote in the introduction section as follows; “Hydrogen is generated in quantities comparable to lactate and dicarboxylic acids as the result of electron acceptance in the dark microoxic condition (Akiyama and Osanai 2023; Iijima et al. 2016)” (ll.54-55). The detailed explanation is below, although omitted from the manuscript.

      A recent study (Akiyama and Oasanai 2023) quantified the consumed glycogen and secreted fermentative products (hydrogen, lactate, dicarboxylic acid, and acetate) in the Synechocystis under the dark microoxic condition, the same conditions as we investigated. The system of the study consists of a 10 mL liquid layer and a 10 mL gas layer, cultivated for 3 days under dark microoxic conditions. Then the amounts of lactic acid, dicarboxylic acid, and hydrogen were approximately 2 µmol, 3.5 µmol, and 11µmol (assuming the gas layer was at 1 atm and ignoring aqueous population), respectively. On the other hand, glycogen equivalent to 15µmol of glucose was consumed in the system. This estimate supports hydrogen accounts for a substantial portion of fermentative products during dark microoxic conditions.

      The necessity of hydrogen production under dark microoxic conditions was demonstrated in (Gutekunst et al. 2014). They show hydrogenase activity is required for the mixotrophic growth in the light-dark and microoxic cycle with arginine. The necessity remains unclear in our conditions because we only performed continuous dark microoxic conditions without glucose.

      The authors also mention a reverse TCA cycle. But is its existence an assumption or indeed active in cyanobacteria, i.e. is it experimentally proven? The authors are a little bit vague in this regard (see lines 241-246).

      We misused the Terminology. We mean to mention the “reductive branch of TCA”. Cyanobacteria conduct the branched TCA cycle under microoxic conditions. One of the branches is the reductive branch, which reduces oxaloacetate to produce malate. We corrected “reverse TCA cycle” to “reductive branch of TCA”. (Figure 1D and ll.260-262)

      Reviewer #2 (Public Review): 

      This work probes the control of the hox operon in the cyanobacterium Synechocystis, where this operon directs the synthesis of a bidirectional hydrogenase that functions to produce hydrogen. In assessing the control of the hox system, the authors focused on the relative contributions of cyAbrB2, alongside SigE (and to a lesser extent, SigA and cyAbrB1) under both aerobic and microoxic conditions. In mapping the binding sites of these different proteins, they discovered that cyAbrB2 bound many sites throughout the chromosome repressed many of its target genes, and preferentially bound regions that were (relatively) rich in AT-residues. These characteristics led the authors to consider that cyAbrB2 may function as a nucleoid-associated protein (NAP) in Synechocystis, given its functional similarities with other NAPs like H-NS. They assessed the local chromosome conformation in both wild-type and cyabrB2 mutant strains at multiple sites within a 40 kb window on either side of the hox locus, using a region within the hox operon as bait. They concluded that cyAbrB2 functions as a nucleoid-associated protein that influences the activity of SigE through its modulation of chromosome architecture.

      The authors approached their experiments carefully, and the data were generally very clearly presented and described.

      Based on the data presented, the authors make a strong case for cyAbrB2 as a nucleoid-associated protein, given the multiple ways in which it seems to function similarly to the well-studied Escherichia coli H-NS protein. It would be helpful to provide some additional commentary within the discussion around the similarities and differences of cyAbrB2 to other nucleoid-associated proteins, and possible mechanisms of cyAbrB2 control (post-translational modification; protein-protein interactions; etc.). The manuscript would also be strengthened with the inclusion of biochemical experiments probing the binding of cyAbrB2, particularly focusing on its oligomerization and DNA polymerization/bridging potential.

      We agree with the comment that the biochemical experiments will deepen our insights into the cyAbrB2 and chromatin conformation. As the reviewer pointed out, the biochemical assay will provide valuable information on mechanisms of cyAbrB2 control, such as post-transcriptional modification, cooperation with cyAbrB1, oligomerization, and the structure of cyAbrB2-bound DNA. However, we think those potential findings are worth of new independent research paper, rather than a part of this paper. Therefore, we added a discussion mentioning biochemistry as the future work (ll.275-290; the section of “The biochemistry of cyAbrB2 will shed light on the regulation of chromatin conformation in the future”).

      Previous work had revealed a role for SigE in the control of hox cluster expression, which nicely justified its inclusion (and focus) in this study. However, the results of the SigA studies here suggested that SigA both strongly associated with the hox promoter, and its binding sites were shared more frequently than SigE with cyAbrB2. The focus on cyAbrB2 is also well-justified, given previous reports of its control of hox expression; however, it shares binding sites with an essential homologue cyAbrB1. Interestingly, while the B1 protein appears to bind similar sites, instead of repressing hox expression, it is known as an activator of this operon. It seems important to consider how cyAbrB1 activity might influence the results described here.

      We infer that the minor side of the bimodal SigE peak is the genuine population that contributes to hox transcription, as hox genes are expressed in a SigE-dependent manner (Figure S2). We considered the strong SigA peak upstream of the hox operon binds the promoter of TU1715, the opposite direction of the hox operon. We added a description of the single SigA peak and bimodal SigE peak near the TSS of the hox operon as follows;

      “A bimodal peak of SigE was observed at the TSS of the hox operon in a microoxic-specific manner (Figure 6C bottom panel). The downstream side of the bimodal SigE peak coincides with SigA peak and the TSS of TU1715. Another side of the bimodal peak lacked SigA binding and was located at the TSS of the hox operon (marked with an arrow in Figure 6C), although the peak caller failed to recognize it as a peak.” (ll.206-209)

      The point that cyAbrB1 binds similar sites as cyAbrB2, despite regulating hox expression in the opposite direction, is very interesting. Therefore, we referred to the transcriptome data of the cyAbrB1 knockdown strain and compared the impact of cyAbrB1 knockdown and cyAbrB2 deletion. We described in result and discussion as follows;

      “we referred to the recent study performing transcriptome of cyAbrB1 knockdown strain, whose cyAbrB1 protein amount drops by half (Hishida et al. 2024). Among 24 genes induced by cyAbrB1 knockdown, 12 genes are differentially downregulated genes in cyabrb2∆ in our study (Figure S5D).” (ll.162-165)

      “CyAbrB1, the homolog of cyAbrB2, may cooperatively work, as cyAbrB1 directly interacts with cyAbrB2 (Yamauchi et al. 2011), their distribution is similar, and they partially share their target genes for suppression (Figures 3A S5C and S5D). The possibility of cooperation would be examined by the electrophoretic mobility shift assay of cyAbrB1 and cyAbrB2 as a complex. Despite their similar repressive function, cyAbrB1 and cyAbrB2 regulate hox expression in the opposite directions, and their mechanism remains elusive.” (ll.292-296)

      Hox operon differs from this general tendency. To see if cyAbrB1 behaves differently from cyAbrB2 in the hox operon, we did an additional ChIP-qPCR experiment on cyAbrB1 in the aerobic condition and the dark microoxic condition (Figure 5C). However, we could not find the difference.

      Reviewer #1 (Recommendations For The Authors): 

      Figure 1B: I recommend changing the header in the grey bar to terms like "upregulated" and "downregulated", which are also used in the legend description. Upregulation of genes can also be a result of de-repression, which is why the term "activated" is somewhat misleading.

      Corrected.

      Lines 114-116: It is unclear what the authors exactly mean here. Please clarify. 

      We rephrase the sentence “The enrichment in the butanoate metabolism pathway indicates the upregulation of genes involved in carbohydrate metabolism. We further classified genes according to their expression dynamics.” (ll.110-111)

      Reviewer #3 (Recommendations For The Authors): 

      Major/experimental comments: 

      (1) For the chromosome conformation capture experiments, it is indicated that these were conducted at aerobic (1hr) and microoxic (4 hr) conditions. But the data presented in Figure 1 suggest that 1 hr corresponds to the beginning of microoxic growth, and that time 0 is aerobic. The composite 3C data in Figure 5 show some interesting but specific differences. It is appreciated that the authors presented the profiles for individual samples in Figure S7, and the differences here do not seem to be as compelling. Are the major differences being highlighted significantly (statistically) different (e.g. at the (c) and (j) loci)? Might the differences be starker if an earlier aerobic condition (e.g. time 0) had been used instead of the 1 hr - microoxic - timepoint?

      Previous Figure 5 consisted of three time points (solid line: aerobic condition, dashed line:1hr of microoxic condition, and dotty line:4hr of microoxic condition). We omitted data of 4hr in the main figure (Figure 7) as 4hr in microoxic conditions makes data complicated. Three time points are shown in the profiles of individual loci (Figure S8).

      There is no statistical significance found in (c) and (j) loci by t-test. Therefore, we have toned down the interpretation of 3C data as follows; “Our 3C result demonstrated that cyAbrB2 influences the chromosomal conformation of hox and nifJ region to some extent (Figure 7).” (ll.325-326)

      (2) This is a complicated system that involves multiple regulatory proteins, each of which is differentially affected by the growth conditions (aerobic/microoxic). It is obviously beyond the scope of this work to probe deeply into all of these proteins. The focus here was on cyAbrB2, and to a slightly lesser extent SigE; however, based on the data presented, it seems that SigA and cyAbrB1 may be equally important contributors to hox control/expression, and in the case of cyAbrB1, possibly also to chromosome conformation. cyAbrB1 appears to have the same binding sites as cyAbrB2, and has been reported to interact with cyAbrB2. Given this association, it is possible that the two proteins may affect the binding of each other, and that loss of one might lead to enhanced binding by the other (or binding may require heterooligomerization?). Probing the regulatory interplay between these two proteins (or at least discussing it) feels important. Conducting e.g. mobility shift assays with each protein, both individually and together, could possibly allow for some understanding of how they function together. 

      We agree that the biochemistry of cyAbrB2 and cyAbrB1 may explain why cyAbrB1 and cyAbrB2 bind long tracts of AT-rich genome regions in vitro. We would like to put the biochemistry future plan as we think biochemistry data is beyond the present study.

      The idea that cyAbrB1 and cyAbrB2 cooperate to form heterooligomers and broad binding to the genome is a very rational and interesting prediction. We add this idea to the discussion “Overall, the biochemistry integrating assay conditions (PTM, buffer condition, and cooperation with cyAbrB1) and output (DNA binding, oligomerization, and DNA structure) will deepen the understanding of cyAbrB2 as cyanobacterial NAPs.”(ll.287-290). We also compared our transcriptome of ∆_cyabrb2 with the recent study of cyabrb1 knockdown (ll. 162-165), and concluded “they partially share their target genes for suppression (Figures 3A S5C and S5D)” (l. 293).

      (3) Throughout the manuscript, there is reference made to cyAbrB2 binding becoming 'blurry' or non-specific under microoxic conditions. It is not clear what this means. It appears that when cyAbrB2 binds, any given protected region can be quite extensive, which can be suggestive of polymerization along the chromosome. Are the boundaries for binding sites typically clearly delineated, and this changes when the cultures are growing under microoxic conditions? There is also no mention made anywhere about oligomerization potential for cyAbrB2, which would be important for the polymerization, and bridging suggested for cyAbrB2 in the model presented in Figure 5. Previous publications (Song et al., 2022; Ishi et al., 2008) have suggested that it can exist as a dimer in vivo, but that in vitro it is largely monomeric. The manuscript would benefit from some additional biochemical analyses of cyAbrB2 binding activity, with a particular focus on DNA binding and oligomerization/bridging potential, and some additional discussion about these characteristics as well. 

      Throughout the manuscript, there is reference made to cyAbrB2 binding becoming 'blurry' or non-specific under microoxic conditions. It is not clear what this means.

      In order to clearly describe “cyAbrB2 binding becomes blurry”, we rearranged the figure composition and made an exclusive figure (Figure 5). We also rephrased the description by adopting the reviewer’s word “boundaries for binding sites”, as this phrase well describes the change. “When cells entered microoxic conditions, the boundaries of the cyAbrB2 binding region and cyAbrB2-free region became obscure (Figure 5), “(ll.319-320)

      There is also no mention made anywhere about oligomerization potential for cyAbrB2,

      We added the discussion about oligomerization “DNA-bound cyAbrB2 is expected to oligomerize, based on the long tract of cyAbrB2 binding region in our ChIP-seq data. However, no biochemical data mentioned the DNA deforming function or oligomerization of cyAbrB2 in the previous studies and preference for AT-rich DNA is not fully demonstrated in vitro (Dutheil et al. 2012; Ishii and Hihara 2008; Song et al. 2022)”(ll. 277-280) and “Overall, the biochemistry integrating assay conditions (PTM, buffer condition, and cooperation with cyAbrB1) and output (DNA binding, oligomerization, and DNA structure) will deepen the understanding of cyAbrB2 as cyanobacterial NAPs.” (ll.287-290)

      The manuscript would benefit from some additional biochemical analyses of cyAbrB2 binding activity, with a particular focus on DNA binding and oligomerization/bridging potential, and some additional discussion about these characteristics as well. 

      We added the discussion integrally considering known features of cyAbrB2, novel findings on cyAbrB2, and the comparison with known NAPs (ll.269-290).

      (4) Given that the major take-away for the authors (based on the title) seems to be the nucleoid-associated protein potential for cyAbrB2, the Discussion would benefit from some additional focus in this area. How similar is cyAbrB2 to other nucleoid-associated proteins? (e.g. H-NS, Lsr2) How does counter-silencing work for other nucleoid-associated proteins? Can the authors definitively exclude the possibility of binding site competition/occlusion, given that cyAbrB2 covers the promoter region of hox? What is other nucleoid-associated proteins have been characterized in the cyanobacteria? 

      We agree with the point, so we additionally discussed cyAbrB2 comparing with H-NS and Lsr2, the canonical NAPs (ll. 269-290).

      We did not deny the possibility of the exclusion of RNAP by cyAbrB2, but the previous manuscript insufficiently discussed that. To emphasize that cyAbrB2 excludes RNA polymerase, we simplified Figure 6 and employed mosaic plots showing anti-co-occurrence of cyAbrB2 binding regions and SigE peaks. Furthermore, we added discussion about SigE exclusion by cyAbrB2 (ll. 355-359)

      We mention the possibility of other nucleoid-associated proteins in cyanobacteria in the discussion. “Furthermore, the conformational changes by deletion of cyAbrB2 were limited, suggesting there are potential NAPs in cyanobacteria yet to be characterized.” (ll.336-339)

      (5) Previous work (Song et al., 2022) showed that changing the AT content of cyAbrB2 binding sites did not affect its ability to bind DNA. There are also previous papers suggesting that cyAbrB2 may be subject to diverse post-translational modifications (e.g. phosphorylation - Spat et al., 2023; glutationylation - Sakr et al., 2013), as well as association with cyAbrB1. These collectively suggest there may be other factors that contribute to cyAbrB2 binding specificity/activity. These seem like relevant points to discuss, particularly given the transient nature of the cyAbrB2 effects on some genes.

      We have included the discussion about AT content, post-translational modifications and transient regulations, and association with cyAbrB1 (ll. 284-295)

      (6) Given the major binding site for SigA upstream of the hox operon, it seems that it likely also contributes to hox cluster expression, together with SigE. Is there a sense for the relative contribution of each sigma factor to hox cluster expression? And whether both are subject to the same inhibitory effect of cyAbrB2? 

      As described above response to the public review, the SigA binding site upstream of the hox operon should be assigned to the TSS of TU1715 (Figure 6C). Transcription of hox operon is highly dependent on SigE as shown in Figure S2, and residual transcription in sigE∆ strain is derived from other sigma factors (SigABCD). Estimating the relative contribution of sigma factors other than SigE is difficult at present because SigABCDE can partially compensate for each other.

      As the different impact of NAPs on the primary and alternative sigma factor is observed in H-NS (Shin et al. 2005), whether both the primary sigma factor (SigA) and the alternative sigma factor (SigE) are inhibited by cyAbrB2 to the same extent is a very interesting question.

      We calculated the odds ratio of SigE and SigA being in the cyAbrB2-free region and wrote in the result; “SigE preferred the cyAbrB2-free region in the aerobic condition more than SigA did (Odds ratios of SigE and SigA being in the cyAbrB2-free region were 4.88 and 2.74, respectively).” (ll.193-195) and discussed “The higher exclusion pressure of cyAbrB2 on SigE may contribute to sharpening the transcriptional response of hox and nifJ on entry to microoxic conditions.” (ll.357-359)

      (7) The 3C experiments suggest there are indeed changes in chromosome architecture in the hox region as growth conditions change and when different regulators are present. Across the chromosome, analogous changes are expected; however, it may be premature to draw this conclusion based on changes at one locus. Is there a reason that the authors did not take full advantage of their 3C samples and sequence them, to capture the full chromosome interactome at the two time-points? This would allow broader conclusions to be drawn regarding changes in chromosome structure and the impact of cyAbrB2.

      In response to the suggestion, we performed an additional 3C assay on the nifJ region by utilizing residual 3C samples. Expanding to genome-wide sequence (Hi-C) needs concentration of ligated fragments by the biotinylation, which were omitted in our 3C sample.

      We rewrote the result as obtained from the 3C data of hox and nifJ (ll.220-245) and omitted the schematic image of an entire chromosome of cyanobacteria (previous Figure 5E).

      Editorial comments: 

      (1) The data presentation in Figure 1 is very effective. 

      (2) Line 87: please rephrase - you can have 'high similarity' or 'high levels of identity', but not high levels of homology - genes/proteins are either homologous or not.

      (3) Line 118: classified into four 'groups'? 

      (4) Line 590: remove 'the'. 

      (5) Figure 2S, panel B: please define acronyms in the legend (GT, IP) and write out 'FLAG' in full for AbrB1.

      (2) to (5) have been corrected.

      (6) Please provide information on or a reference for the tagging of SigA for use in the ChIP-seq experiments within the Materials and Methods.

      Added (l.365)

      (7) Line 648: space between 'binding' and 'regions'. 

      corrected.

      (8) Fig 4E: please make the solid lines thicker - they are currently difficult to see.

      We have made Figure 6C (former 4E) larger and the line thicker.

      (9) Line 666: location. 

      (10) Line 673: Individual. 

      (11) Figure S5, panel C graph title: should this be 'Relative'? 

      (12) Figure S7: What is 'GT'? Should this be 'WT'? 

      (9) to (12) have been corrected.

      (13) In addition to the data presented in Figure 3G, it would be nice to have a small table or Venn diagram summarizing the number of cyAbrB2 binding sites that fall into the different categories (full gene/operon; downstream of a gene; within a gene; promoter region). 

      In response to the comment, we noticed the categories we had applied (full gene/operon; downstream of a gene; within a gene; promoter region) were arbitrary. Therefore, we categorized transcriptional units (TUs) according to the extent of occupancy by cyAbrB2. (Figures 4B and 4C)

      (14) Line 280-281: suggest replacing 'mediates' with 'influences'. 'Mediates' sounds like a direct interaction (for which the evidence is not currently strong without some additional biochemical data), but 'influences' could better accommodate both direct and indirect possibilities. 

      (15) Line 410: it is not clear what this means. 

      We have omitted “As a result, DNA ~600-fold condensed DNA than 3C samples were ligated.”, as it does not give any information about the experimental procedure.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      This manuscript builds upon the authors' previous work on the cross-talk between transcription initiation and post-transcriptional events in yeast gene expression. These prior studies identified an mRNA 'imprinting' phenomenon linked to genes activated by the Rap1 transcription factor (TF), a surprising role for the Sfp1 TF in promoting RNA polymerase II (RNAPII) backtracking, and a role for the non-essential RNAPII subunits Rpb4/7 in the regulation of mRNA decay and translation. Here the authors aimed to extend these observations to provide a more coherent picture of the role of Sfp1 in transcription initiation and subsequent steps in gene expression. They provide evidence for (1) a physical interaction between Sfp1 and Rpb4, (2) Sfp1 binding and stabilization of mRNAs derived from genes whose promoters are bound by both Rap1 and Sfp1 and (3) an effect of Sfp1 on Rpb4 binding or conformation during transcription elongation. 

      Strengths: 

      This study provides evidence that a TF (yeast Sfp1), in addition to stimulating transcription initiation, can at some target genes interact with their mRNA transcripts and promote their stability. Sfp1 thus has a positive effect on two distinct regulatory steps. Furthermore, evidence is presented indicating that strong Sfp1 mRNA association requires both Rap1 and Sfp1 promoter binding and is increased at a sequence motif near the polyA track of many target mRNAs. Finally, they provide compelling evidence that Sfp1-bound mRNAs have higher levels of RNAPII backtracking and altered Rpb4 association or conformation compared to those not bound by Sfp1. 

      Weaknesses: 

      The Sfp1-Rpb4 association is supported only by a two-hybrid assay that is poorly described and lacks an important control. Furthermore, there is no evidence that this interaction is direct, nor are the interaction domains on either protein identified (or mutated to address function). 

      Indeed, our two hybrid, immunoprecipitation and imaging results do not allow us to conclusively discern whether the interaction between Rpb4 and Sfp1 is direct or indirect. While the interaction holds significance, we consider the direct versus indirect distinction to be of secondary importance in the context of this paper. In the current text we indicated that 'our two hybrid, immunoprecipitation and imaging results do not differentiate between a direct or indirect interactions' (see page 6, sentences highlighted in blue)

      The contention that Sfp1 nuclear export to the cytoplasm is transcription-dependent is not well supported by the experiments shown, which are not properly described in the text and are not accompanied by any primary data. 

      This section has been re-written for better clarity (see page 7). We note that this assay was originally developed and published by Lee, M. S., M. Henry, and P. A. Silver in their 1996 paper in G&D and has since been reported in numerous subsequent studies. Reassuringly, our conclusion is bolstered by the observation that Sfp1 binds to Pol II transcripts co-transcriptionally, suggesting that Sfp1 is exported in the context of the mRNA.

      The presence of Sfp1 in P-bodies is of unclear relevance and the authors do not ask whether Sfp1-bound mRNAs are also present in these condensates. 

      P-bodies consist of both RNA and proteins (reviewed in doi: 10.1021/acs.biochem.7b01162). The significance of this experiment lies in its contribution to further confirming the co-localization of Sfp1 with mRNAs and Rpb4. This observation could also yield valuable insights for future investigations into the role of Sfp1.

      Further analysis of Sfp1-bound mRNAs would be of interest, particularly to address the question of whether those from ribosomal protein genes and other growth-related genes that are known to display Sfp1 binding in their promoters are regulated (either stabilized or destabilized) by Sfp1. 

      Fig. 4A, C and D show that RP mRNAs become destabilized in sfp1Δ cells.

      The authors need to discuss, and ideally address, the apparent paradox that their previous findings showed that Rap1 acts to destabilize its downstream transcripts, i.e. that it has the opposite effect of Sfp1 shown here. 

      We would like to thank Reviewer 1 for this valuable comment. In the revised paper, we delved into our hypothesis suggesting that Rap1 is likely responsible for regulating the imprinting of other proteins, that, in turn, lead to the destabilization of mRNAs, such as Rpb4. See blue paragraph in page 20.

      Finally, recent studies indicate that the drugs used here to measure mRNA stability induce a strong stress response accompanied by rapid and complex effects on transcription. Their relevance to mRNA stability in unstressed cells is questionable. 

      Half-lives were determined mainly by the GRO analysis of optimally proliferating cells. This  method does not requires any drug or stressful treatment.  The results obtained by this method were consistent with those obtained after thiolutin addition. Using both methods, we discovered that disruption of Sfp1 results in substantial mRNA destabilization. Nevertheless, in our revised manuscript, we show results obtained by subjecting cells to a temperature shift to 42°C, a natural method to inhibit transcription. This approach to determine half-lives has been previously reported in our publications, such as Lotan et al. (2005, 2007) and Goler Baron et al. (2008). This may rule out effects of the drug on half-lives. Indeed, this assay clearly determine HL under heat stress. Thus it can clearly demonstrate that, at least during heat shock, Sfp1 stabilizes mRNAs. Since the results are similar to those obtained by the GRO method at 30oC, we concluded that Sfp1 stabilizes mRNA under optimal and hot conditions.

      Reviewer #2 (Public Review): 

      Summary: 

      The manuscript by Kelbert et al. presents results on the involvement of the yeast transcription factor Sfp1 in the stabilisation of transcripts whose synthesis it stimulates. Sfp1 is known to affect the synthesis of a number of important cellular transcripts, such as many of those that code for ribosomal proteins. The hypothesis that a transcription factor can remain bound to the nascent transcript and affect its cytoplasmic half-life is attractive, but the methods used to demonstrate the half-life effects and the association of Sfp1 with cytoplasmic transcripts remain to be fully validated, as explained in my comments on the results below: 

      Comments on methodology and results: 

      (1) A two-hybrid-based assay for protein-protein interactions identified Sfp1, a transcription factor known for its effects on ribosomal protein gene expression, as interacting with Rpb4, a subunit of RNA polymerase II. Classical two-hybrid experiments depend on the presence of the tested proteins in the nucleus of yeast cells, suggesting that the observed interaction occurs in the nucleus. Unfortunately, the two-hybrid method cannot determine whether the interaction is direct or mediated by nucleic acids. 

      Indeed, our two hybrid, immunoprecipitation and imaging results do not allow us to conclusively discern whether the interaction between Rpb4 and Sfp1 is direct or indirect. While the interaction holds significance, we consider the direct versus indirect distinction to be of secondary importance in the context of this paper. In the current text we indicated that 'our two hybrid, immunoprecipitation and imaging results do not differentiate between a direct or indirect interactions' (see page 6)

      (2) Inactivation of nup49, a component of the nuclear pore complex, resulted in the redistribution of GFP-Sfp1 into the cytoplasm at the temperature non-permissive for the nup49-313 strain, suggesting that GFP-Sfp1 is a nucleo-cytoplasmic shuttling protein. This observation confirmed the dynamic nature of the nucleo-cytoplasmic distribution of Sfp1. For example, a similar redistribution to the cytoplasm was previously reported following rapamycin treatment and under starvation (Marion et al., PNAS 2004). In conjunction with the observation of an interaction with Rpb4, the authors observed slower nuclear import kinetics for GFP-Sfp1 in the absence of Rpb4 when cells were transferred to a glucose-containing medium after a period of starvation. Since the redistribution of GFP-Sfp1 was abolished in an rpb1-1/nup49-313 double mutant, the authors concluded that Sfp1 localisation to the cytoplasm depends on transcription. The double mutant yeast cells may show a variety of non-specific effects at the restrictive temperature, and whether transcription is required for Sfp1 cytoplasmic localisation remains incompletely demonstrated. 

      We agree with Reviewer 2 that any heat inactivation of a temperature-sensitive (ts) protein can lead to non-specific effects. It is evident that nup49-313 does not prevent Sfp1 export to the cytoplasm. In the case of rpb1-1, these non-specific effects are expected due to transcriptional arrest, which can eventually result in a reduction in protein content. However, this process takes some time, while the impact on export is more rapid. It is worth noting that this assay was developed and previously published by Pam Silver (Henry and Silver G&D 1996) and has been reported in many subsequent papers. Importantly, our conclusion is supported by the observation that Sfp1 binds both nascent RNA (co-transcriptionally) and mature mRNA (cytoplasmic). These observations, along with the reduced mRNA export upon transcription blocking, are consistent with our proposal that Sfp1 is exported in association with mRNA.

      (3) Under starvation conditions, which led to the presence of Sfp1 in the cytoplasm and have previously been correlated with a decrease in the transcription of Sfp1 target genes, the authors observed that a plasmid-based expressed GFP-Sfp1 accumulated in cytoplasmic foci. These foci were also labelled by P-body markers such as Dcp2 and Lsm1. The quality of the microscopic images provided does not allow to determine whether Rpb4-RFP colocalises with GFP-Sfp1. 

      The submitted PDF figure is of low quality. We believe that high quality figure of the final submission is convincing. 

      (4) To understand to which RNA Sfp1 might bind, the authors used an N-terminally tagged fusion protein in a cross-linking and purification experiment. This method identified 264 transcripts for which the CRAC signal was considered positive and which mostly correspond to abundant mRNAs, including 74 ribosomal protein mRNAs or metabolic enzyme-abundant mRNAs such as PGK1. The authors did not provide evidence for the specificity of the observed CRAC signal, in particular, what would be the background of a similar experiment performed without UV cross-linking. In a validation experiment, the presence of several mRNAs in a purified SFP1 fraction was measured at levels that reflect the relative levels of RNA in a total RNA extract. Negative controls showing that abundant mRNAs not found in the CRAC experiment were clearly depleted from the purified fraction with Sfp1 would be crucial to assessing the specificity of the observed protein-RNA interactions. The NON-CRAC+ selected mRNAs were enriched for genes whose expression was previously shown to be upregulated upon Sfp1 overexpression (Albert et al., 2019). The presence of unspliced RPL30 pre-mRNA in the Sfp1 purification was interpreted as a sign of co-transcriptional assembly of Sfp1 into mRNA, but in the absence of valid negative controls, this hypothesis would require further experimental validation.

      We would like to thank Reviewer 2 for bringing this issue up, as it helped us to clarify it in the revised paper.

      First, we emphasized in the Discussion that many CRAC+ genes do not fall into the category of highly transcribed genes. Please see more detailed discussion below.

      Secondly, we examined various features of the 264 genes - classified as CRAC+ - to estimate their specificity and biological significance. Our various experiments revealed that the CRAC+ genes represent a distinct group with many unique features.

      The biological significance of the 264 CRAC+ mRNAs was demonstrated by various experiments; all are inconsistent with technical flaws. In fact, all the experiments and analyses that we have pursued indicate the unique nature of the CRAC+ genes. Some examples are:

      (1) Fig. 2a and B show that most reads of CRAC+ mRNA were mapped to specific location – close the pA sites.

      (2) Fig. 2C shows that most reads of CRAC+ mRNA were mapped to specific RNA motif located near the 3’ ends of the mRNAs.

      (3) Most RiBi CRAC+ promoter contain Rap1 binding sites (p= 1.9x10-22), whiles the vast majority of RiBi non-CRAC+  promoters do not. (Fig. 3C).

      (4) Fig. 4A shows that RiBi CRAC+ mRNAs become destabilized due to Sfp1 deletion, whereas RiBi non-CRAC+ mRNAs do not. Fig. 4B shows similar results due to Sfp1 depletion.

      (5) Fig. 6B shows that the impact of Sfp1 on backtracking is substantially higher for CRAC+ than for non-CRAC+ genes. This is most clearly visible in RiBi genes.

      (6) Fig. 7A shows that the Sfp1-dependent changes along the transcription units is substantially more rigorous for CRAC+ than for non-CRAC+.

      (7) In Fig. S4B, the chromatin binding profile of Sfp1 is shown to be different for CRAC+ and non-CRAC+ genes.

      Taken together, the many unique features, in fact, any feature that we examined, indicate the specificity and significance of this group, demonstrating that our CRAC results are biologically significant.

      Most importantly, these genes do not all fall into the category of highly transcribed genes.  On the contrary, as depicted in Figure 6A (green dots), it is evident that CRAC+ genes exhibit a diverse range of Rpb3 ChIP and GRO signals. Furthermore, as illustrated in Figure 7A, when comparing CRAC+ to Q1 (the most highly transcribed genes), it becomes evident that the Rpb4/Rpb3 profile of CRAC+ genes behaves differently from the Q1 group. Evidently, despite the heterogeneous transcription of CRAC+ genes (as mentioned above), the Rpb4/Rpb3 profile decreases more substantially than that of the highly transcribed genes (Q1).  Moreover, despite similar expression levels among all RiBi mRNAs, only a portion of them binds Sfp1.

      Thus, all our results indicate that CRAC+ genes represent biologically significant group, irrespective of the expression of it members. In response to this comment, we included a new paragraph discussing the validity of our conclusions. See page 18, blue paragraph.

      (5) To address the important question of whether co-transcriptional assembly of Spf1 with transcripts could alter their stability, the authors first used a reporter system in which the RPL30 transcription unit is transferred to vectors under different transcriptional contexts, as previously described by the Choder laboratory (Bregman et al. 2011). While RPL30 expressed under an ACT1 promoter was barely detectable, the highest levels of RNA were observed in the context of the native upstream RPL30 sequence when Rap1 binding sites were also present. Sfp1 showed better association with reporter mRNAs containing Rap1 binding sites in the promoter region. However, removal of the Rap1 binding sites from the reporter vector also led to a drastic decrease in reporter mRNA levels. Whether the fraction of co-purified RNA is nuclear and co-transcriptional or not cannot be inferred from these results. 

      The proposed co-transcriptional binding of Sfp1 is based on the findings presented in Figure 5C and Figure S2D, as well as the observed binding of Sfp1 to transcripts containing introns, as shown in Figures 2D and 3B.  The results of Fig. 3 led us to the assertion that the "RNA-binding capacity of Sfp1 is regulated by Rap1-binding sites located at the promoter." We maintain our stance on this conclusion. Indeed, the Rap1 binding site does impact mRNA levels, as highlighted by Reviewer 2. However, "construct E," which possesses a promoter with a Rap1 binding site, exhibits lower transcript levels compared to "construct F," which lacks such a binding site in its promoter. Despite this difference in transcript levels, Sfp1 was able to pull down the former transcript but not the latter, even though expression of the former gene is relatively low. Thus, the results appear to be more reliant on the specific capacity of Sfp1 to interact with the transcript rather than on the transcript's expression level.

      (6) To complement the biochemical data presented in the first part of the manuscript, the authors turned to the deletion or rapid depletion of SFP1 and used labelling experiments to assess changes in the rate of synthesis, abundance, and decay of mRNAs under these conditions. An important observation was that in the absence of Sfp1, mRNAs encoding ribosomal protein genes not only had a reduced synthesis rate but also an increased degradation rate. This important observation needs careful validation, as genomic run-on experiments were used to measure half-lives, and this particular method was found to give results that correlated poorly with other measures of half-life in yeast (e.g. Chappelboim et al., 2022 for a comparison). Similarly, the use of thiolutin to block transcription as a method of assessing mRNA half-life has been reported to be problematic, as thiolutin can specifically inhibit the degradation of ribosomal protein mRNA (Pelechano & Perez-Ortin, 2008). Specific repressible reporters, such as those used by Baudrimont et al. (2017), would need to be tested to validate the effect of Sfp1 on the half-life of specific mRNAs. Also, it would be very difficult to infer from the images presented whether the rate of deadenylation is altered by Sfp1.

      Various methods exist for assessing mRNA half-lives (HLs), and each of them carries its own set of challenges and biases. Consequently, it becomes problematic to directly compare HL values of a specific mRNA when different methods are employed. The superiority of one particular method over others remains unclear (in my opinion). However, they exhibit a high degree of reliability when it comes to comparing different strains under the identical conditions using a single method.

      Estimating HLs through the GRO approach is a non-invasive method, applied on optimally proliferating cells, which has been employed in numerous publications. While no method is without its limitations, our experience along the years reassured approach to be among the most dependable. Our HL determination using thiolutin to block transcription provided results that were consistent with the values obtained by the GRO approach.

      Nevertheless, in our revised manuscript, we supplemented the HL data, obtain by thiolutin, with results obtained by subjecting cells to a temperature shift to 42°C, a natural method to block transcription in wild-type (WT) cells. This approach to determine HLs has been previously reported in our publications, such as Lotan et al. (2005, 2007) and Goler Baron et al. (2008). The new results are shown in Fig. S3B. They are consistent with our conclusion that Sfp1 stabilizes mRNAs.

      Using a repressible promoter to determine mRNA HL is, unfortunately, not suitable in this paper because the promoter itself is involved in HL regulation. This observation is supported by Bregman et al. (2011) and depicted in Fig. 3, which illustrates that the promoter is critical for mRNA imprinting, consequently regulating HL.

      (7) The effects of SFP1 on transcription were investigated by chromatin purification with Rpb3, a subunit of RNA polymerase, and the results were compared with synthesis rates determined by genomic run-on experiments. The decrease in polII presence on transcripts in the absence of SFP1 was not accompanied by a marked decrease in transcript output, suggesting an effect of Sfp1 in ensuring robust transcription and avoiding RNA polymerase backtracking. To further investigate the phenotypes associated with the depletion or absence of Sfp1, the authors examined the presence of Rpb4 along transcription units compared to Rpb3. One effect of spf1 deficiency was that this ratio, which decreased from the start of transcription towards the end of transcripts, increased slightly. The results presented are largely correlative and could arise from the focus on very specific types of mRNAs, such as those of ribosomal protein genes, which are sensitive to stress and are targeted by very active RNA degradation mechanisms activated, for example, under heat stress (Bresson et al., 2020). 

      Figure 7A illustrates a significant reduction in Rpb4/Rpb3 ratios along the transcription unit in WT cells. This reduction is notably more pronounced in CRAC+ genes compared to the highly transcribed quartile (Q1), which includes all ribosomal protein (RP) genes, and it is completely absent in sfp1∆ cells. Furthermore, it's important to highlight that the CRAC+ gene group displays a wide range of transcription rates, as measured by either Rpb3 ChIP or GRO (Figure 6A). Given these observations, we do not think that heightened sensitivity of RP mRNA degradation in response to stress is responsible for the pronounced difference in the configuration of the Pol II elongation complex that is detected in CRAC+ genes, mainly because this experiment was performed under standard (non-stress) culture conditions.

      Correlative studies are particularly informative when a gene mutation eliminates a correlation, and this is precisely the type of study depicted in Figure 7B-C. The correlations shown in these panels are dependent on Sfp1. Indeed, RP genes are sensitive to stress. However, we used non-stressed conditions. Furthermore, CRAC+ genes did not display any apparent unusual destabilization but rather exhibited higher (not lower) mRNA stability compared to non-CRAC+ genes (Figure 7C).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The paper combines phenotypic and genomic analyses of the "sheltered load" (i.e. the accumulation of deleterious mutations linked to S-loci that are hidden from selection in the homozygous state) in Arabidopsis. The authors compare results to previous theoretical predictions concerning the extent of the load in dominant vs recessive S-alleles, and further develop exciting theory to reconcile differences between previous theory and observed results.

      Strengths:

      This is a very nice combination of theory and data to address a classical question in the field.

      We thank the reviewer for this positive feedback.

      Weaknesses:

      The "genetic load" is a poorly defined concept in general, and its quantification via the number of putatively deleterious mutations is quite difficult. Furthermore counting up the number of derived mutations at fully constrained nucleotides may not be a great estimate of the load, and certainly does not allow for evaluation of recessivity -- a concept critical to ideas concerning the sheltered load. Alternative approaches - including estimating the severity of mutations - could be helpful as well. This imperfection in available approaches to test theory must be acknowledged more strongly by the authors.

      As suggested by the reviewer, we implemented alternative approaches to estimate the severity of deleterious mutations and now report the results of SNPeff and

      SIFT4G analyses in Table S6. The results we obtained with these other metrics were overall very similar to those based on our previous counting of mutations at 0-fold and 4-fold degenerate sites. More generally, we tried to improve the presentation of our strategy to estimate the genetic load (clarified in lines 262-268, 271, 292-295, 297. In particular, we made it clear that our population genetic analysis cannot assess the recessivity of the observed mutations (lines 428-434).

      Reviewer #2 (Public Review):

      Summary:

      This study looks into the complex dominance patterns of S-allele incompatibilities in Brassicaceae, through which it attempts to learn more about the sheltering of deleterious load. I found several weak points in the analyses that diminished my excitement about the results. In particular, the way in which deleterious mutations were classified lacked the ability to distinguish the severity of the mutations and thus their expected associated dominance.

      First, we would like to clarify that our goal with this study is NOT to learn something about dominance of the linked deleterious mutations (we can not). Instead, we compare the accumulation of deleterious mutations linked to dominant vs recessive S-ALLELES, but are agnostic regarding the dominance level of the LINKED mutations themselves. The rationale is that the different intensities of natural selection between dominant vs recessive S-alleles provide a powerful way to examine the process by which deleterious mutations are sheltered in general. We further clarified this aspect on lines 70-73 and 399-401.

      Second, as mentioned above in response to Reviewer 1, we complemented the analysis by predicting the severity of the deleterious mutations by SIFT4G and SNPeff. The results were largely consistent, with the exception that the number of sites included in SIFT4G was low, such that the statistical power was reduced (lines 296-300).

      Furthermore, the simulation approach could have provided this exact sort of insight but was not designed to do so, making this comparison to the empirical data also less than exciting for me.

      As explained above, studying dominance of the linked mutations we observed is an interesting research question (albeit a difficult one), but it was not our goal here. Instead, our study was designed as an empirical test of the predictions presented in Llaurens et al (2009), and we re-analysed some aspects of the model outcome to illustrate our points.

      We now better explain that we based our choice of parameters on the fact that in the theoretical study by Llaurens et al (2009), recessive deleterious mutations are predicted to accumulate in a much more straightforward manner (line 316-318).

      We now dedicate a paragraph of the discussion to explain how our stochastic simulations could be improved, and acknowledge that a full exploration of the interaction between dominance of the S-alleles and dominance of the linked deleterious mutations would be an interesting follow-up - albeit beyond the scope of our study (line 437-441).

      Major and minor comments:

      I think the introduction (or somewhere before we dive into it in the results) of the dominance hierarchy for the S-alleles needs a more in-depth explanation. Not being familiar with this beforehand really made this paper inaccessible to me until I then went to find out more before continuing. I would expect this paper to be broad enough that self-contained information makes it accessible to all readers. For example, lines 110-115 could be in the Introduction.

      We thank the reviewer for this useful remark. We now give a more comprehensive description of the dominance hierarchy and introduce the classes of dominance in A. lyrata already in the introduction, on lines 64-70.

      Along with my above comment, perhaps it is not my place to comment, but I find the paper not of a broad enough scope to be of interest to a broad readership. This S-allele dominance system is more than simple balancing selection, it is a very complex and specific form of dominance between several haplotypes, and the mechanism of dominance does not seem to be genetic. I am not sure that it thus extrapolates to broad comments on general dominance and balancing selection, e.g. it would not be the same as considering inversions and this form of balancing selection where we also expect recessive deleterious mutations to accumulate.

      We disagree with these interpretations by the reviewer, for two reasons:

      First, the mechanism of dominance is actually entirely genetic. In fact, we uncovered some years ago that it is based on the molecular interaction between small non-coding RNAs from dominant alleles and their target sites on recessive alleles (Durand et al. Science 2014, see lines 68-70). If there is something specific with this system, it is that the dominance phenomenon is better understood at the mechanistic level than in most other cases, but the resulting phenomenon in itself (a dominance hierarchy) is rather common.

      Second, the kind of variation in the intensity of linked selection created by this mechanism is actually a general phenomenon, so our results have broad relevance beyond our particular study system. We modified the introduction to explain this point

      more clearly, highlighting in particular the fact that the situation we study closely resembles the case of sex chromosomes, where X (or Z) chromosomes are genetically recessive and Y (or W) chromosomes are genetically dominant. We cite this example in lines 83-87 of the introduction and also several well-studied other examples on lines 480-489 of the discussion.

      It would have been particularly interesting, or a nice addition, to see deleterious mutations classed by something like SNPeff or GERP where you can have different classes of moderate to severe deleterious variants, which we would expect also to be more recessive the more deleterious they are. In line with my next comment on the simulations, I think relative differences between mutations expected to be more or less dominant may be even more insightful into the process of sheltering which may or may not be going on here.

      We agree with the reviewer, and as detailed above we have now integrated such analyses with SNPeff and SIFT4G (Table S6). These new results reinforce our conclusion that while S-allele dominance influences the fixation of deleterious mutations, it has no effect on their total number. See lines 270-272 and 296-300.

      In the simulations, h=0 and s=0.01 (as in Figure 5) for all deleterious mutations seems overly simplistic, and at the convenient end for realistic dominance. I think besides recessive lethals which we expect to be close to h=0 would have a much larger selection coefficient, and other deleterious mutations would only be partially recessive at such an s value. I expect this would change some of the simulation results seen, though to what degree I am not certain. It would be nice to at least check the same exact results for h=0.3 or 0.2 (or additionally also for recessive lethals, e.g. h=0 and s=-0.9). I would also disagree with the statement in line 677, many studies have shown, particularly those on balancing selection, that partially recessive deleterious mutations are not eliminated by natural selection and do play a role in population genetic dynamics. I am also not surprised that extinction was found for higher s values when the mutation rate for such mutations was very high and the distribution of s values was constant. An influx of such highly deleterious mutations is unlikely to ever let a population survive, yet that does NOT mean that in nature, the rare influx of such mutations does lead to them being sheltered. I find overall that the simulation results contribute very little, to none, to this paper, as without something more realistic, like a simultaneous distribution of s and h values, you cannot say which, if any class of these mutations are the ones expected to accumulate because of S-allele dominance.

      We understand that the previous version of our manuscript was confusing between dominance of the S-alleles and dominance of the linked deleterious mutations. We clarified that our study focuses on the effect of the former only (lines 99, 263-264 and 581-583).

      We agree that a complete exploration of the interaction between dominance of the S-alleles and dominance of the linked mutations being sheltered would have been an asset, but as explained above this is not the focus of our study. The previous work by Llaurens et al (2009) has already established that deleterious mutations can fix within S-allele lineages, especially when linked to dominant S-alleles, and when the number of S-alleles is large. Under the conditions they examined, deleterious mutations were much more strongly eliminated if not fully recessive (h=0 vs h=0.2), so for the present study we decided to simulate fully recessive mutations only. We now formally acknowledge the possibility that some complex interaction may take place between dominance of the S-alleles and dominance of the linked deleterious mutations (lines 440-442). However, as explained above we feel that fully exploring this complex interaction would require a detailed investigation, which is clearly beyond the scope of the present study.

      Rather they only show the disappointing or less exciting result that fully recessive, weakly deleterious mutations (which I again think do not even exist in nature as I said above) have minor, to no effect across the classes of S-allele dominance. They provide no insight into whether any type of recessive deleterious mutation can accumulate under the S-allele dominance hierarchy, and that is the interesting question at hand. I would either remove these simulations or redo them in another approach. The authors never mention what simulation approach was used, so I can only assume this is custom, in-house code. Yet I do not find that code provided on the github page. I do not know if the lack of a distribution for h and s values is then a choice or a programming limitation, but I see it as one that should be overcome if these simulations are meant to be meaningful to the results of the study.

      The code we used (in C) was adapted from the previous study by Llaurens et al. (2009), which at the time was not deposited in a data repertory, unfortunately. With the agreement of the authors of that study, this code is now available on Github:

      (https://github.com/leveveaudrey/model_ssi_Llaurens; line 723).

      It is correct that our simulations were not aimed at determining whether “any type of recessive deleterious mutation can accumulate”, but we strongly believe that they help interpreting the observations made in the genomic data.

      Recommendations for the authors:

      Notes from the editor:

      I found Table 1 confusing, with column headings of observed proportion but perhaps numbers reflecting counts.

      Thank you for pointing out this confusion. There was indeed an error in the last column, which we have now corrected.

      I found Figure 2 a bit hard to parse, with the vertical lines being unclear and the x-axis ticks of insufficient resolution to evaluate the physical extent of the signals.

      We increased the size of the label on the x-axis and detailed it on the Figure 2, which is now hopefully more clear. Moreover, we increase the size of the vertical lines.

      Finally, I wonder, given the rapid decay of signal in lyrata, whether 25kb is the right choice for evaluating load and whether the pattern may look different on a smaller scale.

      It is true that the signal decays rapidly in A. lyrata, as can be seen in the haplotype structure analysis and in line with our previous analysis of the same populations Le Veve et al (MBE 2023; in this study we explored the effect of the choice of the size of the chromosomal region analyzed; lines 266-269). However, for the sake of comparison, we prefer to stick to the same window size. The fact that we still see an effect of dominance in spite of the lower statistical power associated with the more rapid decay (because a smaller number of genes is expected to be impacted) actually reinforces our conclusions.

      Reviewer #1 (Recommendations For The Authors):

      I have a few additional suggestions to improve the manuscript.

      (1) How does the load linked to the S-locus compare to that observed in other genomic regions? It would be useful to provide a comparison of the results quantified in Figures three and four to comparable genomic regions unlinked to the S-locus. How severe is the linked load?

      This comparison to the genomic background was actually the core of our previous study (Le Veve et al MBE 2023), which was based on the same populations. This analysis revealed that polymorphism of the 0-fold degenerate sites was more than twice higher in the 25kb immediately flanking the S-locus than in a series of 100 unlinked control regions. Here, the main focus of the present study is on the effect of linkage to particular S-alleles (which was not possible previously because haplotypes had to be phased).

      (2) Details of the GLM for data underlying Figures 3 and 4 are somewhat unclear. Is the key explanatory variable (Dominance) treated as continuous? Categorical? Ordinal etc…

      Dominance is considered as a continuous variable. We specify this in line 162 of the results, in the legends of Figures 3 and 4, in the Material and Method (lines 627 and 660) and in the legend of Table S4.

      (3) I had some trouble understanding the two different p-values in columns five and six of table one. Please provide more detail.

      We understand that the two p-values in Table 1 were confusing. The first was related to the binomial test and the second to the permutation test. To be consistent with the rest of the manuscript, we conserved only the p-value of the permutation test.

      (4) As mentioned in the "weaknesses" above, the authors should be more clear about what they are quantifying. They are explicitly counting the number of variants at 0-fold degenerate sites as a proxy for the genetic load. How good this proxy is is unclear. The most egregious misstatement here was on line 314 in which they make reference to the "total load." However, this limitation should be acknowledged throughout the manuscript and deserves more attention in the methods and discussion.

      As mentioned above, we now integrate additional methods to define and quantify the load (SIFT4G and SNPeff), which reinforced our previous conclusions (lines 271-272, 297-302).

      We clarified our wording and replaced the mention of “total load” by “mean number of linked deleterious mutations per copy of S-allele” (line 324-325). In the discussion we tried to better explain the limitations of approaches to estimate the genetic load (line 431-437).

      Reviewer #2 (Recommendations For The Authors):

      Line 60, it should be specified that this is only for recessive deleterious mutations.

      Non-recessive deleterious mutations would certainly not be expected to accumulate.

      As explained in details above, the question of whether and how non-recessive deleterious mutations can accumulate when linked to the S-locus is difficult and would in itself deserve a full treatment, which is clearly beyond the scope of the present study. We clarified this point on line 56.

    1. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews: 

      Reviewer #1 (Public Review): 

      The goal of the current study was to evaluate the effect of neuronal activity on blood-brain barrier permeability in the healthy brain, and to determine whether changes in BBB dynamics play a role in cortical plasticity. The authors used a variety of well-validated approaches to first demonstrate that limb stimulation increases BBB permeability. Using in vivo-electrophysiology and pharmacological approaches, the authors demonstrate that albumin is sufficient to induce cortical potentiation and that BBB transporters are necessary for stimulus-induced potentiation. The authors include a transcriptional analysis and differential expression of genes associated with plasticity, TGF-beta signaling, and extracellular matrix were observed following stimulation. Overall, the results obtained in rodents are compelling and support the authors' conclusions that neuronal activity modulates the BBB in the healthy brain and that mechanisms downstream of BBB permeability changes play a role in stimulus-evoked plasticity. These findings were further supported with fMRI and BBB permeability measurements performed in healthy human subjects performing a simple sensorimotor task. There is literature to suggest that there are sex differences in BBB dysfunction in pathophysiological conditions and the authors have acknowledged the use of only males as a minor limitation of the study that should be addressed in the future. Future studies should also test whether the upregulation of OAT3 plays a role in cortical plasticity observed following stimulation. Overall, this study provides novel insights into how neurovascular coupling, BBB permeability, and plasticity interact in the healthy brain. 

      Reviewer #2 (Public Review): 

      Summary: 

      This study builds upon previous work that demonstrated that brain injury results in leakage of albumin across the blood brain barrier, resulting in activation of TGF-beta in astrocytes. Consequently, this leads to decreased glutamate uptake, reduced buffering of extracellular potassium and hyperexcitability. This study asks whether such a process can play a physiological role in cortical plasticity. They first show that stimulation of a forelimb for 30 minutes in a rat results in leakage of the blood brain barrier and extravasation of albumin on the contralateral but not ipsilateral cortex. The authors propose that the leakage is dependent upon neuronal excitability and is associated with an enhancement of excitatory transmission. Inhibiting the transport of albumin or the activation of TGF-beta prevents the enhancement of excitatory transmission. In addition, gene expression associated with TGF-beta activation, synaptic plasticity and extracellular matrix are enhanced on the "stimulated" hemisphere. That this may translate to humans is demonstrated by a break down in the blood brain barrier following activation of brain areas through a motor task. 

      Strengths: 

      This study is novel and the results are potentially important as they demonstrate an unexpected break down of the blood brain barrier with physiological activity and this may serve a physiological purpose, affecting synaptic plasticity. 

      The strengths of the study are: 

      (1) The use of an in vivo model with multiple methods to investigate the blood brain barrier response to a forelimb stimulation. 

      (2) The determination of a potential functional role for the observed leakage of the blood brain barrier from both a genetic and electrophysiological view point 

      (3) The demonstration that inhibiting different points in the putative pathway from activation of the cortex to transport of albumin and activation of the TGF-beta pathway, the effect on synaptic enhancement could be prevented.  (4) Preliminary experiments demonstrating a similar observation of activity dependent break down of the blood brain barrier in humans. 

      Weaknesses: 

      The authors adequately addressed most of my points. A few remain: 

      (1) Although the reviewers have addressed the possible effects of anaesthesia on neuro-vascular coupling. They have not mentioned or addressed the possible effects of ketamine (an NMDA receptor antagonist) on synaptic plasticity. Indeed, the low percentage of SEP increase following potentiation (10-20%) could perhaps be explained by partial block of NMDA receptors by ketamine.

      We agree and apologize for this oversight. This important issue is now addressed in the Discussion.

      “Notably, the antagonistic effect of ketamine on NMDA receptors might attenuate the magnitude of SEP potentiation recorded in our experiments (Anis et al., 1983; Salt et al., 1988).”

      (2) The experimental paradigms remain unclear to me. Now, it appears that drugs are applied for 50 minutes and that the stimulation occurs during the "washout period". The more conventional approach would be to have the drug application during the stimulation period to determine if the drugs occlude or enhance the effects of stimulation and then washout the drugs. The problem is that drugs variably washout at different rates depending upon their lipid solubility.

      We agree that the more conventional approach would have been to continue applying the drug throughout the experiment and that differential rates of washout may add variability to our experiments. However, despite this limitation, within each treatment group we found that the SEP response at 50 minutes (immediately after the drug application window) does not differ from SEP response at 80 minutes (after 30 minutes of stimulation and washout) [Figure 3H&G]. This suggests that the drug effects were still present despite terminating drug application and performing potentiation-inducing stimulation. Moreover, our analysis showed that animals within each treatment group (except AP5) had similar SEP responses with little intra-group variability.

      (3) It is still not clear to what extent the experimenters and those doing the analysis were blinded to group. If one or both were blind to group, then please put this in the methods.

      Thank you for this comment. We revised the Methods section to clearly confirm that data was collected and analyzed blindly.  

      Reviewer #3 (Public Review): 

      Summary: 

      This study used prolonged stimulation of a limb to examine possible plasticity in somatosensory evoked potentials induced by the stimulation. They also studied the extent that the blood brain barrier (BBB) was opened by the prolonged stimulation and whether that played a role in the plasticity. They found that there was potentiation of the amplitude and area under the curve of the evoked potential after prolonged stimulation and this was long-lasting (>5 hrs). They also implicated extravasation of serum albumin, caveolae-mediated transcytosis, and TGFb signalling, as well as neuronal activity and upregulation of PSD95. Transcriptomics was done and implicated plasticity related genes in the changes after prolonged stimulation, but not proteins associated with the BBB or inflammation. Next, they address the application to humans using a squeeze ball task. They imaged the brain and suggest that the hand activity led to an increased permeability of the vessels, suggesting modulation of the BBB. 

      Strengths: 

      The strengths of the paper are the novelty of the idea that stimulation of the limb can induce cortical plasticity in a normal condition, and it involves opening of the BBB with albumin entry. In addition, there are many datasets and both rat and human data. 

      Weaknesses: 

      The conclusions are not compelling however because of a lack of explanation of methods.

      In the revised paper, we added a section titled ‘study design’ that presents an overview of the experimental approach.

      The explanation of why prolonged stimulation in the rat was considered relevant to normal conditions should be as clear in the paper as it is in the rebuttal.

      We added a new paragraph to the Discussion section explaining this point as we did in the rebuttal:  

      “Our animal experiments show that a 30 min limb stimulation (at 6Hz and 2mA) increases cross-BBB influx, while a 1 min stimulation (of similar frequency and magnitude) does not. We believe that both types of stimulations fall within the physiological range because our continuous electrophysiological recordings showed no signs of epileptiform or otherwise pathological activity. Moreover, the recorded SEP levels were similar to those reported in previous physiological LTP studies in rats (Eckert & Abraham, 2010; Han et al., 2015; Mégevand et al., 2009) and humans (McGregor et al., 2016). In humans, skill acquisition often involves motor training sessions that last ≥30 minutes (Bengtsson et al., 2005; Classen et al., 1998) and result in physiological plasticity of sensory and motor systems (Classen et al., 1998; Draganski et al., 2004; Sagi et al., 2012). Hence, the experimental task in our human study (30 minutes of repetitive squeezing of an elastic stress-ball) is likely to represent physiological activity, with neuronal activation in primarily motor and sensory areas (Halder et al., 2005). Future human and animal studies are needed to explore the BBB modulating effects of additional stimulation protocols – with varying durations, frequencies, and magnitudes. Such studies may also elucidate the temporal and ultrastructural characteristics that differentiate between physiological and pathological BBB modulation. “

      The authors need to ensure other aspects of the rebuttal are as clear in the paper as in the rebuttal too. 

      Thank you for this comment. This was addressed in the revised paper.

      The only remaining concern that is significant is that it is hard to understand the figures. 

      Thank you for this comment. We revised the figures according to the reviewer’s recommendations. We hope that these changes increase the legibility of the figures. 

      Reviewer #3 (Recommendations For The Authors): 

      The manuscript is improved but there are still suggestions that do not appear to have been addressed. More experiments are not involved in addressing these concerns but one wants the paper to be clarified in terms of what was done. 

      Figures. Please use arrows to point to the effect that the reader should see. Please note what the main point is. 

      Major concerns: 

      Please add explanations, exact p values, and other revisions in the rebuttal to the paper. 

      Rebuttal explanations were added to the paper and p values appear in figure legends.

      Fig 1d shows a seizure-like event which the authors don't think is a seizure because it lacks a depolarization ship. This explanation is not convincing because a LFP would not necessarily show a depolarization ship. Another argument of a discussion of the event as a seizure is warranted. Note that expanding the trace might also show it is unlike a seizure. Regarding the idea that 6Hz 2 mA stimuli for 30 min are physiological, the authors make three arguments which are not clear. First, no epileptiform activity was found, but in Fig. 1 it looks like a seizure occurred. Second, memory and skill acquisition in humans open involve a similar training duration - but what about 6Hz 2 mA?

      Rats are known to rhythmically move their whiskers at frequencies ranging between 5 and 15 Hz (Mégevand et al., 2009). We agree that there is no clear way to justify the similarity between the experimental design in humans and rats. However, we believe that both paradigms (paw stimulation in rats and ball squeeze in humans) represent non-pathological input that we found to modulate barrier permeability. This argument was added to the discussion of the paper:

      “We believe that both types of stimulations fall within the physiological range because in rats, activity between 515 Hz represents physiological rhythmic whisker movement during environment exploration (Mégevand et al., 2009).” 

      Seizures are typically induced in rats via direct tetanic stimulation of the brain (at 50 Hz and 0.3-2.5mA) or maximal electroshock test to the cornea (at 50 Hz and 150 mA) (Swinyard et al., 1952). We, therefore, assert that the activity we observe represents physiological responses and not seizures. This argument is beyond the scope of the current paper. 

      Please note a limitation is that the high level of serum albumin is unlikely to be physiological but may not have been as high in the animal because of the low diffusion rate and degradation (please add the refs in the rebuttal). 

      Thank you, we added the following to the Results section: 

      “The relatively high concentration of albumin was chosen to account for factors that lower its effective tissue concentration such as its low diffusion rate and its likelihood to encounter a degradation site or a cross-BBB efflux transporter (Tao & Nicholson, 1996; Zhang & Pardridge, 2001).”

      Fig. 1. 

      Please consider a box in b to show where the expanded traces in the lower row came from. 

      Thank you for the suggestion. We added lines indicating where the trace excerpts were taken from.

      c. Please use arrows to point to the parts that the authors want the reader to note. In the legend, explain what t is, and delta HbT.

      Thank you. We implemented this suggestion.

      d. It is not clear what the double-sided arrows are meant to show compared to the arrow without two sides. 

      We replaced the two-headed arrow with two single ones.

      e. Please explain what the upward lines at the top signify. What does the red asterisk mean? 

      Thank you. We implemented this suggestion.

      f. Is the reader supposed to note the yellow area? Please make it with an arrow or circle if so. 

      Thank you, we added a white circle to mark the area of tracer accumulation.

      g. Please explain what the permeability index is or reference the part of the paper that does. 

      Further to this suggestion, we added a refence to the appropriate methods section to the legend.

      h. Please use arrows to point to the area of interest. 

      Thank you. We implemented this suggestion.

      m-n. Please mark areas of interest with arrows.  m. the top right two images are unclear. I suggest making them say ipsi inset and contra inset instead of using asterisks. 

      Thank you. We added the ipsi and contra labels to panels in m. The images in panel n represent a phenomenon with no particular region of interest, but rather peri-vascular tracer accumulation along the entire depicted blood vessel. We clarified that panel n represents a separate experiment than panel m: “n. In an animal injected with both EB and NaFlu post stimulation, fluorescence imaging shows extravascular accumulation of both tracers along a cortical small vessel in the stimulated hemisphere.”

      Figure 2. 

      (2) a. Middle. What are the vertical lines at the top? The rebuttal states that was explained in the revised legends but I don't see it. 

      Our apologies. We now included an explanation that “an excerpt of the stimulation trace is shown above the middle LFP trace”.

      c and d are very different field potentials in shape and therefore hard to compare. The rebuttal addresses this but the explanation is not in the revised text. 

      We agree that there is variability in SEP responses between animals. We now added a statement acknowledging this in the methods section: “To overcome potential variability in SEP morphology between animals (Mégevand et al., 2009), each animal’s plasticity measures (max amplitude and AUC of post stimulation SEP) were compared to the same measures at baseline.” 

      In d, it is not clear there is potentiation because the traces are not aligned. 

      All panels depicting SEP traces represent raw data with no alignment. The shift observed in panel d exemplifies why we compare post-stimulation parameters of max amplitude and area under curve to baseline in each animal. 

      Exact P values are said to have been added in the rebuttal but they were not. 

      Exact P values appear in Figure legends.

      (3) b. Use arrows to mark the area of interest. 

      Thank you. We added a white circle to mark the area of tracer accumulation similar to Figure 1f.

      d. Why is there an oscillation superimposed on all traces except CNQX? 

      We agree that this is an interesting question. Future studies should determine the source of this SEP pattern.   

      (4) What does the line and the number 2 mean? How were data normalized? What was counted? What area of cortex?

      The number 2 refers to the scale bar line, meaning a log fold change of 2 reflects the size of the scale bar line. 

      The plot shows the log fold change against the mean count of each gene in the contralateral somatosensory cortex between 1 and 24 hours after stimulation.

      The x axis title was changed to “mean expression” and the legend was modified to:

      “Scatter plot of gene expression from RNA-seq in the contralateral somatosensory cortex 24 vs. 1 h after 30 min stimulation. The y axis represents the log fold change, and the x axis represents the mean expression levels (see methods, RNA Sequencing & Bioinformatics). Blue dots indicate statistically significant differentially expressed genes (DEGs) by Wald Test (n=8 rats per group).”

      How were the pericytes, smooth muscle cells, ,etc. distinguished? 

      This was explained under Methods->RNA Sequencing & Bioinformatics: “Analysis of cell-specific and vascular zonation genes was performed as described (Vanlandewijck et al., 2018), using the database provided in (http://betsholtzlab.org/VascularSingleCells/database.html).”

      What were the chi square statistics? If there were cells used instead of rats, please justify. 

      Thank you. The legend was expanded to include the following:

      “The contralateral somatosensory cortex was found to have a significantly higher number of DEGs related to synaptic plasticity, than the ipsilateral side (***p<0.001, Chi-square).”     

      (5) b. what do the icons mean? 

      We agree that the icons were confusing. We simplified this panel to just show when participants were asked to squeeze the ball (black icon). This explanation was added to the Figure legend.

      Abbreviations? 

      Abbreviations of MRI protocols were added to the figure legend for clarity.

      In c-e what are the units of measure? Fold-change? 

      The units represent t-statistics values for each voxel. The label ‘t-statistic’ was added to the figure.  

      What are the white Iines, + and - signs? 

      The white lines point to voxels of highest activation (t-statistic). This was added to the legend.

      And these are not +/- signs these are voxels with significant activation which only appear similar.

      f. Please explain f and g for clarity. 

      Thank you. The explanation was modified for added clarity.

      Supplemental Fig. 4. 

      Original question: If ipsilateral and contralateral showed many changes why do the authors think the effects were only contralateral? 

      The authors replied: Our gene analysis was designed to complement our in vivo and histological findings, by assessing the magnitude of change in differentially expressed genes (DEGs). This analysis showed that: (1) the hemisphere contralateral to the stimulus has significantly more DEGs than the ipsilateral hemisphere; and (2) the DEGs were related to synaptic plasticity and TGF-b signaling. These findings strengthen the hypothesis raised by our in vivo and histological experiments. 

      Could the authors clarify the answer to the question in the text? 

      Thank you. This section was added to the Discussion. 

      Papers referenced in this letter:

      Anis, N. A., Berry, S. C., Burton, N. R., & Lodge, D. (1983). The dissociative anaesthetics, ketamine and phencyclidine, selectively reduce excitation of central mammalian neurones by N-methyl-aspartate. British Journal of Pharmacology, 79(2), 565–575. hQps://doi.org/10.1111/j.1476-5381.1983.tb11031.x

      Bengtsson, S. L., Nagy, Z., Skare, S., Forsman, L., Forssberg, H., & Ullén, F. (2005). Extensive piano practicing has regionally specific effects on white matter development. Nature Neuroscience, 8(9), 1148–1150. hQps://doi.org/10.1038/nn1516

      Classen, J., Liepert, J., Wise, S. P., Hallett, M., & Cohen, L. G. (1998). Rapid plasticity of human cortical movement representation induced by practice. Journal of Neurophysiology, 79(2), 1117–1123. hQps://doi.org/10.1152/JN.1998.79.2.1117/ASSET/IMAGES/LARGE/JNP.JA47F4.JPEG

      Draganski, B., Gaser, C., Busch, V., Schuierer, G., Bogdahn, U., & May, A. (2004). Changes in grey matter induced by training. Nature, 427(6972), 311–312. hQps://doi.org/10.1038/427311a

      Eckert, M. J., & Abraham, W. C. (2010). Physiological effects of enriched environment exposure and LTP induction in the hippocampus in vivo do not transfer faithfully to in vitro slices. Learning and Memory, 17(10), 480–484. hQps://doi.org/10.1101/lm.1822610

      Halder, P., Sterr, A., Brem, S., Bucher, K., Kollias, S., & Brandeis, D. (2005). Electrophysiological evidence for cortical plasticity with movement repetition. European Journal of Neuroscience, 21(8), 2271–2277. hQps://doi.org/10.1111/J.1460-9568.2005.04045.X

      Han, Y., Huang, M. De, Sun, M. L., Duan, S., & Yu, Y. Q. (2015). Long-term synaptic plasticity in rat barrel cortex. Cerebral Cortex, 25(9), 2741–2751. hQps://doi.org/10.1093/cercor/bhu071

      McGregor, H. R., Cashaback, J. G. A., & Gribble, P. L. (2016). Functional Plasticity in Somatosensory Cortex Supports Motor Learning by Observing. Current Biology, 26(7), 921–927. hQps://doi.org/10.1016/j.cub.2016.01.064

      Mégevand, P., Troncoso, E., Quairiaux, C., Muller, D., Michel, C. M., & Kiss, J. Z. (2009). Long-term plasticity in mouse sensorimotor circuits after rhythmic whisker stimulation. Journal of Neuroscience, 29(16), 5326– 5335. hQps://doi.org/10.1523/JNEUROSCI.5965-08.2009

      Sagi, Y., Tavor, I., HofsteQer, S., Tzur-Moryosef, S., Blumenfeld-Katzir, T., & Assaf, Y. (2012). Learning in the Fast Lane: New Insights into Neuroplasticity. Neuron, 73(6), 1195–1203. hQps://doi.org/10.1016/j.neuron.2012.01.025

      Salt, T. E., Wilson, D. G., & Prasad, S. K. (1988). Antagonism of N-methylaspartate and synapBc responses of neurones in the rat ventrobasal thalamus by ketamine and MK-801. British Journal of Pharmacology,

      94(2), 443–448. hQps://doi.org/10.1111/j.1476-5381.1988.tb11546.x

      Swinyard, E. A., Brown, W. C., & Goodman, L. S. (1952). Comparative assays of antiepileptic drugs in mice and rats. The Journal of Pharmacology and Experimental Therapeutics, 106(3), 319–330. hQp://jpet.aspetjournals.org/content/106/3/319.abstract

      Tao, L., & Nicholson, C. (1996). Diffusion of albumins in rat cortical slices and relevance to volume transmission. Neuroscience, 75(3), 839–847. hQps://doi.org/10.1016/0306-4522(96)00303-X

      Vanlandewijck, M., He, L., Mäe, M. A., Andrae, J., Ando, K., Del Gaudio, F., Nahar, K., Lebouvier, T., Laviña, B.,

      Gouveia, L., Sun, Y., Raschperger, E., Räsänen, M., Zarb, Y., Mochizuki, N., Keller, A., Lendahl, U., &

      Betsholtz, C. (2018). A molecular atlas of cell types and zonation in the brain vasculature. Nature, 554(7693), 475–480. hQps://doi.org/10.1038/nature25739

      Zhang, Y., & Pardridge, W. M. (2001). Mediated efflux of IgG molecules from brain to blood across the blood– brain barrier. Journal of Neuroimmunology, 114(1–2), 168–172. hQps://doi.org/10.1016/S01655728(01)00242-9

    1. to simply slot Clarkson into the standard history of the field would miss much of the point

      I think this is too much modesty, or a kind of self-undercutting to try to convey the importance of the point. But functionally it undercuts the significance of the earlier parts of the chapter. At the end here I'm understanding the argument as being

      1. The antislavery campaign shows what state-of-the-art data visualization meant c. 1800, and these two different visualizations from Clarkson make the case that he should be considered one of the canonical figures.
      2. That's important because it heightens a set of ethical and political questions about whether and when to visualize. Clarkson's work can be considered a countervisualization or something -- possibly a concept to introduce ? -- because it's taking advantage of the trade etc. Also highlights dataviz as a political-rhetorical form, not just a scientific practice about astronomy etc.
      3. Just because we admire things about Clarkson's career doesn't mean should literally canonize him as a saint. Equiano's reaction shows that even at that time there are a different set of requirements.

      And then there is the metaphor of water and streams. This does a few things: 1. provides a counterpoint to the God's-eye, object view by adding a contingency of flow and direction, fluidity, and contingency. 2. was useful for ~1800 readers who ALSO weren't always looking for this objective god's-eye view, which is OK. (I think the infographic/dataviz distinction from the introduction here is useful, because it underlines that the more 'subjective' or whatever flow timelines are an ADVANCE on Priestley's straight lines and can be seen as such. 3. Motivates your own data visualization of the streams of with the also-canonical Mississippi visualization. I may have missed this but I think the connection here is almost fully implicit. This could be one key to motivating the water thing as your own choice.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1:

      We thank the reviewer for his/her time and for the constructive comments. Below please find our detailed responses to your points.

      STING is a key signalling hub in the innate immune system, receiving multiple inputs from upstream activators (such as cGAS) and in turn triggering multiple downstream events (such as IFN induction, NF-kB signalling, autophagy, cell death). Mutations in the STING gene cause a rare inflammatory disease called SAVI. Using a previously established STING ki mouse that recapitulates some of the clinical observations in SAVI patients, this manuscript tests the hypothesis that TNF signalling drives pathology. Using anti-TNF antibody and TNF receptor knockout, the authors show that TNF indeed plays important roles in causing disease in this mouse model. For example, the loss of T cells and neurons is prevented when TNF signalling is blocked, and lung pathology is rescued in STING ki mice lacking TNF receptors. Overall, the manuscript is well written and laid out, and the experimental work is of a high technical standard.

      Major comments

        • Most figures show pooled data from two independent experiments including a total of 5-8 mice. Given the variability in some of the readouts, this raises the question of whether there is sufficient statistical power to draw conclusions. For example, in Figure 2, the conclusion that "Infliximab did not alter the expression of inflammatory mediators" seems questionable given the results in Figure 2F and G. Did the authors perform a power calculation? What effect size can the authors detect given the variability and number of replicates? Similarly, in Figure 3, the authors conclude that "Disruption of TNFR signaling did not significantly prevent T cell lymphopenia"; however, with some more replicates, the data in Figure 3D would likely reach significance. Similar concerns apply to several panels in Figures 4 and 6 and to Figure S5M. Ideally, the authors should perform additional repeat experiments to increase the number of replicates. If that is not possible, power calculations need to be provided and conclusions should explicitly mention the minimum effect size that the author can detect given the small sample size (for example "Infliximab did not alter the expression of inflammatory mediators more than x-fold").* Thank you for this suggestion. However, it is not possible to repeat the treatment of mice with Infliximab for generation of more replicates. The blockade of TNF signalling by treatment with drugs did not cure the murine SAVI disease. According to animal welfare restrictions, we cannot perform additional treatment experiments with Infliximab or Etanercept.

      We analysed the effect size d, f and power of all these presented results and collected them in table S4. Additional explanations about effect sizes were added in the corresponding text to Figures 2 and 3. The demonstrated results in Figure 4 and 6 already contain significant data. We did not include the calculation of effects sizes here. All effect size and power calculations are summarized in table S4.

      • The authors should not make unjustified overstatements. For example, STING KI; TNFR1/2 KO mice should not be referred to as a "new mouse model". The manuscript simply tests the role of TNFR1/2 in the already published STING N153S model. In line 687, avoid using "impressively" and in line 734 avoid using "massively".*

      • *

      Thank you for this suggestion. We changed this sentence into:…”these newly generated mouse lines of TNFR”…., see line 796. Additionally, in line 687 (actual line 705) we omitted “impressively” and in line 734 “massively produced” into “elevated” (actual line 752).

      Minor comments

      • Line 767-769: The statement that spike activates cGAS is misleading, because this effect is an indirect consequence of cell-to-cell fusion (Liu et al 2022).*

      • *

      Thank you for this suggestion. We changed this sentence into: Cell fusion caused by the SARS-CoV-2 spike protein is a potent… (actual line 785).

      Reviewer #1 (Significance (Required)):

      • *

      The main strengths of this study are (1) the use of complementary antibody-based and genetic methods to test the role of TNF signalling; (2) the use of multiple different readouts; and (3) the analysis of many different cell types / organ systems. The main weaknesses are (1) small sample sizes limiting statistical power (see above) and (2) the exclusive use of mouse models.

      • *

      Overall, my opinion is that the advance is important, both fundamentally and clinically. Studies of this and the related V154M mouse model previously showed an important role of non-IFN pathways in driving disease. This study indicates that TNF signalling may cause pathology. This not only extends our understanding of STING's role in autoinflammation but also opens a direct therapeutic avenue using approved TNF targeting drugs.

      • *

      This study will be primarily of interest to specialised audiences working on STING and SAVI, and secondarily to the wider innate immunity field.

      • *

      This reviewer has expertise in the field of nucleic acid sensing, including cGAS-STING.

      • *

      • *

      Reviewer #2:

      We thank the reviewer for his/her time and for the constructive comments. Below please find our detailed responses to your points.

      *In this paper, Luksch et al (2024) examines the role of TNF signaling in STING-associated vasculopathy with onset in infancy (SAVI). By using pharmacological inhibition and genetic inactivation of TNF receptors in a murine SAVI model (STING ki), the research found that pharmacologically inhibiting TNF signaling improved T cell lymphopenia but had limited effects on lung disease. Genetic inactivation of TNFR signaling, particularly TNFR1, enhanced thymocyte survival and expanded the peripheral T cell pool, reducing inflammation and neurodegeneration. The development and progression of severe lung disease in STING ki mice are also reliant on TNFR1 signaling, while TNFR2 deletion did not alleviate lung inflammation. The authors also explored the severe inflammatory lung disease manifestation, showing that primary lung endothelial cells in STING ki mice allowed more neutrophil attachment compared to those in STING WT mice, indicating chronic STING activity in endothelial cells disrupts the endothelial barrier and promotes severe lung disease. The study highlights TNFR signaling as crucial in SAVI and COVID-19 progression and suggests blocking TNFR1 signaling as a potential therapeutic approach for both diseases. *

      • *

      Major comments:

      The paper establishes a strong connection between TNFR1 depletion and the reduction of SAVI disease severity in lung and neuroinflammation, suggesting TNFR1 blockade as a viable therapeutic strategy for SAVI. To strengthen the arguments and improve the therapeutic potential, the authors should address the following major comments:

        • The authors conclude that TNFR1 signaling drives murine SAVI disease, as evidenced by the reduced severity of lung disease in TNFR1 -/- mice. While the genetic model is convincing, the discrepancy between pharmacological inhibition and genetic models needs clarification. Before attributing the pharmacological failure to late administration, have the authors considered that Infliximab might not sufficiently deplete TNF to achieve therapeutic benefits? In figure 2H, serum TNF levels were not significantly altered in STING ki mice treated with Infliximab. Have the authors considered using other TNF inhibitors or alternative methods to measure TNF depletion efficacy in STING ki murine models, such as qPCR, flow cytometry, or immunohistochemistry in lymph nodes or lung tissues?* Thank you for this suggestion. In a preliminary experiment, we already treated STING WT and STING ki mice with Etanercept which is not included in the paper. 3-week-old mice were treated with subcutaneously injection of 25 mg/kg Etanercept or saline, twice per week, for 7 weeks. After treatment, all mice were euthanized and single cell suspensions of blood and spleen were used for flow cytometry analysis. Lung tissue was harvested for histological analysis. Quantification of gene expression was performed by snap frozen lung and kidney tissue and quantification of secreted proteins was analysed by snap frozen serum.

      The transcription of ISGs and proinflammatory mediators in lung tissue was not significantly improved by the Etanercept treatment of mice, see additional figure below (A – D). Interestingly, the amount of secreted CXCL9 in the serum was reduced in Etanercept treated mice compared to vehicle treated mice (E). We concluded that our treatment strategy had no impact in the manifestation and progression of murine SAVI disease, in highly inflamed tissues / organs. However, we found a reduction (partially significant) of proinflammatory mediator transcriptions in the kidney of Etanercept treated mice compared to vehicle control mice. Murine SAVI disease is a systemic autoinflammatory disease without histological alteration in kidney tissue of 10 weeks old mice. Remarkably, transcription of ISGs and proinflammatory mediators is highly upregulated in SAVI mice. Treatment with Etanercept improved this aberrant gene expression in murine SAVI influenced tissue / organ (I – K). These results encouraged us to perform the treatment with infliximab because we expected a more pronounced effect since infliximab can bind the monomeric and trimeric form while etanercept can only bind to the active trimeric from of TNF.

      Etanercept treatment of STING WT (in black) and ____STING ki (in red)____ mice.

      (A) Relative expression level of Cxcl10, (B) Mx1, (C) Tnf and (D) Il1b in lung tissue of Etanercept or saline treated STING WT and STING ki mice. (E) Quantification of CXCL9, (F) CXCL10, (G) IL-6 and (H) TNF in serum samples from STING WT and STING ki mice after treatment. (I) Relative expression level of Cxcl10, (J) Mx1, (K) Tnf and (L) Il1b in kidney tissue of treated mice.

      • The TNF pathway exhibits redundancy, as multiple signaling molecules or pathways can compensate for the loss of TNF function to maintain cellular processes and immune responses. The authors showed that thymocytes of STING ki mice lacking TNFR1/2 expressed significantly lower levels of IFN-related genes (Cxcl10, Sting1), and mice lacking TNFR1 and TNFR1/2 expressed reduced levels of NF-κB-related genes. Does this imply that IFN and NF-κB pathways are downstream of TNF signaling driving SAVI progression? It would be valuable to hear the authors' comments or postulations on the potential mechanisms of TNF driving SAVI progression in the discussion, and the methods to dissect the mechanisms further using genetic or pharmacological methods.*

      Thank you for this suggestion. STING is a key player in various proinflammatory mechanism and is directly involved in IFN and NF-κB signalling. We assume that these signalling pathways are adaptable to various proinflammatory situations. Knock out of TNFR1 and TNFR1/2 results in a strong inhibition of all inflammatory reactions in the whole organisms. We think, it is not possible to conclude mechanisms of murine SAVI manifestation and progression from the results of these mouse lines only. These observations provide new hypothesis, but cannot completely explain the mechanism.

      • The authors mentioned that the pharmacological inhibition of TNF by Infliximab is ineffective due to late administration compared to the onset of SAVI. How would this affect the therapeutic treatment of TNF if the treatment is going to be later than the disease onset? Can the authors elaborate on the potential ways to circumvent the timing of treatment? Would TNFR1 antagonists experience the same issue? To understand disease progression and optimal targeting times, the creation of an inducible TNFR1/2 -/- mouse model could be beneficial. This is optional, but the authors are encouraged to comment on improving TNFR1/2 -/- mouse SAVI models to further study the therapeutic potential of TNF signaling blockage in treating SAVI.*

      We agree with the suggestion. In the next project, we want to generate STING ki mice with inducible knock out.

      Minor comments:

      • The authors separate STING WT and STING ki into different graphs, which can sometimes make it hard to compare STING WT and STING ki baseline levels. It would be beneficial to merge the two genotypes into single graphs for easier comparison.*

      Thank you for this suggestion. In the first version of this manuscript, we collected results from STING WT and STING ki mice in one graph with 8 bars in different colours and textures in the case of TNFR knock out lines. These graphs were overloaded and very confusing. It is was not possible to mark statistical calculations inside these graphs without losing the focus. Hence, we created the demonstrated design of graphs. We think this is the most convincing version.

      • Figure S5 lacks statistical annotations, although the legends mention them. Are the statistics usually shown when a comparison is mentioned in the text, or are they only displayed when the differences are significant? It would be helpful if the authors could clarify this and ensure that all relevant statistical comparisons are clearly reflected in the graphs, regardless of the significance level. This consistency would improve the clarity and interpretation of the data presented.*

      • *

      Thank you for this suggestion. We removed the significance level from the legend of Figure S5 (actually line 1199).

      • *

      The authors did an excellent job discussing the study's implications, but some of this content could be moved to the introduction. The hypothesis that "tumor necrosis factor (TNF) signaling is involved in the manifestation and progression of murine SAVI disease" can be introduced more naturally once the authors present previous findings on TNF's association with various autoimmune disorders. This would set a clear context for the study's objectives and rationale.

      We agree with this suggestion and inserted the sentence: “In our previous investigations, we observed an elevated transcription of Tnf in spleen and thymus of STING ki mice (Siedel et al., 2020).” (actual line 89/90).

      General Assessment: The study identifies enhanced TNF signaling as a driver of SAVI and specifies TNFR1 blockage as a promising treatment to reduce disease severity. It thoroughly characterizes pharmacological inhibition and genetic perturbations of TNF signaling in murine SAVI models and creates a novel mouse model for studying TNF-targeted therapies in SAVI treatment.

      *However, the study is limited in characterizing the discrepancy between pharmacological inhibition and genetic depletion of TNF and understanding the underlying mechanisms of TNF driving chronic STING activation and tissue inflammation. *

      Advances: The study extends knowledge in the field by demonstrating that enhanced TNF signaling drives SAVI, establishing causation rather than mere correlation. The authors provide strong rationale for treating SAVI with TNF inhibitors/blockage, previously used in other autoimmune disorders like IBD or Crohn's disease, but not in SAVI. They also present a valuable genetic model for studying TNFR signaling blockage in SAVI progression, which is important for both the field of SAVI and future therapy development.

      Audience: The research provides translational and clinical insights by suggesting that targeting TNFR1 signaling could inspire novel treatments for SAVI. The study also advances basic research on SAVI disease progression. Immunologists and clinicians studying and treating autoimmune disorders are the intended audience, but the findings have broader implications. The study highlights the potential role of TNF signaling in COVID-19 disease progression and treatment, thus attracting interest beyond the field of autoimmune disorders.

      • *

      Field of expertise:

      cGAS-STING regulation in chromosomally unstable cancers, genomic instability, nuclear envelope rupture and repair

      Do not have sufficient expertise in:

      Immunological underpinning of autoimmune disorders, clinical models or manifestations of SAVI

      • *

      • *

      Reviewer #3:

      We thank the reviewer for his/her time and for the constructive comments. Below please find our detailed responses to your points.

      • *

      Uncontrolled activation of STING is linked to autoinflammatory disease "STING-associated vasculopathy with onset in infancy (SAVI)". The authors had previously published a mouse model of SAVI, which was generated by knocking in the disease causing variant N153S into the endogenous murine Sting1 gene (STING ki) (Luksch et.al., 2019). In the current study, the author further investigated the role of tumor necrosis factor (TNF) signaling in manifestation and progression of murine SAVI disease by using the approach of pharmacologic and genetic inhibition of TNF receptors TNFR1 and TNFR2. Overall, the authors were able to demonstrate the following novel findings:

      • *

      1) Infliximab treatment of STING ki mice significantly increased the number of blood CD8+ T cells and thymic cells count. The authors claimed that the pharmacological inhibition of TNF signalling has a partial rescue effect of T cell lymphopenia. However, pharmacologic inhibition of TNF signalling however has no effect on lung disease.

      2) On the other hand, STING ki;Tnfr1-/- (lacking TNFR1) showed the similar modest rescue of the CD8+ T and CD4+ T cells in blood compared to the WT C57BL/6 (BL6) but not with STING ki;Tnfr2-/- (lacking TNFR2). STING ki;Tnfr1-/-, STING ki;Tnfr2-/- and STING ki;Tnfr1/2-/- had modest rescue of thymic cell count and reduced spleen cell count (reduced splenomegaly). Along with the rescued thymic content and reduced splenomegaly, genetic ablation of TNF signalling (STING ki;Tnfr1-/-) also prevented manifestation of severe inflammatory lung disease.

      3) To investigate the role of lung endothelial cells in the development of interstitial lung disease, primary murine lung endothelial cells from STING WT, STING ki and STING WT;Tnfr1/2-/- and STING ki;Tnfr1/2-/- mice were isolated and bulk RNAseq was performed. This showed decreased level of several proinflammatory cytokines (e.g. Tnf, Il1b) and chemokines (e.g. Cxcl1, Cxcl2, Cxcl9, Cxcl10, Ccl2, Ccl3 and Ccl4) in STING ki mice lacking TNFR1/2 compared to STING ki mice.

      4) Neutrophils were isolated from bone marrow and were added to cultured primary lung endothelial cell monolayers. The experiments demonstrated that the attachment and transmigration of neutrophil cells were dependent on expression of STING gain-of-function mutation in endothelial cells.

      • *

      A few points require clarification before publication of this study.

      • Tnfr1-/-, Tnfr2-/- and Tnfr1/2-/- did not show any statistical significant improvement of thymic cell count in STING ki mice. As such, the statement in the conclusion/summary section of discussion regarding Tnfr1 can restore thymocyte numbers should be toned-down.
      • Thank you for this suggestion. In Figure 4 E, we demonstrated that knock out of TNFR1 leads to increasing of SP CD8 thymocyte count and partially of SP CD4 thymocyte count (Fig. 4 D). In agreement with this suggestion, we marked this subpopulation of thymocytes in the discussion and summary section, see actual line 684 and see actual line 794.

      2) The section on Neuroinflammation and neurodegeneration and dependency of TNFR1/2 signaling is very currently difficult to follow (based on how the data are presented in figures and text). This section requires to be re-written for clarity.

      • *

      Thank you for this suggestion. We re-wrote this section, see line 472 - 499.

      Neuroinflammation and neurodegeneration in dependency of TNFR1/2 signaling

      The extent of inflammation in mouse brain resulting from constitutive activation of STING N153S was reported by quantifying the density of Iba1-positive microglia (Fig.5 A). Consistent with our previous findings (Szego et al., 2022), the density of Iba1-positive microglia in the substantia nigra was higher in STING ki;BL6 mice than in STING WT mice (Fig.5 B). TNFR deficiency did not affect neuroinflammation because there was no significant difference between the density of Iba1-positive microglia between STING ki;BL6 mice and STING ki;Tnfr1/2-/- mice (Fig.5 B). This suggests that the TNF pathway is not required for STING-induced microglia activation in the substantia nigra.

      In addition, we measured the extent of STING-induced astrogliosis by quantifying the density of GFAP-positive cells (Fig. 5 A). Consistent with our previous findings, the density of GFAP-positive astroglia was higher in STING ki than in STING WT mice (Fig. 5C). Yet, as for microglia, there was no significant difference between the density of GFAP-positive astroglia between STING ki;BL6 mice and STING ki;Tnfr1/2-/- mice (Fig.5 C), suggesting that the TNF pathway is not required for STING-induced astrogliosis in the substantia nigra.

      Finally, we measured the extent of STING-induced neurodegeneration by quantifying the density of TH-positive dopaminergic neurons in the substantia nigra (Fig. 5A). As in our previous findings, the density of TH-positive neurons was lower in STING ki;BL6 mice than in STING WT mice (Fig.5 D). The density of TH-positive neurons in the substantia nigra of STING ki;Tnfr1/2-/- mice was higher than the density of TH-positive neurons in the substantia nigra of STING ki;BL6 mice (Fig. 5 D), suggesting that the STING-induced degeneration of TH-positive neurons was blunted in Tnfr1/2-/- mice and that TNFR1/2 are involved in the STING-induced degeneration of dopaminergic neurons.

      Hence, there is a discrepancy between STING-induced effects on glial cells as opposed to STING-induced effects on neurons. The dependence of STING-induced neurodegeneration but not glial response on TNFR1/2 suggests that the STING-induced degeneration of dopaminergic neurons is not a direct consequence of microglia or astroglia activation. This is consistent with the emerging concept of a neuron-specific inflammatory response (Welikovitch et al., 2020).

      *The powerful use of in vivo genetic KO models and TNF inhibitor makes this study a valuable contribution to the field - helping further decipher the importance of the NF-KB/TNF branch of STING in SAVI (knowledge gap). The audience for this work would be specialised to STING biology and potential clinical treatments of SAVI. *

      • *

      Our expertise is in nucleic acids sensing (such as STING) and auto-immunity.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewers

      We thank the reviewers for their comments and suggestions, which we think are helpful and will improve the manuscript, and intend to address with the changes and planned revisions below.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Bello et al look at the SNP rs28834970 associated with Alzheimer's disease (AD), with C being the risk allele, on chromatin accessibility and expression of a nearby gene, PTK2B, in microglia. Their contention is that the single SNP affects chromatin accessibility and binding of the transcription factor CEBP[beta] in an intronic region of PTK2B and thereby affects PTKB expression. I had a few questions that I think are critical to be addressed. Please note that my numbering of panels is based on the figures, not the legends, which do not seem to quite agree with each other. There are also some figure legends that say "IFNg" while the figures say "LPS", which should be fixed.

      We apologise for the mistake in the figure legend that made this confusing, which we have now revised.

      The abstract says that editing a line that is homozygous for protective alleles to homozygous for risk results in "subtle downregulation of PTK2B expression". It isn't clear to me that the presented data fully supports this contention, which is central to the argument of the paper. In figure 2e, the authors show in both RNAseq and ddPCR that there is numerically lower PTK2B expression but this is not indicated to be statistically significant by one-way paired ANOVA. If there is no nominally significant difference in the edited lines, compared to the proposed significant differences in lines carrying the full risk haplotype (figure 1), then it would not seem sensible to ascribe the effects to the single edited base pair.

      We agree with the reviewer that given the effect of the SNP on PTK2B expression in the edited lines is small and only significant in macrophages, we should not interpret the effects to be mediated solely through PTK2B expression, and have substantially reworded the manuscript accordingly.

      Whilst the effects in the eQTL analysis are significant, it is worth noting that this is likely due to the much larger number of donors (133-217) giving greater power to detect the subtle changes in expression (~1.1 to 2 fold in eQTL). This change is of a similar magnitude in our SNP edited lines (~1.2 fold in SNP edited lines) as would be expected of most common regulatory variants so we believe that it could be the primary causal variant. However, we cannot exclude that other variants in the haplotype could contribute to the effect, so have also reworded accordingly to make this clear.

      Given this uncertainty about the overall strength of effect of the single base pair change it would seem important to evaluate the proposed mechanism of CEBPb binding. It wasn't clear whether the ATAC-seq data summarized in the volcano plot in 2C is proposed to be a cause or a consequence of the CEBPb binding change. I am assuming that the 'fold change' estimate here is CC compared to TT, which would be consistent with direction of effect in figure 1, but please clarify.

      We apologise for the mistake in the figure legend that made this confusing, which we have now revised along with clarification in the revised text. It is difficult to be sure whether changes in chromatin accessibility are a cause or consequence of CEBPb binding, but the fact that the binding of CEBPb is increased in the CC allele (Fig 2a, Fig 2c), that the C allele better matches the consensus sequence (Fig 2b) and there is increased chromatin accessibility (Fig 2a, Supp Fig 3b) suggests that CEBPb binding is causing the formation of the region of chromatin accessibility.

      In contrast to the subtle effects at PTK2B, the global transcriptional effects in figure 3 look quite strong. Are any of these changes dependent on PTK2B, that is to say, are they mimicked by partial suppression of PTK2B expression or activity?

      We agree that the downstream effects of the SNP are much stronger than the effects on PTK2B expression, and we have substantially reworded the manuscript to make it clear that we are unsure that the effects of the SNP are all mediated via PTK2B.

      However, we note that there is evidence in the literature of a loss in CCL4 and CCL5 expression upon PTK2B knockout in macrophages (https://www.nature.com/articles/s41467-021-27038-5) and inhibition of PTK2B in monocytes results in a reduction in CCL5 and CXCL1 (https://www.nature.com/articles/s41598-019-44098-2) consistent with our observations.

      Experiments to manipulate PTK2B expression in microglia and readout changes at the RNA level would take a few months to complete, but we would be willing to do this if the reviewer felt this was necessary.

      Finally, in figure 4, it should be clarified as to why lower expression of PTK2B would be expected to have a detrimental effect on Alzheimer's risk. If understood correctly, and again fixing the figure legends would be helpful, the CC edited lines (risk) have lower chemokine induction than the unedited TT lines.

      We apologise for the error in this figure which we have corrected in the revised version. You are correct that the CC lines have a lower chemokine level in both unstimulated and stimulated cells, and we have now discussed further how this may be linked to increased disease risk.

      "Even though overexpression of these chemokines is characteristic of neuroinflammation, correlated with disease progression and found in late stages of AD, knockout of chemokines, such as CCL2, and chemokine receptors, such as CCR2 and CCR5, in mice is associated with increased Aβ deposition and accumulation [47,50-52,107]. It has also been found that patients carrying CCR5Δ32 mutation, which prevents CCR5 surface expression, develop AD at a younger age[108]. Therefore, we hypothesize that in individuals carrying the C/C allele of rs28834970 downregulation of these chemokines in macrophages and microglia harbouring the C/C allele of rs28834970 affects Aβ-induced microglia chemotaxis, leukocytes recruitment and clearance of Aβ, and may increase the risk of developing symptomatic AD"

      Reviewer #1 (Significance (Required)):

      Going from GWAS hits, which represent blocks of high LD inherited variants, to single functional variants is a difficult problem in human genetics. The current paper attempts to isolate the effect of a single variant within an LD block on IPSC derived macrophages and microglia. This idea might be useful in nominating PTK2B as a therapeutic target for AD, although there is some question in my mind as to direction of effect.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      SUMMARY: In this manuscript the authors explore the biological effects of an intronic SNP in the PTK2B gene, previously shown to be associated with late onset Alzheimer's disease (AD) risk. Based on the likely effect of the SNP locus on PTK2B expression in the macrophage lineage, the authors explore the consequences of introducing with the Crispr/Cas9 technique the biallelic SNP base change (C/C vs T/T) in a human IPSC line that is then differentiated into macrophages or microglia. They observe that C/C increases chromatin accessibility and CEBPb binding in comparison to T/T, with a slight decrease in PTK2B expression, significant in macrophages but not in microglia. The authors then investigate the transcriptome changes induced by the C/C mutation and find alteration in many genes, including a decreased expression of a number of cytokine or receptor proteins involved in inflammatory responses. The authors also mention a decreased effect on IFNg-induced reduced mobility but the data are missing (see Figure errors below). Overall the authors propose that the risk SNP is associated with a decreased PTK2B expression and hypothesize a link between this change and a decreased function of macrophages/microglia that may contribute to AD pathology.

      MAJOR COMMENTS

      1- The authors claim that their results show that the investigated SNP has a causal effects in "microglial function" (Title) and in Alzheimer's disease (AD) (Abstract 2nd sentence "Here we validate a causal single nucleotide polymorphism (SNP) associated with an increased risk of Alzheimer's disease". The word "causal" is repeated many times. However the authors should qualify their claim with respect to AD. Their results do show that the SNP has an effect on chromatin accessibility, CEBP binding, PTK2B expression and transcriptome, but the link between these changes is not formally demonstrated and their potential role in AD-like phenotype is not explored. The "causal" role is not formally and logically demonstrated. It remains an interesting, plausible hypothesis and the results provide strong arguments in support of that hypothesis but do not prove it, yet.

      Concerning the title, "causal effects on microglial function" is awkward, anything that has effects is logically "causal" in these effects. The title should be "... has effects on microglial functions" or "... alters microglial function".

      We agree with the reviewer that given the effect of the SNP on PTK2B expression in the edited lines is small and only significant in macrophages, we should not interpret the effects to be mediated solely through PTK2B expression, or that they cause AD. We have substantially reworded the manuscript throughout to account for this.

      2- One major difficulty in the results is to link the slight decrease in PTK2B transcript, which is only significant in macrophages, with the rest of the phenotype. Because what matters to make this link is not the mRNA but the protein, and because mRNA levels are often not strictly correlated with the protein levels, the authors should measure the PTK2B/PYK2 protein levels in their differentiated cell lines in basal conditions and following activation (as they do for other readouts) using immunoblotting. A robust and significant diminution in PYK2 protein would strongly support its role in linking PTK2B expression and transcriptome change.

      We have performed preliminary analyses of PTK2B expression by Western blot in these cell lines after differentiation, but were unable to observe a significant change in abundance in the edited cell lines. This is not unexpected given the results at the RNA level, since the effect size of this common regulatory variant is likely very small (estimated to be ~1.2 fold from the eQTL analysis), and likely within the variability of this assay.

      As mentioned above, we have reworded the manuscript to avoid interpreting that the effects of rs28834970 are mediated solely through effects on PTK2B expression. We think that an experiment to manipulate PTK2B levels (see next point) may be a better way to demonstrate whether these effects are mediated through PTK2B expression.

      An optional additional key experiment would be to reverse the transcriptome phenotype by increasing the expression of PTK2B (e.g. by cDNA transfection). Note that these points are important because an alternative hypothesis to explain the effects of C/C mutation on macrophage function would be that the C/C mutation has a long distance effect on other chromatin regions with key role in regulating these cells.

      We agree that this would be a valuable experiment, and are planning additional experiments to investigate the effect of manipulating PTK2B levels (through knockout) on microglia.

      3- The manuscript contains several errors in the figures and figure legends. In Fig. 2 the legends for the figure items are shuffled. Figure 4 and Supplementary Figure 5 are duplicates of the same one. Consequently important data are not presented.

      We apologise for the errors in these figures that were due to a mistake during uploading where the incorrect versions were used. The legends for figure 2 and panels in figure 4 have now been corrected, and show the effects of rs28834970 on microglial migration and chemokine release in the presence or absence of IFNg.

      4- When the number of replicates is small (e.g. n = 3) it is preferable to use non parametric tests (rank analysis, e.g. Mann Whitney's test) rather than t test. This applies to Figures 2D (current legend 2A), 2E (current legend 2B), Figure 4A-C, Supplementary Figures 2A, 2B. In Supplementary Fig 4E (MARCO) the number of replicates (presumably 3 because based on RNAseq) and the used test are not indicated. Is it the RNAseq statistical analysis?

      We thank the reviewer for this comment. We acknowledge that the t-test may lead to inflated false discovery rates. However, it has been shown that for small sample sizes parametric tests have a power advantage compared to non-parametric ones that may outweigh the possibly exaggerated false positives. See https://genomebiology.biomedcentral.com/articles/10.1186/s13059-022-02648-4#Sec3 which states:

      "In conclusion, when the per-condition sample size is less than 8, parametric methods may be used because their power advantage may outweigh their possibly exaggerated false positives."

      We have also modified the legend of supplementary figure 4E to clarify the number of replicates used.

      5- In addition to the above comment on tests, when the number of replicates is small it is not appropriate (and misleading) to show box plots or bars with SEM. In the indicated figures the individual data points should be shown.

      We now show individual replicates on box plots (Figure 2D, 2E and supp figure 4E).

      MINOR COMMENTS:

      a- Macrophages and microglia are very similar cell types. Could the authors comment more on the differences they observe and how they are related to those previously described?

      We have now referenced the original papers and commented on the markers that we see differentially expressed, notably P2RY12 which is a key homeostatic microglia marker that distinguishes these cells from macrophages.

      b- In Fig. 2A CEBPb cut and run plot, the differences are not limited to the SNP immediate vicinity, there are also visible differences between T/T and C/C plots in at least a 40-kb range. Is it due to multiple interactions of CEBPb? How can the point difference have broad consequences? Please explain this potentially interesting and relevant finding.

      Whilst there may be small changes in CEBPb binding at the second intronic PTK2B chromatin peak, this is not statistically significant given the variability between repeats. In fact, the only significant change we see in CEBPb binding genome-wide is at the locus overlapping the SNP (Fig 2c).

      c- Potentially cis-altered genes near the SNP include CHRNA2 and EPHX2 (see Sup. Fig. 3a). Their expression may not be detected in macrophage lineage. If this is the case please indicate in the text, otherwise please include the corresponding data in Sup. Fig. 3b to show the presence or absence of SNP-induced change.

      You are correct that CHRNA2 and EPHX2 are not expressed in our macrophages or microglia, and we have now explicitly stated this in the revised text.

      d- In general the Figures are not of very high quality and are difficult to read or understand without constantly going back and forth to the legends (which are mislabeled in some instances). To improve:

      . Please increase font size whenever possible.

      . Please improve Fig. 1d by indicating the position of the SNP, numbering the exons (an intermediate scale plot may be necessary and lines on bottom trace are hardly visible).

      . Please indicate the correct color code for T/T and C/C in Fig 3a and b, left panels, which currently doesn't match.

      . Please label the Venn's diagrams comparisons in Sup. Fig. 4b.

      . In the text and legends the Figure items are identified with letters in upper case, in the figures they are in lower case. Please be consistent.

      We have improved the resolution of the images in the pdf and Fig 1d has been revised to include the position of the SNP. The colour code for T/T and C/C is correct in fig 3a and 3b, but since the PCA plots are independently created, we would not always expect the position of the T/T and C/C alleles to be the same. The Venn diagrams in Sup Fig 4b have been updated, and the letters for the figure panels made consistently upper case throughout.

      e- In Fig. 2D and 2E, the Y axes should start at zero to avoid artificially increasing the visual differences. If there is a strong reason not to do so (I don't see any here), the Y axis should be clearly interrupted to avoid confusion.

      We have altered this accordingly.

      f- In the introduction the authors provide some background about previous work about the potential role of PTK2B/PYK2 in AD pathophysiology. The cited preclinical results suggest that PTK2B activity could have a deleterious effect (references in the manuscript). In contrast, some other reports (PMID: 29803828, 33718872) suggest a protective effect of PTK2B/PYK2. Because the evidence in the current manuscript suggests that the risk-associated SNP results in a decreased function of PTK2B/PYK2 (through decreased levels), at least in cells of the macrophage lineage, the authors could broaden their discussion to include these results.

      We have now discussed the conflicting evidence in the revised manuscript.

      Reviewer #2 (Significance (Required)):

      ADVANCE: Late onset Alzheimer's disease is a major medical issue. It has a complex genetic risk component with many associated loci identified in GWAS. Most of these have only a small individual impact on the risk. One of the SNPs associated with increased risk (rs28834970) is located in an intron of the PTK2B gene. Although various reports have investigated the role of the PTK2B gene product, the tyrosine kinase PYK2, in several AD models, the possible link with rs28834970, is unclear.

      An important point is to determine whether TàC SNP corresponding to rs28834970 alters PTK2B expression and how it does so. An alternative hypothesis could be that the SNP has a strong linkage disequilibrium with an unidentified allele in human populations that could be responsible for AD risk. The current manuscript is a significant step forward in addressing that question. By generating a biallelic C/C SNP mutation in a human IPSC line the current study allows to eliminate such linked contribution.

      The strength of the manuscript is to show an effect on chromatin accessibility, CEBP binding and possibly PTK2B transcripts. It also provides interesting evidence of a broad effect of the C/C mutation on the transcriptome of macrophage lineage cells. In its current form the manuscript presents weaknesses that could be improved. These flaws include issues with the presentation discussed above and the uncomplete demonstration that it is the decrease in PTK2B expression that causes the macrophage/microglia phenotype. If these flaws were overcome the paper would represent a significant advance.

      AUDIENCE: The expected audience is specialized in AD with a possible broader range if all weaknesses are addressed.

      REVIEWER EXPERTISE: Basic science close to the field.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewer #1:

      We agree with Reviewer 1 that a function of ROPGEFs in this process was expected to some degree. However, we want to point out that this manuscript focuses on the requirement of ROPGEFs and especially the spatio-temporal description of ROP signalling polarisation and activation during pollen germination. Moreover, different to the downstream ROPs, we show ROPGEFs do not act strictly redundant, confirming results from root hair initiation and providing additional evidence that multiple signalling pathways are required for pollen germination and that ROPGEFs might be essential for bringing specificity to these signals.

      Major comments:

      1. Only one GEF11 mutant line, gef11-t1, was analyzed for germination ratio. It is presumptuous to conclude that GEF11 has no function in the pollen germination of Arabidopsis thaliana (line 241- line 242).

      After the initial negative results, we did not focus on GEF11 further. Thus, we fully agree that it is presumptuous to make such strong statements about the role of GEF11 during pollen germination. We generated additional gef11 mutant alleles for this revision plan using CRISPR/Cas9 as no other suitable lines were available. Moreover, we now have additional higher-order mutants available to demonstrate the function of GEF11 during pollen germination. These additional lines were generated and confirmed and are growing right now. Thus, we will be able to implement new results addressing this point timely, allowing us to make a more founded statement about the function of GEF11 (see Response to Reviewer #2).

      Minor comments:

      1. In Figure 2A, pollen germination ratio was not provided for the single mutants gef8-c△3 and gef9-c△

      This is due to the generation process of the CRISPR/Cas9 alleles. These alleles were generated by a construct mutating both genes simultaneously; thus, these mutants are unavailable as single mutant lines. Instead of separating these alleles by outcrossing, we included additional single mutant alleles for both GEFs with a similar deletion. As all these CRISPR/Cas9 mutants have a complete deletion of the GEF-ORF, we are sure about the loss of the according GEF function. Additional alleles account for possible unspecific effects.

      In Figure 3D, the subcellular localization of GEF12GEF8C is fuzzy. Better imaging is needed.

      We agree that the quality of these images is not ideal due to this specific line having less fluorescent signal. We screened for more lines of this construct and already performed more experiments. We will provide better images for this genotype.

      In Figure 3E, it is intriguing that both GEF8-S518A and GEF8-S518D are not associated with the PM in germinating pollen grains. Does it mean that phosphorylation at S518 is not relevant to polar distribution of GEF8?

      We also find this very intriguing as we did not expect this result. However, we interpret it slightly differently in the way that the S518 site is relevant for GEF polarisation, which might be conferred by RLK interaction. We think both mutant forms alter this potential association with RLKs, thus losing polarisation. We will include more imaging experiments of these constructs and additional lines to strengthen our results. Moreover, we generated lines to study these lines' functionality and complementation capacity, which will be included in a revised manuscript.

      T-DNA insertion lines, gef11-t1 and gef12-t1, need to be verified by PCRs in Figure S3D.

      Thanks for pointing this out. This control should be provided, and we will include the verification in the supplement.

      Response to Reviewer #2:

      Like Reviewer #2, we are also very intrigued by the biphasic accumulation of GEFs, as this is an entirely novel feature of this process. Like Reviewer #2, we also interpret this as an exploration and establishment phase, which could help us to understand how the pollen germination site is decided in species without aperture-dependent pollen germination.

      Major comments:

      1. In line 241, the authors conclude that GEF11 has no function in pollen germination. However, it is likely that GEF11 also plays a redundant role as GEF12 does. I recommend the authors check the phenotypes of gef11,gef12 double mutant and gef8,gef9,gef11 triple mutant to confirm that GEF11 has indeed no function. Otherwise, this conclusion should be better rephrased.

      This point is well justified and similar to the comment of Reviewer #1. As stated before, we had to generate additional lines for this. We will analyse an additional gef11 allele, gef8/gef11 and gef9/11 double mutants, and gef9/11/12 triple mutants to address the function of GEF11 in more detail. The conclusions of the original manuscript will, of course, be adjusted according to the new results.

      Although GEF12 is in the cytosol, the strong pollen germination defects in gef8,gef9,gef12 triple mutants do indicate a critical role of GEF12. Is it possible that GEFs could function in the cytosol? The authors can test this possibility by examining the rescuing ability of several constructs that express, for example, GEF12, GEF12(+GEF8C), GEF8(SA), or GEF8(SD) in gef8. The authors may not perform all of these rescue experiments, but some of the mentioned lines are already in hands. They could readily check the phenotypes.

      We thank the Reviewer for this great point. This information is crucial to discriminate the function of the individual GEFs. We have generated new lines expressing some of the mentioned constructs in the gef8 background to address this. We now have lines that complement gef8 with GEF12, GEF12GEF8C, GEF8S518A, GEF8S518D, and GEF8ΔC. We are currently performing experiments which determine the functionality of these constructs, which will allow us to make more conclusive statements about the function of GEFs in the cytosol and how important the PRONE domain alone, or the membrane attachment of GEFs, is for their function.

      The authors conclude that the C-terminus of GEF8 and GEF9 is necessary and sufficient for membrane localization because GEF8/9C can target GEF12 PRONE domain to the membrane. It is intriguing whether the C-terminus alone could confer membrane targeting ability. Currently, it is not fully understood how GEFs localize to the membrane. Examining the localization of GEF8/9C itself would help clarify this and improve our understanding of GEF regulation. Alternatively, the authors may discuss evidence that supports or disagrees with this possibility.

      This is a good suggestion by the reviewer and indeed intriguing if the C-Terminus alone could confer membrane attachment. Meanwhile, we obtained plants expressing such constructs, showing that the C-terminus alone is insufficient for membrane attachment. This is not surprising, as these domains are largely disordered, and we suspect that the context of an adjacent PRONE domain is required to carry out this function. We will include our new results in the revised manuscript.

      Minor comments:

      1. The N- and C-terminus of GEF8 are predicted to inhibit complex formation. How is the prediction performed? Do the authors use monomer prediction or multimer prediction? Alphafold2 has a low accuracy in predicting non-conserved regions. How confident are the predicted inhibitory contacts?

      We used multimer-prediction of Alphafold2 for the shown structures. However, we fully agree that the predicted structures of Alphafold have low accuracy in that regard, especially for disordered domains like this. We will provide confidence models and predicted aligned error (PAE) plots for this structure. Additionally, we will put our conclusions in a better perspective of these structure confidences and tone down our interpretations of this section.

      Localization of ROPs and calcium reporter in Figure 4 appears to be variable. It would help clarify the specific effects on each reporter if the authors present these data more quantitatively.

      We agree with the reviewer that some of the observations are variable. We will provide the data more quantitatively, including overviews of which percentage we observed the described phenomena and a more quantitative analysis of the strength and timing of signal accumulation (see also Response to Reviewer #3).

      Response to Reviewer #3:

      Major points:

      1. One of my major points is that the manuscript is now mainly based on the observations of individual pollen grains. These are then subjected to well-performed image analysis approaches but still represent somewhat anecdotal evidence (Fig 1A, B, Fig 3C-E, etc). The analysis and (numerical) presentation of a more robust data sample (which I presume the authors have acquired) would strengthen the ms considerably. This goes beyond the Figs - e.g. in l. 164-165 authors state rather vaguely, "we observed that mCit-GEF8 and mCit-GEF9 accumulated at a defined region in the cell periphery, which strongly correlated with the future germination site." Here, I would appreciate the data showing the actual correlation, if every germinated pollen grain displays GEF8/9 accumulation, whether there is a population of pollen grains showing the GEF8/9 transient but not germinating, etc...

      We very much appreciate the reviewer's comment, as this version of the manuscript indeed seems like we made our conclusions based on observations made from individual pollen. However, this is not the case. As the reviewer suspected, more data is available but not included in the manuscript. We have multiple observations for each of the shown constructs and only show a representative one. Furthermore, we imaged more pollen germination events of lines that showed variability and included additional lines for some constructs. We will provide a more quantitative analysis of the results to better represent the variability of the individual constructs, and we will adjust the manuscript accordingly (see comment 2).

      Where the authors analyse multiple cells, we are still missing some info - e.g. it is not stated what the error bars in Fig 1C, D represents (SD, SEM, CI?), size of the sample, etc. In any case, it is evident that there is quite substantial variability in the data, which is understandable. Maybe the authors can plot the individual profile lines along the average? Plus, GEF9 seem to have the maximum pre-germination localisation at -5 min rather than -9 min.

      We agree with the Reviewer that information is missing or not obviously stated. We will correct this for the revised manuscript. Moreover, we agree that the suggested way of showing the data would provide more information and allow a better representation of the results and their variability. This can be seen in the reviewer's interpretation of the results of GEF9. In this case, we see some variability in the timing of GEF9 accumulation, leading to the peak maximum shift. In a revised manuscript, we will, as suggested, show the data as individual lines, providing a better representation of the data. Moreover, we will include such representations for other used constructs to provide a general, more quantitative data analysis (see comment 1).

      I know it is very challenging, but the ms would be much stronger with the in vivo imaging of pollen germination on stigmatic papillae (i) GEF8/9 in wt, (ii) gef8/9 double mutant. This would bring crucial data about the role of the GEF polar domain and its functional relation to pollination.

      This would indeed be great to see. We put an effort into establishing such in vivo imaging experiments with our fluorescent markers. However, we cannot image these events in an in vivo setup (at least with our resources). This has two reasons: 1. The events are very fast and limited to a small region at the pollen-papilla contact side, which we have issues resolving optically and timely. 2. The used marker lines only have a low fluorescent level due to the native promoter, and stronger expression would lead to overexpression artefacts. In vitro, it is difficult to see the observed signal accumulation. In the in vivo situation, we are facing additional diffraction of the papilla cells, which would make the observation of GEF accumulation impossible with our microscopes.

      The phylogeny presented in Fig S1 is only rudimental and not very interesting. Given the author's results, I would love to see if GEF8/9 orthologs also exist in species with defined pollen apertures (where establishing a dynamic site makes little sense). The authors touch on this (L409-411), but it would deserve better analysis and discussion.

      We agree with the reviewer that studying GEF function/accumulation in species with aperture-dependent germination would be interesting. However, we can not conclude functional orthologs in other species based on phylogeny. Such phylogenetic analyses were done, for example, by Kim et al. (BMC Plant Biology, 2020, doi: 10.1186/s12870-020-2298-5). The issue is that all Arabidopsis pollen-expressed GEFs form a closed phylogenetic group without allowing the interpretation of which rice homolog is the functional ortholog of the respective Arabidopsis GEF (this is the same for maize). Thus, such phylogenetic analyses are not conclusive, and they would require experimental data to prove orthology. However, we agree that this point can be interpreted and discussed better, and we will include this in the revised manuscript.

      I am not entirely convinced by the authors' interpretation of rather strange S518 mutation data. Could S518A mutation affect overall GEF8 structure/stability?

      We were also suspicious about these results, as they were unexpected (see also Response to Reviewer #1). To confirm these results, we made additional lines for these constructs, double-checked that the constructs were correct and made more observations for both GEF8S18A and GEF8S18D. Additionally, we started investigating the functionality of these constructs and have this data available timely. Preliminary results suggest that the constructs are partial to fully functional compared to the WT GEF8, arguing against these mutations' effect on structure or stability. We will include more data for these constructs in a revised manuscript to allow a more conclusive interpretation of these unexpected observations.

      Although the authors cannot observe the localisation of ROPs in the plasma membrane, they see the apparent accumulation of active ROP marker CRIB4 there - implying that ROPs must localise to the pollen PM at the germination site. This discrepancy should be solved or at least discussed more.

      The reviewer is correct in that we cannot observe ROP accumulation but rather the accumulation of ROP activity (as seen by CRIB4). This is in line with the observation made by Xiang et al. (2023, Plant Physiology, doi: 10.1093/plphys/kiad196), which also cannot find ROP accumulation. We are convinced that ROPs are present at the plasma membrane of the pollen germination site, but no accumulation is observable. We believe this is due to a high mobility of ROPs and that no accumulation is required, as only a few ROPs are sufficient to activate downstream signals. We will discuss these results in more detail in a revised manuscript to better explain the observed discrepancy.

      Given that calcium oscillates very rapidly in pollen and pollen tubes (with frequency ~6-20s), the profound, long-term changes in calcium levels reported by the authors can hardly be referred to as oscillations. The phenomenon observed should again be analysed using a bigger sample.

      We agree that the terminology is not good, as it suggests similarities to the oscillations found in pollen tubes. Thus, we will change the revised manuscript and refer to the changes in Ca2+ levels as “elevations”. Moreover, we will provide a more quantitative analysis and a bigger sample size, as stated in Response to Reviewer #2.

      Minor points:

      1. In Fig 1F, GEF12 also seems to be polarly localised to the future site.

      The chosen sample is not ideal, as it looks like GEF12 would also slightly accumulate. However, as seen in the quantification of this cell, GEF12 does not significantly accumulate at the pollen germination site, and we never observed any accumulation of GEF12 that is comparable to GEF8 or GEF9. We will include another sample of this colocalisation in the revised manuscript to avoid misinterpretation of the data.

      It is difficult to make any assumptions based on the AlphaFold2 predictions without showing their confidence assessments (e.g., PAE plots). The authors state this themselves in the discussion (L. 447-449).

      As the Response to Reviewer #2 stated, we will include structures with confidence values and PAE plots in the supplement. We additionally tone down our interpretation of these structure predictions to make clear that these structures should be interpreted carefully.

      On one hand the authors repeatedly state that pollen GEFs do act in a redundant manner (and provide some evidence for it), on the other hand the absence of an in vivo phenotype for single and double knockout lines and only mild phenotype for a triple ko line does suggest a level of redundancy. This should be rephrased.

      We agree that this is not clearly phrased. In a revised version, we will change the manuscript to indicate which type and level of redundancy are described. We will discriminate between genetic redundancy, as seen in the mild in vivo effects, and non-redundant molecular function, as observed by protein localisation.

    1. Reviewer #1 (Public Review):

      Summary:

      The novel advance by Wang et al is in the demonstration that, relative to a standard extinction procedure, the retrieval-extinction procedure more effectively suppresses responses to a conditioned threat stimulus when testing occurs just minutes after extinction. The authors provide some solid evidence to show that this "short-term" suppression of responding involves engagement of the dorsolateral prefrontal cortex.

      Strengths:

      Overall, the study is well-designed and the results are potentially interesting. There are, however, a few issues in the way that it is introduced and discussed. Some of the issues concern clarity of expression/communication. However, others relate to a theory that could be used to help the reader understand why the results should have come out the way that they did. More specific comments and questions are presented below.

      Weaknesses:

      INTRODUCTION & THEORY

      (1) Can the authors please clarify why the first trial of extinction in a standard protocol does NOT produce the retrieval-extinction effect? Particularly as the results section states: "Importantly, such a short-term effect is also retrieval dependent, suggesting the labile state of memory is necessary for the short-term memory update to take effect (Fig. 1e)." The importance of this point comes through at several places in the paper:

      1A. "In the current study, fear recovery was tested 30 minutes after extinction training, whereas the effect of memory reconsolidation was generally evident only several hours later and possibly with the help of sleep, leaving open the possibility of a different cognitive mechanism for the short-term fear dementia related to the retrieval-extinction procedure." ***What does this mean? The two groups in study 1 experienced a different interval between the first and second CS extinction trials; and the results varied with this interval: a longer interval (10 min) ultimately resulted in less reinstatement of fear than a shorter interval. Even if the different pattern of results in these two groups was shown/known to imply two different processes, there is absolutely no reason to reference any sort of cognitive mechanism or dementia - that is quite far removed from the details of the present study.

      1B. "Importantly, such a short-term effect is also retrieval dependent, suggesting the labile state of memory is necessary for the short-term memory update to take effect (Fig. 1e)." ***As above, what is "the short-term memory update"? At this point in the text, it would be appropriate for the authors to discuss why the retrieval-extinction procedure produces less recovery than a standard extinction procedure as the two protocols only differ in the interval between the first and second extinction trials. References to a "short-term memory update" process do not help the reader to understand what is happening in the protocol.

      (2) "Indeed, through a series of experiments, we identified a short-term fear amnesia effect following memory retrieval, in addition to the fear reconsolidation effect that appeared much later."<br /> ***The only reason for supposing two effects is because of the differences in responding to the CS2, which was subjected to STANDARD extinction, in the short- and long-term tests. More needs to be said about how and why the performance of CS2 is affected in the short-term test and recovers in the long-term test. That is, if the loss of performance to CS1 and CS2 is going to be attributed to some type of memory updating process across the retrieval-extinction procedure, one needs to explain the selective recovery of performance to CS2 when the extinction-to-testing interval extends to 24 hours. Instead of explaining this recovery, the authors note that performance to CS1 remains low when the extinction-to-testing interval is 24 hours and invoke something to do with memory reconsolidation as an explanation for their results: that is, they imply (I think) that reconsolidation of the CS1-US memory is disrupted across the 24-hour interval between extinction and testing even though CS1 evokes negligible responding just minutes after extinction.

      (3) The discussion of memory suppression is potentially interesting but, in its present form, raises more questions than it answers. That is, memory suppression is invoked to explain a particular pattern of results but I, as the reader, have no sense of why a fear memory would be better suppressed shortly after the retrieval-extinction protocol compared to the standard extinction protocol; and why this suppression is NOT specific to the cue that had been subjected to the retrieval-extinction protocol.

      3A. Relatedly, how does the retrieval-induced forgetting (which is referred to at various points throughout the paper) relate to the retrieval-extinction effect? The appeal to retrieval-induced forgetting as an apparent justification for aspects of the present study reinforces points 2 and 3 above. It is not uninteresting but needs some clarification/elaboration.

      (4) Given the reports by Chalkia, van Oudenhove & Beckers (2020) and Chalkia et al (2020), some qualification needs to be inserted in relation to reference 6. That is, reference 6 is used to support the statement that "during the reconsolidation window, old fear memory can be updated via extinction training following fear memory retrieval". This needs a qualifying statement like "[but see Chalkia et al (2020a and 2020b) for failures to reproduce the results of 6]."

      https://pubmed.ncbi.nlm.nih.gov/32580869/<br /> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7115860/

      CLARIFICATIONS, ELABORATIONS, EDITS

      (5) The Abstract was not easy to follow:

      5A. What does it mean to ask: "whether memory retrieval facilitates update mechanisms other than memory reconsolidation"? That is, in what sense could or would memory retrieval be thought to facilitate a memory update mechanism?

      5B. "First, we demonstrate that memory reactivation prevents the return of fear shortly after extinction training in contrast to the memory reconsolidation effect which takes several hours to emerge and such a short-term amnesia effect is cue independent (Study 1, N = 57 adults)."<br /> ***The phrasing here could be improved for clarity: "First, we demonstrate that the retrieval-extinction protocol prevents the return of fear shortly after extinction training (i.e., when testing occurs just min after the end of extinction)." Also, cue-dependence of the retrieval-extinction effect was assessed in study 2.

      5C. "Furthermore, memory reactivation also triggers fear memory reconsolidation and produces cue-specific amnesia at a longer and separable timescale (Study 2, N = 79 adults)." ***In study 2, the retrieval-extinction protocol produced a cue-specific disruption in responding when testing occurred 24 hours after the end of extinction. This result is interesting but cannot be easily inferred from the statement that begins "Furthermore..." That is, the results should be described in terms of the combined effects of retrieval and extinction, not in terms of memory reactivation alone; and the statement about memory reconsolidation is unnecessary. One can simply state that the retrieval-extinction protocol produced a cue-specific disruption in responding when testing occurred 24 hours after the end of extinction.

      5D. "...we directly manipulated brain activities in the dorsolateral prefrontal cortex and found that both memory retrieval and intact prefrontal cortex functions were necessary for the short-term fear amnesia."<br /> ***This could be edited to better describe what was shown: E.g., "...we directly manipulated brain activities in the dorsolateral prefrontal cortex and found that intact prefrontal cortex functions were necessary for the short-term fear amnesia after the retrieval-extinction protocol."

      5E. "The temporal scale and cue-specificity results of the short-term fear amnesia are clearly dissociable from the amnesia related to memory reconsolidation, and suggest that memory retrieval and extinction training trigger distinct underlying memory update mechanisms."<br /> ***The pattern of results when testing occurred just minutes after the retrieval-extinction protocol was different from that obtained when testing occurred 24 hours after the protocol. Describing this in terms of temporal scale is unnecessary, and suggesting that memory retrieval and extinction trigger different memory update mechanisms is not obviously warranted. The results of interest are due to the combined effects of retrieval+extinction and there is no sense in which different memory update mechanisms should be identified with retrieval (mechanism 1) and extinction (mechanism 2).

      5F. "These findings raise the possibility of concerted memory modulation processes related to memory retrieval..."<br /> ***What does this mean?

      (6) "...suggesting that the fear memory might be amenable to a more immediate effect, in addition to what the memory reconsolidation theory prescribes..."<br /> ***What does it mean to say that the fear memory might be amenable to a more immediate effect?

      (7) "Parallel to the behavioral manifestation of long- and short-term memory deficits, concurrent neural evidence supporting memory reconsolidation theory emphasizes the long-term effect of memory retrieval by hypothesizing that synapse degradation and de novo protein synthesis are required for reconsolidation."<br /> ***This sentence needs to be edited for clarity.

      (8) "previous behavioral manipulations engendering the short-term declarative memory effect..."<br /> ***What is the declarative memory effect? It should be defined.

      (9) "The declarative amnesia effect emerges much earlier due to the online functional activity modulation..."<br /> ***Even if the declarative memory amnesia effect had been defined, the reference to online functional activity modulation is not clear.

      (10) "However, it remains unclear whether memory retrieval might also precipitate a short-term amnesia effect for the fear memory, in addition to the long-term prevention orchestrated by memory consolidation."<br /> ***I found this sentence difficult to understand on my first pass through the paper. I think it is because of the phrasing of memory retrieval. That is, memory retrieval does NOT precipitate any type of short-term amnesia for the fear memory: it is the retrieval-extinction protocol that produces something like short-term amnesia. Perhaps this sentence should also be edited for clarity.

      I will also note that the usage of "short-term" at this point in the paper is quite confusing: Does the retrieval-extinction protocol produce a short-term amnesia effect, which would be evidenced by some recovery of responding to the CS when tested after a sufficiently long delay? I don't believe that this is the intended meaning of "short-term" as used throughout the majority of the paper, right?

      (11) "To fully comprehend the temporal dynamics of the memory retrieval effect..."<br /> ***What memory retrieval effect? This needs some elaboration.

      (12) "We hypothesize that the labile state triggered by the memory retrieval may facilitate different memory update mechanisms following extinction training, and these mechanisms can be further disentangled through the lens of temporal dynamics and cue-specificities."<br /> ***What does this mean? The first part of the sentence is confusing around the usage of the term "facilitate"; and the second part of the sentence that references a "lens of temporal dynamics and cue-specificities" is mysterious. Indeed, as all rats received the same retrieval-extinction exposures in Study 2, it is not clear how or why any differences between the groups are attributed to "different memory update mechanisms following extinction".

      (13) "In the first study, we aimed to test whether there is a short-term amnesia effect of fear memory retrieval following the fear retrieval-extinction paradigm."<br /> ***Again, the language is confusing. The phrase, "a short-term amnesia effect" implies that the amnesia itself is temporary; but I don't think that this implication is intended. The problem is specifically in the use of the phrase "a short-term amnesia effect of fear memory retrieval." To the extent that short-term amnesia is evident in the data, it is not due to retrieval per se but, rather, the retrieval-extinction protocol.

      (14) The authors repeatedly describe the case where there was a 24-hour interval between extinction and testing as consistent with previous research on fear memory reconsolidation. Which research exactly? That is, in studies where a CS re-exposure was combined with a drug injection, responding to the CS was disrupted in a final test of retrieval from long-term memory which typically occurred 24 hours after the treatment. Is that what the authors are referring to as consistent? If so, which aspect of the results are consistent with those previous findings? Perhaps the authors mean to say that, in the case where there was a 24-hour interval between extinction and testing, the results obtained here are consistent with previous research that has used the retrieval-extinction protocol. This would clarify the intended meaning greatly.

      DATA

      (15) Points about data:

      15A. The eight participants who were discontinued after Day 1 in study 1 were all from the no-reminder group. Can the authors please comment on how participants were allocated to the two groups in this experiment so that the reader can better understand why the distribution of non-responders was non-random (as it appears to be)?

      15B. Similarly, in study 2, of the 37 participants that were discontinued after Day 2, 19 were from Group 30 min, and 5 were from Group 6 hours. Can the authors comment on how likely these numbers are to have been by chance alone? I presume that they reflect something about the way that participants were allocated to groups, but I could be wrong.

      15C. "Post hoc t-tests showed that fear memories were resilient after regular extinction training, as demonstrated by the significant difference between fear recovery indexes of the CS+ and CS- for the no-reminder group (t26 = 7.441, P < 0.001; Fig. 1e), while subjects in the reminder group showed no difference of fear recovery between CS+ and CS- (t29 = 0.797, P = 0.432, Fig. 1e)."<br /> ***Is the fear recovery index shown in Figure 1E based on the results of the 1st test trial only? How can there have been a "significant difference between fear recovery indexes of the CS+ and CS- for the no-reminder group" when the difference in responding to the CS+ and CS- is used to calculate the fear recovery index shown in 1E? What are the t-tests comparing exactly, and what correction is used to account for the fact that they are applied post-hoc?

      15D. "Finally, there is no statistical difference between the differential fear recovery indexes between CS+ in the reminder and no reminder groups (t55 = -2.022, P = 0.048; Fig. 1c, also see Supplemental Material for direct test for the test phase)."<br /> ***Is this statement correct - i.e., that there is no statistically significant difference in fear recovery to the CS+ in the reminder and no reminder groups? I'm sure that the authors would like to claim that there IS such a difference; but if such a difference is claimed, one would be concerned by the fact that it is coming through in an uncorrected t-test, which is the third one of its kind in this paragraph. What correction (for the Type 1 error rate) is used to account for the fact that the t-tests are applied post-hoc? And if no correction, why not?

      15E. In study 2, why is responding to the CS- so high on the first test trial in Group 30 min? Is the change in responding to the CS- from the last extinction trial to the first test trial different across the three groups in this study? Inspection of the figure suggests that it is higher in Group 30 min relative to Groups 6 hours and 24 hours. If this is confirmed by the analysis, it has implications for the fear recovery index which is partly based on responses to the CS-. If not for differences in the CS- responses, Groups 30 minutes and 6 hours are otherwise identical.

      15F. Was the 6-hour group tested at a different time of day compared to the 30-minute and 24-hour groups; and could this have influenced the SCRs in this group?

      15G. Why is the range of scores in "thought control ability" different in the 30-minute group compared to the 6-hour and 24-hour groups? I am not just asking about the scale on the x-axis: I am asking why the actual distribution of the scores in thought control ability is wider for the 30-minute group?

      (16) During testing in each experiment, how were the various stimuli presented? That is, was the presentation order for the CS+ and CS- pseudorandom according to some constraint, as it had been in extinction? This information should be added to the method section.

      (17) "These results are consistent with previous research which suggested that people with better capability to resist intrusive thoughts also performed better in motivated dementia in both declarative and associative memories."<br /> ***Which parts of the present results are consistent with such prior results? It is not clear from the descriptions provided here why thought control ability should be related to the present findings or, indeed, past ones in other domains. This should be elaborated to make the connections clear.

    2. Reviewer #3 (Public Review):

      SUMMARY

      Wang et al. have addressed how acquired fear and extinction memories evolve over time. Using a retrieval-extinction procedure in healthy humans, they have investigated the recovery of fear memories 30-60 minutes., 6 hours, and 24 hours after the retrieval-extinction phase. They have addressed this research question through 3 different experiments which included manipulations of the reminder cue, the time interval, and brain activity. Together, the studies suggest that early on after retrieval-extinction (30-60 min. later), retrieval-extinction may lead to an attenuation of fear recovery (after reinstatement) for all fear cues, as well as the non-reminded ones. Study 3 moreover suggests that this effect may depend on normal dlPFC function. In addition, the paper also contains data in line with prior findings suggesting that a 6-hour interval does not benefit from the reminder cue, and that a 24-hour interval does, and specifically for the reminded fear cue. The latter findings are seen as evidence of fear memory reconsolidation.

      STRENGTHS

      (1) The paper combines three related human fear conditioning studies, each with decent sample sizes. The authors are transparent about the fact that they excluded many participants and about which conditions they belonged to.

      (2) The effect that this paper investigates (short-term fear memory after a retrieval-extinction procedure) has not been studied extensively, thus making it a relevant topic.

      (3) The application of brain stimulation as a means to study causal relationships is interesting and goes beyond the purely behavioral or pharmacological interventions that are often used in human fear conditioning research. Also, the use of an active control stimulation is a strength of the study.

      WEAKNESSES

      (1) The entire study hinges on the idea that there is memory 'suppression' if (1) the CS+ was reminded before extinction and (2) the reinstatement and memory test takes place 30 minutes later (in Studies 1 & 2). However, the evidence supporting this suppression idea is not very strong. In brief, in Study 1, the effect seems to only just reach significance, with a medium effect size at best, and, moreover, it is unclear if this is the correct analysis (which is a bit doubtful, when looking at Figure 1D and E). In Study 2, there was no optimal control condition without reminder and with the same 30-min interval (which is problematic, because we can assume generalization between CS1+ and CS2+, as pointed out by the authors, and because generalization effects are known to be time-dependent). Study 3 is more convincing, but entails additional changes in comparison with Studies 1 and 2, i.e., applications of cTBS and an interval of 1 hour instead of 30 minutes (the reason for this change was not explained). So, although the findings of the 3 studies do not contradict each other and are coherent, they do not all provide strong evidence for the effect of interest on their own.

      Related to the comment above, I encourage the authors to double-check if this statement is correct: "Also, our results remain robust even with the "non-learners" included in the analysis (Fig. S1 in the Supplemental Material)". The critical analysis for Study 1 is a between-group comparison of the CS+ and CS- during the last extinction trial versus the first test trial. This result only just reached significance with the selected sample (p = .048), and Figures 1D and E even seem to suggest otherwise. I doubt that the analysis would reach significance when including the "non-learners" - assuming that this is what is shown in Supplemental Figure 1 (which shows the data from "all responded participants").

      Also related to the comment above, I think that the statement "suggesting a cue-independent short-term amnesia effect" in Study 2 is not correct and should read: "suggesting extinction of fear to the CS1+ and CS2+", given that the response to the CS+'s is similar to the response to the CS-, as was the case at the end of extinction. Also the next statement "This result indicates that the short-term amnesia effect observed in Study 2 is not reminder-cue specific and can generalize to the non-reminded cues" is not fully supported by the data, given the lack of an appropriate control group in this study (a group without reinstatement). The comparison with the effect found in Study 1 is difficult because the effect found there was relatively small (and may have to be double-checked, see remarks above), and it was obtained with a different procedure using a single CS+. The comparison with the 6-h and 24-h groups of Study 2 is not helpful as a control condition for this specific question (i.e., is there reinstatement of fear for any of the CS+'s) because of the large procedural difference with regard to the intervals between extinction and reinstatement (test).

      (2) It is unclear which analysis is presented in Figure 3. According to the main text, it either shows the "differential fear recovery index between CS+ and CS-" or "the fear recovery index of both CS1+ and CS2+". The authors should clarify what they are analyzing and showing, and clarify to which analyses the ** and NS refer in the graphs. I would also prefer the X-axes and particularly the Y-axes of Fig. 3a-b-c to be the same. The image is a bit misleading now. The same remarks apply to Figure 5.

      (3) In general, I think the paper would benefit from being more careful and nuanced in how the literature and findings are represented. First of all, the authors may be more careful when using the term 'reconsolidation'. In the current version, it is put forward as an established and clearly delineated concept, but that is not the case. It would be useful if the authors could change the text in order to make it clear that the reconsolidation framework is a theory, rather than something that is set in stone (see e.g., Elsey et al., 2018 (https://doi.org/10.1037/bul0000152), Schroyens et al., 2022 (https://doi.org/10.3758/s13423-022-02173-2)).

      In addition, the authors may want to reconsider if they want to cite Schiller et al., 2010 (https://doi.org/10.1038/nature08637), given that the main findings of this paper, nor the analyses could be replicated (see, Chalkia et al., 2020 (https://doi.org/10.1016/j.cortex.2020.04.017; https://doi.org/10.1016/j.cortex.2020.03.031).

      Relatedly, it should be clarified that Figure 6 is largely speculative, rather than a proven model as it is currently presented. This is true for all panels, but particularly for panel c, given that the current study does not provide any evidence regarding the proposed reconsolidation mechanism.

      Lastly, throughout the paper, the authors equate skin conductance responses (SCR) with fear memory. It should at least be acknowledged that SCR is just one aspect of a fear response, and that it is unclear whether any of this would translate to verbal or behavioral effects. Such effects would be particularly important for any clinical application, which the authors put forward as the ultimate goal of the research.

      (4) The Discussion quite narrowly focuses on a specific 'mechanism' that the authors have in mind. Although it is good that the Discussion is to the point, it may be worthwhile to entertain other options or (partial) explanations for the findings. For example, have the authors considered that there may be an important role for attention? When testing very soon after the extinction procedure (and thus after the reminder), attentional processes may play an important role (more so than with longer intervals). The retrieval procedure could perhaps induce heightened attention to the reminded CS+ (which could be further enhanced by dlPFC stimulation)?

      (5) There is room for improvement in terms of language, clarity of the writing, and (presentation of the) statistical analyses, for all of which I have provided detailed feedback in the 'Recommendations for the authors' section. Idem for the data availability; they are currently not publicly available, in contrast with what is stated in the paper. In addition, it would be helpful if the authors would provide additional explanation or justification for some of the methodological choices (e.g., the 18-s interval and why stimulate 8 minutes after the reminder cue, the choice of stimulation parameters), and comment on reasons for (and implications of) the large amount of excluded participants (>25%).

      Finally, I think several statements made in the paper are overly strong in light of the existing literature (or the evidence obtained here) or imply causal relationships that were not directly tested.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Review, 3D SIM + AO, Wang and coworkers

      In this manuscript, Wang and coworkers report an upright 3D SIM system with adaptive optics (AO) correction. They demonstrate that AO improves imaging into thick 3D samples, including Drosophila larval brain. They also explore the use of remote focusing with their setup. The authors clearly demonstrate a gain with AO, and we are convinced that the microscope they build offers some utility over existing state of the art, particularly in samples thicker than a single cell. That said, we have concerns with the manuscript that we would like to see addressed before recommending publication:

      • Given the emphasis on super-resolution imaging deep inside a sample, we were surprised to see no mention of other forms of structured illumination that allow super-resolution imaging in samples thicker than a single cell. These include the 'spot-scanning' implementations of SIM that offer better imaging at depth by virtue of pinholes, and include MSIM, iSIM, and rescan confocal technologies. The two-photon / AO implementation of iSIM seems particularly germane, e.g. https://pubmed.ncbi.nlm.nih.gov/28628128/ Please consider citing these works, as they help place the existing work into context.
      • As we're sure the authors appreciate, besides aberrations, a major additional obstacle to 3D SIM in thick tissues is the presence of out-of-focus background. Indeed, this point was mentioned by Gustafsson in his classic 2008 paper on 3D SIM (https://pubmed.ncbi.nlm.nih.gov/18326650/): 'The application area of three-dimensional structured illumination microscopy overlaps with that of confocal microscopy, but the two techniques have different and complementary strengths. Structured illumination microscopy offers higher effective lateral resolution, because it concentrates much of the excitation light at the very highest illumination angles, which are most effective for encoding high-resolution information into the observed data, whereas confocal microscopy spreads out its illumination light more or-less uniformly over all available angles to form a focused beam. For very thick and compactly fluorescent samples, however, confocal microscopy has an advantage in that its pinhole removes out-of focus light physically. Structured illumination microscopy is quite effective at removing out-of-focus light computationally, because it is not subject to the missing-cone problem, but computational removal leaves behind the associated shot noise. Therefore confocal microscopy may be preferable on very thick and dense samples, for which the in-focus information in a conventional microscope image would be overwhelmed by out-of-focus light, whereas structured illumination microscopy may be superior in a regime of thinner or sparser samples.' This point is not mentioned at all in the manuscript, yet we are certain it is at least partially responsible for the residual image artifacts the authors mention. Please discuss the problem of out of focus light on 3D samples, particularly with an eye to the 'spot-scanning' papers mentioned above.
      • The authors use a water dipping lens, yet they image into samples that are mounted on coverslips, i.e. they use a dipping lens to image through a coverslip: see attached pdf for reference

      This almost certainly introduces spherical aberration, which the authors seem to observe: see attached pdf for reference

      We find this troubling, as it seems that in the process of building their setup, the authors have made a choice of objective lens that introduces aberrations - that they later correct. At the very least, this point needs to be acknowledged in the manuscript (or please correct us if we're wrong) - as it renders the data in Figs. 3-4 somewhat less compelling than if the authors used an objective lens that allowed correction through a coverglass, e.g. a water dipping lens with a correction collar. In other words, in the process of building their AO setup, the authors have introduced system aberrations that render the comparison with 3D SIM somewhat unfair. Ideally the authors would show a comparison with an objective lens that can image through a glass coverslip. - The authors tend to include numbers for resolution without statistics. This renders the comparisons meaningless in my opinion; ideally every number would have a mean and error bar associated with it. We have included specific examples in the minor comments below. - In Fig. 5, after the 'multipoint AO SIM', the SNR in some regions seems to decrease after AO: see attached pdf for reference

      Please comment on this issue.

      • Please provide timing costs for the indirect AO methods used in the paper, so the reader understands how this time compares to the time required for taking a 3D SIM stack. In a similar vein, the authors in Lines 213-215, mention a 'disproportionate measurement time' when referring to the time required for AO correction at each plane - providing numbers here would be very useful to a reader, so they can judge for themselves what this means. What is the measurement time, why is it so long, and how does it compare to the time for 3D SIM? It would also be useful to provide a comparison between the time needed for AO correction at each (or two) planes without remote focusing (RF) vs. with RF, so the reader understands the relative temporal contributions of each part of the method. We would suggest, for the data shown in Fig. 5, to report a) the time to acquire the whole stack without AO (3D SIM only); b) the time to acquire the data as shown; c) the time to acquire the AO stack without RF. This would help bolster the case for remote focusing in general; as is we are not sure we buy that this is a capability worth having, at least for the data shown in this paper.
      • Some further discussion on possibly extending the remote focusing range would be helpful. We gather that limitations arose from an older model of the DM being used, due to creep effects. We also gather from the SI that edge effects at the periphery of the DM was also problematic. Are these limitations likely non-issues with modern DMs, and how much range could one reasonably expect to achieve as a result? We are wondering if the 10 um range is a fundamental practical limitation or if in principle it could be extended with commercial DMs.

      Minor comments

      • The paper mentions Ephys multiple times, even putting micromanipulators into Fig. 1 - although it is not actually used in this paper. If including in Figure 1, please make it clear that these additional components are aspirational and not actually used in the paper.
      • The abstract mentions '3D SIM microscopes', 'microscopes' redundant as the 'm' in 'SIM' stands for 'microscope'.
      • 'fast optical sectioning', line 42, how can optical sectioning be 'fast'? Do they mean rapid imaging with optical sectinong?
      • line 59, 'effective imaging depth may be increased to some extent using silicone immersion objectives', what about water immersion objectives? We would guess these could also be used.
      • line 65 - evidence for 'water-dipping objectives are more sensitive to aberrations' ? Please provide citation or remove. They are certainly more prone to aberrations if used with a coverslip as done here.
      • 'fast z stacks' is mentioned in line 103. How fast is fast?
      • line 116 'we imaged 100 nm diameter green fluorescent beads'. Deposited on glass? Given that this paper is about imaging deep this detail seems worth specifying in the main text.
      • lines 127-130, when describing changes in the bead shape with numbers for the FWHM, please provide statistics - quoting single numbers for comparison is almost useless and we cannot conclude that there is a meaningful improvement without statistics.
      • In the same vein, how can we understand that remote focus actually improves the axial FWHM of the widefield bead? Is this result repeatable, or it just noise?
      • line 155, 'Because of the high spatial information...' -> 'Because of the high resolution spatial information...'
      • When quoting estimated resolution #s from microtubules (lines 158-163) similarly please provide statistics as for beads.
      • It seems worth mentioning the mechanism of AO correction (i.e. indirect sensing) in the main body of the text, not just the methods.
      • How long do the AO corrections take for the datasets in the paper?
      • Were the datasets in Fig. 2-4 acquired with remote focusing, or in conventional z stack mode? Please clarify this point in the main text and the figure captions.
      • It would be helpful when showing z projections in Figs. 3-5 to indicate the direction of increasing depth (we assume this is 'down' due to the upright setup, but this would be good to clarify)
      • line 174, 'showed significant improvements in both intensity and contrast after reconstruction' - we see the improvements in contrast and resolution, it is harder to appreciate improvements in intensity. Perhaps if the authors showed some line profiles or otherwise quantified intensity this would be easier to appreciate.
      • line 195 'reduced artefacts' due to AO. We would agree with this statement - the benefit from AO is obvious, and yet there are still artefacts. If the authors could clarify what these (residual) artefacts are, and their cause (out of focus light, uncorrected residual aberrations, etc) this would be helpful for a reader that is not used to looking at 3D SIM images.
      • Line 197, 'expected overall structure', please clarify what is expected about the structure and why.
      • Line 199, what is a 'pseudo structure'?
      • Fig. 4B, 'a resolution of ~200 nm is retained at depth', please clarify how this estimate was obtained, ideally with statistics.
      • Fig. 4D, please comment on the unphysical negative valued intensities in Fig. 4D, ideally explaining their presence in the caption. It would also be helpful to highlight where in the figure these plots arise, so the reader can visually follow along.
      • Line 245, 'rapid mitosis'. What does rapid mean, i.e. please provide the expected timescale for mitosis.
      • For the data in Fig. 6, was remote refocusing necessary?
      • What is the evidence for 'reduced residual aberrations', was a comparative stack taken without AO? In general we feel that the results shown in Fig. 6 would be stronger if there were comparative results shown without AO (or remote focusing).
      • Line 350, 'incorporation of denoising algorithms' - citations would be helpful here.
      • Line 411, 'All three were further developed and improved' - vague, how so?
      • Sensorless AO description; how many Zernike modes were corrected?
      • Multi-position aberration correction. Was the assumption of linearity in the Zernike correction verified or met? Why is this a reasonable assumption?
      • Fig. S1B is not useful; if the idea is to give a visual impression of the setup, we would recommend providing more photos with approximate distances indicated so that the reader has a sense of the scale of the setup. As is - it looks like a photograph of some generic optical setup.
      • SI pattern generation - 'the maximum achievable reconstruction resolution was only slightly reduced to about 95% of the theoretical maximum'. We don't understand this sentence, as the resolution obtained on the 100 nm beads is considerably worse than 95% of the theoretical maximum. Or do the authors mean 95% of the theoretical maximum given their pitch size of 317 nm for green and 367 nm for red? SI Deformable mirror calibration

      'spanning the range [0.1, 0.9]' - what are the units here?

      What are the units in Fig. S5C, S5D?

      It would be useful to define 'warmup' also in the caption of SI Fig. S6A. SI Remote Focusing, 'four offsets, {-5 mm, -2.5 mm, 2.5 mm, 5 mm}...' are the units mm or um? '...whereas that of the 10 beads was...' here, do the authors mean the position of the beads derived from the movement of the piezo stage, as opposed to the remote focusing? The authors refer to the 'results from Chapter 3.2'. What are they talking about? Do they mean a supplementary figure, or earlier supplementary results? In general, we found the discussion in this paragraph difficult to follow. Supplementary Fig. 9 seems to be not referred to anywhere in the text. - Since the paper emphasizes 3D SIM, OTFs along the axial direction would also be useful to show, in addition to the lateral OTFs shown in Fig. 2D. - When the sample is moved by the piezo, the axial phase of the 3D-SIM illumination pattern is stable as the sample is scanned through the illumination pattern. When remote focusing is performed, the sample is always stable so the axial phase of the 3D-SIM illumination pattern is presumably changing with remote focusing. Can the authors clarify if the 3D SIM illumination pattern is scanned when remote focusing is applied, or is the intensity pattern stable in z? - In Supplementary Fig. 9, primary spherical is referred to twice, both at index 11 and 22. The latter is presumably secondary spherical? - we do not understand the x axis label, in Fig. S4D, is it really [0, 50, 50, 50] as written? see attached pdf for reference

      Referee Cross-Commenting

      I don't have much to add; the other reviewers raise good points and I think it would be good if the authors could respond to their feedback in a revised manuscript.

      Significance

      Nearly all fluorescence images deteriorate as a function of depth. Methods to ameliorate this depth-dependent degradation are thus of great practical value, as they improve the information content of images and thus (hopefully) biological insight. In this work, the authors develop a method to improve super-resolution imaging (3D SIM) at depth, by combining it with adaptive optics.

    1. Note: This response was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1:

      This study provides negative in vivo evidence for the use of two PERK inhibitors and of TUDCA for the treatment of Sli1-related Marinesco-Sjögren syndrome (MSS).

      Overall, the manuscript reports a substantial amount of work and the study could be published in its present format. The experiments are well described in terms of methodology and appropriate analysis has been applied. Claims are proportionate and not overstated

      I would have only minor comments related to some clarifications that the authors could make in the present manuscript and a suggestion for experiments that could improve the manuscript.

      First, although this is not my expertise, the in vitro analysis of CHOP luciferase assays suggests that very high concentrations, in particular of TUDCA, are needed to observe an effect. The authors may wish to clarify their opinion and whether this could be the reason why in vivo they have been unable to obtain any inhibition of the PERK pathway.

      The reviewer is correct in pointing out that high concentrations of trazodone, DBM and TUDCA were required to inhibit the PERK pathway in the CHOP::luciferase reporter cell lines. However, as we state in the Discussion, we do not think that their lack of effect in vivo was due to insufficient drug levels, since woozy mice were treated with trazodone, DBM or TUDCA according to dose regimens and administration routes that have proved effective in other neurodegenerative disease mouse models. Moreover, our analysis did not find major differences in drug bioavailability between mice with the woozy genetic background (CXB5/ByJ) and C57BL/6J mice in which these drugs had shown neuroprotective effects (see also the response to the next point).

      Second, it seems to me that when measuring the Trazodone metabolism there is a difference between acute and chronic treatment. It would be worth discussing what the authors make of that and what is more relevant (I assume chronic) to the disease model outcome.

      We realized that the nomenclature used in Figures 6 and 7 was confusing, leading the reader to think there were differences in trazodone levels between chronically and acutely treated mice.

      The experiment shown in Figure 6 was designed to test whether there were differences in trazodone pharmacokinetics and metabolism between mice of the woozy strain, which have the CXB5/ByJ genetic background, and C57BL/6J mice in which trazodone had shown neuroprotective effects in previous studies. In contrast, Figure 7 illustrates the levels of trazodone and m-CPP in control and woozy mice (both of which have the CXB5/ByJ genetic background) that had been chronically treated with trazodone for 5 weeks. These are the same animals as in Figure 3, as we state in Figure 7 legend. Therefore one should compare the levels of trazodone and m-CPP in Figure 7 with those of the "woozy" group (CXB5/ByJ genetic background) in Figure 6. This comparison shows that trazodone and m-CPP levels are comparable after chronic and acute (6h) treatment.

      To avoid confusion, we have changed the mouse nomenclature. We have renamed the control group of mice as "CT" (previously "WT") throughout the text and figures. In Figure 6, we have used CXB5/ByJ instead of "woozy" to emphasize the comparison between the different genetic backgrounds (CXB5/ByJ vs C57BL/6J). Finally, we have replaced the colors of symbols in Figure 7 in order to match those of Figure 3. We have also made the description and discussion of these results clearer in the revised manuscript.

      With respect to the experiments a simple and informative addition would be the evaluation of the PERK pathway in mice treated with TUDCA, as this is missing.

      The effect of TUDCA treatment on the PERK pathway is shown in Figure 5, where we measured CHOP mRNA levels in Purkinje cells microdissected from mice treated with 0.4% TUDCA in the chow, and in Figure 9C and D, where we measured the percentage of CHOP-immunopositive Purkinje cells in the cerebellum of same groups of mice by immunohistochemistry.

      Figure 10 illustrates the results of an additional experiment in which woozy mice were treated with 500 mg/kg TUDCA intraperitoneally (ip), to test whether this alternative dosing regimen was any better. Like the treatment per os, TUDCA ip had no beneficial effect on motor dysfunction. Therefore we deemed it unnecessary to check the effect on PERK pathway inhibition in this group of mice.

      A more difficult but potentially more interesting line of investigation is that of searching for potential actions of Trazodone that are PERK independent and might be responsible for the partial rescue observed in the beam walking test, which is much more sensitive and specific than rotarod, so worth considering. Assuming authors want to go down this route and add significance to their study my suggestion would be an unbiased RNA seq from the brain samples they already have. However, this is a suggestion to steer the study towards a more positive outcome and it is not necessary to support their current conclusions.

      We agree with the reviewer that it would be interesting to investigate the mechanism by which trazodone slightly ameliorated the motor performance of woozy mice in the beam walking test. In the Discussion, we speculated that this could be due to an effect of trazodone on cerebellar serotonergic neurotransmission, which would require electrophysiological investigations to demonstrate. Of course, other mechanisms may also be operative, which RNA seq may help identify, as the reviewer suggests. However, this would be a complex and lengthy investigation, the results of which would not change the main conclusions of the present paper. We plan to explore this line of investigation in a future study.

      Reviewer #2:

      Lavigna et al. described the effect of Trazodone in Marinesco-Sjögren syndrome model mice. Although the results are somewhat disappointing, this research has provided fundamental evidence for the future development of MSS therapeutics. There are few minor comments to further improve the manuscript

      Major comment<br /> P14<br /> "Trazodone metabolism to m-CPP was slightly impaired in woozy mice compared to C57BL/6J mice. This was evident from the m-CPP/trazodone ratio, calculated on the AUC0-t in the plasma, which was 0.34 in woozy and 0.67 in C57BL/6J mice."

      Why was the concentration different between WT and woozy mice? Which organ mainly contributes to the metabolism of trazodone? Is the function of this target organ different between WT and woozy mice?<br /> Similar to trazodone, m-CPP clearance from plasma was slightly faster in woozy than in C57BL/6J mice.<br /> Is m-CPP eliminated via the kidney? Or liver? Why is there a difference? Does SIL1 functions in liver or kidney? Needs discussion. This is the same for brain m-CPP levels.

      As explained in the response to the second comment of reviewer #1, "woozy" in Figure 6 referred to mice with the CXB5/ByJ genetic background, and in this experiment we compared trazodone pharmacokinetics and metabolism between CXB5/ByJ and C57BL/6J mice. We have modified the nomenclature of Figure 6 and the Results to make this clear.

      Trazodone undergoes extensive hepatic metabolism, and only a small percentage is excreted unchanged in the urine. Metabolism involves hydroxylation, oxidation and dealkylation reactions, forming in particular the 5HT-active metabolite m-CPP (by CYP3A4). This and other metabolites are mainly excreted in urine, as conjugates [1-3]. The slight differences in trazodone pharmacokinetics and metabolism between the CXB5/ByJ and C57BL6/J mice shown in Figure 6 is not attributable to loss of SIL1 function, since both groups of mice carried wild-type Sil1 alleles, but is most likely due to subtle differences between the two strains, for example in the binding to plasma proteins, metabolic enzymes, transporters and/or the excretion processes. The available data do not allow to clarify this issue.

      The main point, however, is that no major differences were found in the plasma and brain concentrations of trazodone between these two strains of mice, which could have explained the lack of efficacy of trazodone in woozy mice, as we now further stress in the revised Discussion.

      Minor comments

      P3 L5 mutation should be variant.

      This has been changed.

      P4 L1 eIF2a-P should be phosphorylated eIF2α (p-eIF2α). The reviewer prefers (p-eIF2α) than (eIF2α-p) throughout the manuscript.

      There is no standard rule for indicating phosphorylated proteins, and phosphorylated eIF2α is referred to in various ways in different papers, with the "p" in capital or lowercase, preceding or following the protein name, separated by a dash or not. We would prefer to maintain the current nomenclature for consistency with our previous publications, unless the Editor deems otherwise.

      P9 L11 M-CPP should be fully spelled out the first time it appears. m-Chlorophenylpiperazine (m-CPP)

      M-CPP is spelled out the first time it appears in the Material and Methods, subheading Drug treatments and bioanalysis.

      Please explain the difference between the expected function of trazodone and its metabolite m-CPP. Why m-CPP is not effective.

      Based on the observation that mice of the woozy strain had lower brain levels of m-CPP than C57BL6/J mice (Figure 6), we hypothesized that the lack of effect of trazodone in woozy mice could be due to m-CPP mediating the PERK signaling inhibitory activity of trazodone. However, experiments in CHOP::luciferase reporter cells demonstrated that m-CPP does not inhibit PERK signaling (Figure 2D). The precise mechanism by which trazodone inhibits PERK signaling is not known [4], which makes it difficult to speculate why its main metabolite, m-CPP, does not exhibit this activity.

      P11 L3 Fig. 3 Fig. 3A and B?

      Yes, we specifically refer to panels A and B of Figure 3. We have indicated this in the revised manuscript.

      P11 L6 at 7 weeks of age?

      We have re-done the statistical analysis by two-way ANOVA and reported the results in the legend to Figure 3. There is a significant difference between vehicle- and trazodone-treated woozy mice in the number of missteps when the two groups are compared globally. No statistically significant difference in the number of missteps is detected at specific time points by post-hoc analysis. There is no statistically significant difference between vehicle- and trazodone-treated woozy mice in the time to traverse the beam. The Results section has been revised accordingly.

      P12 L17 ~4 times, 4 times? Please state the exact value.

      Done.

      Figure 7 Why are brain m-CPP levels higher than plasma levels? Is trazodone metabolized in brain tissue?

      Trazodone is extensively metabolized in the liver through Cytochrome P450 (Rotzinger et al., 1999). It is well documented that m-CPP readily passes the blood-brain barrier, much better than the parent compound, explaining its high brain levels [2].

      P19 L7 ISRIB; please fully spell out the first time it appears.

      Done.

      References

      1. Rotzinger S, Bourin M, Akimoto Y, Coutts RT, Baker GB (1999) Metabolism of some “second”- and “fourth”-generation antidepressants: iprindole, viloxazine, bupropion, mianserin, maprotiline, trazodone, nefazodone, and venlafaxine. Cell Mol Neurobiol 19:427– 442. https://doi.org/10.1023/a:1006953923305
      2. Caccia S, Ballabio M, Samanin R, Zanini MG, Garattini S (1981) (--)-m-Chlorophenyl- piperazine, a central 5-hydroxytryptamine agonist, is a metabolite of trazodone. J Pharm Pharmacol 33:477–478. https://doi.org/10.1111/j.2042-7158.1981.tb13841.x
      3. DeVane CL, Boulton DW, Miller LF, Miller RL (1999) Pharmacokinetics of trazodone and its major metabolite m-chlorophenylpiperazine in plasma and brain of rats. Int J Neuropsychopharm 2:17–23. https://doi.org/10.1017/S1461145799001303
      4. Halliday M, Radford H, Zents KAM, Molloy C, Moreno JA, Verity NC, Smith E, Ortori CA, Barrett DA, Bushell M, Mallucci GR (2017) Repurposed drugs targeting eIF2alpha-P-mediated translational repression prevent neurodegeneration in mice. Brain 140:1768– 1783. https://doi.org/10.1093/brain/awx074
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Author responses


      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      In their manuscript, Dutta and colleagues compared the meiotic recombination landscapes between five budding yeast species. In the first part of the work, the authors constructed a high-resolution map of meiotic recombination events in Kluyveromyces lactis supported by high-quality genome assemblies for two strains of this yeast. Then, partially repeating their CO and NCO mapping strategy, they compared a number of meiotic recombination parameters between the five species (sometimes three, depending on the quality of the data for each species). They particularly focused on key parameters for meiotic recombination, such as crossover interference and homeostasis and obligate crossover. Although the analysis is interesting, it is underdeveloped in many places and lacks the general conclusions regarding the evolution of recombination and the broader perspective that would be expected from a comparison of these phenomena in budding yeasts.

      [R] Tackling the evolution of recombination is ambitious. Here, with a dataset of five species, it is hard to draw strong evolutionary conclusions besides the variations in the crossover (CO) landscapes and the control of CO formation that we observed, which is already significant. The multiple losses of CO interference that we describe here may constitute our strongest evolutionary conclusion. It potentially underscores the minor evolutionary advantage associated to CO interference at least in budding yeasts. In this context, we changed the title to be more factual and updated the text to better highlight the significance and implications of our findings.

      Major comments:

      The authors indicate that the distribution of hotspots and coldspots is not preserved between species, but this finding is not properly documented. I think it would be useful to include recombination maps in a main figure for all species (or at least for S. cerevisiae, K. lactis and L. waltii) with the elements highlighted. This will allow for a visual illustration of the variability in the recombination landscape between the studied species. [R] The genomes of the species show blocks of synteny but overall, they are not collinear and therefore, it is not possible to have a direct comparison of the recombination maps. In our previous work, we have highlighted the non-conservation of CO hotspots between S. cerevisiae, L. kluyveri and L. waltii (Brion et al. 2017; Dutreux et al. 2023). Briefly, we retrieved conserved syntenic blocks in L. kluyveri and L. waltii genomes containing at least two S. cerevisiae orthologs associated with one hotspot. L. waltii shares only five out of the 92 S. cerevisiae crossover hotspots (RHO5, SLS1, GYP6, OLE1 and MRPL8), while L. kluyveri shares only one. L. waltii and L. kluyveri share no crossover hotspots. In addition, our current study shows that none of the K. lactis hotspot is conserved in any of the four other species (response figure 1 and new supplementary figure S11).

      Response Figure 1. Density of crossovers along the genome using a 5 kb window in the S. cerevisiae genome (Mancera et al. 2008; Oke et al. 2014; Krishnaprasad et al. 2015 combined dataset). Horizontal dotted green line represents crossover hotspot significance threshold. Solid spheres represent the conserved CO hotspots with either L. kluyveri (red) or L. waltii (blue). None of the 92 S. cerevisiae crossover hotspot is conserved in L. lactis.

      Although analyses analogous to those presented in Fig. S5 had already been published in other comparisons of the recombination landscape in yeast (e.g. Dutreux et al., 2023), I think that Figs. S5A and S5B are worth to be presented in the main figures (not supplementary data). In many species of eukaryotes, the detection of NCOs is practically impossible, therefore only results for COs are presented. Therefore, it is perhaps also worth discussing the fact that the relationship applies to all recombination events and not only COs, and therefore is related to the regulation of DSBs frequency and not individual DSBs repair pathways.

      [R] Figures S5A-B are now included in the main figure, Figure 2B. The association holds true for all total recombination (CO+NCO) events as well, new supplementary figure S6A.

      The authors find that CO coldspots were associated with DNA repair genes. Unfortunately, an equivalent analysis was not performed for all recombination events (CO + NCO). I presume this approach is based on the belief that COs are more mutagenic than NCOs. However, recent studies in humans suggest that, at least in mammals, meiotic DSBs themselves are mutagenic, regardless of the pathway used for their repair (Hinch et al., Science 2023). Therefore, I would suggest repeating the analysis also considering NCOs (although I am aware that the picture of NCOs may be incomplete). I would also like to see some graphical representation of the analysis. Is it possible to perform a classic analysis of coldspots/hotspot enrichment in relation to gene ontology?

      [R] As suggested, we performed the analysis to independently detect coldspots for all recombination events (CO+NCO). Based on a threshold of

      In relation to the previous point - it may be worth repeating this type of analysis also for other yeasts used in this study, or at least for S. cerevisiae, to be able to consider the extent to which this relationship is universal and dependent on the meiotic DSB repair pathway.

      [R] The analysis regarding the CO coldspots has been performed in the other species as well. As mentioned in the main text, although some overlap between CO coldspots and DNA repair genes has been observed in the other species as well, we observed a significant enrichment in K. lactis only, maybe because the dataset is larger than in the other species.

      In Fig. S7, the point where WGD occurred is marked in the wrong place, or at least that is what the sentence in the text says ("The Lachancea and Kluyveromyces species branched from the Saccharomyces lineage more than 100 million years ago, before to the ancestral whole-genome duplication (WGD) event specific of the S. cerevisiae lineage").

      [R] We regret the oversight and have corrected the figure.

      The result presented in Fig. S8 is interesting and should be shown in the main figures. Perhaps it would be worth adding an illustration illustrating simple versus complex COs.

      [R] The old Figure S8 is now a part of main Figure 2C-D with the illustrations describing the CO types.

      The last part of the results includes an analysis of the evolutionary rates of the ZMM genes. In the discussion, the authors should also refer the results of this analysis to the previous analysis of the overrepresentation of DNA repair genes in recombination coldspots. I understand that ZMM are not DNA repair proteins in the strict sense, but I think it is worth familiarizing readers with the authors' view on this matter. Moreover, I would suggest showing where MLH1 and MLH3 are located on the plot in Fig. 6 (especially the meiosis-specific MLH3), whether the selection pressure acts on them as on ZMM proteins, or rather as on DNA repair proteins. Showing the SLX4 and MUS81 would also be interesting.

      [R] Figure 6 has been updated as suggested and now shows the Mlh1, Mlh3, Slx4 and Mus81 dN/dS values for the three species.

      I feel like the discussion is underdeveloped. I missed a deeper summary of the comparison between meiotic recombination among the tested budding yeasts in the context of the presence and absence of functional ZMM. Even the title of the work is not properly developed in the manuscript text. The analysis shows that it is not the presence of a functional ZMM pathway or its lack that introduces differences between the individual recombination landscapes, although ZMM determines the presence of proper CO interference. With the caveat that for L. kluyveri it is basically unknown whether it has a functional ZMM or not. Maybe confirming the lack of expression of some ZMM genes in meiosis of this species would answer the question of how it should be treated?

      [R] We agree with this reviewer that our original title was imprecise, so we changed it to be more factual, emphasizing on the multiple losses of crossover interference in budding yeasts. As stated above, it potentially underscores the minor/negligible evolutionary advantage associated to CO interference at least in budding yeasts. From there, it is hard to draw deeper conclusions since the actual roles/functions of CO interference are still under debate, notably in yeasts where the CO frequency tends to be high. We improved the discussion to better highlight these points.

      We also agree that a deeper characterization of the ZMM factors persisting in the non-Saccharomyces yeasts would be informative, but we believe it is beyond the scope of the current manuscript and more suitable for a follow up work. However, our recent publication about L. kluyveri (Legrand et al 2024) shows that Zip3 is properly expressed in meiosis and behaves as in S. cerevisiaesince it is located at DSB sites. Furthermore, we have unpublished transcriptomic data (Response Figure 2) showing that all the ZMM genes from L. kluyveri are specifically induced in meiosis (fold increase >16 at least compared to pre-sporulation conditions). Therefore, so far, although the level of CO interference in L. kluyveri is minimal, there is no indication that the ZMM genes are mis regulated.

      Response Figure 2. Transcriptomic data showing that all the ZMM genes from L. kluyveri are specifically induced in meiosis (Unpublished data from Llorente Lab, CRCM, Marseille).






      Minor comments:

      In general, Figure captions are imprecise, many of them lack clear information explaining what is depicted. Authors should remember that figure legends should be self-sufficient. [R] The figure legends have been updated and are now self-sufficient.

      In the revised manuscript, I would suggest placing figure numbers on the figures and using line numbering, which would facilitate the reception of the work and possible reference to its individual elements in the review.

      [R] We regret the omission. Figure numbers, Line numbers and Page numbers have been added.

      Reviewer #1 (Significance (Required)):

      The study provides a new insight into the variation in recombination landscape within budding yeast species with a special emphasis on crossover control. This includes also de novo assemblies of Kluyveromyces lactis genome and high-resolution tetrad-based maps of meiotic recombination events. Previously, recombination maps of different yeast species were compared, however this study focuses on budding yeasts, some of which lost ZMM pathway and differ in some crossover parameters, like interference and homeostasis. Although the analysis is interesting, it lacks the general conclusions regarding the evolution of recombination and the broader perspective that would be expected from a comparison of these phenomena in budding yeasts.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __

      This paper describes the genome-wide mapping of meiotic recombination in non-Saccharomyces yeast, Kluyveromyces lactis. By using heterologous parental strains, the authors mapped crossovers (COs) and noncrossovers (NCOs) on the genome of K. lactis which lacks proteins necessary for CO formation such as S. cerevisiae, mammals and plants. This is an extension of previous works by the authors' group which mapped CO and NCO in different yeast, Lachancea kluyveri and L. waltii by a similar approach. The authors found that CO frequencies in K. lactis are much lower than those in S. cerevisiae and COs showed weaker interference, which facilitates the non-random distribution of COs along a chromosome. Overall, the experiments and informatic analyses have been done in good quality and the results are convincing. The paper provides additional new information on the landscape of meiotic recombination in different yeast species. These results are of great interest to researchers in the field of meiotic recombination and evolution of meiosis. There are some issues that the authors may be able to address before the publication.

      Major points: While the authors noted that K. lactic shows the loss of a pro-CO factors (ZMM protein), Spo16, and Msh5 (due to the introduction of an in-frame stop codon), it still possesses other proteins such as Zip1, Zip2, Zip3, Zip4/Spo22, Mer3, and Msh4. It is still likely that these pro-CO factors control CO formation (and interference) in this yeast. It would be nice for the authors to study whether the knockout of these genes is dispensable for CO formation and interference in meiosis. A similar analysis should be done for L. kluyveri which retains all ZMM genes, but this is clearly out of the scope of this paper.

      [R] The question of the functions of the remaining ZMM factors is indeed interesting and related to point #8 from reviewer 1 (please see above). Although this is beyond the scope of our work, we would like to refer here to work from Amy McQueen's lab using L. lactis Zip1 in S. cerevisiae (Voelkel-Meiman 2015). This study shows that L. lactis Zip1 does not allow synaptonemal complex assembly in S. cerevisiae but allows CO formation independently of the Msh4/5 complex but that depend on Zip2/4/Spo16 and Mlh1/3 for their resolution. Overall, these results suggests that L. lactis Zip1 at least retained ancestral functions shared with S. cerevisiae Zip1. However, it is not possible to conclude if the lack of full complementation of L. lactis Zip1 in S. cerevisiae comes from functional divergence or simply by the inability of L. lactis Zip1 to function properly in a heterologous context.

      Minor points:

      No page number, no main Figure number. It is hard to review this paper. [R] We regret the oversight. Figure numbers, Line numbers and Page numbers have been added.

      References: In some cases, in the Introduction, the authors referred to review papers such as Pyatnitskaya et al. (2019) for ZMM proteins while in the other parts, they referred to original papers; for example, three papers for Mlh1-Mlh3. If the number of references is not limited, original papers should be cited in the text.

      [R] We regret this omission. Original papers have now been included in the citations.

      Figure 3A, page 9, second paragraph: When the authors compared CO and NCO densities, it would be nice to show P-values for the comparison.

      [R] p-values have now been added to the updated figure.

      Please show a ratio of CO to NCO in each yeast in Figure 3B in the second paragraph of page 9 in the main text.

      [R] The ratios have now been included in the figure for both the CO:NCO ratios and CO:corrected_NCO ratios, in the main text and figure legends.

      Figure S5 and page 7, the first paragraph and page 9, third paragraph: CO/NCO densities (negative correlation to chromosome sizes) in S. cerevisiae should be checked with or without short chromosomes (I, III, and VI), which show very unique regulation of meiotic DSB formation (see Murakami et al. Nature 2020).

      [R] Even excluding the small chromosomes, the size dependent trend persists for S. cerevisiae and S. paradoxus.

      Table S7: Please add the S. cerevisiae gene name such as ZIP1 next to S. cerevisiae orthologs such as YDR285W. Moreover, please explain the column in detail or clarify the data. What does "meiosis" mean here? For example, YJL074C is SMC3, which is expressed in mitosis as well as in meiosis. The same is true for YGL163C, which is RAD54, which plays a minor role in meiosis, but plays a critical in mitotic DSB repair.

      [R] We corrected Table S7 as desired by systematically including the standardized gene names.

      The Gene Ontology (GO) annotation is a statement about the function of a particular gene. It offers a structured framework and a comprehensive set of concepts to describe the functions of gene products across all organisms. It is specifically crafted to support the computational representation of biological systems. In our specific case, we only looked at genes with the gene ontology annotation "meiosis". Together, these statements comprise a "snapshot" of current biological knowledge and is by no means absolute. This has been detailed in the supplementary Table S7.

      Reviewer #2 (Significance (Required)):

      This study provides the landscape of meiotic recombination in non-Saccharomyces yeast, Kluyveromyces lactis. The genome-wide recombination map in K. lactis shows lower crossover frequencies with weaker crossover interference than those in S. cerevisiae. Overall, the experiments and informatic analyses have been done in good quality and the results are convincing. The paper provides additional new information on the landscape of meiotic recombination in different yeast species, particularly in terms of the evolution of meiotic recombination. These results are of great interest to researchers in the field of meiotic recombination and evolution of meiosis.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __

      Dutta et al. have compiled a genome-wide meiotic recombination map for Kluyveromyces lactis and compared it to a compilation of meiotic recombination maps for four other species, two of which (Lachancea kluyveri and Lachancea waltii), like K. lactis, predate the genome duplication event that produced the other two (Saccharomyces cerevisiae and S. paradoxus). Meiosis in many species studied (including metazoans and plants) shows control over the number and distribution of crossovers, which are critical for faithful chromosome segregation during meiosis. This takes the form of crossover interference, where crossovers are spaced more evenly than expected by chance, and crossover homeostasis, where many fewer chromosomes lack a crossover than is expected by chance. While both of the post-duplication species show both crossover interference and homeostasis, none of the pre-duplication species show crossover homeostasis, and crossover interference is very weak. In two cases (K. lactis and L. waltii), this can be explained by mutational loss of a few of the genes (called the ZMM genes) that promote meiotic crossovers in many species. However, L. kluyveribehavior cannot be explained in this way. Recombination hotspots are present but are not shared between the pre-duplication species or between the pre- and post-duplication species, perhaps not surprising for species that diverged more that 100 million years ago. Overall, this work will be a useful contribution to our understanding of the different possible flavors of meiotic recombination mechanisms and control that are possible (and, one might add, promote long-term species viability). A) Evaluation, reproducibility and clarity The work presented in this paper is straightforward and unimpeachable and will largely be of interest to those studying meiotic recombination, be it mechanistic studies or studies of the implications for population genetics. The analysis is technically correct, although there are some aspects where a slightly different emphasis should be considered (see comments below). However, the data and the analysis could stand as they currently are, without further revision.

      Suggestions are below. 1. (trivial) it would have been useful if pages and lines were numbered.

      [R] We regret the oversight. Figure numbers, Line numbers and Page numbers have been added.

      "Across the 205 meioses...". In general, it would be desirable to apply compensation for the fact that NCOs and COs are differently detected. Since, in K. lactis, 35% of COs are not accompanied by detectable gene conversion, it seems reasonable to apply a correction to measured NCOs here and throughout the paper, regardless of the species. For example, if one assumes that 35% of NCOs are not detected, how does this affect estimates of chromosomes that do not appear to have undergone interhomolog recombination? Estimates of CO/NCO bias? In a similar vein, if the CO event is not considered (just the conversion events associated with it), how does this affect measures of conversion tract lengths in COs and NCOs?

      [R] We thank the reviewer for this suggestion. We have performed the correction for the NCO estimates as described in Mancera et al. 2008, on a per tetrad basis across all the species. The fraction of missed NCOs were 7%, 34%, 30%, 23% and 25% respectively for S. paradoxus, S. cerevisiae, K. lactis, L. waltii and L. kluyveri. The fraction of missed NCOs depend upon the parental marker density. In addition, we performed the CO:NCO bias analysis both with the detected and the corrected NCO frequencies and the trends remain unchanged (Now included in figure 3). Finally, we refrain from using the corrected NCO frequencies while reporting the NCO frequencies (Table 1, main text) to maintain uniformity with our previous work and since, these corrections do not alter any results.

      It might be useful to report recombination event frequencies in terms of events/chromosome, as this, rather than event/unit distance, is functionally more relevant. In the same vein, it might be useful to consider total event homeostasis, in addition to just crossover homeostasis.

      [R] This has been updated as suggested. .

      An interesting observation is that two of the three pre-duplication species clearly at one time had a full complement of ZMM genes but lost some due to mutation. Have there ever been attempts to detect either synaptonemal complex or axial elements in these species?

      [R] This is related to point #8 from reviewer 1 and to the major point of reviewer 2 (please see above).

      To our knowledge, cytological observations of synaptonemal complex (SC) or axial elements have been performed in L. kluyverionly by us and the SC is clearly visible (Legrand et al 2024).

      However, it is key to remind here that K. lactis axis protein encoding genes HOP1 and RED1 have been cloned by the Roeder's lab by functional complementation of S. cerevisiae corresponding mutants, supporting the functional conservation of these genes (Smith and Roeder 2000). Finally, as mentioned above, K. lactis Zip1 retained at least some function of the ancestral Zip1 protein that are also shared by the S. cerevisiae protein (Voelkel-Meiman 2015).

      The observation of elevated evolutionary rates in ZMM genes is also intriguing, but it would help if "dN/dS ratio" was defined.

      [R] It is now defined in the text.

      The observation of frequent E0 chromosomes is taken to suggest efficient achiasmate segregation; has the "corrected" NCO frequency been considered? Do the different frequencies of E0 chromosomes predict the different spore viabilities seen between species?

      [R] E0 is not predictive at all of the spore viability as we have shown in previous studies (see L. kluyveri - Brion et al. 2017, L. waltii-Dutreux et al. 2023). In addition, this has been shown is S. cerevisiae as well (Nishant et al. 2009).

      Figure 3A-what would this look like if it were plotted as "Events per chromosome" rather than per megabase?

      [R] We changed the figure (now figure 2A) and plotted as events per chromosome to show the variability of events at the chromosome level.

      Figure legends tend to be unreasonably terse, which makes figures more difficult to interpret.

      [R] This has been updated as suggested.

    1. We would like to thank you and the reviewers for your thoughtful comments that assisted us to improve the manuscript. We carefully followed the reviewers’ recommendations and provide a detailed point-by-point account of our responses to the comments. 

      Please find below the important changes in the updated manuscript.

      (1) We changed the title according to the comments provided by reviewer #1.

      (2) We edited the introduction, results, and discussion to improve the link between the objectives of the study, the findings, and their discussion, as reviewer #2 recommended.

      (3) We clarified the link between camouflage and fitness, which is now presented as a hypothesis, as reviewer #1 suggested.

      (4) We added new analyses and figures in the main text and in the supplementary materials to better emphasize sex differences in landing force, foraging strategies and hunting success, following reviewer #1 suggestion.

      (5) According to reviewer #2 comments, we edited the results adding key information about methods to help the reader understand the findings without reading the Methods section.

      (6) We added important details about the model selection approach along with a discussion of the low R-square values reported in our analyses on hunting success, as reviewer #2 suggested.

      eLife assessment 

      This fundamental work substantially advances our understanding of animals' foraging behaviour, by monitoring the movement and body posture of barn owls in high resolution, in addition to assessing their foraging success. With a large dataset, the evidence supporting the main conclusions is convincing. This work provides new evidence for motion-induced sound camouflage and has broad implications for understanding predator-prey interactions. 

      Public Reviews: 

      Reviewer #1 (Public Review): 

      In this paper, Schalcher et al. examined how barn owls' landing force affects their hunting success during two hunting strategies: strike hunting and sit-and-wait hunting. They tracked tens of barn owls that raised their nestlings in nest boxes and utilized high-resolution GPS and acceleration loggers to monitor their movements. In addition, camcorders were placed near their nest boxes and used to record the prey they brought to the nest, thus measuring their foraging success. 

      This study generated a unique dataset and provided new insights into the foraging behavior of barn owls. The researchers discovered that the landing force during hunting strikes was significantly higher compared to the sit-and-wait strategy. Additionally, they found a positive relationship between landing force and foraging success during hunting strikes, whereas, during the sit-and-wait strategy, there was a negative relationship between the two. This suggests that barn owls avoid detection by generating a lower landing force and producing less noise. Furthermore, the researchers observed that environmental characteristics affect barn owls' landing force during sit-and-wait hunting. They found a greater landing force when landing on buildings, a lower landing force when landing on trees, and the lowest landing force when landing on poles. The landing force also decreased as the time to the next hunting attempt decreased. These findings collectively suggest that barn owls reduce their landing force as an acoustic camouflage to avoid detection by their prey. 

      The main strength of this work is the researchers' comprehensive approach, examining different aspects of foraging behavior, including high-resolution movement, foraging success, and the influence of the environment on this behavior, supported by impressive data collection. The weakness of this study is that the results only present a partial biological story contained within the data. The focus is on acoustic camouflage without addressing other aspects of barn owls' foraging strategy, leaving the reader with many unanswered questions. These include individual differences, direct measurements of owls' fitness, a detailed analysis of the foraging strategy of males and females, and the collective effort per nest box. However, it is possible that these data will be published in a separate paper. 

      We greatly appreciate your recognition of the comprehensive approach and extensive data collection. Our primary objective was to study the role of acoustic camouflage. Nonetheless, the manuscript now includes a detailed analysis of the foraging strategy and hunting success of males and females (lines 164-225).

      The results presented support the authors' conclusion that lower landing force during sit-andwait hunting increases hunting success, likely due to a decreased probability of detection by their prey, resulting in acoustic camouflage. The authors also argue that hunting success is crucial for survival, and thus, acoustic camouflage has a direct link to fitness. While this statement is reasonable, it should be presented as a hypothesis, as no direct evidence has been provided here.

      Thank you for the comment. We agree and thus have edited the language accordingly.  

      However, since information about nestling survival is typically monitored when studying behavior during the breeding period, the authors' knowledge of the effect of acoustic camouflage on owls' fitness can probably be provided. Furthermore, it will be interesting to further examine the foraging strategies used by different individuals during foraging, the joint foraging success of both males and females within each nest box, and the link between landing force and foraging success if the data are available.

      We are currently writing a manuscript on these topics. We are aware that several scientific questions regarding the foraging ecology of the barn owl still need our attention. Regarding the link between landing force and foraging success, we believe that our revised manuscript addresses this specific topic, please see specific responses below.

      However, even without this additional analysis on survival, this paper provides an unprecedented dataset and the first measurement of landing force during hunting in the wild. It is likely to inspire many other researchers currently studying animal foraging behavior to explore how animals' movements affect foraging success.

      Reviewer #2 (Public Review): 

      Summary: 

      The authors provide new evidence for motion-induced sound camouflage and can link the hunting approach to hunting success (detailing the adaptation and inferring a fitness consequence). 

      Strengths: 

      Strong evidence by combining high-resolution accelerometer data with a ground-truthed data set on prey provisioning at nest boxes. A good set of co-variates to control for some of the noise in the data provides some additional insights into owl hunting attempts. 

      Weaknesses: 

      There is a disconnect between the hypotheses tested and the results presented, and insufficient detail is provided on the statistical approach. R2 values of the presented models are very small compared to the significance of the effect presented. Without more detail, it is impossible to assess the strength of the evidence.

      In the revised manuscript, we changed the way results are presented and we improved the link between the hypotheses and the results. The R2 values are indeed small. It is however important to keep in mind that we are assessing the outcome of one specific behavior (i.e. landing force during sit-and-wait hunts) on hunting success in a wild environment, where many complex ecological interactions likely influence hunting success. Nonetheless, the coefficients (as reported in the results) show that for every 1 N increase in landing force, there is a 15% reduction in hunting success, which is substantial. In the discussion we also note that 50 Hz is a relatively low sampling frequency for estimating the peak ground reaction force. We have gone back over the presentation of our results and made our discussion more nuanced to acknowledge this aspect. 

      We have also added a detailed description about our model selection process in the methods section and provide a model selection table for each analysis in the supplementary materials.

      The authors seem to overcome persisting challenges associated with the validation and calibration of accelerometer data by ground-truthing on-board measures with direct observations in captivity, but here the methods are not described any further and sample sizes (2 owls - how many different loggers were deployed?) might be too small to achieve robust behavioural classifications.

      Thank you for the comment. Details of our methods of behavioural identification are provided in lines 385 – 429. There are two reasons why our results should not be limited by the sample size. First, we used the temporal sequence of changes in acceleration, and rates of change in acceleration data, which make the methods robust to individual differences in acceleration values. Furthermore, our methods for behavioural identification were not based on machine learning. Instead, we use a Boolean based approach (as described in Wilson et al. 2018. MEE), which is more robust to small differences in absolute values that might occur e.g. in relation to slight changes in device position. 

      Recommendation for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Comment 1. This study provides new insights into animals' foraging behavior and will probably inspire other researchers to examine foraging behavior in such high resolution.

      We hope so, thank you.

      Comment 2. However, it is necessary to describe better the measured landing force and the hunting strike and perching behavior so the readers can understand these methods when reading the results (and without reading the Methods).

      We have now changed the text in the “Results” to help the reader understand the key methods while reading the results.

      Comment 3. In addition, make sure you use the same terminology for hunting strategies during the entire paper and especially in all figures and corresponding result descriptions.

      We now use consistent terminology throughout the text and figures. We hope that this is now clear in the revised manuscript.

      Comment 4. In addition, although I find your statement about the link between acoustic camouflage and fitness reasonable, it should be described as a hypothesis or examined if you want to keep the direct link statement. I believe showing a direct link can add an additional outstanding aspect to this paper, but I also understand that it can be addressed in a separate paper.

      We agree that the relationship between hunting success and barn owl fitness is an important topic, but it necessitates a consideration of both hunting strategies, including hunting on the wing, which extends beyond the limits of our current study. Indeed, our primary objective was to conduct a detailed examination of the interplay between acoustic camouflage and the success of the sit-and-wait technique.

      However, we have edited the manuscript to explicitly describe the link between acoustic camouflage and fitness as a hypothesis. We believe this adjustment provides a more accurate representation of our approach. We hope this clarifies the specific emphasis of our work and its contribution to the understanding of barn owl hunting behavior.

      Here are my detailed comments about the paper: 

      Comment 5. Title: Consider changing the title to "Acoustic camouflage predicts hunting success in a wild predator." 

      We would like to thank you for your nice proposition. However, we opted for a different title, which is now “Landing force reveals new form of motion-induced sound camouflage in a wild predator”.

      Comment 6. Line 91-93: Please provide additional information about the collected dataset, including: 

      Description of the total period of observations, an average and standard deviation of perching and hunting attempt events per individual per night, number of foraging trips per individual per night, details about the geographic location and characteristics of the habitat, season, and reproductive state. 

      The revised manuscript now includes detailed information about the collected dataset (i.e. study area, reproductive state, etc…). “We used GPS loggers and accelerometers to record high resolution movement data during two consecutive breeding seasons (May to August in 2019 and 2020) from 163 wild barn owls (79 males and 84 females) breeding in nest boxes across a 1,000 km² intensive agricultural landscape in the western Swiss plateau.” Results section, lines 79 – 82

      Details about the number of foraging trips per individuals and per night are now presented in the results: “Sexual dimorphism in body mass was marked among our sampled individuals. Males were lighter than females (84 females, average body mass: 322 ± 22.6 g; 79 males, average body mass 281 ± 16.5 g, Fig S6) and provided almost three times more prey per night than females (males: 8 ± 5 prey per night; females: 3 ± 3 prey per night; Fig.S7). Males also displayed higher nightly hunting effort than females (Males: 46 ± 16 hunting attempts per night, n= 79; Females: 25 ± 11 hunting attempts per nights, n=84; Fig. 3A, Fig S8). However, females were more likely to use a sit and wait strategy than males (females: 24% ± 15%, males: 13% ± 10%, Fig.S9). As a result, the number of perching events per night was similar between males and females (Females: 76 ± 23 perching events per nights; Males: 69 ± 20 perching events per night; Fig S8).” (lines 165 – 174) 

      Comment 7. In addition, state if the information describes breeding pairs of males and females and provides statistics on the number of tracked pairs and the number of nest boxes.

      The revised manuscript now includes a description of the number of tracked breeding pairs and the number of nest boxes. “Of these individuals, 142 belonged to pairs for which data were recovered from both partners (71 pairs in total, 40 in 2019, 31 in 2020). The remaining 21 individuals belonged to pairs with data from one partner (11 females and 1 male in 2019; 4 females and 5 males in 2020).” (lines 82 – 85.)

      Comment 8. Line 93: Briefly define the term "landing force" and explain how it was measured (and let the reader know that there is a detailed description in the Methods).

      We now include a brief definition of the “landing force” along with a brief explanation of how it was measured in the results section. “We extracted the peak vectoral sum of the raw acceleration during each landing and converted this to ground reaction force (hereafter “landing force”, in Newtons) using measurements of individual body mass (see methods for detailed description).” (lines 92 – 95).

      Comment 9. Line 94: All definitions, including "pre-hunting force," need to be better described in the Results section.

      Thank you for this suggestion. We now provided a better description of those key definitions directly in the results section: 

      Measurement of landing force: “Barn owls employing a sit-and-wait strategy land on multiple perches before initiating an attack, with successive landings reducing the distance to the target prey (Fig. 2C). 

      We used the acceleration data to identify 84,855 landings. These were further categorized into perching events (n = 56,874) and hunting strikes (n = 27,981), depending whether barn owls were landing on a perch or attempting to strike prey on the ground (Fig. 1A and B, see methods for specific details on behavioral classification).” (lines 88 – 95)

      Pre-hunt perching force predicts hunting success: “Finally, we analyzed whether the landing force in the last perching event before each hunting attempt (i.e. pre-hunt perching force) predicted variation in hunting success” (lines 229 – 230)

      Comment 10. Line 102: Remove "Our analysis of 27,981 hunting strikes showed that" and add "n = 27,981" after the statistics. You have already stated your sample size earlier. There is no need to emphasize it again, although your sample size is impressive.

      We modified the text in the results section as suggested.

      Comment 11. Line 104: The results so far suggest that the difference in landing force between males and females is an outcome of their different body masses. However, it is not clear what is the reason for the difference in the number of hunting strike attempts between males and females (Lines 104-106). Can you compare the difference in landing force between males and females with similar body mass (females from the lower part of the distribution and males from the upper part)? Is there still a difference?

      Thank you, following your comment we made some new analyses that clarified the situation around landing force involved in perching and hunting strike events between sexes. But firstly, we wanted to clarify why there is a difference in number of hunting attempts between males and females. During the breeding season, females typically perform most of the incubation, brooding, and feeding of nestlings in the nest, while the male primarily hunts food for the female and chicks. The female supports the male providing food in a very irregular way, and this changes from pair to pair (paper in prep.). The differences in number of hunting attempts between males and females reflects this asymmetry in food provisioning between sexes during this specific period. We specified this in the revised version of the manuscript (lines 164 – 174). 

      We also provide a new analysis to investigate sex differences in mass-specific landing force (force/body mass). We found that males and females produce similar force per unit of body mass during perching events. This demonstrates that the overall higher perching force in females (see Fig. 4C in the manuscript) is therefore driven by their higher body mass. (lines 194 – 199)

      Comment 12. Line 154: I believe Boonman et al. (2018) is relevant to this part of the discussion. Boonman, Arjan, et al. found that barn owl noise during landing and taking off is worth considering. ["The sounds of silence: barn owl noise in landing and taking off."

      Behavioral Processes 157 (2018): 484-488.]

      We now cited this paper in the discussion.

      Comment 13. Line 164: Your results do not directly demonstrate a link to fitness, although they potentially serve as a proxy for fitness (add a reference). However, you might have information regarding nestlings' survival - that will provide a direct link for fitness. Change your statement or add the relevant data.

      We appreciated your feedback, and we adjusted the language accordingly.

      Comment 14. Line 213: If the poles are closer to the ground - is it possible that the higher trees and buildings serve for resting and gathering environmental information over greater distances? For example, identifying prey at farther distances or navigating to the next pole?

      Yes, this is indeed the most likely explanation for the fact that owls land more on buildings and trees than on poles until the last period (about 6 minutes) before hunting. In these last minutes, barn owls preferentially use poles, as we showed in figure 2B. The revised manuscript now includes this explanation in the discussion (lines 269 – 284).

      Comment 15. Line 250: The product "AXY-Trek loggers" does not appear on the Technosmart website (there are similar names, but not an exact match). Are you sure this is the correct name of the tracking device you used? 

      Thank you for pointing out this detail that we missed. The device we used is now called "AXY-Trek Mini" (https://www.technosmart.eu/axy-trek-mini/). We have corrected this error directly in the revised manuscript.

      Comment 16. Line 256: Please explain how the devices were recovered. Did you recapture the animals? If so, how? Additionally, replace "after approximately 15 days" with the exact average and standard deviation. Furthermore, since you have these data, please state the difference in body mass between the two measurements before and after tagging.

      The birds were recaptured to recover the devices. Adults barn owls were recaptured at their nest sites, again using automatic sliding traps that are activated when birds enter the nest box. The statement "after approximately 15 days" was replaced by the exact mean and standard deviation, which were 10.47 ± 2.27 days. Those numbers exclude five individuals from the total of 163 individuals included in this study. They could not be recaptured in the appropriate time window but were re-encountered when they initiated a second clutch later in the season (4 individuals) or a new clutch the year after (1 individual).

      We integrated this previously missing information in the revised manuscript (lines 370 – 372).

      Comment 17. Line 259: What was the resolution of the camera? What were the recording methods and schedule? How did you analyze these data? 

      The resolution was set to 3.1 megapixel. Motion sensitive camera traps were installed at the entrance to each nest box throughout the period when the barn owls were wearing data loggers, and each movement detected triggered the capture of three photos in bursts. The photos recorded were not analyzed as such for this study, but were used to confirm each supply of prey, which had previously been detected from the accelerometer data. We added these details in the revised manuscript (lines 377 – 380)

      Comment 18_1. Figure 1: 

      Panel A) Include the sex of the described individual. 

      The sex of the described individual is now included in the figure caption.

      Comment 18_2. It would be interesting to show these data for both males and females from the same nest box (choose another example if you don't have the data for this specific nest box). 

      Although we agree that showing tracks of males and females from the same nest is very interesting, the purpose of this figure was to illustrate our data annotation process and we believe that adding too many details on this figure will make it appear messy. However, the revised manuscript now includes a new figure (Fig. 3A) which shows simultaneous GPS tracks of a male and a female during a complete night, with detailed information about perching and hunting behaviors.

      Comment 18_3. Add the symbol of the nest box to the legend. 

      Done

      Comment 18_4. Provide information about the total time of the foraging trip in the text below. 

      The duration of the illustrated foraging trip has been included in the figure caption.

      Comment 18_5. To enhance the figure’s information on foraging behavior, consider color coding the trajectory based on time and adding a background representing the landscape. Since this paper may be of interest to researchers unfamiliar with barn owl foraging behavior, it could answer some common questions. 

      For similar reasons explained in our answer above (Comment 18_2), we would rather keep this figure as clean as possible. However, we followed your recommendations and included these details in the new Figure 3 described above. In this new figure, GPS tracks are color coded according to the foraging trip number and includes a background representing the landscape. To provide even more detail about the landscape, we added another figure in the supplementary materials (Fig. S2) which provides illustration of barn owls foraging ground and nest site that we think might be of interest for people unfamiliar with barn owls.

      Comment 18_6. Inset panels) provide a detailed description of the acceleration insert panels. 

      Done

      Comment 18_7. Color code the acceleration data with different colors for each axis, add x and y axes with labels, and ensure the time frame on the x-axis is clear. How was the self-feeding behavior verified (should be described in the methods section)? 

      We kept both inset panels as simple as possible since they serve here as examples, but a complete representation of these behaviors (with time frame, different colors and labels) is provided in the supplementary materials (figure S3). We included this statement in the figure caption and added a reference to the full representations from the supplementary materials: 

      In the Figure caption: “Inset panels show an example of the pattern of the tri-axial acceleration corresponding to both nest-box return and self-feeding behaviors (but see Fig S3for a detailed representation of the acceleration pattern corresponding to each behavior).” 

      In the Method section: “Self-feeding was evident from multiple and regular acceleration peaks in the surge and heave axes (resulting in peaks in VeDBA values > 0.2 g and < 0.9 g, Fig.S3D), with each peak corresponding to the movement of the head as the prey was swallowed whole.”.

      Comment 18_8. Panel B) Note in the caption that you refer to the acceleration z-axis.

      We believe that keeping the statement “the heave acceleration…” in the figure caption is more informative than referring to the “z-axis” as it describes the real dimension to which we are referring. The use of the x, y and z axes can be misleading as they can be interchanged depending on the type and setting of recorders used.

      Comment 18_9. Present the same time scale for both hunting strategies to facilitate comparison. You can achieve this by showing only part of the flight phase before perching. 

      Done

      Comment 18_10. Panel C) Presenting the data for both hunting strategy and sex would provide more comprehensive information about the results and would be relatively easy to implement. 

      We agree with your comment. We present the differences in landing force for both landing contexts and sexes in the new Figure 3 as well as in the supplementary materials (Figure S10) of this revised manuscript.

      Comment 19. Figure 2: Please provide an explanation of the meaning of the circles in the figure caption.  

      Done

      Comment 20. Figure 3: 

      Panel A) It is unclear how the owl illustration is relevant to this specific figure, unlike the previous figures where it is clear. Also, suggest removing the upper black line from the edge of the figure or add a line on the right side. 

      Done (now in Figure 2).

      Panel B) "Density" should be capitalized. 

      Done

      Panel C) Add a scale in meters, and it would be helpful to include an indication of time before hunting for each data point. 

      Done

      Comment 21. Figure S1: Mark the locations of the nest boxes and ensure that trajectories of different individuals and sexes can be identified. 

      The purpose of this figure was to show the spatial distribution of the data. We think that adding nest locations and coloring the paths according to individuals and/or sex will make the figure less clear. However, the new Figure 3 highlights those details.

      Comment 22. Figure S2: Show the pitch angle similarly to how you showed the acceleration axes, and explain what "VeDBA" stands for. Provide a description of the perching behavior, clearly indicating it on the figure. Add axes (x, y, z) to the illustration of the acceleration explanation. 

      We edited this figure (now figure S3) to show the pitch angle and provide an explanation of what “VeDBA” stands for in the figure caption. The figure caption now also provides a better description of the perching behavior. For the axes (i.e. X, Y, Z), we prefer to refer to the heave, surge, and sway as this is more informative and refers to what is usually reported in studies working with tri-axial accelerometers.

      Comment 23. Table S1: Improve the explanation in the caption and titles of the table. 

      Done

      Reviewer #2 (Recommendations For The Authors): 

      Comment 1. From the public review and my assessment there, the authors can be assured that I thoroughly enjoyed the read and am looking forward to seeing a revised and improved version of this paper. 

      We thank the reviewer for this comment. We revised the manuscript according to their comments.

      Comment 2. In addition to my major points stated above, I would like to add the following recommendations: 

      The manuscript is overall well written, but it uses a very pictorial language (a little as if we were in a David Attenborough documentary) that I find inappropriate for a research paper (especially in the abstract and introduction, "remarkable" (2x), "sophisticated" (are there any unsophisticated adaptations? We are referring to something under selection after all) etc.

      We appreciated that you found the paper overall well written, and we understand the comment about pictorial language. We therefore slightly changed the text to make sure that the adjective used to describe adaptive strategies are not over-emphasized.

      Comment 3. Abstract 

      "While the theoretical benefits of predator camouflage are well established, no study has yet been able to quantify its consequences for hunting success." - This claim is actually not fully true: 

      Nebel Carina, Sumasgutner Petra, Pajot Adrien and Amar Arjun 2019: Response time of an avian prey to a simulated hawk attack is slower in darker conditions, but is independent of hawk colour morph. Soc. open sci.6:190677 

      We edited our claim to specify that the consequences of predator camouflage on hunting success has never been quantified in natural conditions and cited the reference in the introduction.

      Comment 4. Line 23. Rephrase to: "We used high-resolution movement data to quantify how barn owls (Tyto alba) conceal their approach when using a sit-and-wait strategy, as well as the power exerted during strikes." 

      We edited this sentence in the abstract, as suggested.

      Comment 5. Results 

      There is a disconnect between the objectives outlined at the end of the introduction and the following results that should be improved. 

      The authors state: "Using high-frequency GPS and accelerometer data from wild barn owls (Tyto alba), we quantify the landing dynamics of this sit-and-wait strategy to (i) examine how birds adjust their landing force with the behavioral and environmental context and (ii) test the extent to which the magnitude of the predator cue affects hunting success." But one of the first results presented are sex differences. 

      This is a fair point. We have now changed our statement in the end of the introduction as well as the order of the results to improve the link between the objectives outlined in the introduction and the way result are presented. 

      Comment 6. At this stage, the reader does not even know yet that we are presented with a size-dimorphic species that also has very different parental roles during the breeding season. This should be better streamlined, with an extra paragraph in the introduction. And these sex differences are then not even discussed, so why bring them up in the first place (and not just state "sex has been fitted as additional co-variate to account for the size-dimorphism in the species" without further details). 

      We edited the way the objectives are outlined in the introduction to cover the size dimorphism (lines 70 – 76). We also completely changed the way the sex differences are presented in the results, including a new analysis that we believe provides a better comprehensive understanding of barn owl foraging behavior (lines 164 – 206). Finally, we added a new paragraph in the discussion to consider those results (lines 319 – 339).

      Comment 7. It is not clear to me where and how high-resolution GPS data were used? The results seem to concentrate on ACC – why GPS was used and how it features should be foreshadowed in a few lines in the introduction. I definitively prefer having the methods at the end of a manuscript, but with this structure, it is crucial to give the reader some help to understand the storyline. 

      GPS data were used to validate some behavioral classifications (prey provisioning for example), but most importantly they were used to link each landing event with perch types. We edited the text in the result section to clarify where GPS and/or ACC data were used.

      Comment 8. Discussion 

      Move the orca example further down, where more detail can be provided to understand the evidence. 

      After our extensive edits in the discussion, we felt this example was interrupting the flow. We now cite this study in the introduction. 

      Comment 9. Size dimorphism and evident sex differences are not discussed. 

      The revised manuscript now includes a new paragraph in the discussion in which sex differences are discussed (lines 319 – 339).

      Comment 10. Be more precise in the terminology used (for example, land use seems to be interchangeable with habitat characteristics?). 

      We modified “land use” with “habitat data” in the revised manuscript.

      Comment 11. Methods 

      Please provide a justification for the very high weight limit (5%; line 256). This limit is outdated and does not fulfill the international standard of 3% body weight. I assume the ethics clearance went through because of the short nature of the study (i.e., the birds were not burdened for life with the excess weight? But a line is needed here or under the ethics considerations to clarify this). 

      The 5% weight limit was considered acceptable due to the short deployment period, and we now edited the ethics statement to emphasize this point. However, it is important to note that there is no real international standard, with both 3% and 5% weight limits being commonly used. Both limits are arbitrary and the impact of a fixed mass on a bird varies with species and flight style. All owls survived and bred similarly to the non-tagged individuals in the population (lines 373 – 376 & lines 558 – 561)

      EDITORIAL COMMENT: We strongly encourage you to provide further context and clarification on this issue, as suggested by the Reviewer. On a related point, the ethics statement refers to GPS loggers, rather than GPS and ACC devices; we encourage you to clarify wording here.

      Thank you for highlighting this point that indeed needed some clarifications.

      Although we have used the terminology "GPS recorders", the authorization granted by the Swiss authorities for this study effectively covers the entire tracking system, which combines both GPS and ACC recorders in the same device. We have therefore changed the wording used in the ethics statement to avoid any misunderstanding (lines 373 – 376 & lines 558 – 561)

      Comment 12. Please provide more information on the model selection approach, what does "Non-significant terms were dropped via model simplification by comparing model AIC with and without terms." mean? Did the authors use a stepwise backward elimination procedure (drop1 function)? Or did they apply a complete comparison of several candidate models? I think a model comparison approach rather than stepwise selection would be more informative, as several rather than only one model could be equally probable. This might also improve model weights or might require a model averaging procedure - current reported R2values are very small and do not seem to support the results well. 

      We apologize for the lack of details about this important aspect of the statistical analysis. We applied an automated stepwise selection using the dredge function from the R package “MuMin”, therefore applying a complete comparison of several candidate models. The final models were chosen as the best models since the number of candidate models within ∆AIC<2 was relatively low in each analysis and thus a model averaging was not appropriate here. We edited the methods section to ensure clarity, and added model selection tables for each analysis, ranked according to AICc scores, in the supplementary materials (lines 532 – 552)

      In addition, we agree that the reported R-squared values in our analyses are quite low, specifically regarding the influence of pre-hunt perching force on hunting success (cond R2 = 0.04). Nonetheless, landing impact still has a notable effect size (an increase of 1N reduces hunting success by 15%). The reported values are indicative of the inherent complexity in studying hunting behavior in a wild setting where numerous variables come into play. We specifically investigated the hypothesis that the force involved during pre-hunt landings, and consequently the emitted noise, influences the success of the next hunting attempt in wild barn owls. Factors such as prey behavior and micro-habitat characteristics surrounding prey (such as substrate type and vegetation height) are most likely to be influential but hard, or nearly impossible, to model. We now cover this in a more nuanced way in the discussion (lines 266 – 268)

      Comment 13. Please explain why BirdID was nested in NightID - this is not clear to me.

      Probably here there is a misunderstanding because we wrote that we nested NightID in BirdID (and not BirdID in NightID). 

      Comment 14. I hope the final graphs and legends will be larger, they are almost impossible to read. 

      We enlarged the graphs and legends as much as possible to improve readability. However, looking at the graphs in the published version they seem clear and readable.

      Comment 15. Figure S1: Does "representation" mean the tracks don't show all of the 163 owls? If so, be precise and tell us how many are illustrated in the figure. 

      Figure S1 represent the tracks for each of the 163 barn owls used in the study. We changed the terminology used in the figure caption to avoid any misunderstanding.

      Comment 16. Figure S4: Please adjust the y-axis to a readable format. 

      Done

    1. Author response:

      The following is the authors’ response to the original reviews.

      Response to Reviewer #1 comments:

      (1) SY1 aggregation enhances (in terms of number of aggregates) when Sphingolipid biosynthesis is blocked.

      a. Line no 132-133: I agree that there is circumstantial evidence that the maturation pathway of SY1 IB is perturbed by knocking down sphingolipid biosynthesis. However, to prove this formally, a time course of IB maturation needs to be reported in the knock-down strains.

      Please see Figure 2-figure supplement 1 for the time course of SY1 IB maturation in the knock-down strains. We have added the result to the manuscript, please see lines 129-131on page 5 in the revised version.

      b. It will be good to have formal evidence that sphingolipids are indeed downregulated when these genes are downregulated (knocked down).

      This issue has been clearly evidenced in previous reports, and we have added the appropriate references in the main text. For example, down-regulation of LCB1 or SPT in yeast decreased sphingolipid levels by Huang et al (https://doi.org/10.1371/journal.pgen.1002493). According to the report from Tafesse FG, et al (https://doi.org/10.1371/journal.ppat.1005188), in mammalian cells in which Sptlc2 was knocked down by CRISPR/Cas9, sphingolipid and glucosylceramide production is almost completely blocked. In addition, the levels of sphingosine, sphingomyelin, and ceramide were significantly lower compared to control cells. Please see lines 143-144 on pages 6 and lines 232-233 on pages 9 in the revised version.

      (2) In a normal cell (where sphingolipid biosynthesis is not hampered), the aggregate of SY1 (primarily the Class I aggregate) is localized only on the mitochondrial endomembrane system. These results have been published for other aggregation-prone proteins and are partly explained in the literature. However, their role in the context of maturation is relatively unclear. The authors however provide no strong evidence to show if mitochondria are preferentially involved in any of the stages of IB maturation. Specifically:

      a. Line 166-167: It is not clear from Figure 4B that this is indeed the case. Only the large IB seems to colocalize in all three panels (Class I, 2, 3) with Mitotracker. The smaller IBs in 2 and 3 do not show any obvious co-localization. It is also possible that they do co-localize, but it is not clear from the images. I would appreciate it if the authors either provide stronger evidence (better image) or revise this statement. This point is crucial in some claims made later in the manuscript. (pls see comment #5A).

      Based on the reviewer's suggestion, we replaced the images in Figure 4B. In addition, we added the 3D reconstruction results of the interrelationship between Class 3 and Mitotracker in Figure 4-figure supplement 1B, to further show their relationship.

      (3) The localization is due to the association of SY1 (aggregates) with mitochondrial proteins like Tom70, Tim44 etc. There are some critical points (that can strengthen the manuscript) that are not addressed here. Primarily, the important role of mitochondria in the context of toxicity is neglected. Although the authors have mentioned in the discussion that it was not their main focus, I believe that this is the novel part of the manuscript and this part is potentially a beautiful addition to literature. The questions I found unanswered are:

      a. Is the localization completely lost upon deleting these genes? I see only a partial loss in shape/localization. This is not properly explained in the manuscript. The shape of the IB seems to remain intact while the localization is slightly altered. This indicates that even when sphingolipid is present, SY1 localization is dictated by the (lipid-raft embedded) proteins. Interestingly, it shows that even in the absence of mitochondrial localization the shape of the aggregates is not altered in these deletion strains! How do the authors explain this if mitochondrial surface sphingolipids are important for IB maturation? (the primary screen found that sphingolipid biosynthesis promotes the formation of Class I IBs).

      We agree that mutation in one mitochondrial binding protein only a partial loss in shape/localization, and we have replaced “association” with “surrounding” in the manuscript. Please see lines 163-166 on page 6 in the revised version. In mutants that interact with SY1, we counted the proportion of Class 3 aggregates formed by SY1 and found an increase in the proportion of SY1 Class 3 aggregates in the deletion mutants compared to controls, partially lost interaction of SY1 with mitochondria has effect on shape of aggregates, as detailed in line 184 on page 7 and Figure 4-figure supplement 1D. We think that SY1 interactions with mitochondrial proteins are important for the localization of SY1 IB in mitochondria, whereas sphingolipids play an important role in facilitating the formation of Class 1 IBs from Class 3 aggregates.

      b. What happens to the toxicity when the aggregates are not localized on mitochondria?

      We thank the reviewer for the comments, however to investigate this issue, since a single mutant can only partially affect the phenotype, it may be necessary to construct groups of mutants of different genes to observe the effect, which we will further elucidate in our future studies. What we want to show in this work is that SY1 achieves binding to mitochondria by interacting with these mitochondrial proteins.

      c. It is important to note that sphingolipids may affect the whole process indirectly by altering pathways involved in protein quality control or UPR. UPR may regulate the maturation of IBs. It is therefore important to test if any of the effects seen could be of direct consequence.

      We agree with the reviewer's comments, but there was no significant enrichment for protein quality control or UPR-related pathways in our genome-wide screen, so it is unlikely that sphingolipids indirectly cause maturation of IBs by affecting these two pathways. We addressed this issue in our discussion. Please see lines 325-328 on page 12 in the revised version.

      d. In Figure 4D, the authors find SY1 when they pull down Tom70, Tom37 or Tim44. Tim44 is a protein found in the mitochondrial matrix, how do the authors explain that this protein is interacting with a protein outside the mitochondrial outer membrane?

      This interaction could be potentially due to that some of the soluble SY1 enter the mitochondrial matrix and interact with Tim44.

      e. Is it possible that the authors are immunoprecipitating SY1 since IBs have some amount of unimported mitochondrial proteins in aggregates formed during proteotoxic stress (https://doi.org/10.1073/pnas.2300475120) (Liu et al. 2023).

      Our Co-IP experiments were performed in the soluble state supernatant, so mitochondrial proteins in aggregates were not detected.

      f. Line 261 (Discussion): Does deletion of Tom70 or one of the anchors increase Class III aggregation and increase toxicity? Without this, it is hard to say if mitochondria are involved in detoxification.

      We thank the reviewer for the comments, please see our response to comment 3b.

      (4) This fuels the loss of mitochondrial function.

      a. Line 218-219: Although the change is significant, the percentage change is very slight. Is this difference enough to be of physiological relevance in mitochondrial function? In our hands, the DCF fluorescence is much more variable.

      We agree with the reviewer that there is a small difference (but significant). To which extend such a difference be of physiological relevance in mitochondrial function need to be further investigated.

      b. Is SY1-induced loss of mitochondrial function less in knockouts of Tom70 or the other ones found to be important for localizing the SY1 aggregate to mitochondria?

      We examined mitochondrial membrane potential (indicated by Rho 123 fluor intensity) in tom70Δ, tom37Δ and control his3Δ strains and found that the knocking out of Tom70 or Tom37 reduced the mitochondrial toxicity caused by SY1 expression. Please see lines 212-214 on page 8 in the revised version, and Figure 5-figure supplement 2.

      (5) Mitochondrial function is further abrogated when there is a block in sphingolipid biosynthesis.

      a. Myriosin acted like the deletion strains that showed less structured aggregates. There were more aggregates (Class 3) but visually they seemed to be spread apart. The first comment (#2A) on aggregate classes and their interaction with mitochondria may become relevant here.

      According to a recent review article (https://doi.org/10.3389/fcell.2023.1302472), sphingolipids are present in the mitochondrial membrane, bind to many mitochondrial proteins and have emerged as key regulators of mitochondrial morphology, distribution and function. Dysregulation of sphingolipid metabolism in mitochondria disrupts many mitochondrial processes, leading to mitochondrial fragmentation, impaired bioenergetics and impaired cellular function. Myriocin treatment, which affects sphingolipid metabolism, causes mitochondria to become more fragmented, which may explain why the aggregates appear visually spread apart. Regarding the interaction with mitochondria, we counted the proportion of SY1 aggregates surrounded by mitochondria after treatment with myriocin, and the results were not significantly different compared to the control. Please see lines 168-169 on page 6 in the revised version, and Figure 4-figure supplement 1C.

      (6) A similar phenomenon is conserved in mammalian cell lines.

      a. Line 225-226: Did the authors confirm that this was the only alteration in the genome? Or did they complement the phenotype, genetically?

      We performed SPTLC2 gene complementation experiments in knockout cell lines and found that SPTLC2 gene complementation was able to reduce the number of cells forming IBs and the percentage of dispersed irregular IBs compared to controls. Please see lines 240-242 on page 9 in the revised version, and Figure 6-figure supplement 2B.

      b. Line 241-245: One of the significant phenotypes observed by downregulating sphingolipid biosynthesis in yeast and mammalian cells, was the increase in the number of aggregates. This is not shown in myriocin treatment in mammalian cells. This needs to be shown to the main concordance with the original screen and the data presented with the KO mammalian cell line.

      Please see Figure 7-figure supplement 1A for the data on the proportion of cells forming SY1 IBs after myriocin treatment in mammalian cells, and myriocin treatment in mammalian cells was the same as in the KO mammalian cell line.

      Minor Comments:

      Line 273-275: How is this statement connected to the previous statement? Was it observed that aggregate fusion was advantageous to the cells?

      Yes, aggregate/oligomer fusion is advantageous to the cells, and we have modified the previous statement. Please see line 280 on page 10 in the revised version.

      Line 293-294: I am not sure I understand this statement.

      We have modified this statement. Please see lines 302-303 on page 11 in the revised version.

      Line 295-296: But the authors have commented at multiple places that mitochondria detoxify the cell from SY1 aggregates. I find this link fascinating and worth investigating. Most of the current work has some known links in literature (not everything). The mitochondrial connection being the most fascinating one.

      We have removed this sentence. We have added a validation experiment for the role of mitochondrial activity in SY1 IB maturation in the revised version.

      Line 318: Do the authors mean: The open question is...

      Thanks to the reviewer, we have corrected it.

      Response to Reviewer #2 comments:

      I recommend considering live cell microscopy to analyze whether sphingolipid-dependent formation of SY1 IB takes place at the mitochondrial outer membrane. The IBs could also be produced at other membranes and then transported to the mitochondrial outer membrane for storage.

      As shown in Figure 4A, SY1 IB primarily interacts with mitochondria.

      I recommend analyzing whether mitochondrial activity is needed for sphingolipid-dependent SY1 IB formation. Are these IBs localized to mitochondrial membrane solely as scaffold or are these organelles needed to provide the energy for driving IB formation in concert with sphingolipids? This point could be addressed with rho0 strains lacking mitochondrial DNA.

      We thank the reviewer for this recommendation. We expressed SY1 protein in BY4741 rho0 strain as suggested and found that the maturation and mitochondrial surrounding state of SY1 IB was not affected by mitochondrial activity. Please see lines 185-187 on page 7 in the revised version, and Figure 4-figure supplement 1E and 1F.

      The authors should be more precise in the statistical methods used in their study (method, pre-/post-tests, number of replicates...).

      We thank the reviewer for the comment and we have provided a more precise description of the statistical methods. Please see lines 531-534 on page 19 and figure legends in the revised version.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript aims at a quantitative model of how visual stimuli, given as time-dependent light intensity signals, are transduced into electrical currents in photoreceptors of macaque and mouse retina. Based on prior knowledge of the fundamental biophysical steps of the transduction cascade and a relatively small number of free parameters, the resulting model is found to fairly accurately capture measured photoreceptor currents under a range of diverse visual stimuli and with parameters that are (mostly) identical for photoreceptors of the same type.

      Furthermore, as the model is invertible, the authors show that it can be used to derive visual stimuli that result in a desired, predetermined photoreceptor response. As demonstrated with several examples, this can be used to probe how the dynamics of phototransduction affect downstream signals in retinal ganglion cells, for example, by manipulating the visual stimuli in such a way that photoreceptor signals are linear or have reduced or altered adaptation. This innovative approach had already previously been used by the same lab to probe the contribution of photoreceptor adaptation to differences between On and Off parasol cells (Yu et al, eLife 2022), but the present paper extends this by describing and testing the photoreceptor model more generally and in both macaque and mouse as well as for both rods and cones.

      Strengths:

      The presentation of the model is thorough and convincing, and the ability to capture responses to stimuli as different as white noise with varying mean intensity and flashes with a common set of model parameters across cells is impressive. Also, the suggested approach of applying the model to modify visual stimuli that effectively alter photoreceptor signal processing is thought-provoking and should be a powerful tool for future investigations of retinal circuit function. The examples of how this approach can be applied are convincing and corroborate, for example, previous findings that adaptation to ambient light in the primate retina, as measured by responses to light flashes, mostly originates in photoreceptors.

      Weaknesses:

      In the current form of the presentation, it doesn't become fully clear how easily the approach is applicable at different mean light levels and where exactly the limits for the model inversion are at high frequency. Also, accessibility and applicability by others could be strengthened by including more details about how parameters are fixed and what consensus values are selected.

      Thank you - indeed a central goal of writing this paper was to provide a tool that could be easily used by other laboratories. We have clarified and expanded four points in this regard: (1) we have stated more clearly that mean light levels are naturally part of inversion process, and hence the approach can be applied across a broad range of light levels (lines 292-297); (2) we have expanded our analysis of the high frequency limits to the inversion and added that expanded figure to the main text (new Fig 5); (3) we have included additional detail about our calibration procedures, including our calibration code, to facilitate transfer to other labs; and, (4) we have detailed the procedure for identification of consensus parameters (line 172-182, 191-199 and Methods section starting on line 831).

      Reviewer #2 (Public Review):

      Summary:

      This manuscript proposes a modeling approach to capture nonlinear processes of photocurrents in mammalian (mouse, primate) rod and cone photoreceptors. The ultimate goal is to separate these nonlinearities at the level of photocurrent from subsequent nonlinear processing that occurs in retinal circuitry. The authors devised a strategy to generate stimuli that cancel the major nonlinearities in photocurrents. For example, modified stimuli would generate genuine sinusoidal modulation of the photocurrent, whereas a sinusoidal stimulus would not (i.e., because of asymmetries in the photocurrent to light vs. dark changes); and modified stimuli that could cancel the effects of light adaptation at the photocurrent level. Using these modified stimuli, one could record downstream neurons, knowing that any nonlinearities that emerge must happen post-photocurrent. This could be a useful method for separating nonlinear mechanisms across different stages of retinal processing, although there are some apparent limitations to the overall strategy.

      Strengths:

      (1) This is a very quantitative and thoughtful approach and addresses a long-standing problem in the field: determining the location of nonlinearities within a complex circuit, including asymmetric responses to different polarities of contrast, adaptation, etc.

      (2) The study presents data for two primary models of mammalian retina, mouse, and primate, and shows that the basic strategy works in each case.

      (3) Ideally, the present results would generalize to the work in other labs and possibly other sensory systems. How easy would this be? Would one lab have to be able to record both receptor and post-receptor neurons? Would in vitro recordings be useful for interpreting in vivo studies? It would be useful to comment on how well the current strategy could be generalized.

      We agree that generalization to work in other laboratories is important, and indeed that was a motivation for writing this as a methods paper. The key issue in such generalization is calibration. We have expanded our discussion of our calibration procedures and included that code as part of the github repository associated with the paper. Figure 10 (previously Figure 9) was added to illustrate generalization. We believe that the approach we introduce here should generalize to in vivo conditions. We have expanded the text on these issues in the Discussion (sections starting on line 689 and 757).

      Weaknesses:

      (1) The model is limited to describing photoreceptor responses at the level of photocurrents, as opposed to the output of the cell, which takes into account voltage-dependent mechanisms, horizontal cell feedback, etc., as the authors acknowledge. How would one distinguish nonlinearities that emerge at the level of post-photocurrent processing within the photoreceptor as opposed to downstream mechanisms? It would seem as if one is back to the earlier approach, recording at multiple levels of the circuit (e.g., Dunn et al., 2006, 2007).

      Indeed the current model is limited to a description of rod and cone photocurrents. Nonetheless, the transformation of light inputs to photocurrents can be strongly nonlinear, and such nonlinearities can be difficult to untangle from those occurring late in visual processing. Hence, we feel that the ability to capture and manipulate nonlinearities in the photocurrents is an important step. We have expanded Figure 10 to show an additional example of how manipulation of nonlinearities in phototransduction can give insight into downstream responses. We have also noted in text that an important next step would be to include inner segment mechanisms (section starting on line 661); doing so will require not only characterization of the current-to-voltage transformation, but also horizontal cell feedback and properties of the cone output synapse.

      (2) It would have been nice to see additional confirmations of the approach beyond what is presented in Figure 9. This is limited by the sample (n = 1 horizontal cell) and the number of conditions (1). It would have been interesting to at least see the same test at a dimmer light level, where the major adaptation mechanisms are supposed to occur beyond the photoreceptors (Dunn et al., 2007).

      We have added an additional experiment to this figure (now Figure 10) which we feel nicely exemplifies the approach. The approach that we introduce here really only makes sense at light levels where the photoreceptors are adapting; at lower light levels the photoreceptors respond near-linearly, so our “modified” and “original” stimuli as in Figure 10 (previously Figure 9) would be very similar (and post-phototransduction nonlinearities are naturally isolated at these light levels).

      Reviewer #3 (Public Review):

      Summary:

      The authors propose to invert a mechanistic model of phototransduction in mouse and rod photoreceptors to derive stimuli that compensate for nonlinearities in these cells. They fit the model to a large set of photoreceptor recordings and show in additional data that the compensation works. This can allow the exclusion of photoreceptors as a source of nonlinear computation in the retina, as desired to pinpoint nonlinearities in retinal computation. Overall, the recordings made by the authors are impressive and I appreciate the simplicity and elegance of the idea. The data support the authors' conclusions but the presentation can be improved.

      Strengths:

      -  The authors collected an impressive set of recordings from mouse and primate photoreceptors, which is very challenging to obtain.

      -  The authors propose to exploit mechanistic mathematical models of well-understood phototransduction to design light stimuli that compensate for nonlinearities.

      -  The authors demonstrate through additional experiments that their proposed approach works.

      Weaknesses:

      -  The authors use numerical optimization for fitting the parameters of the photoreceptor model to the data. Recently, the field of simulation-based inference has developed methods to do so, including quantification of the uncertainty of the resulting estimates. Since the authors state that two different procedures were used due to the different amounts of data collected from different cells, it may be worthwhile to rather test these methods, as implemented e.g. in the SBI toolbox (https://joss.theoj.org/papers/10.21105/joss.02505). This would also allow them to directly identify dependencies between parameters, and obtain associated uncertainty estimates. This would also make the discussion of how well constrained the parameters are by the data or how much they vary more principled because the SBI uncertainty estimates could be used.

      Thank you - we have improved how we describe and report parameter values in several ways. First, the previous text erroneously stated that we used different fitting procedures for different cell types - but the real difference was in the amount of data and range of stimuli we had available between rods and cones. The fitting procedure itself was the same for all cell types. We have clarified this along with other details of the model fitting both in the main text (lines 121-130) and in the Methods (section starting on line 832). We also collected parameter values and estimates of allowed ranges in two tables. Finally, we used sloppy modeling to identify parameters that could covary with relatively small impact on model performance; we added a description of this analysis to the Methods (section starting on line 903).

      -  In several places, the authors refer the reader to look up specific values e.g. of parameters in the associated MATLAB code. I don't think this is appropriate, important values/findings/facts should be in the paper (lines 142, 114, 168). I would even find the precise values that the authors measure interesting, so I think the authors should show them in a figure/table. In general, I would like to see also the average variance explained by different models summarized in a table and precise mean/median values for all important quantities (like the response amplitude ratios in Figures 6/9).

      We have added two tables with these parameters values and estimates of allowable ranges. We also added points to show the mean (and SD) across cells to the population figures and added those numerical values to the figure legends throughout.

      -  If the proposed model is supposed to model photoreceptor adaptation on a longer time scale, I fail to see why this can be an invertible model. Could the authors explain this better? I suspect that the model is mainly about nonlinearities as the authors also discuss in lines 360ff.

      For the stimuli that we use we see little or no contribution of slow adaptation in phototransduction. We have expanded the description of this point in the text and referred to Angueyra et al (2022) which looks at this issue in more detail for primate cones (paragraph starting on line 280).

      -  The important Figures 6-8 are very hard to read, as it is not easy to see what the stimulus is, the modified stimulus, the response with and without modification, what the desired output looks like, and what is measured for part B. Reworking these figures would be highly recommended.

      We have reworked all of the figures to make the traces clearer.

      -  If I understand Figure 6 correctly, part B is about quantifying the relative size of the response to the little first flash to the little second flash. While clearly, the response amplitude of the second flash is only 50% for the second flash compared to the first flash in primate rod and cones in the original condition, the modified stimulus seems to overcompensate and result in 130% response for the second flash. How do the authors explain this? A similar effect occurs in Figure 9, which the authors should also discuss.

      Indeed, in those instances the modified stimulus does appear to overcompensate. We suspect this is due to differences in sensitivity of the specific cells probed for these experiments and those used in the model construction. We now describe this limitation in more detail (lines 524-526). A similar point comes up for those experiments in which we speed the photoreceptor responses (new FIgure 9B), and we similarly note that the cells used to test those manipulations differed systematically from those used to fit the model (lines 558-560).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I only have a few minor questions and suggestions for clarification.

      It hasn't become fully clear to me how general the model is when different mean light levels (on long-time scales) are considered. Are there slow adaptation processes not captured in the model that affect model performance? And how should one go about setting the mean light level when, for example, probing ganglion cells with a stimulus obtained through model inversion? Should it work to add an appropriate DC component to the current that is provided as input to the inverted model? (Presumably, deriving a stimulus and then just adding background illumination should not work, or could this be a good approximation, given a steady state that is adapted to the background?)

      We have clarified in the main text that slow adaptation does not contribute substantially to responses to the range of stimuli we explored (lines 281-289). We have also clarified that the stimulus in the model inversion is specified in isomerizations per second - so the mean value of the stimulus is automatically included in the model inversion (lines 293-298).

      Furthermore, a caveat for the model inversion seems to be the potential amplification of high-frequency noise. The suggested application of a cutoff temporal frequency seems appropriate, but data are shown only for a few example cells. Is this consistent across cells? (Given that performance between, e.g., mouse cones can vary considerably according to Fig. 4B?) I would also like to suggest moving the corresponding Supplemental Figure (4.1) into the main part of the manuscript, as it seems quite important.

      We have added population analysis to the new Figure 5 (which was Figure 4 - Figure Supplement 1). We have also clarified that the amplification of high frequency noise is an issue only when we try to apply model inversion to measured stimuli. When we use model inversion to identify stimuli that elicit desired responses, the target responses are computed from a linear model that has no noise, so this is not a concern in applications like those in Figures 6-10.

      Also, could the authors explain more clearly what the effect of the normalization of the estimated stimulus by the power of the true stimulus is? Does this simply reduce power at high frequency or also affect frequencies below the suggested cutoff (where the stimulus reconstruction should presumably be accurate even without normalization)?

      Indeed this normalization reduces high frequency power and has little impact on low frequencies where the inversion is accurate; this is now noted in the text (line 363). As for amplification of high frequency noise (previous comment), the normalization by the stimulus power is only needed when inverting measured responses (i.e. responses with noise) and is omitted when we are identifying stimuli that elicit desired responses (e.g. in Figures 6-10).

      While the overall performance of the model to predict photoreceptor currents is impressive, it seems that particular misses occur for flashes right after a step in background illumination and for the white-noise responses at low background illumination (e.g. Figure 1B). Is that systematic, and if so what might be missing in the model?

      Indeed the model (at least with fixed parameters across stimuli) appears to systematically miss a few aspects of the photoreceptor responses. These include the latency of the response to a bright flash and the early flashes in the step + flash protocol in Figure 1B. Model errors for the variable mean noise stimulus (Figure 2) showed little dependence on time even when responses were sorted by mean light level and by previous mean level. Model errors did not show a clear systematic dependence on light level; this likely reflects, at least in part, the use of mean-square-error to identify model parameters. We have expanded our discussion of these systematic errors in the text (lines 164-166).

      I was also wondering whether this is related to the fact that in Figure 9B, the gain in the modified condition is actually systematically higher when there is more background light. Do the authors think that this could be a real effect or rather an overcompensation from the model? (By the way, is it specified what "Delta-gain" really is, i.e., ratio or normalized difference?)

      We suspect this is an issue with the sensitivity of the specific cells for which we did these experiments (i.e. variability in the gamma parameter between cells). This sensitivity varies between cells, and such variations are likely to place the strongest limitation on our ability to use this approach to manipulate responses in different retinas. We now note those issues in the Results (lines 523-526, 557-559 and 591-593) with reference to Figures 9 (previously Figure 8) and 10 (previously Figure 9), and describe this limitation more generally in the Discussion (section starting on line 649). We have also changed delta-gain to response ratio, which seemed more intuitive.

      Maybe I missed this, but it seems that the parameter gamma is fitted in a cell-type-specific fashion (e.g. line 163), but then needs to be fixed for held-out cells. How was this done? Is there much variability of gamma between cells?

      There is variability in gamma between cells, and this likely explains some of systematic differences between data and model (see above and Methods, lines 902-903). For the consensus models in Figure 2B, gamma was allowed to vary for each cell while the remaining consensus model parameters were fixed. Gamma was set equal to the mean value across cells for model inversion (i.e. for all of the analyses in Figures 4-10). We have described the fitting procedure in considerably more detail in the revised Methods (starting on line 832).

      For completeness, it would be nice to have the applied consensus model parameters in the manuscript rather than just in the Matlab code (especially since the code has not been part of the submission). Also, some notes on how the numerical integration of the differential equations was done would be nice (time step size?).

      We have added tables with consensus parameters and estimates of the sensitivity of model predictions to each parameter. We have also added additional details about the numerical approaches (including the time step) to Methods.

      Similarly, it would be nice to explicitly see the relationships that are used to fix certain model parameters (lines 705ff). And can the constants k and n (lines 709-710) be assumed identical for different species and receptor types?

      We have added more details to the model fitting to the methods, including the use of steady-state conditions to hold certain parameters fixed (lines 862 and 866). We are not aware of any direct comparisons of k and n across species and receptor types. We have noted that model performance was not improved by modest changes in these parameters (due to compensation by other model parameters). More generally, we have explained how some parameters trade for others and hence the logic of fixing some even when exact values were not available.

      For the previous measurements of m and beta (lines 712-713), is there a reference or source?

      We have added references for these values.

      Did the authors check for differences in the model parameters between cone types (e.g., S vs. M)?

      We did not include S cones here. They are harder to record from and collecting a fairly large data set across a range of stimuli would be challenging. Our previous work shows that S cones have slower responses than L and M cones, and this would certainly be reflected in differences in model parameters. We have noted this in the text (Methods, line 808-810).

      For the stated flash responses time-to-peak (lines 183-184), is this for a particular light intensity with no background illumination?

      Those are flashes from darkness - now noted in the text.

      Figure 2 - Supplement 1 doesn't have panel labels A and B, unlike the legend.

      Fixed - thank you.

      Reviewer #2 (Recommendations For The Authors):

      (1) Fig. 2B - for some cells, the consensus model seems to fit better than the individual model. How is this possible?

      This was mostly an error on our part (we inadvertently included responses to more stimuli in fitting the individual models, which slightly hampered their performance). Even with this correction, however, a few cells remain for which the consensus model outperforms and individual model. We believe this is because there is more data to constrain model parameters for the consensus models (since they are fit to all cells at the same time), and that can compensate for improvements associated with customizing parameters to specific cells.

      (2) Fig. 2 Supplement 1, it would be useful to see a blow-up of the data in an inset, as in Fig. 2B.

      Thanks - added.

      (3) Line 400 - this paragraph could include additional quantification and statistics to back up claims re 'substantially reduced', 'considerably lower'.

      We quantify that in the next sentence by computing the mean-square-error between responses and sinusoidal fits (also in Figure 7B, which now includes statistics as well). We have made that connection more direct in the text.

      (4) Maybe a supplement to Fig. 8 could show the changes to the stimulus required to alter the kinetics in both directions - to give more insight into part B., especially.

      Good suggestion - we have added the stimuli to all of the panels of the figure (now Figure 9).

      (5) Fig. 8B - in 'Speed response up' condition - there seems to be error in the model for the decay time of the response - especially for the 'original' condition, which is not quantified in 8C. Was it generally difficult to predict responses to flashes?

      That seems largely to reflect that the cells used for those experiments had faster initial kinetics than the average cells (responses to the control traces are also faster than model predictions in these cells - black traces in Figure 9B). We have added this to the text.

      (6) Line 678, possibly notes that 405 nm equally activates S and M photopigments in mice, since most of the cones co-express the two photopigments (Rohlich et al., 1994; Applebury et al., 2000; Wang et al., 2011).

      Thanks - we have added this (lines 827-829).

      (7) The discussion could include a broader description of the various approaches to identifying nonlinearities within retinal circuitry, which include (incomplete list): recording at multiple levels of the circuit (e.g., Kim and Rieke 2001; Rieke, 2001; Baccus and Meister, 2002; Dunn et al., 2006; 2007; Beaudoin et al., 2007; Baccus et al., 2008); recording currents vs. spiking responses in a ganglion cell (e.g., Kim and Rieke, 2001; Zaghloul et al., 2005; Cui et al., 2016); neural network modeling approaches (e.g., Maheswaranathan et al., 2023); optogenetic approaches to studying filtering/nonlinear behavior at synapses (e.g., Pottackal et al., 2020; 2021).

      Good suggestion - we have added this to the final paragraph of the Discussion.

      Reviewer #3 (Recommendations For The Authors):

      -  I am personally not a fan of the style: "... as Figure 4A shows..." or comparable and much prefer a direct "We observe that X is the case (Figure 4A)". If the authors agree, they may want to revise their paper in this way.

      We have revised the text to avoid the “... as Figure xx shows” construction. We have retained multiple instances which follow a “Figure xx shows that …” construction (which is both active rather than passive and does not use a personal pronoun).

      -  I am not a fan of the title. Light-adaption clamp caters only to a very specialized audience.

      We have changed the title to “Predictably manipulating photoreceptor light responses to reveal their role in downstream visual responses.”

      -  The parameter fitting procedure should not only be described in Matlab code, but in the paper.

      Thanks - we have expanded this in the Methods considerably (section starting on line 832).

      -  The authors should elaborate on why different fitting procedures were used.

      We did not describe that issue clearly. The fitting procedures used across cells were identical, but we had different data available for different cell types due to experimental limitations. We have substantially revised that part of the main text to clarify this issue (paragraph starting on line 121).

      -  The authors state in line 126 that the input stimulus is supposed to mimic eye movements mouse, monkey, or human? Please clarify.

      Thanks - we have changed this sentence to “abrupt and frequent changes in intensity that characterize natural vision.”

      -  Please improve the figure style. For example, labels should be in consistent capitalization and ideally use complete words (e.g. Figure 2B, 4B, and others).

      We have made numerous small changes in the figures to make them more consistent.

      -  Is the fraction of variance calculated on held-out-data? Linear models should be added to Figure 2B.

      The fraction of variance explained was not calculated on held out data because of limitations in the duration of our recordings. Given the small number of free parameters, and the ability of the model to capture held out cells, we believe that the model generalizes well. We have added a supplemental figure with linear model performance (Figure 2 - Figure Supplement 2).

      -  Fig. 9A is lacking bipolar cell and amacrine cell labels. Currently, it looks like HC is next to the BC in the schematic.

      Thanks - we have updated that figure (now Figure 10A)

      -  Maybe I am misunderstanding something, but it seems like the linear model prediction shown in Figure 2A for the rod could be easily improved by scaling it appropriately. Is this impression correct or why not?

      We have clarified how the linear model is constructed (by fitting the linear model to low contrast responses of the full model at the mean stimulus intensity). We also added a supplemental figure, following the suggestion above, showing the linear model performance when a free scaling factor is included for each cell.

      -  The verification experiment in Fig. 5 is only anecdotal and is elaborated only in Figure 6. If I am not mistaken, this does not necessitate its own figure/section but could rather be merged.

      We have kept this figure separate (now Figure 6) as we felt that it was important to highlight the approach in general in a figure before getting into quantification of how well it works.

      -  Figure 5 right is lacking labels. What is red and grey?

      Thanks for catching that - labels are added now.

      -  The end of the Discussion is slightly unusual. Did some text go missing?

      Thanks - we have rearranged the Discussion so as not to end on Limitations.

      -  There is a bonus figure at the end which seems also not to belong in the manuscript.

      Thanks - the bonus figure is removed now.

      -  The methods should also describe briefly what kind of routines were used in the Matlab code, e.g. gradient descent with what optimizer?

      We’ve added that information as well.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their positive assessment of our manuscript. We agree that there are some further experiments suggested by the reviewers that would enhance our study. We have highlighted further proposed experimental work in bold for clarity.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      1. EVIDENCE, REPRODUCIBILITY AND CLARITY Summary: The Matrix 2 (M2) protein of influenza A virus (IAV) is a single pass transmembrane protein known to act as a tetrameric ion channel that is important for both viral entry and egress. The paper by Figueras-Nova et al. entitled "Caspase cleavage of Influenza A virus M2 disrupts M2-LC3 interaction and regulates virion production" reports on the regulation of IAV virion production through a regulatory interplay between a caspase cleavage site and a LC3 interacting region (LIR) motif in M2. In its C-terminal cytoplasmic tail the IAV M2 protein contains a C-terminal LIR motif interacting with LC3. The authors show that this LIR motif is preceded by a functional caspase cleavage motif cleaved predominantly by caspase-6, with some contribution from caspase-3: The motif 82-SAVD-85 directs cleavage after the aspartate (D) at position 85. The cleavage leads to loss of the remaining C terminal sequence from amino acid 86 to 97. The core LIR motif 91-FVSI-94 LIR motif is then lost from M2 which can no longer bind LC3. As previously described by the same group using point mutations in the LIR motif (Ref 12.), loss of a functional LIR., here by caspase- mediated deletion of the LIR, affects the virion production and inhibits filamentous budding. LC3B lipidation is increased upon treatment with a caspase inhibitor. The authors show for the first time that LC3 is included into IAV virions via binding to M2. Furthermore, they also report a co-crystal structure of the M2 C terminus (aa 70-97), containing the caspase cleavage site and LIR, and LC3B (aa 3-125) adding new insights into this interaction and showing that the caspase cleavage site is in a flexible region N-terminal to the LIR. This work shows how caspase cleavage may modulate LC3B lipidation, trafficking to the plasma membrane, incorporation of LC3B in the virions, filamentous budding and virion production (viral titer).

      Major comments: The findings reported here are very well supported by the data shown. This is a very clearly written paper with well described and nicely visualized results that are accompanied by adequate statistical analyses.

      We thank the reviewer for their assessment of our manuscript.

      The authors report a new way the LC3B binding to the C-terminal tail of the M2 proteins is regulated and suggest that this is an adaptation the virus has made to adjust virion production to host cell status by hijacking the function of host caspases. They show that the caspase cleavage motif is evolutionary conserved and use that as an argument. Perhaps it could be discussed if it also could be an argument that the host protects itself against a too massive virion production as this could be too detrimental to the host? Would it not also be an evolutionary advantage to the virus in the long run by avoiding killing the host?

      This is an interesting point. We agree there could be advantage for the virus not to overproduce virions under certain circumstances. Consistent with this caspase-6 deficient mice had increased mortality in response to IAV PR8 infection, and presented and increase in viral spread in the lungs (Zheng, 2021; doi: 10.1016/j.cell.2020.03.040). This is also relevant for the comments made by Reviewer 2. The manuscript will be updated to include a discussion of this point.

      A question I may raise which is optional as it may be too much work to address as part of this study is if the reported regulation of LC3B binding has any role in regulating the ion channel function of the M2 tetramer?

      It is well established that there is no impact of distal C-terminal truncations on M2 ion channel activity (Cady et al., 2009, doi: 10.1021/bi9008837 Schnell and Chou, doi: 10.1038/nature06531; Nguyen et al., 2008, doi: 10.1021/bi801315m; Tobler et al., 1999, doi: 10.1128/jvi.73.12.9695-9701.1999). This is also consistent with data from our lab (Ulferts et al., 2021, doi: 10.1016/j.celrep.2021.109899, Beale et al., 2014, doi: 10.1016/j.chom.2014.01.006) as well as others (Ren et al., 2015, doi: 10.1128/JVI.00576-15) showing the effects of the LIR motif and the proton channel are distinct. We appreciate the reviewer suggesting further work here as optional, but there is already compelling evidence to show there is no substantial effect of the LIR motif on ion channel activity. (See also Reviewer 2 points 4 and 5).

      Minor comments: Delete "with" in line 145.

      This will be changed in the updated manuscript.

      Line 217: It should be written more specifically how "cells were surface stained with M2"

      The protocol for surface staining of M2 will be explained in more detail in the updated manuscript.

      1. SIGNIFICANCE

      This is a very well performed study with a sound experimental strategy and well performed assays with clear results increasing our insight into the interplay between the Influenza A virus and host cells. Although caspase mediated cleavage of the autophagy receptor and signaling scaffold protein p62 (Ref. 25), removing the LIR and LC3-binding, has been reported before I consider this study as novel in reporting this type of regulation of LC3 binding. The cleavage of p62 deletes a large part of the protein while here it is a "clean" deletion of the LIR sequence representing a conceptual advance of regulation of LC3 binding. The study also reports for the first time on LC3B incorporated into virions. The effects on trafficking to the plasma membrane and viral budding and virion production are similar to those reported before (Ref. 12) using viruses with point mutations crippling the LIR motif. This research will be of interested to all studying virus- host interaction and to the autophagy field both as a non autophagic role of LC3B, and as a regulatory mechanism of LIR-LC3B interactions involving the irreversible caspase cleavage-mediated deletion of the LIR motif.

      We thank the reviewer for this assessment of our manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The influenza A virus (IAV) M2 protein is small transmembrane protein which plays a role in virus entry and egress. In a previous study, Beale et al. (2014) identified an LC3-interacting region (LIR) in the M2 cytoplasmic domain that was found to recruit the LC3B protein to the plasma membrane. Recombinant IAV harboring mutations in the LIR motif showed reduced particle stability and lost filamentous morphology.

      In the present study, Figueras-Novoa et al. show that the LIR motif is removed in response to activation of cellular caspases. The authors demonstrate that in in IAV-infected THP-1 cells M2 is partially cleaved at the motif (82)SAVD(85)¯A by caspase 6. Caspase inhibitors abolished cleavage, and a mutant virus harboring the D85A substitution was found to be resistant to caspase action. A crystal structure of purified M2 C- terminus and LC3B revealed that the caspase cleavage site lies in a flexible region that is accessible to caspases.

      Mutant virus encoding a truncated M2 protein (M2D86-97) was unable to interact with LC3, in accordance with the absence of the LIR motif. The M2D86-97 mutant showed reduced lipidation of LC3, while enhanced lipidation of LC3 was observed when wild-type virus-infected cells were treated with caspase inhibitors. The authors also observed that cell surface transport of M2D86-97 but not M2-D85A was impaired. However, in purified virus particles a mix of cleaved and uncleaved M2 was detected. The authors also demonstrated that lipidated LC3B was present in purified virions of wild-type virus particles but even more abundant in M2-D85A virions. Finally, M2D86-97 mutants produced significantly less infectious particles compared to wild-type virus while the D85A cleavage mutant replicated to similar titers than wt virus.

      Based on these findings the authors concluded that caspases regulate the interaction of M2 protein with LC3 which impacts virion production. Specifically, they propose that caspase-mediated removal of the LIR motif may enable a switch between filamentous and non-filamentous budding in response to depletion of cellular resources. However, the authors were unable to rescue a filamentous IAV with a truncated M2 protein and therefore could not provide direct proof for their guess.

      While the data are sound and presented well, they do not support the conclusions of the authors.

      1. To the authors opinion, the conserved caspase cleavage site in the M2 protein might provide an evolutionary advantage for the virus. However, the M2-D85A mutation has no effect on viral replication, so the biological significance of why M2 needs to be cleaved at all is unclear. The conclusion that caspase-induced M2 cleavage is a fine-tuning mechanism of IAV has not been supported by experiments.

      We thank the reviewer for the assessment of our data. We think the reviewer is specifically objecting to the phrase “We conclude that this highly conserved interaction and cleavage act as a regulatory mechanism exploited by IAV to fine-tune virion production in different cellular contexts.” This is a reasonable inference from our results, but we accept that it is not proven. We will change the wording to make it clear this has not been definitively demonstrated.

      1. The finding that the permanently truncated IAV M2 mutant virus was substantially attenuated does not necessarily mean that abrogation of M2-LC3 interaction was responsible for this attenuation. As the M2 protein plays a role in virus budding at the plasma membrane (recruitment of M1 protein, induction of membrane curvature, membrane scission), the impaired transport of the truncated M2 protein might already explain that the virus was attenuated and that incorporation of the protein into the viral envelope was reduced.

      We will confirm this further with additional experiments using LIR mutants. Recapitulating the plasma membrane transport defect of truncated M2 with LIR mutants including the newly characterised M2D87A and M2D88A mutants and a more severe mutant with a FVSI_AAAA substitution would strongly imply this truncation mutant phenotype is due to the lack of LIR motif.

      1. It is also not clear whether the loss of the C-terminal 11 amino acids may have affected the interaction of the M2 protein with other proteins such as TRAPPC6A-delta (Zhu et al., 2017).

      This is a reasonable point, however Zhu et al., 2017 (https://doi.org/10.1128/jvi.01757-16) reported that the interaction with TRAPPC6A retains M2 intracellularly. If the phenotype observed with our truncation was due to the loss of interaction with TRAPPC6A, the opposite phenotype would be observed (more M2 in the plasma membrane with the truncated M2∆86-97 mutant). To address this point directly we will attempt to rescue an M2 mutant virus that has disrupted the reported TRAPPC6A binding site and assess M2 plasma membrane localization.

      The authors did not rule out whether the truncation of the M2 protein by 11 amino acids would have an effect on proton channel activity. Proton channel activity, however, might be important to preserve the metastable conformation of HA in the secretory pathway and might be also important for virus uncoating.

      M2D86-97 induced less LC3 lipidation than wild-type M2 or the D85A mutant. The remaining lipidation was attributed to the ion channel activity of the M2 protein. Can the authors rule out that the truncation of the M2 protein led to reduced ion channel activity which in turn led to reduced LC3B lipidation?

      We have addressed points 4 and 5 in response to Reviewer 1.

      The suggested role of caspase cleavage as a regulatory switch between filamentous and spherical virions (lines 304- 313) is highly speculative as long as the authors do not provide any experimental proof for it. The authors indicated that they were unable to rescue filamentous IAV with M2D86-97. However, would it be possible to use caspase inhibitors to test their hypothesis?

      We acknowledge that M2∆86-97 could not be rescued in a filamentous background. The use of caspase inhibitors would only increase the amount of full length M2 present, and does not provide an alternative strategy for increasing the proportion of truncated M2. However, since M2∆86-97 mutant could not be rescued, we will attempt to rescue additional LIR motif mutants to address this point. In particular, D87A and D88A mutants could be generated in a MUd background, as well as the F91S mutant.

      The authors used only the PR8 strain for their studies, a highly cell culture-adapted strain with spherical morphology. Are the findings obtained with this strain are also valid for others IAV strains?

      As we highlight in Figure 2I, both the caspase cleavage motif and LIR motif are highly conserved in human IAV strains. PR8 was used as it is the reverse genetic system in use and approved for use in the lab. We will attempt to address this by testing whether other IAV strains we are able to obtain also undergo caspase mediated cleavage of M2. If possible, we will obtain recent clinical isolates to show cleavage of M2 in a strain that has not adapted to cell culture.

      1. The authors mainly used the THP-1 cells for their studies, a human macrophage-like cell line. However, human IAV mostly replicate in epithelial cells of the respiratory tract and cause only abortive infections of macrophages. Why did the authors choose this cell line? Can the findings obtained with this cell line be translated to epithelial cells of the airways?

      THP-1 cells are widely used for the study of caspase activity. However, we also show M2 cleavage in MDCK cells and HAP1 cells. PR8 infection of A549 cells does not induce significant amounts of cell death in the infection time points used and, as caspase activation is linked to cell death, we did not observe M2 cleavage in this cell type. We will attempt to infect some epithelial cell types to confirm this phenotype.

      1. Minor issues:

      2. Fig. 1C: There seem to be quite some differences in the cleavage efficiency of M2 between panels A, B, C, and D? Any explanations?

      Different cell types (THP-1 cells and HAP1 cells) are used for the experiments mentioned above, which accounts for the different amount of M2 cleavage.

      • Fig. 1: Panel E: The labeling of the first amino acids as aa 76 seems to be wrong!

      We thank the reviewer for pointing this out, this will be corrected in the updated manuscript.

      Line 147: ...caspase mediated disruption of the M2-LC3 interaction (Fig 2A-B). Should be Fig. 2A-C.

      This sentence was referring to Figure 2A-B, as it refers to LC3B lipidation and not the coIP. This sentence will be changed in the text to reflect the intended meaning.

      • Growth kinetics of the various mutant viruses are missing?

      __We will provide growth kinetics for the relevant mutants _(M2D85A and M2∆86-97).___

      • Line 195: The authors speculate that aa85 is important for viral fitness: That should be demonstrated!

      This speculation is based on the very strong conservation of D85 in human IAV strains. The importance of D85 in viral fitness (permitting cleavage of M2) is only likely to be directly demonstrable in transmission models (for example ferrets) which is not feasible or justifiable.

      Reviewer #2 (Significance (Required)):

      Authors concluded that caspases regulate the interaction of M2 protein with LC3 which impacts virion production. Specifically, they propose that caspase-mediated removal of the LIR motif may enable a switch between filamentous and non-filamentous budding in response to depletion of cellular resources. However, the authors were unable to rescue a filamentous IAV with a truncated M2 protein and therefore could not provide direct proof for their guess. +<br /> +

      • As stated in the response to the comments above, we will attempt to rescue LIR mutant viruses (____D87A and D88A) in a MUd background which would provide further support for our hypothesis. Our data has significance for the understanding of the cell biology of influenza infection as commented on by Reviewers 1 and 3.

        • Reviewer #3 (Evidence, reproducibility and clarity (Required)): Summary : In this article, the authors identify a caspase cleavage site in the influenza A virus (IAV) Matrix 2 protein (M2) that leads to a truncated form of M2 deleted from its C-term LC3-interacting region (LIR). This cleaved form of M2 is seen and accumulates starting at 12 hours post-infection. IAV expressing M2 delta 86-97 mutant, corresponding to cleaved M2, seems to disrupt LC3B localization to cell plasma membrane upon infection. The authors also show that the IAV M2 delta 86-97 has a reduced viral titer compared to IAV WT. Overall the data are quite exciting where the authors identify the specific caspase responsible for the cleavage and show the residues of M2 necessary for LC3 interaction. However, some of the data showing the consequence of the cleavage for viral replication could be better clarified.

      We thank Reviewer 3 for their kind comments and we propose further experiments to clarify the consequences of cleavage.

      Major comments: - In Fig3A-B, the authors seek to demonstrate that the localization of M2 to the plasma membrane requires LIR motif. However, the representative images for cell infected with the delta 86-97 mutant show relatively few cell are expressing M2 raising questions of the infectivity of this mutant virus or if the overall expression of M2 in this assay is less for the delta 86-97 mutant. The authors should consider first quantifying the ratio of M2 cell surface staining over total M2 staining and second re-evaluate the representative images chosen.

      __We will include more examples of permeabilised cells in which comparable numbers of cells are M2 positive between mutants. We will also include high-content microscopy based quantification to support this. __To clarify, we confirm that the quantification of M2 intensity in the plasma membrane is carried out relative to the number of M2 positive cells, as the reviewer agrees is the most accurate way. To avoid confusion, we will update figure legends to describe more accurately the quantification process. A comparison between surface M2 and total M2 cannot be done on an individual cell basis, as once cells are permeabilized (to look for internal M2), robust differentiation between surface and internal M2 is difficult. The above clarification and additional data should provide the necessary support for our conclusions.

      • In fig3E, it is unclear what is being quantified in the graph as the legend and text lines 222-223 mention that spot intensity was measured but the y axis indicates LC3 relocalization intensity. Given LC3 is punctated particularly in the cytosol, It is unclear which spots of LC3 they are referring to. Based on the images shown, using a graph with LC3 surface staining as performed for M2 would clarify the data. The authors should clarify the reporting of these data in the results section. Additionally, the images of the control non-infected cells should be added to 3C.

      We agree with the reviewer on this point. The figure will be updated to describe more accurately what is being quantified. Additionally, images for uninfected cells in 3C will be added.

      • The data in Fig4 and FigS3 need to be strengthened to be conclusive. The volcano plot in FigS3A indicates that there is more LC3B and IAV proteins in M2 D85A than M2delta86-97. However in Fig4E, both LC3 I and LC3 II are increased in virions M2 delta 86-97 compared to M2 D85A which is opposite to the authors' conclusions in lines 244-245. In other words, the total amount of lipidated LC3 is higher in virions from IAV M2 without LIR motif than M2 with LIR. LC3II/I ratio in fig4F would suggest in virions containing M2 with LIR motif, LC3B II may be preferentially incorporated compared to virions containing M2 without LIR, which incorporates both LC3B I and LC3B II. Since this is a critical point made by the authors, performing a co-immunoprecipitation of M2 D58A and M2delta86-97 in the particles and then assessing for binding of LC3 I or II would bolster their conclusions.

      Figure 4F quantifies the ratio of LC3II to LC3I in infectious particles. Another two repeats used to quantify this ratio will be shown in addition, with a better representation of increased amounts of lipidated LC3II in M2D85A infectious particles, as well as an increased LC3II/LC3I ration in said particles when compared to M2∆86-97. Because of the low yield acquired from the purification of IAV virions, performing an IP would be difficult. Even if this were technically feasible it would not prove that M2 is binding LC3 inside the virion – we do not make this claim in our paper, merely that LC3B can be detected in the purified viral particles. We will clarify this point in the revised manuscript.

      • In Fig4J, even if statistically significant, the PFU difference between M2 D85A and M2 delta86-97 is minimal, performing growth curve assay would help appreciate this difference over time. In Thp1 cells, as the authors show caspase cleavage of M2 at time point 12h 14h 16hpi etc... (fig1), they should also show PFU data at these same time points for M2 mutant D85A compared to WT and M2 delta 86-97.

      We agree with the reviewer and indeed this was a point we attempted to make in our manuscript: Figure 4J shows a statistically significant difference between the titers. However, in the text we state that, even though statistically significant, the difference is much smaller than in other titer quantifications performed. Given the nature of a plaque assay, differences of less than a log fold cannot be considered as definitively indicating biological significance. We will clarify this in a revised manuscript. We will also provide the relevant growth kinetics (as per response to Reviewer 2).

      • The title of Fig4 and FigS3 and in text line 226 should be changed as M2 incorporation into virions is not shown and not described in the text. Plus, in figS3B, the authors show that between the M2 mutants, there is no difference in the abundance of M2 and other viral proteins compared to M1.

      The title of Figures 4 and S3 will be changed to more accurately reflect all of the points made by the figure.

      • In the image shown in Fig4H the number of plaques is higher for M2delta86-97 even though the size in smaller than M2 WT. Could the authors clarify in the text of the results section how they quantify PFU in their plaque assay and if they used a size criterion when quantifying the number of plaques?

      The images of plaques are taken at different dilutions, with the M2∆86-97 image belonging to two dilutions lower than the M2WT image. We will include the calculation used for PFU/mL, which does not take into account plaque size. Furthermore, images of the whole plate, showing plaqued serial dilutions will be shown.

      • In fig3B, the legend indicates 8 hpi but on the graphs it is 9 hpi.

      We thank the reviewer for pointing out this mistake. Both should read 8 hpi, this will be corrected in the new manuscript.

      Reviewer #3 (Significance (Required)):

      The authors demonstrated that IAV M2 binding to LC3 is regulated by caspase cleavage. The authors clearly identify the cleavage site and the caspase involved: caspase 6. The cleaved form of M2 seems relevant to IAV infection as it is accumulating after 12hpi. Using a M2 mutant D85A that cannot be cleaved by caspase 6 and truncated M2 mutant delta86-97 mimicking caspase cleaved M2, the authors are able to elegantly address the role of M2 cleavage. However, the importance of M2 caspase cleavage on IAV infection is not demonstrated. Eventually, addressing the impact of the caspase cleavage of M2 LIR motif on autophagy or CASM would be interesting. - Advance: conceptual. - Audience: basic research, specialized in virology, specialized in autophagy. - Field of expertise: virology, autophagy.

      We agree with the reviewer that we have made a conceptual advance in our understanding of the cell biology of influenza A virus infection. We have also determined the structure of the terminal part of the M2 tail in complex with LC3B. The biological importance of the phenotypes we show are most likely in transmission of the virus between hosts, which for IAV would require animal experiments outside the scope of this study. We have demonstrated regulation of the LIR motif by caspase cleavage in a variety of ways, using cell biological and biochemical methods. IAV is a very significant human and animal pathogen, and we believe we have made an important advance in describing a host-pathogen interaction of relevance for viral egress.

    1. Author response:

      Reviewer #1 (Public Review):

      Weaknesses:

      There are some minor weaknesses.

      Notably, there are not a lot of new insights coming from this paper. The structural comparisons between MCC and PCC have already been described in the literature and there were not a lot of significant changes (outside of the exo- to endo- transition) in the presence vs. absence of substrate analogues.

      We agree that the structures of the human MCC and PCC holoenzymes are similar to their bacterial homologs. That is due to the conserved sequences and functions of MCC and PCC across different species.

      There is not a great deal of depth of analysis in the discussion. For example, no new insights were gained with respect to the factors contributing to substrate selectivity (the factors contributing to selectivity for propionyl-CoA vs. acetyl-CoA in PCC). The authors state that the longer acyl group in propionyl-CoA may mediate stronger hydrophobic interactions that stabilize the alpha carbon of the acyl group at the proper position. This is not a particularly deep analysis and doesn't really require a cryo-EM structure to invoke. The authors did not take the opportunity to describe the specific interactions that may be responsible for the stronger hydrophobic interaction nor do they offer any plausible explanation for how these might account for an astounding difference in the selectivity for propionyl-CoA vs. acetyl-CoA. This suggests, perhaps, that these structures do not yet fully capture the proper conformational states.

      We appreciate this comment. Unfortunately, in the cryo-EM maps of the PCC holoenzymes, the acyl groups were not resolved (fig. S6), so we were unable to analyze the specific interactions between the acyl-CoAs and PCC. We will discuss this limitation in our revised manuscript.

      The authors also need to be careful with their over-interpretation of structure to invoke mechanisms of conformational change. A snapshot of the starting state (apo) and final state (ligand-bound) is insufficient to conclude *how* the enzyme transitioned between conformational states. I am constantly frustrated by structural reports in the biotin-dependent enzymes that invoke "induced conformational changes" with absolutely no experimental evidence to support such statements. Conformational changes that accompany ligand binding may occur through an induced conformational change or through conformational selection and structural snapshots of the starting point and the end point cannot offer any valid insight into which of these mechanisms is at play.

      Point accepted. We will revise our manuscript to use "conformational differences" instead of "conformational changes" to describe the differences between the apo and ligand-bound states.

      Reviewer #2 (Public Review):

      Comments and questions to the manuscripts:

      I'm quite impressed with the protein purification and structure determination, but I think some functional characterization of the purified proteins should be included in the manuscript. The activity of enzymes should be the foundation of all structures and other speculations based on structures.

      We appreciate this comment. However, since we purified the endogenous BDCs and the sample we obtained was a mixture of four BDCs, the enzymatic activity of this mixture cannot accurately reflect the catalytic activity of PCC or MCC holoenzyme. We will acknowledge this limitation in the discussion section of our revised manuscript.

      In Figure 1B, the structure of MCC is shown as two layers of beta units and two layers of alpha units, while there is only one layer of alpha units resolved in the density maps. I suggest the authors show the structures resolved based on the density maps and show the complete structure with the docked layer in the supplementary figure.

      We appreciate this comment. We have shown the cryo-EM maps of the PCC and MCC holoenzymes in fig. S8 to indicate the unresolved regions in these structures. The BC domains in one layer of MCCα in the MCC-apo structure were not resolved. However, we think it would be better to show a complete structure in Fig. 1 to provide an overall view of the MCC holoenzyme. We will revise Fig. 1B and the figure legend to clearly point out which domains were not resolved in the cryo-EM map and were built in the structure through docking.

      In the introduction, I suggest the author provide more information about the previous studies about the structure and reaction mechanisms of BDCs, what is the knowledge gap, and what problem you will resolve with a higher resolution structure. For example, you mentioned in line 52 that G437 and A438 are catalytic residues, are these residues reported as catalytic residues or this is based on your structures? Has the catalytic mechanism been reported before? Has the role of biotin in catalytic reactions revealed in previous studies?

      Point accepted. It was reported that G419 and A420 in S. coelicolor PCC, corresponding to G437 and A438 in human PCC, were the catalytic residues (PMID: 15518551). The same study also reported the catalytic mechanism of the carboxyl transfer reaction. The role of biotin in the BDC-catalyzed carboxylation reactions has been extensively studied (PMIDs: 22869039, 28683917). We will include these information in the introduction section of our revised manuscript.

      In the discussion, the authors indicate that the movement of biotin could be related to the recognition of acyl-CoA in BDCs, however, they didn't observe a change in the propionyl-CoA bound MCC structure, which is contradictory to their speculation. What could be the explanation for the exception in the MCC structure?

      We appreciate this comment. We do not have a good explanation for why we did not observe a change in the propionyl-CoA bound MCC structure. It is noteworthy that neither acetyl-CoA nor propionyl-CoA is the natural substrate of MCC. Recently, a cryo-EM structure of the human MCC holoenzyme in complex with its natural substrate, 3-methylcrotonyl-CoA, has been resolved (PDB code: 8J4Z). In this structure, the binding site of biotin and the conformation of the CT domain closely resemble that in our acetyl-CoA-bound MCC structure. Therefore, the movement of biotin induced by acetyl-CoA binding mimics that induced by the binding of MCC's natural substrate, 3-methylcrotonyl-CoA, indicating that in comparison with propionylCoA, acetyl-CoA is closer to 3-methylcrotonyl-CoA regarding its ability to bind to MCC. We will discuss this possibility in our revised manuscript.

      In the discussion, the authors indicate that the selectivity of PCC to different acyl-CoA is determined by the recognition of the acyl chain. However, there are no figures or descriptions about the recognition of the acyl chain by PCC and MCC. It will be more informative if they can show more details about substrate recognition in Figures 3 and 4.

      We appreciate this comment. Unfortunately, in the cryo-EM maps of the PCC holoenzymes, the acyl groups were not resolved (fig. S6), so we were unable to analyze the specific interactions between the acyl-CoAs and PCC. We will discuss this limitation in our revised manuscript.

      How are the solved structures compared with the latest Alphafold3 prediction?

      Since AlphaFold3 was not released when our manuscript was submitted, we did not compare the solved structures with the AlphaFold3 predictions. We have now carried out the predictions using Alphafold3. Due to the token limitation of the AlphaFold3 server, we can only include two α and six β subunits of human PCC or MCC in the prediction. The overall assembly patterns of the Alphafold3-predicted structures are similar to that of the cryo-EM structures. The RMSDs between PCCα, PCCβ, MCCα, and MCCβ in the apo cryo-EM structures and those in the AlphaFold3-predicted structures are 7.490 Å, 0.857 Å, 7.869 Å, and 1.845 Å, respectively. The PCCα and MCCα subunits adopt an open conformation in the cryo-EM structures but adopt a closed conformation in the AlphaFold-3 predicted structures, resulting in large RMSDs.

    1. AbstractDefining a multicellular model can be challenging. There may be hundreds of parameters that specify the attributes and behaviors of objects. Hopefully the model will be defined using some format specification, e.g., a markup language, that will provide easy model sharing (and a minimal step toward reproducibility). PhysiCell is an open source, physics-based multicellular simulation framework with an active and growing user community. It uses XML to define a model and, traditionally, users needed to manually edit the XML to modify the model. PhysiCell Studio is a tool to make this task easier. It provides a graphical user interface that allows editing the XML model definition, including the creation and deletion of fundamental objects, e.g., cell types and substrates in the microenvironment. It also lets users build their model by defining initial conditions and biological rules, run simulations, and view results interactively. PhysiCell Studio has evolved over multiple workshops and academic courses in recent years which has led to many improvements. Its design and development has benefited from an active undergraduate and graduate research program. Like PhysiCell, the Studio is open source software and contributions from the community are encouraged.

      This work has been published in GigaByte Journal under a CC-BY 4.0 license (https://doi.org/10.46471/gigabyte.128), and has published the reviews under the same license. This is part of the PhysiCell Ecosystem Series: https://doi.org/10.46471/GIGABYTE_SERIES_0003

      Reviewer 1. Meghna Verma:

      Is installation/deployment sufficiently outlined in the paper and documentation, and does it proceed as outlined?

      The authors have provided links for video descriptions for installation and that is appreciated.

      One overall recommendation is: If all the screenshots (for e.g.: from Fig 1-12 of the main paper and all the subsections in Supplementary) can be combined in one figure that will help enhance the complete overview and the overall flow of the paper.

      Additional comments are available here: https://gigabyte-review.rivervalleytechnologies.comdownload-api-file?ZmlsZV9wYXRoPXVwbG9hZHMvZ3gvVFIvNTA3L1Jldmlld19QaHlzaUNlbGxTdHVkaW9fTVYucGRm

      Reviewer 2. Koert Schreurs and Lin Wouters supervised by Inge Wortel

      Is there a clear statement of need explaining what problems the software is designed to solve and who the target audience is?

      The problem statement is addressed in the introduction, which mentions the need for a GUI tool as a much more accessible way to edit the XML-based model syntax. However, it is somewhat confusing who exactly the intended audience of the paper is. Is the paper targeted at researchers that already use PhysiCell, but might want to switch to the GUI version? Or should it (also) target the potential new user-base of researchers interested in using ABMs, for whom the XML version was not sufficiently accessible and who will now gain access to these models because there is a GUI? Specifying the intended audience might impact some sections of the paper. For example, for users who already use PhysiCell, the step-by-step tutorials might not be useful since they would already know most of the available options; they would just need a quick overview of what info is in which tab. But if the paper is (also) targeted at potential new users, then some additional information could make both the paper and the tool much more accessible, such as:
      
      • A clear comparison to other modeling frameworks and their functionalities. Why should they use PhysiCell instead of one of the other available (GUI) tools? For example, the referenced Morpheus, CC3D and Artistoo all focus on a different model framework (CPMs); this might be worth mentioning. And what about Chaste? Does it represent different types of models, or are there other reasons to consider PhysiCell over Chaste or vice versa? For new users, this would be important information to include. The paper currently also does not mention other frameworks except those that offer a GUI. While the main point of the paper is the addition of the GUI, for completeness sake it might still be good to mention a broader overview of ABM frameworks and how they compare to PhysiCell, or simply to refer to an existing paper that provides such an overview.
      • The current tutorial immediately dives into very specific instructions (what to click and exact values to enter), often without explaining what these options mean or do. New users would probably appreciate to get a rough outline of which types of processes can be modelled, and which steps they would take to do so. This could be as easy as summarising the different main tabs before going into the details. I understand that some of these explanations will overlap with the main PhysiCell software – but considering that the GUI will open up modelling to a different type of community, it might make sense to outline them here to get a self-contained overview of functionality.
      • Indeed, if the above information is provided, the detailed tutorial might fit better as an appendix or in online documentation. That would also leave more space to explain not only which values to enter, but also what these variables do, why choose these values, what other options to consider, etc. Having this information together in one place would be very useful for beginning users.

      Is the source code available, and has an appropriate Open Source Initiative license been assigned to the code?

      The software is available under the GPL v3 licence.

      As Open Source Software are there guidelines on how to contribute, report issues or seek support on the code?

      There is a Github repository, ensuring that it is possible to contribute and report issues, and the paper explicitly invites community contributions. However, although the paper mentions that it is possible to seek support through Github Issues and “Slack channels”, we could find no link to the latter resource. This should probably be added to make this resource usable for the reader (or otherwise the statement should be removed)

      Is installation/deployment sufficiently outlined in the paper and documentation, and does it proceed as outlined?

      Mostly yes, as installation and deployment are outlined in the paper and documentation. However, we did notice a couple of issues: - The studio guide explains how to compile a project in PhysiCell (https://github.com/PhysiCell-Tools/Studio-Guide/blob/main/README.md), but does not mention that Mac users need to specify the g++ version at the top of the Makefile. This is explained in a separate blog (http://www.mathcancer.org/blog/setting-up-gcc-openmp-on-osx-homebrew-edition/) but should be outlined (or at least referenced) here as well. - There are several different resources covering the installation process, referring to e.g. github.com/physicell-training, github.com/PhysiCell-Tools/Studio-Guide, and the abovementioned blog. But this might not be very accessible to all users targeted by the new GUI functionality (especially when command line interventions and manual Makefile edits are involved). While not all of this has to be changed before publication, having all information in one place would already improve accessibility to a larger user-base. - When following the instructions (https://github.com/PhysiCell-Tools/Studio-Guide/blob/main/README.md), “python studio/bin/studio.py -p -e virus-sample” the -p flag gives an error: “Invalid argument(s): [‘-p’]”. We assumed it has to be left out, but perhaps the docs have to be updated.

      Is the documentation provided clear and user friendly?

      Mostly yes, as there is already a lot of documentation available. However, the user-friendliness could be improved with some minor changes. For example, the documentation could be made more user-friendly if resources were available from a central spot. Currently, information can be found in different places: - https://github.com/PhysiCell-Tools/Studio-Guide/blob/main/README.md provides installation instructions and a nice overview of what is where in the GUI, but as mentioned above, does not mention potential issues when installing on MacOS. - The paper provides very detailed examples; these might be nice to include along with the abovementioned overview. - Potentially other places as well. It would be great if the main documentation page could at least link to these other resources with a brief description of what the user will find there. Further, some additions would make the documentation more complete: - It would be good to have an overview somewhere of all the configuration files that can be supplied/loaded (e.g. those for “rules” and for initial configurations). - A clearer instruction/small tutorial on how to use simularium and paraview with physicell studio; especially for paraview there is no instruction on how to use your own data or make your own `.pvsm` file In the longer term, it might be worthwhile to set up a self-contained documentation website (this is relatively easy nowadays using e.g. Github pages), which can outline dependencies, installation instructions, a quick overview, detailed tutorials, example models, links to Github issues/slack communities. This is not a requirement for publication but might be worth looking into in the future as it would be more user-friendly.
      

      Is there a clearly-stated list of dependencies, and is the core functionality of the software documented to a satisfactory level?

      No. The core functionality of the software is nicely outlined in the Github README (https://github.com/PhysiCell-Tools/Studio-Guide/blob/main/README.md), but as mentioned before, this high-level overview is missing in the paper itself. The README and paper recommend installing the Anaconda python distribution to get the required python dependencies. This is fine, but adding a setup file or requirements.txt might still be useful for users who are more familiar with python and want a more minimal installation. Providing a conda environment.yml that allows running the studio along with paraview and/or simularium might also be helpful. Note that running the studio with simularium in anaconda did not work because anaconda did not have the required vtk v9.3.0; instead we had to install simularium without anaconda (“pip3 install simularium”).

      Are there (ideally real world) examples demonstrating use of the software?

      The detail tutorial nicely walks the reader through the tool (although as mentioned before, a high-level overview is missing and the level of detail feels slightly out of place in the paper itself). When walking through the example in the paper and the supplementary, we did run into a few (minor) issues: - It might be good to stress explicitly that after copying the template.xml into tumor_demo.xml, the first step is always to compile using “make”. The paper mentions “Assuming … you have compiled the template project executable (called “project”) …”. But it might not be immediately clear to all users how exactly they should do so (presumably by running “make tumor_demo” after copying the xml file?). - When running “python studio/bin/studio.py -c tumor_demo.xml -e project” as instructed, a warning pops up that “rules0.csv” is not valid (although the tool itself still works). - The instructions for plotting say to press “enter” when changing cmin and cmax, but Mac offers only a return key. Pressing fn+return to get the enter functionality also does not work; it might be good to offer an alternative for Mac. - When reproducing the supplementary tutorial, results were slightly different. It might be good if the example would offer a random seed so that users can verify that they can reproduce these results exactly. In our hands, when reproducing figs 39, 40, 48, 49 yields way more (red) macrophages (even when running multiple times), but we could not be sure if this is due to variation between runs, or a mistake in the settings somewhere.
      
      
      The paper mentions that they have started setting up automated testing, but it does not give an idea of what the current test coverage is. Did they add a few tests here and there, or start to systematically test all parts of the software? I understand the latter might not be achievable immediately, but it would be good if users and/or contributors can at least get a sense of how good the current coverage is. (Note: the framework uses pytest, which seems to offer some functionality to generate coverage reports, see e.g. https://www.lambdatest.com/blog/pytest-code-coverage-report/). The code in studio_for_pytest.py has a comment “do later, otherwise problems sometimes”, but it is not entirely clear if the relevant issue has been resolved.
      

      Additional Comments: The presented tool offers a GUI interface to the PhysiCell framework for agent-based modeling. As outlined for the paper, this offers significant value to the users since editing a model is now much more accessible. The tool comes with extensive functionality and instructions. Overall, the tool functions as advertised, and will be of great value to the community of PhysiCell users that now have to edit XML files by hand. It is therefore (mostly) publishable as is if some of the issues with installation (mentioned above) can be straightened out. That said, we do think some improvements could make both the tool and the paper more accessible to a larger user audience. Most of these have been mentioned in the other questions, but we will list some additional ones below. Note that many of these are just suggestions, so we will leave it up to the authors if and when they implement them.

      Suggestions for the paper: While the paper nicely outlines design ideas and usage of the tool, there were some points where we felt that the main point did not quite come across, for example: - As mentioned in the question about problem statement and intended audience, adding some information to the paper would make it a more useful resource to users not yet familiar with PhysiCell (see remarks there). - The section “Design and development” describes the development history of the tool. In principle this is a valuable addition, because it illustrates how the project is under ongoing development and has already been improved several times based on feedback of users. However, the amount of information on each previous stage is slightly confusing; it is not entirely clear how this relates to the paper and current tool. If the main point is to showcase that the current tool has been built based on practical user experiences, this would probably come across better if this section was somewhat shorter and focused on the design choices rather than previous versions. If the main point is something else, it should be clarified what the main idea is. – The point of Table 1 was unclear to us – consider removing or explaining the main idea. - Several figures do not have captions (e.g. Figure 1 but also others); it would be helpful to clarify what message the figure should convey. – P4 “adjust the syntax for Windows if necessary” – is it self-explanatory how users should adjust? Consider adding the correct code for windows as well if possible, since users that want to use the GUI tool might not be familiar with command line syntax. - P6 “if you create your own custom C++ code referring directly to cell type ID” – this functionality is never discussed. This might be part of the general PhysiCell functionality, but it would be good to at least provide a link to a resource on how you could do this. - P8 “Only those parameters that display … editing the C++ code” – it was not entirely clear to me what this means, could you clarify? - P13 mentions you can immediately see changes to the model parameters made. This is very useful for prototyping when users want immediate feedback. However, what happens when you try to save output for a simulation where parameters were changed while the simulation was running? Would users be reminded that their current output is not representative? - Discussion: it is good to mention that the tool is already being used. Can you give an indication based on your experience how long it takes new users to learn to navigate the tool? This might be useful information to add in the paper. - The last statement on LLMs seems to come out of nowhere. Consider leaving it out or expanding further on what would be needed to make this work/how feasible this is.

      Further comments on the tool itelf: - The paper mentions that results may not be fully reproducible if multiple threads are used (I assume this is the case even when a random seed is set). In this case, would it make sense to throw a warning the first time a user tries to set a seed with multiple threads, to avoid confusion as to why the results are not reproducible? - Unusable fields are not always greyed out to indicate that they are disabled, which sometimes makes it seem as though the tool is unresponsive. In other places unusable options are set to grey, so it might be good to double-check if this is consistent. - At the initial conditions (IC) page there is no legend; it might be good to add one. - There are some small inconsistencies between the field names mentioned in the paper and those in the tool/screenshots. For example “boundary condition” (p5) should be “dirichlet BC”, “uptake” (p6) should be “uptake rate”. For the latter, the paper mentions that the length scale is 100 micron but this should be visible in the tool as well. - Not all fields have labels, so it is not always clear what the options do (see e.g. drop-downs in Figure 6). – There are a few points in the tool where you have to “enable” a functionality before it works, but this might not always be intuitive. For example, if you upload a file with initial conditions, it can be assumed that you want to use it. There might be good reasons for this in some cases but in general, consider if all these checkpoints are necessary or if this could be simplified. Same goes for the csv files that have to be saved separately instead of through the main “save” button – in the long term it might be worth saving all relevant files when they are updated, or at least throwing a warning that you have to save some of them separately.

    1. AbstractDespite advances in identifying genetic markers associated to severe COVID-19, the full genetic characterisation of the disease remains elusive. This study explores the use of imputation in low-coverage whole genome sequencing for a severe COVID-19 patient cohort. We generated a dataset of 79 imputed variant call format files using the GLIMPSE1 tool, each containing an average of 9.5 million single nucleotide variants. Validation revealed a high imputation accuracy (squared Pearson correlation ≈0.97) across sequencing platforms, showing GLIMPSE1’s ability to confidently impute variants with minor allele frequencies as low as 2% in Spanish ancestry individuals. We conducted a comprehensive analysis of the patient cohort, examining hospitalisation and intensive care utilisation, sex and age-based differences, and clinical phenotypes using a standardised set of medical terms developed to characterise severe COVID-19 symptoms. The methods and findings presented here may be leveraged in future genomic projects, providing vital insights for health challenges like COVID-19.

      This work has been published in GigaByte Journal under a CC-BY 4.0 license (https://doi.org/10.46471/gigabyte.127 ), and has published the reviews under the same license. For a video summary from the author see: https://youtu.be/x6oVzt_H_Pk?si=Byufhl0mIL3h0K6u

      The reviews are as follows:

      Reviewer 1. Jong Bhak:

      Severe cases of covid-19 patients are critical data. This manuscript deals with detailed clinical information genome set as a subset of exome sequences and provide invaluable data for on-going global covid-19 omics studies.

      Reviewer 2. Alfredo Iacoangeli:

      The authors present the release of a new dataset that include low coverage WGS data of 79 individuals who experienced severe covid-19 in Madrid (Spain). The authors processed the data and imputed common variants and they are making this dataset available to the scientific community. They also present the clinical data of these patients in a descriptive and informative fashion. Finally, the authors also validated the quantify of their imputation, showcasing the potential of low coverage WGS as an alternative to microarrays. Overall the manuscript is written very well, clear, and exhaustive. The data is certainly valuable. Its generation and processing and analysis appears robust.
      

      Overall I support the publication of this article and dataset. I only have a small number of minor suggestions for the authors: The sentence "Traditionally, the genotyping process has relied on array technologies as the standard, both at the broader GWAS level and the more specific genetic scoring and genetic diagnostics levels" sounds a little off. I totally understand where the authors come from but given the central role of NGS and Sanger for genetic diagnostics I would suggest the authors to modify accordingly or to keep the GWAS focus.

      Please double-check the use a statistical terms in the description of the imputed data. For example: "On average, each VCF file in this rich dataset contains 9.49 million high-confidence single nucleotide variants [95%CI: 9.37 million - 9.61 million] (Figure 1)." The use of CI in this context is a little miss-leading as it is not strictly referring to a distribution of probability but to a finite collection. A range would be more appropriate. The authors say that they examined the ethnicity of the 79 individuals, however I do not think the ancestry is actually reported anywhere while a few figures show ancestral population data. The authors might clarify or correct the terminology.

      Looking at figure 2 the sentence " although the male age distribution exhibits a broader range and higher variability, suggestive of a greater" does not appear justified. The authors might want to clarify or correct accordingly.

      The sentence "This exploratory analysis highlights the diverse ways in which severe COVID-19 can present, and the importance of comprehensive and nuanced clinical phenotyping in improving our understanding and management of the disease." suggests some basic clustering might be useful. The readers might benefit from a couple of graphs or figures quantifying the overlap of the SNPs across samples and maybe one that shows the density of SNPs across the genome.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Tutak et al use a combination of pulldowns, analyzed by mass spectrometry, reporter assays, and fluorescence experiments to decipher the mechanism of protein translation in fragile X-related diseases. The topic is interesting and important.

      Although a role for Rps26-deficient ribosomes in toxic protein translation is plausible based on already available data, the authors' data are not carefully controlled and thus do not support the conclusions of the paper.

      Strengths:

      The topic is interesting and important.

      Weaknesses:

      In particular, there is very little data to support the notion that Rps26-deficient ribosomes are even produced under the circumstances. And no data that indicate that they are involved in the RAN translation. Essential controls (for ribosome numbers) are lacking, no information is presented on the viability of the cells (Rps26 is an essential protein), and the differences in protein levels could well arise from block in protein synthesis, and cell division coupled to differential stability of the proteins.

      We agree that presented data could benefit from addition of suggested experiments. We will  address the ribosome content, global translation rate and cell viability upon RPS26 depletion. We are also planning to apply polysome profiling to determine if RPS26-depleted ribosomes are translationally active.

      Specific points:

      (1) Analysis of the mass spec data in Supplemental Table S3 indicates that for many of the proteins that are differentially enriched in one sample, a single peptide is identified. So the difference is between 1 peptide and 0. I don't understand how one can do a statistical analysis on that, or how it would give out anything of significance. I certainly do not think it is significant. This is exacerbated by the fact that the contaminants in the assay (keratins) are many, many-fold more abundant, and so are proteins that are known to be mitochondrial or nuclear, and therefore likely not actual targets (e.g. MCCC1, PC, NPM1; this includes many proteins "of significance" in Table S1, including Rrp1B, NAF1, Top1, TCEPB, DHX16, etc...).

      The data in Table S6/Figure 3A suffer from the same problem.

      Tables S3 and S6 show the mass spectrometry output data from MaxQuant analysis  without any flittering.  Certain identifications, i.e. those denoted as contaminants (such as keratins) were removed during statistical analysis in Perseus software. Regarding the data presented in Table S6 (SILAC data), we argue that these data are of very good quality. More than 2000 proteins were identified in a 125min gradient, with over 80% of proteins that were identified with at least 2 unique peptides. However, we acknowledge that the description of Tables S3 and S6 may lead to misunderstanding, thus we will clarify their explanation.

      I am not convinced that the mass spec data is reliable.

      (2) The mass-spec data however claims to identify Rps26 as a factor binding the toxic RNA specifically. The rest of the paper seeks to develop a story of how Rps26-deficient ribosomes play a role in the translation of this RNA. I do not consider that this makes sense.

      Indeed, we identified RPS26 as a protein co-precipitated with FMR1 RNA containing expanded CGG repeats. However, we do not claim that they interact directly. Downregulation of FMRpolyG biosynthesis could be an outcome of the alteration of ribosomal assembly, changes in efficiency and fidelity of PIC scanning or impeded elongation or more likely combination of some of these processes. We will  provide better explanation regarding those issues in the revised version of the manuscript.

      (3) Rps26 is an essential gene, I am sure the same is true for DHX15. What happens to cell viability? Protein synthesis? The yeast experiments were carefully carried out under experiments where Rps26 was reduced, not fully depleted to give small growth defects.

      We agree with the Reviewer 1 that RPS26 is an essential protein. Previously, it was shown that cell viability in cells with mutated C-terminal deletion of RPS26 is decreased (Havkin-Solomon T, Nucleic Acids Res 2023). We will address the question regarding the suppression of FMRpolyG in models with partial RPS26 knock-down.

      (4) Knockdown efficiency for all tested genes must be shown to evaluate knockdown efficiency.

      Missing experiments showing efficiency of knock-down will be included in the revised version of the manuscript.

      (5) The data in Figure 1E have just one mock control, but two cell types (control si and Rps26 depletion).

      We will clarify this ambiguity in the revised version of the manuscripts.

      (6) The authors' data indicate that the effects are not specific to Rps26 but indeed also observed upon Rps25 knockdown. This suggests strongly that the effects are from reduced ribosome content or blocked protein synthesis. Additional controls should deplete a core RP to ascertain this conclusion.

      We agree that observed effect may stem partially from reduced ribosome content, however, we argue that this is not the only explanation. In the publication concerning RPS25 regulation of G4C2-related RAN translation (Yamada SB, 2019, Nat Neurosci), it was shown that RPS25 KO does not affect global translation. Our experiments (SUnSET assay, unpublished) indicated that RPS26 KD also did not reduce global translation rate significantly. We will present that data in the revised version of the manuscript.

      (7) Supplemental Figure S3 demonstrates that the depletion of S26 does not affect the selection of the start codon context. Any other claim must be deleted. All the 5'-UTR logos are essentially identical, indicating that "picking" happens by abundance (background).

      Results shown in Fig.S3 does not imply that RPS26 does not affect the selection of start codon context entirely. We just tested a few hypotheses. We decided to test -4 position, because this position was indicated as the most sensitive to RPS26 regulation in yeast (Ferretti M, 2017, Nat Struct Mol Biol). Regarding WebLOGO analysis; we wrote in the manuscript that we did not identify any specific motif or enrichment within analysed transcripts in comparison to background. We will clarify this ambiguity in revised version of the manuscript.

      (8) Mechanism is lacking entirely. There are many ways in which ribosomes could have mRNA-specific effects. The authors tried to find an effect from the Kozak sequence, unsuccessfully (however, they also did not do the experiment correctly, as they failed to recognize that the Kozak sequence differs between yeast, where it is A-rich, and mammalian cells, where it is GGCGCC). Collisions could be another mechanism.

      As in (7).

      Reviewer #2 (Public Review):

      Summary:

      Translation of CGG repeats leads to the accumulation of poly G, which is associated with neurological disorders. This is a valuable paper in which the authors sought out proteins that modulate RAN translation. They determined which proteins in Hela cells bound to CGG repeats and affected levels of polyG encoded in the 5'UTR of the FMR1 mRNA. They then showed that siRNA depletion of ribosomal protein RPS26 results in less production of FMR1polyG than in control. There are data supporting the claim that RPS26 depletion modulates RAN translation in this RNA, although for some results, the Western results are not strong. The data to support increased aggregation by polyG expression upon S26 KD are incomplete.

      Strengths:

      The authors have proteomics data that show the enrichment of a set of proteins on FMR1 RNA but not a related RNA.

      Weaknesses:

      - It is insinuated that RPS26 binds the RNA to enhance CGG-containing protein expression. However, RPS26 reduction was also shown previously to affect ribosome levels, and reduced ribosome levels can result in ribosomes translating very different RNA pools.

      We agree that presented data could benefit from addition of some experiments. Therefore we will address questions regarding the ribosome content, global translation rate and cell viability upon RPS26 depletion. We are also planning to apply polysome profiling to determine if RPS26-depleted ribosomes are translationally active. However, we did not state that RPS26 binds directly to RNA with expanded CGG repeats and that this interaction is crucial for translation regulation of studied RNA. We just tested such hypotheses. We will improve the text narration in revised version of the manuscript to make major conclusions clearer.

      - A significant claim is that RPS26 KD alleviates the effects of FMRpolyG expression, but those data aren't presented well.

      We thank the Reviewer 2 for this comment. We will show the data derived from a few different cell models that we already have obtained. Moreover, we will include results of experiments with luminescence readout for FMRpolyG fused with luciferase upon RPS26 KD.

      Reviewer #3 (Public Review):

      Tutak et al provide interesting data showing that RPS26 and relevant proteins such as TSR2 and RPS25 affect RAN translation from CGG repeat RNA in fragile X-associated conditions. They identified RPS26 as a potential regulator of RAN translation by RNA-tagging system and mass spectrometry-based screening for proteins binding to CGG repeat RNA and confirmed its regulatory effects on RAN translation by siRNA-based knockdown experiments in multiple cellular disease models and patient-derived fibroblasts. Quantitative mass spectrometry analysis found that the expressions of some ribosomal proteins are sensitive to RPS26 depletion while approximately 80% of proteins including FMRP were not influenced. Since the roles of ribosomal proteins in RAN translation regulation have not been fully examined, this study provides novel insights into this research field. However, some data presented in this manuscript are limited and preliminary, and their conclusions are not fully supported.

      (1) While the authors emphasized the importance of ribosomal composition for RAN translation regulation in the title and the article body, the association between RAN translation and ribosomal composition is apparently not evaluated in this work. They found that specific ribosomal proteins (RPS26 and RPS25) can have regulatory effects on RAN translation(Figures 1C, 2B, 2C, 2E, 4A, 5A, and 5B), and that the expression levels of some ribosomal proteins can be changed by RPS26 knockdown (Figure 3B, however, the change of the ribosome compositions involved in the actual translation has not been elucidated). Therefore, their conclusive statement, that is, "ribosome composition affects RAN translation" is not fully supported by the presented data and is misleading.

      We thank Reviewer 3 for critical comments and suggestions. We agree that the proposed title may be misleading and the presented data does not fully support the aforementioned statement regarding ribosomal composition affecting FMRpolyG synthesis. Hence, we will change the title together with a narrative regarding these unfortunate statements that go beyond the presented results.

      (2) The study provides insufficient data on the mechanisms of how RPS26 regulates RAN translation. Although authors speculate that RPS26 may affect initiation codon fidelity and regulate RAN translation in a CGG repeat sequence-independent manner (Page 9 and Page 11), what they really have shown is just identification of this protein by the screening for proteins binding to CGG repeat RNA (Figure 1A, 1B), and effects of this protein on CGG repeat-RAN translation. It is essential to clarify whether the regulatory effect of RPS26 on RAN translation is dependent on CGG repeat sequence or near-cognate initiation codons like ACG and GUG in the 5' upstream sequence of the repeat. It would be better to validate the effects of RPS26 on translation from control constructs, such as one composed of the 5' upstream sequence of FMR1 with no CGG repeat, and one with an ATG substitution in the 5' upstream sequence of FMR1 instead of near-cognate initiation codons.

      We will address the question regarding the influence of the content of CGG repeats and START codon selection (including different near-cognate start codons) on RPS26-sensitive translation, and present these data in revised version of the manuscript.

      (3) The regulatory effects of RPS26 and other molecules on RAN translation have all been investigated as effects on the expression levels of FMRpolyG-GFP proteins in cellular models expressing CGG repeat sequences Figures 1C, 2B, 2C, 2E, 4A, 5A, and 5B). In these cellular experiments, there are multiple confounding factors affecting the expression levels of FMRpolyG-GFP proteins other than RAN translation, including template RNA expression, template RNA distribution, and FMRpolyG-GFP protein degradation. Although authors evaluated the effect on the expression levels of template CGG repeat RNA, it would be better to confirm the effect of these regulators on RAN translation by other experiments such as in vitro translation assay that can directly evaluate RAN translation.

      We agree that there are multiple factors affecting final translation of investigated mRNA including aforementioned processes. We evaluated the level of FMR1 mRNA, which turned out not to be affected upon RPS26 depletion (Figure 2B&C), however, we will address other possibilities as well.

      (4) While the authors state that RPS26 modulated the FMRpolyG-mediated toxicity, they presented limited data on apoptotic markers, not cellular viability (Figure 1E), not fully supporting this conclusion. Since previous work showed that FMRpolyG protein reduces cellular viability (Hoem G et al., Front Genet 2019), additional evaluations for cellular viability would strengthen this conclusion.

      We thank Reviewer 3 for this suggestion. We addressed the effect of RPS26 KD on apoptotic process induced by FMRpolyG. We will perform other experiments regarding different aspects of FMRpolyG-mediated cell toxicity as well.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The mechanisms of how axonal projections find their correct target requires the interplay of signalling pathways, and cell adhesion that act over short and long distances. The current study aims to use the small ventral lateral clock neurons (s-LNvs) of the Drosophila clock circuit as a model to study axon projections. These neurons are born during embryonic stages and are part of the core of the clock circuit in the larval brain. Moreover, these neurons are maintained through metamorphosis and become part of the adult clock circuit. The authors use the axon length by means of anti-Pdf antibody or Pdf>GFP as a read-out for the axonal length. Using ablation of the MB- the overall target region of the s-LNvs, the authors find defects in the projections. Next, by using Dscam mutants or knock-down they observe defects in the projections. Manipulations by the DNs - another group of clock neurons- can induce defects in the s-LNvs axonal form, suggesting an active role of these neurons in the morphology of the s-LNvs.

      Strengths:

      The use of Drosophila genetics and a specific neural type allows targeted manipulations with high precision.

      Proposing a new model for a small group of neurons for axonal projections allows us to explore the mechanism with high precision.

      Weaknesses:

      It is unclear how far the proposed model can be seen as developmental.

      The study of changes in fully differentiated and functioning neurons may affect the interpretation of the findings.

      We appreciate the reviewer's feedback on the strengths and weaknesses of our study.

      We acknowledge the strengths of our research, particularly the precision afforded by using Drosophila genetics and a specific neural type for targeted manipulations, as well as the proposal of a new model for studying axonal projections in a small group of neurons.

      We understand the concerns about the developmental aspects of our proposed model and the use of Pdf-GAL4 >GFP as a read-out for the axonal length (revised manuscript Figure 1--figure supplement 1). However, even with the use of Clk856-GAL4 that began to be expressed at the embryonic stage (revised manuscript Figure 3--figure supplement 1) to suppress Dscam expression, the initial segment of the dorsal projection of s-LNvs (the vertical part) remained unaffected. Instead, the projection distance is severely shortened towards the midline, and this defect persists until the adult stage. It is for this reason that we delineate the dorsal projections of s-LNvs into two distinct phases: the vertical and horizontal parts, rather than a mere expansion in correspondence with the development of the larval brain.

      Thank you for your valuable feedback, and we have incorporated these considerations into our revised manuscript to enhance the clarity and depth of our research.

      Reviewer #2 (Public Review):

      Summary:

      The paper from Li et al shows a mechanism by which axons can change direction during development. They use the sLNv neurons as a model. They find that the appearance of a new group of neurons (DNs) during post-embryonic proliferation secretes netrins and repels horizontally towards the midline, the axonal tip of the LNvs.

      Strengths:

      The experiments are well done and the results are conclusive.

      Weaknesses:

      The novelty of the study is overstated, and the background is understated. Both things need to be revised.

      We appreciate your acknowledgment that the experiments were well-executed and the results conclusive. This validation reinforces the robustness of our findings.

      We take note of your feedback regarding the novelty of the study being overstated and the background being understated. While axonal projections navigate without distinct landmarks, like the midline or the layers, columns, and segments, they pose more challenges and uncertainties. As highlighted, our key contribution lies in elucidating how axonal projections without clear landmarks are guided, with our research demonstrating how a newly formed cluster of cells at a specific time and location provides the necessary guidance cues for axons.

      We value your insights, and we have carefully addressed these points in our manuscript revision to improve the overall quality and presentation of our research.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      The overall idea of using the s-LNvs as a model is indeed intriguing. There are genetic tools available to tackle these cells with great precision.

      However, based on the stage at which these cells are investigated raises some issues, that I feel are critical to be addressed.

      These neurons develop their axonal projections during embryogenesis and are fully functioning when the larvae hatch, thus to investigate axonal pathfinding one would have to address embryonic development.

      The larval brain indeed continues to grow during larval life, however extensive work from the Hartenstein lab, Truman lab, and others have shown that the secondary (larval born) neurons do not yet wire into the brain, but stall their axonal projections.

      It is thus quite unclear, what the authors are actually studying.

      One interpretation could be that the authors observe changes in axon length due to morphological changes in the brain. Indeed, the fact that the MB expands the anatomy of the surrounding neuropil changes too.

      Moreover, it is unclear when exactly the Pdf-Gal4 (and other drivers) are active, thus how far (embryonic) development of s-LNvs is affected, or if it's all happening in the differentiated, functioning neuron. (Gal4 temporal delay and dynamics during embryonic development may further complicate the issue). As far as I am aware the MB drivers might already be active during embryonic stages.

      Since the raised issue is quite fundamental, I am not sure what might be the best and most productive fashion to address this.

      Eg. either to completely re-focus the topic on "neural morphology maintenance" or to study the actual development of these cells.

      We thank the reviewer for the detailed and insightful feedback on our study. We have tested whether Pdf-Gal4 could effectively label s-LNv, and tracked the s-LNv projection in the early stage after larvae hatching. We did not observe the PDF antibody staining signal and the GFP signal driven by Pdf-GAL4 when the larvae were newly hatched. At 2-4 hours ALH, PDF signals were primarily concentrated at the end of axons, while GFP signals were mainly concentrated at the cell body. Helfrich-Förster initially detected immunoreactivity for PDF in the brains approximately 4-5 hours ALH. The GFP signal expressed by Pdf-GAL4 driver does have signal delay. However, at 8 hours ALH, the GFP signal strongly co-localized with the PDF signal within the axons (see revised manuscript lines 98-101) (Figure 1—figure supplement 1).

      Based on previous research findings and our staining of Clk856-GAL4 >GFP, it is indeed confirmed that the dorsal projection of s-LNvs in Drosophila is formed during the embryonic stage (Figure 3—figure supplement 1). The s-LNvs in first-instar larval Drosophila are capable of detecting signal output and may play a role in regulating certain behaviors. Our selection of tools for characterizing the projection pattern of s-LNv was not optimal, leading us to overlook the crucial detail that the projection had already formed during its embryonic stage.

      However, even when employing Clk856-GAL4 to suppress Dscam expression from the embryonic stage, the initial segment of the dorsal projection of s-LNvs (the vertical part) remains unaffected. Instead, the projection distance is severely shortened towards the midline, and this defect persists until the adult stage. It is for this reason that we delineate the dorsal projections of s-LNvs into two distinct phases: the vertical and horizontal parts, rather than a mere expansion in correspondence with the development of the larval brain.

      From the results searched in the Virtual Fly Brain (VFB) database (https://www.virtualflybrain.org/), it is clear that the neurons that form synaptic connections with s-LNvs at the adult stage are essentially completely different from the neurons that are associated with them at the L1 larval stage. Thus, most neurons that form synapses with s-LNvs in the early larvae either cease to exist after metamorphosis or assume other roles in the adult stage. Similar to the scenario where Cajal-Retzius cells and GABAergic interneurons establish transient synaptic connections with entorhinal axons and commissural axons, respectively, these cells form a transient circuit with presynaptic targets and subsequently undergo cell death during development. In our model, the neurons that synapse with s-LNvs in early development serve as "placeholders," offering positive or negative cues to guide the axonal targeting of s-LNvs towards their ultimate destination.

      Thank you again for your valuable feedback, and we have incorporated these considerations into our revised manuscript to enhance the clarity and depth of our research.

      Reviewer #2 (Recommendations For The Authors):

      Major:

      In the introduction too many revisions are cited and very few actual research papers. This should be corrected and the most significant papers in the field should be cited. For example, there is no reference to the pioneering work from the Christine Holt lab or the first paper looking at axon guidance and guideposts by Klose and Bentley, Isbister et al 1999.

      The introduction should encapsulate the actual knowledge based on actual research papers.

      We acknowledge your concern regarding the citation of review papers rather than primary research papers in the introduction. Following your suggestion, we have revised the introduction section to incorporate references to relevant research papers.

      In the introduction and discussion: The authors cite revisions where the signals that guide axons across different regions including turning are shown and they end up saying: "However, how the axons change their projection direction without well-defined landmarks is still unclear." I think the sentence should be changed. Many things are still not clear but this is not a good phrasing. Maybe they could focus on their temporal finding?

      We appreciate the reviewer's feedback and insightful suggestions. We agree that emphasizing the temporal aspect is crucial in our study. However, we also recognize the significance of understanding the origin of signals that guide axonal reorientation at specific locations. While axonal projections navigating without distinct landmarks pose more challenges and uncertainties compared to those guided by prominent landmarks like the midline, our research demonstrates the crucial role of a specific cell population near turning points in providing accurate guidance cues to ensure precise axonal reorientation. We have revised our phrasing in the introduction and discussion to better reflect these key points (see revised manuscript lines 69-71 and 350-354). Thank you for highlighting the significance of focusing on our temporal findings and the complexities involved in studying axonal projection.

      Many rather old papers have looked into the effect of repulsive guideposts to guide axon projections. In particular, I can think of the paper from Isbister et al. 1999 (DOI: 10.1242/dev.126.9.2007) that not only shows how semaphoring guides Ti axon projection but also shows how the pattern of expression of sema 2a changes during development to guide the correct projection. I really think that the novelty of the paper should be revised in light of the actual knowledge in the field.

      We appreciate the reviewer's reference to the seminal work by Isbister et al. (1999) and the importance of guidepost cells in axon projection guidance, which we have already cited in our revised manuscript. It is crucial to recognize that segmented patterns such as the limb segment traversed by Ti1 neuron projections or neural circuits formed in a layer- or column-specific manner also serve as intrinsic "guideposts," offering valuable insights into axonal pathfinding processes. In our model, explicit guidance cues are lacking. As highlighted, our key contribution lies in elucidating how axonal projections without clear landmarks are guided, with our research demonstrating how a newly formed cluster of cells at a specific time and location provides the necessary guidance cues for axons (see revised manuscript lines 350-354). We have ensured that our revised manuscript reflects these insights and emphasizes the significance of studying axonal guidance in the absence of distinct guideposts. Thank you for underscoring these essential points, which enhance our understanding of axonal projection dynamics.

      Minors:

      Line 54, the authors start talking about floorplate at the end of a section on Drosophila. Please use “In vertebrates”, or “in invertebrates” or “in Drosophila” etc.. when needed to put things in context.

      We thank the reviewer for this suggestion and have modified this sentence. Please refer to lines 62-63 of the revised manuscript.

      Line 69: many factors change the axonal outgrowth. The authors are missing the paper from Fernandez et al. 2020, who have shown that unc5 the receptor of netrin induces the stalling for sLNvs projections before the turn. https://doi.org/10.1016/j.cub.2020.04.025

      We thank the reviewer for this suggestion and have added this research article. Please refer to line 79 of the revised manuscript.

      Line 99: "precisely at the pivotal juncture". It I hard to see how it was done in the figures shown. Can the authors add a small panel with neuronal staining showing this (please no HRP)?

      For all figures, tee magenta is too strong and it is really hard to see the sLNvs projections. Can this be sorted, please?

      We have depicted the pivotal juncture in the schematic diagram on the left side of Figure 1C. Additionally, we have included a separate column of images without HRP in Figure 1A. Moreover, we have modified the pseudo-color of HRP from magenta to blue to enhance the visualization of the s-LNv projection. The figure legends have also been correspondingly modified.

      Line 407: Spatial position relationship between calyx and s-LNvs. OK107-GAL4 labels ... calyx and s-LNvs labeled by, which which.

      We have modified it according to your suggestion. Please refer to lines 430-432 of the revised manuscript.

      Line 137 typo RPRC

      We thank the reviewer for noticing this mistake, which has now been corrected. Please refer to line 148-149 of the revised manuscript.

      Section 158-164. the paper from Zhang et al 2019 needs to be cited since they have found the same effect of decreasing Dscam even if they didn't think about horizontal projection.

      Thanks to the suggestion, we have included in the manuscript the phenotype observed by Zhang et al. (2019) upon knocking down Dscam1-L in adults. Please refer to lines 170-172 of the revised manuscript.

      Line 176: typo senses (instead of sensor).

      Thank you for pointing out our mistake. We have modified it according to your suggestion. Please refer to line 189 of the revised manuscript.

      Line 193: more than Interesting it is Notable. Add "ubiquitus" knockdown.

      Thank you for the suggestion. We have included the word "ubiquitus" to enhance the precision of the narrative. Please refer to line 206 of the revised manuscript.

      Line 224: the pattern of expression of the crz cells is not visible where the projections of sLNvs are located. Are they in that region? Or further away?

      We've changed the pseudo-color of HRP, and in the updated Figure 5- figure supplement 1, you can see the projection pattern of crz+ cells, positioned close to the end of the s-LNv axon terminal.

      Line 243: applied? Do you mean "used"

      Thank you for the suggestion. We have revised it at line 256.

      Figure 5 Sup1: the schematic shows DNs proliferation that is not visible on the GFP image. Please comment.

      We have modified the Figure 5 figure supplementary 1 for 120 h per-GAL4, Pdf-GAL80 >GFP expression pattern. Due to the strong GFP intensity in some DN neurons, there was a loss of GFP signal. Additionally, in Figure 6 figure supplementary 1, we have added co-localization images of DN and s-LNv at 72 h and 96 h. To better illustrate the co-localization information, we have shown only a portion of the layers in the right panel. We hope these additions clarify your concerns.

      Line 251: cite Fernandez et al. 2020 with Purohit et al 2012.

      We have modified it according to your suggestion. Please refer to line 264 of the revised manuscript.

      Line 272: you have not shown synergistic effects because you have not modulated both pathways at the same time. You should talk about complementary.

      We have modified it according to your suggestion at lines 25, 285, 439.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      (1) Point for more elaborate discussion: Apparently the timescale of negative feedback signals is conserved between endothelial cell migration in vitro (with human cells) and endothelial migration during the formation of ISVs in zebrafish. What do you think might be an explanation for such conserved timescales? Are there certain processes within cytoskeletal tension build up that require this quantity of time to establish? Or does it relate to the time that is needed to begin to express the YAP/TAZ target genes that mediate feedback?

      The underlying mechanisms responsible for the conserved timescale is a major direction that we continue to explore. Localization of YAP/TAZ to the nucleus is likely not rate-limiting. We showed previously that acute RhoA activation produced significant YAP/TAZ nuclear localization within minutes, while subsequent co-transcriptional activity aligned with the gene expression dynamics observed here (Berlew et al., 2021). We hypothesize that the dynamics of YAP/TAZdependent transcription and the translation of those target genes are rate-limiting for initial feedback loop completion (tic = 4 hours). This is supported by work from us and others in a variety of cell lines showing YAP/TAZ transcriptional responses take place during the first few hours after activation. (Franklin et al., 2020; Mason et al., 2019; Plouffe et al., 2018) While our data identify mediators of initial feedback loop completion, the molecular effectors that determine the timescale of new cytoskeletal equilibrium establishment (teq = 8 hours) remain unclear.

      (2) Do you expect different timescales for slower endothelial migratory processes (e.g. for instance during fin vascular regeneration which takes days)?

      We selected the ISV development model because it exhibits similar migratory kinetics to our previously-explored human ECFC migration in vitro. The comparable kinetics allowed us to study dynamics of the feedback loop in vivo on similar time scales, but we have not explored models featuring either slower or faster dynamics. 

      It would be interesting to test how feedback dynamics are impacted in distinct endothelial migratory processes. Our data suggest that the feedback loop is necessary for persistent migration; however, YAP and TAZ respond to a diversity of upstream regulators in addition to mechanical signals, which might depend on the process of vascular morphogenesis. For example, after fin amputation, inflammation and tissue regeneration may impact the biochemical and mechanical environment experienced by the endothelium. Additionally, cells display different migratory behaviors in ISV morphogenesis compared to fin regeneration. During ISV formation, sprouting tip cells migrate dorsally through avascular tissue, followed by stalk cells. (Ellertsdóttir et al., 2010) In contrast, the fin vasculature regenerates by forming an intermediate vascular plexus, where some venous-derived endothelial cells migrate towards the sprouting front, while others migrate against it. (Xu et al., 2014) We are excited to study the role of this feedback loop in these different modes of neovessel formation in future studies.

      (3) Is the ~4hrs and 8hrs feedback time window a general property or does it differ between specific endothelial cell types? In the veins the endothelial cells generate less stress fibers and adhesions compared to in the arteries. Does this mean that there might be a difference in the feedback time window, or does that mean that certain endothelial cell types may not have such YAP/TAZcontrolled feedback system?

      Recent studies suggest that venous endothelial cells are the primary endothelial subtype responsible for blood vessel morphogenesis. (Lee et al., 2022, 2021; Xu et al., 2014) They are highly motile and mechanosensitive, migrating against blood flow. (Lee et al., 2022) The Huveneers group has shown that the actin cytoskeleton is differently organized in adult arteries and veins in response to biomechanical properties of its extracellular matrix, rather than intrinsic differences between arterial and venous cells. (van Geemen et al., 2014) This suggests that arterial and venous cells have distinct cytoskeletal setpoints due to mechanical cues in their environment (Price et al., 2021). We expect this to impact the degree of cytoskeletal remodeling and cell migration at equilibrium, rather than the kinetics of the feedback loop per se, though we have not yet tested this hypothesis. Testing these predictions on cytoskeletal setpoint stability and adaptation is a major direction that we continue to explore. 

      (4) The experiments are based on perturbations to prove that transcriptional feedback is needed for endothelial migration. What would happen if the feedback systems is always switched on? An experiment to add might be to analyse the responsiveness of endothelial cells expressing constitutively active YAP/TAZ.

      This is a problem that we are actively pursuing. Though the feedback system forms a coherent loop, we anticipate that the identity of the node of the loop selected for constitutive activation will influence the outcome, depending on whether that node is rate-limiting for feedback kinetics and the extent of intersection of that node with other signaling events in the cell. For example, we have observed that constitutive YAP activation drives profound changes to the transcriptional landscape including, but not limited to, RhoA signaling (Jones et al., 2023). We further anticipate that constitutive activation of feedback loop nodes may alter feedback dynamics, while dynamic or acute perturbation will be required to dissect these contributions in real time. For these reasons, ongoing work in the lab is pursuing these questions using optogenetic tools that enable precise spatial and temporal control (Berlew et al., 2021).   

      (5) To investigate the role of YAP-mediated transcription in an accurate time-dependent manner the authors may consider using the recently developed optogenetic YAP translocation tool: https://doi.org/10.15252/embr.202154401

      We are enthusiastic about the power of optogenetics to interrogate the nodes and timescales of this feedback system, and we are now funded to pursue this line of research. 

      Reviewer #2:

      The idea is intriguing, but it is not clear how the feedback actually works, so it is difficult to determine if the events needed could occur within 4 hrs. Specifically, it is not clear what gene changes initiated by YAP/TAZ translocation eventually lead to changes in Rho signaling and contractility. Much of the evidence to support the model is preliminary. Some of the data is consistent with the model, but alternative explanations of the data are not excluded. The fish washout data is quite interesting and does support the model. It is unclear how some of the in vitro data supports the model and excludes alternatives.

      Major strengths:

      The combination of in vitro and in vivo assessment provides evidence for timing in physiologically relevant contexts, and a rigorous quantification of outputs is provided. The idea of defining temporal aspects of the system is quite interesting.

      Major weaknesses:

      The evidence for a "loop" is not strong; rather, most of the data can also be interpreted as a linear increase in effect with time once a threshold is reached. Washout experiments are key to setting up a time window, yet these experiments are presented only for the fish model. A major technical challenge is that siRNA experiments take time to achieve depletion status, making precise timing of events on short time scales problematic. Also, Actinomycin D blocks most transcription so exposure for hours likely leads to secondary and tertiary effects and perhaps effects on viability. No RNA profiling is presented to validate proposed transcriptional changes.

      We thank the reviewer for these helpful suggestions. We have expanded our explanation of the history and known mediators of the feedback loop in the introduction. We and, independently, the Huveneers group recently reported that human endothelial cells maintain cytoskeletal equilibrium for persistent motility through a YAP/TAZ-mediated feedback loop that modulates cytoskeletal tension. (Mason et al., 2019; van der Stoel et al., 2020) Because YAP and TAZ are activated by tension of the cytoskeleton (Dupont et al., 2011), suppression of cytoskeletal tension by YAP/TAZ transcriptional target genes constitutes a negative feedback loop (Fig. 1A). We described key components of this cell-intrinsic feedback loop, which acts as a control system to maintain cytoskeletal homeostasis for persistent motility via modulation of Rho-ROCK-myosin II activity. (Mason et al., 2019) Both we and the Huveneers group found that perturbation of genes and pathways regulated by YAP/TAZ mechanoactivation can functionally rescue motility in YAP/TAZ-depleted cells (e.g., RhoA/ROCK/myosin II, NUAK2, DLC1). (Mason et al., 2019; van der Stoel et al., 2020) We further showed previously that both YAP/TAZ depletion and acute YAP/TAZ-TEAD inhibition consistently increased stress fiber and FA maturation and arrested cell motility, accounting for these limitations of siRNA. (Mason et al., 2019)

      Enduring limitations to the temporal, spatial, and cell-specific control of the genetic and pharmacologic methods have inspired us to initiate alternative approaches, which are the subject of ongoing efforts. Further research will be necessary in the zebrafish to determine the extent to which the observed migratory dynamics are driven by cytoskeletal arrest. 

      To identify early YAP/TAZ-regulated transcriptional changes, we have added RNA profiling of control and YAP/TAZ depleted cells cultured on stiff matrices for four hours. Genes upregulated by YAP/TAZ depletion were enriched for Gene Ontology (GO) terms associated with Rho protein signal transduction, vascular development, cellular response to vascular endothelial growth factor (VEGF) stimulus, and endothelial cell migration (Fig. 9B). These data support a role for YAP and TAZ as negative feedback mediators that maintain cytoskeletal homeostasis for endothelial cell migration and vascular morphogenesis.  

      Reviewer #3:

      The authors used ECFC - endothelial colony forming cells (circulating endothelial cells that activate in response to vascular injury).

      Q: Did the authors characterize these cells and made sure that they are truly endothelial cells - for example examine specific endothelial markers, arterial-venous identity markers & Notch signalling status, overall morphology etc prior to the start of the experiment. How were ECFC isolated from human individuals, are these "healthy" volunteers - any underlying CVD risk factors, cells from one patient or from pooled samples, what injury where these humans exposed to trigger the release of the ECPFs into the circulation, etc. The materials & methods on ECFC should be expanded.

      Human umbilical cord blood-derived ECFCs were isolated at Indiana University School of Medicine and kindly provided by Dr Mervin Yoder. Cells were cultured as described by the Yoder group (Rapp et al., 2011) and our prior paper (Mason et al., 2019). We have expanded the materials and methods section to describe the source and characterization of these cells.

      The authors suggest that loss of YAP/TAZ phenocopies actinomycin-D inhibition - "both transcription inhibition and YAP/TAZ depletion impaired polarization, and induced robust ventral stress fiber formation and peripheral focal adhesion maturation". However, the cell size of actinomycin-D treated cells (Fig. 1B, top right panel), differs from the endothelial cell size upon siYAP/TAZ (Fig. 1E, top right panel) - and vinculin staining seems more pronounced in actinomycin-D treated cells (B, bottom right) when compared to siYAP/TAZ group. Cell shape is defined by acto-myosin tension.

      Q: Besides Fraction of focal adhesion >1um; focal adhesion number did the authors measure additional parameters related to cytoskeleton remodelling / focal adhesions that can substantiate their statement on similarity between loss of YAP/TAZ and actinomycin-D treatment. Would it be possible to make a more specific genetic intervention (besides YAP/TAZ) interfering with the focal adhesion pathway as opposed to the broad spectrum inhibitor actinomyocin-D.

      Our previous paper (Mason et al., 2019) delineated the mechanistic relationships between YAP/TAZ signaling, focal adhesion turnover, actomyosin polymerization, and the intervening mechanisms of myosin regulation. Specifically, we demonstrated that YAP/TAZ regulate the myosin phosphatase kinase, NUAK2, and ARHGAP genes to mediate this feedback. Expanding on this work, the current study aimed to define the temporal kinetics of the cytoskeletal mechanotransductive feedback in vitro and in vivo. We used actinomycin-D and YAP/TAZ depletion to interrogate the role of transcriptional regulation and YAP/TAZ signaling, respectively. In this revision, we have added RNA profiling that identifies early YAP/TAZ-regulated transcriptional changes and further points to other molecular mediators of focal adhesions (e.g. TRIO, RHOB, THBS1) that will be the subjects of future studies.    

      Q: Does the actinomycin-D treatment affect responsiveness to Vegf? induce apoptosis or reduce survival of the ECFC?

      We have not looked specifically at the effect of actinomycin-D treatment on responsiveness to VEGF. However, actinomycin-D has been reported to reduce transcription of VEGF receptors (E et al., 2012). In contrast, we found that YAP/TAZ depletion upregulated GO terms associated with endothelial cell migration and response to VEGF stimulus (Fig. 9B), as well as receptors to angiogenic growth factors, including KDR and FLT4 (Fig. 9E). These results suggest YAP/TAZ depleted cells may be more sensitive to VEGF stimulation but remain nonmotile due to cytoskeletal arrest.

      We showed previously that long-term treatment with actinomycin-D reduces ECFC survival (Mason et al., 2019).

      Q: Which mechanism links ECM stiffness with endothelial surface area in the authors scenario. In zebrafish, activity of endothelial guanine exchange factor Trio specifically at endothelial cell junctions (Klems, Nat Comms, 2020) and endoglin in response to hemodynamic factors (Siekmann, Nat Cell Biol 2017) have been show to control EC shape/surface area - do these factors play a role in the scenario proposed by the authors.

      Our new transcriptional profiling indicates both Trio and endoglin are regulated through YAP and TAZ in human ECFCs. We plan to follow up on these findings.

      Q: The authors report that EC migrate faster on stiff substrate, and concomitantly these cells have a larger surface area. What is the physiological rationale behind these observations. Did the authors observe such behaviors in their zebrafish ISV model? How do these observations integrate with the tip - stalk cell shuffling model (Jakobsson & Gerhardt, Nat Cell Biol, 2011) and Notch activity in developing ISVs.

      This question raises important distinctions between the mode of migration in ISV morphogenesis and endothelial cells adherent to substrates. Cells behave and respond to mechanical cues differently in 2D vs. 3D matrices. (LaValley and Reinhart-King, 2014) Additionally, the microenvironment in vivo is much more complex, combining numerous biochemical signals and changing mechanical properties. (Whisler et al., 2023) We are actively investigating the downstream targets of YAP/TAZ mechanotransduction and how that integrates with other pathways known to regulate vascular morphogenesis, such as Notch signaling. 

      The authors examined the formation of arterial intersegmental vessels in the trunk of developing zebrafish embryos in vivo. They used a variety of pharmacological inhibitors of transcription and acto-myosin remodelling and linked the observed morphological changes in ISV morphogenesis with changes in endothelial cell motility.

      Q: Reduced formation and dorsal extension of ISVs may have several reasons, including reduced EC migration and proliferation. The Tg(fl i1a:EGFP) reporter however is not the most suitable line to monitor migration of individual endothelial cells. Can the authors repeat the experiments in Tg(fl i1a:nEGFP); Tg(kdrl:HRAS-mCherry) double transgenics to visualize movement-migration of the individual endothelial cells and EC proliferation events, in the different treatment regimes.

      So far, we have not tracked individual endothelial cells during ISV morphogenesis. We agree this is the best approach and are pursuing a similar technique for these experiments.

      ISV formation is furthermore affected by Notch signalling status and a series of (repulsive) guidance cues.

      Q: Does de novo blockade of gene expression with Actinomycin D affect Notch signalling status, expression of PlexinD - sFlt1, netrin1 or arterial-venous identify genes.

      While we have not performed gene expression analysis under the Actinomycin D condition, Actinomycin D functions as a broad transcription inhibitor. We are currently pursuing the downstream targets of YAP/TAZ mechanotransduction in both ECFCs and zebrafish.

      Remark: The authors may want to consider using the Tg(fl i1:LIFEACT-GFP) reporter for in vivo imaging of actin remodelling events.

      We thank the reviewer for their helpful suggestion.

      Remark: the authors report "As with broad transcription inhibition, in situ depletion of YAP and TAZ by RNAi arrested cell motility, illustrated here by live-migration sparklines over 10 hours: siControl: , siYAP/TAZ: (25 μm scale-bar: -)". Can the authors make a separate figure panel for this, how many cells were measured?

      Please refer to our previous publication for the complete details on this data (Mason et al., 2019). We have added the citation in the text.

      Remark: in the wash-out experiments, exposure to the inhibitors is not the same in the different scenarios - could it be that the longer exposure time induces "toxic" side effect that cannot be "washed out" when compared to the short treatment regimes?

      This is a possible limitation of the pharmacological approach and have included it in the discussion section. We are currently exploring alternative approaches to interrogate the timescale of the feedback loop more precisely.  

      References

      Berlew EE, Kuznetsov IA, Yamada K, Bugaj LJ, Boerckel JD, Chow BY. 2021. Single-Component Optogenetic Tools for Inducible RhoA GTPase Signaling. Advanced Biology 5:2100810. doi:10.1002/adbi.202100810

      Dupont S, Morsut L, Aragona M, Enzo E, Giulitti S, Cordenonsi M, Zanconato F, Le Digabel J,Forcato M, Bicciato S, Elvassore N, Piccolo S. 2011. Role of YAP/TAZ in mechanotransduction. Nature 474:179–183. doi:10.1038/nature10137

      E G, Cao Y, Bhattacharya S, Dutta S, Wang E, Mukhopadhyay D. 2012. Endogenous Vascular Endothelial Growth Factor-A (VEGF-A) Maintains Endothelial Cell Homeostasis by Regulating VEGF Receptor-2 Transcription. J Biol Chem 287:3029–3041. doi:10.1074/jbc.M111.293985

      Ellertsdóttir E, Lenard A, Blum Y, Krudewig A, Herwig L, Affolter M, Belting H-G. 2010. Vascular morphogenesis in the zebrafish embryo. Developmental Biology, Special Section: Morphogenesis 341:56–65. doi:10.1016/j.ydbio.2009.10.035

      Franklin JM, Ghosh RP, Shi Q, Reddick MP, Liphardt JT. 2020. Concerted localization-resets precede YAP-dependent transcription. Nat Commun 11:4581. doi:10.1038/s41467-02018368-x

      Jones DL, Hallström GF, Jiang X, Locke RC, Evans MK, Bonnevie ED, Srikumar A, Leahy TP, Nijsure MP, Boerckel JD, Mauck RL, Dyment NA. 2023. Mechanoepigenetic regulation of extracellular matrix homeostasis via Yap and Taz. Proceedings of the National Academy of Sciences 120:e2211947120. doi:10.1073/pnas.2211947120

      LaValley DJ, Reinhart-King CA. 2014. Matrix stiffening in the formation of blood vessels. Advances in Regenerative Biology 1:25247. doi:10.3402/arb.v1.25247

      Lee H-W, Shin JH, Simons M. 2022. Flow goes forward and cells step backward: endothelial migration. Exp Mol Med 54:711–719. doi:10.1038/s12276-022-00785-1

      Lee H-W, Xu Y, He L, Choi W, Gonzalez D, Jin S-W, Simons M. 2021. Role of Venous Endothelial Cells in Developmental and Pathologic Angiogenesis. Circulation 144:1308–1322. doi:10.1161/CIRCULATIONAHA.121.054071

      Mason DE, Collins JM, Dawahare JH, Nguyen TD, Lin Y, Voytik-Harbin SL, Zorlutuna P, Yoder MC, Boerckel JD. 2019. YAP and TAZ limit cytoskeletal and focal adhesion maturation to enable persistent cell motility. Journal of Cell Biology 218:1369–1389. doi:10.1083/jcb.201806065

      Plouffe SW, Lin KC, Moore JL, Tan FE, Ma S, Ye Z, Qiu Y, Ren B, Guan K-L. 2018. The Hippo pathway effector proteins YAP and TAZ have both distinct and overlapping functions in the cell. J Biol Chem 293:11230–11240. doi:10.1074/jbc.RA118.002715

      Price CC, Mathur J, Boerckel JD, Pathak A, Shenoy VB. 2021. Dynamic self-reinforcement of gene expression determines acquisition of cellular mechanical memory. Biophysical Journal 120:5074–5089. doi:10.1016/j.bpj.2021.10.006

      Rapp BM, Saadatzedeh MR, Ofstein RH, Bhavsar JR, Tempel ZS, Moreno O, Morone P, Booth DA, Traktuev DO, Dalsing MC, Ingram DA, Yoder MC, March KL, Murphy MP. 2011. Resident Endothelial Progenitor Cells From Human Placenta Have Greater Vasculogenic Potential Than Circulating Endothelial Progenitor Cells From Umbilical Cord Blood. Cell Med 2:85–96. doi:10.3727/215517911X617888

      Tammela T, Zarkada G, Nurmi H, Jakobsson L, Heinolainen K, Tvorogov D, Zheng W, Franco CA, Murtomäki A, Aranda E, Miura N, Ylä-Herttuala S, Fruttiger M, Mäkinen T, Eichmann A, Pollard JW, Gerhardt H, Alitalo K. 2011. VEGFR-3 controls tip to stalk conversion at vessel fusion sites by reinforcing Notch signalling. Nat Cell Biol 13:1202–1213. doi:10.1038/ncb2331

      van der Stoel M, Schimmel L, Nawaz K, van Stalborch A-M, de Haan A, Klaus-Bergmann A, Valent ET, Koenis DS, van Nieuw Amerongen GP, de Vries CJ, de Waard V, Gloerich M, van Buul JD, Huveneers S. 2020. DLC1 is a direct target of activated YAP/TAZ that drives collective migration and sprouting angiogenesis. Journal of Cell Science 133:jcs239947. doi:10.1242/jcs.239947

      van Geemen D, Smeets MWJ, van Stalborch A-MD, Woerdeman LAE, Daemen MJAP, Hordijk PL, Huveneers S. 2014. F-Actin–Anchored Focal Adhesions Distinguish Endothelial Phenotypes of Human Arteries and Veins. Arteriosclerosis, Thrombosis, and Vascular Biology 34:2059–2067. doi:10.1161/ATVBAHA.114.304180

      Whisler J, Shahreza S, Schlegelmilch K, Ege N, Javanmardi Y, Malandrino A, Agrawal A, Fantin A, Serwinski B, Azizgolshani H, Park C, Shone V, Demuren OO, Del Rosario A, Butty VL, Holroyd N, Domart M-C, Hooper S, Szita N, Boyer LA, Walker-Samuel S, Djordjevic B, Sheridan GK, Collinson L, Calvo F, Ruhrberg C, Sahai E, Kamm R, Moeendarbary E. 2023. Emergent mechanical control of vascular morphogenesis. Science Advances 9:eadg9781. doi:10.1126/sciadv.adg9781

      Xu C, Hasan SS, Schmidt I, Rocha SF, Pitulescu ME, Bussmann J, Meyen D, Raz E, Adams RH, Siekmann AF. 2014. Arteries are formed by vein-derived endothelial tip cells. Nat Commun 5:5758. doi:10.1038/ncomms6758

    1. Colletotrichum fungi infect a wide diversity of monocot and eudicot hosts, causing plant diseases on almost all economically important crops worldwide. In addition to its economic impact, Colletotrichum is a suitable model for the study of gene family evolution on a fine scale to uncover events in the genome that are associated with the evolution of biological characters important for host interactions. Here we present the genome sequences of 30 Colletotrichum species, 18 of them newly sequenced, covering the taxonomic diversity within the genus. A time-calibrated tree revealed that the Colletotrichum ancestor diverged in the late Cretaceous around 70 million years ago (mya) in parallel with the diversification of flowering plants. We

      Reviewer 1: Jamie McGowan In this study, Baroncelli and colleagues carry out a comprehensive analysis of genomic evolution in Colletotrichum fungi, an important group of plant pathogens with diverse and economically significant hosts. Their comparative genomic and phylogenomics analyses are based on the genome sequences of 30 Colletotrichum species spanning the diversity of the genus, including pathogens of dicots, monocots, and both dicots and monocots. This includes 18 genome sequences that are newly reported in this study. They also perform comparative transcriptomic analyses of 4 Colletotrichum species (2 dicot pathogens and 2 monocot pathogens) on different carbon sources. Overall, I thought the manuscript was very well written and technically sound. The results should be of interest to a broad audience, particularly to those interested in fungal evolutionary genomics and plant pathology. I only have a few minor comments. Minor comments: (1) Lines 50 - 51: "The plant cell wall (PCW) consists of many different polysaccharides that are attached not only to each other through a variety of linkages providing the main strength and structure for the PCW". I found this confusing - is the sentence incomplete? (2) Line 66: "Some Colletotrichum species show…" I think there should be a couple of introductory sentences about Colletotrichum before this. (3) Figure 1: It would be informative to label which genomes were sequenced with PacBio versus just Illumina. (4) Lines 254 - 255: "As no other enrichment was identified we performed a manual annotation of genes identified in Figure 3D". I don't think it is clear here what manual annotation this is referring to. (5) One area where I felt the analysis was lacking was the lack of analyses on genome repeat content. The authors highlight the large variation in genome sizes within Colletotrichum species (~44 Mb vs ~90 Mb) and show in Figure 1 that this correlates with increased non-coding DNA. It would have been interesting to determine if this is driven by the proliferation of particular repeat families. (6) Another concern is the inconsistent use of genome annotation methods. 12 of the genomes reported in this study were annotated using the JGI annotation pipeline, whereas the other 6 were annotated using the MAKER pipeline. Several studies (e.g., Weisman et al., 2022 - Current Biology) show that inconsistent genome annotation methods can inflate the number of observed lineage specific genes. The authors may wish to comment on this or demonstrate that this isn't an issue in their study (e.g., by aligning lineage specific proteins against the other genome assemblies).

    1. Structural variants (SVs) play a significant role in speciation and adaptation in many species, yet few studies have explored the prevalence and impact of different categories of SVs. We conducted a comparative analysis of long-read assembled reference genomes of closely related Eucalyptus species to identify candidate SVs potentially influencing speciation and adaptation. Interspecies SVs can be either fixed differences, or polymorphic in one or both species. To describe SV patterns, we employed short-read whole-genome sequencing on over 600 individuals of E. melliodora and E. sideroxylon, along with recent high quality genome assemblies. We aligned reads and genotyped interspecies SVs predicted between species reference genomes. Our results revealed that 49,756 of 58,025 and 39,536 of 47,064 interspecies SVs could be typed with short reads, in E. melliodora and E. sideroxylon

      Reviewer 1: Jakob Butler Ferguson et al have performed a thorough analysis of two species of Eucalyptus, quantifying the extent of structural variation between assembled genomes of the species and determining how prevalent those variations are across a selection of wild material. I believe this study is of sufficient quality for publication in GigaScience, if some minor inconsistencies and grammatical issues are addressed, and a few supporting analyses are performed. The major changes I would like to see include the addition of a syri plot of the complete set of SVs between E. melliodora and E. sideroxylon. I believe this, along with correcting the scale on the plots of recombination in Figure S6/S7 would allow for a better comparison of how recombination rate is interacting with the SVs. I would also suggest a more formal test of enrichment for COG terms, to better support the statements of "enrichment" in the discussion. Suggested changes by line: Line 142 - This section is quite short, I would either merge this section into the Genome scaffolding (and annotation) section, or expand on the results of the gene annotation. Line 182 - (Supplementary Figure S4) Line 183 (and throughout) - Please be consistent with your references to tables and figures. Line 186 - delete comma after 28.63% Line 194 - These are density plots rather than histograms Figure 4 - Both axes are labelled as PC1 Line 217 (page 10, line numbers are doubled up) - This seems repetitive, perhaps "…especially as they may also represent divergent sequences". Line 221 (page 11) - Please insert "and" before polymorphic translocations Line 223 - You have stated that those not successfully genotyped in both species are private or artefacts earlier in the paragraph, please reduce the repetition. Figure 6 - I don't find this figure particularly informative (and somewhat confusing to interpret). I think showing the percentages of each different SV in a visual form implies a level of equivalence in genomic impact, which is difficult to reconcile with the raw difference in numbers. I think a supplemental table with the focus on the percentages would illustrate the point better. Line 246 - There is no mention in the methods about what r threshold was used to declare a pair "correlated", please state it here or in the methods. Line 265 - This line was confusing to interpret. A suggested alteration: "significant value. After attempting to functionally annotating all genes across the genome and placing them within COG categories, 247 of the total 281 gene candidates in SSPs were annotated. These genes were enriched for...." Line 266 - I would like to see a formal enrichment analysis rather than "increased/decreased association", so we could have a clearer picture of which gene functions are truly over/underrepresented in SSPs. You could subsequently limit Figure 8 to those that show a difference. Line 275 - The grammar of this title is a bit off, perhaps "Effect of syntenic, rearranged, unaligned regions and genes on recombination rates" Line 276 - This is the first mention of p, please define it as recombination rate Line 283 - The supplemental Figure S6 and S7 seem to have regions of heightened recombination, but this is difficult to interpret and compare with the current variable axis scales. Please make these consistent. I would also like to see the syri graph of the two aligned genomes, as this would allow for a visual comparison of SV regions with recombination rate. Line 290 - How were p-values adjusted? Line 294 - More information about this 'significantly' higher recombination rate would be good, either in the figure or further expanded in the text. Line 307 - Italics for species names (repeated in Figure 10 and Figure 11 caption) Line 310 - Similar problem to line 275 Figure 10 - Having Figure 9b repeated in Figure 10 and Figure 11 is unnecessary. Line 336 - Vertical lines show average FST, not p Line 341 - Similar problem to line 275 Line 356 - translocations should be plural Line 367 - Vertical lines show average SNP density, not p Line 391 - This is the first mention of barrier loci, please define Line 413 - As mentioned above, I would recommend a formal enrichment test to support this statement Line 428 - Grammar is poor here, please correct Line 490 - Please make this a complete sentence Line 499 - Please state how the Hi-C map was manually edited, and what informed the position of those edits. Line 508 - Please provide an example of how well your LAI score of ~18 compares. The LAI paper seems to intimate that 10 is low quality? Line 513 - Missing bracket for version number Line 536 - Syntenic rather than synteny Line 717 - Formatting error in references Supp table S3-S4-S5 - Space between E. and sideroxylon

  3. Jun 2024
    1. However, by examining the bacteriome in detail, we can obtain much more information about its composition and function than diversity alone can tell us. Based on the taxonomic constitution of our samples, Proteobacteria and Actinobacteria phyla were clearly dominant both in fish skin mucus and water samples. The dominance of the Proteobacteria phylum is not an uncommon observation in fish external mucus samples1,3,5,6,8,11,21,62,63, however, differences between fish species have been observed for the other phyla1,11,62,63. Moreover, significant within-species variability in dominant phyla has been described64, and variability within individuals related to body sites should be noted12.The microbiome can be an important indicator of various pathological conditions, which has already been described in fish, for example, in the case of the gastrointestinal tract65. In this regard, the Bacteroidota phylum may be interesting, which has been highlighted as a marker for eutrophication9,66. Understanding the changes in the composition of the bacteriome or even the microbiome during different pathological conditions can be an important step in understanding and potentially diagnosing disease processes.Our results are therefore in line with the dominance of the Proteobacteria phylum observed in other fish species, but direct comparison with C. carpio is not possible due to the lack of available data. Of course, our observations on the bacteriome composition of our samples are also limited by their paramount host genome contamination, which reduced the coverage of bacterial genomes of interest in the sequencing reaction.

      Since you have the resolution to go below phylum, I think it would be interesting to focus on that more in the discussion.

    1. 17) The just man is the freest of anyone from anxiety; but the unjust man is perpetually haunted by it.

      I found this passage disturbing and I do not necessarily agree with it. I think that because we have two different people with two different moral compasses, their views on the world are polar and there is a struggle in comparing them. This "unjust" person has an opposite view of anxiety, punishment, power, fear, etc... because they are "morally wrong," and may not experience the same emotional spectrum as a person who always does the right thing.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The study presents a valuable tool for searching molecular dynamics simulation data, making such data sets accessible for open science. The authors provide convincing evidence that it is possible to identify useful molecular dynamics simulation data sets and their analysis can produce valuable information.

      Public Reviews

      Reviewer #1 (Public Review):

      Summary:

      Tiemann et al. have undertaken an original study on the availability of molecular dynamics (MD) simulation datasets across the Internet. There is a widespread belief that extensive, well-curated MD datasets would enable the development of novel classes of AI models for structural biology. However, currently, there is no standard for sharing MD datasets. As generating MD datasets is energy-intensive, it is also important to facilitate the reuse of MD datasets to minimize energy consumption. Developing a universally accepted standard for depositing and curating MD datasets is a huge undertaking. The study by Tiemann et al. will be very valuable in informing policy developments toward this goal.

      Strengths:

      The study presents an original approach to addressing a growing concern in the field. It is clear that adopting a more collaborative approach could significantly enhance the impact of MD simulations in modern molecular sciences.

      The timing of the work is appropriate, given the current interest in developing AI models for describing biomolecular dynamics.

      Weaknesses:

      The study primarily focuses on one major MD engine (GROMACS), although this limitation is not significant considering the proof-of-concept nature of the study.

      We thank the reviewer for his/her comments. Moving forward, our plan includes expanding this research to encompass other MD engines used in biomolecular simulations and materials sciences, such as NAMD, Charmm, Amber, LAMMPS, etc. However, this requires parsing associated files to supplement the sparse metadata generally available for the related datasets

      Reviewer #2 (Public Review):

      Summary:

      Molecular dynamics (MD) data is deposited in public, non-specialist repositories. This work starts from the premise that these data are a valuable resource as they could be used by other researchers to extract additional insights from these simulations; it could also potentially be used as training data for ML/AI approaches. The problem is that mining these data is difficult because they are not easy to find and work with. The primary goal of the authors was to discover and index these difficult-to-find MD datasets, which they call the "dark matter of the MD universe" (in contrast to data sets held in specialist databases).

      The authors developed a search strategy that avoided the use of ill-defined metadata but instead relied on the knowledge of the restricted set of file formats used in MD simulations as a true marker for the data they were looking for. Detection of MD data marked a data set as relevant with a follow-up indexing strategy of all associated content. This "explore-and-expand" strategy allowed the authors for the first time to provide a realistic census of the MD data in non-specialist repositories.

      As a proof of principle, they analyzed a subset of the data (primarily related to simulations with the popular Gromacs MD package) to summarize the types of simulated systems (primarily biomolecular systems) and commonly used simulation settings.

      Based on their experience they propose best practices for metadata provision to make MD data FAIR (findable, accessible, interoperable, reusable).

      A prototype search engine that works on the indexed datasets is made publicly available. All data and code are made freely available as open source/open data.

      Strengths:

      The novel search strategy is based on relevant data to identify full datasets instead of relying on metadata and thus is likely to have many true positives and few false positives.

      The paper provides a first glimpse at the potential hidden treasures of MD simulations and force field parametrizations of molecules.

      Analysis of parameter settings of MD simulations from how researchers *actually* run simulations can provide valuable feedback to MD code developers for how to document/educate users. This approach is much better than analyzing what authors write in the Methods sections.

      The authors make a prototype search engine available.

      The guidelines for FAIR MD data are based on experience gained from trying to make sense of the data.

      Weaknesses:

      So far the work is a proof-of-concept that focuses on MD data produced by Gromacs (which was prevalent under all indexed and identified packages).

      As discussed in the manuscript, some types of biomolecules are likely underrepresented because different communities have different preferences for force fields/MD codes (for example: carbohydrates with AMBER/GLYCAM using AMBER MD instead of Gromacs).

      Materials sciences seem to be severely under-represented --- commonly used codes in this area such as LAMMPS are not even detected, and only very few examples could be identified. As it is, the paper primarily provides an insight into the *biomolecular* MD simulation world.

      The authors succeed in providing a first realistic view on what MD data is available in public repositories. In particular, their explore-expand approach has the potential to be customized for all kinds of specialist simulation data, whereby specific artifacts are used as fiducial markers instead of metadata. The more detailed analysis is limited to Gromacs simulations and primarily biomolecular simulations (even though MD is also widely used in other fields such as the materials sciences). This restricted view may simply be correlated with the user community of Gromacs and hopefully, follow-up studies from this work will shed more light on this shortcoming.

      The study quantified the number of trajectories currently held in structured databases as ~10k vs ~30k in generalist repositories. To go beyond the proof-of-principle analysis it would be interesting to analyze the data in specialist repositories in the same way as the one in the generalist ones, especially as there are now efforts underway to create a database for MD simulations (Grant 'Molecular dynamics simulation for biology and chemistry research' to establish MDDB' DOI 10.3030/101094651). One should note that structured databases do not invalidate the approach pioneered in this work; if anything they are orthogonal to each other and both will likely play an important role in growing the usefulness of MD simulations in the future.

      We thank the reviewer for his/her comments. As mentioned to Reviewer 1, we intend to extend this work to other MD engines in the near future to go beyond Gromacs and even biomolecular simulations. Furthermore, as the value of accessing and indexing specialized MD databases such as MDDB, MemprotMD, GPCRmd, NMRLipids, ATLAS, and others has been mentioned by the reviewer, it is indeed one of our next steps to continue to expand the MDverse catalog of MD data. This indexing may also extend the visibility and widespreaded adoptability of these specific databases.

      Reviewer #3 (Public Review):

      Molecular dynamics (MD) simulations nowadays are an essential element of structural biology investigations, complementing experiments and aiding their interpretation by revealing transient processes or details (such as the effects of glycosylation on the SARS-CoV-2 spike protein, for example (Casalino et al. ACS Cent. Sci. 2020; 6, 10, 1722-1734 https://doi.org/10.1021/acscentsci.0c01056) that cannot be observed directly. MD simulations can allow for the calculation of thermodynamic, kinetic, and other properties and the prediction of biological or chemical activity. MD simulations can now serve as "computational assays" (Huggins et al. WIREs Comput Mol Sci. 2019; 9:e1393.

      https://doi.org/10.1002/wcms.1393). Conceptually, MD simulations have played a crucial role in developing the understanding that the dynamics and conformational behaviour of biological macromolecules are essential to their function, and are shaped by evolution. Atomistic simulations range up to the billion atom scale with exascale resources (e.g. simulations of SARS-CoV-2 in a respiratory aerosol. Dommer et al. The International Journal of High Performance Computing Applications. 2023; 37:28-44. doi:10.1177/10943420221128233), while coarse-grained models allow simulations on even larger length- and timescales. Simulations with combined quantum mechanics/molecular mechanics (QM/MM) methods can investigate biochemical reactivity, and overcome limitations of empirical forcefields (Cui et al. J. Phys. Chem. B 2021; 125, 689 https://doi.org/10.1021/acs.jpcb.0c09898).

      MD simulations generate large amounts of data (e.g. structures along the MD trajectory) and increasingly, e.g. because of funder mandates for open science, these data are deposited in publicly accessible repositories. There is real potential to learn from these data en masse, not only to understand biomolecular dynamics but also to explore methodological issues. Deposition of data is haphazard and lags far behind experimental structural biology, however, and it is also hard to answer the apparently simple question of "what is out there?". This is the question that Tiemann et al explore in this nice and important work, focusing on simulations run with the widely used GROMACS package. They develop a search strategy and identify almost 2,000 datasets from Zenodo, Figshare and Open Science Framework. This provides a very useful resource. For these datasets, they analyse features of the simulations (e.g. atomistic or coarse-grained), which provides a useful snapshot of current simulation approaches. The analysis is presented clearly and discussed insightfully. They also present a search engine to explore MD data, the MDverse data explorer, which promises to be a very useful tool.

      As the authors state: "Eventually, front-end solutions such as the MDverse data explorer tool can evolve being more user-friendly by interfacing the structures and dynamics with interactive 3D molecular viewers". This will make MD simulations accessible to non-specialists and researchers in other areas. I would envisage that this will also include approaches using interactive virtual reality for an immersive exploration of structure and dynamics, and virtual collaboration (e.g. O'Connor et al., Sci. Adv.4, eaat2731 (2018). DOI:10.1126/sciadv.aat2731)

      The need to share data effectively, and to compare simulations and test models, was illustrated clearly in the COVID-19 pandemic, which also demonstrated a willingness and commitment to data sharing across the international community (e.g. Amaro and Mulholland, J. Chem. Inf. Model. 2020, 60, 6, 2653-2656 https://doi.org/10.1021/acs.jcim.0c00319; Computing in Science & Engineering 2020, 22, 30-36 doi: 10.1109/MCSE.2020.3024155). There are important lessons to learn here, for simulations to be reproducible and reliable, for rapid testing, for exploiting data with machine learning, and for linking to data from other approaches. Tiemann et al. discuss how to develop these links, providing good perspectives and suggestions.

      I agree completely with the statement of the authors that "Even if MD data represents only 1 % of the total volume of data stored in Zenodo, we believe it is our responsibility, as a community, to develop a better sharing and reuse of MD simulation files - and it will neither have to be particularly cumbersome nor expensive. To this end, we are proposing two solutions. First, improve practices for sharing and depositing MD data in data repositories. Second, improve the FAIRness of already available MD data notably by improving the quality of the current metadata."

      This nicely states the challenge to the biomolecular simulation community. There is a clear need for standards for MD data and associated metadata. This will also help with the development of standards of best practice in simulations. The authors provide useful and detailed recommendations for MD metadata. These recommendations should contribute to discussions on the development of standards by researchers, funders, and publishers. Community organizations (such as CCP-BioSim and HECBioSim in the UK, BioExcel, CECAM, MolSSI, learned societies etc) have an important part to play in these developments, which are vital for the future of biomolecular simulation.

      We thank the reviewer for his/her comments. Beyond the points mentioned to Reviewers 1 and 2, as the reviewer suggested, it would be of great interest to combine innovative and immersive approaches to visualize and possibly interact with the data collected. This is indeed more and more amenable thanks to technologies such as WebGL and programs such as Mol*, or even - as also pointed out by the reviewer - through virtual reality, for example with the mentioned Narupa framework or with the UnityMol software. For a comprehensive review on MD trajectory visualization and associated challenges, we refer to our recent review article https://doi.org/10.3389/fbinf.2024.1356659.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Some minor text editing would improve the readability of the manuscript.

      It would be very useful if the authors could share their perspectives on the best and most efficient approach to sharing datasets and code associated with a publication. My concern lies in the fact that Github, which is currently the dominant platform for sharing code, is not well-suited for hosting large MD datasets. As a result, researchers often need to adopt a workflow where code is shared on Github and datasets are stored elsewhere (e.g., Zenodo). While this is feasible, it adds extra work. Ideally, a transparent process could be developed to seamlessly share code and datasets linked to a study through a unified interface.

      We thank the reviewer for this excellent suggestion. To our knowledge, there is yet no easy framework to jointly store and share code and data, linked to their scientific publication. Of course, code can be submitted to “generic” databases along with the data, but at the current state, those do not provide such useful features like collaborative work & track recording as done to the extent of GitHub.

      Although GitHub is indeed a suitable platform to deposit code, we strongly advise researchers to archive their code in Software Heritage. In addition to preserving source code, Software Heritage provides a unique identifier called SWHID that unambiguously makes reference to a specific version of the source code.

      So far, it is the responsibility of the scientific publication authors to link datasets and source codes (whether in GitHub or Software Heritage) in their paper, but also to make the reverse link from the data and code sharing platforms to the paper after publication.

      As mentioned by the reviewer, a unified interface that could ease this process would significantly contribute to FAIR-ness in MD.

      Reviewer #2 (Recommendations For The Authors):

      L180: I am not aware that TRR files contain energy terms as stated here, my understanding was that EDR files primarily served that purpose.

      “…available in one dataset. Interestingly, we found 1,406 .trr files, Which contain trajectory but also additional information such as velocities, energy of the system, etc’ While the file is especially useful in terms of reusability, the large size (can go up to several 100GB) limits its deposition in most…”

      Indeed, our formulation was ambiguous. The EDR files contain the detailed information on energies, whereas TRR files contain numerous values from the trajectory such as coordinates, velocities, forces and to some extent also energies

      (https://manual.gromacs.org/current/reference-manual/file-formats.html#trr)

      L207: The text states that the total time was not available from XTC files, only the number of frames. However, XTC files record time stamps in addition to frame numbers. As long as these times are in the Gromacs standard of picoseconds, the simulation time ought to be available from XTCs.

      “…systems and the number of frames available in the files (Fig. 3-B). Of note, the frames do not directly translate to the simulation runtime - more information deposited in other files (e.g. .mdp files) is needed to determine the complete runtime of the simulation. The system was up…”.

      Thank you for the useful comment, we removed this sentence. We now mention that studying the simulation time would be of interest in the future, especially when we will perform an exhaustive analysis of XTC files.

      “Of note, as .xtc files also contain time stamps, it would be interesting to study the relationship between the time and the number of frames to get useful information about the sampling. Nevertheless, this analysis would be possible only for unbiased MD simulations. So, we would need to decipher if the .xtc file is coming from biased or unbiased simulations, which may not be trivial.”

      Analysis of MDP files: Were these standard equilibrium MD or can you distinguish biased MD or free energy calculations?

      Currently we do not distinguish between biased and unbiased MD, but in the future we may attempt to do so, e.g. by correlating it with standard equilibration force-fields/parameters, timesteps or similar. Nevertheless, a true distinction will remain challenging.

      L336: typo: pikes -> spikes (or peaks?)

      “…simulations of Lennard-Jones models (Jeon et al., 2016). Interestingly, we noticed the appearance of several pikes at 400K, 600K and 800K, which were not present before the end of the year 2022. These peaks correspond to the same study related to the stability of hydrated crystals (Dybeck et al., 2023)’ Overall, thhis analysis revealed that a wide range of temperatures have been explored,…”

      Thank you. We have corrected this typo.

      Make clear how multiple versions of data sets are handled, e.g., if v1, v2, and v3 of a dataset are provided in Zenodo then which one is counted or are all counted?

      We collected the latest version only of datasets, as exposed by default by the Zenodo API. To reflect this, we added the following sentence to the Methods and Materials section, Initial data collection sub-section:

      “By default, the last version of the datasets was collected.”

      L248 Analysis of GRO files seems fairly narrow because PDB files are very often used for exactly the same purpose, even in the context of Gromacs simulations, not the least because it is familiar to structural biologists that may be interested in representative MD snapshots. Despite all the shortcomings of abusing the PDB format for MD, it is an attempt at increased interoperability. Perhaps the authors can make sure that readers understand that choosing GRO for analysis may give a somewhat skewed picture, even within Gromacs simulations.

      Thanks for this comment. We collected about 12,000 PDB files that could indeed be output from Gromacs simulations and easily be shared due to the universality of this format, but that could as well come from different sources (like other MD packages or the PDB database itself). We purposely decided to limit our study to files strictly associated with the Gromacs package, like MDP and XTC file types. However, we will extend our survey to all other structure-like formats and especially the PDB file type. We reflected this purpose in the following sentence (after line 281)

      “Beyond .gro files, we would like to analyze the ensemble of the ~12,000 .pdb files extracted in this study (see Figure 2-B) to better characterize the types of molecular structures deposited.”

      A simple template metadata file would be welcome (e.g., served from a GitHub/GitLab repository so that it can be improved with community input).

      Thank you for this suggestion that we fundamentally agree with. However, the generation of such a file is a major task, and we believe that the creation of a metadata file template requires far-reaching considerations, therefore is beyond the scope of this paper and should not be decided by a small group of researchers. Indeed, this topic requires a large consensus of different stakeholders, from users, to MD program developers, and journal editors. It would be especially useful to organize dedicated workshops with representatives of all these communities to tackle this specific issue, as mentioned by Reviewer3 in his/her public review. As a basis for this discussion, we humbly proposed at the end of this manuscript a few non-constraining guidelines based on our experience retrieving the data.

      To emphasize this statement, we added the following sentence at the end of the “Guidelines for better sharing of MD simulation data” section (line 420):

      “Converging on a set of metadata and format requires a large consensus of different stakeholders from users, to MD program developers, and journal editors. It would be especially useful to organize specific workshops with representatives of all these communities to collectively tackle this specific issue.”

      In "Data and code availability" it would be good to specify licenses in addition to stating "open source". Thank you for pointing out that GitLab/GitHub are not archives and that everyone should be strongly encouraged to submit data to stable archival repositories.

      We added the corresponding licenses for code and data in the “Data and code availability” section.

      Reviewer #3 (Recommendations For The Authors)

      The paper is well written, with very few typographical or other minor errors.

      Minor points:

      Line 468-9 "can evolve being more user-friendly" should be "can evolve to being more user-friendly", I think.

      Thank you, we have changed the wording accordingly.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This valuable study reports on the packing of molecules in cellular compartments, such as actin-based protrusions. The study provides solid evidence for parameters that enable the building of a biophysical model of filopodia, which is required to gain a complete understanding of these important actin-based structures. Some areas of the manuscript require further clarification.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript proposes an alternative method by SDS-PAGE calibration of Halo-Myo10 signals to quantify myosin molecules at specific subcellular locations, in this specific case filopodia, in epifluorescence datasets compared to the more laborious and troublesome single molecule approaches. Based on these preliminary estimates, the authors developed further their analysis and discussed different scenarios regarding myosin 10 working models to explain intracellular diffusion and targeting to filopodia.

      Strengths:

      Overall, the paper is elegantly written and the data analysis is appropriately presented.

      Weaknesses:

      While the methodology is intriguing in its descriptive potential and could be the beginning of an interesting story, a good portion of the paper is dedicated to the discussion of hypothetical working mechanisms to explain myosin diffusion, localization, and decoration of filopodial actin that is not accompanied by the mandatory gain/loss of function studies required to sustain these claims.

      To be fair, the detailed mechanisms that we raise related to diffusion, localization, and decoration are based on extensive work by others. Many prior papers use domain deletions of Myo10 and fall in the category of gain/loss-of-function studies. It is true that we have not repeated those extensive studies, but it seems appropriate to connect with and cite their work where appropriate.

      Reviewer #2 (Public Review):

      Summary:

      The paper sought to determine the number of myosin 10 molecules per cell and localized to filopodia, where they are known to be involved in formation, transport within, and dynamics of these important actin-based protrusions. The authors used a novel method to determine the number of molecules per cell. First, they expressed HALO tagged Myo10 in U20S cells and generated cell lysates of a certain number of cells and detected Myo10 after SDS-PAGE, with fluorescence and a stained free method. They used a purified HALO tagged standard protein to generate a standard curve which allowed for determining Myo10 concentration in cell lysates and thus an estimate of the number of Myo10 molecules per cell. They also examined the fluorescence intensity in fixed cell images to determine the average fluorescence intensity per Myo10 molecule, which allowed the number of Myo10 molecules per region of the cell to be determined. They found a relatively small fraction of Myo10 (6%) localizes to filopodia. There are hundreds of Myo10 in each filopodia, which suggests some filopodia have more Myo10 than actin binding sites. Thus, there may be crowding of Myo10 at the tips, which could impact transport, the morphology at the tips, and dynamics of the protrusions themselves. Overall, the study forms the basis for a novel technique to estimate the number of molecules per cell and their localization to actin-based structures. The implications are broad also for being able to understand the role of myosins in actin protrusions, which is important for cancer metastasis and wound healing.

      Strengths:

      The paper addresses an important fundamental biological question about how many molecular motors are localized to a specific cellular compartment and how that may relate to other aspects of the compartment such as the actin cytoskeleton and the membrane. The paper demonstrates a method of estimating the number of myosin molecules per cell using the fluorescently labeled HALO tag and SDS-PAGE analysis. There are several important conclusions from this work in that it estimates the number of Myo10 molecules localized to different regions of the filopodia and the minimum number required for filopodia formation. The authors also establish a correlation between number of Myo10 molecules filopodia localized and the number of filopodia in the cell. There is only a small % of Myo10 that tip localized relative to the total amount in the cell, suggesting Myo10 have to be activated to enter the filopodia compartment. The localization of Myo10 is log-normal, which suggest a clustering of Myo10 is a feature of this motor.

      Weaknesses:

      One main critique of this work is that the Myo10 was overexpressed. Thus, the amount in the cell body compared to the filopodia is difficult to compare to physiological conditions. The amount in the filopodia was relatively small - 100s of molecules per filopodia so this result is still interesting regardless of the overexpression. However, the overexpression should be addressed in the limitations.

      This is a reasonable perspective and we now note this caveat in the Limitations section so that readers will take note. Our goal here was to understand a system in which Myo10 is the limiting reagent for filopodia, rather than a native system that expresses high Myo10 on its own. Because U2OS cells do not express detectable levels of Myo10 (see below), the natural perturbation here is overexpressing Myo10 to stimulate filopodial growth.

      The authors have not addressed the potential for variability in transfection efficiency. The authors could examine the average fluorescence intensity per cell and if similar this may address this concern.

      Indeed, cells are heterogenous and will naturally express different levels of Myo10 not only due to transfection efficiency, but also due to their state (cell cycle stage, motile behavior, and more). In fact, we measure the transfection efficiency of each bioreplicate and account for it in our calibration procedure. We also measure the fluorescence intensity per cell, which lets us calculate the total Myo10s per cell and the cell-to-cell variability. These Myo10 distributions across cells are shown in Fig. 1D-E.

      We note here an error that we made in applying this transfection efficiency correction in the first submission. When we obtain the total Myo10 molecules by SDS-PAGE, we should divide by the total number of transfected cells. However, due to an operator precedence error, the transfection efficiency appeared in the numerator rather than the denominator. We have now corrected this error, which has the effect of increasing the number of molecules in all of our measurements. The effect of this correction has strengthened one of the paper’s main conclusions, that Myo10 is frequently overloaded at filopodial tips.

      The SDS PAGE method of estimating the number of molecules is quite interesting. I really like this idea. However, I feel there are a few more things to consider. The fraction of HALO tag standard and Myo10 labeled with the HALO tagged ligand is not determined directly. It is suggested that since excess HALO tagged ligand was added we can assume nearly 100% labeling. If the HALO tag standard protein is purified it should be feasible to determine the fraction of HALO tagged standard that is labeled by examining the absorbance of the protein at 280 and fluorophore at its appropriate wavelength.

      This is a fair point raised by the reviewer, and we have now measured a labeling efficiency of 90% in Supplementary Figure 2A-C. We have adjusted all values according to this labeling efficiency.

      The fraction of HALO tagged Myo10 labeled may be more challenging to determine, since it is in a cell lysate, but there may be some potential approaches (e.g. mass spec, HPLC).

      As noted, this value is considerably more challenging. Instead, we determined conditions under which labeling in cells is saturated. We have now stained with a concentration range for both fixed and live cell samples. Saturation occurs with ~0.5 μM HaloTag ligand-TMR in fixed/permeabilized cells and in live cells (Supplementary Figure 2D-E). This comparison of live cells vs. permeabilized cells allows us to say that the intact plasma membrane is not limiting labeling under these conditions.

      In Figure 1B, the stain free gel bands look relatively clean. The Myo10 is from cell lysates so it is surprising that there are not more bands. I am not surprised that the bands in the TMR fluorescence gel are clean, and I agree the fluorescence is the best way to quantitate.

      Figure 1B shows the focused view at high MW, and there is not much above Myo10. The full gel lanes shown in Supp. Fig. 1C show the expected number of bands from a cell lysate.

      In Figure 3C, the number of Myo10 molecules needed to initiate a filopodium was estimated. I wonder if the authors could have looked at live cell movies to determine that these events started with a puncta of Myo10 at the edge of the cell, and then went on to form a filopodia that elongated from the cell. How was the number of Myo10 molecules that were involved in the initiation determined? Please clarify the assumptions in making this conclusion.

      We thank the reviewer (and the other reviewers) for this excellent suggestion. We have now carried out these live cell experiments. These experiments were quite challenging, because we needed to collect snapshots of ~50 cells to measure the mean fluorescence intensity of transfected cells and then acquire movies of several cells for analysis. The U2OS cells were also highly temperature-sensitive and would retract their filopodia without objective heating.

      We have now analyzed filopodial initiation events and measured considerably more Myo10 at the first signs of accumulation– in the 100s of molecules. The dimmer spots that we measured in the first draft were likely unrelated to filopodial initiation, and we have corrected the discussion on this point.

      We now also track further growth from a stable filopodial tip (the phased-elongation mechanism from Ikebe and coworkers) and find approximately 500 molecules bud off in those events. We also track filopodial elongation rates as a function of Myo10 numbers. We have added additional live cell imaging sections that include these results.

      It is stated in the discussion that the amount of Myo10 in the filopodia exceeds the number of actin binding sites. However, since Myo10 contains membrane binding motifs and has been shown to interact with the membrane it should be pointed that the excess Myo10 at the tips may be interacting with the membrane and not actin, which may prevent traffic jams.

      This is also an excellent point to consider, and we have expanded the relevant discussion along these lines. We agree that the Myo10 at the filopodial tip is likely membrane-bound. We now estimate the 2D membrane area occupied by Myo10, and find that it reaches nearly full packing in many cases (under a number of assumptions that we spell out more fully in the manuscript).

      Reviewer #3 (Public Review):

      Summary:

      The unconventional myosin Myo10 (aka myosin X) is essential for filopodia formation in a number of mammalian cells. There is a good deal of interest in its role in filopodia formation and function. The manuscript describes a careful, quantitative analysis of Myo10 molecules in U2OS cells, a widely used model for studying filopodia, how many are present in the cytosol versus filopodia and the distribution of filopodia and molecules along the cell edge. Rigorous quantification of Myo10 protein amounts in a cell and cellular compartment are critical for ultimately deciphering the cellular mechanism of Myo10 action as well as understand the molecular composition of a Myo10-generated filopodium.

      Consistent with what is seen in images of Myo10 localization in many papers, the vast majority of Myo10 is in the cell body with only a small percentage (appr 5%) present in filopodia puncta. Interestingly, Myo10 is not uniformly distributed along the cell edge, but rather it is unevenly localized along the cell edge with one region preferentially extending filopodia, presumably via localized activation of Myo10 motors. Calculation of total molecules present in puncta based on measurement of puncta size and measured Halo-Myo10 signal intensity shows that the concentration of motor present can vary from 3 - 225 uM. Based on an estimation of available actin binding sites, it is possible that Myo10 can be present in excess over these binding sites.

      Strengths:

      The work represents an important first step towards defining the molecular stoichiometry of filopodial tip proteins. The observed range of Myo10 molecules at the tip suggests that it can accommodate a fairly wide range of Myo10 motors. There is great value in studies such as this and the approach taken by the authors gives one good confidence that the numbers obtained are in the right range.

      Weaknesses:

      One caveat (see below) is that these numbers are obtained for overexpressing cells and the relevance to native levels of Myo10 in a cell is unclear.

      A similar concern was raised by Reviewer 2; please see above.

      An interesting aspect of the work is quantification of the fraction of Myo10 molecules in the cytosol versus in filopodia tips showing that the vast majority of motors are inactive in the cytosol, as is seen in images of cells. This has implications for thinking about how cells maintain this large population in the off-state and what is the mechanism of motor activation. One question raised by this work is the distinction between cytosolic Myo10 and the population found at the ‘cell edge’ and the filopodia tip. The cortical population of Myo10 is partially activated, so to speak, as it is targeted to the cortex/membrane and presumably ready to go. Providing quantification of this population of motors, that one might think of as being in a waiting room, could provide additional insight into a potential step-by-step pathway where recruitment or binding to the cortical region/plasma membrane is not by itself sufficient for activation.

      As mentioned in our response to Reviewer 2, we have now carried out quantitation in live cells to capture Myo10 transitions from cell body into filopodial movement. We attempted to identify this membrane-bound population of motors in our new live cell experiments but were unable to make convincing measurements. Notably, we see no noticeable enrichment of Myo10 at the cortex relative to the cytosol. Although we believe there is a membrane-bound waiting room (akin to the 3D-2D-1D mechanism of Molloy and Peckham), we suspect that the 2D population is diffusing too rapidly to be detected under our imaging conditions.

      Specific comments:

      (1) It is not obvious whether the analysis of numbers of Myo10 molecules in a cell that is ectopically overexpressing Myo10 is relevant for wild type cells. It would appear to be a significant excess based on the total protein stained blot shown in Fig S1E where a prominent band the size of tagged Myo10 seen in the transfected sample is almost absent in the WT control lane.

      Even “wildtype” cells vary considerably in their Myo10 expression levels. For example, melanoma cells often heavily upregulate Myo10, while these U2OS cells produce nearly none (Supplementary Figure 1E). Thus, there is no single, widely acceptable target for Myo10 expression in wildtype cells.

      Please note that the new Supplementary Figure 1E is a Myo10 Western blot, not total protein staining as before.

      Ideally, and ultimately an important approach, would be to work with a cell line expressing endogenously tagged Myo10 via genome engineering. This can be complicated in transformed cells that often have chromosomal duplications.

      Indeed, we chose U2OS cells for this work because they do not express detectable levels of Myo10, and thus we can avoid all of these complications. Here we can examine how Myo10 levels control filopodial production through ectopic expression.

      However, even though there is an excess of Myo10 it would appear that activation is still under some type of control as the cytosolic pool is quite large and its localization to the cell edge is not uniform. But it is difficult to gauge whether the number of molecules in the filopodium is the same as would be seen in untransfected cells. Myo10 can readily walk up a filopodium and if excess numbers of this motor are activated they would accumulate in the tip in large numbers, possibly creating a bulge as and indeed it does appear that some tips are unusually large. Then how would that relate to the normal condition?

      As noted above, the normal condition depends on the cellular system. However, endogenous Myo10 also accumulates in bulges at filopodial tips, so this is not a phenotype unique to Myo10 overexpression. For example, the images from Figure 1 of the Berg and Cheney (2002) citation show bulges from endogenous Myo10 in endothelial cells.

      (2) Measurements of the localization of Myo10 focuses in large part on ‘Myo10 punctae’. While it seems reasonable to presume that these are filopodia tips, the authors should provide readers with a clear definition of a puncta. Is it only filopodia tips, which seems to be the case? Does it include initiation sites at the cell membrane that often appear as punctae?

      We define puncta as any clusters/spots of Myo10 signal detected by segmentation, not limited to any location within the surface-attached filopodia. We exclude puncta that appear in the cell interior (~5 of which appear in Fig. 1A). These are likely dorsal filopodia, but there are few of these compared to the surface attached filopodia of U2OS cells. In Figure 2, “puncta” includes all Myo10 clusters along the filopodia shaft, though a majority happen to be tip-localized (please see Supplementary Figure 4B). We have edited the main text for clarification.

      Along those lines, the position of dim punctae along the length of a filopodium is measured (Fig 3D). The findings suggest that a given filopodium can have more than one puncta which seems at odds if a puncta is a filopodia tip. How frequently is a filopodium with two puncta seen? It would be helpful if the authors provided an example image showing the dim puncta that are not present at the tip.

      We have now provided an example image of dim puncta along filopodia in Supplementary Figure 4C.

      (3) The concentration of actin available to Myo10 is calculated based on the deduction from Nagy et al (2010) that only 4/13 of the actin monomers in a helical turn are accessible to the Myo10 motor (discussion on pg 9; Fig S4). Subsequent work (Ropars et al, 2016) has shown that the heads of the antiparallel Myo10 dimer are flattened, but the neck is rather flexible, meaning that the motor can a variable reach (36 - 52 nm). Wouldn’t this mean that more actin could be accessible to the Myo10 motor than is calculated here?

      Although we see why the reviewer might believe otherwise, the 4/13 fraction of accessible actin holds. This fraction is obtained from consideration of the fascin-actin bundle structure alone, independent of the reach of any particular myosin motor. Every repeating layer of 13 actin subunits (or 36 nm) has 4 accessible myosin binding-sites. The remaining 9 sites are rejected because a single myosin motor domain will have a steric clash with a neighboring actin filament in the bundle. A myosin with an exceptionally long reach might reach the next 13 subunit layer, but that layer also has only 4 binding sites. Thus, we can calculate the number of binding sites per unit length along the filopodium. This number would hold for a dimeric myosin with any reach, including myosin-5 or myosin-2.

      (4) Quantification of numbers of Myo10 molecules in filopodial puncta (Fig 3C) leads the authors to conclude that ‘only ten or fewer Myo10 molecules are necessary for filopodia initiation’ (pg 7, top). While this is a reasonable based on the assumption that the formation of a puncta ultimately results from an initiation event, little is known about initiation events and without direct observation of coalescence of Myo10 at the cell edge that leads to formation of a filopodium, this seems rather speculative.

      As noted above, we have now performed the necessary live cell imaging of filopodial nucleation events and have updated our conclusions accordingly.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I have made a series of comments that might help the authors improve their manuscript:

      - A full calibration of the methodology would require testing a wider range of protein amounts, to exhaustively detect the dynamic range of the technique. The authors acknowledge in the discussion that “Furthermore, our estimates of molecules are predicated on the calibration curve of the Halo Standard Protein on the SDS-PAGE gels, which is likely the highest source of error on our molecule counts”. A good way of convincing a nasty reviewer is to provide a calibration with more than 3 reference points. At least this will help exclude from the analysis cells where Myo10 estimates are not in the linear regime of detection.

      We completely agree with the reviewer’s suggestion to build a robust calibration curve. The SDS gel shown in Figure 1C originally contained 4 reference points, but the highest HaloTag standard protein point oversaturated the detector at the set exposure in the TMR channel and was omitted. We have now re-run the SDS gel to include a HaloTag standard protein curve comprising 5 points, alongside all three bioreplicates from the fixed cell experiments and all three bioreplicates from the live cell experiments (updated in Figure 1B-C). We had saved frozen lysates from the original fixed cell work, so we were able to reanalyze our data with the new set of standards. The Myo10 quantities are consistent, but with much tighter CIs from the standard curve.

      - As already said this methodology is intriguing, however, a correlative validation with a conventional SMLM approach to address the bona-fide of the method would be ideal.

      Unfortunately, single molecule approaches for validation are impractical for us. Due to the relatively high magnification of our TIRF microscope and the large spread area of the U2OS cells, single cells typically extend beyond the field of view. We acknowledge the benefits of SMLM quantitative techniques and other approaches cited in the introduction section. To avoid use of special tools/instruments, we offer our methodology, based off Pollard group’s quantitative Western blotting of GFP, as a simpler alternative accessible to anyone.

      - TMR is a small ligand likely interacting also with Halo in its denatured state. However, to clear any doubts a parallel Native-PAGE investigation should be included, or if existing a specific reference should be provided.

      Perhaps there is a misunderstanding here. One of the key advantages of the HaloTag labeling system is that the engineered dehalogenase is covalently modified by the ligand (the TMR-ligand is a suicide substrate). This means that the TMR remains bound even under denaturing conditions, which allows its detection in SDS-PAGE. Native gels are unnecessary here.

      - Moreover, SDS-PAGE is run at alkaline pH, have the authors considered these points when designing the methodology? Fluorescence images were taken in PBS, which has a different pH. Could the authors, or the literature, exclude these aspects as potential pitfalls in the methodology? Also temperature is affecting fluorescence emission, but it is easier to control with certain tolerance in the room-temperature regime.

      Our method does not compare fluorescence values that cross the experimental systems (SDS-PAGE vs. microscopy). Cellular proteins and HaloTag protein standards are compared in a single setting of SDS-PAGE to obtain the average number of Myo10s per transfected cell. Likewise, all measurements on intact (live or fixed) cells are conducted in that single setting to obtain average fluorescence per cell. Thus, there is no issue with the different buffers or temperatures affecting fluorescence emission.

      - The authors should test their approach also with truncation variants of Myosin10 (for instance lacking the PH or motor domain). This is a classical approach that might prove the potential of the technique when altering the capacity of the protein to interact with a main binding partner. Also, treatments that induced filopodia formation might prove useful (i.e., hypotonic media induce filopodia formation in some fibroblast cell lines in our hands).

      The reviewer raises interesting suggestions that we aim to address in future experiments, but truncation variants and environmental perturbations are beyond the focus of the current manuscript. Here, we report on the otherwise unperturbed state when we add exogenous full-length Myo10 to the U2OS cells. But indeed, experiments with Myo10 domain truncations, PI3K and PTEN inhibition, and cargo protein / activating cofactor knock-downs (among others) are on our drawing board.

      - Most of the mechanisms hypothesized in the discussion are sound and plausible. However, the authors have chosen an experimental model where transient transfection of exogenous Myo10 in U2OS is performed. This approach poses two main and fundamental questions that are not resolved by the data provided:

      A) how do different expression levels affect the Myo10 counting?

      Our counting procedure does not assume uniform expression across a population of cells– quite the opposite, in fact. We directly measure Myo10 expression levels on a cell-by-cell basis with microscopy, once we know the number of molecules in our total pool (see the Methods for details). As an example of the final output, Figs. 1D and 1E show the total number of Myo10 molecules per cell for fixed and live cells, respectively.

      B) how does endogenous and unlabeled Myo10 hamper the bonafide of counts? The authors claimed “U2OS cells express low levels of Myo10, so there is a small population of unlabeled endogenous Myo10 unaddressed by this paper”. As presented, the low levels of endogenous Myo10 sound an arbitrary parameter, and there are no data presented that can limit if not exclude this bias in the analysis. To produce data in a genetically modified cell line with Halo-tag on the endogenous protein will represent a much cleaner system. Alternatively, the authors should look for Myo10 KO cell lines where they can back-transfect their Halo-Tagged Myo10 construct in a more consistent framework, focusing on cells with low-to-mid levels of expression.

      We agree, this is an important point to nail down (and is often neglected in the literature). We have now measured the endogenous Myo10 levels in U2OS cells by Western blotting and found that it is undetectable compared to our HaloTagged construct expression. Please see Supp. Fig 1E. Thus, for all intents and purposes, every Myo10 molecule in these experiments came from our expression plasmid. Accordingly, we have removed this caveat from the paper.

      Minor points

      - Figure 1B. To help the reader SDS-PAGE gels annotations should be clearer already from the figure.

      We have updated the annotations for clarity.

      - Methods should be organized in sessions. As it stands, it is hard for the reader to look for technical details.

      We have expanded and added subsections to the Methods as requested.

      - The good practice of indicating the gene and transcript entry numbers and the primer used to amplify and clone into the backbone vectors is getting lost in many papers. I would strongly encourage the authors to add this information to the methods.

      We have included the gene entries to the methods and will include a full FASTA file of the coding sequence as supplementary information to avoid any ambiguity here.

      The authors write “It is unclear how myosins navigate to the right place at the right time, but our results support an important interplay between Myo10 and the actin network.” It is a bit scholastic to say that Myo10 and actin have an important interplay, they are major binding partners. What is the new knowledge contained in this sentence?

      Agreed– we have deleted the sentence in question.

      Reviewer #2 (Recommendations For The Authors):

      The authors should address all the weaknesses indicated in the public review.

      There were a few other places that require clarification.

      On page 4, the last paragraph. It is stated that the targeting of Myo10 was reported/proposed based on previous work (ref 31). The next few sentences are not referenced and thus likely refer to ref 31. The authors did not measure the parameters discussed in these sentences, so it is important to clarify that they are referring to previous work and not the current study.

      Indeed, the next few sentences still refer to old reference 31, so we have now edited the paragraph for clarity.

      On page 7, the reference to Figure 3A indicates that the trend of higher Myo10 correlating with more filopodia. However, the reference to Figure 3B indicates total intracellular Myo10 weakly correlates with more filopodia. However, the x-axis on Figure 3B is filopodia molecules not the intracellular Myo10. Please clarify.

      We appreciate the reviewer for catching our mistake. Those plots are now in Fig. 2 and have been edited accordingly.

      Reviewer #3 (Recommendations For The Authors):

      The Discussion of results at the end of each section is rather brief and could be expanded on a bit more.

      Before we were operating under the constraints of an eLife Short Report. We have now expanded the discussion for a full article.

      The authors mention that actin filaments at the tips of filopodia could be frayed, citing Medalia et al, 2007 (ref 40). That paper describes an early cryoEM analysis of filopodia from the amoeba Dictyostelium. EM images of mammalian filopodia tips, e.g. Svitkina et al, 2003, JCB, do not show quite the same organization of actin as seen in the Dictyostelium filopodia tips. However, recent work from the Bershadsky lab, Li et al, 2023, presents a few cryoEM images of tips of left-bent filopodia that are tightly adhered to a substrate and there it looks like actin filaments become disorganized in tips, along with membrane bulging. The authors should consider expanding discussion of the filopodia tips to take into account what is known for mammalian filopodia.

      We thank the reviewer for bringing these enlightening papers to our attention. We have now included these citations in the discussion.

      Fig 1D - The x-axis is a bit odd, it goes from 0 then to 2.5e+06 with no indication of the bin size. Can this be re-labelled or the scale displayed a bit differently?

      We have double-checked the axis breaks, which are large because the underlying values are large. We have also provided the bin size as requested for all histograms.

      Fig 4A - What is the bin size for the histogram?

      As above, we have now updated the figure legends (now in Fig. 3) to include the bin size.

      Methods -

      - Please provide an accession number for the Myo10 nucleotide sequence used for this work as there are at least two known isoforms.

      Thank you for noting this. We are using the full-length, not the headless isoform. We have now updated the Methods accordingly.

      - No mention is made of the SDS sample buffer used, was that also added to the sample?

      We have now updated the Methods accordingly.

      - How are samples boiled at 70 deg C? Do the authors actually mean ‘heated’?

      Indeed. We have now corrected “boiled” to “heated.”

      - Could the authors please briefly explain the connected component analysis used to identify filopodia?

      We have now updated the Methods accordingly.

      - The intensity of filopodia was determined by dividing tip intensity by the total bioreplicate sum of intensities then multiplying it by the total pool, if this reviewer understands correctly. It sounds like intensities are being averaged across a whole cell population instead of cell-by-cell. Is that correct? If so, can the authors please provide the underlying rationale for this? If not, then please better describe what was actually done.

      We apologize for the confusion. Intensities are being averaged (summed) across a whole cell population, but importantly that step is only used to obtain a scale factor that converts the fluorescence signal at the microscope to the number of molecules. We then use that scale factor for all cells imaged in the bioreplicate, to both 1) find the total Myo10 in that cell, and 2) find the total amount of that Myo10 in any given location within that cell.

      To further clarify, each bioreplicate has a known total number of Myo10 molecules associated with the number of cells loaded onto the SDS gel. From the SDS gel, we have an average number of Myo10 molecules per positively transfected cell. If 50 cell images are analyzed, then there is a Myo10 ‘total pool’ of (50 cells) * (average Myo10 molecules/cell). The fluorescence signal intensities in microscopy were summed for all cells within the bioreplicate (50 cells in this example). However, due to variation in expression, not every cell has the same signal intensity when imaged under the same conditions. It would be inaccurate to assume each cell contains the average Myo10 molecules/cell. Therefore, to get the number of molecules within a given Myo10 cell (or punctum), the summed cell (punctum) intensity was divided by the bioreplicate fluorescence signal intensity sum and multiplied by ‘total pool.’

      - The authors quantify Myo10 protein amounts by western blotting using Halo tag fluorescence, a method that should provide good accuracy. The results depend on the transfection efficiency and it is rarely the case that it is 100%. The authors state that they use a ‘value correction for positively transfected cells’ (pg 11). It is likely that there was a range of expression levels in the cells, how was a cut-off for classifying a cell as non-expressing determined or set?

      As described in the Methods, “microscopy was used to count the percentage of transfected cells from ~105-190 randomly surveyed cells per bioreplicate.” Cells were labeled and located with DAPI. If no TMR signal could be visually detected by microscopy, then the cell was deemed to be non-Myo10 expressing. We did not set a cutoff fluorescence value, as untransfected cells have no detectable signal. Please see Supplementary Figure 1F for examples.

      - “In-house Python scripts” are used for image analysis. Will these be made publicly available?

      Yes, we will package these up on GitHub.

    1. Author response:

      a) that the investigation is very interesting and inventive, and has the potential to reveal some novel insights.

      We thank the reviewers and are excited to improve upon the manuscript through their suggestions.

      b) that the problem of temporal autocorrelation in the fMRI and behavioral data has not been dealt with clearly and convincingly

      We agree that convincingly accounting for fMRI temporal autocorrelation is important to our claims. To reduce its effects, we used field standard methods: prewhitening and autocorrelation modeling with SPM’s FAST algorithm (shown by Olszowy et al. 2019 to be superior to SPM’s default setting), as well as a high-pass filter of 128 Hz. There is still some first-order autocorrelation structure present across voxels in the left hippocampal beta series: across participants there is slightly positive autocorrelation between the betas of decision trials on successive trials, that decays to ~0 at subsequent lags. We note that our task is a narrative, and some patterns over time are expected; instead of attempting to fully eliminate all temporal structure in the data, we aim to show that the temporal distance between trials is unlikely to explain our effects.

      In the within versus between social dimension representational similarity analysis, the average temporal distance between trials is the same within and between dimensions. The clustering analysis is a between subject analysis about individual differences–and the same overall temporal structure is experienced by all participants.

      The trajectory analysis does not focus on consecutive trials across characters, but rather on consecutive trials within characters, where the time gap between successive trials is relatively large and highly variable. An average of over a minute of time elapses between successive decision trials for a given character (versus ~20 seconds across characters), which is on average almost 11 narrative slides and 3 decision trials. Across characters, the temporal gap between decision trials ranges between 12 seconds to more than 10 minutes, reducing the likelihood that temporal autocorrelation drives character-related estimates. We also highlight the shuffled choices control model, which shares the same temporal autocorrelation structure as the model of interest but had significantly poorer social location decoding–a strong indication that temporal autocorrelation alone can’t explain these results. For each participant, we shuffled their choices and re-computed trajectories that preserved the origin and end locations but produced different locations along the way. Our model decoded location significantly better than this null model, and this difference in performance can't be explained by differences in temporal autocorrelation in the neural or behavioral data.

      In the revision, we will further address this concern. For example, we will report more details on the task structure to aid in interpretation and will more precisely characterize the temporal autocorrelation profile. Where appropriate, we will also improve on and/or add more control analyses that preserve the autocorrelation structure.

      c) that a number of important interesting questions have not been addressed: Are the differences between social partners encoded in the hippocampus? Are the social dimensions encoded in a consistent manner across social partners?

      We believe that we should be able to decode other interesting task- and relationship-related features from the hippocampal patterns, as suggested by the reviewers. In the revision, we will attempt several such analyses, while taking care to control for temporal autocorrelation.

      d) that the cluster analysis in the brain-behavior correlation analysis is not well motivated or validated and should be clarified.

      We agree with the reviewers that this clustering analysis should be better described and validated. We aimed to ask whether less diverse and distinctive cognitive representations of the relationship trajectories relate to smaller real-world social networks. This question of impoverished cognitive maps was first raised by Edward Tolman; we think it is relevant here, as well. In the revision, we will clarify its motivations and implications, and better evaluate it for its robustness. Here, we address a few comments made by the reviewers.

      Reviewer 2 noted that other analyses could be used to ask whether social cognitive map complexity relates to real-world social network complexity. While the proposed alternatives are interesting (e.g., correlating decoding accuracy with social network size), we believe these analyses ask different questions. The current co-clustering analysis was intended to estimate map complexity jointly from the behavioral and neural signatures of the social map across characters. In contrast, the spline location decoding is within character; the accuracy of this decoding does not say much about representations across characters. And although we think character decoding is an interesting possible addition to this manuscript, its accuracy may reflect other aspects of the relationships, beyond just spatial representation. Thus, we will provide a clearer and better validated version of the current analysis to address this question.

      We would also like to clarify that we did not collect the Social Network Index questionnaire in the Initial sample; as such these results are more tentative than the other analyses, due to the inability to confirm them in a separate sample. Reviewer 2 also suggests that a single outlier could drive this effect; but estimating the effect with robust regression also returns a right-tailed p < 0.05, showing that the relationship is robust to outliers.

      References

      Olszowy, W., Aston, J., Rua, C. & Williams, W.B. Accurate autocorrelation modeling substantially improves fMRI reliability. Nature Communications. (2019).

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study provides important new insights into how multisensory information is processed in the lateral cortex of the inferior colliculus, a poorly understood part of the auditory midbrain. By developing new imaging techniques that provide the first optical access to the lateral cortex in a living animal, the authors provide convincing in vivo evidence that this region contains separate subregions that can be distinguished by their sensory inputs and neurochemical profiles, as suggested by previous anatomical and in vitro studies. Additional information and analyses are needed, however, to allow readers to fully appreciate what was done, and the comparison of multisensory interactions between awake and anesthetized mice would benefit from being explored in more detail.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this paper, the authors provide a characterisation of auditory responses (tones, noise, and amplitude-modulated sounds) and bimodal (somatosensory-auditory) responses and interactions in the higher-order lateral cortex (LC) of the inferior colliculus (IC) and compare these characteristics with the higher order dorsal cortex (DC) of the IC - in awake and anaesthetised mice. Dan Llano's group has previously identified gaba'ergic patches (modules) in the LC distinctly receiving inputs from somatosensory structures, surrounded by matrix regions receiving inputs from the auditory cortex. They here use 2P calcium imaging combined with an implanted prism to - for the first time - get functional optical access to these subregions (modules and matrix) in the lateral cortex of IC in vivo, in order to also characterise the functional difference in these subparts of LC. They find that both DC and LC of both awake and anaesthetised mice appear to be more responsive to more complex sounds (amplitude-modulated noise) compared to pure tones and that under anesthesia the matrix of LC is more modulated by specific frequency and temporal content compared to the gabaergic modules in LC. However, while both LC and DC appear to have low-frequency preferences, this preference for low frequencies is more pronounced in DC. Furthermore, in both awake and anesthetized mice, somatosensory inputs are capable of driving responses on their own in the modules of LC, but very little (possibly not at all) in the matrix. However, bimodal interactions may be different under awake and anesthesia in LC, which warrants deeper investigation by the authors: They find, under anesthesia, more bimodal enhancement in modules of LC compared to the matrix of LC and bimodal suppression dominating the matrix of LC. In contrast, under awake conditions bimodal enhancement is almost exclusively found in the matrix of LC, and bimodal suppression dominates both matrix and modules of LC.

      The paper provides new information about how subregions with different inputs and neurochemical profiles in the higher-order auditory midbrain process auditory and multisensory information, and is useful for the auditory and multisensory circuits neuroscience community.

      Strengths:

      The major strength of this study is undoubtedly the fact that the authors for the first time provide optical access to a subcortical region (the lateral cortex of the inferior colliculus (i.e. higher order auditory midbrain)) which we know (from previous work by the same group) have optically identifiable subdivisions with unique inputs and neurotransmitter release, and plays a central role in auditory and multisensory processing. A description of basic auditory and multisensory properties of this structure is therefore very useful for understanding auditory processing and multisensory interactions in subcortical circuits.

      Weaknesses:

      I have divided my comments about weaknesses and improvements into major and minor comments. All of which I believe are addressable by the reviewers to provide a more clear picture of their characterisation of the higher-order auditory midbrain.

      Major comment:

      (1) The differences between multisensory interactions in LC in anaesthetised and awake preparations appear to be qualitatively different, though the authors claim they are similar (see also minor comment related to figure 10H for further explanation of what I mean). However, the findings in awake and anaesthetised conditions are summarised differently, and plotting of similar findings in the awake figures and anaesthetised figures are different - and different statistics are used for the same comparisons. This makes it very difficult to assess how multisensory integration in LC is different under awake and anaesthetised conditions. I suggest that the authors plot (and test with similar statistics) the summary plots in Figure 8 (i.e. Figure 8H-K) for awake data in Figure 10, and also make similar plots to Figures 10G-H for anaesthetised data. This will help the readers understand the differences between bimodal stimulation effects on awake and anaesthetised preparations - which in its current form, looks very distinct. In general, it is unclear to me why the awake data related to Figures 9 and 10 is presented in a different way for similar comparisons. Please streamline the presentation of results for anaesthetised and awake results to aid the comparison of results in different states, and explicitly state and discuss differences under awake and anaesthetised conditions.

      We thank the reviewer for the valuable suggestion. We only highlighted the similarities between the data obtained from anesthetized and awake preparations to indicate the ability to reproduce the technique in awake animals for future assessment. Identifying those similarities between the two experimental setups was based on the comparison between modules vs matrix or LC vs DC within each experimental setup (awake vs anesthetized). Therefore, the statistics were chosen differently for each setup based on the size of the subjects (n) within each experimental preparation. However, we agree with the reviewer’s comment that there are differences between the anesthetized and awake data. To examine these differences, we ran the same statistics for Figure 5 (tonotopy of LC vs. DC-anesthetic animals) and Figure 9 (tonotopy of LC vs DC-awake animals). In addition, we added a new figure after Figure 9 to separate the statistical analysis from the maps. Accordingly, Figures 4 and 5 (maps and analysis, respectively -anesthetized animals) now match Figures 9 and 10 (maps and analysis, respectively – awake animals). We also did the same thing for Figures 7 (microprism imaging of the LC - anesthetized animals), 8 (imaging of the LC from the dorsal surface - anesthetized animals) as well as Figure 11 or old Figure 10 (microprism imaging of the LC - awake animals) to address the similarities and differences of the multisensory data between awake and anesthetized animals. We edited the text accordingly in the result and discussion sections.

      (2) The claim about the degree of tonotopy in LC and DC should be aided by summary statistics to understand the degree to which tonotopy is actually present. For example, the authors could demonstrate that it is not possible/or is possible to predict above chance a cell's BF based on the group of other cells in the area. This will help understand to what degree the tonotopy is topographic vs salt and pepper. Also, it would be good to know if the gaba'ergic modules have a higher propensity of particular BFs or tonotopic structure compared to matrix regions in LC, and also if general tuning properties (e.g. tuning width) are different from the matrix cells and the ones in DC.

      Thank you for the reviewer’s suggestion. We have examined the tonotopy of LC and DC using two regression models (linear and quadratic polynomial) between the BFs of the cells and their location on the anatomical axis. Therefore, the tonotopy is indicated by a significant regression fit with a high R2 between the BFs the cells, and their location within each structure. For the DC, there was a significant regression fit between the BFs of the cells and their locations over the rostromedial to the caudolateral axis. Additionally, the R2 of the quadratic polynomial fit was higher than that of the linear fit, which indicates a nonlinear distribution of cells based on their BFs, which is consistent with the presence of high-low-high tuning over the DC surface. Given that the microprism cannot image the whole area of the LC, and it images a slightly different area in each animal, it was very difficult to get a consistent map for the LC as well as a solid conclusion about the LC tonotopy. However, we have examined the regression fit between the BFs of cells and their location along the main four anatomical axes of the field of view obtained from each animal (dorsal to ventral), (rostral to caudal), (dorsocaudal to ventrorostral) (dorsorostral to ventrocoudal). Unlike the DC, the LC imaged via microprism showed a lower R2 for both linear and quadratic regression mostly in the dorsoventral axis. We show the fitting curves of these regressions in Figure 4-figure supplement 1 (anesthetized data) and Figure 9-figure supplement 1 (awake data). Despite the inconsistent tonotopy of the LC imaged via microprism, the modules were found to have a higher BFs median at 10 kHz compared to matrix that had a lower BFs median at 7.1 kHz, which was consistent across the anesthetized and awake animals. We have added these results in the corresponding spot in the results section (lines 193-197 and 361-364). We have examined the tuning width using the binarized receptive field sum (RFS) method in which each neuron was given a value of 1 if it responds to a single frequency (Narrow RF), but this value increases if the neuron responds to more neighbor frequencies (wide RF). We did this calculation across all the sound levels. Both DC and LC of the anesthetized animals had higher RFS mean and median than those of awake animals given that ketamine was known to broaden the RF. However, in both preparations (anesthetized and awake), the DC had a higher RFS mean than that of the LC, which could be consistent with the finding that the DC had a relatively lower SMI than the LC. To show these new data, we made a new Figure 10-figure supplement 1, and we edited the text accordingly [lines 372-379 & 527-531].

      (3) Throughout the paper more information needs to be given about the number of cells, sessions, and animals used in each panel, and what level was used as n in the statistical tests. For example, in Figure 4 I can not tell if the 4 mice shown for LC imaging are the only 4 mice imaged, and used in the Figure 4E summary or if these are just examples. In general, throughout the paper, it is currently not possible to assess how many cells, sessions, and animals the data shown comes from.

      Thank you for the reviewer’s comment. We do apologize for not adding this information. We added all the information regarding the size of the statistical subjects (number of cells or number of animals used) for every test outcome. To keep the flow of the text, we added the details of the statistical tests in the legends of the figures.

      (4) Throughout the paper, to better understand the summary maps and plots, it would be helpful to see example responses of the different components investigated. For example, given that module cells appear to have more auditory offset responses, it would be helpful to see what the bimodal, sound-only, and somatosensory responses look like in example cells in LC modules. This also goes for just general examples of what the responses to auditory and somatosensory inputs look like in DC vs LC. In general example plots of what the responses actually look like are needed to better understand what is being summarised.

      Thank you for the reviewer’s comment and suggestion. We modified Figure 6 and the text accordingly to include all the significant examples of cells discussed throughout the work.

      Reviewer #2 (Public Review):

      Summary:

      The study describes differences in responses to sounds and whisker deflections as well as combinations of these stimuli in different neurochemically defined subsections of the lateral and dorsal cortex of the inferior colliculus in anesthetised and awake mice.

      Strengths:

      The main achievement of the work lies in obtaining the data in the first place as this required establishing and refining a challenging surgical procedure to insert a prism that enabled the authors to visualise the lateral surface of the inferior colliculus. Using this approach, the authors were then able to provide the first functional comparison of neural responses inside and outside of the GABA-rich modules of the lateral cortex. The strongest and most interesting aspects of the results, in my opinion, concern the interactions of auditory and somatosensory stimulation. For instance, the authors find that a) somatosensory-responses are strongest inside the modules and b) somatosensory-auditory suppression is stronger in the matrix than in the modules. This suggests that, while somatosensory inputs preferentially target the GABA-rich modules, they do not exclusively target GABAergic neurons within the modules (given that the authors record exclusively from excitatory neurons we wouldn't expect to see somatosensory responses if they targeted exclusively GABAergic neurons), and that the GABAergic neurons of the modules (consistent with previous work) preferentially impact neurons outside the modules, i.e. via long-range connections.

      Weaknesses:

      While the findings are of interest to the subfield they have only rather limited implications beyond it. The writing is not as precise as it could be. Consequently, the manuscript is unclear in some places. For instance, the text is somewhat confusing as to whether there is a difference in the pattern (modules vs matrix) of somatosensory-auditory suppression between anesthetized and awake animals. Furthermore, there are aspects of the results which are potentially very interesting but have not been explored. For example, there is a remarkable degree of clustering of response properties evident in many of the maps included in the paper. Taking Figure 7 for instance, rather than a salt and pepper organization we can see auditory responsive neurons clumped together and non-responsive neurons clumped together and in the panels below we can see off-responsive neurons forming clusters (although it is not easy to make out the magenta dots against the black background). This degree of clustering seems much stronger than expected and deserves further attention.

      Thank you for the reviewer’s comment. We do apologize if some areas in the manuscript were imprecisely written. For anesthetized and awake data, we have only emphasized the similarities between the two setups to show the ability to use microprism in awake animals for future assessment. To highlight the differences between anesthetized and awake animals, we have now run uniform statistics for all the data collected from both setups. Accordingly, we have edited Figures 4 and 5 (tonotopy-anesthetized) to match Figures 9 and new Figure 10 (tonotopy-awake). Also, we edited Figures 7 and 8 (multisensory- anesthetized) to match Figure 11 or old Figure 10 (multisensory- awake). We edited the text accordingly in the results section and discussed the possible differences between anesthetized and awake data in the discussion section [lines 521-553].

      We agree with the reviewer’s comment that the cells were topographically clustered based on their responses. Some of these clusters include the somatosensory responsive cells, which were located mostly in the modules (Figures 7D and 8E). Also, the auditory responsive cells with offset responses were clustered mostly in the modules (Figures 7C and 8F). Accordingly, we have edited the text to emphasize this finding.

      We noticed also that some responsive cells to the tested stimulations were surrounded by nonresponsive cells. By comparing the response of the cells to different stimuli we found that while Figures 7 and 11 (old Figure 10) showed only the response of the cells to auditory stimulation (unmodulated broadband noise at 80 dB) and somatosensory stimulation (whisker deflection), some nonresponsive cells to these specific stimulations were found to be responsive to pure tones of different frequencies and amplitudes. As an indicator of the cells' viability, we additionally examined the spontaneous activity of the nonresponsive cells across different data sets. We note that spontaneous activity was rare for all cells even among the responsive cells to sound or somatosensory stimulations. This finding could be related to the possibility that the 2P imaging of calcium signals may not be sensitive enough to track spontaneous activity that may originate from single spikes. However, in some data sets, we have found that the cells that did not respond to any tested stimuli showed spontaneous activity when no stimulation was given indicating the viability of those cells. We have addressed the activity of the non-responsive cells in the text along with a new Figure 11-figure supplement 1.

      We changed the magenta into a green color to be suitable for the dark background. Also, we have completely changed the color palette of all of our images to be suitable for color-blind readers as suggested by reviewer 1.

      Reviewer #3 (Public Review):

      The lateral cortex of the inferior colliculus (LC) is a region of the auditory midbrain noted for receiving both auditory and somatosensory input. Anatomical studies have established that somatosensory input primarily impinges on "modular" regions of the LC, which are characterized by high densities of GABAergic neurons, while auditory input is more prominent in the "matrix" regions that surround the modules. However, how auditory and somatosensory stimuli shape activity, both individually and when combined, in the modular and matrix regions of the LC has remained unknown.

      The major obstacle to progress has been the location of the LC on the lateral edge of the inferior colliculus where it cannot be accessed in vivo using conventional imaging approaches. The authors overcame this obstacle by developing methods to implant a microprism adjacent to the LC. By redirecting light from the lateral surface of the LC to the dorsal surface of the microprism, the microprism enabled two-photon imaging of the LC via a dorsal approach in anesthetized and awake mice. Then, by crossing GAD-67-GFP mice with Thy1-jRGECO1a mice, the authors showed that they could identify LC modules in vivo using GFP fluorescence while assessing neural responses to auditory, somatosensory, and multimodal stimuli using Ca2+ imaging. Critically, the authors also validated the accuracy of the microprism technique by directly comparing results obtained with a microprism to data collected using conventional imaging of the dorsal-most LC modules, which are directly visible on the dorsal IC surface, finding good correlations between the approaches.

      Through this innovative combination of techniques, the authors found that matrix neurons were more sensitive to auditory stimuli than modular neurons, modular neurons were more sensitive to somatosensory stimuli than matrix neurons, and bimodal, auditory-somatosensory stimuli were more likely to suppress activity in matrix neurons and enhance activity in modular neurons. Interestingly, despite their higher sensitivity to somatosensory stimuli than matrix neurons, modular neurons in the anesthetized prep were far more responsive to auditory stimuli than somatosensory stimuli (albeit with a tendency to have offset responses to sounds). This suggests that modular neurons should not be thought of as primarily representing somatosensory input, but rather as being more prone to having their auditory responses modified by somatosensory input. However, this trend was reversed in the awake prep, where modular neurons became more responsive to somatosensory stimuli than auditory stimuli. Thus, to this reviewer, the most intriguing result of the present study is the dramatic extent to which neural responses in the LC changed in the awake preparation. While this is not entirely unexpected, the magnitude and stimulus specificity of the changes caused by anesthesia highlight the extent to which higher-level sensory processing is affected by anesthesia and strongly suggest that future studies of LC function should be conducted in awake animals.

      Together, the results of this study expand our understanding of the functional roles of matrix and module neurons by showing that responses in LC subregions are more complicated than might have been expected based on anatomy alone. The development of the microprism technique for imaging the LC will be a boon to the field, finally enabling much-needed studies of LC function in vivo. The experiments were well-designed and well-controlled, and the limitations of two-photon imaging for tracking neural activity are acknowledged. Appropriate statistical tests were used. There are three main issues the authors should address, but otherwise, this study represents an important advance in the field.

      (1) Please address whether the Thy1 mouse evenly expresses jRGECO1a in all LC neurons. It is known that these mice express jRGECO1a in subsets of neurons in the cerebral cortex, and similar biases in the LC could have biased the results here.

      Thank you for the reviewer’s comment. In the work published by Dana, et al, the expression of jRGECO1a in all Thy1 mouse lines was determined by the brightness of the jRGECO1a in the soma. Given that some cells do not show a detected level of jRGECO1a fluorescence until activated, the difference in expression shown in different brain regions could be related to the level of neuronal activity at the time of sample processing and not the expression levels of the indicator itself. To the best of our knowledge, there is no antibody for jRGECO1a, which can be used for detecting the expression levels of the indicator regardless of the neuronal activity. To test the hypothesis that DC and LC have different levels of jRGECO1a, we examined the expression levels of jRGECO1a after we perfused the mice with high potassium saline to elicit a general neuronal depolarization in the whole brain. Then we immunostained against NeuN (the neuronal marker) to quantify the percentage of the neurons expressing jRGECO1a to the total number of neurons (indicated by NeuN). To have a fair comparison, we restricted our analysis to include the areas imaged only by 2P as some regions were not accessible by microprism such as the deep ventral regions of the LC. There is a similar % of cells expressing jRGECO1a in DC and LC. As expected, the neurons expressing jRGECO1a were only nonGABAergic cells. We addressed these findings in the new Figure 3-figure Supplement 1 as well as the corresponding text in the results [lines 178-184] and methods sections [lines 878-892].

      (2) I suggest adding a paragraph or two to the discussion to address the large differences observed between the anesthetized and awake preparations. For example, somatosensory responses in the modules increased dramatically from 14.4% in the anesthetized prep to 63.6% in the awake prep. At the same time, auditory responses decreased from 52.1% to 22%. (Numbers for anesthetized prep include auditory responses and somatosensory + auditory responses.). In addition, the tonotopy of the DC shifted in the awake condition. These are intriguing changes that are not entirely expected from the switch to an awake prep and therefore warrant discussion.

      Thank you for the reviewer’s comment. To determine if differences exist between anesthetized and awake data, we have now used the same statistics and edited Figures 4,5,7,8,9, and 10 as well as added a new Figure 11. Accordingly, we have edited the result section and added a paragraph addressing the possible differences between the two preparations in the Discussion section [lines 521-553]..

      (3) For somatosensory stimuli, the authors used whisker deflection, but based on the anatomy, this is presumably not the only somatosensory stimulus that affects LC. The authors could help readers place the present results in a broader context by discussing how other somatosensory stimuli might come into play. For example, might a larger percentage of modular neurons be activated by somatosensory stimuli if more diverse stimuli were used?

      We agree with the reviewer’s point. Indeed, the modules are receiving different inputs from different somatosensory sources such as somatosensory cortex and dorsal column nuclei, which could indicate that the activity of the cells in the modular areas could be evoked by different types of somatosensory stimulations, which is an open area for future studies. We have discussed this point in the revised Discussion section [lines 516-520].

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Minor comments:

      (1) Figure 3H: The lateral surface seems quite damaged by the prism. An example slice of the imaging area of each mouse would help the reader better understand the extent of damage the prism leaves in the area of interest.

      Thank you for the reviewer’s comment. We already have included such images in Figures 4A, 7A, and 9A to present the field of view of all prism experiments. However, we need to clarify the point of tissue damage. The insertion of microprism may be associated with some tissue damage as a result of making the pocket for the microprism to be inserted, but it is not possible to get neuronal signals from a damaged field of view. Therefore, we do not believe that there is tissue damage to the parts of the LC imaged by microprism. However, there may be some areas where the microprism is not in direct contact with the LC surface. These areas are located mostly in the periphery of the field of view, and they are completely black as they are out of focus (i.e., the left side of Figure 3B). The right side of Figure 3b as well as Figure 3A have some black areas, which present the vasculatures, where there are no red signals because of the lack of jRGECO1a expression in those areas.

      (2) In relation to the data shown in Figure 4E it is claimed that LC is tuned to higher frequencies (lines 195-196). However, the majority of cells appear to be tuned to frequencies below 14kHz (with a median of 7.5 kHz), which is quite low for the mouse. I assume that the authors mean frequencies that are relatively higher than the DC, but it is worth mentioning in the text that the BFs found in the LC are quite low-frequency responses for the mouse.

      Thank you for the reviewer’s comment, which we agree with. We edited this part by acknowledging that around 50% of the LC cells had a low-frequency bias to 5 and 7.1 kHz. Then we mentioned that most of the LC cells are tuned to relatively higher frequencies than those of the DC [lines 215-218].

      (3) Figure 5A-C: Is it the tone-responsive cells plus an additional ~22% of cells that respond to AM, or are there also cells that respond to tones that do not respond to AM. Please break down to which degree the tone and AM responsive cells are overlapping.

      Thank you for the reviewer’s comment and suggestion. We broke down the responsive cells into cells responsive only to pure tone (tone selective cells or Tone-sel) or to only AM-noise (noise selective cells or Noise-sel) as well as cells responding to both sounds (nonselective cells or Non-sel). We examined the fractions of these categories of cells in both LC and DC within all responsive neurons. Accordingly, we have edited Figure 5A-C as well as the text [lines 229-243].

      (4) Figure 5D. It is unclear to me how a cell is classified as SMI or TMI responsive after computing the SMI or TMI for each cell. What statistic was used to determine if the cell was responsive or not?

      Thank you for the reviewer’s comment. We do apologize for the confusion caused by Figures 5D and E. These figures do not show the values of SMI or TMI, respectively. Rather, the figures show the percentage of the spectrally or temporally modulated cells, respectively. At each sound level, the cells were categorized into two main types. The spectrally modulated cells are those responsive to pure tones or unmodulated noise, so they can detect the spectral features of the sound (old Figure 5D or new Figure 5E). The temporally modulated cells are those responsive to AM-noise, so they can detect the temporal features of the sound of complex spectra like the broadband noise (old Figure 5E or new Figure 5F). To clear this confusion, we removed the words SMI and TMI from the figures, and then we renamed the x-axis label into “% of spectrally modulated cells” and “% of temporally modulated cells” for Figures 5D (new 5E) and E (new 5F), respectively.

      (5) Figure 5 D, E: Is the decrease in SMI and TMI modulated cells in the modules a result of simply lower sensitivity to sounds (i.e. higher response thresholds)? If a cell responds to neither tone, AM, or noise it will have a low SMI and TMI index. If this is the case that affects the interpretation, as it is then not a decrease in sensitivity to spectral or temporal modulation, but instead a difference in overall sound sensitivity.

      Thank you for the reviewer’s comment. We apologize for the confusion about Figures 5E and D, which did not show the SMI and TMI values. Rather, they show the percentage of spectrally or temporally modulated cells, respectively, as explained in our previous response. Therefore, Figure 5D shows the percentage of cells that can detect the spectral features of sound, while Figure 5E shows the percentage of cells that can detect the temporal features of sounds of complex spectra like broadband noise. Accordingly, Figures 5D and E show the sensitivity to different features of sound and not the overall sound sensitivity.

      (6) Figure 7 and 8: What is the false positive rate expected of the responsive cells using the correlation cell flagging criteria? Especially given that the fraction of cells responsive to somatosensory stimulation in LC (matrix) is 0.88% and 1.3% in DC, it is important to know what the expected false positive rate is in order to be able to state that there are actually somatosensory responses there or if this is what you would expect from false positives given the inclusion test used. Please provide an estimate of the false positive rate given your inclusion test and show that the rate found is statistically significantly above that level - and show this rate with a line in Figure 7 H, I.

      Thank you for the reviewer’s comment. To test the efficiency of the correlation method to determine the responsive cells, we initially ran an ROC curve comparing the automated method to a blinded human interpretation. The AUC of the ROC curve was 0.88. This high AUC value indicates that the correlation method can rank the random responsive cells than the random nonresponsive cells. At the correlation coefficient (0.4), which was the cutoff value to determine the responsive cells for somatosensory stimulation, the specificity was 87% and the sensitivity 72%, the positive predictive value was 73%, and the negative predictive value was 86%. Although the above percentages indicate the efficiency of the correlation method, we excluded all the false responsive cells from the analysis. Therefore, the fractions of cells in the graphs are the true responsive cells with no contamination of the non-responsive cells. We also modified Figures 7H and I to match the other data sets obtained from awake animals. Therefore, Figures 7H and I no longer show the average of the responsive cells. Instead, they show the % of different fractions of responsive cells within each cellular motif (modules and matrix). Accordingly, we believe that there is no need to include a rate line on the graph. We added the section describing the validation part to the methods section [lines 808-815].

      (7) Figure 7: Please clarify what is meant by a cell responding to 'both responding to somatosensory and auditory stimulation'. Does it mean that the cell has responses to both auditory and somatosensory stimulation when presented individually or if it responds to both presented together? If it is the former, I don't understand how the number to both can be higher than the number of somatosensory alone (as both requires it also to respond to somatosensory alone). If it is the latter (combined auditory and somatosensory) then it seems that somatosensory inputs remove the responsiveness of most cells that were otherwise responsive to auditory alone (e.g. in the module while 42% respond to sound alone, combined stimulation would leave only 10% of cells responsive). Please clarify what exactly the authors are plotting and stating here.

      Thank you for the reviewer’s comment. The responsive cells in Figure 7 are divided into three categories. Each category has a completely different group of cells. The first category is for the cells responding only to auditory stimulation (auditory-selective cells or Aud-sel). The second category is for the cells that respond only to somatosensory stimulation (somatosensory selective cells or Som-sel). The third category is for the cells that respond to both auditory and somatosensory stimulations when both stimulations are presented individually (auditory/somatosensory nonselective cells or Aud/Som-nonsel). Accordingly, the number of cells may be different across all these categories. We have clarified this part in the text [lines 299-303]. We have modified Figures 7, 8, and 11 (old Figure 10) to match the data from anesthetized and awake animals, so Figures 7H and I now show the collective % of the cells from all animals within modules vs matrix.

      (8) Why are the inferential statistics used in Figure 9F (chi-square test) and Figure 5A-C (t-test) when it tests the same thing (the only difference is one is anaesthetised data and the other awake)? Indeed, all Figure 9 and 10 (awake data figures) plots use chi-square tests to test differences in percentages instead of t-tests used in earlier (anaesthetised data figures) plots to test differences in percentages between groups. Please clarify the reason for this change in statistics used for similar comparisons.

      Thank you for the reviewer’s comment. Imaging the LC via microprism from awake animals confirmed the ability to run this technique with no interference to the ambulatory functions of the animals. Therefore, the main goal was to highlight the similarities between the data obtained from awake and anesthetized setups by highlighting the comparison between the LC and DC or between modules and matrix within each preparation (anesthetized vs awake). Accordingly, the statistics used to run these comparisons were chosen based on the number of the tested animals at each setup (7 anesthetized animals and 3 awake animals for prism insertion). The low number of animals used for awake data made us use the number of cells collectively from all animals instead of the number of animals, so we used the Chi-square test to examine the differences in percentages.

      (9) Figure 10H: The main text describes the results shown here as similar to what was seen in anaesthetised animals. But it looks to me like the results in awake animals are qualitatively different from the multisensory interaction seen in anaesthetised animals. In anaesthetised animals the authors find that there is a higher chance of auditory responses being enhanced by somatosensory inputs when cells are in the modules compared to in the matrix. However, in awake data, this relationship is flipped, with more bimodal enhancement found in the matrix compared to the modules. Furthermore, almost all cells in the modules are suppressed by combined somatosensory input which looks like it is different from what is found in anaesthestised mice and what is described in the discussion: 'we observed that combined auditory-somatosensory stimulation generally suppressed neural responses to auditory stimuli and that this suppression was most prominent in the LC matrix'.

      Thank you for the reviewer’s comment. Our statement was meant to show how the data obtained from awake and anesthetized animals were generally similar. However, we agree that the statement may not be suitable due to the possible differences between awake and anesthetized animals. To address a fair comparison between the anesthetized and awake preparations, we ran similar statistics and graphs for Figures 7, 8, and 11 (old Figure 10). Given that the areas occupied by modules and matrix are different across animals due to the irregular shape of the modules, we chose to run a chi-square test for all the data to quantify the collective % of responding cells within modules vs matrix from all tested animals for each experimental setup (anesthetized vs awake). The anesthetized and awake animals similarly showed that modules and matrix had higher fractions of auditory responsive cells. However, matrix had more cells responding to auditory stimulations than modules, while modules had more cells responding to somatosensory stimulation than matrix. In contrast, while the anesthetized animals showed higher fractions of offset auditory-responsive cells, which were mostly clustered in the modules, the offset auditory-responsive cells were very rare in awake animals (6 cells/one animal).

      Based on the fractions of cells with suppressed or enhanced auditory response induced by bimodal stimulation, the data obtained from anesthetized and awake animals showed that the auditory response in the matrix was suppressed more than enhanced by bimodal stimulation. In contrast, modules had different profiles across the experimental setups and locations. For instance, the modules imaged via microprism in the anesthetized and awake animals showed suppressed more than enhanced auditory responses, but modules imaged from the dorsal surface in anesthetized animals showed enhanced more than suppressed auditory responses. Additionally, modules had less suppressed and more enhanced auditory responses compared to matrix in the anesthetized animals regardless of the location of the modules (microprism or dorsal surface). Yet, modules from awake animals had more suppressed and less enhanced auditory responses compared to matrix. We have addressed these differences in the results and discussion section.

      Additional minor comments that I think the authors could use to aid their manuscript clarity:

      (1) The figure colour selection - especially in Figures 7 and 8 - is really hard to tell apart. Please choose more distinct colours, and a colour scheme that is appropriate for colour blind readers.

      Thank you for the reviewer’s suggestion. We have noticed that the magenta color assigned for the cells with offset responses was very difficult to distinguish from the black background. We have changed the magenta color to green to be different from the color of other cells. Using Photoshop, we chose a color scheme that is suitable for color-blind readers in all our maps.

      (2) The sentence in lines 331-334 should be rephrased for clarity.

      Thank you for the reviewer’s suggestion. We have rephrased the statement for clarity [lines 364-371].

      Reviewer #2 (Recommendations For The Authors):

      As mentioned in the public review the strong clustering evident in some of the maps (some of which may be related to module/matrix differences but certainly not all of it) seems worth scrutinizing further. Would we expect such a strong spatial segregation of auditory responsive and non-responsive neurons? Would we expect response properties (e.g. off-responsiveness) other than frequency tuning to show evidence of a topographic arrangement in the IC? In addressing this it would, of course, be important to rule out that this clustering is not down to some trivial experimental variables and truly reflects functional organization. For instance, are the patches of non-responsive neurons found in parts of the field of view with poor visibility, poor labelling, etc which may explain why it is difficult to pick up responses there? Are the neurons in non-responsive areas otherwise active (i.e. do they show spontaneous activity) or could they be 'dead'? Could the way neuropil signals are dealt with play a role here (it is weighted by 0.4 which strikes me as quite low)? In relation to this, I am also wondering to what extent the extreme overrepresentation (Figure 4) of neurons with a BF of 5kHz (some of this is, of course, down to the fact that the lower end of the frequency range was 5kHz and that the step size was 0.5 octaves), especially in the DC, is to be interpreted.

      Thank you for the reviewer’s comment. Before analysis, the ROIs of all cells were set around the cell bodies using the jRGECO1a signals as a reference, so all cells (responsive and nonresponsive) were collected from areas of good visibility of jRGECO1a signals. In other words, no cells were collected from regions having poor jRGECO1a signals. In Figures 7, 8, and 11 (old Figure 10), the cells showed response either only to unmodulated broadband noise at 80 dB as an auditory stimulus or to whisker deflection with specific speed and power as a somatosensory stimulus. Given that the two stimuli above had specific parameters, the remaining non-responsive cells may respond to auditory or somatosensory stimulations with other features. For instance, some nonresponsive cells to the unmodulated broadband noise were responding to pure tones with different amplitudes and frequencies or to different AM-noise with different amplitudes and modulation frequencies.  Also, these nonresponsive cells may not respond to any of our tested stimuli and may respond to other sensory stimulations. Some of the non-responsive cells showed spontaneous activity when no stimulations were presented. However, we can not rule out the possibility that some of these nonresponsive cells may not be viable. We have addressed the clustering properties in the revised version of the manuscript in the corresponding spots of the results and discussion sections. We have added a new supplementary figure (Figure 11- Figure Supplement 1) to show how the nonresponsive cells to the unmodulated noise may respond to other types of sound and to show the spontaneous activity of some non-responsive cells.

      For the neuropil, previous reports used the contamination factor (r) in a range of 0.3-0.7 (we referenced these studies in the method section [line 776) based on the tissue or cells imaged, vasculatures, and the objective used for imaging. Therefore, we optimized the contamination factor (r) to be 0.4 through a preliminary analysis based on the tissue we image (LC), and the objective used (16x with NA = 0.8 and 3 mm as a working distance).

      We agree that there is an overrepresentation of 5 kHz as the best tuning frequency for DC cells. The previous report (A. B. Wong & Borst, 2019) showed a large zone of the DC where cells were tuned to (2-8 kHz). Given that 5kHz was the lowest tested frequency in our experiment, we think that the low-frequency bias of the DC surface is consistent between studies. This finding also could be supported by the electrophysiology data obtained by spanning the recording electrodes through the IC tissue along the dorsoventral axis. In those experiments, the cells were tuned to lower frequencies at the dorsal surface of the IC.

      We have changed the magenta-colored cells to green ones, so it will be easier to identify the cells. As required by another reviewer, we changed the color pallets of some images and cellular maps to be suitable for color-blind readers. 

      The manuscript would benefit from more precise language in a number of places, especially in the results section.

      Line 220/221, for instance: "... a significant fraction of cells that did not respond to pure tones did respond to AM-noise" Strictly speaking, this sentence suggests that you considered here only the subset of neurons that did not respond to pure tones and then ran a test on that subset. The test that was done seems to suggest though that the authors tested whether the percentage of responsive cells was greater for pure tones or for AM noise.

      Thank you for the reviewer’s comment. We do apologize for the confusion. In the revised manuscript, we categorized the cells according to their response into cells responding to pure tone only (tone-selective cells or Tone-sel), Am-noise only (noise-selective cells or Nose-sel), and to both pure tone and am-noise (nonselective cells or Non-sel). We have modified Figure 5 accordingly. We did the same thing for the data obtained from awake animals and showed that in a new figure to easily match the analysis done for the anesthetized animals.

      Please refer to the figure panels in the text in consecutive order. 2B, for instance, is mentioned after 2H.

      Thank you for the reviewer’s comment. Throughout the paper, we kept the consecutive order of the figure panels within each figure to be in a smooth flow with the text. Yet, figure 2 was just the only exception for a good reason. Figure 2 is a complex one that includes many panels to show a parallel comparison between LC imaged via microprism and DC through single photon images, two-photon images, validating laser lesioning, and histology. Accordingly, we navigated many panels of the figure to efficiently highlight the aspects of this comparison. We prefer to keep Figure 2 as one figure with its current format to show this parallel comparison between LC and DC.

      The legend for Figure 2 could be clearer. For instance, there are two descriptions for panel D. Line 1009: "(C-E)" [i.e. C, D, E] and line 1010: "(D and F)".

      Thank you for the reviewer’s comment. It should be C and E, not C-E. We have fixed the mistake [line 1224]

      Line 275: What does 'with no preference' mean?

      Thank you for the reviewer’s comment. We do apologize for the confusion. There are three categories of cells. Some cells respond only to auditory stimulation, while others respond to only somatosensory stimulation. However, there is another group of cells that respond nonselectively to auditory and somatosensory stimulations or Aud/Som-nonsel cells. We edited the sentence to be clearer [lines 303-304].

      Line 281 (and other places): What does 'normalized against modules' mean?

      Thank you for the reviewer’s comment. This normalization was done by dividing the number of responsive cells of the same response type in the matrix by that in the modules. Therefore, the value taken by modules was always 1 and the value taken by the matrix is something around 1. Accordingly, the value for matrix could be > 1 if matrix had more cells than modules. In contrast, the value of matrix would be < 1 if matrix had fewer cells than modules. In the revised version, we used this normalization method to make the revised Figures 5C and 10C to describe the cell fractions responding to pure tone only, AM-noise only, or to both stimuli in the matrix vs modules. 

      Sentence starting on line 288. I don't find that point to be as obvious from the figures as the sentences seem to suggest. Are we to compare magenta points (auditory off cells) from 7C with green points in 7F?

      Thank you for the reviewer’s comment. We came to this conclusion based on our visual comparison of magenta points (now green in the revised version to increase the visibility) representing the auditory offset cells in Figure 7C and the green points in Figure 7F representing the cells responding to both somatosensory and auditory stimulations. In the revised manuscript, we statistically examined if the percentage of onset auditory response and offset auditory responses are different within the responsive cells to both somatosensory and auditory stimulations in the modules vs matrix. We have found that most of the cells responding to both somatosensory and auditory stimulations inside the modules had offset auditory responses, which could indicate a level of multisensory integration between somatosensory input and the offset auditory responses in these cells. We have added the statistical results to the revised manuscript to address this effect [lines 312-317]

      Lines 300-302: "These data suggest that the module/matrix system permits preservation of distinct multimodal response properties in the face of massive integration of inputs in the LC". First, I'm not quite sure what that sentence means. Second, it would be more appropriate for the discussion. Third, the fact that we are more likely to find response enhancement in the modules than in the matrix is nicely consistent with the idea (supported by work from the senior author's lab and others) that excitatory somatosensory input predominantly targets neurons in the modules (which is why we see mostly response enhancement in the modules) and that this input targets GABAergic neurons which then project to and inhibit neurons both outside and inside of their module. Therefore, I would recommend that the authors replace the aforementioned sentence with one that interprets these results in light of what we know about this somatosensory-auditory circuitry.

      Thank you for the reviewer’s comment. Despite the massive multimodal inputs, the LC receives from auditory vs nonauditory regions, the module/matrix system is a platform for distinct multimodal responses indicated by more somatosensory responsive cells in modules versus more auditory responsive cells in matrix, which matches the anatomical differences that were reported before. We edited the sentence in the light of the comparison between the data obtained from awake and anesthetized animals and moved it to the discussion section [lines 503-506].

      The term 'LC imaged via microprism' is used dozens of times throughout the manuscript. Replacing it with a suitable acronym or initialism could improve the flow of the text and would make some of the sentences less cumbersome.

      Thank you for the reviewer’s suggestion. We changed the term “LC imaged via microprism” into LC (microprism) throughout the revised manuscript.

      5A-C: It is unclear what is being compared here. What are the Ns? Different animals?

      Thank you for the reviewer’s comment. We do apologize for this missing information. We have added the number of subjects used in every statistical test in each corresponding figure legend.

      5G: minus symbol missing on the y-axis.

      Thank you for the reviewer’s comment. We gladly have fixed that.

      Figure 6: Are these examples or population averages?

      Thank you for the reviewer’s question. Every figure panel of the old Figure 6 represents a single trace of an example cell. However, we modified Figure 6 to include more examples of cells showing different responses complying with another reviewer’s suggestion. Each panel of the new Figure 6 represents the average response of 5 stimulations of the corresponding stimulus type. We preferred to show the average signal because it was the one used for the subsequent analysis.

      How are module borders defined?

      Thank you for the reviewer’s question. The modules were defined based on the intensity of the green channel that shows the expression of the GFP signals. The boundaries of modules were determined according to the distinction between high and low GFP signal boundaries of the modules. This step was done before data analysis to avoid any bias.

      7JKL: How are these to be interpreted? Does panel 7K, for instance, indicate that the fraction of neurons showing 'on' responses was roughly twice as large in the matrix than in the modules and that the fraction of neurons showing 'off' responses was roughly 10 times larger in the modules than in the matrix (the mean seems to be at about 1/10).

      Thank you for the reviewer’s comment. The data represented by Figures 7J-L defined the normalization of the number of cells of the same response type in the matrix against the modules. This normalization was done per animal, and then the data of the matrix were plotted against the normalization line at 1 representing the modules. The matrix will be claimed to have more cells than modules if the median of the matrix values > 1. In contrast, the matrix will be claimed to have fewer cells than the modules if the median of the matrix values < 1. Finally, if the median of matrix values = 1, this means there is no difference between matrix and modules. However, to match the data obtained from anesthetized animals (Figures 7 and 8) with those obtained from awake animals (Figure 11 or old Figure 10), we ran all data through the Chi-square test in the revised manuscript. Therefore, the format of Figures 7K-L was changed in the revised manuscript, so they became new Figures 7I-K.

      10A suggests that significantly more than half the neurons shown here are not auditory responsive. Perhaps I am misinterpreting something here but isn't that in contrast to what is shown in panel 9F?

      Thank you for the reviewer’s comment. The data shown in Figure 10A (or revised Figure 11A) represents the cellular response to only one stimulus (broadband noise at 80 dB with no modulation frequency), while Figure 9F (revised 10B) represents the cells responding to varieties of auditory stimulations of different combinations of frequencies and amplitudes (pure tones) as well as to AM-noise of different amplitudes and modulation frequencies. Accordingly, the old Figure 9F or revised Figure 10B shows different cell types based on their responses. For instance, some cells respond only to pure tone. Others respond only to AM-noise or to both pure tones and AM-noise. This may also support that the nonresponsive cells in Figure 10A (revised 11A) can respond to other types of sound features.

      The way I understood panels 7L and 8K there were more suppressed neurons in the matrix than in the modules (line 296: "cells in the modules had a higher odds of having an enhancement response to bimodal stimulation than matrix, while cells in the matrix had a higher odds of having a suppressive response to bimodal stimulation"). Now, panel 10F indicates that in awake mice there is a greater proportion of suppressed neurons in the modules than in the matrix. I may very well have overlooked or misread something but I may not be the only reader confused by this so please clarify.

      Thank you for the reviewer’s comment. We do apologize for this confusion. The ambiguity between Figures 7 and 8 (anesthetized animals) as well as Figure 10 (awake animals) comes from the fact that different statistics have been used for each preparation. In the revised version, we have fixed that by running the same statistics for all the data, and we accordingly revised Figures 7, 8, and 10 (new Figure 11). In brief, the matrix preserves a higher percentage of cells with suppressed auditory responses than those with enhanced auditory responses induced by bimodal stimulation in all conditions (anesthetized vs awake). In contrast, modules act differently across all tested conditions. While modules had more cells with enhanced auditory responses induced by bimodal interaction in anesthetized animals, they had more cells with suppressed response in awake animals indicating that modules could be sensitive to the effect of anesthesia compared to matrix. We addressed this effect in the discussion of the revised manuscript [lines 521-553].

      Line 438: ...as early AS...

      Thank you for the reviewer’s comment. We gladly fixed that [line 512].  

      Reviewer #3 (Recommendations For The Authors):

      My minor recommendations for the authors are as follows:

      (1) The text can be a bit difficult to follow in places. This is partly due to the convoluted nature of the results, but I suggest a careful read-through to look for opportunities to improve the prose. In particular, there is a tendency to use long sentences and long paragraphs. For example, the third paragraph of the introduction runs for almost fifty lines.

      Thank you for the reviewer’s comment and suggestion. We have fixed that.

      (2) This might be due to journal compression, but some of the bar graphs in the figures are difficult to read. For example, the individual data points, especially when filled with striped background colors get lost. Axes can become invisible, like the y-axis in 7L, and portions of bars, like in 7F, are not always rendered correctly. Error bars are sometimes hidden behind data points, as in 5C. Increasing line thickness and shifting individual data points away from error bars might help with this.

      Thank you for the reviewer’s comment and suggestion. We made all the data points with black color and filled circles to make the data points visible. We put all the data points behind the main columns, so they don’t block the error bars. We have fixed figures 7 and 5.

      (3) Throughout the manuscript, the authors use a higher SMI to indicate a preference of cells for auditory stimuli with "greater spectral... complexity" (e.g., lines 219 and 372). I find this interpretation a bit challenging since SMI compares a neuron's preference for tones over noise, and to me, tones seem like the least spectrally complex of all auditory stimuli. Perhaps some clarification of what the authors mean by this would help. For example, is the assumption that a neuron that prefers tones over noise is, either directly or indirectly, receiving input sculpted by inhibitory processes?

      Thank you for the reviewer’s comment. In general, higher SMI values indicate an increase in the preference of the cells to respond to pure tones than noise with no modulation (less spectral complexity). We will clarify this statement throughout the manuscript. However, the SMI value was not mentioned in lines 219 and 372. The statement mentioned in line 219 describes the revised figure 5C (old 5B), where more cells in matrix specifically respond to AM-noise compared to modules, which indicates the preference of the matrix to respond to sounds of greater spectral and temporal complexity. The statement in 372 in the discussion section refers to the finding in revised figures 5E and F (old 5D and E). In the revised figure 5E or old 5D, the data show that matrix has more cells responding to pure tones or noise with no modulation than modules, so matrix has a lower threshold to detect the spectral features of sound (revised figure 5E or old 5D). In the revised figure 5F or old 5E, the data show that matrix has more cells responding to AM-noise than modules, which indicates that matrix functions more to process the temporal features of sound. As explained above, all findings were related to the percentage of cells responding to specific sound stimuli and not the exact SMI values. We have revised the figures accordingly by removing the terms SMI and TMI from the figures, and we have clarified that in the text.

      (4) Lines 250-253: How does a decrease in SMI correspond to "an increase in pure tone responsiveness?" Doesn't a decrease suggest the opposite?

      Thank you for the reviewer’s comment, which we agree with. We do apologize for that. We have fixed this statement [lines 275-277] and any related findings accordingly.

      (5) Line 304: Add "imaged via microprism" or similar after "response profiles with the LC.".

      Thank you for the reviewer’s suggestion. We have fixed that. However, we changed the term “LC imaged via microprism” into “LC(microprism)” for simplicity as suggested by another reviewer [line 330].

      (6) Figure 5A and C: Both plots show that more neurons responded to AM-noise than tones, but it would be interesting to know how much the tone-responsive and AM-noise responsive populations overlapped. Were all tone-responsive neurons also responsive to AM-noise?

      Thank you for the reviewer’s comment. We have categorized the cells based on their response to pure tone only, AM-only, and both pure tone and AM-noise when each stimulus is presented individually. We have modified Figures 5A and C, and they are now Figures 5B and D.

      (7) Figure 5G: Missing negative sign before "0.5.".

      Thank you for the reviewer’s suggestion. We gladly have fixed that. However, old Figure 5G became a revised Figure 5H.  

      (8) Figure 7 legend, Line 1102: Missing period after "(C and E)".

      Thank you for the reviewer’s suggestion. We think that the period should be placed before (C and E) at the end of “respectively”. The parentheses refer to the statements after them. We gladly fixed that. [line 1394]

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      This study presents valuable findings as it shows that sleep rhythm formation and memory capabilities depend on a balanced and rich diet in fly larvae. The evidence supporting the claims of the authors is convincing with rigorous behavioral assays and state-of-the-art genetic manipulations. The work will be of interest to researchers working on sleep and memory. 

      Public Reviews: 

      Summary: 

      This manuscript investigates how energetic demands affect the sleep-wake cycle in Drosophila larvae. L2 stage larvae do not show sleep rhythm and long-term memory (LTM), however, L3 larvae do. The authors manipulate food content to provide insufficient nutrition, which leads to more feeding, no LTM, and no sleep even in older larvae. Similarly, activation of NPF neurons suppresses sleep rhythm. Furthermore, they try to induce a sleep-like state using pharmacology or genetic manipulations in L2 larvae, which can mimic some of the L3 behaviours. A key experimental finding is that activation of DN1a neurons activate the downstream DH44 neurons, as assayed by GCaMP calcium imaging. This occurs only in third instar and not in second instar, in keeping with the development of sleep-wake and feeding separation. The authors also show that glucose metabolic genes are required in Dh44 neurons to develop sleep rhythm and that DH44 neurons respond differently in malnutrition or younger larvae. 

      Strengths: 

      Previous studies from the same lab have shown the sleep is required for LTM formation in the larvae, and that this requires DN1a and DH44 neurons. The current work builds upon this observation and addresses in more detail when and how this might develop. The authors can show that low quality food exposure and enhanced feeding during larval stage of Drosophila affects the formation of sleep rhythm and long-term memory. This suggests that the development of sleep and LTM are only possible under well fed and balanced nutrition in fly larvae. Non-sleep larvae were fed in low sugar conditions and indeed, the authors also find glucose metabolic genes to be required for a proper sleep rhythm. The paper presents precise genetic manipulations of individual classes of neurons in fly larvae followed by careful behavioural analysis. The authors also combine thermogenetic or peptide bath application experiments with direct calcium imaging of specific neurons. 

      Weaknesses: 

      The authors tried to induce sleep in younger L2 larvae, however the behavioral results suggest that they were not able to induce proper sleep behaviour as in normal L3 larvae. Thus, they cannot show that sleep during L2 stage would be sufficient to form LTM. 

      We agree that the experiments with Gaboxadol feeding in L2 did not perfectly mimic L3 sleep behaviors. However, genetic induction of sleep in L2 was effective in increasing sleep duration and depth similar to that observed in normal L3. As noted below in response to specific reviewer comments, because gaboxadol feeding is standard in the field for adult sleep induction, we prefer to still include this data in the manuscript for transparency. Moreover, the gaboxadol manipulation did cause a significant decrease in arousal threshold compared to control larvae. Together these approaches support the hypothesis that sleeping more/more deeply is not sufficient to promote LTM in L2.

      The authors suggest that larval Dh44 neurons may integrate "information about the nutritional environment through the direct sensing of glucose levels to modulate sleep-wake rhythm development". They identify glucose metabolism genes (e.g., Glut1) in the downstream DH44 neurons as being required for the organization of the sleep-wake-feeding rhythm, and that CCHa signaling in DN1a signaling to the DH44 cells via the receptor. However, how this is connected is not well explained. Do the authors think that the nutrient sensing is only occurring in the DH44 neurons and not in DN1a or other neurons? Would not knocking down glucose metabolism in any neuron lead to a functional defect? What is the evidence that Dh44 neurons are specific sensors of nutritional state? For example, do the authors think that e.g. the overexpression of Glut1 in Dh44 neurons, a manipulation that can increase transport of glucose into cells, would rescue the effects of low-sugar food? 

      We thank the reviewer for these suggestions and have added the experiment proposed. We found that knockdown of Hex-C in DN1a neurons did not disrupt sleep-wake rhythms (Fig. S4G-I) suggesting that Dh44 neurons are specialized in requiring glucose metabolism to drive sleep-wake rhythms. We have also added further clarification in the text regarding the existing evidence that Dh44 neurons act has nutrient sensors.

      Some of the genetic controls seem to be inconsistent suggesting some genetic background effects. In Figure 2B, npf-gal4 flies without the UAS show no significant circadian change in sleep duration, whereas UAS-TrpA flies do. The genetic control data in Figure 2D are also inconsistent. Npf-Gal4 seems to have some effect by itself without the UAS. The same is not seen with R76G11-Gal4. Suppl Fig 2: Naïve OCT and AM preference in L3 expressing various combinations of the transgenes show significant differences. npf-Gal4 alone seems to influence preference. 

      The sleep duration and bout number/length data are highly variable. 

      All experiments are performed in isogenized background so variability seen in genetic controls likely reflects stochastic nature of behavioral experiments. Indeed, adult sleep data also shows a great deal of variability within the same genetic background (PMID: 29228366). We agree it is an important point, and we attempt to minimize variability as much as possible with backcrossing of flies and tight control of environmental conditions.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Low sugar exposure and activation of NPF neurons might not induce the same behavioral changes. LS exposure does not enhance mouth hook movements, but overall food intake. NPF activation seems to enhance mouth hook movements, but the data for food intake is not shown. This information would be necessary to compare the two different manipulations. 

      We thank the reviewer for this suggestion. However, we elected not to perform food intake experiments with the NPF activation experiments. Since we are not directly comparing the low sugar and NPF manipulations to each other, we think that both experiments together support the conclusion that immature food acquisition strategies (whether food intake or feeding rate) limit LTM performance. 

      The authors write that the larval feeding assays run for 4 hours, can they explain why that long? Larvae should already have processed food within 4 hours, so that the measurement would not include all eaten food.

      We clarified the rationale for doing 4 hour feeding assays in the results section. We did 4 hours on blue dyed food because initial experiments of 1 hour with control L3 at CT1-4 were difficult to interpret. The measurement does not include all of the eaten food in the 4 hours but does reflect more long-term changes in food intake.

      Sleep induction with Gaboxadol seems to not really work - sleep duration, bout number and length are not enhanced, and arousal threshold is only slightly lower. Thus, the authors should not use this data as an example for inducing sleep behaviour. 

      We agree this approach did not have a large effect in larvae. However, because gaboxadol feeding is standard in the field for adult sleep induction, we prefer to still include this data in the manuscript for transparency. Moreover, the Gaboxadol manipulation did cause a mild (but significant) decrease in arousal threshold compared to control larvae. Gaboxadol feeding also caused a significant decrease in total body weight compared to control larvae indicating that even slightly deeper sleep could be detrimental to younger animals.

      Activation of R76G11 with TrpA1 seems to work better for inducing sleep like behaviour. However, the authors describe that they permanently activated neurons. To induce a "normal" sleep pattern, the authors might try to only activate these neurons during the normal enhanced sleep time in L3 (CT13?) and not during the whole day. This might also allow larvae to eat during day time and gain more weight. 

      We apologize that this point was not clearer, but we did do acute activation of R76G11(+) neurons, as proposed by the reviewer. We have clarified the text to make this point.

      It would be interesting to see how larvae fed with high sucrose and low protein diet would behave in this assay. Do the authors suggest that sugar is most important for the development of sleep behaviour or that it is a combination of sugar and protein that might be required? 

      We agree that feeding larvae a high sucrose and low protein diet would be interesting. However, we initially tried a low protein diet and observed significant developmental delays. Therefore, we are concerned that developmental defects on a high sucrose and low protein diet would confound behavioral results. Additionally, the Dh44 manipulations (glucose & GCN2 signaling) suggest that sugar is the most important for the development of sleep behaviors.

      Reviewer #3 (Recommendations For The Authors): 

      The authors could discuss if the interaction between DN1a clock neurons and Dh44 neurons is mediated synaptic or by volume transmission following the extracellular release of the CCHa1 neuropeptide. They write that "the development of Dh44 neuronal competency to receive clock-driven cues" and that "DN1a clock neurons anatomically and functionally connect to Dh44" but a discussion about volume vs. synaptic signalling would be of interest. 

      We thank the reviewer for this suggestion. We revised the discussion to address this point.

      line 223 " demonstrating that post-synaptic processes likely". It would be interesting to read a discussion on whether it is known if these are postsynaptic or peptide-mediated volume effects? 

      We added additional text to the discussion to address these points.

      - The authors may want to include a schematic of the circuit and how its position in the general anatomy of the fly larva. 

      We thank the reviewer for this suggestion. We have added a model figure to Fig. S6.

      "Dh44 neurons act through glucose metabolic genes" - consider rewording e.g. require glucose metabolic genes 

      We revised the text.

      - line 45 "Early in development, young animals must obtain enough nutrients to ensure proper growth" - this is too general, many animals do not feed in early life-cycle stages (e.g. lecitotrophic development), consider rewording 

      We revised the text to be more specific.

      - line 90 "however, L3 at CT1 consume more than L3 at CT12 (Figure S1A)" - typo CT13, also consider rewording to match the structure of the sentence before 'however, L3 consumed more at CT1 than at CT13' 

      We revised the text to fix this error.

      - Line 111 "and loss of deep sleep" - how is deep sleep defined and measured in the larvae? It is not clear from the data or the text. 

      We revised the text to define deep sleep in the results section. We also have a description of how arousal threshold is calculated in the methods.

      - In Figure 3B and G the individual data points are not shown 

      We did not show individual data points for those graphs because we are plotting the average percentage of 4 biological replicates.

      Typo: 

      Figure 1 legend "F, n= n=100-172 " 

      We revised the text to fix this typo.

    1. Reviewer #2 (Public Review):

      This manuscript is motivated by the question of what mechanisms cause overyielding in mixed-species communities relative to the corresponding monocultures. This is an important and timely question, given that the ultimate biological reasons for such biodiversity effects are not fully understood.

      As a starting point, the authors discuss the so-called "additive partitioning" (AP) method proposed by Loreau & Hector in 2001. The AP is the result of a mathematical rearrangement of the definition of overyielding, written in terms of relative yields (RY) of species in mixtures relative to monocultures. One term, the so-called complementarity effect (CE), is proportional to the average RY deviations from the null expectations that plants of both species "do the same" in monocultures and mixtures. The other term, the selection effect (SE), captures how these RY deviations are related to monoculture productivity. Overall, CE measures whether relative biomass gains differ from zero when averaged across all community members, and SE, whether the "relative advantage" species have in the mixture, is related to their productivity. In extreme cases, when all species benefit, CE becomes positive. When large species have large relative productivity increases, SE becomes positive. This is intuitively compatible with the idea that niche complementarity mitigates competition (CE>0), or that competitively superior species dominate mixtures and thereby driver overyielding (SE>0).

      However, it is very important to understand that CE and SE capture the "statistical structure" of RY that underlies overyielding. Specifically, CE and SE are not the ultimate biological mechanisms that drive overyielding, and never were meant to be. CE also does not describe niche complementarity. Interpreting CE and SE as directly quantifying niche complementarity or resource competition, is simply wrong, although it sometimes is done. The criticism of the AP method thus in large part seems unwarranted. The alternative methods the authors discuss (lines 108-123) are based on very similar principles.

      The authors now set out to develop a method that aims at linking response patterns to "more true" biological mechanisms.

      Assuming that "competitive dominance" is key to understanding mixture productivity, because "competitive interactions are the predominant type of interspecific relationships in plants", the authors introduce "partial density" monocultures, i.e. monocultures that have the same planting density for a species as in a mixture. The idea is that using these partial density monocultures as a reference would allow for isolating the effect of competition by the surrounding "species matrix".

      The authors argue that "To separate effects of competitive interactions from those of other species interactions, we would need the hypothesis that constituent species share an identical niche but differ in growth and competitive ability (i.e., absence of positive/negative interactions)." - I think the term interaction is not correctly used here, because clearly competition is an interaction, but the point made here is that this would be a zero-sum game.

      The authors use the ratio of productivity of partial density and full-density monocultures, divided by planting density, as a measure of "competitive growth response" (abbreviated as MG). This is the extra growth a plant individual produces when intraspecific competition is reduced.

      Here, I see two issues: first, this rests on the assumption that there is only "one mode" of competition if two species use the same resources, which may not be true, because intraspecific and interspecific competition may differ. Of course, one can argue that then somehow "niches" are different, but such a niche definition would be very broad and go beyond the "resource set" perspective the authors adopt. Second, this value will heavily depend on timing and the relationship between maximum initial growth rates and competitive abilities at high stand densities.

      The authors then progress to define relative competitive ability (RC), and this time simply uses monoculture biomass as a measure of competitive ability. To express this biomass in a standardized way, they express it as different from the mean of the other species and then divide by the maximum monoculture biomass of all species.

      I have two concerns here: first, if competitive ability is the capability of a species to preempt resources from a pool also accessed by another species, as the authors argued before, then this seems wrong because one would expect that a species can simply be more productive because it has a broader niche space that it exploits. This contradicts the very narrow perspective on competitive ability the authors have adopted. This also is difficult to reconcile with the idea that specialist species with a narrow niche would outcompete generalist species with a broad niche. Second, I am concerned by the mathematical form. Standardizing by the maximum makes the scaling dependent on a single value.

      As a final step, the authors calculate a "competitive expectation" for a species' biomass in the mixture, by scaling deviations from the expected yield by the product MG ⨯ RC. This would mean a species does better in a mixture when (1) it benefits most from a conspecific density reduction, and (2) has a relatively high biomass.

      Put simply, the assumption would be that if a species is productive in monoculture (high RC), it effectively does not "see" the competitors and then grows like it would be the sole species in the community, i.e. like in the partial density monoculture.

      Overall, I am not very convinced by the proposed method.

      (1) The proposed method seems not very systematic but rather "ad hoc". It also is much less a partitioning method than the AP method because the other term is simply the difference. It would be good if the authors investigated the mathematical form of this remainder and explored its properties.. when does complementarity occur? Would it capture complementarity and facilitation?

      (2) The justification for the calculation of MG and RC does not seem to follow the very strict assumptions of what competition (in the absence of complementarity) is. See my specific comments above.

      (3) Overall, the manuscript is hard to read. This is in part a problem of terminology and presentation, and it would be good to use more systematic terms for "response patterns" and "biological mechanisms".

      Examples:<br /> - on line 30, the authors write that CE is used to measure "positive" interactions and SE to measure "competitive interactions", and later name "positive" and "negative" interactions "mechanisms of species interactions". Here the authors first use "positive interaction" as any type of effect that results in a community-level biomass gain, but then they use "interaction" with reference to specific biological mechanisms (e.g. one species might attract a parasite that infests another species, which in turn may cause further changes that modify the growth of the first and other species).

      - on line 70, the authors state that "positive interaction" increases productivity relative to the null expectation, but it is clear that an interaction can have "negative" consequences for one interaction partner and "positive" ones for the other. Therefore, "positive" and "negative" interactions, when defined in this way, cannot be directly linked to "resource partitioning" and "facilitation", and "species interference" as the authors do. Also, these categories of mechanisms are still simple. For example, how do biotic interactions with enemies classify, see above?

      - line 145: "Under the null hypothesis, species in the mixture are assumed to be competitively equivalent (i.e., absence of interspecific interactions)". This is wrong. The assumption is that there are interspecific interactions, but that these are the same as the intraspecific ones. Weirdly, what follows is a description of the AP method, which does not belong here. This paragraph would better be moved to the introduction where the AP method is mentioned. Or omitted, since it is basically a repetition of the original Loreau & Hector paper.

      Other points:

      - line 66: community productivity, not ecosystem productivity.<br /> - line 68: community average responses are with respect to relative yields - this is important!<br /> - line 64: what are "species effects of species interactions" ?<br /> - line 90: here "competitive" and "productive" are mixed up, and it is important to state that "suffers more" refers to relative changes, not yield changes.<br /> - line 92: "positive effect of competitive dominance": I don't understand what is meant here.

    1. Reviewer #1 (Public Review):

      Summary:

      This paper uses a model of binge alcohol consumption in mice to examine how the behaviour and its control by a pathway between the anterior insular cortex (AIC) to the dorsolateral striatum (DLS) may differ between males and females. Photometry is used to measure the activity of AIC terminals in the DLS when animals are drinking and this activity seems to correspond to drink bouts in males but not females. The effects appear to be lateralized with inputs to the left DLS being of particular interest.

      Strengths:

      Increasing alcohol intake in females is of concern and the consequences for substance use disorder and brain health are not fully understood, so this is an area that needs further study. The attempt to link fine-grained drinking behaviour with neural activity has the potential to enrich our understanding of the neural basis of behaviour, beyond what can be gleaned from coarser measures of volumes consumed etc.

      Weaknesses:

      The introduction to the drinking in the dark (DID) paradigm is rather narrow in scope (starting line 47). This would be improved if the authors framed this in the context of other common intermittent access paradigms and gave due credit to important studies and authors that were responsible for the innovation in this area (particularly studies by Wise, 1973 and returned to popular use by Simms et al 2010 and related papers; e.g., Wise RA (1973). Voluntary ethanol intake in rats following exposure to ethanol on various schedules. Psychopharmacologia 29: 203-210; Simms, J., Bito-Onon, J., Chatterjee, S. et al. Long-Evans Rats Acquire Operant Self-Administration of 20% Ethanol Without Sucrose Fading. Neuropsychopharmacol 35, 1453-1463 (2010).) The original drinking in the dark demonstrations should also be referenced (Rhodes et al., 2005). Line 154 Theile & Navarro 2014 is a review and not the original demonstration.

      When sex differences in alcohol intake are described, more care should be taken to be clear about whether this is in terms of volume (e.g. ml) or blood alcohol levels (BAC, or at least g/kg as a proxy measure). This distinction was often lost when lick responses were being considered. If licking is similar (assuming a single lick from a male and female brings in a similar volume?), this might mean males and females consume similar volumes, but females due to their smaller size would become more intoxicated so the implications of these details need far closer consideration. What is described as identical in one measure, is not in another.

      No conclusions regarding the photometry results can be drawn based on the histology provided. Localization and quantification of viral expression are required at a minimum to verify the efficacy of the dual virus approach (the panel in Supplementary Figure 1 is very small and doesn't allow terminals to be seen, and there is no quantification). Whether these might differ by sex is also necessary before we can be confident about any sex differences in neural activity.

      While the authors have some previous data on the AIC to DLS pathway, there are many brain regions and pathways impacted by alcohol and so the focus on this one in particular was not strongly justified. Since photometry is really an observational method, it's important to note that no causal link between activity in the pathway and drinking has been established here.

      It would be helpful if the authors could further explain whether their modified lickometers actually measure individual licks. While in some systems contact with the tongue closes a circuit which is recorded, the interruption of a photobeam was used here. It's not clear to me whether the nose close to the spout would be sufficient to interrupt that beam, or whether a tongue protrusion is required. This detail is important for understanding how the photometry data is linked to behaviour. The temporal resolution of the GCaMP signal is likely not good enough to capture individual links but I think more caution or detail in the discussion of the correspondence of these events is required.

      Even if the pattern of drinking differs between males and females, the use of the word "strategy" implies a cognitive process that was never described or measured.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      #1) Summary: The transport of effector proteins across membranes from the producing bacterium into a target cell is at the core of bacterial secretion systems. How an additional layer in form of a capsule affects effector export and the susceptibility towards effector import is not fully understood. Here, Flaugnatti and colleagues combined bacterial genetics with phenotypic assays and electron microscopy to demonstrate a dual role of a bacterial capsule in preventing T6SS-mediated effector export and promoting protection from effector import by another bacterium's T6SS. The wide variety of methods used, complementation of the mutants, and validation of the findings across strains strengthen the author's conclusions. Although the main conclusions seem straight forward, the authors unravel the unexpected complexity underlying these phenotypes with strong mechanistic work. In brief, a capsule-deficient mutant (∆itra) is shown to assemble its T6SS similar to the WT, yet secretes more Hcp than the WT and is better in T6SS-mediated killing of other bacteria. A capsule-overproducing mutant (∆bfmS) shows both, a partial deficiency in T6SS assembly and an additional reduction in exported Hcp, and is worse in T6SS-mediated killing than the WT. A mutant with a capsule similar to WT and deficient in cell sensing (∆tslA) forms the least T6SS apparatuses and is yet better in T6SS-mediated killing than the overcapsulated mutant. Together, these data show an effect of the capsule on (i) T6SS apparatus assembly, (ii) effector export, (iii) effector import, and (iv) the need for clearance of accumulating non-secreted Hcp by ClpXP. The work on a clinical isolate of Acinetobacter tumefaciens and the data on an impaired T6SS activity on other cells by antibiotic-induced capsulation is a strong demonstration of the work's clinical relevance in addition to the findings' conceptual novelty.

      • In my view, the manuscript is outstanding with very high quality of experimental data, very well written text and very clear presentation of the data in figures. A few minor comments and suggestions below that I think would strengthen the manuscript.*

      __ Authors’ reply #1: __We thank the reviewer for their enthusiasm.

      • *

      Major comment:

      #2) OPTIONAL: Fig. 4c/l. 320: Having an indirect effect of an antibiotic on T6SS activity by antibiotic-induced capsule formation is very intriguing and contributes to the clinical relevance of the overall findings. When I saw the data in Fig. 4c, the graph instantaneously reminded me of the panel in Fig. 2a, where a similar phenotype is observed by changing the predator:prey ratio in the absence of any antibiotic. The authors themselves comment on the possibility of antibiotic-induced, reduced predator growth (and thereby a change in predator:prey ratio) as a one factor impacting the phenotype here. I am wondering if this data could be strengthened or better disentangled to test more precisely if it is the antibiotic induced capsule formation per se that affects T6SS-mediated killing by A. baumanii in the presence of antibiotics. Would it help to take the bfmS mutant along as a control for direct comparison to see if antibiotic-induced capsule formation of the WT to similar levels of the mutant results in the same killing phenotype? Would it help to test for T6SS-mediated killing in the presence and absence of antibiotics at multiple predator:prey ratios? Could the effect of the antibiotic on A. baumanii growth be measured and considered when choosing the ratio at which the bacteria are mixed?

      __ Authors’ reply #2: __The point raised by the reviewer is very important. As we have stated in the manuscript, the capsule-induced production using antibiotics impacts the growth of A. baumannii and could therefore change the predator-prey ratio, potentially affecting the observed phenotype. However, the antibiotic is expected to equally impact the non-encapsulated ΔitrA strain, yet this strain maintains very strong T6SS killing activity in the presence of chloramphenicol. Thus, we do not believe the predator-prey ratio is causing the observed effect. To address this point more directly, we nonetheless propose to: i) repeat the experiments with different predator-prey ratios (1:1, 2:1, and 5:1), and ii) include a bfmS mutant as a control.

      Minor comments:

      #3) Figure 1D, l. 155, I might have missed this, do the authors happen to have the numbers of E. cloacae as well? This would strengthen the claim on A. baumannii survival because of E. cloacae is being killed.

      __ Authors’ reply #3: __The reviewer is correct; we did not include the survival of E. cloacae in the initial manuscript due to technical reasons (counter-selection of E. cloacae). However, we propose to repeat the experiment using an E. cloacae strain carrying a plasmid conferring kanamycin resistance. This will allow us to counter-select E. cloacae after contact with the A. baumannii predator to determine if E. cloacae is killed by A. baumannii in a T6SS-dependent manner.


      #4) Figure 2, I suggest to write out the species name of the prey in the box with the ratio. With E. cloacae being referred to in the previous figure and starting with similar letters than E. coli, I wasn't sure at first sight what E. c. refers to.

      __ Authors’ reply #4: __We appreciate the comment and will revise the figure as suggested.

      #5) use of the term "T6SS activity" throughout the manuscript (e.g. l. 182, l. 187). I leave this up to the authors. To me, it seems like an umbrella term for the initial observation and I see that such a term can be very handy for the writing. I just would like to mention that the use of the term was not always intuitive to me and sometimes even a bit misleading. For example, l. 182 refers to "increased T6SS activity". As a reader, I only know about 'T6SS activity on other cells' or 'a T6SS-mediated effect on other cells' at this point. T6SS apparatus assembly/firing activity is tested for specifically later and it turns out to differ between mutants. By the time the term is used in the discussion, it captures multiple nuanced phenotypes described by then. The more precise definition of the term in l. 200 helped to capture what exactly is meant by the authors.

      __ Authors’ reply #5: __We propose rephrasing the sentences to include the term "T6SS-secretion activity" when referring to Hcp secretion assays and "T6SS-mediated killing activity" when referring to killing experiments.

      __#6) __l. 198-199 "Collectively, our findings indicate that CPS does not hinder the secretion process of the T6SS or the consequent elimination of competing cells". I might be missing something, I cannot entirely follow this sentence. Didn't the authors just show that the CPS does hinder T6SS-mediated elimination of competing cells in panel 2A and less secreted Hcp in the encapsulated WT compared to the non-encapsulated mutant in panel 2B?

      __ Authors’ reply #6:__ We thank the reviewer for this comment. We realize that the sentence wasn’t well phrased, resulting in confusion. What we meant was that the T6SS is functional regarding its T6SS-mediated killing and secretion in the WT strain, while we also showed that the non-capsulated strain kills and secretes more T6SS material in the supernatant. Thus, there seems to be a balance between capsule production and T6SS activity in the WT. We will revise the sentence to better reflect this meaning.

      #7) l. 224, typo, "in"

      __ Authors’ reply #7:__ We will correct this typo. Thank you.

      • *

      #8) Two connected comments: l. 338, Just a thought, I am wondering about the title of the section. After reading it a second time, I think it is technically correct. When reading it first, I was a bit confused when getting to the data because apparatus assmebly is impaired in the capsule-overproducing strain and although "preserved", doesn't the data indicate that there is less T6SS assembly in the bfmS mutant and that this might be because of less cell sensing and isn't this a main point that there is a difference in apparatus assembly in the capsule overproducing strain compared to WT (other than no difference in apparatus assembly in the strain without capsule)? To me it seems not fully acknowledged as a finding in the interpretation of the data that less cells of the bfmS mutant have a T6SS apparatus. Isn't that interesting? A title along the lines of "Capsule-overproducing strain has preserved sensory function and assembles less T6SS apparatuses" would have been more intuitive for me. l. 352, In case I didn't miss a reference to this data earlier in the manuscript, I am wondering if it would be worth mentioning the finding on the reduced apparatus assembly of the bfmS mutant earlier, together with Figure 3 already. At least a sentence that mentions already that there is more coming later. When I got to this line in the manuscript and read the findings on the apparatus assembly, I first needed to go back to figure 3 and look at the data there again in light of this finding. It is mentioned here on the side but I think very important for the interpretation of the phenotypic data of the bfmS mutant shown earlier, isn't it? The tslA mutant is used beautifully here.

      __ Authors’ reply #8:__ We thank the reviewer for the suggestion and the kind comment about the beautiful usage of the tslA mutant. We will modify the title of the corresponding paragraph as suggested to make it more intuitive.

              Regarding the comment about mentioning the T6SS apparatus assembly defect in the *bfmS* mutant earlier, we respectfully disagree. While we agree that this point is important and can partially explain the difference in killing activity, we believe that showing it together with the *tslA* mutant (Figure 5) makes more sense and is easier for the reader to understand.
      

      #9) Discussion: optional comment. On the one hand, I like the concise discussion. On the other hand, I see more potential here for bringing it all together (potentially at the expense of shortening some of the introduction). I think the subtleties of the findings are complex. For example, I could envision a graphical summary with a working model on all the effects of a capsule on the T6SS and its potential clinical relevance making the study accessible to even more readers.

      __ Authors’ reply #9: __In the revised manuscript, we will include a graphical summary/model.


      Significance

      #10) General assessment: I consider the story very strong in terms of novelty, experimental approaches used, quality of the data, quality of the writing and figures of the manuscript. In my view, the aspects that could be improved are optional/minor and concern only one figure and some phrasing.

      • Advance: I see major advance in the findings (i, mechanistic) on the mechanism of how the capsule interferes with T6SS, (ii, fundamental) on the discovery of ClpXP degrading Hcp, and (iii, clinical) on the meaning of antibiotic treatment for the T6SS of this clinically relevant and often multi-drug resistant bacterial species, which strongly complements existing work on the T6SS and antibiotics in A. baumanii (e.g. of the Feldman group). As the authors write themselves, the starting points of the study of a capsule protecting from a T6SS and the effect of a T6SS on other cells being negatively impacted by a capsule were known, although not studied in one species and not understood mechanistically.*

      • Audience: I see the result of interest to a broad audience in the fields of bacteria-bacteria interactions, Acinetobacter baumanii, type VI secretion, antimicrobial resistance, bacterial capsules.*

      __ Authors’ reply #10: __We once again thank the reviewer and highly appreciate their positive and constructive feedback on our work. We hope the reviewer will be satisfied with the revised version of our manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      #11) In the manuscript by Flaugnatti et al., the authors provide clear evidence of the interplay between capsule outer coat production and the Type VI secretion system (T6SS) in Acinetobacter baumannii. The authors demonstrate that the presence of the capsule or the activity of the T6SS enhances survival against attacking bacteria. However, they also show that in their model bacterium, the (over)production of the capsule likely hinders T6SS dynamics, thereby reducing overall killing efficiency. Additionally, they reveal that the amount of the T6SS component Hcp is regulated in cells that can no longer assemble and/or secrete via the T6SS, presumably by the ClpXP protease. Overall, the experiments are well designed, and most conclusions are supported by the data and appropriate controls. I have however some suggestions that could further strengthen the manuscript prior to publication.

      __ Authors’ reply #11: __We are grateful for the reviewer’s enthusiasm and will implement their comments and suggestions in the revised version of the manuscript.


      Major comments:

      #12) Line 164. The authors use E. coli as prey to test the T6SS activity of A. baumannii. Why not directly use the E. cloacae strain (with or without T6SS) for this purpose? This would provide direct evidence that A. baumannii uses its T6SS to kill E. cloacae, thus confirming the authors conclusions in this section.

      __ Authors’ reply #12: __We thank the reviewer for this comment. We used E. coli to assess the functionality of the T6SS in different strains of A. baumannii, as it is commonly done in the T6SS field. However, as suggested by reviewer 1 (see comment #3) and in response to this query, we will also provide survival data of E. cloacae in the revised manuscript using a plasmid-carrying E. cloacae derivative that allows direct selection.

      #13) In Figure 2, the authors show that a non-capsulated strain kills more effectively and secretes more than a WT, but has a similar number of T6SS. They suggest in their conclusion that "the observed increase in T6SS activity in the non-capsulated strain suggests a compensatory mechanism for the absence of the protective capsule layer." This conclusion implies the presence of an "active" regulatory mechanism that would increase the number of successful T6SS firing events, which has not been demonstrated. Could it not simply be that the capsule blocks some shots that cannot penetrate and are therefore ineffective? This hypothesis is mentioned in lines 204-208. The authors should clarify the conclusion of this section. Given the challenge this may pose in A. baumannii, I suggest that the authors quantify the assembly/firing dynamics of the T6SS under WT and ΔitrA conditions. This would help distinguish between the two hypotheses explaining better firing in non-capsulated cells: i.e., if the number of assembled T6SS is the same in both strains (Fig 2C & 2D), do non-capsulated cells assemble/fire faster, indicating an adaptation in regulation, or do we observe the same dynamics, suggesting a simple physical barrier blocking the passage of certain T6SS firing events?

      __ Authors’ reply #13:__ We realize that the sentence, and more specifically the word "compensatory," might have been misleading and thank the reviewer for bringing this to our attention. What we meant to convey is that there is a balance between capsule production and T6SS activity; if disturbed, the balance shifts in one direction or the other. Specifically, there is more protection through the production of a thicker capsule (e.g., in the ∆bfmSmutant or under sub-MIC conditions of antibiotics, regulated by the Bfm system, as mentioned in the text) or more T6SS activity when less capsule is present (e.g., in the ΔitrA mutant, which we propose is caused by the lack of the steric hindrance). We will rephrase this sentence in the revised manuscript to better convey this message.

              Regarding the quantification of T6SS dynamic assembly/firing events between the capsulated (WT) and non-capsulated (ΔitrA) strains, we do not think this is required for this study, as the amount of secreted Hcp reflects the overall activity of the system. Importantly, we also do not have the technical means to quantify assembly/firing rates under Biosafety 2 conditions, as this requires specialized microscopes with very fast acquisition options (see, for instance, Basler, Pilhofer *et al.*, 2012, *Nature*). Indeed, very few labs in the T6SS field have been able to measure such rates.
      

      #14) Line 428-429. It is mentioned that the deletion of lon does not have a notable effect. However, I observe that the absence of Lon alone causes a more rapid degradation of Hcp in the cells compared to the WT strain (Fig 7B). How do the authors explain that the absence of this protease (whether under conditions of Hcp accumulation or not) increases the degradation of this protein in the cell? This explanation should be included in the manuscript.

      __ Authors’ reply #14: __That’s a fair point. We didn’t address this point further, as the deletion of lon didn’t resolve the issue of why Hcp is degraded. In fact, the opposite seems to be the case, as there is less Hcp in the ∆lon strain compared to the WT. While this observation is not directly relevant to the question of why Hcp is degraded late during growth in secretion-impaired strains, we will properly mention it in the revised manuscript.

              Please also note that a strong growth defect of a Δ*lon*Δ*clpXP* double mutant impaired further investigation in this direction.
      
      • *

      Minor comments:

      #15) Throughout the manuscript, the authors use the term "predator" to refer to A. baumannii. Predation is a specific phenomenon that involves killing for nourishment. To my knowledge, the T6SS has never been shown to be a predation weapon but rather a weapon for interbacterial competition, which is a different concept. If this has not been demonstrated in A. baumannii, the authors should replace the term "predator" with "attacker" (or an equivalent term) to clarify the context.

      __ Authors’ reply #15: __We thank the reviewer for this comment. The term “predator,” as highlighted by the reviewer, typically implies killing for nourishment/cellular products. In the context of T6SS, it facilitates the killing of competitors, releasing DNA into the environment that can subsequently be acquired through natural competence for transformation, as observed in species like Vibrio cholerae (our work by Borgeaud et al., 2015, Science) or other Acinetobacter species such as Acinetobacter baylyi (Ringel et al., 2017, Cell Rep.; Cooper et al., 2017, eLife). The acquisition of DNA reflects the killing for cellular products of the prey. As most A. baumannii strains are also naturally competent, this justifies the usage of the predator and prey nomenclature.

              Apart from this fact, it seems to be a matter of nomenclature, with many papers in the field using one term or the other. Yet, ultimately, this doesn’t change any of the scientific findings. Therefore, to satisfy the reviewer, we will change “predator” to “attacker” throughout the revised manuscript.
      

      #16) Line 274. Since the authors stated that in the Wzc mutant, the capsule is "predominantly found in the supernatant and only loosely attached to the cell," this result is not unexpected. This finding is also consistent with the previous results from Fig. 3A & B, which show sensitivity to complement-mediated killing and the weak amount of (ab)normal CPS produced in that strain, further confirmed by Fig. 3E.

      __ Authors’ reply #16__: We fully agree with the reviewer’s suggestion and will remove the statement.

      #17) Line 299. The authors speculate that "... T6SS may deploy through gaps akin to arrow-slits in the capsule's mesh...". However, this is very unlikely since a WT strain kills (Fig. 3C) and secretes (Fig. 2B & 3D) less effectively than the itrA mutant, suggesting that the T6SS is not assembled in the "right places" devoid of CPS; otherwise, we would expect similar T6SS activity. Based on the results in Fig. 2 (and my earlier comment), this implies that A. baumannii assembles its T6SS randomly, and in the presence of the capsule, its shots would need to be in the right place to penetrate the envelope and reach the target. Could the authors comment on this point and provide a model figure to better visualize the interplay between the capsule and T6SS under the three major conditions: WT, non-capsulated, and capsule overproduction?

      __ Authors’ reply #17: __We thank the reviewer and agree with their comment. We discussed the hypothesis of T6SS deployment through gaps, drawing a parallel to what was proposed for biofilm and T6SS in V. cholerae(Toska et al., 2018, PNAS). However, as mentioned earlier, we believe that the effect of the capsule on T6SS activity is primarily due to steric hindrance, which increases the distance between the T6SS apparatus and the prey cell. To clarify our findings further, we will include a model summarizing our results, as requested by reviewer 1 (see comment #9).


      __ #18)__ In Fig. 5A, the microscopy panels should be adjusted to the same dynamic range as the WT (which represents a true signal), which does not appear to be the case for the tlsA mutant panel for instance. The image gives the impression of a large amount of free TssB-msfGFP in the cytoplasm. However, this effect is due to the dynamic range being adjusted to display a signal. This observation is consistent with the fact that the amount of TssB-msfGFP protein is identical across all strains (Fig. S2F).

      __ Authors’ reply #18: __We will adjust the images to the range of the WT in the revised manuscript, as suggested. However, regardless of how these images are presented, the enumeration of T6SS structures will remain unchanged, which was the sole point of this experiment.

      • *

      #19) Unless I am mistaken, the authors do not comment on the fact that in a ΔbfmS strain, the number of T6SS is halved compared to a WT or ΔitrA strain. If capsule overproduction only partially limits the TslA-dependant T6SS assembly, how can this result be explained? Is it related to the degradation of Hcp in this background, which ultimately limits the formation of T6SS? If so, it would be interesting to mention this connection in the section "Prolonged secretion inhibition triggers Hcp degradation”

      __ Authors’ reply #19: __We did mention that the T6SS assembly of the ΔbfmS mutant is reduced compared to the WT (or ΔitrA), likely due to the defect in sensing the prey (lines 369-374 and 468-472 of the initial manuscript). However, we will revise the sentence to improve clarity in the revised version of the manuscript.

      Significance

      #20) This work is highly intriguing as it not only delves into the specific mechanisms involved but also connects fundamental elements in bacterial competition, i.e., the necessity for self-protection and aggression for survival. The manuscript offers valuable insights into cellular dynamics at a microscale level and prompts new inquiries into the regulation of these systems on a population scale. The work is well-done and the writing is also clear. I am convinced that this work represents another significant step towards understanding bacterial mechanisms and will undoubtedly spark considerable interest in the field.

      __ Authors’ reply #20: __We sincerely thank reviewer #2 for their constructive inputs, which will improve our manuscript.

      • *

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      #21) The manuscript by Flaugnatti et al investigates the relationship between functions of the T6SS in A. baumannii and production of capsular polysaccharide. The manuscript argues that (1) capsule protects A. baumannii against T6SS-mediated attack by other bacteria, (2) capsule also interferes with the bacterium's own T6SS activity, and (3) the T6SS inner tube protein Hcp is regulated by degradation by ClpXP. The main critiques regard the first two conclusions, which seem to be based solely on use of a mutant that has a confounding effect as described below; and to strengthen the third claim by further exploring the results of overexpressing Hcp and by determining whether there is a fitness benefit for Hcp regulation.

      __ Authors’ reply #21: __We thank reviewer #3 for their relevant input. We will conduct additional experiments based on their comments, and these will be incorporated into the revised manuscript.

      • *

      __Main items:____ __

      #22) Throughout the paper, an itrA deletion mutant is used as the capsule-deficient strain and conclusions are drawn about role of capsule based on this mutant. However, itrA deletion also eliminates the protein O-glycosylation pathway (Lees-miller et al 2013), a potential confounder. Analysis of mutants specifically deficient in the high-molecular weight capsule but not protein glycosylation, and/or mutants in the protein o-glycosylation enzyme, should be incorporated into the study to enhance the ability to make conclusions about the role of the capsule.

      __ Authors’ reply #22: __Fair point. We thank the reviewer for this important suggestion. To distinguish between the O-glycosylation pathway and capsule production, we will generate a ∆pglL strain (specific to O-glycosylation), as suggested, and will repeat the key experiments (similar to Fig. 2A and 2B). We are almost done with the engineering of this mutant strain and therefore don’t expect any major delays.

      #23) Evidence could be provided to support the idea raised in lines 482-483 that T6SS component accumulation is toxic ("degradation [of T6SS components] could serve as a strategy to alleviate proteotoxic stress..."). For example, growth curves of ∆clpXP strains with and without hcp could be analyzed, to determine how degrading Hcp is helping the bacteria.

      __ Authors’ reply #23: __We will perform growth curves of ΔclpXP strains with and without hcp, as suggested by the reviewer. However, we are uncertain whether we will be able to observe differences between these strains, as the conditions under which such degradation is significant may be challenging to replicate under standard laboratory conditions.

      __#24) __The possible ClpXP recognition sequence identified at the C terminus of Hcp is interesting-does overexpression of an Hcp variant lacking/altered in this motif alter its protein levels compared to WT Hcp?

      __ Authors’ reply #24: __We thank the reviewer for this suggestion. We are in the process of performing the suggested experiment and will include the data in the manuscript.

      __Minor items:____ __

      #25) *A better explanation could be provided for why overexpressing hcp in WT but not in ∆hcp leads to increased Hcp protein levels. There is a statement about Hcp being regulated post transcriptionally, possibly by degradation (lines 422-423), but would that not also result in regulation in the WT strain? *

      __ Authors’ reply #25: __The reviewer is absolutely correct here. Despite careful genetic engineering, we believe that the hcp mutant used may have a polar effect, causing Hcp accumulation only in the ∆hcp + p-hcp strain but not in the WT + p-hcp strain, which remains capable of secretion. The ∆hcp strain therefore mimics the secretion-impaired tssB mutant. We will clarify this in the revised manuscript.

      #26) *An untreated control is needed in Fig. 4B. *

      __ Authors’ reply #26: __The untreated samples were shown in all previous figures. However, we understand the reviewer's point and will repeat the experiment with the untreated control included in the same experiment.

      #27) *line 179: please clarify "reflecting better invading bacteria" *

      __ Authors’ reply #27: __We appreciate the reviewer mentioning this oversight. We meant to compare this to a situation where a bacterium invades an already existing community, resulting in a predator-prey ratio below 1. We will clarify this further in the revised manuscript.

      #28) *line 351: consider rewording the statement that ∆tslA results in decreased in T6SS assembly and activity using the tssB-msfGFP microscopy assay; it is not clear that activity is measured in this assay. *

      __ Authors’ reply #28: __The reviewer is correct. We will revise the sentence accordingly to better reflect the T6SS assembly.

      #29) *lines 260-265: This experiment could use clarifying, but it would seem that it requires analysis of the secreted capsule levels in the tssB mutant to show it does not produce extracellular capsule to the same extent that ∆bfmS does. *

      __ Authors’ reply #29: __We thank the reviewer for the suggestion and will include these experimental data in the revised manuscript.

      #30) *Fig. 6C and 7A labelling could be improved to avoid potential confusion that the bar graphs are quantifying the western blot. E.g., could add a corresponding vertical label to the Western data, or consider changing "relative expression of hcp" to something reflecting analysis of transcript levels. *

      __ Authors’ reply #30: __We will improve this figure by splitting the qPCR and Western blot data into independent panels. This will eliminate any confusion.


      #31) lines 416-417 and Fig. 7A: states that "hcp mRNA levels increased significantly", but more careful wording could be used because the WT's transcript change is not significant after overexpression (though it is significant in ∆hcp).

      __ Authors’ reply #31: __Point well taken. We will improve the sentence (and Figure) to make its meaning unambiguous.

      • *

      #32) lines 479-480 states that in secretion-impaired strains accumulation of Hcp is mitigated by ClpXP; while this was shown for ∆tssB, was this also the case for ∆bfmS?

      __ Authors’ reply #32: __This is indeed an interesting suggestion. We are in the process of generating the double mutant ∆bfmSclpXP and will include the experimental results in the revised manuscript.


      Significance

      #33) *The strengths of the study are the focus on a clinically significant pathogen, the potential novel roles for the important capsule virulence factor of A. baumannii, and the identification of novel points of control of the T6SS. The analyses of T6SS function are thorough and carefully performed. *

      __ Authors’ reply #33: __We thank the reviewer for their comments, which we believe will significantly strengthen our work, particularly regarding the capsule aspect.

    1. Author response:

      eLife assessment

      This valuable study uses single-cell transcriptomics to explore the mouse vomeronasal organ and represents an advance that enhances our understanding of neural diversity within this sensory system. Findings suggest a unique endoplasmic reticulum (ER) structure in Gnao1 neurons and allow for the synthesis of a developmental trajectory from stem cells to mature vomeronasal sensory neurons. Convincing methods, data, and analyses broadly support the claims, although experiments supporting the main ER-related claim are incomplete and lack quantification of co-expression and statistics on labeling intensity or coverage. Adding these data would greatly strengthen the conclusions of the paper.

      Public Reviews:

      Reviewer #1 (Public Review):

      Devakinandan and colleagues present a manuscript analyzing single-cell RNA-sequencing data from the mouse vomeronasal organ. The main advances in this manuscript are to identify and verify the differential expression of genes that distinguish apical and basal vomeronasal neurons. The authors also identify the enriched expression of ER-related genes in Gnao1 neurons, which they verify with in situ hybridizations and immunostaining, and also explore via electron microscopy. Finally, the results of this manuscript are presented in an online R shiny app. Overall, these data are a useful resource to the community. I have a few concerns about the manuscript, which I've listed below.

      General Concerns:

      (1) The authors mention that they were unable to identify the cells in cluster 13. This cluster looks similar to the "secretory VSN" subtype described in a recent preprint from C. Ron Yu's lab (10.1101/2024.02.22.581574). The authors could try comparing or integrating their data with this dataset (or that in Katreddi et al. 2022) to see if this is a common cell type across datasets (or arises from a specific type of cell doublets). In situ hybridizations for some of the marker genes for this cluster could also highlight where in the VNO these cells reside.

      Cluster13 (Obp2a+) cells identified in our study have similar gene expression markers to those identified with the “putative secretory” cells in Hills et al. manuscript. At the time this manuscript was available publicly, our publication was already finalized and communicated. We welcome the suggestion to integrate data, which we will attempt and address in our revision.      

      (2) I found the UMAPs for the neurons somewhat difficult to interpret. Unlike Katreddi et al. 2022 or Hills et al. 2024, it's tricky to follow the developmental trajectories of the cells in the UMAP space. Perhaps the authors could try re-embedding the data using gene sets that don't include the receptors? It would also be interesting to see if the neuron clusters still cluster by receptor-type even when the receptors are excluded from the gene sets used for clustering. Plots relating the original clusters to the neuronal clusters, or dot plots showing marker gene expression for the neuronal clusters might both be useful. For example, right now it's difficult to interpret clusters like n8-13.

      We will represent the UMAPs to make the developmental trajectory clearer. How neuron clusters are affected by the presence or exclusion of receptors is an interesting question that we will address in our revision, along with showing markers of each neuronal cluster, as suggested by the reviewer.  

      Reviewer #2 (Public Review):

      Summary:

      The study focuses on the vomeronasal organ, the peripheral chemosensory organ of the accessory olfactory system, by employing single-cell transcriptomics. The author analyzed the mouse vomeronasal organ, identifying diverse cell types through their unique gene expression patterns. Developmental gene expression analysis revealed that two classes of sensory neurons diverge in their maturation from common progenitors, marked by specific transient and persistent transcription factors. A comparative study between major neuronal subtypes, which differ in their G-protein sensory receptor families and G-protein subunits (Gnai2 and Gnao1, respectively), highlighted a higher expression of endoplasmic reticulum (ER) associated genes in Gnao1 neurons. Moreover, distinct differences in ER content and ultrastructure suggest some intriguing roles of ER in Gnao1-positive vomeronasal neurons. This work is likely to provide useful data for the community and is conceptually novel with the unique role of ER in a subset of vomeronasal neurons. This reviewer has some minor concerns and some suggestions to improve the manuscript.

      Strengths:

      (1) The study identified diverse cell types based on unique gene expression patterns, using single-cell transcriptomic.

      (2) The analysis suggests that two classes of sensory neurons diverge during maturation from common progenitors, characterized by specific transient and persistent transcription factors.

      (3) A comparative study highlighted differences in Gnai2- and Gnao1-positive sensory neurons.

      (4) Higher expression of endoplasmic reticulum (ER) associated genes in Gnao1 neurons.

      (5) Distinct differences in ER content and ultrastructure suggest unique roles of ER in Gnao1-positive vomeronasal neurons.

      (6) The research provides conceptually novel on the unique role of ER in a subset of vomeronasal neurons, offering valuable insights to the community.

      Weaknesses:

      (1) The connection between observations from sc RNA-seq and EM is unclear.

      (2) The lack of quantification for the ER phenotype is a concern.

      We would like to point out that the connection between scRNA-seq and EM was made in our experiments that investigated the localization of ER proteins via IHC (in Figure 5). The intriguing observation that the levels of a number of ER luminal and membrane proteins were higher in Gnao1 compared to Gnai2 neurons, led us to hypothesize a differential ER content or ultrastructure, which was verified by EM. The quantification of ER phenotype would definitely strengthen our observations, which we will add in our revised manuscript.       

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript, Devakinandan and colleagues have undertaken a thorough characterization of the cell types of the mouse vomeronasal organ, focusing on the vomeronasal sensory neurons (VSNs). VSNs are known to arise from a common pool of progenitors that differentiate into two distinct populations characterized by the expression of either the G protein subunit Gnao1 or Gnai2. Using single-cell RNA sequencing followed by unsupervised clustering of the transcriptome data, the authors identified three Gnai2+ VSN subtypes and a single Gnao1+ VSN type. To study VSN developmental trajectories, Devakinandan and colleagues took advantage of the constant renewal of the neuronal VSN pool, which allowed them to harvest all maturation states. All neurons were re-clustered and a pseudotime analysis was performed. The analysis revealed the emergence of two pools of Gap43+ clusters from a common lineage, which differentiate into many subclusters of mature Gnao1+ and Gnai2+ VSNs. By comparing the transcriptomes of these two pools of immature VSNs, the authors identified a number of differentially expressed transcription factors in addition to known markers. Next, by comparing the transcriptomes of mature Gnao1+ and Gnai2+ VSNs, the authors report the enrichment of ER-related genes in Gnao1+ VSNs. Using electron microscopy, they found that this enrichment was associated with specific ER morphology in Gnao1+ neurons. Finally, the authors characterized chemosensory receptor expression and co-expression (as well as H2-Mv proteins) in mature VSNs, which recapitulated known patterns.

      Strengths:

      The data presented here provide new and interesting perspectives on the distinguishing features between Gnao1+ and Gnai2+ VSNs. These features include newly identified markers, such as transcription factors, as well as an unsuspected ER-related peculiarity in Gnao1+ neurons, consisting of a hypertrophic ER and an enrichment in ER-related genes. In addition, the authors provide a comprehensive picture of specific co-expression patterns of V2R chemoreceptors and H2-Mv genes.

      Importantly, the authors provide a browser (scVNOexplorer) for anyone to explore the data, including gene expression and co-expression, number and proportion of cells, with a variety of graphical tools (violin plots, feature plots, dot plots, ...).

      Weaknesses:

      The study still requires refined analyses of the data and rigorous quantification to support the main claims.

      The method description for filtering and clustering single-cell RNA-sequencing data is incomplete. The Seurat package has many available pipelines for single-cell RNA-seq analysis, with a significant impact on the output data. How did the authors pre-process and normalize the data? Was the pipeline used with default settings? What batch correction method was applied to the data to mitigate possible sampling or technical effects? Moreover, the authors do not describe how cell and gene filtering was performed.

      The data in Figure 7-Supplement 3 show that one-sixth of the V1Rs do not express any chemoreceptor, while over a hundred cells express more than one chemoreceptor. Do these cells have unusually high or low numbers of genes or counts? To exclude the possibility of a technical artifact in these observations, the authors should describe how they dealt with putative doublet cells or debris.

      Surprisingly, some clusters are characterized by the expression of specific chemoreceptors (VRs). Have these been used for clustering? If so, clustering should be repeated after excluding these receptors.

      The identification of the VSN types should be consistent across the different analyses and validated. The data presented in Figure 1 lists four mature VSN types, whereas the re-clustering of neurons presented in Figure 3 leads to a different subdivision. At present, it remains unclear whether these clusters reflect the biology of the system or are due to over-clustering of the data, and therefore correspond to either noise or arbitrary splitting of continua. Clusters should be merged if they do not correspond to discrete categories of cells, and correspondence should be established between the different clustering analyses. To validate the detected clusters as cell types, markers characteristic of each of these populations can be evaluated by ISH or IHC.

      There is a lack of quantification of imaging data, which provides little support for the ER-related main claim. Quantification of co-expression and statistics on labeling intensity or coverage would greatly strengthen the conclusions and the title of the paper.

      scRNA-seq data analysis methods: We agree with the reviewer and will elaborate on the various criterion, parameters and methods in our revision. As described above, our revised manuscript will include analysis of how inclusion / exclusion of VRs affects cell clusters, as well as quantification of the ER phenotype. We will address the reviewer’s concern of over-clustering.

      We think that the cells expressing zero as well as two V1Rs are real and cannot be attributed to debris or doublets for the following reasons:

      a) Cells expressing no V1Rs are not necessarily debris because they express other neuronal markers at the same level as cells that express one or two V1Rs. Higher expression threshold values used in our analysis may have somewhat increased the proportion of cells with zero V1Rs. We will modify figure 7-supplement 3c to add another group showing Gnai2 level in cells expressing zero V1Rs.

      b) Cells co-expressing V1R genes: We listed the frequency of cells co-expressing V1R gene combinations in Supplementary table - 8. Among 134 cells that express two V1Rs, 44 cells express Vmn1r85+Vmn1r86, 21 express Vmn1r184+Vmn1r185, 13 express Vmn1r56+Vmn1r57, 6 express Vmn1r168+Vmn1r177, and so on. Doublets generally are a random combination of two cells. Here, each specific co-expression combination represents multiple cells and is highly unlikely by random chance. Some of the co-expression combinations were identified earlier and verified experimentally in Lee et al., 2019 and Hills et. al. Furthermore, Figure-7 supplement 3c shows that the level of Gnai2 expression is comparable across cells expressing one or two V1Rs. If the V1R expressing cells are doublets, we expect the level of Gnai2 to be higher, as compared to cells expressing single V1R. We will elaborate on this in our revised manuscript.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank all three reviewers for their insightful comments. Based on this feedback, we have performed additional experiments, and revised our manuscript. Below, we address each comment and describe the revisions.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: Ponomarova et al. showed that neomorphic idh-1 mutation results in increased levels of cellular D-2HG. The authors compared the high D-2HG phenotypes by D-2HG dehydrogenase mutant and identified vitamin B12 dependent vulnerability differences. The downregulated gene function of glycine cleavage system involved in one-carbon donor units exacerbates the phenotypes while adding one-carbone donors suppresses the phenotype. They concluded that the idh-1neo mutation imposes a dependency on the one-carbon pool. The manuscript is very interesting but I think the manuscript should be modified to be more clear for broad audiences.

      Concerns: The authors mention a number of examples for metabolic changes of D-2HG in the first paragraph of introduction. I think that a metabolic map explaining the changes helps readers to understand the questions proposed by the authors.

      Thank you for this suggestion. A figure illustrating the contributing factors in D-2HG metabolism has been added to the manuscript (Figure 1A).

      The authors say that D-2HG affects carcinogenesis in many ways, citing previous works. They should say a higher concentration of D-2HG does affect carcinogenesis or not in dhgd loss of function, if they assume the concentration is most important for carcinogenesis.

      Thank you for pointing this out. We have added this information in lines 70-72 of the revised manuscript: "Increased levels of D-2HG caused by the inhibition of D-2-hydroxyglutarate dehydrogenase activity have also been associated with different cancers (PMID: 29339485, PMID: 34296423, PMID: 35007759)."

      Line 110, mode should be read as model, I guess.

      Thank you - we have corrected this error.

      In Figure 4C, concentrations of formate are shown; 0. 20, 40, 80, 160 mM. Is this correct? the high concentration of substrates changes the osmotic pressure of the medium. Also, high concentration of formic acid is toxic to animals. Considering the concentration of vitamin B12 was 64 nM, I wonder concentration unit of formate is also nM.

      We confirm that we supplemented the media with formate in the millimolar range. The highest doses of supplemented formate somewhat slowed the development of P0 animals, but they consistently produced viable progeny. To clarify this we have added the following line to the text on lines 184-187: "The highest doses of supplemented formate somewhat slowed the development of P0 animals, but restored the survival of idh-1neo embryos to wild-type levels on a regular diet of E. coli OP50 as well as the diet of RNAi-competent E. coli HT115."

      Additionally, the use of sodium formate ensured that the pH of the media remained unchanged.

      I could not understand how embryonic and larval lethality confer the same mechanisms on animal carcinogenesis. Could you explain the logic link between lethal mutation and carcinogenesis. Or do the two phenotypes share only a part of metabolic changes?

      Thank you for this suggestion. We have added this in lines 242-246 of the Discussion:

      "While our results have focused on how the neomorphic idh-1 mutation affects the developing embryo, proliferating cancer cells also have been shown to have increased demand for 1C units, for instance, to synthesize nucleosides (33)(PMID: 24657017). Thus, we can speculate that cancers with mutated IDH1 may be increasingly sensitive to depletion of the 1C pool, also."

      Vitamin B12 is an essential substance and deficiency in humans results in sever diseases. Is the lethal phenotype by treatment of idh-1neo mutants comparable to humans? Is the concentration of vitamin B12 similar in humans?

      The daily dose of human vitamin B12 (cobalamin) in supplements can reach 12.5 µg per kg (PMID: 18606874), while we supplement the media fed to worms with approximately 55 µg cobalamin per kg (64 nM adenosylcobalamin). No known adverse effects are associated with excessive intake of vitamin B12 by healthy individuals; therefore, no tolerable upper intake level has been set (PMID: 23193625). However, the impact of vitamin B12 on patients with IDH1neo-positive cancers has not been studied.

      Reviewer #1 (Significance (Required)):

      I think that the manuscript is interesting and may lead an important progress of this field. However, in general, metabolic disorders are difficult to understand for the people outside the speciality. The authors should explain carefully the structure/property, pathways, enzyme functions, and concentration effects of substances of interest.

      See above, we hope these edits are sufficient.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Increased levels of the metabolite D-2HG (derived from alpha-KG) are associated with multiple disorders. In a previous study, the authors showed that in C. elegans dhgd-1 deletion mutants, embryonic lethality resulting from the accumulation of D-2HG in is caused by a lack of ketone bodies. In this study, the authors generated a new model of D-2HG accumulation in C. elegans, idh-1neo, in order to further understand how D-2HG exerts its toxic effects in different contexts. This allele mimics mutations found in neomorphic mutations of human IDH1 that lead to abnormal D-2HG production from alpha-KG. Interestingly, the authors find that idh-1neo mutants are distinct from animals lacking the D-2HG dehydrogenase dhgd-1 previously reported. Specifically, while vitamin B12 rescues the embryonic lethality in dhgd-1 deletion animals, it enhances the lethality of idh-1neo animals. Through an elegant genetic screen, and complementation studies with specific metabolites, they provide compelling evidence that this vitamin B12-dependent enhancement is due to depletion of the 1C pool. Specifically, a reverse genetic screen revealed that inactivation of components of the 1 C-producing glycine cleavage system (GCS) results in embryonic lethality in idh-1neo, but not wildtype animals. Complementation studies with specific metabolites show that replenishing C groups is sufficient to reverse embryonic lethality.

      This is a very clear, well written paper. Experiments are well controlled and executed, figures are of the highest quality and conclusions are convincing. Prior studies are appropriately referenced. No additional experiments are required by this reviewer.

      Minor points 1) In Figure 2A could authors explain how beta-alanine (increased) is different from alanine (decreased). As a non-specialist this is not clear to me.

      Thank you for pointing this out. We added this explanation to the figure legend (lines 510-512).

      2) Did the authors test inactivation of the lipoamide dehydrogenase (dld-1) has the same effect as the other identified components of the GCS?

      The dld-1 RNAi clone was present in the metabolic library that we screened but was not identified as a "hit." We have added the following in lines 164-168 of the revised manuscript: "Two other GCS genes, gcsh-2 and dld-1 were not identified as 'hits'. gcsh-2 is associated with the same reaction as gcsh-1, indicating that the latter encodes an active enzyme (30). dld-1 functions in other metabolic processes, particularly in lactate/pyruvate metabolism, and confers embryonic lethality when knocked down in wild type animals (31)".

      **Referees cross-commenting**

      Comments to Reviewer #3: 1/ The authors treat the idh-1neo worms with vitamin B12 to reduce 3HP concentrations. The authors should consider conducting experiments to reduce 3HP by other means also. This would help establish a causal relationship between the D-2HG accumulation and observed phenotypes.

      The authors show that adding vitamin B12 to the diet of the idh-1neo significantly increased their D-2HG levels. Furthermore, dhgd-1 RNAi drives a further increase in D-2HG in idh-1neo animals and led to 100% penetrant embryonic lethality among the F1 generation of idh-1neo animals. Together I think this provided strong evidence for a causal relationship between the D-2HG accumulation and observed phenotypes. Further characterizing these phenotypes would be interesting but is beyond the scope of this paper.

      4/ The authors should clarify whether it is really vitamin B12 or any other metabolite from the bacteria (like methionine) that is bringing about the phenotypes. Have they tested metabolically inactive bacteria?

      the authors show that supplementing B12-treated idh-1neo animals with formate (another 1C donor) restored the survival of idh-1neo embryos, supporting a role for B12 in depletion of the 1C pool. They also show that suppressing Met/SAM cycle genes in idh-1neo prevent 1C depletion and restore availability of 1C units. So the evidence that 1C unit depletion is at the core of the observed phenotypes is pretty convincing

      7/ The authors should conduct metabolomic profiling to examine changes in metabolic pathways, including 1C, glycine metabolism, glucose metabolism etc, in idh-1neo animals subjected to GCS gene knockdown, and vitamin B12 supplementation.

      Not clear how these experiments would add to this story. Open up another line of research

      8/ The audience will be limited to the field although the study pertains to an oncometabolite. The study value would have improved if the authors had included cancer cell data. Also, the phenotype studied has not been mechanistically linked to the oncometabolite function, making the study academic in nature.

      The intetest of this study is that it is being carried out in an organismal context.

      Reviewer #2 (Significance (Required)):

      As a geneticist with a general interest in metabolomics I find this an elegans study that offers new insight into how IDH-1 and -2 neomorphic mutations affect metabolic rewiring in the context of a whole animal. Although similarities are observed between idh-1neo mutants and animals lacking the D-2HG dehydrogenase dhgd-1, both of which have increased levels of the metabolite D-2HG, specific metabolic differences are observed. The identification of 1C unit deficiency as a driver of lethality in idh-1neo mutants is highly significant given the central importance of 1C metabolism. This study should therefore be of interest to a wide audience.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Ponomarova et al presents a short follow up of their previous study to elucidate the role of a oncogenic variant of idh-1 that increases the 3HP levels, similar to the Ddhgd-1 mutant. Using a combination of metabolomics and genetics, they show that the defect in idh-1neo worms on high vitamin B12 diet is the draining of the 1C pool, distinct from the mechanisms of lethality observed in the Ddhgd-1 mutant. While the findings are interesting, there is a lack of mechanistic understanding of the basis of the phenotype observed. Moreover, the authors do not establish the link between the oncometabolite, that should support uncontrolled cell division, with the observed phenotype. Some control experiments are missing and should be included in the revised manuscript. there could be many other The comments on the manuscript are as follows, in no particular order:

      1. The authors treat the idh-1neo worms with vitamin B12 to reduce 3HP concentrations. The authors should consider conducting experiments to reduce 3HP by other means also. This would help establish a causal relationship between the D-2HG accumulation and observed phenotypes.

      To further examine the link between 3HP and idh-1neo embryonic lethality, we targeted hphd-1 by RNAi, which increases 3HP levels (Ponomarova et al., 2023). Hphd-1 knockdown did not induce lethality in the wild-type or exacerbate lethality in idh-1neo animals (Figure S3), further demonstrating that lack of 3HP degradation is not linked to this phenotype (lines 143-145).

      Also, see cross-comments from Reviewer #2 above.

      The authors should investigate the functional impact of HPHD-1 inhibition on 3-hydroxypropionate levels and D-2HG accumulation by RNAi knockdown of HPHD-1 in idh-1neo animals.

      We have now performed the suggested experiment please see response to comment 1 above.

      The authors do not clearly mention clearly which diet in some of their experiments. This is imporant since the two diets used (OP50 and HT115) differ in their vitamin B12 content, and thus could have different consequences.

      We added this information in figures, figure legends, and lines 259-260 of the revised manuscript.

      The authors should clarify whether it is really vitamin B12 or any other metabolite from the bacteria (like methionine) that is bringing about the phenotypes. Have they tested metabolically inactive bacteria?

      The reviewer correctly points out that bacterial metabolism may play a role in the effects exerted by vitamin B12. We have not tested metabolically inactivated bacteria, however, our RNAi experiments (Figure 4E) demonstrate that supplemented vitamin B12 acts through the Met/SAM cycle in idh-1neo animals. Please also see cross-comments from Reviewer #2.

      The authors consistently use 64 nM of Vitamin B12. Will the hphd-1 mutant and the idh-1neo mutant have different vitamin B12 thresholds for the observed phenotypes?

      Thank you for raising this interesting point. While 64 nM vitamin B12 virtually eliminates 3HP accumulation in idh-1 animals (Figure 2D), we have not tested if this dose is sufficient to eliminate 3HP accumulation in hphd-1 mutant. However, potential differences in 3HP levels in idh-1neo and hphd-1 animals treated with vitamin B12 would not contradict our conclusion that 3HP is not the cause of embryonic lethality in idh-1neo mutant animals.

      Figure 3b: HT115 has inherently high levels of vitamin B12 so the RNAi effect of genes should be seen on the OP50 diet supplemented with B12.

      Despite reports of elevated B12 levels in E. coli HT115, vitamin B12-induced embryonic lethality of idh-1neo on a diet of OP50 is more severe than on a diet of HT115 bacteria (Figure 4C). Therefore, it may be harder to quantify synthetic lethal interaction of idh1-neo with GCS RNAi knockdown using OP50 strains (which would need to be created).

      The authors should conduct metabolomic profiling to examine changes in metabolic pathways, including 1C, glycine metabolism, glucose metabolism etc, in idh-1neo animals subjected to GCS gene knockdown, and vitamin B12 supplementation.

      While these results would be interesting and further our understanding of metabolic changes that occur in idh-1neo mutant animals we think they are beyond the scope of the manuscript. Also, please see cross-comments from Reviewer #2.

      Perform rescue experiments using different one-carbon donors (e.g., formate, serine) to restore embryonic viability in idh-1neo mutants under conditions of vitamin B12-induced stress. Quantify the efficacy of these interventions using developmental assays.

      In addition to formate rescue experiments (Figure 4C), we supplemented idh-1neo animals with serine (Figure 4D and S7). Similar to formate, serine supplementation resulted in the rescue of idh-1neo embryonic lethality on an E. coli OP50 diet (lines 187-189). The lack of rescue on an HT115 diet could be due to HT115 bacteria containing more glycine (Gao et al., 2017), which might limit the efficiency of serine conversion to glycine needed for 1C unit production.

      Provide experimental evidence to show that idh-1neo animals possess an alternative source of energy.

      We have previously found that diminished production of ketone bodies in ∆dhgd-1 mutants causes embryonic lethality that can be rescued by exogenous supplementation of ketone body 3-hydroxybutyrate (Ponomarova et al., 2023). In contrast to dhgd-1 mutants, idh-1neo embryonic lethality fails to respond to supplemented 3-hydroxybutyrate (Figure S4), indicating the lethality associated with the idh-1neo mutation is caused by a different mechanism, i.e., a depletion in 1C-units.

      The authors use vitamin B12 to inhibit the shunt pathway (line 127). They should explore alternate strategies to do the same, like gene knockdown.

      Please see our response to comment 1 above where we discuss RNAi knock-down of the shunt pathway gene, hphd-1.

      It is not clear why the authors did not follow up with the other phenotypes of the idh-1neo that were visible without the Vitamin B12 supplementation. They should follow up with this and also other phenotypes to explore the broader physiological consequences of D-2HG accumulation.

      We agree that the other physiological consequences of D-2HG accumulation are interesting, and we plan to investigate them in our future studies.

      The authors should include control experiments without supplementation of vitamin B12, ketone bodies etc. in each of their figures.

      We thank the reviewer for this suggestion. We have added these data (Figures S5, 6, 7, and 8).

      The authors posit that the idh-1neo depletes the 1C pool leading to the observed lethality. So, when they supply formate to replenish it, they rescue the lethality of the B12-treated worms. Similar results are obtained by knocking down the enzymes. So where are the 1C units going? Understanding this will provide the much-needed mechanistic understanding to this study.

      We appreciate this insightful comment and expand our discussion to elaborate on this issue (lines 224-227). "We propose that a lack of 1C units in idh-1neo can impede pyrimidine biosynthesis via thymidylate synthase tyms-1, which uses 1C units to generate dTMP. Supporting this hypothesis, RNAi of tyms-1 causes embryonic lethality (36-38)."

      It may be important to measure the D-2HG levels in the mitochondria vs the cytosol.

      While this is an interesting point, we think that this line of inquiry is beyond the scope of this work (and is technically challenging).

      The idh-1neo is an oncometabolite. The authors do not show any data to indicate whether this mutant has any defect in cell division/cell cycle in the somatic tissue or germline.

      In this study we primarily focused on the molecular changes in the metabolic network that occur in idh-1neo mutant animals, which we think is an important advance in understanding the basis for how this mutation affects IDH function. Additional phenotypic outcomes of these perturbed metabolic processes will be the basis of future studies.

      Reviewer #3 (Significance (Required)):

      The audience will be limited to the field although the study pertains to an oncometabolite. The study value would have improved if the authors had included cancer cell data. Also, the phenotype studied has not been mechanistically linked to the oncometabolite function, making the study academic in nature.

      While we agree that the link between idh-1neo, 2HG production and oncometabolite function has not been directly shown we think that our study adds important molecular understanding of metabolic changes that occur in relation to idh-1neo function which are important for future studies of how this mutation affects carcinogenesis. Also, please see cross-comments from Reviewer #2.

      In addition, we specified statistical significance in Figure 2, described statistical tests used (lines 361-363) and corrected a few grammatical errors throughout the text.

    1. Author Response

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public Review):

      The manuscript by Sejour et al. is testing "translational ramp" model described previously by Tuller et al. in S. cerevisiae. Authors are using bioinformatics and reporter based experimental approaches to test whether "rare codons" in the first 40 codons of the gene coding sequences increase translation efficiency and regulate abundance of translation products in yeast cells. Authors conclude that "translation ramp" model does not have support using a new set of reporters and bioinformatics analyses. The strength of bioinformatic evidence and experimental analyses (even very limited) of the rare codons insertion in the reporter make a compelling case for the authors claims. However the major weakness of the manuscript is that authors do not take into account other models that previously disputed "rare or slow codon" model of Tuller et al. and overstate their own results that are rather limited. This maintains to be the weak part of the manuscript even in the revised form.

      We are glad the reviewer thinks our evidence makes “a compelling case for the authors claims”. This was our main aim, and we are satisfied with this.

      The reviewer believes the major weakness of the manuscript is that we do not take into account other models and do not (see below) cite numerous other relevant papers. The reviewer made essentially the same criticism at the first review, at which time we looked quite hard for papers generally meeting the reviewer’s description. We found a few, which we incorporated here. Still, we did not find the body of evidence whose existence the reviewer implies. We are citing every study we know to be relevant, though of course we will have inadvertently missed some, given the huge body of literature. After the first round of review, we wrote “the reviewer did not give specific references, and, though we looked, we weren’t always sure which papers the reviewer had in mind.” We hoped the reviewer would provide citations. But only two citations are provided here, both to A. Kochetov, and these don’t seem central to the reviewer’s points.

      The studies that authors do not mention argue with "translation ramp" model and show more thorough analyses of translation initiation to elongation transition as well as early elongation "slow down" in ribosome profiling data. Moreover several studies have used bioinformatical analyses to point out the evolution of N-terminal sequences in multiple model organisms including yeast, focusing on either upstream ORFs (uORFs) or already annotated ORFs. The authors did not mention multiple of these studies in their revised manuscript and did not comment on their own results in the context of these previous studies.

      Mostly, we do not know to what papers the reviewer is referring. This may be our failing, but it would have helped if the reviewer had cited one of them. There are papers discussing the evolution of N-terminal sequences, but as far as we know, these do not discuss translation speed or codon usage. Of course, we may have missed some papers.

      As such the authors approach to data presentation, writing and data discussion makes the manuscript rather biased, focused on criticizing Tuller et al. study and short on discussing multiple other possible reasons for slow translation elongation at the beginning of the protein synthesis. This all together makes the manuscript at the end very limited.

      We think the reviewer may be considering our paper as being generally about translation speeds, whereas in our minds, it is not. This difference in views as to what the paper is “about” is perhaps causing friction. To us, it is indeed a limited paper. We are narrowly focused on the finding of Tuller that there is an enrichment of rare, slow codons at the 5’ end of genes, and we have sought an explanation of this particular fact. This is not a paper about rates of translation generally—it is a limited paper about the reason for the 5’ enrichment of rare, slow codons.

      To expand on this, the encoded slow 5’ translation due to rare, slow codons (of Tuller et al.) is a small effect (1% to 3%). The possible unencoded slow 5’ translation of unknown mechanism discussed by some other papers (e.g., Weinberg et al. 2016, Shah et al. 2013) is a much larger effect (50% or more). Just from the different magnitudes, it seems likely these are different phenomena. And yet, despite the small size of the encoded effect, it is for some reason this paper by Tuller et al. that has captured the attention of the literature: as we point out below, Tuller et al. has been cited over 900 times. Partly because of the wide and continuing influence of this paper, it is worth specifically and narrowly addressing its findings.

      Reviewer #2 (Public Review):

      Tuller et al. first made the curious observation, that the first ∼30-50 codons in most organisms are encoded by scarce tRNAs and appear to be translated slower than the rest of the coding sequences (CDS). They speculated that this has evolved to pace ribosomes on CDS and prevent ribosome collisions during elongation - the "Ramp" hypothesis. Various aspects of this hypothesis, both factual and in terms of interpreting the results, have been challenged ever since. Sejour et al. present compelling results confirming the slower translation of the first ~40 codons in S. cerevisiae but providing an alternative explanation for this phenomenon. Specifically, they show that the higher amino acid sequence divergence of N-terminal ends of proteins and accompanying lower purifying selection (perhaps the result of de novo evolution) is sufficient to explain the prevalence of rare slow codons in these regions. These results are an important contribution in understanding how aspects of the evolution of protein coding regions can affect translation efficiency on these sequences and directly challenge the "Ramp" hypothesis proposed by Tuller et al.

      I believe the data is presented clearly and the results generally justify the conclusions.

      We thank the reviewer for his/her attention to the manuscript, and for his/her comments.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      As mentioned in the public review major weakness of the manuscript is the lack of analyses for confounding effects, overstatements of the results (using single amino acid sequence reporter) and the lack of discussion of previous work that argues against Tuller et al model. In my previous review I mentioned multiple other studies that addressed "slow codons" model in more detail.

      No, the reviewer did not cite any specific studies.

      While some of these studies are mentioned in the revised manuscript, authors are still rather biased and selective in their discussions. I should also point out that previous studies, that authors fail again to mention, were focused on either translation initiation, initiation to elongation transition or early elongation effects in relation to mRNA sequence, structure, codons as well as amino acid sequence. Also additional studies with bioinformatic analyses of N-terminal conservation and existence of start sites at the beginning of the protein sequences in multiple model organisms were also omitted.

      Again, we do not know to what papers the reviewer is referring. But this sounds like a lot. Our paper is aimed at a specific, narrow topic: Why is there an excess of rare, slow codons in the 5’ region of genes? We are not trying to make general statements about all things affecting and affected by translation speed, we are just trying to explain the excess of rare, slow codons.

      In general manuscript seems to be too much focused-on discussion of Tuller's paper . . .

      Yes, we are focused on the Tuller findings, the excess of rare slow codons in 5’ regions.

      . . . and arguing with the model that was already shown by multiple other studies to be limited and not correct.

      We find it unsatisfactory that the reviewer states in a public review that there are multiple other studies showing that the Tuller model is not correct, and yet does not cite any of them. Furthermore, for the reviewer to say that Tuller et al. is “not correct” is too sweeping. The core finding of Tuller et al. was the excess of rare, slow codons in the 5’ regions of genes. We confirm this; we believe it is correct; we are not aware of any literature disputing this. Then, Tuller interpreted this as an adaptation to promote translational efficiency. On the interpretation, we disagree with Tuller. But if one is to disagree with this interpretation, one needs an alternative explanation of the fact of the excess rare, slow codons. Providing such an alternative explanation, and doing an experiment to distinguish the explanations, is our contribution. We are not aware of any other paper making our interpretation.

      There are of course many papers that discuss various aspects of translation at the 5’ ends of genes, and we do cite quite a few such papers in our manuscript, though certainly not all. But papers of this general kind do not, and cannot, show that Tuller et al. is “not correct”. As far as we know, no paper provides an alternative explanation for the rare slow codons, and no paper does an experiment to modulate translation speed and look at the effect on gene expression. Notably, the slow translation phenomenon associated with the rare codons found by Tuller et al. is a very small effect—a change of about 1% to 3% of translation speed. Some other papers on translation speed are dealing with possible changes in the range of 50% or more. These are presumably some other phenomenon (if indeed they are even real changes in translation speed), and, whether they are true or not, the results and interpretations of Tuller et al. could still be true or not. Of course, if we knew of some previous paper showing the Tuller paper is not correct, we should and would cite it.

      To expand on the current view of Tuller in the literature, Tuller et al. has been cited 956 times according to Google Scholar. This makes it an extremely influential paper. After finding Tuller et al. in Entrez Pubmed, one can look under “Cited by” and see the five most recent papers that cite Tuller et al. The five papers given on May 23 2024 were Bharti . . . Ignatova 2024; Uddin 2024; Khandia . . . Choudhary 2024; Love and Nair 2024; and Oelschlaeger 2024. We went through these five most recent papers that cite Tuller et al., and asked, did these authors cite the Tuller results as fully correct, or did they mention any doubts about the results? All five of the papers cited the Tuller results as fully correct, with no mention of any kind of doubt. For instance, Kandia et al. 2024 state “The slow “ramp” present at 5’ end of mRNA forms an optimal and robust means to reduce ribosomal traffic jams, thus minimizing the cost of protein expression40.”, while Oelschlaeger (2024) states “Slow translation ramps have also been described elsewhere and proposed to prevent traffic jams along the mRNA [51,52,53].” Although Uddin (2024) cited Tuller as fully correct, Uddin seemed to think (it is a little unclear) that Tuller found an enrichment of highly-used codons, opposite to the actual finding. The multiple contrary studies mentioned by the reviewer do not seem to have been very influential.

      There are papers containing skepticism about the Tuller interpretation, and also papers with results that are difficult to reconcile in a common-sense way with the Tuller interpretation. But skepticism, and a difficulty to reconcile with common sense, are far from a demonstration that a paper is incorrect. Indeed, Tuller et al. may have been published in Cell, and may be so highly cited, exactly because the findings are counter-intuitive, colliding with common sense. Our contribution is to find a common-sense interpretation of the surprising but correct underlying fact of the 5’ enrichment of rare, slow codons.

      Having wrote that in the previous review, I have to admit that Sejour et al manuscript in the main text has a minimal amount of novelty with experimental evidence, the conclusions are based on three reporters with and without stalling/collision sequence with the same amino acid sequence and varying codons. Some more novelty is seen in bioinformatic analyses of multiple yeast sequences and sequence conservation at the N-termini of proteins. However, even this part of the manuscript is not discussed fully and with correct comparison to previous studies. Authors, based on my previous comments discuss further experimental shortcomings in their new and "expanded" discussion but the use of a single reporter in this case cannot relate to all differences that may be coming from ORFs seen in complete yeast transcriptome. There are multiple studies that used more reporters with more than one amino-acid and mRNA sequence as well as with similar variation of the rare or common codons. The handwaving argument about the influence of all other mechanisms that can arise from different start sites, RNA structure, peptide interaction with exit channel, peptidyl-tRNA drop-off, eIF3 complex initiation-elongation association, and etc, is just pointing up to a manuscript that is more about bashing up Tuller's model and old paper than trying to make a concise story about their own results and discuss their study in plethora of studies that indicated multiple other models for slow early elongation.

      We don’t understand why the reviewer is so grudging.

      Discussion of the ribosome's collisions and potential impact of such scenario in the author's manuscript is left completely without citation, even though such work has relevant results to the author's conclusions and Tuller's model.

      This is not true. We cite Dao Duc and Song (2018) “The impact of ribosomal interference, codon usage, and exit tunnel interactions on translation elongation rate variation.” PLoS Genet 14, and Tesina, . . . and Green (2020) “Molecular mechanism of translational stalling by inhibitory codon combinations and Poly(A) tracts. EMBO J., which are two excellent papers on this subject. We also cite Gamble et al. (2016), who found the underlying result, but at that time did not attribute it to ribosome collisions.

      Previous studies (not cited) for example clearly indicate how the length from stalling sequence to start codon is related to ribosome collisions. Moreover such studies are pointing out differences in initiation vs elongation rates that may impact ribosome collisions and protein expression. Both of these topics would be very valuable in discussions of evolutionary changes in the current yeast ORFs. Not to mention that authors do not really discuss also possibilities for differences in 5'UTRs and uORFs in relation to downstream ORFs sequence and codon composition.

      It is not clear to us that such papers are highly relevant to the issue on which we are working.

      The argument about whether cycloheximide or not is doing 5' ribosome slowdown (lines 425-443) is just rambling about Weinberg's paper from 2016 without any real conclusion. In this section authors are just throwing down hypothesis that were more clearly explained in Weinberg's manuscript or shown experimentally in studies done after the Weinberg et al. paper was published.

      Earlier, the reviewer had the criticism that “The studies that authors do not mention argue with "translation ramp" model and show more thorough analyses of translation initiation to elongation transition as well as early elongation "slow down" in ribosome profiling data.” The main study we know of dealing with these issues like these is that of Weinberg et al. 2016. In our opinion, this is a thoughtful paper on these issues. But now, at this point, the reviewer seems to criticize the fact that we do extensively cite results from Weinberg et al. It is true that there is no ultimate conclusion, but why there is no conclusion is a little bit interesting. Weinberg et al show that even in studies that do not use cycloheximide as the first step in ribosome profiling, there is some left-over high density of ribosomes near 5’ ends. But, all these ribosome profiling experiments do use cycloheximide at a later step in the procedure. Until someone does a ribosome profiling experiment without the use of any cycloheximide at any step, there will be no firm conclusion. This is not our fault—and also not the issue we are writing about. And, the reason this paragraph is in the manuscript at all is that the reviewer (we thought) had asked for something like this in the first review.

      At the end, even in the limited novelty of evolutionary arguments about non-existing N-terminal conservation of codons or amino acids they fail to cite and discuss previous work by Kochetov (BioEssays, 2008 and NAR, 2011) which have additional explanation on evolution of N-terminal sequences in yeast, human or Drosophila.

      These two papers of Dr. Kochetov’s have some relevance and we now cite them. These are the only papers cited by the reviewer in his/her two reviews.

      Probably the reviewer would have preferred a paper on a different subject.


      The following is the authors’ response to the original reviews.

      Response to Reviewers:

      We thank the reviewers for their comments, and their evident close reading of the manuscript. Generally, we agree with the reviewers on the strengths and weaknesses of our manuscript. Our revised manuscript has a more extensive discussion of alternative explanations for initial high ribosome density as seen by ribosome profiling, and which more specifically points out the limitations of our work.

      As a preface to specific responses to the reviewers, we will say that we could divide observations of slow initial translation into two categories, which we will call “encoded slow codons”, and “increased ribosome density”. With respect to the first category, Tuller et al. documented initial “encoded slow codons”, that is, there is a statistical excess of rare, slowly-translated codons at the 5’ ends of genes. Although the size of this effect is small, statistical significance is extremely high, and the existence of this enrichment is not in any doubt. At first sight, this appears to be a strong indication of a preference for slow initial translation. In our opinion, our main contribution is to show that there is an alternative explanation for this initial enrichment of rare, slow codons—that they are a spandrel, a consequence of sequence plasticity at the 5’ (and 3’) ends of genes. The reviewers seem to generally agree with this, and we are not aware that any other work has provided an explanation for the 5’ enrichment of rare codons.

      The second category of observations pertaining to slow initial translation is “increased ribosome density”. Early ribosome profiling studies used cycloheximide to arrest cell growth, and these studies showed a higher density of ribosomes near the 5’ end of genes than elsewhere. This high initial ribosome density helped motivate the paper of Tuller et al., though their finding of “encoded slow codons” could explain only a very small part of the increased ribosome density. More modern ribosome profiling studies do not use cycloheximide as the first step in arresting translation, and in these studies, the density of ribosomes near the 5’ end of genes is greatly reduced. And yet, there remains, even in the absence of cycloheximide at the first step, a significantly increased density of ribosomes near the 5’ end (e.g., Weinberg et al., 2016). (However, most or all of these studies do use cycloheximide at a later step in the protocol, and the possibility of a cycloheximide artefact is difficult to exclude.) Some of the reviewer’s concerns are that we do not explain the increased 5’ ribosome density seen by ribosome profiling. We agree; but we feel it is not the main point of our manuscript. In revision, we more extensively discuss other work on increased ribosome density, and more explicitly point out the limitations of our manuscript in this regard. We also note, though, that increased ribosome density is not a direct measure of translation speed—it can have other causes.

      Specific Responses.

      Reviewer 1 was concerned that we did not more fully discuss other work on possible reasons for slow initial translation. We discuss such work more extensively in our revision. However, as far as we know, none of this work proposes a reason for the 5’ enrichment of rare, slow codons, and this is the main point of our paper. Furthermore, it is not completely clear that there is any slow initial translation. The increase in ribosome density seen in flash-freeze ribosome profiling could be an artefact of the use of cycloheximide at the thaw step of the protocols; or it could be a real measure of high ribosome density that occurs for some other reason than slow translation (e.g., ribosomes might have low processivity at the 5’ end).

      Reviewer 1 was also concerned about confounding effects in our reporter gene analysis of the effects of different codons on efficiency of translation. We have two comments. First, it is important to remember that although we changed codons in our reporters, we did not change any amino acids. We changed codons only to synonymous codons. Thus at least one of the reviewer’s possible confounding effects—interactions of the nascent peptide chain with the exit channel of the ribosome—does not apply. However, of course, the mRNA nucleotide sequence is altered, and this would cause a change in mRNA structure or abundance, which could matter. We agree this is a limitation to our approach. However, to fully address it, we feel it would be necessary to examine a really large number of quite different sequences, which is beyond the scope of this work. Furthermore, mRNAs with low secondary structure at the 5’ end probably have relatively high rates of initiation, and also relatively high rates of elongation, and it might be quite difficult to disentangle these. But in neither case is there an argument that slow initial translation is efficient. Accurate measurement of mRNA levels would be helpful, but would not disentangle rates of initiation from rates of elongation as causes of changes in expression.

      Reviewer 2 was concerned that the conservation scores for the 5’ 40 amino acids, and the 3’ 40 amino acids were similar, but slow translation was only statistically significant for the 5’ 40 amino acids. As we say in the manuscript, we are also puzzled by this. We note that 3’ translation is statistically slow, if one looks over the last 100 amino acids. Our best effort at an explanation is a sort of reverse-Tuller explanation: that in the last 40 amino acids, the new slow codons created by genome plasticity are fairly quickly removed by purifying selection, but that in the first 40 amino acids, for genes that need to be expressed at low levels, purifying selection against slow codons is reduced, because poor translation is actually advantageous for these genes. To expand on this a bit, we feel that the 5000 or so proteins of the proteome have to be expressed in the correct stoichiometric ratios, and that poor translation can be a useful tool to help achieve this. In this explanation, slow translation at the 5’ end is bad for translation (in agreement with our reporter experiments), but can be good for the organism, when it occurs in front of a gene that needs to be expressed poorly. Whereas, in Tuller, slow translation at the 5’ end is good for translation.

      Reviewer 2 wondered whether the N-terminal fusion peptide affects GFP fluorescence in our reporter. This specific reporter, with this N-terminus, has been characterized by Dean and Grayhack (2012), and by Gamble et al. (2016), and the idea that a super-folder GFP reporter is not greatly affected by N-terminal fusions is based on the work of Pedelacq (2006). None of these papers show whether this N-terminal fusion might have some effect, but together, they provide good reason to think that any effect would be small. These citations have been added.

    1. Author response:

      Reviewer #1 (Public Review):

      Abbasi et al. assess in this MEG study the directed connectivity of both cortical and subcortical regions during continuous speech production and perception. The authors observed bidirectional connectivity patterns between speech-related cortical areas as well as subcortical areas in production and perception. Interestingly, they found in speaking low-frequency connectivity from subcortical (the right cerebellum) to cortical (left superior temporal) areas, while connectivity from the cortical to subcortical areas was in the high frequencies. In listening a similar cortico-subcortical connectivity pattern was observed for the low frequencies, but the reversed connectivity in the higher frequencies was absent.

      The work by Abbasi and colleagues addresses a relevant, novel topic, namely understanding the brain dynamics between speaking and listening. This is important because traditionally production and perception of speech and language are investigated in a modality-specific manner. To have a more complete understanding of the neurobiology underlying these different speech behaviors, it is key to also understand their similarities and differences. Furthermore, to do so, the authors utilize state-of-the-art directed connectivity analyses on MEG measurements, providing a quite detailed profile of cortical and subcortical interactions for the production and perception of speech. Importantly, and perhaps most interesting in my opinion, is that the authors find evidence for frequency-specific directed connectivity, which is (partially) different between speaking and listening. This could suggest that both speech behaviors rely (to some extent) on similar cortico-cortical and cortico-subcortical networks, but different frequency-specific dynamics.

      These elements mentioned above (investigation of both production and perception, both cortico-cortical and cortico-subcortical connectivity is considered, and observing frequency-specific connectivity profiles within and between speech behaviors), make for important novel contributions to the field. Notwithstanding these strengths, I find that they are especially centered on methodology and functional anatomical description, but that precise theoretical contributions for neurobiological and cognitive models of speech are less transparent. This is in part because the study compares speech production and perception in general, but no psychophysical or psycholinguistic manipulations are considered. I also have some critical questions about the design which may pose some confounds in interpreting the data, especially with regard to comparing production and perception.

      (1) While the cortico-cortical and cortico-subcortical connectivity profiles highlighted in this study and the depth of the analyses are impressive, what these data mean for models of speech processing remains on the surface. This is in part due, I believe, to the fact that the authors have decided to explore speaking and listening in general, without targeting specific manipulations that help elucidate which aspects of speech processing are relevant for the particular connectivity profiles they have uncovered. For example, the frequency-specific directed connectivity is it driven by low-level psychophysical attributes of the speech or by more cognitive linguistic properties? Does it relate to the monitoring of speech, timing information, and updating of sensory predictions? Without manipulations trying to target one or several of these components, as some of the referenced work has done (e.g., Floegel et al., 2020; Stockert et al., 2021; Todorović et al., 2023), it is difficult to draw concrete conclusions as to which representations and/or processes of speech are reflected by the connectivity profiles. An additional disadvantage of not having manipulations within each speech behavior is that it makes the comparison between listening and speaking harder. That is, speaking and listening have marked input-output differences which likely will dominate any comparison between them. These physically driven differences (or similarities for that matter; see below) can be strongly reduced by instead exploring the same manipulations/variables between speaking and listening. If possible (if not to consider for future work), it may be interesting to score psychophysical (e.g., acoustic properties) or psycholinguistic (e.g., lexical frequency) information of the speech and see whether and how the frequency-specific connectivity profiles are affected by it.

      We thank the reviewer for pointing this out. The current study is indeed part of a larger project investigating the role of the internal forward model in speech perception and production. In the original, more comprehensive study, we also included a masked condition where participants produced speech as usual, but their auditory perception was masked. This allowed us to examine how the internal forward model behaves when it doesn't receive the expected sensory consequences of generated speech. However, for the current study, we focused solely on data from the speaking and listening conditions due to its specific research question. We agree that further manipulations would be interesting. However, for this study our focus was on natural speech and we avoided other manipulations (beyond masked speech) so that we can have sufficiently long recording time for the main speaking and listening conditions.

      (2) Recent studies comparing the production and perception of language may be relevant to the current study and add some theoretical weight since their data and interpretations for the comparisons between production and perception fit quite well with the observations in the current work. These studies highlight that language processes between production and perception, specifically lexical and phonetic processing (Fairs et al., 2021), and syntactic processing (Giglio et al., 2024), may rely on the same neural representations, but are differentiated in their (temporal) dynamics upon those shared representations. This is relevant because it dispenses with the classical notion in neurobiological models of language where production and perception rely on (partially) dissociable networks (e.g., Price, 2010). Rather those data suggest shared networks where different language behaviors are dissociated in their dynamics. The speech results in this study nicely fit and extend those studies and their theoretical implications.

      We thank the reviewer for the suggestion and we will include these references and the points made by the reviewer in our revised manuscript.

      (3) The authors align the frequency-selective connectivity between the right cerebellum and left temporal speech areas with recent studies demonstrating a role for the right cerebellum for the internal modelling in speech production and monitoring (e.g., Stockert et al., 2021; Todorović et al., 2023). This link is indeed interesting, but it does seem relevant to point out that at a more specific scale, it does not concern the exact same regions between those studies and the current study. That is, in the current study the frequency-specific connectivity with temporal regions concerns lobule VI in the right cerebellum, while in the referenced work it concerns Crus I/II. The distinction seems relevant since Crus I/II has been linked to the internal modelling of more cognitive behavior, while lobule VI seems more motor-related and/or contextual-related (e.g., D'Mello et al., 2020; Runnqvist et al., 2021; Runnqvist, 2023).

      We thank the reviewer for their insightful comment. The reference was intended to provide evidence for the role of the cerebellum in internal modelling in speech. We do not claim that we have the spatial resolution with MEG to reliably spatially resolve specific parts of the cerebellum.

      (4) On the methodological side, my main concern is that for the listening condition, the authors have chosen to play back the speech produced by the participants in the production condition. Both the fixed order as well as hearing one's own speech as listening condition may produce confounds in data interpretation, especially with regard to the comparison between speech production and perception. Could order effects impact the observed connectivity profiles, and how would this impact the comparison between speaking and listening? In particular, I am thinking of repetition effects present in the listening condition as well as prediction, which will be much more elevated for the listening condition than the speaking condition. The fact that it also concerns their own voice furthermore adds to the possible predictability confound (e.g., Heinks-Maldonado et al., 2005). In addition, listening to one's speech which just before has been articulated may, potentially strategically even, enhance inner speech and "mouthing" in the participants, hereby thus engaging the production mechanism. Similarly, during production, the participants already hear their own voice (which serves as input in the subsequent listening condition). Taken together, both similarities or differences between speaking and listening connectivity may have been due to or influenced by these order effects, and the fact that the different speech behaviors are to some extent present in both conditions.

      This is a valid point raised by the reviewer. By listening to their own previously produced speech, our participants might have anticipated and predicted the sentences easier. However, during designing our experiment, we tried to lower the chance of this anticipation by several steps. First, participants were measured in separate sessions for speech production and perception tasks. There were always several days' intervals between performing these two conditions. Secondly, our questions were mainly about a common/general topic. Consequently, participants may not remember their answers completely.

      Importantly, using the same stimulus material for speaking and listening guaranteed that there was no difference in the low-level features of the material for both conditions that could have affected the results of our statistical comparison.

      Due to bone conduction, hearing one’s unaltered own speech from a recording may seem foreign and could lead to unwanted emotional reactions e.g. embarrassment, so participants were asked whether they heard their own voice in a recording already (e.g. from a self-recorded voice-message in WhatsApp) which most of them confirmed. Participants were also informed that they were going to hear themselves during the measurement to further reduce unwanted psychophysiological responses.

      (5) The ability of the authors to analyze the spatiotemporal dynamics during continuous speech is a potentially important feat of this study, given that one of the reasons that speech production is much less investigated compared to perception concerns motor and movement artifacts due to articulation (e.g., Strijkers et al., 2010). Two questions did spring to mind when reading the authors' articulation artifact correction procedure: If I understood correctly, the approach comes from Abbasi et al. (2021) and is based on signal space projection (SSP) as used for eye movement corrections, which the authors successfully applied to speech production. However, in that study, it concerned the repeated production of three syllables, while here it concerns continuous speech of full words embedded in discourse. The articulation and muscular variance will be much higher in the current study compared to three syllables (or compared to eye movements which produce much more stable movement potentials compared to an entire discourse). Given this, I can imagine that corrections of the signal in the speaking condition were likely substantial and one may wonder (1) how much signal relevant to speech production behavior is lost?; (2) similar corrections are not necessary for perception, so how would this marked difference in signal processing affect the comparability between the modalities?

      One of the results of our previous study (Abbasi et al., 2021) was that the artefact correction was not specific to individual syllables but generalised across syllables. Also, the repeated production of syllables was associated with substantial movements of the articulators mimicking those observed during naturalistic speaking. We therefore believe that the artefact rejection is effective during speaking. We also checked this by investigating speech related coherence in brain parcels in spatial proximity to the articulators. In our previous study we also show that the correction method retains neural activity to a very large degree. We are therefore confident that speaking and listening conditions can be compared and that the loss of true signals from correcting the speaking data will be minor.

      References:

      • Abbasi, O., Steingräber, N., & Gross, J. (2021). Correcting MEG artifacts caused by overt speech. Frontiers in Neuroscience, 15, 682419.

      • D'Mello, A. M., Gabrieli, J. D., & Nee, D. E. (2020). Evidence for hierarchical cognitive control in the human cerebellum. Current Biology, 30(10), 1881-1892.

      • Fairs, A., Michelas, A., Dufour, S., & Strijkers, K. (2021). The same ultra-rapid parallel brain dynamics underpin the production and perception of speech. Cerebral Cortex Communications, 2(3), tgab040.

      • Floegel, M., Fuchs, S., & Kell, C. A. (2020). Differential contributions of the two cerebral hemispheres to temporal and spectral speech feedback control. Nature Communications, 11(1), 2839.

      • Giglio, L., Ostarek, M., Sharoh, D., & Hagoort, P. (2024). Diverging neural dynamics for syntactic structure building in naturalistic speaking and listening. Proceedings of the National Academy of Sciences, 121(11), e2310766121.

      • Heinks‐Maldonado, T. H., Mathalon, D. H., Gray, M., & Ford, J. M. (2005). Fine‐tuning of auditory cortex during speech production. Psychophysiology, 42(2), 180-190.

      • Price, C. J. (2010). The anatomy of language: a review of 100 fMRI studies published in 2009. Annals of the new York Academy of Sciences, 1191(1), 62-88.

      • Runnqvist, E., Chanoine, V., Strijkers, K., Pattamadilok, C., Bonnard, M., Nazarian, B., ... & Alario, F. X. (2021). Cerebellar and cortical correlates of internal and external speech error monitoring. Cerebral Cortex Communications, 2(2), tgab038.

      • Runnqvist, E. (2023). Self-monitoring: The neurocognitive basis of error monitoring in language production. In Language production (pp. 168-190). Routledge.

      • Stockert, A., Schwartze, M., Poeppel, D., Anwander, A., & Kotz, S. A. (2021). Temporo-cerebellar connectivity underlies timing constraints in audition. Elife, 10, e67303.

      • Strijkers, K., Costa, A., & Thierry, G. (2010). Tracking lexical access in speech production: electrophysiological correlates of word frequency and cognate effects. Cerebral cortex, 20(4), 912-928.

      • Todorović, S., Anton, J. L., Sein, J., Nazarian, B., Chanoine, V., Rauchbauer, B., ... & Runnqvist, E. (2023). Cortico-cerebellar monitoring of speech sequence production. Neurobiology of Language, 1-21.

      Reviewer #2 (Public Review):

      Summary:

      The authors re-analyse MEG data from a speech production and perception study and extend their previous Granger causality analysis to a larger number of cortical-cortical and in particular cortical-subcortical connections. Regions of interest were defined by means of a meta-analysis using Neurosynth.org and connectivity patterns were determined by calculating directed influence asymmetry indices from the Granger causality analysis results for each pair of brain regions. Abbasi et al. report feedforward signals communicated via fast rhythms and feedback signals via slow rhythms below 40 Hz, particularly during speaking. The authors highlight one of these connections between the right cerebellum lobule VI and auditory association area A5, where in addition the connection strength correlates negatively with the strength of speech tracking in the theta band during speaking (significant before multiple comparison correction). Results are interpreted within a framework of active inference by minimising prediction errors.

      While I find investigating the role of cortical-subcortical connections in speech production and perception interesting and relevant to the field, I am not yet convinced that the methods employed are fully suitable to this endeavour or that the results provide sufficient evidence to make the strong claim of dissociation of bottom-up and top-down information flow during speaking in distinct frequency bands.

      Strengths:

      The investigation of electrophysiological cortical-subcortical connections in speech production and perception is interesting and relevant to the field. The authors analyse a valuable dataset, where they spent a considerable amount of effort to correct for speech production-related artefacts. Overall, the manuscript is well-written and clearly structured.

      Weaknesses:

      The description of the multivariate Granger causality analysis did not allow me to fully grasp how the analysis was performed and I hence struggled to evaluate its appropriateness. Knowing that (1) filtered Granger causality is prone to false positives and (2) recent work demonstrates that significant Granger causality can simply arise from frequency-specific activity being present in the source but not the target area without functional relevance for communication (Schneider et al. 2021) raises doubts about the validity of the results, in particular with respect to their frequency specificity. These doubts are reinforced by what I perceive as an overemphasis on results that support the assumption of specific frequencies for feedforward and top-down connections, while findings not aligning with this hypothesis appear to be underreported. Furthermore, the authors report some main findings that I found difficult to reconcile with the data presented in the figures. Overall, I feel the conclusions with respect to frequency-specific bottom-up and top-down information flow need to be moderated and that some of the reported findings need to be checked and if necessary corrected.

      Major points

      (1) I think more details on the multivariate GC approach are needed. I found the reference to Schaum et al., 2021 not sufficient to understand what has been done in this paper. Some questions that remained for me are:

      (i) Does multivariate here refer to the use of the authors' three components per parcel or to the conditioning on the remaining twelve sources? I think the latter is implied when citing Schaum et al., but I'm not sure this is what was done here?

      If it was not: how can we account for spurious results based on indirect effects?

      Yes, multivariate refers to the three components.

      (ii) Did the authors check whether the GC of the course-target pairs was reliably above the bias level (as Schaum et. al. did for each condition separately)? If not, can they argue why they think that their results would still be valid? Does it make sense to compute DAIs on connections that were below the bias level? Should the data be re-analysed to take this concern into account?

      We performed statistics on DAI and believe that this is a valid approach. We argue that random GC effects would not survive our cluster-corrected statistics.

      (iii) You may consider citing the paper that introduced the non-parametric GC analysis (which Schaum et al. then went on to apply): Dhamala M, Rangarajan G, Ding M. Analyzing Information Flow in Brain Networks with Nonparametric Granger Causality. Neuroimage. 2008; 41(2):354-362. https://doi.org/10.1016/j.neuroimage.2008.02. 020

      Thanks, we will add this reference in the revised version.

      (2) GC has been discouraged for filtered data as it gives rise to false positives due to phase distortions and the ineffectiveness of filtering in the information-theoretic setting as reducing the power of a signal does not reduce the information contained in it (Florin et al., 2010; Barnett and Seth, 2011; Weber et al. 2017; Pinzuti et al., 2020 - who also suggest an approach that would circumvent those filter-related issues). With this in mind, I am wondering whether the strong frequency-specific claims in this work still hold.

      This must be a misunderstanding. We are aware of the problem with GC on filtered data. But GC was here computed on broadband data and not in individual frequency bands.

      (3) I found it difficult to reconcile some statements in the manuscript with the data presented in the figures:

      (i) Most notably, the considerable number of feedforward connections from A5 and STS that project to areas further up the hierarchy at slower rhythms (e.g. L-A5 to R-PEF, R-Crus2, L CB6 L-Tha, L-FOP and L-STS to R-PEF, L-FOP, L-TOPJ or R-A5 as well as R-STS both to R-Crus2, L-CB6, L-Th) contradict the authors' main message that 'feedback signals were communicated via slow rhythms below 40 Hz, whereas feedforward signals were communicated via faster rhythms'. I struggled to recognise a principled approach that determined which connections were highlighted and reported and which ones were not.

      (ii) "Our analysis also revealed robust connectivity between the right cerebellum and the left parietal cortex, evident in both speaking and listening conditions, with stronger connectivity observed during speaking. Notably, Figure 4 depicts a prominent frequency peak in the alpha band, illustrating the specific frequency range through which information flows from the cerebellum to the parietal areas." There are two peaks discernible in Figure 4, one notably lower than the alpha band (rather theta or even delta), the other at around 30 Hz. Nevertheless, the authors report and discuss a peak in the alpha band.

      (iii) In the abstract: "Notably, high-frequency connectivity was absent during the listening condition." and p.9 "In contrast with what we reported for the speaking condition, during listening, there is only a significant connectivity in low frequency to the left temporal area but not a reverse connection in the high frequencies."

      While Fig. 4 shows significant connectivity from R-CB6 to A5 in the gamma frequency range for the speaking, but not for the listening condition, interpreting comparisons between two effects without directly comparing them is a common statistical mistake (Makin and Orban de Xivry). The spectrally-resolved connectivity in the two conditions actually look remarkably similar and I would thus refrain from highlighting this statement and indicate clearly that there were no significant differences between the two conditions.

      (iv) "This result indicates that in low frequencies, the sensory-motor area and cerebellum predominantly transmit information, while in higher frequencies, they are more involved in receiving it."

      I don't think that this statement holds in its generality: L-CB6 and R-3b both show strong output at high frequencies, particularly in the speaking condition. While they seem to transmit information mainly to areas outside A5 and STS these effects are strong and should be discussed.

      We appreciate the reviewer's thoughtful comments. We acknowledge that not all connectivity patterns strictly adhere to the initial observation regarding feedback and feedforward communication. It's true that our primary focus was on interactions between brain regions known to be crucial for speech prediction, including auditory, somatosensory, and cerebellar areas. However, we also presented connectivity patterns across other regions to provide a more comprehensive picture of the speech network. We believe this broader perspective can be valuable for future research directions.

      Regarding the reviewer's observation about the alpha band peak in Figure 4, we agree that a closer examination reveals the connectivity from right cerebellum to the left parietal is in a wider low frequency range. We will refrain from solely emphasizing the alpha band and acknowledge the potential contribution of lower frequencies to cerebellar-parietal communication.

      We also appreciate the reviewer highlighting the need for a more nuanced interpretation of the listening condition connectivity compared to the speaking condition. The reviewer is correct in pointing out that while Figure 4 suggests a high-frequency connectivity from L-A5 to R-CB only in the speaking condition, a direct statistical comparison between conditions might not reveal a significant difference. We will revise the manuscript to clarify this point.

      Finally, a closer examination of Figure 3 revealed that the light purple and dark green edges in the speaking condition for R-CB6 and L-3b suggest outgoing connections at low frequencies, while other colored edges indicate information reception at high frequencies. We acknowledge that exceptions to this directional pattern might exist and warrant further investigation in future studies.

      (4) "However, definitive conclusions should be drawn with caution given recent studies raising concerns about the notion that top-down and bottom-up signals can only be transmitted via separate frequency channels (Ferro et al., 2021; Schneider et al., 2021; Vinck et al., 2023)."

      I appreciate this note of caution and think it would be useful if it were spelled out to the reader why this is the case so that they would be better able to grasp the main concerns here. For example, Schneider et al. make a strong point that we expect to find Granger-causality with a peak in a specific frequency band for areas that are anatomically connected when the sending area shows stronger activity in that band than the receiving one, simply because of the coherence of a signal with its own linear projection onto the other area. The direction of a Granger causal connection would in that case only indicate that one area shows stronger activity than the other in the given frequency band. I am wondering to what degree the reported connectivity pattern can be traced back to regional differences in frequency-specific source strength or to differences in source strength across the two conditions.

      This is indeed an important point. That is why we are discussing our results with great caution and specifically point the reader to the relevant literature. We are indeed thinking about a future study where we investigate this connectivity using other connectivity metrics and a detailed consideration of power.

      Reviewer #3 (Public Review):

      In the current paper, Abbasi et al. aimed to characterize and compare the patterns of functional connectivity across frequency bands (1 Hz - 90 Hz) between regions of a speech network derived from an online meta-analysis tool (Neurosynth.org) during speech production and perception. The authors present evidence for complex neural dynamics from which they highlight directional connectivity from the right cerebellum to left superior temporal areas in lower frequency bands (up to beta) and between the same regions in the opposite direction in the (lower) high gamma range (60-90 Hz). Abbasi et al. interpret their findings within the predictive coding framework, with the cerebellum and other "higher-order" (motor) regions transmitting top-down sensory predictions to "lower-order" (sensory) regions in the lower frequencies and prediction errors flowing in the opposite direction (i.e., bottom-up) from those sensory regions in the gamma band. They also report a negative correlation between the strength of this top-down functional connectivity and the alignment of superior temporal regions to the syllable rate of one's speech.

      Strengths:

      (1) The comprehensive characterization of functional connectivity during speaking and listening to speech may be valuable as a first step toward understanding the neural dynamics involved.

      (2) The inclusion of subcortical regions and connectivity profiles up to 90Hz using MEG is interesting and relatively novel.

      (3) The analysis pipeline is generally adequate for the exploratory nature of the work.

      Weaknesses:

      (1) The work is framed as a test of the predictive coding theory as it applies to speech production and perception, but the methodological approach is not suited to this endeavor.

      We agree that we cannot provide definite evidence for predictive coding in speech production and perception and we believe that we do not make that claim in the manuscript. However, our results are largely consistent with what can be expected based on predictive coding theory.

      (2) Because of their theoretical framework, the authors readily attribute roles or hierarchy to brain regions (e.g., higher- vs lower-order) and cognitive functions to observed connectivity patterns (e.g., feedforward vs feedback, predictions vs prediction errors) that cannot be determined from the data. Thus, many of the authors' claims are unsupported.

      We will revise the manuscript to more clearly differentiate our results (e.g. directed Granger-Causality from A to B) from their interpretation (potentially indicating feedforward or feedback signals).

      (3) The authors' theoretical stance seems to influence the presentation of the results, which may inadvertently misrepresent the (otherwise perfectly valid; cf. Abbasi et al., 2023) exploratory nature of the study. Thus, results about specific regions are often highlighted in figures (e.g., Figure 2 top row) and text without clear reasons.

      Our connectograms reveal a multitude of results that we hope is interesting to the community. At the same time the wealth of findings poses a problem for describing them. We did not see a better way then to highlight specific connections of interest.

      (4) Some of the key findings (e.g., connectivity in opposite directions in distinct frequency bands) feature in a previous publication and are, therefore, interesting but not novel.

      We actually see this as a strength of the current manuscript. The computation of connectivity is here extended to a much larger sample of brain areas. It is reassuring to see that the previously reported results generalise to other brain areas.

      (5) The quantitative comparison between speech production and perception is interesting but insufficiently motivated.

      We thank the reviewer for this comment. We have addressed that in detail in response to the point (1&4) of reviewer 1.

      (6) Details about the Neurosynth meta-analysis and subsequent selection of brain regions for the functional connectivity analyses are incomplete. Moreover, the use of the term 'Speech' in Neurosynth seems inappropriate (i.e., includes irrelevant works, yielding questionable results). The approach of using separate meta-analyses for 'Speech production' and 'Speech perception' taken by Abbasi et al. (2023) seems more principled. This approach would result, for example, in the inclusion of brain areas such as M1 and the BG that are relevant for speech production.

      We agree that there are inherent limitations in automated meta-analysis tools such as Neurosynth. Papers are used in the meta-analysis that might not be directly relevant. However, Neurosynth has proven its usefulness over many years and has been used in many studies. We also agree that our selection of brain areas is not complete. But Granger Causality analysis of every pair of ROIs leads to complex results and we had to limit our selection of areas.

      (7) The results involving subcortical regions are central to the paper, but no steps are taken to address the challenges involved in the analysis of subcortical activity using MEG. Additional methodological detail and analyses would be required to make these results more compelling. For example, it would be important to know what the coverage of the MEG system is, what head model was used for the source localization of cerebellar activity, and if specific preprocessing or additional analyses were performed to ensure that the localized subcortical activity (in particular) is valid.

      There is a large body of evidence demonstrating that MEG can record signals from deep brain areas such as thalamus and cerebellum including Attal & Schwarz 2013, Andersen et al, Neuroimage 2020; Piastra et al., 2020; Schnitzler et al., 2009. These and other studies provide evidence that state-of-the-art recording (with multichannel SQUID systems) and analysis is sufficient to allow reconstruction of subcortical areas. However, spatial resolution is clearly reduced for these deep areas. We will add a statement in the revised manuscript to acknowledge this limitation.

      (8) The results and methods are often detailed with important omissions (a speech-brain coupling analysis section is missing) and imprecisions (e.g., re: Figure 5; the Connectivity Analysis section is copy-pasted from their previous work), which makes it difficult to understand what is being examined and how. (It is also not good practice to refer the reader to previous publications for basic methodological details, for example, about the experimental paradigm and key analyses.) Conversely, some methodological details are given, e.g., the acquisition of EMG data, without further explanation of how those data were used in the current paper.

      We will revise the relevant sections of the manuscript.

      (9) The examination of gamma functional connectivity in the 60 - 90 Hz range could be better motivated. Although some citations involving short-range connectivity in these frequencies are given (e.g., within the visual system), a more compelling argument for looking at this frequency range for longer-range connectivity may be required.

      Given previous evidence of connectivity in the gamma band we think that it would be a weakness to exclude this frequency band from analysis.

      (10) The choice of source localization method (linearly constrained minimum variance) could be explained, particularly given that other methods (e.g. dynamic imaging of coherent sources) were specifically designed and might potentially be a better alternative for the types of analyses performed in the study.

      Both LCMV and DICS are beamforming methods. We used LCMV because we wanted used Granger Causality which requires broadband signals. DICS would only provide frequency-specific band-limited signals.

      (11) The mGC analysis needs to be more comprehensively detailed for the reader to be able to assess what is being reported and the strength of the evidence. Relatedly, first-level statistics (e.g., via estimation of the noise level) would make the mGC and DAI results more compelling.

      We perform group-level cluster-based statistics on mGC while correcting for multiple comparisons across frequency bands and brain parcels and report only significant results. This is an established approach that is routinely used in this type of studies.

      (12) Considering the exploratory nature of the study, it is essential for other researchers to continue investigating and validating the results presented in the current manuscript. Thus, it is concerning that data and scripts are not fully and openly available. Data need not be in its raw state to be shared and useful, which circumvents the stated data privacy concerns.

      We acknowledge the reviewer's concern regarding the full availability of the dataset. Due to privacy limitations on the collected data, we are unable to share it publicly at this time. However, to promote transparency and enable further exploration, we have provided the script used for data analysis and an example dataset. This example dataset should provide a clear understanding of the data structure and variables used in the analysis. Additionally, we are happy to share the complete dataset upon request from research teams interested in performing in-depth secondary analyses.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)):_ _ __ In this manuscript, Jones et al. report on a potential role for fam83fa in zebrafish hatching, radiation response and autophagy. The authors are commended for generating multiple KO lines and maternal-zygotic embryos for analysis. However, important controls are lacking and the data is circumstantial throughout with very little mechanistic insight into the precise roles, if any, of fam83f in these processes.

      We thank the reviewer for recognizing the strengths of our manuscript, and highlighting areas we might improve. Please see the specific comments below addressing the points raised. In respect of mechanistic insight, while we agree that our manuscript does not provide this, it was not intended to. Rather, we aim to communicate our descriptive findings on the role of Fam83fa in vivo, providing data for follow-up studies by other researchers into the mechanistic role of Fam83fa.

      1. Validation of the KO phenotypes (hatching, IR sensitivity) requires rescue with WT fam83fa WT mRNA, but not 1-500 or fam83fb mRNA.

      We thank the reviewer for raising the issue of rescue experiments. Such experiments are frequently used in knock-down experiments, where non-specificity may be a problem, but they are used more rarely in genetic knock-outs, where the gene defect is well defined. In the case of Fam83fa, a particular difficulty is that overexpression of fam83fa itself causes a p53-mediated DNA damage response (DDR) (Salama et al., 2019). Moreover, we have shown by both qRT-PCR and western blotting that injection of fam83fa mRNA into zebrafish embryos (the traditional technique by which rescue experiments are performed) induces a p53-mediated DDR. As a result, it would be very difficult to interpret the results of any rescue experiment, because one would have to be absolutely certain that levels of fam83fa re-expression recapitulate and do not exceed endogenous levels. As a tool for specificity, we therefore used more than one fam83fa-/- mutant line, carrying a different genomic mutation, and validated that the same phenotype was present in both. We are happy to provide the qRT-PCR and western blot data confirming the results of fam83fa mRNA injection, if required. We have included an additional section into the manuscript detailing this issue. 2.

      While the hatching phenotype (Fig 3) is convincing, there is no data on HG development in the null embryos. Does the HG develop normally in the absence of fam83fb? If so, this would support the authors conclusions that the role of fam83fb is functional rather than developmental (indirect effect). In situs as in Fig.1 might be helpful here.

      Thank you to the reviewer for this helpful suggestion. We agree that we did not investigate whether the hatching gland develops normally in the MZ-fam83fa-/- mutant embryos. No gross morphological differences were observed that led us to investigate this, although we agree it is an interesting question for a future project. In terms of functional vs developmental effects, we are confident that MZ-fam83fa-/- mutant embryos develop at a normal temporal rate, as evidenced by the machine learning based classifier used to assess temporal developmental trajectory (Figure S3 and Jones et al., 2022, 2024). This strongly suggests that the effect of fam83fa KO is functional rather than indirect and caused by (for example) developmental delay.

      While the IR sensitivity phenotype (Fig S4) is convincing, IR-induced cell death/apoptosis was not analyzed. There is a large literature describing straightforward assays for cell death/apoptosis detection in zebrafish with assays such as acridine orange or TUNEL labeling, or active casp3 whole-mount IF. Is IR-induced cell death enhanced in fam83fa KOs?

      We thank the reviewer for their positive comments and agree that investigating the nature of the cell death occurring following IR would be very interesting. We did make use of both acridine orange and TUNEL labeling following injection of fam83fa mRNA (see 1 above), and whilst the assays themselves were relatively straightforward, due to technical issues the quantification of fluorescence intensity was not. Similarly, we suspect that a significant degree of necrosis is also occurring, which further complicates the issue of data interpretation from both these approaches. We do, however, think this is an important avenue of questioning, and hope that other researchers will explore the mechanism of IR induced cell death in the MZ-fam83fa-/- mutants in the future,

      Similarly, there are multiple tools to assay autophagy in zebrafish (e.g., Moss et al., Histochem Cell Biol 2020, PMC7609422; Mathai et al., Cells 2017, PMC5617967). Is autophagy affected in the KOs, with or without IR? These experiments might directly implicate fam83fa in autophagy.

      We agree that there are exciting tools with which to assay autophagy in zebrafish, and although we considered some of these, including caudal fin regeneration, we deemed these experiments to be beyond the descriptive scope of this paper, given the time and resources available to us. We hope that other researchers will use our data as a basis for investigating the role of Fam83fa in autophagy further, using assays such as these suggested by the reviewer.

      Figure 4: Isn't there a slight reduction in p53 induction at 10 hours?

      Although the western blot in Figure 4A gives this impression, this is probably due to loading variability (see the anti-β-actin loading control band). Moreover, over three independent experiments (Figure 4B), this apparent difference is not statistically significant. Taken together with other evidence that the p53-mediated DNA damage response is not affected in MZ-fam83fa-/- mutants, we are confident there is no detectable change in the level of stabilized p53 in the MZ-fam83fa-/- mutants compared to WT.

      Given the widely documented, dominant role of p53 in zebrafish IR-sensitivity, the authors should test if the IR sensitivity of fam83fa KO animals is p53-dependent, ideally via a cross into p53 null, but at least via injection of p53 morpholinos.

      We agree that p53 is widely documented as playing an essential role in the IR induced DNA damage response in zebrafish. All our experiments suggest there is no difference between the levels of p53 (protein or mRNA) or any of the p53-induced downstream effectors (that we tested) in MZ-fam83fa-/- mutants compared to WT embryos. This was true whether or not the embryos were subjected to genotoxic stressors, including IR treatment. We therefore conclude that the increased sensitivity phenotype we observe as a result of loss of Fam83fa is not caused by a change in p53 activity, at least not as part of the DNA damage response.

      Do autophagy inhibitors phenocopy the hatching and IR-sensitivity defects of fam83fa embryos? Do the inhibitors exacerbate the mutant phenotypes or synergize with M or Z mutant phenotypes? (I may have missed this but do M and Z fam83fa null embryos have any phenotype? Or do the phenotypes only manifest in MZ embryos?)

      This is an excellent question, and indeed one we attempted to address. We tried to optimize several autophagy inhibitors including bafilomycin A1, chloroquine and wortmannin, as well as the proteasomal inhibitor MG132. In addition, we tried to optimize the autophagy promoters Torin1 and rapamycin. Unfortunately, we regularly saw global effects in zebrafish embryos that were difficult to characterize and control by dosage. At the same time, we were also working to confirm the specific effects of these drugs on autophagy using p62 and LC3-I and LC3-II western blots, which themselves were difficult to optimize. We attempted to optimize these experiments for 6 months before the COVID lockdown occurred, at which point they were abandoned. We would be delighted for future researchers to continue these experiments, as we are now unable to pursue this further due to closure of the Smith lab, but we agree that these are very pertinent questions. We hope the descriptive data provided in our paper will prompt other researchers in the autophagy field to further explore the role of Fam83fa in autophagy. In response to the zygotic phenotype question, this was something we did not investigate. As there was no immediately apparent phenotype in the zygotic generation, for ease of screening larger numbers of embryos we proceeded immediately to the maternal-zygotic (MZ) generation.

      Reviewer #1 (Significance (Required)):

      The role of Fam83f is not known. This study in zebrafish might be the first to clarify the function of this protein in vivo.

      We thank the reviewer for this positive insight, and we agree that our work is the first do so in vivo.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Fam83f is one of the proteins about which little is known. The authors Jones et al., tried to shed light on Fam83f function by knocking out the gene in zebrafish. Here they found that fam83 is expressed in the hatching gland and that larvae without Fam83f hatch significantly earlier than wild-type animals. The authors furthermore investigated the response of fam83f knock-out animals to DNA damage and found increased sensitivity to ionizing radiation and MMS. In order to find out more about Fam83f function in the DNA damage response, the authors performed RNA-seq after employing DNA damage and here they saw upregulation of several autophagy/lysosome-associated proteins and downregulation of some phosphatidylinositol-3-phosphate binding proteins, among others. Finally, the authors found that Fam83f is targeted to the lysosome. The manuscript is overall well written and clear in its general statement.

      We thank the reviewer for their encouraging comments.

      In the manuscript, the authors describe the investigation of several aspects of Fam83f function and particularly the role in hatching seems to be important for Fam83f as the gene is strongly expressed in the hatching gland and its absence leads to a clear and considerable earlier hatching. Unfortunately, all aspects of Fam83f function that are described in the manuscript are investigated very superficially, the conclusions are not supported by data and important controls are lacking. As such, the RNA-seq results are not confirmed by qRT-PCR, the role of the Fam83f LIR domain is not confirmed by co-IPs and it has not been investigated whether the presence of Fam83f in lysosomes is due to its degradation or whether it has a function in this cellular compartment.

      We thank the reviewer for their input and will address each point raised below: -

      • All aspects of Fam83f function are investigated superficially.

      We agree that we have not provided an in-depth analysis of the mechanistic role of Fam83fa. It was because there were so many roles that we decided to make this paper rather descriptive in nature, hoping that the observations will prove useful to other researchers who may wish to define the mechanistic roles of Fam83fa more deeply. Even without in-depth investigation, our findings are previously unreported and the phenotypes we report are clear. We have amended our manuscript to make it apparent that this paper is intended to be descriptive in nature, and we hope this addresses this issue.

      • Important controls are lacking - RNA-seq results are not confirmed by qRT-PCR

      We thank the reviewer for their comment. We did not include qRT-PCR data as a control for the RNA-seq data because 1) each RNA-seq experiment was repeated on three biological replicates across three independent experiments and 2) we conducted RNA-seq on two different MZ-fam83fa-/- mutant lines and only considered genes that were mis-regulated in both mutants. Taken together, we considered this to be sufficient validation for the manuscript. However, we also performed confirmatory qRT-PCR for several of the differentially expressed genes identified, including the three main PI(3)P binding genes. We have now included these data in the supplementary information as an additional control - see Figure S6G which is now also referred to in the main text, and additional primer sequences have been added to Table S1.

      • The role of the Fam83f LIR domain is not confirmed by co-Ips

      We agree with the reviewer that this is an important experiment, and we worked closely with Dr Brian Ludwig and Dr Karen Vousden (The Francis Crick Institute) to test this. We tried to express zebrafish Atg8 and Gabarap (the two main ATG8 proteins that bind to LIR domains) but were unable to express sufficient levels of protein to perform the co-Ips. The text in the manuscript has now been amended to reflect that this experiment is required to confirm the role of the putative LIR domain in Fam83fa.

      • *it has not been investigated whether the presence of Fam83f in lysosomes is due to its degradation or whether it has a function in this cellular compartment *

      Whilst we agree with the reviewer that this is an important question, we did not intend this paper to expand beyond a descriptive role of the observations we made following the loss of Fam83fa in vivo. These are important questions to follow up on to determine the mechanism of action of Fam83fa, and we hope that other researchers will pursue these avenues of investigation following the publication of our observations.

      Also, there is no leading concept in the manuscript. Starting from a role in hatching, the authors go to the DNA damage response and finally to the presence of Fam83f in lysosomes. How are these different aspects linked? Is the presence of Fam83f in lysosomes important for the suppression of hatching and how does Fam83f delays this process? (One would have wished that the authors would not have been that broad and were more focused on a particular aspect which then could have been investigated in depth.)

      We agree with the reviewer that the paper gives a broad overview of our observations and does not examine the underlying mechanisms in detail. However, we believe that descriptive papers such as this, where observations following genetic perturbation are reported, are equally important, providing as they do important foundational data for other researchers to take forward. We do postulate on the links between the hatching, DNA damage and lysosomal phenotypes we observe in the discussion section, and we have expanded on this following the reviewers' comments, to make our hypothesized link between these phenomena clearer.

      Specific comments: - All materials should be described in material and methods including the antibodies that have been used

      The antibodies used together with concentrations and catalog numbers are now in Materials and Methods

      • Abbreviations should be explained

      The manuscript has been revised to ensure all abbreviations are explained. We thank the reviewer for bringing this oversight to our attention.

      • Figure 4A: Levels of p53 should also be shown for untreated fam83f -/-KO1 and KO2 animals

      The authors thank the reviewers for raising this point. Extracts from untreated MZ-fam83fa-/- KO1 and KO2 embryos were not included on this particular blot, as p53 was observed to be undetectable in all embryos, across all our experiments (WT and both mutants) unless genotoxic stress was applied. No quantification could therefore be performed as the expression level was essentially zero. However, we have now included an example p53 western blot in Supplemental Figure 5A, which shows WT, MZ-fam83fa-/- KO1 and MZ-fam83fa-/- KO2 untreated blots for p53 (all undetectable) alongside treated embryos (detected).

      • Some references are missing (e.g. page 17, lane 320/321: As this group of cells arises....)

      This citation and reference have now been added; thank you to the reviewer for highlighting this omission.

      • Lane 369: The authors write about 4 KO lines but only two are shown in the figure.

      We thank the reviewer for this observation. In Figure 2B only KO1 and KO2 schematic diagrams are shown for simplicity (as these are the lines taken forward for further investigation). We have now amended the manuscript text to make this clear.

      • Lane 374/375: The NMD is not proven

      Absolutely - we have now revised the text to change this sentence accordingly and thank the reviewer for noting this.

      • Lane 380: how can RNA levels of fam83fa be upregulated when the gene has been knocked out? Why are these genes only upregulated in KO1? How relevant is this?

      This was a typographical error, and we are very grateful to the reviewer for picking up on this. It should have read 'fam83fb'. As nonsense-mediated decay and associated transcriptional adaptation have been previously reported in zebrafish, this finding may be of considerable interest to the community. It is a side observation, and not necessarily directly related to the role of Fam83fa in vivo, but we felt it important to include. Indeed, as a result of this observation we have recently shared our MZ-fam83fa-/- lines with another group who are planning to investigate precisely this question - why are fam83fb and fam83g only upregulated in KO1?

      • Figure 3C is not mentioned in the text and lacks any labelling

      Figure 3C is now clearly referred to in the text and a label added to the figure.

      • Lane 434/435: all relevant data should be shown (can be done as supplementary figure)

      We have now amended this to include an additional supplemental figure (Figure S5A).

      • Lane 434: The reference to the figure seems to be incorrect (5A4A)

      Amended accordingly - thank you for pointing out this mistake.

      • Figure 4C and 4D: what is the difference?

      Thank you to the reviewer for noticing this omission. These data are from t1 (+2hrs) and t2 (+10hrs) and have now been labelled accordingly.

      • S5C and S5D: why are there 3 clusters?

      We thank the reviewer for raising this as it has provided us with an opportunity to present our data more clearly. There are 3 clusters that represent the combination of the two first principal components, which are time and treatment. Therefore, the clusters represent i) untreated at t1, ii) treated at t1 and iii) treated at t2. However, having two plots with different color schemes made this confusing/misleading. We have now replaced the two PCA plots with one that is colored and labelled accordingly with the 3 aforementioned clusters.

      • Lane 495 to 505: What does this mean that the GO analysis shows upregulation and downregulation of endopeptidases and why "in contrast"?

      We thank the reviewer for this comment, and we agree that this paragraph was misleading/confusing. This has now been rewritten in the main text, clarifying that endopeptidases were consistently upregulated at both timepoints.

      Reviewer #2 (Significance (Required)):

      The strength of the manuscript is certainly that it provides inside into Fam83f function as there is not much known about Fam83f.

      We thank the reviewer for the positive comment, and we agree that very little is known about this highly conserved protein.

      These study is probably most interesting for people in the zebrafish and related fields as the authors convincingly show the expression of Fam83f in the hatching gland and also the earlier hatching in the absence of the protein is very clear.

      Thank you for the positive feedback.

      The weakness of the study is clearly that it does not provide an in-depth analysis. As such, it shows that Fam83f is involved in hatching and can delay the process but it remains elusive how this is achieved. (Likwise, also the investigation into the DNA damage response remains very superficial and does not prove a specific role for Fam83f in the DNA damage response or whether the increased sensitivity is more unspecifically caused by the absence of a gene or eventually even connected to the earlier hatching.

      Please refer to responses above (and changes made to the manuscript) clarifying that this study is intended to be descriptive, and provides important foundational data for further in-depth mechanistic studies by other researchers interested in the role of Fam83fa in vivo.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)):_ _ __ In their manuscript "Zebrafish reveal new roles for Fam83f in hatching and the DNA damage-mediated autophagic response", Jones et al. provide an interesting exploration for the function of a poorly studied protein, Fam83f in embryonic development. Using the zebrafish as a model organism, the study combines loss-of-function genetics, phenotypic analysis and RNA-sequencing to characterize and explore the result of Fam83f loss. Upon critical review of the manuscript and the results we offer suggestions to improve the manuscript (see 'minor technical issues'). Additionally, we would like to highlight a weakness of the study in making the connection between Fam83f to the observed phenotype (increased sensitivity to DNA damage), see 'major issues'.

      Major issues:

      Most of our concern stems from relatively incomplete connection of the loss of fam83f to increased sensitivity to DNA-damage and lysosome function.

      Please refer to comments above and changes made to the manuscript to clarify this is a descriptive paper that is not intended to provide in-depth mechanistic insight into the role of Fam83fa.

      Is the increased sensitivity in fam83f KO embryos a direct effect to fam83f loss? A rescue experiment (by introduction of Fam83fa mRNA into their KO2 fish line) in the presence of ionizing radiation would help us understand the functional role of this protein in this process. Furthermore, can overexpression of any of the down-regulated genes involved in lysosome function restore the early hatching phenotype or the sensitivity to DNA damage? Fam83fa rescue experiments would be very difficult to interpret - please see comments above and the corresponding changes to our manuscript.

      In terms of over-expressing some of the downregulated genes identified in the RNA-seq and qRT-PCR to see if the phenotype can be rescued, we feel these are excellent suggestions and we hope other researchers in future will attempt such experiments.

      Minor technical issues:

      -Methods line 203, clarify how many embryos were used per sample for RNA-seq (this was only described as 15 embryos in the main body results text).

      Text has been amended to clarify this. We thank the reviewer for noticing this oversight.

      -Comment about the expansion of fam83f orthologs in mammals (8) as opposed to only 2 in zebrafish

      We apologize for any confusion: mammals do not have 8 fam83f orthologs. Mammals and zebrafish have 8 FAM83 genes (FAM83A-FAM83H). Zebrafish, unlike mammals, have genome duplication and although mammals have only one FAM83F gene, zebrafish have two: Fam83fa and Fam83fb. We trust this clarifies this issue and believe this to be clear in our main text. However, we are happy to make any suggested amendments should the reviewer consider our wording confusing.

      -Supplementary figure 1C: please include representative images of secondary axis formation in fam83fa overexpressed Xenopus embryos.

      We have not included any images as these are already published in our related paper on FAM83F (Dunbar et al., 2020) which we refer to in the figure legend text. No additional images were captured specifically for this publication.

      -Provide more information about the mis-regulated genes in the RNA-seq analysis, how many are up or down regulated? Perhaps a better plot than a Venn diagram can be an MA-plot with the Venn diagram moved to a supplementary figure.

      The Venn diagrams in Figure 5A-C are to illustrate the number of differentially expressed genes that are shared between KO1 and KO2 (whether up or down regulated), and only those that are common to both lines are taken forward. Following the reviewer's comments, we have now displayed the behavior of the common genes across all replicates in one heatmap, with the data normalized to the WT untreated samples, and the normalized variance stabilized count indicates whether a gene is up or down regulated across each of the replicates and conditions. We believe this addresses the reviewer's comment as these data are now displayed in a more direct way and the genes that are consistently up or downregulated across all replicates (and indeed those that are not) can be clearly seen. We thank the reviewer for raising this and improving our data representation.

      -A better comparison of mis-regulated genes in the fam83f knockouts would be a comparison of KO2 and perhaps KO3, as the compensatory effects in KO1 can lead to additional indirect effect on the transcriptome. We understand the time and cost involved in this experiment and suggest that the differential gene expression analysis be performed individually on up or down regulated genes from KO2, or a comparison of such analysis will be provided with the differential gene expression analysis that was performed on shared mis-regulated genes between KO1 and KO2.

      The reviewer raises an excellent point. At the time of experimental design, we were concerned that omitting KO1 in favor of another line (e.g. KO3) would bias our results by excluding potentially important data. Similarly, as transcriptional adaptation occurs in a sequence specific manner, and the phenotype was present in KO1 regardless, we didn't want to exclude these data. However, with hindsight, we agree that it may have been prudent to exclude KO1 on this basis, and we may have seen an increased concordance of differentially expressed genes (DEGs) between KO2 and KO3. However, this is not possible to repeat now due to the Smith lab closing, and our documented findings are valid and important regardless. We acknowledge however that, with hindsight, what the reviewer suggests may have been better experimental design.

      -Can you confirm with the RNA-seq analysis that fam83g is upregulated in KO1 as opposed to KO2? (i.e. can the compensatory analysis you have observed with qRT-PCR be confirmed with the RNA-seq data?)

      This is an excellent question, and we thank the reviewer for raising this. fam83fb passed our threshold for significance to be deemed as differentially expressed (upregulated) in KO1 only, in accordance with our qRT-PCR data. fam83g did not pass the significance threshold, but perhaps this is not surprising as both fam83fb and fam83g are expressed at particularly low levels to start with and would probably require much greater sequencing depth to be detected.

      Reviewer #3 (Significance (Required)):

      There is fundamental value in clarifying the in vivo function of poorly characterized protein-coding genes. This study fills a gap in the literature, but the broader conceptual impact is limited. The authors do a thorough job at generating and characterizing CRISPR/Cas9 mediated knock-out zebrafish animals. It is further commended that the authors do a meticulous job in a quantitative description of the resulting phenotype. This is a thorough study, with the only major concern being the lack of rescue experiments that would be needed to substantiate the the role of fam83f in sensitivity to DNA damage and lysosome function.

      We thank the reviewer for their comments and trust we have addressed the issues concerned with the changes described above.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      Perampalam et al. describe novel methods for genome-wide CRISPR screening to identify and validate genes essential for HGSOC spheroid viability. In this study, they report that Netrin signaling is essential for maintaining disseminated cancer spheroid survival, wherein overexpression of Netrin pathway genes increases tumor burden in a xenograft model of ovarian cancer. They also show that high netrin expression correlates with poor survival outcomes in ovarian cancer patients. The study provides insights into the biology of netrin signaling in DTC cluster survival and warrants development of therapies to block netrin signaling for treating serous ovarian cancer.

      Strengths:

      - The study identifies Netrin signaling to be important in disseminated cancer spheroid survival

      - A Novel GO-CRISPR methodology was used to find key genes and pathways essential for disseminated cancer cell survival

      Thanks for the endorsement of our work and its importance to metastasis in ovarian cancer.

      Weaknesses:

      - The term dormancy is not fully validated and requires additional confirmation to claim the importance of Netrin signaling in "dormant" cancer survival.

      - Findings shown in the study largely relate to cancer dissemination and DTS survival rather than cancer dormancy.

      Much of the validation of dormancy and cell cycle arrest in HGSOC spheroids, as well as the culture model, have been published previously and hence was not repeated here.  I think this reviewer will appreciate the updated citations and explanations to better illustrate the state of knowledge.  We have also added new experiments that further emphasize the dormant state of spheroid cells in culture and xenografts, as well as patient derived spheroids used in this study.

      Reviewer #1 (Recommendations for Authors):

      (1) It is unclear what spheroid/adherent enrichment ratio is and how it ties into genes affecting cell viability. Why is an ER below 1 the criteria for selecting survival genes?

      Our screen uses the ‘guide only’ comparison in each culture condition to establish a gene score under that specific condition.  A low adherent score captures genes that are essential under standard culture conditions where cells are proliferating and this can include genes needed for proliferation or other basic functions in cell physiology.  A low spheroid score identifies the genes that are most depleted in suspension when cells are growth arrested and this is an indication of cell death in this condition.  Since gene knock outs are first established in adherent proliferating conditions, essential genes under these conditions will already start to become depleted from the population before suspension culture.  By selecting genes with a ratio of <1 we can identify those that are most relevant to dormant suspension culture conditions.  Ultimately, the lowest enrichment ratio scores represent genes whose loss of function is dispensable in the initial adherent condition, but critical for survival in suspension and this is what we aimed to identify. We’ve updated Figure 1B to illustrate this and we’ve updated the explanation of the enrichment ratio on page 6, lines 144 to 147 of the results.

      (2) The WB for phospho-p38 in figure 1A for OVCAR8 line does not show increased phosphorylation in the spheroid relative to the adherent. If anything, phospho-p38 appears to be reduced in the spheroid. Can the authors provide a better western blot?

      We’ve updated this blot with a longer exposure, see Figure 1A.  Phosphorylation levels of p38 are essentially unchanged in OVCAR8 cells in suspension culture, although the overall levels of p38 may be slightly reduced in dormant culture conditions.

      (3) How did the authors confirm dormancy apart from western blot for phospho-ERK vs phospho-p38? Authors should add EdU/BrdU staining and/or Ki67 staining to confirm dormancy.

      Previous publications that appear as citations 7,10, and 33 in the reference list established the growth arrest state of these cells in suspension culture in the past.  This included measuring other known markers of dormancy and quiescence such as p27, p130, and reduced cyclin/cdk activity and 3H-thymidine incorporation. In addition, other associated characteristics of dormancy such as EMT and catabolic metabolism have been demonstrated in these culture conditions (see citation 11 and Rafehi et al. Endocr. Relat. Cancer 23;147-59).  We’ve added these additional citations to our descriptions of dormant spheroid culture to better clarify the status of these cells in our experiments (see page 6, lines 126-28).  To ensure that cells are growth arrested in the experiments shown in this paper, we have updated Figure 1A to include blots of p130 and Ki67 to further emphasize that spheroid cells are not proliferating as the quiescence marker (p130) is high and the proliferative marker (Ki67) is lost in suspension culture.

      (4) Can the authors report spheroid volume over time in culture? How was viability measured?

      We’ve updated the methods (see page 27, line 574) to better highlight the description of cell survival that answers both of these questions. At the ends of experimental time points in both the screen and viability assays we captured live cells by replating on adherent plasticware. We fixed and stained with crystal violet and photographed plates to illustrate the sizes of spheroids (shown in Fig. 2 Supplement 1E, Fig. 6C, and 7D). We subsequently extracted the dye and quantitated it spectrophotometrically to quantitatively compare biomass of viable cells between experiments irrespective of the relatively random shapes of spheroids. We found reattachment and staining in this manner to match traditional viability assays such as CellTiter-Glo in a previous paper (10). Furthermore, biomass never increases in culture and diminishes gradually over time in culture consistent with the non-proliferative state of these experiments. Double checks of this equivalency of viability and reattached biomass measurments, as well as demonstrating that biomass is lost over time, are shown in Fig. 2 Supplement 1E that compares reattached crystal violet staining measurements with CellTiter-Glo for DYRK1A knock out cells over time in culture. In addition, we include a comparison of crystal violet staining of reattached spheroids with trypan blue dye exclusion in Fig. 5G and H. In both cases reattachment and more direct viability assays demonstrate the same conclusion that Netrin signaling supports viability in dormant culture.

      (5) Please show survival significance of Netrin signaling genes in recurrence/relapse free survival to claim importance in cancer dormancy.

      See Fig. 7 Supplement 1C where we include the recurrence free survival data. Netrin-1, and -3 high expressors also have a numerically shorter progression free survival but it is not statistically significant. Netrin-1 overexpression alone is also shown and it shows shorter survival with a P-value of 0.0735. Elevated survival of dormant cells in a residual disease state is expected to increase the chance of relapse and shorten this interval. Thus, this data is consistent with our model, but lacks statistical significance. 

      There are many alternative ways to interpret what shorter progression free survival, or overall survival, may mean biologically. Since survival of dormant cells is but one of them, we also added new data to experimentally investigate the role of endogenous Netrin signaling in dormant residual disease in Fig. 6 and described on page 12, lines 266-87.  We used xenograft experiments to show OVCAR8 spheroids form and withdraw from the cell cycle equivalently to suspension culture following intraperitoneal injection.  Furthermore, loss of Netrin signaling due to receptor deletions compromises survival during this early window before disseminated lesions form.  This argues that Netrin signaling contributes to survival during this window of dormancy.  In addition, mice engrafted with mutant cells experience prolonged survival when Netrin signaling is blocked.  Together, these experiments further argue that Netrin signaling supports survival in the dormant, non-proliferative phase, and leads to reduced survival of mice.

      (6) The authors show IHC staining of patient ascites derived HGSOC spheroids. However, no marker for dormancy is shown in these spheroids. Adding Ki67 staining or phospho-ERK vs phospho-p38 would be necessary to confirm cancer dormancy.

      We have added new staining for Ki67 and p130 that compares these markers in HGSOC tumors where Ki67 is high and p130 is low with ascites derived spheroids where staining is the opposite. Importantly, expression of p130 is linked to cellular quiescence and is not found to accumulate in the nucleus of cells that are just transiting through G1.  This confirms that the ascites derived spheroids are dormant.  See Fig. 4A-E and described on page 9, lines 201-7.

      (7) Overall, the findings are interesting in the context of cancer dissemination. There is not enough evidence for cancer dormancy and the importance of Netrin signaling in the survival of cancer dormancy. Overexpression of Netrin increases phosphorylation of ERK, leading one to expect an increase in proliferation. This suggests that Netrin breaks cancer cells out of dormancy, into a proliferative state.

      We have found that the discovery of Netrin activation of MEK-ERK in growth arrested cells is counterintuitive to many cancer researchers.  However, this axis exists in other paradigms of Netrin signaling in axon outgrowth that are not proliferation related (see citation 26, Forcet et al. Nature 417; 443-7 as an example).  We have added Fig. 5D and descriptions on page 11, lines 244-52 to better clarify that Netrins CAN’T induce cell proliferation through ERK.  Addition of recombinant Netrin-1 can only induce ERK phosphorylation in suspension culture conditions and not in quiescent adherent conditions.  The small magnitude of ERK phosphorylation induced by Netrin-1 in suspension compared to treating adherent, quiescent cells with the same concentration of mitogenic EGF further emphasizes that this is not a proliferative signal.  Lastly, the new xenograft experiment in Fig. 6A-D (described on page 12, lines 266-81 demonstrates the growth arrested context in which Netrin signaling in dormant spheroids leads supports viability.

      (8) If authors wish to claim cancer dormancy as the premise of their study, additional confirmatory experiments are required to support their claims. Alternatively, based on the current findings of the study, it would be best to change the premise of the article to Netrin signaling in cancer dissemination and survival of disseminated cancer spheroids rather than cancer dormancy.

      I expect that this reviewer will agree that we have added more than sufficient explanations of background work on HGSOC spheroid dormancy from the literature, as well as new experiments that address their questions about dormancy in our experiments.

      Reviewer #2 (Public Review):

      Summary:

      In this article, the authors employed modified CRISPR screens ["guide-only (GO)-CRISPR"] in the attempt to identify the genes which may mediate cancer cell dormancy in the high grade serous ovarian cancer (HGSOC) spheroid culture models. Using this approach, they observed that abrogation of several of the components of the netrin (e.g., DCC, UNC5Hs) and MAPK pathways compromise the survival of non-proliferative ovarian cancer cells. This strategy was complemented by the RNAseq approach which revealed that a number of the components of the netrin pathway are upregulated in non-proliferative ovarian cancer cells and that their overexpression is lost upon disruption of DYRK1A kinase that has been previously demonstrated to play a major role in survival of these cells. Perampalam et al. then employed a battery of cell biology approaches to support the model whereby the Netrin signaling governs the MEK-ERK axis to support survival of non-proliferative ovarian cancer cells. Moreover, the authors show that overexpression of Netrins 1 and 3 bolsters dissemination of ovarian cancer cells in the xenograft mouse model, while also providing evidence that high levels of the aforementioned factors are associated with poor prognosis of HGSOC patients.

      Strengths:

      Overall it was thought that this study is of potentially broad interest in as much as it provides previously unappreciated insights into the potential molecular underpinnings of cancer cell dormancy, which has been associated with therapy resistance, disease dissemination, and relapse as well as poor prognosis. Notwithstanding the potential limitations of cellular models in mimicking cancer cell dormancy, it was thought that the authors provided sufficient support for their model that netrin signaling drives survival of non-proliferating ovarian cancer cells and their dissemination. Collectively, it was thought that these findings hold a promise to significantly contribute to the understanding of the molecular mechanisms of cancer cell dormancy and in the long term may provide a molecular basis to address this emerging major issue in the clinical practice.

      Thanks for the kind words about the importance of our work in the broader challenges of cancer treatment.

      Weaknesses:

      Several issues were observed regarding methodology and data interpretation. The major concerns were related to the reliability of modelling cancer cell dormancy. To this end, it was relatively hard to appreciate how the employed spheroid model allows to distinguish between dormant and e.g., quiescent or even senescent cells. This was in contrast to solid evidence that netrin signaling stimulates abdominal dissemination of ovarian cancer cells in the mouse xenograft and their survival in organoid culture. Moreover, the role of ERK in mediating the effects of netrin signaling in the context of the survival of non-proliferative ovarian cancer cells was found to be somewhat underdeveloped.

      Experiments previously published in citation 7 show that growth arrest in patient ascites derived spheroids is fully reversible and that argued against non-proliferative spheroids being a form of senescence and moved this work into the dormancy field.  We have added extensive new support for our model systems and data to address the counterintuitive aspects of MEK-ERK signaling in survival instead of proliferation. 

      Reviewer #1 Recommendations for Authors

      (1) A better characterization of the spheroid model may be warranted, including staining for the markers of quiescence and senescence (including combining these markers with staining for the components of the netrin pathway)

      See Figure 1A and page 6, lines 126-36 where we have added blots for Ki67 and p130 to better emphasize the arrested proliferative state of cells in our screening conditions.  We have also added these same controls for patient ascites-derived spheroids in Figure 4 and described on page 9, lines 203-7.  One realization from this CRISPR screen, and others in our lab, is that it identifies functionally important aspects of cell physiology and not necessarily ones that are easily explored using commercially available antibodies.  Netrin-1 and -3 staining of patient derived spheroids in Fig. 4, as well as cell line spheroids stained in Fig. 4 Supplement 1 further support the relevance of this pathway in dormant cancer cells because Netrins are expressed in the right place at the right time.  The Netrin-1 stimulation experiments in Fig. 5C were originally carried out to probe HGSOC cells for functionality of Netrin receptors since we couldn’t reliably detected them by blotting or staining with available antibodies.  This demonstrates that this pathway is active in the various HGSOC cell lines we’ve used and specifically, using OVCAR8 cells, we show it is only active in suspension culture conditions.

      (2) In figure 1A it appears that total p38 levels are reduced in some cell lines in spheroid vs. adherent culture. The authors should comment on this.

      These blots have been updated to be more clear.  Overall p38 levels may be reduced in some cell lines and when compared with activation levels of phosphorylated p38 it suggests the fraction of activated p38 is higher. OVCAR8 cells may be an exception where the overall activity level remains approximately the same.

      (3) The authors should perhaps provide a clearer rationale for choosing to focus on the netrin signaling vs. e.g., GPCR signaling, and consider more explicit defining of "primary" vs. "tertiary" categories in Reactome gene set analysis.

      We’ve updated Fig. 1E and the text on page7, lines 161-5 to illustrate which gene categories identified in the screen belong to which tiers of Reactome categories. It better visualizes why we have investigated the Axon guidance pathway that includes Netrin because it is a highly specific signaling pathway that scores similarly to the broader and less specific categories at the very top of the list. As an aside, the GPCR signaling and GPCR downstream signaling have proven to be fairly intractable categories.  As best we can tell the GPCR downstream signaling category is full of MAPK family members and likely represents some redundancy with MAPK further down.  

      (4) In figure 3A-C, including factors whose expression did not appear to change between adherent and suspension conditions may be warranted as the internal control. Figure 3D-F may benefit from some sort of quantification.

      The mRNA expression levels are normalized to GAPDH as an internal control. We have updated this figure and re-plotted it as fold change relative to adherent culture cells with statistical comparisons to indicate which are significantly upregulated in suspension culture.

      The IHC experiments are now in Fig. 4D-F and show positive staining for Netrin-1 and -3.  Netrin-3 is easiest to see, while Netrin-1 is trickier because the difference with the no primary antibody control isn’t intensity, but the tint of the DAB stain.  We had to counter stain the patient spheroids with Hematoxylin in order for the slide scanner to find the best focal plane and make image registration between sections possible.  This unfortunately makes the Netrin-1 staining rather subtle.  For cell line spheroids in the Fig. 4, Supplement 1 we didn’t need the slide scanner and show negative controls without counter stain that are much more convincing of Netrin-1 detection and reassure us that our staining detects the intended target.  We’ve updated the labels in Fig. 4 and Fig. 4, Supplement 1 for this to be more intuitive.  Unfortunately, relying on the tint of the DAB stain leaves this as a qualitative experiment.

      - In figure 4C-E the authors show that Netrin-1 stimulation induces ERK phosphorylation whereby it is argued that this is a "low-level" stimulation of ERK signaling required for the survival of ovarian cells in the suspension. This is however hard to appreciate, and it was thought that having adherent cells in parallel would be helpful to wage whether this indeed is a "low level" ERK activity. Moreover, the authors should likely include downstream substrates of ERK (e.g., RSKs) as well as p38 in these experiments. The control experiments for the effects of PD184352 on ERK phosphorylation also appear to be warranted. Finally, performing the experiments with PD184352 in the presence of Netrin-1 stimulation would also be advantageous.

      We have added a new Netrin-1 stimulation experiment in Fig. 4D (described on page 11, line 244-52) that shows that Netrins can only activate  very low levels of ERK phosphorylation in suspension when proliferation is arrested. Netrin-1 stimulation of quiescent adherent cells where stimulation of proliferation is possible shows that Netrins are unable to activate ERK phosphorylation in this condition.  In contrast, we also stimulate quiescent adherent OVCAR8 cells with an equal concentration of EGF (a known mitogen) to offer high level ERK phosphorylation as a side by side comparison.  I think that this offers clear evidence that Netrin signaling is inconsistent with inducing cell proliferation.  We’ve also updated citations in the introduction to include citation 26 that offers a previously reported paradigm of Netrin-ERK signaling in axon outgrowth that is a non-cancer, non-proliferative context to remind readers that Netrins utilize MEK-ERK differently. 

      We highlight Netrin-MEK-ERK signaling as key to survival for a number of reasons.  First, Netrin signaling in this paradigm does not fit the dependence receptor paradigm where loss of Netrin receptors protect against cell death.  Fig. 5B rules this out as receptor loss never offers a survival advantage, but clearly receptor deletions compromise survival in suspension culture.  Second, positive Netrin signaling is known to support survival by inactivating phosphorylation of DAPK1.  We’ve added this experiment as Fig. 5 Supplement 1D and show that loss of Netrin receptors doesn’t reduce DAPK1 phosphorylation in a time course of suspension culture.  Consequently, we conclude this isn’t the survival signal either.  Since MEK and ERK family members scored in our screen, we investigated their role in survival.  We now show two different MEK inhibitors with different inhibitory mechanisms to confirm that MEK inhibition induces cell death. In addition to the previous PD184352 inhibitor in our first submission, we’ve added Trametinib as well and this is shown in Fig. 5G.  Since it is surprising the MEK inhibition can kill instead of just arrest proliferation, we’ve also added another cell death assay in which we show trypan blue dye exclusion as a second look at survival.  This is now Fig. 5H.  Lastly, we include Trametinib inhibition of ERK phosphorylation in these assays in Fig. 5I.  While we leave open what takes place downstream of ERK, our model in Fig. 5J offers a very detailed look at the components upstream.

      - Does inhibition of ERK prevent the abdominal spread of ovarian cancer cells? The authors may feel that this is out of the scope of the study, which I would agree with, but then the claims regarding ERK being the major mediator of the effects of netrin signaling should be perhaps slightly toned down.

      We agree that loss of function xenograft experiments will enhance our discovery of Netrin’s role in dormancy and metastasis.  We have added a new Fig. 6 that uses xenografts with Netrin receptor deficient OVCAR8 cells (UNC5 4KO).  It demonstrates that two weeks following IP engraftment we can isolate spheroids from abdominal washes and that cells have entered a state of reduced proliferation as determined by lowered Ki67 expression as well as other proliferation inducing genes.  In the case of UNC5 4KO cells, there is significant attrition of these cells as determined by recovering spheroids in adherent culture (Fig.6C) and by Alu PCR to detect human cells in abdominal washes (Fig. 6D).  Lastly, xenografts of UNC5 4KO cells cause much less aggressive disease and significantly extend survival of these mice (Fig. 6E,F).  Not exactly the experiment that the reviewer is asking for, but a clear indication that Netrin signaling supports survival in xenograft model of dormancy.

      - Notwithstanding that this could be deduced from figures 6D and F, it would be helpful if the number of mice used in each experimental group is clearly annotated in the corresponding figure legends. Moreover, indicating the precise statistical tests that were used in the figures would be helpful (e.g., specifying whether anova is one-way, two-way, or?)

      We have added labels to what is now Fig. 8B to indicate the number of animals used for each genotype of cells.  We have also updated figure legends to include more details of statistical tests used in each instance.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required):

      The majority of the conclusions are well supported by strong experimental evidence. The only area where that is not fully the case is the role of Pak1 as a downstream effector of FoxG1-FoxO6 and its effects on macropinocytosis. To further strengthen this claim, the authors should demonstrate that ablation of Pak1 can rescue the functional consequences of forced FoxO6 expression and whether overexpression of Pak1 rescues quiescence exit in FoxO6 knockout. Thank you to the reviewer for these helpful suggestions. To investigate the effects of Pak1 ablation, and therefore more directly the link between FOXG1 and FoxO6 and macropinocytosis, we tested the published Pak1 inhibitor IPA-3. Unfortunately, to distinguish the role of Pak1 in quiescence exit and macropinocytosis, we would need a dosage of IPA-3 that is efficacious but does not affect cell proliferation. It was not possible to optimise such a dosage (a dosage of 10uM is shown to be efficacious at inhibiting Pak1 (Verma et al, 2020; Wong et al, 2013) however even at 2.5uM we see significant cell death in our cells. Indeed, this is potentially due to pleiotropic roles for Pak1.

      Also, it is not feasible to overexpress Pak1 in the FoxO6 KO cells with inducible FOXG1. To ensure we are investigating quiescence exit this would need to be in an inducible manner; however, re-transfecting cells using the PiggyBac system would potentially alter FOXG1 transgene levels by excising the existing transgene.

      As shown in Figure S3, we do not observe clear vacuole formation in F6 (FOXG1-inducible) cells upon Dox addition. As detailed in the discussion, we hypothesise that FoxO6-induced macropinocytosis could represent a stalled state, with other pathways downstream of FOXG1 necessary to be activated concomitantly to ensure cell cycle re-entry, e.g., through increased pinocytic flux that cannot be assessed within our experimental timeframes. Indeed, active Pak1 has been found to modulate pinocytic cycling, enhancing both FITC-dextran uptake and efflux (Dharmawardhane et al, 2000). We therefore would not hypothesise that high Pak1 levels alone would be sufficient to drive quiescence exit.

      Alternatively, the macropinocytosis observed may be a metabolic stress response because of the hyperactivation of signalling pathways upon FoxO6 overexpression. Hyperactivation of Ras signalling, canonical Wnt and PI3K signalling have all been shown to play roles in inducing macropinocytosis (Overmeyer et al, 2008; Tejeda-Muñoz et al, 2019; Recouvreux & Commisso, 2017).

      We believe the observed macropinocytosis phenotype upon Foxo6 overexpression, and the changes in Pak1 expression upon Foxo6 loss or FOXG1 induction provide interesting insights into the function of this underexplored FoxO family member. However, currently we are unable to demonstrate a direct link between these processes and have therefore modified the text to reflect this (see lines 292-4, 330-3, 365-8).

      • The manuscript stresses the role of NSC quiescence exit in GBM and demonstrates that FoxG1 KO reduces FoxO6 levels in a murine GBM cell line but a BMP4-mediated quiescence and dox-induced FoxG1 over-expression or an abolishment of cell cycle re-entry thereof by reduced FoxO6 levels in the case of FoxG1 KO is lacking. But this would significantly substantiate the relevance of the findings. *

      Mouse GBM cells have elevated levels of FoxG1 and have been shown to be refractory to BMP4-mediated quiescence entry, maintaining colony formation following BMP treatment (Bulstrode et al, 2017). It is therefore challenging to specifically investigate cell cycle re-entry/ quiescence exit using these mouse GBM cells, or indeed any GBM cell line due to their inability to respond fully to BMP cues (Caren et al, 2015). It has also been shown by Bulstrode et al, 2017 that Foxg1 null mouse neural stem cells show an increased propensity to exit cycle in response to BMP treatment, and reduced colony formation on return to EGF/FGF-2 growth factors. FOXG1 null cell lines therefore show a reduced response to BMP cues, making it difficult to explore quiescence exit per se.To navigate this, instead we investigated Dox-induced FOXG1 overexpression in FoxO6 WT and KO mouse NS cells, which display similar quiescence characteristics upon BMP treatment (Figure 4).

      • In the introduction and discussion, FoxO6 is mentioned for its oncogenic roles in various cancers but no reference to GBM specifically is cited. It feels like a missed opportunity to not show evidence of this in the IENS cell line that has reduced levels of FoxO6; is there an effect in their proliferative capacity? What are the expression levels of Pak1 following FoxG1 KO in IENS cells? *

      Thank you for the helpful suggestion. It is indeed true the literature on FoxO6 in GBM is lacking, explaining the absence of citations on this. On investigation of expression of the proliferation marker Ki67 in these cells we found no significant difference in expression, now shown in Figure 1H. This is in fitting with previous findings of our lab (Bulstrode et al, 2017) which show that FOXG1 is dispensable for the maintenance of continued NSC or GSC proliferation in vitro. We investigated the expression levels of Pak1 following FOXG1 KO in IENS and found a decrease in both KO lines compared to parental cells (updated Figure 6F).

      As explained in our discussion, these data suggest that Foxg1/FoxO6/Pak1 are not functionally important in sustaining GSC/NSC proliferation, as shown by the lack of proliferation defects upon Foxg1 or FoxO6 deletion (Bulstrode et al, 2017), but impact regulatory transitions, as cells prepare to exit quiescence into the proliferative radial-glia like state.

      *Minor comments *

      - Fig1A shows 4 and 2-fold respectively for the two mouse NSC lines, not 17 and 4-fold increase as written on manuscript, please adjust accordingly.

      The qRT-PCR data are presented as log2(fold change) or - ddCt, where this value equals zero for the calibrator sample, as indicated in the figure legends and axes. The data are presented in this way to enable accurate visualisation of up- and down-regulation of gene expression. Data are stated as ‘fold increase’ in the text for ease of reading, which we have clarified in the text and figure legends (e.g. lines 154 and 176).

        • Fig2G manuscript reports a 235-fold upregulation, but graph looks more like a 7 or 8-fold as shown on Fig1A for the F6 NSC line. I would recommend checking the fold changes reported throughout the paper. *

      See previous comment above. The qRT-PCR data are presented as log2(fold change) or - ddCt, where this value equals zero for the calibrator, as indicated in the figure legends and axes. The data are presented in this way to enable accurate visualisation of up- and down-regulation of gene expression. Data are stated as ‘fold increase’ in the text for ease of reading, which we have clarified in the text and figure legends (e.g. lines 154 and 176).

      • The manuscript describes the increase of FOXG1 after BMP4-induced cell cycle exit as compared to non-BMP4 treated cells (p.8 first paragraph), but I am wondering if this expression is rather compared to dox negative and not vs BMP4 negative treatment. *

      Data are presented relative to the non-BMP treated (EGF/FGF-2) control throughout the manuscript for consistency. This is to enable changes in expression between -Dox and +Dox to be visualised throughout the quiescence-exit time course relative to the initial starting population in EGF/FGF-2 growth media, prior to BMP treatment.

        1. In Fig2G it is interesting that FoxO6 is upregulated in BMP4 treated throughout the experiment with highest values at day10 post treatment. At the same time, non-BMP4 treated cells keep decreasing their FoxO6 levels dramatically but there is no mention or reference to this effect.*

      In Figure 2G, all cells have been treated with BMP4, prior to return to growth media (EGF/FGF) with or without Dox. It is true that in the +Dox condition with FOXG1 induction, FoxO6 levels continue to increase up to Day 10, perhaps reflective of the expansion of a highly proliferative radial glia-like population.

        1. Fig2 would benefit from a western blot like Fig1D where FoxG1 and FoxO6-HA protein levels are also shown in dox-treated comparing BMP4-treated vs non-treated. *

      Due to the lack of specific FoxO6 antibodies and the absence of a FoxO6-HA tag in this cell line, it is not possible to perform protein analysis of FoxO6 levels in this figure as for Figure 1D.

      • The colonies in Fig3E should be quantified, as their ability to form neurospheres seems somewhat compromised upon FoxO6 KO. Fig3B and 3F could perhaps be consolidated into one panel in the interest of space and presentation. *

      Good suggestion. We have now consolidated Fig 3B and 3F into one panel (now Figure 3F) as suggested by the reviewer. We performed additional replicates for Figure 3E to quantify the colony formation efficiency. This showed a small but insignificant decrease in colony forming ability in the KO cells (Figure 3E). Importantly the FoxO6 null cells do form colonies, and our results show that FoxO6 is not essential for proliferation or colony formation of NSCs in EGF/FGF-2 – this therefore does not account for the complete loss in colony formation we see the in the FoxO6 KO cells upon FOXG1 induction.

      • Fig4A shows vs "parental" non-BMP on y axis but wouldn't this show fold change of dox+ parental vs parental. The authors should clarify this. *

      All samples in Figure 4A are compared to parental cells in EGF/FGF-2, i.e. non-BMP treated, as the calibrator sample where log2(fold change) equals zero. We chose to set a single calibrator sample for all data (parental and FoxO6 KO cells included) to allow us to compare changes in FOXG1 transgene across the entire experiment.

      • Perhaps the authors can add a non-BMP4 treated count of % FOXG1 positive cells to Fig4C for reference. *

      As shown in Figure 4A, both parental and FoxO6 KO cells show similar, i.e. negligible, FOXG1 transgene expression without Dox, compared to the parental non-BMP4 treated control, therefore negligible FOXG1-V5 positive cells are seen by ICC. We have edited Figure 4A to include a non-BMP treated and BMP-treated control to show the negligible FOXG1-V5 expression by qPCR as controls.

      • The sentence mentioning Fig5D for the first time (p.10 third paragraph) needs rephrasing for clarity and should also call out Fig5C for the mCherry expression live cell imaging data where appropriate. Fig5D does not appear to be live imaging as implied by the text. If vacuole formation is observed already as early as 10-11h after Dox induction, then it should be shown somewhere in Fig5. Vacuole formation is shown with a higher magnification image inset only in the 22h timepoint image. I think Fig5E should be more substantiated with some sort of quantification, e.g. % of vacuoles positive for EEA1 and/or LAMP1. *

      We apologise for this. The first reference to Figure 5D one line 234 should refer to Figure 5C, this has now been corrected in the text. Vacuoles are visible in Figure 5C panel 10 h 30 min, however, to make this clearer we have also supplied an accompanying movie of the live imaging (Movie 1). The imaging in Fig 5E has not been quantified as this imaging was performed with the purpose of confirming the vacuole structures seen are not simply enlarged lysosomes, due to their similarity in appearance to those published elsewhere (Ramosaj et al, 2021; Leeman et al, 2018). Instead, we have provided Western blotting data in Figure S5E to support this conclusion that there is no clear increase in EEA1 or LAMP1 (early endosomal or lysosomal) expression upon FoxO6-HA induction.

      *- Could the authors comment on the lack of proliferative advantage of the FoxO6 overexpression. FigS3 shows Edu staining, but there is no proliferation assay in either Fig5 or S3. What would be the effect of FoxO6 overexpression on BMP4-mediated quiescence with or without FoxG1 over-expression? *

      Induction of FoxO6-HA overexpression does not provide a proliferative advantage to the cells. Looking at individual cells, those with high FoxO6-HA levels seem to associate with EdU negativity. In Figure S3 we provide quantitative EdU incorporation assay as a proliferation assay (quantification of the number of cells cycling, therefore incorporating EdU, within a 24h pulse period). Quantification of the EdU staining in Figure S3G is provided in Figure S3H. We have now clarified this in the text on page 11, lines 263-4.

      Unfortunately, due to transgene overexpression using the PiggyBac transposon method, it is not feasible to overexpress FoxO6 and FOXG1 in the same cell line, as re-transfecting cells using the PiggyBac system would potentially alter FOXG1 transgene levels and make results difficult to interpret. Given the association of vacuolated cells with EdU negativity, we predict that FoxO6 overexpression would not give an advantage for quiescence exit. Indeed, BMP-treated cells with FoxO6 overexpression show a decrease in EdU positivity, as shown in Figure S3H. As discussed in the text, we hypothesise that cells with FoxO6 overexpression are in a stalled state, potentially due to signalling hyperactivation. While this may not be physiological, it gives us clues as to the function and downstream targets of FoxO6, which remain uncharacterised.

      *- Can the authors clarify if there is a proliferation change in F6 cells in Fig6F as in Fig2F? Fig6F shows Pak1 is already upregulated in quiescent NSCs, what are the expression levels of Pak1 in FoxO6 -/- ANS4 cells upon FoxG1-mediated quiescence exit as shown in Fig4? Is there a particular reason why the F6 cell line data is shown only up to day2 post Dox-induction rather than d4 or d10? For consistency with the rest of similar experimental data this timeline should be extended. Does Pak1 remain elevated, plateaus or keeps reducing further post day2? *

      The data is (previous) Figure 6F is the same assay and cell line as presented in Figure 2, but at an early timepoint (Day 2) during the quiescence exit assay. We have provided in the panel qRT-PCR analysis of Ki67 to show that cells begin to show increased proliferation at this timepoint. Due to our hypothesis that Pak1 is required at an early transition point, we decided to analyse this expression at an earlier timepoint than Figure 2. We have also repeated this at D10 (data below), showing Pak1 levels continue to increase with time, along with FoxO6 and the proliferative marker Ki67. Due to technical issues with variable FOXG1 transgene levels we were unable to analyse Pak1 expression levels in FoxO6+/- ANS4 cells upon FOXG1-mediated quiescence exit.

      *15 . Reviewer #1 (Significance (Required)): *

      The study provides a conceptual advance for exit from stem cell quiescence. There is strong evidence provided for murine neural stem cells, but the link to GBM cancer stem cells is less developed (but perhaps this is the subject of a separate manuscript).

      While FoxG1 is a known regulator of neurodevelopment and glioblastoma, the functions of FoxO6 have not been studied in the context of neural stem cells. In my view, this study should be of high interest to audiences in both neurodevelopment and cancer research. * Expertise: glioblastoma, cancer stem cells, neurodevelopment *

      We have edited the text and title to clarify that neural stem cells are used here as a model for GSCs with high levels of FOXG1 (e.g. lines 36 and 69).


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      *Major comments: *

      -The choice of NSCs as a main experimental model to understand the effects of FoxG1 and FoxO6 is not fully justified. The authors had previously shown that FoxG1 is expressed at very low levels in NSCs (Fig. 1A in Bulstrode et al. 2017). FoxO6 also seems to be barely expressed in NSCs (Fig. 1 of the current manuscript) and, in addition, its levels seem to go further down as cells exit quiescence (-Dox line in Fig. 2H). Therefore, these two genes do not seem to play an important role in the normal exit from quiescence of NSCs, with FoxO6 only affecting FoxG1 overexpression-induced exit from quiescence. * * *If the aim is to mimic a GBM-like state by FoxG1 overexpression, this should be made much clearer in the text, including title and abstract. In that case, the authors should also show a direct comparison of the levels of FoxG1 in GBM and upon Dox-induced overexpression in NSCs. *

      We agree with this criticism and suggestion to fix this. It is indeed our aim to mimic a GBM-like state by inducing FOXG1 overexpression and we should have made that more explicit. All experiments are performed in the context of high FOXG1 level. Like Foxg1, FoxO6’s homeostatic roles may be subtle in adulthood, and mostly involved in neural plasticity (Yu et al, 2019). This is in keeping with our finding that basal FoxO6 levels are low in adult NSCs and not required for sustained proliferation but are important for cell state transitions. If the FoxO6 levels activated by elevated FOXG1 represent an acquired dependency of GBM, there may be a therapeutic window to target this pathway. However, given the poorly understood roles of FoxO6, further work is needed to determine its specific value as a therapeutic target. We have modified the title and the text to make this clearer. This is also stated in the first paragraph of the results section on page 7 (line 148).

      We have provided below a Western Blot (Bulstrode, 2016) in which FOXG1 levels in F6 cells induced with Dox (1000 ng/ml the dosage used) with the GBM cell lines G7 and G144, and the normal NS cell line U5. This shows that the FOXG1 levels induced are significantly higher than found in normal neural stem cells (mouse or human). This model has been previously used and published in Bulstrode et al, 2017, upon which this manuscript expands.

      *-While the authors state that they aim to study NSC quiescence, they use a protocol that is closer to modelling astrocytic differentiation. In fact, in their previous work, they use this very same protocol (removal of growth factors and addition of BMP) to study the role of FoxG1 and Sox2 on astrocyte de-differentiation (Bulstrode et al. 2017). While there is arguably no perfect in vitro model of NSC quiescence, the current standard in the field is treatment with both BMP and FGF for 48 to 72 hours (e.g.: Mira et al., 2010, Martynoga et al., 2013, Knobloch et al., 2017, Leeman et al., 2020). BMP alone is regarded as a pro-astrocytic differentiation cue, and 24 hours might not be enough for NSCs to fully commit to either differentiation or quiescence. Therefore, either the claims in the paper are changed to match the astrocytic differentiation model, or a standard quiescence protocol should be used throughout to confirm the findings also apply to the exit from quiescence of NSCs. *

      We agree with the reviewer that there is indeed no perfect in vitro model of NSC quiescence and thank the reviewer for this useful discussion. Coincident with this project, this was an active area of research from our laboratory as explored by Marques-Torrejon et al, 2021 (Nature Comms). After 24 h BMP4 treatment, we found that adult mouse NS cells: exit cell cycle, are growth factor unresponsive, obtain an astrocytic morphology, upregulate astrocytic markers such as Gfap and Aqp4, and downregulate radial glia/NS cell markers such as Nestin and Olig2 (Figure 3).

      We therefore initially viewed them as terminally differentiated. However, the exact state of these cells is difficult to define due to the lack of definitive markers and transcriptional differences that can distinguish terminally differentiated GFAP-expressing astrocytes from quiescent type B SVZ NS cells (which also express GFAP) (Bulstrode et al, 2017; Doetsch et al, 1999; Codega et al, 2014). Findings from our laboratory later suggested some NS cell markers are maintained following BMP4 treatment and these cells can be forced back into cycle with combined Wnt/EGF signalling, or FGF/BMP signalling (Marques-Torrejon et al 2021). This suggests in vitro NS cells may lie along a continuous spectrum of states from dormant quiescent, activated quiescent (primed for cell cycle re-entry) to actively proliferating, similar to that observed in vivo in the mouse SVZ (Dulken et al, 2017). Indeed, after 24 h BMP4 treatment, we observe a minimal level of colony formation in no Dox controls following 10 days of exposure to the growth factors EGF/FGF-2 (Figure 2D-F).

      These non-cycling BMP4-induced astrocytic cells might therefore be better viewed as dormant quiescent NSCs, hence our reference as quiescent NSCs. The assay conditions used in this manuscript differ to those of Marques-Torrejon et al, in terms of density and length of BMP4 treatment; it is therefore likely that our BMP-treated cells are at different stages along the continuum between dormancy and primed quiescent states. Importantly, regardless of the exact cell type induced by 24 h BMP4 treatment, we have considered the changes induced by FOXG1 overexpression, in comparison to the effect of NS cell media alone.

      *-The FoxO6-induced vacuole formation in NSCs is a very interesting finding. However, so far it was only observed upon FoxO6 overexpression. To claim vacuolization is required for quiescence exit, the authors should show whether this phenomenon is also observed upon normal exit from quiescence and FoxG1-induced reactivation of NSCs. From the author's own data, Pak1 (which induces vacuolization) is unlikely to reactivate NSCs, as its expression is highest in BMP-treated cells (Figure 6F). The authors should show whether some vacuolization is present at these stage in NSCs and if not, discuss the possible interplay between Pak1 and FoxO6 in vacuole formation and quiescence exit. *

      As detailed in the discussion, we hypothesise that FoxO6- induced macropinocytosis could represent a stalled state, with other pathways downstream of FOXG1 necessary to be activated concomitantly to ensure cell cycle re-entry, e.g., through increased pinocytic flux that cannot be assessed within our experimental timeframes. Indeed, active Pak1 has been found to modulate pinocytic cycling, enhancing both FITC-dextran uptake and efflux (Dharmawardhane et al, 2000). Alternatively, the macropinocytosis observed may be a metabolic stress response because of hyperactivation of signalling pathways upon FoxO6 overexpression Hyperactivation of Ras signalling, canonical Wnt and PI3K signalling have all been shown to play roles in inducing macropinocytosis (Overmeyer et al, 2008; Tejeda-Muñoz et al, 2019; Recouvreux & Commisso, 2017).

      We do not see clear evidence of vacuoles in FOXG1-induced reactivation of NSCs – this supports that the macropinocytosis seen upon FoxO6 overexpression is a stalled state or due to hyperactivation. While this may not be physical, it gives us clues as to the function and downstream targets of FoxO6, which remain uncharacterised (such as a link of FoxO6 and FOXG1 with Pak1-related pathways). Demonstrating a requirement for vacuolisation in quiescence exit is outwidth this manuscript and therefore we are careful not to claim this. We have modified the text to clarify this.

      As the reviewer noted, it is interesting that Pak1 is highest in BMP-treated cells; it seems that BMP signalling itself is triggering elevated Pak1 levels, likely as cells undergo extensive cell shape changes during the transition from proliferation to quiescence. However, in EGF/FGF-2, Pak1 levels decrease, and our data suggests that FOXG1/FoxO6 are required to increase or maintain Pak1, potentially to again enable the cell shape/metabolic changes required on quiescence exit. We have added to the text to expand upon this observation on page 14 (lines 330-333). -Finally, the data on the regulation of Pak1 expression by FoxO6 is insufficient to draw any strong conclusions. Downregulation of Pak1 in FoxO6 cells is not enough evidence to claim a direct regulation. The authors should show whether Pak1 levels are increased after FoxO6 overexpression and whether FoxG1 is downregulated in FoxO6 KO NSCs (indirectly affecting Pak1 expression).

      We have performed qRT-PCR analysis of Foxg1 expression in FoxO6 KO NSCs and see no consistent difference in expression, indicating this is not indirectly affecting Pak1 expression (see below, 1). We have also investigated Pak1 levels upon FoxO6 overexpression, over a time course following Dox addition (see below, 2). Interestingly, when FoxO6 is overexpressed, Pak1 is not clearly upregulated at any time-point. It may be that as Pak1 is already expressed in the -Dox controls, due to its roles in a variety of cellular functions, that the levels are saturated already. It is clear that Pak1 expression decreases upon FoxO6 loss in EGF/FGF (without coincident Foxg1 downregulation) and in F6 cells, higher FOXG1 correlates with higher Pak1 in EGF/FGF. Together with the induction of macropinocytosis upon FoxO6 overexpression, these data provide interesting insights into the potential pathways downstream of Foxo6 in controlling quiescence exit, directly or indirectly related to Pak1 signalling. We have modified the text to reflect this on page 14 (lines 330-333).

      Minor comments: * Please state in the main text that NSCs are derived from the SVZ. *

      This has been added to the text on page 7 (line 149) and is in the methods ‘Cell Culture’ section.

      Reviewer #2 (Significance (Required)):

      As I said before, I find this work tackles a very important question, how is the exit from quiescence controlled in NSCs. This manuscript will be of interest to researchers in the fields of adult stem cell biology and adult neurogenesis. While my expertise lies mostly on NSC biology, this work is of potential great interest for the cancer field, particularly for brain cancer research. Elucidating the mechanisms GBM cells use to exit quiescence is crucial in order to avoid the relapse of this aggressive form of brain cancer. To increase the relevance of the work to the cancer community, some of the key findings should be reproduced with GBM cells. It would be particularly important to show whether Pak1 induced vacuolization and macropinocytosis can be observed in GBM cells.

      As detailed in the discussion, we hypothesise that FoxO6- induced macropinocytosis could represent a stalled state, with other pathways downstream of FOXG1 necessary to be activated concomitantly to ensure cell cycle re-entry, e.g., through increased pinocytic flux that cannot be assessed within our experimental timeframes. Alternatively, the macropinocytosis observed may be a metabolic stress response because of hyperactivation of signalling pathways upon FoxO6 overexpression Hyperactivation of Ras signalling, canonical Wnt and PI3K signalling have all been shown to play roles in inducing macropinocytosis (Overmeyer et al, 2008; Tejeda-Muñoz et al, 2019; Recouvreux & Commisso, 2017). We do not see clear evidence of vacuoles in FOXG1-indued reactivation of NSCs– this supports that the macropinocytosis seen upon FoxO6 overexpression is a stalled state or due to hyperactivation. We do not therefore think macropinocytosis per se would be observed in quiescence exit of GBM cells – indeed a normal form of macropinocytosis-induced cell death called methuosis has been observed in GBM cells with hyperactivated Ras signalling (Overmeyer et al, 2008). However, this phenotype still gives us clues as to the function of FoxO6 in quiescence exit in GSCs and the downstream signalling pathways it may regulate, such as Pak1-related signalling (discussed on lines 330-3 and 366-9).

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: * The overall objective of the paper is to investigate the mechanisms by which co-option of the activity of developmental master lineage regulators by cancer cells allows them to gain fitness. To answer this question, they focus on FOXG1. This TF acts during the specification of the telecephalon. Its expression can be increased in Glioblastoma (GBM) and, more importantly for the paper, FOXG1 has previously been shown to promote exit from quiescence of glioblastoma stem cells (GSCs) and non-transformed neural stem cells (NSCs). In a previous screen, the authors identified FoxO6 as a potential direct target gene of FOXG1. In this paper, they showed that with the gain of expression for FOXG1 in NSCs and loss of FOXG1 in GSCs, FoxO6 is increased or decreased, respectively. Loss of FoxO6 in NSCs does not alter their cell cycle or cell shape and specification. Yet, loss of FoxO6 in NSCs blocks FOXG1-mediated exit from quiescence. To understand the mechanisms, they decided to overexpress FoxO6 in NSCs and demonstrated that the cells undergo macropinocytosis, a process by which cells can engulf large amount of nutriments from the external medium. It remains to be determined whether this macropinocytosis occurs in cells overexpressing FOXG1 and GSCs. The authors provide a first answer by showing that overexpression of FOXG1 induces not only FoxO6 but also the expression of PAK1, one of the key kinases that regulates the membrane engulfment of macropinocytosis in NSCs. In GSC lines, the decrease of FOXO6 decreases PAK1 levels. *

      Major comments: * The paper describes interesting and convincing results (number of cell lines, repeated experiments seems sufficient) but it is difficult to reconcile them all in a single model, and this diminishes the impact of the study. Epistatic interactions between FoxG1, FoxO6, PAK1 and macropinocytosis are not always studied in the same cell models. Whether FOXG1-induced exit from quiescence of NSCs is dependent on a FOXG1-->FOXO6-->PAK1-->Macropinocytosis axis remains to be demonstrated. Also does such an axis operate in tumor cells remains to be fully assessed? In particular, if FoxO6 overexpression in NSCs can induce macropinocytosis, is this cellular process induced by FoxO6 downstream of FOXG1 activity during NSC quiescence exit? Is PAK1 a relay of FoxO6? Experiments looking at macropinocytosis and the involvement of PAK1 in the cell models of Figure 4 will definitely help to bridge the different results all together. *

      We thank the reviewer for this useful insight and discussion for future work.

      To directly investigate the effects of Pak1 ablation, and therefore more directly the link between FOXG1 and FoxO6 and macropinocytosis, we tested the published Pak1 inhibitor IPA-3. Unfortunately, to distinguish the role of Pak1 in quiescence exit and macropinocytosis, we would need a dosage of IPA-3 that is efficacious but does not affect cell proliferation. It was not possible to optimise such a dosage (a dosage of 10uM is shown to be efficacious at inhibiting Pak1 (Verma et al, 2020; Wong et al, 2013) however even at 2.5uM we see significant cell death in our cells. Indeed, this is potentially due to the variety of cellular functions Pak1 is involved in. Conversely, it is not feasible to overexpress Pak1 in the FoxO6 KO cells with inducible FOXG1. To ensure we are investigating quiescence exit this would need to be in an inducible manner; however, re-transfecting cells using the PiggyBac system would potentially alter FOXG1 transgene levels (through excision of the existing transgene) and therefore make results difficult to interpret.

      We hypothesise that FoxO6- induced macropinocytosis could represent a stalled state, with other pathways downstream of FOXG1 necessary to be activated concomitantly to ensure cell cycle re-entry, e.g., through increased pinocytic flux that cannot be assessed within our experimental timeframes (as detailed in the text discussion). Alternatively, the macropinocytosis observed may be a metabolic stress response because of hyperactivation of signalling pathways upon FoxO6 overexpression Hyperactivation of Ras signalling, canonical Wnt and PI3K signalling have all been shown to play roles in inducing macropinocytosis (Overmeyer et al, 2008; Tejeda-Muñoz et al, 2019; Recouvreux & Commisso, 2017). We do not see clear evidence of vacuoles in FOXG1-induced reactivation of NSCs– this supports that the macropinocytosis seen upon FoxO6 overexpression is a stalled state or due to hyperactivation and therefore not a physiological process in quiescence exit. We do not therefore think macropinocytosis per se would be observed in quiescence exit of GBM cells – indeed a normal form of macropinocytosis-induced cell death called methuosis has been observed in GBM cells with hyperactivated Ras signalling (Overmeyer et al, 2008).

      However, we believe the observed macropinocytosis phenotype upon Foxo6 overexpression, and the changes in Pak1 expression upon Foxo6 loss or FOXG1 induction provide interesting insights into the function of this underexplored FoxO family member, in GSCs and the downstream signalling pathways it may control, such as Pak1-related signalling. We have modified the text to reflect the limitations of our current data and discuss this (lines 330-3 and 366-9).

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer 1 (Public Review):

      He et al. investigate the requirement and function of Blimp1 (encoded by Prdm1) in murine NK cells and ILC1. Employing a conditional knockout mouse model (Prdm1flox x Ncr1cre), the authors describe impaired abundance and maturation of Prdm1-deficient NK cells and ILC1 in different tissues. Blimp1-deficient NK cells have reduced expression of cytotoxic molecules (Gzmb, Prf1) and, in some instances, Ifng production, and Prdm1flox x Ncr1cre mice show impaired tumor control in experimental metastasis models. Using single-cell RNA sequencing analysis, the authors propose that Prdm1 regulates JunB expression and NK cell maturation. Based on in silico analyses, the authors suggest manifold intercellular communication between NK/ILC1 and macrophages. Without following up on any of these potentially interesting suggestions, the authors conclude their study reiterating that Prdm1 regulates IFNg-production of tumor-infiltrating NK cells and ILC1. Many of the reported functions of Blimp1 in NK cells have previously been identified using a mixed-chimera strategy comparing Prdm1 WT and KO NK cells (Kallies et al., Blood 2011). Here, the authors expand on these findings using a conditional model to delete Prdm1 in NK/ILC1 and single-cell sequencing and provide a more refined analysis of the functions of Blimp1 in these cells. Cell-chat analysis suggests close interactions of Blimp-dependent NK/ILC1 subsets with hepatic macrophages, but these suggestions are not followed up by experiments. Potentially interesting differences in the macrophage compartment of Ncr1-Cre x Prdm1-fl/fl mice are suggested by the scRNA-Seq data but are not validated e.g. by FACS. The study falls short in providing new mechanistic insights. Nevertheless, it is an interesting confirmation of "old" suggestions in a more refined setting, and the provided single-cell mRNA-Seq data represents a potentially valuable resource for the community. There are some control analyses that are required to support the conclusions of the authors, and I have a few suggestions that would help to improve the manuscript.

      We sincerely appreciate your careful review and insightful feedback on our manuscript. We have carefully considered your comments and present the results of new experiments conducted in response to your suggestions. Please find the detailed responses below.

      Major comments

      Comment 1: The authors do not control for the potential effects of Cre expression. Expression of Cre from within the Ncr1 locus (using the mouse model established by Narni-Mancinelli et al.) has significant effects on NK cells and especially ILC1s (reducing their frequency and absolute numbers and altering their functionality. The authors should characterize the Ncr1cre mice used here (developed by Shanghai Model Organism Center) in this regard and should use proper controls (Ncr1Cre+ Prdm1wt/wt as control for Ncr1Cre+ Prdm1fl/fl, instead of WT littermates) for all of their key data, e.g. those depicted in Fig 1FG, 2ADFH, 7D, S2,3,4.

      Response 1: This is a very insightful question that has posed a challenge for many researchers, including us, engaged in conditional knockout studies. The expression of Cre and the insertion of loxP sequences both have the potential to influence gene expression. This is because the region where loxP is inserted may contain regulatory sequences for the gene of interest. Ncr1-Cre is a frequently used transgenic mouse model in our laboratory. In our prior research, we also had concerns about the possible impact of Cre on NKp46 expression, which could lead to a decline in NK cell function. Therefore, in our previous studies focused on Smad4 expression in NK cells, we conducted similar experiments. In Figure 6 of our published paper in the Journal of Clinical Investigation (Wang et al., J Clin Invest, 2018), we compared NKp46-iCreTgfbr2fl/flSmad4fl/WT with NKp46-iCreTgfbr2fl/flSmad4fl/fl. Although the primary purpose is to establish Smad4's independence from TGF-β, it also allows for a comparison between Smad4fl/fl and Smad4fl/WT in the presence of Cre. In the critical phenotype we assessed, NKp46-iCreTgfbr2fl/flSmad4fl/fl (compared with NKp46-iCreTgfbr2fl/flSmad4fl/WT) exhibited the same phenotype as NKp46-iCreSmad4fl/fl (compared with NKp46WTSmad4fl/fl). This suggests that Cre's influence on NK cells may be within a reasonable and controllable range. Furthermore, in contrast to the decrease in Ncr1 expression caused by Cre, the reduction in the expression levels of genes targeted by Loxp knockout, such as Prdm1 in this study (Figure 1 E), is more significant. Therefore, with the current techniques and research methods, we believe that the data provided in this study can support the role of Prdm1 in

      NK cells.

      Comment 2: Several of the phenotypic findings on NK cells have been described before by Kallies et al. in 2011 (Ref 29), although using a different genetic Prdm1-ablation model (Prdm1-GFP/GFP knockin/knockout model). This study reported impaired NK cell maturation, reduced Gzmb expression, impaired in vivo cytotoxicity against subcutaneous RMA-S cells, impaired in vitro proliferation, comparable in vitro killing, increase in BM NK cell numbers. The authors should discuss/mention this more prominently in their manuscript, and highlight where they confirm or refine these previous findings, and where they actually provide new information.

      Response 2: We appreciate your valuable suggestions. The article you referred to, published in Blood, is indeed an excellent work. While we had cited this article, our discussion regarding its specific content was limited. Based on your advice, we have made revisions and included the following content in our discussion section (page 24; line 489-493):

      “In a study involving systemic knockout combined with competitive transplantation, it was found that Prdm1 promotes NK cell maturation and the expression of Gzmb. On the contrary, the same study also found that NK cells with Prdm1 deficiency exhibit heightened proliferation, increased survival, enhanced migratory abilities towards tumors, and greater cytotoxicity against subcutaneously implanted RMAS tumors (31).”.

      Comment 3: What is the reason to refer to the enriched cluster in Blimp1-deficient NK cells as "Junbhi"? There is no follow-up for a function of Junb, and there are many other genes upregulated in these cells. Most critically, these cells seem to represent exactly the c-Kithi cells that Kallies et al. already showed and discussed in their paper. The authors should stain for Kit, and also refer to this. Also, MacKay et al. performed Blimp1-Chip-Seq (in T cells), maybe it would be interesting to check to which of the identified DEGs Blimp1 can bind.

      Response 3: We appreciate the suggestion from the reviewer. We think a gene that supports the development of lymphocytes doesn't necessarily positively regulate their function. For example, JunB is essential for T cell development but can also induce T cell exhaustion (Lynn et al., Nature. 2019). Therefore, while Prdm1 has been shown to promote NK cell development, it cannot be assumed that it always positively regulates NK cell function, especially for anti-cancer immune surveillance. In this respect, we try to find a driving-factor of the impaired anti-tumor ability of Prdm1_Δ_Ncr1 NK cells. Although there are many other genes upregulated in this cluster (e.g. Kit), JunB attracts more our interest of its potential for regulating NK cells functions in cancer, whereas c-Kit is more likely a marker of NK cells maturation, which has been well-demonstrated by Kallies et al. and other studies. Our previous studies also showed that the expression of c-kit was decreased in mature NK cells, compared immature NK cells (Wang et al., J Clin Invest, 2018). 

      The lack of following experiments of Junb is because we cannot find valuable surface markers to investigate the follow-up function of _Junb_hi cNK cluster. If we use intracellular markers, it is more likely an analysis of gene expression pattern, which has been well-described in our RNA-seq data. As we describe above, our study did not aim to further investigate the role of prdm1 in NK cells maturation, as the c-Kit expression was upregulated in Prdm1-kncok NK cells and correlated with NK cell maturation, which has been validated by Kallies et al.. 

      We also have discussed the potential DEGs that could be bound and regulated by Prdm1 in our revised manuscript (page 27-28; line 561-571):

      “Prdm1 and Hobit directly bound and repressed Tcf7 (18), which encoded TCF-1, a TF binding and limiting the activity of Gzmb regulatory element (69). Gzmb has been demonstrated directly bound and activated by Junb in NK cells, which suggested Gzmb expression regulated by multiple Prdm1/Hobit downstream signals (26). In human T cells, binding motif of JUNB was enriched in the binding sites of PRDM1 (70), indicating the essential role of PRDM1-JUNB axis during NK cell and T cell development. In NK cells deficient in Prdm1 expression, we noted a decrease in Gzmb levels alongside with an elevation in Junb expression. This indicates that Prdm1 not only facilitates the expression of Gzmb in NK cells but also suppresses Junb expression. Given that Junb is recognized as a positive regulator of Gzmb (71), this presents a complex interplay that seems contradictory. Therefore, it is imperative to develop a theoretical framework to comprehensively understand and interpret this paradoxical relationship.”.

      Comment 4: cNK cells are considered circulating cells, that transiently pass through the liver.

      Previous studies have suggested almost identical gene expression patterns in hepatic and splenic NK cells. In functional tests, they often "perform" identically. I am therefore a bit surprised that the authors find a differential dependency of Blimp1 for the IFNg production of splenic (no role of Blimp1) versus hepatic (Blimp1 regulating IFNg production) NK cells (Fig S3). Do the authors have any suggestions on that? The analyses are performed by 12+4h stimulations with IL12/18, which could involve the effects of altered bystander cells (as suggested by Figure 6). Therefore, these analyses should be provided upon standard 4h stimulations with IL12/18 and also with PMA/I under BFA. Note: liver and splenic cNK cells look quite different in the chosen histograms in Figures 7 A, B, C, yet there is massive variability in these analyses - is there any systematic/technical problem?

      Response 4: We appreciate the valuable suggestion from the reviewer. Studies have suggested that, at the gene expression or transcriptomic level, liver NK cells exhibit more similarity to splenic NK cells while displaying greater divergence from liver ILC1s. However, we do not think that splenic NK cells or peripheral blood NK cells (which are more abundant in circulation) are entirely indistinguishable from liver NK cells. Notably, there are substantial differences in their maturity levels, with liver NK cells being more mature. Since we are examining the protein levels, a 4-hour stimulation period may not fully capture these distinctions. Even when considering the potential impact of bystander cells, the experimental design specifically targets Prdm1 knockout within NK cells, ensuring that the study accurately elucidates the role of Prdm1 in NK cells. For each experiment, we have implemented control measures, and any variances observed in the figures may be attributed to individual variations among the animals. It is also possible that the MFI values measured by flow cytometry exhibit larger variations than a percentage.

      Comment 5: Figure 4 H/I - In contrast to NK cells in Fig 4E, F, the KO and WT ILC1s seem to co-cluster largely. Authors should validate differentially expressed genes. How strong is the effect of Blimp1 in ILC1s? Or is Blimp1 a critical TF driving effector differentiation in NK cells, while it has only subtle effects in ILC1 (these may be regulated by Hobit?)? This seems an interesting finding that should at least be discussed. For these types of small differences in ILC1, FACS confirmation analyses should be performed and findings be reevaluated using Cre-expressing controls (see above).

      Response 5: We appreciate the suggestion from the reviewer. As request, we analyze the DEGs in liver cNK cells and ILC1s from our scRNA-seq data (revised Supplemental Figure 8, A and B). There only a few valuable DEGs in ILC1s compared to cNK cells. It’s likely that Prdm1 have more essential effect of cNK cells transcriptional program, while it plays more important role in keep the homeostasis of ILC1s population. We have discussed these points to better inform the readers. (page 27; line 554-561): 

      “Previous studies have identified Hobit and Prdm1 as central regulators instructing tissue-dependent programs and retention of diverse tissue-resident lymphocytes (18, 51, 53). Liver ILC1s required Hobit, but not necessary for cNK cells (6). Expression of Prdm1 was remarkably higher in cNK cells versus ILC1s (18). While in our study, cNK cells and liver ILC1s reduced simultaneously in Prdm1ΔNcr1 mice, and even more significant in ILC1s. This indicates that while Prdm1 is expressed at lower levels in ILC1s, its role in preserving the quantity of ILC1s may be more crucial. Thus, Prdm1 and Hobit may have parallel program in instructing ILC1s functional development and maturation.”. 

      We cannot find valuable surface marker to evaluate the change in ILC1s, as most of changes are intracellular markers.

      Comment 6: The authors describe and discuss some of Figure 1 and 2 data as if Blimp1 would be involved in alternative NK versus ILC1 fates, but there is no evidence for this.

      Response 6: There is no evidence that Prdm1 could alter the fate decision of the progenitor towards liver cNK or ILC1s. Although some studies reported the conversion between cNK cells and ILC1s in special contexts, it was widely accepted that liver cNK cells and ILC1s originated from different progenitors. While we observed changes in the proportions of liver cNK cells and ILC1 in Prdm1 KO mice, we still lack sufficient evidence to support the relative independence of NK and ILC1 development, as well as evidence to indicate that Prdm1 is exclusively responsible for NK and ILC1.

      Regarding the changes in NK and ILC1 proportions after Prdm1 KO, we believe that both NK and ILC1 cells require Prdm1 to maintain their populations, with ILC1 possibly requiring it to a greater extent. This is the reason for the altered balance between NK and ILC1 cells following Prdm1 KO. We wish to clarify this point to prevent any misconceptions among readers. To address this, we have added the following content to the discussion section (page 25; line 509-516):

      “Furthermore, although both liver NK cells and liver ILC1s require Prdm1 to maintain their quantity, liver ILC1s demonstrate a more pronounced dependency on Prdm1. However, it is currently widely believed that liver NK cells and liver ILC1s originate from different progenitors. It is worth noting that while we observed changes in the NK and ILC1 proportions after Prdm1 knockout, our data does not support the hypothesis that Prdm1 affects progenitor differentiation decisions, thereby influencing the fate selection of NK and ILC1. Further research may be needed to elucidate how Prdm1 regulates the balance between NK cells and ILC1s.”.

      Comment 7: There are several recent studies suggesting a role for Hobit, homologue of Blimp1, in NK cells and in ILC1, and in the control of liver metastases. The authors should discuss similar and unique functions of Hobit and Blimp1, also in the regulation of gene expression patterns, and should refer to these studies.

      Response 7: We would like to express our gratitude to the reviewer for your insightful comments, which bring forth a critical perspective. In accordance with the reviewer's suggestion, we have updated our discussion to include the diverse functions guided by Hobit and Prdm1 in regulating the development and function of cNK cells and ILC1s (page 27; line 554-561):

      “Previous studies have identified Hobit and Prdm1 as central regulators instructing tissue-dependent programs and retention of diverse tissue-resident lymphocytes (18, 51, 53). Liver ILC1s required Hobit, but not necessary for cNK cells (6). Expression of Prdm1 was remarkably higher in cNK cells versus ILC1s (18). While in our study, cNK cells and liver ILC1s reduced simultaneously in Prdm1ΔNcr1 mice, and even more significant in ILC1s. This indicates that while Prdm1 is expressed at lower levels in ILC1s, its role in preserving the quantity of ILC1s may be more crucial. Thus, Prdm1 and Hobit may have parallel program in instructing ILC1s functional development and maturation.”.

      As shown in Supplemental Figure 8, we analyzed two published scRNA-seq data performed with Hobit_KO mice and integrated DEGs in cNK cells and ILC1s with our data. We observed overlaps of DEGs in _Prdm1_Δ_Ncr1 and Hobit_KO between cNK cells and ILC1s, such as _Junb, Tcf7, Gzmb, and Prf1 (Supplemental Figure 8), indicating the similar regulatory network of Prdm1 and Hobit. These data are now described on page 19; lines 386-395:   

      “We also compared the gene expression patterns between Prdm1 and Hobit (homologue of Blimp1) with two published scRNA-seq data (51, 53). Following the knockout of Hobit, the DEGs were primarily identified within ILC1s. Conversely, after the knockout of Prdm1, a greater number of DEGs were observed in cNK cells. This indicates that Prdm1 likely possesses a broader range of target genes within cNK cells, whereas Hobit appears to have a more pronounced impact on gene expression within ILC1s (Supplemental Figure 8, C-F). There are some overlaps between the downstream transcriptional profile of Prdm1 and Hobit in liver cNK cells and ILC1s (Supplemental Figure 8, G and H), such as Junb, Fosb, Tcf7, Kit, Gzmb, Prf1, and Cxcr6 was simultaneously upregulated or downregulated in both Prdm1ΔNcr1 and _Hobit_KO liver cNK cells or ILC1s, indicating the similar regulatory networks of Prdm1 and Hobit.”.

      Comment 8: Figure 4: The authors should discuss (and cross-validate) their liver gene expression analyses in the context of published datasets of NK and ILC1, such as the ones by Lopez et al, Friedrich et al, Ducimetiere et al and Yomogida et al.

      Response 8: We thank the reviewer for raising this important point. To address this question, we have now analyzed the gene expression of liver cNK cells and ILC1 in two published data mentioned above, also in the context of Hobit-knock. We compared gene expression of different clusters and described in our revised manuscript (page 19; lines 386-395). 

      “We also compared the gene expression patterns between Prdm1 and Hobit (homologue of Blimp1) with two published scRNA-seq data (51, 53). Following the knockout of Hobit, the DEGs were primarily identified within ILC1s. Conversely, after the knockout of Prdm1, a greater number of DEGs were observed in cNK cells. This indicates that Prdm1 likely possesses a broader range of target genes within cNK cells, whereas Hobit appears to have a more pronounced impact on gene expression within ILC1s (Supplemental Figure 8, C-F). There are some overlaps between the downstream transcriptional profile of Prdm1 and Hobit in liver cNK cells and ILC1s (Supplemental Figure 8, G and H), such as Junb, Fosb, Tcf7, Kit, Gzmb, Prf1, and Cxcr6 was simultaneously upregulated or downregulated in both Prdm1ΔNcr1 and _Hobit_KO liver cNK cells or ILC1s, indicating the similar regulatory networks of Prdm1 and Hobit.”.

      Recommendations For The Authors:

      Comment 9: The use of a paired t-test analysis when comparing cells/groups from different mice is not correct. Instead, the authors should consider using e.g. an unpaired t-test and re-test the indicated significance (e.g. Figure 1F, Figure 2H).

      Response 9: We thank the reviewer’s comments. As we used littermates for the experiments and they are compared side by side, so the paired t-test analysis is acceptable. We reanalysis the significance in the results of Figure 1F, and Figure 2H using unpaired t-test. The statistics significance of Figure 1F using unpaired t-test was same as using t-test. However, in Figure 2H, the reduced IFN-γ production not reach statistics significance when used un-paired t-test (Supplemental Figure 12B). It may attribute to the variation between different littermates, but the trend is still under the scope of our conclusion. We believe that employing a paired t-test between littermates could be also meaningful. As such, we kept both statistical methodologies to ensure a thorough evaluation.

      Comment 10: In several instances, it is unclear whether data are pooled or representative (and if so, of how many analyses). This information needs to be provided for all analyses. 

      Response 10: We apologize for the lack of details and have now provided the sufficient information in our figure legends. 

      For example, we delete the number in original histogram to avoid the misunderstanding of the unclear whether data are pooled or representative (e.g. original Figure7 A-C; revised Figure7 A-C). Furthermore, we added the “representative” in figure legends of all flow cytometric plots to better inform readers (e.g. original Figure2, D and F; revised Figure2, B and D).

      Comment 11: In the title and abstract authors use "type 1 ILCs" for both NK cells and ILC1, and it is difficult to understand which phenotypes correspond to cNK cells versus ILC1. Most of the analyses clearly separate these two different cell types. I would appreciate a lot being more accurate in the abstract, and describing cNK and ILC1 phenotypes in a clear way.

      Response 11: We are really sorry for our inaccurate descriptions. According to Spits et al., (Spits et al., Nature Reviews Immunology, 2013) and other related studies, we have now adopted a more appropriate nomenclature as “Conventional NK cells” correspond to “cNK cells”, “Type 1 innate lymphoid cells” to “ILC1s”, and “Group 1 ILC” as the collective name of cNK and ILC1s. 

      The definition of these cells was described in the introduction (page 4, line 52-53; line58-62): 

      “Group 1 ILCs consist of cNK cells and ILC1s (1, 2), with distinct developmental trajectories and effect molecules (3).”, “In a state of homeostasis, liver group 1 ILCs (CD45+CD3-NK1.1+NKp46+) can be discriminated into cNK cells and ILC1s by the differential expression of CD49a and CD49b (2): cNK cells are marked by the expression of CD49b, while liver ILC1s exhibit a distinctive positivity for CD49a. Tumor Necrosis Factor Related Apoptosis Inducing Ligand (TRAIL) is also expressed on liver ILC1s, but not on cNK cells (10, 11).”. 

      We also describe cNK and ILC1 phenotypes in our scRNA-seq data, as shown in page 13; line 259-261: 

      “cNK cells expressed high levels of Itga2 (CD49b) and Eomes, while ILC1s had high levels expression of Itga1 (CD49a) and Tnfsf10 (Supplemental Figure 5, F and G).”.

      Comment 12: In the abstract authors state "The present study unveiled a novel regulatory mechanism of Prdm1 in liver Type 1 ILCs, showing promising potential for developing innovative immune therapy strategies against liver cancer." - maybe authors should discuss how their findings could be used for therapeutic approaches?

      Response 12: We appreciate comments from the reviewer. As there hasn't been a clear consensus on the role of Prdm1 in NK cells prior to this, some studies have suggested that Prdm1 can inhibit cytokine secretion by NK cells. Particularly, Kallies et al. in their 2011 article in Blood found that Prdm1 might suppress NK cell anti-tumor activity. Hence, there hasn't been any immunotherapy targeting Prdm1 in NK cells for cancer treatment. Our research demonstrates the enhancing role of Prdm1 in NK cell anti-tumor activity, providing theoretical support for NK cell therapy targeting Prdm1. 

      We added the following content to the discussion section (page 29; line 605-609): 

      “Further research may provide deeper insight into the role of PRDM1 in the anti-tumor function of human NK cells, enabling a more direct investigation of its application in cancer therapies. Given its important role in preserving liver cNK cells and ILC1s functional heterogeneity, enhancing Prdm1 function in human NK cells could potentially be a strategy to promote NK cell-based immunotherapy for cancer.”.

      Comment 13: The authors should explain or interpret their data a bit more (e.g. what is the consequence of GSEA enriched in negative regulation of Il6 production? (Fig. 3D)  do NK cells produce Il6 (Figure 3)? What's the impact of Il17 signaling in NK/ILC1 (Figure 5). Do the authors suggest JunB-driven metabolic reprogramming (Suppl. Fig 6D-F?).

      Response 13: We appreciate comments from the reviewer. The question of IL-6 production in NK cell also raised by another reviewer. We have checked the GSEA results, and found no valuable genes in IL-6 production in NK cells. According to the suggestions of another reviewer (Response to Reviewer 2 Comment, Comment 14), it may be prudent to omit this figure.

      IL-17 signaling indicated the plasticity of ILC1s, that may be originated from the differentiation of ILC3, we added more discussion of this part (page 17; line 341-344). 

      “Several ILC3 signature genes, such as Rora, Tmem176a, and Tmem176b (45), highly expressed in this cluster (Supplemental Figure 7D). Considering the close relationship between IL-17 mediated immunity response and ILC3 (1, 46), it is plausible that _Il7r_hi ILC1 cluster may be attributed, at least in part, to potential plasticity between ILC1 and ILC3 subsets.”.

      The decreased mitochondrial function may have more relevance to NK cell exhaustion in tumors. Our data suggest that the elevated expression of JunB in NK cells may predispose them to exhaustion. Currently, our hypothesis regarding the promotion of NK cell exhaustion by high JunB expression is based on the observed correlation between JunB expression levels and exhaustion phenotypes (at the gene expression and IFN-γ secretion levels) and the findings in reference 67 (Lynn et al., Nature, 2019), where JunB was found to promote T cell exhaustion. However, we have not demonstrated causation between high JunB expression and exhaustion in NK cells. We propose that in NK cells, especially mature NK cells, excessive JunB expression may make them more sensitive to exhaustion inducers. Nevertheless, further research is needed to confirm this. To clarify this, we added the following content in the discussion section (page 26; line 537-543): 

      “While our current data is not sufficient to definitively classify these cells as exhausted NK cells, it supports that a subpopulation, referred to Junbhi cluster, demonstrates an exhaustion-like phenotype.

      The significant increase in this cell population following Prdm1 knockout in NK cells may potentially be one of the reasons why Prdm1ΔNcr1 mice lose their tumor-killing capacity. Whether the excessive expression of JunB in NK cells is also a contributing factor to their exhaustion, similar to T cells(65), requires further investigation.”.

      Comment 14: Ref 25 and Ref 57 are the same publication?

      Response 14: We are really sorry for our careless mistakes. We have checked all the reference and corrected the wrong format.

      Comment 15: Figure 1, E - The method description of RT-PCR is missing. I apologize if I have overlooked this information.

      Response 15: We have now added the description of RT-PCR in our revised method section (page 31; line 638-644):

      “RNA was extracted from FACS-sorted NK cells or splenocytes using RNASimple Total RNA Kit (TIANGEN Biotech, 4992858) and subsequently reverse transcribed to cDNA with SuperScript VILO Master Mix (Thermo Fisher Scientific, 11755050) according to manufacturer’s instructions. qPCR was performed with SYBR Green Mix (Thermo Fisher Scientific, A25742) and CFX Opus 96 Real-Time PCR System (Bio-Rad). The relative mRNA expression level was calculated using 2-ddCt method. Primer sequences:           Prdm1: 5’-CAGAAACACTACTTGGTACA-3’; 5’-GATTGCTTGTGCTGCTAA-3’.”

      Comment 16: Figure 1, F - The NKp46+CD3- gate for the liver seems to cut the population, not all cells are included.

      Response 16: We appreciate the review’s comment and apologize for our carelessness. We expend our data with more samples and reanalyzed them with a more convincing gating strategy. We now update our figures (revised Figure 1G; revised Supplemental Figure 2A). Several changes have occurred in the data and conclusions, and we have accordingly revised these contents in our manuscript.

      The original text is:

      “Proportion and absolute number of cNK cells in blood, bone marrow, lung, liver, spleen, and lymph nodes were analyzed by flow cytometry. Compared with Prdm1+/+ mice, the percentage of cNK cells (CD3-NK1.1+NKp46+) among lymphocytes was decreased in all of these tissues except bone marrow and lymph nodes (Figure 1F; Supplemental Figure 2A). However, no significant difference was observed in the percentage of cNK cells among bone marrow-derived lymphocytes between Prdm1ΔNcr1 and Prdm1+/+ mice. The absolute number of cNK cells in blood, lung, liver, and spleen also decreased in Prdm1ΔNcr1 mice (Figure 1F; Supplemental Figure 2A). Only a slight decrease in the number of cNK cells was observed in the lymph nodes of Prdm1ΔNcr1 mice, which did not reach statistical significance either (Supplemental Figure 2A). In contrast, the absolute number of cNK cells in Prdm1fl/fl mice bone marrow is moderately higher than Prdm1ΔNcr1 mice (Figure 1F).”

      The revised text is (page 8; line 142-146):

      “Proportion and absolute number of cNK cells in blood, bone marrow, lung, liver, spleen, and lymph nodes were analyzed by flow cytometry. Compared with Prdm1+/+ mice, the percentage and absolute number of NK cells (CD45+CD3-NK1.1+NKp46+) among lymphocytes was decreased in all of these tissues, whereas increased number of NK cells were observed in bone marrow (Figure 1G; Supplemental Figure 2A).”

      Comment 17: Figure 1, The y-axis labeling of lung CD3-NKp46+ cells (x10^3) is not correct.

      Response 17: We are really sorry for our carelessness. We now check the labels and make sure they are correct.

      Comment 18: Figure 1, The statistical significance of absolute numbers of NKp46+ cells in the bone marrow should be reviewed.

      Response 18: We expend our data with more samples and reanalyzed them with a more convincing gating strategy. We observed significant increase of bone marrow NK cells quantity in our updated data. These changes are now described in our revised manuscript.

      The original text is: 

      “However, no significant difference was observed in the percentage of cNK cells among bone marrow-derived lymphocytes between Prdm1ΔNcr1 and Prdm1+/+ mice”, “In contrast, the absolute number of cNK cells in Prdm1fl/fl mice bone marrow is moderately higher than Prdm1ΔNcr1 mice (Figure 1F).”

      The revised text is (page 8; line 142-146):

      “Proportion and absolute number of cNK cells in blood, bone marrow, lung, liver, spleen, and lymph nodes were analyzed by flow cytometry. Compared with Prdm1+/+ mice, the percentage and absolute number of NK cells (CD45+CD3-NK1.1+NKp46+) among lymphocytes was decreased in all of these tissues, whereas increased number of NK cells were observed in bone marrow (Figure 1G; Supplemental Figure 2A).”

      Comment 19: Figure 1, G - CD27 and CD11b are used to define maturation stages within NK cells. Here the authors are analyzing group 1 ILC instead (containing both NK cells and ILC1, especially in the liver). It would be better to pre-gate on Eomes+ or CD49b+ NK cells for this analysis.

      Response 19: We apologize for the lack of details in this analysis. We have pre-gate CD49b+ NK cells for the maturation stages analysis. We have now added this statement in our revised manuscript and figure legend (page 8; line 149-151)

      “The maturation of cNK cells (gated by CD45+CD3-NK1.1+NKp46+CD49b+) from blood, bone marrow, lung, liver, spleen, and lymph nodes were assessed, based on the expression of CD11b and CD27.”.

      Comment 20: Supplementary Figure 1, A - The NKp46+CD3- gate seems to cut the population, not all cells are included. y-axis labeling of spleen CD3-NKp46+ cells (%) is not correct.

      Response 20: Thanks, we have corrected these errors and shown in our revised supplementary Figure 2A.

      Comment 21: Figure 2, D-G - Did the authors analyse the ILC1/NK compartment of the tumor? What is the abundance and phenotype of these cells dependent on Prdm1 expression? Proper Crecontrols should be used (see above).

      Response 21: We appreciate the suggestions from the reviewer. As request, we have now added the analysis of cNK/ILC1s population in the context of tumor. The proportion changes of cNK cells and ILC1s in Prdm1_Δ_Ncr1 mice was similar with the no tumor-burden condition, while the number of both cNK cells and ILC1s decreased in tumor bearing liver (revised Figure 7D). These contents have been updated in our revised manuscript (page 23; line 479-481):

      “The proportion changes of cNK cells and ILC1s in Prdm1ΔNcr1 mice was similar with the no tumorburden condition, while the number of both cNK cells and ILC1s have significant decreased in tumor-bearing liver (Figure 7D).”.

      The reason why we did not use Cre-controls was described in comment 1.

      Comment 22: Figure 2, H - Prdm1-deficient NK and ILC1 produce less Ifng in response to in vitro stimulations with Il-12 and /or Il-18, and bulk Seq analysis (Fig 3F) shows reduced Il12rb2 expression. Does the expression of cytokine receptors correlate with the maturation of NK cells? This could be analyzed from the single-cell RNA-seq dataset. The statistical significance of %Ifng after Il12/Il18 stimulation should be revisited (see above).

      Response 22: We thank the reviewer for the suggestions. To address this question, we explored the expression of IL-12 and IL-18 receptors in cNK and ILC1 clusters. Within cNK clusters, Il12rb2, Il18r1 and Il18rap was highly expressed in Prf1hi and Cxcr3hi cNK clusters (revised Supplemental Figure 6H), indicating the IL-18 receptor expression correlated with the NK cell maturation. While in ILC1, these receptors mostly expressed on Il7r_hi and _Gzmb_hi ILC1 clusters (revised Supplemental Figure 7C). Significant decreased of _Il18r1 expression in Prdm1_Δ_Ncr1 cNK cells and ILC1s may associated with the impaired ability to produce IFN-γ. We now added this analysis (page 18; line 364-368):

      “Within cNK cells, Il12rb2, Il18r1 and Il18rap was highly expressed in Prf1hi and Cxcr3hi cNK clusters (Supplemental Figure 6I), indicating the IL-18 receptor expression correlated with the NK cell maturation. While in ILC1, these receptors mostly expressed on Il7r_hi and _Gzmb_hi ILC1 clusters (Supplemental Figure 7D). Significant decreased of _Il18r1 expression in Prdm1ΔNcr1 cNK cells and ILC1s may associated with the impaired ability to produce IFN-γ.”.

      The un-paired t test of IFN-γ production was displayed in revised supplemental Figure 12 B. Difference in IFN-γ production was found to be not significant when analyzed using an unpaired ttest in original Figure 2 H. However, significance was observed in tumor-bearing liver cNK cells and ILC1s, specifically under the context of IL-12/IL-18 stimulation, as depicted in the original Figure 7E using an unpaired t-test. These variations may be attributed to differences among different littermates. Despite these variations, the trend remains consistent with our overall conclusions. We believe that employing a paired t-test between littermates could be also meaningful. As such, we kept both statistical methodologies to ensure a thorough evaluation.

      Comment 23: Figure 3, A-E - For bulk sequencing analysis, splenic CD3-NK1.1+NKp46+ were isolated. This population also contains ILC1 in the spleen (e.g. Flommersfeld et al.), although much less abundant compared to NK cells, and compared to the liver compartment. However, have the authors tested the abundance of splenic ILC1 in Prdm1-deficient mice, which may impact the gene expression data? In line with this the detection of altered Cxcr6 expression in Figure F, which is usually expressed by ILC1 rather than NK cells, may indicate an alteration in ILC1 numbers. The authors should validate the altered expression of CXCR6, Itga1, and Cx3cr1 on NK cells by flow cytometry.

      Response 23: We cited the work of Flommersfeld et al. into our manuscript and have expanded our Results section to include the following information (page 19; line 377-385):

      “Previous research found that spleen NK cells could be divided into three distinct groups based on their expression levels of CD27, CD62L, CD49a, and CD49b (52). CD27+CD62L- NK cells have remarkable high expression of Batf3, while it was only barely expressed in CD27+CD62L+ and CD27-CD62L+ NK cells (52). Based the sequencing data published by Flommersfeld et al., (GSE180978), a notable negative correlation was observed between the expression levels of Prdm1 and Batf3 (Supplemental Figure 8I). On top of that, our findings unveiled the negative regulatory influence of Prdm1 on Batf3 within both spleen and liver NK cells. This discovery highlights a potential upstream mechanism that may influence the hemostasis of the spleen NK cell subpopulations through Batf3.”.

      We validated the expression of CD49a (Itga1) and CX3CR1 in liver cNK cells and ILC1s in our revised manuscript, which is described in our revised manuscript (page 9; line 170-174, page 14; line 231-233):

      “Increased CD49a expression was also observed in Prdm1ΔNcr1 liver ILC1s, while it showed decreased expression in NKp46+ cells in the liver, bone marrow, and lymph nodes (Supplemental Figure 2, F and G).”, “The percentage of CX3CR1+ cNK cells was significantly decreased in multiple tissues of Prdm1_Δ_Ncr1 mice, while the proportion of CX3CR1+ ILC1 was increased in the liver (Figure 3F).”

      Comment 24: Figure 3, F - Tnfsf26: which gene is this? is this a typo? Is a function of this gene in NK cells reported? Altered Batf3 expression suggests an impact on ILC1-like NK cells (Flommersfeld et al).

      Response 24: We are very sorry for our mistakes. We have removed Tnfrsf26 from the heatmap.

      Comment 25: Figure 3, G-J refer to Kallies data?! 

      Response 25: Kallies‘s data has mentioned the reduced GzmB expression in Blimp1gfp/gfp mice. However, compared with Kallies’s study, we further analyzed the GzmB and Perforin expression in different mature stages of NK cells. Reduced GzmB expression not only due to the less mature phenotype in Prdm1-deficient NK cells, highlighting the role of Prdm1 in regulating NK cell function. So, we added these contents in the revised manuscript (page 12; line 233-242):

      “Lower GZMB and PRF1 production was observed in Prdm1-deficient splenic cNK cells, liver cNK cells and ILC1s (Figure 3, H-K; Supplemental Figure 4, A-I). Notably, the proportion of GZMB+ and PRF1+ cNK cells was decreased among almost all of the maturation stages of cNK cells (Figure 3, J and K). The relative mean fluorescent intensities (MFIs) of GZMB and PRF1 consistently show a reduction across all developmental stages in PrdmΔNcr1 NK cells (Supplemental Figure 4, H and I). Yet, no statistical difference of PRF1 was found within the CD11b-CD27+ and CD11b+CD27+ subsets, likely due to the relatively lower perforin levels in these populations (Supplemental Figure 4I). These findings suggest that Prdm1 may directly influence cytotoxic molecule in NK cells, rather than impacting their anti-tumor abilities solely by affecting the maturation phenotype of Prdm1-deficient NK cells.”

      In Discussion section (Kallies’s work is cited here in revised manuscript) (page 24; line 500-502):

      “Our results not only confirmed a decrease in cytotoxic molecules in Prdm1-deficient NK cells (31) but also showed that the reduction in Gzmb and perforin is not solely attributable to the diminished maturation of these cells.”

      Comment 26: Figure 3, G, I - How do the authors explain the high variability of GzmB and Prf1 in Prdm1+/+ cells? 2 samples have comparable values to Prdm1-deficient cells.

      Response 26: This may be due to the inherent differences in MFI among different samples. In the revised version, we have added data on percentages, which exhibit much less variability (Figure 3, H and I). The MFIs of GZMB and PRF1 are moved to supplemental Figure 4 E and F.

      Comment 27: Did the authors test the mice for potential germline recombination of the floxed allele, which has been suggested as a potential problem of Ncr1cre?

      Response 27: We appreciate the insightful comments provided by the reviewer, and this is a really good question. In Prdm1fl/fl mice, germline recombination typically results in a systemic knockout of Prdm1, which can lead to embryonic lethality. Given that mice were successfully born in the current study, it is almost unlikely that germline recombination of Prdm1 occurred due to leaky expression of Cre.

      To confirm this issue, we isolated splenocytes and assessed Prdm1 expression using qPCR. We observed no significant difference in Prdm1 expression between splenocytes from Prdm1+/+ and Prdm1ΔNcr1 mice (revised Figure 1F). This also indicated that germline recombination issues are unlikely to be present in the Prdm1ΔNcr1 mice.

      Comment 28: Histograms do not show MFI

      Response 28: We appreciate the comments provided by the reviewer. The MFI value was omitted.

      Comment 29: Supplementary Figure 4, B - FACS plot labelling: Typo, Histograms do not show MFI.

      Response 29: We sincerely thank the reviewer for careful reading. The typo in this figure was corrected. The MFI is omitted.

      Comment 30: Figure 4, A - What are the cells in the red cluster in the middle of the UMAP, do they belong to B cells? Why do they cluster so separately? It is interesting, but also surprising that NK and ILC1 cluster map so far apart from each other (rather with CD8 or B cells? or NKT cells) - do the authors have any comments?

      Response 30: We sincerely apologize for the mistakes in labeling a group of cells in our previous analysis. Upon a thorough re-evaluation, we have corrected the labels of several cell clusters that were previously misidentified. The revised heatmap (revised Supplemental Figure 5C) represents the marker genes for each cluster. Additionally, in our updated analysis (revised Figure 4A), we have included clusters for Epithelial cells, CD4+ T cells, NKT cells, and Kupffer cells. Please note, the red cluster identified in the center of the original heatmap corresponds to the CD4+ T cells.

      We checked the markers of cNK cell and ILC1 clusters and confirmed they are labeled correctly, as Ncr1 and Klrb1c (NK1.1) was highly expressed in these clusters compared to others (revised Supplemental Figures 5E).

      Comment 31: Does Junb expression correlate with the maturation stages of NK cells?

      Response 31: Our previous research indicated that during the maturation process of NK cells, there was a decrease in the expression levels of Junb (negative correlation), whereas there was an increase in the expression levels of Prdm1 (Wang et al., J Clin Invest, 2018; Supplemental Figure 5c and Supplemental Figure 11).

      Comment 32: The authors may consider validating their scRNA-seq data (e.g. by FACS analysis for highlighted markers, eg. cKit, Tcf7, Gzma, Cxcr3).

      Response 32: We appreciate the suggestion from the reviewer. We validated several marker genes, including Gzmb, Prf1, and Cx3cr1 by FACS, as shown in the revised Figure 3 F-K. Currently, FACS cannot distinguish liver NK cells into as many distinct clusters as can be achieved through scRNAseq analysis. However, we expect that as technology progresses, we will be able to enhance our validation of the scRNA-seq data.

      Comment 33: It is a bit unclear to me why authors refer to Cxcr3hi NK cells as tissue-resident. This is based on Cxcr3 and Ccr2 expression. To make this statement, a much more detailed analysis would be required. How are CD69, CD49a, or CXCR6 expression of these cells?

      Response 34: We appreciate the suggestion from the reviewer. The primary reason for classifying this specific cluster of NK cells as tissue-resident is derived from the differential expression genes (DEGs) and Gene Ontology (GO) analysis, which demonstrate significant chemokine receptor activity within this cluster.

      To make this statement more clearly, we check the expression of the above markers, but only Cd69 had expression in cNK clusters, which was highly expressed in _Junb_hi and _Cxcr3_hi cNK cells (revised Supplemental Figure 6D). We also used top30 DEGs in ILC1s versus cNK to calculate the module score in all cNK clusters, as _Cxcr3_hi cNK had highest score among these clusters (revised Supplemental Figure 6D). This part has been updated in our manuscript (page 15; line 298-308):

      “Expression of tissue-resident markers Cd69 was also highly expressed in this clusters (Supplemental Figure 6D). The enrichment of chemokine receptors in the genes upregulated in the Cxcr3_hi cluster implying a greater likelihood of this cluster being tissue-resident compared with other cNK cell clusters (Figure 4H). To further confirmed tissue-resident properties of this clusters, we calculated the module score based on top30 DEGs in ILC1 versus cNK clusters, including _Cxcr6, Itga1, Cd160, Cd226, etc. _Cxcr3_hi cNK clusters have the highest score among all cNK clusters (Supplemental Figure 6H), indicating the similarity with liver ILC1s. In the tumor microenvironment, reports indicated that NK cells could transform into ILC1s (25). If this conversion of cNK cells into ILC1s also occurred under normal physiological conditions, then _Cxcr3_hi cNK cell cluster might be the most susceptible to such transformation.”

      Comment 35: The authors suggest that Prdm1 regulates chemokine receptor expression. An alternative explanation could be that this is an indirect effect of altering the abundance of NK cell subsets.

      Response 35: We are sorry for lacking the details in these figures. The input cell number of each genotype has now been added in following figure legends. 

      Figure 4F, “Proportions of cNK cells among total cNK cells (left; 211 cells in Prdm1+/+, and 141 cells in Prdm1ΔNcr1) and within clusters (right).”; Figure 5C, “Proportions of ILC1s among total ILC1s in different genotypes (left; 114 cells in Prdm1+/+, and 63 cells in Prdm1ΔNcr1) and within each cluster (right).”; Figure 6C, “Proportions of MDMs and KCs among total macrophages in different genotypes (510 cells in Prdm1+/+, and 624 cells in Prdm1ΔNcr1).”

      To minimize the effects of discrepancies in input numbers between samples with different genotypes, we represented the relative proportions of each cluster within its specific genotype (e.g. Supplemental Figure 6B; Supplemental Figure 7B; Supplemental Figure 9B).

      Comment 36: Supplementary Figures 6 and 7, A - The formatting of gene annotations does not fit the heat maps (the gene names on the last rows are missing).

      Response 36: We apologize for our careless mistakes. We have now addressed these mistakes.

      Comment 37: Supplementary Figures 6 and 7, What is the consequence of compromised mitochondrial function? Increase apoptosis?

      Response 37: In our experiments, we did not find that Prdm1 has an effect on the apoptosis of NK cells. Conversely, previous studies have found that Prdm1 might inhibit the proliferation of NK cells (C. Kucuk, et. al., PNAS, 2011). We acknowledge that there is ongoing debate regarding the precise definition of NK cell exhaustion. In our experiments, no changes were detected in the expression levels of surface markers (TIGIT) associated with exhaustion on NK cells following the knockout of Prdm1. However, we did note a significant reduction in the cytokine secretion capacity and tumor control efficacy of NK cells after Prdm1 knockout. We prefer to say that the consequence of compromised mitochondrial function might be increased exhaustion. As we mentioned in discussion part (line 482-483), mitochondrial fragmentation has been confirmed to be closely associated with NK cell exhaustion in tumor (Zheng et al. Nature immunology, 2019). Although the evidence to define the exhausted NK cells in Prdm1_Δ_Ncr1 was not sufficient, our data may support the compromised mitochondrial functions, at least in part, associated with the exhausted phenotype of Prdm1_Δ_Ncr1 NK cells in cancer. 

      We have discussed these points in our revised manuscript (page 26; line 529-543): 

      “Mitochondria are pivotal organelles crucial for cellular metabolism. Disruptions in mitochondrial function have been linked to T Cell exhaustion, attributed to glycolytic reprogramming (66). Similarly, mitochondrial fragmentation has been closely associated with NK cell exhaustion (67).

      However, the concept of NK cell exhaustion isn't as firmly established as it is for T cells. Exhausted NK cells should primarily exhibit diminished functions. This is characterized by a diminished ability to destroy tumor cells, a reduced capability to activate other components of the immune system, and compromised proliferation and survival rates. Additionally, this reduced functionality is associated with a decline in the expression of molecules responsible for cytotoxic activity, lower production of IFN-γ, and metabolic disturbances that may arise from mitochondrial dysfunction. While our current data is not sufficient to definitively classify these cells as exhausted NK cells, it supports that a subpopulation, referred to Junb_hi cluster, demonstrates an exhaustion-like phenotype. The significant increase in this cell population following _Prdm1 knockout in NK cells may potentially be one of the reasons why Prdm1ΔNcr1 mice lose their tumor-killing capacity. Whether the excessive expression of JunB in NK cells is also a contributing factor to their exhaustion, similar to T cells(65), requires further investigation.”.

      Comment 38: Figure 5, Describing the scRNA Seq data, the authors are switching a lot between Figure 4 and Figure 5. Maybe a reorganization of the Figures (Figure 4: NK cell; Figure 5: ILC1) could help.

      Response 38: We appreciate the reviewer’s suggestion. We have now reorganized the Figure 4 and Figure 5.

      Comment 39: Figure 5, We suggest naming one of the ILC1 clusters "Gzmbhi" to keep it consistent with the FACS data.

      Response 39: We agree with this excellent suggestion and have now renaming the “Gzmahi” ILC1 cluster as “Gzmbhi” ILC1 cluster.

      Comment 40: Figure 5, C - How was the JunB score derived (which genes were used)?

      Response 40: The JunB score was calculated based on the expression of marker genes in _Junb_hi cNK clusters (DEGs in _Junb_hi cNK cluster compared to other clusters, as shown in revised Supplemental figure 6A). The score was calculated using “AddModuleScore” R package.

      Comment 41: Figure 5, G, I - The authors highlight Il17 signaling pathway, what is the impact of Il17 on NK/ILC1? Did the authors check for ILC3 (Rorc expression) within the ILC1 cluster?

      Response 41: The enrichment of IL-17 signaling pathway in Il7r_hi ILC1 indicated that this cluster encompass ILC1s originate from the conversion of Rorγt+ ILC3s. Although the Rorc expression was undetectable in all ILC1 clusters, we found several ILC3 marker genes highly expressed in this clusters (e.g. Rora, Tmem176a, Tmem176b) according to the ILC3 transcriptomes (Robinette et al., _Nature Immunology, 2015). 

      We have added these contents in our revised manuscript (page 17; line 341-344): 

      “Several ILC3 signature genes, such as Rora, Tmem176a, and Tmem176b (45), highly expressed in this cluster (Supplemental Figure 7D). Considering the close relationship between IL-17 mediated immunity response and ILC3 (1, 46), it is plausible that _Il7r_hi ILC1 cluster may be attributed, at least in part, to potential plasticity between ILC1 and ILC3 subsets.”.

      Comment 42: Figure 5, The authors detect more Ly49E+ cytotoxic ILC1 in Prdm1fl Ncr1cre mice.

      How does this observation fit to the reduced cytotoxicity of NK cells?

      Response 42: The proportion of _Klra_hi ILC1 was increased, while the _Gzmb_hi ILC1 was decreased in _Prdm1_ΔNcr1 mice. Moreover, total number of three ILC1 cluster was reduced in _Prdm1_ΔNcr1 mice.

      Comment 43: Line 350/351: Citation required.

      Response 43: We added the respective reference. (reference 55 and 56).

      Comment 44: Figure 6, The Cell-chat analysis provides interesting suggestions, but none are experimentally addressed. It is also difficult to evaluate these analyses: are any of the Mac subsets altered in frequency or phenotype in either genotype? This could be analyzed from the single-cell data in Fig 4. At the very least, flow cytometric validation of predicted shifts in the Mac compartment should be confirmed.

      Response 44: We gratefully thanks for these valuable suggestions. As requested, we analyzed macrophages and validated some of the scRNA-seq data by flow cytometry. We have re-written this part with the analysis of altered proportion of two macrophage clusters (Kupffer cells and Monocyte-derived macrophages) (page 20-21; line 399-436):

      “The scRNA sequencing analysis identified two well-established subpopulations of liver macrophages: the resident Kupffer Cells (KCs) and the Monocyte-Derived Macrophages (MDMs) (Figure 6, A-C; Supplemental Figure 9A). When comparing the total proportion of macrophages within the immune cell population of the liver between WT and Prdm1ΔNcr1 mice, there is an increase in Prdm1ΔNcr1 mice (Figure 6C). To confirm these findings, we utilized flow cytometry to define macrophages, including both KCs and MDMs, gating by CD45+Ly6G-F4/80+CD11b+ (Figure 6D).

      Our analysis showed that, following the deletion of Prdm1 in Group 1 ILCs, there is a significant increase in both the proportion and number of macrophages in the liver (Figure 6D).

      According to the transcriptional profile, liver macrophages further clustered and were labeled as “Ly6c2_hi”; “_Cxcl2_hi”; “_Ear2_hi” MDMs, and “_Mrc1_hi”; “_C1q_hi” KCs (Figure 6A, Supplemental Figure 9, A-E). Increased proportion of MDMs and KCs was observed in _Prdm1ΔNcr1 cells (Supplemental Figure 9B). Within MDMs clusters, Ly6c2_hi MDMs mainly compose of _Prdm1+/+ cells, while Prdm1ΔNcr1 cells concentrated in Cxcl2_hi cluster (Figure 6C). The scRNA-seq data reveal that following Prdm1 knockout in NKp46+ cells, there is a decrease in the proportion of KCs within the macrophage population, while the proportion of MDMs increases (Figure 6D). CX3CR1, a chemokine receptor, is extensively utilized to distinguish KCs and MDMs within macrophages. Cells expressing CX3CR1 are identified as MDMs, whereas those without CX3CR1 expression are categorized as KCs (56). Employing flow cytometry and leveraging CX3CR1 expression, we assessed the ratios of KCs and MDMs. However, diverging from the scRNA-seq findings, flow cytometry indicates that post-Prdm1 knockout in group 1 ILCs, there is a minor increase in the proportion of KCs within the total liver macrophages, and a decrease in the proportion of MDMs (Figure 6D; Supplemental Figure 9B). This discrepancy could stem from the different bases of classification: scRNA-seq defines KCs based on gene expression profiles, whereas flow cytometry differentiates between KCs and MDMs using the single surface marker, CX3CR1. Analysis of the macrophage subsets identified by scRNA-seq reveals that, while MDM clusters generally show high CX3CR1 expression, there exists a subset within MDMs, labeled _Mrc1hi, that also exhibits high levels of CX3CR1 (Supplemental Figure 9C). Consequently, if flow cytometry solely employs CX3CR1 for differentiating between KCs and MDMs, it could result in disparities when compared to scRNA-seq outcomes. Both KCs and MDMs has significantly increased in Prdm1ΔNcr1 mice, which was consist with the scRNA-seq data (Supplemental Figure 9, B and F). Despite the decrease in the proportion of Ly6c2hi MDMs in Prdm1ΔNcr1 mice, the expression levels of Ly6c2 exhibited minimal variation between WT and Prdm1ΔNcr1 mice (Supplemental Figure 9D). Intriguingly, within certain cellular subsets, notably the Ear2hi cluster, the Ly6c2 expression levels in KO mice were found to be higher than those in WT mice. Additionally, we employed flow cytometry to examine Ly6C expression within the macrophages. Similar with the scRNA-seq findings, there were no notable differences in Ly6C expression levels between WT and KO mice (Figure 6E; Supplemental Figure 9G).”.

      The changes of the macrophage compartment indicated the potential influence of functional NK cells to macrophages. We have revised these parts in our results and discussion (line 590-601). However, to address more analysis on macrophage is worthy but would go beyond the scope of this manuscript, which will be a direction of our further work.

      Comment 45: Figure 6, C1qhi Mac only are few cells/events, and interactions (or cells?) seem to be gone in the Prdm1-floxed mice. Is that true? Does it make sense to perform cell-chat analysis on so few cells?

      Response 45: We have now added KCs to the cell-chat analysis, and this cluster was belonged to C1qhi KCs. We have revised the analysis of corresponding parts in our manuscript (page 20-21; line 408-428):

      “According to the transcriptional profile, liver macrophages further clustered and were labeled as “Ly6c2_hi”; “_Cxcl2_hi”; “_Ear2_hi” MDMs, and “_Mrc1_hi”; “_C1q_hi” KCs (Figure 6A, Supplemental Figure 9, A-E). Increased proportion of MDMs and KCs was observed in _Prdm1ΔNcr1 cells (Supplemental Figure 9B). Within MDMs clusters, Ly6c2_hi MDMs mainly compose of _Prdm1+/+ cells, while Prdm1ΔNcr1 cells concentrated in Cxcl2_hi cluster (Figure 6C). The scRNA-seq data reveal that following Prdm1 knockout in NKp46+ cells, there is a decrease in the proportion of KCs within the macrophage population, while the proportion of MDMs increases (Figure 6D). CX3CR1, a chemokine receptor, is extensively utilized to distinguish KCs and MDMs within macrophages. Cells expressing CX3CR1 are identified as MDMs, whereas those without CX3CR1 expression are categorized as KCs (56). Employing flow cytometry and leveraging CX3CR1 expression, we assessed the ratios of KCs and MDMs. However, diverging from the scRNA-seq findings, flow cytometry indicates that post-Prdm1 knockout in group 1 ILCs, there is a minor increase in the proportion of KCs within the total liver macrophages, and a decrease in the proportion of MDMs (Figure 6D; Supplemental Figure 9B). This discrepancy could stem from the different bases of classification: scRNA-seq defines KCs based on gene expression profiles, whereas flow cytometry differentiates between KCs and MDMs using the single surface marker, CX3CR1. Analysis of the macrophage subsets identified by scRNA-seq reveals that, while MDM clusters generally show high CX3CR1 expression, there exists a subset within MDMs, labeled _Mrc1hi, that also exhibits high levels of CX3CR1 (Supplemental Figure 9C). Consequently, if flow cytometry solely employs CX3CR1 for differentiating between KCs and MDMs, it could result in disparities when compared to scRNA-seq outcomes.”.

      Comment 46: Figure 6, C - Here the interactions of both Mac+ILC1 and Mac+NK are shown together. It would be interesting to separate this analysis (also Suppl. Fig 9A-B) into comparisons of Mac+ILC1 vs Mac1+NK from WT or Prdm1fl Ncr1 mice.

      Response 46: As request, we re-analyzed this part in each genotype, which was showed in the Supplemental Figure 10. These data have now been described in (page 22; line 445-447).

      “The reduction of interaction mostly occurred in the cross-talk of ILC1-MDM and ILC1-KC, whereas no difference was observed in cNK-MDM and cNK-KC interaction (Supplemental Figure 10, A-H)”

      Comment 47: Supplementary Figure 9, A, B - Is this analysis using WT and Prdm1fl Ncr1cre dataset together? 

      Response 47: Yes, we used WT and Prdm1_Δ_Ncr1 data together. As the request above, we separate this analysis from WT or Prdm1_Δ_Ncr1 Ncr1 mice. These data have now been described in (page 22; line 445-460):

      “The reduction of interaction mostly occurred in the cross-talk of ILC1-MDM and ILC1-KC, whereas no difference was observed in cNK-MDM and cNK-KC interaction (Supplemental Figure 10, A-H). A reduction in the interaction of ligand-receptor, such as Mif-CD74, Cxcl16-Cxcr6, and Cxcl10-Cxcr3 was observed in Prdm1ΔNcr1 mice compared to Prdm1+/+ mice (Supplemental Figure 11). Compared to Prdm1+/+ mice, the information flow of CXCL and MIF pathways significantly decreased in Prdm1ΔNcr1 mice (Figure 6, H and I; Supplemental Figure 10, B, D, F, and H). These pathways play a crucial role in facilitating macrophage migration. The CXCL signaling was sent from Ly6c2_hi _Cxcl2_hi MDMs and _C1q_hi KC, targeting all ILC1 clusters and _Cxcr3_hi cNK cell clusters (Figure 6J). Of note, although the population of _Cxcl2_hi macrophage primarily comprised cells from _Prdm1ΔNcr1 mice, the interaction within the CXCL pathway between macrophages and group 1 ILCs was obviously less than Prdm1+/+ sample (Figure 6J). These changes could be linked to a decreased population of ILC1s and Cxcr3_hi cNK cell cluster in _Prdm1ΔNcr1 mice, implying that the homeostasis of _Cxcl2_hi macrophages required sufficient signals from cNK cells and ILC1s. The impaired CXCLCXCR interactions might subsequently lead to reduced recruitment and activation of group 1 ILCs and macrophages within the tumor microenvironment.”.

      Comment 48: Figure 7, A-C -What is the consequence/interpretation of reduced Mitotracker staining? Any metabolic assays performed? The definition of NK cell "exhaustion" is unclear, is reduced IFNg enough for that? Is the concept of NK cell exhaustion clearly established? Only shortly touched upon in the discussion, the rationale for suggesting an exhausted phenotype, should be explained.

      Response 48: MitoTracker was used to assess the mitochondrial mass. The reduced staining indicated compromised mitochondria function, which associated with mitochondrial fragmentation.

      We believe that the exhaustion of NK cells is not as well-established a concept as it is for T cells. The purpose of detecting mitochondria in this study is to provide evidence for the relationship between Prdm1 and the exhaustion of NK cells. In the discussion section, we have added the following content (page 26; line 529-543):

      “Mitochondria are pivotal organelles crucial for cellular metabolism. Disruptions in mitochondrial function have been linked to T Cell exhaustion, attributed to glycolytic reprogramming (66). Similarly, mitochondrial fragmentation has been closely associated with NK cell exhaustion (67).

      However, the concept of NK cell exhaustion isn't as firmly established as it is for T cells. Exhausted NK cells should primarily exhibit diminished functions. This is characterized by a diminished ability to destroy tumor cells, a reduced capability to activate other components of the immune system, and compromised proliferation and survival rates. Additionally, this reduced functionality is associated with a decline in the expression of molecules responsible for cytotoxic activity, lower production of IFN-γ, and metabolic disturbances that may arise from mitochondrial dysfunction. While our current data is not sufficient to definitively classify these cells as exhausted NK cells, it supports that a subpopulation, referred to Junb_hi cluster, demonstrates an exhaustion-like phenotype. The significant increase in this cell population following _Prdm1 knockout in NK cells may potentially be one of the reasons why Prdm1ΔNcr1 mice lose their tumor-killing capacity. Whether the excessive expression of JunB in NK cells is also a contributing factor to their exhaustion, similar to T cells(65), requires further investigation.”.

      Comment 49: Figure 7, x-axis labelling (MFI) of histograms is not correct. Do bar graphs and FACS plots show the same data? Does the number in the FACS plots indicate the MFI? If so, the FACS plots do not show representative samples?

      Response 48: We appreciate the valuable comments provided by the reviewer. In the revised Figure 7, the MFI values have been removed. Bar graphs now display summary data from FACS histograms.

      A representative sample close to the group's mean value was chosen for display in the histograms.

      Comment 50: Figure 7, D - How are these data different from Figure 2H? Why is it now called "exhaustion", but not in 2H? Is the detected IFNg only driven by ex vivo stimulation with Il12/Il18? As above, a "standard" 4h assay should also be provided to allow better interpretation of potential differences. In the introduction, the authors cite the Ducimetiere study (Ref 5) highlighting "the primary function of ILC1 in suppressing the seeding of metastatic tumor cells in liver tissue". Thus, it would be interesting to test Ifng production by liver ILC1 and NK cells ex vivo at early time points of tumor inoculation.

      Response 50: Tumors grow and proliferate within tissues, constituting one of the major causes of lymphocyte exhaustion. This part of the current study aims to investigate whether Prdm1 aids NK cells or ILC1 in resisting the exhaustion induced by malignant tumors. Specifically, we seek to ascertain whether the absence of Prdm1 renders NK cells or ILC1 more susceptible to exhaustion within the tumor microenvironment. Therefore, we will consider the capacity to secrete IFN-γ upon IL-12/IL-18 stimulation as one indicative aspect of exhaustion. It's crucial to emphasize that this assessment serves as only one piece of evidence, not the sole determinant. Overnight stimulation is a conventional method for studying NK cells and has been widely used across different laboratories, including our lab (e.g. Bream et al., Blood, 2003; Yu et al., Immunity, 2006; Wang et al., J Clin Invest, 2018). It's essential to clarify that our approach does not involve stimulating with tumor cells to evaluate the secretion capacity of IFN-γ by NK cells or ILC1.

      Reviewer 2 (Public Review):

      Summary:

      This study offers a significant advancement in understanding liver innate lymphoid cell (ILC) biology by elucidating the role of the transcription factor Prdm1. It shows that Prdm1 is crucial in maintaining the balance between conventional natural killer (cNK) cells and ILC1s in the liver, with knockout models revealing a vital role in cancer defense mechanisms. Despite not affecting direct cytotoxicity, Prdm1 deficiency leads to increased cancer metastasis and reduced secretion of key molecules like IFN-γ, pointing to its importance in immune regulation. The use of single-cell RNA sequencing further underscores Prdm1's role in cellular communication within the liver's immune milieu. This study is a robust contribution to the field, providing insights that could inform new immunotherapy approaches for liver cancer.

      Strengths:

      The study's strength lies in its comprehensive approach, combining the specificity of Prdm1 conditional deletion in Ncr1-cre mice with integrative omics analyses and cutting-edge cytometry to delineate Prdm1's role in liver Type 1 ILC biology and its functional implications in tumor immunity. This multifaceted strategy not only clarifies Prdm1's influence on ILC composition and maturation but also conveys potential therapeutic insights for liver cancer immunotherapy.

      We sincerely appreciate your interest and critical assessment of our manuscript. We have carefully read your comments and suggestions, and I am truly grateful for your expert guidance. We have worked on addressing each of your concerns and comments, and below we provide a point-to-point response. Please find the detailed responses below:

      Weakness

      Comment 1: A notable weakness of the study is the limited scope of in vivo disease models, primarily relying on the B16F10 melanoma model, which may not fully capture the complex behavior of Type 1 ILCs across diverse cancer types. Furthermore, the absence of direct human data, such as the effects of PRDM1 deletion in human NK cells or stem cells during their differentiation into NK and ILC1, leaves a gap in translating these findings to clinical settings.

      Response 1: We appreciate the reviewer for raising these important points, which we see as a unique opportunity for future work to transform our understanding of Prdm1 and its targets as opposed to a weakness of the present study. 

      In our revised manuscript, we have discussed these limitations of our study (page 29; line 602-609):

      “While our findings underscore the importance of Prdm1 in liver cNK cells and ILC1s tumor immune surveillance, it does not be validated in human NK cells, whereas previous studies have found that PRDM1 might inhibit the proliferation and function of human NK cells (33, 73). Furthermore, we not provided an in-depth evaluation in multiple tumor models. Further research may provide deeper insight into the role of PRDM1 in the anti-tumor function of human NK cells, enabling a more direct investigation of its application in cancer therapies. Given its important role in preserving liver cNK cells and ILC1s functional heterogeneity, enhancing Prdm1 function in human NK cells could potentially be a strategy to promote NK cell-based immunotherapy for cancer.”.

      Recommendations For The Authors:

      (Introduction) 

      Comment 2: Reference 1 appears slightly misplaced. You might find the nomenclature discussion in Spits et al., Nature Reviews Immunology, 2013, more appropriate.

      Response 2: We are really sorry for our inaccurate descriptions. According to Spits et al., (Spits et al., Nature Reviews Immunology, 2013) and other related studies, we have now adopted a more appropriate nomenclature as “Conventional NK cells” correspond to “cNK cells”, “Type 1 innate lymphoid cells” to “ILC1s”, and “Group 1 ILC” as the collective name of cNK and ILC1s. 

      The definition of these cells was described in the introduction (page 4, line 52-53; line58-62): 

      “Group 1 ILCs consist of cNK cells and ILC1s (1, 2), with distinct developmental trajectories and effect molecules (3).”, “In a state of homeostasis, liver group 1 ILCs (CD45+CD3-NK1.1+NKp46+) can be discriminated into cNK cells and ILC1s by the differential expression of CD49a and CD49b (2): cNK cells are marked by the expression of CD49b, while liver ILC1s exhibit a distinctive positivity for CD49a. Tumor Necrosis Factor Related Apoptosis Inducing Ligand (TRAIL) is also expressed on liver ILC1s, but not on cNK cells (10, 11).”. 

      We also describe cNK and ILC1 phenotypes in our scRNA-seq data, as shown in page 13; line 259-261: 

      “cNK cells expressed high levels of Itga2 (CD49b) and Eomes, while ILC1s had high levels expression of Itga1 (CD49a) and Tnfsf10 (Supplemental Figure 5, F and G).”.

      Comment 3: It has come to my attention that Reference 9 has been retracted. I recommend removing this citation to maintain the integrity of your references (https://doi.org/10.1182/blood.2023022801).

      Response 3: We thank the reviewer’s comment and we now have removed this citation.

      Comment 4: For a more comprehensive context around reference 15, consider citing Thierry Walzer's work ([https://rupress.org/jem/article/211/3/563/41636/T-bet-and-Eomes-instruct-thedevelopment-of-two)]) which aligns closely with your discussion.

      Response 4: We agree with the reviewer’s suggestion and have added this citation in our introduction (page 4; line 64-66):

      “Liver environment facilitated T-bet expression in the early stage of NK cells development, which results in Eomes repression. The repression of T-bet is required for Eomes+ NK cells (17).”.

      (Results) 

      Comment 5: The NK cell signature referenced in 32 has been questioned for its reliability as discussed by Cursons et al., CRI 2019 (https://pubmed.ncbi.nlm.nih.gov/31088844/). Reanalysis of data in Figure 1 B/C and Supplementary Figure 1 with the refined NK cell signature from Curson's work would be advantageous.

      Response 5: We thank the reviewer’s comment. As requested, we reanalyzed our data using the refined NK cell signature from Cursons et al. (revised Figure 1 A-C; revised Supplemental Figure 1). Of note, the overall survival of liver cancer (LIHC) patients only reached statistics significance when compared high and low expression of refined PRDM1-NK signature with a median cutoff (Figure 1, A-C). The overall survival performed with quartile high and low expression of refined PRDM1-NK signature was moved to supplemental figure 1, G-I. 

      The original text is: “Examination of 363 liver hepatocellular carcinoma (LIHC) patient samples from The Cancer Genome Atlas (TCGA) revealed a positive correlation between the expression of NK cell-associated genes (NCR1, NCR3, KLRB1, CD160, and PRF1) (32) and PRDM1 expression (Figure 1A). Patients with top and bottom quartiles of NK-PRDM1 signature expression were chosen for survival analysis (Figure 1B). Notably, patients with the NK-PRDM1_hi signature had better overall survival compared to the these with NK-_PRDM1_lo signature (Figure 1C). Similar results were also found in skin cutaneous melanoma (SKCM, n=454) and lung adenocarcinoma (LUAD, n=497) patients (Supplemental Figure 1, A-F). These data suggested that _PRDM1 in NK cells might be essential for immune surveillance in some solid tumors, including liver cancer. These findings prompted us to investigate the impact and mechanism of PRDM1 in NK cells and ILC1 within the context of liver cancer.”

      We have rewritten this part in our revised manuscript (page 7; line 119-132): 

      “Examination of 363 liver hepatocellular carcinoma (LIHC) patient samples from The Cancer Genome Atlas (TCGA) revealed a positive correlation between the expression of NK cell-associated genes (34) (NCR1, KLRB1, CD160, PRF1, etc.) and PRDM1 expression (Figure 1A). The patients are ordered from highest to lowest based on the expression of NK-Prdm1 for survival analysis (Figure 1B). Notably, patients exhibiting higher levels of NK-PRDM1 expression (above the median) experienced better survival outcomes compared to those with lower levels of NK-PRDM1 expression (below the median) (Figure 1C). Similar results were also found in skin cutaneous melanoma (SKCM, n=454) and lung adenocarcinoma (LUAD, n=497) patients (Supplemental Figure 1, A-F). Patients within the highest quartile of NK-PRDM1 signature expression demonstrated enhanced overall survival, a result that achieved statistical significance in LUAD and SKCM patients (Supplemental Figure 1, G-I). These data suggested that PRDM1 in NK cells might be essential for immune surveillance in solid tumors, including liver cancer, and prompted us to investigate the function and mechanism of PRDM1 in NK cells and ILC1 within the context of liver cancer.”.

      Comment 6: The origin of the Ncr1-cre mice utilised should be clarified; is this the line developed by Eric Vivier? (https://www.pnas.org/doi/10.1073/pnas.1112064108).

      Response 6: We did not use the line developed by Eric Vivier, our Ncr1-cre mice was purchase from Shanghai Model Organism Center, Inc.. We described this in our method parts (page 29-30; line 612-614): 

      Prdm1fl/fl mice were purchased from The Jackson Laboratory. Ncr1-iCre and B2m-/- mice were purchased from Shanghai Model Organisms Center, Inc.. Six- to twelve-week-old littermates were used for the experiment.”

      Comment 7: Considering the known reduction of Ncr1 expression in Ncr1-cre mice and its implications, it is recommended to repeat the B16F10 experiments with the correct control, Ncr1cre/+ Prdm1+/+.

      Response 7: This is an excellent question, and it has been raised by another reviewer and comprehensively answered (Reviewer 1, Comment 1). The answer is below: 

      The expression of Cre and the insertion of loxP sequences both have the potential to influence gene expression. This is because the region where loxP is inserted may contain regulatory sequences for the gene of interest. Ncr1-Cre is a frequently used transgenic mouse model in our laboratory. In our prior research, we also had concerns about the possible impact of Cre on NKp46 expression, which could lead to a decline in NK cell function. Therefore, in our previous studies focused on Smad4 expression in NK cells, we conducted similar experiments. In Figure 6 of our published paper in the Journal of Clinical Investigation (Wang et al., J Clin Invest, 2018), we compared NKp46iCreTgfbr2fl/flSmad4fl/WT with NKp46-iCreTgfbr2fl/flSmad4fl/fl. Although the primary purpose is to establish Smad4's independence from TGF-β, it also allows for a comparison between Smad4fl/fl and Smad4fl/WT in the presence of Cre. In the critical phenotype we assessed, NKp46iCreTgfbr2fl/flSmad4fl/fl (compared with NKp46-iCreTgfbr2fl/flSmad4fl/WT) exhibited the same phenotype as NKp46-iCreSmad4fl/fl (compared with NKp46WTSmad4fl/fl). This suggests that Cre's influence on NK cells may be within a reasonable and controllable range. Furthermore, in contrast to the decrease in Ncr1 expression caused by Cre, the reduction in the expression levels of genes targeted by Loxp knockout, such as Prdm1 in this study (Figure 1 E), is more significant. Therefore, with the current techniques and research methods, we believe that the data provided in this study can support the role of Prdm1 in NK cells.

      Comment 8: The proportion of ILC1 in wild-type mouse livers is notably higher than standard references. Could you confirm whether liver perfusion was performed before analysis? This procedure was not clearly detailed in the methods section.

      Response 8: We apologize that we did not provide enough detail regarding this point in our original method. We had performed the liver perfusion before analysis. This has now been clarified in the method section of the revised text (page 30-31; line 630-636): 

      “Mice were perfused with 1◊ PBS by portal vein puncture before harvesting tissues. Liver and lung was digested with 0.05% collagenase II for 30 minutes and filtered through 70 µm cell strainers, and mononuclear cells were isolated after subjected to density gradient using 30% and 70% percoll. Spleen were also removed and pressed through 70 µm filterers to obtain splenocytes. Peripheral blood mononuclear cells were obtained from peripheral blood after lysis of red blood cells (Biolegend, 420301). Flushing femurs and mechanical disruption of inguinal lymph nodes were performed to obtain cells from bone marrow and lymph nodes.”.

      The lymphocyte proportions in mice from different laboratories may exhibit slight variations, possibly due to genetic background disparities. To minimize the influence of genetic backgrounds, paired littermates were used in the current study, wherein one is Prdm1 WT and the other has the Prdm1 gene knocked out in NK cells.

      Comment 9: There appears to be inconsistency in reference formatting; for instance, Ref 39 does not match the formatting of other references. A thorough review of your citation format is suggested.

      Response 9: We apologize for the inadvertent errors and we reviewed the citation format.

      Comment 10: The information in Figures 2B and C may be better suited to the supplementary section as it does not significantly contribute to the main text.

      Response 10: We agree with the reviewer’s suggestion and these are now moved to supplementary figures (Supplemental Figure 2).

      Comment 11: The citation of reference 40 could be strengthened by including Sathe et al., 2014, which directly pertains to your findings (https://www.nature.com/articles/ncomms5539).

      Response 11: We added the suggested reference.

      Comment 12: Can the findings presented in Figure 2D/F be replicated using alternative models?

      This would substantiate the versatility of your results.

      Response 12: The current predominant in vivo tumor model for NK cells is primarily based on the use of B16F10 melanoma cells. These melanoma cells, with their low expression of MHC-I molecules, evade T cell-mediated immune surveillance, rendering them ideal targets for NK cells. Typically, this experimental melanoma metastasis assay involves tail vein injection, followed by nodules' detection in the lungs. To align with our investigation of liver-resident cNK and ILC1, we've introduced splenic injection (via the portal vein) and evaluated melanoma metastasis in the liver to reflect the anti-tumor capabilities of liver group 1 ILCs. We also explored subcutaneous tumor models, but we believe they may not effectively support Prdm1's role in cNK cells, particularly liver-resident NK cells and ILC1. While we've experimented with models using mouse liver tumor cells like Hepa 1-6, we found them less stable than B16F10 and less conducive to quantification. Should more suitable models or cells line emerge, we remain open to exploring them in future research.

      Comment 13: The absence of in vitro killing assessments against B16F10 and YAC-1 leaves a gap in the NK cell characterisation which would be valuable to address.

      Response 13: Isolating NK cells for ex vivo cytotoxicity assays typically requires stimulation with high concentrations of IL-2. Under such high IL-2 stimulation, many intracellular differences that contribute to difference in cytotoxicity, such as changes in transcription factors, are often masked. Another issue is that current ex vivo NK cell cytotoxicity assays often only isolate NK cells from the spleen. Liver-resident NK cells, on the other hand, are often limited in quantity and isolation methods, making it challenging to conduct ex vivo cytotoxicity assays effectively. If more sensitive detection methods become available, we will also incorporate ex vivo data into our future research endeavors.

      Comment 14: The suggestion that NK cells produce IL-6 is indeed a bold one, and without additional validation through intracellular cytokine detection or ELISA, it may be prudent to omit these claims.

      Response 14: We have checked the GSEA results, and found no valuable genes in IL-6 production.

      Therefore, we have removed this figure.

      Comment 15: The lack of fluorescence minus one (FMO) controls in Figure 3 and Supplementary

      Figure 4 is noted; including these would enhance the validity of your gating strategies.

      Response 15: As requested, we add the FMO controls in aforementioned figures.

      Comment 16: There seems to be a minor mix-up in referring to Figure 4A in the scRNAseq results section, perhaps it was intended to refer to Figure 3A?

      Response 16: We have corrected this part (line 247). We also double checked corrected the inaccuracies in the references to the figures. we apologize for the inadvertent errors.

      Comment 17: The rich datasets generated from bulk and scRNAseq are commendable. However, I urge you to make these datasets publicly accessible with a GEO accession number.

      Response 17: We appreciate the suggestion from the reviewer. We plan to upload our datasets when in the last version of our manuscript, which is also the request of the eLife policy.

      Comment 18: Figure 4K is insightful, yet a similar analysis of the ILC1 cluster could provide a more rounded understanding.

      Response 18: We thank the reviewer for the comments. We provide the similar analysis of ILC1s, as showing in revised Figure 5H. 

      Comment 19: The metabolic RNA signatures featured in Supplementary Figure 6 are intriguing and warrant further validation, perhaps through Seahorse analysis. Such validation could merit their inclusion in the main figures.

      Response 19: This is a very good suggestion. Currently, our data offer only limited indications in this context. We have chosen to validate some aspects of Prmd1's influence on cytotoxicity molecules. As for Prdm1's impact on other aspects of NK cells, such as metabolic functions, we may explore further in future research. Additionally, we hope that by publishing our research findings, laboratories worldwide can draw insights for their own studies and conduct relevant research based on this data.

      Comment 20: It is difficult to discern whether the cells depicted in Figure 7D are truly tumorinfiltrating ILC1 or NK cells that have adopted ILC1-like characteristics. Intravenous injection of CD45-PE could clarify this distinction, and if they are the latter, it may be more appropriate to refer to them as ILC1-like cells.

      Response 20: We completely agree with the reviewer's suggestion that "tumor-infiltrating lymphocytes" may not be accurate for the current experiment. Therefore, in the revised manuscript, we have changed it to "liver cNK or ILC1 from tumor-bearing livers.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      All of the reviewers indicate that their major concerns have been adequately addressed, but they each have a few comments that the authors should consider before submitting a final version (without further review) for publication. For example, a statement about the sex of the mice used in the studies and whether any differences were noted if both sexes were used. The idea that the loss of glutamate transport might affect NA loading into vesicles is also worth considering. Finally, the authors might want to mention that the role of neuropeptide release from NA neurons needs further examination. 

      As noted in the prior submitted revision, all experiments contained both males and females and this was addressed in our re-submission. In our analysis of breathing and metabolism, sex was included in the analysis and no significant phenotypic difference was observed (The statement of no sex difference is in line 451-456). For the fate map and in situ experiments, although the group size is small, we did not see obvious differences in the expression patterns in the three glutamate transporters between females and males (line 347-350). All the anatomical and phenotypic data in this manuscript are presented as combined graphs (figure 1, figure 1 supplement 1, figure 2, figure 2 supplement 2, figure 4,5,6,7) and we had differentially labeled our data points by sex (female data is pink and male data is blue).

      The possibility that loss of Vglut2 might affect NA release has been added in the discussion (line 485-491) of the current revision. Dopamine Beta Hydroxylase (DBH) converts dopamine to noradrenaline in the vesicles, thus, glutamate may not directly affect noradrenaline loading into vesicles. However, since loss of Vglut2 reduced dopamine release in subsets of dopaminergic neurons, it remains possible that glutamate affects dopamine loading in NA neurons and in turn perturbs DA to NA conversion in the vesicle by DBH and subsequent noradrenaline release. Future work could examine this hypothesis using fast-scan cyclic voltammetry (FSCV) or microdialysis.

      The further examination of the role of neuropeptide release from NA neurons is mentioned in the discussion (line 491-494 and line 497-499 of the pre).

      eLife assessment

      Chang et al. provide glutamate co-expression profiles in the central noradrenergic system and test the requirement of Vglut2-based glutamatergic release in respiratory and metabolic activity under physiologically relevant gas challenges. Their experiments provide compelling evidence that conditional deletion of vesicular glutamate transporters from noradrenergic neurons does not impact steady-state breathing or metabolic activity in room air, hypercapnia, or hypoxia. This study provides an important contribution to our understanding of how noradrenergic neurons regulate respiratory homeostasis in conscious adult mice. 

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Chang et al. provide glutamate co-expression profiles in the central noradrenergic system and test the requirement of Vglut2-based glutamatergic release in respiratory and metabolic activity under physiologically relevant gas challenges. Their experiments show that conditional deletion of Vglut2 in NA neurons does not impact steady-state breathing or metabolic activity in room air, hypercapnia, or hypoxia. Their observations challenge the importance of glutamatergic signaling from Vglut2 expressing NA neurons in normal respiratory homeostasis in conscious adult mice. 

      Strengths:

      The comprehensive Vglut1, Vglut2, and Vglut3 co-expression profiles in the central noradrenergic system and the combined measurements of breathing and oxygen consumption are two major strengths of this study. Observations from these experiments provide previously undescribed insights into (1) expression patterns for subtypes of the vesicular glutamate transporter protein in the noradrenergic system and (2) the dispensable nature of Vglut2dependent glutamate signaling from noradrenergic neurons to breathing responses to physiologically relevant gas challenges in adult conscious mice. 

      Weaknesses:

      Although the cellular expression profiles for the vesicular glutamate transporters are provided, the study does not document that glutamatergic-based signaling originating from noradrenergic neurons is evident at the cellular level under normal, hypoxic, and/or hypercapnic conditions. The authors effectively recognize this issue and appropriately discuss their findings in this context. 

      We thank the reviewer for the positive evaluation of our work.

      Reviewer #2 (Public Review):

      The authors characterized the recombinase-based cumulative fate maps for vesicular glutamate transporters (Vglut1, Vglut2 and Vglut3) expression and compared those maps to their realtime expression profiles in central NA neurons by RNA in situ hybridization in adult mice. Authors have revealed a new and intriguing expression pattern for Vglut2, along with an entirely uncharted co-expression domain for Vglut3 within central noradrenergic neurons. Interestingly, and in contrast to previous studies, the authors demonstrated that glutamatergic signaling in central noradrenergic neurons does not exert any influence on breathing and metabolic control either under normoxic/normocapnic conditions or after chemoreflex stimulation. Also, they showed for the first-time the Vglut3-expressing NA population in C2/A2 nuclei. In addition, they were also able to demonstrate Vglut2 expression in anterior NA populations, such as LC neurons, by using more refined techniques, unlike previous studies. 

      A major strength of the study is the use of a set of techniques to investigate the participation of NA-based glutamatergic signaling in breathing and metabolic control. The authors provided a full characterization of the recombinase-based cumulative fate maps for Vglut transporters. They performed real-time mRNA expression of Vglut transporters in central NA neurons of adult mice. Further, they evaluated the effect of knocking down Vglut2 expression in NA neurons using a DBH-Cre; Vglut2cKO mice on breathing and control in unanesthetized mice. Finally, they injected the AAV virus containing Cre-dependent Td tomato into LC of v-Glut2 Cre mice to verify the VGlut2 expression in LC-NA neurons. A very positive aspect of the article is that the authors combined ventilation with metabolic measurements. This integration holds

      particular significance, especially when delving into the exploration of respiratory chemosensitivity. Furthermore, the sample size of the experiments is excellent.  Despite the clear strengths of the paper, some weaknesses exist. It is not clear in the manuscript if the experiments were performed in males and females and if the data were combined. I believe that the study would have benefited from a more comprehensive analysis exploring the sex specific differences. The reason I think this is particularly relevant is the developmental disorders mentioned by the authors, such as SIDS and Rett syndrome, which could potentially arise from disruptions in central noradrenergic (NA) function, exhibit varying degrees of sex predominance. Moreover, some of the noradrenergic cell groups are sexually dimorphic. For instance, female Wistar rats exhibit a larger LC size and more LC-NA neurons than male subjects (Pinos et al., 2001; Garcia-Falgueras et al., 2005). More recently, a detailed transcriptional profiling investigation has unveiled the identities of over 3,000 genes in the LC. This revelation has highlighted significant sexual dimorphisms, with more than 100 genes exhibiting differential expression within LC-NA neurons at the transcript level. Furthermore, this investigation has convincingly showcased that these distinct gene expression patterns have the capacity to elicit disparate behavioral responses between sexes (Mulvey et al., 2018).

      Therefore, the authors should compare the fate maps, Vglut transporters in males and females, at least considering LC-NA neurons. Even in the absence of identified sex differences, this information retains significant importance. 

      An important point well raised by the authors is that although suggestive, these experiments do not definitively rule out that NA-Vglut2 based glutamatergic signaling has a role in breathing control. Subsequent experiments will be necessary to validate this hypothesis. 

      An improvement could be made in terms of measuring body temperature. Opting for implanted sensors over rectal probes would circumvent the need to open the chamber, thereby preventing alterations in gas composition during respiratory measurements. Further, what happens to body temperature phenotype in these animals under different gas exposures? These data should be included in the Tables. 

      Is it plausible that another neurotransmitter within NA neurons might be released in higher amounts in DBH-Cre; Vglut2 cKO mice to compensate for the deficiency in glutamate and prevent changes in ventilation? 

      Continuing along the same line of inquiry is there a possibility that Vglut2 cKO from NA neurons not only eliminates glutamate release but also reduces NA release? A similar mechanism was previously found in VGLUT2 cKO from DA neurons in previous studies (Alsio et al., 2011; Fortin et al., 2012; Hnasko et al., 2010). Additionally, does glutamate play a role in the vesicular loading of NA? Therefore, could the lack of effect on breathing be explained by the lack of noradrenaline and not glutamate? 

      We thank the reviewer for the positive evaluation and further suggestions. Please see our response in “Author Response” to the previous version of Reviewer #2 (Public review).

      Reviewer #4 (Public Review): 

      Summary:

      Although previous research suggested that noradrenergic glutamatergic signaling could influence respiratory control, the work performed by Chang and colleagues reveals that excitatory (specifically Vglut2) neurons is dynamically and widely expressed throughout the central noradrenergic system, but it is not significantly crucial to change baseline breathing as well the hypercapnia and hypoxia ventilatory responses. The central point that will make a significant change in the field is how NA-glutamate transmission may influence breathing control and the dysfunction of NA neurons in respiratory disorders. 

      Strengths:

      There are several strengths such as the comprehensive analysis of Vglut1, Vglut2, and Vglut3 expression in the central noradrenergic system and the combined measurements of breathing parameters in conscious unrestrained mice. 

      Other considerations :

      These results strongly suggest that glutamate may not be necessary for modulating breathing under normal conditions or even when faced with high levels of carbon dioxide (hypercapnia) or low oxygen levels (hypoxia). This finding is unexpected, considering many studies have underscored glutamate's vital role in respiratory regulation, more so than catecholamines. This leads us to question the significance of catecholamines in controlling respiration. Moreover, if glutamate is not essential for this function, we need to explore its role in other physiological processes such as sympathetic nerve activity (SNA), thermoregulation, and sensory physiology. 

      We thank the reviewer for the positive evaluation and further suggestions. The potential role of noradrenergic-derived glutamate in other processes, which is beyond the scope of this study, should be addressed in the future.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      All of my concerns were effectively resolved, leading me to accept the paper. However, I suggest that the authors consider investing in a more reliable system for measuring body temperature, as accurate measurements of this parameter are crucial for whole body plethysmography. 

      Thank you for the suggestion. The real-time measurement of body temperature is a goal in future studies.

      Reviewer #4 (Recommendations For The Authors):

      Because I am revising a revised version, I believe the authors have addressed most, if not all, the concerns raised by already 3 reviewers. In my understanding the authors achieved their aims and the results are totally supported by the conclusions. The impact of this work on the respiratory field is significant and is likely to advance the field. The methods and data utilized, which combine standard techniques with genetic tools, will be highly beneficial to the research community. 

      In my understanding I still have one concern that if glutamate is not critical, then what is? Could we potentially disable the noradrenergic (NA) system while preserving glutamate functionality to determine if the NA system is indeed crucial for respiratory physiology? This approach might provide clearer insights into the mechanisms underlying respiratory control. 

      We agree that there remain several exciting questions about the respective roles of noradrenaline, glutamate, and other neuropeptides such as Neuropeptide Y (NPY) and galanin. We are currently devising strategies to address the respective and combinatorial roles for all these candidates in breathing control. Most simply, we can conditionally, mutagenized each of them in the central noradrenergic system in an acute manner using DBH-CreER mice to determine if any of them are critical to respiratory control with the advantage of minimizing developmental compensatory events.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, the authors evaluated a novel eIF2B activator, DNL343, in two mouse models representing different forms of the integrated stress response (ISR). They first assessed the pharmacokinetics of DNL343, demonstrating its ability to cross the blood-brain barrier and exhibit good bioavailability. In an acute ISR model induced by optic nerve crush (ONC) injury, DNL343 treatment reduced ISR-induced transcriptional changes and neuronal loss, demonstrating neuroprotective effects. Next, the authors generated an eIF2B loss-of-function mice model by knocking in disease-causing Eif2b5 variants. The model presents a chronic ISR and mimics vanishing white matter disease (VWMD). DNL343 treatment from the pre-symptomatic stage improved body weight and motor functions corrected transcriptional changes, and reversed proteomic and metabolomic alterations in the brain and cerebrospinal fluid. DNL343 treatment initiated at an advanced disease stage also showed positive effects, restoring body weight gain, suppressing ISR, reducing neurodegeneration biomarkers, and extending lifespan. These findings highlight DNL343 as an effective ISR inhibitor with potential applications in treating VWMD and other neurodegenerative disorders involving ISR.

      Strengths:

      The study's findings regarding the novel compound DNL343 offer significant promise in addressing VWMD, a condition currently lacking disease-modifying treatment. DNL343 directly targets eIF2B, the disease-causing complex in VWMD, and demonstrates notable efficacy in reversing the integrated stress response (ISR) and mitigating neurodegeneration in a VWMD mouse model. These results raise hope for the potential application of DNL343 in VWMD treatment, a development eagerly anticipated by patients and the VWMD research community. Moreover, the study hints at the broader potential of DNL343 in treating other ISR-related neurodegenerative disorders, such as amyotrophic lateral sclerosis, a prospect that holds broader interest. Additionally, the study's identification of potential biomarkers for VWMD represents a notable strength, potentially leading to improved disease progression assessment pending further confirmation in future research.

      Weaknesses:

      There are a couple of notable concerns in this study. Firstly, while the in vivo evidence strongly supports the efficacy of DNL343 in mitigating ISR and neurodegeneration, there is a lack of direct biochemical evidence to confirm its activity in eIF2B activation. Secondly, the potential for cardiovascular toxicity, which has been reported for a related eIF2B activator in a canine model (as mentioned in the manuscript), has not been evaluated for DNL343 in this study. This data gap regarding toxicity could be crucial for informing the future development of DNL343 for potential human use. Further investigation into these areas would be valuable for a comprehensive understanding of the compound's mechanisms and safety profile.

      We thank the reviewer for the thoughtful feedback and an opportunity to provide further clarification. To address the first question regarding biochemical evidence of the mechanism of action of DNL343, we agree that additional data is helpful to interpreting the results presented in this manuscript. We now include a citation to Craig et al (Craig, R.A., 2nd, J. De Vicente, A.A. Estrada, J.A. Feng, K.W. Lexa, M.J. Canet, W.E. Dowdle, R.I. Erickson, B.N. Flores, P.C.G. Haddick, L.A. Kane, J.W. Lewcock, N.J. Moerke, S.B. Poda, Z. Sweeney, R.H. Takahashi, V. Tong, J. Wang, E. Yulyaningsih, H. Solanoy, K. Scearce-Levie, P.E. Sanchez, L. Tang, M. Xu, R. Zhang and M. Osipov (2024). "Discovery of DNL343: A Potent, Selective, and Brain-Penetrant eIF2B Activator Designed for the Treatment of Neurodegenerative Diseases." J Med Chem.) which includes the full details on the discovery and characterization of DNL343.

      On the question of cardiovascular toxicity observed with previous eIF2B activating compounds, Craig et al also provides evidence in a non-human primate (cynomolgus monkey) model that DNL343 dosing did not result in QT prolongation or any functional cardiac changes. We have also completed a Phase 1 (NCT04268784) and Phase 1B double-blind (NCT05006352) trials in healthy and ALS participants, respectively and these trials are referenced on page 4, lines 102-103. The safety profile observed in these clinical studies supported further development of DNL343 for ALS in the Healey Platform trial (NCT04297683, Regimen G).

      Reviewer #2 (Public Review):

      Summary:

      The authors developed DNL343, a CNS-penetrant small molecule integrated stress response (ISR) inhibitor, to treat neurodegenerative diseases caused by ISR.

      Strengths:

      DNL343 is an investigational CNS-penetrant small molecule integrated stress response (ISR) inhibitor designed to activate the eukaryotic initiation factor 2B (eIF2B) and suppress aberrant ISR activation. The therapeutic efficacy of DNL343 has been extensively characterized in two animal models. Importantly, plasma biomarkers of neuroinflammation and neurodegeneration can be reversed with DNL343 treatment. Remarkably, several of these biomarkers show differential levels in CSF and plasma from patients with vanishing white matter disease (VWMD) upon DNL343 treatment. Overall, this is a very exciting study to target ISR for therapeutic interventions.

      Weaknesses:

      My main questions center around the characterization of DNL343.

      (1) Is there any biochemical evidence showing DNL343 activates eIF2B, such as binding assays or in vitro biochemical activity assays? A conference presentation was cited - "Osipov, M. (2022). Discovery of DNL343: a Potent Selective and Brain-penetrant eIF2B Activator Designed for the Treatment of Neurodegenerative Diseases. Medicinal Chemistry Gordon Research Conference. New London, NH." However, there needs to be public information about this presentation.

      Information from this presentation and more details on the discovery and characterization of DNL343 can be found in Craig et al J Med Chem (2024) and this citation has been replaced.

      (2) How was the selectivity of DNL343 demonstrated? What are the off-targets of DNL343, in particular when DNL343 is administered at a high dose? Thermal-proteasome profiling or photoaffinity labeling experiments could be considered.

      Please see Craig et al J Med Chem (2024) for full details. In brief, there were no significant off target effects observed for DNL343 in a Cerep panel.

      (3) What are the total drug concentrations in the brain and plasma? What are the unbound ratios?

      Following a single oral dose of DNL343 in mice, unbound brain-to-unbound plasma exposures ratios (Kp,uu) of 0.8 to 1.1 were observed, indicating high CNS penetrance. This was further supported by CSF-to-unbound plasma exposures ratios at 0.9 in the same mouse study. The CNS penetrance was also confirmed in rats and NHP by CSF-to-unbound plasma ratios near unity as reported in Craig et al J Med Chem (2024).

      (4) If DNL343 is given intravenously, what are the concentrations in the brain and plasma after 5 minutes and 1 hour or longer time points? In other words, does DNL343 cross BBB through passive diffusion or an active process?

      Unbound brain-to-unbound plasma exposure ratios following a single oral dose in the mouse were 0.8 to 1.1 and showed no time dependence. These measurements were made prior to, near, and following plasma tmax of DNL343, indicating unbound DNL343 crosses the BBB through passive diffusion and rapidly reached equilibrium between the brain and systemic circulation. Details can be found in Craig et al J Med Chem (2024).

      (5) What is the complete PK profile of DNL343 for intravenous and oral dosing?

      DNL343 administered orally to mice as a suspension formulation showed plasma PK consistent with prolonged absorption with tmax ranging from 3 to 4 h, and a terminal elimination half-life (t1/2) of ~10 h. Details can be found in Craig et al J Med Chem (2024).

      (6) Are there any major drug metabolites that could be of concern?

      DNL343 metabolism is through Phase 1 biotransformation pathways. None of the in vivo circulating metabolites show potency towards eIF2B activation. Given that none of these metabolites are of concern, we believe this information is beyond the scope of the current manuscript.

      Reviewer #3 (Public Review):

      Summary:

      ISR contributes to the pathogenesis of multiple neurodegenerative diseases, such as ALS, FTD, VWMD, etc. Targeting ISR is a promising avenue for potential therapeutics. However, previously identified ways to target ISR present some challenges. PERK inhibitors suppress ISR by inhibiting eIF2alpha phosphorylation and cause pancreatic toxicity in mice. In order to bypass eIF2alpha, previous studies have identified ISR suppressors that target eIF2B, such as ISRIB and 2BAct. These molecules suppress neurodegeneration but do not cause detrimental effects in mouse models. However, ISRIB is water-insoluble, and 2BAct causes cardiovascular complications in dogs, preventing their use in clinics. Here, the authors showed that DNL343, a new ISR inhibitor targeting eIF2B, suppresses neurodegeneration in mouse models. Combined with their previous results of a clinical phase I trial showing the safety of DNL343, these findings suggest the promise of DNL343 as a potential drug for neurodegenerative diseases in which ISR contributes to pathogenesis.

      Strengths:

      The finding is important and has disease implications, and the conclusion is not surprising.

      Weaknesses:

      The experimental design and data are hard to comprehend for an audience with a basic research background. This reviewer suggests that the authors use the same way that previous studies on ISRIB and 2BAct (e.g., Wong et al; eLife, 2019) designed experiments and interpret data.

      We thank this reviewer for their feedback and recognition that DNL343 has a promising potential as treatment for neurodegenerative diseases. While our studies share some similarities to Wong et al., eLife (2019) and Abbink et al., ACTN (2019), our study design is intentionally distinct (e.g. inclusion of both prevention and treatment dosing paradigms, determining dose-response impact of drug treatment across biomarkers) which necessitates tailored data visualization to effectively communicate our findings. However, we understand the importance of clarity for a broader audience and to this end, we have made a number of changes to the data figures, in particular data from omics experiments in Figures 3 and 5. We also provided additional supplemental tables to aid data interpretation. This would hopefully cater to both audiences familiar with previous work and those with a less specialized background.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Demyelination is a significant pathological feature in the VWMD mouse model. The authors should clarify whether they observed similar demyelination in their study and if DNL343 had any impact on reversing this demyelination. These findings are crucial for assessing the compound's effectiveness in mitigating neurodegeneration.

      Demyelination is indeed an important feature in the eIF2B LOF (VWMD) mouse model. Given that this phenotype and the ability to rescue the histological phenotype with this MOA (Wong et al; eLife, 2019, cited in introduction) is very well characterized, along with our limitation from the size and number of mouse tissues, we prioritized non-histological targeted and unbiased analyses that were aimed at identifying translatable biomarkers. Nonetheless, the totality of our data, in different mouse models and cell types, strongly supports DNL343 as a potent ISR inhibitor that is effective in attenuating neurodegeneration:

      · In the optic nerve crush model, DNL343 dose-dependently reduced retinal cell degeneration

      · In the VWMD mouse model, DNL343 attenuated the increase in a plasma biomarker of neurodegeneration, neurofilament-light, which corresponded to normalization in motor function.

      · Metabolomic and lipidomic analyses in the VWMD mouse model brain showed increases in oxysterols, such as 7-ketocholesterol, and cholesterol esters and these lipids are associated with demyelination (Nugent et al, 2020). DNL343 treatment attenuated the levels of these oxysterols, indicating decreased demyelination.

      · When initiated at an advance disease stage, reversal of plasma biomarkers of neurodegeneration (Nf-L) and neuroinflammation (GFAP) by DNL343 in this model was accompanied by extension in the lifespan that is otherwise shortened as the mutant animals succumb to disease.

      These data highlight the potential therapeutic benefits of DNL343 in the broader context of ISR-mediated neurodegeneration which can include but may not be limited to VWMD.

      (2) Figure 6 presents several biomarkers with significantly increased levels in VWMD mice and patient biofluids. However, these biomarkers are not reflected in the brain proteomics data presented in Figure 3. The discrepancy between these findings should be addressed and discussed in the manuscript to provide a more comprehensive understanding.

      Proteins detected in Figure 6 were not detected by TMT proteomics in the CSF. In the brain, only GFAP was detected and the overall abundance in tissue were similar in both genetic groups. Cytokines such as TIMP1, MCP1 are usually present in low abundances and therefore are challenging to detect in broad discovery proteomics method applied in this study. Antibody-based immunoassays are better suited to specifically measure low abundant proteins than mass-spectrometry-based proteomics, while mass-spectrometry based methods offer wider dynamic range to detect more highly abundant proteins. Differences in detection sensitivity between immunoassay vs mass spectrometry assays has been previously noted (Petrera et al, J Proteome Res, 2021). We have added new text to address this point in the revised manuscript (page 7, line 274-277).

      (3) Figure 7 discusses the effects of DNL343 treatment initiated at an advanced disease stage. Since the 4-week treatment did not rescue performance in the balance beam test (as shown in Figure 6A), it is important to clarify if a 20-week treatment had any impact on this parameter.

      This reviewer raised an important question that we were unfortunately unable test. When the balance beam training was administered after 8 (out of 20) weeks of dosing, most animals of both wildtype and mutant genotypes struggled to remain on or maintain balance on the beam and were unable to progress traversing the beam, making the assay unsuccessful in this cohort. This impairment appeared to be driven by distinct factors in the two genotypes: age-associated obesity in wild-type animals and severe motor impairment in the eIF2B HOM mice, irrespective of treatment. While it is possible that other less demanding and more sensitive assays could reveal more nuanced differences, this, and our earlier data (Figure 4G-I), suggest that DNL343 could prevent but not reverse functional deterioration. This is in line with our understanding of DNL343 mechanism of action that does not include neuronal regeneration, a therapeutic effect that is likely required for functional recuperation. We have added this point to the manuscript (page 8, line 319-326).

      Additionally, considering the significant increase in Gdf15 levels in the disease model, it would be valuable to know if DNL343 treatment affected Gdf15 levels. If these assays were conducted, reporting the data would greatly assist in evaluating the compound's efficacy when administered at an advanced disease stage.

      We were not able to measure GDF15 levels in the 20-week study due to limitation in the in-life collected plasma samples which was dedicated to assessing biomarkers of neurodegeneration (Figure 7E-F). However, data from our 4-week treatment study, which was initiated at a similar age range to the 20-week treatment study (19-26 and 24-33 weeks of age, respectively), showed that DNL343 was able to reduce GDF15 levels in the brain (mRNA and protein) and CSF (protein) (Supplemental Figure 5A-C), suggesting that DNL343 reduces ISR activation at an advanced disease stage in the model. We expect that this reduction observed at 4 weeks of treatment would persist for the duration of the extended treatment in the 20-week cohort.

      (4) A minor point. In Figures 5A, 5C, and 5E, it appears that the red-colored group should likely be labeled as "HOM 0 mg/kg" instead of "HOM 3 mg/kg".

      This has been amended, thank you.

      Reviewer #3 (Recommendations For The Authors):

      Major concerns:

      (1) The cellular function of DNL343 needs to be clarified. The authors claim that it activates eIF2B, but no cellular or molecular evidence is provided. Does it bind to eIF2B? Does it not affect eIF2alpha phosphorylation? Does it restore translation upon stress that causes eIF2alpha phosphorylation? Does it suppress stress granule assembly? The authors cited Sun, Tsai et al. 2023 and Osipov et al., 2022. However, these citations are conference abstracts with no published figures available for review.

      We agree that additional data outlining the biochemical evidence of the mechanism of action of DNL343 was needed. We now include a citation to Craig et al J Med Chem (2024) that includes the full details on the discovery and molecular characterization of DNL343.

      (2) It needs to be clarified how the authors selected the ISR marker genes. ISR genes are more than those selected. How about others? How did the authors measure the mRNA levels, bulk RNA-seq or RT-PCR? If the former, have the authors verified their results using RT-PCR? Have the authors measured the protein levels for nerve crush experiments (by both proteomic and individual protein analyses)? Also, no statistical analyses were found for the heat maps.

      The ISR marker genes were selected by a combination of experimental and literature data. Transcriptomics analysis of the eIF2B HOM brains was conducted using untargeted RNAseq (Supplemental Figure 1B). Here, we found an enrichment of transcripts previously reported to be ISR dependent, namely Atf4, Chac1, Ddit3, Eif4ebp1, Ppp1r15a (Larhammar et al., 2017), Atf3, Asns, Mthfd2, Psat1, Sesn2, Slc1a5, Slc7a5, Slc7a11, Trib3 (Wong et al., 2019, Abbink et al., 2019).  These transcripts were assayed using targeted qPCR in the eIF2B HOM brains, spleen and PBMC (Supplemental Figure 1A, C, D) and in the retinas from the ONC experiments (Figure 2C). We have further clarified the analysis method for the gene expression data in the figure legends.

      We did not interrogate the proteome of the retina in the ONC model. Our study in this model was intended as a proof-of-concept evaluation of DNL343 effects in this acute ISR-dependent model of neurodegeneration. To this end, we performed gene expression (Figure 2C) and immunofluorescence analyses (Figure 2D-F). Each of these analyses were conducted using dedicated whole retinas; conducting additional protein analyses would necessitate a separate cohort of animals.

      We believe that heatmaps provide the best visualization of the data, particularly the dose dependent effects of DNL343 on multiple genes, but we understand the value for also providing statistical analyses. To address this, we provide additional Supplemental tables to show the outcome of statistical analyses undertaken. Statistical data relating to Figure 2C can be found on new Supplemental Tables 1 & 2; those relating to Supplemental Figures 1A, C, and D on new Supplemental Tables 3, 5, 6, respectively; that from Figure 4D on new Supplemental Table 8, and that from Figure 7D on new Supplemental Table 11.

      (3) Both the authors and Wong et al. (eLife, 2019) performed transcriptomic analyses on HOM mice. How do the authors compare the two data sets? Are they the same?

      In this work, transcriptomic approach was applied to confirm induction of ISR response in our in vivo model. While data are not identical, all of the top annotated genes shown in supplementary figure 1B were also deemed to be significant by Wong and coworkers (Bayes factor > 10). More importantly, as explained in our responses to question #2 from reviewer 3,  ISR genes highlighted in supplementary Figure 1B were also confirmed in two other studies (Larhammar et al., 2017, Abbink et al., 2019). These data support our interpretation that eIF2B HOM have elevated ISR relative to WT mice. We have added new text to line 164 on page 5 to clarify this point.

      (4) Can the authors interpret their omic data using volcano plots for HOM rescue experiments, as Wong et al. did in eLife 2019? Heat maps with statistical analyses are more straightforward to comprehend. Can the authors verify some of these data using RT-PCR, Western blot, etc.?

      We added additional pathway interpretation in our Figure 3 and 5 to highlight key biological processes altered in the brain and cellular compartment origin of CSF proteins changed in eIF2B HOM at baseline and following treatment with DNL343. Our treatment designed employed multiple dosing levels and as such, summarization by volcano plot would have resulted in creation of many figures that can be more easily captured by a single heat map plot. However, to provide additional quantitative information, we now added supplementary tables showing full statistical analysis for all heat maps for added clarity and transparency.

      We demonstrated 100% correlation between the select genes we examined by qPCR in supplemental Figure 1A and those identified from brain by RNA-seq. In addition, question of reliability of RNA-seq data has been previously been examined in great detail (Everaet et al, Sci Rep 2017) and found ~85% concordance between RNA-seq and qPCR data and those that were discordant tended to have < 2 log2FC and were present in low abundance. Given that top core ISR genes identified in our study have >2 log2FC and have been verified by other independent labs (Larhammar et al., 2017, Abbink et al., 2019, Wong et al., 2019). Based on these, we do not think that there is a rationale need for technical confirmation of RNAseq data.

      Risks for mis-annotation of proteins in TMT data were further mitigated by removing protein with coverage < 20% and having less than 8 unique peptides detected and setting protein annotation FDR to <1%.

      Additionally, TMT-labelling based proteomics offers wider dynamic range and sensitivity than western blotting. Validation of TMT logFC data with western blot technique, which is less quantitative and has lower dynamic ranges of detection may not be very informative. Furthermore, similar trends of changes in key ISR genes and proteins shown in figures 4D and 5A (e.g PSAT, SLC7A11, SLC7A5) provides additional support for the authenticity of proteins identified in this work.

      Also, for Figures 4E and F, it is assumed that each line represents an individual animal, but why their body weight gains are so different for the wild type? Can the authors plot the mean and s.e.m.? Also, there are no data about neurodegeneration. The authors need to show microscopy images, count the numbers, and assess the morphology of nerve cells.

      The large data spread in the body weight gain in our wild-type mice reflect the normal variability of this endpoint which can be influenced by sex and age. Indeed, both factors are present in our cohorts as animals of both sexes were included and there was a 7-week age-range (10-17 weeks of age at dosing start). Each line in Figures 4E-F indeed represents data sampled from individual animal over time. We chose to represent the data this way for transparency and have provided additional visualization (new Supplemental Figure 3) showing both body weight gain and plasma Nf-L levels as mean ± SEM as requested by this reviewer.

      In this study we chose to use a clinically-relevant biomarker of neurodegeneration, plasma neurofilament light chain (NfL) (Figure 4F). This allowed us to prioritize the tissue samples from these studies to execute comprehensive unbiased analyses for more complete characterization of the phenotype of these eIF2B LoF mice. NfL is a biomarker that has been recognized as a sensitive measurement of neuronal/axonal damage regardless of cause (Gaetani et al., 2018, Khalil et al., 2018). Elevated levels of plasma (and CSF) NfL levels has been demonstrated across neurodegenerative conditions such as Alzheimer’s disease (Giacomucci et al., 2022), multiple sclerosis (Ferreira-Atuesta et al., 2021), and in ALS (Huang et al., 2018).

      (5) How ISR is connected to metabolomic changes? Can the authors explain it?

      ISR caused significant increases in amino acid transporter and serine/glycine/1-carbon metabolism enzymes transcript and protein abundances that were highlighted in Figure 3A and C and lines 237-255 in the main text. Similar patterns were also observed in prior published studies (Larhammar et al., 2017, Abbink et al., 2019, Wong et al., 2019). Consistent with these changes we observed increased levels of Alanine (transported by SLC3A2, SLC7A11, SLC7A3) and decreased cystathionine levels (associated with increased expression of CTH).  ATF4 is one of the main orchestrator of ISR response to stress (e.g., amino acid deprivation) and it is required for expression of amino acid transporters and enzymes required for synthesis non-essential amino acids (PMID: 28494858). ATF4 increases cellular amino acid uptake and deliver AA needed for synthesis of proteins and glutathione needed for survival.

      We also observed prominent changes in CE in eIF2B HOM and its normalization with DNL343 treatment shown in Figure 5C. We checked for changes in expression levels of CEL, CES1, LCAT, LIPA, SOAT1, and NCEH1 proteins involved in CE metabolism and failed to detect any changes in protein or RNA abundances.  This  suggests that a rapid demyelination is a more likely trigger for CE accumulation as reported in FTD-GRN (Marian OC et al., 2023 acta neuropathol commun 11, 52), and in experimental demyelination models (Nugent AA et al., 2020 Neuron). We have added new text to the discussion section of the manuscript page 9, lines 408-411 to discuss how these results relate to each other.

      (6) It is hard to understand the biomarker part. The authors said "potential translational biomarkers are elevated..." Do the authors mean they are elevated so they can be potential biomarkers? If their levels are unchanged (e.g., TIMP-1), how can they be biomarkers? Also, this part needs a conclusion/summary. Also, what does "reversed biomarkers..." mean?

      We have modified the text to clarify and included a concluding sentence for this section of the results (page 7, lines 297-299). In assessing whether a given protein could be a potential translational biomarker for human disease we evaluated if the following two conditions were met: (1) Increased or decreased gene expression or protein levels of the biomarker in the brain or biofluids (CSF or plasma) of Eif2b5 R191H homozygote mice relative to wild-type controls that is modulated or normalized by administration of DNL343 and (2) protein levels in biofluids from VWMD patients that show differential levels than healthy controls in the same directionality as what is seen in the mouse model. GDF-15, GFAP, and NfL meet these criteria, but TIMP-1 and MCP-1 do not.

      Minor concerns:

      (1) Please explain which multiple comparison tests the authors used.

      This information has been further clarified in the figure legends.

      (2) Administrating the drug at an advanced stage led to a trend of NfL reduction but did not rescue function. Can the authors discuss what this means?

      Further elaboration and discussion about this finding have been added to the results section on page 8, line 319-325.

      (3) For statistical analyses on the bar graphs, it would be better if the authors labeled the comparison pairs on the graphs.

      We agree that labelling comparisons in bar graphs could aid the readership and have added this modification. Additionally, comparisons are indicated in the figure legend.

      (4) The authors need to state clearly that 2BAct's cardiovascular toxicity was observed in dogs, not mice. The current study does not exclude similar DNL343 toxicity. However, previous clinical trials suggest that DNL343 may be safe for humans.

      The suggestion to specify cardiovascular toxicity in dogs has been added (page 3, line 101), thank you. We now include a citation to Craig et al J Med Chem (2024) that provides evidence in a non-human primate (cynomolgus monkey) model that DNL343 dosing did not result in QT prolongation or any functional cardiac changes. We have also completed a Phase 1 (NCT04268784) and Phase 1B double-blind (NCT05006352) trials in healthy and ALS participants, respectively and now include reference to these trials on page 4, lines 102-104. The safety profile observed in these clinical studies supported further development of DNL343 for ALS in the Healey Platform trial (NCT04297683, Regimen G).

    1. Reviewer #3 (Public Review):

      Summary:

      In this manuscript, Last and colleagues describe Ais, an open-source software package for the semi-automated segmentation of cryo-electron tomography (cryo-ET) maps. Specifically, Ais provides a graphical user interface (GUI) for the manual segmentation and annotation of specific features of interest. These manual annotations are then used as input ground-truth data for training a convolutional neural network (CNN) model, which can then be used for automatic segmentation. Ais provides the option of several CNNs so that users can compare their performance on their structures of interest in order to determine the CNN that best suits their needs. Additionally, pre-trained models can be uploaded and shared to an online database.

      Algorithms are also provided to characterize "model interactions" which allows users to define heuristic rules on how the different segmentations interact. For instance, a membrane-adjacent protein can have rules where it must colocalize a certain distance away from a membrane segmentation. Such rules can help reduce false positives; as in the case above, false negatives predicted away from membranes are eliminated.

      The authors then show how Ais can be used for particle picking and subsequent subtomogram averaging and for the segmentation of cellular tomograms for visual analysis. For subtomogram averaging, they used a previously published dataset and compared the averages of their automated picking with the published manual picking. Analysis of cellular tomogram segmentation was primarily visual.

      Strengths:

      CNN-based segmentation of cryo-ET data is a rapidly developing area of research, as it promises substantially faster results than manual segmentation as well as the possibility for higher accuracy. However, this field is still very much in the development and the overall performance of these approaches, even across different algorithms, still leaves much to be desired. In this context, I think Ais is an interesting package, as it aims to provide both new and experienced users with streamlined approaches for manual annotation, access to a number of CNNs, and methods to refine the outputs of CNN models against each other. I think this can be quite useful for users, particularly as these methods develop.

      Weaknesses:

      Whilst overall I am enthusiastic about this manuscript, I still have a number of comments:

      On page 5, paragraph 1, there is a discussion on human judgement of these results. I think a more detailed discussion is required here, as from looking at the figures, I don't know that I agree with the authors' statement that Pix2pix is better. I acknowledge that this is extremely subjective, which is the problem. I think that a manual segmentation should also be shown in a figure so that the reader has a better way to gauge the performance of the automated segmentation.

      On page 7, the authors mention terms such as "emit" and "absorb" but never properly define them, such that I feel like I'm guessing at their meaning. Precise definitions of these terms should be provided.

      For Figure 3, it's unclear if the parent models shown (particularly the carbon model) are binary or not. The figure looks to be grey values, which would imply that it's the visualization of some prediction score. If so, how is this thresholded? This can also be made clearer in the text.

      Figure 3D was produced in ChimeraX using the hide dust function. I think some discussion on the nature of this "dust" is in order, e.g. how much is there and how large does it need to be to be considered dust? Given that these segmentations can be used for particle picking, this seems like it may be a major contributor to false positives.

      Page 9 contains the following sentence: "After selecting these values, we then launched a batch particle picking process to determine lists of particle coordinates based on the segmented volumes." Given how important this is, I feel like this requires significant description, e.g. how are densities thresholded, how are centers determined, and what if there are overlapping segmentations?

      The FSC shown in Figure S6 for the auto-picked maps is concerning. First, a horizontal line at FSC = 0 should be added. It seems that starting at a frequency of ~0.045, the FSC of the autopicked map increases above zero and stays there. Since this is not present in the FSC of the manually picked averages, this suggests the automatic approach is also finding some sort of consistent features. This needs to be discussed.

      Page 11 contains the statement "the segmented volumes found no immediately apparent false positive predictions of these pores". This is quite subjective and I don't know that I agree with this assessment. Unless the authors decide to quantify this through subtomogram classification, I don't think this statement is appropriate.

      In the methods, the authors note that particle picking is explained in detail in the online documentation. Given that this is a key feature of this software, such an explanation should be in the manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This study examines the spatial and temporal patterns of occurrence and the interspecific associations within a terrestrial mammalian community along human disturbance gradients. They conclude that human activity leads to a higher incidence of positive associations.

      Strengths:

      The theoretical framework of the study is brilliantly introduced. Solid data and sound methodology. This study is based on an extensive series of camera trap data. Good review of the literature on this topic.

      Weaknesses:

      The authors use the terms associations and interactions interchangeably.

      This is not the case. In fact, we state specifically that "... interspecific associations should not be directly interpreted as a signal of biotic interactions between pairs of species…" However, co-occurrence can be an important predictor of likely interactions, such as competition and predation. We stand by our original text.

      It is not clear what the authors mean by "associations". A brief clarification would be helpful.

      Our specific definition of what is meant here by spatial association can be found in the Methods section. To clarify, the calculation of the index of associations is based on the covariance for the two species of the residuals (epsilon) after consideration of all species-specific response to known environmental covariates. These covariances are modelled to allow them to vary with the level of human disturbance, measured as human presence and human modification. After normalization, the final index of association is a correlation value that varies between -1 (complete disassociation) and +1 (complete positive association).

      Also, the authors do not delve into the different types of association found in the study. A more ecological perspective explaining why certain species tend to exhibit negative associations and why others show the opposite pattern (and thus, can be used as indicator species) is missing.

      Suggesting the ecological underpinnings of the associations observed here would mainly be speculation at this point, but the associations demonstrated in this analysis do suggest promising areas for the more detailed research suggested.

      Also, the authors do not distinguish between significant (true) non-random associations and random associations. In my opinion, associations are those in which two species co-occur more or less than expected by chance. This is not well addressed in the present version of the manuscript.

      Results were considered to be non-random if correlation coefficients (for spatial association) or overlap (for temporal association) fell outside of 95% Confidence Intervals. This is now stated clearly in the Methods section.  In Figure 3—figure supplement 1-3 and Figure 4—figure supplement 1-3, p<0.01 levels are also presented.

      The obtained results support the conclusions of the study.

      Anthropogenic pressures can shape species associations by increasing spatial and temporal co-occurrence, but above a certain threshold, the positive influence of human activity in terms of species associations could be reverted. This study can stimulate further work in this direction.

      Reviewer #2 (Public Review):

      Summary:

      This study analyses camera trapping information on the occurrence of forest mammals along a gradient of human modification of the environment. The key hypotheses are that human disturbance squeezes wildlife into a smaller area or their activity into only part of the day, leading to increased co-occurrence under modification. The method used is joint species distribution modelling (JSDM).

      Strengths:

      The data source seems to be very nice, although since very little information is presented, this is hard to be sure of. Also, the JSDM approach is, in principle, a nice way of simultaneously analysing the data.

      Weaknesses:

      The manuscript suffers from a mismatch of hypotheses and methods at two different levels.

      (1) At the lower level, we first need to understand what the individual species do and "like" (their environmental niche). That information is not presented, and the methods suggest that the representation of each species in the JSDM is likely to be extremely poor.

      The response of each species to the environmental covariates provides a window into their environmental niche, encapsulated in the beta coefficients for each environmental covariate. This information is presented in Figure 2.

      (2) The hypothesis clearly asks for an analysis of the statistical interaction between human disturbance and co-occurrence. Yet, the model is not set up this way, and the authors thus do a lot of indirect exploration, rather than direct hypothesis testing.

      Our JSDM model is set up specifically to examine the effect of human disturbance on co-occurrence, after controlling for shared responses to environmental variables.  It directly tests the first hypothesis, since, if increase in indices of human disturbance had not tended to increase the measured spatial correlations between species as detected by the model, we would have rejected our stated hypothesis that human modification of habitats results in increased positive spatial associations between species.

      Even when the focus is not the individual species, but rather their association, we need to formulate what the expectation is. The hypotheses point towards presenting the spatial and the temporal niche, and how it changes, species for species, under human disturbance. To this, one can then add the layer of interspecific associations.

      Examining each species one by one and how each one responds to human disturbance would miss the effects of any meaningful interactions between species.  The analysis presented provides a means to highlight associations that would have been overlooked.  Future research could go on to analyze the strongest associations in the community and the strongest effects of human disturbance so as to uncover the underlying interactions that give rise to them and the mechanisms of human impact.  We believe that this will prove to be a much more productive approach than trying to tackle this problem species by species and pair by pair.

      The change in activity and space use can be analysed much simpler, by looking at the activity times and spatial distribution directly. It remains unclear what the contribution of the JSDM is, unless it is able to represent this activity and spatial information, and put it in a testable interaction with human disturbance.

      The topic is actually rather complicated. If biotic interactions change along the disturbance gradient, then observed data are already the outcome of such changed interactions. We thus cannot use the data to infer them! But we can show, for each species, that the habitat preferences change along the disturbance gradient - or not, as the case may be.

      Then, in the next step, one would have to formulate specific hypotheses about which species are likely to change their associations more, and which less (based e.g. on predator-prey or competitive interactions). The data and analyses presented do not answer any of these issues.

      We suggest that the so-called “simpler” approach described above is anything but simple, and this is precisely what the Joint Species Distribution Model improves upon.  As pointed out in the Introduction, simply examining spatial overlap is not enough to detect a signal of meaningful biotic interaction, since overlap could be the result of similar responses to environmental variables.  With the JSDM approach, this would not be considered a positive association and would then not imply the possible existence of meaningful interaction.

      Another more substantial point is that, according to my understanding of the methods, the per-species models are very inappropriate: the predictors are only linear, and there are no statistical interactions (L374). There is no conceivable species in the world whose niche would be described by such an oversimplified model.

      While interaction terms can be included in the JSDM, this would considerably increase the complexity of the models.  In previous work, we have found no strong evidence for the importance of interaction terms and they do not improve the performance of the models.

      We have no idea of even the most basic characteristics of the per-species models: prevalences, coefficient estimates, D2 of the model, and analysis of the temporal and spatial autocorrelation of the residuals, although they form the basis for the association analysis!

      The coefficient estimates for response to environmental variables used in the JSDM are provided in Figure 2 and Figure 2—source data 1.

      Why are times of day and day of the year not included as predictors IN INTERACTION with niche predictors and human disturbance, since they represent the temporal dimension on which niches are hypothesised to change?

      Also, all correlations among species should be shown for the raw data and for the model residuals: how much does that actually change and can thus be explained by the niche models?

      The discussion has little to add to the results. The complexity of the challenge (understanding a community-level response after accounting for species-level responses) is not met, and instead substantial room is given to general statements of how important this line of research is. I failed to see any advance in ecological understanding at the community level.

      We agree that the community-level response to human disturbance is a complex topic, and we believe it is also a very important one.  This research and its support of the spatial compression hypothesis, while not providing definitive answers to detailed mechanisms, opens up new lines of inquiry that makes it an important advance.  For example, the strong effects of human disturbance on certain associations that were detected here could now be examined with the kind of detailed species by species and pair by pair analysis that this reviewer appears to demand.

      Reviewer #1 (Recommendations For The Authors):

      L27 indicates instead of "idicates".

      We thank the reviewer for catching that error.

      L64 I would refer to potential interactions or just associations. It is always hard to provide evidence for the existence of true interactions.

      We have revised to “potential interactions” to qualify this statement.

      L69 Suggestion: distort instead of upset.

      We thank the reviewer for catching that error.

      L70-71 Here, authors use the term associations. Please, be consistent with the terminology throughout the manuscript.

      We thank the reviewer for raising this important point.  The term “co-occurrence” appears to be used inconsistently in the literature, so we have tried to refer to it only when referencing the work of us. For us, co-occurrence means “spatial overlap” without qualification as to whether it is caused by interaction or simply by similar responses to environmental factors (see Blanchet et al. 2020, Argument 1). In our view, interactions refer to biotic effects like predation, competition, commensalism, etc., while associations are the statistical footprint of these processes.   In keeping with this understanding, in Line 73, we changed "association" to the stronger word "interaction," but in Line 76, we keep the words "spatiotemporal association", which is presumed to be the result of those interactions. In Line 91, we have changed “interactions” to “associations,” as we do not believe interactions were demonstrated in that study. 

      L76 "Species associations are not necessarily fixed as positive or negative..." This sentence is misleading. I would say that species associations can vary across time and space, for instance along an environmental gradient.

      We thank the reviewer for pointing out the potential for confusion.  In Line 79, we have changed as suggested.

      L78 "Associations between free-ranging species are especially context-dependent" Loose sentence. Please, explain a bit further.

      We have changed the sentence to be more specific; ”Interactions are known to be context-dependent; for example, gradients in stress are associated with variation in the outcomes of pairwise species interactions.”

      L83-85 This would be a good place to introduce the 'stress gradient' hypothesis, which has also been applied to faunal communities in a few studies. According to this hypothesis, the incidence of positive associations should increase as environmental conditions harden.

      In our review of the literature, we find that the stress gradient hypothesis is somewhat controversial and does not receive strong support in vertebrates.  We have added the phrase “…the controversial stress-gradient hypothesis predicts that positive associations should increase as environmental conditions become more severe…”

      L86-88 Well, overall, the number of studies examining spatiotemporal associations in vertebrates is relatively small. That is, bird associations have not received much more attention than those of mammals. I find this introductory/appealing paragraph a bit rough. I think the authors can do better and find a better justification for their work.

      We thank the reviewer for the comments.  We have rewritten the paragraph extensively to make it clearer and to provide a stronger justification for the study.

      L106 "[...] resulting in increased positive spatial associations between species" I'd say that habitat shrinking would increase the level of species clustering or co-occurrence, but in my opinion, not necessarily the incidence of positive associations. It is not clear to me if the authors use positive associations as a term analogous to co-occurrence.

      We thank the reviewer for raising this very important distinction.  Habitat shrinking would increase levels of species co-occurrence, but this is not particularly interested.  We wanted to test whether there were effects on species interactions, as revealed by associations.  We find that the terms association and co-occurrence are used somewhat loosely in the literature and so have made some new effort to clarify and systematize this in the manuscript.  For example, there appear to be a differences in the way “co-occurrence” is used in Boron 2023 and in Blanchet 2020. We do not use the term "positive spatial association" as analogous to "spatial co-occurrence.". Spatial co-occurrence, which for us has the meaning of spatial overlap, could simply be the result of similar reactions to environmental co-variates, not reflecting any biotic interaction. Joint Species Distribution Models enable the partitioning of spatial overlap and segregation into that which can be explained by responses to known environmental factors, and that which cannot be explained and thus might be the result of biotic interactions.  It is only the latter that we are calling spatial association, which can be positive or negative.   These associations may be the statistical footprint of biotic interactions.

      Results:

      Difference between random and non-random association patterns. It is not clear to me if the reported associations are significant or not. The authors only report the sign of the association (either positive or negative) but do not clarify if these associations indicate that two species coexist more or less than expected by chance. In my opinion, that is the difference between true ecological associations (e.g., via facilitation or competition effects) and random co-existence patterns. This is paramount and should be addressed in a new version of the manuscript.

      This information is provided in Figure 3—figure supplement 1,2,3 and Figure 4—figure supplement 1,2,3.  This is referenced in the text as follows, “… correlation coefficients for 18 species pairs were positive and had a 95 % CI that did not overlap zero, and the number increased to 65 in moderate modifications but dropped to 29 at higher modifications" and so on. This criterion for significance (ie., greater than expected by chance) is now stated at the end of the Materials and methods.  In Figure 3—figure supplement 1,2,3 and Figure 4—figure supplement 1,2,3, those correlations that were significant at p<0.01 are also shown.

      I am also missing a more ecological explanation for the observed findings. For instance, the top-ranked species in terms of negative associations is the red fox, whereas the muntjac seems to be the species whose presence can be used as an indicator for that of other species. What are the mechanisms underlying these patterns? Do red foxes compete for food with other species? Do the species that show positive associations (red goral, muntjac) have traits or a diet that are more different from those of other species? More discussion on these aspects (role of traits and the trophic niche) would be necessary to better understand the obtained results.

      The purpose of this paper was to test the compression hypotheses, and we have tried to keep that as the focus.  However, the analysis does open up interesting lines of inquiry for future research to decipher the details of the interactions between species and the mechanisms by which human disturbance facilitates or disrupts these interactions. The reviewer raises some interesting possibilities, but at this point, any discussion along these lines would be largely speculation and could lengthen the paper without great benefit. 

      Reviewer #2 (Recommendations For The Authors):

      The manuscript should be accompanied by all data and code of analysis.

      All data and RScripts have been made available in Science Data Bank: https://doi.org/10.57760/sciencedb.11804.

      The sentence "not much is known" is weak: it suggests the authors did not bother to quantify what IS known, and simply waved any previous knowledge aside. Surely we have some ideas about who preys on whom, and which species have overlapping resource requirements (e.g., due to jaw width). For those, we would expect a particularly strong signal, if the association is indeed indicative of interactions.

      We believe that the reviewer is referring to the statement in Line 90-92 about the lack of understanding of the resilience of terrestrial mammal associations to human disturbance.  We have added a reference to one very recent publication that addresses the issue (Boron et al., 2023), but otherwise we stand by our statement. We have, however, added a qualifier to make it clear that we did indeed look for previous knowledge; "However, a review of the literature indicates that ...."

      Figures:

      Fig. 1. This reviewer considers that this is too trivial and should be deleted.

      This is a graphical statement of the hypotheses and may be helpful to some readers.

      Fig. 2. Using points with error bars hides any potential information.

      Done as suggested.

      That only 4 predictors are presented is unacceptably oversimplified.

      Only 4 predictors are included because, in previous work, we found that adding additional predictors or interactions did little to improve the model’s performance (Li et al. 2018, 2021 and 2022) and could lead to over-fitting.

      Fig. 5. and 6. aggregate extremely strongly over species; it remains unclear which species contribute to the signal, and I guess most do not.

      The number of detection events presented in Table 1 should help to clarify the relative contribution of each species to the data presented in Figures 5 and 6.

      This reviewer considers that the introduction 'oversells' the paper.

      L55: can you give any such "unique ecological information"

      L60: Lyons et al. (Kathleen is the first name) has been challenged by Telford et al. (2016 Nature) as methodologically flawed.

      The first name has been deleted.  The methodological flaw has to do with interpretation of the fossil record and choice of samples, not with the need to partition shared environmental preferences and interactions.

      L61 contradicts line 64: Blanchet et al. (2022, specifying some arguments from Dormann et al. 2018 GEB) correctly point out that logically one cannot infer the existence or strength from co-occurrence data. It is thus wrong to then claim (citing Boron et al.) that such data "convey key information about interactions". The latter statement is incorrect. A tree and a beetle can have extremely high association and nothing to do with each other. Association does not mean anything in itself. When two species are spatially and temporally non-overlapping, they can exhibit perfect "anti-association", yet, by the authors' own definition, cannot interact.

      We believe that the reviewer’s concerns arise from a misunderstanding of how we use the term association.  In our usage, an association is not the same as co-occurrence or overlap, which may simply be the result of shared responses to environmental variables.  The co-occurring tree and beetle would not be found to have any association in our analysis, only shared environmental sensitivities.  In contrast, associations can be the statistical footprint of interactions, and would be overlaid onto any overlap due to similar responses to the environment.  In the case of negative associations, such as might be the result of competitive exclusion or avoidance of predators, the two species would share environmental responses but show lower than expected spatial overlap.  Even though they might be only rarely found in the same vicinity, they would indeed be interacting when they were together.

      Joint Species Distribution Models "allow the partitioning of the observed correlation into that which can be explained by species responses to environmental factors... and that which remains unexplained after controlling for environmental effects and which may reflect biotic interactions." (Garcia Navas et al. 2021). It is the latter that we are calling “associations.”

      L63: Gilbert reference: Good to have a reference for this statement.

      This point is important, but the reviewer’s comments below have made it clear that it is even more important to point out that strong interactions should be expected to lead to significant associations.  We have added a statement to clarify this.

      L70-72: Incorrect, interactions play a role, not associations (which are merely statistical).

      In this, we agree, and we have revised the statement to refer to interactions, not associations. In our view, an interaction is a biological phenomenon, while an association is the resulting statistical signal that we can detect.

      L75: Associations tell us nothing, only interactions do. Since these can not be reliably inferred, this statement and this claim are wrong.

      We thank the reviewer for raising this point, but we beg to disagree. Strong interactions should be expected to lead to significant associations that can be detected in the data. Associations, which can be measured reliably, are the evidence of potential interactions, and hence associations can tell us a great deal.  We have added a note to this effect after the Gilbert reference above to clarify this point.

      However, we do accept that associations must be interpreted with caution. As Blanchet et al. 2020 explain, " …the co-occurrence signals (e.g. a significant positive or negative correlation value) estimated from these models could originate from any abiotic factors that impact species differently. Therefore, this correlation cannot be systematically interpreted as a signal of biotic interactions, as it could instead express potential non-measured environmental drivers (or combinations of them) that influence species distribution and co-distribution.”  Or alternatively an association could be the result of interaction with a 3rd species. 

      L87: Regarding your claim, how would you know you DO understand? For that, you need to formulate an expectation before looking at the data and then show you cannot show what you actually measure. (Jaynes called this the "mind-projection fallacy".)

      We are not sure if the reviewer is criticizing our paper or the entire field of community ecology.  Perhaps it is the statement that “….resilience of interspecific spatiotemporal associations of terrestrial mammals to human activity remains poorly understood….”  Since we are confident that the reviewer believes that mammals do interact, we guess that it is the term “association” that is questioned.  We have revised this to “…the impacts of human activity on interspecific interactions of terrestrial mammals remains poorly understood…” 

      In this particular case, we did formulate an expectation before looking at the data, in the form of the two formal hypotheses that are clearly stated in the Introduction and illustrated in Figure 1. If the hypotheses had not been supported, then we would have accepted that we do not understand. But as the data are consistent with the hypotheses, we submit that we do understand a bit more now.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank all reviewers for their detailed and constructive feedback, which substantially helped improve the manuscript. We apologise for the time taken for the revisions, which was partially due to the first author (successfully) writing and defending her PhD thesis in the same time frame. We would like to point out already here that, based on reviewers' feedback, main figure 6 is completely redone and the conclusions of this figure have changed substantially. We no longer suggest RNA chaperoning activity (it was identified as being due to the high concentration of TEV protease, in a control suggested by the reviewers). Instead, our refined assay conditions with lower TEV protease concentration identified ribonuclease activity of membrane-bound full-length 2C, which is consistent with a publication from 2022 (PMID: 35947700).


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Evidence, reproducibility, and clarity

      Summary:

      In this study by Shankar and colleagues, the authors aim to understand the structure and function of the enterovirus 2C protein, a putative viral helicase with AAA+ ATPase activity. Using poliovirus (as a model enterovirus) 2C, the author's propose the protein contains two amphipathic helices (AH1 and AH2) at the N-terminus that are divided by a conserved glycine. Using purified MBP-tagged 2C and N-terminal 2C truncations, their data suggests AH1 is primarily responsible for clustering at membranes, whilst AH2 is the main mediator of 2C oligmerisation and membrane binding. Furthermore, 2C was suggested to be able to recruit RNA to membranes, with a preference for dsRNA, and the author's data implies that the helicase activity of 2C is ATP-independent. Instead, the ATP activity appears to be required for 2C hexamer formation or chaperone activity. The manuscript is generally well written /presented and the author's present very interesting data which raises several questions, some of which require additional experimentation to help support the author's conclusions. Specific comments are as follows.

      We thanks the reviewer for the overall positive assessment, as well as the specific comments below.

      Major Comments:

      1. The authors use four main constructs throughout the paper: full-length 2C, 2C with deletion of AH1 (ΔAH1), 2C with both AH1 and AH2 deleted (ΔMBP) and 2C with an extended N-terminal deletion. From this, the author's draw conclusions on the function of both AH1 and AH2. One of the author's main conclusions is that AH2 is the main mediator of 2C membrane association (e.g., in line 169). However, is it possible to conclude the relative importance of AH1 vs AH2 without testing a construct containing the deletion of AH2 only (ΔAH2)? This should be generated and used alongside this data to fully define the relative importance of AH1 and AH2 in these assay and remove the possibility that the deletion of AH1 changes the structure and/or function of AH2, which could also result in the observed differences.

      This was a very good suggestion. We expressed and purified the ΔAH2 protein requested by the reviewer and characterized its oligomeric state as well as its membrane binding. It turns out, as suspected, that the ΔAH2 protein behaves very similarly to the ΔMBD protein (i.e. it does not form higher order oligomers and does not bind membranes). The changes in the manuscript due to this addition are many but can primarily be found in main figures 2-3 and their associated supplementary figures.

      Previous structural predictions of 2C do not appear to have two separate AHs at the N-terminus. Are the AH1 and AH2 structures predicted to be formed in the context of the entire 2C protein, 2BC precursors and polyprotein? Are there structural approaches that could provide experimental evidence for two separate AH at the N-terminus?

      This is a good point. Previous predictions were not that detailed, partially since they were done in the pre-alphafold era. Unfortunately, we cannot think of a tractable experimental method that could verify the split nature of the amphipathic helix in the only context that would matter: the protein bound to a membrane. A long-term goal would be in situ structures of full-length 2C on membranes using cryo-electron tomography, but our current sample and data sets are not sufficient for this. We added a mention of the long-term need for experimental structures of full-length 2C on lines 315-318 in the discussion.

      Why are the 2C dimers (lines 137-138) not apparent on the mass photometry data presented (figure 2)?

      Different constructs were measured by mas photometry and SEC-MALS. Also, the required concentration is 100-1000x lower for mass photometry which will affect a dynamic equilibrium in case the same construct were measured by the two methods.

      It appeared that binding of ΔMBD-2C was better when POPS is in the membrane (line 174). What is the explanation for this and was this finding significant?

      Well spotted. It may mean that 2C has a second, lower affinity membrane-binding site which is charge-dependent somewhere outside the MBD. We now added a mention of this in the discussion, lines 321-323.

      From the author's data on lipid drop clustering they conclude ΔAH1 is more effective for clustering, however, the ΔAH1 construct produces pentamers not hexamers (from Figure 2). Is formation of hexamers related to or required for membrane clustering?

      ΔAH1 is LESS effective at clustering, not more. As for the mention of pentamers in the original submission: we now think this was an unfortunate choice of words. The mass photometry data for 2C(ΔAH1) could more parsimoniously be interpreted as a mix of hexamers and other (unknown to us) smaller oligomers such as trimers. We have removed all mentions of pentamers.

      The replicon data presented in Figure 7 should include a replication-defective control (e.g., polymerase mutant), in order to compare how defective in replication ΔAH1 and ΔMBP deletions are compared to a fully-defective construct. Likewise, deletion of ΔAH1 in this construct is likely to affect processing of the viral polyprotein where several previous studies with picornaviruses have demonstrated that the residues in the P2'-P4' positions can change cleavage efficiency (e.g., PMID: 2542331), or the structure of 2C, leading to the reduction of replication.

      Thanks for these good comments. We made the polymerase-dead (GDD-to-GAA) replicon and remeasured it side by side with the 2C replicons. It has a similar luciferase activity indicating that no replication takes place in the 2C deletion replicons. This is shown in the new figure 7. As for the possibility or processing defects, we mentioned this in the original discussion and have now cited the reference suggested by the reviewer in this context (line 324).

      How does the author's model of ATPase-independent helicase activity and an APT-dependent required RNA chaperone activity fit with 2 step model for RNA binding and ATPase activity suggested by Yeager et al (PMID: 36399514)?

      Acting upon comments from other reviewers, we completely redid the "helicase assay" in the revised manuscript. It turns out that the ATP-independent unwinding activity in the original submission was an artefact of the assay conditions (specifically, of the TEV protease at the higher concentration we used in the old assay). In our improved assay we neither see helicase activity nor ATP-independent RNA chaperoning activity.

      Optional major comments that would increase the significance of the work:

      All of the optional comments below are exceptionally interesting. But given the long time needed for the several major changes to this manuscript (e.g. the ΔAH2 protein characterization and reoptimisation of the helicase assay) we believe it is more sensible to address them in future studies, for which the 2C reconstitution system can be used.

      The preference for dsRNA over ssRNA appears to be quite small (Figure 5d). In the context of a viral infection where ssRNA is likely to outnumber dsRNA at different times during infection is this preference physiologically relevant? In relation to this, what size stretch of dsRNA is required for preference, and could this correspond to cis-acting RNA structural elements, dsRNA as it escapes 3D polymerase or as part of the RF and RI forms (PMID: 9343205)? What is the proposed mechanism of how dsRNA outcompetes membrane tethering of 2C? OPTIONAL The author's study has been conducted in the absence of other viral non-structural proteins. What is the physiological importance of the observations, such as membrane interaction/clustering or RNA binding when presented in the context of the other replication machinery. OPTIONAL Do 2C monomers, dimers and hexamers have different functions in viral replication perhaps at different stages of replication and which of these forms are relevant during viral infection or can they all be detected during infection? Can any suggested separate functional arrangements be separated by genetic complementation experiments? OPTIONAL

      Minor comments:

      1. The author's appear to interchange between naming/nomenclature of the constructs which makes it confusing to follow (for example, ΔMBD is the same as 2C(41-329) likewise, 2C(Δ115) is sometimes called 2C(116-329)). It would be much easier to follow if the naming of constructs was consistent throughout (unless I am misunderstanding some subtlety in the difference between such constructs).

      Thanks very much for spotting this. We have fixed it.

      The author's suggest a pentamer arrangement for the ΔAH1 construct, however in the mass photometry data (figure 2D), a hexamer is indicated with the arrow. It would be helpful to change the label to indicate the size of the pentamer where this is being generated, not the hexamer.

      As mentioned above, we think the "pentamer" designation of the original manuscript was unfortunate. It is more parsimonious to interpret this as a mix of states, hexamer and undefined snaller.

      In most figures, data for full-length 2C, ΔAH1 and ΔMBP is shown. However data for ΔMBP is missing in Figure 4. Using ΔMBP may demonstrate even lower clustering, hinting that AH2 is also involved in this process.

      Thanks for this comment. In our view, it can be derived from figure 3 (which shows lack of binding to PC/PE membranes) that the ΔMBD construct would not cluster membranes under the conditions of the assay (clustering requires concomitant binding to two membranes). We now describe our rationale for this on lines 220-222. However, we did include the ΔMBD protein in the new negative staining TEM supplementary figure where it and ΔAH2 show no signs of clustering (figure S10).

      I think it would be better for normalise the data in the flotation experiments such that the percentage of 2C in the upper faction is presented as relative to the amount of lipid in the upper fraction (presented in Figure S4).

      The change suggested by the reviewer would make it impossible to show the important no-liposome control (leftmost bar in Fig. 3C) in the same plot as the other measurements. We believe that would unnecessarily complicate the figure. Thus, we opted to keep the measurement that are normalised by lipid fluorescence in the supplementary figure. Instead, we now added another mention of this supplementary figure in the legend to main figure 3.

      At several places (e.g., lines 232 and 272) the author's refer to "realistic systems". I think the term "physiologically relevant" might be more appropriate.

      Agreed and changed throughout.

      Line 237: I think "y" is a typo and should read "by".

      Thanks. This text was reworked due to the major changes to figure 6.

      Reviewer #1 (Significance (Required)):

      Significance

      I have limited expertise with structural biology but specialise my research on positive-sense RNA virus replication, structure and function. This research is of interest to a broad audience of researchers investigating many positive-sense RNA viruses, which extends beyond the viral family studied here. The work utilises novel techniques to begin to understand the specific roles of 2C in poliovirus replication. The author's data add important incremental new insight into recent studies on viral helicase proteins as referenced in the study, however, a key limitation is understanding the importance/relevance of their observations during a viral infection.

      We thanks the reviewer for this positive and nuanced appraisal of our work.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors present an alternative assay system to investigate picornavirus 2C, a protein that is tricky to analyze biochemically in its full length form because of an amphipathic helix at the N-terminus. Poliovirus 2C is expressed with an N-terminal MBP tag, a 50kD protein that helps with solubility as is commonly used for 2C investigations. A difference here is that liposomes are included to mimic membranes for 2C attachment. The key findings are that 2C induces clustering of of liposomes, that double stranded RNA binding by 2C impacts this clustering effect and that a free N-terminus (after cleavage of MBP by TEV protease) is needed for RNA binding and an ATP independent (ie non helicase) RNA duplex separation activity.

      Major:

      In the floatation assays in figure 3 the authors use a system where MBP-2C is fluorophore-labeled with ATTO488 on exposed cysteines. Poliovirus and other enterovirus 2C has a very well characterized zinc finger domain that has cysteines coordinating a zinc ion. Mutation experiments previously showed that these cysteines are necessary for viral replication and 2C stability. Have the authors controlled for disruption of the zinc finger domain by the labelling of cysteines with ATT0488 and checked if the protein remains folded?

      We completely agree with the reviewer and apologise for the omission in the original submission. We have now included a Zn content measurement, which shows unchanged levels between labelled and unlabelled 2C protein (Figure S7). Also, we now in the revised manuscript explicitly describe our original reasoning for labelling on native cysteines: the presence of two cysteines which are not necessary for viral replication and which are more solvent exposed-exposed (and thus more likely to be labelled) in the crystal structure of the soluble fragment of 2C (lines 176-181).

      In the analysis of the amphipathic helix, did the authors include membranes in their structural predictions o just the free helix? How does inclusion of membranes impact the predictions? In the predictions in Figure D, only 2 of 4 show a kink and there doesn't seem to be a correlation between those that predict a kink or not and whether the hydrophobic side is aligned in Figure S1.

      Unfortunately, predicting a protein structure with the interacting membrane is beyond what is currently doable with protein prediction methods (one would have to combine protein structure predictions with molecular dynamics simulations including a membrane). Based on general principles of protein structure, it is likely that there is some flexibility around G17. Thus there may not be a single "kink angle" for any given virus, but we believe that the presence of the kink (and offset hydrophobic surfaces) for a number of viruses lends credibility and robustness to the observation. We added some descriptions of this thinking on lines 126-127.

      Based on previous structures of 2C from different viruses the N-terminal amphipathic helix containing region is predicted to localize on one face of the predicted hexametric structure tethering 2C to the membrane. How does the authors hypothesized model explain 2C dependent clustering? is there evidence that 2C hexamers can oligomerize further into dodecamers for example, maintaining separate faces to enable N-terminal interaction with different membranes? What is the distance between the liposomes in figure 4 at the points of density attributed to 2C? How does this compare to the size of 2C determined in previous structural studies? Is it consistent with one hexamer/2 hexamers sitting on top of one another?

      These are very interesting questions but we believe it is prudent to limit our speculation at this point. Eventually, we hope that larger data sets of cryo-electron tomography, coupled to subtomogram averaging, may provide a more definitive answer. What we managed to do with our current cryo-electron tomography data set is to estimate the volume of individual protein densities, and from the volume calculate an estimated molecular mass of the individual complexes seen in the tomograms. This correlates very well with 2C hexamers (new figure 4D).

      In the Discussion lines 278-285 the authors suggest that having MBP attached may reflect the polyprotein condition. Can they make a construct with MBP-2B2C to examine interaction with liposomes and assess 2C function?

      This is a highly relevant question, but the biochemistry of 2BC is even more challenging than 2C, and we are unfortunately nowhere near being able to work with purified 2BC at the moment.

      Discussion lines 293-296, the possibility of two different populations of 2C, binding RNA or membranes cannot be excluded, there is much more 2C around late in infection that present in early infection- the model in figure 8 doesn't acknowledge/capture this.

      We have changed the model figure such that more 2C is seen later, and the clustering function is also seen late in infection. The original discussion text referred to (which is unchanged) talks about a "preferential role in RNA replication and particle assembly at later time points" specifically for this reason. We hope the new figure 8 is better at conveying this message.

      Discussion lines 313-317, the authors don't reference a study where a mutant of foot-and-mouth disease virus 2C lacking the n-terminal amphipathic helix that could bind but not hydrolyze ATP, hexamerized in the presence of RNA that seems pertinent here (PMID: 20507978).

      Thanks for the suggestion. However, after the extensive changes we made to the revised to figure 6 based on excellent reviewer comments (essentially: the RNA chaperoning activity turned out to be an artefact, the improved assay shows no sign of RNA unwinding but instead of 2C-mediated ribonuclease activity), these sentence of the original discussion lost most of their context and we opted to remove them.

      Some evidence of MBP-2C cleavage by TEV in the different assays used should be presented as this is a major focus of discussion and currently no gels show TEV cleavage is happening.

      Thanks for the suggestion - we agree. We now show these in the new supplementary figures S5 and S12.

      Reviewer #2 (Significance (Required)):

      The work presents an additional methodology to investigate a a protein that has previously been difficult to study. The authors acknowledge that there is still a lot of 2C biology that remains to be discovered.

      Thanks, we agree.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript provides insights into the role of the N-terminus in membrane binding and its importance in the various functions of 2C.

      Major issues

      Line 103-119. Is this novel? I thought people had done a lot of bioinformatic analysis of PV 2C (especially Wimmer) who also did mutational work to analyse the importance of various amino acids in the N-terminal helix. I feel like the paper in general, and this section in particular, underplays the large body of work that has been done on the amphipathic helix by various groups.

      We apologise if our original manuscript didn't sufficiently acknowledge previous work in the field. In the first sentence of the mentioned paragraph (now lines 112-113) , we did however cite several papers that have previously addressed the amphipathic nature of the N-terminus of 2C. We have now added two more references along the same line, and changed the wording in a way that we hope better bring across that the amphipathic nature per se has been studies before. We would be happy to add more specific references if the reviewer has any suggestions. However, the rest of our analysis IS indeed novel for the following reasons: (i) we show that the amphipathic region is not a simple, single amphipathic helix, but instead has a conserved glycine (helix breaker/destabiliser residue) and two distinct amphipathic stretches before and after this region, (ii) we use alphafold2 (not available at the time of the earlier work) to provide the first reliable structural models of the membrane-binding domain. These models consistently, across several enterovirus 2C proteins, reveal that the hydrophobic surfaces of the first and second amphipathic regions, on either side of the conserved glycine 17, are offset from one another. This lends additional credibility to the distinct nature of these regions which have not previously been identified as such and which we also show in the biochemical assays to be functionally distinct. We have now also added a clarification to the Discussion that the N-terminus of 2C had previously been identified as its membrane-binding domain and we cite references for this. We hope that these changes will sufficiently acknowledge earlier work in the field while clearly pointing out the advance that our paper makes.

      Line 132. Did you validate your column with known MW standards? The peak for full length and deltaAH1 look fairly standard for 2C, in that you have a mixture of species. Not sure you can say it is a hexamer when it is such a broad peak. C doesn't really help you too much since the counts at 400 (pentamer) and 480 (hexamer) are almost the same with quite large error bars. Like most people that have worked with 2C I think the best you can say is that you are making some kind of oligomerized 2C that includes hexamer, pentamer, etc. Why no dimer for MBP-2C and MBP-2C(delta AH1) when compared to the other constructs?

      We did not calibrate the gel filtration column since the outcome would anyway be a more crude estimate of molecular mass than the mass photometry and SEC-MALS measurements. But we do agree with the reviewer on the broad mass photometry peaks. To address this experimentally, we compared the existing MBP-2C spectra to new recordings on apoferritin, a highly stable homomultimeric protein complex of a similar mass to aa MBP-2C hexamer. The apoferritin mass estimate is overlayed with the full-length MBP-2C in the new figure 2D and the corresponding supplementary figure S3. This indeed shows that the MBP-2C peak is broader, i.e. consistent with a mix of species which are predominantly but not only hexamers. We describe and discuss this on lines 145-149. As for the mention of pentamers in the original submission: we now think this was an unfortunate choice of words. The mass photometry data for 2C(ΔAH1) could more parsimoniously be interpreted as a mix of hexamers and other (unknown to us) smaller oligomers such as trimers. We have removed all mentions of pentamers.

      Line 143. Does your data show that there are two amphipathic helices? Bioinformatics suggests it but your experiments just show the importance of the two areas in oligomerization, not that it is forming two helices.

      We agree that the choice of words was not idea and have now changed it to "structure predictions indicate" (lines 162).

      Figure S2. Your preps are still relatively dirty, which isn't ideal for biochemical assays. Especially lane 3, where you are looking at 50-60% purity. I don't want you to re-run experiments but I think you need to comment on the purity of the protein you are working with. Also I don't like that you removed the top and bottom of the SDS-PAGE. How much protein never entered the gel. Is there a big fat band at 20 kDa? You need to have the full gel here. Did you measure 260 nm of the preps as well to see if you had bound RNA to the 2C?

      Thanks for the comment, we agree that our original submission lacked detail in the description of the protein purification. This is now addressed with the new figure S2 which shows size exclusion chromatograms of the fluorophore-labelled proteins (same chromatograms as in figure 2) and the corresponding uncropped gels imaged both in the stain-free channel (showing all proteins) and in the fluorescence channel. The A260/A280 ratio measured for all proteins shows that they are free of nucleic acids at the point of imaging. The protein preps are not 100% homogeneous but we do believe that they are more than 50-60% pure.

      Lines 170. Wasn't this done in the recent "An Amphipathic Alpha-Helix Domain from Poliovirus 2C Protein Tubulate Lipid Vesicles"? I don't see it referenced. What is novel about the current work when compared to that paper? Any differences?

      Thanks for pointing this out. The referenced study worked with a synthesized, isolated peptide corresponding to AH2 (i.e. not with full protein). An amphipathic peptide outside the context of its protein cannot be expected to recapitulate the properties of the entire protein, e.g. since it is not spatially constrained in how it interactis with membranes. As one example (relating to the title of that paper) we don't see full-length 2C protein tubulating membranes the way the isolated peptide does. As for the reviewer's question about novelty, the paper mentioned does not identify the split nature of the amphipathic region, does not consider the role of AH1, does not characterise the membrane-binding properties of full-length 2C with respect to liposome membrane composition and size, does not identify and characterise the membrane clustering properties of 2C, nor its interactions with nucleic acid when bound to a membrane. However, we do agree that we should have cited the paper in our manuscript. We now cite it in the discussion, lines 320-321.

      I'm surprised by the lack of electron microscopy (negative stain mostly) of both the oligomerized 2C and the various liposomes. I know the Carlson group is a microscopy group so why the lack of validation using electron microscopy of the various DLS experiments? I know you did cryo-ET for one of the constructs but I think negative stain electron microscopy of other constructs would be useful.

      Thanks for the suggestion. As suggested, we have now expanded the analysis with negative staining EM of several more constructs studied by DLS. It can be found in the new supplementary figure S10.

      Figure 4C. What evidence is there that this is 2C apart from you added it to the liposomes? It also comes back to the relative impurity of your protein prep. Could this be E.coli contamination?

      Thanks for this comment. We have now added a new supplementary figure (S5) showing SDS-PAGE gels of the reactions used for flotation and DLS assays - which are identical to the cryo-ET samples. In addition, we estimated the molecular mass of the individual, putative 2C desities in the cryo-electron tomograms by measuring their volume. This analysis, which can be found in the new figure 4D, shows that the estimated mass of individual protein densities is consistent with a hexamer of full-length 2C. In addition, we mention in the discussion the long-term need to determine high-resolution structures of membrane-bound 2C using cryo-ET and subtomogram averaging (lines 315-318).

      Figure 8. Is this model supported by the data in this paper? Your cryo-ET says that 2C is there but that isn't supported by any other data. How is the dsRNA protected from the innate immune system in this model? is it just sat out in the cytosol? How is the nascent ssRNA packeged into the capsid? Is there competition between the dsRNA and capsid for 2C binding (which your model suggests)? I know it sounds like I am being overly critical of the model but in my opinion there are still too many unanswered questions in the field to come up with a half decent model.

      Thanks for this comment. We are the first to agree that our understanding of the roles of 2C is far from complete! We should have been more clear that the model figure represents some of the roles of 2C identified to date, and does not claim to be complete. However we do feel that a model figure serves a purpose of putting our findings into a context, and also providing testable hypotheses for future research . As for the question, some of the roles of 2C shown in the model figure (in particular, particle assembly) are rather supported but earlier work of ourselves and others. We have now produced a new model figure and changed the figure legend to better reflect the incompleteness of the current understanding, and the origin of the different parts of the model figure. In addition, we extended the final paragraph of the discussion (which lists still-unknown aspects of 2C) with the reviewer's mention of dsRNA shielding from innate immunity (lines 374-375). The other aspects mentioned by the reviewer as not yet fully understood are already mentioned in that paragraph.

      Minor issues

      Lines 43-45: I feel like you underplay the success of the poliovirus vaccination program. Approximately 30 of WPV1 in 2022 and the full eradication of WPV2 and 3. Vaccine derived polio is still an issue but even that is relatively low compared to where the world was in the 1950s.

      We agree that the previous wording was not ideal. We replaced it and added another recent reference - related to the type 2 vaccine switch (lines 47-49).

      Line 66. I agree there are 11 individual proteins but I feel like this leaves out the fact that some of the uncleaved precursors appear to have some functions, for example 2BC.

      Good point. We have now added a mention of 2BC and the fact that it has distinct functions to the introduction (lines 70-71). 2BC is also mentioned in the legend of the model figure (figure 8).

      Line 56: LD needs to be defined.

      Well spotted thanks. Since the abbreviation was not used anywhere else we opted to spell it out instead (line 59).

      Line 75. I think you have misrepresented Xia et al here. They clearly say that in their study that they show helicase and chaperone activity. I never managed to repeat that work but you should still report what they claim. One major thing is that they used insect expressed protein, whereas most people (including myself and in the paper under review) use E.coli expressed protein. Do post translational modifications play an important role in function?

      You are right that the reference to their paper for this statement was incorrect. We have now made this part of the introduction more explicit (lines 82-83) and we also in the new discussion mention the possibility of e.g. post-translational modifications affecting 2C helicase activity, under reference to Xia et al (lines 359-361)

      Line 103. Need to make it clear here it is poliovirus 2C.

      Thanks, we added it (line 112).

      Line 135. I assume you mean kDa instead of uM?

      It should actually be μM. It is the solution concentration at which the assay was performed. We added some words to clarify this (line 154).

      Figure 3. What do you mean by "Only 2C"? Is that MBP-2C? Maybe I am reading the data wrong but adding TEV does nothing? How do you know TEV is removing the MBP? It looks like MBP-2C binds to the liposomes just the same as cleaved MBP-2C. I see in line 165 you acknowledge this. Could an alternative conclusion for line 168 be that MBP isn't being cleaved off but that AH2 is too small to be exposed in that construct? Did you do that construct without MBP being cleaved? I think you need to confirm that MBP is being cleaved off.

      Thanks for spotting this mistake. It should indeed be MBP-2C (in the absence of liposomes). We corrected figure 3. Also, in response to this comment and similar ones, we have now added a new supplementary figure showing SDS-PAGE gels of the reaction loaded onto flotation assays and DLS (figure S5). It shows that MBP-2C is cleaved.

      Line 184. Is there a reason you use the 2019 paper as a reference instead of the far earlier Bienz et al papers? I'd suggest they are the seminal papers on 2C membrane association. Once again how is this work different from the recent "An Amphipathic Alpha-Helix Domain from Poliovirus 2C Protein Tubulate Lipid Vesicles" paper?

      See our response above of the paper mentioned here (which we have now cited). As for why we cite the 2019 paper here: our statement pertains specifically to the contact sites between lipid droplets and replication organelles, not to the membrane binding of 2C per se. We have now added a more general mention of membrane remodelling by non-structural proteins in the introduction, where we cite on of the Bienz papers (lines 75-77).

      Figure 5D. So only 1-3% of RNA is found in the upper fraction? Is that significant enough to say that dsRNA was recruited significantly more than ssRNA? How confident are you in your quantification of the starting amounts of RNA?

      We agree that the fraction is low, however, the fluorescence signal is very clearly above background. We are thus confident in the measurement. The low percentage at the end of the experiment likely has a simple physico-chemical explanation: in a dynamic equilibrium in a density gradient, whatever RNA dissociates during the run will migrate away from the 2C-vesicle fraction and not be able to rebind. We still tried to address this concern by a complementary experiment where we used fluorescence anisotropy to measure binding of RNA to 2C on vesicles. While the measurements showed the same tendency, they curves were not clean enough to be published, which we think is due to the complex system with 2C bound to vesicles and clusters of vesicles. Still, in view of the relatively low percentage of measured recruitment we opted to adjust the paper title and the title of figure 5 (including the subheading related to figure 5) to put less emphasis on the dsRNA recruitment.

      Line 223. Any idea why the MBP needs to be cleaved off? Clearly the MDB is accessible or it would not bind to the liposomes.

      Since we have no data directly supporting this we prefer not to speculate in the paper. But one guess would be that the NTD of 2C, as implicated by previous publications, has a dual role in membrane binding and RNA binding. It may be that it can bind membrane while conjugated to MBP, but needs MBP to be removed in order to simultaneously bind membrane and RNA.

      Line 237: missing "b" in "by"

      Thanks. This paragraph was rewritten in the light of the changes to figure 6.

      Figure 6. I don't fully understand the results here. Earlier you showed that the delta MBD didn't really bind SUV. So presumably it isn't really membrane bound. Why does it have similar activity to full-length MBP in your helicase assay if membrane is important? Did you do SUV and TEV protease only control?

      We are very grateful to this reviewer (and others) for pointing out the need for a TEV control. When performing the control, we found that the TEV protease, at the high concentrations initially used, surprisingly had an artefactual RNA chaperone-like effect on its own. We then proceeded to titrate down the TEV protease concentration to the point where it no longer interfered. At this TEV protease concentration, although 2C was substantially cleaved (see the new supplementary figure S12), we could no longer detect an RNA chaperone activity. Thus, the contents of the new figure 6, and its conclusions, have been substantially changed. We now focused our attention on the remaining effect that 2C has on RNA: single-strand ribonuclease activity. These experiments were all conducted in the presence of RNase inhibitors, and the presence of Mg2+-dependent ribonuclease activity parallels a recent publication that found this for truncated 2C from hepatitis A and several enteroviruses.

      Line 257: "staring"?

      Thanks, corrected. A staring glycine would indeed be something strange.

      Line 336. Need to change the u to mu.

      Thanks, corrected.

      Any discussion on your observation in Figure 1D that EV71 and CVB3 don't appear to have AH1 and AH2 or do you think that the domains are conserved across the different viruses?

      Thanks for bringing this up. Based on this and a comment from another reviewer, we have now clarified our thinking around this. Since the glycine will introduce some flexibility between AH1 and AH2, we cannot say from the single alphafold predictions that this is THE kink angle. The presence of the kink in the predictions of several MBDs lends more credibility to the robustness of the observation, but most importantly the hydrophobic surfaces in AH1 and AH2 are non-aligned for ALL sequences we looked at. This is now described on lines 126-128.

      Table 1 (and possibly elsewhere): an apostrophe is not the prime symbol. 5' compared to 5′.

      Thanks, we corrected this throughout.

      Line 702 "and" should be "an".

      Thanks, corrected.

      I couldn't open one of the movies (140844_0_supp_2820374_a2g272.avi).

      Sorry to hear this, we will check the movie again.

      Reviewer #3 (Significance (Required)):

      Overall I liked the paper and is worth publishing. One of the issues in the 2C field is the difficulty in making pure 2C and carrying out in vitro assays that correlate with what is observed in the natural infection. I think this paper suffers from similar struggles with a 2C preparation that doesn't appear that pure. I think it also suffers from not having 2C from a wild-type infection. I don't think that it is feasible to get that kind of 2C but by once again using a recombinant protein from E.coli we are left with another manuscript that provides conflicting evidence of the functions of 2C without a definitive answer. The experiments are well done, although are missing some controls and the manuscript is laid out in a logical manner and is relatively easy to follow.

      We thanks the reviewer for these comments. We believe that we have now provided better information regarding the purification of the recombinant 2C protein, and we do think that the controls present in the original manuscript and the revised manuscript alleviate the concerns about lack of specificity. Of course, isolating 2C vesicles from wildtype infection would be another interesting way of approaching its function, but such an approach would come with its own set of challenges related e.g. to the presence of confounding host factors.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      This is an interesting manuscript that reports the development of an in vitro membrane assay for probing the biochemical functions of the enterovirus 2C protein. The technique is interesting because it can be applied to 2C proteins from other members of the picornavirus family, an important group of mammalian pathogens. It has the capacity to probe different functions (e.g. membrane clustering, ATPase activity, RNA-binding and manipulation activities).

      Overall, the manuscript is well written and gives a clear account of the work undertaken. It adds insight to previous studies of enteroviral (and picornaviral) 2C proteins, providing confirmation of some earlier work in a more physiological context and some new insights, particularly into the membrane and RNA binding aspects of 2C.

      That said, there are a number of places where some amendment of the claims made is required to provide a more precise statement of the findings of this work. These are listed below.

      We thank the reviewer for this positive feedback on our work, as well as for the specific comments below.

      Line 21 (Abstract) - The authors claim to have shown that a conserved glycine divides the N-terminal membrane-binding domain into 2 helices. I would suggest instead what they have produced are computational predictions that this is the case - some way short of an experimental demonstration. Sequence analysis predicts helical secondary structure in the N-terminus and indeed Alphafold2 also predicts a helical structure, but these predictions require experimental verification. The authors should therefore rewrite sections that claim to have shown the presence of 2 helices. In doing so, they should perhaps also comment on the fact that Alphafold2 does not predict 2 helices in this region for all enteroviruses (see Fig 1D). Moreover, the sequence analysis in Fig. S1 shows the presence of two Lys residues in the segment 17-38; it would be interesting for the reader to have these indicated in the figures showing the Alphafold2 prediction - do they in any way interrupt the hydrophobic face of the predicted helix?

      Thanks very much for this comment, which is in line with what other reviewers also wrote. We agree, and changed the abstract sentence. We have also rewritten the manuscripts in several places to address the limits of structure predictions and the eventual need for an experimental structure of full-length membrane-bound 2C (lines 126-128 and 315-318).

      Line 82 (Introduction) - The authors write that the membrane binding domain (MBD) of poliovirus has been shown to mediate hexamerisation, citing Adams et al (2009) - reference 43. However, that is not what this paper shows. Rather it provides evidence of aggregation of an MBP-2C fusion protein into forms that ranged from tetramer to octamer, but no evidence that these aggregates assume functional forms (e.g. the presumed hexameric ring structure characteristic of the AAA+ ATPase family to which 2C belongs). As far as I am aware the first demonstration of hexameric ring formation by a picornaviral 2C protein was for the 2C of foot-and-mouth disease virus (see Sweeney et al, JBC, 2010). Although this is not an enterovirus, this finding was later confirmed for Echovirus 30 (ref 51). I should declare an interest here: the Sweeney paper is from my lab. I will leave it to the editor and the authors to determine how to write a more precise account of the early observations of hexamerisation in picornaviral and enteroviral 2C proteins.

      Thanks very much for this insightful comment. As a response to this and other similar comments, we are much more cautious about our wording in the revised manuscript (see also response to comment below. In the part of the introduction discussed here (now lines 89-91) we now use the original wording of the Adams paper ("oligomerization"). In the context of that new text we didn't feel that Sweeney et al paper was a suitable reference, but we now cite it in the later mention of 2C's oligomeric/hexameric state in the first part of the Results (lines 137-138 ).

      Line 132 - the authors used mass photometry to investigate oligomeric forms of their MBP-2C constructs and state that for the full length 2C protein "the high-mass peak closely corresponds to a hexamer". While it is true that the peak shown in Fig 2C aligns with the expected MW for an MBP-2C hexamer, the peak is very broad, indicative of the presence of other oligomeric states with lower and higher numbers of monomers. This should be commented on. Indeed, the finding seems to echo the early findings of Adams et al (ref 43) with poliovirus MBP-2C.

      Thanks for this comment, which was also made by another reviewer. We cite here what we replied to that reviewer

      ...we do agree with the reviewer on the broad mass photometry peaks. To address this experimentally, we compared the existing MBP-2C spectra to new recordings on apoferritin, a highly stable homomultimeric protein complex of a similar mass to aa MBP-2C hexamer. The apoferritin mass estimate is overlayed with the full-length MBP-2C in the new figure 2D and the corresponding supplementary figure S3. This indeed shows that the MBP-2C peak is broader, i.e. consistent with a mix of species which are predominantly but not only hexamers. We describe and discuss this on lines 145-149.

      Line 143 - for the reasons given above, this summary paragraph represents too strong a statement of what has been observed.

      We agree, and changed the paragraph. It now only refers to "oligomerization" (lines 162-164).

      Line 197 - I note that the authors did not test the membrane clustering capabilities of the 2C(41-329) construct. Although the 2C(deltaAH1) construct had already shown a significant loss of activity, the shorter construct could still have been a useful control. I don't think it is necessary for this experiment to be done, but if the authors have a rationale for not performing the experiment, perhaps they could include it in a revised manuscript.

      Thanks for the suggestion. The rationale is that a protein that doesn't bind a membrane in the first place will also not cluster them (an action that requires binding TWO membranes). We now describe our reasoning on lines 220-222. Nevertheless, we did test these constructs in the new supplementary figure showing negative staining TEM (figure S10).

      Line 223 - typo. I think you mean MBD.

      Thanks! Corrected (now line 257).

      Line 215 - the authors observed that the presence of ssDNA reduced membrane clustering and conclude that "nucleic acid binding partially outcompetes membrane tethering activity". Two things: (1) although I agree is it likely that this effect is due to binding of DNA to 2C, binding has not been demonstrated experimentally so the authors should be more careful in how they describe their result; (2) there is no data presented to show that RNA binding reduces membrane tethering so at best I think the conclusion has to be that the data are consistent with the notion that DNA binding reduces membrane tethering. It would of course be interesting to see the effects of RNA and I'm curious to know why the assay was not performed.

      Thanks for the comment. The honest answer is that previous publications (primarily Yeager et al, NAR 2022) convinced us that the outcome should be near-identical with DNA, so we chose DNA oligos because they are cheaper and easier to work with. But we agree with the reviewer that RNA is of course more relevant. We now present a comparison at 5 μM of ssDNA and ssRNA, which in fact shows a slightly stronger effect on membrane clustering by RNA (figure 5C). In the light of this additional experiment, we feel that some of the text changes suggested by the reviewer may no longer be necessary.

      Line 237 - typo: by, not y

      Thanks. In the light of the extensive changes to figure 6 this text was removed.

      Line 284 - the authors claim that 2C may only bind RNA after the N-terminus is liberated from 2B in infected cells, since cleavage of the MBP tag from their construct was needed for 2C to bind RNA in their in vitro assay. However, this does not automatically follow given the large structural differences between MBP and 2B and the fact that the authors have not tested the RNA binding capacity of a 2BC fusion protein. Their claim here is too strong and should be re-written.

      We agree, and have added a discussion along the lines suggested by the reviewer (line 330-332).

      Line 293 - The authors speculate that RNA binding might cause a shift between the membrane clustering activities and the role of the protein in RNA replication. However, since they have not shown that RNA binding reduces membrane clustering, this is too speculative.

      In our revised manuscript we have studied the effect of RNA on membrane binding, thus we feel that this text is relevant in the context of the extended experiments.

      Line 299-317 - within this discussion is the assumption that in their assay system enterovirus 2C adopts the ring-like hexameric structure typical of AAA+ ATPases. While I agree this may well be the case, it has not been demonstrated in this study so the authors should make clear they are making this assumption. The same applies to the legend of Fig 8.

      This part of the discussion was extensively rewritten after our changes to figure 6. We now only refer to "hexamer" once in the corresponding part of the discussion, where we talk about structural models of hexamers produced by other groups who have crystallised fragments of 2C. There we believe we should refer to hexamers to accurately cite their work.

      We are not sure what the reviewer is referring to when it comes to the legend for figure 8: the original legend had no reference to the oligomeric state of 2C. We have substantially changed figure 8 and its legend and the new figure and legend make no references to hexamers/oligomers.

      Line 302 - the authors claim to have shown that 2C is 'selective' for dsRNA. I think at best they have shown a preference for binding dsRNA over ssRNA.

      We changed the wording (line 349). We have also changed the title of the paper where we removed "double-stranded".

      Line 313 - The sentence starting "A recent study..." needs a reference.

      The revised discussion no longer contains this sentence.

      Line 332 - the full sequence of the synthetic gene used in this study should be made available (e.g. as supplementary information or a deposited sequence with an accession number). This is a critical point before the paper can be published.

      We will of course submit the sequences as supplementary data. Thanks for the reminder.

      Line 362 - the authors should describe the likely points of attachment of fluorophores and comment on how this labelling might affect 2C function.

      Thanks for the comment. In response to this and a similar comment from another reviewer, we discuss the likely conjugation site of the fluorophore (lines 175-181), and also (due to the proximity to the Zn finger) provide a new measurement showing that equal amounts of Zn can be detected in the labelled and unlabelled protein (figure S7).

      Line 372 - Is a single protein standard (BSA) sufficient to calibrate the SEC-MALS system?

      Yes, it is the recommended procedure (note that SEC-MALS is only dependent on scattering, not elution volumes etc).

      Reviewer #4 (Significance (Required)):

      As stated above this is an interesting study that presents findings from a novel assay. It will be of interest to picornavirologists and the wider community interested in the mechanisms of AAA+ ATPases.

      We thanks the reviewer for this positive appraisal of our work.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      We thank the reviewer for their careful reading of our manuscript and have taken all of their grammatical corrections into account.

      Reviewer #2 (Public Review):

      Weaknesses: 

      The paper contains multiple instances of non-scientific language, as indicated below. It would also benefit from additional details on the cryo-EM structure determination in the Methods and inclusion of commonly accepted requirements for cryo-EM structures, like examples of 2D class averages, raw micrographs, and FSC curves (between half-maps as well as between rigid-body fitted (or refined) atomic models of the different polymorphs and their corresponding maps). In addition, cryo-EM maps for the control experiments F1 and F2 should be presented in Figure 9.

      We tried to correct the non-scientific language and have included the suggested data on the Cryo-EM analyses including new Figures 11-17.  We did not collect data on the sample used for the seeds in the cross seeding experiments because we had already confirmed in multiple datasets that the conditions in F1 and F2 reproducibly produce fibrils of Type 1 and Type 3, respectively. We have now analyzed cryo-EM data for 6 more samples at pH 7.0 and found that several kinds of polymorphs (Types 1A, 1M, 2A, 2B and 5) are accessible at this pH, however the Type 3 polymorphs are not formed at pH 7.0 under the conditions that we used for aggregation.

      Reviewer #2 (Recommendations For The Authors):

      Remove unscientific language: "it seems that there are about as many unique atomicresolution structures of these aggregates as there are publications describing them"   

      We have rephrased this sentence.

      For same reason, remove "Obviously, " 

      Done

      What does this mean? “polymorph-unspecific” 

      Rephrased as non-polymorph-specific

      What does this mean? "shallow amyloid energy hypersurface"  

      By “shallow hypersurface” we mean that the minimum of the multi-dimensional function that describes the energy of the amyloid is not so deep that subtle changes to the environment will not favor another fold/energy minimum. We have left the sentence because while it may not be perfect, it is concise and seems to get the point across.

      "The results also confirm the possibility of producing disease-relevant structure in vitro." -> This is incorrect as no disease-relevant structure was replicated in this work. Use another word like “suggest”.

      We have changed to “suggest” as suggested.

      Remove "historically" 

      Done

      Rephrase “It has long been understood that all amyloids contain a common structural scaffold” 

      Changed to “It has long been established that all amyloids contain a common structural scaffold..” 

      "Amyloid polymorphs whose differences lie in both their tertiary structure (the arrangement of the beta-strands) and the quaternary structure (protofilamentprotofilament assembly) have been found to display distinct biological activities [8]" -> I don't think this is true, different biological activities of amyloids have never been linked to their distinct structures.  

      We have added 5 new references (8-12) to support this sentence.

      Reference 10 is a comment on reference 9; it should be removed. Instead, as for alphasynuclein, all papers describing the tau structures should be included.  

      We have removed the reference, but feel that the addition of all Tau structure references is not merited in this manuscript since we are not comparing them.

      Rephrase: "is not always 100% faithful"

      Removed “100%”

      What is pseudo-C2 symmetry? Do the authors mean pseudo 2_1 symmetry (ie a 2-start helical symmetry)?

      Thank for pointing this out.  We did indeed mean pseudo 21 helical symmetry.  

      Re-phrase: "alpha-Syn's chameleon-like behavior" 

      We have removed this phrase.

      "In the case of alpha-Syn, the secondary nucleation mechanism is based on the interaction of the positively charged N-terminal region of monomeric alpha-Syn and the disordered, negatively charged C-terminal region of the alpha-Syn amyloid fibrils [54]" -> I would say the mechanisms of secondary nucleation are not that well understood yet, so one may want to tune this down a bit. 

      We have changed this to “mechanism has been proposed to be”

      The paragraphs describing experiments by others are better suited for a Discussion rather than a Results section. Perhaps re-organize this part? 

      We have left the text intact as we are using a Results and Discussion format.

      A lot of information about Image processing seems to be missing: what steps were performed after initial model generation? 

      We have added more details in the methods section on the EM data processing and model analysis.

      Figure 1: Where is Type 4 on the pH scale?

      We have adjusted the Fig 1 legend to clarify that pH scale is only applicable to the structures presented in this manuscript. 

      Figure 2: This might be better incorporated as a subpanel of Figure 1.

      We agree that this figure is somewhat of a loner on its own and we only added it in order to avoid confusion with the somewhat inconsistent naming scheme used for the Type 1B structure. However, we prefer to leave it as a separate figure so that it does not get dilute the impact of figure 1.

      Figure 3: What is the extra density at the bottom of Type 3B from pH 5.8 samples 1 and 2. pH 5.8 + 50mM NaCl (but not pH 5.8 + 100 mM NaCl)? Could this be an indication of a local minimum and the pH 5.8 + 100 mM NaCl structure is correct? Or is this a real difference between 0/50mM NaCl and 100 mM NaCl? 

      We did not see the extra density to which the reviewer is referring, however the images used in this panel are the based on the output of 3D-classification which is more likely to produce more artifacts than a 3D refinement. With this in mind, we did not see any significant differences in the refined structures and therefore only deposited the better quality map and model for each of the polymorph types.

      Figure 3: To what extent is Type 3B of pH 6.5 still a mixture of different types? The density looks poor. In general, in the absence of more details about the cryo-EM maps, it is hard to assess the quality of the structures presented.

      In order to improve the quality of the images in this panel, a more complete separation of the particles from each polymorph was achieved via the filament subset selection tool in RELION 5. In each case, an unbiased could be created from the 2D classes via the relion_helix_inimodel2D program, further supporting the coexistence of 4 polymorphs in the pH 6.5 sample. The particles were individually refined to produce the respective maps that are now used in this figure.

      Many references are incorrect, containing "Preprint at (20xx)" statements.  

      This has been corrected.

      Reviewer #3 (Public Review):

      Weaknesses: 

      (1) The authors reveal that both Type 1 monofilament fibril polymorph (reminiscent of JOSlike polymorph) and Type 5 polymorph (akin to tissue-amplified-like polymorph) can both form under the same condition. Additionally, this condition also fosters the formation of flat ribbon-like fibril across different batches. Notably, at pH 5.8, variations in experimental groups yield disparate abundance ratios between polymorph 3B and 3C, indicating a degree of instability in fibrillar formation. The variability would potentially pose challenges for replicability in subsequent research. In light of these situations, I propose the following recommendations: 

      (a) An explicit elucidation of the factors contributing to these divergent outcomes under similar experimental conditions is warranted. This should include an exploration of whether variations in purified protein batches are contributing factors to the observed heterogeneity.

      We are in complete agreement that understanding the factors that lead to polymorph variability is of utmost importance (and was the impetus for the manuscript itself). However the number of variables to explore is overwhelming and we will continue to investigate this in our future research. Regarding the variability between batches of purified protein, we also think that this could be a factor in the polymorph variability observed for otherwise “identical” aggregation conditions, particularly at pH 7 where the largest variety of polymorphs have been observed. However, even variation between identical replicates (samples created from the same protein solution and simply aggregated simultaneously in separate tubes) can lead to different outcomes (see datasets 15 and 16 in the revised Table 1) suggesting that there are stochastic processes that can determine the outcome of an individual aggregation experiment. While our data still indicates that Type 1,2 and 3 polymorphs are strongly selected by pH, the selection between interface variants 3B vs. 3C and 2A vs. 2B might also be affected by protein purity. Our standard purification protocol produces a single band by coomassie-stained SDS-PAGE however minor truncations and other impurities below a few percent would go undetected and, given the proposed roles of the N and C-termini in secondary nucleation, could have a large effect on polymorph selection and seeding. In line with the reviewer’s comments we now include a batch number for each EM dataset. While no new conclusions can be drawn from the inclusion of this additional data, we feel that it is important to acknowledge the possible role of batch to batch variability. 

      (b) To enhance the robustness of the conclusions, additional replicates of the experiments under the same condition should be conducted, ideally a minimum of three times.  

      The pH 5.8 conditions that yield Type 3 fibrils has already been repeated several times in the original manuscript. Since the pH 7.4 conditions produce the most common a-Syn polymorph (Type 1A) and were produced twice in this manuscript (once as an unseeded and once as a cross-seeded fibrilization) we decided to focus on the intermediate condition where the most variability had been seen (pH 7.0). The revised table 1 now has 6 new datasets (11-16) representing 6 independent aggregations at pH 7.0 starting from two different protein purification batches. The results is that we now produce the type 2A/B polymorphs in three samples and in two of these samples we once again observed the type 1M polymorph.  The other samples produced Type 1A or non-twisted fibrils.

      (c) Further investigation into whether different polymorphs formed under the same buffer condition could lead to distinct toxicological and pathology effects would be a valuable addition to the study.  

      The correlation of toxicity with structure would in principle be interesting. However the Type 1 and Type 3 polymorphs formed at pH 5.8 and 7.4 are not likely to be biologically relevant. The pH 7 polymorphs (Type 5 and 1M) would be more interesting because they form under the same conditions and might be related to some disease relevant structures. Still, it is rare that a single polymorph appears at 7.0 (the Type 5 represented only 10-20% of the fibrils in the sample and the Type 1M also had unidentified double-filament fibrils in the sample). We plan to pursue this line of research and hope to include it in a future publication.

      (2) The cross-seeding study presented in the manuscript demonstrates the pivotal role of pH conditions in dictating conformation. However, an intriguing aspect that emerges is the potential role of seed concentration in determining the resultant product structure. This raises a critical question: at what specific seed concentration does the determining factor for polymorph selection shift from pH condition to seed concentration? A methodological robust approach to address this should be conducted through a series of experiments across a range of seed concentrations. Such an approach could delineate a clear boundary at which seed concentration begins to predominantly dictate the conformation, as opposed to pH conditions. Incorporating this aspect into the study would not only clarify the interplay between seed concentration and pH conditions, but also add a fascinating dimension to the understanding of polymorph selection mechanisms.

      A more complete analysis of the mechanisms of aggregation, including the effect of seed concentration and the resulting polymorph specificity of the process, are all very important for our understanding of the aggregation pathways of alphasynuclein and are currently the topic of ongoing investigations in our lab.

      Furthermore, the study prompts additional queries regarding the behavior of cross-seeding production under the same pH conditions when employing seeds of distinct conformation. Evidence from various studies, such as those involving E46K and G51D cross-seeding, suggests that seed structure plays a crucial role in dictating polymorph selection. A key question is whether these products consistently mirror the structure of their respective seeds. 

      We thank the reviewer for reminding us to cite these studies as a clear example of polymorph selection by cross-seeding. Unfortunately, it is not 100% clear from the G51D cross seeding manuscript (https://doi.org/10.1038/s41467-021-26433-2) what conditions were used in the cross-seeding since different conditions were used for the seedless wild-type and mutant aggregations… however it appears that the wildtype without seeds was Tris pH 7.5 (although at 37C the pH could have dropped to 7ish) and the cross-seeded wild-type was in Phosphate buffer at pH 7.0. In the E46K cross-seeding manuscript, it appears that pH 7.5 Tris was used for all fibrilizations (https://doi.org/10.1073/pnas.2012435118).  In any event, both results point to the fact that at pH 7.0-7.5 under low-seed conditions (0.5%) the Type 4 polymorph can propagate in a seed specific manner.

      (3) In the Results section of "The buffer environment can dictate polymorph during seeded nucleation", the authors reference previous cell biological and biochemical assays to support the polymorph-specific seeding of MSA and PD patients under the same buffer conditions. This discussion is juxtaposed with recent research that compares the in vivo biological activities of hPFF, ampLB as well as LB, particularly in terms of seeding activity and pathology. Notably, this research suggests that ampLB, rather than hPFF, can accurately model the key aspects of Lewy Body Diseases (LBD) (refer to: https://doi.org/10.1038/s41467-023-42705-5). The critical issue here is the need to reconcile the phenomena observed in vitro with those in in-vivo or in-cell models. Given the low seed concentration reported in these studies, it is imperative for the authors to provide a more detailed explanation as to why the possible similar conformation could lead to divergent pathologies, including differences in cell-type preference and seeding capability.  

      We thank the reviewer for bring this recent report to our attention. The findings that ampLB and hPFF have different PK digestion patterns and that only the former is able to model key aspects of Lewy Body disease are in support of the seed-specific nature of some types of alpha-synuclein aggregation.  We have added this to the discussion regarding the significant role that seed type and seed conditions likely play in polymorph selection.

      (4) In the Method section of "Image processing", the authors describe the helical reconstruction procedure, without mentioning much detail about the 3D reconstruction and refinement process. For the benefit of reproducibility and to facilitate a deeper understanding among readers, the authors should enrich this part to include more comprehensive information, akin to the level of detail found in similar studies (refer to:

      https://doi.org/10.1038/nature23002).

      As also suggested by reviewer #2, we have now added more comprehensive information on the 3D reconstruction and refinement process.

      (5) The abbreviation of amino acids should be unified. In the Results section "On the structural heterogeneity of Type 1 polymorphs", the amino acids are denoted using three-letter abbreviation. Conversely, in the same section under "On the structural heterogeneity of Type 2 and 3 structures", amino acids are abbreviated using the one-letter format. For clarity and consistency, it is essential that a standardized format for amino acid abbreviations be adopted throughout the manuscript.

      That makes perfect sense and had been corrected.

      Reviewing Editor:

      After discussion among the reviewers, it was decided that point 2 in Reviewer #3's Public Review (about the experiments with different concentrations of seeds) would probably lie outside the scope of a reasonable revision for this work. 

      We agree as stated above and will continue to work on this important point.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This manuscript provides a detailed analysis of RNA and protein dynamics during transmission of the rodent malaria model P. yoelii from the mouse host to an in vitro ookinete culture setting (mimicking the mosquito midgut environment). This group and others have shown experimentally that a substantial number of mRNAs is stored in the female Plasmodium gametocyte, ready to be translated following initiation of ookinete development. The process is akin to maternal deposition of mRNA in oocytes of metazoans. With this manuscript the authors provide a significant contribution to the field of translational control in Plasmodium parasites as they explore the translational activation during the early hours of zygote-to-ookinete development. The paper presents RNAseq and mass-spec analyses of female gametocytes and for the first time for 6-hour zygotes (ie a fertilized female gamete); the zygote datasets are much improved and more comprehensive than the only other performed in 2008 in P. gallinaceum. Using comparative analyses of transcriptome and proteome data (including published datasets) the authors arrive at a list of 198 transcripts that are translationally repressed in the gametocyte and translated within 6 hours of fertilization in the zygote. Many of these mRNAs are known to be involved in zygote to ookinete transformation. BioID is finally used to explore changes in mRNP protein composition between the female gametocyte and the zygote.

      The paper is generally well written. The authors present a lot of data (also in comparison with published data). Sometimes perhaps the main message could be simplified / streamlined in section titles (Quantitative Proteomics by DIA-MS is not very informative. The outcome of the analysis would be more telling).

      Response: We have revised section headers to clarify the content.

      A considerable proportion of the DIA mass-spec proteomics results section is very technical. The paper describes a biological phenomenon rather than a technical mass-spec advance. Can these technical details be moved to the methods section?

      Response: As this is one of the first published instances of using DIA-MS to Plasmodium, we want to keep this information in the main text to help our community adopt these approaches. While these details are highly technical, they are also some of the major advances of this project.

      On the other hand, a bit more detail could be provided in the main text. For example, the age of the zygotes is never mentioned. This is important, please add this. The main manuscript text has 16 mentions of the word "many". As the authors are in possession of the data, please provide, if missing, (in parenthesis) the absolute numbers, maybe in an "x out y" format. Please clearly state the number of biological and/or technical replicates used for transcriptome and proteome analyses in the main text, figures and/or figure legends. How many protein coding genes are encoded in the P. yoelii genome?

      Response: Several of these requested details are noted in the materials and methods. We have added this information to the main manuscript now as well. We have also revised the manuscript to replace some instances of “many” with specific numbers unless it adversely impacted the flow of the sentence to do so.

      The authors claim that only zygotes (fertilized females) have surface-exposed Pys25 (a surface protein they use to affinity-purify zygotes) but not gametocytes. I could not find the experimental data for this in the paper. The cited reference #22 also does not appear to show this. In Figure 2C Pys25 is shown to be translated in gametocytes. In this context it may be important to note that in the related P. berghei the related protein P28 is expressed even in the absence of fertilization (Billker 2004; DOI: 10.1016/s0092-8674(04)00449-0). It may not be relevant whether translation requires fertilization, but the authors claim it affects trafficking of the Pys25 protein to the surface, so it needs to be shown. A reference to an infertile P. yoelii line would be great.

      Response: We have corrected the reference supporting the surface exposure of p25 on zygotes. The observation by Billker and colleagues about Pbs28 is also of interest, but outside of the scope of this study as we did not investigate the fertilization event itself here.

      It is highly commendable that all data is provided throughout the manuscript. For readability, may I suggest that the authors add labels to individual sheets within an excel file from A to Z, and do so also within the manuscript. That would really help; the most relevant data sets could then be identified quickly. For example, line 184 refers to 276 zygote proteins in which sheet of which table?

      Response: While this labeling system would also be effective, we have provided a README tab for our files that quickly directs the reader to the relevant tab (as we do for our previous publications).

      Section 176 onwards: here the authors combine P. falciparum and P. yoelii proteomics data. Please explain why you excluded any of the available P. berghei proteome data such as the male and female gametocyte proteome? The same question applies to 294 onwards.

      Response*: We compared our datasets with those of Lasonder et al. NAR 2016 because that study was also focused on translational repression of mRNAs and provided both RNA-seq and proteomic datasets of female gametocytes (although not of zygotes). *

      The comparative transcriptome-proteome analysis arrives at 198 translationally repressed mRNAs. Could the authors provide one or two alternatives using less stringent parameters? The list in P. falciparum and P. berghei is considerably larger (500+ and 700+).

      Response: We could have reduced the stringency of our thresholds to arrive at a far larger number, but prefer to retain higher confidence in those we are scoring as translationally repressed and then released for translation. We provide all of the pertinent data in the supplemental files if readers would like to adjust these thresholds to see which additional mRNAs may also be regulated.

      The turboID data is informative but somewhat speculative in regard to spatial rearrangements within these mRNPs. Figure 6 presents the RNA helicase to bind the 5' end of mRNAs that are associated with polyribosomes and I assume being translated. Is this association realistic? The RNA helicase DOZI homolog of yeast (Dhh1) is also involved in decapping. Response: We provide Figure 6 as our working model of how the reorganization of the DOZI/CITH/ALBA complex could occur based on available data from this study and others. Future studies are warranted to determine if DOZI remains associated with monosomes vs. polysomes, but current data indicate that DOZI can bind to eIF4E when translational repression is not imposed.

      Specific comments:

      title Is global the appropriate word? Some transcripts appear to be translated later.

      Response: We believe it does apply appropriately to these data.

      Line 30/32 Please re-phrase the sentence. There is: Cell Host Microbe 2012 Jul 19;12(1):9-19. doi: 10.1016/j.chom.2012.05.014.

      Response: We conclude that the sentence is correct as written, even in considering Sebastian et al. Cell Host & Microbe 2012.

      30 Perhaps add ookinete that establishes infection rather than the zygote. For a general readership, a brief description of the sexual life cycle might be useful

      Response: It is not possible to get into these nuances in the Abstract. This information is covered in the main text and the works that are cited.

      32 DOZI/CITH/ALBA complex would require some explanation for a more general reader

      Response: It is not possible to get into these nuances in the Abstract. This information is covered in the main text and the works that are cited.

      36-37 I believe zygotes were collected 6 hours after fertilization. Does that qualify as soon after fertilization? Motile ookinetes are generated within 20 hours and motility can be seen before that.

      Response: Yes, we think this qualifies as the process is not synchronous, but relies on when male gametes encounter and fuse with female gametes.

      37 Essential functions for what?

      Response: It is not possible to get into these nuances in the Abstract. This information is covered in the main text and the works that are cited.

      39 Is the spatial arrangement of this mRNP known?

      Response*: Some interactions of members of this complex were known (DOZI with eIF4E, ALBA4 with PABP1), but not the overall spatial arrangement. These findings are novel to this study. *

      40 Can you briefly allude to the "recent, paradigm-shifting models of translational control"

      Response: It is not possible to get into these nuances in the Abstract. This information is covered in the main text and the works that are cited.

      44 Products = mRNA

      Response: We have stated it as products because the maternal cell provides more than just mRNAs that are essential to further development post-fertilization.

      45 Oocyte in metazoans ?

      Response: Yes, this is the correct term. The context here is in higher eukaryotes.

      60/62 Please re-phrase the sentence. There is: Cell Host Microbe 2012 Jul 19;12(1):9-19. doi: 10.1016/j.chom.2012.05.014.

      Response: We conclude that the sentence is correct as written, even in considering Sebastian et al. Cell Host & Microbe 2012.

      81 PbDozi Plasmodium berghei DOZI

      Response: We have added this clarifying text here as suggested.

      84/85 Please rephrase and cite Nucleic Acids Res. 2008 Mar;36(4):1176-86. doi: 10.1093/nar/gkm1142. Epub 2007 Dec 23. and Cell Host Microbe 2012 Jul 19;12(1):9-19. doi: 10.1016/j.chom.2012.05.014.

      Response: As noted above for other comments, we hold that the current phrasing is accurate even when considering these important publications.

      88 Please define the timepoints throughout this manuscript. What age are the zygotes? How many hours post-induction? Please define the time for ookinete development somewhere in the introduction

      Response: The timepoint used for zygote collection is now included in the main text in addition to its previous inclusion in the Materials and Methods section. As we have not studied the ookinete stage here, we have opted to keep the introduction focused on the key details for this study.

      104 Please add the age (in hours) of these zygotes from the time of starting the in vitro cultures. From the methods section it looks like 6 hours.

      Response: The timepoint used for zygote collection is now included in the main text in addition to its previous inclusion in the Materials and Methods section.

      103/105 I can find no evidence for P25 (Pys25) expression relying on fertilization in the cited paper (22). The SOM has no reference to Pys25 either. Please show data or reference published data that there is no translation and trafficking of Pys25 in unfertilized female gametes, ie those that are placed in ookinete medium. In this respect it may be important to note that unfertilized Plasmodium berghei females placed in ookinete medium translate P28, the P25 paralog (https://www.sciencedirect.com/science/article/pii/S0092867404004490?via%3Dihub)

      Response: We have corrected the reference supporting the surface exposure of p25 on zygotes. The observation by Billker and colleagues about Pbs28 is also of interest, but outside of the scope of this study as we did not investigate the fertilization event itself here.

      104 What cell line was used for the zygotes?

      Response*: The PyApiAP2-O::GFP transgenic parasite line was used here. These details are included in the manuscript and supporting information. *

      114 The number of transcripts detected in gametocytes is quite small compared to the twice as large proteomics dataset. See for example also Lasonder 2016 for P. falciparum detected transcripts: 4477 different sense transcripts were identified, 98% of which were shared between MG and FG.

      Response: Yes, the number of mRNAs or proteins scored as detected differs based on thresholds applied. We prefer to err on the side of higher stringency as noted above.

      117 Does the 194 up-in-gametocytes dataset include the 81 not found in zygotes?

      Response: No, these 194 are detected in both datasets, but are more abundant in gametocytes than zygotes.

      117 Could you indicate some of the genes in the plot?

      Response: Several hits of special note are described in the text. We have opted to keep the figure clear and streamlined.

      Fig1 How were the upregulated transcripts identified? 1647 are shown to be specific to zygotes in 1B, yet only 685 are shown in 1C to be upregulated. Do the transcripts found exclusively in zygotes not count? Are these transcripts likely the result of de novo transcription? How old are these zygotes when the libraries are made?

      Response: The details of the RNA-seq processing are provided in the MakeFile, the supplementary tables, and the manuscript. The README tab provides descriptions of what processing occurred between sequential tabs. As noted above, zygotes were collected at 6 hours.

      132 Many? How many? Please provide a precise number.

      Response: These details are now in the revised manuscript.

      134 Please explain why p28 would be differentially abundant in the zygote rather than the female gametocyte. That would require de novo transcription of this gene. If there is experimental evidence for the de novo transcription of p28 and other translationally repressed transcripts in the zygote please cite the references. Can you name a few more examples here? P25 for example, ap2-o, or anything published and experimentally validated. What about AP2-o and AP2-Z? Both are known to be translationally repressed.

      Response: We state in the original manuscript that there is not a significantly different mRNA abundance of pys28.

      139 Please define how many members of the IMC?

      Response*: We have now replaced “many” with the number of IMC members we have detected, which is also shown in supporting tables. *

      156 Can you provide a number of how many parasites were used in total or per run. And how many biological and technical replicates were analysed?

      Response: These details are provided in the Materials and Methods.

      169 The number of proteins detected in the gametocyte sample is twice the size of transcripts. IS this to be expected?

      Response*: This reflects the sensitivity of the assays run for transcriptomics and proteomics. *

      170 How many samples were analyzed? One gametocyte and one zygote sample?

      Response: Yes, for the creation of the DIA-MS spectral library, a single biological replicate was used in addition to in silico library approaches. This information is provided in the next sentence.

      176 Why did you not include P. berghei in the meta-analysis?

      Response: We compared these results to all of the published Plasmodium proteomes in PlasmoDB.

      184 Please refer to an excel table here.

      Response: We have pointed to the relevant supporting files in this section.

      184 145 proteins: do you mean orthologs in general or orthologs with a gene/protein annotation other than unknown function?

      Response: We use the standard form of ortholog throughout the manuscript.

      190 142 proteins: do they all have orthologs in P. falciparum?

      Response: No, not all proteins in our dataset have unambiguous orthologues in P. falciparum, and this is accounted for in our data processing approaches.

      Figure 2C P25 is not exclusive to zygotes here and also found in the gametocyte sample.

      Response: That is correct. It is known that p25 is expressed in female gametocytes, but that the localization changes in the zygote.

      190 shortlist

      Response: The spelling of “short list” as two words is an appropriate American spelling of this term.

      219 onwards Does the list of 198 transcripts exclusively arise from your RNAseq and proteomics comparison? Or does it include falciparum data as outline in section 176 onwards, ie the list of 276 proteins that only are detected in zygotes?

      Response: Yes, this list of 198 mRNAs is derived from our datasets only using our defined thresholds. The details of this are provided in the manuscript.

      224 Early zygote? At 6 hours do the parasites not start to transform, elongate?

      Response: This process is not synchronous, as it is affected by the timing of gamete fusion.

      225 >5-fold. Is this an arbitrary decision?

      Response: This threshold has been used by our group and others in prior studies, and was partially informed by the behavior of previously characterized transcripts.

      227 1417 mRNAs: they are from which dataset?

      Response: These are from our datasets with P. yoelii, as described in the manuscript.

      228/229 Please explain why DOZI and CITH are in the list of 198 repressed transcripts? They are present in the gametocyte. Are they upregulated>5 fold?

      Response: Yes, they meet our criteria for this regulation, and in the manuscript we note that we believe that they are self-regulated and likely have continuing roles in early mosquito stage development.

      259 ... as they are already translated in the gametocyte?

      Response: Yes. Translational repression allows for the existence of some of the protein in the initial timepoint. This differs from translational silencing which does not.

      295 Is this from the 198 TR list S4?

      Response: No. Transcripts that remain repressed would not be in the list of 198, as the protein was not detected in zygotes.

      294 onwards How many putatively falciparum transcripts are there? How many were identified in P. berghei? How many are common to all? A Venn diagram perhaps to compare the different studies

      Response: There is substantial overlap between the species with respect to the presence of syntenic orthologues in this dataset. However, because we did not conduct experiments with P. falciparum or P. berghei here, we do not want to make claims that they are similarly regulated or potentially have a reader misinterpret a figure to that effect.

      301 How many transcripts were found associated with Plasmodium berghei DOZI and/or CITH in female gametocytes? How many of those were abundantly detected as protein in zygotes, or had no difference in protein abundance between gametocytes and zygotes, or even greater abundance in female gametocytes?

      Response: These details are now provided in the revised manuscript.

      303/305 Please indicate the numbers of translationally repressed transcripts identified for P. falciparum and berghei.

      Response: These data are provided in Supporting Information Table 4.

      317/319 Please add the promoter used for tid-GFP

      Response: We have now added this information to the Materials and Methods.

      320 Please elaborate on the spatial organization of the DCA complex.

      Response: This has not been previously characterized, and this entire section is dedicated to the experimental data and interpretations of how the DOZI/CITH/ALBA complex may be organized.

      321/322 Have precise binding sites of DOZI and ALBA4 really been shown experimentally in the cited papers? In relation to 5' and 3' ends of the mRNA? Please cite Braks et al. paper.

      Response: Yes. The association of DOZI with eIF4E and ALBA4 with PABP1 are established in the literature, in some cases by multiple independent laboratories. The Braks publication does not address the binding of these proteins, and thus is not cited.

      323 What is the first generation BioID enzyme? BirA*

      Response: Yes. The first generation enzyme is called BirA*

      323 Please cite relevant Kyle Roux and Alice Ting for the original enzymes

      Response: We have now added these citations to this sentence.

      327 Could you show images of ALBA4::TurboID::GFP, DOZI::TurboID::GFP and cytosolic (free) TurboID? Perhaps stained with fluorescently labelled streptavidin and / or against GFP? In the gametocyte and zygote samples?

      Response: We attempted to stain with monoclonal antibodies that are reactive against biotin and there was insufficient specificity, hence why such data is not included. We conclude that all of the other data that supports this approach suffices to demonstrate its rigor.

      331 What is the age of these zygotes? Where they affinity purified?

      Response: As throughout the manuscript, zygotes were collected at 6 hours. Details of experimental purifications are provided in the materials and methods.

      Fig S4 Please indicate whether ALBA4 and DOZI were tagged endogenously

      Response: Yes. The endogenous loci for both ALBA4 and DOZI were modified to include the C-terminal TurboID and GFP tags.

      421/430 Please add a few references here

      Response: We do not believe that specific references are warranted for these general statements.

      429 translational repression?

      Response: Yes. These statements set the stage for the use of translational repression.

      445 966 proteins in gallinaceum? The zygote cultures in that study were 2-3 hours. How old were the cultures in your study?

      Response: As throughout the manuscript, zygotes were collected at 6 hours.

      481 Please explain / cite why repression is energetically costly.

      Response: These details are provided in both the introduction and discussion sections. The energetic cost of translational repression is both the cost to produce the transcripts without immediately/fully utilizing it for translation, in addition to the energetic cost to impose the regulation.

      501 Please add the time-point of RNA and protein sampling. How many hours into ookinete development? What is the time from cardiac puncture through FACS sampling of gametocytes.

      Response: We have provided all of these details in the materials and methods for female gametocytes and zygotes. We did not look at ookinetes in this study.

      711/713 Do you have any images that show the successful purification of zygotes away from gametocytes? Secondly, please provide a reference for the statement that unfertilized female gametocyte do not express surface exposed Pys25.

      Response*: We do not have captured images of these zygotes, but confirmed them during collection using microscopy. The reference for surface exposure of Pbs25 is now provided earlier in the manuscript as well. *

      711/716 Were parasites lysed and mechanically homogenised?

      Response: We have provided all of these details in the materials and methods for female gametocytes and zygotes.

      Figure 6 What is the evidence that DOZI stays associated with mRNA that is being translated? Rather than mRNA that is being decapped. Please add the references that unequivocally show that DOZI and ALBA4 bind to opposite ends of repressed mRNAs.

      Response: This is our working model of these data. It is feasible that these complexes could form off of mRNA as well. Publications describing the interactions of DOZI with eIF4E and ALBA4 with PABP1 are provided in the manuscript. It is well established that eIF4E binds to the m7G cap of the 5’ end of mRNAs, and PABP1 binds to the poly(A) tail at the 3’ end of mRNAs.

      Reviewer #1 (Significance (Required)):

      The experiments in the manuscript are carefully conducted. Apart from a P. gallinaceum study from 2009 this is the first comprehensive analysis of the transcriptome and proteome of a Plasmodium zygote (developing ookinete) at 6 hours post-fertilization. The data are used to explore the temporal aspect of activation of translation during the first quarter of the 20-24 hour ookinete developmental period. The study will be of interest to the field, specifically those scientists working to understand translational control, ookinete development, and those developing intervention strategies to prevent mosquito infection and thus malaria transmission.

      Response: We appreciate Reviewer 1’s extensive feedback and positive remarks about the significance of our study. We have revised our manuscript to reflect this constructive feedback.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Main findings

      Taking a multi-omic approach, the authors provide quantitative evidence for translation repression of ~200 mRNAs in Plasmodium yoelii female gametocytes. These mRNAs are then translated, and proteins detected by 6 hours after activating gametocytes. They accomplish this by performing a comparative global analysis of the transcriptome and proteome between female gametocytes and early zygotes that provides an intresting resource. The authors also use proximity labelling of the DOZI/CITH/ALBA4 repression complex, and these data suggest the complex may disassemble in the zygote or change its composition.

      Major points

      Line 181-184: The authors state that there is no evidence of how the DCA complex selects specific mRNAs for translation repression. While the exact mechanisms have not been fully elucidated, Braks et al (2008, doi:10.1093/nar/gkm1142) suggested a role of the untranslated regions (UTRs) in translation repression of transcripts in Plasmodium berghei female gametocytes. They identified a uridine-rich 47-base element in the 5'UTR and or 3'UTR that was associated with translationally repressed transcripts and validated it experimentally. Considering this finding, I would recommend an amendment of the statement and to include the earlier work. I would also like to see additional analysis to check if this U-rich motif or other motifs are associated with the translationally repressed transcripts identified in the current study. The current study should be better powered to conduct such an analysis.

      Response: We have now added a comment and citation in the revised text about this study in Lines 86-88. Understanding the full importance of this element is challenging, as the Plasmodium transcriptome is highly enriched in A’s and U’s due to the highly skewed A/T content of its genome. Perhaps for this reason, we did not see an association of this motif with the identified mRNAs.

      The authors used zygotes that expressed GFP tagged AP2-O, however, there is no explanation of the significance of using this line.

      Response: This line is described in the Materials and Methods and supporting information. It was used to provide further validation of the production of zygotes.

      Minor points

      In line 106-107, the authors refer to figure SI, this figure is about genomic locus and genotyping PCR for the PyApiAP2-O::GFP parasites but there is no intext description of why this specific line was used.

      Response: We have provided this information in the revised manuscript.

      Statement in line 122-124 "It is likely that....." should go into the discussion not results.

      Response: We have placed this single sentence immediately after presenting these data here to aid reader comprehension.

      Statement in line 171-175: "In addition to providing confirmatory...." Should be in the discussion not on the results.

      Response: We view this sentence as a concluding remark of this section of data that also places this information in context for the reader.

      In Fig. 4 A and B, could the colour scheme be changed so that the proteins that are not in both samples (and probably contain many unspecifically detected proteins) appear less prominent?

      Response: We appreciate this suggestion and have adjusted these plots accordingly in the revised manuscript.

      Reviewer #3 (Significance (Required)):

      Why is the paper interesting. Translation repression of mRNA at a global level in the female gametocytes has been studied previously in rodent malaria parasites investigated, but prior to the current study, the release of mRNA from translation repression in the mosquito stages has only been demonstrated for specific transcripts. By characterizing and quantitating changes in protein abundance between macrogamete and zygote, coupled with transcriptomic analysis, the current work broadens our understanding of zygotic translation activation that is key to successful malaria parasite transmission to the mosquito.

      This dataset provides a useful resource for the Plasmodium research community as it provides a more comprehensive view of how transcripts behave during the transitions from the mammalian host to the vector. It is one step in a broader endeavour towards finding genes crucial for parasite transmission that could be targeted for interventions.

      How translational repression and derepression is regulated remains unknown, although some of the molecular players have been identified. This paper shows proximity labelling and expansion microscopy data of the ribonuclear protein complex thought to mediate repression. Although the specific mechanistic insights provided by the experiments shown here remain relatively limited, the work demonstrates interesting new avenues for how translational derepression in Plasmodium can be studied.

      Response: We also appreciate Reviewer 3’s excellent feedback and positive remarks about the significance of our study. The revised manuscript addresses these comments, and we believe it is further strengthened because of it.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      We thank all the reviewers for their constructive and critical comments. We provide a point-by-point response to the reviewers' comments, as detailed below. By responding to them, we believe that our revised manuscript will significantly improve so that it will be of interest for researchers in the field of cell biology, signaling pathways, physiology and nutrition.

      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      Summary: The manuscript by Yusuke Toyoda and co-workers describes that the phosphorylation of the a-arrestin Aly3 downstream of TORC2 and GAD8 (AKT) negatively regulates endocytosis of the hexose transporter Ght5 in S.pombe under glucose limiting growth conditions.

      To arrive at these conclusions, the researchers define a set of redundant c-terminal phosphorylation sites in Aly3 that are downstream by GAD8. Phosphorylation of these sites reduces Ght5 ubiquitination and endocytosis. For ubiquitination, Aly3 interacts with the ubiquitin ligases Pub1/3.

      We thank the reviewer for his/her time and reporting advantages and issues of this study.

      Major points:

      Figure 3B: it would be interesting to compare Aly3 migration pattern (and hence potential phosphorylation) under glucose replete or limiting growth conditions. Can the authors provide direct evidence that Aly3 phosphorylation changes in response to glucose availability? Also please explain the 'smear' in lanes aly3(4th Ala), aly3(4th Ala, A584S), aly3(4th Ala, A586T).

      While it is an interesting possibility that the Aly3 migration pattern changes in response to glucose concentrations in medium, we think that this is unlikely and that examining this possibility is beyond the scope of this study. Because a phospho-proteomics study reported by Dr. Paul Nurse's lab showed Tor1-dependent phosphorylation of Aly3 at S584 under high glucose (2%) conditions (Mak et al, EMBO J, 2021), the Aly3 phosphorylation (migration) pattern is likely to be constant regardless of glucose conditions. Glucose conditions affect the mRNA and protein levels of Ght5, but supposedly not its endocytosis to vacuoles (Saitoh et al, Mol Biol Cell, 2015; Toyoda et al, J Cell Sci, 2021).

      As for the smear in Aly3(4th A), Aly3(4th A;A584S), Aly3(4th A; A586T), we suspect that some posttranslational modification occurs on these mutant Aly3 proteins, but the identity of the modification is unclear. We did not mention the smear signals in the original manuscript, because the presence or absence of the smear did not necessarily correlate with cell proliferation in low glucose and thus vacuolar localization of Ght5, which is the main topic of this study. In the revised manuscript, we will mention this point more clearly.

      Figure 4: Ght5 localization should be analyzed + / - thiamine and in media with different glucose levels. Also, a co-localization with a vacuolar marker (FM4-64) would be nice (but not necessary). Ideally, the authors should add WB analysis of Ght5 turnover to complement the imaging data. Also, would it be possible to measure directly the effects on glucose uptake (using eg: 2-NBDG).

      In this revision, we plan to observe Ght5 localization under the conditions indicated by the reviewer (+/- thiamine and high/low glucose levels) to unambiguously show that the vacuolar localization of Ght5 occurs in a manner dependent solely on expression of the mutant Aly3 protein.

      We thank the reviewer for the suggestion of co-staining with FM4-64. Indeed, because we previously reported that the cytoplasmic Ght5 signals were surrounded by FM4-64 signals in the TORC2-deficient tor1Δ mutant cells (Toyoda et al, J Cell Sci, 2021), the cytoplasmic Ght5-GFP signals in Figure 4 are very likely to co-localize with vacuoles. We will modify the text to clarify this point.

      As suggested, we plan to add Western blot analysis of Ght5 turnover in Aly3-expressing cells, to complement the imaging data (Figure 4) in the revised manuscript. Persistent appearance of GFP in Western blot would be a good support for vacuolar transport of Ght5-GFP.

      While regulation of glucose uptake is an important issue, measurement of Ght5-dependent glucose uptake using 2-NBDG was very difficult in our hands. Another reviewer (Reviewer #2) also mentioned the difficulty of this measurement in the Referees cross-commenting section.

      Figure 5: Given the localization of Ght5 shown in Figure 4, I'm surprised that it is possible in to detect full length Ght5, and its ubiquitination in the phospho-mutants of Aly3. I expected that the majority of Ght5 would be constitutively degraded, and that one would need to prevent endocytosis and/or vacuolar degradation to detect full length Ght5 and ubiquitination. Please explain the discrepancy. Also it seems that the quantification in B was performed on a single experiment.

      As the aim of Figure 5 is to compare the ubiquitinated species of Ght5 among the samples expressing different species of Aly3, the loading amount of each sample was adjusted so that the abundance of immunoprecipitated Ght5 is same across them. Therefore, as the reviewer points out, before the adjustment, abundance of the full-length Ght5 might be different in these samples. In the revised manuscript, we will add explanation on this point; why the anti-GFP blot of Figure 5A has the similar intensities in those samples.

      In the revised manuscript, we will add two additional replicates of the same experiment as Figure 5 in Supplementary material to show reproducibility of the result.

      Figure 6: Which PPxY motif of Aly3 is used for interaction with Pub1/3 and does their interaction depend on (de)phosphorylation?

      In the revised manuscript, we will discuss that "both PY motifs of Aly3 might be required for full interaction with Pub1/3," by citing the following published knowledge:

      (a) Mutation of both PPxY motif of budding yeast Rod1 and Rog3 (Aly3 homologs) diminished their interaction with the ubiquitin ligase Rsp5 (Andoh et al, FEBS Lett, 2002).

      (b) Mutating either one of two PPxY motifs of budding yeast Cvs7/Art1 greatly decreased interaction with WW domain, and mutating both abolished the interaction (Lin et al, Cell, 2008).

      Our preliminary results indicated that Pub3 interacted with Aly3, Aly3(4th A) and phospho-mimetic Aly3(4th D), and thus suggested that the Aly3-Pub1/3 interaction does not depend on the phosphorylation status of Aly3. Consistently, budding yeast Rod1 reportedly interacts with Rsp5 regardless of its phosphorylation status (e.g. Becuwe et al, J Cell Biol, 2012). While we have partially mentioned this point in the original manuscript (L499-503), we will discuss this point more clearly in the revised manuscript.

      Reviewer #1 (Significance):

      The results are well presented and clear cut (with few exceptions, please see major points). They provide further evidence that metabolic cues instruct the phosphorylation of a-arrestins. Phosphorylation then negatively regulates a-arrestin function in selective endocytosis and is essential to adjust nutrient uptake across the plasma membrane to the given biological context.

      We thank the reviewer for finding significance of our study. We believe that adding new results of the requested experiments and responding to the raised comments will clarify the significance of our revised manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity):

      **Summary / background. This paper focuses on the regulation of endocytosis of the hexose transporter, Ght5, in S. pombe by nutrient limitation through the arrestin-like protein Aly3. Ght5 is induced when glucose is limiting and is required for growth and proliferation in these conditions. ght5+ encodes the only high-affinity glc transporter from fission yeast. ght5+ is induced in low glucose conditions at the transcriptional level and is translocated to the plasma membrane to allow glc import. Ght5 is targeted to the vacuole in conditions of N limitation. Mutations in the TORC2 pathway lead to the same process, thus preventing growth on low glucose medium, as shown in the gad8ts mutant, mutated for the Gad8 kinase acting downstream of TORC2. Previously, the authors demonstrated that the vacuolar delivery of Ght5 in the gad8ts mutant is suppressed by mutation of the arrestin-like protein Aly3. Arrestin-like proteins are in charge of recognising and ubiquitinating plasma membrane proteins to direct their vacuolar targeting by the endocytosis pathway. This suggested that Aly3 is hyperactive in TORC2 mutants, and accordingly, Ght5 ubiquitination was increased in gad8ts.

      **Overall statement This study aims at deepening our understanding of the regulation of endocytosis by signalling pathways through arrestin-like proteins. Ght5 is a nice model to study a physiological regulation, and the authors have a great set of tools at hand. However, I think the conclusions are not always rigorous and the conclusions are sometimes far-reaching. The main problem is that much of the conclusions concern a potential phosphorylation of Aly3 which is not experimentally addressed. An additional issue is the fact that they look at Ght5 ubiquitination by co-immunoprecipitation in native conditions (or at least, it seems to me) which cannot be conclusive. Overall, I think some experiments should be performed to address (at least) these 2 points before the manuscript can be published, see detailed comments below.

      We thank the reviewer for pointing both advantages and issues of our manuscript.

      We admit that phosphorylation of Aly3 was not experimentally shown in our manuscript, although its phosphorylation has already been shown in phospho-proteomic studies by other groups. For this issue, we plan to add an experiment and modify the text, as explained below.

      The other major issue raised by this reviewer is that detection of Ght5 ubiquitination by immunoprecipitation in a native condition cannot be conclusive. Although we noticed that many studies perform affinity purification after denaturing and precipitating proteins with TCA or acetone to detect ubiquitination of the affinity-purified protein (e.g. Lin et al, Cell, 2008), we disagree with this opinion of the reviewer #2. In a review article describing methods to study ubiquitination by immunoblotting (Emmerich and Cohen, Biochem Biophys Res Comm, 2015), affinity purification of the protein of interest in a native condition is mentioned as one major choice. Moreover, a denaturing condition was not applicable to detect ubiquitinated Ght5 because the Ght5 protein that is once denatured and precipitated with TCA cannot be re-solubilized for immune-purification and -blotting. As the reviewer points out, a pitfall of detection of ubiquitinated Ght5 in a native condition is the presence of co-immunoprecipitated proteins. In our previous study (Toyoda et al, J Cell Sci, 2021), we purified GFP-tagged Ght5 and showed that a 110 kDa band detected in an anti-Ub immunoblot was also recognized by an anti-GFP antibody, confirming that the detected 110 kDa band corresponded to an ubiquitinated species of Ght5, but not a co-immunoprecipitated protein. Similarly, in the revised manuscript, we will add a panel of high-contrast (over-exposed) anti-GFP immunoblot, in which the indicated 110 kDa band was clearly detected by an anti-GFP antibody, in Figure 5A.

      We appreciate these issues raised by the reviewer #2. By responding to them, we believe that conclusions of our study will be more rigorous and undoubtful in the revised manuscript.

      **Major statements and criticism.

      *Fig 1. Based on the hypothesis that TORC2-mediated phosphorylation regulate Ght5 endocytosis, the authors first considered a possible phosphorylation of Ght5. They mutagenised 11 **possible** phosphorylation sites on the Ct of Ght5, but none affected the growth on low glucose in the absence of thiamine, suggesting that they don't contribute to the observed TORC2-mediated regulation. However, I disagree with the statement that "phosphorylation of Ght5 is dispensable for cell proliferation in low glucose", given that the authors do not show 1- that Ght5 is phosphorylated and 2-that this is abolished by these mutations. They should either provide data on this or tone down and say that these residues are not involved in the regulation, without implying phosphorylation which is not proven.

      Although we did not experimentally test whether these 11 residues of Ght5 was phosphorylated in our hand, these residues have been shown to be phosphorylated in phospho-proteomics studies by other groups (Kettenbach et al, Mol Cell Proteomics, 2015; Swaffer et al, Cell Rep, 2018; Tay et al, Cell Rep, 2019; Halova et al, Open Biol, 2021; Mak et al, EMBO J, 2021). In the revised manuscript, we plan to be more precise by replacing this conclusion with the following statement: "11 Ser/Thr residues of Ght5, which are reportedly phosphorylated, are not essential for cell proliferation in low glucose."

      In the presence of Thiamine (Supp fig 1), it seems that the ST/A mutant grows better in low glucose, and this is not explained nor commented. Since the transporter is not expressed, could the authors provide an explanation to this? If the promoter is leaky and some ght5-ST/A is expressed, it may be more stable and allow better growth than the WT, which would tend to indicate that impairing phosphorylation prevents endocytosis (which is classical for many transporters, see the body of work on CK1-mediated phosphorylation of transporters). Have the authors tried to decrease glc concentration lower than 0.14% in the absence of thiamine to see if this also true when the transporters is strongly expressed? (OPTIONAL)

      Improved growth of Ght5(ST11A)-expressing cells in the presence of thiamine was mentioned in the legend of Supplementary Figure 1A. In the revised manuscript, we will mention this observation also in the main text for better description of the results.

      Adding thiamine to medium does not completely shut off transcription from the nmt1 promoter but allows some transcription, as previously reported (Maundrell, J Biol Chem, 1990; Forsburg, Nuc Acid Res, 1993). In the revised manuscript, we will mention this "leakiness" of the nmt1 promoter and, by citing the suggested studies, will discuss a possibility that the ST11A mutations might prevent endocytosis of Ght5 and consequently promote cell proliferation in low glucose conditions.

      We found that, in the absence of thiamine, cells expressing ght5+ and ght5(ST11A) proliferated to the comparable extent on medium containing 0.08% glucose. This result will be added to the revised manuscript.

      *Fig 2. The authors then follow the hypothesis that TORC2 exerts its Ght5-dependent regulation through the phosphorylation of Aly3. They mutagenised 18 **possible** phosphorylation sites on Aly3. This led to a strong defect in growth in low-glc medium. Mutation of the possible Gad8 site (S460) did not recapitulate this phenotype, suggesting that it is not sufficient, however, mutations of 4 ST residues in a CT cluster (582-586) mimicked the full 18ST/A mutation, suggesting these are the important residues for Ght5 endocytosis.

      We thank the reviewer for appreciating the results in Fig. 2. As we explain below, we plan to perform an additional experiment to show that the Aly3 C-terminus is phosphorylated. With this result, our model will gain another experimental support.

      *Fig 3A. Further dissection did not allow to pinpoint this regulation to a specific residue, beyond the dispensability of the T586 residue. Fig 3B. The authors look at the effects of mutation of Aly3 on these sites at the protein level. They had to develop an antibody because HA-epitope tagging did not lead to a functional protein (Supp fig 2). Whereas I agree that the mutations causing a phenotype lead to a change in the migration pattern, I disagree with the statement that "This observation indicated that slower migrating bands were phosphorylated species of Aly3" (p.9 l.271). First, lack of phosphorylation usually causes a slower mobility on gel, which is not clear to spot here. Second, a smear appears on top of the mutated proteins (eg. 4th Ala) which is possibly caused by another modification. There are many precedents in the literature about arrestins being ubiquitinated when they are not phosphorylated (see the work on Bul1, Rod1, Csr2 in baker's yeast from various labs). My gut feeling is that lack of phosphorylation unleashes Aly3 ubiquitination leading to change in pattern. All in all, it is impossible to state about the phosphorylation of a protein without addressing its phosphorylation properly by phosphatase treatment + change in migration, or MS/MS. Thus, whereas the data looks promising, this hypothesis that Aly3 is phosphorylated at the indicated sites is not properly demonstrated.

      We disagree with the reviewer's opinion that a lack of phosphorylation usually causes slower mobility on gel. There are many examples in which phosphorylation causes slower mobility on gel, including budding yeast Rod1 (Alvaro et al, Genetics, 2016), and mammalian TXNIP (Wu et al, Mol Cell, 2013). In the revised manuscript, we will cite these reports to support our interpretation that the slower migrating bands are likely phosphorylated species of Aly3 (L270-271).

      Smear-like signals in Aly3(4th Ala), Aly3(4th A;A584S) and Aly3(4th A;A586T) might result from some modification, but identity of the modification is unknown. As the reviewer #2 mentioned, phosphorylation on Aly3 might negatively regulate another modification. The precedent studies revealed that budding yeast Rod1 and Rog3 arrestins tend to be ubiquitinated in snf1/AMPK-deficient cells (Becuwe et al, J Cell Biol, 2012; O'Donnell et al, Mol Cell Biol, 2015), and that Bul1 arrestin is dephosphorylated and ubiquitinated in budding yeast cells deficient in Npr1 kinase (Merhi and Andre, Mol Cell Biol, 2012). Also, budding yeast Csr2 arrestin is deubiquitinated and phosphorylated upon glucose replenishment, while non-phosphorylated Csr2 is ubiquitinated and activated by Rsp5 (Hovsepian et al, J Cell Biol, 2012). While the smear-like signals are interesting, we noticed that the smear-like signals did not necessarily correlate with cell proliferation defects in low glucose. We therefore think that clarifying the identity of the smear-like signals is beyond the scope of this study. We will discuss the smear-like signals only briefly in the revised manuscript, and would address this issue in our future work, hopefully.

      While the 4 S/T residues at the C-terminus of Aly3 as well as the other 14 S/T residues have been already shown to be phosphorylated in the precedent studies (Kettenbach et al, Mol Cell Proteomics, 2015; Tay et al, Cell Rep, 2019; Halova et al, Open Biol, 2021), we will confirm that the slower migrating Aly3 is indeed phosphorylated by phosphatase treatment in the revised manuscript. This planned experiment will further strengthen our study and support our conclusion and model.

      *Fig 4. The authors now look at the functional consequences of these mutations on ALy3 on Ght5 localisation. The data clearly shows that mutation of the 4 identified S/T residues (Aly3-4th A) causes aberrant localisation of the transporter to the vacuole, likely to cause the observed growth defect on low glucose. There is a nice correlation between the vacuolar localisation and growth in low-glucose for the various aly3 mutants. (A final proof could be to express this in the context of an endocytic mutant, which should restore membrane localisation and suppress the aly3-4thA phenotype - OPTIONAL). However, I still disagree with the statement that "These results indicate that phosphorylation of Aly3 at the C-terminal 582nd, 584th, and/or 585th serine residues is required for cell-surface localization of Ght5." given that phosphorylation was not properly demonstrated.

      While phosphorylation of the 582nd, 584th and/or 585th serine residues of Aly3 is not experimentally demonstrated in our hands, they have been shown to be phosphorylated in phospho-proteomics studies by other groups (Kettenbach et al, Mol Cell Proteomics, 2015; Tay et al, Cell Rep, 2019; Halova et al, Open Biol, 2021; Mak et al, EMBO J, 2021). Among them, the 584th serine residue (S584) was reported to be phosphorylated in a TORC2-dependent manner (Mak et al, EMBO J, 2021), consistent with our model. To explicitly demonstrate that S584 is phosphorylated, we plan to make a strain expressing a mutant Aly3 protein in which all the possible phosphorylation sites except S584 are replaced with alanine, namely Aly3(ST17A;S584). Hopefully, we can properly show the phosphorylation of S584 by measuring the mobility of the Aly3(ST17A;S584) on gel with/without phosphatase treatment or gad8 mutation.

      We thank the reviewer for suggestion of the experiment using an endocytic mutant. Previously we reported that vacuolar localization of Ght5 in gad8 mutant cells was suppressed by mutations in not only aly3 but also genes encoding ESCRT complexes (Toyoda et al, J Cell Sci, 2021). We therefore think that in cells expressing Aly3(ST18A) or Aly3(4th Ala), Ght5 is subject to endocytosis and ensuing selective transport to vacuoles via endosome-localized ESCRT complexes. We will discuss this point in the revised manuscript.

      *Fig 5. Here, the authors question the role of Aly3 mutations on Ght5 ubiquitination. They immunoprecipitate Ght5 and address its ubiquitination status in various Aly3 mutants. The data is encouraging for a role in Aly3 phosphorylation (?) in the negative control of Ght5 ubiquitination. My main problem with this experiment is that it seems that Ght5 immunoprecipitations were made in non-denaturing conditions, which leads to the question of what is the anti-ubiquitin revealing here (Ght5 or a co-immunoprecipitated protein, for example Aly3 itself, or the Pub ligases, or an unknown protein). It seems that this protocol was previously used in their previous paper, but I stand by my conclusion that ubiquitination of a given protein can only be looked in denaturing conditions. The experiments should be repeated in buffers classical for the study of protein ubiquitination to be able to conclude unambiguously that we are looking at Ght5 ubiquitination itself, especially in the absence of a non-ubiquitinable form of Ght5 as a negative control. Could the authors comment on the fact that S-A or S-D mutations display the same phenotype regarding the possible Ght5 ubiquitination?

      As mentioned above, immunoprecipitation of Ght5 in denaturating conditions is not feasible. Ght5 can be affinity-purified only in a non-denaturing condition. In addition, affinity purification in a native condition is considered as a major choice to examine its ubiquitination according to a literature by Emmerich and Cohen (Emmerich and Cohen, Biochem Biophys Res Comm, 2015). A drawback of native condition is, as the reviewer points out, that the affinity-purified fraction might include non-bait (non-Ght5) proteins. The 110 kDa band indicated by an arrow in Fig. 5A was confirmed to be Ght5, not a non-bait protein, as a band at the identical position was detected in the immunoblot with anti-GFP antibody. Because this band in the anti-GFP immunoblot was too faint to be visible in Fig. 5A of the original manuscript, we will add an additional panel showing the contrast-enhanced anti-GFP immunoblot in which the 110 kDa band is clearly visible.

      As for the result that "S-A or S-D mutations display the same phenotype regarding the possible Ght5 ubiquitination," we are afraid that the reviewer #2 misunderstood the labels of the samples. We apologize for confusing notational system of the sample name. Full description of samples is as follows; In Aly3(4th A), all of S582, S584, S585 and T586 are replaced with A; In Aly3(4th A;A584S), S582, S585 and T586 are replaced with A, whereas S584 remains intact; In Aly3(4th A;A584D), S582, S585 and T586 are replaced with A, and S584 is replaced with phospho-mimetic D. Because cells expressing Aly3(4th A;A584S) and Aly3(4th A;A584D) exhibited similarly low levels of Ght5 ubiquitination, we speculated that phosphorylation at S584 of Aly3 negatively regulates ubiquitination of Ght5.

      In the revised manuscript, we plan to add a table showing amino acid sequence of each species of Aly3 (just like Figure 3A) to avoid confusion.

      *Fig 6. The authors want to document the model whereby Aly3 may interact with some of the Nedd4 ligases (Pub1/2/3) to mediate its Ght5-ubiquitination function. They actually use the Aly3-4thA mutant, it should have been better with the WT protein. But the results indicate a clear interaction with at least Pub1 and Pub3. By the way, are the Pub1/2/3 fusions functional? Nedd4 proteins are notoriously affected in their function by C-terminal tagging and are usually tagged at their N-terminus (See Dunn et al. J Cell Biol 2004).

      We plan to test whether Pub1-myc is functional by comparing proliferation of the Pub1-myc-expressing strain and pub1Δ strain, as pub1Δ cells reportedly show proliferation defects at a high temperature (Tamai and Shimoda, J Cell Sci, 2002). As deletion of pub2 or pub3 reportedly exhibited no obvious defects (Tamai and Shimoda, J Cell Sci, 2002; Hayles et al, Open Biol, 2013), it is not easy to assess functionality of the myc-tagged genes.

      Please note that C-terminally tagged Pub1/2/3 proteins have been widely used in studies with fission yeast. Both Pub1-HA and non-tagged Pub1 were reported to be ubiquitinated (Nefsky and Beach, EMBO J, 1996; Strachan et al, J Cell Sci, 2023). Pub1-GFP, which complemented the high temperature sensitivity of pub1Δ, localized to cell surface and cytoplasmic bodies (Tamai and Shimoda, J Cell Sci, 2002). Pub2-GFP, overexpression of which arrested cell growth just like overexpression of non-tagged Pub2, localized to cell surface, and consistently Pub2-HA was detected in membrane-enriched pellet fractions after ultracentrifugation (Tamai and Shimoda, J Cell Sci, 2002). They also reported ubiquitin conjugation of the HECT domain of Pub2 fused with myc epitope at its C-terminus. Pub3-GFP localized to cell surface (Matsuyama et al, Nat Biotech, 2006).

      Regardless of functionality of the myc-tagged Pub1/2/3, we believe that results of this experiment (Figure 6) support our model, because the aim of this experiment, which is to identify the HECT-type and WW-domain containing ubiquitin ligase(s) that interact with Aly3, is irrelevant to functionality of the myc-tagged Pub proteins.

      *Fig 7. The authors want to provide genetic interaction between the Pub ligases and the growth defects in low glc due to alterations in Ght5 trafficking. It is unclear how the gad8ts pub1∆ mutant was generated since it doesn't seem to grow on regular glc concentration (Supp fig 5), could the authors provide some information about this? It is also not clear whether it can be stated thatches mutant is "more sensitive" to glc depletion because of the low level of growth to begin with (even at 3%). Altogether, the data show that deletion of pub3+ is able to suppress the growth defect of the gad8ts mutant on low glc medium, suggesting it is the relevant ligase for Ght5 endocytosis. This is confirmed by microscopy observations of Ght5 localisation. However, I would again tone down the main conclusion, which I feel is far-reaching: "Combined with physical interaction data, these results strongly suggest that Aly3 recruits Pub3, but not Pub2, for ubiquitination of Ght5." Work on Rsp5 in baker's yeast has shown that Rsp5 function goes beyond cargo ubiquitination, including ubiquitination of arrestins (which is often required for their function as mentioned in the introduction) or other endocytic proteins (epsins, amphyphysin etc). I agree that the data are compatible with this model but there are other possible explanations. Anything that would block endocytosis would supposedly suppress the gad8ts phenotype.

      gad8ts pub1Δ was produced at 26 {degree sign}C, a permissive temperature of the gad8ts mutant. While this is described in the Methods section of the original manuscript, we will mention this more clearly in the Results section of the revised manuscript.

      We did not conclude low glucose sensitivity of gad8ts pub1Δ cells in the indicated part (L376-377). Rather, we compared proliferation of gad8ts single mutant and pub1Δ single mutant cells in low glucose, and we found that the pub1Δ single mutant exhibited the higher sensitivity. In the revised manuscript we will correct the text to clarify that we compared proliferation of two single mutants (but not gad8ts pub1Δ mutant).

      We agree with the opinion that the recruited Pub3 may ubiquitinate proteins other than Ght5. In the revised manuscript, we will correct our conclusion of the Figure 7 experiment (L388-390), not to limit the possible ubiquitination target(s) to Ght5.

      In a genetic screen, we found that mutations in aly3+ and genes encoding ESCRT complexes suppressed low-glucose sensitivity and vacuolar transport of Ght5 of gad8ts mutant cells (Toyoda et al, J Cell Sci, 2021). This finding appears consistent with the reviewer's opinion that blocking endocytosis would supposedly suppress the gad8ts phenotype. We will mention this point in the revised manuscript.

      *Discussion Some analogy with the regulation of the Bul arrestins by TORC1/Npr1 and PP2A/Sit4 could be mentioned (Mehri et al. 2012), at the discretion of the authors. The possibility that phosphorylation may neutralise a basic patch on Aly3 Ct, possibly involved in electrostatic interactions with Ght5 is very interesting. Regarding the effect of the mutations on Aly3 localisation (p.15 l.498), did the authors tag Aly3 with GFP? There are examples where proteins tagged with HA are not functional whereas tagging with GFP does not alter their function (eg. Rod1, Laussel et al. 2022) - and here Supp Fig 2 only relates to HA-tagging. Proof of a change in Aly3 localisation upon mutation would definitely be a plus (OPTIONAL).

      We thank the reviewer for the suggestion of a reference. In the revised manuscript, we will cite the indicated report in the corresponding part for an additional support of TORC1-mediated control of Aly3 (de)phosphorylation.

      While examining localization of Aly3 by GFP-tagging is interesting, we do not believe that it is necessary in this study. We would like to produce Aly3-GFP and to examine its functionality and localization in our future study. We thank the reviewer's insightful suggestion.

      **Minor comments.

      *Introduction: - I believe the text corresponding to the work on TXNIP is incorrect (p.5 l.127). TXNIP is degraded after its phosphorylation, not "rectracted" from the surface.

      In the revised manuscript, we will correct the text accordingly.

      • For the sake of completion, the authors could add other references concerning the regulation of Rod1 in budding yeast such as Becuwe et al. 2012 J Cell Biol and O'Donnell et al. 2015 Mol Cell Biol, in addition to Llopis-Torregrosa et al. 2016.

      In the revised manuscript, we will add the suggested references and correct the text in the corresponding part of the Introduction (L123-138).

      • Other examples of the requirement for arrestin ubiquitination beyond Art1 (p.5 l.136-137) are listed in the ref cited: Kahlhofer et al. 2021.

      We will cite the indicated review to navigate readers for more examples of arrestin ubiquitination (and transporter ubiquitination).

      *Figures: In general, I think it would be clearer if the authors showed on the figures that the background strain in which the XXX gene is added (or its mutant forms) is a xxx∆ strain.

      We will modify the figures to clearly show the genetic background of the strains used.

      **Referees cross-commenting**

      Cross review of Reviewer 1 - *I don't believe that the authors "define a set of redundant c-terminal phosphorylation sites in Aly3", because phosphorylation is not proven. *I thinks the points raised for Fig 3B are valid but the authors should focus on making their story conclusive before expanding to other data (except for the explanation of the smear, see my review). Also, I don't think 2NBDG actually works to measure Glc uptake. * same for Fig 6 - not sure the interaction site mapping between Aly3 and Pubs would bring much value since there are more urgent things to do to make the story solid.

      As mentioned above, we will experimentally show phosphorylation of the Aly3 C-terminus in the revised manuscript. Such experiments would make our story more solid and conclusive. We truly appreciate the comments and suggestions.

      We agree with the comments on difficulty of measuring glucose uptake using 2-NBDG. In fact, we tried and failed measuring Ght5-mediated glucose uptake using 2-NBDG.


      Cros review of Reviewer 3 - we have many overlaps, so briefly : *I agree that the bibliography is incomplete (mentioned in my review) *I agree that there is no demonstration of the phospho-status of Aly3, and it is a problem *I agree that the results can be better quantified, esp. in the light of the points raised by this referee concerning the variability of expression of ST18A Other specific comments : *I agree that the statement that dephosphorylation activates alpha-arresting should be toned down - this was observed in several instances but there are examples of arrestin-mediated endocytosis which does not require their prior dephosphorylation. *I fully agree that efforts could be made regarding the classification/nomenclature of arrestins in S. pombe, this had escaped my attention

      As detailed in the individual point raised by the reviewers, we will add the suggested references and accordingly correct the text in the revised manuscript.

      In addition to experimentally showing Aly3 phosphorylation, we will quantify the immunoblot result.

      Our statement that dephosphorylation activates alpha-arrestins might be too generalized. We will mention reports in which arrestin-mediated endocytosis does not require prior dephosphorylation (e.g. O'Donnell et al, Mol Biol Cell, 2010; Gournas et al, Mol Biol Cell, 2017; Savocco et al, PLoS Biol, 2019), and modify the text precisely.

      Reviewer #2 (Significance):

      *strengths and limitations This study aims at deepening our understanding of the regulation of endocytosis by signalling pathways through arrestin-like proteins in S. pombe. Ght5 is a nice model to study a physiological regulation, and the authors have a great set of tools at hand, including the discovery of Aly3 as the main arrestin for this regulation, and a signalling pathway (TORC2/Gad8) acting upstream. The main question is now to understand at the mechanistic level how TORC2 signaling impinges on the regulation of this arrestin.

      Overall, the authors nicely demonstrate that C-terminal Ser/Thr residues are crucial for the function of Aly3 in Ght5 endocytosis. They propose a model whereby Aly3 phosphorylation by an unknownn kinase inhibits its function on Ght5 ubiquitination, which would favour its endocytosis. However, I think the conclusions are not always rigorous and the conclusions are sometimes far-reaching. The main problem is that much of the conclusions concern a potential phosphorylation of Aly3 which is not experimentally addressed. An additional issue is the fact that they look at Ght5 ubiquitination by co-immunoprecipitation in native conditions (or at least, it seems to me) which cannot be conclusive. Overall, I think some experiments should be performed to address (at least) these 2 points before the manuscript can be published, see detailed comments above.

      *Advance

      This study, if completed carefully, would provide among the first examples of mapping of phosphorylation sites on arrestins, which are usually phosphorylated at many sites and are thus difficult to study. Few studies went down to this level in this respect (see Ivshov et al. eLife 2020). There are no changes in paradigms or new conceptual insights, but this work is a nice example of the conservation of these regulatory mechanisms.

      We appreciate that this study is highly evaluated by this reviewer. We understand the main problems raised by the reviewer, and as we detailed above, we plan to perform an experiment and make explanation to respond to the problems. With the raised issues answered, we believe that conclusions of the revised manuscript will be more rigorous.

      Our study reveals mechanisms regulating vacuolar transport of the Ght5 hexose transporter via the TORC2 pathway in fission yeast. The serine residues at the Aly3 C-terminus (582nd, 584th and 585th serine residues), which are probably phosphorylated in a manner dependent on the TORC2 pathway, are required for sustained Ght5 localization to cell surface and cellular adaptation to low glucose. To our knowledge, there is no such study, and thus we think that this study is novel. By responding to the reviewers' comments and adding new data as explained above, the revised manuscript will be able to present novelty of our study more clearly. Comparison of our study in fission yeast to related studies in other model organisms may reveal the conservation and diversity of these regulatory mechanisms.

      *Audience Should be of interest for people studying basic research in the field of cell biology, signalling pathways, transporter regulation by physiology. Reviewer background is on the regulation of transporter endocytosis by signalling pathways and arrestin-like proteins.

      Reviewer #3 (Evidence, reproducibility and clarity): (Authors' response in blue)

      In this manuscript, the authors work to address how phospho-regulation of a-arrestin Aly3 in S. pombe regulates the glucose transporter Ght5. The authors use a series of phospho-mutants in Aly3 and assess function of these mutants using growth assays and localization of Ght5. My main concerns with the manuscript are that 1) there is a lack of appreciation for the similar work that has been done in S. cerevisiae to define a-arrestin phospho-regulation, which is evidenced by the severe lack of referencing throughout the document, 2) the sites mutated on Aly3 are not demonstrated to change phospho-status of Aly3 and so all interpretations of these mutants need to be better contextualized and 3) almost none of the findings are quantified (imaging or immunoblots) making it difficult to assess the rigor of the outcomes. More detailed comments are provided below.

      We thank the reviewer for thorough reading of the manuscript and the detailed comments. As explained below, we will respond to the points raised by the reviewer and accordingly modify the manuscript.

      Minor Comments

      Immunoblotting or immunostaining to define the levels and localization of phospho-mutants - In Figure 1, an immunoblot or immunostaining to define the abundance/localization of WT Ght5 vs its ST11A mutant would be appreciated. It is very difficult to know if ST11A is as functional as WT or not without an assessment of the levels and localization of the WT and mutant proteins to accompany the spot assays. Perhaps a version of Ght5 that is a phospho-mimetic would be more useful here as well since that version should not be dephosphorylated and then presumably would be internalized and not allow for growth on low glucose medium.

      We plan to add fluorescence microscopy data of WT Ght5 and Ght5(ST11A) in the revised manuscript, to compare the localization and abundance of these two Ght5 species. In our preliminary observation, those of two Ght5 species seemed to be indistinguishable.

      We'd like to emphasize that the primary aim of this study is to reveal mechanisms regulating Ght5 localization and consequently ensuring cell proliferation in low glucose. While analyzing a phospho-mimetic Ght5 mutant (e.g. Ght5(ST11D)) is interesting in terms of understanding of the nature of Ght5, we believe that such an analysis is out of the scope on this study. As Ght5(ST11A)-expressing cells proliferated comparably to Ght5(WT)-expressing cells and WT and ST11A Ght5 indistinguishably localize on the cell surface, phosphorylation of the ST residues of Ght5 is not likely to be the primary mechanism regulating Ght5 localization and function. We would like to assess a phospho-mimetic Ght5 mutant protein in our future studies.

      For the Aly3 mutants where the abundance of Aly3 appears lower via immunoblotting (i.e., 4thA-A582S or S582A) how is the near perfect functional readout explained when the levels of the protein are much lower than WT? For the ST18A mutant, this is a particularly important point since the authors indicate on lines 194-197 that based on the functional data for ST18A, some of these ST residues are needed for phospho-regulation of Aly3. However, in Figure 3B the authors clearly show that there is very little ST18A protein in cells, and so these mutations have impacted Aly3 stability, which may or may not be linked to its phospho-status. The authors should be upfront about this finding on lines 194-197 and should not present this phospho-model as the only reason for why ST18A may not be functional. On lines 265-276 for the authors indicate that ST18A is expressed equivalently to WT Aly3, which is just not the case in Figure 3B. Perhaps quantification of replicate data would help clarify this issue. Further, if the authors wish to conclude that the upper MW bands in Figure 3B are due to phosphorylation, perhaps they should perform phosphatase treatments of their extracts to collapse these bands. However, most certainly the overall abundance of the single band for ST18A is reduced compared to the total bands of WT Aly3.

      We disagree with the opinion that the levels of the mutant Aly3 are much lower than WT. For semi-quantitative measurement of the protein abundance, 2-fold dilution series of the WT Aly3 sample were loaded in the leftmost 3 lanes of Figure 3B. Although the levels of Aly3(4th A;A582S), Aly3(S582A) and Aly3(ST18A) were lower than that of WT Aly3, those are 50% or more of the WT, judging from the intensities of the serially-diluted WT samples. To clearly show that the expression of these Aly3 proteins is within comparable levels, we plan to add a column chart of the quantified expression levels and to mention abundances of the Aly3 proteins more quantitatively in the revised text. We do not think that replicate data (of Western blots as in Figure 3B) helps clarify this issue, because nmt1 promoter-driven gene transcription is induced with a small variation (Forsburg, Nuc Acid Res, 1993). We will cite this report and mention this point in the revised text.

      We are afraid that this reviewer seems to consider that Aly3(ST18A) is not functional, but it is not a case and we do not intend to claim so. While deletion of aly3 did not interfere with cell proliferation in low glucose (see vector controls in Figures 2B, 2C and 3A, -Thiamine), expression of the ST18A mutant clearly hinders cell proliferation in low glucose, indicating that the ST18A performs dominant negative function to inhibit cell proliferation. That is, even though the expression level and/or stability of the ST18A is reduced, it is still sufficiently abundant to perform the dominant negative function. We propose the phospho-model not due to dysfunctionality of ST18A, but its dominant negative functionality. The 18 S/T residues of Aly3, which are shown to be phosphorylated in precedent phospho-proteomics studies, seem to be required to down-regulate Aly3's function to inhibit cell proliferation in low glucose. We apologize for this confusion, and we will modify the text and figures to clarify these points in the revised manuscripts.

      To obtain an experimental support for our description that the slower migrating bands in Figure 3B are due to phosphorylation, we plan to perform a phosphatase treatment experiment as suggested.

      Figure 2A - how do the phosphorylation sites identified in Aly3 compare to those identified in Rod1 from S. cerevisiae? See PMID 26920760 or SGD for more information. I am confused as to why the Aly3 protein has an arrowhead at the C-terminus. What does this denote?

      We will mention reported phosphorylation sites of Aly3 and budding yeast Rod1/Art4 in the revised manuscript, by referring to the indicated report and database. It should be noted that similarity between amino acid sequences of Aly3 and S. cerevisiae Rod1 is not so high and limited in Arrestin-N and -C domains. The C-terminal half of Aly3, in which most of the potential phosphorylation sites are found, is not similar to Rod1. Thus, these sites are unlikely to be conserved between them.

      An arrowhead indicates the direction of transcription (from N to C-terminus). We will describe it explicitly in the revised figure legend.

      Figure 2 - The WT and Aly3-ST18A are expressed in S. pombe from a non-endogenous locus under the control of the Nmt1 promoter. However, are these mutants present in cells that contain WT copies of Aly3 at other genomic loci? If so, this would surely muddy the interpretations of these data as a- and b-arrestins are capable of multimerizing and the effect of multimerization on their activities can vary.

      As mentioned in L188, an aly3 deletion mutant strain (aly3Δ) was used as a host, and thus all strains harboring an nmt1-driven aly3 gene lack the endogenous aly3 gene. We will add an illustration clearly showing that the host strain lacks the endogenous aly3+ gene and modify the legend of Figure 2.

      Functional readouts for Aly3 using Ght5 localization - The reduced surface levels of Ght5 does correspond to the spot assay growth in low glucose for the various Aly3 mutants used. However, it would be useful if these assays incorporated an endocytosis inhibitor to help prevent the activities of these Aly3 plasmids to see if the transporter is retained at the PM. At the end of these mutational analyses, the authors conclude that phosphorylation of Aly3 at any of 3 sites is required for Ght5 trafficking to the vacuole in low glucose, however no experiment is done to demonstrate that these sites are phosphorylated residues. A phosphatase assay would be useful to help demonstrate that the modifications in 3B really are phosphorylation and a quantification of the phosphorylated bands in these WBs would also be useful to solidify the statement made on lines 306-309.

      We thank the reviewer for suggestion of the experiment using an endocytosis inhibitor. Previously we reported that vacuolar localization of Ght5 in gad8ts mutant cells was suppressed by mutations in not only aly3 but also genes encoding ESCRT complexes (Toyoda et al, J Cell Sci, 2021). We therefore think that, in cells expressing Aly3(ST18A) or Aly3(4th Ala), Ght5 is subject to endocytosis and subsequent selective transport to vacuoles via ESCRT complexes. We will mention these previous findings in the revised manuscript.

      As mentioned in responses to the comments above and other reviewer's, we will perform a phosphatase treatment experiment and its quantification in the revised manuscript. Here, we'd like to emphasize that these 3 sites have been shown to be phosphorylated in phospho-proteomic studies by other researchers (Kettenbach et al, Mol Cell Proteomics, 2015; Tay et al, Cell Rep, 2019; Halova et al, Open Biol, 2021; Mak et al, EMBO J, 2021), although we do not show it directly in this study.

      Phosphorylation assessments - in general, it would be good to not only build the non-phosphorylatable versions of Aly3 but also the phospho-mimetic forms.

      We produced a phospho-mimetic mutant Aly3 (i.e. Aly3(4th A;A584D)), and showed the result in Figure 5A; cells expressing Aly3(4th A;A584D) exhibited a low ubiquitination of Ght5, similarly to Aly3(WT)- and Aly3(4th A;A584S)-expressing cells. According to our experiences, replacing S/T with D/E does not necessarily mimic phosphorylation. Thus, we do not believe that systematic production of phospho-mimetic Aly3 mutants would help achieve the aim of this study.

      Pub1, 2, and 3 - It would be helpful if the authors indicated what genes Pubs 1-3 correspond to in S. cerevisiae, where Rsp5 is the predominant Ub ligase interacting with a-arrestins. Is there no ortholog of Rsp5 in S. pombe?

      Pub1, Pub2 and Pub3 are regarded as orthologs of budding yeast Rsp5, according to the fission yeast database PomBase. We will perform a homology search for these E3 proteins, and based on the result, we will add a description in the revised manuscript.

      Pub-Aly3 interactions - could the authors please comment on the reason why so very little Aly3 is copurified with Pub1 or Pub2? Can any clear conclusion be drawn about pub2 given how very little Pub2 is present in the IPs? Based on my understanding of these data I do not think that this can be cleanly interpreted. What is is the identity of the ~50kDa MW band in Figure 6 in the upper MYC detection panel?

      We do not have an accurate answer for the result that a small amount of Aly3 is copurified with Pub1 or Pub3. The Pub1/3-Aly3 interaction may be weak or transient. We will discuss this point in the revised manuscript.

      Regarding whether Aly3 interacts with Pub2, we agree with the reviewer. As described in the Results (L360-362), we could not conclude anything about Aly3-Pub2 interaction by this immunoprecipitation experiment alone. On the other hand, the genetic interaction experiment (Figure 7A) suggests that pub2+ is not involved in defects caused by the gad8ts mutation (while pub3+ and aly3+ are). By this experiment, we think that Pub2 is not a partner of Aly3.

      In the revised manuscript, we will describe that Pub2 is not a partner of Aly3 in a paragraph describing the Figure 7A experiment.

      Because the 50 kDa band found in the IP fraction of all the samples appears even in "beads only" (Figure 6), those are supposedly derived from mouse IgG dissociated from the beads used for immunoprecipitation. We will mention this in the legend of Figure 6.

      Phosphorylation and ubiquitination of a-arrestins - The paragraph from lines 123-138 is very superficial in addressing what is known about phosphorylation and ubiquitination of a-arrestins. The way this section is written, it feels misleading to the reader as it omits many of the details for regulation that would help place the current study in context. The discussion of Rod1 phosphorylation by AMPK for example, which is directly relevant to this study, is underdeveloped. I would recommend splitting this into two paragraphs and providing a more in depth, and accurate, view of the literature on this topic, with a focus on the regulation that is relevant for the ortholog of Aly3 in S. cerevisiae. For example, Rod1 phosphorylation by AMPK is greatly expanded upon in the following papers (PMID 22249293 and 25547292) and AMPK regulation of C-tail phosphorylation of a-arrestins is defined further in PMID 26920760. These references are each particularly important to compare with the current findings presented in this manuscript. Torc2 regulation ofa-arrestins is also reviewed in PMID 36149412 and references therein should be considered.

      Because the primary aim of this study is to reveal mechanisms regulating Ght5 localization in fission yeast, but not to dissect modification and regulation of α-arrestins, we decided not to get into the details of phosphorylation and ubiquitination of α-arrestins. Furthermore, although budding yeast Rod1 and Rog3 are found to be downstream of the TORC2-Ypk1 signaling in the context of internalization of the Ste2 pheromone receptor, it is not clear whether TORC2-Ypk1 signaling also regulate α-arrestin-mediated internalization of hexose transporters in budding yeast. For these reasons, we focused on limited literatures essential for interpretation of the results and omitted many references describing the details of α-arrestin regulation. However, as this reviewer commented, we realize that our decision makes the discussion superficial and misleading to the reader. We sincerely apologize for this inconvenience.

      In the revised manuscript, we will reorganize the paragraphs in the discussion and include the suggested references. Regarding budding yeast Rod1, we will cite the study reporting Ypk1-mediated phosphorylation on Rod1 in mating pheromone response via regulation of Ste2 endocytosis (Alvaro et al, Genetics, 2016). We will also mention other reports (Becuwe et al, J Cell Biol, 2012; O'Donnell et al, Mol Cell Biol, 2015) about AMPK-dependent phosphorylation of Rod1 in the corresponding part (e.g. L129-130). In addition, we will mention that Aly2, Rod1 and Rog3 α-arrestins were found downstream of the TORC2-Ypk1 signaling (Muir et al, eLife, 2014; Thorner, Biochem J, 2022).

      As a further detailed example, there is far more work done on ubiquitination of a-arrestins in S. cerevisiae than the single citation provided by the authors on line 137. The way this section is written it feels misleading. Considerable effort has been spent on defining how mono- and poly-ubiquitination regulate a-arrestins and the authors should consider the data provided in the following citations and revise the two sentences they provide in this introduction to better reflect the breadth of our understanding rather than simply indicate that the 'mechanisms that regulate functions of a-arrestisn are not fully understood'. (PMIDs 23824189; 22249293; 17028178; 28298493)

      Ubiquitination of α-arrestin itself is not the topic of this study, and physiological consequences of ubiquitination of Aly3 remain unknown. Because of these reasons, we did not describe the details of ubiquitination of α-arrestins in the original manuscript. However, we never intend to mislead the reader, and thus to avoid it, we will revise the indicated sentences and cite the suggested literatures (O'Donnell et al, J Biol Chem, 2013; Becuwe et al, J Cell Biol, 2012; Kee et al, J Biol Chem, 2006; Ho et al, Mol Biol Cell, 2017) in the revised manuscript.

      Context of the findings and lack of citations - The referencing in this manuscript is very poor as many of the key papers that report analogous findings in the budding yeast Saccharomyces cerevisiae are not cited. This oversight in citing the appropriate literature must be remedied before this manuscript can be considered further for publication. Examples of these omissions occur at the following places:

      We will modify the text and carefully cite more literatures describing analogous finding in budding yeast and other organisms in the revised manuscript. We appreciate the insightful suggestions by this reviewer. It should be noted, however, that it is not evident whether budding yeast Rod1 and Rog3 are orthologous to fission yeast Aly3. Although Rod1 and Aly3 share overlapping roles, amino acid sequence similarity of them is not high and limited only in domains which are generally conserved among α-arrestin-family proteins.

      Line 90 - The Puca and Brou citations is one example of this but the first examples come from Daniela Rotin's work looking at Rsp5 interactions in budding yeast, which is where the association between HECT-domain Ub ligases and a-arrestins is also documented by Scott Emr and Hugh Pelham's labs. Here are some PMID numbers to improve the citations of this section (PMID 17551511; 18976803; 19912579) and each of these references long predates the Puca and Brou publication.

      In the revised manuscript, we will improve the citations by including the suggested studies (Gupta et al, Mol Syst Biol, 2007; Lin et al, Cell, 2008; Nikko and Pelham, Traffic, 2009).

      Lines 123-126 - Phosphorylation can also increase vacuole-dependent degradation of alpha-arrestins as demonstrated in PMID 35454122. The interaction with 14-3-3 proteins that is driven by phosphorylation of a-arrestins was first demonstrated by the Leon group in PMID 22249293). Lines 129-132 - Here again the Leon reference that helps demonstrate the 14-3-3 inhibition of Rod1 is lacking (PMID 22249293).

      We will cite the suggested studies in description of these topics (Bowman et al, Biomolecules, 2022; Becuwe et al, J Cell Biol, 2012).

      Lines 130-132 - Please include references for the statement that dephosphorylation activates a-arrestin activity. There are no citations on this statement and there are many to choose from and I would urge the authors to cite the primary literature on these points.

      We will cite studies for the statement "Conversely, dephosphorylation is thought to activate α-arrestins and to promote selective endocytosis of transporter proteins" (L130-132).

      These are just a few examples from the Introduction, but the Discussion is similarly wrought with issues in referencing and framing the experimental results within the context of the larger field, including what is known about Rod1/Rog3 regulation in S. cerevisiae. For example, the Llopis-Torregrosa et al reference and statement on lines 508-510 is incorrect. There are other phosphorylation sites defined in the C-terminus of Rod1, as described in Alvaro et al. PMID: 26920760.

      We will carefully correct Discussion by citing the suggested references (e.g. Alvaro et al, Genetics, 2016) and framing the obtained results within the context of the larger field.

      Of note, a combination of α-arrestin, upstream kinase(s) and distinct phosphorylation sites appears to determine the target transporter (Kahlhofer et al, Biol Cell, 2021; Thorner, Biochem J, 2022), and it has not been explicitly proved that TORC2-Ypk1 signaling also regulate α-arrestin-mediated internalization of hexose transporters in budding yeast. For these reasons, we stated "S. cerevisiae Rod1 and Rog3 are phosphorylated solely by Snf1p/AMPK" in the context of internalization of hexose transporters. We will also discuss this point in the revised manuscript.

      Minor Comments Clarification needed - Lines 107-121 - The relationship between the S. pombe arrestins and those in other organisms is somewhat unclear. Frist, all the arrestins in humans and S. cerevisiae can be sorted into the alpha, beta and Vps26 classes. However, the authors indicate that the S. pombe genome has 11 arrestin-like proteins but only 4 of these are a-arrestins. What classes do the other 7 arrestins belong to? It would be appreciated if this point was clarified.

      To our knowledge, fission yeast arrestins are not well classified yet. We will perform a phylogenetic tree analysis to classify them, and modify the description of the indicated part accordingly. We will also cite our previous report (Toyoda et al, J Cell Sci, 2021), in which the overall protein structure and domains of 11 fission yeast arrestin-like proteins were reported.

      Next, for the 4 a-arrestins identified in S. pombe the authors indicate that Aly3 is the homolog of Rod1/Art4 and Rog3/Art7 from S. cerevisiae. What is the relationship of Rod1 in S. pombe to Rod1 in S. cerevisiae? Are these also homologs? You can see how the nomenclature is confusing and, given the functional overlap of S. cerevisiae Rod1/Rog3 proteins it is important to know if Aly3 is the only version of these a-arrestins or if there is an additional counterpart in S. pombe. This point becomes somewhat more confusing when on lines 134-136 the authors talk about Arn1/Any1 as an arrestin related protein in S. pombe yet this protein was not included on the list of a-arrestins in the preceding section. What class of arrestin is this protein?

      According to PomBase, both Aly3 and Rod1 are assigned as the orthologue of budding yeast Rod1 and Rog3. However, as mentioned in responses above, it is unclear whether Aly3 is really orthologous to budding yeast Rod1/Rod3. In the revised manuscript, we will perform a homology search for these 4 proteins, and add information on how much these arrestins share homology.

      Arn1/Any1 is regarded as a β-arrestin (Nakase et al, J Cell Sci, 2013). We will also mention this in the revised manuscript.

      Alpha-arrestin homology - On lines 127-129 the authors indicate that TXNIP is the mammalian homolog of Aly3. To my knowledge, there are no evolutionary analyses that can draw these lines of homology between the a-arrestins in humans and those in yeasts. It would be appreciated if the authors could cite the work that leads to this conclusion or revise the sentence to more accurately reflect what is known on this topic. It certainly appears that, given their functional overlap in regulating glucose transporters, Txnip and Rod1/Rog3 in humans and S. cerevisiae are functionally connected. I urge the authors to use more caution when describing this protein family.

      Among human α-arrestins, ARRDC2 (22%) but not TXNIP (20%) has the highest amino acid identity to Aly3 (Toyoda et al, J Cell Sci, 2021). However, as TXNIP has been reported to regulate endocytosis of hexose transporters, GLUT1 and 4 (Wu et al, Mol Cell, 2013; Waldhart et al, Cell Rep, 2017), we think that TXNIP and Aly3 share physiological roles. We will revise the sentence (L127-129) more accurately.

      Text editing - The text could use editing as there are awkward and grammatically incorrect sentences in several places. Here are a few examples to help the authors:

      Please note that the original manuscript is edited by a professional editor, who is a native English (American) speaker and has edited thousands of research papers, before initial submission. We will ask an editor to check the revised draft again before submission.

      Lines 57-60 - the protein is not expressed over the entire cell surface, but is localized to the entire cell surface.

      We will correct this wording.

      Lines 80-83 - this sentence is very confusing

      We will correct this part by changing the phrase "Unlike TORC1," into a clause.

      Line 86 - Is there more than one gene encoding Aly3 in S. pombe?

      No, there is only one gene encoding Aly3. We will correct this part so as to avoid being misunderstood.

      Line 88, 109, - these sentences need to start with a capitol so either capitalize the A in arrestin or write out Alpha with a capitol A.

      We will correct the sentence as suggested.

      Lines 145-148 - unclear as written

      We will clarify the meaning of the sentence by changing the voice.

      Line 224 - why are these amino acids being referred to as hydroxylated? Perhaps hydroxyl-containing amino acids or 18 amino acids with hydroxyl side chains would be better choices?

      We will correct the word as suggested.

      Line 300 - very confusing sentence structure

      We will correct this part by simplifying the structure of the sentence.

      And elsewhere....

      We will carefully check the revised text before submission.

      Reviewer #3 (Significance):

      The authors provide some information as to the residues needed in the Aly3 C-tail for Ght5 trafficking in S. Pombe. These results are not places in the context of similar phosphor-regulatory work done for a-arrestins in S. cerevisiae, and this is needed for appreciation of the significance of the study.

      Overall, it appears that the model put forth is very similar to the one already proposed in S. cerevisiae where phosphorylation impedes a-arrestin-mediated trafficking of glucose transporters. It is interesting to see this similarity hold in S. Pombe, but it does not dramatically alter our appreciation of a-arrestin biology.

      The significance of the findings are somewhat underscored by the fact that very little quantification of data are presented, making the rigor of the work difficult to assess.

      We thank the reviewer for careful reading and evaluation of our study. As the reviewer states, the results are not placed in the context of similar phospho-regulatory works done for α-arrestins in S. cerevisiae. This may partly come from the fact that it remains unclear whether internalization of hexose transporters is regulated by TORC2-dependent phosphorylation in S. cerevisiae. We believe that our study is novel and significant for this reason. By performing the additional experiments/quantification and revising the text as suggested by the reviewers, the manuscript will be further strengthened, and we will be able to clearly conclude that TORC2-dependent phosphorylation of Aly3 regulates localization of the Ght5 hexose transporter and cellular responses to glucose shortage stress.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary/background.

      This paper focuses on the regulation of endocytosis of the hexose transporter, Ght5, in S. pombe by nutrient limitation through the arrestin-like protein Aly3. Ght5 is induced when glucose is limiting and is required for growth and proliferation in these conditions. ght5+ encodes the only high-affinity glc transporter from fission yeast. ght5+ is induced in low glucose conditions at the transcriptional level and is translocated to the plasma membrane to allow glc import. Ght5 is targeted to the vacuole in conditions of N limitation. Mutations in the TORC2 pathway lead to the same process, thus preventing growth on low glucose medium, as shown in the gad8ts mutant, mutated for the Gad8 kinase acting downstream of TORC2. Previously, the authors demonstrated that the vacuolar delivery of Ght5 in the gad8ts mutant is suppressed by mutation of the arrestin-like protein Aly3. Arrestin-like proteins are in charge of recognising and ubiquitinating plasma membrane proteins to direct their vacuolar targeting by the endocytosis pathway. This suggested that Aly3 is hyperactive in TORC2 mutants, and accordingly, Ght5 ubiquitination was increased in gad8ts.

      Overall statement

      This study aims at deepening our understanding of the regulation of endocytosis by signalling pathways through arrestin-like proteins. Ght5 is a nice model to study a physiological regulation, and the authors have a great set of tools at hand. However, I think the conclusions are not always rigorous and the conclusions are sometimes far-reaching. The main problem is that much of the conclusions concern a potential phosphorylation of Aly3 which is not experimentally addressed. An additional issue is the fact that they look at Ght5 ubiquitination by co-immunoprecipitation in native conditions (or at least, it seems to me) which cannot be conclusive. Overall, I think some experiments should be performed to address (at least) these 2 points before the manuscript can be published, see detailed comments below.

      Major statements and criticism.

      • Fig 1. Based on the hypothesis that TORC2-mediated phosphorylation regulate Ght5 endocytosis, the authors first considered a possible phosphorylation of Ght5. They mutagenised 11 possible phosphorylation sites on the Ct of Ght5, but none affected the growth on low glucose in the absence of thiamine, suggesting that they don't contribute to the observed TORC2-mediated regulation. However, I disagree with the statement that "phosphorylation of Ght5 is dispensable for cell proliferation in low glucose", given that the authors do not show 1- that Ght5 is phosphorylated and 2-that this is abolished by these mutations. They should either provide data on this or tone down and say that these residues are not involved in the regulation, without implying phosphorylation which is not proven. In the presence of Thiamine (Supp fig 1), it seems that the ST/A mutant grows better in low glucose, and this is not explained nor commented. Since the transporter is not expressed, could the authors provide an explanation to this? If the promoter is leaky and some ght5-ST/A is expressed, it may be more stable and allow better growth than the WT, which would tend to indicate that impairing phosphorylation prevents endocytosis (which is classical for many transporters, see the body of work on CK1-mediated phosphorylation of transporters). Have the authors tried to decrease glc concentration lower than 0.14% in the absence of thiamine to see if this also true when the transporters is strongly expressed? (OPTIONAL)
      • Fig 2. The authors then follow the hypothesis that TORC2 exerts its Ght5-dependent regulation through the phosphorylation of Aly3. They mutagenised 18 possible phosphorylation sites on Aly3. This led to a strong defect in growth in low-glc medium. Mutation of the possible Gad8 site (S460) did not recapitulate this phenotype, suggesting that it is not sufficient, however, mutations of 4 ST residues in a CT cluster (582-586) mimicked the full 18ST/A mutation, suggesting these are the important residues for Ght5 endocytosis.
      • Fig 3A. Further dissection did not allow to pinpoint this regulation to a specific residue, beyond the dispensability of the T586 residue. Fig 3B. The authors look at the effects of mutation of Aly3 on these sites at the protein level. They had to develop an antibody because HA-epitope tagging did not lead to a functional protein (Supp fig 2). Whereas I agree that the mutations causing a phenotype lead to a change in the migration pattern, I disagree with the statement that "This observation indicated that slower migrating bands were phosphorylated species of Aly3" (p.9 l.271). First, lack of phosphorylation usually causes a slower mobility on gel, which is not clear to spot here. Second, a smear appears on top of the mutated proteins (eg. 4th Ala) which is possibly caused by another modification. There are many precedents in the literature about arrestins being ubiquitinated when they are not phosphorylated (see the work on Bul1, Rod1, Csr2 in baker's yeast from various labs). My gut feeling is that lack of phosphorylation unleashes Aly3 ubiquitination leading to change in pattern. All in all, it is impossible to state about the phosphorylation of a protein without addressing its phosphorylation properly by phosphatase treatment + change in migration, or MS/MS. Thus, whereas the data looks promising, this hypothesis that Aly3 is phosphorylated at the indicated sites is not properly demonstrated.
      • Fig 4. The authors now look at the functional consequences of these mutations on ALy3 on Ght5 localisation. The data clearly shows that mutation of the 4 identified S/T residues (Aly3-4th A) causes aberrant localisation of the transporter to the vacuole, likely to cause the observed growth defect on low glucose. There is a nice correlation between the vacuolar localisation and growth in low-glucose for the various aly3 mutants. (A final proof could be to express this in the context of an endocytic mutant, which should restore membrane localisation and suppress the aly3-4thA phenotype - OPTIONAL). However, I still disagree with the statement that "These results indicate that phosphorylation of Aly3 at the C-terminal 582nd, 584th, and/or 585th serine residues is required for cell-surface localization of Ght5." given that phosphorylation was not properly demonstrated.
      • Fig 5. Here, the authors question the role of Aly3 mutations on Ght5 ubiquitination. They immunoprecipitate Ght5 and address its ubiquitination status in various Aly3 mutants. The data is encouraging for a role in Aly3 phosphorylation (?) in the negative control of Ght5 ubiquitination. My main problem with this experiment is that it seems that Ght5 immunoprecipitations were made in non-denaturing conditions, which leads to the question of what is the anti-ubiquitin revealing here (Ght5 or a co-immunoprecipitated protein, for example Aly3 itself, or the Pub ligases, or an unknown protein). It seems that this protocol was previously used in their previous paper, but I stand by my conclusion that ubiquitination of a given protein can only be looked in denaturing conditions. The experiments should be repeated in buffers classical for the study of protein ubiquitination to be able to conclude unambiguously that we are looking at Ght5 ubiquitination itself, especially in the absence of a non-ubiquitinable form of Ght5 as a negative control. Could the authors comment on the fact that S-A or S-D mutations display the same phenotype regarding the possible Ght5 ubiquitination?
      • Fig 6. The authors want to document the model whereby Aly3 may interact with some of the Nedd4 ligases (Pub1/2/3) to mediate its Ght5-ubiquitination function. They actually use the Aly3-4thA mutant, it should have been better with the WT protein. But the results indicate a clear interaction with at least Pub1 and Pub3. By the way, are the Pub1/2/3 fusions functional? Nedd4 proteins are notoriously affected in their function by C-terminal tagging and are usually tagged at their N-terminus (See Dunn et al. J Cell Biol 2004).
      • Fig 7. The authors want to provide genetic interaction between the Pub ligases and the growth defects in low glc due to alterations in Ght5 trafficking. It is unclear how the gad8ts pub1∆ mutant was generated since it doesn't seem to grow on regular glc concentration (Supp fig 5), could the authors provide some information about this? It is also not clear whether it can be stated thatches mutant is "more sensitive" to glc depletion because of the low level of growth to begin with (even at 3%). Altogether, the data show that deletion of pub3+ is able to suppress the growth defect of the gad8ts mutant on low glc medium, suggesting it is the relevant ligase for Ght5 endocytosis. This is confirmed by microscopy observations of Ght5 localisation. However, I would again tone down the main conclusion, which I feel is far-reaching: "Combined with physical interaction data, these results strongly suggest that Aly3 recruits Pub3, but not Pub2, for ubiquitination of Ght5." Work on Rsp5 in baker's yeast has shown that Rsp5 function goes beyond cargo ubiquitination, including ubiquitination of arrestins (which is often required for their function as mentioned in the introduction) or other endocytic proteins (epsins, amphyphysin etc). I agree that the data are compatible with this model but there are other possible explanations. Anything that would block endocytosis would supposedly suppress the gad8ts phenotype.

      Discussion

      Some analogy with the regulation of the Bul arrestins by TORC1/Npr1 and PP2A/Sit4 could be mentioned (Mehri et al. 2012), at the discretion of the authors. The possibility that phosphorylation may neutralise a basic patch on Aly3 Ct, possibly involved in electrostatic interactions with Ght5 is very interesting. Regarding the effect of the mutations on Aly3 localisation (p.15 l.498), did the authors tag Aly3 with GFP? There are examples where proteins tagged with HA are not functional whereas tagging with GFP does not alter their function (eg. Rod1, Laussel et al. 2022) - and here Supp Fig 2 only relates to HA-tagging. Proof of a change in Aly3 localisation upon mutation would definitely be a plus (OPTIONAL).

      Minor comments.

      Introduction:

      • I believe the text corresponding to the work on TXNIP is incorrect (p.5 l.127). TXNIP is degraded after its phosphorylation, not "rectracted" from the surface.
      • For the sake of completion, the authors could add other references concerning the regulation of Rod1 in budding yeast such as Becuwe et al. 2012 J Cell Biol and O'Donnell et al. 2015 Mol Cell Biol, in addition to Llopis-Torregrosa et al. 2016.
      • Other examples of the requirement for arrestin ubiquitination beyond Art1 (p.5 l.136-137) are listed in the ref cited: Kahlhofer et al. 2021.

      Figures: In general, I think it would be clearer if the authors showed on the figures that the background strain in which the XXX gene is added (or its mutant forms) is a xxx∆ strain.

      Referees cross-commenting

      Cross review of Reviewer 1

      • I don't believe that the authors "define a set of redundant c-terminal phosphorylation sites in Aly3", because phosphorylation is not proven.
      • I thinks the points raised for Fig 3B are valid but the authors should focus on making their story conclusive before expanding to other data (except for the explanation of the smear, see my review). Also, I don't think 2NBDG actually works to measure Glc uptake.
      • same for Fig 6 - not sure the interaction site mapping between Aly3 and Pubs would bring much value since there are more urgent things to do to make the story solid.

      Cros review of Reviewer 3 - we have many overlaps, so briefly :

      • I agree that the bibliography is incomplete (mentioned in my review)
      • I agree that there is no demonstration of the phospho-status of Aly3, and it is a problem
      • I agree that the results can be better quantified, esp. in the light of the points raised by this referee concerning the variability of expression of ST18A

      Other specific comments :

      • I agree that the statement that dephosphorylation activates alpha-arresting should be toned down - this was observed in several instances but there are examples of arrestin-mediated endocytosis which does not require their prior dephosphorylation.
      • I fully agree that efforts could be made regarding the classification/nomenclature of arrestins in S. pombe, this had escaped my attention

      Significance

      strengths and limitations

      This study aims at deepening our understanding of the regulation of endocytosis by signalling pathways through arrestin-like proteins in S. pombe. Ght5 is a nice model to study a physiological regulation, and the authors have a great set of tools at hand, including the discovery of Aly3 as the main arrestin for this regulation, and a signalling pathway (TORC2/Gad8) acting upstream. The main question is now to understand at the mechanistic level how TORC2 signaling impinges on the regulation of this arrestin.

      Overall, the authors nicely demonstrate that C-terminal Ser/Thr residues are crucial for the function of Aly3 in Ght5 endocytosis. They propose a model whereby Aly3 phosphorylation by an unknownn kinase inhibits its function on Ght5 ubiquitination, which would favour its endocytosis. However, I think the conclusions are not always rigorous and the conclusions are sometimes far-reaching. The main problem is that much of the conclusions concern a potential phosphorylation of Aly3 which is not experimentally addressed. An additional issue is the fact that they look at Ght5 ubiquitination by co-immunoprecipitation in native conditions (or at least, it seems to me) which cannot be conclusive. Overall, I think some experiments should be performed to address (at least) these 2 points before the manuscript can be published, see detailed comments above.

      Advance

      This study, if completed carefully, would provide among the first examples of mapping of phosphorylation sites on arrestins, which are usually phosphorylated at many sites and are thus difficult to study. Few studies went down to this level in this respect (see Ivshov et al. eLife 2020). There are no changes in paradigms or new conceptual insights, but this work is a nice example of the conservation of these regulatory mechanisms.

      Audience

      Should be of interest for people studying basic research in the field of cell biology, signalling pathways, transporter regulation by physiology. Reviewer background is on the regulation of transporter endocytosis by signalling pathways and arrestin-like proteins.

    1. Author response:

      Reviewer #1 (Public Review):

      Summary and Strengths:

      The ability of Wolbachia to be transmitted horizontally during parasitoid wasp infections is supported by phylogenetic data here and elsewhere. Experimental analyses have shown evidence of wasp-to-wasp transmission during coinfection (eg Huigins et al), host to wasp transmission (eg Heath et al), and mechanical ('dirty needle') transmission from host to host (Ahmed et al). To my knowledge this manuscript provides the first experimental evidence of wasp to host transmission. Given the strong phylogenetic pattern of host-parasitoid Wolbachia sharing, this may be of general importance in explaining the distribution of Wolbachia across arthropods. This is of interest as Wolbachia is extremely common in the natural world and influences many aspects of host biology.

      Weaknesses:

      The first observation of the manuscript is that the Wolbachia strains in hosts are more closely related to those in their parasitoids. This has been reported on multiple occasions before, dating back to the late 1990s. The introduction cites five such papers (the observation is made in other studies too that could be cited) but then dismisses them by stating "However, without quantitative tests, this observation could simply reflect a bias in research focus." As these studies include carefully collected datasets that were analysed appropriately, I felt this claim of novelty was rather strong. It is unclear why downloading every sequence in GenBank avoids any perceived biases, when presumably the authors are reanalysing the data in these papers.

      Thank you for bringing this to our attention, and we will make the necessary amendments in our revised manuscript.

      I do not doubt the observation that host-parasitoid pairs tend to share related Wolbachia, as it is corroborated by other studies, the effect size is large, and the case study of whitefly is clearcut. It is also novel to do this analysis on such a large dataset. However, the statistical analysis used is incorrect as the observations are pseudo-replicated due to phylogenetic non-independence. When analysing comparative data like this it is essential to correct for the confounding effects of related species tending to be similar due to common ancestry. In this case, it is well-known that this is an issue as it is a repeated observation that related hosts are infected by related Wolbachia. However, the authors treat every pairwise combination of species (nearly a million pairs) as an independent observation. Addressing this issue is made more complex because there are both the host and symbiont trees to consider. The additional analysis in lines 123-124 (including shuffling species pairs) does not explicitly address this issue.

      We concur with your observation regarding the non-independence of the data due to phylogenetic relationships. While common phylogenetic correction methods are indeed not directly applicable to wsp distances between species pairs, we are investigating the potential of phylogenetic mixed models to address this issue. We hope to include a revised analysis using this approach in our revised manuscript.

      The sharing of Wolbachia between whitefly and their parasitoids is very striking, although this has been reported before (eg the authors recently published a paper entitled "Diversity and Phylogenetic Analyses Reveal Horizontal Transmission of Endosymbionts Between Whiteflies and Their Parasitoids"). In Lines 154-164 it is suggested that from the tree the direction of transfer between host and parasitoid can be inferred from the data. This is not obvious to me given the poor resolution of the tree due to low sequence divergence. There are established statistical approaches to test the direction of trait changes on a tree that could have been used (a common approach is to use the software BEAST).

      Thank you for your insightful comments regarding the transfer direction of Wolbachia between whiteflies and their parasitoids. We acknowledge the concern about the resolution of the phylogenetic tree and the inference of the direction of Wolbachia transmission based on the available data. We considered the high infection frequency and obligate nature of Wolbachia in En. formosa, which exhibits a 100% infection rate, as a strong indicator that recent transmission of Wolbachia in this clade likely occurred from En. formosa to B. tabaci. We appreciate your recommendation and will ensure that our conclusions are supported by a more statistically sound approach. As you suggested, we will employ the software BEAST to rigorously test the direction of transmission, and we will revise our statements accordingly.

      Reviewer #2 (Public Review):

      The paper by Yan et al. aims to provide evidence for horizontal transmission of the intracellular bacterial symbiont Wolbachia from parasitoid wasps to their whitefly hosts. In my opinion, the paper in its current form consists of major flaws.

      Weaknesses:

      The dogma in the field is that although horizontal transmission events of Wolbachia occur, in most systems they are so rare that the chances of observing them in the lab are very slim.

      For the idea of bacteria moving from a parasitoid to its host, the authors have rightfully cited the paper by Hughes, et al. (2001), which presents the main arguments against the possibility of documenting such transmissions. Thus, if the authors want to provide data that contradict the large volume of evidence showing the opposite, they should present a very strong case.

      In my opinion, the paper fails to provide such concrete evidence. Moreover, it seems the work presented does not meet the basic scientific standards.

      We are grateful for your critical perspective on our work. Nonetheless, we are confident in the credibility of our findings regarding the horizontal transmission of Wolbachia from En. formosa to B. tabaci. Our study has documented this phenomenon through phylogenetic tree analyses, and we have further substantiated our observations with rigorous experiments in both cages and petri dishes. The horizontal transfer of Wolbachia was confirmed via PCR, with the wsp sequences in B. tabaci showing complete concordance with those in En. formosa. Additionally, we utilized FISH, vertical transmission experiments, and phenotypic assays to demonstrate that the transferred Wolbachia could be vertically transmitted and induce significant fitness cost in B. tabaci. All experiments were conducted with strict negative controls and a sufficient number of replicates to ensure reliability, thereby meeting basic scientific standards. The collective evidence we present points to a definitive case of Wolbachia transmission from the parasitoid En. formosa to the whitefly B. tabaci.

      My main reservations are:

      • I think the distribution pattern of bacteria stained by the probes in the FISH pictures presented in Figure 4 looks very much like Portiera, the primary symbiont found in the bacterium of all whitefly species. In order to make a strong case, the authors need to include Portiera probes along with the Wolbachia ones.

      We are very grateful for your critical evaluation regarding the specificity of FISH in our study. We assure the reliability of our FISH results based on several reasons.

      1) We implemented rigorous negative controls which exhibited no detectable signal, thereby affirming the specificity of our hybridization. 2) The central region of the whitefly nymphs is a typical oviposition site for En. formosa. Post-parasitism, we observed FISH signals around the introduced parasitoid eggs, distinct from bacteriocyte cells which are rich in endosymbionts including Portiera (FIG 3e-f). This observation supports the high specificity of our FISH method. 3) In the G3 whiteflies, we detected the presence of Wolbachia in bacteriocytes in nymphs and at the posterior end of eggs in adult females (FIG 4). This distribution pattern aligns with previously reported localizations of Wolbachia in B. tabaci (Shi et al., 2016; Skaljac et al., 2013). Furthermore, the distribution of Wolbachia in the whiteflies does indeed exhibit some overlap with that of Portiera (Skaljac et al., 2013; Bing et al., 2014). 4) The primers used in our FISH assays have been widely cited (Heddi et al., 1999) and validated in studies on B. tabaci and other systems (Guo et al., 2018; Hegde et al., 2024; Krafsur et al., 2020; Rasgon et al., 2006; Uribe-Alvarez et al., 2019; Zhao et al., 2013). Taking all these points into consideration, we stand by the reliability of our FISH results.

      References:

      Bing XL, Xia WQ, Gui JD, Yan GH, Wang XW, Liu SS. 2014. Diversity and evolution of the Wolbachia endosymbionts of Bemisia (Hemiptera: Aleyrodidae) whiteflies. Ecol Evol, 4(13): 2714-37.

      Guo, Y, Hoffmann, AA, Xu, XQ, Zhang X, Huang HJ, Ju JF, Gong JT, Hong XY. 2018. Wolbachia-induced apoptosis associated with increased fecundity in Laodelphax striatellus (Hemiptera: Delphacidae). Insect Mol Biol, 27: 796-807.

      Heddi A, Grenier AM, Khatchadourian C, Charles H, Nardon P. 1999. Four intracellular genomes direct weevil biology: Nuclear, mitochondrial, principal endosymbiont, and Wolbachia. Proc Natl Acad Sci USA, 96: 6814-6819.

      Hegde S, Marriott AE, Pionnier N, Steven A, Bulman C, Gunderson E, et al. 2024. Combinations of the azaquinazoline anti-Wolbachia agent, AWZ1066S, with benzimidazole anthelmintics synergise to mediate sub-seven-day sterilising and curative efficacies in experimental models of filariasis. Front Microbiol, 15: 1346068.

      Krafsur AM, Ghosh A, Brelsfoard CL. 2020. Phenotypic response of Wolbachia pipientis in a cell-free medium. Microorganisms, 8: 1060.

      Rasgon JL, Gamston, CE, Ren X. 2006. Survival of Wolbachia pipientis in cell-free medium. Appl Environ Microbiol, 72: 6934-6937.

      Shi P, He Z, Li S, An X, Lv N, Ghanim M, Cuthbertson AGS, Ren SX, Qiu BL. 2016. Wolbachia has two different localization patterns in whitefly Bemisia tabaci AsiaII7 species. PLoS One, 11: e0162558.

      Skaljac M, Zanić K, Hrnčić S, Radonjić S, Perović T, Ghanim M. 2013. Diversity and localization of bacterial symbionts in three whitefly species (Hemiptera: Aleyrodidae) from the east coast of the Adriatic Sea. Bull Entomol Res, 103(1): 48-59.

      Uribe-Alvarez C, Chiquete-Félix N, Morales-García L, Bohórquez-Hernández A, Delgado-Buenrostro N L, Vaca L, et al. 2019. Wolbachia pipientis grows in Saccharomyces cerevisiae evoking early death of the host and deregulation of mitochondrial metabolism. MicrobiologyOpen, 8: e00675.

      Zhao DX, Zhang XF, Chen DS, Zhang YK, Hong XY, 2013. Wolbachia-host interactions: Host mating patterns affect Wolbachia density dynamics. PLoS One, 8: e66373.

      • If I understand the methods correctly, the phylogeny presented in Figure 2a is supposed to be based on a wide search for Wolbachia wsp gene done on the NCBI dataset (p. 348). However, when I checked the origin of some of the sequences used in the tree to show the similarity of Wolbachia between Bemisia tabaci and its parasitoids, I found that most of them were deposited by the authors themselves in the course of the current study (I could not find this mentioned in the text), or originated in a couple of papers that in my opinion should not have been published to begin with.

      We appreciate your meticulous examination of the sources for our sequence data. All the sequences included in our phylogenetic analysis were indeed downloaded from the NCBI database as of July 2023. The sequences used to illustrate the similarity of Wolbachia between B. tabaci and its parasitoids include those from our previously published study (Qi et al., 2019), which were sequenced from field samples. Additionally, some sequences were also obtained from other laboratories (Ahmed et al., 2009; Baldo et al., 2006; Van Meer et al., 1999). We acknowledge that in our prior research (Qi et al., 2019), the sequences were directly submitted to NCBI and, regrettably, we did not update the corresponding publication information after the article were published. It is not uncommon for sequences on NCBI, with some never being followed by a published paper (e.g., FJ710487- FJ710511 and JF426137-JF426149), or not having their associated publication details updated post-publication (for instance, sequences MH918776-MH918794 from Qi et al., 2019, and KF017873-KF017878 from Fattah-Hosseini et al., 2018). We recognize that this practice can lead to confusion and apologize for the oversight in our work.

      References:

      Ahmed MZ, Shatters RG, Ren, SX, Jin GH, Mandour NS, Qiu BL. 2009. Genetic distinctions among the Mediterranean and Chinese populations of Bemisia tabaci Q biotype and their endosymbiont Wolbachia populations. J Appl Entomol, 133: 733-741.

      Baldo L, Hotopp JCD, Jolley KA, Bordenstein SR, Biber SA, Choudhury RR, et al. 2006. Multilocus sequence typing system for the endosymbiont Wolbachia pipientis. Appl Environ Microbiol, 72: 7098-110.

      Fattah-Hosseini S, Karimi J, Allahyari H. 2014. Molecular characterization of Iranian Encarsia formosa Gahan populations with natural incidence of Wolbachia infection. J Entomol Res Soc, 20: 85–100.

      Qi LD, Sun JT, Hong XY, Li YX. 2019. Diversity and phylogenetic analyses reveal horizontal transmission of endosymbionts between whiteflies and their parasitoids. J Econ Entomol, 112(2): 894-905.

      Van Meer MM, Witteveldt J, Stouthamer R. 1999. Phylogeny of the arthropod endosymbiont Wolbachia based on the wsp gene. Insect Mol Biol, 8: 399-408.

      • The authors fail to discuss or even acknowledge a number of published studies that specifically show no horizontal transmission, such as the one claimed to be detected in the study presented.

      Thank you for bringing this to our attention. We will address and discuss the published studies that report no evidence of horizontal transmission, as you've highlighted, in the revised version of our manuscript.

      Reviewer #3 (Public Review):

      This is a very ordinary research paper. The horizontal of endosymbionts, including Wolbachia, Rickettsia etc. has been reported in detail in the last 10 years, and parasitoid vectored as well as plant vectored horizontal transmission is the mainstream of research. For example, Ahmed et al. 2013 PLoS One, 2015 PLoS Pathogens, Chiel et al. 2014 Enviromental Entomology, Ahmed et al. 2016 BMC Evolution Biology, Qi et al. 2019 JEE, Liu et al. 2023 Frontiers in Cellular and Infection Microbiology, all of these reported the parasitoid vectored horizontal transmission of endosymbiont. While Caspi-Fluger et al. 2012 Proc Roy Soc B, Chrostek et al. 2017 Frontiers in Microbiology, Li et al. 2017 ISME Journal, Li et al. 2017 FEMS, Shi et al. 2024 mBio, all of these reported the plant vectored horizontal transmission of endosymbiont. For the effects of endosymbiont on the biology of the host, Ahmed et al. 2015 PLoS Pathogens explained the effects in detail.

      Thank you very much for your insightful comments and for highlighting the relevant literature in the field of horizontal transmission of endosymbionts, including Wolbachia and Rickettsia. After careful consideration of the studies you have mentioned, we believe that our work presents significant novel contributions to the field. 1) Regarding the parasitoid-mediated horizontal transmission of Wolbachia, most of the cited articles, such as Ahmed et al. 2013 in PLoS One and Ahmed et al. 2016 in BMC Evolutionary Biology, propose hypotheses but do not provide definitive evidence. The transmission of Wolbachia within the whitefly cryptic species complex (Ahmed et al. 2013) or between moths and butterflies (Ahmed et al. 2016) could be mediated by parasitoids, plants, or other unknown pathways. 2) Chiel et al. (2014 in Environmental Entomology reported “no evidence for horizontal transmission of Wolbachia between and within trophic levels” in their study system. 3) The literature you mentioned about Rickettsia, rather than Wolbachia, indirectly reflects the relative scarcity of evidence for Wolbachia horizontal transmission. For example, the evidence for plant-mediated transmission of Wolbachia remains isolated, with Li et al. 2017 in The ISME Journal being one of the few reports supporting this mode of transmission. 4) While the effects of endosymbionts on their hosts are not the central focus of our study, the effects of transgenerational Wolbachia on whiteflies are primarily demonstrated to confirm the infection of Wolbachia into whiteflies. Furthermore, the effects we report of Wolbachia on whiteflies are notably different from those reported by Ahmed et al. 2015 in PLoS Pathogens, likely due to different whitefly species and Wolbachia strains. 6) More importantly, our study reveals a mechanism of parasitoid-mediated horizontal transmission of Wolbachia that is distinct from the mechanical transmission suggested by Ahmed et al. 2015 in PLoS Pathogens. Their study implies transmission primarily through host-feeding contamination, without the need for Wolbachia to infect the parasitoid, suggesting host-to-host transmission at the same trophic level. In contrast, our findings demonstrate transmission from parasitoids to hosts through unsuccessful parasitism, which represents cross-trophic level transmission. To our knowledge, this is the first experimental evidence that Wolbachia can be transmitted from parasitoids to hosts. We believe these clarifications and the novel insights provided by our research contribute valuable knowledge to the field.

      References:

      Ahmed MZ, De Barro PJ, Ren SX, Greeff JM, Qiu BL. 2013. Evidence for horizontal transmission of secondary endosymbionts in the Bemisia tabaci cryptic species complex. PLoS One, 8: e53084.

      Ahmed MZ, Li SJ, Xue X, Yin XJ, Ren SX, Jiggins FM, Greeff JM, Qiu BL. 2015. The intracellular bacterium Wolbachia uses parasitoid wasps as phoretic vectors for efficient horizontal transmission. PLoS Pathog, 10: e1004672.

      Ahmed MZ, Breinholt JW, Kawahara AY. 2016. Evidence for common horizontal transmission of Wolbachia among butterflies and moths. BMC Evol Biol, 16: 118. doi.org/10.1186/s12862-016-0660-x.

      Caspi-Fluger A, Inbar M, Mozes-Daube N, Katzir N, Portnoy V, Belausov E, Hunter MS, Zchori-Fein E. 2012. Horizontal transmission of the insect symbiont Rickettsia is plant-mediated. Proc Biol Sci, 279(1734): 1791-6.

      Chiel E, Kelly SE, Harris AM, Gebiola M, Li X, Zchori-Fein E, Hunter MS. 2014. Characteristics, phenotype, and transmission of Wolbachia in the sweet potato whitefly, Bemisia tabaci (Hemiptera: Aleyrodidae), and its parasitoid Eretmocerus sp. nr. emiratus (Hymenoptera: Aphelinidae). Environ Entomol, 43(2): 353-62.

      Chrostek E, Pelz-Stelinski K, Hurst GDD, Hughes GL. 2017. Horizontal transmission of intracellular insect symbionts via plants. Front Microbiol, 8: 2237.

      Li SJ, Ahmed MZ, Lv N, Shi PQ, Wang XM, Huang JL, Qiu BL. 2017. Plantmediated horizontal transmission of Wolbachia between whiteflies. ISME J, 11: 1019-1028.

      Li YH, Ahmed MZ, Li SJ, Lv N, Shi PQ, Chen XS, Qiu BL. 2017. Plant-mediated horizontal transmission of Rickettsia endosymbiont between different whitefly species. FEMS Microbiol Ecol, 93(12). doi: 10.1093/femsec/fix138.

      Liu Y, He ZQ, Wen Q, Peng J, Zhou YT, Mandour N, McKenzie CL, Ahmed MZ, Qiu BL. 2023. Parasitoid-mediated horizontal transmission of Rickettsia between whiteflies. Front Cell Infect Microbiol, 12: 1077494. DOI: 10.3389/fcimb.2022.1077494

      Qi LD, Sun JT, Hong XY, Li YX. 2019. Diversity and phylogenetic analyses reveal horizontal transmission of endosymbionts between whiteflies and their parasitoids. J Econ Entomol, 112: 894-905.

      Shi PQ, Wang L, Chen XY, Wang K, Wu QJ, Turlings TCJ, Zhang PJ, Qiu BL. 2024. Rickettsia transmission from whitefly to plants benefits herbivore insects but is detrimental to fungal and viral pathogens. mBio, 15(3): e0244823.

      Weaknesses:

      In the current study, the authors downloaded the MLST or wsp genes from a public database and analyzed the data using other methods, and I think the authors may not be familiar with the research progress in the field of insect symbiont transmission, and the current stage of this manuscript lacking sufficient novelty.

      We appreciate your critical perspective on our study. However, we respectfully disagree with the viewpoint that our manuscript lacks sufficient novelty.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      i) "Enhancers dependent on TPR during senescence are enriched for binding sites of inflammatory transcription factors". *Proximity to genes does not confirm an enhancer role for that gene, although Tasdemir et al., 2016 suggested this. At that time, HI-C and Hi-CHiP techniques were not well-established. Nowadays, without combining HI-C and H3K27ac ChIP, Hi-ChIP alone cannot definitively identify actual enhancer regions. If we repeatedly use the Tasdemir et al., 2016 map, we risk incorrect mapping of enhancers of SASP. The authors should either use other public Hi-C databases to map the enhancer of SASP or temper their conclusions about enhancers. Otherwise, this could set a precedent for the SASP enhancer region that might not be entirely accurate. *

      The enhancer mapping for SASP is outdated, as advancements in Hi-C have significantly developed this area. Therefore, the claimed enhancers of SASP may not be accurate.

      __Response: __We agree with the reviewer that enhancers are not easy to define, or to pair with their target gene(s). Indeed, we would argue that even combined HI-C and H3K27ac does not define enhancers or enhancer-gene pairs and that the gold-standard evidence for an enhancer is genetics – does its deletion/mutation abrogate gene activation. We would also point out that we did not actually use the Tasdemir data to call enhancers. In response to the reviewer’s comment, we will temper our terminology and now refer to our inter-and intra-genic ATAC-seq peaks only as “putative enhancers”.

      ii) “Many of these include putative enhancers located close to key SASP genes, such as IL1B and IL8 (Figure 1D).” I have the same concern as mentioned above (i). However, I am interested in knowing the other key SASP genes where DNA is accessible near the genes. A supplementary table listing key SASP genes along with their distances to the TSS and affected by TPR knock-down would be helpful.

      __Response: __We thank the reviewer for this suggestion. We will provide tables listing the TPR dependent, senescent specific ATAC-seq peaks that are close to genes associated with the ‘positive regulation of inflammatory response’, ‘cytokine activity’ and ‘cytokine receptor binding’ gene ontology terms which were significant in the GREAT analysis, and which includes many SASP genes. We will also provide distances of these regions from the associated genes.

      iii) "As we previously reported, knockdown of TPR (siTPR) in RAS cells blocks SAHF formation, but it also results in reduced nuclear localisation (decreased nucleocytoplasmic ratio) of NF-κB, consistent with decreased NF-κB activation (Figure 2A and B, Figure S2A)." TPR is required for CCF, SASP, and SAHF. The relationship between CCF and SASP is well established, but the relationship between SAHF and CCF/SASP remains elusive. Both SAHF and CCF are enriched with heterochromatin markers, suggesting that CCF might originate from SAHF. However, this has not been confirmed. Do the authors think that SAHF is a prerequisite for CCF in the OIS model, or is it an independent event?

      Response: __We agree with the reviewer that CCFs likely originate from SAHF. Whilst we cannot definitively prove thisin our ER-Ras OIS model, in the revised manuscript we intend to further investigate the relationship between SAHF and CCF by knocking down HMGA1 during RAS-induced senescence. Like TPR, HMGA1 depletion is known to lead to loss of SAHF (Narita et al., Cell, 2006) but, unlike TPR, HMGA1 is a chromatin protein enriched on heterochromatin itself. We will assess whether loss of HMGA1 also abrogates CCF formation.__

      iv) The authors suggested that "it is plausible that the decrease in CCFs produced during the early phases of OIS upon TPR knockdown may be caused by an increase in the stability of the nuclear periphery due to the heterochromatin that remains there when SAHF are not formed." I do not completely agree with this explanation because CCF starts forming at day 3-4 but culminates at later time points. According to Figure 5A, only 5-6% of cells are positive for CCFs on day 5. What happens on day 8? By day 8, the percentage of CCF-positive cells could be 20-25%, or the number of CCFs per cell might be 0.2-0.3. If TPR is not required for CCF formation at this stage, then linking CCF to SASP at day 8 becomes critical. This suggests that another mechanism might be driving SASP expression and that TPR could be regulating downstream signaling of CCF. It is possible that changes in nuclear pore density affect the localization of cGAS from the nucleus to the cytoplasm.

      Response: __In our hands and using this IMR90 ER-RAS system, CCF formation decreases later in senescence (d8 - only 2% of cells) hence our focus on early timepoints after oncogenic RAS activation. At later timepoints, cGAS activation is also mediated by retrotransposons (de Cecco et al., Nature, 2019; Liu et al., Cell, 2023), as well as leakage of mitochondrial DNA (Victorelli et al., Nature, 2023; Chen et al., Nat. Comms, 2024), and so it is difficult to disentangle the net contribution of these three inputs.__

      v) Additionally, the authors did not address what happens in the later stages of CCF formation in the absence of TPR. If TPR is not required for CCF formation at later stages, it fails to explain the downstream processes at these time points adequately. This suggests that TPR may also have another mechanism of SASP regulation independent of CCF formation.

      __Response: __In our cellular system CCFs precede the SASP - CCFs are already present at day 3 but SASP factors are not secreted until day 5. However, CCFs are not necessarily required for maintenance of the SASP. Once initiated the SASP is maintained by cytokine feedback loops.

      …………

      Reviewer #2:

      1. The claim that TPR knockdown does not affect NFkappaB nuclear translocation indeed stands, but it would be nice if the authors also compared data across conditions in Fig. 2F, i.e. siCTRL+Ras CM versus siTPR+Ras CM in RAS cells and provided a p-value as it seems to me that there is some dampening of translocation intensity, which is clearly not the case for STOP cells. The authors focus on this for d3 and d5, but it seems to be also the case for later time points.

      __Response: __As basal NF-κB translocation is lower in RAS cells on TPR knockdown, we would expect a dampening in NF-κB translocation between siCTRL+RAS CM and siTPR+Ras CM regardless of whether there is a transportation defect. Consistent with this, the p-value for this comparison is significant, but we did not show it because it is not important in considering whether NF-κB nuclear translocation is impeded by TPR knockdown, which is the focus here. We will add a table with median nuclear:cytoplasmic NF-κB ratios and 95% confidence intervals to make the changes in basal level (treatment with STOP CM) clearer.

      Also, a comment based on literature or from the authors previous work on TPR, on the extent to which the structural integrity of the nuclear basket is at all affected upon TPR depletion would be helpful for data interpretation.

      __Response: __In the revised manuscript we will refer to the literature showing that TPR is the final component added to the nuclear pore and that its absence does not affect localisation of NUP153 to the nuclear basket (Hase and Cordes., Mol. Biol. Cell 2003; Aksenova et al., Nat Comms, 2020).

      Magnification of representative cells per each condition in Fig. 2E would be welcome.

      __Response: __We will provide a revised figure 2E with the magnifications as requested.

      Regarding the data in Figs 3 and S3: I am a bit confused about how the obviously decreased NFkappaB nuclear signal (e.g., in Fig. 3D) does not translate into a skewed N/C ratio (e.g., in Fig. 3C)? The western blots indicate that overall NFkappaB levels remain essentially unchanged? Am I missing something?

      Response: __As stated in the Methods section, we used a 50-pixel expansion of the detected nuclear area as our cytoplasmic area in the analysis (see image below). This was because we found detecting and segmenting the whole cytoplasmic area in the NF-κB channel to be unreliable. At day 3 and 5, the decrease in NF-κB nuclear signal in RAS cells on TPR knockdown was accompanied by a decrease in signal in the portion of the cytoplasm closest to the nucleus. This led to no change in the nuclear:cytoplasmic ratio. We believe the redistribution of NF-κB closer to the nucleus in the RAS siCTRL sample indicates early activation and will make this clearer in the revised text. We will also quantify the NF-κB western blots (see point 5), to help clarification of this issue.____ __

      Also, along these lines, d8 western blots seem to portray an overall drop in NFkappaB levels. Is this indeed so? Can the authors maybe quantify their blots' replicates and provide a box plot and statistical testing?

      Response: __We will provide quantification for the NF-κB western blots, though box plots would not be appropriate as we only have two replicates.__

      Regarding the ATAC-seq data from d3, I think it could be mined a bit more. For example, compare to d8 (which the authors have apparently done, but don't present in detail) and discuss which are these early regions that also become accessible by d3 and what kind of genes and motifs are associated with them. Moreover, the focus in Fig. S3E is on ATAC sites shared with d8; how about d3-specific ones? How many of these are there (if any) and how might they be affected?

      __Response: __As shown in Table S2, TPR knockdown did not cause any changes in chromatin accessibility at day 3, so there are no day 3 specific TPR dependent peaks. We will edit the text to make this clearer. We will carry out motif analysis and GREAT analysis on the day 3 peaks that become accessible in RAS cells but are not accessible in STOP (RAS-specific peaks).

      I trust that the authors quantified their STING blots for the conclusions they present, but since it is difficult to assess these confidently by eye, again, some quantification plots would be welcome in Figs 4C,D and S4D,E.

      __Response: __We will provide quantification for the STING western blots.

      As controls for Fig. 5, it would be interesting to see if active histone readouts also mark CCFs in this system.

      __Response: __Ivanov et al., J. Cell Biol., 2013 showed the absence of H3K9 acetylation from chromatin in CCFs. Further exploration of the types of chromatin/sequences in CCFs is outside the scope of our current manuscript.

      *The POM121 channel in Fig. 5C appears to have some small signal foci in the cytoplasm; could these be small CCFs? More generally, the authors focus on these large blobs that only appear in

      __Response: __The small signal foci the reviewer is highlighting are background from the POM121 antibody staining rather than CCFs – they do not show DAPI staining, and similar foci are evident in non-senescent cells where CCFs are generally not present. Our unpublished data (see response to Reviewer 1, point iv) from day 8 cells shows that only ~2% of senescent cells are positive for CCF regardless of TPR knockdown, which is a similar number to that observed in non-senescent cells at earlier timepoints. Thus, in our hands CCF formation occurs earlier, triggering the SASP, rather than at day 8 when the SASP is already established and reinforced through positive feedback cytokine signalling.

      I wonder if there is a simple experiment the authors could do to test if this mechanism is only linked to senescence, specifically oncogene-induced senescence? I don't think this is needed to support the conclusions drawn here, but it could significantly broaden the scope of their discovery of, for example, this was true in other senescence models or during proinflammatory activation in general?

      __Response: __These are interesting suggestions, but setting up, characterising and quantifying other senescence models will take a substantial amount of time that would be outside the scope of our current manuscript.

      ………….

      Reviewer #3

      1. The study uses a single cell strain IMR90 undergoing a single form of senescence, induced by activated Ras. To show the generalizability of the finding, the authors are advised to inhibit TPR in other forms of senescence in addition to IMR90. For example, IR or etoposide induces greater amount of CCF than in OIS of IMR90. BJ, MEFs, and ARPE-19 senescence also show prominent CCF.

      __Response: __These are interesting suggestions, but as we responded to reviewer 2, setting up, characterising and quantifying other senescence models will take a substantial amount of time that would be outside the scope of our current manuscript.

      To convincing show the CCF pathway is involved, the authors need to measure the activity of cGAS-STING pathway. Including cGAMP ELISA will be informative.

      __Response: __We thank the reviewer for this suggestion, and we will try to include this assay in our revised manuscript.

      The authors used conditioned media to show that TPR KD does not directly affect NFkB nuclear translocation. While this is helpful, conditions other than senescence will be more direct. For example, TNFa treatment or poly I:C transfection induces efficient NFkB nuclear translocation in IMR90 cells.

      __Response: __This experiment (Fig. 2EF) was designed to simply show that knocking down TPR does not impair the ability of activated NFkB to enter the nucleus, it is not about senescence per se. Indeed, this is why we included the addition of SASP (RAS) conditioned media to non-senescence STOP cells in Fig. 2. We do not think investigating other methods of activating NFkB would add more to the question of whether TPR loss abrogates NFkB nuclear import.

      Fig. 4C and Fig. S4D are identical.

      Response: Though these STING immunoblots look similar; in fact they are not identical. Below we attach the raw original image in which both biological replicates (Fig 4C and S4D) for Day 3 were run on the same gel as proof of this claim.

      Figure legend for Fig. S4F is mislabeled.

      __Response: __We will correct this.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      DNA damage triggers senescence, inducing chromatin reorganization and SASP activation. The authors previously demonstrated that the TPR nucleoprotein at nuclear pores is crucial for both SAHF formation and SASP activation during senescence. Here they also showed that TPR is required for the formation of cytoplasmic chromatin fragments (CCF), which activate cGAS-STING-TBK1-NF-kB signaling to express SASP. While the mechanistic regulation of CCF formation by TPR remains unclear, their study provides compelling evidence of downstream processes involving CCF. This study offers new insights into CCF formation, suggesting a promising direction for further research. I endorse the manuscript; however, there are several concerns that need addressing before acceptance.

      i) "Enhancers dependent on TPR during senescence are enriched for binding sites of inflammatory transcription factors".

      Proximity to genes does not confirm an enhancer role for that gene, although Tasdemir et al., 2016 suggested this. At that time, HI-C and Hi-CHiP techniques were not well-established. Nowadays, without combining HI-C and H3K27ac ChIP, Hi-ChIP alone cannot definitively identify actual enhancer regions. If we repeatedly use the Tasdemir et al., 2016 map, we risk incorrect mapping of enhancers of SASP. The authors should either use other public Hi-C databases to map the enhancer of SASP or temper their conclusions about enhancers. Otherwise, this could set a precedent for the SASP enhancer region that might not be entirely accurate.

      ii) Many of these include putative enhancers located close to key SASP genes, such as IL1B and IL8 (Figure 1D).

      I have the same concern as mentioned earlier about enhancers. However, I am interested in knowing the other key SASP genes where DNA is accessible near the genes. A supplementary table listing key SASP genes along with their distances to the TSS and affected by TPR knock-down would be helpful.

      iii) "As we previously reported, knockdown of TPR (siTPR) in RAS cells blocks SAHF formation, but it also results in reduced nuclear localisation (decreased nucleocytoplasmic ratio) of NF-κB, consistent with decreased NF-κB activation (Figure 2A and B, Figure S2A)." TPR is required for CCF, SASP, and SAHF. The relationship between CCF and SASP is well established, but the relationship between SAHF and CCF/SASP remains elusive. Both SAHF and CCF are enriched with heterochromatin markers, suggesting that CCF might originate from SAHF. However, this has not been confirmed. Do the authors think that SAHF is a prerequisite for CCF in the OIS model, or is it an independent event?

      iv) The authors suggested that "it is plausible that the decrease in CCFs produced during the early phases of OIS upon TPR knockdown may be caused by an increase in the stability of the nuclear periphery due to the heterochromatin that remains there when SAHF are not formed." I do not completely agree with this explanation because CCF starts forming at day 3-4 but culminates at later time points. According to Figure 5A, only 5-6% of cells are positive for CCFs on day 5. What happens on day 8? By day 8, the percentage of CCF-positive cells could be 20-25%, or the number of CCFs per cell might be 0.2-0.3. If TPR is not required for CCF formation at this stage, then linking CCF to SASP at day 8 becomes critical. This suggests that another mechanism might be driving SASP expression and that TPR could be regulating downstream signaling of CCF. It is possible that changes in nuclear pore density affect the localization of cGAS from the nucleus to the cytoplasm.

      Significance

      The authors previously demonstrated that the TPR nucleoprotein at nuclear pores is crucial for both SAHF formation and SASP activation during senescence. Here they also showed that TPR is required for the formation of cytoplasmic chromatin fragments (CCF), which activate cGAS-STING-TBK1-NF-kB signaling to express SASP. While the mechanistic regulation of CCF formation by TPR remains unclear, their study provides compelling evidence of downstream processes involving CCF. This study offers new insights into CCF formation, suggesting a promising direction for further research.

      However, there are some limitations to this study. The enhancer mapping for SASP is outdated, as advancements in Hi-C have significantly developed this area. Therefore, the claimed enhancers of SASP may not be accurate. Additionally, the authors did not address what happens in the later stages of CCF formation in the absence of TPR. If TPR is not required for CCF formation at later stages, it fails to explain the downstream processes at these time points adequately. This suggests that TPR may also have another mechanism of SASP regulation independent of CCF formation.

    1. we used our words we used what words we had to weld, what words we had we wielded, kneeled, we knelt.

      I think the opening lines of the poem are the first of many examples of Choi employing Parallelism in this poem. I think the repetitive nature of this parallelism may be a commentary on how society pushes us to to fit in to a mold both in our daily routine, as well as in our identities.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      (1) The manuscript by Lu et al aims to study the effects of tubulin post-translational modification in C. elegans touch receptor neurons. Authors use gene editing to engineer various predicted PTM mutations in a-tubulin MEC-12 and b-tubulin MEC-7. Authors generate and analyze an impressive battery of mutants in predicted phosphorylation site and acetylation site of b-tubulin MEC-7, K40 acetylation site in a-tubulin MEC-12, enzymatic site of the a-tubulin acetyltransferase MEC-17, and PTM sites in the MEC-12 and MEC-7 C-tails (glutamylation, detyrosination, delta-tubulin). This represents a lot of work, and will appeal to a readership interested in C. elegans touch receptor neurons. The major concern/criticism of this manuscript is whether the introduced mutation(s) directly affects a specific PTM or whether the mutation affects gene expression, protein expression/stability/localization, etc. As such, this work does convincingly demonstrate, as stated in the title, that "Editing of endogenous tubulins reveals varying effects of tubulin posttranslational modifications on axonal growth and regeneration." 

      We thank the reviewer for the constructive comments. With regards to the major concern or criticism, we like to point out that we have previously characterized ~100 missense mutations in mec-7 and mec-12 (Zheng et al., 2017, PMID: 28835377; Lee et al., 2021, PMID: 33378215). So, we are familiar with the phenotypes associated with mutations that affect gene expression or protein stability, which mostly result in a null phenotype. When analyzing the PTM site mutants, we compared their phenotypes with the previously categorized phenotypes of null alleles, neomorphic mutations that increase microtubule stability, and antimorphic mutations that prevent polymerization or disrupt microtubule stability. For example, in the case of mec-7 S172 mutations, we found that S172P mutants had the same phenotype as the mec-7 knockout (mild neurite growth defects), suggesting that S172P likely affects protein folding or stability, resulting in the loss of MEC-7. In contrast, S172A and S172E mutations showed phenotypes similar to neomorphic alleles (the emergence of ectopic ALM posterior neurite) and antimorphic alleles (the severe shortening of all neurites in the TRNs), respectively. These phenotypic differences suggested to us that the effects of S172A and S172E mutations cannot be simply attributed to the loss of protein expression and stability. Similar logic was applied to the studies of other PTM-inactivating or -mimicking mutations.

      (2) For example, the authors manipulate the C-terminal tail of MEC-12 and MEC-7, to test the idea that polyglutamylation may be an important PTM. These mutants displayed subtle phenotypes. The authors show that branch point GT335 and polyglutamyation polyE recognizing antibodies stain cultured embryonic touch receptor neurons (TRNs), but did not examine staining in C. elegans TRNs in situ. To my knowledge, these antibodies have not been shown to stain the TRNs in any published papers, raising the question of how these "glutamylation" mutations are affecting mec-12 and -7. The rationale for using cultured embryonic TRNs and the relevance of the data and its interpretation are not clear. 

      The GT335 and polyE antibodies were used by previous studies (O’Hagan et al., 2011, PMID: 21982591; and O’Hagan et al., 2017, PMID: 29129530) to detect the polyglutamylation signals in the sensory cilia of C. elegans. We initially tried to stain the whole animals using these antibodies but could not get clear and distinct signals in the TRNs. We reason that the tubulin polyglutamylation signals in the TRNs may be weak, and the in situ staining method which requires the antibodies to penetrate multiple layers of tissues (e.g., cuticles and epidermis) to reach the TRN axons may be not sensitive enough to detect the signal. In fact, the TRN axons are located deeper in the worm body compared to the sensory cilia that are mostly exposed to the environment. Another reason could be that the tissues (mostly epidermis) surrounding the TRN axons also have polyglutamylation staining, which makes it difficult to recognize TRN axons. This is a situation different from the anti-K40 acetylation staining, which only occurs in the TRNs because MEC-12 is the only a-tubulin isotype that carries K40. Due to these technical difficulties, we decided to use the in vitro cultured TRNs for the staining experiment, which allows both easy access of the antibodies (thus higher sensitivity) and the dissociation of the TRNs from other tissues. The fact that we were able to observe reduced staining in the ttll mutants and the tubulin mutants that lost the glutamate residues suggest that these antibodies indeed detected glutamylation signals in the cells.

      (3) The final paragraph of the discussion is factually incorrect. The C. elegans homologs of the CCP carboxypeptidases are called CCPP-1 and CCPP-6. There are several publications on their functions in C. elegans.

      We thank the reviewer for pointing out the mistake in the text. We intended to say that “there is no C. elegans homolog of the known tubulin carboxypeptidases that catalyze detyrosination”, which is true given that the detyrosinase vasohibins (VASH1/VASH2) homologs cannot be found in C. elegans. We are aware of the publications on CCPP-1 and CCPP-6; CCPP-1 is known to regulate tubulin deglutamylation in the cilia of C. elegans (O’Hagan et al., 2011 and 2017), while CCPP-6 may function in the PLM to regulate axonal regeneration (Ghosh-Roy et al., 2012). In the revised manuscript, we have corrected the error.

      Reviewer #2 (Public Review):

      Summary:

      The tubulin subunits that make up microtubules can be posttranslationally modified and these PTMs are proposed to regulate microtubule dynamics and the proteins that can interact with microtubules in many contexts. However, most studies investigating the roles of tubulin PTMs have been conducted in vitro either with purified components or in cultured cells. Lu et al. use CRISPR/Cas9 genome editing to mutate tubulin genes in C. elegans, testing the role of specific tubulin residues on neuronal development. This study is a real tour de force, tackling multiple proposed tubulin modifications and following the resulting phenotypes with respect to neurite outgrowth in vivo. There is a ton of data that experts in the field will likely reference for years to come as this is one of the most comprehensive in vivo analyses of tubulin PTMs in vivo.

      This paper will be very important to the field, however would be strengthened if: 1) the authors demonstrated that the mutations they introduced had the intended consequences on microtubule PTMs, 2) the authors explored how the various tubulin mutations directly affect microtubules, and 3) the findings are made generally more accessible to non C. elegans neurobiologists.

      (1) The authors introduce several mutations to perturb tubulin PTMs, However, it is unclear to what extent the engineered mutations affect tubulin in the intended way i.e. are the authors sure that the PTMs they want to perturb are actually present in C. elegans. Many of the antibodies used did not appear to be specific and antibody staining was not always impacted in the mutant cases as expected. For example, is there any evidence that S172 is phosphorylated in C. elegans, e.g. from available phosphor-proteomic data? Given the significant amount of staining left in the S172A mutant, the antibody seems non-specific in this context and therefore not a reliable readout of whether MTs are actually phosphorylated at this residue. As another example, there is no evidence presented that K252 is acetylated in C. elegans. At the very least, the authors should consider demonstrating the conservation of these residues and the surrounding residues with other organisms where studies have demonstrated PTMs exist. 

      We thank the reviewer for the comments. To our knowledge, there are very few phosphor-proteome data available for C. elegans. We searched a previously published dataset (Zielinska et al., 2009; PMID: 19530675) and did not find the S172 phosphorylation signal in MEC-7. This is not surprising, given that only six touch receptor neurons expressed MEC-7 and the abundance of MEC-7 in the whole animal lysate may be below the detection limit. However, this phosphorylation site S172 is highly conserved across species and tubulin isotypes (Figure 1-figure supplement 1 in the revised manuscript), suggesting that this site is likely phosphorylated in MEC-7.

      In the case of K252, the potential acetylation site and the flanking sequences are extremely conserved across species and isotypes. In fact, the 20 amino acids from 241-260 a.a. are identical among the tubulin genes of C. elegans, fruit flies, Xenopus, and humans (Figure 4-figure supplement 1B). Thus, although K252 acetylation was found in the HeLa cells, this site can possibly be acetylated. 

      In the case of K40, we observed sequence divergence at the PTM site and adjacent sequences among the tubulin isotypes in C. elegans. MEC-12 is the only C. elegans a-tubulin isotype that has the K40 residue, and the 40-50 a.a. region of MEC-12 appears to be more conserved than other isotypes when compared to Drosophila, frog, and human a-tubulins (Figure 4-figure supplement 1A).

      (2) Given that the authors have the mutants in hand, it would be incredibly valuable to assess the impact of these mutations on microtubules directly in all cases. MT phenotypes are inferred from neurite outgrowth phenotypes in several cases, the authors should look directly at microtubules and/or microtubule dynamics via EBP-2 when possible OR show evidence that the only way to derive the neurite phenotypes shown is through the inferred microtubule phenotypes. For example, the effect of the acetylation or detyrosination mutants on MTs was not assessed. 

      We thank the reviewer for the suggestions. In this study, we created >20 tubulin mutants. Due to limited time and resources, we were not able to examine microtubule dynamics in every mutant strain using EBP-2 kymographs. We assessed the effects of the tubulin mutations mostly based on the changes on neurite growth pattern. From our previous experience of analyzing ~100 mec-7 and mec-12 missense mutations (Zheng et al., 2017, MBoC; Lee et al., 2021, MBoC), we found that the changes in microtubule dynamics are correlated with the changes in neuronal morphologies. For example, the growth of ectopic ALM-PN is correlated with fewer EBP-2 comets and potentially reduced microtubule dynamics; this correlation holds true for several mec-7 neomorphic missense alleles we examined before (Lee et al., 2021, MBoC) and the PTM site mutants [e.g., mec-7(S172A) and mec-12(4Es-A)] analyzed in this study. Similarly, the shortening of TRN neurites is correlated with more EBP-2 comets and increased microtubule dynamics. For the mutants that don’t show neurite growth defects, our previous experience is that they are not likely to show altered microtubule dynamics in EBP-2 tracking experiments. So, we did not analyze the acetylation mutants (which had no defects in neurite growth) and the detyrosination mutants (which had weak ALM-PN phenotype). Nevertheless, we agree with the reviewer that we could not rule out the possibility that there may be some slight changes to microtubule dynamics in these mutants.

      Using tannic acid staining and electron microscopy (EM), we previously examined the microtubule structure in several tubulin missense mutants (Zheng et al., 2017, MBoC) and found that the loss-of-function and antimorphic mutations significantly reduced the number of microtubules and altered microtubule organizations by reducing protofilament numbers. These structural changes are consistent with highly unstable microtubules and defects in neurite growth. On the other hand, neomorphic mutants had only slight decrease in microtubule abundance, maintained the 15-protofilament structure, and had a more tightly packed microtubule bundles that filled up most of the space in the TRN neurite (Zheng et al., 2017, MBoC). These structural features are consistent with increased microtubule stability and ectopic neurite growth. Although we did not directly examine the microtubule abundance and structure using EM in this study, we would expect similar changes that are correlated with the neurite growth phenotypes in the PTM mutants. We agree with the reviewer, it will be informative to conduct more comprehensive analysis on these mutants using EM and other structural biology methods.

      (3) There is a ton of data here that will be important for experts working in this field to dig into, however, for the more general cell biologist, some of the data are quite inaccessible. More cartoons and better labeling will be helpful as will consistent comparisons to control worms in each experiment.

      Response: We thank the reviewer for the comment. In the revised manuscript, we added some cartoons to Figure 2G to show the location of the synaptic vesicles. The neurite growth phenotype should be quite straightforward. Nevertheless, we added one more Figure (Figure 8) to summarize all the results in the study with cartoons that depicted the changes to neuronal morphologies.

      (4) In addition, I am left unconvinced of the negative data demonstrating that MBK does not phosphorylate tubulin. First, the data described in lines 207-211 does not appear to be presented anywhere. Second, RNAi is notoriously finicky in neurons, thus necessitating tissue-specific degradation using either the ZF/ZIF-1 or AID/TIR1 systems which both work extremely well in C. elegans. Third, there appears to be increasing S172 phosphorylation in Figure 3 Supplement 2 with added MBK-2, but there is no anti-tubulin blot to show equal loading, so this experiment is hard to interpret.

      We added the results of mbk-1, mbk-2, and hpk-1 mutants and cell-specific knockdown of MBK-2 into Figure 3-figure supplement 1D. Considering the reviewer’s suggestion, we attempted to use a ZIF-1 system to remove the MBK-2 proteins specifically in the TRNs using a previously published method (PMID: 28619826). We fused endogenous MBK-2 with GFP by gene editing and then expressed an anti-GFP nanobodies fused with ZIF-1 in the TRNs to induce the degradation of MBK-2::GFP. To our surprise, unlike the mbk-2p::GFP transcriptional reporter, the MBK-2::GFP did not show detectable expression in the TRNs, although expression can be seen in early embryos, which is consistent with the “embryonic lethal” phenotype of the mbk-2(-) mutants (Figure 3-figure supplement 2A-B in the revised manuscript). We reason that either endogenous MBK-2 is not expressed in the TRNs or is expressed at a very low level. We then crossed mbk-2::GFP with ItSi953 [mec-18p::vhhGFP4::Zif-1] to trigger the degradation of any potential MBK-2 proteins and did not observe the ectopic growth of ALM-PN (Figure 3- figure supplement 2C). These results suggest that MBK-2 is not likely to regulate tubulin phosphorylation in the TRNs, which is consistent with the results of other genetic mutants and the RNAi experiments.

      For Figure 3 Supplement 2 (Figure 3-figuer supplement 3 in revised manuscript), because we added the same amount of purified MEC-12/MEC-7 to all reactions and had established equal loading in Figure 3E, we did not do the anti-tubulin staining in this experiment. Since higher concentration (1742 nM) of MBK-2 did not produce stronger signal than the condition with 1268 nM, we don’t think the 1268 nM band represents true phosphorylation. Moreover, the signal is not significantly stronger than the control without MBK-2 and is much lower than the signal generated by CDK1 in Figure 3E. Based on these results, we concluded that MBK-2 is not likely to phosphorylate MEC-7.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      General:

      A summary table would help the reader digest the vast amount of phenotypic data.

      Cartoons to help a non-C. elegans reader understand the figures. 

      We added Figure 8 to summarize and illustrate the effects of the various mutants analyzed in this study.

      Specific:

      The authors engineered mutations into the predicted phosphorylation site of b-tubulin mec-7. These CRISPR-alleles mutations phenocopied previously identified loss-of-function, gain-of-function, and neomorphic mec-7 alleles identified in genetic screens by the Chalfie lab. Next, the authors sought to identify the responsible kinase, taking a candidate gene approach. The most likely family - minibrain - had no effect when knocked down/out. The authors showed that cdk-1 mutants displayed ectopic ALM-PN outgrowth. Whether cdk-1 specifically acts in the TRNs was not demonstrated, calling into question whether CDK-1 phosphorylates S172 in vivo. In their introduction (lines 45-59), the authors built a case for engineering PTM mutations directly into tubulins, because the PTM enzymes may have multiple substrates. This logic applies to the cdk-1 experiment and its interpretation. 

      The reviewer is right. Since CDK1 and minibrain kinase are the only known kinases that catalyze S172 phosphorylation, our results suggest that CDK-1 is more likely to catalyze S172 phosphorylation in the TRNs compared to MBK-1/2. Genetic studies found that cdk-1(-); mec-7(S172A) double mutants did not show stronger phenotype than the two single mutants, suggesting that they function in the same pathway. Nevertheless, we could not rule out the possibility that other kinases may also control S172 phosphorylation, and the effect of CDK-1 is indirect. We mentioned this possibility in the revised manuscript.

      For a-tubulin MEC-12, acetyl-mimicking K40Q and unmodifiable K40R mutants failed to stain with the anti-acetyl-a-tubulin (K40) antibody and displayed subtle TRN phenotypes. The enzymatically dead MEC-17 had phenotypes similar to those described by Topalidou (2012), confirming the Chalfie lab finding that MEC-17 has functions in addition and independent of its acetyltransferase activity. The authors moved onto a predicted acetylation site in MEC-7 and observed TRN developmental defects, and acknowledged that this may be due to tubulin instability and not a PTM. This is a concern for all mutants, as there is no way to measure whether the protein is expressed, stable, or localized properly. 

      We acknowledge that this is a caveat of mutational studies. An amino acid substitution at the PTM site may have multiple effects, including the change of the PTM state and potential alteration of protein conformation. Without direct evidence for enzymatic modification of the PTM site in the neurons, we could not rule out the possibility the phenotype we observed is not related to PTM and instead is the result of abnormal protein conformation and function caused by the mutation.

      Nevertheless, as stated in our above response to the first point in the public review, we can phenotypically differentiate loss-of-function and gain-of-function mutants. If the mutation reduces expression or general protein stability, it is more likely to cause a loss-of-function phenotype. For most PTM site mutants, this is not the case. We observed mostly gain-of-function phenotype, suggesting that the missense mutations did not simply inactivate the tubulin protein and instead affected the functional properties of the protein.

      From here, the authors manipulate the C-terminal tail of MEC-12 and MEC-7, testing the idea that polyglutamylation may be an important PTM. These mutants displayed subtle phenotypes. The authors show that branch point GT335 and polyglutamyation polyE recognizing antibodies stain cultured embryonic TRNs, but did not examine staining in TRNs. To my knowledge, these antibodies have not been shown to stain the TRNs in any published papers (see next point). The rationale for using cultured embryonic TRNs is not clear. 

      See our response to the second point in the public review.

      Lines 548-553 There are several publications on CCPP-1 and CCPP-6 functions in TRNs and ciliated sensory neurons. See

      PMID: 20519502

      PMID: 21982591

      PMID: 21943602

      PMID: 23000142

      PMID: 29129530

      PMID: 33064774

      PMID: 36285326

      PMID: 37287505 

      We thank the reviewer for pointing out these references, some of which were cited in the revised manuscript. We made a mistake in the Discussion by saying that there are no C. elegans homologs of tubulin carboxypeptidases while we intended to state that there is no homolog of tubulin detyrosinase in C. elegans. We are aware of the studies of CCPP-1 and CCPP-6 and have corrected the mistake in revised manuscript (also see our response to the third point in the public review).

      Reviewer #2 (Recommendations For The Authors):

      Figures: 

      As stated in the public review, more cartoons and better labeling will be helpful as will consistent comparisons to control worms in each experiment. A good example of this issue is demonstrated in Figure 2 and Figure 4: 

      (1) Figure 2: Please label images with what is being probed in each panel. 

      We added labels to the panels.

      (2) Figure 2G is very hard to interpret - cartoon diagramming what is being observed would be helpful. 

      We added cartoons to help illustrate the images.

      (3) Line 182-185: is this referring to your data or to Wu et al? It is not clear in this paragraph when the authors are describing published work versus their own data presented here. 

      It is from our data. We have made it clear in the revised manuscript.

      (4) Figure 2 - 2K is not well described. What experiment is being done here? What is dlk-1 and why did you look at this mutant? 

      Figure 2K showed that both wild-type animals and S172A mutants could reconnect the severed axons after laser axotomy. Previous studies have found that dlk-1(-) mutants were not able to regenerate axons due to altered microtubule dynamics (PMID: 19737525; PMID: 23000142). We used dlk-1(-) mutants as a negative control, because DLK-1 promotes microtubule growth following axotomy, and the DLK-1 pathway is essential for regeneration (PMID: 23000142). We want to highlight the phenotypic difference between dlk-1(-) mutants and the S172E mutants. Although both mutants showed similar regrowth length, dlk-1(-) mutants showed unbranched regrowth probably due to the lack of microtubule polymerization, whereas the S172E mutants showed a mesh-like regrowth pattern likely due to highly dynamic and unstable microtubules. We explained the different phenotypes in the revised manuscript.

      (5) Figure 4C: this phenotype is hard to interpret. Where is the wt control? Where is the quantification? 

      In the Figure legend, we have referred the readers to Figure 1G for the wild-type image. Quantification is provided in the text (~20% of the animals showed the branching defects).

      (6) There are no WT comparison images in Figure 4I, making the quantification difficult to interpret 

      In the Figure legend, we have referred the readers to Figure 1A for the wild-type control. Moreover, we included a new Figure 8 to summarize the phenotypes of all mutants.

      Experimental:

      (1) Is it clear that only MEC-7/MEC-12 are the only a- and b-tubulin present in the TRNs? The presence of other tubulins not mutated would complicate the interpretation of the results. 

      According to the mRNA levels, the expression of MEC-7 and MEC-12 are >100 fold higher than other tubulin isotypes. For example, single-cell transcriptomic data (Taylor et al., 2021) showed that mec-7 mRNA is at 135,940 TPM in ALM neurons, whereas two other tubulin isotypes, tbb-1 and tbb-2, have expression value of 54 and 554 TPM, respectively in the ALM. So, even if there are some other tubulin isotypes, their abundance is much lower than mec-7 and mec-12 and are not likely to interfere with the effects of the mec-7 and mec-12 mutants.

      (2) The in vitro kinase assays should be quantified. 

      We have added the quantification.

      (3) The idea that Cdk1 phosphorylates tubulin in interphase is surprising and I am left wondering how the authors propose that Cdk1 is activated in interphase. Is cyclin B (or another cyclin) present in interphase in this cell type? Expression but not activation of Cdk1 is not discussed. 

      CDK1 can work with cyclin A and cyclin B. C. elegans has one cyclin A gene (cya-1) and four cyclin B genes (cyb-1, cyb-2.1, cyb-2.2, and cyb-3). According to single-cell transcriptomic data of L4 animals, cya-1 and cyb-1 showed weak expression in many postmitotic neurons (including the ALM neurons), while cyb-2.1, cyb-2.2, and cyb-3 had no expression in neurons. So, it is possible that cya-1/cyclin A and cyb-1/cyclin B has low level of expression in the TRNs. A previous study also found the expression of cell cycle regulators (including cyclins) in postmitotic neurons in mouse brain (Akagawa et al., 2021; PMID: 34746147).

      (4) What is the significance of neurite swelling and looping in Figure 4H? The underlying cause of this phenotype is not described. 

      The neurite swelling and looping phenotype of mec-17(-) mutants were described by Topalidou et al., (2012; PMID: 22658602) and were caused by the bending of the microtubules. It appears that the loss of the a-tubulin acetyltransferase altered the organization of microtubules in the TRNs. These defects were partially rescued by the enzymatically dead MEC-17, suggesting that MEC-17 may play a non-enzymatic (and likely structural) role in regulating microtubule organization. We added more explanation in the revised manuscript.

      (5) It is quite surprising that polyglutamylation is not affected in the quintuple ttll mutant. Since the authors made the sextuple ttll mutant, could they demonstrate whether polyglutamylation is further reduced in this mutant via GT335 staining? 

      We did not make the comparison of the quintuple and sextuple ttll mutants because they were crossed with TRN markers with different colors for technical reasons. The quintuple mutants CGZ1475 carried uIs115 [mec-17p::TagRFP] IV, whereas the sextuple mutants CGZ1474 carried zdIs5 [mec-4p::GFP] I. As a result, we need to use different secondary antibodies for the antibody staining, which makes the results not compatible.

      Polyglutmaylation signal in the cell body was strongly affected by the ttll mutations. In fact, in the ttll-4(-); ttl-5(-); ttll-12(-) triple mutants, the signal is significantly reduced in the cell body of the TRNs, as well as the cell body of other cells. What’s surprising is that the signal in the axons persisted in the ttll triple and quintuple mutants. As the reviewers suggested, we also stained the sextuple mutants and found similar pattern as the triple and quintuple mutants (new Figure 6-figure supplement 1C in the revised manuscript), although the results are not quantitatively comparable due to the use of secondary antibodies with different fluorophores.

      Writing:

      (1) The beginning of the results section is quite jarring. The information in lines 96-104 should be in the Introduction. 

      Due to the nature of this paper, each section deals with a particular PTM. We think it is helpful to discuss some background information before describing our results on each PTM rather than giving all in the introduction. Nevertheless, we modified the beginning of the results to make it more coherent and more connected with the preceding paragraphs.

      (2) Line 122-126: conclusions are not supported by the data: it is suggested from previous experiments, but authors do not look at MTs directly. 

      We have rephrased the statement to acknowledge that we made such conclusion based on phenotypic similarity with mutants we previously examined.

      (3) I am confused by the usage of both mec-12(4EtoA) and mec-12(4Es-A). Are these the same mutations? If so, there needs to be consistency. If not, each case needs to be defined. 

      They are the same. We have corrected the mistake and are now using mec-12(4Es-A) to refer to the mutants.

      Line 105: phosphor --> phospho 

      Line 187: were --> was 

      Line 298: is --> are

      The above typos are corrected.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Recommendations For The Authors):

      I still find it really impressive that the Purkinje cell stimulation so closely mimics the pathogenic phenotypes - in my opinion, the strongest part of the paper. I would like just a little clarification on some of my previous questions.

      Major points:

      (1) Can the authors clarify where the new units came from? Are these units that were recorded before the initial submission and excluded, but are now included? If so, why were they excluded before? Or are these units that were recorded since the original submission?

      The number of units increased in Figure 1 for three reasons: 1) We have now plotted the classifier results in Figure 1 instead of the validation results, which have been moved to Figure 1 Supplement 3. 2) In response to reviewer comments, we no longer include units that had >60 s of recording in both our model creation and validation. We had previously used 30 s for creating the model and a different 30 s for validating the model, if an additional 30 s were available. 3) We changed our model creation and validation strategy based on previous reviewer comments. The new units in Figures 2-4 were taken from our pool of previously collected but unanalyzed data (we collect neural data on a rolling basis and thus these data were not initially available). We were fortunate to have these data to analyze in order to address the concerns about the number of cells included in the manuscript. The number of units increased in Figure 5 because new units were recorded in response to reviewer comments.

      (2) Why did some of the neuron counts go down? For example, in Pdx1Cre;Vglut2fl/fl mice, the fraction of units with the control signature went from 11/21 to 7/23. Is this because the classifier changed between the original submission and the revision?

      Yes, the proportion of cells matching each classification changed due to the different parameters and thresholds used in the updated classifier model.

      Minor points:

      In the Discussion: "We find some overlap and shared spike features between the different disease phenotypes and show that healthy cerebellar neurons can adapt multiple disease-associated spike train signatures." I think "adapt" should be "adopt"

      In the Discussion: "compare" is misspelled as "compared"

      Thank you for bringing these typos to our attention. We will upload a new version of the text with the typos corrected.


      The following is the authors’ response to the original reviews.

      We would like to thank the Reviewers for providing excellent and constructive suggestions that have enabled us to strengthen our overall presentation of our data. We have addressed each of the comments by altering the text, providing additional data, and revising the figures, as requested.

      Below are our explanations for how we have altered the manuscript in this revised version.

      Recommendations for the authors:

      I think you will have seen from the comments that there was great enthusiasm for the importance of this study. There were also shared concerns about how the classifier may be inadequate in its current format, as well as specific suggestions to consider to improve. I hope that you will consider a revision to really amplify the impact of the importance of this study.

      Reviewer #1 (Recommendations For The Authors):

      Distinct motor phenotypes are reflected in different neuronal firing patterns at different loci in motor circuits. However, it is difficult to determine if these altered firing patterns: 1) reflect the underlying neuropathology or phenotype, 2) whether these changes are intrinsic to the local cell population or caused by larger network changes, and 3) whether abnormal firing patterns cause or reflect abnormal movement patterns. This manuscript attempts to address these questions by recording neural firing patterns in deep cerebellar nucleus neurons in several models of cerebellar dysfunction with distinct phenotypes. They develop a classifier based on parameters of single unit spike trains that seems to do an inconsistent job of predicting phenotype (though it does fairly well for tremor). The major limitation of the recording/classifier experiments is the low number of single units recorded in each model, greatly limiting statistical power. However, the authors go on to show that specific patterns of Purkinje cell stimulation cause consistent changes in interposed nucleus activity that map remarkably well onto behavioral phenotypes. Overall, I did not find the recording/classifier results to be very convincing, while the stimulation results strongly indicate that interposed nucleus firing patterns are sufficient to drive distinct behavioral phenotypes.

      We thank the reviewer for their comments. We describe below how we have addressed the major concerns.

      Major concerns:

      (1) I don't think it's legitimate to use two 30-second samples from the same recording to train and validate the classifier. I would expect recordings from the same mouse, let alone the same unit, to be highly correlated with each other and therefore overestimate the accuracy of the classifier. How many of the recordings in the training and validation sets were the same unit recorded at two different times?

      We previously published a paper wherein we measured the correlation (or variability) between units recorded from the same mouse versus units recorded from different mice (see: Van der Heijden et al., 2022 – iScience, PMID: 36388953). In this paper we did not find that nuclei neuron recordings from the same mouse were more correlated or similar to each other than recordings from different mice. 

      Upon this reviewer comment, however, we did observe strong correlations between the two 30-second samples from the same recording units. We therefore decided to no longer validate our classifier based on a training and validation sets that had overlapping units. Instead, we generated 12 training sets and 12 non-overlapping validation sets based on our entire database. We then trained 12 classifier models and ranked these based on their classification ability on the validation sets (Figure 1 – supplemental Figure 3). We found that the top two performing classifier models were the same, and used this model for the remainder of the paper. 

      (2) The n's are not convincing for the spike signature analyses in different phenotypic models. For example, the claim is that Pdx1Cre;Vglut2fl/fl mice have more "control" neurons than ouabain infusion mice (more severe phenotype). However, the numbers are 11/21 and 7/20, respectively. The next claim is that 9/21 dystonic neurons are less than 11/20 dystonic neurons. A z-test for proportions gives a p-value of 0.26 for the first comparison and a pvalue of 0.44 for the second. I do not think any conclusions can be drawn based on these data.

      We included more cells in our analyses and found that the z-test for n the proportion of cells with the “control” and “dystonia” signature is indeed statistically significant. 

      (3) Since the spiking pattern does not appear to predict an ataxic phenotype and the n's are too small to draw a conclusion for the dystonic mice, I think the title is very misleading - it does not appear to be true that "Neural spiking patterns predict behavioral phenotypes...", at least in these models.

      We have changed the title to: “Cerebellar nuclei cells produce distinct pathogenic spike signatures in mouse models of ataxia, dystonia, and tremor.” We feel that this new title captures the idea that we find differences between spike signatures associated with ataxia, dystonia, and tremor and that these signatures induce pathological movements.

      (4) I don't think it can be concluded from the optogenetic experiments that the spike train signatures do not depend on "developmental changes, ...the effect of transgene expression, ... or drug effects outside the cerebellum." The optogenetic experiments demonstrate that modulating Purkinje cell activity is sufficient to cause changes in DCN firing patterns and phenotypes (i.e., proof-of-principle). However, they do not prove that this is why DCN firing is abnormal in each model individually.

      Thank you for highlighting this section of the text. We agree that the optogenetic experiments cannot explain why the DCN is firing abnormally in each model. We have edited this section of the text to prevent this conclusion from being drawn by the readers.

      Minor points:

      (1) It would be nice to see neural recordings in the interposed nucleus during Purkinje terminal stimulation to verify that the firing patterns observed during direct Purkinje neuron illumination are reproduced with terminal activation. This should be the case, but I'm not 100% certain it is.

      We have edited the text to clarify that representative traces and analysis of interposed nucleus neurons in response to Purkinje terminal stimulation are the data in Figure 5.

      (2) How does the classifier validation (Fig. 1E) compare to chance? If I understand correctly, 24/30 neurons recorded in control mice are predicted to have come from control mice (for example). This seems fairly high, but it is hard to know how impressive this is. One approach would be to repeat the analysis many (1000s) of times with each recording randomly assigned to one of the four groups and see what the distribution of "correct" predictions is for each category, which can be compared against the actual outcome.

      We have now also included the proportion of spike signatures in the entire population of neurons and show that the spike signatures are enriched in each of the four groups (control, ataxia, dystonia, tremor) relative to the presence of these signatures in the population (Figure 1E). 

      (3) I don't think this is absolutely necessary, but do the authors have ideas about how their identified firing patterns might lead to each of these phenotypes? Are there testable hypotheses for how different phenotypes caused by their stimulation paradigms arise at a network level?

      We have added some ideas about how these spike signatures might lead to their associated phenotypes to the discussion.

      Reviewer #2 (Recommendations For The Authors):

      (1) As mentioned earlier, my main concern pertains to the overall architecture and training of the classifier. Based on my reading of the methods and the documentation for the classifier model, I believe that the classifier boundaries may be biased by the unequal distribution of neurons across cerebellar disease groups (e.g., n=29 neurons in control versus n=19 in ataxics). As the classifier is trained to minimize the classification error across the entire sample, the actual thresholds on the parameters of interest may be influenced by the overrepresentation of neurons from control mice. To address this issue, one possible solution would be to reweight each group so that the overall weight across classes is equal. However, I suggest a better strategy might be to revise the classifier architecture altogether (as detailed below).

      We have retrained the classifier model based on equal numbers of ataxic, dystonic, and tremor cells (n=20) but we intentionally included more control cells (n=25). We included more control cells because we assume this is the baseline status for all cerebellar neurons and wanted to avoid assigning disease signatures to healthy neurons too easily. 

      (2) As the authors make abundantly clear, one mouse model of disease could potentially exhibit multiple phenotypes (e.g., a mouse with both ataxia and tremor). To address this complexity, it might be more valuable to predict the probability of a certain CN recording producing specific behavioral phenotypes. In this revised approach, the output of the classifier wouldn't be a single classification (e.g., "this is an ataxic mouse") but rather the probability of a certain neural recording corresponding to ataxia-like symptoms (e.g., "the classifier suggests that this mouse has a 76% likelihood of exhibiting ataxic symptoms given this CN recording"). This modification wouldn't require additional data collection, and the exemplar disease models could still be used to train such a revised network/classifier, with each mouse model corresponding to 0% probability of observing all other behavioral phenotypes except for the specific output corresponding to the disease state (e.g., L7CreVgat-fl/fl would be 0% for all categories except ataxia, which would be trained to produce a score of 100%). This approach could enhance the validation results across other mouse models by allowing flexibility in a particular spike train parameter to produce a diverse set of phenotypes.

      This is a great comment. Unfortunately, our current dataset is constrained to fully address this comment for the following reasons:

      - We have a limited number of neurons on which we can train our classifier neurons. Further dividing up the groups of neurons or complicating the model limited the power of our analyses and resulted in overfitting of the model on too few neurons.

      - The recording durations (30 seconds) used to train our model are likely too short to find multiple disease signatures within a single recording. We feel that the complex phenotypes are likely resulting from cells within one mouse exhibiting a mix of disease signatures (as in the Car8wdl/wdl mice).

      We think this question would be great for a follow-up study that uses a large number of recordings from single mice to fully predict the mouse phenotype based on the population spike signatures. 

      To limit confusion about our classifier model, we have also altered the language of our manuscript and refer to the cells exhibiting a spike signature instead of predicting a phenotype. 

      However, the paper falls short in terms of the classifier model itself. The current implementation of this classifier appears to be rather weak. For instance, the crossvalidated performance on the same disease line mouse model for tremor is only 56%. While I understand that the classifier aims to simplify a high-dimensional dataset into a more manageable decision tree, its rather poor performance undermines the authors' main objectives. In a similar vein, although focusing on three primary features of spiking statistics identified by the decision tree model (CV, CV2, and median ISI) is useful for understanding the primary differences between the firing statistics of different mouse models, it results in an overly simplistic view of this complex data. The classifier and its reliance on the reduced feature set are the weakest points of the paper and could benefit from further analysis and a different classification architecture. Nevertheless, it is commendable that the authors have collected high-quality data to validate their classifier. Particularly impressive is their inclusion of data from multiple mouse models of ataxia, dystonia, and tremor, enabling a true test of the classifier's generalizability.

      We intentionally simplified our parameter space from a high-dimensional dataset into a more manageable decision tree. We did this for the following reasons:

      - The parameters, even though all measuring different features, are highly correlated (see Figure 1 – supplemental Figure 2). Further, we were training our dataset on a limited number of recordings. We found that including all parameters (for example using a linear model) caused overfitting of the data and poor model performance.

      - Describing the spike signatures using a lower number of parameters allowed us to design optogenetic parameters that would mimic this parameter space. This would be infinitely more complex with a bigger parameter space. 

      We agree with the reviewer that inclusion of multiple mouse models in addition to the optogenetics experiments provide the classifier’s generalizability. 

      Minor Comments:

      (1) The blown-up CN voltage traces in Figures 5C and Supplementary Figure 2B appear more like bar plots than voltage traces on my machine.

      Thank you for bringing this to our attention. We have improved the rendering of the traces.

      (2) The logic in lines 224-228 is somewhat confusing. The spike train signatures are undoubtedly affected by all the factors mentioned by the authors. What, I believe, the authors intend to convey is that because changes in CN firing rates can be driven by multiple factors, it is the CN firing properties themselves that likely drive disease-specific phenotypes.

      We agree that our discussion of the CN firing needs clarification. We have made the appropriate edits in the text.

      Reviewer #3 (Recommendations For The Authors):

      It's quite astounding that this can be done from single spike trains from what are almost certainly mixed populations of neurons. Could you add something to the discussion about this? Some questions that could be addressed would be would multiple simultaneous recordings additionally help classify these diseases, or would non-simultaneous recordings from the same animal be useful? Also more discussion about which cells you are likely recording from would be useful.

      Thank you for this suggestion. We have added discussion about multiple recordings, simultaneous vs non-simultaneous recordings, and our thoughts on the cell population recorded in this work.

      Data in figure 2 is difficult to understand - it appears that the majority of dysregulated cells in 2 ataxic models are classified as dystonia cells, not ataxic cells. This appears surprising as it seems to be at odds with earlier data from Fig 1. In my opinion, it is not discussed adequately in the Results or Discussion section.

      We have added further discussion of the ataxia models represented in Figures 1 and 2.

      Minor comment:

      The colours of the subdivisions of the bars in 2C and 3C, and the rest of the paper appear to be related to the groups in the middle (under "predicted"), but the colours are much paler in the figure than in the legend, although the colours in the bars and the legends match in the first figure (1E). Does this signify something?

      These figures were remade with the same colors across the board.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The study by Prieto et al. faces the increasingly serious problem of bacterial resistance to antimicrobial agents. This work has an important element of novelty proposing a new approach to control antibiotic resistance spread by plasmids. Instead of targeting the resistance determinant, plasmid-borne proteins are used as antigens to be bound by specific nanobodies (Nbs). Once bound plasmid transfer was inhibited and Salmonella infection blocked. This in-depth study is quite detailed and complex, with many experiments (9 figures with multiple panels), rigorously carried out. Results fully support the authors' conclusions. Specifically, the authors investigated the role of two large molecular weight proteins (RSP and RSP2) encoded by the IncHI1 derivative-plasmid R27 of Salmonella. These proteins have bacterial Ig-like (Big) domains and are expressed on the cell surface, creating the opportunity for them to serve as immunostimulatory antigens. Using a mouse infection model, the authors showed that RSP proteins can properly function as antigens, in Salmonella strains harboring the IncHI1 plasmid. The authors clearly showed increased levels of specific IgG and IgA antibodies against these RSP proteins proteins in different tissues of immunized animals. In addition, non-immunized mice exhibited Salmonella colonization in the spleen and much more severe disease than immunized ones. 

      However, the strength of this work is the selection and production of nanobodies (Nbs) that specifically interact with the extracellular domain of RSP proteins. The procedure to obtain Nbs is lengthy and complicated and includes the immunization of dromedaries with purified RPS and the construction of a VHH (H-chain antibody variable region) library in E. coli. As RSP is expressed on the surface of E. coli, specific Nbs were able to agglutinate Salmonella strains harboring the p27 plasmid encoding the RSP proteins. 

      The authors demonstrated that Nbs-RSP reduced the conjugation frequency of p27 thus limiting the diffusion of the amp resistance harbored by the plasmid. This represents an innovative and promising strategy to fight antibiotic resistance, as it is not blocked by the mechanism that determines, in the specific case, the amp resistance of p27 but it targets an antigen associated with HincHI- derivative plasmids. Thus, RPS vaccination could be effective not only against Salmonella but also against other enteric bacteria. A possible criticism could be that Nbs against RSP proteins reduce the severity of the disease but do not completely prevent the infection by Salmonella.

      It is true that vaccina2on of mice with purified RSP protein did not provide complete protec2on against infec2on with a Salmonella strain harboring an IncHI plasmid. As this finding is based on an animal model, further inves2ga2on is required to evaluate its clinical efficacy. In any case, even par2al protec2on provided by nanobodies or by a vaccine could poten2ally improve survival rates among cri2cally ill pa2ents infected with a pathogenic bacterium harboring an IncHI plasmid. An addi2onal beneficial aspect of our approach is that it will reduce dissemina2on of IncHI plasmids among pathogenic bacteria, which would reduce the presence of an2bio2c resistance plasmids in the environment and in the bacteria infec2ng pa2ents. 

      Reviewer #2 (Public Review):

      Summary:

      This manuscript aims to tackle the antimicrobial resistance through the development of vaccines. Specifically, the authors test the potential of the RSP protein as a vaccine candidate. The RSP protein contains bacterial Ig-like domains that are typically carried in IncHl1 plasmids like R27. The extracellular location of the RSP protein and its role in the conjugation process makes it a good candidate for a vaccine. The authors then use Salmonella carrying an IncHl plasmid to test the efficacy of the RSP protein as a vaccine antigen in providing protection against infection of antibioticresistant bacteria carrying the IncHl plasmid. The authors found no differences in total IgG or IgA levels, nor in pro-inflammatory cytokines between immunized and non-immunized mice. They however found differences in specific IgG and IgA, attenuated disease symptoms, and restricted systemic infection.

      The manuscript also evaluates the potential use of nanobodies specifically targeting the RSP protein by expressing it in E. coli and evaluating their interference in the conjugation of IncHl plasmids. The authors found that E. coli strains expressing RSPspecific nanobodies bind to Salmonella cells carrying the R27 plasmid thereby reducing the conjugation efficacy of Salmonella. 

      Strengths:

      The main strength of this manuscript is that it targets the mechanism of transmission of resistance genes carried by any bacterial species, thus making it broad.

      The experimental setup is sound and with proper replication.

      Weaknesses:

      The two main experiments, evaluating the potential of the RSP protein and the effects of nanobodies on conjugation, seem as parts of two different and unrelated strategies.

      In preparing our manuscript, we were aware that we included two different strategies to combat an2microbial resistance. However, we deemed it valuable to include both in the paper. The development of new vaccines and the inhibi2on of the transfer of an2bio2c resistance determinants are currently considered relevant approaches to combat an2microbial resistance. Our inten2on in the ar2cle is to integrate these two strategies. 

      The survival rates shown in Figure 1A and Figure 3A for Salmonella pHCM1 and non-immunized mice challenged with Salmonella, respectively, are substantially different. In the same figures, the challenge of immunized mice and Salmonella pHCM1 and mice challenged with Salmonella pHCM1 with and without ampicillin are virtually the same. While this is not the only measure of the effect of immunization, the inconsistencies in the resulting survival curves should be addressed by the authors more thoroughly as they can confound the effects found in other parameters, including total and specific IgG and IgA, and pro-inflammatory cytokines.

      Overall the results are inconsistent and provide only partial evidence of the effectiveness of the RSP protein as a vaccine target.

      To address the concerns regarding the disparities in survival rates depicted in Figures 1A and 3A, it is important to refer to several factors that contribute to these variations. Firstly, it should be noted that the data depicted in these figures stem from distinct experimental sets conducted at different times employing different batches of mice. Despite the use of the same strain and supplier, individual animals and their batches can exhibit variability in susceptibility to infection due to inherent biological differences.

      Unlike in vitro cell culture experiments, which can achieve high replicability due to the homogeneity of cell lines, in vivo animal studies often exhibit greater variability. This variability is influenced not only by genetic variations within animal populations, even if originating from the same supplier, but also by environmental factors within the animal facility. These factors include temperature variations, the concentration y of non-pathogenic microorganisms in the facility, which can modify the immune responses, or the density of animals in the environment, consequently affecting human traffic and generating potential disturbances. 

      When designing experiments with animals, it is desirable for the results to be consistent across different animal batches. If one bacterial strain exhibits higher mortality rates than another across multiple experimental series, this pattern should be reproducible despite the inherent variability in in vivo studies. It is more important to demonstrate consistency in trends than to focus on absolute figures when validating experimental results. 

      It is also important to clarify that when we refer to survival rates, it doesn’ t necessarily mean that the animals were found deceased. The animal procedures were approved by the Ethics Committee of Animal Experimentation of the Universitat de Barcelona, which include an animal monitoring protocol. Our protocol requires close daily monitoring of several health and behavioral parameters, each evaluated according to specific criteria. When an animal reaches a predetermined score threshold indicating severe distress or suffering, euthanasia is administered to alleviate further suffering. At this point, biological samples are collected for subsequent analysis.

      The conjugative experiments use very long conjugation times, making it harder to assess if the resulting transconjugants are the direct result of conjugation or just the growth of transconjugants obtained at earlier points in time. While this could be assessed from the obtained results, it is not a direct or precise measure.

      In the conjuga2on experiments we u2lized a reduced number of donor cells expressing the RSP protein and of recipient cells, as well as long conjuga2on 2mes, to reflect more accurately a situa2on that may occur naturally in the environment. Short conjuga2on 2mes are efficient in controlled laboratory condi2ons using high densi2es of donor and recipient cells, but these condi2ons are not commonly found in the environment. For the interference of the conjuga2ve transfer of the IncHI plasmid we used an E. coli strain displaying the nanobody binding RSP to simulate a process that could be also scaled-up in a natural environment (i.e., a probio2c strain in a livestock farm) and that could be cost effec2ve. See discussion sec2on, lanes 326-328.   

      While the potential outcomes of these experiments could be applied to any bacterial species carrying this type of plasmids, it is unclear why the authors use Salmonella strains to evaluate it. The introduction does a great job of explaining the importance of these plasmids but falls short in introducing their relevance in Salmonella.

      The prevalence of IncHI plasmids in Salmonella was indicated in the introduc2on sec2on, lanes 65-67. Nevertheless, we understand the reviewer’s cri2cisms and have modified both these sentences in the introduc2on sec2on and also added comments in the results sec2on (lanes 118-128).

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      I understand working with mice can be challenging in terms of repeating experiments to further support the study's claims. For this reason, I think the authors need to discuss more thoroughly the following things:

      Can the authors comment on why the presence of Ampicillin leads to a lower upregulation of proinflammatory cytokines in the spleen despite harboring resistance against ampicillin?

      At the intestinal level, physiological inflammatory responses play a crucial role in enabling the host to identify foreign and commensal bacterial antigens and initiate a highly regulated and "controlled" immune response (Fiocchi, 2008. Inflamm Bowel Dis. 2008, 14 Suppl 2:S77-8). The administration of antibiotics such as ampicillin, reduces the load of intestinal resident microbiota, thereby lowering the extent of intestinal immune activation. This decline in immune activation extends to systemic levels, potentially accounting for the reduced expression of proinflammatory cytokines observed in the spleen.

      There are inconsistent results in the survival rates in Figures 1A and 3A, please discuss how this could alter the observed differences in total and specific IgG and IgA, and pro-inflammatory cytokines.

      To address the reviewer concerns regarding the discrepancies in survival rates shown in Figures 1A and 3A, and how these differences might influence the observed variations in total and specific IgG and IgA, as well as pro-inflammatory cytokines, it is important to clarify the terminology used in our study. In our context, "survival" does not solely refer to mortality per se, but encompasses the endpoints defined by our animal welfare protocols, which are rigorously supervised by the Animal Experimentation Ethics Committee of the University of Barcelona. Our protocol mandates close daily monitoring of several health and behavioral parameters, each scored according to specific criteria. When an animal reaches a predefined score threshold indicating severe distress or suffering, euthanasia is conducted to prevent further distress, at which point we collect biological samples for analysis.

      In contrast to in vitro cell culture experiments, which often achieve high replicability thanks to the homogeneity of cell lines, in vivo animal studies frequently display greater variability. This variability stems not only from genetic differences within animal populations, even if originating from the same supplier, but also from environmental factors within the animal facility. These factors encompass variations in temperature, the presence of non-pathogenic microorganisms in the facility (capable of altering immune responses) and the density of animals, which can impact human traffic and potentially lead to disturbances. 

      The experiments depicted in Figs. 1A and 3A were separated in time, and hence may be influenced by environmental factors within the animal facility. Nevertheless, in the comparative analysis performed between immunized and non-immunized animals, experiments were performed simultaneously and hence under similar environmental conditions in the animal facility. For several parameters (i.e., immunoglobulins and proinflammatory cytokines) statistically significant differences were observed. 

      Regarding the conjugation assays, it is not entirely clear to me why the conjugation times are so long. It would be beneficial to have more data about the conjugation efficacy between the donor and recipient without any E. coli expressing the nanobodies at different time intervals. This would help to differentiate between transconjugants and transconjugants obtained from early conjugation events.

      This comment is par2ally answered in a previous response, regarding the numbers of donor and recipient cells and dura2on of conjuga2on.  We note here that in fig. 9, the requested experiment with donor and recipient cells without E. coli interferent cells is already present, corresponding to the label “none”. To avoid confusion, we have modified the legend in fig. 9.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      *Reviewer #1 *

      1. The authors conclude that RFP-Ac expression is restricted to emerging SOPs and surroundings cells at 18h APF, indicating that Ac is activated later than Sc. Can the authors provide images for RFP-Ac expression at 10h and 16h APF similar to GFP-Sc as shown in their figures. Do the SOPs that contain high levels of both Ac and Sc (as some SOPs have Sc expression but not Ac) undergo fate divergence and SB faster than the SOPs containing higher levels of only Sc?

      We are now showing the expression pattern of GFP-SC and RFP-Ac/GFP-Ac in fixed samples stained also for E-cad at 13h, 16h and 18h APF (Fig 1I-K' and Fig S1E-G'). Ac and Sc were found to be activated around the same time. However, Ac appeared to accumulate at lower levels than Sc prior to SOP selection in the central domain of the ADHN (Fig 1J-K'). We also confirmed that Ac was more strongly expressed in SOPs. Additionally, SOPs appeared to accumulate both Ac and Sc, i.e. SOPs with high levels of GFP-Sc also showed a strong RFP-Ac signal (Fig S1H-H'). Finally, since RFP-Ac was not detectable in living pupae, possibly due to the rapid turn-over of Ac and the slow maturation of RFP, we could not study more precisely the relative dynamics of Ac and Sc. For the same reason, we could not address whether the rate of fate divergence (measured using GFP-Sc) varied with the level of Ac.

      2. It would be interesting to see the spatial and temporal dynamics of Ac and Sc in Notch mutants or even Notch dynamics in Sc and Ac mutants to better understand the progression fate divergence and its effect on lateral inhibition in real time.

      Following the reviewer's suggestion, we examined the expression pattern of NRE-deGFP, a Notch activity reporter, in ac sc double mutant pupae at 16h and 24h APF (Fig S3A-D). This showed that the initial pattern of NRE-deGFP at 16h APF (signal detected in posterior ADHN cells as well as in the ADHN) did not depend on Ac and Sc. By contrast, the second phase of NRE-deGFP expression (in cells of the proneural ADHN domain, around emerging SOPs) was found to depend on the activity of Ac and Sc. Thus, strong Notch activation observed in cells surrounding emerging SOPs was found to depend on the activity of Ac-Sc, presumably because Ac and Sc are required for SOP specification and SOPs produce Delta, serving as the local source to activate Notch (see also our response to reviewer 3, point #6). Thus, since NRE-deGFP was not up-regulated in the proneural ADHN domain of sc10-1 ac3 mutant pupae, a quantitative analysis of the dynamics of NRE-deGFP may not be informative.

      The reviewer also suggested us to study the dynamics of GFP-Sc in Notch mutants. One can easily predict that most Notch mutant cells would accumulate GFP-Sc, as observed in the notum (PMID: 28386027). Therefore, analysis of fate symmetry breaking is unlikely to be useful in that context. Likewise, a FDI analysis would not be relevant. From a technical point of view, live imaging of GFP-Sc would have to be performed in Notch mutant clones. This is because RNAi against Notch (strong 10xUAS-Notch hp2 construct, PMID: 19487563) driven by escargot-Gal4 to knock down Notch in larval histoblasts only led to a partial loss of Notch function (our unpublished data). Generation of Notch mutant clones in the abdomen would require constructing appropriate GFP-Sc Notch FRT recombinant chromosome as well as generating a new FRT GFP-Sc chromosome with an infrared marker (not currently available) to compare the relative dynamics of GFP-Sc in wild-type and mutant cells. In sum, this proposed experiment would take a significant amount of time and is unlikely to shed new light. Given that this experiment is not essential to support the claims of the paper and that it is not clear to us what would be learnt from this experiment, we opted for not performing this experiment.

      Minor comments * 1. In figure 1F and F', the authors mention GFP-Sc is not expressed prior to 14h, however, there is still GFP signal detected in their imaging. Can the authors comment what would be the cause of this GFP signal or was it due to non-specific background signal during their imaging analysis?*

      We thank the referee for raising this issue. Yes, a strong autofluorescence signal was detected prior to the onset of GFP-Sc expression. We provide below the results of our analysis of the autofluorescence signal (Fig R1) relative to the nuclear signal (Fig R2), and how normalization of the signal was used to measure the specific GFP-Sc signal.

      Analysis of the autofluorescence signal over time

      To estimate the autofluorescence signal, we measured the average intensity of the signal acquired in the GFP channel for each frame and plotted these values over time. The results are shown in Fig R1 below:

      *Fig R1: temporal profile of the autofluorescence signal *

      Each measurement corresponds to the average intensity measured in the GFP channel over the entire field at each z-section and for each time point. Mean and SD values of measured are shown over time in black and grey, respectively. Time is in frame number (dt is 2.5 min). The data shown above corresponds to movie 1 (see also Fig 2).

      This plot indicates that the autofluorescence signal was progressively bleached. We therefore excluded from our analysis the first 50 time points when the autofluorescence signal was initially strong. No nuclear GFP-Sc signal was detectable in these first 50 frames in the cells of the central area of the ADHN which are studied here (see Fig 2A', t=1:12, time frame #29).

      While revising the manuscript, we realized that t=0 corresponded to two distinct time points in the first version of our manuscript: it corresponded to the onset of imaging in Fig 2A-D', and to t=2:08 (time frame #51) in all other figures showing data following removal of the first 50 time points. We have now fixed this issue and are presenting all data with t=0 corresponding to the onset of imaging.

      Analysis of the nuclear fluorescence signal over time

      To detect the nuclear GFP-Sc signal, we measured the average intensity of the signal acquired in the GFP channel (raw intensity values corresponding to the sum of the GFP-Sc and autofluorescence signals) in segmented nuclei (in 3D, within the entire z-stack). These values were plotted over time (pink curve in Fig R2 below; the autofluorescence is plotted in black, as in Fig R1, for the sake of comparison). This showed that the intensity of the signal measured in nuclei was initially identical to the mean intensity measured across the entire field of view, indicative of autofluorescence only. A specific increase in signal intensity in nuclei (relative to the entire field of view) was detectable after 2h of imaging (time frame 48 in Fig R1; dt is 2.5 min). Importantly, mean intensity values of the autofluorescence signal appeared to be approximately 10-fold stronger than the mean intensity associated with the nuclear GFP-Sc signal.

      Fig R2: temporal profile of the GFP-Sc signal

      *The plot in pink corresponds to the average intensity in the GFP channel (raw intensity values corresponding to the GFP-Sc and/or autofluorescence signals) per nucleus (within the entire z-stack) for each time point. Mean and SD values measured in each nucleus are shown over time (in pink; these data correspond to movie 1; shown also in Fig 3). This plot (pink) should be compared with the plot shown in Fig R1 (also in black in Fig R2). The intensity difference between the pink and black curves was attributed to the specific GFP-Sc signal. *

      Signal normalization and analysis of the GFP-Sc signal

      In our study, we normalized the GFP-Sc signal by dividing the averaged value measured in each single nucleus (data corresponding to the pink curve in Fig R2) by the mean value of the signal measured at the same time point in the same channel in the entire image stack (data corresponding to the black curve in Fig R1/R2). Given the low intensity of the GFP-Sc signal, and the small number of pixels corresponding to Scute-expressing nuclei over the entire field of view, this value should closely reflect the autofluorescence noise. Thus, the background autofluorescence signal should be close to 1. This was experimentally verified by measuring the normalized intensity values of the PDHN nuclei that did not express Scute: a mean intensity value of 0.96 +/- 0.10 was measured (at time frame #51; see Fig R1 below). In contrast, the normalized GFP-Sc values measured several hours before SB were found to be close to 1.1 (see Fig 3D). Whether these values reflect very low levels of nuclear GFP-Sc that cannot be detected visually or result from imperfect normalization of the signal remain unclear. Given the intensity and non-uniformity of the autofluorescence signal, we cannot exclude the latter. For this reason, we chose to not over-interpret the initial low intensity values of GFP-Sc.

      In the materials and methods, the authors mention that prior to imaging the larvae and pupae are grown at 18, 21 or 25{degree sign}C. Is there a reason why the larvae and pupae are grown at different temperatures for different experiments? Can the authors specify (i.e. in the figure legends) in which experiments different temperatures were used?

      Larvae and pupae were grown at different temperatures for convenience, i.e. to adapt the time interval between staging at 0h APF and mounting for live imaging. Indeed, it is much easier to obtain 10-14h APF pupae by collecting staged pupae at 0h APF the day before and incubating them overnight at lower temperature to slow-down development. However, all live imaging experiments were performed at 23-25{degree sign}C, and we have no reason to think that this prior incubation would affect the process studied here.

      The citations need to have a better format as they show up as each citation within a single bracket which makes it a little hard to read when multiple references are cited in a single sentence. fixed

      In the abstract, the sentence 'Unexpectedly, we observed at low frequency (10%) pairs of cells that are in direct contact at the time of SB'. SB should be replaced with "Symmetry breaking", as it appeared for the first time in the manuscript and should be written out in full. fixed

      Throughout the manuscript there are instances where the abbreviations are written in full with the abbreviation in brackets after they have already been introduced in the introduction which can be changed to just the abbreviation itself. fixed

      In the discussion on page 11, 'our observation...', our needs to be changed to Our. fixed

      7. It would be nice to have arrow heads or dotted lines around the cells or areas on interest in both, all the figures and movies, so that it will be easier to follow the results. The videos have a lot of background due to fragmented apoptotic nuclei, etc. as mentioned by the authors, hence arrow heads or dotted lines would bring viewers focus on the areas of interest.

      fixed (see for instance Fig 1D, Fig 2A, Fig 5B, Fig 7A, Fig S3D, etc...)

      8. It would be helpful to have anterior - posterior axis (i.e. with an arrow) shown on top of all the figures.

      In our earlier version, we indicated that 'In this and all other figures, dorsal is up and anterior is left' in the legend of Fig 1B. We have now moved this sentence at the end of Fig 1 to have it more apparent. Additionally, the AP axis is now clearly indicated in Fig 1C. We believe that it is not necessary to repeat this orientation in all figures.

      Scale bars are missing in all figures, videos, and figure legends. Added

      Only movies 1 and 3 are referenced in the text. All movies are now referenced in the text

      Keeping the colors in the movies and figures consistent and same would be helpful. For example, Movie 2 Histone3.3-mIFP marker is in blue but in figure 3 it is in magenta. fixed (H3.3-mIFP in magenta in this movie, now numbered 3)

      As mentioned above, it would be helpful if the authors have arrow heads or dotted lines around the cells or areas of interest in both the figures and movies for better representation of their data. For example, movie 1 shows a larger area of imaging than shown in figure 2A, which makes it hard to follow the cells of interest in the movie.

      An additional movie corresponding to the SOP shown in Fig 2A is now provided (new movie 2).

      --

      Reviewer #2

      1. Despite "symmetry breaking" being the main focus of the paper, in the Introduction, the authors do not explain what this term means and do not provide any description of this process. This is a critical point that makes understanding of the goals of the paper difficult. Therefore, the authors are encouraged to provide more information and a clear description of this term/phenomenon. We thank the reviewer for this suggestion, we are now stating in the introduction what symmetry breaking means in the context of lateral inhibition: 'To describe and study the process of SOP selection, we studied fate SB. The latter refers to the transition point when one cell, the future SOP, starts to stably accumulate a higher level of GFP-Sc relative to its immediate neighbors.'

      The role of Achaete in the story is not clear. Even though both factors are required for SOP determination, the authors mainly focus on Scute, so it is not very clear what the role of Achaete in this process is, if there is any. As shown in the paper, Achaete is expressed later when heterogeneity is promoting cell fate divergence. Is Achaete maybe contributing to cell heterogeneity/ cell fate divergence?

      We thank the reviewer for raising this point. We now show in Fig S1A-D that abdominal bristles develop in a protein null allele of sc (scM6 ) as well as in an ac mutant corresponding to a 45 kb deletion that removes ac but not sc (PMID: 16216235)). Together with our analysis of sc10-1 ac3 __mutant flies, we can now conclude that __Sc and Ac act redundantly for SOP specification in the pupal abdomen. We have also further studied the expression of Ac relative to Sc and E(spl)HLH-m3 (see our response above to point #1 of reviewer 1). We fully agree with the reviewer that cell-to-cell variations in Ac expression might contribute to proneural heterogeneity and SB. This is now briefly discussed.

      Minor points: * * 1. Symmetry Breaking (SB) should be abbreviated in the Abstract. The authors initially use the full term without abbreviation, and only on page 5, the abbreviation is finally defined; however, it should be introduced much earlier.

      fixed

      The second-to-last sentence in the abstract, "These lateral inhibition defects were correlated via cellular rearrangements," is unclear regarding what defects the authors are referring to.

      This sentence was rewritten: 'Live imaging showed that these patterning defects were corrected via cellular rearrangements associated with global tissue fluidity, not via cell fate change.'

      For clarity, being more specific in the text in regards to description of the figure panels would be beneficial (e.g. page 3 Fig 1C-E); referring to C-E together makes it hard to understand what does each panel shows.

      fixed

      In many instances, the movies are not properly referenced (e.g. on page 5, third row simply states "movies"), making it difficult to discern which movie should be checked. On page 8, when authors refer to movie 3, they likely meant movie 5.

      fixed

      Figure S1 requires some corrections.

      We thank the reviewer for helping us improve the presentation of our results.

      The authors use the short name "scute" initially and then switch to the shortened version "sc'.

      fixed

      Additionally, the nlsRFP (blue) is difficult to see; adjusting the levels or changing colors/showing separate channels may improve visibility.

      The authors mention clone borders, but none are shown. It would greatly help to outline the borders in all figures.

      The ubiquitous nlsRFP marker is now shown in magenta in Fig S1I that now shows only 2 channels to outline the ADHN (white dotted line) and the clones (yellow dotted lines).

      We also outlined the clone borders in Fig 4C,C'.

      Genotypes of the samples should be indicated, and clarification is needed regarding what "n" represents (number of cells, clones, or flies).

      The genotype studied in Fig S1 and Fig 4 (which is the only complex genotype studied here) is now indicated in the Methods section. We have clarified what the different 'n' meant, in Fig 4 (see text) and elsewhere (see legend of Fig S2 for instance).

      What do the arrows in the panel B show?

      Thanks for pointing this out. The arrows in Fig S1I' indicate Cut/Hnt-positive cells (SOPs) within the clones (as now explained in the legend).

      It is also recommended to display important channels as separate black and white images.

      Separate channels are now shown in Fig S1 and S3.

      Additionally, the use of RNAi against GFP instead of RNAi against scute should be justified; using RNAi GFP as the genotype on the graph could be interpreted as a control genotype rather than downregulation of scute.

      A RNAi construct against GFP was used because this construct was known to very efficient and specific. Indeed, a strong knock-down of GFP-Sc was obtained by this approach (see Fig 4C'). We did not test sc RNAi constructs in the context of GFP-Sc. To avoid confusion, we are now indicating Sc downregulation (gfp RNAi) in Fig 4C'.

      In the Figure 2 Legend, the authors use "std" as an abbreviation to define standard deviation. Typically, this is abbreviated as SD.

      fixed

      In Figure 4E, the authors do not explain on why there are points on the x-axis that correspond to a decimal number of cells.

      Since heterogeneity was calculated over a 20 min interval, we likewise calculated the number of neighbors over the same time interval. Thus, the number of neighbors for each SOP corresponds to an averaged value calculated over this time interval. This is now explained in the legend.

      --

      Reviewer #3

      1. First and foremost, the authors should state in the first paragraph of the Results that scGFP is a CRISPR knockin and thus it's the only source of Sc protein in the animals imaged (this is stated only in the Methods section). Thanks for this comment, we agree that this is one of the strengths of our work that we should emphasize. We now state in the results section: 'GFP-Sc is produced from the endogenous locus such that all Sc molecules produced in these pupae are GFP-tagged'

      The magnitude of the Sc increase should be commented on. Based on the intensity and FDI plots in Fig. 3B-E, an increase of 15-17% in the amount of Sc is suggested (the FDI plateaus at 0.08, which gives 1.08/0.92 = 1.17x the level of Sc in the SOP vs the surrounding cells). However, in the stills shown in Fig. 2BCD and in Fig. 3A, the intensity differential between SOPs and neighbors seems at least 100% (ie at least double the intensity, which would yield an FDI of >1/3 =0.33). Why is this high contrast never seen in the quantitative measurements?

      Thanks for asking about the fold change of GFP-Sc levels in SOPs, from SB to its plateau. This fold change can be seen in Fig 3D: the normalized value of GFP-Sc is 1.12 at SB, and 1.26 three hours after SB (when the FDI plateaus), indicative of a 2.2x fold increase of GFP-Sc in SOPs (0.26/0.12= 2.2, following background subtraction; see our detailed response to reviewer #1, minor point 1, about background signal analysis and normalization of the signal). This fold-change value is now indicated in the legend of Fig 3D. Obviously, this fold-change value is highly sensitive to signal normalization. Since the autofluorescence signal was stronger than the GFP-Sc signal (see Fig R2 above) and varied over time (due to bleaching; see Figs R1 and R2 above), we feel that this fold-change value should be taken with a grain of salt.

      From Fig. 2A-D it appears that the ScGFP fluorescence intensity is at the same level or weaker than nearby autofluorescence. Please state (1) how you confirmed that the histoblast nest has lower autofluorescence than the larval epidermis and (2) how you corrected for histoblast nest autofluorescence in your quantifications.

      As detailed above (our response to reviewer #1, minor point 1), the specific GFP-Sc signal is ten-times lower than the autofluorescence signal. We did not compare the autofluorescence signal produce by larval and imaginal cells (but note that larval epidermal cells had a stronger autofluorescence signal; see the yellow dots in Fig 2A). Normalization of the signal to correct for autofluorescence was explained in the Methods (and is also detailed above in our response to reviewer #1, minor point 1).

      The paradoxical result of Fig. S1B should be discussed. On the one hand it is stated that "Ac and Sc specify the fate of the Sensory Organ Precursor cells (SOPs)" (p.2) and on the other S1B shows SOP specification in the absence of Sc. Are the SOPs shown in Fig S1B rare exceptions? Do the authors believe that these rare exceptions are there because of inefficient RNAi (since in comparison with S1A, in the null condition almost no SOPs should be formed)? Or they are the SOPs in RNAi clones as rare as the occasional bristles in S1A?

      We do not see the result of Fig S1B as paradoxical but interpreted this result assuming that Ac and Sc were redundant for SOP determination. We now provide clear genetic evidence in support of this view (see our response above to reviewer #2, point 2). Otherwise, we found that RNAi is efficient (see loss of the GFP signal in clone in Fig. 4C'). In adult males, the density of bristles appeared to be quite normal over clonal patches of gfp RNAi cells (not shown), consistent with Ac being redundant with Sc

      One figure that is not straightforward to interpret is Fig. 4E. It plots ScGFP heterogeneity vs. number of RNAi neighbors. Each point in the plot must be an individual SOP (165 total). Therefore, its neighbors (the x-axis) should take integral (not decimal) values. How can a single SOP have a decimal number of RNAi neighbors, especially since heterogeneity was sampled over a 10min time-window, when not much cell rearrangement can take place? Please explain.

      Since heterogeneity was calculated over a 20 min interval, we likewise calculated the number of neighbors over the same time interval. Thus, the number of neighbors for each SOP corresponds to an averaged value calculated over this time interval. This is now explained in the legend: 'Note that the number of neighbors was likewise calculated over this time interval, and the resulting number of neighbors may not take an integral value.'

      I found the discussion of the Notch reporter dynamics (Fig. 7) confusing in several places. * * (6a) Whereas it's clear that there is plenty of Notch signaling going on before SBN, the authors repeatedly imply that Notch signaling starts after SBN. For example, in the Results (p.9) they state "Thus, this quantitative approach failed to detect a phase of reciprocal Notch signaling during which proneural cluster cells would both send and receive a Delta-Notch signal prior to SOP emergence." The fact that the NRE-deGFP gave a robust signal before the start of the movies clearly means that mutual inhibition was going on for quite some time before SB. In fact, an FDI of 0 for >4h prior to SBN (Fig. 7G) means exactly this: that the level of Notch response among the cluster cells is equivalent ("mutual inhibition" lasts for at least 4h before SBN). (6b) In the first paragraph of this section (p.8) they comment that the pre-existence of Notch signaling is unexpected - why? I interpret it to simply be mutual inhibition (see above). Then they go on to quantitate the average Notch response intensity over the entire posterior ADHN (please define the borders the "posterior" ADHN). I question the informational value of this analysis (averaging over a large region), when Notch signaling is known to have intense local cell-to-cell variability (also evident in the stills shown in Fig. 7A,B,C).

      We apologize for not describing well enough the data shown in Fig 7E, and for not explaining clearly our interpretation of the NRE-deGFP signal.

      While the observation of a strong NRE-deGFP signal indeed indicates that Notch signaling had been active prior to the time of observation (in this sense, Notch is indeed active long before SBN), this does not necessarily imply that Notch is still active at that time. This is because the deGFP protein produced by the NRE-deGFP reporter is stable relative to the time scale of the studied process. Its measured half-life in S2 cells cultured at 25{degree sign}C is 2h (PMID: 31140975). Based on this data, the NRE-deGFP signal is likely to remain detectable several hours after Notch signaling has been switched off. If the rate of production of deGFP is lower than its rate of degradation, then the NRE-deGFP signal is expected to progressively decay over time. We believe that this is what we observed in our movies: while a strong signal was detected over the posterior half of the ADHN at 14-15h APF, this signal decreased over time (Fig 7D). To interpret the temporal dynamics of NRE-deGFP signal in terms of instantaneous Notch activity, we examined the Rate of Change (ROC): an increase of the NRE-deGFP signal over time (positive ROC) would indicate that Notch activity is increasing (more precisely, the production rate of deGFP is higher than its rate of degradation), whereas a decrease (negative ROC) indicates that Notch becomes less active (or inactive if the rate of decrease approximates the decay rate of the deGFP protein). Our data shown in Fig 7D showed that the NRE-deGFP signal (measured in the area indicated with a dotted line in Fig 7A,B; this area was defined by the initial pattern of NRE-deGFP) decreased over time (negative ROC) between t=1 and t=6.5h. We therefore conclude that Notch signaling is decreasing to reach a minimum at t=~3.5h, indicating that the level of Notch activity is at its lowest around the time of SB. At this minimum, the decay rate corresponds to a protein half-life of 4.4h, which is not so different from the measured half-time of deGFP in S2 cells (particularly if one assumes a 1.4x difference between the decay rates measured at 22 and 25{degree sign}C, based on the known temperature-dependent speed of development). This is why we conclude that Notch signaling is very low at this stage. Additionally, no NRE-deGFP signal was detected before t=4:30h (movie 7) in the initially NRE-deGFP negative cells (located anterior to the area indicated with a dotted line in Fig 7A). This indicated that Notch was activated late in this area. Together, our observations are not consistent with the view that Notch mediates a strong mutual inhibition signal over a prolonged time interval prior to SB.

      To further study the pattern of Notch activity, we have monitored over time the accumulation pattern of GFP-tagged E(spl)m3-HLH (GFP-m3) (PMID: 31375669) in fixed sample (Fig S3F-G'). This confirmed that Notch was active in posterior ADHN cells and in the PDHN prior to 14h APF, i.e. prior to the onset of Ac and Sc, and that Notch activation extended to the central ADHN domain at 17-18h APF (Fig S3E-E' and G-G', and Fig 7I-I''), coinciding with SOP emergence.

      Otherwise, the reviewer is correct when stating that a FDI value close to 0 indicates that the level of measured fluorescence among the different cells of the considered cluster is similar. Such a FDI value would be measured if cells did not express NRE-deGFP or had decreasing but similar levels of NRE-deGFP. This FDI value does not, per se, imply that Notch is active.

      And then they move on to a (much more informative) cell-by-cell analysis, without even changing paragraphs, making it hard for the reader to follow. (6c) The conclusion at the end of the second paragraph (p. 9) "It also showed that SB was detected soon after the onset of Notch-mediated inhibitory signaling." is nowhere supported by data. If I understand well, SB refers to Sc and "the onset of Notch-mediated inhibitory signaling" refers to SBN (which is the onset of ASYMMETRY in Notch signaling, not the onset of Notch signaling, which has been going on for hours earlier). I don't see any data comparing SB with SBN. In fact, this is an important question to address (see below - comment 10).

      We apologize for the lack of clarity in our writing, we meant: "It also showed that SBN was detected soon after the onset of Notch-mediated inhibitory signaling."

      Yes, SBN refers to the onset of asymmetry in Notch signaling, as measured using NRE-deGFP. As explained above (but see also our response to point #7 below), our data do not provide evidence for a detectable Notch signal prior to SBN.

      We agree that comparing SB and SBN would be nice. Unfortunately, our current tools do not permit a detailed comparison (see our detailed response below, point #10).

      Mutual inhibition amongst neighboring cells has been proposed to involve (besides mutual Notch signaling) an increase in Sc levels in 2-3 cells in a cluster before the singularization of a single SOP. The authors seem rather biased against such a transient Sc hike based on their results in Fig. 2D, where the neighboring cells stay at rather constant basal Sc levels for several hours, while the Sc SB event happens. However, looking at an individual SOP in Fig. 2B, I do detect a mild hike in the pink curve right around SB in the blue curve. Could the average result from 160 SOPs (in Fig 2D) simply blur such transient Sc hikes, if they happen with different kinetics for different SOPs? Couldn't the 10% of SOP twins (shown in Fig. 6) represent a special case of this transient "subcluster" Sc hike? I would appreciate some discussion on this point. [Whether Sc is transiently upregulated or not, however, does not change my firm conclusion - from the data presented - that Notch-mediated mutual inhibition has been going on long before SBN.]

      First, our data are consistent with the notion that a few proneural cells progressively accumulate higher level of Scute prior to SB (as proposed above). Indeed, the moderate increase in both GFP-Sc levels and coefficient of variation values (GFP-Sc heterogeneity) seen prior to SB correspond to what the reviewer has in mind (higher levels of GFP-Sc in a few proneural cluster cells). We also appreciate the reviewer's comment about the plot shown in Fig 2D. However, we strongly feel that our quantitative analysis of a large dataset is a strength. Thus, we do not find useful to discretize a continuous process by introducing the notion of 'subclusters' of 2-3 cells. Likewise, we believe that it is more informative to focus our analysis on the entire dataset using average and SD values and do not wish to base our interpretation of the process based on selected tracks (the one shown in Fig 2B only served as an illustration of how we performed our analysis and has no interpretation value).

      The reviewer also states that "mutual inhibition amongst neighboring cells has been proposed to involve an increase in Sc levels in 2-3 cells in a cluster before the singularization of a single SOP". Since there is no published description of the pattern of accumulation of Scute in abdominal histoblats (to our best knowledge), we hypothesize that this statement applies to the proneural clusters in the developing wing disc. This is because the accumulation pattern of Sc has been studied in detail in that context by the Modollel and Carroll labs (PMID: 2044965, PMID: 2044964). However, their description of the accumulation pattern of Scute (in fixed samples, using anti-Sc antibodies) did not refer to sub-clusters of 2-3 cells. We would appreciate if the reviewer could direct us to the relevant published observation.

      Finally, we are not sure to follow the reviewer when she/he firmly concluded from our data that Notch-mediated mutual inhibition has been going on long before SBN. Instead, our data clearly showed that the ADHN region that produced SOPs exhibited two distinct NRE-deGFP patterns, with Notch signaling being active prior to imaging (i.e. prior to 14h APF) and decreasing to reach a minimum of Notch activation around 17h APF (i.e. around the time of SB, as determined by GFP-Sc imaging) in the posterior area of the ADHN.

      Thus, our data do not show that mutual inhibition does not take place in this tissue but rather imply that the phase of mutual inhibition (or competition) must be relatively short, or transient, and that competition amongst proneural cluster cells operate at low Notch and Sc levels (probably contrary to what many people have in mind).

      Some minor points: * * 8. Please change Cad-GFP to Ecad-GFP or shg-GFP, as Cad misdirects to caudal.

      Thanks, changed into Ecad-GFP and Ecad-mKate

      What is c in "(x,y,z,c,t) movies"? (a fifth independent variable?)

      c stands for channel. This is relatively standard nomenclature.

      The authors show that Sc displays a SB event leading to FDI of 0.08 and the Notch reporter displays another SB (SBN) leading to a much more pronounced FDI of -0.2. Are these two events (the hike of Sc levels and the plummeting of Notch signal) contemporaneous or does one precede the other? Having both tagged with GFP makes it impossible to image simultaneously, but the authors could register each reporter's dynamics relative to the time of SOP division (as done in Fig. 5C) to get a sense of their relative order.

      We do agree with the reviewer that it would be nice to be able to align in time these two data sets. Unfortunately, the temporal correlation between SB and the SOP division is too variable (4.7 +/- 1.1) to confidently align these two datasets using this event as a time reference. New tools are needed (see our response to point #11 below).

      Where in the above timeline is the SOP fate definitively adopted? neur-nlsGFP, Ac-RFP, m3Cherry and Sens detection in Figs. 1 and 7 give us a rough idea that these other markers appear around the time of Sc FDI peaking, around 3h after the initial SB. But this is not presented in an organized fashion - the reader collects this information sporadically. A reanalysis of the already existing data attempting to place these various markers in an integrated timeline would be of great importance in understanding the details of this cell fate specification process. Which is the earliest SB event? sc, neur or Notch? How long does it take from that early SB until definitive SOP markers (Sens) first appear?

      We agree with the reviewer, it would be interesting to extend the approach reported here for Scute to characterize SB and rate of FDI for other key factors governing the selection of SOPs. As pointed out by the reviewer (point #10 above), it would also be important to register in time these various events. Unfortunately, the maturation time of RFP, mCherry, FP670, etc... appeared to be too slow relative to the rapid turnover of the Ac, Sc and E(spl)-HLH factors prevented us from performing two-color imaging. Hence, current tools do not permit to determine which is the earliest SB event.

      More genetic perturbations could be performed to solidify the model of cell-cell communication during lateral inhibition. Two obvious ones come to mind: (a) How would the Sc-GFP dynamics change in a Notch-RNAi background? (b) How would the NRE-deGFP dynamics change in a sc-RNAi background?

      See our detailed response to reviewer #1, major point #2.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Your editorial guidance, reviews, and suggestions have led us to make substantial changes to our manuscript. While we detail point-by-point responses in typical fashion below, I wanted to outline, at a high level, what we’ve done.

      (1) Methods. Your suggestions led us to rethink our presentation of our methods, which are now described more cohesively in a new methods section in the main text.

      (2) Model Validation & Robustness. Reviewers suggested various validations and checks to ensure that our findings were not, for instance, the consequence of a particular choice of parameter. These can be found in the supplementary materials.

      (3) Data Cleaning & Inclusion/Exclusion. Finally, based on feedback, our new methods section fully describes the process by which we cleaned our original data, and on what grounds we included/excluded individual faculty records from analysis.

      eLife assessment

      Efforts to increase the representation of women in academia have focussed on efforts to recruit more women and to reduce the attrition of women. This study - which is based on analyses of data on more than 250,000 tenured and tenure-track faculty from the period 2011-2020, and the predictions of counterfactual models - shows that hiring more women has a bigger impact than reducing attrition. The study is an important contribution to work on gender representation in academia, and while the evidence in support of the findings is solid, the description of the methods used is in need of improvement.

      Reviewer #1 (Public Review):

      Summary and strengths

      This is an interesting paper that concludes that hiring more women will do more to improve the gender balance of (US) academia than improving the attrition rates of women (which are usually higher than men's). Other groups have reported similar findings but this study uses a larger than usual dataset that spans many fields and institutions, so it is a good contribution to the field.

      We thank the reviewer for their positive assessment of the contributions of our work.

      Weaknesses

      The paper uses a mixture of mathematical models (basically Leslie matrices, though that term isn't mentioned here) parameterised using statistical models fitted to data. However, the description of the methods needs to be improved significantly. The author should consider citing Matrix Population Models by Caswell (Second Edition; 2006; OUP) as a general introduction to these methods, and consider citing some or all of the following as examples of similar studies performed with these models:

      Shaw and Stanton. 2012. Proc Roy Soc B 279:3736-3741

      Brower and James. 2020. PLOS One 15:e0226392

      James and Brower. 2022. Royal Society Open Science 9:220785 Lawrence and Chen. 2015.

      [http://128.97.186.17/index.php/pwp/article/view/PWP-CCPR-2015-008]

      Danell and Hjerm. 2013. Scientometrics 94:999-1006

      We have expanded the description of methods in a new methods section of the paper which we hope will address the reviewer’s concerns.

      We agree that our model of faculty hiring and attrition resembles Leslie matrices. In results section B, we now mention Leslie matrices and cite Matrix Population Models by Caswell, noting a few key differences between Leslie matrices and the model of hiring and attrition presented in this work. Most notably, in the hiring and attrition model presented, the number of new hires is not based on per-capita fertility constants. Instead, population sizes are predetermined fixed values for each year, precluding exponential population growth or decay towards 0 that is commonly observed in the asymptotic behavior of linear Leslie Matrix models.

      We have additionally revised the main text to cite the listed examples of similar studies (we had already cited James and Brower, 2022). We thank the reviewer for bringing these relevant works to our attention.

      The analysis also runs the risk of conflating the fraction of women in a field with gender diversity! In female-dominated fields (e.g. Nursing, Education) increasing the proportion of women in the field will lead to reduced gender diversity. This does not seem to be accounted for in the analysis. It would also be helpful to state the number of men and women in each of the 111 fields in the study.

      We have carefully examined the manuscript and revised the text to correctly differentiate between gender diversity and women’s representation.

      We have additionally added a table to the supplemental materials (Tab. S3) that reports the estimated number of men and women in each of the 111 fields.

      Reviewer #2 (Public Review):

      Summary:

      This important study by LaBerge and co-authors seeks to understand the causal drivers of faculty gender demographics by quantifying the relative importance of faculty hiring and attrition across fields. They leverage historical data to describe past trends and develop models that project future scenarios that test the efficacy of targeted interventions. Overall, I found this study to be a compelling and important analysis of gendered hiring and attrition in US institutions, and one that has wide-reaching policy implications for the academy. The authors have also suggested a number of fruitful future avenues for research that will allow for additional clarity in understanding the gendered, racial, and socioeconomic disparities present in US hiring and attrition, and potential strategies for mitigating or eliminating these disparities.

      We thank the reviewer for their positive assessment of the contributions of our work.

      Strengths:

      In this study, LaBerge et al use data from over 268,000 tenured and tenure-track faculty from over 100 fields at more than 12,000 PhD-granting institutions in the US. The period they examine covers 2011-2020. Their analysis provides a large-scale overview of demographics across fields, a unique strength that allows the authors to find statistically significant effects for gendered attrition and hiring across broad areas (STEM, non-STEM, and topical domains).

      LaBerge et al. find gendered disparities in attrition-using both empirical data and their counterfactual model-that account for the loss of 1378 women faculty across all fields between 2011 and 2020. It is true that "this number is both a small portion of academia... and a staggering number of individual careers," as ." - as this loss of women faculty is comparable to losing more than 70 entire departments. I appreciate the authors' discussion about these losses-they note that each of these is likely unnecessary, as women often report feeling that they were pushed out of academic jobs.

      LaBerge et al. also find-by developing a number of model scenarios testing the impacts of hiring, attrition, or both-that hiring has a greater impact on women's representation in the majority of academic fields in spite of higher attrition rates for women faculty relative to men at every career stage. Unlike many other studies of historical trends in gender diversity, which have often been limited to institution-specific analyses, they provide an analysis that spans over 100 fields and includes nearly all US PhD-granting institutions. They are able to project the impacts of strategies focusing on hiring or retention using models that project the impact of altering attrition risk or hiring success for women. With this approach, they show that even relatively modest annual changes in hiring accumulate over time to help improve the diversity of a given field. They also demonstrate that, across the model scenarios they employ, changes to hiring drive the largest improvement in the long-term gender diversity of a field.

      Future work will hopefully - as the authors point out - include intersectional analyses to determine whether a disproportionate share of lost gender diversity is due to the loss of women of color from the professoriate. I appreciate the author's discussion of the racial demographics of women in the professoriate, and their note that "the majority of women faculty in the US are white" and thus that the patterns observed in this study are predominately driven by this demographic. I also highly appreciate their final note that "equal representation is not equivalent to equal or fair treatment," and that diversifying hiring without mitigating the underlying cause of inequity will continue to contribute to higher losses of women faculty.

      Weaknesses

      First, and perhaps most importantly, it would be beneficial to include a distinct methods section. While the authors have woven the methods into the results section, I found that I needed to dig to find the answers to my questions about methods. I would also have appreciated additional information within the main text on the source of the data, specifics about its collection, inclusion and exclusion criteria for the present study, and other information on how the final dataset was produced. This - and additional information as the authors and editor see fit - would be helpful to readers hoping to understand some of the nuance behind the collection, curation, and analysis of this important dataset.

      We have expanded upon the description of methods in a new methods section of the paper.

      We have also added a detailed description of the data cleaning steps taken to produce the dataset used in these analyses, including the inclusion/exclusion criteria applied. This detailed description is at the beginning of the methods section. This addition has substantially enhanced the transparency of our data cleaning methods, so we thank the reviewer for this suggestion.

      I would also encourage the authors to include a note about binary gender classifications in the discussion section. In particular, I encourage them to include an explicit acknowledgement that the trends assessed in the present study are focused solely on two binary genders - and do not include an analysis of nonbinary, genderqueer, or other "third gender" individuals. While this is likely because of the limitations of the dataset utilized, the focus of this study on binary genders means that it does not reflect the true diversity of gender identities represented within the professoriate.

      In a similar vein, additional context on how gender was assigned on the basis of names should be added to the methods section.

      We use a free, open-source, and open-data python package called nomquamgender (Van Buskirk et al, 2023) to estimate the strengths of (culturally constructed) name-gender associations. For sufficiently strong associations with a binary gender, we apply those labels to the names in our data. We have updated the main text to make this approach more apparent.

      We have also added language to the main text which explicitly acknowledges that our approach only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      I do think that some care might be warranted regarding the statement that "eliminating gendered attrition leads to only modest changes in field-level diversity" (Page 6). while I do not think that this is untrue, I do think that the model scenarios where hiring is "radical" and attrition is unchanged from present (equal representation of women and men among hires (ER) + observed attrition (OA)) shows that a sole focus on hiring dampens the gains that can otherwise be addressed via even modest interventions (see, e.g., gender-neutral attrition (GNA) + increasing representation of women among hires (IR)). I am curious as to why the authors did not include an additional scenario where hiring rates are equal and attrition is equalized (i.e., GNA + ER). The importance of including this additional model is highlighted in the discussion, where, on Page 7, the authors write: "In our forecasting analysis, we find that eliminating the gendered attrition gap, in isolation, would not substantially increase representation of women faculty in academia. Rather, progress towards gender parity depends far more heavily on increasing women's representation among new faculty hires, with the greatest change occurring if hiring is close to gender parity." I believe that this statement would be greatly strengthened if the authors can also include a comparison to a scenario where both hiring and attrition are addressed with "radical" interventions.

      Our rationale for omitting the GNA + ER scenario in the presented analysis is that we can reason about the outcomes of this scenario without the need for computation; if a field has equal inputs of women and men faculty (on average) and equal retention rates between women and men (on average), then, no matter the field’s initial age and gender distribution of faculty, the expected value for the percentage of women faculty after all of the prior faculty have retired (which may take 40+ years) is exactly 50%. We have updated the main text to discuss this point.

      Reviewer #3 (Public Review):

      This manuscript investigates the roles of faculty hiring and attrition in influencing gender representation in US academia. It uses a comprehensive dataset covering tenured and tenure-track faculty across various fields from 2011 to 2020. The study employs a counterfactual model to assess the impact of hypothetical gender-neutral attrition and projects future gender representation under different policy scenarios. The analysis reveals that hiring has a more significant impact on women's representation than attrition in most fields and highlights the need for sustained changes in hiring practices to achieve gender parity.

      Strengths:

      Overall, the manuscript offers significant contributions to understanding gender diversity in academia through its rigorous data analysis and innovative methodology.

      The methodology is robust, employing extensive data covering a wide range of academic fields and institutions.

      Weaknesses:

      The primary weakness of the study lies in its focus on US academia, which may limit the generalizability of its findings to other cultural and academic contexts.

      We agree that the U.S. focus of this study limits the generalizability of our findings. The findings that we present in this work will only generalize to other populations–whether it be to an alternate industry, e.g., tech workers, or to faculty in different countries–to the extent that these other populations share similar hiring patterns, retention patterns, and current demographic representation. We have added a discussion of this limitation to the manuscript.

      Additionally, the counterfactual model's reliance on specific assumptions about gender-neutral attrition could affect the accuracy of its projections.

      Our projection analysis is intended to illustrate the potential gender representation outcomes of several possible counterfactual scenarios, with each projection being conditioned on transparent and simple assumptions. In this way, the projection analysis is not intended to predict or forecast the future.

      To resolve this point for our readers, we now introduce our projections in the context of the related terms of prediction and forecast, noting that they have distinct meanings as terms of art: On one hand, prediction and forecasting involve anticipating a specific outcome based on available information and analysis, and typically rely on patterns, trends, or historical data to make educated guesses about what will happen. Projections are based on assumptions and are often presented in a panel of possible future scenarios. While predictions and forecasts aim for precision, projections (which we make in our analysis) are more generalized and may involve a range of potential outcomes.

      Additionally, the study assumes that whoever disappeared from the dataset is attrition in academia. While in reality, those attritions could be researchers who moved to another country or another institution that is not included in the AARC (Academic Analytics Research Centre) dataset.

      In our revision, we have elevated this important point, and clarified it in the context of the various ways in which we count hires and attritions. We now explicitly state that “We define faculty hiring and faculty attrition to include all cases in which faculty join or leave a field or domain within our dataset.” Then, we enumerate the number of situations that could be counted as hires and attritions, including the reviewer’s example of faculty who move to another country.

      Reviewer #1 (Recommendations For The Authors):

      Section B: The authors use an age structured Leslie matrix model (see Caswell for a good reference to these) to test the effect of making the attrition rates or hiring rates equal for men and women. My main concern here is the fitting techniques for the parameters. These are described (a little too!) briefly in section S1B. Some specific questions that are left hanging include:

      A 5th order polynomial is an interesting choice. Some statistical evidence as to why it was the best fit would be useful. What other candidate models were compared? What was the "best fit" judgement made with: AIC, r^2? What are the estimates for how good this fit is? How many data points were fitted to? Was it the best fit choice for all of the 111 fields for men and women?

      We use a logistic regression model for each field to infer faculty attrition probabilities across career ages and time, and we include the career age predictor up to its fifth power to capture the career-age correlations observed in Spoon et. al., Science Advances, 2023. For ease of reference, we reproduce the attrition risk curves in Fig S4.

      We note that faculty attrition rates start low and then reach a peak around 5-7 years after earning PhD, and then decline until around 15-20 years post-PhD, after which, attrition rates increase as faculty approach retirement.

      This function shape starts low and ends high, and includes at least one local minimum, which indicates that career age should be odd-ordered in the model and at least order-3, but only including career age up to its 3rd order term tended to miss some of the overserved career-age/attrition correlations. We evaluated the fit using 5-fold cross validation with a Brier score loss metric, and among options of polynomials of degree 1, 3, 5, or 7, we found that 5th order performed well overall on average over all fields (even if it was not the best for every field), without overfitting in fields with fewer data. Example fits, reminiscent of the figure from Spoon et al, are now provided in Figs S4 and S5.

      While the model fit with fifth order terms may not be the best fit for all 111 fields (e.g., 7th order fits better in some cases), we wanted to avoid field-specific curves that might be overfitted to the field-specific data, especially due to low sample size (and thus larger fluctuations) on the high career age side of the function. Our main text and supplement now includes justifications for our choice to include career age up to its fifth order terms.

      You used the 5th order logistic regression (bottom of page 11) to model attrition at different ages. The data in [24] shows that attrition increases sharply, then drops then increases again with career age. A fifth order polynomial on its own could plausibly do this but I associate logistic regression models like this as being monotonically increasing (or decreasing!), again more details as to how this worked would be useful.

      Our first submission did not explain this point well, but we hope that Supplementary Figures S4 and S5 provide clarity. In short, we agree of course that typical logistic regression assumes a linear relationship between the predictor variables and the log odds of the outcome variable. This means that the relationship between the predictor variables and the probability of the outcome variable follows a sigmoidal (S-shaped) curve. However, the relationship between the predictor variables and the outcome variable may not be linear.

      To capture more complex relationships, like the increasing, decreasing and then increasing attrition rates as a function of career age, higher-order terms can be added to the logistic regression model. These higher-order terms allow the model to capture nonlinear relationships between the predictor variables and the outcome variable — namely the non-monotonic relationship between rates of attrition and career age — while staying within a logistic regression framework.

      "The career age of new hires follows the average career age distribution of hires" did you use the empirical distribution here or did you fit a standard statistical distribution e.g. Gamma?

      We used the empirical distribution. This information has been added to the updated methods section in the main text.

      How did you account for institution (presumably available)? Your own work has shown that institution types plays a role which could be contributing to these results.

      See below.

      What other confounding variables could be at play here, what is available as part of the data and what happens if you do/don't account for them?

      A number of variables included in our data have been shown to correlate with faculty attrition, including PhD prestige, current institution prestige, PhD country, and whether or not an individual is a “self-hire,” i.e., trained and hired at the same institution (Wapman et. al., Nature, 2022). Additional factors that faculty self-report as reasons for leaving academia include issues of work-life balance, workplace climate, and professional reasons, and in some cases to varying degrees between men and women faculty (Spoon et. al., Sci. Adv., 2023).

      Our counterfactual analysis aims to address a specific question: how would women’s representation among faculty be different today if men and women were subjected to the same attrition patterns over the past decade? To answer this question, it is important to account for faculty career age, which we accept as a variable that will always correlate strongly with faculty attrition rates, as long as the tenure filter remains in place and faculty continue to naturally progress towards retirement age. On the other hand, it is less clear why PhD country, self-hire status, or any of the other mentioned variables should necessarily correlate with attrition rates and with gendered differences in attrition rates more specifically. While some or all of these variables may underlie the causal roots of gendered attrition rates, our analysis does not seek to answer causal questions about why faculty leave their jobs (e.g., by testing the impact of accounting for these variables in simulations per the reviewers suggestion). This is because we do not believe the data used in this analysis is sufficient to answer such questions, lacking comprehensive data on faculty stress (Spoon et. al., Sci. Adv., 2023), parenthood status, etc.

      What career age range did the model use?

      The career age range observed in model outcomes are a function of the empirically derived attrition rates for faculty across academic fields. The highest career age observed in the AARC data was 80, and the faculty career ages that result from our model simulations and projections do not exceed 80.

      We have also added the distribution of faculty across career ages for the projection scenario model outputs in the supplemental materials Fig. S3 (see response to your later comment regarding career age for further details). Looking at these distributions, it is observed that very few faculty have career age > 60, both in observation and in our simulations.

      What was the initial condition for the model?

      Empirical 2011 Faculty rosters are used as the initial conditions for the counterfactual analysis, and 2020 faculty rosters are these as the initial conditions for the projections analysis. This information has been added to the descriptions of methods in the main text.

      Starting the model in 2011 how well does it fit the available data up to 2020?

      Thank you for this suggestion. We ran this analysis for each field starting in 2011, and found that model outcomes were statistically indistinguishable from the observed 2020 faculty gender compositions for all 111 academic fields. This finding is not surprising, because the model is fit to the observed data, but it serves to validate the methods that we used to extract the model's parameters. We have added these results to the supplement (Fig. S2).

      What are the sensitivity analysis results for the model? If you have made different fitting decisions how much would the results change? All this applied to both the hiring and attrition parameters estimates.

      We model attrition and hiring using logistic regression, with career age included as an exogenous variable up to its fifth power. A natural question follows: what if we used a model with career age only to its first or third power? Or to higher powers? We performed this sensitivity analysis, and added three new figures to the supplement to present these findings:

      First, we show the observed attrition probabilities at each career age, and four model fits to attrition data (Supplementary Figs S4 and S5). The first model includes career age only to its first power, and this model clearly does not capture the full career age / attrition correlation structure. The second model includes career age to its third power, which does a better job of fitting to the observed patterns. The third model includes career age up to its fifth power, which appears to very modestly improve upon the former model. The fourth model includes career age up to its seventh power, and the patterns captured by this model are largely the same as the 5th-power model up to career age 50, beyond which there are some notable differences in the inferred attrition probabilities. These differences would have relatively little impact on model outcomes because the vast majority of faculty have a career age below 50.

      Second, we show the observed probability that hires are women, conditional on the career age of the hire. Once again, we fit four models to the data, and find that career age should be included at least up to its fifth order in order to capture the correlation structures between career age and the gender of new hires. However, limited differences result from including career age up to the 7th degree in the model (relative to the 5th degree).

      As a final sensitivity analysis, we reproduce Fig. 2, but rather than including career age as an exogenous variable up to its fifth power in our models for hiring and attrition, we include career age up to its third power. Findings under this parameterization are qualitatively very similar to those presented in Fig. 2, indicating that the results are robust to modest changes to model parameterization (shown in supplement Fig. S6).

      Far more detail in this and some interim results from each stage of the analysis would make the paper far more convincing. It currently has an air of "black box" too much of the analysis which would easily allow an unconvinced reader to discard the results.

      We have added more detailed descriptions of the methods to the main text. We hope that the changes made will address these concerns.

      Section C: You use the Leslie model to predict the future population. As the model is linear the population will either grow exponentially (most likely) or dwindle to zero. You mention you dealt with this by scaling the average value of H to keep the population at 2020 levels? This would change the ratio of hiring to attrition. How did this affect the timescale of the results. If a field had very minimal attrition (and hence grew massively over the time period of the dataset) the hiring rate would have to be very small too so there would be very little change in the gender balance. Did you consider running the model to steady state instead?

      We chose the 40 year window (2020-2060) for this projection analysis because 40 years is roughly the timespan of a full-length faculty career. In other words, it will take around 40 years for most of the pre-existing faculty from 2020 to retire, such that the new, simulated faculty will have almost entirely replaced all former faculty by 2060.

      For three out of five of our projection scenarios (OA, GNA, OA+ER), the point at which observed faculty are replaced by simulated faculty represents steady state. One way to check this intuition is to observe the asymptotic behavior of the trajectories in Fig. 3B; the slopes for these 3 scenarios nearly level out within 40 years.

      The other two scenarios (OA + IR, GNA+IR) represent situations where women’s representation among new hires is increasing each year. These scenarios will not reach steady state until women represent 100% of faculty. Accordingly, the steady state outcomes for these scenarios would yield uninteresting results; instead, we argue that it is the relative timescales that are interesting.

      What did you do to check that your predictions at least felt realistic under the fitted parameters? (see above for presenting the goodness of fit over the 10 years of the data).

      We ran the analysis suggested in a prior comment (Starting the model in 2011 how well does it fit the available data up to 2020?) and found that model outcomes were statistically indistinguishable from the observed 2020 faculty gender compositions for all 111 academic fields, plus the “All STEM” and “All non-STEM” aggregations.

      You only present the final proportion of women for each scenario. As mentioned earlier, models of this type have a tendency to lead to strange population distributions with wild age predictions and huge (or zero populations). Presenting more results here would assuage any worries the reader had about these problems. What is the predicted age distribution of men and women in the long term scenarios? Would a different method of keeping the total population in check have yielded different results? Interim results, especially from a model as complex as this one, rather than just presenting a final single number answer are a convincing validation that your model is a good one! Again, presenting this result will go a long way to convincing readers that your results are sound and rigorous.

      Thank you for this suggestion. We now include a figure that presents faculty age distributions for each projection scenario at 2060 against the observed faculty age distribution in 2020 (pictured below, and as Fig. S3 in the supplementary materials). We find that the projected age distributions are very similar to the observed distributions for natural sciences (shown) and for the additional academic domains. We hope this additional validation will inspire confidence in our model of faculty hiring and attrition for the reviewer, and for future readers.

      In Fig S3, line widths for the simulated scenarios span the central 95% of simulations.

      Other people have reached almost identical conclusions (albeit it with smaller data sets) that hiring is more important than attrition. It would be good to compare your conclusions with their work in the Discussion.

      We have revised the main text to cite the listed examples of similar studies. We thank the reviewer for bringing these relevant works to our attention.

      General comments:

      What thoughts have you given to non-binary individuals?

      Be careful how you use the term "gender diversity"! In many countries "Gender diverse" is a term used in data collection for non-binary individuals, i.e. Male, female, gender diverse. The phrase "hiring more gender diverse faculty" can be read in different ways! If you are only considering men and women then gender balance may be a better framework to use.

      We have added language to the main text which explicitly acknowledges that our analysis focuses on men and women due to limitations in our name-based gender tool, which only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      We have also taken additional care with referring to “gender diversity,” per reviewer 1’s point in their public review.

      Reviewer #2 (Recommendations For The Authors):

      Data availability: I did not see an indication that the dataset used here is publicly available, either in its raw format or as a summary dataset. Perhaps this is due to the sensitive nature of the data, but regardless of the underlying reason, the authors should include a note on data availability in the paper.

      The dataset used for these analyses were obtained under a data use agreement with the Academic Analytics Research Center (AARC). While these data are not publicly available, researchers may apply for data access here: https://aarcresearch.com/access-our-data.

      We also added a table to the supplemental materials (Tab. S3) that reports the estimated number of men and women in each of the 111 fields.

      Additionally, a variety of summary statistics based on this dataset are available online, here: https://github.com/LarremoreLab/us-faculty-hiring-networks/tree/main

      Gender classification: Was an existing package used to classify gender from names in the dataset, or did the authors develop custom code to do so? Either way, this code should be cited. I would also be curious to know what the error rate of these classifications are, and suggest that additional information on potential biases that might result from automated classifications be included in the discussion, under the section describing data limitations. The reliability of name-based gender classification is particularly of interest, as external gender classifications such as those applied on the basis of an individual's name - may not reflect the gender with which an individual self-identifies. In other words, while for many people their names may reflect their true genders, for others those names may only reflect their gender assigned at birth and not their self-perceived or lived gender identity. Nonbinary faculty are in particular invisibilized here (and through any analysis that assigns binary gender on the basis of name). While these considerations do not detract from the main focus of the study - which was to utilize an existing dataset classified only on the basis of binary gender to assess trends for women faculty-these limitations should be addressed as they provide additional context for the interpretation of the results and suggest avenues for future research.

      We use a free, open-source, and open-data python package called nomquamgender (Van Buskirk et al, 2023) to estimate the strengths of (culturally constructed) name-gender associations. For sufficiently strong associations with a binary gender, we apply those labels to the names in our data. We have updated the main text to make this approach more apparent.

      We have also added language to the main text which explicitly acknowledges that our approach only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      As we mentioned in response to the public review, we use a free and open source python package called nomquamgender to estimate the strengths of name-gender associations, and we apply gender labels to the names with sufficiently strong associations with a binary gender. This package is based on a paper by Van Buskirk et. al. 2023, “An open-source cultural consensus approach to name-based gender classification,” which documents error rates and potential biases.

      We have also added language to the main text which explicitly acknowledges that our approach only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      Page 1: The sentence beginning "A trend towards greater women's representation could be caused..." is missing a conjunction. It should likely read: "A trend towards greater women's representation could be caused entirely by attrition, e.g., if relatively more men than women leave a field, OR entirely by hiring..."

      We have edited the paragraph to remove the sentence in question.

      Pages 1-2: The sentence beginning "Although both types of strategy..." and ending with "may ultimately achieve gender parity" is a bit of a run-on; perhaps it would be best to split this into multiple sentences for ease of reading.

      We have revised this run-on sentence.

      Page 2: See comments in the public review about a methods section, the addition of which may help to improve clarity for the readers. Within the existing descriptions of what I consider to be methods (i.e., the first three paragraphs currently under "results"), some minor corrections could be added here. First, consider citing the source of the dataset in the line where it is first described (in the sentence "For these analyses, we exploit a census-level dataset of employment and education records for tenured and tenure-track faculty in 12,112 PhD-granting departments in the United States from 2011-2020.") It also may be helpful to include context here (or above, in the discussion about institutional analyses) about how "departments" can be interpreted. For example, how many institutions are represented across these departments? More information on how the authors eliminated the gendered aspect of patterns in their counterfactual model would be helpful as well; this is currently hinted at on page 4, but could instead be included in the methods section with a call-out to the relevant supplemental information section (S2B).

      We have added a citation to Academic Analytics Research Center’s (AARC) list of available data elements to the data’s introduction sentence. We hope this will allow readers to familiarize themselves with the data used in our analysis.

      Faculty department membership was determined by AARC based on online faculty rosters. 392 institutions are represented across the 12,112 departments present in our dataset. We have updated the main text to include this information.

      Finally, we have added a methods section to the main text, which includes information on how the gendered aspect of attrition patterns were eliminated in the counterfactual model.

      Page 2: Perhaps some indication of how many transitions from an out-of-sample institution might be helpful to readers hoping to understand "edge cases."

      In our analysis, we consider all transitions from out-of-sample institutions to in-sample institutions as hires, and all transitions away from in-sample institutions–whether it be to an out of sample institution, or out of academia entirely–as attritions. We choose to restrict our analysis of hiring and attrition to PhD granting institutions in the U.S. in this way because our data do not support an analysis of other, out-of-sample institutions.

      I also would have liked additional information on how many faculty switched institutions but remained "in-sample and in the same field" - and the gender breakdowns of these institutional changes, as this might be an interesting future direction for studies of gender parity. (For example, readers may be spurred to ask: if the majority of those who move institutions are women, what are the implications for tenure and promotion for these individuals?)

      While these mid-career moves are not counted as attritions in the present analysis, a study of faculty who switch institutions but remain (in-sample) as faculty could shed light on issues of gendered faculty retention at the level of institutions. We share the reviewer’s interest in a more in depth study of mid-career moves and how these moves impact faculty careers, and we now discuss the potential value of such a study towards the end of the paper. In fact, this subject is the topic of a current investigation by the authors!

      Page 3: I was confused by the statement that "of the three types of stable points, only the first point represents an equitable steady-state, in which men and women faculty have equal average career lengths and are hired in unchanging proportions." Here, for example, computer science appears to be close to the origin on Figure 1, suggesting that hiring has occurred in "unchanging proportions" over the study interval. However, upon analysis of Table S2, it appears that changes in hiring in Computer Science (+2.26 pp) are relatively large over the study interval compared to other fields. Perhaps I am reading too literally into the phrase that "men and women faculty are hired in unchanging proportions" - but I (and likely others) would benefit from additional clarity here.

      We had created an arrow along with the computer science label in Fig. 1, but it was difficult to see, which is likely the source of this confusion. This was our fault, and we have moved the “Comp. Sci.” label and its corresponding arrow to be more visible in Figure 1.

      Changes in women’s representation in Computer Science due to hiring over 2011 - 2020 was +2.26 pp as the reviewer points out, but, consulting Fig. 1 and the corresponding table in the supplement, we observe that this is a relatively small amount of change compared to most fields.

      Page 3: If possible it may be helpful to cite a study (or multiple) that shows that "changes in women's representation across academic fields have been mostly positive." What does "positive" mean here, particularly when the changes the authors observe are modest? Perhaps by "positive" you mean "perceived as positive"?

      We used the term positive in the mathematical sense, to mean greater than zero. We have reworded the sentence to read “women's representation across academic fields has been mostly increasing…” We hope this change clarifies our meaning to future readers.

      Page 3: The sentence that ends with "even though men are more likely to be at or near retirement age than women faculty due to historical demographic trends" may benefit from a citation (of either Figure S3 or another source).

      We now cite the corresponding figure in this sentence.

      Page 4: The two sentences that begin with "The empirical probability that a person leaves their academic career" would benefit from an added citation.

      We have added a citation to the sentences.

      Figure 3: Which 10 academic domains are represented in Panel 3B? The colors in appear to correspond to the legend in Panel 3A, but no indication of which fields are represented is provided. If possible, please do so - it would be interesting and informative to be able to make these comparisons.

      This was not clear in the initial version of Fig. 3B, so we now label each domain. For reference, the domains represented in 3B are (from top to bottom):

      ● Health

      ● Education

      ● Journalism, Media, Communication

      ● Humanities

      ● Social Sciences

      ● Public Administration and Policy

      ● Medicine

      ● Business

      ● Natural Sciences

      ● Mathematics and Computing

      ● Engineering

      Page 6: Consider citing relevant figure(s) earlier up in paragraph 2 of the discussion. For example, the first sentence could refer to Figure 1 (rather than waiting until the bottom of the paragraph to cite it).

      Thank you for this suggestion, we now cite Fig. 1 earlier in this discussion paragraph.

      Page 10: A minor comment on the fraction of women faculty in any given year-the authors assume that the proportion of women in a field can be calculated from knowing the number of women in a field and the number of men. This is, again, true if assuming binary genders but not true if additional gender diversity is included. It is likely that the number of nonbinary faculty is quite low, and as such would not cause a large change in the overall proportions calculated here, but additional context within the first paragraph of S1 might be helpful for readers.

      We have added additional context in the first paragraph of S1, explaining that an additional term could be added to the equation to account for nonbinary faculty representation if our data included nonbinary gender annotations. Thank you for making this point.

      Page 10: Please include a range of values for the residual terms of the decomposition of hiring and attrition in the sentence that reads "In Figure S1 we show that the residual terms are small, and thus the decomposition is a good approximation of the total change in women's representation."

      These residual terms range from -0.51pp to 1.14pp (median = 0.2pp). We have added this information to the sentence in question.

      Page 12: It may be helpful to readers to include a description of the information contained in Table S2 in the supplemental text under section S3.

      We refer to table S2 twice in the main text (once in the observational findings, and once for the counterfactual analysis), and the contents of table S2 are described thoroughly in the table caption.

      Reviewer #3 (Recommendations For The Authors):

      (1) There is a potential limitation in the generalizability of the findings, as the study focuses exclusively on US academia. Including international perspectives could have provided a more global understanding of the issues at hand.

      The U.S. focus of this study limits the generalizability of our findings, as non-U.S. other faculty may exhibit differences in hiring patterns, retention patterns, and current demographic representations. We have added a discussion of this limitation to the manuscript. Unfortunately, our data do not support international analyses of hiring and attrition.

      (2) I am not sure that everyone who disappeared from the AARC dataset could be count as "attrition" from academia. Indeed, some who disappeared might have completely left academia once they disappeared from the AARC dataset. Yet, there's also the possibility that some professors left for academic positions in countries outside of the US, or US institutions that are not included in the AARC dataset. These individuals didn't leave academia. Furthermore, it is also possible that these scholars who moved to an institution outside of US or not indexed by AARC are gender specific. Therefore, analyses that this study conducts should find a way to test whether the assumption that anyone who disappeared from AARC is indeed valid. If not, how will this potentially challenge the current conclusions?

      The reviewer makes an important point: faculty who move to faculty positions in other countries and faculty who move to non-PhD granting institutions, or to institutions that are otherwise not included in the AARC data are all counted as attritions in our analysis. We intentionally define hiring and attrition broadly to include all cases in which faculty join or leave a field or domain within our dataset.

      The types of transitions that faculty make out of the tenure track system at PhD granting institutions in the U.S. may correlate with faculty attributes, like gender. For example, women or men may be more likely to transition to tenure track positions at non-U.S. institutions. Nevertheless, these types of career transition represent an attrition for the system of study, and a hire for another system. Following this same logic, faculty who transition from one field to another field in our analysis are treated as an attrition from the first field and a hire into the new field.

      By focusing on “all-cause” attrition in this way, we are able to make robust insights for the specific systems we consider (e.g.,, STEM and non-STEM faculty at U.S. PhD granting institutions), without being roadblocked by the task of annotating faculty departures and arbitrating which should constitute “valid” attritions.

      (3) It would be very interesting to know how much of the attribution was due to tenure failure. Previous studies have suggested that women are less likely to be granted tenure, which makes me wonder about the role that tenure plays in the gendered patterns of attrition in academia.

      We note that faculty attrition rates start low and then reach a peak around 5-7 years after earning PhD, and then decline until around 15-20 years post-PhD, after which, attrition rates increase as faculty approach retirement. The first local maximum appears to coincide roughly with the tenure clock timing, but we can only speculate that these attritions are tenure related. Our dataset is unfortunately not equipped to determine the causal mechanisms driving attrition.

      We reproduce the attrition risk curve in the supplementary materials, Fig. S4:

      (4) The dataset used doesn't fully capture the complexities of academic environments, particularly smaller or less research-intensive institutions (regional universities, historically black colleges and universities, and minority-serving institutions). This could be potentially added to the manuscript for discussions.

      We have added this point to the description of this study’s limitations in the discussion.

    1. Reviewer #1 (Public Review):

      Summary:

      In "Changes in wing morphology..." Roy et al investigate the potential allometric scaling in wing morphology and wing kinematics in 8 different hoverfly species. Their study nicely combines different new and classic techniques, investigating flight in an important, yet understudied alternative pollinator. I want to emphasize that I have been asked to review this from a hoverfly biology perspective, as I do not work on flight kinematics. I will thus not review that part of the work.

      Strengths:

      The paper is well-written and the figures are well laid out. The methods are easy to follow, and the rationale and logic for each experiment are easy to follow. The introduction sets the scene well, and the discussion is appropriate. The summary sentences throughout the text help the reader.

      Weaknesses:

      The ability to hover is described as useful for either feeding or mating. However, several of the North European species studied here would not use hovering for feeding, as they tend to land on the flowers that they feed from. I would therefore argue that the main selection pressure for hovering ability could be courtship and mating. If the authors disagree with this, they could back up their claims with the literature. On that note, a weakness of this paper is that the data for both sexes are merged. If we agree that hovering may be a sexually dimorphic behaviour, then merging flight dynamics from males and females could be an issue in the interpretation. I understand that separating males from females in the movies is difficult, but this could be addressed in the Discussion, to explain why you do not (or do) think that this could cause an issue in the interpretation.

      The flight arena is not very big. In my experience, it is very difficult to get hoverflies to fly properly in smaller spaces, and definitely almost impossible to get proper hovering. Do you have evidence that they were flying "normally" and not just bouncing between the walls? How long was each 'flight sequence'? You selected the parts with the slowest flight speed, presumably to get as close to hovering as possible, but how sure are you that this represented proper hovering and not a brief slowdown of thrust?

      Your 8 species are evolutionarily well-spaced, but as they were all selected from a similar habitat (your campus), their ecology is presumably very similar. Can this affect your interpretation of your data? I don't think all 6000 species of hoverflies could be said to have similar ecology - they live across too many different habitats. For example, on line 541 you say that wingbeat kinematics were stable across hoverfly species. Could this be caused by their similar habitat?

    1. Author response:

      Reviewer 1:

      Summary:

      In this manuscript by Bimbard et al., a new method to perform stable recordings over long periods of time with neuropixels, as well as the technical details on how the electrodes can be explanted for follow-up reuse, is provided. I think the description of all parts of the method is very clear, and the validation analyses (n of units per day over time, RMS over recording days...) are very convincing. I however missed a stronger emphasis on why this could provide a big impact on the ephys community, by enabling new analyses, new behavior correlation studies, or neurophysiological mechanisms across temporal scales

      Strengths:

      Open source method. Validation across laboratories. Across species (mice and rats) demonstration of its use and in different behavioral conditions (head-fixed and freely moving).

      Weaknesses:

      Weak emphasis on what can be enabled with this new method that didn't exist before.

      We thank the reviewer for highlighting the limited discussion around scientific impact. Our implant has several advantages which combine to make it much more accessible than previous solutions. This enables a variety of recording configurations that would not have been possible with previous designs, facilitating recordings from a wider range of brain regions, animals, and experimental setups. In short, there are three key advances:

      (1) Adaptability: The CAD files can be readily adapted to a wide range of configurations (implantation depth, angle, position of headstage, etc.). Labs have already, modified the design to optimise for their needs, and re-shared with the community.

      (2) Weight:  Because of the lightweight design, experimenters can i) perform complex and demanding freely moving tasks as we exemplify in the manuscript, and ii) implant female and water restricted mice while respecting animal welfare weight limitations.

      (3) Cost: At ~$10, our implant is significantly cheaper than published alternatives, which makes it affordable to more labs and means that testing modifications is cost-effective.

      We will make these features clearer in the manuscript.

      Reviewer 2:

      Summary:

      This work by Bimbard et al., introduces a new implant for Neuropixels probes. While Neuropixels probes have critically improved and extended our ability to record the activity of a large number of neurons with high temporal resolution, the use of these expensive devices in chronic experiments has so far been hampered by the difficulty of safely implanting them and, importantly, to explant and reuse them after conclusion of the experiment. The authors present a newly designed two-part implant, consisting of a docking and a payload module, that allows for secure implantation and straightforward recovery of the probes. The implant is lightweight, making it amenable for use in mice and rats, and customizable. The authors provide schematics and files for printing of the implants, which can be easily modified and adapted to custom experiments by researchers with little to no design experience. Importantly, the authors demonstrate the successful use of this implant across multiple use cases, in head-fixed and freely moving experiments, in mice and rats, with different versions of Neuropixels probes, and across 8 different labs. Taken together, the presented implants promise to make chronic Neuropixel recordings and long-term studies of neuronal activity significantly easier and attainable for both current and future Neuropixels users.

      Strengths:

      - The implants have been successfully tested across 8 different laboratories, in mice and rats, in head-fixed and freely moving conditions, and have been adapted in multiple ways for a number of distinct experiments.

      - Implants are easily customizable and the authors provide a straightforward approach for customization across multiple design dimensions even for researchers not experienced in design.

      - The authors provide clear and straightforward descriptions of the construction, implantation, and explant of the described implants.

      - The split of the implant into a docking and payload module makes reuse even in different experiments (using different docking modules) easy.

      - The authors demonstrate that implants can be re-used multiple times and still allow for high-quality recordings.

      - The authors show that the chronic implantations allow for the tracking of individual neurons across days and weeks (using additional software tracking solutions), which is critical for a large number of experiments requiring the description of neuronal activity, e.g. throughout learning processes.

      - The authors show that implanted animals can even perform complex behavioral tasks, with no apparent reduction in their performance.

      Weaknesses:

      - While implanted animals can still perform complex behavioral tasks, the authors describe that the implants may reduce the animals' mobility, as measured by prolonged reaction times. However, the presented data does not allow us to judge whether this effect is specifically due to the presented implant or whether any implant or just tethering of the animals per se would have the same effects.

      The reviewer is correct: some of the differences in mouse reaction time could be due to the tether rather than the implant. As these experiments were also performed in water-restricted female mice with the heavier Neuropixels 1.0 implant, our data represent the maximal impact of the implant, and we will highlight this in the revision.

      - While the authors make certain comparisons to other, previously published approaches for chronic implantation and re-use of Neuropixels probes, it is hard to make conclusive comparisons and judge the advantages of the current implant. For example, while the authors emphasize that the lower weight of their implant allows them to perform recordings in mice (and is surely advantageous), the previously described, heavier implants they mention (Steinmetz et al., 2021; van Daal et al., 2021), have also been used in mice. Whether the weight difference makes a difference in practice therefore remains somewhat unclear.

      The reviewer is correct: without a direct comparison, we cannot be certain that our smaller, lighter implant improves behavioural results (although this is supported by the literature, e.g. Newman et al, 2023). However, the reduced weight of our implant is critical for several laboratories represented in this manuscript due to animal welfare requirements. Indeed, in Daal et al the authors “recommend a [mouse] weight of >25 g for implanting Neuropixels 1.0 probes.” This limit precludes using (the vast majority of) female mice, or water-restricted animals. Conversely, our implant can be routinely used with lighter, water-restricted male and female mice. We will emphasise this point in the revision.

      - The non-permanent integration of the headstages into the implant, while allowing for the use of the same headstage for multiple animals in parallel, requires repeated connections and does not provide strong protection for the implant. This may especially be an issue for the use in rats, requiring additional protective components as in the presented rat experiments.

      We apologise for not clarifying the various headstage options in the manuscript and we will address this in the revision. Our repository has headplate holder designs (in the XtraModifications/Mouse_FreelyMoving folder). This allows leaving the headstage on the implant, and thus minimize the number of connections (albeit increasing the weight for the mouse). Indeed, mice recorded while performing the task described in our manuscript had the head-stage semi-permanently integrated to the implant, and we will highlight this in the revision.

      Reviewer 3:

      Summary:

      In this manuscript, Bimbard and colleagues describe a new implant apparatus called "Apollo Implant", which should facilitate recording in freely moving rodents (mice and rats) using Neuropixels probes. The authors collected data from both mice and rats, they used 3 different versions of Neuropixels, multiple labs have already adopted this method, which is impressive. They openly share their CAD designs and surgery protocol to further facilitate the adaptation of their method.

      Strengths:

      Overall, the "Apollo Implant" is easy to use and adapt, as it has been used in other laboratories successfully and custom modifications are already available. The device is reproducible using common 3D printing services and can be easily modified thanks to its CAD design (the video explaining this is extremely helpful). The weight and price are amazing compared to other systems for rigid silicon probes allowing a wide range of use of the "Apollo Implant".

      Weaknesses:

      The "Apollo Implant" can only handle Neuropixels probes. It cannot hold other widely used and commercially available silicon probes. Certain angles and distances are not possible in their current form (distance between probes 1.8 to 4mm, implantation depth 2-6.5 mm, or angle of insertion up to 20 degrees).

      We appreciate the reviewer’s points, but as we will discuss in the revised manuscript, one implant accommodating the diversity of the existing probes is beyond the scope of this project. However, because the design is adaptable, groups should be able to modify the current version of the implant to adapt to their electrodes’ size and format (and can highlight any issues in the Github “Discussions” area).

      With Neuropixels, the current range of depths covers practically all trajectories in the mouse brain. In rats, where deeper penetrations may be useful, the experimenter can attach the probe at a lower point in the payload module to increase the length of exposed shank. We now specify this in the Github repository.

      We have now extended the range of inter-probe distances from a maximum of 4 mm to 6.5 mm, and this will be reflected in the revised manuscript. Distances beyond this may be better served by 2 implants, and smaller distances could be achieved by attaching two probes on the same side of the docking module. In the next revision, we will add these points to the discussion.

    1. Reviewer #1 (Public Review):

      Summary:

      This paper presents a mechanistic study of rDNA origin regulation in yeast by SIR2. Each of the ~180 tandemly repeated rDNA gene copies contains a potential replication origin. Early-efficient initiation of these origins is suppressed by Sir2, reducing competition with origins distributed throughout the genome for rate-limiting initiation factors. Previous studies by these authors showed that SIR2 deletion advances replication timing of rDNA origins by a complex mechanism of transcriptional de-repression of a local PolII promoter causing licensed origin proteins (MCMcomplexes) to re-localize (slide along the DNA) to a different (and altered) chromatin environment. In this study, they identify a chromatin remodeler, FUN30, that suppresses the sir2∆ effect, and remarkably, results in a contraction of the rDNA to about one-quarter it's normal length/number of repeats, implicating replication defects of the rDNA. Through examination of replication timing, MCM occupancy and nucleosome occupancy on the chromatin in sir2, fun30, and double mutants, they propose a model where nucleosome position relative to the licensed origin (MCM complexes) intrinsically determines origin timing/efficiency. While their interpretations of the data are largely reasonable and can be interpreted to support their model, a key weakness is the connection between Mcm ChEC signal disappearance and origin firing. While the cyclical chromatin association-dissociation of MCM proteins with potential origin sequences may be generally interpreted as licensing followed by firing, dissociation may also result from passive replication and as shown here, displacement by transcription and/or chromatin remodeling. Moreover, linking its disappearance from chromatin in the ChEC method with such precise resolution needs to be validated against an independent method to determine the initiation site(s). Differences in rDNA copy number and relative transcription levels also are not directly accounted for, obscuring a clearer interpretation of the results. Nevertheless, this paper makes a valuable advance with the finding of Fun30 involvement, which substantially reduces rDNA repeat number in sir2∆ background. The model they develop is compelling and I am inclined to agree, but I think the evidence on this specific point is purely correlative and a better method is needed to address the initiation site question. The authors deserve credit for their efforts to elucidate our obscure understanding of the intricacies of chromatin regulation. At a minimum, I suggest their conclusions on these points of concern should be softened and caveats discussed. Statistical analysis is lacking for some claims.

      Strengths are the identification of FUN30 as suppressor, examination of specific mutants of FUN30 to distinguish likely functional involvement. Use of multiple methods to analyze replication and protein occupancies on chromatin. Development of a coherent model.

      Weaknesses are failure to address copy number as a variable; insufficient validation of ChEC method relationship to exact initiation locus; lack of statistical analysis in some cases.

      Additional background and discussion for public review:

      This paper broadly addresses the mechanism(s) that regulate replication origin firing in different chromatin contexts. The rDNA origin is present in each of ~180 tandem repeats of the rDNA sequence, representing a high potential origin density per length of DNA (9.1kb repeat unit). However, the average origin efficiency of rDNA origins is relatively low (~20% in wild-type cells), which reduces the replication load on the overall genome by reducing competition with origins throughout the genome for limiting replication initiation factors. Deletion of histone deacetylase SIR2, which silences PolII transcription within the rDNA, results in increased early activation or the rDNA origins (and reduced rate of overall genome replication). Previous work by the authors showed that MCM complexes loaded onto the rDNA origins (origin licensing) were laterally displaced (sliding) along the rDNA, away from a well-positioned nucleosome on one side. The authors' major hypothesis throughout this work is that the new MCM location(s) are intrinsically more efficient configurations for origin firing. The authors identify a chromatin remodeling enzyme, FUN30, whose deletion appears to suppress the earlier activation of rDNA origins in sir2∆ cells. Indeed, it appears that the reduction of rDNA origin activity in sir2∆ fun30∆ cells is severe enough to results in a substantial reduction in the rDNA array repeat length (number of repeats); the reduced rDNA length presumably facilitates it's more stable replication and maintenance.

      Analysis of replication by 2D gels is marginally convincing, using 2D gels for this purpose is very challenging and tricky to quantify. The more quantitative analysis by EdU incorporation is more convincing of the suppression of the earlier replication caused by SIR2 deletion.

      To address the mechanism of suppression, they analyze MCM positioning using ChEC, which in G1 cells shows partial displacement of MCM from normal position A to positions B and C in sir2∆ cells and similar but more complete displacement away from A to positions B and C in sir2fun30 cells. During S-phase in the presence of hydroxyurea, which slows replication progression considerably (and blocks later origin firing) MCM signals redistribute, which is interpreted to represent origin firing and bidirectional movement of MCMs (only one direction is shown), some of which accumulate near the replication fork barrier, consistent with their interpretation. They observe that MCMs displaced (in G1) to sites B or C in sir2∆ cells, disappear more rapidly during S-phase, whereas the similar dynamic is not observed in sir2∆fun30∆. This is the main basis for their conclusion that the B and C sites are more permissive than A. While this may be the simplest interpretation, there are limitations with this assay that undermine a rigorous conclusion (additional points below). The main problem is that we know the MCM complexes are mobile so disappearance may reflect displacement by other means including transcription which is high is the sir2∆ background. Indeed, the double mutant has greater level of transcription per repeat unit which might explain more displaced from A in G1. Thus, displacement might not always represent origin firing. Because the sir2 background profoundly changes transcription, and the double mutant has a much smaller array length associated with higher transcription, how can we rule out greater accessibility at site A, for example in sir2∆, leading to more firing, which is suppressed in sir2 fun30 due to greater MCM displacement away from A?

      I think the critical missing data to solidly support their conclusions is a definitive determination of the site(s) of initiation using a more direct method, such as strand specific sequencing of EdU or nascent strand analysis. More direct comparisons of the strains with lower copy number to rule out this facet. As discussed in detail below, copy number reduction is known to suppress at least part of the sir2∆ effect so this looms over the interpretations. I think they are probably correct in their overall model based on the simplest interpretation of the data but I think it remains to be rigorously established. I think they should soften their conclusions in this respect.

    1. On BBT, all traditional and metacognitive accounts of the human are the product of extreme informatic poverty. Ironically enough, many have sought intentional asylum within that poverty in the form of apriori or pragmatic formalisms, confusing the lack of information for the lack of substantial commitment, and thus for immunity against whatever the sciences of the brain may have to say. But this just amounts to a different way of taking refuge in obscurity. What are ‘rules’? What are ‘inferences’? Unable to imagine how science could answer these questions, they presume either that science will never be able to answer them, or that it will answer them in a manner friendly to their metacognitive intuitions. Taking the history of science as its cue, BBT entertains no such hopes. It sees these arguments for what they happen to be: attempts to secure the sufficiency of low-dimensional, metacognitive information, to find gospel in a peephole glimpse.

      This describes the approach of Sellars, Brandom, and Brassier, all of which Bakker has criticized in the blog.

      They admit that science has priority in the scientific realm, but what we think we are is not something that can be true or false, but are games, rules, things we play, a game of "pretend as if we are persons".

      This is a much better position. It does not attempt to tell science that science is a building founded upon the ground of philosophy (unlike Kant, or Heidegger), and does not try to make scientifically testable predictions and get embarrassed in the process (unlike those who sought to study the "quantum of consciousness" because they thought free will is real, thus something quantum-mechanical must be true of the brain, or that philosopher who argued that Anton syndrome is impossible because it is philosophically impossible, or those psychoanalysts that try to interpret Cotard's syndrome as some manifestation of childhood trauma).

      The problem with this position is as follows:

      1. Is science really based on a game of giving and taking reasons? If not, then there's no guarantee that science would protect the game of "let's pretend we are persons who make decisions, has plans, hopes for love, etc". The juggernaut of science may eventually crush the "manifest image of man" under its wheels, migrate to a society of unconscious biorobots, and run even faster as a result!

      2. Philosophers are unable to figure out what rules, games, normativity, etc, are! They can't agree, after centuries of disputation. Any working consensus will have to come from science, and what if science finally shows that rules and games are nothing like what Sellars, Brandom, etc, thought they are? If not, then not only is the manifest image not the scientific image, not only is it unnecessary for working scientists, it is even not what the philosophers say it is. It is as if the philosophers have been stuck in Plato's Cave, mistaking the shadow-play for optical-science.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      …I find the concept and execution of the study very interesting and elegant. The paper is also commendably clear and readable. The differences between primary and higher cortex are compelling and I am largely convinced by the authors' claim that they have found evidence that broadly supports a mixed selectivity model of neural disentanglement along the lines of Rigotti et al (2013). I think that the increasing body of evidence for these kinds of representations is a significant development in our understanding of higher sensory representations. I also think that the dDR method is likely to be useful to researchers in a variety of fields who are looking to perform similar types of neural decoding analysis.

      Thanks! We agree that questions around population coding and high-level representations are critical in the field of sensory systems.

      Reviewer #2 (Public Review):

      ... This is a well-carried out study with thoughtful analyses which in large part achieves its aims to evaluate how task-engagement changes neural activity across multiple auditory regions. As with all work, there are several caveats or areas for future study/analysis. First, the sounds used here (tones, and narrow-band noise) are relatively simple sounds; previous work suggests that exactly what activity is observed within each region (e.g., sensory only, decision-related, etc) may depend in part upon what stimuli are used. Therefore, while the current study adds importantly to the literature, future work may consider the use of more varied stimuli. Second, the animals here were engaged in a behavioral task; but apart from an initial calculation of behavioral d', the task performance (and its effect on neural activity) is largely unaddressed.

      The reviewer makes several important points that we hope we addressed in the specific changes detailed below. Indeed, it is important to recognize the possibility that the specific stimuli involved in a task may interact with the effects of behavioral state and that variability in task performance should be considered as an important aspect of behavioral state.

      Reviewer #1 (Recommendations For The Authors):

      I have a few minor comments and criticisms:

      (1) Figure 1c. The choice of low-contrast grey text (e.g. "Target vs. target" is unfortunate, especially when printed, and should be replaced (e.g. with dark grey).

      We have edited the figure to use a higher contrast (dark grey). Thanks for catching this.

      (2) Figure 2 and Supplementary Figure 3. I think some indication of error or significance is required in all panels. Without this, it's hard to interpret any of these panels.

      Thank you for this feedback. Including significance here was clarifying and helps to strengthen our claim that state-dependent changes in neural activity were smaller and more diverse for single neurons than at the population level. We modified Figure 2b-c to indicate whether each neuron’s response to the target stimulus was significantly different than its response to the catch stimulus. The same test was performed in Supplementary Figure 3. Additionally, we added a statistical test in Figure 2d-e to indicate, for each pair of target/catch stimuli, whether discrimination (d-prime) changed significantly between active and passive conditions. Furthermore, we modified the text of the second paragraph under the results heading: “Diverse effects of task engagement on single neurons in primary and non-primary auditory cortex” to reference and interpret the results of these significance tests. The new text reads as follows (L. 121):

      “Sound-evoked spiking activity was compared between active and passive states to study the impact of task engagement on sound representation. In both A1 and dPEG, responses to target and catch stimuli were significantly discriminable for a subset of single neurons (about 25% in both areas, Figure 2A-C, Supplemental Figures 3-5, bootstrap test). This supports the idea that stimulus identity can be decoded in both brain regions, regardless of task performance. However, the fact that the responses of most neurons in both brain areas could not significantly discriminate target vs. catch stimuli also highlights the diversity of sound encoding observed at the level of single neurons. The accuracy of catch vs. target discrimination for each neuron was quantified using neural d-prime, the z-scored difference in target minus catch spiking response for each neuron (Methods: Single neuron PSTHs and d-prime (Niwa et al., 2012a)). Task engagement was associated with significant changes in catch vs. target d-prime for roughly 10% of neurons in both A1 (40 / 481 neurons, bootstrap test) and dPEG (33 / 377 neurons, bootstrap test). This included neurons that both increased their discriminability and decreased their discriminability (Figure 2D-E). Thus, the effects of task engagement at the level of single neurons were relatively mild and inconsistent across the population; many neurons showed no significant change and of those that did, effects were bidirectional (Figure 2D-E).”

      We also included an additional methods paragraph in the “Statistical tests” section to describe the bootstrapping procedure used for these significance tests (L. 644):

      “The one exception to this general approach is in Figure 2, where we analyzed the sound discrimination abilities of single neurons. In this case, we computed p-values for each neuron and stimulus independently. First, for each neuron and catch vs. target stimulus pair, we measured d-prime (see Methods: Single neuron evoked activity and d-prime). We generated a null distribution of d-prime values for each neuron-stimulus pair, under each experimental condition by shuffling stimulus identity across trials before computing d-prime (100 resamples). A neuron was determined to have a significant d-prime for a given target vs. catch pair if its actual measured d-prime was greater than the 95th percentile of the null d-prime distribution. Second, for each neuron and catch vs. target stimulus pair, we tested if d-prime was significantly different between active and passive conditions. To test this, we followed a similar procedure as above, however, rather than shuffle stimulus identity, we shuffled active vs. passive trial labels. This allowed us to generate a null distribution of active vs. passive d-prime difference for each neuron and stimulus pair. A neuron was determined to have a significant change in d-prime between conditions if the actual Δ d-prime lay outside the 95% confidence interval of the null Δ d-prime distribution.”

      For Figure 2a, we chose not to indicate significance on the figure to avoid clutter, since the significance for all neurons in the population are shown in panels b-c anyway. Additionally, the difference plot shown in panel a is in units of z-scores, which we believe already gives a raw sense of the significance of the target vs. catch response change per neuron in this example dataset.

      (3) Figure 2 and Supplementary Figure 3. I would consider including some more examples as a Supplementary Figure (and perhaps combining Supp Fig 3 with Fig 2 as a main figure).

      We found no significant or apparent difference in single-neuron properties between A1 and dPEG. Therefore, we decided it is not helpful to plot both A1 and PEG examples in the main text. However, we agree that the ability to see more examples of the raw data could be useful. Therefore, we compiled two supplementary figures (Supplementary Figures 4 and 5) that replicate Figure 2a for all datasets, encompassing A1 and PEG.

      (4) Figure 2a and Supp Fig 3a. I was initially confused that the "delta-spk/sec (z-score)" values had themselves been z-scored, but now I think that they are simply the differences of the two left hand sub-panels. This could be made clear in the figure legend.

      The figure legends have been modified to state the procedure for computing “delta-spk/sec” more clearly. Specifically, we added the following information to the legend (L. 141):

      “Difference is computed as the z-scored response to the target minus the z-scored catch response (resulting in a difference shown in units of z-score).”

      (5) Figure 2b-e and Supp Fig 3b-e. Indicate the time window over which the responses were measured, and the number of neurons.

      Figure legends have been modified to include a sentence clearly stating the time window over which responses were measured. The number of neurons is also now included in the legend and on the figure itself. Furthermore, a brief description of the new statistical testing procedure has been added here (L. 144).

      “Responses were defined as the total number of spikes recorded during the 300 ms of sound presentation (area between dashed lines in panel A). Neurons with a significantly different response to the catch vs. target stimulus are indicated in black and quantified on the respective figure panel.”

      (6) Figure 2. "singe" should read "single"

      Typo in figure label has been fixed.

      (7) Line 144. Figure number is missing (Figure 3B-C).

      The missing figure number has been added to the text.

      (8) Figure 3. Again, the low-contrast grey should be replaced.

      The low-contrast grey has been replaced with dark grey.

      Reviewer #2 (Recommendations For The Authors):

      This study really nicely compares the activity and effects on activity in two areas of the auditory cortex in respect to task-engagement; I think it is, for the most part, very well done.

      A couple of specific recommendations:

      (1) Although I understand 'inf dB' as the SNR, including the actual dB level used in the experiments, would be useful, especially in the case of the inf dB.

      Thank you for this feedback. We agree that clarification about the overall sound level used here would be helpful. We have modified the methods section “Behavioral paradigm” to include the following sentence (L. 450):

      “That is, the masking noise (and distractor stimuli) were always presented with an overall sound level of 60 dB SPL. Infinite (inf) dB trials corresponded to trials where the target tone was presented at 60 dB SPL without any masking noise present, 0 dB to trials where the target was 60 dB SPL, -5 dB to trials where the target was presented at 55 dB SPL etc.”

      In addition, we have modified the main text (L. 82):

      “Animals reported the occurrence of a target tone in a sequence of narrowband noise distractors by licking a piezo spout (Figure 1A, Methods: Behavioral paradigm, distractor stimulus sound level: 60 dB SPL). … We describe SNR as the overall SPL of the target relative to distractor noise level. Thus, an SNR of –5 dB corresponds to a target level of 55 dB SPL while an Inf dB SNR corresponds to a target tone presented without any masking noise.”

      And Figure legend 1 now explicitly states the sound level used in the experiments (L. 104):

      “Variable SNR was achieved by varying overall SPL of the target relative to the fixed (60 dB SPL) distractor noise, e.g., -5 dB SNR corresponds to a 55 dB SPL target with 60 dB SPL masking noise. Infinite (inf) dB SNR corresponds to a target tone presented in isolation (60 dB SPL).”

      (2) I very much appreciate the attempt to disentangle task engagement from generalized arousal state, and specifically, addressing this through the use of pupillometry. However, by focusing the discussion of pupil dynamics solely on the arousal-state aspects of pupil size, the paper doesn't address the increasing evidence suggests that pupil size may fluctuate based upon a lot of other things, including perceptual events (see Kronemer et al, 2022 for a recent human paper; for auditory: Zekveld et al 2018 (review) and Montes-Lourido et al, 2021; but many many others, too). It would be nice to see either a bit more nuanced discussion of what pupil size may be indicating (easier), or analyzing the behavior in the context of pupil dynamics (a heavier lift).

      This is a good point. We agree that it is worth mentioning these more nuanced aspects of cognition that may be reflected by pupil size. Therefore, we also analyzed pupil size in the context of behavioral performance (see Supplemental Figure 6) and added the following text to the results (L. 193).

      “In addition to reflecting overall arousal level, pupil size has also been reported to reflect more nuanced cognitive variables such as, for example, listening effort (Zekveld et al., 2014). Furthermore, rodent data suggests that optimal sensory detection is associated with intermediate pupil size (McGinley et al., 2015), consistent with the hypothesis of an inverted-U relationship between arousal and behavioral performance (Zekveld et al., 2014). To determine if this pattern was true for the animals in our task, we measured the dynamics of pupil size in the context of behavioral performance. Across animals, task stimuli evoked robust pupil dilation that varied with trial outcome (Supplemental Figure 6b-c). Notably, pre-trial pupil size was significantly different between correct (hit and correct reject), hit, and miss trials (Supplemental Figure 6b-c), recapitulating the finding of an inverted-U relationship to performance in rodents (McGinley et al., 2015).  Since we focused only on correct trials in our decoding analysis, these outcome-dependent differences in pupil size are unlikely to contribute to the emergent decoding selectivity in dPEG.”

      (3) I think it would make this paper shine that much more if behavioral performance were not subsumed into the overall label of task engagement. You've already established you have performance that varies as a function of SNR; I would love to see the neural d' and covariability related to the behavioral d' (in the comparisons where this is possible). I would also love to see a more direct measure of choice for those stimuli that show variable behavior (e.g., a choice probability analysis or something of the like would seem to be easily applied to the target SNRs of -5 and 0 dB); and compare task engaged activity of hits vs misses vs passive listening to those same stimuli. You discuss previous studies looking at choice-related/decision-related activity and draw parallels to this work-given that there is the opportunity with this data set to *directly* assess choice-related activity, the absence of such an analysis seems like a missed opportunity.

      Thank you for this feedback. We agree that “task engagement” is not a unimodal state and that a more fine-grained analysis of task-engaged neural activity, according to behavioral choice, could be informative.

      First, we would like to point out that in Figure 4 we did already compare behavioral d’ to delta neural d’. We found that the two were significantly correlated in dPEG, but not in A1. This suggests that task-dependent changes in stimulus decoding in dPEG, but not A1, are predictive of behavioral performance. This is consistent with the finding that task-relevant stimulus representations were selectively enhanced in dPEG, but not in A1.

      Second, we added a choice decoding analysis to address whether auditory cortex represents the animal’s choice in our task. The results of this analysis are summarized in Supplemental Figure 8 and are discussed under the results section: “Behavioral performance is correlated with neural coding changes in non-primary auditory cortex only.” (L. 226):

      “The previous analysis suggests that the task-dependent increase in stimulus information present in dPEG population activity is predictive of overall task performance. Next, we asked whether the population activity in either brain region was directly predictive of behavioral choice on single hit vs. miss trials. To do this, we conducted a choice probability analysis (Methods). We found that in both brain regions choice could be decoded well above chance level (Supplemental Figure 8). Choice information was present throughout the entire trial and did not increase during the target stimulus presentation. This suggests that the difference in population activity primarily reflects a cognitive state associated with the probability of licking on a given trial, or “impulsivity” rather than “choice.” This interpretation is consistent with our finding that baseline pupil size on each trial is predictive of trial outcome (Supplemental Figure 6b).”

      To keep our decoding approach consistent throughout the manuscript, we followed the same approach for choice decoding as we did for stimulus decoding (perform dDR then calculate neural d-prime in the dimensionality reduced space). To make the results more interpretable, we converted choice d-prime to a choice probability (percent correctly decoded choices) using leave-one-out cross validation. (We note that d-prime and percent correct are very highly correlated statistics.) This is described in the methods as follows (L. 550):

      “We performed a choice decoding analysis on hit vs. miss trials. We followed the same procedure as described above for stimulus decoding, where instead of a pair of stimuli our two classes to be decoded were “hit trial” vs. “miss trial”. That is, for each target stimulus we computed the optimal linear discrimination axis separating hit vs. miss trials (Abbott and Dayan, 1999) in the reduced dimensionality space identified with dDR (Heller and David, 2022). For the sake of interpretability with respect to previous work we reported choice probability as the percentage of correctly decoded trial outcomes rather than d-prime. Percent correct was calculated by projecting the population activity onto the optimal discrimination axis and using leave-one-out cross validation to measure the number of correct classifications.”

      (4) It would also be interesting to look at population coding across sessions (although the point is taken that within a session allows the opportunity to assess covariability). Minorly self-servingly but very much related to the above point, Christison-Lagay et al, 2017 employed a similar detect-in-noise task, analyzed single neurons and population level activity, and looked at putative choice-related activity. The current study has the opportunity to expand on that kind of analysis that much more by looking across multiple sites vs within a given recording site; and compare across regions.

      Thank you for highlighting this point, we agree that it is important. When studying population coding it is critical to consider the impact of covariability between neurons. Therefore, it is worthwhile to revisit our interpretations of prior results, e.g., Christison-Lagay et al, 2017, which studied population coding by combining neurons across different sessions, given that we now have access to simultaneously recorded population data.

      First, we would like to point out that this was the primary motivation for our simulation analyses presented in Figure 5. Using simulations, we found that task-dependent gain modulation (which can be observed across sessions) was sufficient to explain our primary finding – selective enhancement in decoding of behaviorally relevant sound stimuli in dPEG.

      Second, to address the question about how covariability affects choice-related information in auditory cortex and compare our findings with prior studies, we performed the same set of simulations for choice probability analysis. We found that, again, choice-dependent gain modulation was sufficient to explain our findings. That is, simulations with hit- vs. miss-dependent gain changes, but fixed covariability, closely mirrored the choice probability we observed in the raw data. An additional simulation where covariability between all neurons was set to zero also recapitulated our findings in the raw data. Collectively, this suggests that covariability does not play a significant role in shaping the choice information present in A1 and dPEG during this task. We have added the following text to the manuscript to summarize this finding (L. 293):

      “Finally, we used the same simulation approach to determine what aspects of population activity carry the “choice” related information we observed in A1 and dPEG (Figure 4 – figure supplement 1). Similar to our findings for stimulus decoding, we found that gain modulation alone was sufficient to recapitulate the choice information present in the raw data for this task. This helps frame prior work that pooled neurons across sessions to study population coding of choice in similar auditory discrimination tasks (Christison-Lagay et al, 2017).”

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This manuscript presents a solid and generally convincing set of experiments to address the question of whether the lateral parafacial area (pFL) is active in controlling active expiration, which is particularly important in patient populations that rely on active exhalation to maintain breathing (eg, COPD, ALS, muscular dystrophy). This study presents a valuable finding by pharmacologically mapping the core medullary region that contributes to active expiration and addresses the question of where these regions lie anatomically. Results from these experiments will be of value to those interested in the neural control of breathing and other neuroscientists as a framework for how to perform pharmacological mapping experiments in the future.

      Thanks for the positive feedback on our study, as well as the assessment of the novelty of our investigation and the advancements to the field that these results will bring in the future.

      We have addressed the specific comments and made changes to the manuscript as indicated below.

      Public Reviews:

      Reviewer #1 (Public Review):

      The main focus of the current study is to identify the anatomical core of an expiratory oscillator in the medulla using pharmacological disinhibition. Although expiration is passive in normal eupneic conditions, activation of the parafacial (pFL) region is believed to evoke active expiration in conditions of elevated ventilatory demands. The authors and others in the field have previously attempted to map this region using pharmacological, optogenetic, and chemogenetic approaches, which present their own challenges.

      In the present study, the authors take a systematic approach to determine the precise anatomical location within the ventral medulla's rostrocaudal axis where the expiratory oscillator is located. The authors used a bicuculline (a GABA-A receptor antagonist) and fluorobeads solution at 5 distinct anatomical locations to study the effects on neuronal excitability and functional circuitry in the pFL. The effects of bicuculline on different phases of the respiratory cycle were characterized using a multidimensional cycle-by-cycle analysis. This analysis involved measuring the differences in airflow, diaphragm electromyography (EMG), and abdominal EMG signals, as well as using a phase-plane analysis to analyze the combined differences of these respiratory signals. Anatomical immunostaining techniques were also used to complement the functional mapping of the pFL.

      Major strengths of this work include a robust study design, complementary neurophysiological and immunohistochemical methods, and the use of a novel phase-plane analysis. The authors construct a comprehensive functional map revealing functional nuances in respiratory responses to bicuculline along the rostrocaudal axis of the parafacial region. They convincingly show that although bicuculline injections at all coordinates of the pFL generated an expiratory response, the most rostral locations in the lateral parafacial region play the strongest role in generating active expiration. These were characterized by a strong impact on the duration and strength of ABD activation and a robust change in tidal volume and minute ventilation. The authors also confirmed histologically that none of the injection sites overlapped grossly with PHOX2B+ neurons, thus confirming the specificity of the injections in the pFL and not the neighboring RTN.

      Collectively, these findings advance our understanding of the presumed expiratory oscillator, the pFL, and highlight the functional heterogeneity in the functional response of this anatomical structure.

      Thanks for the positive feedback on the results presented in the current manuscript.

      Reviewer #2 (Public Review):

      Summary:

      Pisanski and colleagues map regions of the brainstem that produce the rhythm for active expiratory breathing movements and influence their motor patterns. While the neural origins of inspiration are very well understood, the neural bases for expiration lag considerably. The problem is important and new knowledge pertaining to the neural origins of expiration is welcome.

      The authors perturb the parafacial lateral (pFL) respiratory group of the brainstem with microinjection of bicuculline, to elucidate how disinhibition in specific locations of the pFL influences active expiration (and breathing in general) in anesthetized rats. They provide valuable, if not definitive, evidence that the borders of the pFL appear to extend more rostrally than previously appreciated. Prior research suggests that the expiratory pFL exists at the caudal pole of the facial cranial nucleus (VIIc). Here, the authors show that its borders probably extend as much as 1 mm rostral to VIIc. The evidence is convincing albeit with caveats.

      Strengths:

      The authors achieve their aim in terms of showing that the borders of the expiratory pFL are not well understood at present and that it (the pFL) extends more rostrally. The results support that point. The data are strong enough to cause many respiratory neurobiologists to look at the sites rostral to the VIIc for expiratory rhythmogenic neurons and characterize their properties and mechanisms. At present my view is that most respiratory neurobiologists overlook the regions rostral to VIIc in their studies of expiratory rhythm and pattern.

      Weaknesses:

      The injection of bicuculline has indiscriminate effects on excitatory and inhibitory neurons, and the parafacial region is populated by excitatory neurons that are expiratory rhythmogenic and GABA and glycinergic neurons whose roles in producing active expiration are contradictory (Flor et al. J Physiol, 2020, DOI: 10.1113/JP280243). It remains unclear how the microinjections of bicuculline differentially affect all three populations. A more selective approach would be able to disinhibit the populations separately. Nevertheless, for the main point at hand, the data do suggest that we should reconsider the borders of the expiratory pFL nucleus and begin to examine its physiology up to 1 mm rostral to VIIc.

      The control experiment showed that bicuculline microinjections induced cFos expression in the pFL, which is good, but again we don't know which neurons were disinhibited: glutamatergic, GABAergic, or glycinergic.

      Thanks for sharing your excitement on the results of our study, and appreciating the thorough investigation performed with the use of bicuculline, an approach that was originally used in Pagliardini et al, 2011, PMID: 21414911) and then used by many other groups to generate and study active expiration in vivo.

      In the current study we used the well known effect of Bicuculline to systematically test the area that is more sensitive to such a pharmacological effect, and hence may be the core for generating active expiration. While the use of GABA receptor antagonists may have an indiscriminate effect on GABA receptor expressing neurons with various phenotypes, anatomical assessment of inhibitory cells has shown very little distribution of GABAergic and glycinergic cells in the parafacial area (Tanaka et.al, 2003; PMID: 14512139) and it has been inferred in multiple publications (Huckstepp et al., 2015, PMID: 25609622; Huckstepp et al. 2016 PMID: 27300271; Huckstepp et al., 2018, PMID: 30096151; Flor et al., 2020, PMID: 32621515; Britto & Moraes, 2017; PMID: 28004411; Silva et al. 2016; PMID: 26900003) and demonstrated recently (Magalhaes et al.,  2021; PMID: 34510468) that late-E neurons in the parafacial region are excitatory and have a glutamatergic phenotype. We can’t exclude that a small fraction of neurons in the pFL area are inhibitory, and that they could influence recruitment of adjacent late-E expiratory neurons. A more selective activation of neuronal populations with different phenotype would be indeed interesting, nonetheless, if local inhibitory neurons have a role in the generation of active expiration, then their disinhibition could have either an inhibitory effect on late-E activity or stimulate expiration in a more indirect fashion.

      While the effect of bicuculline on active expiration has been reported and replicated in multiple manuscripts, the source of inhibition across different phases of the respiratory cycle is still under investigation. Some studies suggest that GABAergic and glycinergic inhibition is not originated in pFL but rather in the BötC and preBötC areas (Flor et al., 2020, PMID: 32621515; Magalhaes et al., 2021; PMID: 34510468) and the effects of this inhibition across the respiratory cycle is debated. Future studies will be key to identify the source of pFL inhibition.

      The manuscript characterizes how bicuculline microinjections affect breathing parameters such as tidal volume, frequency, ventilation, inspiratory and expiratory time, as well as oxygen consumption. Those aspects of the manuscript are a bit tedious and sometimes overanalyzed. Plus, there was no predictive framework established at the outset for how one should expect disinhibition to affect breathing parameters. In other words, if the authors are seeking to map the pFL borders, then why analyze the breathing patterns so much? Does doing so provide more insight into the borders of pFL? I did not think it was compellingly argued.

      We have edited the introduction to address this comment and emphasize the rationale for the study. We also edited the results section to summarize our findings.

      We continue to report our in-depth analysis of the perturbations induced by bicuculline injection over the various respiratory characteristics as this will be fundamental to determine the effects of our experiment not only on the activation of pFL and active expiration, but also on the respiratory network in general. In order to be fair and open about our findings we have reported the results of our analysis in detail. Of note, all sites generated active expiration, but since the objective of the study was to determine the sites with the most significant changes, a finer and multilevel analysis has been used.

      Further, lines 382-386 make a point about decreasing inspiratory time even though the data do not meet the statistical threshold. In lines 386-395, the reporting appears to reach significance (line 388) but not reach significance (line 389). I had trouble making sense of that disparity.

      The statistics were confirmed, and the lines edited as follows: “Interestingly, the duration of inspiration during the response was found to decrease in all groups relative to baseline respiration (Ti response = 0.279 ± 0.034s, Ti baseline = 0.318 ± 0.043s, Wilcoxon rank sum: Z = 3.24, p = 0.001). Contrary to this decrease in inspiratory duration, the total expiratory time was observed to increase in all groups and remained elevated compared to baseline (TE response = 1.313 ± 0.188s, TE baseline = 1.029 ± 0.161s, Wilcoxon rank sum: Z = 4.49, p = 0.001).”

      The other statistical hiccups include "tended towards significance" (line 454), "were found to only reach significance for a short portion of the response" (line 486-7), "did not reach the level of significance" (line 506), which gives one the sense of cherry picking or over-analysis. Frankly, this reviewer finds the paper much more compelling when just asking whether the microinjections evoke active expiration. If yes, then the site is probably part of the pFL.

      Statistical “tendencies” have been eliminated throughout the manuscript.

      We have analyzed in details our results in order to determine changes and differential effects on respiration when comparing the 5 sites of injections. Although the presentation of the results may seem tedious, it has allowed us to highlight some interesting effects: first, the effects on respiratory frequency. It has been shown in the past that optogenetic stimulation of this area causes an increase in respiratory frequency (Pagliardini et al., 2011, PMID: 21414911), whereas a dishinibition with this same approach or stimulation of AMPAreceptor in pFL have shown a reduction in frequency or not a significant change in the response (Pagliardini et al., 2011, PMID: 21414911; Huckstepp et al., 2015, PMID: 25609622; Huckstepp et al. 2016 PMID: 27300271; Huckstepp et al., 2018, PMID: 30096151). Here, we suggest that the reduction in respiratory frequency is observed only in the caudal sites and could be attributed to BötC effects rather than the stimulation of the core of the pFL since no respiratory change was observe where the effect was more potent (rostral side). Another interesting point was the effects on O2 consumption, although difficult to interpret at this point, we found very interesting that hyperventilation occurred only at the most rostral injection sites.

      I encourage the authors to consider the fickleness of p-values in general and urge them to consider not just p but also effect size.

      Thank you for the feedback on our description of the statistical results and the suggestion of incorporating effect size. We have now included measurements of effect size in the results section.  Specifically, we calculated the effect size within each ANOVA using the value of eta squared for all data shown in Figures 3 and 4. Please note that in our phase-plane analysis (Fig. 5-6) the Mahalanobis distance is itself an effect size measure for multidimensional data. We also note that statistical evaluation using non-parametric analyses do not involve effect sizes.

      Reviewer #3 (Public Review):

      Summary:

      The study conducted by Pisanski et al investigates the role of the lateral parafacial area (pFL) in controlling active expiration. Stereotactic injections of bicuculline were utilized to map various pFL sites and their impact on respiration. The results indicate that injections at more rostral pFL locations induce the most robust changes in tidal volume, minute ventilation, and combined respiratory responses. The study indicates that the rostrocaudal organization of the pFL and its influence on breathing is not simple and uniform.

      Strengths:

      The data provide novel insights into the importance of rostral locations in controlling active expiration. The authors use innovative analytic methods to characterize the respiratory effects of bicuculline injections into various areas of the pFL.

      Weaknesses:

      Bicuculline injections increase the excitability of neurons. Aside from blocking GABA receptors, bicuculline also inhibits calcium-activated potassium currents and potentiates NMDA current, thus insights into the role of GABAergic inhibition are limited.

      Increasing the excitability of neurons provides little insights into the activity pattern and function of the activated neurons. Without recording from the activated neurons, it is impossible to know whether an effect on active expiration or any other respiratory phase is caused by bicuculline acting on rhythmogenic neurons or tonic neurons that modulate respiration. While this approach is inappropriate to study the functional extent of the conditional "oscillator" for active expiration, it provides valuable insights into this region's complex role in controlling breathing.

      We have included a reflection of the weaknesses of our studies in the technical consideration section to address the possibility that bicuculline may induce active expiration through other mechanisms. Please note that the use of bicuculline was not to gain further insight on GABAergic inhibition of pFL but to adopt a tool to generate active expiration that has been extensively validated by our group and others.

      Multiple studies have shown recruitment of excitatory late expiratory neurons with bicuculline injections. Although we did not record from late-E neurons in this study, we infer from the body of literature that disinhibition of neurons in this area will activate late-E neurons (as previously demonstrated) and generate active expiration. Although we see value in recording activity of single neurons (especially to study mechanisms of rhythmogenesis), we opted to measure the physiological response from respiratory muscles as an indication of active expiration recruitment in vivo. Recording from single neurons after bicuculline injections in each site would confirm the presence of expiratory neurons along the parafacial area, which is probably not surprising, since every site tested promoted active expiration. The focus of the study though was to determine the site with the strongest physiological response to disinhibition. Future studies will be key to determine whether all neurons along this column have similar electrophysiological rhythmic properties to the ones recently reported (Magalhaes et al., 2021; PMID: 34510468), or some of them simply provide tonic drive to late-E neurons located elsewhere.

      We have discussed the issue as follows:

      “Our experiments focused on determining the area in the pFL that is most effective in generating active expiration as measured by ABD EMG activity and expiratory flow. We did not attempt to record single cell neuronal activity at various locations as previously shown in other studies (Pagliardini et al 2011; Magalhaes et al., 2021), as this approach would most likely find some late-E neurons across the pFL and thus not effectively discriminate between areas of the pFL. Future studies involving multi-unit recordings or imaging of cell population activities will help to determine the firing pattern and population density of bicuculline-activated cells and further determine differences in distribution and function of late-E neurons across the region of the pFL.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Overall, the manuscript addresses an important question in the field, the anatomical location of the expiratory oscillator. I commend the authors for a well-thought-out and clearly presented study. However, a few small concerns deserve attention to improve the clarity of the report.

      (1) The figures would benefit from a rostral-to-caudal representation of results instead of a caudal-to-rostral orientation. Example, Figure 2.

      We opted for a caudal to rostral representation to progressively move away from the inspiratory oscillator (preBötC) and the anatomical reference point (the caudal tip of the facial nucleus) with our series of injections. 

      (2) A discussion about how expiratory responses generated by these pharmacological approaches would compare to endogenous baseline conditions. The authors mention that bicuculline injections elicited a late-E downward inflection that was absent in baseline conditions. Thus, this raises the point of how these findings compare to awake freely moving animals or during different conditions of increased ventilatory demand.

      This is an interesting question that has not yet been address in the field. As far as we know, there are no recordings of pFL neurons in freely behaving animals although recordings of pFL late-E neurons under elevated PaCO2 have shown a late-E activity in in situ preparations (Britto & Moraes, 2017; PMID: 28004411; Magalhaes et al., 2021; PMID: 34510468).

      We have clarified this in the discussion as follows:

      “At rest, respiratory activity does not present with active expiration (i.e, expiratory flow below its functional residual capacity in conjunction with expiratory-related ABD muscle recruitment) and expiratory flow occurs due to passive recoil of chest wall with no contribution of abdominal activity. Active expiration and abdominal recruitment can be spontaneously observed during sleep (in particular REM sleep, Andrews and Pagliardini, 2015; Pisanski et al., 2019) and can be triggered during increased respiratory drive (e.g. Hypercapnia, RTN stimulation, Abbott et al., 2011). Although never assessed in freely moving, unanesthetized rodents, bicuculline has been extensively used to generate active expiration and late-E neuron activity in both juvenile and adult anesthetized rats (Pagliardini et al., 2011; Huckstepp et al., 2015 Huckstepp et al., 2016; Huckstepp et al., 2018; De Britto and Moraes, 2017; Magalhaes et al., 2021). “

      (3) In Figure 2A, there appears to be an injection site in the top right quadrant of the image, very distant from the intended site. Could the authors confirm if this is an artifact?

      Yes, it is an artifact of image acquisition, we should have marked that in the figure. To avoid confusion and follow other reviewers’ suggestions we have edited he figure.

      (4) A stylistic suggestion would be to include the subpanel of Figure 2C saline control injection as a graph of its own and also include the control anatomical location in 2B.

      Thanks for the suggestion. Because of the complex organization of the figure we opted to leave it as a subpanel in order to not distract the reader from the 5 injection sites, but still provide information about vehicle injection and their lack of changes in respiratory response.

      (5) The authors note that DIAm Area (norm.) during the inspiratory phase is increased in the +6 and +8mm groups. However, Figure 5E shows that the +8mm group is significantly reduced as compared to the +6mm group. Please clarify.

      During the inspiratory phase we did not observe any significant change in the DIA Area (norm.). We realize that the description of this part of the results was confusing and therefore we have eliminated that section.

      Reviewer #2 (Recommendations For The Authors):

      I encourage the authors to consider the fickleness of p-values in general and urge them to consider not just p but also effect size. There is a valuable editorial in this week's J Physiology (https://doi.org/10.1113/JP285575) that may provide helpful guidance.

      Thanks for this comments and the general assessment. We realized that the results section was dense and with a lot of information. We significantly slimmed the description of the results in order to facilitate the appreciation of the results and avoid confounding statement about significant vs non- significant results.

      We have now included measurements of effect size in the results section.  Specifically, we calculated the effect size within each ANOVA using the value of eta squared for all data shown in Figures 3 and 4. Please note that in our phase-plane analysis (Fig. 5-6) the Mahalanobis distance is itself an effect size measure for multidimensional data. We also note that statistical evaluation using non-parametric analyses do not involve effect sizes.

      The equipment and resources should be clearly identified and use RRIDs whenever possible. Resources like antibodies and other reagents (e.g., cryoprotectants) should be identified, not just by manufacturer, but also by specific part or product numbers or identifiers.

      Manuscript has been edited to add these details.

      The manuscript makes reference to ImageJ and Matlab routines, which must be public through GitHub or another stable repository.

      Thanks for pointing this out. Image J analysis has been performed following scripts already available to users (no custom scripts). The Matlab scripts used for the multivariate analysis is now available at: https://github.com/mprosteb/Pisanski2024

      The way that ABD-DIA coupling was assessed was unclear from the Methods.

      The following text has been added to the methods: “The coupling between ABD and DIA signals was measured as a ratio and analyzed by quantifying the number of bursts of activity observed for the ABD and DIA EMG signals during the first 10 minutes of the response, excluding time bins at end of the response (due to fading and waning of the ABD response in those instances).”

      Fig. 1A was never cited in the text.

      It has been cited now.

      Fig. 1A-C appears to be exactly the same as Fig. 5A-C.

      The reviewer is correct. We have used figure 1 to describe and explain our analytical methods with sample data and Figure 5 describes our results. We have clarified that in: “Figure 5: Rostral injections elicit more prominent changes to respiration in each signal and sub-period. A-C: Is the same as Method Figure 1, has been included here for further clarity when analyzing the results.”

      Late Expiratory airflow is given in units of volts (V) in lines 358-363 (Fig. 4C) but then in units of volts-seconds (V•s) in lines 363-367. Both units are problematic because the voltage is neither an air volume nor an air volume per unit time. Is there some conversion factor left out?

      In this section of the results we describe the changes in expiratory peak amplitude (V) and expiratory peak flow (V•s). Since calibration of airflow was performed on the positive flow and for larger volumes, we prefer to use the original units to guarantee precise assessment of the change and avoid introducing potential errors. Since the analysis considers changes from baseline readings, converting to ml or ml*s would not affect our analysis.

      Reviewer #3 (Recommendations For The Authors):

      The study conducted by Pisanski et al investigates the role of the lateral parafacial area (pFL) in respiratory control, specifically in modulating active expiration. The precise location of this expiratory oscillator within the ventral medulla remains uncertain, with some studies indicating that the caudal tip of the facial nucleus (VIIc) forms the core while others propose more rostral areas. Bicuculline injections were utilized at various pFL sites to explore the impact of these injections on respiration. The authors use innovative and impressive analytic methods to characterize the effect on respiratory activity. The results indicate that injections at more rostral pFL locations induce the most robust changes in tidal volume, minute ventilation, and combined respiratory responses. The study will contribute to an enhanced understanding of the neural mechanisms controlling active expiration. The main message of the study is that the rostro-caudal organization of the pFL is not simple and uniform. The data provides novel insights into the importance of rostral locations in controlling active expiration (see e.g. lines 738-740).

      The data and results of the paper are intriguing, and it appears that the experiments are well-managed and executed. However, there are several major and minor comments and suggestions that should be addressed by the authors:

      (1) The study relies heavily on local injections into specific areas that are confirmed histologically. One potential concern is the injection volume of 200 nL in such a tiny area. The authors suggest that the drug did not spread to rostral/caudal areas outside the specified coordinate partly based on their cFOS staining. For example, the lack of cFOS activation in TH+ cells and Phox2B cells is interpreted as proof that bicuculline did not spread to these somas (Figure 2). The authors seem to use a similar argument as evidence that the pFL does not include Phox2B neurons in the RTN as discussed in the Discussion section (lines 830-847). However, it is very surprising that bicuculline injections into an area that is known to contain Phox2B and Th+ neurons do not activate these neurons as assessed by the cFOS staining. It seems puzzling to me that none of their injections shown in Figure 2 activated Phox2B or Th neurons. I assume that in targeting the pFL the authors must have sometimes hit areas that included neurons that define the RTN, which would have activated Phox2B or Th+ neurons. Did the authors find that these activations did not activate active expiration? Such negative "controls" would strengthen their argument that pFL is a separate and distinct region that selectively controls active expiration.

      Thanks for the positive feedback on the manuscript. As it has been demonstrated and discussed in several previous publications, PHOX2B expressing neurons in this area of the brain are part of the RTN Neuromedin B positive neurons (more densely located in the ventral paraFacial rather than the lateral parafacial, our site of injection), the TH+ C1 neurons (located in a somewhat more caudal and medial position compared to our sites of injection, around the BötC/ preBötC area) and the large Facial MN (easily identifiable by their large size and compact location). Given this differential spatial distribution, and the controls described below, we believe we have reduced the possibility of the direct activation of these neurons, although we can’t exclude it in full.

      There is now strong evidence about lack of PHOX2B expression in late E neuron in juvenile and adult rats (Magalhaes et al., 2021; PMID: 34510468). We realize that the microinjected solution could potentially diffuse in the brain and hit other areas, but we combined two strategies to verify our intention for a focal injection activating only a restricted area of the brain (i.e., the pFL): i) localization of fluorobeads that were diluted in the Bicuculline solution; ii) expression of cFos combined with anatomical markers, to identify activated cells. Fluorobeads have a very limited spread in the brain and therefore informed us of the site of the injection to differentiate between the five injections locations. Although we can’t assume that Bicuculline will have a similar spread (and it will also be quickly degraded in the tissue), the combination of this analysis with the localized expression of cFos cells has helped us to differentiate between injections site. Because of the proximity of PHOX2B cells in RTN and C1 neurons, we also combined cFos expression with immunohistochemistry to determine whether bicuculline activation was also visible in these two neuronal populations. Our results indicate that there is baseline cfos activity in RTN neurons (see vehicle injection) but the fraction of PHOX2B activated cells did not increase with bicuculline injections suggesting that these neurons were not the target of our injections. Please note that cfos expression has been extensively used to determine RTN neuron activation, especially following chemoreflex responses. 

      (2) The authors refer to "the expiratory oscillator" throughout the manuscript (e.g. lines 58, 62, 65) as if there is only one expiratory oscillator i.e. "the expiratory oscillator". For some reason, the authors avoided citing and mentioning PiCo (Anderson et al. 2016), which is considered the oscillator for postinspiration. Since the present study focuses on the role of expiration, and since the authors describe convincing effects on postinspiration, considering this oscillator which is located dorsomedial to the VRC seems relevant for the present study.

      Due to the limited and controversial literature that is currently present describing Pico as a third oscillator and the fact that our studies do not directly assess the post-inspiratory activity (as measure by the V nerve or laryngeal muscles) or Pico activity and location (which would be even more distant than the RTN, for example), we prefer to avoid commenting on the effects of this injection on Pico or the connectivity between Pico and pFL.

      We have added this to the discussion:

      “Therefore, although it has previously been described, it is currently unknown the exact mechanism by which this post-I activity in the ABD muscles is generated. For example the interplay between the rostral pFL and brainstem structures generating post-inspiratory activity, such as the proposed post-inspiratory oscillator (PiCo; Anderson et al., 2016) or pontine respiratory networks, could be reasonably involved in this process.”

      (3) The authors do not specify what type of bicuculline they injected. Bicuculline is known to have significant effects on potassium channels. Thus, the effects reported here could be due to a non-specific change in excitability, rather than caused by a specific GABAergic blockade.

      The authors also do not know what effects these injections cause in the neurons in vivo, since the injections are not accompanied by recordings from the respiratory neurons that they activate. This together with the non-specific bicuculline effects will affect the interpretation of the results. Thus, the authors need to be more careful when interpreting their effects as "GABAergic". The use of more specific blockers like gabazine could partly address this concern. The authors have to discuss this in a "limitation section".

      Thanks for pointing that out, we have now clarified in the methods section that we used bicuculline methochloride. We can’t exclude that some side- effects could be present due to the use of this drug. For the purpose of this study though, we focused on using bicuculline as a tool to consistently generate active expiration since it has been extensively used by multiple laboratories to induce abdominal muscle recruitment and active expiration, as well as to directly record late-E neurons in this same area.

      We have included in the discussion the following statement:

      “Technical considerations

      Bicuculline methiodide has previously been observed to exhibit inhibitory effects on Ca2+ activated K+ currents inducing non-specific potentiation of NMDA currents (Johnson and Seutin, 1997). Consequently, caution is warranted in attributing our findings solely to the GABAa antagonist properties of bicuculline. Previous work has demonstrated a temporal correlation between the onset of late-E neuron activity in the caudal parafacial region and ABD activity in response to bicuculline (Pagliardini et al., 2011; de Britto and Moraes, 2017; Magalhaes et al., 2021) as well as GABAergic sIPSCs in late-E neurons (Magalhaes et al., 2012). However, it is essential to note that the current study lacks single unit recording, preventing us from definitively confirming whether the observed activity stems from late-E neuronal GABAergic dishinibition or excitation through non GABAergic mechanisms.”

      (4) I also caution the authors when stating that the bicuculline injections will reveal the precise location and functional boundaries of "the" expiratory oscillation within the pFL. Increasing the excitability with bicuculline is inappropriate to study the functional boundaries of an oscillator. It is particularly inappropriate to identify the boundaries of the pFL, a network that is normally inactive and activated only under certain behavioral and metabolic conditions. Because the injections are increasing the neuronal excitability unspecifically, and because the authors are not recording the activity of the neurons in the pFL region it is unclear what kind of neurons are activated. The cFOS staining may help to define whether these neurons are Phox2B or Th positive or negative, but they will not provide insights into the activity patterns of the activated neurons. Thus, it is fair to assume that these injections will likely include also tonic neurons that might indirectly control the activity of pFL neurons under certain metabolic or behavioral conditions without actually being involved in the rhythmogenesis of active expiration. Many of the effects peak after several minutes, and different regions cause differential effects with different time courses, which is difficult to interpret functionally. Thus, the "core" identified in the present study could consist of tonic neurons as opposed to rhythmic neurons generating active expiration.

      We agree with the reviewer that our local injections may have activated an heterogeneous population of neurons. We do not claim that we only activated late-E rhythmogenic neurons but that our multiple sites of injections revealed the area that is generating the strongest excitation of ABD muscles and active expiration.

      While the use of GABA receptor antagonists may have an indiscriminate effect on GABA receptor expressing neurons with various phenotypes, anatomical assessment of inhibitory cells has shown very little distribution of GABAergic and glycinergic cells in the parafacial area (Tanaka et.al, 2003; PMID: 14512139) and it has been inferred in multiple publications (Huckstepp et al., 2015, PMID: 25609622; Huckstepp et al. 2016 PMID: 27300271; Huckstepp et al., 2018, PMID: 30096151; Flor et al., 2020, PMID: 32621515; Britto & Moraes, 2017; PMID: 28004411; Silva et al. 2016; PMID: 26900003) and demonstrated recently (Magalhaes et al.,  2021; PMID: 34510468) that late-E neurons in the parafacial region are excitatory and have a glutamatergic phenotype

      As suggested by the reviewer, it is possible that the bicuculline injection may have activated some tonic non rhythmogenic neurons which could activate the expiratory oscillator located elsewhere.

      We have edited the introduction as follows:

      “By strategically administering localized volumes of bicuculline at multiple rostrocaudal levels of the ventral brainstem, we aimed to selectively enhance the excitability of neurons driving active expiration, thereby revealing the extension of the pharmacological response and the most efficient site in generating active expiration.”

      We have edited the results as follows:

      “Importantly, the group with injection sites at +0.6 mm from VIIc exhibited the swiftest response onset, suggesting that this area is the most critical for the generation of active expiration, either through direct activation of the expiratory oscillator or, alternatively, for providing a strong tonic drive to late-E neurons located elsewhere.”

      In the introduction, it should also be emphasized that the pharmacological approach used in the present study complements the existing elegant chemogenetic studies, rather than emphasizing primarily the limitations of the chemogenetic inhibitions. The conclusion should be that these studies together provide different, yet complementary insights: The chemogenetic approach by inhibiting neurons, the present study by exciting neurons, and all studies come with their own limitations.

      Thanks for the suggestion, we have updated the manuscript as follows:

      “Although both of these elegant chemogenetic studies have contributed extensively to our understanding of the pFL, the existing evidence suggests that the expiratory oscillator may expand beyond the limits of the viral expression achieved in said studies, as proposed by Huckstepp et al., (2015).”

      Throughout the manuscript, the authors have to be cautious when implying that an excitatory effect relates to the activity of rhythmogenic pFL neurons. For example, on line 710 the authors state that "it is conceivable to infer that the rostral pFL is in the closest proximity to the cells responsible for the generation of active expiration". While it may indeed be "conceivable", the bicuculline injections themselves provide no insights into the location of neurons responsible for rhythmogenesis. It is equally "conceivable" that the excited neurons provide a tonic drive to the neurons without being involved in the generation of active expiration. These tonic neurons could be located at a distance from the presumed rhythmogenic core.

      We have included the possibility of tonic excitation in the technical considerations section:

      “However, our study did not include recording from late-E neurons following bicuculline injections, preventing us from definitively confirming whether the observed activity stems from late-E neuronal excitation or the potentiation of a tonic drive, particularly in the rostral areas.”

      (5) It is intriguing that some of their injections (Fig.2D) evoked postinspiratory activity. This interesting finding should be discussed as it could provide important insights into the coordination of the different phases of expiration.

      Thanks for the suggestion. We have included the following to the discussion:

      “Therefore, although it has previously been described, the exact mechanism by which this post-I ABD activity is generated is unclear. This late-E/post-I pattern of activity is similar to what has been observed in in vitro preparations and in vivo recordings in juvenile rats (Janczewski et al., 2002; Janczewski et al., 2006).

      “Therefore, although it has previously been described, it is currently unknown the exact mechanism by which this post-I activity in the ABD muscles is generated. For example the interplay between the rostral pFL and brainstem structures generating post-inspiratory activity, such as the proposed post-inspiratory oscillator (PiCo; Anderson et al., 2016) or pontine respiratory networks, could be reasonably involved in this process.”

      (6) The authors conducted bilateral disinhibition of the pFL, but only a unilateral photomicrograph was shown. Figure 2 should include a representative bilateral photomicrograph along with a scatter plot for clarity and completeness.

      We have edited figure 2 to include representative images of bilateral injections.

      (7) Regarding the Bicuculline injections in the Methods section: Aside from specifying exactly what type of bicuculline was used, the authors should provide more information about the pFL location and landmarks used, including the missing medial-lateral coordinate. The fluorobead spread of approximately ~300 µm, as observed in Figure 2C, is crucial for the interpretation of the results and should be detailed. An alternative approach could involve e.g. calculating the area covered by fluorobeads in each group.

      We have included the following in the text:

      “Each rat was injected at 2.8 mm lateral from the midline and at a specific RC coordinate based on the following groups: -0.2 mm from the caudal tip of the facial nucleus (VIIc) (n=5), +0.1 mm from VIIc (n=7), +0.4 mm from VIIc (n=5), +0.6 mm from VIIc (n=6), +0.8 mm from VIIc (n=5)”

      “These findings strongly suggest that bicuculline specifically activated cells within the vicinity of the injection sites which spread ~300 ìm (Figure 2C, horizontal lines) and did not activate PHOX2B+ cells in the RTN area, beyond their baseline level of activity.”

      (8) In the Experimental Protocol, the authors should provide more details on how the parameters were determined. For example, specify the number of cycles included for Dia frequency/amplitude, Abd frequency/amplitude, and with regards to the averaging process, the authors should specify over how many cycles they obtained an average for Dia/Abd activity time and AUC. The authors should also provide information on the number of bicuculline injections that they repeated to average these values and they should report the coefficient of variation for repeated injections. Please clarify the method used to calculate AUC, considering the non-linear nature of the activity.

      Only one bicuculline injection per rat was performed and the number of rats used for each injection site is indicated in the methods as follows:

      “Each rat was injected at 2.8 mm lateral from the midline and at a specific RC coordinate based on the following groups: -0.2 mm from the caudal tip of the facial nucleus (VIIc) (n=5), +0.1 mm from VIIc (n=7), +0.4 mm from VIIc (n=5), +0.6 mm from VIIc (n=6), +0.8 mm from VIIc (n=5), and CTRL (n=7). We recorded the physiological responses to the injection for 20-25 min.”

      We have clarified in the methods section the following:

      “Respiratory data was tracked in time bins of 2-minute duration from the baseline period prior to injections and spanned 20 min of recording post-injection. Mean-cycle measurements for each signal were computed by averaging values across all cycles within a given time bin.”

      Additional clarifications have been added:

      “We then used the average calculations of respiratory rate (RR), tidal volume (VT), Minute Ventilation (Ve), expiratory ABD amplitude, expiratory ABD area, VO2, VE/VO2 to obtain values relative to the baseline period. Peak responses were identified as the time bin that produced the strongest changes relative to baseline.”

      “Mean-cycle measurements for each signal were computed by averaging across all cycles within a given time bin. (~300 cycles in baseline, ~100 cycles per response time bin). We then used the average calculations of respiratory rate (RR), tidal volume (VT), Minute Ventilation (Ve), expiratory ABD amplitude, expiratory ABD area, VO2, VE/VO2 to obtain values relative to the baseline period. Peak responses were identified as the time bin that produced the strongest changes relative to baseline.”

      “The Area under the curve (AUC) was measured during baseline and was subtracted from the corresponding AUC of the response for each time bin (Figure 1C). This AUC measure was computed as the sum of the signal in a given respiratory phase as all signals were sampled at the same rate. Note that areas calculated below the zero- (0) line, as would be expected from a negative airflow during expiration, yields negative AUC values.”

      (9) The authors should explain how oxygen consumption was calculated-did it involve the Depocas & Hart (1957) formula? Please provide information on expiratory CO2, whether ventilation was adjusted to achieve consistent CO2 levels across animals, and ideally specify the end-tidal CO2 range for the experiments. Discuss the rationale behind the chosen CO2 levels and whether CO2-dependent pFL activity could have influenced results.

      We have clarified in the measurement in the methods as follows:

      “The gas analyzer measured fractional concentration of O2. Based on this and the flow rate at the level of the trachea (minute ventilation), we calculated O2 consumption according to Depocas and Hart (1957).”

      We have also added to the methods section:

      “During the entire experimental procedure, rats breathed spontaneously and end tidal CO2 was not adjusted through the experimental protocol.”

      In terms of the CO2-dependent pFL activity possibly influencing the results: by inducing active expiration in conditions in which there is no physiological demand for it (i.e. no hypoxia or hypercapnia), it is likely that pCO2 is reduced, overall decreasing the drive for ABD activity which would suggest that our results are likely an underestimation of the response that would have been produced if we maintained the CO2 levels constant.

      (10) The authors should address the discrepancy in fos-activated neurons between the control (44 neurons) and experimental animals (90-120 neurons per hemisection). Please explain the activation in the control group. Please also provide insights into how the authors interpret this difference in cfos-activated neurons between control and experimental groups.

      The following paragraph has been added to the discussion:

      “The assessment of cellular activity, quantified through cFos staining, unveiled the existence of basal activity in control rats. This observed baseline activity is likely emanating from subthreshold physiological processes within the parafacial area which do not culminate in ABD activity. Analysis of the cFos staining confirmed focal activation of neurons in the pFL of rats injected with bicuculline and minimal cFos expression in the PHOX2B+ cells in all groups as compared to the control group. These results confirm the very limited mediolateral spread of the drug from the core site of injection and back previous findings supporting the hypothesis that the majority of PHOX2B+ cells are more ventrally located in the parafacial area (pFV, Huckstepp et al., 2015) and PHOX2B+ cell recruitment is not necessary for active expiration (de Britto & Moraes, 2017; Magalhães et al., 2021).”

      (11) In Figure 8, the authors plotted the relationship of each cycle correlated to the normalized area. Have you also calculated the same late-E, inspiratory, and post-I to fR or VT separately?

      No, we only did the separated breathing phase (late-E, I, Post-I) analysis in the calculations of the DIA, airflow and ABD area, as well as on the Euclidean and Mahalanobis distances.

      Minor comments:

      Is there any specific reason for conducting these experiments exclusively in males?

      No, we usually use male rats for this type of experiments. We use both male and female rats for other studies that concern the effects of sex hormones but in this case, we performed experiments only in male rats.

      Page 13, Line 320: What is the duration of the bicuculline-induced effects?

      This information is included in the results section as follows:

      “Similarly, the ABD response duration was longer at the two most rostral locations (+0.6 mm = 17.6 ± 2.7 min; +0.8 = 17.1 ± 3.3 min) compared to the most caudal group (-0.2 mm = 2.4 ± 1.1 min; One-Way ANOVA p = 0.043; Tukey -0.2 mm vs +0.6 mm: p = 0.048; -0.2 mm vs +0.8 mm: p = 0.041; Figure 3E).”

      Page 16, Line 400: Is there a rationale for the high tidal volume (VT) observed in these animals? A baseline VT of 7 ml/kg appears notably elevated.

      Please note that rats were vagotomised and spontaneously breathing, hence the tidal volume is increased compared to non-vagotomised rats as seen in previous studies (Ouahchi et al., 2011).

      Figure 2D: Could you provide longer recordings? Additionally, incorporating diaphragm (Dia) recordings would enhance the interpretation of abdominal (Abd) recordings.

      Figure 3 A has a representative example of the 20 minute recordings for each location.

      Page 18, Line 458: Please rectify "Dunn: p , 0.001" to the appropriate format, perhaps "Dunn: p < 0.001."

      Thank you, edited.

    1. Author response:

      eLife assessment

      “…The evidence however is incomplete, since the tai loss-of-clone phenotype is based on one allele and the mechanism involved in cell competition through Dlp and Wg lacks adequate supporting data.”

      We agree with the need for a second allele and are adding supporting data from a new tai lof allele we have generated by Crispr.

      We also agree that additional functional data would help demonstrate that differences in Dlp levels are required for the mechanism of Tai cell competition. Experiments are ongoing to test whether normalizing Dlp levels across clonal boundaries rescues elimination of Tai-low clones.

      Reviewer #1:

      Overall Statements:

      “There is some data in the supplementary materials suggesting that Tai promotes dlp mRNA expression, but this was not compelling.”

      We are currently testing effects on Tai on dlp and dally transcription using qPCR and reporter transgenes. As noted below, the effects of Tai on Dlp trafficking are ‘strong’, so resolving effects on Dlp transcription will complement this localization data.

      “The authors don't further examine Dlp protein in tai clones.”

      As noted by the Reviewer, we do examine Dlp levels and localization in tai-low clones (see Figure 9), but these experiments are challenging due to their very small size and the hypomorphic nature of the tai allele (tai[k15101]) that was used. Experiments are in progress to examine the effect of our Crispr null allele of tai on Dlp levels and localization in wing clones.

      “In sum, the authors have uncovered some interesting results, but the story has some unresolved issues that, if addressed, could boost its impact. Additionally, the preprint seems to have 2 stories, one about tai and cell competition and the other about tai and Wg distribution. It would be helpful to reorder the figures and improve the narrative so that these are better integrated with each other.”

      We agree. The results of our modifier screen required that we first understand how Tai regulates the Wg pathway before could apply this to understanding the competitive mechanism. Thus, the paper is composed of three sections: 1. the screen, 2. the Tai-Dlp-Wg connection in the absence of competition, and 3. the contribution of Dlp-Wg to the tai[low] ‘loser’ phenotype. These sections use different techniques (e.g., clonal mosaics with genomic alleles, Gal4/UAS and RNAi to define the effect of Tai loss on Wg and Dlp). Ongoing experiments return to clonal mosaics to test whether elevating Dlp can rescue tai lof clones in the same manner as Apc/Apc2 alleles (see Figs. 2-3), which elevate Wg pathway activity.

      Specifics:

      “It would be good to know whether the authors can rescue tai-low clones by over-expression UAS-Dlp.”

      As noted above, experiments are ongoing to test whether normalizing Dlp levels across clonal boundaries rescues elimination of Tai-low clones.

      “The data on Wg distribution seems disjointed from the data about cell competition. The authors could refocus the paper to emphasize the cell competition story. The role of Dlp in Wg distribution is well established, so the authors could remove or condense these results. The story really could be Figs 1, 2, 3 and 7 and keep the paper focused on cell competition. The authors could then discuss Dlp as needed for Wg signaling transduction, which is already established in the literature.”

      We appreciate the suggestion to reorganize the figures to focus the first part of the story on competition, and then follow with the role of Tai in controlling Dlp. We will consider this approach pending the results of ongoing experiments.  

      “The model of tai controlling dlp mRNA and Dlp protein distribution is confusing. In fact, the data for the former is weak, while the data for the latter is strong. I suggest that the authors focus on the altered Dlp protein distribution on tai-low clones. It would also be helpful to prove the Wg signaling is impeded in tai clones (see #5 below).”

      We agree but are currently testing how dlp reporters and mRNA respond to Tai in order to rigorously test a Dlp transcriptional mechanism. To complement the ‘strong’ evidence that Tai regulates Dlp distribution, we are testing Dlp in clones of our Tai Crispr null. Since submission, we have also assessed the effect of blocking the endocytic factor shibire/dynamin in Dlp distribution in Tai deficient cells to complement the data on Pentagone that is already in the paper (see Fig. S3).

      “I don't know if the Fz3-RFP reported for Wg signaling works in imaginal discs, but if it does then the authors could make clones in this background to prove that cell-autonomous Wg signaling is reduced in tai-low clones.”

      We thank the reviewer for this suggestion, which we are now testing.

      Reviewer #2

      Overall Comments:

      “While the authors present good evidence in support of most of their conclusions, there are alternative explanations in many cases that have not been excluded.”

      We appreciate this point and are conducting experiments for a revised submission that will help test alternative mechanisms and clarify our conclusions.

      Specifics:

      “However, the experiments have been done with a single allele, and these experiments do not exclude the possibility that there is another mutation on the same chromosome arm that is responsible for the observed phenotype. Since the authors have a UAS-tai stock, they could strengthen their results using a MARCM experiment where they could test whether the expression of UAS-tai rescues the elimination of tai mutant clones. Alternatively, they could use a second (independent) allele to demonstrate that the phenotype can be attributed to a reduction in tai activity.”

      As noted above, we agree with the need for a second allele and are adding supporting data from a new tai lof allele we have generated by Crispr.

      The tai[k15101] allele acts as a tai hypomorph and has been shown to produce weaker phenotypes than the 61G1 strong lof in a number of papers (Bai et al, 2000; König et al, 2011, Luo et al, 2019, and Zhang et al, 2015). We agree that rescue of tai[k1501] with a UAS-Tai transgene would help rule out effects of second site mutations. We are currently pursuing the reviewer’s second suggestion of phenocopy with a different allele, our new tai Crispr lof.   

      “The authors have screened a total of 21 chromosomes for modification and have not really explained which alleles are nulls and which are hypomorphs. The nature of each of the alleles screened needs to be explained better.”

      We will update the text to better reflect what type of alleles were chosen. In most cases we preferred amorphs or null alleles over hypomorphs, however when the amorph option was not available, we used hypomorphs.

      “Also, the absence of a dominant modification does not necessarily exclude a function of that gene or pathway in the process. This is especially relevant for the Spz/Toll pathway which the authors have previously implicated in the ability of tai-overexpressing cells to kill wild-type cells.”

      We thank the reviewer for this completely accurate point. The dominant screen does not rule out effects of other pathways such as Spz/Toll. Indeed, we were surprised by the lack of dominant effects by Spz/Toll alleles on tai[low] competition given our prior work. The reciprocally clear dominant effect of Apc/Apc2 led us to consider that Wg signaling plays a role in this phenomenon, which then became the starting point of this study.

      “The most important discovery from this screen is the modification by the Apc alleles. This part of the paper would be strengthened by testing for modification by other components of the Wingless pathway. The authors show modification by Apc[MI01007] and the double mutant Apc[Q8] Apc2[N175A]. Without showing the Apc[Q8] and Apc2[N175A] alleles separately, it is hard to know if the effect of the double mutant is due to Apc, Apc2,` or the combination.”

      We agree that testing for modification with other components of the Wg pathway would be helpful to strengthen the connection between Tai low clonal elimination and Wg pathway biology. We also agree that separating Apc [Q8] and Apc2 [N175A] would be a good idea to check if both Apc proteins are equally important for rescuing Tai low cell death, and future experiments for the lab could investigate this distinction.

      “RNAi of tai seems to block the formation of the Wg gradient. If so, one might expect a reduction in wing size. Indeed, this could explain why the wings of tai/Df flies are smaller. The authors mention briefly that the posterior compartment size is reduced when tai-RNAi is expressed in that compartment. However, this observation merits more emphasis since it could explain why tai/Df flies are smaller (Are their wings smaller?).”

      We agree that this is an exciting possibility. Growth effects of Tai linked to interactions with Yorkie and EcR could be due to a distinct role in promoting Wg activity. Alternatively, Tai may cooperate with Yorkie or EcR to control Wg pathway. These are exciting possibilities that we are pursuing in future work

      With regard to the “small size” effect of reducing Tai, we have previously shown that RNAi of Tai using engrailed-Gal4 causes the posterior compartment to shrink (Zhang et al. 2015, Figure 1C-F, H). In this paper, we also showed that tai[k15101]/Df animals are proportionally smaller than wildtype animals and quantified this by measuring 2D wing size (Zhang et al. 2015, Figure 1A and 1B)

      “In Figure 7, the authors show the effect of manipulating Tai levels alone or in combination with increasing Dlp levels. However, they do not include images of Wg protein distribution upon increasing Dlp levels alone.”

      We thank the reviewer for this reminder and have already generated these control images to include in a revised submission paper.

      “In Figure 8, there is more Wg protein both at the DV boundary and spreading when tai is overexpressed in the source cells using bbg-Gal4. However, in an earlier experiment (Figure 5C) they show that the wg-lacZ reporter is downregulated at the DV boundary when tai is overexpressed using en-Gal4. They therefore conclude that wg is not transcriptionally upregulated but is, instead secreted at higher levels when tai is expressed in the source cells. Wg protein is reduced in the DV stripe with tai is overexpressed using the en-Gal4 driver (Figure 6B') and is increased at the same location when tai is overexpressed with the bbg-Gal4 driver. (Figure 8) I don't know how to reconcile these observations.”

      We thank the reviewer for pressing us to develop an overall model explaining our results and how we envision Tai regulating Dlp and Wg. We are preparing a graphic abstract that illustrates this model and will be included in our revision.

      Briefly, we favor a model in which Tai controls the rate of Wg spread via Dlp, without a significant effect on wg transcription. For example, the induction of Dlp across the ‘engrailed’ domain of en>Tai discs (Fig 7B-B”) allows Wg to spread rapidly across the flanks and moderately depletes it from the DV margin (Fig 6B-B”) as noted by the reviewer. Adding a UAS-Dlp transgene in the en>Tai background dramatically accelerates Wg spread and causes it to be depleted from the DV margin and build up at the far end of the gradient adjacent to the dorsal and ventral hinge. Significantly blocking endocytosis of Wg in en>Tai discs with a dominant negative shibire transgene also causes Wg to build up in the same location (new data to be added in a revision) consistent with enhanced spreading. The difference in the bbg-Gal4 experiment is that Tai is only overexpressed in DV margin cells, which constrains and concentrates Wg within this restricted domain; we are in the process of testing whether this effect on Wg is blocked by RNAi of Dlp in bbg>Tai discs.

      “In Figure 9, the tai-low clones have elevated levels of Dlp. How can this be reconciled with the tai-RNAi knockdown shown in Figure 7C' where reducing tai levels causes a strong reduction in Dlp levels?”

      We apologize for not explaining this data well enough. First, the tai[k15101] allele is a weak, viable hypomorph (as shown in our Zhang et al, 2015 paper) whereas the Tai RNAi line is lethal with most drivers (including en-Gal4) and thus a stronger lof. Second, Tai RNAi lower Dlp levels (Fig 7C) while tai[k15101] causes Dlp to accumulate intracellularly (see Fig. 9A-C). These data indicate that reduced Tai leads to a defect in Dlp intracellular trafficking while its loss reduces Dlp overall levels; these data can be explained by a single role for Tai in Dlp traffic to or from the cell membrane, or two roles, one in trafficking and one Dlp expression. As noted, we are investigating both possibilities using dlp reporter lines and our new tai null Crispr allele.

      Reviewer #3:

      Overall Weaknesses:

      “The study has relatively weak evidence for the mechanism of cell competition mediated by Dlp and Wg.”

      The screen and middle section of the paper provide genetic evidence that elevating Wg pathway activity rescues Tai[low} loser cells and that Tai controls levels/localization of Dlp and distribution of Wg in the developing wing disc. Our current work is focused on linking these two finding together in Tai “loser” clones.

      “More evidence is required to support the claim that dlp transcription or endocytosis is affected in tai clones.”

      As noted above, we are testing whether normalizing Dlp levels across clonal boundaries rescues tai[low] loser clones and assessing effects of Tai on dlp transcription and Dlp trafficking.

      Specifics:

      “Most of the rest of the study is not in the clonal context, and mainly relies on RNAi KD of tai in the posterior compartment, which is a relatively large group of cells. I understand why the authors chose a different approach to investigate the role of tai in cell competition. However because ubiquitous loss of tai results in smaller organs, it is important to determine to what extent reducing levels of tai in the entire posterior compartment compares with clonal elimination i.e. cell competition. This is important in order to determine to what extent the paradigm of Tai-mediated regulation of Dlp levels and by extension, Wg availability, can be extended as a general mechanism underlying competitive elimination of tai-low clones. If the authors want to make a case for mechanisms involved in the competitive elimination of tai clones, then they need to show that the KD of tai in the posterior compartment shows hallmarks of cell competition. Is there cell death along the A/P boundary? Or is the compartment smaller because those cells are growing slower?”

      Based on data that cell competition does not occur over compartment boundaries (e.g., see review by L.A. Johnston, Science, 2009), we chose not to use UAS-Gal4 to assess competition, but rather to investigate underlying biology occurring between Tai, Wg, and Dlp.

      “Are the levels of Myc/DIAP1, proteins required for fitness, affected in en>tai RNAi cells?”

      This is, of course, an interesting question given that Myc is a well-studied competition factor and is proposed to be downstream of the Tai-interacting protein Yki. We are not currently focused on Myc, but plan to test its role in the Tai-Dlp-Wg pathway in future work.

      “The authors do not have direct/strong evidence of changes in dlp mRNA levels or intracellular trafficking. To back these claims, the authors should look for dlp mRNA levels and provide more evidence for Dlp endocytosis like an antibody uptake assay or at the very least, a higher resolution image analysis showing a change in the number of intracellular Dlp positive punctae. Also, do the authors think that loss of tai increases Dlp endocytosis, making it less available on the cell surface for maintaining adequate extracellular Wg levels?”

      As noted above, have added experiments using a dominant-negative shibire/dynamin allele to test whether Tai controls Dlp endocytosis. These data will be added to a revised manuscript. We have also gathered reagents to test effects of Tai gain/loss on Dlp secretion.

      “The data shown in the last figure is at odds with the model (I think) the authors are trying to establish: When cells have lower Tai levels, this reduces Dlp levels (S2) presumably either by reducing dlp transcription and/or increasing (?) Dlp endocytosis. This in turn reduces Wg (availability) in cells away from source cells (Figure 6). The reduced Wg availability makes them less fit, targeting them for competitive elimination. But in tai clones, I do not see any change in cell-surface Dlp (9B) (I would have expected them to be down based on the proposed model). The authors also see more total Dlp (9A) (which is at odds with S2 assuming data in S2 were done under permeabilizing conditions.).”

      As noted above (under Rev #2 comments), we apologize for not explaining this data well enough. First, the tai[k15101] allele is a weak, viable hypomorph (as shown in our Zhang et al, 2015 paper) whereas the Tai RNAi line is lethal with most drivers (including en-Gal4) and thus a stronger lof. Second, Tai RNAi lower Dlp levels (Fig 7C) while tai[k15101] causes Dlp to accumulate intracellularly (see Fig. 9A-C). These data indicate that reduced Tai leads to a defect in Dlp intracellular trafficking while its loss reduces Dlp overall levels; these data can be explained by a single role for Tai in Dlp traffic to or from the cell membrane, or two roles, one in trafficking and one Dlp expression. We are investigating both possibilities using dlp reporter lines and our new tai null Crispr allele.

      “As a side note, because Dlp is GPI-anchored, the authors should consider the possibility that the 'total' Dlp staining observed in 9A may not be actually total Dlp (and possibly mostly intracellular Dlp, since the permeabilizing membranes with detergent will cause some (most?) Dlp molecules to be lost, and how this might be affecting the interpretation of the data. I think one way to address this would be to process the permeabilized and non-permeabilized samples simultaneously and then image them at the same settings and compare what membrane staining in these two conditions looks like. If membrane staining in the permeabilized condition is decreased compared to non-permeabilized conditions, and the signal intensity of Dlp in permeabilized conditions remains high, then the authors will have evidence to support increased endocytosis in tai clones. Of course, these data will still need to be reconciled with what is shown in S2.

      We thank the reviewer for this excellent suggestion and are generating mosaic discs to test the proposed approach of synchronous analysis of total vs. intracellular Dlp.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: Authors performed a metatranscriptomic analysis from publicly-available datasets of whole blood from 3 places in Indonesia. Their goal was to explore which pathogens were present on the blood of those 117 healthy individuals. It was interesting that reads from Flaviviridae and Plasmodium were detected in asymptomatic subjects.

      Major comments: 1) How did the authors assess and correct batch-effects between different datasets?

      Our response: We have sequencing batch information for the Indonesian dataset and saw no clear clustering based on batches in the first 8 PCs. We recognize that sampling variations may exist between islands, though the taxa matrix we acquired from the unmapped reads are very scarce that such variations did not have a strong enough effect to introduce batch effects in our microbiome analyses, and that the signals were driven by pathogenic reads. For our comparative analyses between datasets, we made sure that all three datasets shared similar processing (collected using Tempus Blood RNA Tubes and went through globin depletion method) and have trimmed both Indonesian and Malian reads to match the length of the UK reads (75BP).

      2) Did the RNA-seq capture poly-A mRNAs? If so... these reads that did not map the human genome were captured because of internal priming. Can they find internal poly A sequences in the genome of Flaviviridae and Plasmodium pathogens? I would like to know that to understand the source of the reads and which other pathogens may be missing (due to the lack of internal priming).

      __Our response: __No, our dataset did not capture poly-A mRNAs. We performed ribosomal RNA (rRNA) and globin mRNA depletion.

      3) Principal coordinates analysis (PCoA) is often utilized in metagenomics analysis. Although they are equivalent, is there a reason for using PCA?

      Our response: Since we used CLR transformation, the resulting matrix lies in Euclidian space. PCA is just a form of PCoA in Euclidian space.

      Minor comments: 1) "Indonesia is a country with large numbers of endemic and emerging infectious diseases [16], making it a crucially important location to monitor and understand the effects of pathogens on human hosts." Is there any epidemiological data that shows differences in infectious diseases across these 3 places? Can the authors provide a map and better explanation about the importance in comparing these 3 areas?

      __Our response: __We have added references to malaria infection being more prevalent in the eastern side of Indonesia in the discussion section.

      2) Why is it so hard to try to identify (only for Flaviviridae reads) reads that map to very relevant viruses, such as Zika, Dengue, and Yellow Fever? Why did the authors state that they "were unable to refine this assignment further" if this is one of the most interesting finding?

      __Our response: __Our reanalysis showed a small percentage of the Flaviviridae reads to be assigned to the Pegivirus genus. As more diverse microbial genomes are added to reference databases and identical regions become more common between them, it becomes harder for the classifer to further define reads to species level (https://link.springer.com/article/10.1186/s13059-018-1554-6). Flaviviridae has distinct species spread across six different genera (https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=11050). In comparison, despite Plasmodiidae having more species recorded compared to Flaviviridae, an overwhelming majority of the species is part of the Plasmodium genus, hence we were able to refine them down to species-level (https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=1639119).

      3) Is the script available at https://gitlab.unimelb.edu.au/igr-lab/Epi_Study ? This reviewer could not access it. __Our response: __We thank Reviewer 1 for pointing this out and have amended the link, now accessible here: https://gitlab.svi.edu.au/muhamad.fachrul/indo_blood_microbiome

      Reviewer #1 (Significance (Required)):

      Interesting paper that enable to extract additional knowledge from whole blood RNA-seq data. There are already several papers that do this and I think authors could go one step forward (for instance, PCR validation of additional individuals). I don't think this can be used for surveillance if it cannot identify species, it is more expensive than running targeted assays, and that may be many false negative pathogens in the samples.

      __Our response: __We thank Reviewer 1 for their comments. We have updated our manuscript to reflect our updated analyses which minimizes false positive taxa and the project’s significance not as a mainline surveillance tool, but a retrospective one.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Bobowik and colleagues perform a computational analysis of whole blood RNA-seq datasets from healthy individuals of three different regions of Indonesia. Their goal is to identify infecting pathogens and other microbes and correlate their abundances to host gene expression patterns or health characteristics in these populations. They find a broad range of bacterial, viral and microeukaryote taxa. When comparing the three Indonesian populations, they find that the Korowai population is the most diverse and different from the other two, possibly driven by the higher prevalence and abundance of Plasmodium (Apicomplexa) in this population.

      Then, the authors conduct a statistical decomposition of human gene expression in these samples in independent factors using ICA, and correlate each of these factors to the abundances of the microbial taxa detected. This analysis allows researchers to associate specific patterns of gene expression, such as immune-related pathways, to the presence of members of the Apicomplexa and Kitrinoviricota phyla.

      Lastly, the authors use previously published data from other two cohorts (from Mali and the UK) to contextualize their blood microbiome findings. They find microbial reads in all datasets. The Mali cohort is characterized by a large abundance of archaea, not found in the other two populations, while the UK cohort has the lower diversity. Altogether, the authors propose the use of RNA-seq data from human whole blood as a way to study the blood microbiome and establish potential associations between blood resident microbes and host gene expression

      Major comments:

      1) The methodology to filter and remove reads from potential contaminants needs to be more stringent to ensure the results do not contain spurious contaminants and that the conclusions are correct. It has been described that genomic databases are heavily contaminated with human sequences (Steinegger and Salzberg, 2020), and in this manuscript, even after a two-pass alignment with STAR, reads mapping to helminths also corresponded to the human genome. Additionally, ad-hoc removal of specific taxa (Metazoa and Viridiplantae) was only performed after suspicion of contamination. However, this ad-hoc removal cannot be performed with microbial (bacterial, viral, etc.) contaminants as there is a risk of removing actual bacteria from the samples. But it has been confirmed that many microbial assemblies also suffer from human contamination. Possible actions to take are the following: a.Perform the human mapping with more lenient parameters to avoid human reads to map to other (likely contaminated) genomes in genome databases. b.Remove common contaminants that have been documented, for instance in blood (Chrisman et al., 2022). c.Run a tool to detect contaminated contigs in the database used to map reads to microbes and remove these problematic contigs from further analysis.

      Our response: We thank Reviewer 2 for the suggestions, especially to address contaminants. We have reanalyzed our data which resulted in much fewer taxa yet still retained the main pathogenic findings.

      2) In line with the above, removing singletons (as I have understood these are taxa that are represented by a single read), is a way to minimize the risk of contamination. To take advantage of the functional profiling of RNA-seq, a measure to ensure that microbes found in blood are active would be to include in the analysis only taxa for which expression of more than a few genes is detected. This type of filtering has been previously applied in studies where very low microbial loads are expected (Lloréns-Rico et al., 2021). In this study, it has only been applied to the specific case of the archaeal taxon Methanocaldococcaceae. However, I would expect cleaner results if applied consistently to all taxa detected.

      __Our response: __We have reanalyzed the data and applied this to all taxa detected.

      3) The specificity of Methanocaldococcaceae in the samples from Mali is very striking. I am highly suspicious that this only occurs due to a batch effect, even though the authors were highly selective in their cohorts to avoid these. In fact, I extracted the genes spanning the regions highlighted in Supplementary Figure 9 of the Methanocaldococcus jannaschii genome. A BLAST search of these sequences returned, among Methanocaldococcus hits, hits from the ERCC synthetic spike-in sequences, used as internal controls in many RNA-seq experiments. ERCC synthetic spike-in hits appeared for all 4 regions in the genome of M. jannaschii highlighted in this figure. In the original publications of this dataset, there is no reference to the use of these ERCC controls, but given the observed matches, I suggest the authors to perform an extra step in their filtering pipeline to remove all reads mapping to these ERCC standards in all their three cohorts to prevent these sort of batch effects.

      __Our response: __We thank Reviewer 2 for pointing this out. Our reanalysis, which now used proper 2-pass mapping and further downstream classification with both pairs of the reads, no longer detected any archaea.

      4) I am puzzled by the inconsistencies shown between forward and reverse reads when mapping paired-end data. I expect these inconsistencies at lower taxonomic ranks (species or genus level) due to incomplete genomes, but not at higher taxonomic ranks. I wonder if, by performing more stringent filtering of contaminants as suggested above, the consistency between forward and reverse reads increases and both mates can be used, making the mapping more reliable.

      __Our response: __We have reanalyzed the data using both pairs of the reads for classification, resulting in less detected taxa. We believe the new results are more robust as it no longer includes taxa that are not typically found in humans (such as the archae Methanocaldococcus and other environmental bacteria).

      In summary, my main concerns regarding this manuscript involve the possibility that contaminants in the sequencing data may be the cause of some of the results presented, and I tried to propose ways of dealing with these contaminants. While some of the results may not be affected by detection of contaminants (i.e. the association between Apicomplexa and some ICs), others such as the diversity measures or the comparison across cohorts may be severely affected. I will consider these results highly preliminary until a more thorough and stringent approach for contaminant removal is applied.

      Our response: We thank Reviewer 2 for the suggestions and have updated our manuscript with results updated analyses that are more stringent towards contaminants, as can be seen from our updated findings.

      Minor comments:

      1) I would appreciate some of the analyses done at lower taxonomic levels if the sparsity of the data allows it, after removing contaminants. Given that the CLR transformation does not allow for zeros, other alternatives such as GMPR (Chen et al., 2018) or adding a pseudocount would allow these analyses?

      __Our response: __After our reanalysis, we ended up with even sparser data and therefore could not perform the analyses at lower taxonomic levels.

      2) In the PCA shown in figure 1, does the number of microbial reads detected correlate with any of the first two components?

      __Our response: __Yes Plamosdiidae correlates well with PCs 1 and 2 (0.66 & 0.73) and Flaviviridae correlates very strongly with PC1 (0.917). We have added this detail in the results section.

      3) In Figure 1C, the x axis is wrongly named PC2.

      __Our response: __We thank Reviewer 2 for pointing this out and have amended this detail.

      4) There is a typo in the legend of Figure 1A ("showeing")

      __Our response: __We thank Reviewer 2 for pointing this out and have amended this detail.

      5) In the alpha diversity estimates comparison across the three different cohorts, after subsampling each population to achieve similar sample size in each cohort, it is stated that "after subsampling, each population had similar diversity estimates". However, the numbers shown afterwards corresponding to the mean values of alpha diversity, without confidence intervals or a boxplot/violin plot together with an accompanying statistical test, are not enough to assess similarity. I would appreciate a figure (similar to Figure 3E and F) or a test accompanying these mean values.

      __Our response: __We thank Reviewer 2 for pointing this out and have amended this detail.

      6) In the volcano plots (Figure 3A, B and others throughout the manuscript) it would help the reader to add lines for the thresholds chosen for the effect size and -log10(p-value) to separate significant results.

      __Our response: __We thank Reviewer 2 for pointing this out and have amended this detail.

      7) In Figure 3E and F, I would appreciate having bars for the statistically significant comparisons.

      __Our response: __We thank Reviewer 2 for pointing this out and have amended this detail.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The manuscript constitutes an important contribution to antimalarial drug discovery, employing diverse systems biology methodologies; with a focus on an improved M1 metalloprotease inhibitor, the study provides convincing evidence of the utility of chemoproteomics in elucidating the preferential targeting of PfA-M1. Additionally, metabolomic analysis effectively documents specific alterations in the final steps of hemoglobin breakdown. These findings underscore the potential of the developed methodology, not only in understanding PfA-M1 targeting but also in its broader applicability to diverse malarial proteins or pathways. Revisions are needed to further enhance overall clarity and detail the scope of these implications.

      We thank the editor and reviewers for recognising the contribution our work makes to understanding the selective targeting of aminopeptidase inhibitors in malaria parasites and the wider impact this multi-omic strategy can have for anti-parasitic drug discovery efforts. The reviewers have provided constructive feedback and raised important points that we have taken on-board to improve our manuscript. In particular, we have revised aspects of the text and figures to enhance clarity, performed additional analysis on the other possible MIPS2673 interacting proteins and more comprehensively analysed the effect of MIPS2673 on parasite morphology. NB: Specific responses to comments in the public reviews are provided within responses to the specific recommendations to authors.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The article "Chemoproteomics validates selective targeting of Plasmodium M1 alanyl aminopeptidase as a cross-species strategy to treat malaria" presents a series of biochemical methods based on proteomics and metabolomics, as a means to:

      (1) validate the specific targeting of biologically active molecules (MIPS2673) towards a defined (unique) protein target within a parasite and (2) to explore whether by quantifying the perturbations generated at the level of the parasite metabolome, it is possible to extrapolate which metabolic pathway has been disrupted by using this biologically active molecule and whether this may further confirm selective targeting in parasites of the expected (or in-vitro targeted) enzyme (here PfA-1).

      The inhibitor used in this work by the authors (MIPS2673) is to my knowledge a novel one, although belonging to a chemical series previously explored by the authors, which recently enabled them to discover a specific PfA-M17 inhibitor, MIPS2571 (Edgard et al., 2022, ref 11 of this current work). Indeed, inhibitors specifically targeting either PfA-M1 or PfA-M17 (and not both, as currently done in the past) are scarce today, and highly needed to functionally characterize these two zinc-aminopeptidases. MIPS2673, blocks the development of erythrocytic stages of Plasmodium falciparum with an EC50 of 324 nM, blocks the parasite development at the young trophozoite stage at 5x EC50 (but at ring stages at 10xEC50, figure 1E), and inhibits the enzymatic activity of PfA-M1 (and its ortholog Pv-M1) but not of the related malarial metallo-aminopeptidases (M17 and M18 families) nor the human metalloenzymes from closely related enzymatic families, supporting its selective targeting of PfA-M1 (and Pv-M1).

      All experiments are carried out in vitro (e.g. biochemical studies such as enzymology, proteomics, metabolomics) and on cultured parasites (erythrocyte stages of Plasmodium falciparum and several gametocytes stages obtained in vitro); there are no in vivo manipulations. The work related to Plasmodium vivax, which justifies the "cross-species" indication in the title of the article, is restricted to using a recombinant form of the M1-family aminopeptidase in enzymatic assays. The rest of the work concerns only Plasmodium falciparum. While I found globally that this work is original and brings new data and above all proposes chemical validation approaches that could be used for other target validations under similar limiting conditions (impossibility of KO of the gene), I have some specific questions to address to the authors.

      Strengths and weaknesses:

      - The chemoproteomic approach, that explores the ability of MIPS2673 to more significantly "protect" the putative target (PfA-M1) against thermal degradation or enzymatic attack (by proteinase K), to document its selective targeting towards PfA-M1 (the inhibitor, once associated with its target, is expected to stabilize its structure or prevent the action of end proteases), uses several concentrations of MIPS2673 and provides convincing results. My main criticism is that these tests are carried out with parasite extracts enriched in 30-38 hours old forms, and restricted to the fraction of soluble proteins isolated from these parasitic forms, which still limits the scope of the analysis. It is clear that this methodological approach is a choice that can be argued both biologically (PfA-M1 is well expressed in these stages of the parasite development) and biochemically (it is difficult to do proteomic analyses on insoluble proteins) but I regret that the authors do not discuss these limitations further, notably, I would have expected (from Figure 1E) some targets to be also present at ring stages.

      - The metabolomic approach, by documenting the ability of MIPS2673 to selectively increase the number of non-hydrolyzed dipeptides in treated versus untreated parasites is another argument in favor of the selective targeting of PfA-M1 by MIPS2673, in particular by its broad-spectrum aminopeptidase action preferentially targeting peptides resulting from the degradation of hemoglobin by the parasite. The relative contribution of peptides derived from host hemoglobin versus other parasite proteins is, however, little discussed.

      The work as a whole remains highly interesting, both for the specific topic of PfA-M1's role in parasite biology and for the method, applicable to other malarial drug contexts.

      Reviewer #2 (Public Review):

      In this manuscript, the authors first developed a new small molecular inhibitor that could target specifically the M1 metalloproteases of both important malaria parasite species Plasmodium falciparum and P. vivax. This was done by a chemical modification of a previously developed molecule that targets PfM1 as well as PfM17 and possibly other Plasmodial metalloproteases. After the successful chemical synthesis, the authors showed that the derived inhibitor, named MIPS2673, has a strong antiparasitic activity with IC50 342 nM and it is highly specific for M1. With this in mind, the authors first carried out two large-scale proteomics to confirm the MIPS2673 interaction with PfM1 in the context of the total P. falciparum protein lysate. This was done first by using thermal shift profiling and subsequently limited proteolysis. While the first demonstrated overall interaction, the latter (limited proteolysis) could map more specifically the site of MIPS2673-PfM1 interaction, presumably the active site. Subsequent metabolomics analysis showed that MIPS2673 cytotoxic inhibitory effect leads to the accumulation of short peptides many of which originate from hemoglobin. Based on that the authors argue that the MIPS2673 mode of action (MOA) involves inhibition of hemoglobin digestion that in turn inhibits the parasite growth and development.

      Reviewer #3 (Public Review):

      This is a manuscript that attempts to validate Plasmodium M1 alanyl aminopeptidase as a target for antimalarial drug development. The authors provide evidence that MIPS2673 inhibits recombinant enzymes from both Pf and Pv and is selective over other proteases. There is in vitro antimalarial activity. Chemoproteomic experiments demonstrate selective targeting of the PfA-M1 protease.

      This is a continuation of previous work focused on designing inhibitors for aminopeptidases by a subset of these authors. Medicinal chemistry explorations resulted in the synthesis of MIPS2673 which has improved properties including potent inhibition of PfA-M1 and PvA-M1 with selectivity over a closed related peptidase. The compound also demonstrated selectivity over several human aminopeptidases and was not toxic to HEK293 cells at 40 uM. The activity against P. falciparum blood-stage parasites was about 300 nM.

      Thermal stability studies confirmed that PfA-M1 was a binding target, however, there were other proteins consistently identified in the thermal stability studies. This raises the question as to their potential role as additional targets of this inhibitor. The authors dismiss these because they are not metalloproteases, but further analysis is warranted. This is particularly important as the authors were not able to generate mutants using in vitro evolution of resistance strategies. This often indicates that the inhibitor has more than one target.

      The next set of experiments focused on a limited proteolysis approach. Again several proteins were identified as interacting with MIPS2673 including metalloproteases. The authors go on to analyze the LiP-MS data to identify the peptide from PfA-M1 which putatively interacts with MIPS2673. The authors are clearly focused on PfA-M1 as the target, but a further analysis of the other proteins identified by this method would be warranted and would provide evidence to either support or refute the authors' conclusions.

      The final set of experiments was an untargeted metabolomics analysis. They identified 97 peptides as significantly dysregulated after MIPS2673 treatment of infected cells and most of these peptides were derived from one of the hemoglobin chains. The accumulation of peptides was consistent with a block in hemoglobin digestion. This experiment does reveal a potential functional confirmation, but questions remain as to specificity.

      Overall, this is an interesting series of experiments that have identified a putative inhibitor of PfA-M1 and PvA-M1. The work would be significantly strengthened by structure-aided analysis. It is unclear why putative binding sites cannot be analyzed via specific mutagenesis of the recombinant enzyme.

      In the thermal stability and LiP -MS analysis, other proteins were consistently identified in addition to PfA-M1 and yet no additional analysis was undertaken to explore these as potential targets.

      The metabolomics experiments were potentially interesting, but without significant additional work including different lengths of treatment and different stages of the parasite, the conclusions drawn are overstated. Many treatments disrupt hemoglobin digestion - either directly or indirectly and from the data presented here it is premature to conclude that treatment with MIPS2673 directly inhibits hemoglobin digestion.

      Finally, the potency of this compound on parasites grown in vitro is 300 nM - this would need improvements in potency and demonstration of in vivo efficacy in the SCID mouse model to consider this a candidate for a drug.

      Summary:

      Overall, this is an interesting series of experiments that have identified a putative inhibitor of the Plasmodium M1 alanyl aminopeptidases, PfA-M1 and PvA-M1.

      Strengths:

      The main strengths include the synthesis of MIPS2673 which is selectively active against the enzymes and in whole-cell assay.

      Weaknesses:

      The weaknesses include the lack of additional analysis of additional targets identified in the chemoproteomic approaches.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Question 1. Line 737 (and elsewhere). Why are Plasmodium vivax orthologs of PfA-M1 and PfA-M17 called Pv-M1 and Pv-M17 and not PvA-M1 and PvA-M17, where A stands for Aminopeptidase? I would recommend changing the names if possible, although the mention of Pv-M1 and Pv-M17 is now current in the literature (which is kind of regrettable). See also Supplemental Table S1 where PfA-M1 is named Pf-M1.

      Supplemental Table S1 was updated to PfA-M1. Nomenclature for the Plasmodium vivax aminopeptidase orthologs was amended to PvA-M1 and PvA-M17 as suggested by the reviewer.

      Question 2. Figure 1. Observation of parasite culture slide smears in Figure 1E strongly suggests that an important target of MIPS2673 appears to be expressed at the ring stage or very young trophozoites, whereas the authors, in their proteomic and metabolomic analyses, performed studies focused on late trophozoites stages (30-38h post-invasion). This difference in the targeting of Plasmodium stages puzzles me and deserves some explanations from the authors, and is related to my question 3.

      As the reviewer indicates, ring-stage parasite growth appears to be affected at high concentrations (5x and 10x EC50) of MIPS2673. Under these conditions, parasite growth appears to stall during late rings/early trophs at ~16-22 h post invasion when haemoglobin digestion is increasing and when one presumes PfA-M1 (the primary target of MIPS2673) is increasing in both expression and activity (see references 26 and 28 of this manuscript). Thus, whilst it is unsurprising that MIPS2673 has some activity against ring-stage parasites, we focused on the trophozoite stage for our proteomics studies as we showed this to be the stage most susceptible to MIPS2673 (Fig. 1D) and reasoned that we would most likely identify the primary MIPS2673 target, and other interacting proteins, from a complex biological mixture at this stage. The same reasoning underpinned our decision to perform metabolomics on drug-treated trophozoites, as we reasoned we would see a greater functional effect on this stage. Furthermore, performing these experiments on trophozoites rather than rings minimises the interference from the host red blood cell. While we cannot rule out additional targets in rings, repeating all experiments during this parasite stage is beyond the scope of this study.

      Question 3. Figure 2. Although Figure 2 is insightful and somehow self-explanatory, I think it misses two specific pieces of information. First, it is indicated in line 618 (M&M) that parasite material for thermal stability and limited proteolysis studies correspond to synchronized parasites (30-38h post-invasion) but this information is not given in Figure 2. In addition, if I fully understand the experimental protocol of obtaining parasite extracts, they strictly correspond to the soluble protein fraction of the erythrocytic stages of plasmodium at the late trophozoite stage, and not to all parasitic proteins as the scheme of Figure 2 might suggest. I would appreciate it very much if these two points (parasite stages and soluble proteins) were clearly indicated in the scheme as indeed, not the whole parasite blood stage proteome is investigated in the study but just a part of it (~47%, as the authors indeed indicate line 406). Please, edit also the legend of the figure accordingly.

      This is correct, the soluble protein fraction from synchronised trophozoites was used in our proteomics studies. These details have been included in an updated Figure 2 and in the corresponding figure legend.

      Question 4. Thermal stabilization. Figure 3B. Could the authors explain how they calculated or measured "absolute" protein abundances, and how this refers to a number of parasites in initial assays as this is not clear to me. Notably, abundance for PfA-M1 is much higher than for PF3D7_0604300, which are interesting "absolute" values.

      Protein abundance was calculated using the mean peptide quantity of the stripped peptide sequence, with only precursors passing the Q-value threshold (0.01) considered for relative quantification. Within independent experiments, normalisation was based on total protein amount (determined by the BCA assay) rather than the initial number of parasites.

      PfA-M1 is known to be a highly abundant protein and PF3D7_0604300 (as well as the other protein hits identified by thermal stability proteomics) are likely less abundant. It is noted that abundance is also dependent on ionisation efficiency and trypsin digestion efficiency. Therefore, we avoid comparing absolute abundances across proteins and use relative differences across conditions instead.

      NB: the word “absolute” in the text (“absolute fold-change”) refers to the absolute value of the fold-change (i.e. positive or negative), and not to absolute quantification of proteins. The preceding text in each case clarifies that these are based on “relative peptide abundance”.

      Question 5. Figure 5A. How do the authors explain peptides whose abundances are decreasing instead of increasing? Figure 5C. Could the authors provide digital cues (aa numbers or positions) on the ribbon representation of the PfA-M1 sequence? It is difficult to correlate the position of the 3D domains with respect to the primary structure of the protein. Also, the "yellow" supposed to show the "drug ligand" is really not very visible.

      LiP-MS is based on the principle that ligand binding alters the local proteolytic susceptibility of a protein to a non-specific protease (in this case proteinase K, PK). In this sense, in LiP-MS we are not looking at variations in the stability of whole proteins (as is the case with thermal stability proteomics, where proteins detected with significantly higher abundance in treated relative to control samples reflects thermal stabilisation of the target due to ligand binding), but differences in peptide patterns between treated and control samples that reflect a change in the ability of PK to cleave the target. Thus, in the bound state, the ligand prevents proteolysis with PK. This results in decreased abundance of peptides with non-tryptic ends (as PK cannot access the region around where the ligand is bound) and increased abundance of the corresponding fully tryptic peptide, when compared to the free target. This concept is demonstrated in Fig. 4A and is explained in the text (lines 279-282) and Fig. 4 figure legend.

      To aid visualisation, we have not added amino acid positions on the PfA-M1 sequence in Fig. 5, but have provided amino acid positions for all peptides in Supplementary File 3. We have also changed the colour of the ligand in Fig. 5C to blue and increased transparency of the binding and centre of mass neighbourhoods.

      Question 6. Gametocyte assays. Line 824 states that several compounds were used as positive controls for anti-gametocyte activity (chloroquine, artesunate, pyronaridine, pyrimethamine, dihydroartemisinin, and methylene blue) and line 821 states that the biological effects are measured against puromycin. This is not very clear to me, could the authors comment on this?

      This wording has been clarified in the methods to reflect that 5 µM puromycin was used as the positive control to calculate percent viability, whereas the other antimalarials were run in parallel as reference compounds with known anti-gametocyte activity (line 862).

      Question 7. Metabolomics. Metabolomic assays were done on parasites at 28h pi, incubated for 1h with 3x EC50 of MIPS2673. You mention applying the drug on 2x10E8 infected red blood cells (line 838) but you do not explain how you isolate these infected red blood cells from non-infected red blood cells. Could you please specify this?

      Metabolomics studies were performed such that cultures at 2% haematocrit and 6% trophozoite-stage parasitaemia (representing 2 x 108 cells in total, rather than 2 x 108 infected cells) were treated with compound or vehicle and after 1 h metabolites were extracted. This methodological detail has been clarified in the methods (line 875).

      Question 8. Figure 3B. Does this diagram come from the experimental 3D structure created by the authors (8SLO) or from molecular modeling? Please specify in the legend (line 1305).

      The diagram showing the binding mode of MIPS2673 bound to PfA-M1 comes from the experimentally determined 3D structure (PDB ID: 8SLO). This has now been stated in the figure legend. Note that the structural diagram refers to Fig. 1B (not Fig. 3B as indicated by the reviewer). The experimentally determined PfA-M1 structure with MIPS2673 bound (PDB ID: 8SLO) was also used to map LiP peptides and estimate the MIPS2673 binding site in Fig. 5, which is also now reflected in the appropriate section of the text (line 308) and Fig. 5 legend.

      Question 9. Line 745. Why not indicate µm concentration for this H-Leu-NHMec substrate while it is indicated for the other substrates mentioned in the rest of the paragraph (H-Ala-NHMec, 20 μM, etc..). Also in this section (Enzyme assays) the pH at which the various enzymatic assays were done is missing.

      All enzyme assays were performed at pH 8.0. The concentration of H-Leu-NHMec varied depending on the enzyme assayed, as follows: 20 µM for PfA-M1, 40 µM for PvA-M1 and 100 µM for ERAP1 and ERAP2. This information is now clearly stated in the methods section (lines 782 and 787) and as a footnote for Supplemental Table S1.

      Question 10. Line 830, please define FBS.

      Fetal bovine serum (FBS) has been added where appropriate (line 867).

      Question 11. The authors mention in the title the targeting of several plasmodium species, but the only experimental study on the Plasmodium vivax species concerns the use of the recombinant enzyme Pv-M1. Authors also mention "multi-stage targets", but ultimately only look at erythrocyte stages and three different gametocyte stages.

      We have now removed the words “cross-species” and “multi-stage” from the manuscript title and abstract so as not to overstate these findings. We have also added the word “potential” in the manuscript text to clarify that selective M1 inhibition could offer a potential multistage and cross species strategy for malaria.

      Question 12. Supplemental Table S1. I would suggest replacing "Percent inhibition by MIPS2673 of PfA-M1 and Pv-M1 aminopeptidases compared to selected human M1 homologues" with "Percent inhibition by MIPS2673 of PfA-M1 and Pv-M1 aminopeptidase activities compared to selected human M1 homologues".

      Done.

      Question 13. Supplemental Table S3. Here you indicate IC50 while in text and Figure 1 you quote EC50. Why this difference?

      This has now been changed to EC50 in Supplemental Table S3.

      Reviewer #2 (Recommendations For The Authors):

      Amendments that I would recommend in order to improve the presentation include all four parts of the study:

      (1) In vitro antiparasitic activity of MIPS2673.

      The authors showed that MIPS2673 inhibits parasite growth with IC50 of 324nM measured by a standard drug sensitivity assay, Fig 1C. This is all well and good, but it would be helpful to include at least one if not more other compounds such as antimalaria drugs and/or their earlier inhibitors (e.g. inhibitor 1) for comparisons. This is typically done to show that the assay in this manuscript is fully compatible with previous studies. It will also give a better view of how the selective inhibition of PfM1 kills the parasite, specifically.

      Alongside MIPS2673, we also analysed the potency of the known antimalarial artesunate, which was found to have an EC50 of 4 nM. This value agrees with the expected potency of artesunate and indicates our MIPS2673 value of 324 nM is indeed compatible with previous studies. We have now reported the artesunate EC50 value for reference (lines 197-198 and Fig. S1).

      Next, the authors proceeded to investigate the stage-specific effect of MIPS2673 but this time doing a survival assay instead of proper IC50 estimations (Figure 1. I wonder why? Drug survival assays have typically very limited information content and measuring proper IC50 in stage-specific wash-off assays would be much more informative.

      We performed single concentration stage specificity assays to determine the parasite asexual stage at which MIPS2673 is most active. This involved washing off the compound after a 24 h exposure in rings or trophozoites and determining parasite viability in the next asexual lifecycle. While a full dose response curve would allow generation of an EC50 value against the respective parasite stages, this information is unlikely to change the interpretation that MIPS2673 is more active against trophozoites stages than against rings.

      Finally, in Figure 1E, the authors present the fact that the MIPS2673 arrests the parasite development. This is done by presenting a single (presumably representative) cell per time point. This is in my view highly insufficient. I recommend this figure be supplemented by parasite stage counts or other more comprehensive data representation. Also, the authors mention that while there is a growth arrest, hemoglobin is still being made. From the cell images, I can not see anything that supports this statement.

      We thank the reviewer for this constructive comment and they are correct in their assessment that these are representative parasite images at the respective time points. To address the reviewers concerns we have now provided cell counts from each treatment condition (Fig. 1E) at selected time points, which shows parasite stalling at the ring to trophozoite transition under drug treatment. On reflection, we agree that it is difficult to determine the presence of haemozoin from our images and have removed this statement.

      (2) Protein thermal shift profiling. In the next step, the authors proceed to carry out cellular thermal shift profiling to show that PfM1 indeed interacts with MIPS2673, this time in the context of the total protein lysates from P. falciparum. This section of the study is in my view quite solid and indeed it is nice to see that the inhibitor causes a thermal shift of PfM1 which further supports what was already expected: interaction.

      I have no problem with this study in terms of the technical outcome but I would urge the authors to tone down the interpretation of these results in two ways.

      Four other proteins were found to be shifted by the inhibitor which also indicates interactions. Calling it simply "off-target" interactions might not represent the truth. The authors should explore and in some way comment that interactions with these proteins could contribute to the MIPS2673 MOA. I do not suggest conducting any more studies but simply acknowledge this situation. Identifying more than one target is indeed very common in CETSA studies and it would be helpful to acknowledge this here as well.

      We agree that identifying binding proteins in addition to the “expected” target is commonplace, and is indeed one of the benefits of this unbiased and proteome-wide approach. In the results and discussion, we have now amended our language to refer to these additional hits as MIPS2673-interacting proteins. In our original manuscript we dedicate a paragraph in the discussion to these additional interacting proteins and the likelihood of them being targets that contribute to antimalarial activity. Of these four additional interacting proteins, only the putative AP2 domain transcription factor (PF3D7_1239200) is predicted to be essential for blood stage growth and is therefore the only protein from this additional four that would likely contribute to antimalarial activity. These points are explicitly stated in the discussion (lines 530-550). Notably, all of the other interacting proteins identified in our thermal stability dataset were detected in our LiP-MS experiment but were not identified as interacting proteins by this method. The remaining three proteins were two non-essential P. falciparum proteins with unknown functions (PF3D7_1026000 and PF3D7_0604300) that are poorly described in the literature and a human protein (RAB39A). Further analysis of these other thermal stability proteomics hits in our LiP-MS dataset (see responses to Reviewer #3) identified none or only 1 significant LiP peptide from these proteins across our LiP-MS datasets, indicating they are likely to be false positive hits. Caveats around identifying protein targets by different deconvolution methods are also now addressed (lines 545-550).

      At some point, the author argues that causing shifts of only four/five proteins including PfM1 shows that MIPS2673 does not interact with other (off) targets. Here one must be careful to present the lack of shifts in the CETSA as proof of no interaction. There are many reasons why thermal shifts are not observed including the physical properties of the individual proteins, detection limit etc. Again I suggest adjusting these statements accordingly.

      We thank the reviewer for raising this important point and have now included additional discussion around this comment (lines 545-550).

      Finally, I am not convinced that Figure 2 presents nothing more than the overall experimental scheme with not much new information. Many of such schemes were published previously in the original publication of thermal profiling. I would suggest omitting it from the main text and shifting it into supplementary methods etc.

      We agree that similar schemes have been published previously, especially for thermal proteome profiling, and acknowledge the reviewer’s suggestion of moving this figure to the supplemental material. However, we have kept Fig. 2 in the main text as this scheme also incorporates a LiP-MS workflow for malaria drug target deconvolution (the first to do so) and also to satisfy the additional details requested for this figure by Reviewer #1 (question 3).

      (3) Identification of MIPS2673 target proteins using LiP-MS. In the next step, the authors carried out the limited proteolysis analysis with the rationale that protein peptides that are near the inhibitor binding site will exhibit higher resilience to proteolysis. The authors did a very good job of showing this for PfM1-MISP2673 interaction. This part is very impressive from a technological perspective, and I congratulate the authors on such achievement. I imagine these types of studies require very precise optimizations and performance.

      Here, however, I struggle with the meaning of this experiment for the overall flow of the manuscript. It seems that the binding pocket of MIPS2673 is less known since the inhibitor was designed for it. In fact, the authors mentioned that the crystal structure of PfM1 is available. From this perspective, the LiP-MS study represents more of a technical proof of concept for future drug target analysis but has limited contribution to the already quite well-established PfM1-MISP2673 interaction. Perhaps this could be presented in this way in the text.

      We thank the reviewer for this comment and they are correct that we solved the crystal structure of PfA-M1 bound to MIPS2673. We wish to highlight that the primary reason for performing the LiP-MS study was as an independent and complementary target deconvolution method to narrow down the shortlist of targets identified with thermal stability proteomics, and validate with high confidence that PfA-M1 is indeed the primary target of MIPS2673 in parasites. The use of a complementary approach based on a different biophysical principle (proteolytic susceptibility vs thermal stability) would also allow us to identify MIPS2673 interacting proteins that may not be detectable by thermal stability proteomics, for example targets that do not alter their thermal stability upon ligand binding. The text in the results and discussion has been amended to clarify these points (lines 266-268 and 545-550).

      Furthermore, we agree that correctly predicting the MIPS2673 binding site on PfA-M1 using our LiP-MS peptide data is a technical proof of concept. Indeed, we wished to highlight the potential utility of LiP-MS for identifying both the protein targets of drugs and predicting their binding site, which is not possible with many other target deconvolution approaches. This point has been updated in the text (lines 303-304, 459-461).

      (4) Metabolomic profiling of MIPS2673 inhibition showed a massive accumulation of short peptides which clearly indicates that this inhibitor blocks some proteolytic activity of short peptides, presumably products of upstream proteolytic activities. Here the authors argue, that because many of these detected short (di-/tri-) peptides could be mapped on the hemoglobin protein sequence, this must be their origin. Although this might be the case the author could not exclude the fact that at least some of these come from other sources (e.g. Plasmodium proteins). It would be quite helpful to comment on such a possibility as well. In particular, it was mentioned that the main subcellular localization of PfM1 is in the cytoplasm while most if not all hemoglobin digestion occurs in the digestive vacuole...?

      Indeed, we agree that Pf_A-M1 is likely processing both Hb and non-Hb peptides and do not definitively conclude that all dysregulated peptides must be derived from haemoglobin. A subset of dysregulated peptides cannot be mapped to haemoglobin and must have an alternative source such as other host proteins or turnover of parasite proteins. We have amended the discussion to better reflect these possible alternate peptide sources (480-482). Although the peptides detected in the metabolomics study (2-5 amino acids) are too short to be definitively assigned to any specific parasite or RBC protein, it is important to note that our analysis strongly indicates that the majority, but not all, of dysregulated peptides are more likely to originate from haemoglobin than other human or parasite proteins. This is based on sequence mapping, which was aided by acquiring MS/MS data for a subset of dysregulated peptides from which we derive accurate sequences (as opposed to residue composition inferred from total peptide mass) to more directly link dysregulated peptides to haemoglobin. We further quantified the sequence similarity of dysregulated peptides to all detectable proteins in the _P. falciparum infected erythrocyte proteome (~4700 proteins), showing that these peptides are statistically more similar to haemoglobin than other host or parasite proteins.

      The apparent disconnect between PfA-M1 localisation (cytosol) and the predominant site of haemoglobin digestion (digestive vacuole, DV) is explained by the fact that peptides originating from digestion of haemoglobin in the DV are required to be transported into the cytoplasm for further cleavage by peptidases, including PfA-M1. This point has now been clarified in the discussion (lines 473-474).

      Reviewer #3 (Recommendations For The Authors):

      (1) Thermal stability studies confirmed that PfA-M1 was a binding target, however, there were other proteins consistently identified in the thermal stability studies. This raises the question as to their potential role as additional targets of this inhibitor. The authors dismiss these because they are not metalloproteases, but further analysis is warranted. This is particularly important as the authors were not able to generate mutants using in vitro evolution of resistance strategies. This often indicates that the inhibitor has more than one target.

      We thank the reviewer for this comment. The possibility of other targets contributing to MIPS2673 activity was also raised by Reviewer #2 (question 2) and is addressed above. Further to our response to Reviewer #2, we agree that the inability to generate resistant parasites in vitro could indicate that inhibition of multiple essential parasite proteins (including PfA-M1) contribute to MIPS2673 activity and do not rule out this possibility. It may also indicate the target has a very high barrier for resistance and is unable to tolerate resistance causing mutations as they are deleterious to function. Indeed, previous attempts to mutate PfA-M1 (references 12 and 50), and our own attempts to generate MIPS2673 resistant parasites in vitro (unpublished), were unsuccessful. It is important to note that of the hits reproducibly identified using thermal stability proteomics, only PfA-M1 and a putative AP2 domain transcription factor (PF3D7_1239200) are predicted to be essential for blood stage growth. We have explicitly stated that PF3D7_1239200 could also contribute to activity (line 533 and 537).

      As we identified multiple hits with thermal stability proteomics we employed the complementary LiP-MS method to further investigate the target landscape of MIPS2673. PfA-M1 was the only protein reproducibly identified as the target through this approach. Importantly, the five proteins identified as hits by thermal stability proteomics were also detected in our LiP-MS datasets, but only PfA-M1 was identified as a target by both target deconvolution methods, strongly indicating it is the primary target of MIPS2673 in parasites. An important caveat is that we profiled the soluble proteome (we did not include detergents necessary for extracting membrane proteins as they may interfere with these stability assays) and other factors (e.g. the biophysical properties of the protein) will impact on whether ligand induced stabilisation events are detected. We have added additional text in the discussion around the above points (lines 545-550).

      While we do not definitively rule out other MIPS2673 interacting proteins existing in parasites (that possibly also contribute to activity), our metabolomics studies indicated no functional impact by MIPS2673 outside of elevated levels of short peptides. This is indicative of aminopeptidase inhibition and the profile of peptide accumulation was distinct from a known PfA-M17 inhibitor, and other antimalarials, further pointing to selective inhibition of the PfA-M1 enzyme by MIPS2673 being responsible for antimalarial activity.

      (2) The next set of experiments focused on a limited proteolysis approach. Again several proteins were identified as interacting with MIPS2673 including metalloproteases. The authors go on to analyze the LiP-MS data to identify the peptide from PfA-M1 which putatively interacts with MIPS2673. The authors are clearly focused on PfA-M1 as the target, but a further analysis of the other proteins identified by this method would be warranted and would provide evidence to either support or refute the authors' conclusions.

      As PfA-M1 was the only protein reproducibly identified as an interacting protein across both LiP-MS experiments (and by thermal stability proteomics) we focused our analysis on this protein. However, we agree that further analysis of the other putative interacting proteins would be valuable. Additional analysis was performed  (see new figure S4) on the other interacting proteins identified by thermal stability proteomics and the other interacting proteins identified in LiP-MS experiment one, as no other proteins (apart from PfA-M1) were identified as hits in the second LiP-MS experiment (lines 314-318, 495-505, 740-762 and Fig. S4). Using the common peptides detected across both LiP-MS experiments we mapped significant LiP peptides to the structures of the other putative MIPS2673-interacting proteins, where a structure was available and significant LiP-MS peptides were detected, and measured the minimum distance to expected binding sites. It is noted that when using the same criteria for a significant LiP peptide that we used for our PfA-M1 analysis, only one significant LiP peptide is identified from these other putative interacting proteins (YSPSFMSFK from PfADA). Therefore, we used a less stringent criteria for defining significant LiP peptides for these other proteins (see methods and Fig. S4 legend) in order to identify significant LiP peptides to map to structures. This analysis showed that, with the exception of PfA-M17, significant LiP-MS peptides for these other proteins are not significantly closer to binding sites than all other detected peptides, supporting our assertion that these other proteins are likely to be false positives or not functionally relevant MIPS2673 interacting proteins. Although significant peptides from PfA-M17 were closer to the binding site, our thermal stability and metabolomics data, combined with our previous work on the PfA-M17 enzyme, argue against this being a functionally relevant target (see lines 362-374 and 486-529 for a more detailed discussion). Another possible explanation for this result is that peptide substrates accumulating due to primary inhibition of PfA-M1 interact with PfA-M17, leading to structural changes around the enzyme active site that are detected by LiP-MS.

      (3) The final set of experiments was an untargeted metabolomics analysis. They identified 97 peptides as significantly dysregulated after MIPS2673 treatment of infected cells and most of these peptides were derived from one of the hemoglobin chains. The accumulation of peptides was consistent with a block in hemoglobin digestion. This experiment does reveal a potential functional confirmation, but questions remain as to specificity.

      As indicated, the accumulation of short peptides identified by metabolomics suggests MIPS2673 perturbs aminopeptidase function. Many of these peptides (but not all) likely map to haemoglobin and are more haemoglobin-like than other proteins in the infected red blood cell proteome. An effect on a subset of non-haemoglobin peptides is also apparent and we have added this to our discussion (also refer to our response to question 4 from Reviewer #2). A direct comparison to our previous metabolomics analysis of a specific PfA-M17 inhibitor (MIPS2571, reference 11) revealed MIPS2673 induces a unique metabolomic profile. The extent of peptide accumulation differed and a subset of short basic peptides (containing Lys or Arg) were elevated only by MIPS2673, consistent with the broad substrate preference of PfA-M1. Importantly, the metabolomics profile induced by MIPS2673 is the opposite of many other antimalarials, which cause depletion of haemoglobin peptides. Taken together, the profile of short peptide accumulation induced by MIPS2673 is consistent with specific inhibition of PfA-M1.

      (4) Overall, this is an interesting series of experiments that have identified a putative inhibitor of PfA-M1 and PvA-M1. The work would be significantly strengthened by structure-aided analysis. It is unclear why putative binding sites cannot be analyzed via specific mutagenesis of the recombinant enzyme.

      Contrary to this comment we solved the crystal structure of PfA-M1 bound to MIPS2673, determining its binding mechanism to the enzyme. This was further supported through proteomics-based structural analysis by LiP-MS. Undertaking site specific mutagenesis would be interesting to further probe the binding dynamics of MIPS2673 to the M1 protein. However, we believe it is beyond the scope of this study and would not change our conclusion that MIPS2673 binds to PfA-M1, which we have shown using multiple unbiased proteomics-based methods, enzyme assays and X-ray crystallography.

      (5) In the thermal stability and LiP -MS analysis, other proteins were consistently identified in addition to PfA-M1 and yet no additional analysis was undertaken to explore these as potential targets.

      As addressed in our previous responses, across independent thermal stability proteomics experiments we consistently identified 5 interacting proteins, including the expected target PfA-M1. In contrast, only PfA-M1 was reproducible across independent LiP-MS experiments. While several plausible putative targets (including aminopeptidases and metalloproteins) were identified in one of our LiP-MS experiment, they appear to be false discoveries and not responsible for the antiparasitic activity of MIPS2673, as peptide-level stabilisation was not consistent across independent LiP-MS experiments, and an interaction is refuted by our thermal stability, metabolomics and recombinant enzyme inhibition data. We have now performed further analysis of these other putative interacting proteins, which also argues against them being likely interacting proteins (see also response to question 2). We have also added to our existing discussion on possible MIPS2673 targets and the likelihood of these proteins contributing to antimalarial activity (lines 486-550).

      (6) The metabolomics experiments were potentially interesting, but without significant additional work including different lengths of treatment and different stages of the parasite, the conclusions drawn are overstated. Many treatments disrupt hemoglobin digestion - either directly or indirectly and from the data presented here it is premature to conclude that treatment with MIPS2673 directly inhibits hemoglobin digestion.

      Our metabolomics studies were performed using typical experimental conditions for investigating the antimalarial mechanisms of compounds by metabolomics (see references 11, 39, 40 and 55-57). We used a short 1 h incubation at 3x EC50 allowing us to profile the primary parasite pathways affected by MIPS2673 and avoid a nonspecific death phenotype associated with longer incubations. As addressed in our response to Reviewer #1 (question 2) we focused on trophozoite infected red blood cells as this is the stage most susceptible to MIPS2673 and when one presumes the greatest functional impact would be seen. It is possible that an expanded kinetic metabolomics analysis may reveal secondary mechanisms involved in MIPS2673 activity and we have now acknowledged this in the manuscript (lines 515-516). However, even though secondary mechanisms may become apparent at longer incubations it also becomes difficult to uncouple drug specific responses from nonspecific death effects. We believe any additional information provided by an expanded metabolomics analysis is unlikely to outweigh the significant extra financial cost associated with this type of experiment.

      It is correct that many antimalarial compounds appear to disrupt haemoglobin digestion when analysed by metabolomics. However, as indicated in our manuscript (lines 369-373) and previous responses, the profile of elevated haemoglobin peptides induced by MIPS2673 is substantially different to the profile caused by other antimalarials. For example, artemisinins and mefloquine cause haemoglobin peptide depletion (references 55-57) and chloroquine results in increased levels of a different subset of non-haemoglobin peptides (see Creek et al. 2016). While there is some overlap in profile with a selective M17 inhibitor (our previous work, reference 11), the level of enrichment of these peptides is different and MIPS2673 also induces accumulation of a distinct set of basic peptides consistent with the substrate preference of the PfA-M1 enzyme. As we show that MIPS2673 does not inhibit other parasite aminopeptidases, a likely explanation for the profile overlap is that the build-up of substrates that cannot be processed by PfA-M1 leads to secondary dysregulation of other aminopeptidases. Our analyses (sequence mapping, MS/MS analysis and sequence similarities to all infected red blood cell proteins) strongly indicate that the majority of elevated peptides (but not all) originate from haemoglobin. Combined with our proteomics and recombinant enzyme data indicating direct engagement of PfA-M1, and with previous literature indicating the enzyme functions to cleave amino acids from haemoglobin-derived peptides, our data indicates MIPS2673 likely directly perturbs the haemoglobin digestion pathway through PfA-M1 inhibition.

      (7) Finally, the potency of this compound on parasites grown in vitro is 300 nM - this would need improvements in potency and demonstration of in vivo efficacy in the SCID mouse model to consider this a candidate for a drug.

      We do not propose MIPS2673 as an antimalarial candidate. The experiments presented here were centred on target validation rather than identification of an antimalarial lead, which may be the focus of future studies. To avoid this confusion, we have amended the manuscript title and language throughout to clarify this point.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This study advances our understanding of the allosteric regulation of anaerobic ribonucleotide reductases (RNRs) by nucleotides, providing valuable new structural insight into class III RNRs containing ATP cones. The cryo-EM structural characterization of the system is solid, but some open questions remain about the interpretation of activity/binding assays and the newly incorporated HDX-MS results. The work will be of interest to biochemists and structural biologists working on ribonucleotide reductases and other allosterically regulated enzymes.

      Public Reviews:

      Reviewer #1 (Public Review):

      The goal of this study is to understand the allosteric mechanism of overall activity regulation in an anaerobic ribonucleotide reductase (RNR) that contains an ATP-cone domain. Through cryo-EM structural analysis of various nucleotide-bound states of the RNR, the mechanism of dATP inhibition is found to involve order-disorder transitions in the active site. These effects appear to prevent binding of substrate and a radical transfer needed to initiate the reaction.

      Strengths of the manuscript include the comprehensive nature of the work - including both numerous structures of different forms of the RNR and detailed characterization of enzyme activity to establish the parameters of dATP inhibition. The manuscript has been improved in a revision by performing additional experiments to help corroborate certain aspects of the study. But these new experiments do not address all of the open questions about the structural basis for mechanism. Additionally, some questions about the strength of biochemical data and fit of binding or kinetic curves to data that were raised by other referees still remain. Some experimental observations are not consistent with the proposed model. For example, why does dATP enhance Gly radical formation when the proposed mechanism of dATP inhibition involves disorder in the Gly radical domain?

      The work is impactful because it reports initial observations about a potentially new mode of allosteric inhibition in this enzyme class. It also sets the stage for future work to understand the molecular basis for this phenomenon in more detail.

      We express our gratitude to the reviewer for dedicating time to review our work and for the overall favorable assessment. We agree that the question of exactly how much the glycyl radical domain becomes more mobile without losing the glycyl radical entirely is an unresolved one but we also think that our work sets a solid basis for future experiments by us and others.

      Reviewer #3 (Public Review):

      The manuscript by Bimai et al describes a structural and functional characterization of an anaerobic ribonucleotide reductase (RNR) enzyme from the human microbe, P. copri. More specifically, the authors aimed to characterize the mechanism by how (d)ATP modulates nucleotide reduction in this anaerobic RNR, using a combination of enzyme kinetics, binding thermodynamics, and cryo-EM structural determination, complemented by hydrogen-deuterium exchange (HDX). One of the principal findings of this paper is the ordering of a NxN 'flap' in the presence of ATP that promotes RNR catalysis and the disordering (or increased protein dynamics) of both this flap and the glycyl radical domain (GRD) when the inhibitory effector, dATP, binds. The latter is correlated with a loss of substrate binding, which is the likely mechanism for dATP inhibition. It is important to note that the GRD is remote (>30 Ang) from the binding site of the dATP molecule, suggesting long-range communication of the structural (dis)ordering. The authors also present evidence for a shift in oligomerization in the presence of dATP. The work does provide evidence for new insights/views into the subtle differences of nucleotide modulation (allostery) of RNR, in a class III system, through long-range interactions.

      The strengths of the work are the impressive, in-depth structural analysis of the various regulated forms of PcRNR by (d)ATP using cryo-EM. The authors present seven different models in total, with striking differences in oligomerization and (dis)ordering of select structural features, including the GRD that is integral to catalysis. The authors present several, complementary biochemical experiments (ITC, MST, EPR, kinetics) aimed at resolving the binding and regulatory mechanism of the enzyme by various nucleotides. The authors present a good breadth of the literature in which the focus of allosteric regulation of RNRs has been on the aerobic orthologues.

      The addition of hydrogen-deuterium exchange mass spectrometry (HDX-MS) complements the results originating from cryo-EM data. Most notably, is the observation of the enhanced exchange (albeit quite subtle) of the GRD domain in the presence of dATP that matches the loss of structural information in this region in the cryo-EM data. The most pronounced and compelling HDX results are seen in the form of dATP-induced protection of peptides immediately adjacent to the b-hairpin at the s-site, where dATP is expected to bind based on cryo-EM. It is clear that the presence of dATP increases the rigidity of this region.

      We are happy that both reviewers find the HDX-MS experiments to be a valuable addition to the existing data.

      Weaknesses:

      The discussion of the change in peptide mobility in the N-terminal region is complicated by the presence of bimodal mass spectral features and this may prevent detailed interpretation of the data, especially for select peptide region that shows opposite trends upon nucleotide association.

      Further, the HDX data in the NxN flap is unchanged upon nucleotide binding (ATP, dATP, or CTP), despite changes observed in the cryo-EM data.

      We are grateful to the reviewer for the comprehensive feedback on the HDX-MS part and for identifying areas for improvement. The HDX analysis was of course undertaken with the intention of identifying differences in disorder of the NxN flap and GRD region. From an HDX perspective both regions were found to be highly susceptible to HDX regardless of state/ligand, due to surface accessibility and/or very fast dynamics. However, this does not mean that there is no difference in the degree of order of these regions upon ligand addition, simply that we with HDX-MS, in the limited time span of 30-3000 seconds, could not conclusively support an increased disorder. We have rephrased the discussion text to reflect this fact

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      On page 5 (and throughout the manuscript) there are some inconsistencies in how dissociation constants for effectors and inhibitors are described - for example, D in KD is sometimes subscripted and sometimes not.

      Thank you for noticing these remaining errors. We hope that we have fixed all of them now.

      Reviewer #3 (Recommendations For The Authors):

      The authors addressed many of the initial concerns raised. The addition of the HDX-MS data in this revision is a welcomed contribution to the work and complements the cryo-EM data. In select cases, the data may be over-interpreted. This reviewer suggests that the authors revise the text in this section so that it is more consistent with the presented data.

      Specific points:

      (1) The bimodal mass spectral features in the N-terminal domain complicate the data interpretation. Specifically for peptides in 81-99 region, the fast exchanging feature shows protection in the presence of (d)ATP/CTP, but the opposite trend is observed for the slow exchanging species. It is therefore advisable to not make absolutes about the HDX results in this region, as the data are complicated.

      As stated by the reviewer, it is not possible from the presented HDX data to deduce if this is a result of 50% loaded dimer or the oligomerization state of the protein. We have remedied this by removing mentions of a difference between the dATP and ATP in bimodality. Also, we have addressed this in the text by stating that the main reason is most likely the different oligomerization states present in solution. Nevertheless, it is clear from the HDX data that the N-terminal region and 81-99 are very interesting, and it was somewhat disappointing that due to the dynamics of the oligomerization it was not possible to SEC-purify pure dimer or tetramer samples for HDX-MS, in order to deconvolute the cause.

      (2) Related to #1, the authors assign the bimodal HDX behavior to EX1 mechanism, but this is not necessarily (and unlikely) true based on the limited time points. The authors also state that it originates from the heterogeneity of the sample: "a mixture of states" which could reflect the mixture of oligomerization states. The authors should be careful assigning EX1 mechanism unless there are compelling results to support it.

      We apologize for the unfortunate phrasing. It was not our intention to imply that the bimodality is due to true EX1 kinetics. See the above answer. The mention of EX1 has been removed from the discussion text.

      (3) The deuterium uptake for peptide 118-126 is very small (~1Da) compared to the length of the peptide. The change in deuterium uptake (<0.25Da) from dATP is very small; the authors should proceed with caution when presenting interpretations of such small differences.

      We agree with the reviewer that extra caution should be taken when dealing with such a small difference. However, the 118-126 peptide has been significance tested in both HDExaminer and Deuteros 2.0, and we also observed this for more than one run. The difference in uptake is small but increases to significance at the longer labelling times. The proximity to the NxN flap makes it interesting in context of an allosteric conformational change. i.e the dynamics of the NxN might be too fast so we can only see some secondary effects. We would like to keep the data  in Figure 10 for reasons of transparency. In essence this is similar to the observed bimodality mentioned above: we cannot fully explain the observation but present the data as it was observed.

      (4) On p. 22, the authors should consider revising the following statement: "confirming dATP binding to the s-site." Even though the HDX data are most compelling for the protection of peptides 178-204 and 330-348 that are adjacent to the beta-hairpin at the s-site, these data cannot "confirm" a binding site for a small molecule, such as dATP.

      We appreciate that the reviewer has pointed out that the statement can be misleading, and we agree that the binding site of small molecules can’t be confirmed based solely on HDX data. The sentence reformulated to clarify that the binding site was confirmed based on the combined evidence of HDX data and the previously presented biochemical and structural data on the s-site.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This manuscript reports important in vitro biochemical and in planta experiments to study the receptor activation mechanism of plant membrane receptor kinase complexes with non-catalytic intracellular kinase domains. Several lines of evidence convincingly show that one such putative pseudokinase, the immune receptor EFR achieves an active conformation following phosphorylation by a co-receptor kinase, and then in turn activates the co-receptor kinase allosterically to enable it to phosphorylate down-stream signaling components. This manuscript will be of interest to scientists focusing on cell signalling and allosteric regulation.

      We wish to clarify that EFR is itself, not a pseudokinase. We could show in previous work (Bender et al., 2021; https://doi.org/10.1073/pnas.2108242118 ) that EFR has catalytic activity in vitro. This catalytic activity is, however, not required for elf18-induced immune signaling in planta.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary

      The authors use an elegant but somewhat artificial heterodimerisation approach to activate the isolated cytoplasmic domains of different receptor kinases (RKs) including the receptor kinase BRI1 and EFR. The developmental RK BRI1 is known to be activated by the co-receptor BAK1. Active BRI1 is then able to phosphorylate downstream substrates. The immune receptor EFR is also an active protein kinase also activated by the co-receptor BAK1. EFR however appears to have little or no kinase activity but seems to use an allosteric mechanism to in turn enable BAK1 to phosphorylate the substrate kinase BIK1. EFR tyrosine phosphorylation by BAK1 appears to trigger a conformational change in EFR, activating the receptor. Likewise, kinase activating mutations can cause similar conformational transitions in EFR and also in BAK1 in vitro and in planta.

      We wish to clarify that we make no strong link between tyrosine phosphorylation and the conformational change leading to activation of the complex. Rather, the HDX-MS data demonstrate the structural importance of Tyr836 for the activation mechanism. At present, we do not know how phosphorylation of the residue would affect the activation process.

      Strengths:

      I particularly liked The HDX experiments coupled with mutational analysis (Fig. 2) and the design and testing of the kinase activating mutations (Fig. 3), as they provide novel mechanistic insights into the activation mechanisms of EFR and of BAK1. These findings are nicely extended by the large-scale identification of EFR-related RKs from different species with potentially similar activation mechanisms (Fig. 5).

      Weaknesses:

      In my opinion, there are currently two major issues with the present manuscript. (1) The authors have previously reported that the EFR kinase activity is dispensible for immune signaling (https://pubmed.ncbi.nlm.nih.gov/34531323/) but the wild-type EFR receptor still leads to a much better phosphorylation of the BIK1 substrate when compared to the kinase inactive D849N mutant protein (Fig. 1). (2) How the active-like conformation of EFR is in turn activating BAK1 is poorly characterized, but appears to be the main step in the activation of the receptor complex. Extending the HDX analyses to resting and Rap-activated receptor complexes could be a first step to address this question, but these HDX studies were not carried out due to technical limitations.

      Overall this is an interesting study that aims to advance our understanding of the activation mechanisms of different plant receptor kinases with important functions in plant immunity.

      Reviewer #2 (Public Review):

      Summary:

      Transmembrane signaling in plants is crucial for homeostasis. In this study, the authors set out to understand to what extent catalytic activity in the EFR tyrosine kinase is required in order to transmit a signal. This work was driven by mounting data that suggest many eukaryotic kinases do not rely on catalysis for signal transduction, relying instead on conformational switching to relay information. The crucial findings reported here involve the realisation that a kinase-inactive EFR can still activate (ie lead to downstream phosphorylation) of its partner protein BAK1. Using a convincing set of biochemical, mass spectrometric (HD-exchange) and in vivo assays, the team suggest a model in which EFR is likely phosphorylated in the canonical activation segment (where two Ser residues are present), which is sufficient to generate a conformation that can activate BAK1 through dimersation. A model is put forward involving C-helix positioning in BAK1, and the model extended to other 'non-RD' kinases in Arabidopsis kinases that likely do not require kinase activity for signaling.

      We prefer not to describe EFR as a tyrosine kinase. It may be the case that EFR can function under certain conditions as a dual-specificity protein kinase, but this has never been demonstrated experimentally. We therefore describe EFR as a Ser/Thr protein kinase, since it is known that the isolated cytoplasmic domain can phosphorylate on Ser and Thr residues (Wang et al., 2014; https://doi.org/10.1016/j.jprot.2014.06.009).

      Strengths:

      The work uses logical and well-controlled approaches throughout, and is clear and convincing in most areas, linking data from IPs, kinase assays (including clear 32P-based biochemistry), HD-MX data (from non-phosphorylated EFR) structural biology, oxidative burst data and infectivity assays. Repetitions and statistical analysis all appear appropriate.

      Overall, the work builds a convincing story and the discussion does a clear job of explaining the potential impact of these findings (and perhaps an explanation of why so many Arabidopsis kinases are 'pseudokinases', including XPS1 and XIIa6, where this is shown explicitly).

      Weaknesses:

      No major weaknesses are noted from reviewing the data and the paper follows a logical course built on solid foundations; the use of Tables to explain various experimental data pertinent to the reported studies is appreciated.

      (1) The use of a, b,c, d in Figures 2C and 3C etc is confusing to this referee, and is now addressed in the latest version

      (2) The debate about kinase v pseudokinases is well over a decade old. For non-experts, the kinase alignments/issues raised are in PMID: 23863165 and might prove useful if cited.

      We have cited the suggested reference in the second paragraph of the discussion.

      (3) Early on in the paper, the concept of kinases and pseudokinases related to R-spine (and extended R-spine) stability and regulation really needs to be more adequately introduced to explain what comes next; e.g. some of the key work in this area for RAF and Tyr kinases where mutual F-helix Phe amino acid changes are evaluated (conceptually similar to this study of the E-helix Tyr to Phe changes in EFR) should be cited (PMID: 17095602, 24567368 and 26925779).

      As an alternative, we have amended the text in several places to focus on conformational toggling between active/inactive states rather than R-spine stability. We think that this keeps the message of our manuscript focused. We hope that the reviewer finds this acceptable.

      (4) In my version, some of the experimental text is also currently in the wrong order (and no page numbers, so hard for me to state exactly where in the manuscript); However, I am certain that Figure 2C is mentioned in the text when the data are actually shown in Figure 3C for the EFR-SSAA protein.

      Indeed, some references to Figure 2 in the text were incorrect. We have corrected these. References in the text to Figure 3 and the data reported therein are correct.

      (5) Tyr 156 in PKA is not shown in Supplement 1, 2A as suggested in the text; for readers, it will be important to show the alignment of the Tyr residue in other kinases; this has been updated in the second version. Although it is clearly challenging to generate phosphorylated EFR (seemingly through Codon-expansion here?), it appears unlikely that a phosphorylated EFR protein, even semi-pure, couldn't have been assayed to test the idea that the phosphorylation drives/supports downstream signaling. What about a DD or EE mutation, as commonly used (perhaps over-used) in MEK-type studies?

      Our aim with codon expansion was to generate recombinant protein carrying high-stoichiometry phosphorylation at sites which we have previously documented to be required for downstream signaling (Macho et al., 2014; Bender et al., 2021). We additionally demonstrated previously that a DD mutant of the activation loop sites in EFR does not fully complement the efr-1 mutant (Bender et al., 2021), suggesting that the Asp mutations are not good phospho-mimics in this context. We therefore did not generate DD or EE mutations for in vitro studies.

      Impact:

      The work is an important new step in the huge amount of follow-up work needed to examine how kinases and pseudokinases 'talk' to each other in (especially) the plant kingdom, where significant genetic expansions have occurred. The broader impact is that we might understand better how to manipulate signaling for the benefit of plants and mankind; as the authors suggest, their study is a natural progression both of their own work, and the kingdom-wide study of the Kannan group.

      Reviewer #3 (Public Review):

      The study presents strong evidence for allosteric activation of plant receptor kinases, which enhances our understanding of the non-catalytic mechanisms employed by this large family of receptors.

      Plant receptor kinases (RKs) play a critical role in transducing extracellular signals. The activation of RKs involves homo- or heterodimerization of the RKs, and it is believed that mutual phosphorylation of their intracellular kinase domains initiates downstream signaling. However, this model faces a challenge in cases where the kinase domain exhibits pseudokinase characteristics. In their recent study, Mühlenbeck et al. reveal the non-catalytic activation mechanisms of the EFR-BAK1 complex in plant receptor kinase signaling. Specifically, they aimed to determine that the EFR kinase domain activates BAK1 not through its kinase activity, but rather by utilizing a "conformational toggle" mechanism to enter an active-like state, enabling allosteric trans-activation of BAK1. The study sought to elucidate the structural elements and mutations of EFR that affect this conformational switch, as well as explore the implications for immune signaling in plants. To investigate the activation mechanisms of the EFR-BAK1 complex, the research team employed a combination of mutational analysis, structural studies, and hydrogen-deuterium exchange mass spectrometry (HDX-MS) analysis. For instance, through HDX-MS analysis, Mühlenbeck et al. discovered that the EFR (Y836F) mutation impairs the accessibility of the active-like conformation. On the other hand, they identified the EFR (F761H) mutation as a potent intragenic suppressor capable of stabilizing the active-like conformation, highlighting the pivotal role of allosteric regulation in BAK1 kinase activation. The data obtained from this methodology strengthens their major conclusion. Moreover, the researchers propose that the allosteric activation mechanism may extend beyond the EFR-BAK1 complex, as it may also be partially conserved in the Arabidopsis LRR-RK XIIa kinases. This suggests a broader role for non-catalytic mechanisms in plant RK signaling.

      The allosteric activation mechanism was demonstrated for receptor tyrosine kinases (RTKs) many years ago. A similar mechanism has been suggested for the activation of plant RKs, but experimental evidence for this conclusion is lacking. Data in this study represent a significant advancement in our understanding of non-catalytic mechanisms in plant RK signaling. By shedding light on the allosteric regulation of BAK1, the study provides a new paradigm for future research in this area.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The authors have considered points 1-5 raised in my initial review and the revised manuscript contains a more balanced discussion and limitation section. No additional experiments have been performed to substantiate the envisioned allosteric activation mechanism of the co-receptor kinase BAK1 by the receptor EFR. I rewrote the public statement accordingly.

      Reviewer #2 (Recommendations For The Authors):

      Thanks for responding to my comments.

      Reviewer #3 (Recommendations For The Authors):

      The revised manuscript has fully addressed my previous concerns and is now suitable for publication in eLife.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: 

      Using concurrent in vivo whole-cell patch clamp and dendritic calcium imaging, the authors characterized how functional synaptic inputs across dendritic arborizations of mouse primary visual cortex layer 2/3 neurons emerge during the second postnatal week. They were able to identify spatially and functionally separated domains of clustered synapses in these neurons even before eye-opening and characterize how the clustering changes from P8 to P13. 

      Strengths: 

      The work is technically challenging and the findings are novel. The results support previous EM and immunostaining studies but provide in vivo evidence on the time course and the trajectory of how functional synaptic input develops. 

      Weaknesses: 

      There are some missing details about how the experiments were performed, and I also have some questions about the analyses. 

      We have now added a more detailed description of the methods and added new supplemental figures and descriptions to clarify our analyses. Please find our responses to the specific points of this reviewer in the section “Recommendations for the authors” below.

      Reviewer #2 (Public Review):

      In this study, Leighton et al performed remarkable experiments by combining in-vivo patch-clamp recording with two-photon dendritic Ca2+ imaging. The voltage-clamp mode is a major improvement over the pioneer versions of this combinatorial experiment that has led to major breakthroughs in the neuroscience field for visualizing and understanding synaptic input activities in single cells in-vivo (sharp electrodes: Svoboda et al, Nature 1997, Helmchen et al, Nature Neurosci 1999; whole-cell current-clamp: Jia et al, Nature 2010, Chen et al, Nature 2011. I suggest that these papers would be cited). This is because in voltage-clamp mode, despite the full control of membrane voltage in-vivo not being realistic, is nevertheless most effective in preventing back-propagation action potentials, which would severely confound the measurement of individual synaptically-induced Ca2+ influx events. Furthermore, clamping the cell body at a strongly depolarized potential (here the authors did -30mV) also facilitates the detection of synaptically-induced Ca2+ influx. As a result, the authors successfully recorded high-quality Ca2+ imaging data that can be used for precise analysis. To date, even in view of the rapid progress of voltage-sensitive indicators and relevant imaging technologies in recent years, this very old 'art' of combining single-cell electrophysiology and two-photon imaging (ordinary, raster-scanned, video-rate imaging) of Ca2+ signals still enables measurements of the best level precision. 

      We thank the reviewer for reminding us of these important previous studies that we cite now in the revised manuscript. 

      On the other hand, the interpretation of data in this study is a bit narrow-minded and lacks a comprehensive picture. Some suggestions to improve the manuscript are as follows: 

      (1) The authors made a segregation of 'spine synapse' and 'shaft synapse' based solely on the two photon images in-vivo. However, caution shall be taken here, because the optical resolution under in vivo imaging conditions like this cannot reliably tell apart whether a bright spot within or partially overlapping a segment of the dendrite is a spine on top of (or below) it. Therefore, what the authors consider as a 'shaft synapse' (by detecting Ca2+ hotspots) has an unknown probability of being just a spine on top or below the dendrite. If there is other imaging data of higher axial resolution to validate or calibrate, the authors shall take some further considerations or analysis to check the consistency of their data, as the authors do need such a segregation between spine and shaft synapses to show how they evolve over the brain development stages. 

      We agree with the reviewer that the differentiation between spine and sha synapses can be difficult for those spines that are located above or below the dendric sha in the z-dimension because of the lower resolution of 2-photon microscopy in the z-dimension compared to the image plane. We have now added a new paragraph to the Methods section to describe in more detail how we identify spine and sha synapses and provide more examples in a new supplementary figure (Fig S5). We believe that we can identify spine and sha synapses reliably in most cases, but added a cautionary note to make the reader aware of potential misidentifications.

      (2) The use of terminology 'bursts of spontaneous inputs' for describing voltage-clamp data seems improper. Conventionally, 'burst' refers to suprathreshold spike firing events, but here, the authors use 'burst' to refer to inward synaptic currents collected at the cell body. Not every excitatory synaptic input (or ensemble of inputs) activation will lead to spike firing under naturalistic conditions, therefore, these two concepts are not equivalent. It is recommended to use 'barrage of inputs' instead of 'burst of inputs'. Imagine a full picture of the entire dendritic tree, the fact that the authors could always capture spontaneous Ca2+ events here and there within a few pieces of dendrites within an arbitrary field-of-view suggests that, the whole dendritic tree must have many more such events going on as a barrage while the author's patch electrode picks up the summed current flow from the whole dendritic tree. 

      We agree with the reviewer that “barrage” is a clearer term for multiple synaptic inputs occurring simultaneously and therefore we changed the terminology throughout the manuscript.

      (3) Following the above issue, an analysis of the temporal correlation between synaptic (not segregating 'spine' or 'shaft') Ca2+ events and EPSCs is absent. Again, the authors drew arbitrary time windows to clump the events for statistical analysis. However, the demonstrated example data already shows that the onset times of individual synaptic Ca2+ events do not necessarily align with the beginning of a 'barrage' inward current event. 

      The reviewer writes that “an analysis of the temporal correlation between synaptic calcium events and EPSCs is absent”. We would like to point out that we did determine the percentage of calcium transients that occurred during barrages of synaptic inputs (~60%, page 7). This is important, since the barrages in our patch-clamp recordings most likely reflect spontaneous network events as described in the developing cortex previously by us and many other labs . The time window we chose was not “arbitrary” as the reviewer suggests, but based on the duration of the barrages of synaptic inputs as defined in the Methods section. 

      The reason, why we did not perform a more in-depth analysis of the temporal relationship between synaptic calcium transients and synaptic input currents is that it is essentially impossible to relate calcium transients at individual synapses to specific synaptic input events. First, during barrages of synaptic inputs many synapses are active simultaneously, both in the mapped dendrites as well as in the un-observed parts of the dendric arborization as the reviewer notes above. Thus, barrages cannot be broken down into individual synaptic transmission events. Second, since our acquisition frequency is ~10 Hz, we can identify the onset of individual synaptic calcium transients with 100-200 ms precision (1 or 2 frames). However, throughout any 100-200 ms period of recording, several synapses are active across the entire dendric arborization such that we cannot assign a given calcium transient to a specific EPSC within a 100-200 ms epoch. Third, due to the limited clamping capacity of in vivo patch recordings, we cannot be certain that individual transmission events in distal dendrites can be resolved in the patch recording.

      (4) The authors claim that "these observations indicate that the activity patterns investigated here are not or only slightly affected by low-level anesthesia". It would be nice to show some of the recordings in this work without any anesthesia to support this claim. 

      Indeed, the conclusion that the patterns of activity are only slightly affected by low levels of anesthesia is based on our previous recordings on the network level. Unfortunately, we are still not able to record calcium imaging with single synapse resolution in unanesthezed developing mice (and no one else is as far as we know), because the skull of these young animals is not firm, yet. As a consequence, movements cannot be reduced sufficiently for patching and imaging with single synapse resolution. Our previously published (Siegel et al., 2012) and unpublished work on the cellular level suggests that activity patterns during light anesthesia are very similar to those during sleep in mouse pups at this age.

      Reviewer #3 (Public Review):

      Summary: 

      There is a growing body of litterature on the clustering of co-active synapses in adult mice, which has important implications for understanding dendritic integration and sensory processing more broadly. However, it has been unclear when this spatial organization of co-active synapses arises during development. In this manuscript, Leighton et al. investigate the emergence of spatially organized, coactive synapses on pyramidal dendrites in the mouse visual cortex before eye-opening. They find that some dendrite segments contain highly active synapses that are co-active with their neighbors as early as postnatal day (P) 8-10, and that these domains of co-active synapses increase their coverage of the dendritic arbor by P12-13. Interestingly, Leighton et al. demonstrate that synapses co-active with their neighbors are more likely to increase their activity across a single recording session, compared to synapses that are not co-active with their neighbors, suggesting local plasticity driven by coincident activity before eye-opening. 

      The current manuscript includes some replication of earlier results from the same research group (Winnubst et al., 2015), including the presence of clustered, co-active synapses in the visual cortex of mouse pups, and the finding that synapses co-active with their neighbors show an increase in transmission frequency during a recording session. The main novelty in the current study compared to Winnubst et al. (2015) is the inclusion of younger animals (P8-13 in the current study compared to P10-15 in Winnubst et al., 2015). The current manuscript is the first demonstration that active synapses are clustered on specific dendrite segments as early as P8-10 in the mouse visual cortex, and the first to show the progression in active synapse distribution along the dendrite during the 2nd postnatal week. These results from the visual cortex may help inform our understanding of sensory development more broadly. 

      Strengths: 

      The authors ask a novel question about the emergence of synaptic spatial organization, and they use well-chosen techniques that directly address their questions despite the challenging nature of these techniques. To capture both structural and functional information from dendrites simultaneously, the authors performed a whole-cell voltage clamp to record synaptic currents arriving at the soma while imaging calcium influx at individual synaptic sites on dendrites. The simultaneous voltage clamp and calcium imaging allowed the authors to isolate individual synaptic inputs without their occlusion by widespread calcium influx from back-propagating action potentials. Achieving in vivo dendrite imaging in live mice that are as young as P8 is challenging, and the resulting data provides a unique view of synaptic activity along individual dendrites in the visual cortex at an early stage in development that is otherwise difficult to assess. 

      The authors provide convincing evidence that synapses are more likely to be co-active with their neighbors compared to synapses located farther away (Fig. 6F-H), and that synapses co-active with their neighbors increase their transmission frequency during a recording session (Figure 7C). These findings are particularly interesting given that the recordings occur before eye-opening, suggesting a relationship between co-activity and local synaptic plasticity even before the onset of detailed visual input. These results replicate previously published findings from P10-15 pups (Winnubst et al., 2015), increasing confidence in the reproducibility of the data. 

      The authors also provide novel data documenting for the first time spatially organized, co-active synapses in pups as young as P8. Comparing the younger (P8-10) and older (P12-13) pups, provides insight into how clusters of co-active synapses might emerge during development. 

      Weaknesses: 

      This manuscript provides insufficient detail for assessing the rigor and reproducibility of the methods, particularly for age comparisons. The P8-10 vs P12-13 age comparisons are the primary novel finding in this manuscript, and it is, therefore, critical to avoid systematic age differences in the methods and analysis whenever possible. Specific concerns related to the age comparisons are listed below: 

      (1) Given that the same research group previously published P12-13 data (Winnubst et al., 2015), it is unclear whether both age groups in the current study were imaged/analyzed in parallel by the same researcher(s), or whether previous data was used for the P12-13 group. 

      While indeed the approach in the present study is similar to that of our previous study (Winnubst et al. 2015), the data set presented here is entirely new. The current study was made possible by a new microscope that allows combining resonant scanning with piezo-focusing to image large fractions of the dendric arborization. In fact, we could now image almost 10 times larger dendric segments including branch points than in our previous study. One author contributed to the experiments in both studies. Image analysis of all experiments was performed by the first author of the present study who was not involved in the Winnubst et al. work.

      (2) The authors mention that they used 2 different microscopes, and used a fairly wide range of imaging frame rates (5-15 Hz). It is unclear from the current manuscript whether the same imaging parameters were used across the two age groups. If data for the two experimental groups was collected separately, perhaps at different times, by a different person, or on a different microscope, there is a concern that some differences between the groups may not necessarily be due to age. 

      The reviewer mentions that the experimental settings are not identical across the experiments of this study. In the original manuscript we erroneously reported in the Methods section that 2 different setups were used for this study; however, all experiments were performed on the same microscope. We have corrected this in the new manuscript. We took timelapse recordings of small stacks of varying depth to cover as many dendrites as possible in each recording, therefore, we needed to adjust the rate of acquired stacks within a certain range as the reviewer points out. The data were acquired by two scientists during an overlapping period. And while the different ages were not recorded in a strictly randomized fashion, they were not acquired in sequence according to ages, but rather involved many attempts on animals of different ages from many different litters. For each litter a small percentage of animals would generate successful recordings, and the ages of these successes were random. Therefore, we believe that neither the collection of data nor the analysis (see point above) affected the differences we describe here for the two age groups.

      (3) It is unclear whether the image analysis was performed blind to age. Blinding to age during analysis is particularly important for this study, in which it was not possible to blind to age during imaging due to visible differences in size and developmental stage between younger and older pups. 

      The analysis was not setup to be performed blind to age. Not only is the age of the animal apparent at the stage (as the reviewer points out), also the number of spines and the activity levels clearly show differences between neurons only a few days apart. However, all age-related findings reported in this study - except the increase in synapse density and activity - became apparent to us only after the full set of synaptic transmission events was determined and the analysis was performed on the entire data set, making it very unlikely that event detection was biased.

      (4) The relatively low N (where N is the number of dendrites or the number of mice) in this study is acceptable due to the challenging nature of the techniques used, but unintentional sampling bias is a concern. For example, if higher-order dendrites from the apical tuft were imaged at P12-13, while more segments of the apical trunk were imaged at P8-10, this could inadvertently create apparent age differences that were in fact due to dendrite location on the arbor or dendrite depth. 

      The reviewer points out that sampling bias with respect to synapse location along dendrites in the dataset could lead to falsely apparent age differences. In all experiments we imaged dendrites of layer 2/3 neurons that were relatively close to the cortical surface to optimize image quality. In addition, we confirmed that the mean distance of the imaged dendric stretches from the cell body was similar between the dendrites of each age group (Young: 392 +/-  104 µm, Old: 323 +/- 118 µm; mean +/- STD). Therefore, we do not think that sampling bias affected these results.

      Additional general methodological concerns, which are not specifically related to the age comparisons, are listed below: 

      (5) The authors assert that clustered, co-active synapses emerge in the visual cortex before eye-opening, which is an important finding in that it suggests this phenomention is driven by spontaneous activity rather than visual input. However, this finding hinges on the imaged cells being reliably located in the visual cortex, which is difficult to identify with certainty in animals that have not yet opened their eyes and therefore cannot undergo intrinsic signal imaging to demarcate the boundaries of the visual cortex. If the imaged cells were in, for example, nearby somatosensory cortex, then the observed spatial organization could be due to sensory input rather than spontaneous activity. 

      The reviewer argues that if the neurons included in our analysis were located in non-visual sensory cortex, e.g. the somatosensory cortex, sensory experience might have shaped clustered inputs instead of spontaneous activity. We are, however, certain that the neurons were located inside the primary visual cortex. In previous experiments where we performed the same craniotomies, we mapped spontaneous activity across the sensory areas in the occipital neocortex and we know the exact location of V1 which is already very consistent during the second postnatal week. (See for example Supplemental Figure 4 in Leighton et al., 2021).  

      (6) It is unclear how the authors defined a synaptic transmission event in the GCaMP signal (e.g. whether there was a quantitative deltaF/F threshold). 

      In the revised manuscript, we describe the procedure of identifying synaptic calcium transients in more detail and added a new supplemental figure to clarify this aspect of the analysis. In short, we use an automated detection with a 2x standard deviation threshold and a subsequent manual control and selection step. Please, find all details in the Methods section and Figure S4 of the revised manuscript.

      (7) The authors' division of synapses into spine vs shaft is unconvincing due to the difficulty of identifying Z-projecting spines in images from 2-photon microscopy, where the Z resolution is insufficient to definitively identify Z-projecting spines, and the fact that spines in young animals may be thin and dim. The authors' examples of spine synapses (e.g. in Fig. 2A) are convincing, but some of the putative shaft synapses may in fact be on spines. 

      We agree with the reviewer that the differentiation between spine and sha synapses can be difficult for those spines that are located above or below the dendric sha in the z-dimension because of the lower resolution of 2-photon microscopy in the z-dimension compared to the image plane (see also response to Reviewer 2, point 1). We have now added a new paragraph to the Methods section to describe in more detail how we identify spine and sha synapses and provide more examples in a new supplementary figure (Fig S5). We believe that we can identify spine and sha synapses reliably in most cases, but added a cautionary note to make the reader aware of potential misidentifications.

      Reviewer #1 (Recommendations For The Authors):

      I think the experiments performed were very technically challenging (probably one of the few labs that can do this in the field), and the findings provide in vivo evidence on how structured synaptic inputs are assembled during development that has never been reported. 

      I suggest improving the writing and presentation and really explaining how they conducted the experiments and how they defined shaft synapses. 

      Line 96: 12 dendritic areas from 11 mice at ages between postnatal day 8 to 13. 

      - Do the authors know how many neurons were imaged? It is unclear if the authors patch on all the imaged neurons and only imaged (or analyzed) the dendrites of those patched neurons. If yes, how sparse are the neurons labelled from IUE? From 1B, it looks like there are two cells adjacent to each other. Can the authors really distinguish whether the imaged dendrites are from the patched neuron? 

      The reviewer wonders whether we can tell apart dendrites of patched cells from those of neighboring neurons that were not patched. This is actually very straight forward: the experiment included a depolarization step (see Methods section) which leads to an immediate, but temporary, increase in fluorescence in all of the patched neurons’ dendrites, but none of the neighboring dendrites. We have added this information to the Methods section of the new manuscript and provide now an example (Fig S3). Furthermore, as these cells normally fire frequently, it would immediately become clear that an unpatched cell is being imaged if backpropagating action potentials are predominantly observed rather than synaptic signals. The visualization of these synaptic signals is only possible due to the blockade of Na+ channels with QX314 in the intracellular solution (see Methods). 

      - In the methods section, it says 'dendrites were imaged in single plane or small stacks with plane...'. How do the authors do calcium imaging with small stacks of plane using Nikon MP scope? 

      Small stacks were acquired by using the piezo focusing device of our Nikon A1 microscope. Since we combined this fast focusing approach with resonant scanning, we were able to acquire z-stacks of 3-5 frames at a rate of up to 15 Hz (per stack).

      - I also assume this is not chronic imaging, and there are different mice for each postnatal day. If it's true, this is somewhat important for all the correlation analysis as there are only 2 mice for each postnatal day (other than day 12) and day 13 only has 1 animal. 

      Yes, indeed these are not chronic experiments and dendrites imaged on different days are from different neurons and different mice. We agree with the reviewer that if it had been possible to image the same neurons across these developmental stages, we would have detected even clearer correlations. Therefore, we see our results as conservative estimates of the developmental trajectory of the analyzed parameters.

      Line 104 - 109: I don't understand why the authors need to hold at -30mV to facilitate calcium influx through NMDA receptors? I assume this helps them to visualize as many synapses as possible? but wouldn't that also make the 'event frequency' not reflect the true value? 

      Indeed depolarizing the imaged neurons to -30 mV was necessary to get sufficient calcium influx to map synaptic inputs. We don’t think that this affects the frequency of inputs, because the frequency of synaptic inputs is determined by the presynaptic firing rate and the release probability of the presynaptic terminal, which are not affected by the depolarization of the dendrite.

      Figure 2A - It says in the method section that ROIs are manually selected. However, it's not explained what the criteria are. For spine synapses, it's easy to define but for shaft synapses like in Fig 2B, why are there 2 synapses on the shaft? And in Fig 4a, 5a, Fig S1 P13, some of the dendrites are packed with ROIs. What's the distance between those shaft synapses? Can the imaging resolution really separate them? 

      The reviewer asks for a better description of how we identified individual ROIs and thus synapse locations and whether this is actually feasible. We have now added a more detailed description of how we select synaptic sites based on the occurrence of synaptic calcium transients. In addition, we have added a new supplemental Figure (S4) to give the reader an impression of the image quality and the ability to locate individual synapses reliably. We find that separating sha synapses was possible for inter-synapse distances of ~4 µm or more. The mean sha synapse distance in our data set is 21 µm.

      - Similar issue applies to Figure 4A that I'm not sure what's the resolution of each 'hot spot'. They all seem very close together. Maybe additional raw dendrite images with fluorescence changes like 1C or 2A could be helpful (or movies in the supplementary?) 

      As the reviewer suggests, we have added now additional supplemental figures to illustrate better how we identify synaptic transmission events as well as spine and sha synapses.

      - Also for line 164, it says that 76% of high-activity synapses were located on spines. This could also maybe support that only the spine synapses are real synapses and many shaft synapses are actually not synapses and they were just categorized as shaft synapses from manual ROI? 

      We are actually quite sure that sha synapses are real synapses based on our analysis, since they show repeated synaptic calcium transients that co-occur with barrages of synaptic inputs as measured by patch-clamp recordings. Indeed one would expect to see a number of excitatory synapses on dendric shas of pyramidal neurons at these ages based on previous EM studies (Miller and Peters, 1981; Wildenberg et al., 2023).

      - While this might not impact the overall novelty of the paper, I would be curious to know if the authors can still observe the same findings if they only analyze spine synapses. 

      We repeated several analyses with a dataset that contained only spine synapses. For most analyses we observed the expected result: the effect sizes were similar compared to the entire data set, but the power was reduced. For example the effect of distance to closest high-activity neighbor and own activity (Fig 5E, F) was similar, but p-values were around 0.1 (Similar results for Figure 7B). In contrast, the co-activity with synapses within a domain was significantly higher than the co-activity with synapses in other domains also for the spine-synapse only data set. 

      Fig 6 - Does the domain co-activity also contribute to the synaptic current recorded (related to Fig 4). 

      Yes, the synaptic activity measured by calcium imaging contributes to the recorded EPSCs. However, the exact relationship between synaptic inputs measured by calcium imaging and those measured by patch-clamping is complicated by 3 factors: first, during barrages of synaptic inputs many synapses are active simultaneously, both in the mapped dendrites as well as in the un-observed parts of the dendric arborization. Thus, barrages cannot be broken down into individual events. Second, since our acquisition frequency is ~10 Hz, we can identify the onset of individual synaptic calcium transients with 100-200 ms precision (1 or 2 frames). However, throughout any 100-200 ms period of recording several synapses are active across the entire dendric arborization such that we cannot assign a given calcium transient to a specific EPSC within a 100-200 ms epoch. Third, due to the limited clamping capacity of in vivo patch recordings, we cannot be certain that individual transmission events in distal dendrites can be resolved in the patch recording as EPSCs.

      Reviewer #2 (Recommendations For The Authors):

      (1) I suggest the authors should provide the number of cells and mice recorded in the figure legends. 

      The number of dendrites and mice is the same across all analyses: 12 dendrites from 11 mice for all experiments, 6/6 for P8-10 and 6/5 for P12-13. All dendrites and synapses (and their ages) are shown in the supplemental figures S1 and S2. We mention the number of imaged dendrites now at the beginning of the Results section and when we split ages for the first me.

      (2) Instead of showing only cartoon illustrations of dendrites in Figures 3-6, I suggest showing the two photon images as well together with the cartoon. 

      The 2-photon images of all dendrites of the dataset are available in Figure S1. To allow the reader to compare the cartoon representations in the main figures and the 2-photon images of each neuron, we have now labeled each dendrite in the dataset (D1-D12, see figures S1 and S2). For every figure, where we show example neurons (cartoons or zoom ins) we now provide this identifier.

      Reviewer #3 (Recommendations For The Authors):

      To address the weaknesses outlined above, we recommend that the authors do the following: 

      • To address concerns about the rigor and reproducibility of the methods specifically related to age comparisons, please confirm the following: 

      - Both age groups were run in parallel by the same researcher(s). 

      Experiments were run partly overlapping and experiments from different age groups were performed in parallel by both researchers.

      - Both age groups were imaged on the same microscope, or animals from each age group were imaged on both microscopes. If it was necessary to use different microscopes for the different age groups for biological or practical reasons, please explain. 

      All experiments were run on the same microscope, a Nikon A1 2-photon microscope. In the original methods description we erroneously mentioned two microscopes (copy and paste error from a previous publication). We corrected that in the revised manuscript.

      - There was no difference in imaging frame rates or other imaging parameters between age groups. If it was necessary to use different parameters for different age groups for biological reasons, please explain. 

      We varied the frame rates somewhat to allow larger z-stacks for some experiments where dendrites traversed different depths; however the mean frame rates were similar between the experiments in P8-10 vs P12-13 dendrites, 8.5 vs 10 Hz, respectively.

      - Images were analyzed blind to age. 

      The analysis was not setup to be performed blind to age. The number of spines and the activity levels clearly show obvious differences between neurons only a few days apart. However, all findings reported in this study related to age - except the increase in synapse density and activity - became apparent to us only after the full set of synaptic transmission events was determined and the analysis was performed on the entire data set, making it unlikely that event detection was biased.

      - There was no difference in the location of analyzed dendrites (e.g. depth from the pia, branch order) between age groups. 

      In all experiments we imaged dendrites of layer 2/3 neurons that were relatively close to the cortical surface to optimize image quality. In addition, we determined the mean distance of the imaged dendric stretches from the cell body and found that this distance was similar between the dendrites of each age group (Young: 392 +/-  104 µm, Old: 323 +/- 118 µm; mean +/- STD). Therefore, we do not think that sampling bias affected these results.

      • To address general methodological concerns, please provide additional description of the following points: 

      - Please clarify how the visual cortex was identified in P8-13 pups. If there was ambiguity about identifying the visual cortex in these pups, please discuss the implications of this ambiguity. 

      The reviewer asks how we identified V1 in these experiments. We are indeed certain that the neurons were located inside the primary visual cortex. We have ample experience with mapping V1 in these animals based on patterns of spontaneous activity as well as post-hoc stainings. V1 is quite large already at these ages (> 2 mm long and > 1 mm wide) and its extent very consistent across animals. Thus, we would argue it is actually hard to miss.

      - Please clarify how synaptic transmission events were identified in the GCaMP signal. 

      We have now added a more detailed description of how we identify synaptic calcium transients. In addition, we have added a new supplemental Figure (S3) to give the reader an impression of the image quality and the ability to locate individual synapses reliably. 

      - It is acceptable to use the spine vs shaft analysis despite the inevitable difficulty resolving Z-projecting spines, but this caveat should be mentioned in the discussion of the spine vs shaft results. 

      We added a more detailed description of spine and sha synapse identification, a new supplemental figure (S5) and we now mention the caveat related to the limited z-resolution of 2-photon microscopy in the revised manuscript.

      • Two additional minor details should be clarified in the text of the manuscript: 

      - Please specify the volume of DNA solution injected into each embryo. 

      The injected volume was 1 µl. We added this information in the Methods section of the revised manuscript.

      - In Fig S1, please specify whether the scale bar applies to all images. 

      The scale bar applies to all images. This information was added to the figure legend.

      References

      Leighton AH, Cheyne JE, Houwen GJ, Maldonado PP, De Winter F, Levelt CN, Lohmann C. 2021. Somatostatin interneurons restrict cell recruitment to renally driven spontaneous activity in the developing cortex. Cell Rep 36:109316. doi:10.1016/j.celrep.2021.109316

      Miller M, Peters A. 1981. Maturation of rat visual cortex. II. A combined Golgi-electron microscope study of pyramidal neurons. JComp Neurol 203:555–573.

      Siegel F, Heimel JA, Peters J, Lohmann C. 2012. Peripheral and central inputs shape network dynamics in the developing visual cortex in vivo. Current Biology 22:253–258.

      Wildenberg G, Li H, Sampathkumar V, Sorokina A, Kasthuri N. 2023. Isochronic development of cortical synapses in primates and mice. Nat Commun 14:8018. doi:10.1038/s41467-02343088-3

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      This is an interesting and well-written paper reporting on a novel approach to studying cerebellar function based on the idea of selective recruitment using fMRI. The study is well-designed and executed. Analyses are sound and results are properly discussed. The paper makes a significant contribution to broadening our understanding of the role of the cerebellum in human behavior.

      We thank the reviewer for the positive assessment of our paper.

      (1) While the authors provide a compelling case for the link between BOLD and the cerebellar cortical input layer, there remains considerable unexplained variance. Perhaps the authors could elaborate a bit more on the assumption that BOLD signals mainly reflect the input side of the cerebellum (see for example King et al., elife. 2023 Apr 21;12:e81511).

      Our paper is based on the assumption that the cerebellar BOLD signal reflects solely the input to the cerebellum and does not reflect the changes in firing rates of Purkinje cells. This assumption relies on two lines of arguments: Studies that have directly looked at the mechanism of vasodilation in the cerebellum, and studies that try to infer the contributions of different neurophysiological mechanisms to overall cerebellar metabolism (Attwell and Iadecola, 2002).

      Vasodilatory considerations: The mechanisms that causes vasodilation in the cerebellum, and hence BOLD signal increases, has been extensively studied: Electrical stimulation of mossy fibers (Gagliano et al., 2022; Mapelli et al., 2017), as well as parallel fibers (Akgören et al., 1994; Iadecola et al., 1996; Mathiesen et al., 1998; Yang and Iadecola, 1997) lead to robust increases in cerebellar blood flow. In contrast to the neocortex, the regulation of blood flow in the cerebellum depends nearly purely on the vasodilator Nitric Oxide (NO) (Akgören et al., 1994; Yang and Iadecola, 1997) with stellate cells playing a key role in the signaling cascade (Yang et al., 2000).

      Electrical (Mathiesen et al., 2000) and pharmacological (Yang and Iadecola, 1998) stimulation of climbing fibers also leads to robust increases in blood flow. Simultaneous parallel and climbing fiber stimulation seems to combine sub-additively to determine the blood flow changes (K. Caesar et al., 2003).

      Importantly, even dramatic changes in spiking rate of Purkinje cells do not lead to changes in vasodilation. For starters, parallel fiber stimulation leads to blood flow increases, even though the net effect on Purkinje cell firing is inhibitory (Mathiesen et al., 1998). More importantly, complete inhibition of the Purkinje cell using a GABA agonist does not change baseline cerebellar blood flow (Kirsten Caesar et al., 2003). Conversely, even a 200-300% increase in simple (and complex) spike firing rate through application of a GABA antagonist does not show any measurable consequences for blood flow, even though it clearly increases the metabolic rate of oxygen consumption in the tissue (Thomsen et al., 2009, 2004).

      In sum, this extensive set of studies clearly argues that the cerebellar blood flow response is mostly dictated by synaptic input, and that the firing rate of Purkinje cells does not influence vasodilation. Because the BOLD signal is caused by an supply of oxygen over and above the level of oxygen consumption, this would argue that increases in Purkinje cell firing would not lead to BOLD increases. What is less clear is the degree to which changes in BOLD signal during normal activity are determined by changes in mossy fiber or climbing fiber input. Disruption of either pathway leads to 60-70% reductions in the evoked blood flow response during whisker stimulation (Yang et al., 2000; Zhang et al., 2003) – but it remains unclear to what degree this reflects the distribution of contributions in the healthy animal, as these powerful disruptions may have a number of side-effects.

      Metabolic considerations: To estimate the relative contributions climbing fiber / mossy fiber input to the variations in BOLD signal under natural conditions, it is useful to consider the contributions of different cerebellar processes to the overall metabolism of the cerebellum. Assuming an average firing rate of 40Hz for mossy fibers, ~3Hz for Granule cells, and 1Hz for climbing fibers, Howarth et al. (Howarth et al., 2012, 2010) estimated that the transmission from mossy fibers to granular cells, dominates the energy budget with 53%. The subsequent stage, encompassing the transfer of information from Granular cells to Purkinje cells, accounts for 32% of energy expenditure. In contrast, integration within Purkinje cells and the spiking (simple and complex) of these cells represents only 15% of the total energy consumption.

      More important for the BOLD signal, however, are the activity-induced variations in metabolic consumption: Purkinje cells fire relatively constantly at a very high frequency (~50Hz) both during awake periods and during sleep (Shin et al., 2007). When providing a signal to the neocortex, firing rate decreases, actually lowering the metabolic demand. Climbing fibers normally fire at ~0.5 Hz and even during activity rarely fire much above 2Hz (Streng et al., 2017). In contrast, granule cells show a low firing rates during rest (typically <1hz) and can spike during activity well above 100Hz. Combined with the sheer number of granule cells, these considerations would suggest that the vast majority of the variation in metabolic demand are due to mossy fiber input and granule cell activity.

      Overall, we therefore think it is likely that the main determinant of the cerebellar cortical BOLD signal is mossy fiber input and the transmission of information from mossy fibers to granule cells to Purkinje cells. We admit that the degree to which climbing fiber input contribute to BOLD signal changes is much less clear. We can be quite certain, however, that the firing rate of Purkinje cells does not contribute to the cerebellar BOLD signal, as even dramatic changes in the firing rate do not cause any changes in vasodilation.  We have clarified our line of reasoning in the paper, and hope this more extensive response here will give the reader a better overview over the pertaining literature.

      (2) The current approach does not appear to take the non-linear relationships between BOLD and neural activity into account.

      Thank you for raising this concern. We did not stress this point in the paper, but one big advantage of our selective recruitment approach is that it is – to some degree- robust against non-linearities in the relationship between neural activity and BOLD signal. This is the case, as long as the shape of the non-linearity is similar in the cerebellum and the neocortex. The results of our motor task (Figure 3) provide a clear example of this: The BOLD signal both in the neocortex and cerebellum incases non-linearly as a function of force – the increase from 2.5N to 6N (a 3.5N increase) is larger than the increase from 6N to 10N (a 4N increase). A similar non-linearity can be observed for tapping speed (6, 10 to 18 taps / s). However, within each condition, the relationship between cortical and cerebellar activity is nearly perfectly linear, reflecting the fact that the shape of the non-linearity for the cerebellum and cortex is very similar.

      Most importantly, even if the non-linearity across the two structures is different, any non-linear relationship between neural activity and BOLD signal (of vasodilatory nature) should apply to different conditions (here force and speed increases) similarly. Therefore, if two conditions show overlapping activity levels (as observed for force and speed across medium and high levels, Figure 3), a offset between conditions cannot be caused by a non-linearity in the relationship of cortical and cerebellar activity. Because all conditions are subject to the same non-linearity, all points should lie on a single (likely monotonically increasing) non-linear function. Both for the motor and working memory task, the pattern of results clearly violates this assumption.

      (3) The authors may want to address a bit more the issue of closed loops as well as the underlying neuroanatomy including the deep cerebellar nuclei and pontine nuclei in the context of their current cerebello-cortical correlational approach. But also the contribution of other brain areas such as the basal ganglia and hippocampus. 

      Cortical-cerebellar communication is of course bi-directional. As discussed in King at al., (2023), however, we are restricting our model to the connections from the neocortex to the cerebellum for the following reasons: First, cerebellar BOLD activity likely reflects mostly neocortical input (see our answer to pt. 1), whereas neocortical activity is determined by a much wider array of projections, including striato-thalamo-cortical and cortico-cortical connections. Secondly, the output of the cerebellum cannot be predicted from the BOLD signal of the cerebellar cortex, as it is unlikely that the firing rate of Purkinje cells contribute to cerebellar BOLD signal (see pt. 1). For these reasons we believe that the relationship between neocortical and cerebellar activity patterns is mostly dictated by the connectivity from cortex to cerebellum, and is therefore best modelled as thus. This is now more clearly discussed in a new paragraph (line 318-323) of the revised manuscript.

      We are also ignoring other inputs to the cerebellum, including the spinal chord, the basal ganglia (Bhuvanasundaram et al., 2022; Bostan and Strick, 2018) hippocampus (Froula et al., 2023; Watson et al., 2019), and amygdala (Farley et al., 2016; Jung et al., 2022; Terburg et al., 2024). In humans, however, the neocortex remains the primary source of input to pontine nuclei. Consequently, it stands as the main structure shaping activity within the cerebellar cortex. While it is an interesting question to what degree the consideration of subcortical structures can improve the prediction of cerebellar activity patterns, we believe that considering the neocortex provides a good first approximation.

      Reviewer #1 (Recommendations):

      (4)  A few sentences to clarify the used models as was done in the King et al. (2024) paper may improve readability.

      We have now added the sentences in the introduction (line 25ff):

      To approach this problem, we have recently developed and tested a range of cortical-cerebellar connectivity models (King et al., 2023), designed to capture fixed, or task-invariant, transmission between neocortex and cerebellum. For each cerebellar voxel, we estimated a regularized multiple regression model to predict its activity level across a range of task conditions (King et al., 2019) from the activity pattern observed in the neocortex for the same conditions. The models were then evaluated in their ability to predict cerebellar activity in novel tasks, again based only on the corresponding neocortical activity pattern. Two key results emerged from this work. First, while rs-FC studies (Buckner et al., 2011; Ji et al., 2019; Marek et al., 2018) have assumed a 1:1 mapping between neocortical and cerebellar networks, models which allowed for convergent input from multiple neocortical regions to a single cerebellar region performed better in predicting cerebellar activity patterns for novel tasks. Second, when given a cortical activation pattern, the best performing model could predict about 50% of the reliable variance in the cerebellar cortex across tasks (King et al., 2023).

      (5) To what extent does this paper demonstrate the limitations of BOLD in neuroscientific research? 

      The primary objective of this study was to shed light on the problems of interpreting BOLD activation within the cerebellum. The problem that the BOLD signal mostly reflect input to a region is not unique to the cerebellum, but also applies (albeit likely to a lesser degree) to other brain structures. However, the solution we propose here critically hinges on three features of the cerebellar circuitry: a) the mossy fiber input for the cerebellar hemispheres mostly arise from the neocortex, b) the BOLD signal is likely dominated by this mossy fiber input (see pt. 1), and c) there is very little excitatory recurrent activity in the cerebellum, so output activity in the cerebellum does not cause direct activity in other parts of the cerebellum.

      These features motivate us to use a directed cortex->cerebellum connectivity model, which does not allow for any direct connectivity within the cerebellum. While the same approach can also be applied to other brain structures, it is less clear that the approach would yield valid results here. For example, due the local excitatory recurrent connectivity within neocortical columns, the activity here will also relate to local processing.

      (6) What if the authors reversed their line of reasoning as in that cerebellum activity is matched to map changes in cerebral cortical activity? Perhaps this could provide further evidence for the assumed directional specificity of the task-dependent gating of neocortical inputs. 

      Given (a) that the cerebellar BOLD signal tells us very little about cerebellar output signals (b) that there are many other input signals to the neocortex that are more powerful than cerebellar inputs, and c) that there strong cortical-cortical connections, we believe that this model would be hard to interpret (see also our answer to pt. 3).

      Therefore, while the inversion of the linear task-invariant mapping between cortical and cerebellar activity is a potentially interesting exercise, it is unclear to us at this point what strong predictions we would be able to test with this approach.

      (7) The statement that cerebellar fMRI activity may simply reflect the transmission of neocortical activity through fixed connections can be better explained. Also in the context of using the epiphenomenon (on page 11) in the paper. To what extent is the issue of epiphenomenon not a general problem of fMRI research?

      We have rephrased the introduction of this idea (line 17):

      This means that increases in the cerebellar BOLD signal could simply reflect the automatic transmission of neocortical activity through fixed anatomical connections. As such, whenever a task activates a neocortical region, the corresponding cerebellar region would also be activated, regardless of whether the cerebellum is directly involved in the task or not.

      Epiphemonal activity: This is indeed a general problem in fMRI research (and indeed research that uses neurophysiological recordings, rather than manipulations of activity). Indeed, we have discussed similar issues in the context of motor activity in ipsilateral motor cortex (Diedrichsen et al., 2009). However, given that we only offer a possible approach to address this issue for the cerebellum (see pt. 5), we thought it best to keep the scope of the discussion focused on this structure.

      Reviewer #2 (Public Review):

      Summary:

      Shahshahani and colleagues used a combination of statistical modelling and whole-brain fMRI data in an attempt to separate the contributions of cortical and cerebellar regions in different cognitive contexts.

      Strengths:

      The manuscript uses a sophisticated integration of statistical methods, cognitive neuroscience, and systems neurobiology.

      The authors use multiple statistical approaches to ensure robustness in their conclusions.

      The consideration of the cerebellum as not a purely 'motor' structure is excellent and important. <br />

      We thank the reviewer for their positive evaluation.

      Weaknesses:

      (1) Two of the foundation assumptions of the model - that cerebellar BOLD signals reflect granule cells > purkinje neurons and that corticocerebellar connections are relatively invariant - are still open topics of investigation. It might be helpful for the reader if these ideas could be presented in a more nuanced light.

      Please see response to the comment 1 of Reviewer 1 for a more extensive and detailed justification of this assumption. We have now also clarified our rationale for this assumption better in the paper on line 10-14. Finally, we now also raise explicitly the possibility that some of the violations of the task-invariant model could be caused by selectively increase of climbing fiber activity in some tasks (line 340).

      (2) The assumption that cortical BOLD responses in cognitive tasks should be matched irrespective of cerebellar involvement does not cohere with the idea of 'forcing functions' introduced by Houk and Wise. 

      We are assuming that you refer to the idea that cerebellar output is an important determinant of the dynamics (and likely also of the magnitude) of neocortical activity. We agree most certainly here. However, we also believe that in the context of our paper, it is justified to restrict the model to the connectivity between the neocortex and the cerebellum only (see reviewer 1, comment 3).

      Furthermore, if increased cerebellar output indeed occurs during the conditions for which we identified unusually high cerebellar activity, it should increase neocortical activity, and bring the relationship of the cerebellar and cortical activity again closer to the predictions of the linear model. Therefore, the identification of functions for which cerebellar regions show selective recruitment is rather conservative.

      Reviewer #2 (Recommendations):

      (3) One of the assumptions stated in the abstract -- that the inputs to the cerebellum may simply be a somewhat passive relay of the outputs of the cerebral cortex -- has been challenged recently by work from Litwin-Kumar (Muscinelli et al., 2023 Nature Neuroscience), which argues for complex computational relationships between cortical pyramidal neurons, pontine nuclei and granule cells, which in turn would have a non-linear impact on the relationship between cortical and cerebellar BOLD. The modelling is based on empirical recordings from Wagner (2019, Cell) which show that the synaptic connections between the cortex and granule cells change as a function of learning, further raising concerns about the assumption that the signals inherent within these two systems should be identical. Whether these micro-scale features are indicative of the macroscopic patterns observed in BOLD is an interesting question for future research, but I worry that the assumption of direct similarity is perhaps not reflective of the current literature. The authors do speak to these cells in their discussion, but I believe that they could also help to refine the authors' hypotheses in the manuscript writ large.

      We absolutely agree with your point. However, we want to make extremely clear here that our hypothesis (that the inputs to the cerebellum are a linear task-invariant function of the outputs of the cerebral cortex) is the Null-hypothesis that we are testing in our paper. In fact, our results show the first empirical evidence that task-dependent gating may indeed occur. In this sense, our paper is consistent with the theoretical suggestion of (Muscinelli et al., 2023).

      You may ask whether a linear task-invariant model of cortical-cerebellar connectivity is not a strawman, given that is most likely incorrect. However, as we stress in the discussion (line 298-), a good Null-model is a useful model, even if it is (as all models) ultimately incorrect. Without it, we would not be able to determine which cerebellar activity outstrips the linear prediction. The fact that this Null-model itself can predict nearly 50% of the variance in cerebellar activity patterns across tasks at a group level, means that it is actually a very powerful model, and hence is a much more stringent criterion for evidence for functional involvement than just the presence of activity.

      (4) Further to this point, I didn't follow the authors' logic that the majority of the BOLD response in the cerebellum is reflective of granule cells rather than Purkinje cells. I read through each of the papers that were cited in defense of the comment: "The cerebellar BOLD signal is dominated by mossy fiber input with very little contribution from the output of the cerebellar cortex, the activity of Purkinje cells" and found that none of these studies made this same direct conclusion. As such, I suggest that the authors soften this statement, or provide a different set of references that directly confirm this hypothesis. 

      Please see response to the comment 1, Reviewer 1. We hope the answer provides a more comprehensive overview over the literature, which DOES show that spiking behavior of Purkinje cells does not influence vasodilation (as opposed to mossy fiber input). We have now clarified our rationale and the exact cited literature on line 9-14 of the paper.

      (5) Regarding the statement: "As such, whenever a task activates a neocortical region, we might observe activity in the corresponding cerebellar regions regardless of whether the cerebellum is directly involved in the task or not." -- what if this is a feature, rather than a bug? That is, the organisation of the nervous system has been shaped over phylogeny such that every action, via efference copies of motor outputs, is filtered through the complex architecture of the cerebellum in order to provide a feed-forward signal to the thalamus/cortex (and other connected structures). Houk and Wise made compelling arguments in their 1995 Cerebral Cortex paper arguing that these outputs (among other systems) could act as 'forcing functions' on the kinds of dynamics that arise in the cerebral cortex. I am inclined to agree with their hypothesis, where the implication is that there are no tasks that don't (in some way) depend on cerebellar activity, albeit to a lesser or greater extent, depending on the contexts/requirements of the task. I realise that this is a somewhat philosophical point, but I do think it is important to be clear about the assumptions that form the basis of the reasoning in the paper. 

      This is an interesting point. Our way of thinking about cerebellar function does indeed correspond quite well to the idea of forcing functions- the idea that cerebellar output can “steer” cortical dynamics in a particular way. However, based on patient and lesion data, it is also clear that some cortical functions rely much more critically on cerebellar input than others. We hypothesize here that cerebellar activity is higher (as compared to the neocortical activity) when the functions require cerebellar computation.

      We also agree with the notion that cerebellar contribution is likely not an all-or-none issue, but rather a matter of gradation (line 324ff).

      (6) Regarding the logic of expecting the cortical patterns for speed vs. force to be matched -- surely if the cerebellum was involved more in speed than force production, the feedback from the cerebellum to the cortex (via thalamus) could also contribute to the observed differences? How could the authors control for this possibility? 

      Our model currently indeed does not attempt to quantify the contributions of cerebellar output to cortical activity. However, given that cerebellar output is not visible in the BOLD signal of the cerebellum (see reviewer 1, comment 1), we believe that this is a rational approach. As argued in our response to your comment 2, increased cerebellar output in the speed compared to the force condition should bring the activity relationship closer to the linear model prediction. The fact that we find increased cerebellar (as compared to neocortical) activity in the speed conditions, suggests that there is indeed task-dependent gating of cortical projections to the cerebellum.

      Akgören N, Fabricius M, Lauritzen M. 1994. Importance of nitric oxide for local increases of blood flow in rat cerebellar cortex during electrical stimulation. Proc Natl Acad Sci U S A 91:5903–5907.

      Attwell D, Iadecola C. 2002. The neural basis of functional brain imaging signals. Trends Neurosci 25:621–625.

      Bhuvanasundaram R, Krzyspiak J, Khodakhah K. 2022. Subthalamic Nucleus Modulation of the Pontine Nuclei and Its Targeting of the Cerebellar Cortex. J Neurosci 42:5538–5551.

      Bostan AC, Strick PL. 2018. The basal ganglia and the cerebellum: nodes in an integrated network. Nat Rev Neurosci 19:338–350.

      Buckner RL, Krienen FM, Castellanos A, Diaz JC, Yeo BTT. 2011. The organization of the human cerebellum estimated by intrinsic functional connectivity. J Neurophysiol 106:2322–2345.

      Caesar K., Gold L, Lauritzen M. 2003. Context sensitivity of activity-dependent increases in cerebral blood flow. Proc Natl Acad Sci U S A 100:4239–4244.

      Caesar K., Thomsen K, Lauritzen M. 2003. Dissociation of spikes, synaptic activity, and activity-dependent increments in rat cerebellar blood flow by tonic synaptic inhibition. Proc Natl Acad Sci U S A 100:16000–16005.

      Farley SJ, Radley JJ, Freeman JH. 2016. Amygdala Modulation of Cerebellar Learning. J Neurosci 36:2190–2201.

      Froula JM, Hastings SD, Krook-Magnuson E. 2023. The little brain and the seahorse: Cerebellar-hippocampal interactions. Front Syst Neurosci 17:1158492.

      Gagliano G, Monteverdi A, Casali S, Laforenza U, Gandini Wheeler-Kingshott CAM, D’Angelo E, Mapelli L. 2022. Non-linear frequency dependence of neurovascular coupling in the cerebellar cortex implies vasodilation-vasoconstriction competition. Cells 11:1047.

      Howarth C, Gleeson P, Attwell D. 2012. Updated energy budgets for neural computation in the neocortex and cerebellum. J Cereb Blood Flow Metab 32:1222–1232.

      Howarth C, Peppiatt-Wildman CM, Attwell D. 2010. The energy use associated with neural computation in the cerebellum. J Cereb Blood Flow Metab 30:403–414.

      Iadecola C, Li J, Xu S, Yang G. 1996. Neural mechanisms of blood flow regulation during synaptic activity in cerebellar cortex. J Neurophysiol 75:940–950.

      Ji JL, Spronk M, Kulkarni K, Repovš G, Anticevic A, Cole MW. 2019. Mapping the human brain’s cortical-subcortical functional network organization. Neuroimage 185:35–57.

      Jung SJ, Vlasov K, D’Ambra AF, Parigi A, Baya M, Frez EP, Villalobos J, Fernandez-Frentzel M, Anguiano M, Ideguchi Y, Antzoulatos EG, Fioravante D. 2022. Novel Cerebello-Amygdala Connections Provide Missing Link Between Cerebellum and Limbic System. Front Syst Neurosci 16:879634.

      King M, Hernandez-Castillo CR, Poldrack RA, Ivry RB, Diedrichsen J. 2019. Functional boundaries in the human cerebellum revealed by a multi-domain task battery. Nat Neurosci 22:1371–1378.

      King M, Shahshahani L, Ivry RB, Diedrichsen J. 2023. A task-general connectivity model reveals variation in convergence of cortical inputs to functional regions of the cerebellum. Elife 12:e81511.

      Mapelli L, Gagliano G, Soda T, Laforenza U, Moccia F, D’Angelo EU. 2017. Granular layer neurons control cerebellar neurovascular coupling through an NMDA receptor/NO-dependent system. J Neurosci 37:1340–1351.

      Marek S, Siegel JS, Gordon EM, Raut RV, Gratton C, Newbold DJ, Ortega M, Laumann TO, Adeyemo B, Miller DB, Zheng A, Lopez KC, Berg JJ, Coalson RS, Nguyen AL, Dierker D, Van AN, Hoyt CR, McDermott KB, Norris SA, Shimony JS, Snyder AZ, Nelson SM, Barch DM, Schlaggar BL, Raichle ME, Petersen SE, Greene DJ, Dosenbach NUF. 2018. Spatial and Temporal Organization of the Individual Human Cerebellum. Neuron 100:977-993.e7.

      Mathiesen C, Caesar K, Akgören N, Lauritzen M. 1998. Modification of activity-dependent increases of cerebral blood flow by excitatory synaptic activity and spikes in rat cerebellar cortex. J Physiol 512 ( Pt 2):555–566.

      Mathiesen C, Caesar K, Lauritzen M. 2000. Temporal coupling between neuronal activity and blood flow in rat cerebellar cortex as indicated by field potential analysis. J Physiol 523:235–246.

      Muscinelli SP, Wagner MJ, Litwin-Kumar A. 2023. Optimal routing to cerebellum-like structures. Nat Neurosci 26:1630–1641.

      Shin S-L, Hoebeek FE, Schonewille M, De Zeeuw CI, Aertsen A, De Schutter E. 2007. Regular patterns in cerebellar Purkinje cell simple spike trains. PLoS One 2:e485.

      Streng ML, Popa LS, Ebner TJ. 2017. Climbing Fibers Control Purkinje Cell Representations of Behavior. J Neurosci 37:1997.

      Terburg D, van Honk J, Schutter DJLG. 2024. Doubling down on dual systems: A cerebellum–amygdala route towards action- and outcome-based social and affective behavior. Cortex 173:175–186.

      Thomsen K, Offenhauser N, Lauritzen M. 2004. Principal neuron spiking: neither necessary nor sufficient for cerebral blood flow in rat cerebellum. J Physiol 560:181–189.

      Thomsen K, Piilgaard H, Gjedde A, Bonvento G, Lauritzen M. 2009. Principal cell spiking, postsynaptic excitation, and oxygen consumption in the rat cerebellar cortex. J Neurophysiol 102:1503–1512.

      Watson TC, Obiang P, Torres-Herraez A, Watilliaux A, Coulon P, Rochefort C, Rondi-Reig L. 2019. Anatomical and physiological foundations of cerebello-hippocampal interaction. Elife 8:e41896.

      Yang G, Huard JM, Beitz AJ, Ross ME, Iadecola C. 2000. Stellate neurons mediate functional hyperemia in the cerebellar molecular layer. J Neurosci 20:6968–6973.

      Yang G, Iadecola C. 1998. Activation of cerebellar climbing fibers increases cerebellar blood flow: role of glutamate receptors, nitric oxide, and cGMP. Stroke 29:499–507; discussion 507-8.

      Yang G, Iadecola C. 1997. Obligatory role of NO in glutamate-dependent hyperemia evoked from cerebellar parallel fibers. Am J Physiol 272:R1155-61.

      Zhang Y, Forster C, Milner TA, Iadecola C. 2003. Attenuation of activity-induced increases in cerebellar blood flow by lesion of the inferior olive. Am J Physiol Heart Circ Physiol 285:H1177-82.

    1. Author response:

      eLife assessment

      This valuable study reveals how a rhizobial effector protein cleaves and inhibits a key plant receptor for symbiosis signaling, while the host plant counters by phosphorylating the effector. The molecular evidence for the protein-protein interaction and modification is solid, though biological evidence directly linking effector cleavage to rhizobial infection is incomplete. With additional functional data, this work could have implications for understanding intricate plant-microbe dynamics during mutualistic interactions.

      Thank you for this helpful comment. In the revised manuscript version, we will be more prudent with directly linking cleavage of Nod factor receptors by NopT and rhizobial infection.

      We plan to modify the Title, the One-Sentence Summary, Abstract, and Discussion regarding this point.

      Public Reviews:

      Reviewer #1 (Public Review):

      Bacterial effectors that interfere with the inner molecular workings of eukaryotic host cells are of great biological significance across disciplines. On the one hand they help us to understand the molecular strategies that bacteria use to manipulate host cells. On the other hand they can be used as research tools to reveal molecular details of the intricate workings of the host machinery that is relevant for the interaction/defence/symbiosis with bacteria. The authors investigate the function and biological impact of a rhizobial effector that interacts with and modifies, and curiously is modified by, legume receptors essential for symbiosis. The molecular analysis revealed a bacterial effector that cleaves a plant symbiosis signaling receptor to inhibit signaling and the host counterplay by phosphorylation via a receptor kinase. These findings have potential implications beyond bacterial interactions with plants.

      Thank you for highlighting the broad significance of rhizobial effectors in understanding legume-rhizobium interactions. We fully agree with your assessment and will emphasize these points in the revised Introduction and Discussion sections of our manuscript. Specifically, we will expand our Discussion regarding the potential impact of the NopT interaction with symbiotic receptor kinases on plant immune signaling and regarding the general significance of our work.

      Bao and colleagues investigated how rhizobial effector proteins can regulate the legume root nodule symbiosis. A rhizobial effector is described to directly modify symbiosis-related signaling proteins, altering the outcome of the symbiosis. Overall, the paper presents findings that will have a wide appeal beyond its primary field.

      Out of 15 identified effectors from Sinorhizobium fredii, they focus on the effector NopT, which exhibits proteolytic activity and may therefore cleave specific target proteins of the host plant. They focus on two Nod factor receptors of the legume Lotus japonicus, NFR1 and NFR5, both of which were previously found to be essential for the perception of rhizobial nod factor, and the induction of symbiotic responses such as bacterial infection thread formation in root hairs and root nodule development (Madsen et al., 2003, Nature; Tirichine et al., 2003; Nature). The authors present evidence for an interaction of NopT with NFR1 and NFR5. The paper aims to characterize the biochemical and functional consequences of these interactions and the phenotype that arises when the effector is mutated.

      Thank you for your positive feedback on our manuscript. In the revised Introduction and Discussion sections, we plan to better emphasize the interdisciplinary significance of our work. We will show how the knowledge gained from our study can contribute to a better understanding of microbial interactions with eukaryotic hosts in general, which may have a stimulating effect on future research in various research areas such as pathogenesis and immunity.

      To ensure that the readers can easily follow the rationale behind our experiments, we will improve the Results section and provide more detailed explanations of how NopT among 15 examined effectors was selected. Additionally, we will provide more background information on NopT and the roles of NFR1 and NFR5 in symbiotic signaling in the Introduction section. As suggested, we will include the references Madsen et al. (2003) and Tirichine et al. (2003) as well as additional references on rhizobial NopT proteins into our revised manuscript version.

      Evidence is presented that in vitro NopT can cleave NFR5 at its juxtamembrane region. NFR5 appears also to be cleaved in vivo. and NFR1 appears to inhibit the proteolytic activity of NopT by phosphorylating NopT. When NFR5 and NFR1 are ectopically over-expressed in leaves of the non-legume Nicotiana benthamiana, they induce cell death (Madsen et al., 2011, Plant Journal). Bao et al., found that this cell death response is inhibited by the coexpression of nopT. Mutation of nopT alters the outcome of rhizobial infection in L. japonicus. These conclusions are well supported by the data.

      We appreciate that you recognize the value of our data.

      The authors present evidence supporting the interaction of NopT with NFR1 and NFR5. In particular, there is solid support for cleavage of NFR5 by NopT (Figure 3) and the identification of NopT phosphorylation sites that inhibit its proteolytic activity (Figure 4C). Cleavage of NFR5 upon expression in N. benthamiana (Figure 3A) requires appropriate controls (inactive mutant versions) that have been provided, since Agrobacterium as a closely rhizobia-related bacterium, might increase defense related proteolytic activity in the plant host cells.

      Thank you for recognizing the use of an inactive NopT variant in Figure 3A. In fact, increased activity of plant proteases induced by Agrobacterium is an important point that should not be neglected. We plan to mention this aspect in our revised Discussion.

      In the context of your comments, we are planning to make the following improvements to the manuscript:

      (1) We will add a more detailed description of the experimental conditions under which the cleavage of NFR5 by NopT was observed in vitro and in vivo.

      (2) We plan to provide more comprehensive data on the phosphorylation of NopT by NFR1, including phosphorylation assays and mass spectrometry results. These additional data support the proposed mechanism by which NFR1 inhibits the proteolytic activity of NopT.

      (3) We will expand the Discussion on the cell death response induced by ectopic expression of NFR1 and NFR5 in Nicotiana benthamiana. We will include more details from Madsen et al. (2011) to contextualize our findings with published literature.

      We believe these additions and clarifications will enhance the clarity and impact of our findings.

      Key results from N. benthamiana appear consistent with data from recombinant protein expression in bacteria. For the analysis in the host legume L. japonicus transgenic hairy roots were included. To demonstrate that the cleavage of NFR5 occurs during the interaction in plant cells the authors build largely on western blots. Regardless of whether Nicotiana leaf cells or Lotus root cells are used as the test platform, the Western blots indicate that only a small proportion of NFR5 is cleaved when co-expressed with nopT, and most of the NFR5 persists in its full-length form (Figures 3A-D). It is not quite clear how the authors explain the loss of NFR5 function (loss of cell death, impact on symbiosis), as a vast excess of the tested target remains intact. It is also not clear why a large proportion of NFR5 is unaffected by the proteolytic activity of NopT. This is particularly interesting in Nicotiana in the absence of Nod factor that could trigger NFR1 kinase activity.

      Thank you for your comments regarding the cleavage of NFR5 and its functional implications. In the revised version, we will change our manuscript taking into account the following considerations:

      (1) We acknowledge that the Western blots indicate only a small proportion of NFR5 is cleaved when co-expressed with NopT. It is worth noting in this context that the proteins were expressed at high levels which likely do not reflect the natural situation in L. japonicus. Low production of cleaved NFR5 in our Western blots with transformed N. benthamiana or L. japonicus cells thus may simply reflect an experimental effect due to high NFR5 protein synthesis. We suggest that the presence of high amounts of intact NFR5 does not have a significant functional impact on plant responses (cell death in N. benthamiana, rhizobial infection of L. japonicus) whereas NFR5 cleavage (or formation of NFR5 cleavage products) may be crucial for the observation of the observed phenotypic changes. The fraction of cleaved NFR5, although small, may be sufficient to disrupt crucial signaling pathways, leading to observable phenotypic changes. We will address possible differences between experimental and natural protein levels in our revised Discussion.

      (2) We studied in our work three biochemical aspects of NopT: (i) physical binding of NopT to NFR1 and NFR5 (ii) proteolytical cleavage of NFR5 by NopT and (iii) phosphorylation of NopT by NFR1. These three biochemical properties appear to influence each other. Phosphorylation of NopT by NFR1 appears to reduce its proteolytic activity, thereby counteracting NFR5 degradation by NopT (NFR5 homeostasis). Moreover, as NopT is a phosphorylation substrate for NFR1, NopT probably interferes with kinase mediated downstream responses of NFR1. Thus, NFR5 cleavage activity of NopT appears to be only one feature of NopT. We plan to mention these considerations in our revised Discussion.

      It is also difficult to evaluate how the ratios of cleaved and full-length protein change when different versions of NopT are present without a quantification of band strengths normalized to loading controls (Figure 3C, 3D, 3F). The same is true for the blots supporting NFR1 phosphorylation of NopT (Figure 4A).

      Thank you for pointing out this aspect. Following your recommendation, we will quantify the band intensities for cleaved and full-length NFR5 in the experiments with different versions of NopT. These values will be normalized to loading controls. Similarly, the Western blots supporting NFR1 phosphorylation of NopT will be quantified. The data for normalized band intensities will be included into the revised figures. The quantifications will provide a clearer understanding of how the ratios of cleaved to full-length proteins change with different NopT variants and also will provide information to which extent NopT is phosphorylated by NFR1.

      It is clear that mutation of nopT results in a quantitative infection phenotype. Nodule primordia and infection threads are still formed when L. japonicus plants are inoculated with ∆nopT mutant bacteria, but it is not clear if these primordia are infected or develop into fully functional nodules (Figure 5). A quantification of the ratio of infected and non-infected nodules and primordia would reveal whether NopT is only active at the transition from infection focus to thread or perhaps also later in the bacterial infection process of the developing root nodule.

      Thank you for pointing this out. In the revised version of our manuscript, we will provide data showing that there are no obvious differences in nodule formation in plants inoculated with ∆nopT and wild-type NGR234, respectively. However, quantification of infection threads containing our GFP-labeled rhizobia in primordia and nodules would be difficult to perform due to strong autofluorescence signals in these tissues. The main goal of our study was to identify and characterize the interaction between NopT and Nod factor receptors. We therefore believe that an in-depth analysis of the bacterial infection process at later symbiotic stages is out of the scope of the present work.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript presents data demonstrating NopT's interaction with Nod Factor Receptors NFR1 and NFR5 and its impact on cell death inhibition and rhizobial infection. The identification of a truncated NopT variant in certain Sinorhizobium species adds an interesting dimension to the study. These data try to bridge the gaps between classical Nod-factor-dependent nodulation and T3SS NopT effector-dependent nodulation in legume-rhizobium symbiosis. Overall, the research provides interesting insights into the molecular mechanisms underlying symbiotic interactions between rhizobia and legumes.

      Strengths:

      The manuscript nicely demonstrates NopT's proteolytic cleavage of NFR5, regulated by NFR1 phosphorylation, promoting rhizobial infection in L. japonicus. Intriguingly, authors also identify a truncated NopT variant in certain Sinorhizobium species, maintaining NFR5 cleavage but lacking NFR1 interaction. These findings bridge the T3SS effector with the classical Nod-factor-dependent nodulation pathway, offering novel insights into symbiotic interactions.

      We appreciate that you recognize the value of our manuscript.

      Weaknesses:

      (1) In the previous study, when transiently expressed NopT alone in Nicotiana tobacco plants, proteolytically active NopT elicited a rapid hypersensitive reaction. However, this phenotype was not observed when expressing the same NopT in Nicotiana benthamiana (Figure 1A). Conversely, cell death and a hypersensitive reaction were observed in Figure S8. This raises questions about the suitability of the exogenous expression system for studying NopT proteolysis specificity.

      We appreciate your attention to these plant-specific differences. In view of your comments, we plan to revise the Discussion and explain the different expression systems used for studying NopT effects in planta. Previous studies showed that NopT expressed in tobacco (N. tabacum) or in specific Arabidopsis thaliana ecotypes (with PBS1/RPS5 genes) causes rapid cell death (Dai et al. 2008; Khan et al. 2022). Our data shown in Fig. S8 confirm these findings. As cell death (effector triggered immunity) is usually associated with induction of protease activities, we considered N. tabacum and A. thaliana plants as not suitable for testing NFR5 cleavage by NopT. In fact, no NopT/NFR5 experiments were performed with these plants in our study. In contrast, the expression of NopT in Nicotiana benthamiana did not lead to cell death in our experiments. Khan et al. 2022 also reported that cell death does not occur in N. benthamiana unless the cells were transformed with PBS1/RPS5 constructs. Thus, N. benthamiana is a suitable expression system to analyze NopT protease activity on co-expressed substrates. Our revision aims to better understand the advantages of the N. benthamiana expression system for studying NopT mediated proteolysis of NFR5.

      (2) NFR5 Loss-of-function mutants do not produce nodules in the presence of rhizobia in lotus roots, and overexpression of NFR1 and NFR5 produces spontaneous nodules. In this regard, if the direct proteolysis target of NopT is NFR5, one could expect the NGR234's infection will not be very successful because of the Native NopT's specific proteolysis function of NFR5 and NFR1. Conversely, in Figure 5, authors observed the different results.

      Our inoculation experiments clearly show that NopT of NGR234 has a negative effect on formation of infection foci (Fig. 5A) and nodule primordia (Fig. 5E). Our biochemical analysis indicates that NopT targets the NFR1/NFR5 complex, which most likely impairs activation of downstream responses such as NIN gene expression. Accordingly, NIN promoter activity was found to be higher in roots inoculated with the Δ_nopT_ mutant as compared to the NGR234 wild-type (Fig. 5B and 5D). It is therefore plausible that NopT impairs rhizobial infection of L. japonicus due to inhibition of NFR1/NFR5 functions. We agree with this Reviewer that it can be expected that “NGR234's infection will not be very successful”. Fig. 5 confirms that Δ_nopT_ mutant is indeed a better symbiont and we do not think that we obtained “unexpectedly different results”. In the revised version, we will try to formulate our discussion text better in order to avoid any misunderstandings. Furthermore, will write as figure title “NopT dampens rhizobial infection…” instead of “NopT regulates rhizobial infection…”. We are also considering changing the title of our manuscript.  

      (3) In Figure 6E, the model illustrates how NopT digests NFR5 to regulate rhizobia infection. However, it raises the question of whether it is reasonable for NGR234 to produce an effector that restricts its own colonization in host plants.

      We acknowledge the potential paradox of NGR234 producing an effector that appears to restrict its own colonization in host plants. In fact, depending on the host plant, most rhizobial effectors are “double-edged swords” that play either a positive or negative role in the symbiosis. In response to your comment, we will discuss the possibility that NopT may confer selective advantages in interactions between NGR234 and host plants where NopT plays a positive symbiotic role (Dai et al. 2008; Kambara et al. 2009). Inhibition of NFR1/NFR5 functions by NopT in these host plants could be a feedback response in cells in which symbiotic signaling has already started. It is tempting speculate that the interaction between NopT and Nod factor receptors reduces Nod factor perception and downstream signaling to avoid a possible overreaction of symbiotic signaling, which may result in hypernodulation or formation of empty nodules without bacteria. Furthermore, it is tempting to speculate that NopT targets not only Nod factor receptors but also other host proteins to promote symbiosis, e.g. by suppressing excessive immune responses triggered by hyperinfection of rhizobia. In our revised manuscript, we will highlight the need for further investigations to elucidate the precise mechanisms underlying the observed infection phenotype and the role of NopT in modulating symbiotic signaling pathways.  

      (4) The failure to generate stable transgenic plants expressing NopT in Lotus japonicus is surprising, considering the manuscript's claim that NopT specifically proteolyzes NFR5, a major player in the response to nodule symbiosis, without being essential for plant development.

      Thank you for your comments. The failure to obtain L. japonicus plants constitutively expressing NopT was indeed surprising and suggests that NopT targets not only NFR5 but also other proteins in L. japonicus. The number of NopT substrates in plants could be greater than assumed. For example, we show in our work that NopT can cleave AtLYK5 and LjLYS11. In our manuscript, we don’t provide protocols and data on our efforts to construct L. japonicus plants stably expressing NopT. Indeed, it cannot be completely ruled out that the observed failure is not due to NopT expression, but rather to other factors that influence the transformation and regeneration of explants into whole plants. Our results should therefore not be over-interpreted. We consider a discussion of our failed transformation experiments to be somewhat preliminary and not central to this manuscript. herefore, we plan to modify our Discussion and delete the sentence reporting that stable transgenic plants expressing NopT have not been successfully generated.

    1. Reviewer #2 (Public Review):

      Summary:

      The goal of the paper was to trace the transitions hippocampal microglia undergo along aging. ScRNA-seq analysis allowed the authors to predict a trajectory and hypothesize about possible molecular checkpoints, which keep the pace of microglial aging. E.g. TGF1b was predicted as a molecule slowing down the microglial aging path and indeed, loss of TGF1 in microglia led to premature microglia aging, which was associated with premature loss of cognitive ability. The authors also used the parabiosis model to show how peripheral, blood-derived signals from the old organism can "push" microglia forward on the aging path.

      Strengths:

      A major strength and uniqueness of this work is the in-depth single-cell dataset, which may be a useful resource for the community, as well as the data showing what happens to young microglia in heterochronic parabiosis setting and upon loss of TGFb in their environment.

      Weaknesses:

      That said, given what we recently learned about microglia isolation for RNA-seq analysis, there is a danger that some of the observations are a result of not age, but cell stress from sample preparation (enzymatic digestion 10min at 37C; e.g. PMID: 35260865). Changes in cell state distribution along aging were made based on scRNA-seq and were not corroborated by any other method, such as imaging of cluster-specific marker expression in microglia at different ages. This analysis would allow confirming the scRNA-seq data and would also give us an idea of where the subsets are present within the hippocampus, and whether there is any interesting distribution of cell states (e.g. some are present closer to stem cells?). Since TGFb is thought to be crucial to microglia biology, it would be valuable to include more analysis of the mice with microglia-specific Tgfb deletion e.g. what was the efficiency of recombination in microglia? Did their numbers change after induction of Tgfb deletion in Cx3cr1-creERT2::Tgfb-flox mice.

      Overall:

      In general, I think the authors did a good job following the initial observations and devised clever ways to test the emerging hypotheses. The resulting data are an important addition to what we know about microglial aging and can be fruitfully used by other researchers, e.g. those working on microglia in a disease context.

    1. There's a lot of really great content here. But, for readers like me (the technical/design/engineering/research side of the visualization community), I think the writing isn't landing with quite the impact that it could for a few reasons:

      (1) In my interdisciplinary collaborations, I've noticed a difference in writing styles/norms between the humanities and the design/engineering disciplines. The latter tend to favor a top-down argument structure (e.g., a crisply articulated thesis that is then unpacked via clearly signposted topic sentences). I think that's because readers like me are trying to figure out how to operationalize the things we're reading/learning. So, right from the get go, we need a clearly articulate conceptual model so that, over the course of the rest of the writing, we can figure out how to integrate it with our existing mental models of practice/research.

      In contrast, this piece takes a very bottom-up approach to the argument. For me, the experience of reading bottom-up writing is of assembling a mental model that feels more like a wobbly house of cards: ad hoc, duct taped together, and needing to constantly swap/rearrange it as more pieces of the conceptual contribution reveal themselves to me.

      As a concrete example, for the first third of this chapter, I wasn't actually sure what I was supposed to be taking away. I almost wondered whether I should suggest titling the chapter "preface" instead of "introduction" because it opens by being focused inwardly (i.e., on the presentation of the homepage) in a way I'm more accustomed to with prefaces than introductions. Although that whole chunk of writing was very pleasant to read (which may also be a function of the fact that I had the pleasure of meeting y'all and learning about how the project came together!), I wasn't entirely sure what this chunk was hoping to do/communicate—or how it was hoping to influence my thinking.

      (2) Related to the first point, while I personally find the exploration of a visualization counterhistory exciting and thought-provoking, I wonder if the writing could better motivate the goals of the counterhistory a bit more explicitly and clearly? That is, if someone isn't already bought into valuing the history of the field (or doesn't know how a counterhistory may/should affect their current practice today), how might the writing persuade them to care? Or, put another way, how can the writing speak and evangelize to an audience who is open-minded, but not yet "on side." To me, this feels like a particularly important thing for an introductory chapter to do, that seems missing in the current iteration.

      (3) I'm on the fence about how central a role Tufte is given here. I think this depends on the audience you are trying to reach—I'm not sure that many (most?) visualization researchers/designers/practitioners (i.e., visualization "thought leaders") consider Tufte to play as influential a role as this chapter purports him to do. If this was the core audience, then I think the focus on Tufte could be watered down without losing much of the overall framing of "counterhistory"—because I think what this chapter describes is very much the history the field tells itself (relatively independently of Tufte, I think?).

      On the other hand, if the intended audience are the folks one hop removed (i.e., people who produce/consume visualizations in their daily lives/jobs, but aren't necessarily plugged into conversations on the bleeding edge), I think Tufte serves as a useful foil. But, something about his treatment in this chapter feels a little caricatured to me (and I say this as someone relatively ambivalent about his role). I'm not quite able to put my finger on what specifically about the writing left me with that feeling, though.

      (4) Starting with the "Two Stories of Data Visualization" (and particularly the subsequent chapter on "Every Datapoint is a Person"), I wondered whether the target of the book's critique is indeed visualization (i.e., the graphic representation of data) or whether it's more fundamental and broader practices of data (i.e., definition, collection, etc. more similar to the set of issues y'all discussed in Data Feminism). I really enjoyed all of the detail and discussion here—and I was convinced about the role that data played. But, I was perhaps less convinced about visualization's central/facilitating/empowering role in it. It's likely impossible to fully disentangle data from its representation (as the data table examples do a great job) but, if the book wants to maintain visualization as its target, I wonder if the writing could be refined a little to make its focus clearer/crisper?

      (5) I wonder if the writing can be more explicit about its positionality? I think some of the early sections (and occasional passages throughout) set up an incorrect expectation for me of a much broader (i.e., more global) counterhistory. So, I was then surprised that this chapter maintains a relatively fixed focus on Western history. In fact, I might go further to say that the writing seems to be particularly fixated on an American point of view (e.g., I raised an eyebrow at the description of the United States as "the exemplary" colonial state; as a non-American and citizen of a former colonized nation, I would consider the British Empire to be the ultimate colonial power...). I think this focus is fine if the writing is explicit that it is primarily concerned with developing a counterhistory rooted in the West (and, at that, the United States).

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study identifies differential Orsay virus infection of C. elegans when animals are fed on different bacteria. The evidence for this is however, incomplete, as experiments to control for feeding rate and bacterial pathogenicity are needed as well as direct quantification of viral load. 

      We appreciate that the editors and reviewers felt that our manuscript addressed an important problem. We appreciate the constructive critiques provided by the reviewers and have worked to address all of the concerns, including a number of additional experiments as indicated below.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: 

      This manuscript explores the importance of food type on virus infection dynamics using a nematode virus as a model system. The authors demonstrate that susceptibility to viral infection can change by several orders of magnitude based on the type of bacterial food that potential hosts consume. They go on to show that, for the bacterial food source that reduces susceptibility, the effect is modulated by quorum sensing molecules that the bacteria produce. 

      Strengths: 

      This manuscript shows convincingly that nematode susceptibility to viral infection changes by several orders of magnitude (i.e. doses must be increased by several orders of magnitude to infect the same fraction of the population) depending on the bacterial food source on which hosts are reared. The authors then focus on the bacteria that reduce host susceptibility to viral infection and demonstrate that certain bacterial quorum-sensing compounds are required to see this effect of reduced susceptibility. Overall, sample sizes are large, methods are generally rigorous, experiments are repeated, and patterns are clear. 

      Weaknesses: 

      Although the molecular correlate of reduced susceptibility is identified (i.e. quorum sensing compounds) the mechanisms underlying this effect are missing. For example, there are changes in susceptibility due to altered nutrition, host condition, the microbiome, feeding rate, mortality of infected hosts, etc. In addition, the authors focus almost entirely on the reduction in susceptibility even though I personally find the increased susceptibility generated when reared on Ochrobactrum to be much more exciting. 

      I was a bit surprised that there was no data on basic factors that could have led to reductions in susceptibility. In particular, data on feeding rates and mortality rates seem really important. I would expect that feeding rates are reduced in the presence of Pseudomonas. Reduced feeding rates would translate to lower consumed doses, and so even though the same concentration of virus is on a plate, it doesn't mean that the same quantity of virus is consumed. Likewise, if Pseudomonas is causing mortality of virus-infected hosts, it could give the impression of lower infection rates. Perhaps mortality rates are too small in the experimental setup to explain this pattern, but that isn't clear in the current version of the manuscript. Is mortality greatly impacted by knocking out quorum-sensing genes? Also, the authors explored susceptibility to infection, but completely ignored variation in virus shedding. 

      We have added data on feeding rates (Line numbers 141-148 and 176-182, Supplementary Figure 4). After six hours of exposure no differences in feeding rate were observed. After 24 hours minor differences emerged between O. vermis MYb71 and each Pseudomonas species, however feeding rate inversely correlated with susceptibility to Orsay virus in that O. vermis MYb71 displayed the lowest feeding rate while P. aeruginosa PA14 displayed the highest feeding rate.

      We have also added data on mortality rates (Line numbers 183-200, Supplementary Figure 6). No significant mortality was observed within the 24-hour exposure period used for our Orsay infection and transmission assays. P. aeruginosa virulence is dependent upon temperature and as our assays are done at 20°C rather than 25°C this may account for reduced mortality compared to other published results. Regardless, we noted that O. vermis MYb71 killed C. elegans as quickly as P. aeruginosa PA14 under these conditions and these two bacteria led to the shortest lifespan compared to the other tested bacteria. Interestingly, P. lurida MYb11 was observed to be more virulent than P. aeruginosa PA01 under these conditions. These results suggest that there is no direct correlation between mortality and susceptibility to Orsay virus, although it does not rule out that virulence effects unique to each bacterium could contribute to alterations in host susceptibility.  

      The reviewer is correct to assert that differences in viral shedding could exist. However, our susceptibility assays using exogenous Orsay virus remove this source of variation and yet we still observe the same trends such that O. vermis MYb71 promotes infection while P. lurida MYb11, P. aeruginosa PA01, and P. aeruginosa PA14 attenuate infection. Further we measured the amount of virus shed into the lawns in the presence of different bacteria and did not observe differences in shed virus that could account for the differences we observe in incidence proportion (Line numbers 241-254, Fig. 3 F). Viral stability could be an issue in both the transmission and susceptibility assays. We therefore tested viral stability in the presence of E. coli, P. lurida MYb11, P. aeruginosa PA01, and P. aeruginosa PA14 and successfully recovered virus from all lawns, suggesting virus is not rapidly degraded in the presence of any bacterium (Fig. 3D and 3E). However, we noted that the recovery of Orsay virus from lawns of E. coli OP50 and P. lurida MYb11 within 30 minutes was decreased compared to a spike-in control suggesting recovery from each lawn is not equivalent. This complicates a comparison of viral stability and shedding rates between different bacteria, but our ability to recover substantial amounts of virus in the shedding assay from the three Pseudomonas strains we examined precludes a substantial decrease in shedding rates as an explanation for the robust attenuation of Orsay virus observed in transmission assays.  

      I was also curious why the authors did not further explore the mechanism behind the quorumsensing effect. Not sure whether this is possible, but would it be possible to add spent media to the infection plates where the spent media was from Pseudomonas that produce the quorum sensing compound but the plates contain OP50, Pseudomonas, or the quorum sensing knockout of Pseudomonas? That would reveal whether it is the compound itself vs. something that the compound does. 

      We observed that quorum sensing mutants suppressed the attenuation of Orsay virus infection and we agree that this could be a consequence of the compounds themselves, or more likely an effect of the downstream consequences of quorum signaling. We added culture supernatant from each bacterium to lawns of E. coli OP50 to assess the effect on host susceptibility and did not observe any potent effect (Line numbers 311-318, Supplementary Figure 9). This supports an interpretation that it is not the compound itself that is responsible, however we cannot rule out that the compounds themselves may be responsible if provided at a higher concentration.

      In addition, I was surprised by how much focus there was on the attenuation of infection and how little there was on the enhancement of infection. To me, enhancement seems like the more obvious thing to find a mechanism for -- is the bacteria suppressing immunity, preventing entry to gut cells, etc? 

      We are also intrigued by the enhancement of infection by Ochrobactrum spp, however we chose to focus on attenuation given the availability of Pseudomonas aeruginosa genetic mutants for study. We have added data (Line numbers 371-402, Figure 7, and Supplemental Figure 12) that inform our current hypothesis regarding Ochrobactrum mediated enhancement of Orsay virus infection.

      I was a bit concerned about the "arbitrary units", which were used without any effort to normalize them. David Wang and Hongbing Jiang have developed a method based on tissue culture infectious dose 50 (TCID50) that can be used to measure infectious doses in a somewhat repeatable way. Without some type of normalization, it is hard to imagine how this study could be repeated. The 24-hour time period between exposure and glowing suggests very high doses, but it is still unclear precisely how high. Also, it is clear that multiple batches of virus were used in this study, but it is entirely unclear how variable these batches were. 

      We have clarified that we also measured the (TC)ID50 for every batch of virus used similar to the methods suggested by the Wang laboratory (Line numbers 107-119 and 499-506). We have added a figure showing the virus batch variability for all batches used in this study (Supp. Fig. 2). We have further clarified that the arbitrary units correspond to the actual microliters of viral filtrate used during infection and provided clear methods to replicate our viral batch production to assist with issues of reproducibility (Line numbers 107-119 and 499-506).

      The authors in several places discuss high variability or low variability in incidence as though it is a feature of the virus or a feature of the host. It isn't. For infection data (or any type of binomial data) results are highly variable in the middle (close to 50% infection) and lowly variable at the ends (close to 0% or 100% infection). This is a result that is derived from a binomial distribution and it should not be taken as evidence that the bacteria or the host affect randomness. If you were to conduct dose-response experiments, on any of your bacterial food source treatments, you would find that variability is lowest at the extremely high and extremely low doses and it is most variable in the middle when you are at doses where about 50% of hosts are infected. 

      Thank you for pointing this out, we have removed all reference to this throughout the manuscript.

      Reviewer #2 (Public Review):

      Summary and Major Findings/Strengths:

      Across diverse hosts, microbiota can influence viral infection and transmission. C. elegans is naturally infected by the Orsay virus, which infects intestinal cells and is transmitted via the fecal-oral route. Previous work has demonstrated that host immune defense pathways, such as antiviral RNAi and the intracellular pathogen response (IPR), can influence host susceptibility to virus infection. However, little is known about how bacteria modulate viral transmission and host susceptibility. 

      In this study, the authors investigate how diverse bacterial species influence Orsay virus transmission and host susceptibility in C. elegans. When C. elegans is grown in the presence of two Ochrobactrum species, the authors find that animals exhibit increased viral transmission, as measured by the increased proportion of newly infected worms (relative to growth on E. coli OP50). The presence of the two Ochrobactrum species also resulted in increased host susceptibility to the virus, which is reflected by the increased fraction of infected animals following exposure to the exogenous Orsay virus. In contrast, the presence of Pseudomonas lurida MYb11, as well as Pseudomonas PA01 or PA14, attenuates viral transmission and host susceptibility relative to E. coli OP50. For growth in the presence of P. aeruginosa PA01 and PA14, the attenuated transmission and susceptibility are suppressed by mutations in regulators of quorum sensing and the gacA two-component system. The authors also identify six virulence genes in P. aeruginosa PA14 that modulate host susceptibility to virus and viral transmission, albeit to a lesser extent. Based on the findings in P. aeruginosa, the authors further demonstrate that deletion of the gacA ortholog in P. lurida results in loss of the attenuation of viral transmission and host susceptibility. 

      Taken together, these findings provide important insights into the species-specific effects that bacteria can have on viral infection in C. elegans. The authors also describe a role for Pseudomonas quorum sensing and virulence genes in influencing viral transmission and host susceptibility. 

      Major weaknesses: 

      The manuscript has several issues that need to be addressed, such as insufficient rigor of the experiments performed and questions about the reproducibility of the data presented in some places. In addition, confounding variables complicate the interpretations that can be made from the authors' findings and weaken some of the conclusions that are stated in the manuscript. 

      (1) The authors sometimes use pals-5p::GFP expression to indicate infection, however, this is not necessarily an accurate measure of the infection rate. Specifically, in Figures 4-6, the authors should include measurements of viral RNA, either by FISH staining or qRT-PCR, to support the claims related to differences in infection rate. 

      Following the reviewers comment we have corroborated our pals-5::GFP data using FISH staining (Line numbers 291-292 and 357-359, Figure 4D & 4E, and Figure 6C).  

      (2) In several instances, the experimental setup and presentation of data lack sufficient rigor. For example, Fig 1D and Fig 2B only display data from one experimental replicate. The authors should include information from all 3 experimental replicates for more transparency. In Fig 3B, the authors should include a control that demonstrates how RNA1 levels change in the presence of E. coli OP50 for comparison with the results showing replication in the presence of PA14. In order to support the claim that "P. aeruginosa and P. lurida MYb11 do not eliminate Orsay virus infection", the authors should also measure RNA1 fold change in the presence of PA01 and P. lurida in the context of exogenous Orsay virus. Additionally, the authors should standardize the amount of bacteria added to the plate and specify how this was done in the Methods, as differing concentrations of bacteria could be the reason for species-specific effects on infection. 

      All experimental replicates are now included within the supplementary information. 

      We have also measured RNA1 fold change following infection in the presence of P. aeruginosa PA01 and P. lurida MYb11 (Line numbers Fig 3B and 3C) and found that these bacteria also do not eliminate Orsay virus replication. 

      We thank the reviewer for their comment on controlling the amount of bacteria and have clarified our methods section to more clearly explain that we seed our plates with equivalent amounts (based on volume) of overnight bacterial culture before allowing the bacteria to grow on the plates for 48 hours.  

      (3) The authors should be more careful about conclusions that are made from experiments involving PA14, which is a P. aeruginosa strain (isolated from humans), that can rapidly kill C. elegans. To eliminate confounding factors that are introduced by the pathogenicity of PA14, the authors should address how PA14 affects the health of the worms in their assays. For example, the authors should perform bead-feeding assays to demonstrate that feeding rates are unaffected when worms are grown in the presence of PA14. Because Orsay virus infection occurs through feeding, a decrease in C. elegans feeding rates can influence the outcome of viral infection. The authors should also address whether or not the presence of PA14 affects the stability of viral particles because that could be another trivial reason for the attenuation of viral infection that occurs in the presence of PA14. 

      We have added data on feeding rates (Line numbers 141-148 and 176-182, Supplementary Figure 4). After six hours of exposure no differences in feeding rate were observed. After 24 hours minor differences emerged between O. vermis MYb71 and each Pseudomonas species, however feeding rate inversely correlated with susceptibility to Orsay virus in that O. vermis MYb71 displayed the lowest feeding rate while P. aeruginosa PA14 displayed the highest feeding rate.

      We have also added data on mortality rates (Line numbers 183-200, Supplementary Figure 6). No significant mortality was observed within the 24-hour exposure period used for our Orsay infection and transmission assays. P. aeruginosa virulence is dependent upon temperature and as our assays are done at 20°C rather than 25°C this may account for reduced mortality compared to other published results. Regardless, we noted that O. vermis MYb71 killed C. elegans as quickly as P. aeruginosa PA14 under these conditions and these two bacteria led to the shortest lifespan compared to the other tested bacteria. Interestingly, P. lurida MYb11 was observed to be more virulent than P. aeruginosa PA01 under these conditions. These results suggest that there is no direct correlation between mortality and susceptibility to Orsay virus, although it does not rule out that virulence effects unique to each bacterium could contribute to alterations in host susceptibility.  

      We tested viral stability in the presence of E. coli OP50 and Pseudomonas spp. and successfully recovered virus from all lawns, suggesting virus is not rapidly degraded in the presence of P. lurida MYb11, P. aeruginosa PA01, and P. aeruginosa PA14 (Line numbers 241-249, Fig 3D and Fig 3E). However, we noted that the recovery of Orsay virus from lawns of E. coli OP50 and P. lurida MYb11 within 30 minutes was decreased compared to a spike-in control suggesting recovery from each lawn is not equivalent. This complicates a comparison of viral stability and shedding rates between different bacteria, but our ability to recover substantial amounts of virus in the shedding assay from each Pseudomonas species precludes a substantial decrease in shedding rates as an explanation for the robust attenuation of Orsay virus observed in transmission assays.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Overall, I really liked this manuscript, I do think there are areas for improvement though. 

      Some smaller things: 

      Line 84: "can be observed spreading from a single animal" -- this isn't really great wording because the virus itself can't be observed (at least not very easily) -- even infection is hard to see. 

      The wording in line 84-85 has now been adjusted to read “can spread from a single animal”.

      Fig 1C: which groups are statistically significantly different from each other? 

      Statistics have now been added to Figure 1C. 

      Line 154: not necessary to do for this paper, but this sentence made me curious whether the effect would have been seen with mixtures of bacteria (i.e. what if 50% were OP50 and 50% were Pseudomonas?) 

      This data has now been added in Line numbers 372-378, Figure 7A, and Supp. Fig. 12A and 12B.

      Line 262-264: I don't find this interesting at all for the reasons mentioned earlier about binomial data being the most variable in the middle. 

      These lines have been removed.

      Figure 4 B: The labels for the first two tick marks on the x-axis are switched I suspect. Otherwise, the controls did not behave as expected. 

      Figure 4B has been corrected.

      Line 288, 297 and several other places: "Orsay Virus" should be "Orsay virus". 

      We have corrected these instances.

      Supplemental Figure 2: Labels in the figure legend are B and C instead of A and B. 

      These labels have been adjusted for their placement within Figure 6.

      Line 411: I suspect this was supposed to be 13,200 xg rather than 13.2 xg. 

      This error has been corrected.

      Line 416-417: This sentence is very hard to interpret. More details are needed. This is the ID50 in which host strain? Is this averaged over all batches of virus? How variable are the batches? 

      This sentence (line number 114) has been amended to clarify that all ID50 values referred to here were calculated for ZD2611 populations in the presence of E. coli OP50. Further, Supplementary Figure 2 now shows all the ID50 values measured for each batch of virus used in this manuscript resulting in an average ID50 of 3.6.

      Lines 467-469: Why exclude these instead of counting them as zeros in the analysis? How many plates fit this description -- were there lots or only a few over the course of all experiments? 

      We have chosen to exclude these plates as these samples lost spreaders at some point during the course of the assay potentially skewing the eventual number of new infections counted depending on when the infected spreader animal crawled off the plate.  We have detailed the number of plates that fit this description in lines 559-562. 

      Line 476: A critical detail that is missing here is what number of worms were counted to score infection. Please say here or in the figure legends. 

      We have added the total number of worms counted and the minimum number counted per plate for each assay in the figure legends.

      Line 546: Why was only a single representative experiment shown? I'm asking for a justification, not necessarily for you to show all the data. 

      We chose to show a single representative experiment for two reasons:  We noted variability between susceptibility assays even when using the same batch of virus such that we could not combine experiments into a single plot as we did for transmission assays. Second, while we could normalize to a control within each experiment and expect to see similar relative differences across experiments, we believe this makes it more difficult to interpret the underlying data. For example, an increase in the infection rate of 80% compared to 10% within a population has only a single interpretation while a relative increase in the infection rate by 8x within a population could have several underlying meanings (e.g. 80% vs 10%, 64%vs 8%, 24% vs 3%). We have now included all experimental replicates in the supplementary material. 

      Reviewer #2 (Recommendations For The Authors):

      Minor concerns: 

      (1) Lines 86-87: "utilized a collection of bacteria isolated from the environment with wild C. elegans". The authors should provide more context on the source of these bacterial strains. 

      More references for the sources of these bacteria have been added to Supplementary Table 2.

      (2) The presentation of data in Fig 1 could be improved. The authors should include the text "pals-5p::GFP" on the images shown in Fig 1B. The red dashed line in Fig. 1D should intersect the dose-response curve at y = 0.5. The column heading for Fig 1E states "ID50 +/- SD (a.u.)", but should read "ID50 ratio" and should not have units. It also might be more intuitive to normalize the ID50 value for O. vermis to E. coli OP50. This way, having an ID50 ratio >1 indicates decreased transmission relative to E. coli, and ID50 ratio <1 indicates increased transmission relative to E. coli. To increase the transparency and rigor of 1E, the authors should plot the ratios from all 3 experimental replicates. The authors should also briefly explain why different viral doses were used in Fig 1D and 1F. 

      The text “pals-5p::GFP” has now been added to Figure 1B and throughout the text. The red dashed line in figure 1D has been corrected. Figure 1E has been adjusted to an actual figure as suggested and the y-axis label is “ID50 Ratio Compared to E. coli OP50”. The ID50 replicates have been plotted in Supplementary Figure 2. We have clarified that the doses used are the same. Briefly, the technical replicates of individual doses from Figure 1D and Supplementary Figure 3A and 3B were pooled and processed for FISH staining to provide each experimental replicate of Figure 1F. 

      (3) Line 110: The claim is that Ochrobactrum and P. lurida MYb11 reduce the variability of infection levels. However, another possibility is that there's simply less dynamic range in the assay because the infection levels have been compressed to 100% and 0% under these conditions. 

      This line has been removed.

      (4) There are discrepancies between what is shown in Fig 2C and what is described in the text. Lines 163-164: "P. aeruginosa PA01 and P. lurida MYb11 attenuated average infection to 33% and 62% of the population respectively". In Fig 2C, the mean for PA01 is ~25% whereas the mean for P. lurida appears to be less than 62%. 

      These values have been corrected.

      (5) Line 196: Provide more context for why rde-1 mutants were tested. This is the first time rde-1 is mentioned in the text (i.e. why show results in rde-1 mutants when the results are in Fig 2). 

      More context has been provided for why rde-1 mutants were tested (Line numbers 228-232). Briefly, using the rde-1 mutant, which has defective antiviral immunity and therefore supports higher viral replication levels than the wild-type (Félix et al. 2011), allows us to potentiate our infection assay in Figure 3B and 3C such that we maximize our chances of detecting viral replication in the presence of the Pseudomonas species, and especially P. aeruginiosa PA14, where fewer animals might be expected to get infected based upon Figure 2B and Supplementary Figure 5. 

      (6) Lines 228-229: "Mutations of any the regulators of the las, rhl, or pqs quorum sensing systems suppressed the attenuation of Orsay virus infection caused by the presence of wild-type P. aeruginosa PA01". Based on this description, PA01 should have a lower fraction of GFP positive relative to the quorum sensing mutants in Fig 4B. It seems that the x-axis labels OP50 and PA01 are swapped. 

      The x-axis labels of Figure 4B have been corrected. 

      (7) To improve clarity, for any figures that have data showing the "fraction of individuals GFP positive", the authors should include "pals-5p::GFP" in the y-axis title and legend. 

      The y-axis labels, legends, and text have been corrected throughout.  

      (8) To improve overall clarity and flow, the order in which the data is presented could be reordered. In particular, Fig. 6 could be better positioned instead of being the last figure, as no further characterization is performed on the mutants, and the findings are not conserved in strains that are more relevant to the C. elegans microbiota, such as P. lurida. The overall story could be strengthened if the authors ended the manuscript with more details related to the mechanism by which regulators of quorum sensing modulate the outcome of viral infection. 

      Figure 5 and Figure 6 have now been swapped.

      (9) Fig 5A: Make arrow sizes consistent across diagrams (i.e. the diagram for gacA deletion). 

      This figure (now Figure 6A) has been adjusted to make arrow sizes consistent across diagrams.  

      (10) Lines 280-282: "These data suggest that gacA has a conserved role across distant Pseudomonas species..." Here, the authors can provide more context on how well-conserved gacA is across Pseudomonas species (i.e. phylogenetic analysis of gacA sequences across different Pseudomonas species/strains). Furthermore, the data in Fig 5 does not provide strong enough support for the conclusion that gacA has a conserved role broadly across Pseudomonas species, as the authors only assess the effects of a gacA deletion in two species, P. aeruginosa and P. lurida. 

      We have adjusted lines 361-362 to “These data suggest that gacA has a conserved role between P. aeruginosa and P. lurida Myb11 in the attenuation of Orsay virus transmission and infection of C. elegans.” to reflect that we only assessed the effects of the gacA deletion in P. aeruginosa and P. lurida MYb11.

      (11) The manuscript can be strengthened by performing additional experiments to elucidate the mechanism by which Pseudomonas modulates viral infection. Does the attenuation of viral transmission and host susceptibility by P. lurida and P. aeruginosa require C. elegans to be in the presence of live bacteria? For example, the authors could measure viral transmission and susceptibility of C. elegans grown on heat-killed Pseudomonas. Additionally, it would be interesting to determine if modulation of viral infection is dependent on a secreted molecule. To assess this, the authors could perform viral infections in the context of Pseudomonas culture supernatant. 

      We added bacterial culture supernatant from each bacterium to lawns of E. coli OP50 to assess the effect on host susceptibility and did not observe any potent effect (Line numbers 311-318, Supplementary Figure 9). This supports an interpretation that attenuation is not mediated by a secreted molecule, however we cannot rule out that attenuation activity would become apparent if supernatant were provided at a higher concentration.

      We have found substantial challenges appropriately controlling live vs. heat-killed experiments particularly with the specifics of our susceptibility experiments. With regards to the underlying question of mechanism we believe that the genetic mutants (e.g. rhlR/gacA) are equally informative and that further comparison of these mutants’ interaction with the C. elegans host as compared to wild-type may be informative. 

      (12) The authors should include a discussion on the relative virulence potential of PA01, PA14, and P. lurida and the relationship between bacterial virulence potential and the outcome of viral infection. 

      We have also added data on mortality rates (Line numbers 183-200, Supplementary Figure 6). No significant mortality was observed within the 24-hour exposure period used for our Orsay infection and transmission assays. P. aeruginosa virulence is dependent upon temperature and as our assays are done at 20°C rather than 25°C this may account for reduced mortality compared to other published results. Regardless, we noted that O. vermis MYb71 killed C. elegans as quickly as P. aeruginosa PA14 under these conditions and these two bacteria led to the shortest lifespan compared to the other tested bacteria. Interestingly, P. lurida MYb11 was observed to be more virulent than P. aeruginosa PA01 under these conditions. These results suggest that there is no direct correlation between mortality and susceptibility to Orsay virus, although it does not rule out that virulence effects unique to each bacterium could contribute to alterations in host susceptibility.  

      (13) More information is needed on strains listed in Supplementary Table 2, particularly when there is no reference listed and the strain is "Gift of XXX lab". For example, the Troemel lab previously published about an Ochrobactrum strain in Troemel et al PLOS Biology 2008 PMID: 19071962 - is this the same strain? Please ensure that there is adequate information about each strain with as many published references as possible so that the work can be more easily reproduced. 

      We have added additional information and references to the strain table in Supplementary Table 2. The strain listed as Ochrobactrum sp. has been amended to Ochrobactrum BH3 as it is the strain described in Troemel et al. 2008.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This manuscript uses C. elegans as a model to interrogate the effects of autism-associated variants of previously unknown function in the RNA-binding protein RBM-26/RBM27.

      Despite its potential impact, there are several concerns related to the technical rigor and specificity of the observed effects.

      Major concerns: 1. The effects on PLM are interesting, but why was this neuron selected for study? Was this a lucky guess or are other axons also affected? It is important to clarify whether the effects of RBM-26 are specific to this neuron or act pleiotropically across many or all neurons. According to CeNGEN, rbm-26 is strongly expressed in the well-characterized neurons ASE, PVD, and HSN. Are there morphological defects in these neurons, or others? As a note, there are also functional assays for these neurons (salt sensing, touch response, and egg laying, respectively).

      We have added new data to the supplemental materials showing that loss of rbm-26 function also causes the beading phenotype in the axons and dendrites of the PVD neuron (Figure S4 and lines 196-199). We have focused on the PLM neuron because our preliminary studies indicated that it had a higher penetrance of axon defects relative to the PVD neuron. Moreover, we observed expression of endogenously tagged RBM-26 in the PLM neuron (Figure 3A-C and lines 210-215).

      Similarly, the choice of the MALSU homolog seemed like a shot in the dark. It is ranked 46th (out of 63 genes) for fold-enrichment following RBM-26 pull-down, and 9th for p-value. Were any of the mRNAs with greater fold-enrichment or smaller p-values examined further? It is important to determine whether many or all of these interacting genes are overexpressed in the absence of RBM-26 and whether they are also required for the phenotypic effects of RBM-26 mutants, or if the MALSU homolog is special.

      We have clarified our reasoning for selecting the MALS-1 ortholog of MALSU1 for further study (see lines 283-284 and Table S2). Amongst binding partners with human orthologs, MALS-1 was by far the top ranked candidate. The adjusted p-value for MALS-1 was 0.0008. The next smallest adjusted p-value was two orders of magnitude larger (0.028 for dpy-4). Moreover, the log2fold fold enrichment for MALS-1 was 1.98, about the same as the largest (ACADS with 2.13). Nonetheless, we agree that some of the other interactors may also be of interest and have thus included them in the supplemental table S2. Although these other potential binding partners are outside the scope of this study, we expect that future studies by ourselves or others may focus on the roles of these other binding partners.

      In addition to the specificity controls mentioned above, positive and negative controls are needed throughout the results. While each of these may be relatively minor by itself, as a group they raise questions about the technical rigor of the study. Briefly these include: Fig 1C. Missing loading controls and negative control (rbm-26 null allele). Additional exposures should be included to show whether RBM-26(P80L) protein or the lower band for RBM-26(L13V) are present at all, relative to the null allele.

      We have added no-stain loading controls to figure 1C. We have also switched to using ECL detection, which is much more sensitive and reveals faint bands for RBM-26(P80L) and additional faint bands for RBM-26(L13V). In addition, we have included a longer exposure for the blot (Figure S1). We are unable to test the null, as we can only produce a limited number of small maternally rescued progeny, thereby precluding western blot analysis.

      Fig 2. Controls to distinguish overextension of PLM axon from posterior mispositioning of ALM cell body are needed. Quantification of PLM axon lengths in microns (or normalized to body size) with standard deviation, not error of proportion, should be shown. Measurement of "beading phenotype" should be more rigorous, see for example the approach in Rawson et al. Curr. Biol. 2017 https://doi.org/10.1016/j.cub.2014.02.025 . The developmental stage examined, and the reason for choosing that stage, should be described for this and all figures.

      We have added new data that shows PLM axon length relative to body length for each of the RBM-26 mutants (Figure S2 and lines 183-185). These results indicate that the PLM axon has a larger axon length to body length ration, suggesting that the PLM/ALM overlap phenotype is a result of PLM axon overextension. For most experiments, we retain penetrance, as this has been standard practice in the field and allows for a much larger sample size (see examples listed below). We have also added examples of how the beading phenotype was measured (Figure S3). Moreover, we have now analyzed this phenotype and others at multiple developmental stages (Figures 2D-H and Table S1). In general, we have conducted experiments at the L3 stage because the rbm-26(null) mutants don't survive past this stage. However, for many of our experiments we have also included additional stages as well. We have added this explanation to the methods section of phenotype analysis and also at various locations throughout the text. We have also labeled all graphs to clearly indicate the developmental stages and included.

      10.1038/s41467-019-12804-3 Article by laboratory of Brock Grill

      10.1371/journal.pgen.1002513 Article by laboratory of Ian Chin-Sang

      doi.org/10.1073/pnas.1410263111 Article by laboratory of Chun-Liang Pan

      10.1016/j.neuron.2007.07.009 Article by laboratory of Yishi Jin

      doi.org/10.1523/JNEUROSCI.5536-07.2008 Article by laboratory of William Wadsworth

      Fig 3. Controls without auxin and with neuronal TIR1 expression alone should be included. Controls demonstrating successful RBM-26 depletion, in larvae as well as in embryos at the time of PLM extension, should be included (weak embryonic depletion might explain why the overextension phenotype is only 14% instead of 40% as in the null). According to CeNGEN, rbm-26 expression in PLM is barely detected, thus depletion with a PLM-specific TIR1 should also be tested. To confirm the authors' identification of the cell marked "N" as the PLM cell body, co-expression of rbm-26 and a PLM-specific marker should be added. Rescue of the rbm-26 mutants with neuronal (and PLM-only) expression should be included to test sufficiency in PLM, and as a further control for potential artifacts of the AID system.

      We have added new data showing that an endogenously tagged RBM-26::Scarlet protein is expressed in the PLM neuron (Figure 3A-C). Moreover, we have added rescue experiments, showing that a Pmec-7::rbm-26::scarlet transgene can rescue the beading phenotype and the PLM/ALM overlap phenotype (Figure 3 F-G). We have also added controls without auxin (Figure S7) __and without the rbm-26::scarlet::aid gene (Figure S8). We have added a new figure showing auxin-mediated depletion of RBM-26::Scarlet::AID in the PLM neuron (Figure S10)__. We examined auxin-mediated depletion at the L3 stage for consistency with our auxin-mediated phenotypic experiments. Moreover, these were done at the L3 stage for consistency with other experiments that included the rbm-26(null) mutants, which don't survive past this stage.

      In general, auxin-mediated knockdown tends to be hypomorphic in neurons. This is likely due to the fact that the neuronal TIR1 driver is expressed at much lower levels relative to the other drivers. In addition, the lower penetrance observed in auxin-mediated PLM/ALM overlap phenotype could reflect the fact that this phenotype resolves by the L4 stage in the hypomorphic mutants. For example, in P80L mutants at the L3 stage we see only about a 20% penetrance of the PLM/ALM overlap phenotype (relative to about 15% in auxin-mediated knockdown).

      Fig 4. More rigorous quantification of the distribution of mitochondria along the axon should be included, not only total number, and it should be clarified what region of the axon the images are taken from. Including the AID-depletion strain with and without auxin would further add to the sense of rigor. For the mitoTimer experiments, why is RBM-26(L13V) not included and why do wild-type values differ ~5-fold between experiments (despite error bars being almost non-existent)? A more rigorous approach to standardizing imaging conditions may be needed. Positive controls using compounds that affect oxidation should be included. Measurements of individual mitochondria with standard deviations should be shown, rather than aggregate averages with error of proportion.

      We have changed our methodology for measuring mitochondria, so that we now report the density of mitochondria in the axon (number per 100µm), (Figure 4E-F). We agree that this method is much better than counting the total number of mitochondria per axon, as it corrects for differences in body length and axon length). We also now include data for the whole axon (Figure 4E), proximal axon (Figure 4G), and distal axon (Figure 4H). These data suggest that the mitochondrial density defects occur in the proximal axon but not in the distal axon. Using the null allele, we have also examined the timing of mitochondria defects in the axon and report that the defects begin in the L1 stage and continue throughout larval development (Figure 4F). Individual datapoints have been added for all graphs in Figure 4.

      For the mitoTimer experiments (Figure 5), we have added data for L13V and have added the individual datapoints to the graph. In the prior version, the values did not differ 5-fold between experiments with the same stage, rather the different graphs were from different stages (as noted in the figure legends/main text) and the L4 stage has much more oxidation than the L2 stage. To clear this up, we have added labels to the graphs to indicate the stages for each experiment. We have also added new data, so that we now show results for the L2, L3, and L4 stages for all three rbm-26 mutants (see Figure 5C-E). We didn't test the L1 stage because the signal was not sufficient for accurate quantitation.

      Fig 5. Additional positive and negative controls should be added, including additional rbm-26 alleles, the AID-tagged strain with and without auxin, and a rescued mutant.

      The old Figure 5 has become Figure 6 in the new version. We have added the rbm-26(L13V) allele to each experiment, (Figure 6B-D). We have also added the loading controls for the western blot along with quantification for 3 biological replicates of the western blot analysis (Figure 6D). We agree that these additions significantly strengthen the data because they show that two independent alleles of rbm-26 cause very substantial increase in the expression of mals-1 at both the mRNA and protein levels. We did not do these experiments with the rescuing transgene or with the AID-tagged strain because these experiments are done on whole worm lysates, whereas the AID-tagged and rescuing transgene are neuron-specific.

      Fig 6. Controls showing whether the Scarlet-tagged protein is functional are needed, to rule out dominant negative or toxicity-related effects.

      This is Figure 7 in the new version. For this experiment, we are showing that overexpression of MALS-1 does cause defects. The idea is that excessive amounts of MALS-1 causes deleterious effects to the mitochondria. In fact, these defects could be considered as dominant negative or toxic. We considered the possibility of crossing the Pmec-7::mals-1::scarlet transgene with rbm-26; mals-1 double mutants. However, this does not seem workable, because the single copy Pmec-7::mals-1::scarlet transgene produces the phenotypes at penetrances that are similar to what we observe in rbm-26; mals-1 double mutants. We concede that the results of the overexpression experiments in Figure 7 are limited when considered in isolation. However, we think that they are meaningful when considered in combination with the results on the mals-1;rbm-26 double mutants in Figure 8.

      Fig 8. Controls for other mitochondrial components need to be included. It is important to determine if the decrease in ribosomes is specific or reflects a general decrease in mitochondria. If there are fewer mitochondria as suggested in Fig. 4, then of course mitochondrial ribosomal protein levels are also reduced. Additional rbm-26 alleles should be included here as well. Is this effect dependent on the MALSU homolog?

      This is Figure 8D-E in the new version. We have added new data showing that the decrease in MRPL-58 expression that is caused by the rbm-26(P80L) mutation is dependent on MALS-1. We concede that these experiments cannot be used to determine anything about the mitoribosomes per se, but rather serve as an alternative way of testing the effect of rbm-26 on mitochondria. We have revised the text accordingly (lines 355-357). Given these limitations we have elected not to try additional mitochondrial markers and have also not included additional rbm-26 alleles for this experiment.

      Finally the authors should address concerns about image manipulation, which amplify the concerns about technical rigor outlined above. The image in Fig. 2A appears to have a black box placed over the lower-right portion of the field to hide some features. Black boxes also appear to have been placed over the tops of images in Fig. 4B and 4D and at the left of Fig. 6A, 6B, and 6C. While these manipulations probably do not affect the conclusions, they further undermine confidence in data integrity and experimental rigor.

      We have corrected all of these image processing errors. The box in 2A was for the purpose of squaring off a corner that was clipped during image rotation. The boxes in Figures 4 and 6 (of the prior version) were added to give space for labels (without obscuring image features). We have now used alternative methods to accomplish the same goals. For example, in Figures 4-D we have placed the labels outside of the images.

      Minor points. 1. C. elegans nomenclature conventions should be followed: - C. elegans gene names have three or four letters, thus the MALSU homolog cannot be named "malsu-1". Please have new gene names approved by WormBase BEFORE submitting for publication http://tazendra.caltech.edu/~azurebrd/cgi-bin/forms/gene_name.cgi

      We have changed malsu-1 to mals-1. In addition, both mals-1 and mrpl-58 have now been approved by wormbase and will be listed on the website upon its next update.

      • If two sequential CRISPR edits are made on the same gene then they should be listed as a compound allele, such as rbm-26(cue22cue25)

      We have updated our gene names to reflect this convention.

      • Genes on the same chromosome should not be separated with a semicolon, for example rbm-26(cue40) K12H4.2(syb6330)

      We have updated our gene names to reflect this convention.

      Describing the defects as "neurodevelopmental" is misleading in the case of axon beading or degeneration. Similarly, there is no evidence for an "axon targeting" defect as stated in the abstract.

      We have revised such that instead of referring to degeneration phenotypes as neurodevelopmental, we now refer to axon degeneration phenotypes that occur during development. For example, in the abstract we now say, "These observations reveal a mechanism that regulates expression of a mitoribosomal assembly factor to protect against axon degeneration during neurodevelopment.

      Regarding targeting defects, this was meant to refer to the misplacement of the PLM axon tip (which contains electrical synapses). However, our subsequent analysis has revealed that these defects are transient in P80L and L13V mutants, as they resolve by the L4 stage. The rbm-26 null axon development defects do not resolve, though these mutant die prior to the L4 stage. Given these findings, we have decided not to use the term of targeting defects. Instead, we now refer to this as an axon tiling defect or PLM/ALM overlap phenotype.

      In Fig. 5A, the symbol that appears to correspond to F59C6.15 (lowest p-value) is a different size than the others and is colored as ncRNA, whereas WormBase annotates this gene as snoRNA.

      This error has been corrected.

      In the Introduction, the last sentences of the first two paragraphs should be varied ("However, little is known about the [...] mechanisms that protect [...] during neurodevelopment.")

      This has been done.

      Why is RBM-26 protein running as a doublet at both sizes?

      We have improved our western blotting methodology by using 12% gel, allowing for better resolution. We have also switched from colorimetric detection to ECL detection, allowing for greater sensitivity. In our new blots, we identify 6 different RBM-26 protein bands. We don't know the reason for these bands, but speculate that they are the result of post-translational processing (148-150).

      When showing the RBM-26 expression pattern (Fig. 3) please include a lower-magnification image of the entire animal.

      This has been done (Figure S6)

      It is confusing to refer to the RNA IP experiments as an "unbiased screen", which in C. elegans typically refers to a genetic screen.

      We now refer to this as a "biochemical screen".

      The relationship between axon overextension, beading, and mitochondrial localization is not clear. What causal connection between these is being proposed? The causal connections between these phenotypes, if any, should be clarified experimentally. For example, if the axon extension defects develop before mitochondrial localization defects, then it is unlikely that mitochondrial defects cause axon overextension.

      We have added new data showing that the reduction in mitochondrial density within the axon begins during the L1 stage and increases throughout larval development (Figure 4F). We have also added additional data showing that the increase in mitochondrial oxidation is weak in the L2 stage and surges in the L3 stage (Figure 5C-E), coincident with the beginning of the axon degeneration phenotypes. We propose (lines 383-391) that a low level of mitochondrial defects is present in L1 larvae, giving rise to the axon tiling defects. In the L3 stage there is a surge in excessive mitochondrial oxidation, giving rise to the axon degeneration phenotypes. We have added a new section to the discussion that addresses the relationship between defects in axon development and axon degeneration (lines 375-405).

      Please explain how to interpret the difference in axon beading in the two deletion alleles of the MALSU homolog (axon beading defects in tm12122 but not in syb6330). Is syb6330 not a null allele? Or are the defects in tm12122 due to other mutations in this strain background?

      One likely reason for this difference is that tm12122 is predicted to cause a partial deletion of the mals-1 coding sequence, whereas the syb6330 is a full deletion. Thus, the tm12122 could be acting as a dominant negative. In fact, prior work on the MALSU1 ortholog has indicated that this protein is subject to interference by a dominant negative construct (see Rorbach et al, Nucleic Acids Res 2012). Nonetheless, we cannot rule out the possibility of a linked second mutation in tm12122. However, since we have found similar phenotypes and genetic interactions with both alleles, we can conclude that these phenotypes and interactions are due to loss of MALS-1, rather than a second mutation.

      Are mitochondria reduced in number or mislocalized? If they are reduced in number, is this due to altered balance of fission/fusion?

      We have adjusted our methods for quantifying mitochondria and have also analyzed the proximal vs distal axon (Figure 4). We find that the density of mitochondria is decreased in the proximal axon, but not in the distal axon. We speculate that this might reflect a higher demand on mitochondria in the proximal axon, due to a higher amount of trafficking activity in the proximal axon (lines 255-257). We propose that the loss of RBM-26 causes dysfunction in mitochondria. Since fission and fusion are mechanisms that can help to repair damaged mitochondria, it is likely that they would be involved in the phenotypes that we observe.

      In Fig. 3A-D, please keep the labels in the same position in all panels and do not alter brightness settings between single-color and merged panels.

      These images have been moved to the supplemental data section (Figure S5). We have adjusted the labels as suggested. We have not changed the brightness settings, as they were already the same in all panels. However, the blue signal in the merged panel does obscure some of the red signal, giving an appearance of an alteration in color balance.

      The claim that rbm-26 acts cell-autonomously requires PLM-specific depletion and rescue experiments.

      We have added new data indicating that a Pmec-7::rbm-26::scarlet transgene can rescue the beading phenotype (Figure 3F-G).

      **Referees cross-commenting** I appreciate the use of the consultation session to resolve differences between reviewers, but in this case I fully agree with the content and tone of all the comments from the other reviewer -- I think our remarks are very well aligned!

      Reviewer #1 (Significance (Required)):

      The study engineers autism-associated variants in conserved residues of RBM27 into the C. elegans homolog RBM-26 and identifies neuronal phenotypes potentially relevant to autism and a potential molecular mechanism involving regulation of mitochondrial ribosome assembly.

      The key claims of the study are 1} that autism-associated variants in RBM-26 decrease its protein expression; 2} that impaired RBM-26 function leads to a variety of defects in development and maintenance of a single neuron called PLM, including altered axonal localization of mitochondria; 3} that RBM-26 normally binds the mRNA for the C. elegans homolog of MALSU, a mitochondrial ribosomal assembly factor; 4} that loss of RBM-26 leads to overexpression of the MALSU homolog; and 5} that MALSU is required for some of the deleterious effects on the PLM neuron seen in RBM-26 mutants.

      This study will be of interest to the autism research community because it bolsters the idea that variants in RBM27 are likely to disrupt gene function and to affect neuronal health. It will also be of interest to the broader cell biology community because it suggests an interesting potential nucleus-to-mitochondria signaling mechanism, in which a nuclear RNA-binding protein might regulate assembly of mitochondrial ribosomes.

      My field of expertise is developmental biology in C. elegans.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary In this manuscript, the authors studied an ASD-associated gene, rbm-26 in neuronal morphology using the touch receptor neuron PLM in C. elegans, and found that loss-of-function rbp-27 causes overextension and the formation of bulb-like structures in the axon. Using UV-crosslinking RNA immunoprecipitation and RNA-Seq, they identify malsu-1 as a target of rbm-26. Genetic analyses suggest malsu-1 likely functions downstream of rbm-26 in controlling the PLM morphology. Major comments:

      • The authors describe RBM27 is associated with ASD and ID while they only cite SFARI paper that describes a weak association of RBM27 to ASD. The appropriate referenced that show link between RBM27 and ID should be provided. The link with ID was an error. We had meant to say "ASD or other neurodevelopmental disorders." This has been corrected.

      • SFARI database only has three (P79L, R190Q, G348D) mutations listed as ASD-associated. Where are other mutations L13V and R455H, particularly L13V that the authors used to generate the C. elegans mutant come from? Are they associated with intellectual disabilities? The others came from the devovo-DB. We have added a reference for this database and have also added the primary source references for each of the five de novo variants (see line 121).

      • The authors should be very careful when describing 'gene X causes Y diseases'. Many (if not all) of the examples described in this manuscript are disease-associated genes without validation to be causal genes. We have revised accordingly. For example on lines 433-435, we now say," For example, mutations in the EXOSC3, EXOSC8 and EXOSC9 are thought to cause syndromes that include defects in brain development such as hypoplasia of the cerebellum and the corpus callosum". We have decided to use the phrase "thought to cause" because three of the five referenced articles on these genes use titles that indicate causation.

      • The authors refer PLM axon beading and overextension phenotypes to 'axon degeneration and targeting defects'. The authors must provide additional evidence of axon degeneration (see below). Also the term 'targeting defects' is misleading as the authors did not examine if overextension of the PLM axon causes targeting defects. At least they should examine some synaptic markers. To provide more evidence of degeneration we have analyzed several additional phenotypes at multiple developmental stages (Figure 2 and Table S1). Regarding targeting defects, this was meant to refer to the misplacement of the PLM axon tip (which contains electrical synapses). However, our subsequent analysis has revealed that these defects are transient in P80L and L13V mutants, as they resolve by the L4 stage. The rbm-26 null axon development defects do not resolve, though these mutant die prior to the L4 stage. Given these findings, we have decided not to use the term of targeting defects. Instead, we now refer to this as an axon tiling defect or PLM/ALM overlap phenotype.

      • Neuronal phenotypes (axon overextension and beading) should be examined at different developmental timepoints (larval, young adult, and aged animals) to test if these phenotypes are indeed degenerative instead of developmental defects. We have included new data to observe all of these phenotypes at multiple developmental time points (Figure 2 and Table S1).

      • The authors use the blebbing (beading) phenotype in the axon as the sole evidence of neurodegenerative properties of the PLM neuron. A more thorough analysis of this phenotype as done by others (Pan PNAS 2006) must be provided to support the authors' claim that this phenotype represents neurodegeneration. We have included new data on multiple degenerative phenotypes in axons including: blebbing, beading, waviness and breaks (Table S1).

      • The number of beads per axon should be quantified to better represent the severity of rbm-26 mutant. Individual samples should be plotted in the quantification instead of showing the percentage of animals. We have added data on the density of beads in rbm-26(null), rbm-26(P80L), and rbm-26(L13V) mutants (Figure S3). For most experiments we have decided to use penetrance to measure axon degeneration because this is a standard in the field and allows for a larger sample size. For examples please see:

      10.1523/JNEUROSCI.1494-11.2012 (Toth et al, 2012)

      https://doi.org/10.1016/j.cub.2014.02.025 (Rawson et al, 2014)

      10.1073/pnas.1011711108 (Pan et al, 2012)

      https://doi.org/10.7554/eLife.80856 (Czech et al, 2023)

      https://doi.org/10.1016/j.celrep.2016.01.050 (Nichols et al, 2016)

      • Based on the single gel image in Fig. 1C with no loading control, the P80L mutant appears to have no protein expression. How is the P80L viable while the null mutant is lethal? The authors should quantify the protein expression levels from multiple blots with proper loading controls. If P80L mutation is introduced into RBM-26::mScarlet strain can it cause depletion of the signal in vivo? We have added new data showing that the RBM-26::Scarlet signal is diminished by the P80L mutation in vivo (Figure 1E-F). We have also added quantification from 3 biological replicate blots (Figure 1D). Finally, we have improved the sensitivity of our blots by using ECL detection and also show various exposures to highlight the fainter bands (Figures 1C and S1). Therefore, we are now able to detect low level expression of RBM-26(P80L) mutant protein. It is likely that the low level of RBM-26(P80L) and RBM-26(L13V) seen on western blots is sufficient to prevent the lethal phenotype.

      • 'Moreover, loss of either the SPTBN1 or ADD1 genes causes a neurodevelopmental syndrome that includes autism and ADHD' References are missing, and as described above, be extra careful when indicating causality. Very few genes are known to cause ASD and ADHD. We have added the citations for this work (line 81). We also note that the titles for both of the cited articles indicate causation. To be on the safe side we have revised this line to say, "Moreover, loss of either the SPTBN1 or ADD1 genes are thought to cause a neurodevelopmental syndrome that includes autism and ADHD"

      • Fig. 3E F, the authors should use the strains that express TIR1 specifically in the touch receptor neurons to argue cell autonomous function of RBM-26. Alternatively, the authors may conduct PLM neuron-specific rescue experiments to test the sufficiency. We have added new data indicating that a Pmec-7::rbm-26::scarlet transgene can rescue the beading phenotype and the PLM/ALM overlap phenotype (see Figure 3F-G).

      • 'Loss of RBM-26 causes mitochondria dysfunction in axons.' The authors did not examine mitochondria function in axons. They only examined the number of mitochondria, and ROS production in the soma. The authors should provide additional evidence to support the idea that elevated ROS production in the soma is due to mitochondrial dysfunction in axons. Also, the authors should use both P80L and L13V for this experiment, and indicate individual datapoint as dots. Here, they quantified at the L4 stage, which the authors should justify. We have added the L13V data to this experiment and now show the individual data points. In addition, we have now conducted this analysis at the L2, L3 and L4 stages (Figure 5C-E). We have also revised the text to indicate that loss of rbm-26 function causes mitochondrial dysfunction in the cell body which could potentially cause a reduction of mitochondria in the axon (see lines 100-101 and 268-270). We speculate that mitochondria in the axon are also dysfunctional. However, the mitoTimer signal is not bright enough in axons to allow for quantification.

      • Figure 5B and C: the authors should also use L13V to quantify malsu-1 mRNA and protein level, and include quantifications in panel C (from multiple blots). This is Figure 6 in the new version. We have added new data for expression of mals-1 mRNA and protein in rbm-26(L13V) mutants (Figure 6B-D). We have also included quantifications from 3 biological replicates (Figure 6D).

      • In the rbm-26 mutant, the number of mitochondria is reduced, while the amount of MALSU-1 protein is increased. If MALSU-1 is specifically localized at mitochondria in wild type, where does the excessive MALSU-1 go in the rbm-26 mutants? Quantification of MALSU-1 signal intensity should be provided. Our Pmec-7::mals-1::scarlet transgene uses the tbb-2 3'UTR and causes an overexpression phenotype. To address the question posed by the reviewer, we would need to express MALS-1 at endogenous levels. Given that endogenous levels of MALS-1 are very low, it is unlikely that we would be able to visualize its expression. Nonetheless, as a way to address this question we have attempted to create a single copy Pmec-7::mals-1::scarlet transgene that utilizes the mals-1 endogenous 3'UTR. We have tried multiple approaches for generating this construct, but all have failed, likely due to sequence complexities within the mals-1 3'UTR. While we cannot say where the extra MALS-1 protein goes, we think that it is likely overloaded into the remaining mitochondria and could also be in the cytosol as well.

      • Figure 7C: malsu-1 knockout mutants exhibit PLM overextension phenotype, which is not consistent with their model. The authors should discuss this in detail. We have added a paragraph to the discussion explaining that mitochondria function could be disrupted by either MALS-1 overexpression or by MALS-1 loss of function (lines 471-480).

      • 'To validate these findings, we also repeated these experiments with an independent allele of malsu-1, malsu-1(tm12122) and found similar results (Fig. 7A-C).' The malsu-1(tm12122) exhibits beading phenotype and more severe overextension phenotype which the authors must describe and discuss more carefully. One likely reason for this difference is that tm12122 is predicted to cause a partial deletion of the mals-1 coding sequence, whereas the syb6330 is a full deletion. Thus, the tm12122 could be acting as a dominant negative. In fact, prior work on the MALSU1 ortholog has indicated that this protein is subject to interference by a dominant negative construct (see Rorbach et al, Nucleic Acids Res 2012). Nonetheless, we cannot rule out the possibility of a linked second mutation in tm12122. However, since we have found similar phenotypes and genetic interactions with both alleles, we can conclude that these phenotypes and interactions are due to loss of MALS-1, rather than a second mutation (albeit at a slightly different penetrance). We have added these considerations to the results section (lines 342-244).

      • Figure 8: The authors should include data from L13V, malsu-1 and rbm-26; malsu-1 mutants. Quantification from multiple blots should be provided. This is Figure 8D in the new version. We have added the malsu-1 and rbm-26;malsu-1 double mutants to this experiment. We have also added quantification from multiple biological replicate blots. As pointed out by the other reviewer, we think that this experiment does not give specific information about mitoribosomes, but is an alternative approach to looking at the reduction in mitochondria. Given this limitation and considering that we have added L13V data to the mitochondria experiment in Figure 8B, we have elected not to add additional data on L13V to the western blot experiment in Figure 8D

      Minor comments: • 'Consistent with a role for mitochondria in neurodevelopmental disorders, some of these disorders include a neurodegenerative phenotype.' Why is it consistent to have neurodegenerative phenotypes if mitochondria is associated with neurodevelopmental disorders? A better explanation would help.

      We have changed this sentence to, "Some neurodevelopmental syndromes feature neurodegenerative phenotypes that occur during neuronal development."

      • L13V is generally more severe in axon overextension phenotype than P80L while protein level is more abundant. The authors should discuss about this. We have also added a time course for the PLM/ALM overlap phenotype mutants (Figure 2D). This new data shows that the PLM/ALM overlap is quite similar overall between the P80L and L13V mutants. Both of these mutations cause an increase in PLM/ALM overlap in early larval development that is resolved by the L4 stage. The P80L phenotype resolves slightly sooner for reasons that are unknown. This could reflect differences in expression within the PLM that are not reflected in the whole worm lysate. This could also be due to a slight difference in the genetic background or other stochastic factors. The key point is that these two independent alleles cause similar phenotype overall, indicating that this phenotype is the result of loss in RBM-26 function.

      • Fig. 2E, F: 'Beading refers to focal enlargement or bubble-like lesions which were at least twice the diameter of the axon in size.' How are the diameters of axons measured? A more detailed quantification method, and examples of measurement should be provided. We have added example measurements to the supplemental section (Figure S3). Additional detail on the measurements are in the Methods section (lines 517-518).

      • Figure 3: The authors should also include low-magnification images to show where RBM-26 is expressed. The current image does now allow identifying cells. The transgene that labels the nuclei of hypodermis should be indicated in the manuscript. Specifically, the expression of the RBM-26 in the PLM should be shown. We have added a low magnification image (Figure S6) and have also added images of endogenously tagged RBM-26:Scarlet in the PLM (Figure 3A-C). The transgenic label for the hypodermis has been added to the legend of Figure S5.

      • Figure 3: 'Tissue specific degradation of RBM-26::SCARLET::AID was achieved due to cell-type specific TIR-1 driver lines (see methods for details).' This information is not provided in the method section. This information has been added to methods section, "Auxin proteindegredation"

      • Fig. 4 E. Values from individual samples should be indicated as dots. Representative images of P80L and L13V should be included. Conduct quantifications at adult stage as the authors use in other quantifications, or justify use of specific developmental stage (L3) they used. Figure 4 has become Figures 4 and 5 in the revised version. We have updated the graphs to include dots for individual data points. We have added quantifications of the mitoTImer experiments for the L2, L3 and L4 stages (Figure 5C-E). We note that our other experiments were done at the L1, L2, L3 and L4 and adult stages. The mitoTimer signal is not sufficient at the L1 stage for quantification. At the adult stage, the red signal becomes saturated. We have added representative images for mitoTimer in P80L and L13V mutants (Figure S9).

      • The genes malsu-1 and mrpl-58 are not listed on wormbase. If the authors would like to designate names to these gene, they should clearly indicate that along with the sequence name. We have changed malsu-1 to mals-1. In addition, both mals-1 and mrpl-58 have now been approved by wormbase and will be listed on the website upon its next update.

      • The authors found that MRPL-58 amount is reduced in rbm-26 mutants (which require additional verifications). This can be explained by the fact that axonal mitochondria number is reduced in the rbm-26 mutants. How did the authors confirm that the reduction in MRPL-58 level is due to the disruption of mitoribosome assembly? This is Figure 8D-E in the new version. We have added new data showing that the decrease in MRPL-58 expression that is caused by the rbm-26(P80L) mutation is dependent on MALS-1. We concede that these experiments cannot be used to determine anything about the mitoribosomes per se, but rather serve as an alternative way of testing the effect of rbm-26 on mitochondria. We have revised the text accordingly (lines 355-357).

      • 'MALSU-1 is a mitoribosomal assembly factor that functions as part of the MALSU1:LOR8F8:mtACP anti-association module [37-39].' I don't think these are known for C. elegans MALSU-1. We have revised to, "MALS-1 is an ortholog of the MALSU1 mitoribosomal assembly factor that functions as part of the MALSU1:LOR8F8:mtACP anti-association module"

      • 'Moreover, our results also suggest that disruption of this process can give rise to neurodevelopmental disorders.' I feel this is a quite a bit of stretch.

      This has been replaced with, "Therefore, we speculate that human RBM26/27 could function with the RNA exosome complex to protect against neurodevelopmental defects and axon degeneration in infants." (lines 371-373)

      **Referees cross-commenting** Yes, many of our comments overlap, and I fully agree with all comments from the other reviewer too. Reviewer #2 (Significance (Required)):

      I found the manuscript interesting particularly the use of innovative techniques in identifying the target of RBM-26, The genetic analyses of rbm-26 and malsu-1 generally support the authors main conclusions that rbm-26 inhibits malsu-1 and be of potential interest to basic neuroscientists and cell biologists. However, the current manuscript looked premature which made my reading experience less pleasant. The phenotypic analyses is superficial compared to works similar to this work, which are insufficient to support the authors' claim of 'axon degeneration and targeting defects'. A number of issues listed above should be addressed before this manuscript is published. The reviewer's expertise: neurodevelopment in model organisms.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      In this manuscript, the authors studied an ASD-associated gene, rbm-26 in neuronal morphology using the touch receptor neuron PLM in C. elegans, and found that loss-of-function rbp-27 causes overextension and the formation of bulb-like structures in the axon. Using UV-crosslinking RNA immunoprecipitation and RNA-Seq, they identify malsu-1 as a target of rbm-26. Genetic analyses suggest malsu-1 likely functions downstream of rbm-26 in controlling the PLM morphology.

      Major comments:

      • The authors describe RBM27 is associated with ASD and ID while they only cite SFARI paper that describes a weak association of RBM27 to ASD. The appropriate referenced that show link between RBM27 and ID should be provided.
      • SFARI database only has three (P79L, R190Q, G348D) mutations listed as ASD-associated. Where are other mutations L13V and R455H, particularly L13V that the authors used to generate the C. elegans mutant come from? Are they associated with intellectual disabilities?
      • The authors should be very careful when describing 'gene X causes Y diseases'. Many (if not all) of the examples described in this manuscript are disease-associated genes without validation to be causal genes.
      • The authors refer PLM axon beading and overextension phenotypes to 'axon degeneration and targeting defects'. The authors must provide additional evidence of axon degeneration (see below). Also the term 'targeting defects' is misleading as the authors did not examine if overextension of the PLM axon causes targeting defects. At least they should examine some synaptic markers.
      • Neuronal phenotypes (axon overextension and beading) should be examined at different developmental timepoints (larval, young adult, and aged animals) to test if these phenotypes are indeed degenerative instead of developmental defects.
      • The authors use the blebbing (beading) phenotype in the axon as the sole evidence of neurodegenerative properties of the PLM neuron. A more thorough analysis of this phenotype as done by others (Pan PNAS 2006) must be provided to support the authors' claim that this phenotype represents neurodegeneration.
      • The number of beads per axon should be quantified to better represent the severity of rbm-26 mutant. Individual samples should be plotted in the quantification instead of showing the percentage of animals.
      • Based on the single gel image in Fig. 1C with no loading control, the P80L mutant appears to have no protein expression. How is the P80L viable while the null mutant is lethal? The authors should quantify the protein expression levels from multiple blots with proper loading controls. If P80L mutation is introduced into RBM-26::mScarlet strain can it cause depletion of the signal in vivo?
      • 'Moreover, loss of either the SPTBN1 or ADD1 genes causes a neurodevelopmental syndrome that includes autism and ADHD' References are missing, and as described above, be extra careful when indicating causality. Very few genes are known to cause ASD and ADHD.
      • Fig. 3E F, the authors should use the strains that express TIR1 specifically in the touch receptor neurons to argue cell autonomous function of RBM-26. Alternatively, the authors may conduct PLM neuron-specific rescue experiments to test the sufficiency.
      • 'Loss of RBM-26 causes mitochondria dysfunction in axons.' The authors did not examine mitochondria function in axons. They only examined the number of mitochondria, and ROS production in the soma. The authors should provide additional evidence to support the idea that elevated ROS production in the soma is due to mitochondrial dysfunction in axons. Also, the authors should use both P80L and L13V for this experiment, and indicate individual datapoint as dots. Here, they quantified at the L4 stage, which the authors should justify.
      • Figure 5B and C: the authors should also use L13V to quantify malsu-1 mRNA and protein level, and include quantifications in panel C (from multiple blots).
      • In the rbm-26 mutant, the number of mitochondria is reduced, while the amount of MALSU-1 protein is increased. If MALSU-1 is specifically localized at mitochondria in wild type, where does the excessive MALSU-1 go in the rbm-26 mutants? Quantification of MALSU-1 signal intensity should be provided.
      • Figure 7C: malsu-1 knockout mutants exhibit PLM overextension phenotype, which is not consistent with their model. The authors should discuss this in detail.
      • 'To validate these findings, we also repeated these experiments with an independent allele of malsu-1, malsu-1(tm12122) and found similar results (Fig. 7A-C).' The malsu-1(tm12122) exhibits beading phenotype and more severe overextension phenotype which the authors must describe and discuss more carefully.
      • Figure 8: The authors should include data from L13V, malsu-1 and rbm-26; malsu-1 mutants. Quantification from multiple blots should be provided.

      Minor comments:

      • 'Consistent with a role for mitochondria in neurodevelopmental disorders, some of these disorders include a neurodegenerative phenotype.' Why is it consistent to have neurodegenerative phenotypes if mitochondria is associated with neurodevelopmental disorders? A better explanation would help.
      • L13V is generally more severe in axon overextension phenotype than P80L while protein level is more abundant. The authors should discuss about this.
      • Fig. 2E, F: 'Beading refers to focal enlargement or bubble-like lesions which were at least twice the diameter of the axon in size.' How are the diameters of axons measured? A more detailed quantification method, and examples of measurement should be provided.
      • Figure 3: The authors should also include low-magnification images to show where RBM-26 is expressed. The current image does now allow identifying cells. The transgene that labels the nuclei of hypodermis should be indicated in the manuscript. Specifically, the expression of the RBM-26 in the PLM should be shown.
      • Figure 3: 'Tissue specific degradation of RBM-26::SCARLET::AID was achieved due to cell-type specific TIR-1 driver lines (see methods for details).' This information is not provided in the method section.
      • Fig. 4 E. Values from individual samples should be indicated as dots. Representative images of P80L and L13V should be included. Conduct quantifications at adult stage as the authors use in other quantifications, or justify use of specific developmental stage (L3) they used.
      • The genes malsu-1 and mrpl-58 are not listed on wormbase. If the authors would like to designate names to these gene, they should clearly indicate that along with the sequence name.
      • The authors found that MRPL-58 amount is reduced in rbm-26 mutants (which require additional verifications). This can be explained by the fact that axonal mitochondria number is reduced in the rbm-26 mutants. How did the authors confirm that the reduction in MRPL-58 level is due to the disruption of mitoribosome assembly?
      • 'MALSU-1 is a mitoribosomal assembly factor that functions as part of the MALSU1:LOR8F8:mtACP anti-association module [37-39].' I don't think these are known for C. elegans MALSU-1.
      • 'Moreover, our results also suggest that disruption of this process can give rise to neurodevelopmental disorders.' I feel this is a quite a bit of stretch.

      Referees cross-commenting Yes, many of our comments overlap, and I fully agree with all comments from the other reviewer too.

      Significance

      I found the manuscript interesting particularly the use of innovative techniques in identifying the target of RBM-26, The genetic analyses of rbm-26 and malsu-1 generally support the authors main conclusions that rbm-26 inhibits malsu-1 and be of potential interest to basic neuroscientists and cell biologists. However, the current manuscript looked premature which made my reading experience less pleasant. The phenotypic analyses is superficial compared to works similar to this work, which are insufficient to support the authors' claim of 'axon degeneration and targeting defects'. A number of issues listed above should be addressed before this manuscript is published.

      The reviewer's expertise: neurodevelopment in model organisms.

    1. Author response

      The following is the authors’ response to the previous reviews

      eLife assessment 

      This work is an attempt to establish conditions that accurately and efficiently mimic a drought response in Arabidopsis grown on defined agar-solidified media - an admirable goal as a reliable experimental system is key to conducting successful low water potential experiments and would enable high-throughput genetic screening (and GWAS) to assess the impacts of environmental perturbations on various genetic backgrounds. The authors compare transcriptome patterns of plant subjected to water limitation imposed with different experimental systems. The work is valuable in that it lays out the challenges of such an endeavor and points out shortcomings of previous attempts. There was concern, however, that a purely gene expression-based approach may not provide sufficient physiologically relevant information about plant responses to drought, and therefore, despite improvements from a previous version, the new methodology championed by this work remains inadequate.   

      Molecular biologists who study drought stress must make choices about which assays to use in their investigation. Serious resources and effort are put into their endeavor, and choice of assay matters. Our manuscript’s goal was largely practical: to guide molecular biologists employing transcriptomics in their choice of drought stress assay, and thus help ensure their work will discover transcriptional signatures of importance, and not those that may be an artifact from lowering water potential using chemical agents on agar plates.  

      We examine how different approaches of reducing water potential impact the Arabidopsis root and shoot transcriptome. Our manuscript shows that each method of reducing water potential has a different effect on Arabidopsis root transcriptome responses. We acknowledge that drought stress induces a complex physiological response, and can vary depending on the method used. However, by comparing across assays, we find instances where a gene is downregulated by low water potential in one assay, and upregulated by low water potential in another assay. We feel it is only natural to question why this could be, and to hypothesize that it may be caused by secondary effects caused by the way low water potential is imposed.  We note that comparative transcriptomics has been a standard approach for decades. We take it as the reviewer’s opinion that it may not be insightful, but it does not factually impact our findings. 

      Reviewer #2 (Public Review): 

      This manuscript purports to develop a new system to study low water potential (drought) stress responses in agar plates. They make numerous problematic comparisons among transcriptome datasets, particularly to transcriptome data from a vermiculite drying experiment which they inappropriately present as representing an authentic "drought response" to the exclusion of all other data. For some reason, which the reviewer cannot fully understand, the authors seem intent on asserting the superiority of their experimental system to all others. They do not succeed in this and such an effort is ultimately a disservice to the field of drought research as a whole. 

      While they devote considerable effort in comparing transcriptome data among various experimental systems, the potentially more informative experiment at the end of the manuscript of testing growth responses of a number of Arabidopsis accessions is only done for their "LW" system. The focus of this manuscript on transcriptome data to the almost complete exclusion of other types of data which is a symptom of a broader over-emphasis on transcriptome that unfortunately is quite prevalent in plant science now. It is worth reminding that for protein coding genes, which constitute the vast majority of genes, transcriptome data is a proxy measurement. The really important thing is protein amount, and even more so protein activity/function, which we know has an imperfect, at best, correlation with transcript level. We measure transcriptomes because we can, not because it is inherently the most informative thing to do. The author's quixotic quest to see if the transcriptomes of different stress treatments match is of limited value and further diminished by their misleading presentation of one particular transcriptome data set (from their vermiculite drying experiments) as somehow a special data set that everything else must be evaluated against. This study sheds no new light on how to do relevant drought (low water potential) experiments in the lab. 

      Although the reviewer acknowledges that the authors have made some effort to respond to previous comments, the fundamental flaws remain and the present version of this study is little improved from the first submission. 

      One challenge faced by the drought community is establishing consensus regarding the definition of drought itself. According to the criteria followed by the reviewer, any method leading to a reduction in water potential qualifies as drought stress. However, the findings presented in this manuscript demonstrate that transcriptional responses in roots vary considerably across five different methods of reducing water potential. This indicates that beyond responding to a change in water potential itself, root transcriptomes will also respond to the specific way low water potential is introduced. We believe this variability is of interest to the drought research community. 

      Of the five methods we explore, we hold the view that the gene expression changes induced by vermiculite drying as the most analogous to the expression signatures Arabidopsis would exhibit in response to low water potential in the natural environment. In contrast, we posit that Arabidopsis grown on agar plates - where the root system is exposed to air and light, and where water potential is lowered using chemical agents - may contain gene expression signatures plant molecular biologists may not find particularly relevant. However, we acknowledge that this is our opinion, and will make this more explicit on our revised text. 

      More broadly, we believe that the reviewer’s observation regarding the ‘over-emphasis’ on transcriptomics that is prevalent within the plant science community justifies, rather than diminishes, the work presented here. If transcriptomics is a commonly employed method, then we anticipate that the outcomes of this study will hold value for a broad audience. Such researchers are likely not only using transcriptomics as a proxy measure for protein abundance, as the reviewer suggests, but also because it is one of the more straightforward genomic techniques biologists can use to identify candidate genes that may be chosen for further scrutiny. 

      Reviewer #3 (Public Review): 

      Comments on revised version: 

      Specific previous criticisms that were addressed are: 

      (1) that gene expression changes were only compared between the highest dose of each stress assay. In the revised version, the authors changed their framework and are now using linear modelling to detect genes that display a dose response to each specific treatment. I agree that this might be a more robust approach to selecting genes that are specific to a certain treatment. 

      (2) that concentrations of PEG, mannitol, NaCl, and the "low water" agar which were chosen are not comparable in regards to their specific osmotic component. I appreciate that the authors measured the osmotic potential of each treatment. It revealed that both PEG and NaCl at their highest concentration had a much more negative osmotic potential compared to the other treatment. The authors claim that using ANCOVA they did not detect any significant differences between the treatments (lines 113, 114). I do believe that ANCOVA is not the appropriate test in this case. ANCOVA has an assumption of linearity, while the dose response between concentration and osmotic potential is non-linear. This is particularly evident for PEG (Steuter AA. Water potential of aqueous polyethylene glycol. Plant Physiol. 1981 Jan;67(1):64-7. doi: 10.1104/pp.67.1.64.). Since the treatments are not the same at the highest level, I think this could have effects on the validity of comparisons by linear model. One approach could be to remove the treatment level with the highest concentration and compare the results or adjust the treatments to the same osmolarity. 

      (3) that only two biological replicates were collected for RNA sequencing which makes it impossible to know how much variance exists between samples. The authors added a third replicate in the revised version for most treatments. However, some treatments still have only two replicates, which cannot be easily seen from the text or the figure. I would prefer that those differences are pointed out. 

      (4) that the original manuscript did not explore what effect the increase of agar and nutrient concentration in the "low water" agar had on water potentials. The authors conducted additional experiments showing that changes in water potential were exclusively caused by changes in the nutrient concentration (Figure 2-figure supplement 5; lines 222-224). However, the increase in agar strength had also some effect on gene expression. While this is not further discussed in the text, I believe this effect of agar on gene expression could be similar to root responses to soil compaction. 

      (5) That the lower volume of media in the "low water" agar could have an effect on plants. The authors compared these effects in Figure 2-figure supplement 7. They claim that "different volumes of LW agar media do not play a significant part in modulating gene expression". While I can see that they detected 313 overlapping DEGs, there were still 146 and 412 non-overlapping DEGs. The heatmap in subpanel E also shows that there were differences in particular in the up-regulated genes. My conclusion would be that the change in volume does play a role and this should be a consideration in the manuscript. 

      We thank the reviewer for their suggestions. We plan to resubmit the manuscript reflecting the requested changes. Specifically, we will: 

      -       We will detail more thoroughly the effects of agar volume on gene expression changes elicited by LW agar treatment. 

      -       We will investigate whether the tensile stress introduced by hard agar is similar to soil compaction by an analysis with existing literature. 

      -       Assess more rigorously the suitability of the ANCOVA model for assessing water potential changes of different media types.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1

      The paper is overall convincing. However, a little more attention to data presentation and possibly the addition of at least another technique (see below) would greatly strengthen the findings.

      As we hope to demonstrate below, we have taken steps to improve our manuscript on both fronts (data presentation and experimental evidence).

      The absence of statistics catches immediately the eye. I am sure that the shown differences are statistically significant (thanks to the number of analyzed cells), but reporting the result of some statistical test would help the reader in identify the relevant data in a plot. This is somehow necessary considering that sometimes in the text something is deemed to be "significant" or "not significant", and I felt that I really needed that when looking at the plot in Fig. 3D.

      To facilitate the interpretation of figures that contain data from multiple strains (such as the one mentioned by the reviewer), we have carried out a nonparametric single-step multiple comparison test (Games-Howell) to identify mutants whose means differ significantly from each other. To avoid overcrowding the figures, we have graphically summarized the p-values of all pairwise comparisons in a small matrix within the corresponding panel, and provided 99% confidence intervals and p-values of all differences in the Supplement.

      Related to the previous point: for every N/C distribution analysis, a number of analyzed cells is reported. By the way it is written, it seems that the replication relies solely by the cells in that specific population, i.e.: each cell is treated as a replicate. At least I could not find if that is not the case in the legends or in the methods. I wonder what the results would be (and their significance) if each replicate would be a new assay on another population.

      Cell populations exhibit significant variability in their phenotypic characteristics. Consequently, the quantification of a specific feature (e.g., the Sfp1 nuclear/cytoplasmic ratio) across a sample of cells from a given population results in a distribution rather than a single fixed value. For each quantification, we report the number of cells that were used to construct the corresponding distribution, i.e. the sample size. To compare samples from different populations (e.g., different Sfp1 mutant strains), we run them in parallel during microscopy experiments and compare their means, as described above. Throughout our study, we have tried to ensure that we quantify a sufficiently large number of cells to overcome cell-to-cell variability and enhance the reliability of our results.

      In this context, the question of the reviewer is not entirely clear to us, as individual measurements of a sample are not replicates. However, one can replicate the entire experiment on a different day by re-growing the different strains, running microscopy, quantifying the new movies etc. In this sense, the experiments shown in the manuscript consist of single replicates, i.e. experiments that were carried out on the same day, with all the relevant mutants and controls quantified together. However, we have monitored many of our mutants multiple times over the course of our work. For example, Fig. 1 below shows replicates of the Sfp1 N/C ratio distributions at steady-state in the analog-sensitive (A) and wild-type (B) background, which were quantified several times across various experiments. While day-to-day variability in the empirical distributions of the same mutant exists to a small extent, it is quite small.

      The scale of x axes in N/C ratio plots. Besides not being consistent throughout the figures, it originates from 1, visually enhancing the differences.

      We believe the reviewer was referring to the y-axes, as the x-axes represent time. Summarizing the N/C ratio dynamics of different Sfp1 mutants has been challenging. First, the average N/C ratios at steady-state vary considerably across different mutants, as shown in the panels that summarize steady-state N/C ratios. To compare the magnitude and features of their responses, normalization is necessary. We chose to normalize the time series of each mutant to have a mean of 1 prior to the onset of a perturbation. This allows the normalized time series to represent the percentage-wise changes in the Sfp1 N/C ratio upon perturbation.

      Using a common y-axis scale for all plots of N/C ratio dynamics not ideal, as some responses are subtler than others. Additionally, we do not believe that N/C dynamics across different figures need to (or should) be compared to each other. However, within a figure, panels that require comparison are placed in the same row and share the same y-axis scale. We believe that this approach optimizes data visualization and facilitates important visual comparisons.

      Related to the previous point: it is evident from the plots that the N/C ratio is always positive, even in the most deficient of the analyzed mutants. This implies that a relevant fraction of Sfp1 is still nuclear. I thus wonder what the impact of these mutations would be on the actual function of Sfp1. For this reason, I feel that qPCR evaluation of transcripts of Sfp1 target genes is particularly needed. Since lack of Sfp1 is known to yield some of the smallest cells possible, it would also be cool to have an estimate of the size of mutants where Sfp1 is less nuclear. These analyses could confer phenotypical relevance to the data, but would also help in assessing a currently unexplored possibility, that phosphorylation events by PKA influence Sfp1 function besides its localization, i.e.: the still somehow nuclear fraction is not as functional as wt Sfp1 in promoting transcription.

      It is indeed the case that the recorded N/C ratios are larger than 1 in all strains that we have monitored. We have never observed an N/C ratio smaller than 1 using widefield microscopy for two main reasons: first, out-of-focus light from the cytosol above and below the nucleus is added to the nuclear signal, causing the nuclear signal to always be non-zero, even for predominantly cytosolic proteins. Second, both in- and out of focus vacuoles are devoid of the fluorescent protein fusions that we quantify, which reduces the average brightness of the cytosol. For these reasons, even when a protein is largely cytosolic, the average N/C ratio over a cell population is no lower than around 1.5. Keeping these points in mind, one can observe that our most delocalized Sfp1 mutants have an N/C ratio that is around 1.6-1.7, which is very close to the lower limit. This means that these Sfp1 mutants are largely cytosolic, and the nuclear fraction (if non-zero) is quite small.

      We agree that assessing the phenotypic relevance of Sfp1 mutations is of interest. However, this was impossible with our original strains, as we introduced each Sfp1 mutant as an extra copy in the HO locus while leaving the endogenous Sfp1 locus intact. This was done in order to avoid any phenotypic changes that might result from changes in Sfp1 activity.

      To address the suggestion of the reviewer, we therefore deleted the endogenous Sfp1 copy in strains carrying sfp1PKA2A, sfp1PKA2D and sfp113A, leaving only the mutated Sfp1 copy at the HO locus. Surprisingly, the growth rate and drug sensitivity (determined by halo assays) of these single-copy mutants did not differ much in comparison to the mutants carrying the functional Sfp1 copy and from the wild-type (Supp. Figs. 4J and 7). This observation aligns with findings for the single-copy sfp1-1 mutant in [Lempiäinen et al. 2009], which corresponds to sfp1TOR7A in our work. [Lempiäinen et al. 2009] had suggested that Sch9 compensates for the loss of Sfp1 activity via a feedback mechanism, which could explain our results as well. If this is the case, acute depletion of wild-type Sfp1 could unveil transient changes in cell growth, before the compensatory effect of Sch9 was established. Unfortunately, we were unable to efficiently degrade wild-type Sfp1 carrying a C-terminal auxin-inducible degron. Instead, we followed the same approach with [Lempiäinen et al. 2009] and deleted SCH9.

      As we describe in the last section of Results, the difference was dramatic for sfp113A __mutants, which were extremely slow-growing in the absence of Sch9 (doubling time was around 4 hours, but it was hard to estimate because we could not grow the cells consistently). Interestingly, SCH9 deletion had a negative impact on sfp1__PKA2D __but not sfp1__PKA2A __cells (__Supp. Fig. 7). Overall, these results demonstrate that Sch9 can compensate for loss of Sfp1 activity, which makes it challenging to study the impact of Sfp1 mutations on cellular phenotypes.

      To further understand to what extent Sch9 compensates for loss of Sfp1 phosphorylation, we carried out RNA-seq on WT and cells carrying a single copy of sfp113A (with the endogenous SFP1 copy removed). Despite the fact that sfp113A __grow as well as WT, RNA-seq picked up several differentially expressed genes related to amino acid biosynthesis. This surprising finding is presented in the last section of Results, and in __Supplementary Figures 8, 9 and 10. We explore the relevance of these results and their connection with past literature on Sfp1 and Sch9 in the Discussion section.

      I found some typos here and there, and it would greatly help to report them if in the manuscript line numbers were included.

      We apologize for the typos. We have tried to eliminate them, and we have also added line numbers to the manuscript.

      Reviewer 2

      There is no biochemical evidence presented that the putative PKA sites (S105 and S136) are genuinely phosphorylated by PKA. The fact that they match the PKA consensus motif, alone, does not guarantee this. In order to claim that they are looking at the effect of PKA by mutagenizing these residues, the authors have to demonstrate the PKA-dependency of S105 and S136 phosphorylation by, for example, mass spec experiments or western blotting with phospho-specific antibodies (Cell Signaling Technology #9624 for example). Also, does the band-shift caused by PKA inhibition (Fig 3C) is canceled by the S105A/S136A mutation?

      We took several actions to demonstrate that the putative PKA sites are indeed phosphorylated by PKA. We first tried to detect Sfp1 phosphorylation using the antibody mentioned by the reviewer, but failed as the sensitivity of this antibody appears to be quite low. On the other hand, mass spectrometry did not produce the right fragments to detect the sites of interest. We therefore resorted to an in vitro kinase assay using [γ-32P]ATP together with purified PKA and Sfp1. Unfortunately, bacterial overexpression of MBP-tagged Tpk1, Tpk2 and Tpk3 (the catalytic subunits of PKA) was quite challenging and we were unable to produce soluble protein. We therefore resorted to commercially available bovine PKA (bPKA, PKA catalytic subunit, Sigma-Aldrich 539576), which shows high homology to the yeast Tpk kinases [Toda et al. 1987]. Moreover 87% of bPKA substrates have been shown to also be Tpk1 substrates [Ptacek et al. 2005], and bPKA has been used to identify new Tpk substrates in budding yeast [Budovskaya et al. 2005__]. As we show in the revised manuscript, bovine PKA does phosphorylate Sfp1. Moreover, phosphorylation is reduced by 50% in the double S105A, S136A mutant (Fig.1F), and becomes undetectable in the 13A mutant__ (Supp Fig. 6). Together with the rapid response of Sfp1 localization to acute PKA inhibition which we had already reported, we believe that these results provide strong evidence that Sfp1 is a direct PKA substrate, and that the two phosphosites that we identified are functional.

      As the above in vivo experiments do not exclude S105/S136 phosphorylation by other kinases downstream of PKA, in order to claim the direct phosphorylation, the authors need in vitro PKA kinase assay. These biochemical experiments are not trivial, but I think absolutely necessary for this story.

      One cannot exclude that S105/S136 are also phosphorylated by other kinases of the AGC family (note that [Lempiäinen et al. 2009] has already excluded Sch9). However, as we hope to have shown, PKA indeed phosphorylates Sfp1. Examining if other kinases besides PKA and TORC1 target Sfp1 is a very interesting question that should be addressed in future work.

      The authors only look at the localization of Sfp1. To assess its functionality and so physiological impact, it would be informative to measure the mRNA level of target ribosomal genes in various Sfp1 mutants they created.

      As we described in our response to Reviewer 1 above, we did perform RNA-seq on WT and cells carrying a single copy of sfp113A. We observed a notable absence of differentially expressed ribosomal genes and ribosome-related categories in the GO analysis (Supp. Figs. 8, 9 and 10). Together with our observations on SCH9 deletion (Supp. Fig. 7), these results suggest that Sch9 can largely compensate for the loss of Sfp1 activity. On the other hand, the emergence of differentially expressed amino acid biosynthesis genes is a finding that merits further investigation, as it connects with previous observations made with Sch9 deletion mutants and the [ISP+] prion form of Sfp1 (cf. Discussion).

      In the experiments using analog-sensitive PKA (Fig 1D and E for example), they directly compare wildtype-PKA versus analog sensitive-PKA, or with 1-NM-PP1 versus without 1-NM-PP1. This makes interpretation difficult, particularly because 1-NM-PP1 itself has a significant impact even in the wild PKA strain. The real question is the difference between wild-type Sfp1 versus mutant Sfp1. In the current form, they compare Fig 1D versus 1E, these two do not look like a single, side-by-side experiment. They should compare wild-type Sfp1 versus mutant Sfp1 side-by-side.

      Figure 1D shows that 1-NM-PP1 has a transient off-target effect on Sfp1 localization in WT cells, which could also affect Sfp1 mutants. This observation prompted us to use wild-type PKA as a control when testing the effect of 1-NM-PP1 on sfp1PKA2D in cells carrying PKAas (Figure 1E). As Fig. 1E shows, the effect of 1-NM-PP1 on sfp1PKA2D localization in PKAas cells is quite similar to the off-target effect in cells carrying sfp1__PKA2D __and wild-type PKA. This behavior of sfp1__PKA2D __is clearly different from the response of wild-type Sfp1 to PKAas inhibition, which results in sustained delocalization. We have made the latter observation repeatedly, both in this study and our previously published work [Guerra et al. 2021].

      In Figure 3, the argument around the additive effects of PKA and TORC1 is confusing. The authors say they are additive referring Figure 3E, but say they are not additive referring Figure 3B. Which is true? In fact, Figure 3B appears to show an additive effect as well.

      We did not use the word "additive" in the text, because we find it difficult to interpret. Instead, we state that PKA and TORC1 appear to control Sfp1 phosphorylation independently of each other. PKA and TORC1 phosphorylation converges to the same response, affecting Sfp1 localization. It appears that loss of either kinase delocalizes Sfp1, while loss of both kinases may only have a small additional effect.

    1. Josh is, by the way, a philosopher and a neuroscientist, so this gives him special powers. He doesn't sort of sit back in a chair, smoke a pipe and think, "Now why do you have these differences?" He says, "No, I would like to look inside people's heads, because in our heads we may find clues as to where these feelings of revulsion or acceptance come from." In our brains.

      AUTHORITY ??

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)): ** Summary

      The nucleus is recognised as a core component of mechanotransduction with many mechano-sensitive proteins shuttling between the nucleus and cytoplasm in response to mechanical stimuli. In this work, Granero-Moya et al characterise a live florescent marker of nucleocytoplasmic transport (NCT) and how it responds to a variety of cues. This work follows on from the authors previous study (Andreu 2022) where they examined the response of passive and active NCT to mechanical signalling using a series of artificial constructs. One of these constructs (here named Sencyt) showed a differential localisation depending on substrate stiffness, accumulating in the nucleus on stiffer substrates (which the authors previously showed was due to differences in mechano-sensitivity of passive versus facilitated NCT). Here the authors use Sencyt as a tool to probe how different cues affect NCT and thus nuclear force-sensing in two different cell lines (one epithelial, one mesenchymal). *

      They have established a 3D image segmentation pipeline to measure both the nuclear/cytoplasmic ratio of Sencyt and 3D nuclear shape parameters. As a proof-of -principle, they show that hypoosmotic shock (which inflates the nucleus and would be expected to increase nuclear tension) and hyper-osmotic shock (which shrinks and deforms the nucleus) alter Sencyt nuclear-cytoplasmic ration as expected. They then show that inhibiting acto-myosin, which would be expected to block force transduction to the nucleus, reduces NCT, although interestingly this is without any changes to nuclear morphology. They then examine how cell density affects NCT and show that Sencyt localisation correlates only weakly with density but much more strongly with nuclear deformation (especially as measured by solidity). This is surprising considering that mechano-sensitive transcription factors such as YAP have been shown to exit the nucleus at high cell densities. Therefore, the authors directly compare Sencyt and Yap nucleo/cytoplasmic localisation and show that Sencyt behaves differently to YAP with YAP localisation correlating strongly with cell density. This reveals an added layer of complexity in YAP regulation beyond pure changes to NCT.* Major points *

      The data presented throughout this work are high quality and rigorous. The controls used are appropriate (including the use of a freely diffusing mCherry to illustrate the specificity of the Sencyt probe in osmotic shock experiments - figure S2). Experiments are properly replicated and the statistical analysis is appropriate. The data are beautifully presented in figures and the manuscript is well written and very clear. Overall this is a high quality work.

      We thank the reviewer for the positive assessment of the manuscript.

      * The discussion is careful and the conclusions are supported by the data. My only small concern is that the authors place too much emphasis on how this work is in 'multicellular systems' as opposed to their previous work in single cells (for example "Here, we demonstrate that mechanics also plays a role in multicellular systems, in response to both hypo and hyper-osmotic shocks, and to cell contractility. L212). Cell density is only controlled in figures 3 and 4 and in some of the earlier experiments, cells look quite sparse (eg Figure 2). It's also debatable how far a monolayer of cancer cells, which lack contact inhibition of growth, is a multicellular system. Furthermore, the authors don't specifically look at cell/cell adhesion or observe major differences between the epithelial or mesenchymal lines. For this reason, the authors should tone down this discussion before publication. *

      • *

      We agree with the reviewer that properly assessing cell-cell adhesion is important in the context of the work. To this end, we have stained for E-cadherin in both cell lines. As expected and as described previously, the results confirm that MCF7 cells do have clear cadherin-mediated cell-cell adhesions, with a cadherin staining localized specifically in cell-cell junctions. Also as expected, C26 cells show much lower cadherin expression, without a clear pattern. Further confirming this difference, MCF7 cells show clearly distinct actin organizations in their apical and basal sides, whereas C26 cells do not. Thus, we believe that the two cell models do represent a reasonable assessment of epithelial versus mesenchymal phenotypes, in a multicellular context. The data are presented in new supplementary fig. 1, and discussed in page 3 of the manuscript (first paragraph). We have also included a paragraph in the discussion to comment on the differences between cell types (page 7, 2nd paragraph).

      * Optional experimental suggestions: For me, the most compelling finding is that nuclear deformation has a greater correlation with NCT than cell density and that this is different from the behaviour of YAP. To cement the importance of nuclear deformation, the authors could induce deformation in single cells, for example by culture on very thin micropatterned lines and assess the localisation of Sencyt and YAP. It would also be interesting to assess the role of force transduction in this context or in different densities by removing actin, which affects NCT without inducing nuclear shape changes. These functional experiments would allow the authors to draw stronger conclusions about the role of nuclear shape and deformation but they aren't necessary for publication. *

      • *

      This is a very interesting suggestion. Following the reviewer's advice, we have now carried out experiments in which we have seeded cells on micropatterns of different sizes, and measured both sencyt and YAP ratios. In C26 cells, we have found as expected that increasing spreading leads to progressive nuclear deformation (as measured through nuclear solidity) and progressive increase in both sencyt and YAP ratios. Interestingly, cell spreading in MCF7 did not affect nuclear solidity, sencyt ratios, or YAP ratios. This further confirms the relationship between nuclear deformation and nucleocytoplasmic transport, and shows as well that different cell lines have different sensitivities. The lack of response of MCF7 cells is consistent with the lower sencyt response, and lower sencyt/nuclear shape correlation measured in fig. 4. It suggests that MCF7 cells may have mechanisms to shield the nucleus from deformation, something which we have reported in a different context (Kechagia et al., Nat. Mater. 2023). The new results are reported in new fig. 3, and supplementary fig. 8, and discussed in pages 5 (1st paragraph) and 6 (1st paragraph) of the manuscript results.

      • *

      Minor points

      * - I'd like to see better examples of 3D reconstructions of nuclei (ie fig 1C but bigger) in different conditions. This is especially important in figure 3 where it would be helpful to see examples of nuclei with high or low solidity. The differences in oblateness are clear to see from the images in 3a and 3f but solidity could be better illustrated. *

      • *

      We have now added 3D reconstructions as requested, which illustrate the nuclear shape changes that take place. This is shown in figs. 1, 4 (which corresponds to figure 3 in the previous version of the manuscript), s3, and s7.

      *

      • Where Sencyt index is plotted, it would be clearer to add labels to at least figure 1 which indicate whether it is more cytoplasmic or nuclear. *
      • *

      We have done this as requested in figure 1.

      * Reviewer #1 (Significance (Required)): *

      * In this work, Granero-Moya et al characterise a new tool for measuring NCT and show that it is mechanically regulated. Given the importance of NCT in mechano-transduction, this tool will be a great asset to the mechano-biology community and will likely be adopted by multiple groups in the future. The findings about the effects of cell density on NCT and differences from YAP are interesting but could be further fleshed out. This work is likely to be of greatest interest to a specialised audience working in the fields of mechano-biology and nuclear transport. *

      • *

      We thank the reviewer for the positive assessment.

      * *

      • *

      *Reviewer #2 (Evidence, reproducibility and clarity (Required)): *

      * The study conducted by Granero Moya and colleagues describes the application of a synthetic protein which is observed to enter the nucleus in response to mechanical strains, rather than being influenced by cell density. However, the novelty of this work is minimal since the conceptual framework and the utilization of this identical or similar tool have been previously reported by the same team in earlier publications. *

      • *

      We respectfully disagree with the assessment of the reviewer. Please see below for a detailed response regarding novelty.

      • *

      *In their experiments, they employ this GFP-based sensor, referred to as Sencyt, in cells subjected to osmotic shocks. These shocks are highly stressful and impact a range of cellular processes, including stress response pathways MAPK and others; Osmoregulatory pathways; cell cycle regulations, autophagy and death pathway; ion channel regulations and others. The second findings are on cells treated with a combo of drugs affecting the actin cytoskeleton. The justification for using a combination of two specific drugs remains unclear, as the study does not adequately explain the rationale behind this choice. Additionally, there is a lack of information regarding the full range of targets these drugs affect. This raises questions about the comprehensiveness and applicability of the findings, as understanding the complete scope of the drugs' targets is crucial for interpreting the results within a minimal frame of physiological context. *

      • *

      The two drugs used are paranitroblebbistatin (a photostable version of blebbistatin) and Ck666. We apologize for not explaining in more detail the action of these drugs, both of which have been characterized and used extensively in the literature. Paranitroblebbistatin binds to myosin, preventing its ATPase activity and therefore impairing actomyosin contractility (https://doi.org/10.1002/anie.201403540). It acts on different myosin isoforms, including non-muscle myosin II, the main type of myosin responsible for actomyosin contractility in non-muscle cells. CK666 binds to and inhibits arp2/3, a protein responsible for nucleating branched actin (https://doi.org/10.1016/j.chembiol.2013.03.019). This impairs lamellipodial formation and therefore cell spreading (see for instance https://doi.org/10.1371/journal.pone.0100943).

      The rationale for using both drugs in combination was explained in page 4 of the manuscript. In our previous work, we determined that myosin inhibition with blebbistatin is not sufficient to inhibit nuclear mechanotransduction. Indeed, in an epithelial context, we observed that due to reduced contractility, blebbistatin-treated epithelial cells in fact spread more on their substrate. This leads to more deformed (flattened) nuclei, leading to the counterintuitive result that YAP nuclear localization increases rather than decreases. If cell spreading is impaired by interfering with branched actin nucleation, then this spreading is prevented, and the combination of drugs leads to reduced nuclear deformation, and reduced YAP nuclear localization (see supplementary fig. 7 in Kechagia et al, Nat. Mater. 2023, https://doi.org/10.1038/s41563-023-01657-3). Similar results had been published previously by the group of Clare Waterman (https://doi.org/10.1074/jbc.M115.708313).

      Thus, the combination of drugs was designed to ensure that we were impairing nuclear mechanotransduction. Of course, we agree with the reviewer that all perturbations have potential side effects. Osmotic shocks will affect a range of cellular processes (as mentioned in the discussion of the manuscript), and any drug treatment can potentially have off-target effects. However, the fact that two orthogonal perturbations with different potential side effects (osmotic shocks versus actomyosin-targeting drugs) lead to the same effects in sencyt strongly suggests that the effect is mediated by mechanics, and not other factors. To reinforce this, we have now added an additional mechanical manipulation: seeding cells on micropatterned islands of different sizes. As spreading increases, cells are known to increase actomyosin contractility, and nuclear deformation (https://doi.org/10.1529/biophysj.107.116863, https://doi.org/10.1073/pnas.0235407100, https://www.nature.com/articles/ncomms1668, https://doi.org/10.1073/pnas.1902035116). As expected, nuclear solidity, sencyt ratios, and Yap ratios all increased with cell spreading. Interestingly, this occurred only for C26 and not MCF7 cells, where no changes were measured in solidity, sencyt, or YAP. The lack of response of MCF7 cells is consistent with the lower sencyt response, and lower sencyt/nuclear shape correlation measured in fig. 4. It suggests that MCF7 cells may have mechanisms to shield the nucleus from deformation, something which we have reported in a different context (Kechagia et al., Nat. Mater. 2023).

      The new results are shown in figs. 3 and s8. We have also expanded the explanation of drug treatments in page 4 (3rd paragraph).

      * The novelty is on the specificity of this synthetic fusion protein for these manipulations and not on cell density. Yet, the reasons behind this selective response remain unexplained, potentially attributable to the unique characteristics or sensitivity thresholds of their synthetic probe. As comparison, YAP localization and this is sensitive to both inputs, but this is also already published (fig4). The focus is anyway on Sencyt for which they offer simple observations and quantifications. *

      • *

      The main novelty of the work lies in the characterization of the role of nucleocytoplasmic transport in mechanotransduction, in the context of multicellular systems. We and others had shown that nucleocytoplasmic transport responds to mechanical force in the context of single cells (see for instance Andreu et al. 2022 from our group, but also https://doi.org/10.1126/science.abd9776 from the Martin Beck group). However, to what extent this applies to multicellular systems was unknown. It is true that in multicellular systems, the response of YAP and other mechanosensitive transcription factors has been characterized (such as in our Elosegui-Artola 2017 paper, mostly done at the single cell level but including one figure panel on epithelial cell monolayers). The reviewer argues here and in the consultation comments with other reviewers (see below) that this demonstrated the role of nucleocytoplasmic transport in multicellular systems. However, we respectfully disagree. As also noted by reviewer 3 in the consultation, the response of YAP, and of any transcription factor, may include effects on nucleocytoplasmic transport, but will also likely include effects caused by the complex biochemical signalling pathways that regulate them. Disentangling such effects requires a sensor that only responds to nucleocytoplasmic transport, and this is precisely what Sencyt provides.

      The reviewer also states that our manuscript does not explain why sencyt responds to mechanics and not cell density. We disagree: sencyt responds to mechanics for the reasons explained in our previous work (Andreu et al., Nat. Cell Biol. 2022), and there is no reason to expect a specific response to cell density. In this regard, we don't think there are any sensitivity thresholds to detect cell density, as the probe is not designed to sense this parameter in the first place. The fact that YAP responds to both mechanics and cell density shows that the response to density cannot be merely explained by mechanics, and is rather due to signalling through other means. Of course, we agree that we do not explain the mechanism by which YAP senses cell density, but we think this lies clearly out of the scope of our manuscript.

      In terms of novelty, our work also characterizes a tool to assess nucleocytoplasmic transport live in cells. We agree with the reviewer that the specific construct had been reported in our previous paper, but it had not been characterized in detail. This is done here, enabling its use by the community as a tool to measure nucleocytoplasmic transport in any context, be it related to mechanics or not.

      • *

      When reviewing the figures presented, I find it challenging to detected marked differences, despite their quantitative data suggesting otherwise.

      • *

      We assume here that the reviewer refers to differences in sencyt nuclear localization, that is, the sencyt index. We have now checked the example images showing changes in sencyt index, in figures 1 and 2. In figure 1, the example cells under hypo-osmotic shocks increase their sencyt index from 1.2 to 1.45 (C26). In figure 1, the example cells under hyper-osmotic shocks decrease their sencyt index from 0.9 to 0.3 (MCF7) and from 1.4 to 0.5 (C26). In figure 2, the example cells increase their sencyt index upon drug washout from 0.2 to 1.4 (MCF7) and from 0 to 0.9 (C26). Of course, these individual values don't reflect exactly average values, but they do reflect the reported average trends and their magnitudes faithfully. Here we note that even though sencyt changes with the different treatments, it is always more nuclear than cytosolic (sencyt index >0, as it has an NLS). Thus, to the naked eye, sencyt always seems to show a "bright" nucleus, and it is hard to intuitively see changes in its localization. Further, we also note that osmotic shocks lead to overall changes in fluorescence levels due to volume changes (as GFP molecules get diluted or concentrated in hypo or hyper osmotic shocks, respectively). This does not affect ratiometric quantifications as assessed with our mcherry control, but means that changes in ratios are hard to see by eye. To help in this visualization, we have now changed the images from green to grayscale, which is better perceived by the human eye. We have also specified the issue of fluorescence intensity changes in the legend of the figure.

      In addition to this, we have seen that there is indeed a case in which examples were not following average trends. In the case of hypo-osmotic shocks in figure 1, example MCF7 cells were barely changing their sencyt index with treatment. We apologize for choosing this non-representative image for the figure, we have now changed the figure to show more representative cells.

      • Furthermore, the study attempts to correlate the behavior of Sencyt with the nuclear geometric parameter of solidity, a connection that seems to lack a clear basis in cell biology and could potentially lead to misconceptions. *
      • *

      Mechanical effects on nucleocytoplasmic transport are mediated by mechanical tension application to nuclear pores, which are embedded in the nuclear membrane (nuclear envelope). Whereas nuclear envelope tension is very challenging to measure directly, it can be indirectly related to nuclear shape. Indeed, a tense membrane will tend to even out membrane irregularities and appear rounded, whereas a membrane under low tension will tend to show wrinkles. Nuclear solidity is a geometric parameter that compares actual nuclear volume to the volume of the convex hull (intuitively, the volume of the smallest wrinkle-free object containing all of the nucleus). Thus, it is the geometric parameter that best reflects the presence of wrinkles, folds or irregularities, and as such the one that should best correlate to membrane tension. Of course, this correlation is not perfect, and there could be many situations in which changes in membrane tension may not directly affect nuclear solidity. But we do believe that solidity is the geometrical parameter that should best reflect membrane tension, and this is why we focus on it. Consistent with our hypothesis, solidity is the geometrical parameter that best correlates with sencyt. To further clarify this, we now explain this rationale in detail in page 4 of the manuscript (1st paragraph).

      * Reviewer #2 (Significance (Required)): *

      * In sum, I think the MS is of interest for a very specialistic audience. There are no clear interpretations. The work is done in one or two cellular model systems in vitro; and the general significance of these observations is of very limited impact and no novelty. *

      We strongly disagree. The study is done on two cellular models, one with epithelial and the other with mesenchymal phenotype, and thus highly relevant for multicellular systems. Following suggestions by reviewers 1 and 2, we have now characterized the epithelial/mesenchymal behaviour of the cell types in detail (see supp. fig. 1). The results are novel in that they demonstrate the role of nucleocytoplasmic transport in multicellular systems, something which as argued above had not been done before. The difference with YAP, and the disentanglement between transport and signalling, is also novel. Finally, we believe the manuscript will be impactful because of this novelty, but also because of the availability of sencyt as a tool for the community. In fact, since placing this manuscript in biorxiv, we have received many requests (directly and through addgene) to share sencyt, which is currently being used in several labs across the world.

      • *

      *Reviewer #3 (Evidence, reproducibility and clarity (Required)): *

      • *

      In this very well-written manuscript, Pere Roca-Cusachs and colleagues investigated the response of nucleocytoplasmic transport (NCT) to mechanical stress and tested whether this response is similar in epithelial and mesenchymal cells using a combination of quantitative approaches. This study builds upon their earlier findings, which elegantly demonstrated that NCT is sensitive to mechanical forces transmitted to the nuclear membrane. Using a similar approach to their recent work, they quantitatively analyzed NCT and compared the two cell types using various treatments that impact nuclear membrane tension. The study is straightforward and experimentally sound, with an adequate number of replicates and independent experiments. While one might consider the limitations given their previous work, none have demonstrated that NCT is mechanosensitive in epithelial cells. Additionally, they provide a simple approach to measure NCT, which should be of interest in the field. However, it is unclear how the authors defined the epithelial phenotype in this work and whether they solely based this characterization on the tissue/cell's origin. Epithelia can be defined ultrastructurally with reference to their apico-basal polarity and specific cell-cell junctions (Alberts et al., 1994; Davies and Garrods, 1997). Changing cell density should affect cell/cell adhesion, but the authors provide no evidence that the cells tested in the study are attached to their neighbors on all sides and form an epithelium. While I recognize that the objective of this study is not to mimic the in vivo behavior of epithelial tissue, the authors should at least ensure that cells form a monolayer by quantitatively assessing cell-cell junctions (or they should adjust their conclusions adequately). This control is specifically important for Figure 3 and 4, whose objective is to test the impact of cell/cell contacts. But it would also be important to provide this essential control for Figure 1 and 2, as it is unclear from the images provided if MCF7 cells are forming an epithelium (and form cell/cell junctions).

      • *

      We thank the reviewer for the positive assessment of our work. We fully agree with the reviewer that properly assessing cell-cell adhesion is important in the context of the work. To this end, we have stained for E-cadherin in both cell lines. As expected and as described previously, the results confirm that MCF7 cells do have clear cadherin-mediated cell-cell adhesions, with a cadherin staining localized specifically in cell-cell junctions. Also as expected, C26 cells show much lower cadherin expression, without a clear pattern. Further confirming this difference, MCF7 cells (but not C26 cells) show a clear apico-basal polarization, with distinct actin organizations in their apical and basal sides. Thus, we believe that the two cell models do represent a reasonable assessment of epithelial versus mesenchymal phenotypes, in a multicellular context. The data are presented in new supplementary fig. 1. We have also included a paragraph in the discussion to comment on the differences between cell types (page 7, 2nd paragraph).

      • Reviewer #3 (Significance (Required)): *

      • *

      The mechanosensitivity of NCT is an important question central to many aspects of cell biology. One might consider the impact of the proposed work limited, given their previous research. However, none have demonstrated that NCT is mechanosensitive in epithelial cells, making it a crucial question that needs to be addressed. Additionally, they provide a simple approach to measure NCT, which should be of interest to a broad audience.

      We thank again the reviewer for this positive assessment.

      • *

      *Referees cross-commenting *

      * Here comments from all 3 reviewers are reported *

      * Reviewer 1: *

      * I disagree with R2's comment that there is 'no novelty' here. Although this work is going to be of greater interest to a specialised rather than general audience, it characterises in depth a simple tool to measure NCT which will be useful for mechanobiology field. Also, using 'two cellular model systems in vitro' is very standard in the field when assessing subcellular processes like NCT. Using this approach in vivo would be very interesting but challenging and would be an entirely different study . *

      • *

      *I agree with R2's comments that the authors should better justify their combination of two actin inhibitors and R3s point on better assessing cell/cell junctions. *

      • *

      We thank the reviewer for these comments. Both issues have been addressed, as described in the response to reviewers above.

      * Reviewer 2 *

      * About Reviewer 3's comments, I believe it's a stretch to highlight the strength and novelty based on "NCT's mechanosensitivity in epithelial cells has not been demonstrated,". There are thousands of papers on the Hippo pathway, that is known to be mechanosensitive, on the regulation of YAP, that enters in the nucleus in Hippo inhibited conditions and exits to the cytoplasm in Hippo induced cells, including downstream of mechanical signals. The phenomenon of nuclear-cytoplasmic shuttling being a common event from neurons to endothelial and multiple types of epithelial, immune, and fibroblast cells is already established through NCT of this and other endogenous proteins. This is simply an accepted fact. Then, The Nature cell Biology 2022 was offering a very general claim. No warning that conclusions could have been cell type specific. In the Artola 2017 Cell paper they also showed NCT in mammary epithelial cells. We should definitively conclude that NCT's mechanosensitivity in epithelial cells has been well demonstrated. *

      • *

      We disagree with this assessment, for the same reasons also exposed by reviewer 3 below. Previous work on YAP and other transcription factors cannot be seen as a demonstration of the role of nucleocytoplasmic transport per se. The localization of any transcription factor is highly regulated by complex signalling pathways, and can be affected by many factors. One of them is nucleocytoplasmic transport, but signalling events (for instance through phosphorylation) could change localization by promoting binding to cytosolic or nuclear binding partners, by promoting protein degradation, by masking nuclear localization signals, and others. To isolate the role of nucleocytoplasmic transport, a probe sensitive only to this factor should be designed. This is exactly what sencyt provides. In fact, this has allowed us to answer an important open question: is the sensitivity of YAP to cell density mediated by mechanics and nucleocytoplasmic transport, or is it mediated by some other factor? Our results suggest that some other factor, likely mediated by the Hippo pathway and not necessarily mechanotransduction, explains this sensing of cell density. This is a novel finding, which was not provided in either our Elosegui-Artola 2017 paper or our Andreu 2022 paper.

      * About Reviewer 1: I find it challenging to grasp the point made in the comment. On novelty, in their previous study in NBC 2022 Syncet was already shown to undergo NCT. The reviewer states that the study presents "a simple tool to measure nuclear-cytoplasmic transport (NCT) beneficial for the mechanobiology field, and evidence that this demonstrates a novel layer of regulation in hippo signaling (also because this is observational and not a mechanistic study). The tool in question is far from simple. Its application requires transfection into cell cultures, conducting live imaging, etc. If one aims to measure NCT of endogenous proteins, straightforward immunofluorescence or live imaging of endogenous proteins (like GFP-tagged YAP, Twist, Smads, etc.) using the same experimental setup should suffice to demonstrate relevance, without necessitating any additional experiments. What then, is the unique benefit of this proposed tool? Given it's an artificial construct combining NLS-GFP with a bacterial protein, questions arise about the effects of the forced nuclear localization signal (NLS) or the bacterial component. It is an empirical artificial construct and there is no mechanism to explain its behavior.The comparison of Syncet with YAP seems to me questionable and of limited utility. *

      As also noted by reviewer 3 below, the use of genetically encoded fluorescent sensors that require transfection is by now absolutely standard in biology, and cannot be considered to be "far from simple". And as stated above, imaging of endogenous transcription factors (which also requires transfection if it is done live) does not isolate the role of nucleocytoplasmic transport. We also disagree that "there is no mechanism to explain its behaviour". Sencyt was developed in our previous andreu et al 2022 paper, where the mechanism is explained in detail.

      • *

      *It's unsurprising that an artificial construct only mirrors some aspects of what is considered a genuine mechanosensitive protein. The utility of a synthetic tool lies in its ability to replicate actual phenomena, not in what it fails to do. In comparison to their NBC 2022 study, this manuscript focuses on what their reporter fails to detect. *

      We disagree that a synthetic tool is only useful if it replicates the behaviour of endogenous proteins. A synthetic tool, precisely due to its engineered, artificial nature, can be made to respond only to specific factors (in this case, nucleocytoplasmic transport). This can then be used to disentangle the role of such specific factors, as done here.

      The osmotic shock was the assay in their 2017 Cell paper. Here they demonstrate that a combination of Blebbistatin+CK (an unclear choice of drugs) is ineffective, as is cell density. Are there other specific peculiarities associated with this construct?

      Here, we note that our osmotic shock experiments in our 2017 paper were done for YAP (not nucleocytoplasmic transport in general). Regarding the choice of drugs, please refer to our answer to the reviewer comments above for a full explanation. Also, we want to clarify that this combination is not ineffective, as it leads to clear changes in sencyt. * *

      * My other concern is on the minor quantitative changes reported, which seem inconsistent with the provided representative images, where significant differences are difficult to appreciate. For instance, the claim that the transfected sensor differs from an endogenous NCT protein, YAP, after cell density treatment, is hard to detect in their images. In Figure 4, comparing YAP and Syncet in C26 cells, YAP appears uniformly nuclear at high cell density, potentially more nuclear than the synthetic sensor, which is not coherent with their claim.*

      • *

      Regarding the concern of the minor changes seen in images, please refer to our full response to the reviewer comments above. Regarding the comparison between sencyt and YAP, we want to clarify that in our manuscript we do not compare the absolute values of nuclear localization between YAP and sencyt. As the reviewer notes, these are two different proteins, so which one is more nuclear does not really provide useful information. So whether YAP is more or less nuclear than sencyt is unrelated to (not incoherent with) our claim. What we state in figure 4 is that YAP responds to cell density, whereas sencyt does not. This is clear from the quantifications and also from the images.

      • *
      • From the Hippo perspective, there is really an unusual amount of nuclear YAP left in their cells. This should be almost completely cytoplasmic from prior contact inhibition studies in the Hippo field. Syncet could be simply less sensitive than YAP in these borderline conditions. Although there's a more noticeable cytoplasmic noise in dense cells with YAP compared to Syncet, this could be attributed to several factors, including differences in protein degradation rates, which I suspect to be quicker for a synthetic protein. From a technical perspective it is complex to get strong conclusions after comparing something so unrelated with each other. One is a live GFP detection and the other is a staining by immunofluorescence. the nature of the background is also different and so conclusions from comparisons between unrelated systems is not justified. *
      • *

      In conditions of high density, average YAP ratios are close to one (zero in logarithmic scale, as reported in the figures) for MCF10A cells, so there is no nuclear localization. This is similar to what we and others have previously reported in similar conditions (Elosegui Artola et al 2017, Kechagia et al. 2023, for example). In C26 cells, YAP levels at high density are a bit higher. This is likely due to their mesenchymal nature, and therefore diminished cell-cell contact inhibition (as assessed in detail in this revision). This in fact further suggests that the response of YAP to cell-cell contacts is different from a mere mechanical factor, supporting our hypothesis. Regarding the issue of noise, background noise is removed from quantifications, and potential noise coming from non-specificities or autofluorescence is also cancelled by the fact that we compute fluorescence ratios between nucleus and cytoplasm (and not absolute values). Thus, we don't think noise is an issue. Further, we note again that we do not directly compare values between sencyt and yap.

      * This suggests caution on what is heralded as the main claim here put forward. *

      * Reviewer 1: *

      *I do have some sympathy with R2s comments in the consultation. I agree that showing that NCT is mechanosensitive in an epithelium is not new. I also agree that sometimes it is difficult to see the quantitative differences by eye. This second point could be addressed by including more details of the segmentation and analysis in the supplemental material (along with some example images). *

      • *

      We thank the reviewer for the suggestions. Regarding the novelty, please see above for a detailed discussion, and also the comments of reviewer 3 below (previous work studied not NCT but transcription factors, affected by many parameters). Regarding quantitative differences, we have now addressed this issue by showing images in grayscale rather than green, and also by replacing one example cell in figure 1 which indeed did not reflect the average measured trends. We now also show examples of 3D rendered images of the nuclei in different conditions. We have also gone through the methods and clarified in detail how ratios are calculated, the segmentation procedure is also explained in detail.

      * Regarding novelty, I would be interested to know if R2 thinks that there are experiments that the authors could do to improve the work. Or do they need to simply tone down their claims? It's perfectly acceptable to publish a well characterised tool with a series of observations and it's beneficial to the community to do so.*

      • Reviewer 3 *

      * Thanks to Reviewers #1 and #2 for using this consultation option; I truly appreciate their feedback on my comments and find it extremely valuable. I agree with Reviewer #1 that the method proposed here is relatively simple. Transfecting cells and conducting live fluorescent imaging can hardly be considered difficult. I believe the construct used/designed by the authors is the main advantage as it provides a specific way to quantitatively assess NCT and not limit the analysis to a single nuclear protein (such as YAP). Reviewer #2 suggests using immunofluorescence staining of YAP or live imaging of fusion fluorescent protein (following transfection) to analyze NCT, but this approach would yield a readout not only based on NCT but also on the many other interacting partners/mechanisms that regulate the candidate localization, resulting in an unspecific readout (and similar transfection/live imaging set-up). *

      • *

      We thank the reviewer for this comment, we fully agree and have elaborated on this in our responses above.

      * Regarding the impact of the study, I agree that it is certainly not as impactful as previous publications on this topic. Although I find reviewer#2 argument on Yap irrelevant, as YAP is not the main focus of this paper. Some experiments have been done with cells of epithelial origin, but NCT mechanosensitivity has not been clearly tested in epithelial monolayer, which is the main claim of the proposed study here. The 2017 Cell paper focused on YAP transport into the nucleus (and not NCT in general) and they showed a correlation between YAP nuclear localization and traction force in MCF10A. I am not sure if one would say that "NCT mechanosensitivity has been well demonstrated in epithelial cells" based on this single panel. The impact of the proposed study is certainly not outstanding but offering a thorough analysis in epithelial cells (as monolayers and not as individual cells) and presenting a well-defined experimental approach should be of interest in the field. I agree with comments from reviewer#2 that some reported effects in graph are unclear on main images. More experimental details should hopefully clarify this aspect.*

      • *

      We fully agree with the reviewer. Regarding quantitative differences, we have now addressed this issue by showing images in grayscale rather than green, and also by replacing one example cell in figure 1 which indeed did not reflect the average measured trends.

    1. Author response:

      The following is the authors’ response to the original reviews.

      (1) The authors should show i) whether the variants exhibit the same surface expression as wildtype and ii) whether changes of surface expression (e.g. wt transporter expressed low and high) alters growth rates under conditions where growth depends on amino acid uptake. The authors say that the uptake of radioactive substrate and the overall fitness coincide (Figures 5 and 6), but it would be good to quantify the correlation, perhaps by using a scatterplot and linear regression.

      We thank the reviewer for the questions and proposals. The comparison of the surface expression between the transporter-expressing variants was added to the manuscript (Figure 3- Figure supplement 1 and 2). In the case of the AGP1 variants it was calculated that surface expression between the evolved mutants and the wild-type is similar, indicating that the transporter overexpression has no impact on the growth rate per se. The same analysis for the PUT4 variants showed significant difference, with the PUT4-S variant seemingly expressed more than the wild-type. However, that does not seem to affect the uptake effect of the mutation in the cases of the original substrates of Ala, Gly and GABA, since in those cases the transporter activity for the evolved variant is substantially decreased (Figure 5). Thus, the variation on the surface expression between the mutant and the wild-type, which could be attributed to the small sample size and the inherent limitations of the analysis (imaging of a culture with cells in different planes), is not expected to interfere with the reported results.

      Additionally, a scatterplot accompanied with a linear regression curve describing the connection between the overall fitness and uptake of 2 mM radioactive substrates was added to the manuscript, as advised (Figure 5- Figure supplement 2). In both cases of 2 mM Phe or Glu, the regression model explains 60-70% of the variation observed in the uptake rate of the amino acids by the different variants if changes in the uptake rate are dependent on changes in the fitness.

      (2) The authors should further investigate to what extent the (over)expression of wildtype versus variant transporters impacts growth rates. I would recommend such experiments being done under conditions where nitrogen uptake does not depend on amino acid uptake. I could imagine that some of the fitness data are confounded by the general effects of mutations on growth rates. More concretely, I could imagine that overexpression of e.g. the AGP1-G variant is less of a burden for the yeast cells and would allow to grow them better in general. This could explain why its overall fitness is close to wt, whereas other variants exhibit diminished fitness (Fig. 4A).

      The growth curves of all transporter variant cultures in the absence of selection for amino acid uptake have been presented in Figure 4 - Supplement figure 1. As proposed, the growth rates of the variants in medium with ammonium as nitrogen source were calculated and presented in Figure 3- Supplement figure 1 and 2. For both cases of AGP1 and PUT4 expressing variants, statistical analysis showed no significant difference between the mutants and the wild-type.

      (3) It is quite remarkable that the PUT4-S variant has such a dramatically enlarged substrate spectrum. In addition, the fitness losses for Alanine and GABA are rather small. This striking finding asks the question of why yeast has not evolved this much better/more efficient variant in the first place?

      We thank the reviewer for this very good question. We now included an explanation in the Discussion, but to give a short answer here: One should keep in mind that we used a 10-gene deletion strain to select for given mutants. Wild-type cells have a wide spectrum of substrates through the use of many amino acid transporters, and their regulation is intricately tuned to achieve optimum transport under any environmental circumstance. Broadening the spectrum of a single transporter thus would not lead to increased fitness. On the contrary, it would probably throw off this fine balance.

      (4) It would be generally interesting which types of selections (transporter/amino acid combinations) were tried (maybe as part of the methods section). I could imagine that the examples that are shown in the paper are the "tip of the iceberg", and that many other trials may have failed either because the cultures died, or the identified clones would grow faster due to mutations outside of the plasmid. It would be helpful for researchers planning such experiments in the future to be made aware of potential stepping stones.

      The issues raised here are spot-on, as we actually did test the evolution of PUT4 towards transport of other amino acids than the two mentioned in the report. Aside from the successful Asp and Glu, we ran parallel cultures selecting for transport of Gln, Thr, Trp, Tyr, and Cit. Neither of these evolution regimes led to increased growth phenotypes that were linked to the evolved gene, and we did not investigate these cultures further. At this point, we cannot fully explain this result, which is why we decided to omit it from the report. The L207S variant of PUT4 was later shown to indeed support growth on Gln, Thr, and Cit. Therefore, we speculate that the reason for not evolving this mutant in the respective evolution cultures was that the fitness gain in these amino acids was not large enough to be sufficiently enriched in the course of the evolution trial. Given that the Δ10AA strain still harbors nine amino acid transporter genes in its genome, it is conceivable that upregulation of some of these genes causes growth in some amino acids, prohibiting the selection of mutations in PUT4 (e.g., by mutations outside the plasmid, as the reviewer aptly suggested). We deemed these (negative) results not appropriate for the manuscript, as our main focus was characterizing the fitness effects of single mutations, not the laboratory evolution process of obtaining the mutants.

      (5) The authors took a genetic gain-of-function approach based on random mutagenesis of the transporter. In such approaches, it is difficult to know which mutation space is finally covered/tested, and information that can be gained from loss-of-function analyses is missed. Accordingly, the outcome is somewhat anecdotal. To provide an idea of the mutational landscape accessible, the authors could perform NGS of cultures without any selective pressure, and report the distribution of missense variants in the population.

      We very much appreciate the interest in the details of the mutagenesis. Based on the information given in the original OrthoRep publications (e.g., Ravikumar et al., DOI: 10.1016/j.cell.2018.10.021; mutation rate approx. 10-5 per generation and nucleotide), we calculated the expected number of mutations per passage in our experiments. For AGP1, it is about 5000 mutational events per passage (10 mL culture volume and 1:200 dilution), and for PUT4, it is about 1000 mutational events per passage (2 mL culture volume and 1:100 dilution). At a gene length of about 2000 bp, we expect to cover most single mutations already in the first or second passage (in the absence of selection). This is reflected in the result that the strongly beneficial mutation L207S in PUT4 was recovered in every selection on Asp or Glu we tested. We included this information in the Methods section.

      That said, the present study was consciously designed to research gain-of-function mutations, as we wanted to know if and how membrane transporters can evolve new substrate specificities without losing the original functions. Our approach was chosen to reflect as close as possible a natural scenario where a microorganism encounters a new ecological niche (a new nutrient to be transported). At the same time, we included selective pressure to keep the capacity to thrive in the original niche (to assimilate an ancestral nutrient). This approach is designed to specifically select against any loss-of-function mutations, which is in line with most modern theories about evolution of protein function (excellently reviewed in Soskine and Tawfik, DOI: 10.1038/nrg2808). We find that this approach gives a good idea how transporters could evolve new functions in a natural setting. By engineering single mutations in the wild-type background of the transporters, we show the fitness effects of different single mutations - this finding thus does not depend on the mutational landscape that is covered in the experiment.

      (6) The authors do not discuss the impact of these mutations on transport rates/kinetics, which are known to play a role in substrate selection in solute carriers (https://www.nature.com/articles/s41467-023-39711-y). Do the authors think ligand binding/recognition is more important than kinetic selection in the evolution of function?

      Indeed, the observed phenotypes can stem from both changes in transport rate and changes in substrate binding. In our opinion, both are perfectly possible explanations for the behavior of evolved transporter variants. We are not discussing this in the manuscript as the weak transport of the novel substrates in the wild-type transporters did not allow us to unambiguously assign one or the other. Yet, we can lend minor circumstantial evidence pointing towards substrate affinity being the more important factor in evolving a new activity in transporters: Overall transport rate (for original substrates) declined in most evolved transporters. Therefore, it is a bit less likely that improved transport rate allowed novel substrates to be used as a nutrient. However, this is not to say that both processes can occur (even side by side).

      (7) Ultimately, what are the selective pressures that drive transporter function? The authors pose this question but don't fully develop the idea. Would promiscuous variants still be selected for if the limiting nitrogen source was taken up by the cell via a different pathway (i.e. ammonium or perhaps arginine)?

      Evolution and regulation of transporters is a very complex system, and we simplify this system in our single-transporter/single-amino acid approach. In nature, the selective forces are assumed to be much smaller than in our system, and multiple selective pressures might occur at the same time (maybe even in opposite directions). Therefore, such predictions are beyond the scope of the present study. To put it shortly, yeasts (and other organisms) have evolved the capacity to transport all natural amino acids. Yet, to actually allow fine-tuned regulation of transport of each individual amino acid, narrow- and broad-range transporters have evolved, including a lot of redundancy. This means that the question posed cannot be answered by yes or no, but by “it depends”.

      (8) Amino acids are a special class of metabolites, in that they all have the same basic structure. Thus, transport systems really only need to recognize the amino and carboxyl groups with high fidelity, and can modulate the side chain binding site to increase specificity. This was demonstrated in a bacterial APC transporter (https://www.nature.com/articles/s41467-018-03066-6#Sec2). Is this why the APC fold is largely responsible for AA uptake in biology?

      Indeed, typically, APC-type amino acid transporters bind the amino and carboxyl groups in the same position by backbone interactions. Therefore, this might be an ancestral feature of the APC superfamily and explain why this group represents the main group of amino acid transporters.

      (9) There isn't much discussion on the location of the mutations with respect to binding site vs. gating helices. Are there hotspots of mutations within the APC, and areas where variation is poorly tolerated? It would be helpful to briefly review what is known about mutations that change amino acid specificity in the APC family. My impression is that other studies applying rational mutagenesis have also shown that single-site mutations in the binding pocket alter substrate specificity - are these analogous to the L207 in PUT4? PUT4: I64T comes up in 3 of 5 selections. Did the authors consider a closer analysis of this mutation, and if not, why?

      We agree that it would be helpful to determine hotspots of mutations in APC transporters that lead to changes in selectivity. However, we feel that the current literature does not lend enough data to support an extended analysis of such hotspots. Conversely, the natural sequences of APC transporters are not similar enough to determine which residues are responsible for a certain selectivity profile. There are however some studies on site-directed mutagenesis, as mentioned by the reviewer. A short summary of those is discussed in the revised paper. Interpretation of the previous studies under the light of our results suggests that the evolutionary evolved sites derived in our work play a significant role in substrate selectivity and transporter function within the superfamily of the APC transporters.

      As to the question why we did not include the I64T mutation in our experiments: this mutation lies within the poorly defined N-terminus of the protein, which is not part of the transmembrane core. We therefore deemed this residue as probably not connected to the specificity of the protein; it might be related to the protein’s stability in the cell, as the termini of transporters are known to be important for post-translational regulation, especially vacuolar degradation.

      (10) What do we learn about the APC fold that informs our understanding of where substrate specificity arises in this fold? Do the authors think all SLC folds are equally capable of adaption, or are some more evolutionary-ready than others? An evolutionary analysis of these transporters to gain insights into whether the identified substitutions also occurred during natural evolution under real-life conditions would further strengthen the manuscript. Could the authors provide a sense of how similar the 18 yeast amino acid transporters are, such as sequence alignments or a matrix of pairwise sequence identity/similarity? Are they very diverged, or is the complement of amino acid substrates covered by a rather conserved suite of transporters?

      We do not want to make bold statements about adaptive evolution in other SLC folds, but we consider it not unlikely that a similar approach will lead to similar conclusions in other transporters.<br /> As advised, a pairwise identity matrix was added to the manuscript (Figure 1–figure supplement 2).

      As to the proposed analysis focusing on natural occurrence of the mutations we found: we have indeed looked into this, but have not found evidence of such mutations. This is actually expected, as our selection regime puts “unnatural” selective pressures on a single transporter in isolation, which in reality co-evolved with a whole suite of other transporters that already have the capacity to transport all amino acids. Therefore, it is unlikely that the same mutations would happen in a natural setting. Our study is designed to capture evolution where a completely novel substrate is encountered, for which no transport mechanism has evolved yet.

      (11) Throughout: some of the bar graphs show individual data points, but others do not (Figure 3, Figure 5). These should be shown for all experiments.

      We thank the reviewer for the comment. In the revised version of the manuscript, we included individual data points in all bar graphs.

      (12) For bar graphs in which no indication of significance is shown, does this mean that p>0.05? Comparisons that are not significant (p>0.05) should be indicated as such.

      We thank the reviewer for the comment. In the revised version of the manuscript, we indicated in the legends that in cases of no significant difference (p > 0.05) between the wild-type and the evolved variants, no asterisks are shown.

      (13) Figure 5, Figure 6: Are the three confocal images just three different fields of view? It might be useful to include a zoom-in on a single representative cell, as it is hard for the reader to see to evaluate the membrane localization.

      In the revised version of the manuscript, we clarified that the three confocal images represent three different cultures, as each variant was tested in triplicates. We also included a zoom-in of a representative cell, as suggested.

      (14) In the main text, page 9, the conditions used for each experimental evolution are not clear ("nitrogen limiting mixture of amino acids (1 mM final concentration)". I think this is an important detail, since the mixtures are quite different for the more promiscuous vs. the more selective transporter, and it would be helpful if this was described more clearly in the main text.

      We thank the reviewer for the comment. We have included further clarification in the revised manuscript.

      (15) Figure 1-Supplement 1 and Figure 4 Supplement 4 - can't read the figure labels. Try labeling columns and rows rather than individual plots.

      We have taken the proposal into account and revised the proposed Figures accordingly.

      (16) Page 9: "The transporter gene was sequenced and re-introduced into Delta-10AA cells." Was the plasmid isolated, sequenced, and re-introduced, or was the gene cut-and-pasted into a new vector backbone?

      In the revised manuscript we have clarified that the gene was sequenced and then cloned into the expression vector and re-introduced into naïve Δ10AA cells.

    1. Author response:

      Reviewer #1 (Public Review):

      The authors report a high-quality genome assembly for a member of Xenacoelomorpha, a taxon that is at the center of the last remaining great controversies in animal evolution. The taxon and the species in question have "jumped around" the animal tree of life over the past 25 years, and seemed to have found their place as a sister-group to all remaining bilaterians. This hypothesis posits that the earliest split within Bilateria includes Xenacoelomorpha on the one hand and a clade known as Nephrozoa (Protostomia + Deuterostomia) on the other, and is thus referred to as the Nephrozoa hypothesis. Nephrozoa is supported by phylogenomic evidence, by a number of synapomorphic morphological characters in the Nephrozoa (namely, the presence of nephridia) and lack of some key bilaterian characters in Xenacoelomorpha, and by the presence of unique miRNAs in Nephrozoa.

      The Nephrozoa hypothesis has been challenged several times by the authors' groups who alternatively suggest placing Xenacoelomorpha within Deuterostomia as a sister group to a clade known as Ambulacraria. This hypothesis (the Xenambulacraria hypothesis) is supported by alternative phylogenomic datasets and by the shared presence of a number of unique molecular signatures. In this contribution, the authors aim to strengthen their case by providing full genome data for Xenoturbella bocki.

      The actual sequencing and analysis are technically and methodologically excellent. Some of the analyses were done several years ago using approaches that may now seem obsolete, but there is no reason not to include them. As a detailed report of a newly sequenced genome, the manuscript meets the highest standards.

      The authors emphasize a number of key findings. One is the fact that the genome is not as simple as one might expect from a "basal" taxon, and is on par with other bilaterian genomes and even more complex than the genome of secondarily simplified bilaterians. There is an implicit expectation here that the sister group to all Bilateria would represent the primitive state. This is of course not true, and the authors are aware of this, but it sometimes feels as though they are using this implicit assumption as a straw dog argument to say that since the genome is not as simple as expected, X. bocki must be nested within Bilateria. The authors get around this by acknowledging that their finding is consistent with a "weak version of the Nephrozoa hypothesis", which is essentially the Nephrozoa phylogenetic hypothesis without implicit assumptions of simplicity.

      We were NOT suggesting that Xenacoels are ‘basal’ though others have certainly done so. We were testing, instead, whether their supposed simplicity is reflected in the compostion of the genome.

      Another finding is a refutation of the miRNA data supporting Nephrozoa. This is an important finding although it is somewhat flogging a dead horse, since there is already a fair amount of skepticism about the validity of the miRNA data (now over 20 years old) for higher-level phylogenetics.

      The missing bilaterian microRNAs was one of the early pieces of evidence excluding the Xenacoelomorpha from Nephrozoa. Our new data are an important refutation of this source of evidence and add to the picture that this phylum is not lacking characters of Bilateria as had been suggested (missing micro RNAs Hox genes explicitly interpreted in this way).

      The finding that the authors feel is most important is gene presence-absence data that recovers a topology in which X. bocki is sister to Abulacraria. The problem is that the same tree does not support the monophyly of Xenacoelomorpha. This may be an artifact of fast evolving acoel genomes, as the authors suggest, but it still raises questions about the robustness of the data.

      In sum, the authors' results and analyses leave an open window for the Xenambulacraria hypothesis, but do not refute the Nephrozoa hypothesis. The manuscript is a valuable contribution to the debate but does not go a significant way towards its resolution.

      The manuscript has gone through several rounds of review and revision on a preprint server and is thus fairly clear of typos, inconsistencies and lack of clarity. The authors are honest and open in their interpretation of the results and their strengths.

      We thank the reviewer for their assessment of our manuscript. We have responded to some of the points they make above. As there were no specific points to edit or change raised by reviewer 1, we are replying in detail only to reviewer 2. We like to note that we have modified the text and thus focus of our manuscript in accordance to with what we think reviewer 1 is suggesting in the last two paragraphs of their review.

      Reviewer #2 (Public Review):

      The manuscript describes the genome assembly and analysis of Xenoturbella bocki, a worm that bears many morphological features ascribed to basal bilateria. The authors aim to analyse this genome in an attempt to determine the phylogenetic position of X. bocki as a representative of Xenacoelomorpha and its associated acoelomorphs. In doing so, they want to inform the debate as to whether xenacoelomorph belong among, or is in fact paraphyletic to all bilaterians.

      This paper presents a high-quality assembly of the X. bocki genome. By virtue of the phylogenetic position of this species, this genome has considerable scientific interest. This assembly appears to be highly complete and is a strength of the paper. The further characterisation of the genome is well executed and presented. Solid results from this paper include a comprehensive description of the Hox genes, miRNA and neruopeptide repertoire, as well as a description of the linkage group and how they relate to the ancestral linkage groups.

      Where this paper is weaker is that for the central claims and questions of this paper, i.e,. the question of the phylogenetic position of xenacoelomorph and whether X. bocki is a slowly evolving, but otherwise representative member of this clade, remains insufficiently resolved.

      The authors have achieved the goal of describing the X. bocki genome very well. By contrast, it is unclear, based on the presented evidence, whether xenacoelomorph is truly a monophyletic group. The balance of the evidence seems to suggest that the X. bocki genome belongs within the bilateria group. However, it is unclear as to what is driving the position of the other acoels. Assuming that X. bocki and the other two species in that group are monophyletic, then the evidence will favour the authors' conclusion (but without clearly rejecting the alternatives).

      This paper will likely further animate the debate regarding this basal species, and also questions related to the ancestral characters of bilateria as a whole. In particular the results from the HOX and paraHOX clusters, may provide an interesting counterpoint to the previous results based on the acoels.

      We thank the Reviewer for their extended comments on our manuscript. We would firstly like to point out that our work was not aiming to resolve the phylogenetic position of X. bocki. We discussed this question at length, as it was and is a major and important question in evolutionary biology, however we think that we had phrased any conclusions in this regard very cautiously as we are well aware of limitations in our data to resolve the conundrum.

      In this revision we have further modified our text, specifically in the Introduction and Abstract, to make it clear that we are contributing to the understanding of the evolution and biology of a fascinating organism that cannot easily be cultured in the laboratory.

      In addition, we have supplied more explanation on why Xenacoelomorpha are generally seen as a monophyletic group and which lines of evidence point to this. Again, it should be noted here that colleagues who regard the Nephrozoa hypothesis as true, do not doubt the monophyly of Xenacoelomorpha.

    1. Scene One: A Typical Day in English Class, Tuesday, 12:20 p.m.When I walk into English class, there are only two students in the classroom; the tables are set up in a U-shape. The room is not organized, your desk is messy, and the room has trash ever ywhere. There is one TV in the back of the room. The room smells like scented board markers. I walk to my seat and wait for you to get ever yone settled in the classroom. After more students arrive, you ask us to read our independent reading book for about 25 minutes. Some of us do what you ask while you work on your computer. Then three students get kicked out because they didn’t do what you wanted them to do, they were talking back, or maybe you were just having a bad day. We don’t have a jour-nal to write about our books and you do not ask us what we are reading dur-ing this time. When independent reading time is over, you tell us to take out our Hamlet books. We read Hamlet as a class for the rest of the period. While we are reading, we have to take notes about what is happening or write sum-maries in our Hamlet notebook. You tell us what you think about the text and what is happening in the play. Most often, we simply write what you tell us to write. This happens ever y single day. Class is over and you didn’t assign any homework — you rarely do

      The disorderly environment suggests a lack of organization and may impact the learning atmosphere negatively. While reading "Hamlet" provides valuable literary exposure, the lack of student input or discussion beyond teacher-directed notes may limit critical thinking and analysis.

    2. The Letter-Writing Process with StudentsI wanted to do this project not only for the experience of improving my writ-ing but also I think that the students’ voice is not always heard entirely, even through dialogue. I feel that by doing this journal we can make a difference with our personal experience and touch the heart of someone who is willing to stand by us. I also wanted to get the attention of other students who may be feel-ing the same frustration I have felt

      In the letter-writing process with students, Rashida Registe expresses her motivation for the project. She sees it as an opportunity not only to enhance her writing skills but also to amplify the voices of students, which she feels may not always be fully heard even through dialogue. Rashida believes that by sharing their personal experiences through the journal, they can make a difference and touch the hearts of those willing to support them.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study assesses homeostatic plasticity mechanisms driven by inhibitory GABAergic synapses in cultured cortical neurons. The authors report that up- or down-regulation of GABAergic synaptic strength, rather than excitatory glutamatergic synaptic strength, is critical for homeostatic regulation of neuronal firing rates. The reviewers noted that the findings are potentially important, but they also raised questions. In particular, the evidence supporting the findings is currently incomplete and demonstration of independent regulation of mEPSCs and mIPSCs is a necessary experiment to support the major claims of the study. 

      We appreciate the detailed, thoughtful assessment of our paper by the reviewers and editors and now submit a revised version that addresses the reviewers’ comments as detailed below in response to each concern. We include a more open discussion of alternative possibilities and have added experiments demonstrating that AMPAergic scaling in our mouse cortical cultures is triggered differently than GABAergic scaling. We treated the cultured neurons exactly as described for triggering GABAergic scaling (20µM CNQX for 24 hours), however this did not trigger AMPAergic upscaling (new Figure 7), even though it did reduce spiking/bursting activity. Below we explain the result further, but ultimately this does demonstrate independent regulation of mEPSCs and mIPSCs as requested by the editor/reviewer (spike reductions induced by CNQX reduced mIPSC amplitude, but had no effect on mEPSC amplitude).

      Reviewer #1 (Public Review):

      While the paper is ambitious in its rhetorical scope and certainly presents intriguing findings, there are several serious concerns that need to be addressed to substantiate the interpretations of the data. For example, the CTZ data do not support the interpretations and conclusions drawn by the authors. Summarily, the authors argue that GABAergic scaling is measuring spiking (at the time scale of the homeostatic response, which they suggest is a key feature of a homeostat) yet their data in figure 5B show more convincingly that CTZ does not influence spiking levels - only one out of four time points is marginally significant (also, I suspect that the bootstrapping method mentioned in line 454-459 was conducted as a pairwise comparison of distributions. There is no mention of multiple comparisons corrections, and I have to assume that the significance at 3h would disappear with correction).

      We certainly understand the criticism here (similar to reviewer 2’s third point). We now discuss these complications in a more detailed description in the manuscript (CTZ section of results and at end of the discussion). First, we are presenting our entire dataset to be as transparent as possible. Unlike most synaptic scaling studies (including our own) that apply drugs to alter activity and assess mPSC amplitude at the final time point, here we are actually showing CTZ’s effect on spiking activity within the culture over time. This is critical because it has informed us of the drug’s true effect on spiking, the variability that is associated with these perturbations, and the ability and timing of the cultured network to homeostatically recover initial levels. This was important because it revealed that the drugs do not always influence activity in the way we assume, and this provides greater context to our results. Second, we are showing all of our data, and presenting it using estimation statistics which go beyond the dichotomy of a simple p value yes or no (Ho J, Tumkaya T, Aryal S, Choi H, Claridge-Chang A. 2019. Moving beyond P values: data analysis with estimation graphics. Nat Methods 16: 565-66). Estimation statistics have become a more standard statistical approach in the last 15 years and is the preferred method for the Society for Neuroscience’s eNeuro Journal. This method shows the effect size and the confidence interval of the distribution. For the 3 hr time point in Fig. 5B the CTZ/ethanol vs. ethanol data points exhibit very little overlap and the effect size demonstrates a near doubling of spike frequency, and the confidence interval shows a clear separation from 0. This was a pairwise comparison as we compared values at each time point after the addition of ethanol or ethanol/CTZ. Third, the plots illustrate an upward trend in spike frequency at 1 and 6 hrs, but that there is also clear variability. It is important to note that these are multiunit recordings and not purely excitatory principal neurons that we target for mPSC recordings. This complication along with the variability inherent in these cultures could make simple comparisons difficult to interpret and we now discuss this (end of discussion). Regardless, we do see some increase in spiking with CTZ and we clearly see increases in mIPSC amplitude, thus providing some support for the idea that spiking could be a critical player in terms of GABAergic scaling, particularly when put in the context of all of our findings. Future work will be necessary to determine how alterations in spiking lead to changes in mIPSC amplitude and we now discuss this (2nd to last paragraph in discussion).

      Then, the fact that TTX applied on top of CTZ drives an increase in mIPSC amplitude is interpreted as a conclusive demonstration that GABAergic scaling is sensing spiking. It is inevitable, however, that TTX will also severely reduce AMAP-R activation - a very plausible alternative explanation is that the augmentation of AMPAR activation caused by CTZ is not sufficient to overcome the dramatic impact of TTX. All together, these data do not provide substantial evidence for the conclusion drawn by the authors. 

      We believe that the most parsimonious explanation for our results is that spiking activity, not AMPAR activation, triggers GABAergic downscaling. GABAergic scaling is no different when comparing 24hr TTX treatment vs TTX+CTZ, and optogenetic restoration of spiking activity while continuing to block AMPAR activation was able to restore GABAergic mPSC amplitudes to control levels. It is important to emphasize that our results with TTX vs. TTX+CTZ are different for GABAergic scaling (no difference in this study) and AMPAergic scaling (CTZ diminished upward scaling in previous study – Fong et al., 2015 - PMID: 25751516) suggesting different triggers for the two forms of scaling. While we strongly believe we have demonstrated that GABAergic downscaling is dependent on spiking (not AMPAergic transmission), we now acknowledge that we cannot rule out the possibility that upward GABAergic scaling may be influenced by AMPAR activation (2nd paragraph discussion), although we have no evidence in support of this.

      Specific points:

      - The logic of the basis for the argument is somewhat flawed: A homeostat does not require a multiplicative mechanism, nor does it even need to be synaptic. Membrane excitability is a locus of homeostatic regulation of firing, for example. In addition, synapse-specific modulation can also be homeostatic. The only requirement of the homeostat is that its deployment subserves the stabilization of a biological parameter (e.g., firing rate). 

      We largely agree with the reviewer and should not have implied that this was a necessary requirement for a spike rate homeostat. What we should have said was that historically this definition has been applied to AMPAergic scaling, which is thought to be a spike rate homeostat. We have now corrected this (introduction and discussion).

      - Line 63 parenthetically references an important, but contradictory study as a brief "however". Given the tone of the writing, it would be more balanced to give this study at least a full sentence of exposition. 

      Agreed, and we have now done this.

      - The authors state (line 11) that expression of a hyperpolarizing conductance did not trigger scaling. More recent work ('Homeostatic synaptic scaling establishes the specificity of an associative memory') does this via expression of DREADDs and finds robust scaling.

      The purpose of citing this study was to argue that the spike rate homeostat hypothesis doesn’t make sense for AMPAergic scaling based on a study that hyperpolarized an individual cell while leaving the rest of the network unaltered and therefore leaving network activity and neurotransmission largely normal. In this previous study scaling was not triggered, suggesting reduced spike rate within an individual cell was insufficient to trigger scaling in that cell. The more recent study mentioned by the reviewer achieved scaling by hyperpolarizing a majority of cells in the network. Importantly, this approach alters neurotransmission throughout the network, making it challenging to isolate the specific contributions of spiking vs. receptor activation. Unlike the previous study, which focused on the impact within individual cells, this newer study involves global alterations in network activity, complicating the interpretation of the role of spiking versus receptor activation in triggering scaling.

      - Supplemental figure 1 looks largely linear to me? Out of curiosity, wouldn't you expect the left end to be aberrant because scaling up should theoretically increase the strength of some synapses that would have been previously below threshold for detection?

      We agree that the scaling ratio plot is largely linear. To be clear, the linearity of the ratio plot was not our point here, rather that there was a positive slope meaning ratios (CNQX mEPSC amplitudes/control mEPSC amplitudes) got bigger for the larger CNQX-treated mEPSCs. Alternatively, a multiplicative relationship where mEPSCs are all increased by a single factor (e.g. 2X) would be a flat line with 0 slope at the multiplicative value (e.g. 2). In terms of the left side of the plot, we do see values that rise abruptly from 1 - this was partially obstructed by the Y axis in this figure and we have adjusted this. This left part of the plot is likely due the CNQX-induced increases in mEPSC amplitudes of mini’s that where below our detection threshold of 5pA, as suggested by the reviewer. Therefore, mini’s that were 4pAs could now be 5pAs after CNQX treatment and these are then divided by the smallest control mEPSCs which are 5 pAs (ratio of 1). We tried to do a better job describing this in the resubmission (1st paragraph of results).

      - Given that figure 2B also shows warping at the tail ends of similar distributions, how is this to be interpreted? 

      The left side of the ratio plot shows evidence consistent with the idea that mIPSCs are dropping into the noise after CNQX treatment (smallest GABA mIPSCs that don’t fall into noise are 5pA and this is divided by the smallest control GABA mPSCs of 5pPA and therefore the ratio is 1). The rest of the distribution will then approach the scaling factor (50% in this case). On the right side of the ratio plot the values appear to slightly increase. We are not sure why this is happening, but it maybe that a small percentage of mIPSCs are not purely multiplicative at 0.5, however the biggest mPSCs can vary to a great degree from one cell to the next and in other cases we do not see this (Figure 4B, Figure 5E). We tried to do a better job describing this in the resubmission (results describing Figure 2).

      - The readability of the figures is poor. Some of them have inconsistent boundary boxes, bizarre axes, text that appears skewed as if the figures were quickly thrown together and stretched to fit. 

      We have adjusted the figures to be more consistent throughout the manuscript.

      - I'm concerned about the optogenetic restoration of activity experiment. Cortical pyramidal neuron mean firing rates are log normally distributed and span multiple orders of magnitude. The stimulation experiments can only address the total firing at a network-level - given than a network level "mean" is meaningless in a lognormal distribution, how are we to think about the effect of this manipulation when it comes to individual neurons homeostatically stabilizing their own activities? In essence, the argument is made at the single-neuron level, but the experiment is conducted with a network-level resolution. 

      As described above, we do not have the capacity to know what the actual firing rate of a particular neuron was before and after perturbing the system, and certainly not for the specific cells we recorded from to obtain mPSC amplitudes, and so we cannot say that we have perfectly restored the original firing rates of neurons. However, there is reason to believe that this is achieved to some extent. Our optogenetic stimulation is only 50-100 ms long activating a subset of neurons. This is sufficient to provide a synaptic barrage that then triggers a full blown network burst where the majority of spikes occur, but this is after the light is off. In other words, the optogenetic light pulse only initiates what becomes a relatively normal network burst that fortunately allows the individual cells to express their relatively normal (pre-drug) activity pattern. In our previous study using optogenetic activity restoration (Fong et al., 2015) we were able to show that this was the case for individual units - the spiking of an individual unit during a burst is similar before and after CNQX/optogenetic stimulation (see Figure 4b and Suppl. Fig 4 in Fong et al. 2015). We are not claiming that we have restored spiking to exactly the pre-drug state, but bring it back toward those levels and we see this is associated with a return of the mIPSC amplitude to near control levels. We now include a brief description of this in the manuscript (results describing Figure 3).

      - Line 198-99: multiplicativity is not a requirement of a homeostatic mechanism.

      - Line 264-265 - again, neither multiplicativity and synaptic mechanisms are fundamentally any more necessary for a homeostatic locus than anything else that can modulate firing rate in via negative feedback. 

      As mentioned above, the multiplicative nature of scaling has been a historical proposal for AMPAergic scaling and we have now found such a relationship for GABAergic scaling. This is important for understanding how this plasticity works, but we agree that it is not necessary for a homeostat and we have adjusted the manuscript accordingly.

      - 277: do you mean AMPAR? 

      We were not clear enough here. We actually do mean GABAR. The idea was that CTZ increases network activity and thus increases both AMPAergic and GABAergic transmission. We have rewritten this part of the discussion to avoid any confusion (2nd paragraph discussion).

      - Example: Figure 1A is frustratingly unreadable. The axes on the raster insets are microscopic, the arrows are strangely large, and it seems unnecessary to fill so much realestate with 4 rasters. Only one is necessary to show the concept of a network burst. The effect of time+CNQX on the frequency of burst is shown in B and C.

      - Example: Figure 2 appears warped and hastily assembled. Statistical indications are shown within and outside of bounding boxes. Axes are not aligned. Labels are not aligned. Font sizes are not equal on equivalent axes. 

      These figures were generated by the estimation statistics website and text may have been resized inappropriately. We have tried to adjust this and now have attempted to standardize the axes text to the best of our ability.

      - The discussion should include mention of the limitations and/or constraints of drawing general conclusions from cell culture. 

      We have added this consideration at the end of the discussion. Further, this is why we cited studies that argue GABAergic neurons have a particularly important role in homeostatic regulation of firing following sensory deprivations in vivo.

      - The discussion should include mention of the role of developmental age in the expression of specific mechanisms. It is highly likely that what is studied at ~P14 is specific to early postnatal development. 

      We now discuss caveats of cortical cultures at the end of the discussion.

      It is essential to ensure that the data presented in the paper adequately supports the conclusions drawn. A more cautious approach in interpreting the results may lead to a stronger argument and a more robust understanding of the underlying mechanisms at play. 

      We have broadened our discussion of alternative interpretations throughout the manuscript.

      Reviewer #1 (Recommendations For The Authors):

      While I am hesitant to judge a paper based on its tone, I would personally recommend revision of some of the subjective words and statements, as the manuscript undermines its own effectiveness by making unnecessarily strong statements. The text repeatedly paints an "either A or B" picture, and if there's any general lesson in biology, it's that it's always A and B. Global, multiplicative glutamatergic scaling could quite conceivably occur alongside GABAergic scaling, as well as synapse-specific homeostatic modifications. It seems that it would be wise to acknowledge that, while the data presented here point in one direction, in vivo results in an adult brain (for example) might present an entirely different set of patterns. This will not only enhance the readability of the paper but also ensure that the scientific community can engage with the work in a constructive and collaborative manner. Again, I present this as only a constructive and supportive suggestion. I am a big fan of work from this laboratory, and I would love to see this paper in an improved form - it's an important set of ideas and I do believe that these data are rigorously collected. 

      We have attempted to provide a more comprehensive interpretation of our results. We agree that a homeostat can come in many flavors, but do believe that GABAergic scaling is strong candidate, whereas AMPAergic scaling does not currently fit such a role. We do now discuss caveats with our work and are open to other interpretations that need to be flushed out in future work.

      Reviewer #2 (Public Review):

      Major points:

      (1) The reason why CNQX does not completely eliminate spiking is unclear (Fig. 1). What is the circuit mechanism by which spiking continues, although at lower frequency, in the absence of AMPA-mediated transmission and what the mechanism by which spiking frequency grows back after 24h (still in the absence of AMPA transmission)?

      Is it possible that NMDA-mediated transmission takes over and triggers a different type of network plasticity?

      The bursting in AMPAR blockade is due to the remaining NMDA receptor-mediated transmission. We showed this in our previous study in Suppl. Figure 2 and 6 of Fong et al., 2015 (PMID: 25751516). Our ability to optically induce normal looking bursts of spikes was also dependent NMDAR activation (Fong et al 2015 and Figure 6 Newman et al., 2015 - PMID: 26140329). Further, in Dr Fong’s PhD dissertation it was shown that the bursting activity was abolished when AMPA and NMDA receptors were both blocked. There are likely many factors that contribute to the recovery of activity, and certainly one of them is likely to be the weakening of inhibitory GABAergic currents as we had mentioned. We have now added the point about NMDARs mediating the remaining bursts in the manuscript (results associated with Figure 1). We are not clear on what the reviewer has in mind in terms of “NMDA-mediated transmission takes over and triggers a different kind of network plasticity”, but we do discuss the possibility that spiking triggers GABAergic scaling through its effect on NMDAergic transmission, which we cannot rule out, but also have no evidence in support of this idea (3rd and 5th paragraph of discussion). We do plan on addressing this in a future work.

      (2) A possible activation of NMDARs should be considered. One would think that experiments involving chronic glutamatergic blockade could have been conducted in the presence of NMDAR blockers. Why this was not the case?

      Unfortunately, it was not possible to optogenetically restore normal bursting in the presence of NMDAR blockade (even when AMPAergic transmission was intact), as NMDARs appeared to be critical for the optical restoration of the normal duration and form of the burst in rat cortical cultures (see Suppl. Figure 6 Fong et al., 2015 Nat Comm and Figure 6 Newman et al., 2015). Even high concentrations of CNQX (40µM) prevented us from restoring spiking in mouse cultures in the current study, which is why we moved to 20µM CNQX for this study. The reviewer raises an excellent point about a possible NMDAR contribution to altered synaptic strength, however. It is likely that NMDAR signaling is reduced in the presence of CNQX since burst frequency was dramatically reduced along with AMPAR-mediated depolarizations. We cannot rule out the possibility that NMDAR signaling could contribute to the alterations in GABAergic mIPSCs and discuss this in the resubmission (3rd and 5th paragraph of the discussion). We had not considered this previously because prior work suggested that 24/48 hour block NMDARs (APV) did not trigger AMPAergic scaling in cortical or hippocampal cultures (see Figure 1 Turrigiano et al., 1998 Nature and Suppl. Figure 4 Sutton et al., 2006 Cell), moreover, our previous study showed that restoring NMDAergic transmission ontogenetically, at least to some extent, had no influence on AMPAergic scaling (Fong et al., 2015).

      Also, experiments with global ChR2 stimulation with coincident pre and postsynaptic firing might also activate NMDARs and result in additional effects that should be taken into consideration for the global scaling mechanism.

      To be clear, our optical stimulation was of short duration (duration 50-100 ms) and was turned off before the vast majority of spiking that occurred in the bursts. So the light flash was a trigger that allowed a relatively normal looking burst to occur after the light was off (see lower panel of Figure 3B optogenetic stimulation – short duration only at onset of burst – we now make this clearer in resubmission). Therefore, we were unlikely to trigger significant synchronous activation that does not normally occur in network bursts.

      (3) Cultures exposed to CTZ to enhance AMPA receptors generated variable results (Fig. 5), somewhat increasing spiking activity in a non-significant manner but, at the same time, strengthening mIPSC amplitude. This result seems to suggest that spiking might be involved in GABAergic scaling, but it does not seem to prove it. Then, addition of TTX that blocked spiking reduced mIPSC amplitude. It was concluded here that the ability of CTZ to enhance GABAergic currents was primarily due to spiking, rather than the increase in AMPA-mediated currents. However, in addition to blocking action potentials, TTX would also prevent activation of AMPARs in the presence of CTZ due to the lack of glutamatergic release. Therefore, under these conditions, an effect of glutamatergic activation on GABAergic scaling cannot be ruled out.

      These concerns were very similar to reviewer 1’s first comments (see above). To be clear we are going a step beyond most scaling studies by assessing MEA-wide firing rate, but this still provides an incomplete picture of the particular cells that we target for patch recordings in terms of their firing before and after a drug. Further, we see considerable variability in effect on firing rate from culture to culture, which we now discuss in the resubmission (final paragraph discussion). The fact that mIPSCs are no different after TTX treatment vs CTZ+TTX treatment suggests that AMPAergic transmission is not so influential on GABAergic downscaling. While the CTZ results are not conclusive by themselves, taken together with the optogenetic results, where restoration of spiking in AMPAR blockade reverses scaling, is most consistent with idea that GABAergic scaling is triggered by spiking rather than AMPAR activation and places GABAergic scaling as a strong candidate as spike rate homeostat. Although we do feel that we have demonstrated that downward GABAergic scaling is dependent on spiking, we cannot rule out the possibility that upward GABAergic scaling could be influenced by AMPAR activation to some extent. We now acknowledge this possibility (2nd paragraph discussion).

      (4) The sample size is not mentioned in any figure. How many cells/culture dishes were used in each condition?

      The individual dots represent either individual cells for mIPSC amplitude or individual cultures in MEA experiments. Number of cultures and cells are now stated in the figure legends.

      (5) Cortical cultures may typically contain about 5-10% GABAergic interneurons and 90-95 % pyramidal cells. One would think that scaling mechanisms occurring in pyramidal cells and interneurons could be distinct, with different impact on the network. Although for whole-cell recordings the authors selected pyramidal looking cells, which might bias recordings towards excitatory neurons, naked eye selection of recording cells is quite difficult in primary cultures. Some of the variability in mIPSC amplitude values (Fig. 2A for example) might be attributed to the cell type? One could use cultures where interneurons are fluorescently labeled to obtain an accurate representation. The issue of the possible differential effects of scaling in pyramidal cells vs. interneurons and the consequences in the network should be discussed.

      We now include this discussion in the resubmission (final paragraph discussion). Briefly, we chose large cells, which will be predominantly glutamatergic neurons as suggested by the reviewer. Ultimately, even among glutamatergic principal cells there may be variability in the response to drug application. All of these issues could contribute to variability and we have expanded our description of the variability in our results, including that based on cellular heterogeneity. 

      Reviewer #2 (Recommendations For The Authors):

      Minor comments –

      Fig S3: Please quantify changes in frequency

      We have done this (Supplemental Figure 5).

      Fig 2: please choose colors with higher contrast for CNQX/TTX

      We have done this.

      Fig. 3C: Why doesn't CNQX+PhotoStim reach control levels of bursting at 2h?

      The program was designed to follow and maintain total spike frequency and so it does a better job at this than maintaining burst frequency.

      Fig. 5A: please include a comparison between control and Ethanol

      We now do this in Figure 5C. Both around 26pAs.

      Fig. 5C: where is the Etoh condition?

      We have made this figure more clear in terms of controls (Figure 5C & D).

      Reviewer #3 (Public Review):

      This paper concerns whether scaling (or homeostatic synaptic plasticity; HSP) occurs similarly at GABA and Glu synapses and comes to the surprising conclusion that these are regulated separately. This is surprising because these were thought to be co-regulated during HSP and in fact, the major mechanisms thought to underlie downscaling (TTX or CNQX driven), retinoic acid and TNF, have been shown to regulate both GABARs and AMPARs directly. (As a side note, it is unclear that the manipulations used in Josesph and Turrigiano represent HSP, and so might not be relevant). Thus the main result, that GABA HSP is dissociable from Glu HSP, is novel and exciting. This suggests either different mechanisms underlie the two processes, or that under certain conditions, another mechanism is engaged that scales one type of synapse and not the other.

      However, strong claims require strong evidence, and the results presented here only address GABA HSP, relying on previous work from this lab on Glu HSP (Fong, et al., 2015). But the previous experiments were done in rat cultures, while these experiments are done in mice and at somewhat different ages (DIV). Even identical culture systems can drift over time (possibly due to changes in the components of B27 or other media and supplements). Therefore it is necessary to demonstrate in the same system the dissociation. To be convincing, they need to show the mEPSCs for Fig 4, clearly showing the dissociation. Doing the same for Fig 5 would be great, but I think Fig 4 is the key.

      We understand the concern of the reviewer as we do see significant variability within our cultures and they were plated in different places, by different people, in different species (rat vs mouse). Therefore, we have attempted to redo the study on AMPAergic scaling on these mouse cortical neurons. Surprisingly, we found that 20µM CNQX did not trigger AMPAergic upscaling (new Figure 7), even though it did reduce spiking activity and was able to produce GABAergic downscaling. We did not carry out the optogenetic restoration of activity, because we did not trigger upscaling. The result does however, show that the reductions in spiking/bursting that trigger GABAergic downscaling, did not trigger AMPAergic upscaling and therefore dissociate the 2 forms of scaling in these mouse cultures. We do not know why 20 µM CNQX did not trigger scaling in these cultures since it does reduce spiking and AMPAR activation. In the Fong study we used 40µM CNQX because intracellular recordings from rat cortical neurons suggested this was required to completely block AMPAergic currents. Our initial studies in the current manuscript examining GABAergic scaling in mouse cortical cultures used 40µM CNQX, however, this concentration of CNQX prevented us from restoring spiking through optogenetic activation, so we reduced our concentration to 20µM CNQX, which did trigger GABAergic downscaling and allowed the restoration of spiking. We now show and discuss this result (Figure 7 and 3rd paragraph discussion).

      The paper also suggests that only receptor function or spiking could control HSP, and therefore if it is not receptor function then it must be spiking. This seems like a false dichotomy; there are of course other options. Details in the data may suggest that spiking is not the (or the only) homeostat, as TTX and CNQX causes identical changes in mIPSC amplitude but have different effects on spiking. Further, in Fig 5, CTZ had a minimal effect on spiking but a large effect on mIPSCs. Similar issues appear in Fig 6, where the induction of increased spiking is highly variable, with many cells showing control levels or lower spiking rates. Yet the synaptic changes are robust, across all cells. Overall, this is not persuasive that spiking is necessarily the homeostat for GABA synapses.

      Together our results argue against AMPAR or GABAR activation as a trigger for GABAergic scaling and that this is different than our results for AMPAergic scaling. These points alone are important to recognize. While changes in spiking do not perfectly follow the changes in GABAergic scaling they do always trend in the right direction. As mentioned above, total spiking activity is only one measure of spiking. It is possible that these drugs alter the pattern of spiking that translates into an altered calcium transients which may be important for triggering the plasticity. Further, we acknowledge that we cannot rule out a role for NMDARs contributing to GABAergic scaling (3rd and 5th paragraph of discussion). Based on the variability that we observe and the nature of our MEA recordings we cannot precisely determine how the total activity or pattern of activity changes with drug application in the specific cells that we target for whole cell recordings, and this is now discussed (final paragraph of discussion). Again, it is important to note that we are going a step beyond most homeostatic plasticity studies that add a drug and simply assume it is having an effect on spiking (e.g. CNQX was initially thought to completely abolish spiking, but clearly does not). However, we believe that the most parsimonious explanation of our results supports our proposal that GABAergic scaling is a strong candidate as a spike rate homeostat. Regardless, in the resubmission we have included a broader discussion about these possibilities, and recognize that we cannot rule out the possibility that AMPAergic transmission could contribute to upward GABAergic scaling (2nd paragraph discussion).

      The paper also suggests that the timing of the GABA changes coincides with the spiking changes, but while they have the time course of the spiking changes and recovery, they only have the 24h time point for synaptic changes. It is impossible to conclude how the time courses align without more data.

      We can only say that by the 24 hour CNQX time point, when overall spiking is recovered in some but not all cultures and bursts have not recovered, that GABAergic scaling has already occurred. We now state this more clearly in the resubmission (near the end of the 2nd paragraph of the discussion).

      Reviewer #3 (Recommendations For The Authors):

      The statistics are inadequately described. The full information including actual p values should be given, particularly for the non-significant trends reported.

      We have done this in Figure legends.

      The abstract and introduction give the impression that GABA and Glu HSP are independent, though most work links them as occurring simultaneously and in a coordinated fashion to achieve homeostasis.

      While it is true that many studies have triggered both forms of scaling with activity or transmission blockade, these studies have not addressed whether these forms of scaling are actually triggered in the same way mechanistically, except potentially for the one study that we mentioned (Joseph et al.,). Our results suggest they are independent. We now do mention the idea that these two forms of scaling have been assumed to be commonly triggered (3rd paragraph introduction).

      The data in Fig 6 is presented as if BIC treatment is a novel result, although BIC/Gabazine/PTX have been used to induce down-scaling in many previous papers. While it's good to have the results, they should be put in proper context. As suggested in the paper, testing if decreased GABAR function would lead to upscaling does not make sense given all the previous data. 

      Figure 6 shows GABAergic upscaling in response to GABAR block (bicuculline), but we are aware of only two other studies that looked at GABAergic scaling after treating with a GABAR blocker and they found upscaling but this was in hippocampal cultures, not cortical cultures (Peng et al., 2010 - PMID: 21123568, Pribiag et al., 2014 - PMID: 24753587). We now mention this in the results section describing Figure 6. While many studies have blocked GABARs and find AMPAergic downscaling, we are addressing the triggers for GABAergic scaling in Figure 6.

      Is Fig S4B mislabeled? The title says spike rate but the graph axis says burst frequency.

      The reviewer is correct and we have now adjusted this.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Protein conformational changes are often critical to protein function, but obtaining structural information about conformational ensembles is a challenge. Over a number of years, the authors of the current manuscript have developed and improved an algorithm, qFit protein, that models multiple conformations into high resolution electron density maps in an automated way. The current manuscript describes the latest improvements to the program, and analyzes the performance of qFit protein in a number of test cases, including classical statistical metrics of data fit like Rfree and the gap between Rwork and Rfree, model geometry, and global and case-by-case assessment of qFit performance at different data resolution cutoffs. The authors have also updated qFit to handle cryo-EM datasets, although the analysis of its performance is more limited due to a limited number of high-resolution test cases and less standardization of deposited/processed data.

      Strengths:

      The strengths of the manuscript are the careful and extensive analysis of qFit's performance over a variety of metrics and a diversity of test cases, as well as the careful discussion of the limitations of qFit. This manuscript also serves as a very useful guide for users in evaluating if and when qFit should be applied during structural refinement.

      Reviewer #2 (Public Review):

      Summary

      The manuscript by Wankowicz et al. describes updates to qFit, an algorithm for the characterization of conformational heterogeneity of protein molecules based on X-ray diffraction of Cryo-EM data. The work provides a clear description of the algorithm used by qFit. The authors then proceed to validate the performance of qFit by comparing it to deposited X-ray entries in the PDB in the 1.2-1.5 Å resolution range as quantified by Rfree, Rwork-Rfree, detailed examination of the conformations introduced by qFit, and performance on stereochemical measures (MolProbity scores). To examine the effect of experimental resolution of X-ray diffraction data, they start from an ultra high-resolution structure (SARS-CoV2 Nsp3 macrodomain) to determine how the loss of resolution (introduced artificially) degrades the ability of qFit to correctly infer the nature and presence of alternate conformations. The authors observe a gradual loss of ability to correctly infer alternate conformations as resolution degrades past 2 Å. The authors repeat this analysis for a larger set of entries in a more automated fashion and again observe that qFit works well for structures with resolutions better than 2 Å, with a rapid loss of accuracy at lower resolution. Finally, the authors examine the performance of qFit on cryo-EM data. Despite a few prominent examples, the authors find only a handful (8) of datasets for which they can confirm a resolution better than 2.0 Å. The performance of qFit on these maps is encouraging and will be of much interest because cryo-EM maps will, presumably, continue to improve and because of the rapid increase in the availability of such data for many supramolecular biological assemblies. As the authors note, practices in cryo-EM analysis are far from uniform, hampering the development and assessment of tools like qFit.

      Strengths

      qFit improves the quality of refined structures at resolutions better than 2.0 A, in terms of reflecting true conformational heterogeneity and geometry. The algorithm is well designed and does not introduce spurious or unnecessary conformational heterogeneity. I was able to install and run the program without a problem within a computing cluster environment. The paper is well written and the validation thorough.

      I found the section on cryo-EM particularly enlightening, both because it demonstrates the potential for discovery of conformational heterogeneity from such data by qFit, and because it clearly explains the hurdles towards this becoming common practice, including lack of uniformity in reporting resolution, and differences in map and solvent treatment.

      Weaknesses

      The authors begin the results section by claiming that they made "substantial improvement" relative to the previous iteration of qFit, "both algorithmically (e.g., scoring is improved by BIC, sampling of B factors is now included) and computationally (improving the efficiency and reliability of the code)" (bottom of page 3). However, the paper does not provide a comparison to previous iterations of the software or quantitation of the effects of these specific improvements, such as whether scoring is improved by the BIC, how the application of BIC has changed since the previous paper, whether sampling of B factors helps, and whether the code faster. It would help the reader to understand what, if any, the significance of each of these improvements was.

      Indeed, it is difficult (embarrassingly) to benchmark against our past work due to the dependencies on different python packages and the lack of software engineering. With the infrastructure we’ve laid down with this paper, made possible by an EOSS grant from CZI, that will not be a problem going forward. Not only is the code more reliable and standardized, but we have developed several scientific test sets that can be used as a basis for broad comparisons to judge whether improvements are substantial. We’ve also changed with “substantial improvement” to “several modifications”  to indicate the lack of comparison to past versions.

      The exclusion of structures containing ligands and multichain protein models in the validation of qFit was puzzling since both are very common in the PDB. This may convey the impression that qFit cannot handle such use cases. (Although it seems that qFit has an algorithm dedicated to modeling ligand heterogeneity and seems to be able to handle multiple chains). The paper would be more effective if it explained how a user of the software would handle scenarios with ligands and multiple chains, and why these would be excluded from analysis here.

      qFit can indeed handle both. We left out multiple chains for simplicity in constructing a dataset enriched for small proteins while still covering diversity to speed the ability to rapidly iterate and test our approaches. Improvements to qFit ligand handling will be discussed in a forthcoming work as we face similar technical debt to what we saw in proteins and are undergoing a process of introducing “several modifications” that we hope will lead to “substantial improvement” - but at the very least will accelerate further development.

      It would be helpful to add some guidance on how/whether qFit models can be further refined afterwards in Coot, Phenix, ..., or whether these models are strictly intended as the terminal step in refinement.

      We added to the abstract:

      “Importantly, unlike ensemble models, the multiconformer models produced by qFit can be manually modified in most major model building software (e.g. Coot)  and fit can be further improved by refinement using standard pipelines (e.g. Phenix, Refmac, Buster).”

      and introduction:

      “Multiconformer models are notably easier to modify and more interpretable in software like Coot12 unlike ensemble methods that generate multiple complete protein copies(Burnley et al. 2012; Ploscariu et al. 2021; Temple Burling and Brünger 1994).”

      and results:

      “This model can then be examined and edited in Coot12 or other visualization software, and further refined using software such as phenix.refine, refmac, or buster as the modeler sees fit.”

      and discussion

      “qFit is compatible with manual modification and further refinement as long as the subsequent software uses the PDB standard altloc column, as is common in most popular modeling and refinement programs. The models can therefore generally also be deposited in the PDB using the standard deposition and validation process.”

      Appraisal & Discussion

      Overall, the authors convincingly demonstrate that qFit provides a reliable means to detect and model conformational heterogeneity within high-resolution X-ray diffraction datasets and (based on a smaller sample) in cryo-EM density maps. This represents the state of the art in the field and will be of interest to any structural biologist or biochemist seeking to attain an understanding of the structural basis of the function of their system of interest, including potential allosteric mechanisms-an area where there are still few good solutions. That is, I expect qFit to find widespread use.

      Reviewer #3 (Public Review):

      Summary:

      The authors address a very important issue of going beyond a single-copy model obtained by the two principal experimental methods of structural biology, macromolecular crystallography and cryo electron microscopy (cryo-EM). Such multiconformer model is based on the fact that experimental data from both these methods represent a space- and time-average of a huge number of the molecules in a sample, or even in several samples, and that the respective distributions can be multimodal. Different from structure prediction methods, this approach is strongly based on high-resolution experimental information and requires validated single-copy high-quality models as input. Overall, the results support the authors' conclusions.

      In fact, the method addresses two problems which could be considered separately:

      - An automation of construction of multiple conformations when they can be identified visually;

      - A determination of multiple conformations when their visual identification is difficult or impossible.

      We often think about this problem similarly to the reviewer. However, in building qFit, we do not want to separate these problems - but rather use the first category (obvious visual identification) to build an approach that can accomplish part of the second category (difficult to visualize) without building “impossible”/nonexistent conformations - with a consistent approach/bias.

      The first one is a known problem, when missing alternative conformations may cost a few percent in R-factors. While these conformations are relatively easy to detect and build manually, the current procedure may save significant time being quite efficient, as the test results show.

      We agree with the reviewers' assessment here. The “floor” in terms of impact is automating a tedious part of high resolution model building and improving model quality.

      The second problem is important from the physical point of view and has been addressed first by Burling & Brunger (1994; https://doi.org/10.1002/ijch.199400022). The new procedure deals with a second-order variation in the R-factors, of about 1% or less, like placing riding hydrogen atoms, modeling density deformation or variation of the bulk solvent. In such situations, it is hard to justify model improvement. Keeping Rfree values or their marginal decreasing can be considered as a sign that the model is not overfitted data but hardly as a strong argument in favor of the model.

      We agree with the overall sentiment of this comment. What is a significant variation in R-free is an important question that we have looked at previously (http://dx.doi.org/10.1101/448795) and others have suggested an R-sleep for further cross validation (https://pubmed.ncbi.nlm.nih.gov/17704561/). For these reasons it is important to get at the significance of the changes to model types from large and diverse test sets, as we have here and in other works, and from careful examination of the biological significance of alternative conformations with experiments designed to test their importance in mechanism.

      In general, overall targets are less appropriate for this kind of problem and local characteristics may be better indicators. Improvement of the model geometry is a good choice. Indeed, yet Cruickshank (1956; https://doi.org/10.1107/S0365110X56002059) showed that averaged density images may lead to a shortening of covalent bonds when interpreting such maps by a single model. However, a total absence of geometric outliers is not necessarily required for the structures solved at a high resolution where diffraction data should have more freedom to place the atoms where the experiments "see" them.

      Again, we agree—geometric outliers should not be completely absent, but it is comforting when they and model/experiment agreement both improve.

      The key local characteristic for multi conformer models is a closeness of the model map to the experimental one. Actually, the procedure uses a kind of such measure, the Bayesian information criteria (BIC). Unfortunately, there is no information about how sharply it identifies the best model, how much it changes between the initial and final models; in overall there is not any feeling about its values. The Q-score (page 17) can be a tool for the first problem where the multiple conformations are clearly separated and not for the second problem where the contributions from neighboring conformations are merged. In addition to BIC or to even more conventional target functions such as LS or local map correlation, the extreme and mean values of the local difference maps may help to validate the models.

      We agree with the reviewer that the problem of “best” model determination is poorly posed here. We have been thinking a lot about htis in the context of Bayesian methods (see: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9278553/); however, a major stumbling block is in how variable representations of alternative conformations (and compositions) are handled. The answers are more (but by no means simply) straightforward for ensemble representations where the entire system is constantly represented but with multiple copies.

      This method with its results is a strong argument for a need in experimental data and information they contain, differently from a pure structure prediction. At the same time, absence of strong density-based proofs may limit its impact.

      We agree - indeed we think it will be difficult to further improve structure prediction methods without much more interaction with the experimental data.

      Strengths:

      Addressing an important problem and automatization of model construction for alternative conformations using high-resolution experimental data.

      Weaknesses:

      An insufficient validation of the models when no discrete alternative conformations are visible and essentially missing local real-space validation indicators.

      While not perfect real space indicators, local real-space validation is implicit in the MIQP selection step and explicit when we do employ Q-score metrics.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      A point of clarification: I don't understand why waters seem to be handled differently in for cryo-EM and crystallography datasets. I am interested about the statement on page 19 that the Molprobity Clashscore gets worse for cryo-EM datasets, primarily due to clashes with waters. But the qFit algorithm includes a round of refinement to optimize placement of ordered waters, and the clashscore improves for the qFit refinement in crystallography test cases. Why/how is this different for cryo-EM?

      We agree that this was not an appropriate point. We believe that the high clash score is coming from side chains being incorrectly modeled. We have updated this in the manuscript and it will be a focus of future improvements.

      Reviewer #2 (Recommendations For The Authors):

      - It would be instructive to the reader to explain how qFit handles the chromophore in the PYP (1OTA) example. To this end, it would be helpful to include deposition of the multiconformer model of PYP. This might also be a suitable occasion for discussion of potential hurdles in the deposition of multiconformer models in the PDB (if any!). Such concerns may be real concerns causing hesitation among potential users.

      Thank you for this comment. qFit does not alter the position or connectivity of any HETATM records (like the chromophore in this structure). Handling covalent modifications like this is an area of future development.

      Regarding deposition, we have noted above that the discussion now includes:

      “qFit is compatible with manual modification and further refinement as long as the subsequent software uses the PDB standard altloc column, as is common in most popular modeling and refinement programs. The models can therefore, generally also be deposited in the PDB using the standard deposition and validation process.”

      Finally, we have placed all PDBs in a Zenodo deposition (XXX) and have included that language in the manuscript. It is currently under a separate data availability section (page XXX). We will defer to the editor as to the best header that should go under.

      - It may be advisable to take the description of true/false pos/negatives out of the caption of Figure 4, and include it in a box or so, since these terms are important in the main text too, and the caption becomes very cluttered.

      We think adding the description of true/false pos/negatives to the Figure panel would make it very cluttered and wordy. We would like to retain this description within the caption. We have also briefly described each in the main text.

      - page 21, line 4: some issue with citation formatting.

      We have updated these citations.

      - page 25, second paragraph: cardinality is the number of members of a set. Perhaps "minimal occupancy" is more appropriate.

      Thank you for pointing this out. This was a mistake and should have been called the occupancy threshold.

      - page 26: it's - its

      Thank you, we have made this change. 

      - Font sizes in Supplementary Figures 5-7 are too small to be readable.

      We agree and will make this change. 

      Reviewer #3 (Recommendations For The Authors):

      General remarks

      (1) As I understand, the procedure starts from shifting residues one by one (page 4; A.1). Then, geometry reconstruction (e.g., B1) may be difficult in some cases joining back the shifted residues. It seems that such backbone perturbation can be done more efficiently by shifting groups of residues ("potential coupled motions") as mentioned at the bottom of page 9. Did I miss its description?

      We would describe the algorithm as sampling (which includes minimal shifts) in the backbone residues to ensure we can link neighboring residues. We agree that future iterations of qFit should include more effective backbone sampling by exploring motion along the Cβ-Cα, C-N, and (Cβ-Cα × C-N) bonds and exploring correlated backbone movements.

      (2) While the paper is well split in clear parts, some of them seem to be not at their right/optimal place and better can be moved to "Methods" (detailed "Overview of the qFit protein algorithm" as a whole) or to "Data" missed now (Two first paragraphs of "qFit improves overall fit...", page 8, and "Generating the qFit test set", page 22, and "Generating synthetic data ..." at page 26; description of the test data set), At my personal taste, description of tests with simulated data (page 15) would be better before that of tests with real data.

      Thank you for this comment, but we stand by our original decision to keep the general flow of the paper as it was submitted.

      (3) I wonder if the term "quadratic programming" (e.g., A3, page 5) is appropriate. It supposes optimization of a quadratic function of the independent parameters and not of "some" parameters. This is like the crystallographic LS which is not a quadratic function of atomic coordinates, and I think this is a similar case here. Whatever the answer on this remark is, an example of the function and its parameters is certainly missed.

      We think that the term quadratic programming is appropriate. We fit a function with a loss function (observed density - calculated density), while satisfying the independent parameters. We fit the coefficients minimizing a quadratic loss. We agree that the quadratic function is missing from the paper, and we have now included it in the Methods section.

      Technical remarks to be answered by the authors :

      (1) Page 1, Abstract, line 3. The ensemble modeling is not the only existing frontier, and saying "one of the frontiers" may be better. Also, this phrase gives a confusing impression that the authors aim to predict the ensemble models while they do it with experimental data.

      We agree with this statement and have re-worded the abstract to reflect this.

      (2) Page 2. Burling & Brunger (1994) should be cited as predecessors. On the contrary, an excellent paper by Pearce & Gros (2021) is not relevant here.

      While we agree that we should mention the Burling & Brunger paper and the Pearce & Gros (2021) should not be removed as it is not discussing the method of ensemble refinement.

      (3) Page 2, bottom. "Further, when compared to ..." The preference to such approach sounds too much affirmative.

      We have amended this sentence to state:

      “Multiconformer models are notably easier to modify and more interpretable in software like Coot(Emsley et al. 2010) unlike ensemble methods that generate multiple complete protein copies(Burnley et al. 2012; Ploscariu et al. 2021; Temple Burling and Brünger 1994).”

      “The point we were trying to make in this sentence was that ensemble-based models are much harder to manually manipulate in Coot or other similar software compared to multiconformer models. We think that the new version of this sentence states this point more clearly.”

      (4) Page 2, last paragraph. I do not see an obvious relation of references 15-17 to the phrase they are associated with.

      We disagree with this statement, and think that these references are appropriate.

      “Multiconformer models are notably easier to modify and more interpretable in software like Coot12 unlike ensemble methods that generate multiple complete protein copies(Burnley et al. 2012; Ploscariu et al. 2021; Temple Burling and Brünger 1994).”

      (5) Page 3, paragraph 2. Cryo-EM maps should be also "high-resolution"; it does not read like this from the phrase.

      We agree that high-resolution should be added, and the sentence now states:

      “However, many factors make manually creating multiconformer models difficult and time-consuming. Interpreting weak density is complicated by noise arising from many sources, including crystal imperfections, radiation damage, and poor modeling in X-ray crystallography, and errors in particle alignment and classification, poor modeling of beam induced motion, and imperfect detector Detector Quantum Efficiency (DQE) in high-resolution cryo-EM.”

      (6) Page 3, last paragraph before "results". The words "... in both individual cases and large structural bioinformatic projects" do not have much meaning, except introducing a self-reference. Also, repeating "better than 2 A" looks not necessary.

      We agree that this was unnecessary and have simplified the last sentence to state:

      “With the improvements in model quality outlined here, qFit can now be increasingly used for finalizing high-resolution models to derive ensemble-function insights.”

      (7) Page 3. "Results". Could "experimental" be replaced by a synonym, like "trial", to avoid confusing with the meaning "using experimental data"?

      We have replaced experimental with exploratory to describe the use of qFit on CryoEM data. The statement now reads:

      “For cryo-EM modeling applications, equivalent metrics of map and model quality are still developing, rendering the use of qFit for cryo-EM more exploratory.”

      (8) Page 4, A.1. Should it be "steps +/- 0.1" and "coordinate" be "coordinate axis"? One can modify coordinates and not shift them. I do not understand how, with the given steps, the authors calculated the number of combinations ("from 9 to 81"). Could a long "Alternatively, ...absent" be reduced simply to "Otherwise"?

      We have simplified and clarified the sentence on the sampling of backbone coordinates to state:

      “If anisotropic B-factors are absent, the translation of coordinates occurs in the X, Y, and Z directions. Each translation takes place in steps of 0.1 along each coordinate axis, extending to 0.3 Å, resulting in 9 (if isotropic) or to 81 (if anisotropic) distinct backbone conformations for further analysis.”

      (9) Page 6, B.1, line 2. Word "linearly" is meaningless here.

      We have modified this to read:

      “Moving from N- to C- terminus along the protein,”

      (10) Page 9, line 2. It should be explained which data set is considered as the test set to calculate Rfree.

      We think this is clear and would be repetitive if we duplicated it.

      (11) Page 9, line 7. It should be "a valuable metric" and not "an"

      We agree and have updated the sentence to read:

      “Rfree is a valuable metric for monitoring overfitting, which is an important concern when increasing model parameters as is done in multiconformer modeling.”

      (12) Page 10, paragraph 3. "... as a string (Methods)". I did not find any other mention of this term "string", including in "Methods" where it supposed to be explained. Either this should be explained (and an example is given?), or be avoided.

      We agree that string is not necessary (discussing the programmatic datatype). We have removed this from the sentence. It now reads:

      “To quantify how often qFit models new rotameric states, we analyzed the qFit models with phenix.rotalyze, which outputs the rotamer state for each conformer (Methods).”

      (13) Page10, lines 3-4 from bottom. Are these two alternative conformations justified?

      We are unsure what this is referring to.

      (14) Page 12, Fig. 2A. In comparison with Supplement Fig 2C, the direction of axes is changed. Could they be similar in both Figures?

      We have updated Supplementary Figure 2C to have the same direction of axes as Figure 2A.

      (15) Page 15, section's title. Choose a single verb in "demonstrate indicate".

      We have amended the title of this section to be:

      “Simulated data demonstrate qFit is appropriate for high-resolution data.”

      (16) Page 15, paragraph 2. "Structure factors from 0.8 to 3.0 A resolution" does not mean what the author wanted apparently to tell: "(complete?) data sets with the high-resolution limit which varied from 0.8 to 3.0 A ...". Also, a phrase of "random noise increasing" is not illustrated by Figs.5 as it is referred to.

      We have edited this sentence to now read:

      “To create the dataset for resolution dependence, we used the ground truth 7KR0 model, including all alternative conformations, and generated artificial structure factors with a high resolution limit ranging from  0.8 to 3.0 Å resolution (in increments of 0.1 Å).”

      (17) Page 15, last paragraph is written in a rather formal and confusing way while a clearer description is given in the figure legend and repeated once more in Methods. I would suggest to remove this paragraph.

      We agree that this is confusing. Instead of create a true positive/false positive/true negative/false negative matrix, we have just called things as they are, multiconformer or single conformer and match or no match. We have edited the language the in the manuscript and figure legends to reflect these changes.

      (18) Page 16. Last two paragraphs start talking about a new story and it would help to separate them somehow from the previous ones (sub-title?).

      We agree that this could use a subtitle. We have included the following subtitle above this section:

      “Simulated multiconformer data illustrate the convergence of qFit.”

      (19) Page 20. "or static" and "we determined that" seem to be not necessary.

      We have removed static and only used single conformer models. However, as one of the main conclusions of this paper is determining that qFit can pick up on alternative conformers that were modeled manually, we have decided to the keep the “we determined that”.

      (20) Page 21, first paragraph. "Data" are plural; it should be "show" and "require"

      We have made these edits. The sentence now reads:

      “However, our data here shows that not only does qFit need a high-resolution map to be able to detect signal from noise, it also requires a very well-modeled structure as input.”

      (21) Page 21, References should be indicated as [41-45], [35,46-48], [55-57]. A similar remark to [58-63] at page 22.

      We have fixed the reference layout to reflect this change.

      (22) Page 21, last paragraph. "Further reduce R-factors" (moreover repeated twice) is not correct neither by "further", since here it is rather marginal, nor as a goal; the variations of R-factors are not much significant. A more general statement like "improving fit to experimental data" (keeping in mind density maps) may be safer.

      We agree with the duplicative nature of these statements. We have amended the sentence to now read:

      “Automated detection and refinement of partial-occupancy waters should help improve fit to experimental data further reduce Rfree15 and provide additional insights into hydrogen-bond patterns and the influence of solvent on alternative conformations.”

      (23) Page 22. Sub-sections of "Methods" are given in a little bit random order; "Parallelization of large maps" in the middle of the text is an example. Put them in a better order may help.

      We have moved some section of the Methods around and made better headings by using an underscore to highlight the subsections (Generating and running the qFit test set, qFit improved features, Analysis metrics, Generating synthetic data for resolution dependence).

      (24) Page 24. Non-convex solution is a strange term. There exist non-convex problems and functions and not solutions.

      We agree and we have changed the language to reflect that we present the algorithm with non-convex problems which it cannot solve.

      (25) Page 26, "Metrics". It is worthy to describe explicitly the metrics and not (only) the references to the scripts.

      For all metrics, we describe a sentence or two on what each metric describes. As these metrics are well known in the structural biology field, we do not feel that we need to elaborate on them more.

      (26) Page 26. Multiplying B by occupancy does not have much sense. A better option would be to refer to the density value in the atomic center as occ*(4*pi/B)^1.5 which gives a relation between these two entities.

      We agree and have update the B-factor figures and metrics to reflect this.

      (27) Page 40, suppl. Fig. 5. Due to the color choice, it is difficult to distinguish the green and blue curves in the diagram.

      We have amended this with the colors of the curves have been switched.

      (28) Page 42, Suppl. Fig. 7. (A) How the width of shaded regions is defined? (B) What the blue regions stand for? Input Rfree range goes up to 0.26 and not to 0.25; there is a point at the right bound. (C) Bounds for the "orange" occupancy are inversed in the legend.

      (A) The width of the shaded region denotes the standard deviations among the values at every resolution. We have made this clearer in the caption

      (B) The blue region denotes the confidence interval for the regression estimate. Size of the confidence interval was set to 95%. We have made this clearer in the caption

      (C) This has been fixed now

      The maximum R-free value is 0.2543, which we rounded down to 0.25.

      (29) Page 43. Letters E-H in the legend are erroneously substituted by B-E.

      We apologize for this mistake. It is now corrected.