10,000 Matching Annotations
  1. Jul 2025
    1. Navigating Failures in Pods With Devices

      Summary: Navigating Failures in Pods With Devices

      This article examines the unique challenges Kubernetes faces in managing specialized hardware (e.g., GPUs, accelerators) within AI/ML workloads, and explores current pain points, DIY solutions, and the future roadmap for more robust device failure handling.

      Why AI/ML Workloads Are Different

      • Heavy Dependence on Specialized Hardware: AI/ML jobs require devices like GPUs, with hardware failures causing significant disruptions.
      • Complex Scheduling: Tasks may consume entire machines or need coordinated scheduling across nodes due to device interconnects.
      • High Running Costs: Specialized nodes are expensive; idle time is wasteful.
      • Non-Traditional Failure Models: Standard Kubernetes assumptions (like treating nodes as fungible, or pods as easily replaceable) don’t apply well; failures can trigger large-scale restarts or job aborts.

      Major Failure Modes in Kubernetes With Devices

      1. Kubernetes Infrastructure Failures

        • Multiple actors (device plugin, kubelet, scheduler) must work together; failures can occur at any stage.
        • Issues include pods failing admission, poor scheduling, or pods unable to run despite healthy hardware.
        • Best Practices: Early restarts, close monitoring, canary deployments, use of verified device plugins and drivers.
      2. Device Failures

        • Kubernetes has limited built-in ability to handle device failures—unhealthy devices simply reduce the allocatable count.
        • Lacks correlation between device failure and pod/container failure.
        • DIY Solutions:
          • Node Health Controllers: Restart nodes if device capacity drops, but these can be slow and blunt.
          • Pod Failure Policies: Pods exit with special codes for device errors, but support is limited and mostly for batch jobs.
          • Custom Pod Watchers: Scripts or controllers watch pod/device status, forcibly delete pods attached to failed devices, prompting rescheduling.
      3. Container Code Failures

        • Kubernetes can only restart containers or reschedule pods, with limited expressiveness about what counts as failure.
        • For large AI/ML jobs: Orchestration wrappers restart failed main executables, aiming to avoid expensive full job restart cycles.
      4. Device Degradation

        • Not all device issues result in outright failure; degraded performance now occurs more frequently (e.g., one slow GPU dragging down training).
        • Detection and remediation are largely DIY; Kubernetes does not yet natively express "degraded" status.

      Current Workarounds & Limitations

      • Most device-failure strategies are manual or require high privileges.
      • Workarounds are often fragile, costly, or disruptive.
      • Kubernetes lacks standardized abstractions for device health and device importance at pod or cluster level.

      Roadmap: What’s Next for Kubernetes

      SIG Node and Kubernetes community are focusing on:

      • Improving core reliability: Ensuring kubelet, device manager, and plugins handle failures gracefully.
      • Making Failure Signals Visible: Initiatives like KEP 4680 aim to expose device health at pod status level.
      • Integration With Pod Failure Policies: Plans to recognize device failures as first-class events for triggering recovery.
      • Pod Descheduling: Enabling pods to be rescheduled off failed/unhealthy devices, even with restartPolicy: Always.
      • Better Handling for Large-Scale AI/ML Workloads: More granular recovery, fast in-place restarts, state snapshotting.
      • Device Degradation Signals: Early discussions on tracking performance degradation, but no mature standard yet.

      Key Takeaway

      Kubernetes remains the platform of choice for AI/ML, but device- and hardware-aware failure handling is still evolving. Most robust solutions are still "DIY," but community and upstream investment is underway to standardize and automate recovery and resilience for workloads depending on specialized hardware.

  2. edtechbooks.org edtechbooks.org
    1. evelop strategies to “debug” or find mistakes in their program

      This is one of the biggest skills taught through code as you are constantly problem solving to create a better functioning code.

    2. It is not a language like ours, with vocabularies or alphabets, but special commands and abbreviations that are used to write computer software.

      Coding language is very structured with specific commands greatly altering the outcome of the code. One single missed letter can cause the whole code to crash.

    3. Coding involves problem-solving, perseverance, collaboration, mathematical logic, and reasoning skills.

      As a computer science teacher I completely agree with these skills that can be learned through coding. Code is a logical structure that needs strong problem solving skills.

    1. Reviewer #2 (Public review):

      The revised manuscript by Altan et al. includes some real improvements to the visualizations and explanations of the authors' thesis statement with respect to fMRI measurements of pRF sizes. In particular, the deposition of the paper's data has allowed me to probe and refine several of my previous concerns. While I still have major concerns about how the data are presented in the current draft of the manuscript, my skepticism about data quality overall has been much alleviated. Note that this review focuses almost exclusively on the fMRI data as I was satisfied with the quality of the psychophysical data and analyses in my previous review.

      Major Concerns

      (I) Statistical Analysis

      In my previous review, I raised the concern that the small sample size combined with the noisiness of the fMRI data, a lack of clarity about some of the statistics, and a lack of code/data likely combine to make this paper difficult or impossible to reproduce as it stands. The authors have since addressed several aspects of this concern, most importantly by depositing their data. However their response leaves some major questions, which I detail below.

      First of all, the authors claim in their response to the previous review that the small sample size is not an issue because large samples are not necessary to obtain "conclusive" results. They are, of course, technically correct that a small sample size can yield significant results, but the response misses the point entirely. In fact, small samples are more likely than large samples to erroneously yield a significant result (Button et al., 2013, DOI:10.1038/nrn3475), especially when noise is high. The response by the authors cites Schwarzkopf & Huang (2024) to support their methods on this front. After reading the paper, I fail to see how it is at all relevant to the manuscript at hand or the criticism raised in the previous review. Schwarzkopf & Huang propose a statistical framework that is narrowly tailored to situations where one is already certain that some phenomenon (like the adaptation of pRF size to spatial frequency) either always occurs or never occurs. Such a framework is invalid if one cannot be certain that, for example, pRF size adapts in 98% of people but not the remaining 2%. Even if the paper were relevant to the current study, the authors don't cite this paper, use its framework, or admit the assumptions it requires in the current manuscript. The observation that a small dataset can theoretically lead to significance under a set of assumptions not appropriate for the current manuscript is not a serious response to the concern that this manuscript may not be reproducible.

      To overcome this concern, the authors should provide clear descriptions of their statistical analyses and explanations of why these analyses are appropriate for the data. Ideally, source code should be published that demonstrates how the statistical tests were run on the published data. (I was unable to find any such source code in the OSF repository.) If the effects in the paper were much stronger, this level of rigor might not be strictly necessary, but the data currently give the impression of being right near the boundary of significance, and the manuscript's analyses needs to reflect that. The descriptions in the text were helpful, but I was only able to approximately reproduce the authors analyses based on these descriptions alone. Specifically, I attempted to reproduce the Mood's median tests described in the second paragraph of section 3.2 after filtering the data based on the criteria described in the final paragraph of section 3.1. I found that 7/8 (V1), 7/8 (V2), 5/8 (V3), 5/8 (V4), and 4/8 (V3A) subjects passed the median test when accounting for the (40) multiple comparisons. These results are reasonably close to those reported in the manuscript and might just differ based on the multiple comparisons strategy used (which I did not find documented in the manuscript). However, Mood's median test does not test the direction of the difference-just whether the medians are different-so I additionally required that the median sigma of the high-adapted pRFs be greater than that of the low-adapted pRFs. Surprisingly, in V1 and V3, one subject each (not the same subject) failed this part of the test, meaning that they had significant differences between conditions but in the wrong direction. This leaves 6/8 (V1), 7/8 (V2), 4/8 (V3), 5/8 (V4), and 4/8 (V3A) subjects that appear to support the authors' conclusions. As the authors mention, however, this set of analyses runs the risk of comparing different parts of cortex, so I also performed Wilcox signed-rank tests on the (paired) vertex data for which both the high-adapted and low-adapted conditions passed all the authors' stated thresholds. These results largely agreed with the median test (only 5/8 subjects significant in V1 but 6/8 in in V3A, other areas the same, though the two tests did not always agree which subjects had significant differences). These analyses were of course performed by a reviewer with a reviewer's time commitment to the project and shouldn't be considered a replacement for the authors' expertise with their own data. If the authors think that I have made a mistake in these calculations, then the best way to refute them would be to publish the source code they used to threshold the data and to perform the same tests.

      Setting aside the precise values of the relevant tests, we should also consider whether 5 of 8 subjects showing a significant effect (as they report for V3, for example) should count as significant evidence of the effect? If one assumes, as a null hypothesis, that there is no difference between the two conditions in V3 and that all differences are purely noise, then a binomial test across subjects would be appropriate. Even if 6 of 8 subjects show the effect, however (and ignoring multiple comparisons), the p-value of a one-sided binomial test is not significant at the 0.05 level (7 of 8 subjects is barely significant). Of course, a more rigorous way to approach this question could be something like an ANOVA, and the authors use an ANOVA analysis of the medians in the paragraph following their use of Mood's median test. However, ANOVA assumes normality, and the authors state in the previous paragraph that they employed Mood's median test because "the distribution of the pRF sizes is zero-bounded and highly skewed" so this choice does not make sense. The Central Limits Theorem might be applied to the medians in theory, but with only 8 subjects and with an underlying distribution of pRF sizes that is non-negative, the relevant data will almost certainly not be normally distributed. These tests should probably be something like a Kruskal-Wallis ANOVA on ranks.

      All of the above said, my intuition about the data is currently that there are significant changes to the adapted pRF size in V2. I am not currently convinced that the effects in other visual areas are significant, and I suspect that the paper would be improved if authors abandoned their claims that areas other than V2 show a substantial effect. Importantly, I don't think this causes the paper to lose any impact-in fact, if the authors agree with my assessments, then the paper might be improved by focusing on V2. Specifically, the authors' already discuss psychophysical work related to the perception of texture on pages 18 and 19 and link it to their results. V2 is also implicated in the perception of texture (see, for example, Freeman et al., 2013; DOI:10.1038/nn.3402; Ziemba et al., 2016, DOI:10.1073/pnas.1510847113; Ziemba et al., 2019; DOI:10.1523/JNEUROSCI.1743-19.2019) and so would naturally be the part of the visual cortex where one might predict that spatial frequency adaptation would have a strong effect on pRF size. This neatly connects the psychophysical and imaging sides of this project and could make a very nice story out of the present work.

      (II) Visualizations

      The manuscript's visual evidence regarding the pRF data also remains fairly weak (but I found the pRF size comparisons in the OSF repository and Figure S1 to be better evidence-more in the next paragraph). The first line of the Results section still states, "A visual inspection on the pRF size maps in Figure 4c clearly shows a difference between the two conditions, which is evident in all regions." As I mentioned in my previous review, I don't agree with this claim (specifically, that it is clear). My impression when I look at these plots is of similarity between the maps, and, where there is dissimilarity, of likely artifacts. For example, the splotch of cortex near the upper vertical meridian (ventral boundary) of V1 that shows up in yellow in the upper plot but not the lower plot also has a weirdly high eccentricity and a polar angle near the opposite vertical meridian: almost certainly not the actual tuning of that patch of cortex. If this is the clearest example subject in the dataset, then the effect looks to me to be very small and inconsistently distributed across the visual areas. That said, I'm not convinced that the problem here is the data-rather, I think it's just very hard to communicate a small difference in parameter tuning across a visual area using this kind of side-by-side figure. I think that Figure S2, though noisy (as pRF maps typically are), is more convincing than Figure 4c, personally. For what it's worth, when looking at the data myself, I found that plotting log(𝜎(H) / 𝜎(L)), which will be unstable when noise causes 𝜎(H) or 𝜎(L) to approach zero, was less useful than plotting plotting (𝜎(H) - 𝜎(L)) / (𝜎(H) + 𝜎(L)). This latter quantity will be constrained between -1 and 1 and shows something like a proportional change in the pRF size (and thus should be more comparable across eccentricity).

      In my opinion, the inclusion of the pRF size comparison plots in the OSF repository and Figure S1 made a stronger case than any of the plots of the cortical surface. I would suggest putting these on log-log plots since the distribution of pRF size (like eccentricity) is approximately exponential on the cortical surface. As-is, it's clear in many plots that there is a big splotch of data in the compressed lower left corner, but it's hard to get a sense for how these should be compared to the upper right expanse of the plots. It is frequently hard to tell whether there is a greater concentration of points above or below the line of equality in the lower left corner as well, and this is fairly central to the paper's claims. My intuition is that the upper right is showing relatively little data (maybe 10%?), but these data are very emphasized by the current plots.
The authors might even want to consider putting a collection of these scatter-plots (or maybe just subject 007, or possible all subjects' pRFs on a single scatter-plot) in the main paper and using these visualizations to provide intuitive supporting for the main conclusions about the fMRI data (where the manuscript currently use Figure 4c for visual intuition).

      Minor Comments

      (1) Although eLife does not strictly require it, I would like to see more of the authors' code deposited along with the data (especially the code for calculating the statistics that were mentioned above). I do appreciate the simulation code that the authors added in the latest submission (largely added in response to my criticism in the previous reviews), and I'll admit that it helped me understand where the authors were coming from, but it also contains a bug and thus makes a good example of why I'd like to see more of the authors' code. If we set aside the scientific question of whether the simulation is representative of an fMRI voxel (more in Minor Comment 5, below), Figures 1A and the "AdaptaionEffectSimulated.png" file from the repository (https://osf.io/d5agf) imply that only small RFs were excluded in the high-adapted condition and only large RFs were excluded in the low-adapted condition. However, the script provided (SimlatePrfAdaptation.m: https://osf.io/u4d2h) does not do this. Lines 7 and 8 of the script set the small and large cutoffs at the 30th and 70th percentiles, respectively, then exclude everything greater than the 30th percentile in the "Large RFs adapted out" condition (lines 19-21) and exclude anything less than the 70th percentile in the "Small RFs adapted out" condition (lines 27-29). So the figures imply that they are representing 70% of the data but they are in fact representing only the most extreme 30% of the data. (Moreover, I was unable to run the script because it contains hard-coded paths to code in someone's home directory.) Just to be clear, these kinds of bugs are quite common in scientific code, and this bug was almost certainly an honest mistake.

      (2) I also noticed that the individual subject scatter-plots of high versus low adapted pRF sizes on the OSF seem to occasionally have a large concentration of values on the x=0 and y=0 axes. This isn't really a big deal in the plots, but the manuscript states that "we denoised the pRF data to remove artifactual vertices where at least one of the following criteria was met: (1) sigma values were equal to or less than zero ..." so I would encourage the authors to double-check that the rest of their analysis code was run with the stated filtering.

      (3) The manuscript also says that the median test was performed "on the raw pRF size values". I'm not really sure what the "raw" means here. Does this refer to pRF sizes without thresholding applied?

      (4) The eccentricity data are much clearer now with the additional comments from the authors and the full set of maps; my concerns about this point have been met.

      (5) Regarding the simulation of RFs in a voxel (setting aside the bug), I will admit both to hoping for a more biologically-grounded situation and to nonetheless understanding where the authors are coming from based on the provided example. What I mean by biologically-grounded: something like, assume a 2.5-mm isotropic voxel aligned to the surface of V1 at 4{degree sign} of eccentricity; the voxel would span X to Y degrees of eccentricity, and we predict Z neurons with RFs in this voxel with a distribution of RF sizes at that eccentricity from [reference], etc. eventually demonstrating a plausible pRF size change commensurate to the paper's measurements. I do think that a simulation like this would make the paper more compelling, but I'll acknowledge that it probably isn't necessary and might be beyond the scope here.

    1. Als je als verzoeker om internationale bescherming werd toegewezen aan een OCMW (code 207 OCMW), heb je een bijlage 25 of 26 en een immatriculatieattest. Je hebt tijdens je asielprocedure recht op maatschappelijke dienstverlening.

      dit moet geupdatet worden!

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The study by Osato and Hamada aims at computationally identifying a set of novel putative insulator-associated DNA binding proteins (DBPs) via estimation of their contribution to the expression of genes in the same chromosome region of their binding sites (+- 1Mbp from TSS). To achieve this, the authors leverage a deep learning architecture already published via which ChIP-seq peaks of DBPs in the TSS of a given gene are used to predict its expression level in four human cell lines.

      Building on this, the authors used another tool called DeepLIFT to evaluate the weight of each DBP binding site on the final gene expression value. Hence they made the assumption that if a given DBP had an insulator function they could restrict the prediction of the gene's expression to the region included between pairs of that DBP binding sites, and evaluate the pair's motif directionality bias in the distribution of weights. They exemplify their approach's validity by the fact that they can predict the known directionality bias of CTCF/cohesin-bound sites as the highest of the lot, with the F-R orientation of the pairs the most enriched, recapitulating what already known in literature: i.e., that F-R chromatin interaction peaks are the most enriched. In addition, they find several new DBPs showing significant directionality bias; hence they could be candidates for insulation activity. They then provide correlation between these putative insulator binding sites and sites of transition between euchromatin and heterochromatin by independently using histone mark and gene expression datasets. This, of course, is not surprising because (a) there is insulation between regions with heterotypic chromatin identities, and (b) it was already known from the first papers describing insulated chromatin domains that their boundaries were well-enriched for active transcription and transcriptional regulators (e.g., Dixon et al, Nature 2012).

      Finally, they use chromatin interaction (looping) sites to check the overlap between CTCF and all other DBPs and define a subset of putative insulator DBPs not overlapping CTCF peaks, suggesting potentially new insulatory mechanisms. These factors were all known transcriptional activators, but this part of the findings carry most of the novelty in the work and have the potential of opening up new directions for research in chromatin organization.

      Overall, the methodology applied here is adequate, clear, and reproducible. The major issue, in our view, is that the entire manuscript's findings relies on the usage of deepLIFT, a tool which was not benchmarked previously or by the current study. In fact, deepLIFT is public as regards its code, and also appears as a preprint from 2017 on biorXiv and published in the Proceedings of Machine Learning Research conference. Also, this key tool was developed by the Kundaje lab (who produce high quality alogrithms), and not by the authors. Therefore, the manuscript is predominantly based on the execution of existing workflows to publicly-available data. This does not take anything away from the interesting question posed here, but at the same time does not provide the community with any new algorithm/workflow.

      Finally, although I appreciate that the authors are purely computational and have likely no capacity for experimental validation of their claims of new DBPs having insulator roles, I would assume that there are RNA-seq and/or ChIP-seq data out there produced after knockdown of one or more of these DBPs that show directional positioning. Using this kind of data, effects on gene expression can at least be tested in regard to the authors' predictions. Moreover, in terms of validation, Figure 6 should be expanded to incorporate analysis of DBPs not overlapping CTCF/cohesin in chromatin interaction data that is important and potentially more interesting than the simple DBPs enrichment reported in the present form of the figure. Critically, I would like to see use of Micro-C/Hi-C data and ChIP-seq from these factors, where insulation scores around their directionally-bound sites show some sort of an effect like that presumed by the authors - and many such datasets are publicly-available and can be put to good use here.

      As secondary issues, we would point out that:

      • The suggested alternative transcripts function, also highlighted in the manuscript;s abstract, is only supported by visual inspection of a few cases for several putative DBPs. I believe this is insufficient to support what looks like one of the major claims of the paper when reading the abstract, and a more quantitative and genome-wide analysis must be adopted, although the authors mention it as just an 'observation'.
      • Figure 1 serves no purpose in my opinion and can be removed, while figures can generally be improved (e.g., the browser screenshots in Figs 4 and 5) for interpretability from readers outside the immediate research field.
      • Similarly, the text is rather convoluted at places and should be re-approached with more clarity for less specialized readers in mind.

      Significance

      The scientific novelty of the work lies primarily in the identification of a set of DBPs that are proposed to confer insulator activity genome-wide. This has been long sought after in human data (whilst it is well understood and defined in Drosophila). The authors produce a quantitative ranking of the putative insulation effect of these DBPs and, most importantly, go on to identify a smaller subset that are apparently non-overlapping with anchors of CTCF-cohesin loop anchors; the presence of strong motif orientation biases in many DBPs can also be of broad interest, especially those that cannot be trivially ascribable to the loop extrusion process.

      However, although these findings open the way for speculation on multiple insulation mechanisms via proteins with multiple regulatory functions, the manuscript provide no experimental or computational means to test the proposed roles of these DBPs - and, as such, this limits the potential impact of the work and mostly targets researchers in the field of genome organization that can test these findings. Having said this, if validated, this work can significantly broaden our understanding of how chromatin is organized in 3D nuclear space.

      I typically identify myself to the authors: A. Papantonis, expertise in 3D genome architecture, chromatin biology, and genomics/bioinformatics.

    1. Now let's consider two popular recipes for reproducibility

      Imagine if there were a readily accessible World Wide Wruntime that anyone with any commodity computing device could use to run arbitrary code (and its corresponding documentation) written by other people...

    2. Alice: I couldn't compile your code. Look at this error message! Bob: It works for me! You use Debian 12? I still run Debian 9. That's surely what makes the difference. But I also have good news: I managed to run your code on my machine. The only problem is that... I get 0.8 nm. Alice: I use libode version 3.4. The documentation says it must be compiled with gcc 10 or later. You probably have an older gcc. Bob: Uhhh... Well... I will have to install a virtual machine with Debian 12, and you with Debian 9. Shall we meet again in a week?
    1. ArticleTalk English ReadEditView history Tools Tools move to sidebar hide Actions ReadEditView history General What links hereRelated changesUpload filePermanent linkPage informationCite this pageGet shortened URLDownload QR code Edit interlanguage links Print/export Download as PDFPrintable version In other projects Wikimedia CommonsWikidata item Appearance move to sidebar hide TextSmallStandardLargeThis page always uses small font sizeWidthStandardWideThe content is as wide as possible for your browser window.Color (beta)AutomaticLightDarkThis page is always in light mode. From Wikipedia, the free encyclopedia A closed curve divides the plane into two regions

      hmm

    1. Author Response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      (1) The use of single-cell RNA and TCR sequencing is appropriate for addressing potential relationships between gene expression and dual TCR.

      Thank you for your detailed review and suggestions. The main advantages of scRNA+TCR-seq are as follows: (1) It enables comparative analysis of features such as the ratio of single TCR paired T cells to dual TCR paired T cells at the level of a large number of individual T cells, through mRNA expression of the α and β chains. In the past, this analysis was limited to a small number of T cells, requiring isolation of single T cells, PCR amplification of the α and β chains, and Sanger sequencing; (2) While analyzing TCR paired T cell characteristics, it also allows examination of mRNA expression levels of transcription factors in corresponding T cells through scRNA-seq.

      (2) The data confirm the presence of dual TCR Tregs in various tissues, with proportions ranging from 10.1% to 21.4%, aligning with earlier observations in αβ T cells.

      Thank you very much for your detailed review and suggestions. Early studies on dual TCR αβ T cells have been very limited in number, with reported proportions of dual TCR T cells ranging widely from 0.1% to over 30%. In contrast, scRNA+TCR-seq can monitor over 5,000 single and paired TCRs, including dual paired TCRs, in each sample, enabling more precise examination of the overall proportion of dual TCR αβ T cells. It is important to note that our analysis focuses on T cells paired with functional α and β chains, while T cells with non-functional chain pairings and those with a single functional chain without pairing were excluded from the total cell proportion analysis. Previous studies generally lacked the ability to determine expression levels of specific chains in T cells without dual TCR pairings.

      (3) Tissue-specific patterns of TCR gene usage are reported, which could be of interest to researchers studying T cell adaptation, although these were more rigorously analyzed in the original works.

      Thank you very much for your detailed review and suggestions. T cell subpopulations exhibit tissue specificity; thus, we conducted a thorough investigation into Treg cells from different tissue sites. This study builds upon the original by innovatively analyzing the differences in VDJ rearrangement and CDR3 characteristics of dual TCR Treg cells across various tissues. This provides new insights and directions for the potential existence of “new Treg cell subpopulations” in different tissue locations. The results of this analysis suggest the necessity of conducting functional experiments on dual TCR Treg cells at both the TCR protein level and the level of effector functional molecules.

      (4) Lack of Novelty: The primary findings do not substantially advance our understanding of dual TCR expression, as similar results have been reported previously in other contexts.

      Thank you for your detailed review and suggestions. Early research on dual TCR T cells primarily relied on transgenic mouse models and in vitro experiments, using limited TCR alpha chain or TCR beta chain antibody pairings. Flow cytometry was used to analyze a small number of T cells to estimate dual TCR T cell proportion. No studies have yet analyzed dual TCR Treg cell proportion, V(D)J recombination, and CDR3 characteristics at high throughput in physiological conditions. The scRNA+TCR-seq approach offers an opportunity to conduct extensive studies from an mRNA perspective. With high-throughput advantages of single-cell sequencing technology, researchers can analyze transcriptomic and TCR sequence characteristics of all dual TCR Treg cells within a study sample, providing new ideas and technical means for investigating dual TCR T cell proportions, characteristics, and origins under different physiological and pathological states.

      (5) Incomplete Evidence: The claims about tissue-specific differences lack sufficient controls (e.g., comparison with conventional T cells) and functional validation (e.g., cell surface expression of dual TCRs).

      Thank you for your detailed review and suggestions. This study indeed only analyzed dual TCR Treg cells from different tissue locations based on the original manuscript, without a comparative analysis of other dual TCR T cell subsets corresponding to these tissue locations. The main reason for this is that, in current scRNA+TCR-seq studies of different tissue locations, unless specific T cell subsets are sorted and enriched, the number of T cells obtained from each subset is very low, making a detailed comparative analysis impossible. In the results of the original manuscript, we observed a relatively high proportion of dual TCR Treg cell populations in various tissues, with differences in TCR composition and transcription factor expression. Following the suggestions, we have included additional descriptions in R1, citing the study by Tuovinen et al., which indicates that the proportion of dual TCR Tregs in lymphoid tissues is higher than other T cell types. This will help understand the distribution characteristics of dual TCR Treg cells in different tissues and provide a basis for mRNA expression levels to conduct functional experiments on dual TCR Treg cells in different tissue locations.

      (6) Methodological Weaknesses: The diversity analysis does not account for sample size differences, and the clonal analysis conflates counts and clonotypes, leading to potential misinterpretation.

      We thank you for your review and suggestions. In response to your question about whether the diversity analysis considered the sample size issue, we conducted a detailed review and analysis. This study utilized the inverse Simpson index to evaluate TCR diversity of Treg cells. A preliminary analysis compared the richness and evenness of single TCR Treg cell and dual TCR Treg cell repertoires. The two datasets analyzed were from four mouse samples with consistent processing and sequencing conditions. However, when analyzing single TCR Tregs and dual TCR Tregs from various tissues, differences in detected T cell numbers by sequencing cannot be excluded from the diversity analysis. Following recommendations, we provided additional explanations in R1: CDR3 diversity analysis indicates TCR composition of dual TCR Treg cells exhibits diversity, similar to single TCR Treg cells; however, diversity indices of single TCR Tregs and dual TCR Tregs are not suitable for statistical comparison. Regarding the "clonal analysis" you mentioned, we define clonality based on unique TCR sequences; cells with identical TCR sequences are part of the same clone, with ≥2 counts defined as expansion. For example, in Blood, there are 958 clonal types and 1,228 cells, of which 449 are expansion cells. In R1, we systematically verified and revised clonal expansion cells across all tissue samples according to a unified standard.

      (7) Insufficient Transparency: The sequence analysis pipeline is inadequately described, and the study lacks reproducibility features such as shared code and data.

      Thank you for your review and suggestions. Based on the original manuscript, we have made corresponding detailed additions in R1, providing further elaboration on the analysis process of shared data, screening methods, research codes, and tools. This aims to offer readers a comprehensive understanding of the analytical procedures and results.

      (8) Weak Gene Expression Analysis: No statistical validation is provided for differential gene expression, and the UMAP plots fail to reveal meaningful clustering patterns.

      Thank you very much for your review and suggestions. Based on your recommendations, we conducted an initial differential expression analysis of the top 10 mRNA molecules in single TCR Treg and dual TCR Treg cells using the DESeq2 R package in R1, with statistical significance determined by Padj < 0.05. Regarding the clustering patterns in the UMAP plots, since the analyzed samples consisted of isolated Treg cell subpopulations that highly express immune suppression-related genes, we did not perform a more detailed analysis of subtypes and expression gene differences. This study primarily aims to explore the proportions of single TCR and dual TCR Treg cells from different tissue sources, as well as the characteristics of CDR3 composition, with a focus on showcasing the clustering patterns of samples from different tissue origins and various TCR pairing types.

      (9) A quick online search reveals that the same authors have repeated their approach of reanalysing other scientists' publicly available scRNA-VDJ-seq data in six other publications,In other words, the approach used here seems to be focused on quick re-analyses of publicly available data without further validation and/or exploration.

      Thank you for your review and suggestions. Most current studies utilizing scRNA+TCR-seq overlook analysis of TCR pairing types and related research on single TCR and dual TCR T cell characteristics. Through in-depth analysis of shared scRNA+TCR-seq data from multiple laboratories, we discovered a significant presence of dual TCR T cells in high-throughput T cell research results that cannot be ignored. In this study, we highlight the higher proportion of dual TCR Tregs in different tissue locations, which exhibits a certain degree of tissue specificity, suggesting these cells may participate in complex functional regulation of Tregs. This finding provides new ideas and a foundation for further research into dual TCR Treg functions. However, as reviewers pointed out, findings from scRNA+TCR-seq at the mRNA level require additional functional experiments on dual TCR T cells at the protein level. We have supplemented our discussion in R1 based on these suggestions.

      Reviewer #2 (Public review):

      (1)The existence of dual TCR expression by Tregs has previously been demonstrated in mice and humans (Reference #18 and Tuovinen. 2006. Blood. 108:4063; Schuldt. 2017. J Immunol. 199:33, both omitted from references). The presented results should be considered in the context of these prior important findings.

      Thank you very much for your review and suggestions. Based on the original manuscript, we have supplemented our reading, understanding, and citation of closely related literature (Tuovinen, 2006, Blood, 108:4063 (line 44,line175 in R1); Schuldt, 2017, J Immunol, 199:33 (line 44,line178 in R1)). We once again appreciate the valuable comments from the reviewers, and we will refer to these in our subsequent dual TCR T cell research.

      (2) This demonstration of dual TCR Tregs is notable, though the authors do not compare the frequency of dual TCR co-expression by Tregs with non-Tregs. This limits interpreting the findings in the context of what is known about dual TCR co-expression in T cells.

      Thank you very much for your review and suggestions. This analysis is primarily based on the scRNA+TCR-seq study of sorted Treg cells, where we found the proportions and distinguishing features of dual TCR Treg cells in different tissue sites. Given the diversity and complexity of Treg function, conducting a comparative analysis of the origins of dual TCR Treg cells and non-T cells with dual TCRs will be a meaningful direction. Currently, peripheral induced Treg cells can originate from the conversion of non-Treg cells; however, little is known about the sources and functions of dual TCR Treg cell subsets in both central and peripheral sites. In R1, we have supplemented the discussion regarding the possible origins and potential applications of the "novel dual TCR Treg" subsets.

      (3) Comparison of gene expression by single- and dual TCR Tregs is of interest, but as presented is difficult to interpret. Statistical analyses need to be performed to provide statistical confidence that the observed differences are true.

      Thank you very much for your review and suggestions. Based on your recommendations, we performed an initial differential expression analysis of the top 10 mRNA molecules in single TCR Treg and dual TCR Treg cells using the DESeq2 R package in R1, with a statistical significance threshold of Padj<0.05 for comparisons.

      (4) The interpretations of the gene expression analyses are somewhat simplistic, focusing on the single-gene expression of some genes known to have a function in Tregs. However, the investigators miss an opportunity to examine larger patterns of coordinated gene expression associated with developmental pathways and differential function in Tregs (Yang. 2015. Science. 348:589; Li. 2016. Nat Rev Immunol. Wyss. 2016. 16:220; Nat Immunol. 17:1093; Zenmour. 2018. Nat Immunol. 19:291).

      Thank you for your review and suggestions. This study is based on publicly available scRNA+TCR-seq data from different organ sites generated by the original authors, focusing on sorted and enriched Treg cells within each tissue sample. However, there was no corresponding research on other cell types in each tissue sample, preventing analysis of other cells and factors involved in development and differentiation of single TCR Treg and dual TCR Treg. The literature suggested by the reviewer indicates that development, differentiation, and function of Treg cells have been extensively studied, resulting in significant advances. It also highlights complexity and diversity of Treg origins and functions. This research aims to investigate "novel dual TCR Treg cell subpopulations" that may exhibit tissuespecific differences found in the original authors' studies of Treg cells across different organ sites. This suggests further experimental research into their development, differentiation, origin, and functional gene expression as an important direction, which we have supplemented in the discussion section of R1.

      Reviewer #3 (Public review):

      (1) Definition of Dual TCR and Validity of Doublet Removal:This study analyzes Treg cells with Dual TCR, but it is not clearly stated how the possibility of doublet cells was eliminated. The authors mention using DoubletFinder for detecting doublets in scRNA-seq data, but is this method alone sufficient?We strongly recommend reporting the details of doublet removal and data quality assessment in the Supplementary Data.

      Thank you very much for your review and suggestions. In the analysis of the shared scRNA+TCR-seq data across multiple laboratories, as you mentioned, this study employed the DoubletFinder R package to exclude suspected doublets. Additionally, we used the nCount values of individual cells (i.e., the total sequencing reads or UMI counts for each cell) as auxiliary parameters to further optimize the assessment of cell quality. Generally, due to the possibility that doublet cells may contain gene expression information from two or more cells, their nCount values are often abnormally high. In this study, all cells included in the analysis had nCount values not exceeding 20,000. Among the five tissue sample datasets, we further utilized hashtag oligonucleotide (HTO) labeling (where HTO labeling provides each cell with a unique barcode to differentiate cells from different tissue sources. By analyzing HTO labels, doublets and negative cells can be accurately identified) to eliminate doublets and negative cells.After the removal of chimeric cells, all samples exhibited T cells that possessed two or more TCR clones. This phenomenon validates the reliability of the methodological approach employed in this study and indicates that the analytical results accurately reflect the proportion of dual TCR T cells. Based on the recommendations of the reviewers, we have supplemented and clarified the methods and discussion sections in the manuscript. It is particularly noteworthy that in our analysis, the discussed dual TCR Treg cells and single TCR Treg cells specifically refer to those T cells that possess both functional α and β chains, which are capable of forming TCR. We have excluded from this analysis any Treg cells that possess only a single functional α or β chain and do not form TCR pairs, as well as those Treg cells in which the α or β chains involved in TCR pairing are non-functional.

      (2) In Figure 3D, the proportion of Dual TCR T cells (A1+A2+B1+B2) in the skin is reported to be very high compared to other tissues. However, in Figure 4C, the proportion appears lower than in other tissues, which may be due to contamination by non-Tregs. The authors should clarify why it was necessary to include non-Tregs as a target for analysis in this study. Additionally, the sensitivity of scRNA-seq and TCR-seq may vary between tissues and may also be affected by RNA quality and sequencing depth in skin samples, so the impact of measurement bias should be assessed.

      We deeply appreciate your review and constructive comments. Based on the original manuscript, we have further supplemented and elaborated on the uniqueness and relative proportions of double TCR T cell pairs in skin tissue samples in Section R1. Due to the scarcity of T cells in skin samples, we included some non-Treg cells during single-cell RNA sequencing and TCR sequencing to obtain a sufficient number of cells for effective analysis. The presence of non-regulatory T cells may indeed impact the statistical representation of double TCR T cells as well as the related comparative analyses, as noted by the reviewer. T cells with A1+A2+B1+B2 type double TCR pairings are primarily found within the non-regulatory T cell population in the skin. In response to this point, we have provided a detailed explanation of this analytical result in the revised manuscript R1. Furthermore, concerning the two datasets included in the study, we conducted a comparative analysis in R1, exploring how factors such as sequencing depth at different tissue sites might introduce biases in our findings, which we have thoroughly elaborated upon in the discussion section. We thank you once again for your valuable suggestions.

      (3) Issue of Cell Contamination:In Figure 2A, the data suggest a high overlap between blood, kidney, and liver samples, likely due to contamination. Can the authors effectively remove this effect? If the dataset allows, distinguishing between blood-derived and tissue-resident Tregs would significantly enhance the reliability of the findings. Otherwise, it would be difficult to separate biological signals from contamination noise, making interpretation challenging.

      We thank you for your review and suggestions. We have carefully verified data sources for tissues such as blood, kidneys, and liver. In the study by Oliver T et al., various techniques were employed to differentiate between leukocytes from blood and those from tissues, ensuring accurate identification of leukocytes from tissue samples. First, anti-CD45 antibody was injected intravenously to label cells in the vasculature, verifying that analyzed cells were indeed resident in the tissue. Second, prior to dissection and cell collection, authors performed perfusion on anesthetized mice to reduce contamination of tissue samples by leukocytes from the vasculature. Additionally, during single-cell sequencing, authors utilized HTO technology to avoid overlap between cells from different tissues.

      Analysis of the scRNA+TCR-seq data shared by the original authors revealed highly overlapping TCR sequences in blood, kidney, and liver, despite distinct cell labels associated with each tissue. While these techniques minimize overlap of cells from different sources, they cannot completely rule out the potential impact of this technical issue. As suggested, we have provided additional clarification in R1 of the manuscript regarding this phenomenon of high overlap in the kidney, liver, and blood, indicating that the possibility of Treg migration from blood to kidney and liver cannot be entirely excluded.

      (4) Inconsistency Between CDR3 Overlap and TCR Diversity:The manuscript states that Single TCR Tregs have a higher CDR3 overlap, but this contradicts the reported data that Dual TCR Tregs exhibit lower TCR diversity (higher 1/DS score). Typically, when TCR diversity is low (i.e., specific clones are concentrated), CDR3 overlap is expected to increase. The authors should carefully address this discrepancy and discuss possible explanations.

      Thank you for your review and suggestions. Regarding the potential relationship between CDR3 overlap and TCR diversity, in samples with consistent sequencing depth, lower diversity indeed corresponds to a higher proportion of CDR3 overlap. In our analysis of scRNA+TCR-seq data, we found that single TCR Tregs exhibit both higher diversity and CDR3 overlap, seemingly presenting contradictory analytical results (i.e., dual TCR Tregs show lower TCR diversity and CDR3 overlap). In R1, we supplemented the analysis of possible reasons: the presence of multiple TCR chains in dual TCR Treg cells may lead to a higher uniqueness of CDR3 due to multiple rearrangements and selections, resulting in lower CDR3 overlap; the lower diversity of dual TCR Tregs may be related to the number of T cells sequenced in each sample. The CDR3 diversity analysis in this study merely suggests that the TCR composition of dual TCR Treg cells is diverse, similar to that of single TCR Tregs. However, the diversity indices of single TCR Tregs and dual TCR Tregs are not suitable for statistical comparative analysis. A more in-depth and specific analysis of the diversity and overlap of the VDJ recombination mechanisms and CDR3 composition in dual TCR Tregs during development will be an important technical means to elucidate the function of dual TCR Treg cells.

      (5) Functional Evaluation of Dual TCR Tregs:This study indicates gene expression differences among tissue-resident Dual TCR T cells, but there is no experimental validation of their functional significance. Including functional assays, such as suppression assays or cytokine secretion analysis, would greatly enhance the study's impact.

      We sincerely appreciate your review and suggestions: In this analysis of scRNA+TCR-seq data, we innovatively discovered a higher proportion of dual TCR Treg cells in different tissue sites, which exhibited differences in tissue characteristics. Furthermore, we conducted a comparative analysis of the homogeneity and heterogeneity between single TCR Treg and dual TCR Treg cells. This result provides a foundation for further research on the origin and characteristics of dual TCR Treg cells in different tissue sites, offering new insights for understanding the complexity and functional diversity of Treg cells. Based on your suggestions, we have supplemented R1 with the feasibility of further exploring the functions of tissue-resident dual TCR T cells and the necessity for potential application research.

      (6) Appropriateness of Statistical Analysis:When discussing increases or decreases in gene expression and cell proportions (e.g., Figure 2D), the statistical methods used (e.g., t-test, Wilcoxon, FDR correction) should be explicitly described. They should provide detailed information on the statistical tests applied to each analysis.

      Thank you for your review and suggestions: Based on the original manuscript, we have supplemented the specific statistical methods for the differences in cell proportions and gene expression in R1.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The authors provide a detailed characterization of the tumor microenvironment (TME) of 91 ovarian cancer patients, broken down in long and short-term survivors (post 5 years). The focus on the role of a subgroup of T cells, gamma/delta γδ) T cells with reported anti but also pro tumorigenic properties, Prior work of the lab has established a link between a subgroup of γδ T cells expressing CD73 and poor prognosis, due to the ability of these cells to produce immunosuppressive cytokines, such as IL10 or IL8 and the production of adenosine, by CD73, in the micromilieu. The data is further backed up by the analysis of fresh tumor specimens and tissue culture work.

      Here they continue this story by investigating the TME using tumor microarrays (91 samples), single cell RNA seq (12 patients), imaging mass cytometry (> 30 samples) and flow cytometry (form confirmatory purposes) to define cellular neighborhoods of CD73+ and CD73- γδ T cells. This revealed differences in cellular composition and spatial transcriptome analysis further helped to define the transcriptomes in γδ T cells, cancer cells and cancer associated fibroblasts.

      The authors conclude the in ovarian cancer γδ T cells expressing CD73 dampen anti-tumor immunity and propose detection and evaluation of CD73+ γδ T cells as prognostic marker.

      The manuscript is well written, and despite its descriptive nature, easy to follow. Data is presented in a clear and easy to read fashion.

      Reviewer #1 (Significance (Required)):

      Using a well characterized cohort of ovarian cancer patients with detailed clinical follow up the authors report on the predictive power of a subset of γδ T cells expressing CD73, with immune suppressive / regulatory capacity, reading out patient survival in high grade serous ovarian cancer, a still deadly disease. As such the identification of reliable markers predicting survival is a clear medical need. These findings contrast others made in different solid cancers, suggesting tumor type specific differences, which are only starting to emerge, but are of clear clinical relevance.

      What is unclear to me and needs to be addressed, is if these patient specimens were taken before or after initial therapy, whether the samples have been stratified according the treatment that they got, assuming it will be mostly platinum compounds (but maybe not), and that the p53 status of the tumors are (if genetics are available this would help to add some granularity to the study that, as it stands is largely descriptive, even though with extremely high resolution. This data should be available and could be integrated.

      We thank the reviewer for this insightful and constructive comment. We agree that clinical context and treatment stratification are essential to strengthen the interpretation and translational value of our findings.

      We confirm that all tumor samples used in this study were obtained prior to any systemic treatment, i.e., before first-line chemotherapy, during the Biopsy realized for the diagnosis. This information has now been clearly stated in the Methods and Results section (page 4, line 103) and also in Table S1.

      Although our primary aim was not to evaluate correlations with mutational status, we recognize the critical role that tumor genetics play in shaping the immune microenvironment. Using available clinical genomics data, we found that the TP53 mutational status of our cohort aligns with that of previous analyses. As expected for high-grade serous ovarian cancer (HGSOC), nearly all tumors exhibited TP53 mutations (present in 95% of patients). Due to the lack of variability in TP53 status, no meaningful stratification was observed based on this factor. This information has been added in the Materials and methods part (page 4 lines 104 to 106)

      Some minor issues

      • I would stick to CD73, and not mix it with NT5E, which is confusing at first (Fig 2).

      We appreciate this suggestion. To clarify the nomenclature and avoid confusion, we have consistently indicated throughout the text and figure legends that NT5E refers to the CD73 gene.

      • I would ask to compare the overall survival of CD73+ between densities - is it still significantly different in fig 1 - meaning is it about density, CD73 expression, or both. Comparing survival of tumors with a low density of CD73+ γδ T cells does not seem to be different from those having a low density of CD73- γδ T cells, which could be considered in data interpretation. Same for high density tumors.

      In the manuscript, the term “density” specifically refers to the density of γδ____ T cells and not the density of CD73 molecules expressed by these cells. Additionally, it is not feasible to conduct a density analysis of molecules using the data obtained from immunofluorescence (IF) staining of sample sections.

      Kaplan-Meier analyses were performed to assess patient survival based on the density of total γδ____ T cells, as well as the subsets of CD73⁺ and CD73⁻ γδ T cells. The results indicate that a higher density of γδ____ T cells is associated with poorer patient survival, with a more pronounced effect seen in those with a high density of CD73⁺ γδ T cells compared to those with CD73⁻ γδ T cells.

      As the reviewer pointed out, patients with a low density of CD73⁺ γδ T cells do not show significantly different survival outcomes compared to those with a low density of CD73⁻ γδ T cells (IC50 for low CD73⁺ = 6.0 years vs. IC50 for low CD73⁻ = 6.2 years). In response, we have revised the corresponding sentence in the text and included the IC50 values for greater clarity and informativeness (page 9).

      • figure 1, caption should include the word "patients" at the end, I guess.

      The modification has been done.

      • labelling and font can be improved in many panels, eg. the dot plots in Fig 2, panel B, right, same for panel C and D

      We appreciate the feedback on figure presentation. We have now updated Figure 2 with improved labeling, consistent font size, and enhanced resolution to ensure better readability across all panels, particularly panels B–D. The revised figure has been updated in the main manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript ("Deciphering the tumor-infiltrating CD73+ regulatory γδ T cell ecosystem associated with poor survival of patients with ovarian cancer"), Chabab et al. report on the phenotype and location of CD73+ γδ T cells in ovarian cancer. CD73+ γδ T cells can be immunosuppressive via the production of cytokines (IL-8, IL-10) and the expression of PD-L1. Here, the authors investigated the phenotype and location of CD73+ and -neg γδ T cells in ovarian cancers with a particular focus on the cells surrounding the γδ T cells in the tumour.

      Overall, the study is informative and well-performed. However, the way some of the data are presented does not allow to fully evaluate them. Besides this, this reviewer only has some minor comments.

      General comments:

      • The data provided in this manuscript are descriptive/correlative, and as such, causation cannot be inferred. Therefore, the language needs to reflect this; statements like "we investigated the impact of CD73+ regulatory γδ T cells in ovarian cancer" (L89) and "CD73+ γδ T cells were in close contact with more aggressive tumor cells" (L426), among others, are incorrect without functional data. The authors are advised to adjust the text throughout.

      We thank the reviewer for this thoughtful point. We have amended the text to make it consistent with the data.

      • Please make the figure legends self-explanatory without the need to search for the information in the M&M. For example, the graphs in fig1 and 3 contain many dots, but it is not explained what these dots represent. Please also add n for each experiment shown and state how often the experiment was performed independently.

      As requested by the reviewer, we have revised the figure legends to make them more explicit. We have indicated the number of biological replicates (n) and how many times each experiment was performed independently. This information has been added to each legend where consistent and relevant, to ensure clarity and reproducibility.

      • It would be helpful for the reader if abbreviations introduced in the M&M were also explained the first time they appeared in the results section.

      This point has been addressed as requested by the reviewer.

      • Please explain all abbreviations, e.g. FIGO, CST, NT5E, etc.
      • L235: typo 'that' instead of 'than'; L258 'reduced'; L259 'fig1d-f'; L451f twice 'CD73+'; 'naive' instead of 'naïve' throughout; SF2 legend: '2f' instead of '3f', SF9 legend: '1.105'.
      • L280ff: "Tumor cells ... were the most important cell type" - it may be clearer to use 'most frequent';

      All these points have been addressed.

      M&M - Please be consistent, if you provide catalogue numbers or dilutions (antibody, reagents) [which is good, maybe even adding the RRID number], do so for all items.

      This point has been addressed as requested by the reviewer.

      • The M&M does not state for the CAFs how long they were cultured before the supernatant was taken for the cytokine measurements.

      This point has been added in M&M section.

      • For the IL-6 ELISA, it is stated that the "cells were harvested"; what happened to them, and how do you get any SN from these cells?

      We have amended the protocol of IL-6 Elisa in M&M section for clarification.

      Figures Fig.1: - The authors used the word 'predict' in the heading, which seems not appropriate for a retrospective study; something like 'correlate' seems better.

      The word “predict” has been replaced by “correlate” as suggested by the reviewer.

      • Similarly, the title of the figure legends claims that the 'impact of γδ T cells' is shown, while only a correlation is presented.

      The title of the figure has been modified

      • For Fig1a-c, only summary data are presented. Please add exemplary pictures as well.

      Pictures of IF have been added as Supplementary Fig 1.

      • For Fig1d-f, the label for the x-axis is missing.

      The figure has been corrected.

      Fig.2 - It seems funny to call the patients 'naïve', maybe 'untreated' is clearer.

      We appreciate this suggestion and agree that ‘untreated’ is a clearer and more appropriate term in this context. We have replaced all instances of ‘naïve’ with ‘untreated’ throughout the manuscript to avoid ambiguity.

      • The graph in Fig2e does not allow comparing the cell frequencies properly. This would require either bar graphs or a table. Furthermore, the statistical analysis is missing. Without that, a statement like "associated with higher proportion of CAFs" (L265) is not supported.

      We thank the reviewer for this valuable observation. In response, we have replaced the original visualization in Figure 2E with grouped bar graphs showing the mean ± SEM of the relative proportions of each major cell type in the NT5E_low and NT5E_high groups, based on the median split. This format allows for clearer visual comparison of cell frequencies across conditions.

      Furthermore, we performed statistical comparisons using a t-test (a parametric test) on each population to evaluate differences in cell type proportions between the two groups. The results indicate a significantly higher proportion of CAFs and γδ T cells in the NT5E_high tumor profile. The corresponding p-values are provided in the figure legend. We hope this revised analysis and clearer presentation address the reviewer’s concerns.

      Fig.3 - For Fig3b+c, the IMC are derived from 4 patients (not clear for the flow data)

      As stated in both the figure legend and the text, the IMC analysis was conducted on 38 ROIs from four patient samples, while the flow cytometry analysis was performed on tumor samples from seven ovarian cancer patients.

      • did the authors noticed differences between patients?

      "As shown in new Figures 3b and 3c, no significant differences were observed between patients. Each individual patient is represented by a different color."

      • For Fig3e, the description in the text does not reflect the figure, e.g. cluster 1 does not show LAG3 expression, but this is claimed in the text (other descriptions are off as well).

      The text describing Fig. 3e has been amended in the new version of the manuscript.

      • In Fig3h, the authors stain cytokines in γδ T cells purified from ovarian cancer samples. The text seems to imply that the cytokine staining was performed directly ex vivo, without an in vitro stimulation of the cells, e.g. with PMA/ionomycin (if so, the description is missing). In any case, the values appear surprisingly high. Exemplary data are needed to clarify how the gating was done (for γδ T cells and the cytokines) and what the primary data looked like.

      The protocol has been amended in the “Materials and Methods” section. A gating strategy and primary data analysis from one representative patient are included in a supplementary Figure 4c.

      We agree with the reviewer’s comments that it is surprising that γδ T cell stimulation is not required for IL-8, IL10 and IFNγ production. However, one possible explanation is the high reactivity of γδ T cells compared to other T cell subsets, as well as their localization in the tumor microenvironment rather than in healthy tissue or blood.

      • In Fig3h, it is not clear what is meant with "IL-8 / IL-10", please explain.

      This analysis shows the percentage of cells that are positive for both IL-8 and IL-10.

      The figure and its legend have been amended for clarity.

      Fig.4 - Please provide the values and the statistical analyses for all cell populations.

      We performed statistical analyses (Wilcoxon signed-rank test) for all cell populations and provide the data in the Supplementary Fig. 5A. However, due to the heterogeneity of ROIs, a significant difference was observed for tumor cells, which were more prevalent more in the neighborhood of CD73- than CD73+ γδ T cells (p

      Fig.5/6 - In Fig5, the authors state that 8 cell populations were differentially enriched around CD73+ or -neg γδ T cells. However, in Fig4, only 4 of these populations are mentioned. Please add the remaining 4 to fig4 and name the 8 clusters in fig5 in line with the gating strategy used in fig4.

      We thank the reviewer for highlighting that the description of Figure 5 in our text was unclear. We have revised the text for clarification and specify that based on Supplementary Figure 7, which shows the number of cells for each cell type found in the neighborhood of all γδ T cell subsets (CD73- and CD73+) in all ROIs. We decided to perform phenotypic analysis on only four cell types (those with a sufficient cell counts), setting the cutoff at 700 cells.

      The four cell types are analyzed in Figures 5 and 6. Figure 5A shows tumor cells, with eight clusters identified, while Figure 5B represents fibroblasts, with seven clusters identified. Figure 6A shows CD4 T cells, with eight clusters, and Figure 6B CD8 T cells, with ten clusters.

      • Furthermore, the authors want to show in fig5 how the phenotype of these 8 cell populations differs depending on whether they are close to CD73+/- γδT cells. tSNE plots do not allow illustrating this (BTW: the plots lack the colour code). The frequencies of the cell types/phenotypes in the vicinity of CD73+/- γδ T cells need to be depicted differently (e.g. bar graphs). Furthermore, the claim that differences are observed, needs to be supported by showing the statistical values obtained. The same argument applies to Fig6 and SF8.

      We have added the code color of tSNE plots in Figures 5, 6, and SF9. The tables in Supplementary Figure 8 show the percentage of cells in each cluster within the vicinity of CD73+/- γδ T cells, allowing for an investigation of the neighborhood of each γδ T cell subset.

      • Fig6: This reviewer disagrees with the notion that the expression of HLA-DR or CD279 is enough to imply a functional state of the cell.

      As requested by the reviewer, we have amended the text to clarify that: “Cluster analysis revealed that CD4+ T cells in contact with effector γδ T cells (i.e., the CD73- subset) express HLA-DR and/or PD-1, both activation markers.”

      Supplements - SF2a: please check the labels; how can CD8+ CD4+ cells be labelled 'CD8 T cells' and why do the authors exclude the possibility that e.g. B cells could express HLA-DR?

      We thank the reviewer for pointing out the error in Figure 2a, which has now been corrected. The CD8+ cells have been relabeled as 'CD8 T cells,' and the B cells are now shown expressing HLA-DR.

      • SF7 is not clear to this reviewer. If the clusters represent different cell types, how can e.g. tumours be found in all of them?

      We believe the reviewer is referring to SF9 rather than SF7 in this comment. SF9 analyzes γδ T cells in proximity to CD73+ and CD73- γδ T cells. As in Figures 5 and 6, γδ T cell neighbors of CD73+ and CD73- γδ T cells were identified, and a clustering analysis revealed five distinct clusters. Tumor cells was not analyzed in this figure. We have clarified the text to prevent confusion

      • SF9b lacks a negative control and a statistical analysis, and SF9c lacks the summary data and statistical analysis.

      As requested by the reviewer, we have performed statistical analysis for SF9b and added a negative control. Additionally, we have included summary data with a statistical analysis in SF9c.

      • In the text, the authors state, "We and others reported that in ovarian tumors, IL-6 is mainly produced by CAFs and induces CD73 expression by γδ T cells (Extended Data Fig. 9 and 15)." The data in SF9b are not enough to make this claim and reference 15 is a review article that does not even mention 'IL-6'. This needs to be corrected.

      We have updated Supplementary Figure 9B to provide more robust data. We thank the reviewer for pointing out our error. The publication we intend to cite is a research article, not a review.” Hu G, Cheng P, Pan J, Wang S, Ding Q, Jiang Z, et al. An IL6-Adenosine Positive Feedback Loop between CD73+ γδ Tregs and CAFs Promotes Tumor Progression in Human Breast Cancer. Cancer Immunol Res. 2020;8:1273–86.” we made the correction in the manuscript.

      Reviewer #2 (Significance (Required)):

      In this manuscript ("Deciphering the tumor-infiltrating CD73+ regulatory γδ T cell ecosystem associated with poor survival of patients with ovarian cancer"), Chabab et al. report on the phenotype and location of CD73+ γδ T cells in ovarian cancer.

      CD73+ γδ T cells can be immunosuppressive via the production of cytokines (IL-8, IL-10) and the expression of PD-L1. Here, the authors investigated the phenotype and location of CD73+ and -neg γδ T cells in ovarian cancers with a particular focus on the cells surrounding the γδ T cells in the tumour.

      Overall, the study is informative and well-performed. However, the way some of the data are presented does not allow to fully evaluate them. Besides this, this reviewer only has some minor comments.

      To enable a full evaluation of the data, we have added new figures, amended others, and clarified certain points in the text, hoping that the reviewer will find these modifications sufficient to consider our manuscript for publication.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this article, Chabab et al. analyze sample from ovarian cancer patients, with a specific focus on gamma-delta T cells (Tγδ). The authors claim that CD73+ cells are associated with poor prognosis in ovarian cancer, and that CD73 expression is correlated with the composition and polarization of the microenvironment. Using imaging mass cytometry data, they also claim that the neighborhoods of CD73+ and CD73- Tγδ cells differs in composition.

      Major comments: - The prognostic value of CD73/NT5E is analyzed in TCGA-Ovarian RNAseq data. In the context of this article, it is implied that this should reflect CD73 expression by Tγδ but it is likely that other cell types are contributing to bulk CD73 expression.

      We appreciate the reviewer’s insightful comment. In fact, due to low proportion of Tγδ in TME we have stratified on NT5E total expression. We agree that this signal likely includes contributions from multiple cell types beyond γδ T cells, such as cancer-associated fibroblasts and endothelial cells, which are also known to express CD73 (NT5E gene).

      The stratification of patient based on NT5E total expression showed an association between high NT5E expression and poorer overall survival and increase in Tγδ gene markers (TRDC, TRGC1/2) and percentage of cells (Fig2E) in the patient cohort (Fig2C). To clarify this point, we have revised the Results and Discussion sections to explicitly state that the TCGA-based survival analysis reflects total intratumoral NT5E enrichment and cannot be attributed specifically to γδ T cells. We now refer to this analysis as an independent validation of the clinical relevance of CD73, while noting that its cell-type-specific contribution remains to be resolved in future studies using spatial transcriptomics or deconvolution approaches.

      • In the analysis of scRNAseq data, multiple public datasets are aggregated and the overall level of CD73 is used for stratification. Is this stratification confounded by dataset of origin?

      We thank the reviewer for raising this critical point regarding potential batch effects and dataset-driven bias in our stratification strategy. To address this, we performed additional analyses to assess whether NT5E (CD73) expression is confounded by dataset of origin.

      First, we verified that all single-cell datasets (GSE147082, GSE241221, and GSE235931) were processed using a harmonized integration workflow, including SCTransform normalization and integration using Seurat’s reciprocal PCA approach, which effectively minimizes batch-related variability.

      • The last part of the results discusses the role of IL6 produced by CAFs on Tγδ, but very little data is shown to support the proposed mechanisms. The authors report expression of CD73 by flow cytometry on blood-sorted Tγδ following culture with IL2, IL6, IL21. The data shown however only represents one donor and should therefore be repeated on multiple donors.

      We appreciate the reviewer’s insightful comment. We have added data and updated Supplementary Figure 9 to provide more robust findings. Regarding the role of IL-6, our data in ovarian cancer are consistent with the study by Hu et al. in breast cancer, which reports an IL-6-Adenosine Positive Feedback Loop between CD73+ γδ Tregs and CAFs that promotes tumor progression in human breast cancer."

      Minor comments:

      • The authors stratify their cohort by Tγδ density but I could not find the threshold used for stratification

      The threshold has been added in figure and text.

      • Labels for CD8+ and CD4+CD8+ T cells are swapped in Extended Data Fig 2A

      The correction of figure has been made.

      • The legend of graphs shown in multiple panels (for instance: Fig 3F) are not very clear: is each dot representing the average expression of one cluster in one patient?
      • In figure 3G there is no color scale, the authors need to add it with appropriate units so that readers can interpret the data shown

      These points have all been amended and corrected in the next version of the manuscript.

      Reviewer #3 (Significance (Required)):

      This paper shows interesting imaging mass cytometry data of ovarian cancer specimens. The focus on CD73 expression by Tγδ is fairly specific, although the exonucleotidases pathway involving CD73 is currently extensively studied for its immunosuppressive role.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript ("Deciphering the tumor-infiltrating CD73+ regulatory γδ T cell ecosystem associated with poor survival of patients with ovarian cancer"), Chabab et al. report on the phenotype and location of CD73+ gd T cells in ovarian cancer. CD73+ gd T cells can be immunosuppressive via the production of cytokines (IL-8, IL-10) and the expression of PD-L1. Here, the authors investigated the phenotype and location of CD73+ and -neg gd T cells in ovarian cancers with a particular focus on the cells surrounding the gd T cells in the tumour. Overall, the study is informative and well-performed. However, the way some of the data are presented does not allow to fully evaluate them. Besides this, this reviewer only has some minor comments.

      General comments:

      • The data provided in this manuscript are descriptive/correlative, and as such, causation cannot be inferred. Therefore, the language needs to reflect this; statements like "we investigated the impact of CD73+ regulatory γδ T cells in ovarian cancer" (L89) and "CD73+ γδ T cells were in close contact with more aggressive tumor cells" (L426), among others, are incorrect without functional data. The authors are advised to adjust the text throughout.
      • Please make the figure legends self-explanatory without the need to search for the information in the M&M. For example, the graphs in fig1 and 3 contain many dots, but it is not explained what these dots represent. Please also add n for each experiment shown and state how often the experiment was performed independently.
      • It would be helpful for the reader if abbreviations introduced in the M&M were also explained the first time they appeared in the results section.
      • Please explain all abbreviations, e.g. FIGO, CST, NT5E, etc.
      • L235: typo 'that' instead of 'than'; L258 'reduced'; L259 'fig1d-f'; L451f twice 'CD73+'; 'naive' instead of 'naïve' throughout; SF2 legend: '2f' instead of '3f', SF9 legend: '1.105'.
      • L280ff: "Tumor cells ... were the most important cell type" - it may be clearer to use 'most frequent';

      M&M

      • Please be consistent, if you provide catalogue numbers or dilutions (antibody, reagents) [which is good, maybe even adding the RRID number], do so for all items.
      • The M&M does not state for the CAFs how long they were cultured before the supernatant was taken for the cytokine measurements.
      • For the IL-6 ELISA, it is stated that the "cells were harvested"; what happened to them, and how do you get any SN from these cells?

      Figures

      Fig.1:

      • The authors used the word 'predict' in the heading, which seems not appropriate for a retrospective study; something like 'correlate' seems better.
      • Similarly, the title of the figure legends claims that the 'impact of gd T cells' is shown, while only a correlation is presented.
      • For Fig1a-c, only summary data are presented. Please add exemplary pictures as well.
      • For Fig1d-f, the label for the x-axis is missing.

      Fig.2

      • It seems funny to call the patients 'naïve', maybe 'untreated' is clearer.
      • The graph in Fig2e does not allow comparing the cell frequencies properly. This would require either bar graphs or a table. Furthermore, the statistical analysis is missing. Without that, a statement like "associated with higher proportion of CAFs" (L265) is not supported.

      Fig.3

      • For Fig3b+c, the IMC are derived from 4 patients (not clear for the flow data) - did the authors noticed differences between patients?
      • For Fig3e, the description in the text does not reflect the figure, e.g. cluster 1 does not show LAG3 expression, but this is claimed in the text (other descriptions are off as well).
      • In Fig3h, the authors stain cytokines in gd T cells purified from ovarian cancer samples. The text seems to imply that the cytokine staining was performed directly ex vivo, without an in vitro stimulation of the cells, e.g. with PMA/ionomycin (if so, the description is missing). In any case, the values appear surprisingly high. Exemplary data are needed to clarify how the gating was done (for gd T cells and the cytokines) and what the primary data looked like.
      • In Fig3h, it is not clear what is meant with "IL-8 / IL-10", please explain.

      Fig.4

      • Please provide the values and the statistical analyses for all cell populations.

      Fig.5/6

      • In Fig5, the authors state that 8 cell populations were differentially enriched around CD73+ or -neg gd T cells. However, in Fig4, only 4 of these populations are mentioned. Please add the remaining 4 to fig4 and name the 8 clusters in fig5 in line with the gating strategy used in fig4.
      • Furthermore, the authors want to show in fig5 how the phenotype of these 8 cell populations differs depending on whether they are close to CD73+/- gdT cells. tSNE plots do not allow illustrating this (BTW: the plots lack the colour code). The frequencies of the cell types/phenotypes in the vicinity of CD73+/- gd T cells need to be depicted differently (e.g. bar graphs). Furthermore, the claim that differences are observed, needs to be supported by showing the statistical values obtained. The same argument applies to Fig6 and SF8.
      • Fig6: This reviewer disagrees with the notion that the expression of HLA-DR or CD279 is enough to imply a functional state of the cell.

      Supplements

      • SF2a: please check the labels; how can CD8+ CD4+ cells be labelled 'CD8 T cells' and why do the authors exclude the possibility that e.g. B cells could express HLA-DR?
      • SF7 is not clear to this reviewer. If the clusters represent different cell types, how can e.g. tumours be found in all of them?
      • SF9b lacks a negative control and a statistical analysis, and SF9c lacks the summary data and statistical analysis.
      • In the text, the authors state, "We and others reported that in ovarian tumors, IL-6 is mainly produced by CAFs and induces CD73 expression by γδ T cells (Extended Data Fig. 9 and 15)." The data in SF9b are not enough to make this claim and reference 15 is a review article that does not even mention 'IL-6'. This needs to be corrected.

      Significance

      In this manuscript ("Deciphering the tumor-infiltrating CD73+ regulatory γδ T cell ecosystem associated with poor survival of patients with ovarian cancer"), Chabab et al. report on the phenotype and location of CD73+ gd T cells in ovarian cancer. CD73+ gd T cells can be immunosuppressive via the production of cytokines (IL-8, IL-10) and the expression of PD-L1. Here, the authors investigated the phenotype and location of CD73+ and -neg gd T cells in ovarian cancers with a particular focus on the cells surrounding the gd T cells in the tumour. Overall, the study is informative and well-performed. However, the way some of the data are presented does not allow to fully evaluate them. Besides this, this reviewer only has some minor comments.

    1. Jerry Howell Jerry Howell Following Old guy who likes to code. Privacy advocate. I make Peersuite, a decentralized discord alternative.WorkdicabledLocationMartinsville, VA, USAJoinedMay 13, 2025 May 13 Peersuite, a decentralized workspace #opensource #webdev #javascript #community

    1. nextflow pull nf-core/demo Nextflow will pull the pipeline code, meaning it will download the full repository to your local drive.

      This is the same as git pull; It just organizes the module into modules/nf-core/ for you.

      This only works for the demo module which is in the nf-core repo. For all other modules found in nf-core, you need to install and use nf-core modules install ..

    2. nf-core project enforces strong guidelines for how pipelines are structured, and how the code is organized, configured and documented.
    1. Development Company Build your Crypto business empire with the most redefined crypto wallet in the digital space. Dappfort, a promising crypto wallet development company, offers you next-gen-powered wallet solutions. Let's work together Crypto Wallet Development Dappfort is a promising leader in crypto wallet development, offering an advanced wallet solution that helps wallet users buy, sell, trade, and swap a variety of cryptocurrencies at any time. Are you a crypto enthusiast looking to launch your own business in the emerging crypto market? Then Dappfort’s crypto wallet would be a great solution. The certified blockchain developers of Dappfort work on your crypto wallet idea to bring it to reality. Crypto wallet development requires a team of experienced developers from UI design to smart contract programming. Dappfort takes your wallet development process through a sequence of workflows to bring the best idea to the digital world. We, as a team, make sure every wallet is hassle-free, compatible, secure, and user-friendly. Also, we are keen on providing feature-rich crypto wallet development services tailored to your business needs. Crypto Wallet Development Services We offer a wide range of crypto wallet development services based on the requirement of client. Here is a list of the top services we work for our clients NFT Wallet A dedicated NFT wallet that allows users to hold only non-fungible tokens, the wallet is specifically designed to buy, sell, trade, and hold any kind of NFT safely. Multi-Chain Wallet The futuristic wallet that attracts most users around the globe is this one, as this allows users to trade and hold multiple cryptocurrencies across various blockchain networks. Coin-Based Wallet We also work on dedicated crypto wallet projects, especially designed for a particular crypto coin based on its dedicated blockchain network. White Label Solution Looking to launch a similar wallet like Trust Wallet? Avail our white label crypto wallet solution and launch in the crypto space in no time. Wallet as a Service Our professionals are also employed to develop robust digital solutions to handle every aspect of functions and transactions in the wallet. Browser Extensions Are you looking for something like Metamaks? We have got you, our professionals are ready to launch your next browser extensions wallet like Metamask. Features of our Cryptocurrency Wallet Solutions We make sure every wallet developed by our experts is planned with the most prominent features that users look for and helps admin manage their business. Admin features User features Effective Data It provides all the data of every transaction and trade that took place through the wallet platform, and also the activity of the Dapps and staking details. API Admin has control over the API integrated to the platform and can modify, add, or remove the API based on the needs of the users of the crypto wallet. Analytics The analytics dashboard helps the admin to analyze the users' access to the features of the wallet, and this helps to upgrade the features that users opt for the most. Security Management The admin holds the key position to manage the security of the wallet, where he enables various security features to protect the crypto wallet solution. Data Backup On a daily basis, the wallet records all the required data and stores it for future analysis, and also helps to transfer data in the future at any time. Notifications Admins will be able to send notifications to a specific set of users or to all users of the crypto wallet from the dashboard. Trade History The user will be able to go through the history of trades done before for analysis, and this helps to plan future trades and also the previous assets he/she was holding. Multiple Cryptos The wallet is designed to hold multiple cryptocurrencies, which allows users to explore various assets and let them choose the best crypto they want to buy/trade/sell. Referral & Reward The referral and reward system allows users to earn commission when a user makes any transaction with the wallet through his/her referral ID after creating their wallet. User Dashboard The user dashboard holds all the information of the user since when he/she made their first transaction in the wallet, which helps them to analyze their digital assets. Staking Staking helps users to lock their digital assets for a certain period and allows them to earn rewards and crypto based on their locked assets and the time period. Multiple Payment Gateways Users can deposit/withdraw their funds through any of their desired payment gateways, which makes their process easier. Addon Features Apart from the above-listed features, below are a few additional features that help the wallet reach out to maximum users around the world. Token Listing Multiple tokens can be listed in the wallet to attract multiple users around the globe and reach your business goals. Browser Extension Are you looking to add browser extensions like Metamask for your wallet? We have got you, our developers will bring it for you. Fiat Support We also bring you Fiat support, which allows users around the globe to deposit their Fiat into the wallet for the purchase of any asset. Cross-Chain The cross-chain options allow users to trade across multiple blockchain networks, allowing users to connect to various Dapps they want. Dapp Dapps play a major role in the crypto space, so with the wallet, we integrate the Dapp to ease the process for users for further needs. Lightning Network BTC Lightning Network allows instant, low-cost transactions that are completed using off-chain payment techniques, enabling fast Bitcoin transfers. Security Features We are keen on the security features of the crypto wallet to avoid any kind of disruption in the service and to avoid any hacks, where we add multiple layers of security, and here are a few among them Multi-Factor Authentication Multiple kinds of authentication are used, including passwords, hardware keys, and more. This provides extra security levels even if one aspect is compromised. Encryption Integrates strong encryption algorithms such as AES or ECC to protect sensitive data. It ensures that encrypted data is not readable without the decryption key. Biometric Authentication requires the use of unique physical traits such as fingerprints or facial recognition, providing an additional security layer that prevents any hacks. Anti-Phishing software Anti-phishing software detects fraudulent websites and programs. It avoids scams and the exposure of private keys or login information. DDoS Protection DDoS protection ensures that the overwhelming unwanted traffic is blocked that aims to disrupt or break the system, not directly preventing data breaches. Cryptocurrency Wallet Development Process Project Scope Analysis Determine Tech Stack UI/UX Design Backend Development Smart Contract Coding Set up Security Layers Dapp Integration API Integration Testing and Deployment Project Scope Analysis and Take Profit We assigned the experts on our team to analyze the project scope and proceed with the project's vision and mission, thereby informing the development process and project outcome. Determine Tech Stack Once the process is set up, we identify the required tools and tech stack for developing the project, presenting the best solution for the digital space. UI/UX Design and Candlestick Close Then our designers will start their work with designing the best impressive UI for your wallet that helps the users to find and access every feature of the wallet. Backend Development Simultaneously, our developers will proceed with the backend development process and set up the core functionality of the wallet and additional features as the project demands. Smart Contract Coding On the other hand, our certified blockchain developers will be coding smart contracts that play a major role in the crypto wallet and secure every transaction. Set up Security Layers Once the core functionality and features are set up, we will add multiple layers of security features to the wallet to safeguard the users and digital assets of the wallet. Dapp Integration If the client requires any decentralized application to be integrated with the wallet to provide additional services, our developers would work to integrate DApps with the wallet. API Integration For various additional features and functions, API’s are integrated with the wallet based on the project demand to ease multiple operations for the users. Testing and Deployment Once the development process is done, the wallet is taken through a series of tests where the bugs & vulnerabilities are fixed and removed before deployment. Benefits of Launching a Crypto Wallet Looking to launch a crypto wallet but don’t have any idea on why to launch and what benefits you can get, here are a few things that may help before you move with the wallet development process Global Reach Crypto Wallet allows you to take your business to a global audience and helps to build your crypto business empire further in the digital space. Multiple Revenue Streams Crypto wallet lets you make multiple revenue which, including transaction fees, token listing fees, even for staking, exchange, and more. DeFi Users can access DeFi platforms, and also users can benefit from staking, lending, and borrowing, making it a one-stop solution. Market Demand The crypto market is growing, and the crypto space demands that every user have a wallet for any operation in the crypto space, which allows you to reach more users. Secured Crypto wallets are a more secure solution in the crypto space because of multiple security layers and deployed smart contracts, which is an added advantage. Scalability The wallet offers hassle-free performance to the users, even if more than 50000+ active users access the wallet at the same time. Tech Stack we use for Cryptocurrency Wallet Development Our certified professionals work on various tech stacks to present the best crypto wallet with top-notch features, and here are a few on which our professionals most commonly work. Frontend React Next.Js Vue Backend Node.js Nest.js Express.js Socket.io Blockchain Ledger Bitcoin-core Pinata Cloud Hardhat IPFS Alchemy TronWeb Blockchain network Avalanche Ethereum Bitcoin Solana Polygon Fantom Blockchain platforms Solidity Stellar Hyperledger fabric Rust Additional tools Apollo Docker REST/GraphQL ClickUp/ Jira GitHub What Makes Dappfort the Best Cryptocurrency Wallet Development Company? Dappfort is the best cryptocurrency wallet development company that works with clients around the world on various requirements and provides the best solution that helps their business grow in the crypto space. If you are looking to launch your wallet in the crypto space or looking to give your existing wallet an upgrade, connect with our experts, and they will help you reach your business goal with advanced solutions. Dappfort’s crypto wallet solution offers a wide range of opportunities in digital assets that attract a lot of users around the globe and help you reach your business goals. The crypto wallet is the gateway to every transaction and every operation that takes place in the crypto space, so if you have an idea to enter the crypto space, then launching a wallet would be the best solution at this time. Contact us! Book a call or fill out the form below and we'll get back to you once we've processed your request. Select Country I agree the Terms and conditions & send me NDA FAQs Related to Crypto Wallet App Development Frequently asked questions regarding Crypto Wallet App What are the benefits of developing an crypto Wallet app? The benefits of creating a cryptocurrency wallet app include convenience, security, quick transactions, connection with other services, and easy access to digital payments. How can AI be integrated into crypto Wallet application development? AI can improve eWallet apps by providing tailored suggestions, fraud detection, speech recognition, and chatbots, which improves user experience, security, and transaction efficiency. What technologies are commonly used in crypto Wallet apps development? It comprises mobile app frameworks, backend technology, payment gateway integrations, and AI technologies. What are the essential components to add in a cryptocurrency wallet app? It includes user registration, account linking, wallet balance management, transaction history, QR code scanning, peer-to-peer money transfers, bill payments, and security features like as authentication. How can I monetize crypto Wallet app? An eWallet app's monetization options include transaction fees, merchant partnerships, in-app advertising, premium feature subscription plans, and app licensing to other businesses. How can I ensure the security of an crypto Wallet app? Implement data encryption, two-factor authentication, frequent security audits, industry standard compliance, user education on secure practices, and AI-powered fraud detection to assure the security of your crypto wallet software. Explore all of free resource Discover guides on wallet security, reviews of the best options, and the latest trends in the cryptocurrency space. Stay informed and make the most of your digital assets! Explore Insights Get in touch Get in touch with us for all your crypto wallet inquiries! Whether you have questions, need support, or want to share feedback, we’re here to help you navigate your digital asset journey. Contact us Boost your business with our customized web3 digital solutions. Partner with Dappfort to turn your vision into reality. Contact Us Business Enquiry: +91 88385 34884 Email Us: sales@dappfort.com LONDON 71-75 Shelton Street, Covent Garden, London, England WC2H 9JQ, GB MADURAI Baskar Complex, Besant Road, Chinna Chokkikulam Madurai, TN-625002 India Company About us Blog Services Web3 Game Development Web3 Wallet Development Web3 Dapp Development Web3 Defi Development Web3 Ecommerce Development Crypto Exchange Development White Label Exchange Software Cryptocurrency Wallet Development Hybrid Exchange Development P2P Exchange Development Metaverse Development Metaverse Casino Game Development Metaverse Avatar Development Metaverse Game Development Metaverse Token Development Metaverse App Development Metaverse Social Media Development Products Kucoin Clone Script Bybit Clone Script Solanart Clone Script Superrare Clone Script Axie infinity Clone Script Alien Worlds Clone Script Sandbox Clone Script Cryptoblades Clone Script Zedrun Clone Script BC Games Clone Script Cricket Betting Software Whitelabel Sportsbook Software Draftkings Clone Script Fanduel Clone Script Lotus365 Clone Script Dream11 Clone Script Betfury Clone Script Foundation Clone Script Disclaimer: logos and other registered trademarks of blockchain networks & other popular application used on this platform are held by their respective owners. Dappfort does not claim ownership or association on them, and their use is purely for informational and illustrative purposes to understand by our audience. Privacy Policy Terms & Conditions © 2025 Dappfort, All Rights Reserved Connect Whatsapp Connect Telegram $(document).ready(function() { $('.menu').click(function(event) { event.stopPropagation(); // Prevents the event from bubbling up to the document $(this).toggleClass('clicked'); var submenu = $(this).next('.submenu'); if ($(this).hasClass('clicked')) { submenu.addClass('active').delay(100).queue(function(next) { $(this).addClass('show'); // Add the class that triggers the slow transition next(); // Proceed to the next item in the queue }); } else { submenu.removeClass('active show'); } // Hide other submenus $('.menu').not(this).removeClass('clicked').next('.submenu').removeClass('active show'); }); $(document).click(function(event) { // Check if the click is outside of the menu and submenu if (!$(event.target).closest('.submenu-container').length) { $('.menu').removeClass('clicked'); $('.submenu').removeClass('active show'); } }); // Prevent submenu from closing when clicked inside $('.submenu').click(function(event) { event.stopPropagation(); // Prevents the click event from closing the submenu }); }); $(window).scroll(function() { if ($(this).scrollTop() > 20) { $('.navbar').addClass('stickyNav'); } else { $('.navbar').removeClass('stickyNav'); } if ($(this).scrollTop() > 20) { $('.ctaPopBtn').addClass('stickyBtn'); } else { $('.ctaPopBtn').removeClass('stickyBtn'); } }); document.addEventListener("DOMContentLoaded", function() { var input = document.querySelector("#phone"); var iti = window.intlTelInput(input, { initialCountry: "auto", separateDialCode: true, utilsScript: "https://cdnjs.cloudflare.com/ajax/libs/intl-tel-input/17.0.8/js/utils.js", }); // Optional: Retrieve and set the phone number value from localStorage var storedPhoneNumber = localStorage.getItem("phoneNumber"); if (storedPhoneNumber) { input.value = storedPhoneNumber; } // Handle form submission var form = document.getElementById("phoneForm"); if (form) { form.addEventListener("submit", function(event) { event.preventDefault(); // Save the phone number to localStorage localStorage.setItem("phoneNumber", input.value); // Log the phone number (you can replace this with actual form submission) console.log("Submitted Phone Number:", input.value); }); } else { console.error("Form element not found."); } }); document.getElementById('countries').addEventListener('change', function() { var selectedCountryCode = this.options[this.selectedIndex].value; var selectedCountry = this.options[this.selectedIndex].text; $("#hiddencountry").val(selectedCountry); $("#phone").val(selectedCountryCode); }); $(document).ready(function() { $(window).scroll(function() { if ($(this).scrollTop() > 100) { // Change 100 to the height you want to trigger the effect $('.header-section').addClass('is-sticky'); } else { $('.header-section').removeClass('is-sticky'); } }); }); $("#social_select li").click(function() { var t = $(this).attr("data-value"); $(".contact_type").val(t), "skype" == t ? $(".social_contact").attr("placeholder", "Your " + t + " ID") : $(".social_contact").attr("placeholder", "Your " + t + " Number") }), $("ul.social-select").on("click", ".init", function() { $(this) .closest("ul") .children("li:not(.init)") .toggle() }), $(document).on("click", function(t) { var e = $("ul.social-select"); e === t.target || e .has(t.target) .length || $(".init") .closest("ul") .children("li:not(.init)") .slideUp("fast") }); var allOptions = $("ul.social-select").children("li:not(.init)"); $("ul.social-select").on("click", "li:not(.init)", function() { allOptions.removeClass("selected"), $(this).addClass("selected"), $("ul.social-select") .children(".init") .html($(this).html()); var t = $(this).attr("data-value"); $("ul.social-select") .children(".init") .attr("data-value", t), allOptions.toggle() }); function fetchDataFromJson() { // alert('Function called'); fetch('https://www.dappfort.com/json/country.json') .then(response => { if (!response.ok) { throw new Error('Network response was not ok'); } return response.json(); }) .then(data => { // alert('Data fetched successfully'); const selectElement = document.getElementById("countries"); if (data && data.items && Array.isArray(data.items)) { data.items.forEach(item => { const option = document.createElement("option"); option.value = item.value; option.text = item.country; selectElement.appendChild(option); }); } else { alert('No items found in the fetched data'); } }) .catch(error => { console.error('Error fetching the JSON file:', error); }); } var Tawk_API=Tawk_API||{}, Tawk_LoadStart=new Date(); (function(){ var s1=document.createElement("script"),s0=document.getElementsByTagName("script")[0]; s1.async=true; s1.src='https://embed.tawk.to/63e22ab4c2f1ac1e2031e0fa/1golndl1o'; s1.charset='UTF-8'; s1.setAttribute('crossorigin','*'); s0.parentNode.insertBefore(s1,s0); })(); function openCity(evt, cityName) { var i, tabcontent, tablinks; tabcontent = document.getElementsByClassName("tabcontents"); for (i = 0; i < tabcontent.length; i++) { tabcontent[i].style.display = "none"; } tablinks = document.getElementsByClassName("tablinks"); for (i = 0; i < tablinks.length; i++) { tablinks[i].className = tablinks[i].className.replace(" active", ""); } document.getElementById(cityName).style.display = "block"; evt.currentTarget.className += " active"; } document.getElementById("defaultOpen").click(); function googletag() { var head = document.getElementsByTagName("head")[0]; var script = document.createElement("script"); script.type = "text/javascript"; script.src = "https://www.googletagmanager.com/gtag/js?id=G-2NL8ZX1DCM"; script.defer = true; head.appendChild(script); } setTimeout(googletag, 7000); window.dataLayer = window.dataLayer || []; function gtag() { dataLayer.push(arguments); } setTimeout(() => { gtag('js', new Date()); gtag('config', 'G-2NL8ZX1DCM', { 'debug_mode': true }); }, 7000); const swiper = new Swiper(".swiper-process", { direction: "vertical", loop: false, slidesPerView: 1, spaceBetween: 0, // autoplay: { // delay: 4000, // disableOnInteraction: false // }, navigation: { nextEl: ".swiper-button-next", prevEl: ".swiper-button-prev" } }); // Navigation List Sync const navItems = document.querySelectorAll("#featureNav li"); function updateNavActive(index) { navItems.forEach((el, i) => { el.classList.toggle("active", i === index); }); } // Sync nav on slide change swiper.on("slideChangeTransitionEnd", () => { updateNavActive(swiper.realIndex); }); // On nav click navItems.forEach((el, i) => { el.addEventListener("click", () => { swiper.slideToLoop(i); updateNavActive(i); }); }); var sc_project = 12903339; var sc_invisible = 1; var sc_security = "a037a771"; <div class="statcounter"><a title="Web Analytics Made Easy - Statcounter" href="https://statcounter.com/" target="_blank"><img class="statcounter" src="https://c.statcounter.com/12903339/0/a037a771/1/" alt="Web Analytics Made Easy - Statcounter" referrerPolicy="no-referrer-when-downgrade"></a></div> { "@context": "https://schema.org/", "@type": "FAQPage", "mainEntity": [{ "@type": "Question", "name": "What are the benefits of developing an crypto Wallet app?", "acceptedAnswer": { "@type": "Answer", "text": "The benefits of creating a cryptocurrency wallet app include convenience, security, quick transactions, connection with other services, and easy access to digital payments." } }, { "@type": "Question", "name": "How can AI be integrated into crypto Wallet application development?", "acceptedAnswer": { "@type": "Answer", "text": "AI can improve eWallet apps by providing tailored suggestions, fraud detection, speech recognition, and chatbots, which improves user experience, security, and transaction efficiency." } }, { "@type": "Question", "name": "What technologies are commonly used in crypto Wallet apps development?", "acceptedAnswer": { "@type": "Answer", "text": "It comprises mobile app frameworks, backend technology, payment gateway integrations, and AI technologies." } }, { "@type": "Question", "name": "What are the essential components to add in a cryptocurrency wallet app?", "acceptedAnswer": { "@type": "Answer", "text": "It includes user registration, account linking, wallet balance management, transaction history, QR code scanning, peer-to-peer money transfers, bill payments, and security features like as authentication." } }, { "@type": "Question", "name": "How can I monetize crypto Wallet app?", "acceptedAnswer": { "@type": "Answer", "text": "An eWallet app's monetization options include transaction fees, merchant partnerships, in-app advertising, premium feature subscription plans, and app licensing to other businesses." } }, { "@type": "Question", "name": "How can I ensure the security of an crypto Wallet app?", "acceptedAnswer": { "@type": "Answer", "text": "Implement data encryption, two-factor authentication, frequent security audits, industry standard compliance, user education on secure practices, and AI-powered fraud detection to assure the security of your crypto wallet software." } } ] }

      Crypto Wallet Development Company, can be referred if looking to develop your crypto wallet.

    1. Author Response:

      The following is the authors response to the original reviews.

      Reviewer #1 (Public review): 

      Summary: 

      The authors report four cryoEM structures (2.99 to 3.65 Å resolution) of the 180 kDa, full-length, glycosylated, soluble Angiotensin-I converting enzyme (sACE) dimer, with two homologous catalytic domains at the N- and C-terminal ends (ACE-N and ACE-C). ACE is a protease capable of effectively degrading Aβ. The four structures are C2 pseudo-symmetric homodimers and provide insight into sACE dimerization. These structures were obtained using discrete classification in cryoSPARC and show different combinations of open, intermediate, and closed states of the catalytic domains, resulting in varying degrees of solvent accessibility to the active sites. 

      To deepen the understanding of the gradient of heterogeneity (from closed to open states) observed with discrete classification, the authors performed all-atom MD simulations and continuous conformational analysis of cryo-EM data using cryoSPARC 3DVA, cryoDRGN, and RECOVAR. cryoDRGN and cryoSPARC 3DVA revealed coordinated open-closed transitions across four catalytic domains, whereas RECOVAR revealed independent motion of two ACE-N domains, also observed with cryoSPARC-focused classification. The authors suggest that the discrepancy in the results of the different methods for continuous conformational analysis in cryo-EM could result from different approaches used for dimensionality reduction and trajectory generation in these methods. 

      Strengths: 

      This is an important study that shows, for the first time, the structure and the snapshots of the dynamics of the full-length sACE dimer. Moreover, the study highlights the importance of combining insights from different cryo-EM methods that address questions difficult or impossible to tackle experimentally while lacking ground truth for validation. 

      Weaknesses: 

      The open, closed, and intermediate states of ACE-N and ACE-C in the four cryo-EM structures from discrete classification were designated quantitatively (based on measured atomic distances on the models fitted into cryo-EM maps, Figure 2D). Unfortunately, atomic models were not fitted into cryo-EM maps obtained with cryoSPARC 3DVA, cryoDRGN, and RECOVAR, and the open/closed states in these cases were designated based on qualitative analysis. As the authors clearly pointed out, there are many other methods for continuous conformational heterogeneity analysis in cryo-EM. Among these methods, some allow analyzing particle images in terms of atomic models, like MDSPACE (Vuillemot et al., J. Mol. Biol. 2023, 435:167951), which result in one atomic model per particle image and can help in analyzing cooperativity of domain motions through measuring atomic distances or angular differences between different domains (Valimehr et al., Int. J. Mol. Sci. 2024, 25: 3371). This could be discussed in the article. 

      Reviewer #2 (Public review): 

      Summary: 

      The manuscript presents a valuable contribution to the field of ACE structural biology and dynamics by providing the first complete full-length dimeric ACE structure in four distinct states. The study integrates cryo-EM and molecular dynamics simulations to offer important insights into ACE dynamics. The depth of analysis is commendable, and the combination of structural and computational approaches enhances our understanding of the protein's conformational landscape. However, the strength of evidence supporting the conclusions needs refinement, particularly in defining key terms, improving structural validation, and ensuring consistency in data analysis. Addressing these points through major revisions will significantly improve the clarity, rigor, and accessibility of the study to a broader audience, allowing it to make a stronger impact in the field. 

      Strengths: 

      The integration of cryo-EM and MD simulations provides valuable insights into ACE dynamics, showcasing the authors' commitment to exploring complex aspects of protein structure and function. This is a commendable effort, and the depth of analysis is appreciated. 

      Weaknesses: 

      Several aspects of the manuscript require further refinement to improve clarity and scientific rigor as detailed in my recommendations for the authors. 

      Reviewer #3 (Public review): 

      Summary: 

      Mancl et al. report four Cryo-EM structures of glycosylated and soluble Angiotensin-I converting enzyme (sACE) dimer. This moves forward the structural understanding of ACE, as previous analysis yielded partially denatured or individual ACE domains. By performing a heterogeneity analysis, the authors identify three structural conformations (open, intermediate open, and closed) that define the openness of the catalytic chamber and structural features governing the dimerization interface. They show that the dimer interface of soluble ACE consists of an N-terminal glycan and protein-protein interaction region, as well as C-terminal protein-protein interactions. Further heterogeneity mining and all-atom molecular dynamic simulations show structural rearrangements that lead to the opening and closing of the catalytic pocket, which could explain how ACE binds its substrate. These studies could contribute to future drug design targeting the active site or dimerization interface of ACE. 

      Strengths: 

      The authors make significant efforts to address ACE denaturation on cryo-EM grids, testing various buffers and grid preparation techniques. These strategies successfully reduce denaturation and greatly enhance the quality of the structural analysis. The integration of cryoDRGN, 3DVA, RECOVAR, and all-atom simulations for heterogeneity analysis proves to be a powerful approach, further strengthening the overall experimental methodology. 

      Weaknesses: 

      In general, the findings are supported by experimental data, but some experimental details and approaches could be improved. For example, CryoDRGN analysis is limited to the top 5 PCA components for ease of comparison with cryoSPARC 3DVA, but wouldn't an expansion to more components with CryoDRGN potentially identify further conformational states? The authors also say that they performed heterogeneity analysis on both datasets but only show data for one. The results for the first dataset should be shown and can be included in supplementary figures. In addition, the authors mention that they were not successful in performing cryoSPARC 3DFLex analysis, but they do not show their data or describe the conditions they used in the methods section. These data should be added and clearly described in the experimental section. 

      Some cryo-EM data processing details are missing. Please add local resolution maps, box sizes, and Euler angle distributions and reference the initial PDB model used for model building. 

      Reviewer #1 (Recommendations for the authors): <br /> Major point: 

      The authors could discuss the use of continuous conformational heterogeneity analysis methods that analyze particle images in terms of atomic models, based on MD simulations, like MDSPACE (Vuillemot et al., J. Mol. Biol. 2023, 435:167951). MDSPACE can be used on a dataset preprocessed with cryoSPARC or Relion by discrete classification to reduce compositional heterogeneity and obtain initial particle poses. It results in one atomic model per particle image and can help in analyzing the cooperativity of domain motions by measuring atomic distances or angular differences between different domains (Valimehr et al., Int. J. Mol. Sci. 2024, 25: 3371). 

      We agree that MDSPACE is a promising and useful tool for analysis, and are excited to implement such a method. Prior to manuscript submission, we have had discussions with the primary author, Slavica Jonic, about how we may employ her software in our analysis. Unfortunately, we were unable to overcome significant computational issues, notably MDSPACE’s lack of GPU functionality, which prevent us from employing MDSPACE in a reasonable manner for our dataset. We hope to employ MDSPACE in future work, once the computational issues have been addressed, and have added a section on MDSPACE to the discussion in an effort to increase the visibility of MDSPACE, as we feel it is an exciting approach that deserves more visibility. We have added a substantial discussion on this point, specifically on MDspace as follows:

      line 565-574

      Similarly, MDSPACE holds tremendous promise as a method for investigating conformational dynamics from cryo-EM data (61). MDSPACE integrates cryo-EM particle data with short MD simulations to fit atomic models into each particle image through an iterative process which extracts dynamic information. However, the lack of GPU-enabled processing for MDSPACE requires either a dedicated a computational setup that diverges from most other cryo-EM software, or access to a CPU-based supercomputer, which severely limits the accessibility of such software. Despite these challenges, both 3DFlex and MDSPACE use promising approaches to study protein conformational dynamics. We look forward to exploring effective methods to incorporate these strategies into our future research.

      Minor points: 

      (1) Lines 348-350: "The discrepancy in population size between these clusters is likely due to bias in the initial particle poses, rather than a subunit-specific preference for the open state." Which bias? The cluster size is related to conformations, not to poses. 

      We hope to emphasize that the assignment of particles to either the OC or CO cluster is likely due to the particle orientation within the complete dimer refinement, and the discrepancy in size between OC and CO clusters does not necessarily indicate a domain specific preference for one state or another, which would carry allosteric implications. This remains a possibility, but we hope to avoid over-interpretation of our results with the statement above.

      The statement was altered to now read:

      Line 418-423

      “The discrepancy in population size between these clusters is likely due to bias in the initial particle orientation, rather than a subunit-specific preference for the open state. As the O/C state and the C/O state are 180 degree rotations of each other, particle assignment to either cluster is likely influenced by the initial particle orientation of the complete dimer, and we currently lack the data to discern any allosteric implication to the orientation assignment.”

      (2) Line 519: "Micrographs with a max CTF value worse than 4Å were removed from the dataset,..." (also, lines 822-823 in supplementary material). <br /> Do you want to say that micrographs with a resolution worse than 4 A were removed? 

      Max CTF value was replaced with CTF fit resolution to properly match the parameter used in Cryosparc.

      (3) Figure 2C: The black lines are barely visible. Can you make them thicker and in red color? 

      The figure has been amended.

      (4) Figure 2D: The values for Chain A and Chain B in the second row (ACE-C) of sACE-3.05 columns are 17.9 (I) (Chain A) and 13.9 (C) (Chain B). Shouldn't they be reversed (13.9 (C) (Chain A) and 17.9 (I) (Chain B))? 

      The values are now correct. sACE-3.65 chains were flipped in the table, and the updated color scheme should make it easier to map the values from the table to their corresponding structure.

      Reviewer #2 (Recommendations for the authors): 

      The manuscript presents the first complete full-length dimeric ACE structure. The integration of cryo-EM and MD simulations provides valuable insights into ACE dynamics, showcasing the authors' commitment to exploring complex aspects of protein structure and function. This is a commendable effort, and the depth of analysis is appreciated. However, several aspects of the manuscript require further refinement to improve clarity and scientific rigor. In the view of this reviewer, a major revision is necessary. Please see the detailed comments below: 

      (1) Definition of "Conformational Heterogeneity": The term "conformational heterogeneity" should be clearly defined when citing references 27-29. <br /> References 27 and 29 use MD simulations, which reveal "conformational flexibility" rather than "conformational heterogeneity" as observed in cryo-EM data. A more precise distinction should be made. 

      We have changed the term “conformational heterogeneity” to the broader “conformational dynamics

      (2) Figure Adjustments for Clarity: <br /> Figure 1B: A scale bar is needed for accurate representation. 

      A 100 Angstrom scale bar was added to figure 1B.

      Figure 2A, B: Using a Cα trace representation would improve clarity and make structural differences more apparent. 

      We found using a Cα trace representation makes the figure too confusing and impossible to determine individual structural elements. Everything just becomes a jumble of lines.

      Additionally, a Cα displacement vs. residue index plot (with Figure 1A placed along the x-axis) should be included alongside Figures 2A and B to provide quantitative insight into structural variations. 

      This analysis has been combined with several other suggestions and now comprises a new figure 4.

      (3) Structural Resolution and Validation: <br /> Euler angle distribution and 3D-FSC analysis should be provided to help the audience assess how these factors influence the resolution of each structure. <br /> Local resolution analysis in Relion should be included to determine if there are dynamic differences among the four structures. <br /> To enhance structural interpretation, the manuscript would benefit from showcasing examples of bulky side-chain densities (e.g., Trp, Phe, Tyr) for each of the four structures. 

      Information is included in Figure S3 and S5.

      (4) Glycan Modeling Considerations: <br /> Since the resolution of cryo-EM does not allow for precise glycan composition determination, additional experimental validation (e.g., Glyco-MS) would strengthen the modeling. If experimental support is unavailable, appropriate references should be cited to justify the modeled glycans. 

      Minimal glycan modeling was performed with the goal of demonstrating that the protein is glycosylated. We have highlighted that we chose 12 N-linked glycosylation sites that have the observed extra density, an indication that glycan should be present and modeled them with complex glycans in the manuscript.  

      (5) Advanced Cryo-EM and MD Analyses: 3DFlex Analysis: <br /> It is recommended that the authors explore 3DFlex to better capture conformational variability. CryoSPARC's community support can assist in proper implementation. 

      We have incorporated our 3Dflex analysis in our discussion as follows:

      Line 553-565

      Surprisingly, we did not observe such motion using cryoSPARC 3DFlex, a neural network-based method analyzing our cryo-EM data of sACE (54). Central to the working of cryoSPARC 3DFlex is the generation of a tetrahedral mesh used to calculate deformations within the particle population. Proper generation of the mesh is critical for obtaining useful results and must often be determined empirically. Despite several attempts, we were unable to obtain results from 3DFlex comparable to what we observed with our other methods. Even using the results from our 3DVA as prior input to 3DFlex, the largest conformational change we observed was a slight wiggling at the bottom of the D3a subdomain (Movie S12). The authors of 3DFlex note that 3DFlex struggles to model intricate motions, and the implementation of custom tetrahedral meshes currently requires a non-cyclical fusion strategy between mesh segments. Given these limitations, and the complexity of sACE conformational dynamics, it appears that sACE, as a system, is not well-suited to analysis via 3DFlex in its current implementation.

      (6) Movie Consistency: <br /> The MD simulation movies should use the same color coding as the first four movies for consistency. Similarly, the 3DVar analysis map should be color-coded to enhance interpretability. 

      MD simulation movies are re-colored.

      (7) MD Simulations - Data Extraction and Validation: <br /> The manuscript includes several long-timescale MD simulations, but further analysis is needed to extract meaningful dynamic information. Suggested analyses include: <br /> a. RMSF (Root Mean Square Fluctuation) Analysis: Calculate RMSF from MD trajectories and compare it with local resolution variations in cryo-EM maps. 

      RMSF values were included in the new figure 4 along with structural depictions colored by RMSF value to localize variation to the structure.

      b. Assess whether regions exhibiting lower dynamics correspond to higher resolution in cryo-EM. 

      Information is added to Figure 4, Figure S3, S5, S6.

      c. Compare RMSF between simulations with and without glycans to identify potential effects. 

      This has been done in Figure 4.

      d. Clustering Analysis: Use the four solved structures as reference states to cluster MD simulation trajectories. Determine if the population states observed in MD simulations align with cryo-EM findings. 

      This has been done in supplementary figure S10.

      e. Principal Component Analysis (PCA): Perform PCA on MD trajectories and compare with dynamics inferred from cryo-EM analyses (3DVar, cryoDRGN, and RECOVAR) to ensure consistency. 

      This has been done in supplementary figure S11.

      f. Correction of RMSF Analysis or the y-axis label in Figure S9: The RMSF values cannot be negative by definition. The authors should carefully review the code used for this calculation or explicitly define the metric being measured. 

      The Y-axis label has been corrected to clarify that the plot depicts the change in RMSF values when comparing the glycosylated and non-glycosylated MD simulations.

      (8) Discussion on Coordinated Motion and Allostery: <br /> The discussion of coordinated motion and allosteric regulation between sACE-N domains should be explicitly connected to experimental evidence mentioned in the introduction: <br /> "Enzyme kinetics analysis suggests negative cooperativity between two catalytic domains (31-33). However, ACE also exhibits positive synergy toward Ab cleavage and allostery to enhance the activity of its binding partner, the bradykinin receptor (11, 34)." 

      (9) The authors should elaborate on how their new insights provide a mechanistic explanation for these experimental observations. 

      (10) Connection to Therapeutic Implications: <br /> The discussion section should more explicitly connect the structural findings to potential therapeutic applications, which would significantly enhance the impact of the study. 

      These three points (8-10) were addressed in a significant overhaul to the discussion section.

      In summary, this study makes a valuable contribution to the field of ACE structural biology and dynamics. The combination of cryo-EM and MD simulations is particularly powerful, and with major revisions, this manuscript has the potential to make a strong impact. Addressing the points outlined above will significantly improve clarity, strengthen the scientific claims, and enhance the manuscript's accessibility to a broader audience. I appreciate the authors' rigorous approach to this complex topic and encourage them to refine their work to fully highlight the significance of their findings. 

      Reviewer #3 (Recommendations for the authors): 

      (1) The authors incorrectly refer to their ACE construct as full-length throughout the manuscript. Given that they are purifying the soluble region (aa 1-1231), saying full-length ACE is not the correct nomenclature. I suggest removing full-length and using soluble ACE (sACE) throughout the text. 

      We utilize the term full-length to highlight the fact that our structures contain both the N and C domains for both subunits in the dimer, in contrast to the previously published ACE cryo-EM structure. We have clarified in the text that we refer to the full-length soluble region of ACE (sACE), and sACE is used to specifically refer to our construct throughout the text, except when referring to ACE in a more generalized biological context in the introduction and discussion.

      (2) The authors could show differences between the different structural states by measuring and displaying the alpha carbon distances. For example, in Figures 2A, B, 3A, and 4B and C. 

      Alpha carbon displacements for each residue have been added to the new figure 4.

      (3) Most figures, with a few exceptions (Figures 2 and S11), are of low quality. Perhaps they are not saved in the same format. In addition, the color schemes used throughout the figures and movies are not consistent. For example, in Figure 1 D2 domains are in green, while they appear yellow in Figure 2 and later. Please double-check all coloring schemes and keep them consistent throughout the manuscript. In addition, it would be good to keep the labeling of the domains in the subsequent figures, as it is difficult to remember which domain is which throughout the manuscript. 

      We are unsure of how to address the low quality issue, our files and the online versions appear to be of suitable high quality. We will work with editorial staff to ensure all files are of suitable quality. The color scheme has been revised throughout the manuscript to ensure consistency and better differentiate between domains and chains.

      (4) Figure 1. Indicate exactly where in panel A ACE-N ends and ACE-C starts. Also, the pink and magenta, as well as aqua vs. light blue, are hard to distinguish. 

      We have updated coloring scheme.

      (5) Figure 2. In the figure legend, the use of brackets for defining closed, intermediate, and open states is confusing, given that the panels are also described with brackets, and some letters match between them. Using a hyphen or bolding the abbreviations could help. Also, define chains A and B, make the black lines that I assume indicate distances in C bold or thicker as they are very hard to see in the figure, and add to the legend what those lines mean. 

      The abbreviations have been changed from parentheses to quotes, and suggestions have been implemented.

      (6) Figure 4 is confusing as shown. Since the authors mention the general range of motion in sACE-N first in the text, wouldn't it make more sense to show panel B first and then panel A? Also, can you point and label the "tip connecting the two long helices of the D1a subdomain" in the figure? It is not clear to me where this region is in B. In addition, add a description of the arrows in B and C to the figure legend. 

      Most changes incorporated. The order should make more sense now in light of other changes.

      (7) Figure 5. Can the authors add a description to the legend as to what the arrows indicate and their thickness? 

      Done

      (8) Add a scale bar to the micrograph images in the supplementary figures. 

      Figure S2 and S4 need the scale bar.

      (9) Provide a more comprehensive description of buffers used in the DF analysis, as this information could be useful to others. 

      We have included the data in Table S1.<br /> (10) Line 51: Reference format not consistent with other references: (Wu et al., 2023). 

      Fixed

      (11) Line 66: Define "ADAM". 

      The definition has been added.

      (12) Line 90: The authors say: Recent open state structures of sACE-N, sACE monomer, and a sACE-N dimer, along with molecular dynamics (MD) simulations of sACE-C, have begun to reveal the conformational heterogeneity, though it remains under-studied (27-29)." Can the authors clarify what "it" refers to? The full-length ACE, sACE, or its specific domains? 

      The sentence now reads: Recent open state structures of sACE-N, sACE monomer, and a sACE-N dimer, along with molecular dynamics (MD) simulations of sACE-C, have begun to reveal ACE conformational dynamics, though they remain under-studied (29-31).

      (13) Line 204: "The comparison of our dimeric sACE cryoEM structures of reveals the conformational dynamics of sACE catalytic domains." The second "of" should be removed. 

      Fixed<br /> (14) Line 268: "From room mean square fluctuation (RMSF) analysis..." "room" should be replaced with "root."

      Fixed

    1. All non-mandatory command-line tool non-file arguments MUST be provided as a string via the $task.ext.args variable.

      Explanation from seqera AI

      Why this convention exists:

      Flexibility: Users can easily customize module behavior without modifying the module code itself
      Separation of concerns: Module logic stays separate from parameter configuration
      Reusability: The same module can be used with different parameters across different pipelines
      Consistency: All nf-core modules follow the same pattern for optional arguments
      

      What goes in ext.args vs. input channels:

      Input channels: Mandatory non-file arguments that are essential for the tool to function (e.g., required modes, essential parameters)
      ext.args: Optional flags, parameters with defaults, or any non-essential command-line options
      
    1. Reviewer #2 (Public review):

      Summary:

      The authors developed a cell-type specific fluorescence-tagging approach using a CRISPR/Cas9 induced spilt-GFP reconstitution system to visualize endogenous Bruchpilot (BRP) clusters as presynaptic active zones (AZ) in specific cell types of the mushroom body (MB) in the adult Drosophila brain. This AZ profiling approach was implemented in a high-throughput quantification process, allowing for the comparison of synapse profiles within single cells, cell types, MB compartments, and between different individuals. The aim is to analyse in more detail neuronal connectivity and circuits in this centre of associative learning. These are notoriously difficult to investigate due to the density of cells and structures within a cell. The authors detect and characterize cell-type-specific differences in BRP-dependent profiling of presynapses in different compartments of the MB, while intracellular AZ distribution was found to be stereotyped. Next to the descriptive part characterizing various AZ profiles in the MB, the authors apply an associative learning assay and detect consequent AZ re-organisation.

      Strengths:

      The strength of this study lies in the outstanding resolution of synapse profiling in the extremely dense compartments of the MB. This detailed analysis will be the entry point for many future analyses of synapse diversity in connection with functional specificity to uncover the molecular mechanisms underlying learning and memory formation and neuronal network logics. Therefore, this approach is of high importance for the scientific community and a valuable tool to investigate and correlate AZ architecture and synapse function in the CNS.

      Weaknesses:

      The results and conclusions presented in this study are, in many aspects, well-supported by the data presented. To further support the key findings of the manuscript, additional controls, comments, and possibly broader functional analysis would be helpful. In particular:

      (1) All experiments in the study are based on spilt-GFP lines (BRP:GFP11 and UAS-GFP1-10). The Materials and Methods section does not contain any cloning strategy (gRNA, primer, PCR/sequencing validation, exact position of tag insertion, etc.) and only refers to a bioRxiv publication. It might be helpful to add a Materials and Methods section (at least for the BRP:GFP11 line). Additionally, as this is an on locus insertion the in BRP-ORF, it needs a general validation of this line, including controls (Western Blot and correlative antibody staining against BRP) showing that overall BRP expression is not compromised due to the GFP insertion and localizes as BRP in wild type flies, that flies are viable, have no defects in locomotion and learning and memory formation and MB morphology is not affected compared to wild type animals.

      (2) Several aspects of image acquisition and high-throughput quantification data analysis would benefit from a more detailed clarification.

      a) For BRP cluster segmentation it is stated in the Materials and Methods state, that intensity threshold and noise tolerance were "set" - this setting has a large effect on the quantification, and it should be specified and setting criteria named and justified (if set manually (how and why) or automatically (to what)). Additionally, if Pyhton was used for "Nearest Neigbor" analysis, the code should be made available within this manuscript; otherwise, it is difficult to judge the quality of this quantification step.

      b) To better evaluate the quality of both the imaging analysis and image presentation, it would be important to state, if presented and analysed images are deconvolved and if so, at least one proof of principle example of a comparison of original and deconvoluted file should be shown and quantified to show the impact of deconvolution on the output quality as this is central to this study.

      (3) The major part of this study focuses on the description and comparison of the divergent synapse parameters across cell-types in MB compartments, which is highly relevant and interesting. Yet it would be very interesting to connect this new method with functional aspects of the heterogeneous synapses. This is done in Figure 7 with an associative learning approach, which is, in part, not trivial to follow for the reader and would profit from a more comprehensive analysis.

      a) It would be important for the understanding and validation of the learning induced changes, if not (only) a ratio (of AZ density/local intensity) would be presented, but both values on their own, especially to allow a comparison to the quoted, previous AZ remodelling analysis quantifying BRP intensities (ref. 17, 18). It should be elucidated in more detail why only the ratio was presented here.

      b) The reason why a single instead of a dual odour conditioning was performed could be clarified and discussed (would that have the same effects?).

      c) Additionally, "controls" for the unpaired values - that is, in flies receiving neither shock nor odour - it would help to evaluate the unpaired control values in the different MB compartments.

      d) The temporal resolution of the effect is very interesting (Figure 7D), and at more time points, especially between 90 and 270 min, this might raise interesting results.

      e) Additionally, it would be very interesting and rewarding to have at least one additional assay, relating structure and function, e.g. on a molecular level by a correlative analysis of BRP and synaptic vesicles (by staining or co-expression of SV-protein markers) or calcium activity imaging or on a functional level by additional learning assays

    1. Writing a new test requires more effort than examining anexisting one, mostly because you don’t write tests in a vacuum: you have to take intoaccount the underlying code. And so although I focus on unit tests, I also devote a sig-nificant portion of this book to discussing code design.

      Los tests están fuertemente unidos al diseño del código.

      Un buen "test" es un buén diseño de código

      Es como algo permeable. En el sentido que sale

    2. This is not to say that coverage metrics should take into account code paths inexternal libraries (they shouldn’t), but rather to show you that you can’t rely onthose metrics to see how good or bad your unit tests are.

      Cuando estas haciendo un test, tambien hay que tomar en consideración que por mas que testeemos librerias o modulos externos, hay ciertos hidden cases que no están en ocnsideración.

      Por ejemplo, un metodo .parse() que tienen ciertos paths que no se ven externamente. Por ejemplo, edge cases tipo: que pasa si lega un null, un string, vacio o un string muy largo

    3. igure 1.4 The branch metric is calculated as the ratio of thenumber of code branches exercised by the test suite and thetotal number of branches in the production code base

      Prompt: Que es el branch coverage y cómo se calcula?

    4. In software, entropy manifests in the form of code that tends to deteriorate. Eachtime you change something in a code base, the amount of disorder in it, or entropy,increases. If left without proper care, such as constant cleaning and refactoring, thesystem becomes increasingly complex and disorganized.

      Al pirncipio uno piensa que hacer test no es necesario, pero a medida que el sistema va creciendo los tests son una especie de safety net para los cambios introducidos en el sistema

    1. Key modules. The action module (left) executes tasks such as retrieving reference datasets, converting gene names, verifying ligand–receptor interactions using existing databases, processing data with established software packages (e.g., numpy) or generating and executing custom code, while reasoning over and aggregating information from multiple sources
    1. To migrate this code to DSL2, you need to move all of your channel logic throughout the script into a workflow definition

      seqscreen was writtein in DSL1, needs to be migrated (Todd)

    1. has deep knowledge of the errors

      What could be the source of this knowledge? - Maybe a human in the loop training with automated code gen + linter use? - Grazing on forums?

      able to identify the root cause of errors, help troubleshoot, and suggest edits

    2. not only give you the initial conversion, but also run the stages of the code that it generates with sample data and iteratively correct any code that yields runtime errors
    3. convert a pipeline from Bash/CWL/WDL to Nextflow

      use cases

      can not only give you the initial conversion, but also run the stages of the code that it generates with sample data and iteratively correct any code that yields runtime errors

    4. Seqera AI – a bioinformatics agent purpose-built for the scientific lifecycle

      Seqera-AI can - Suggest pipelines (tested and validated) - Answering bioinformatics questions with context - Generate nextflow code + validate/self-correct (when would someone use this?)

      context retrieved: - Can retrieve context for writing and testing nextflow code - context of pipeline results to aid interpretation

      source: Summarized from text below

    1. This new specification enables more specific error reporting, ensures more consistent code, and will allow the Nextflow language to evolve independently of Groovy.
    2. strict syntax will eventually become the only way to write Nextflow code, and new language features will be implemented only in the strict syntax
    1. Improvements to NanoPlot and NanoComp are, among code optimizations, the generation of additional plots, using dynamic HTML plots from the Plotly library, and enabling further exploration by the end users
    1. We prefer to be explicit to aid code clarity, as such the $it syntax is discouraged and will slowly be phased out of the Nextflow language.
    1. write another pipeline that calls on one of those processes, you just need to type one short import statement to use the relevant module. This is better than just copy-pasting the code, because if later you decide to improve the module, all your pipelines will inherit the improvements.
    1. I rarely wait, because I'm juggling multiple projects. When one agent instance is working, I switch to another window. Sometimes it's a separate git worktree of the same codebase. Yes, context switching is tiring, but it also seems to help me overcome ADHD-related activation energy barriers? Over the years, there've been days when I just sit there staring at the IDE window, poking my brain with a stick saying "c'mon, do something" and nothing happens for an hour or more. I'm not planning my next move, I'm just dissociating. My executive function doesn't, like, function. Often. My own brain makes me wait long periods of time before it starts generating useful results. 😅 Maybe it's the cycling novelty that keeps me going? I enjoy task switching between prosing and coding. I enjoy finding that the model appears to have "read" everything—evidenced by it echoing my intent back in code or follow-up questions. I enjoy discovering that while I was in another window, new things happened in the background for me to review.

      It'd be interesting to see if people can e.g. work assisted for more hours before getting tired in a day. I do suspect that the perception of going faster is maybe a shift in distribution over tasks that feel tedious and tasks that feel like they go quickly (independently of reality). But: most of productivity is really emotional management, so...

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      A) The presentation of the paper must be strengthened. Inconsistencies, mislabelling, duplicated text, typos, and inappropriate colour code should be changed.

      We spotted and corrected several inconsistencies and mislabelling issues throughout the text and figures. Thanks!  

      B) Some claims are not supported by the data. For example, the sentence that says that "adolescent mice showed lower discrimination performance than adults (l.22) should be rewritten, as the data does not show that for the easy task (Figure 1F and Figure 1H).

      We carefully reviewed the specific claims and fixed some of the wording so it adheres to the data shown.

      C) In Figure 7 for example, are the quantified properties not distinct across primary and secondary areas?

      We now carried out additional analysis to test this. We found that while AUDp and AUDv exhibit distinct tuning properties, they show similar differences between adolescent and adult neurons (see Supplementary Table 6, Fig. S7-1a-h). Note that TEa and AUDd could not be evaluated due to low numbers of modulated neurons in this protocol.

      D) Some analysis interpretations should be more cautious. (..) A lower lick rate in general could reflect a weaker ability to withhold licking- as indicated on l.164, but also so many other things, like a lower frustration threshold, lower satiation, more energy, etc).

      That is a fair comment, and we refined our interpretations. Moreover, we also addressed whether impulsiveness impacted lick rates. In the Educage, we found that adolescent mice had shorter ITIs only after FAs (Fig. S2-1). In the head-fixed setup, we examined (1) the proportion of ITIs where licks occurred (Fig. S3-1c) and (2) the number of licks in these ITIs (Fig. S3-1d). We found no differences between adolescents and adults, indicating that the differences observed in the main task are not due to general differences in impulsiveness (Fig. S2-1, Fig. S3-1c, d). Finally, we note that potential differences in satiation were already addressed in the original manuscript by carefully examining the number of trials completed across the session. See also Review 3, comment #1 below.

      Reviewer #2 (Public review):

      A) For some of the analyses that the authors conducted it is unclear what the rationale behind them is and, consequently, what conclusion we can draw from them.

      We reviewed the manuscript carefully and revised the relevant sections to clarify the rationale behind the analyses. See detailed responses to all the reviewer’s specific comments.

      B) The results of optogenetic manipulation, while very interesting, warrant a more in-depth discussion.

      We expanded our discussion on these experiments (L495-511) and also added an additional analysis to strengthen our findings (Fig. S3-2e).

      Reviewer #3 (Public review):

      (1) The authors report that "adolescent mice showed lower auditory discrimination performance compared to adults" and that this performance deficit was due to (among other things) "weaker cognitive control". I'm not fully convinced of this interpretation, for a few reasons. First, the adolescents may simply have been thirstier, and therefore more willing to lick indiscriminately. The high false alarm rates in that case would not reflect a "weaker cognitive control" but rather, an elevated homeostatic drive to obtain water. Second, even the adult animals had relatively high (~40%) false alarm rates on the freely moving version of the task, suggesting that their behavior was not particularly well controlled either. One fact that could help shed light on this would be to know how often the animals licked the spout in between trials. Finally, for the head-fixed version of the task, only d' values are reported. Without the corresponding hit and false alarm rates (and frequency of licking in the intertrial interval), it's hard to know what exactly the animals were doing.

      irst, as requested, we added the Hit rates and FA rates for the head-fixed task (Fig. S3-1a). Second, as requested by the reviewr, we performed additional analyses in both the Educage and head-fixed versions of the task. Specifically, we analyzed the ITI duration following each trial outcome. We found that adolescent mice had shorter ITIs only after Fas (Fig. S2-1). In the head-fixed setup, we examined (1) the proportion of ITIs during which licks occurred (Fig. S3-1c) and (2) the number of licks in these ITIs (Fig. S3-1d). We found no differences between adolescents and adults, indicating that the differences observed in the main task are not due to general differences in impulsiveness (Fig. S2-1, Fig. S3-1c, d). See also comment #D of reviewer #1 above.

      B) There are some instances where the citations provided do not support the preceding claim. For example, in lines 64-66, the authors highlight the fact that the critical period for pure tone processing in the auditory cortex closes relatively early (by ~P15). However, one of the references cited (ref 14) used FM sweeps, not pure tones, and even provided evidence that the critical period for this more complex stimulus occurred later in development (P31-38). Similarly, on lines 72-74, the authors state that "ACx neurons in adolescents exhibit high neuronal variability and lower tone sensitivity as compared to adults." The reference cited here (ref 4) used AM noise with a broadband carrier, not tones.

      We carefully checked the text to ensure that each claim is accurately supported by the corresponding reference.

      C) Given that the authors report that neuronal firing properties differ across auditory cortical subregions (as many others have previously reported), why did the authors choose to pool neurons indiscriminately across so many different brain regions?

      We appreciate the reviewer’s concern. While we acknowledge that pooling neurons across auditory cortical subregions may obscure region-specific effects, our primary focus in this study is on developmental differences between adolescents and adults, which were far more pronounced than subregional differences.

      To address this potential limitation: (1) We analyzed firing differences across subregions during task engagement (see Fig. S4-1, S4-2, S4-3; Supplementary Tables 2 and 3). (2) We have now added new analyses for the passive listening condition in AUDp and AUDv (Fig. S7-1; Supplementary Table 6).

      These analyses support our conclusion that developmental stage has a greater impact on auditory cortical activity than subregional location in the contexts examined. For clarity and cohesion, the main text emphasizes developmental differences, while subregional analyses are presented in the Supplement.

      D) And why did they focus on layers 5/6? (Is there some reason to think that age-related differences would be more pronounced in the output layers of the auditory cortex than in other layers?)

      We agree that other cortical layers, particularly supragranular layers, are important for auditory processing and plasticity. Our focus on layers 5/6 was driven by both methodological and biological considerations. Methodologically, our electrode penetrations were optimized to span multiple auditory cortical areas, and deeper layers provided greater mechanical stability for chronic recordings. Biologically, layers 5/6 contain the principal output neurons of the auditory cortex and are well-positioned to influence downstream decision-making circuits. We acknowledge the limitation of our recordings to these layers in the manuscript (L268; L464-8).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) The presentation of the paper must be strengthened. As it is now, it makes it difficult to appreciate the strengths of the results. Here are some points that should be addressed:

      a) The manuscript is full of inconsistencies that should be fixed to improve the reader's understanding. For example, the description on l.217 and the Figure. S3-1b, the D' value of 0 rounded to 0.01 on l. 735 (isn't it rather the z-scored value that is rounded? A D' of 0 is not a problem), the definition of lick bias on l. 750 and the values in Fig.2, the legend of Figure 7F and what is displayed on the graph (is it population sparseness or responsiveness?), etc.

      We adjusted the legend and description of former Fig. S3-1b (now Fig. S3-2b).

      We now clarify that the rounded values refer to z-scored hit and false alarm rates that we used in the d’ calculation. We adjusted the definition of the lick bias in Fig. 2 and Fig. S3-1b (L804).

      We replaced ‘population responsiveness’ with ‘population sparseness’ throughout the figures, legend and the text.

      b) References to figures are sometimes wrong (for example on l. 737,739).

      c) Some text is duplicated (for example l. 814 and l. 837).

      d) Typos should be corrected (for example l. 127, 'the', l. 787, 'upto').

      We deleted the incorrect references of this section, removed the duplicated text, and corrected the typos.

      e) Color code should be changed (for example the shades of blue for easy and hard tasks - they are extremely difficult to differentiate).

      After consideration, we decided to retain the blue color code (i.e., Fig. 1d, Fig. 3d, Fig. 4e-g, Fig. 5c, Fig. 6d–g), where the distinction between the shades of blue appears sufficiently clear and maintains visual consistency and aesthetic appeal. We did however, made changes in the other color codes (Fig. 4, Fig. 5, Fig. 6, Fig. 7).

      f) Figure design should be improved. For example, why is a different logic used for displaying Figure 5A or B and Figure 1E?

      We adjusted the color scheme in Fig. 5. We chose to represent the data in Fig. 5 according to task difficulty, as this arrangement best illustrates the more pronounced deficits in population decoding in adolescents during the hard task.

      f) Why use a 3D representation in Figure 4G? (2)

      The 3D representation in Fig. 4g was chosen to illustrate the 3-way interactions between onset-latency, maximal discriminability, and duration of discrimination.

      g) Figure 1A, lower right panel- should "response" not be completed by "lick", "no lick"?

      We changed the labels to “Lick” and “No Lick” in Fig. 1a.

      h) l.18 the age mentioned is misleading, because the learning itself actually started 20 days earlier than what is cited here.

      Corrected.

      i) Explain what AAV5-... is on l.212.

      We added an explanation of virus components (see L216-220).

      (2) The comparison of CV in Figure 2 H-J is interesting. I am curious to know whether the differences in the easy and hard tasks could be due to a decrease in CV in adults, rather than an increase in CV in adolescents? Also, could the difference in J be due to 3 outliers?

      We agree that the observed CV differences may reflect a reduction in variability in adults rather than an increase in adolescents. We have revised the Results section accordingly to acknowledge this interpretation.

      Regarding the concern about potential outliers in Fig. 2J, we tested the data for outliers using the isoutlier function in MATLAB (defining outliers as values exceeding three standard deviations from the mean) and found no such cases.

      (3) Figure 2c shows that there is no difference in perceptual sensitivity between adolescents and adults, whereas the conclusion from Figure 4 is that adolescents exhibit lower discriminability in stimulus-related activity. Aren't these results contradictory?

      This is a nuanced point. The similar slopes of the psychometric functions (Fig. 2c) indicating comparable perceptual sensitivity and the lower AUC observed in the ACx of adolescents (Fig. 4) do not necessarily contradict each other. These two measures capture related but distinct issues: psychometric slopes reflect behavioral output, which integrates both sensory encoding and processing downstream to ACx, while the AUC analysis reflects stimulus-related neural activity in ACx, which may still include decision-related components.<br /> Note that stimulus-related neural discriminability outside the context of the task is not different between adolescent and adult experts (Fig. 7h; p = 0.9374, Kruskal Willis Test after Tukey-Kramer correction for multiple comparisons; not discussed in the manuscript). This suggests that there are differences that emerge when we measure during behavior. Also note that behavior may rely on processing beyond ACx, and it is possible that downstream areas compensate for weaker cortical discriminability in adolescents — but this issue merits further investigation.

      (4) Why do you think that the discrimination in hard tasks decreases with learning (Figure 6D vs Figure 6F)?

      This is another nuanced point, and we can only speculate at this stage. While it may appear counterintuitive that single-neuron discriminability (AUC) for the hard task is reduced after learning (Fig. 6D vs. 6F), we believe this may reflect a shift in sensory coding in expert animals. In a recent study (Haimson et al., 2024; Science Advances), we found that learning alters single-neuron responses in the easy versus hard task in complex and distinct ways, which may account for this result. It is also possible that, in expert mice, top-down mechanisms such as feedback from higher-order areas act to suppress or stabilize sensory responses in auditory cortex, reducing the apparent stimulus selectivity of single neurons (e.g., AUC), even as behaviorally relevant information is preserved or enhanced at the population level.

      Reviewer #2 (Recommendations for the authors):

      This is very interesting work and I enjoyed reading the manuscript. See below for my comments, queries and suggestions, which I hope will help you improve an already very good paper.

      We thank the reviewer for the meticulous and thoughtful review.

      (1) Line 107: x-axis of panel 1e says 'pre-adolescent'.

      (2) Line 130: replace 'less' with 'fewer'.

      (3) Line 153: 'both learned and catch trials': I find the terminology here a bit confusing. I would typically understand a catch trial to be a trial without a stimulus but these 'catch' trials here have a stimulus. It's just that they are not rewarded/punished. What about calling them probe trials instead?

      We corrected the labelling (1), reworded to ‘fewer’ and ‘probe trials’ (2,3).

      (4) Line 210: The results of the optogenetics experiments are very interesting. In particular, because the effect is so dramatic and much bigger than what has been reported in the literature previously, I believe. Lick rates are dramatically reduced suggesting that the mice have pretty much stopped engaging in the task and the authors very rightly state that the 'execution' of the behavior is affected. I think it would be worth discussing the implications of these results more thoroughly, perhaps also with respect to some of the lesion work. Useful discussions on the topic can be found, for instance, in Otchy et al., 2015; Hong et al., 2018; O'Sullivan et al., 2019; Ceballo et al., 2019 and Lee et al., 2024. Are the mice unable to hear anything in laser trials and that is why they stopped licking? If they merely had trouble distinguishing them then we would perhaps expect the psychometric curves to approach chance level, i.e. to be flat near the line indicating a lick rate of 0.5. Could the dramatic decrease in lick rate be a motor issue? Can we rule out spillover of the virus to relevant motor areas? (I understand all of the 200nL of the virus were injected at a single location) Or are the effects much more dramatic than what has been reported previously simply because the GtACR2 is much more effective at silencing the auditory cortex? Could the effect be down to off-target effects, e.g. by removing excitation from a target area of the auditory cortex, rather than the disruption of cortical processing?

      We have now expanded the discussion in the manuscript to more thoroughly consider alternative interpretations of the strong behavioral effect observed during ACx silencing (L495–511). In particular, we acknowledge that the suppression of licking may reflect not only impaired sensory discrimination but also broader disruptions to arousal, motivation, or motor readiness. We also discuss the potential impact of viral spread, circuit-level off-target effects, and the potency of GtACR2 as possible contributors. We highlight the need for future work using more graded or temporally precise manipulations to resolve these issues.

      (5) Line 226: Reference 19 (Talwar and Gerstein 2001) is not particularly relevant as it is mostly concerned with microstimulation-induced A1 plasticity. There are, however, several other papers that should be cited (and potentially discussed) in this context. In particular, O'Sullivan et al., 2019 and Ceballo et al., 2019 as these papers investigate the effects of optogenetic silencing on frequency discrimination in head-fixed mice and find relatively modest impairments. Also relevant may be Kato et al., 2015 and Lee et al., 2024, although they look at sound detection rather than discrimination.

      We changed the references and pointed the reader to the (new section) Discussion.

      (6) Line 253: 'engaged [in] the task.

      (7) Figure 4: It appears that panel S4-1d is not referred to anywhere in the main text.

      Fixed.

      (8) Line 260: Might be useful to explain a bit more about the motivation behind focusing on L5/L6. Are there mostly theoretical considerations, i.e. would we expect the infragranular layers to be more relevant for understanding the difference in task performance? Or were there also practical considerations, e. g. did the data set contain mostly L5/L6 neurons because those were easier to record from given the angle at which the probe was inserted? If those kinds of practical considerations played a role, then there is nothing wrong with that but it would be helpful to explain them for the benefit of others who might try a similar recording approach.

      There were no deep theoretical considerations for targeting L5/6.  Our focus on layers 5/6 was driven by both methodological and biological considerations. Methodologically, our electrode penetrations were optimized to span multiple auditory cortical areas, and deeper layers provided greater mechanical stability for chronic recordings. Biologically, layers 5/6 contain the principal output neurons of the auditory cortex and are well-positioned to influence downstream decision-making circuits. We acknowledge the limitation of our recordings to these layers in the manuscript (L268; L463–467). See also comment D of reviewer 3.

      (9) Supplementary Table 2: The numbers in brackets indicate fractions rather than percentages.

      Fixed.

      (10) Figure S4-3: The figure legend implies that the number of neurons with significant discriminability for the hard stimulus and significant discriminability for choice was identical. (adolescent neurons = 368, mice = 5, recordings = 10; adult n = 544, mice = 6, recordings = 12 in both cases). Presumably, that is not actually the case and rather the result of a copy/paste operation gone wrong. Furthermore, I think it would be helpful to state the fractions of neurons that can discriminate between the stimuli and between the choices that the animal made in the main text.

      Thank you for spotting the mistake. We corrected the n’s and added the percentage of neurons that discriminate stimulus and choice in the main text and the figure legend.

      (11) Line 301: 'We used a ... decoder to quantify hit versus correct reject trial outcomes': I'm not sure I understand the rationale here. For the single unit analysis hit and false alarm trials were compared to assess their ability to discriminate the stimuli. FA and CR trials were compared to assess whether neurons can encode the choice of the mice. But the hit and CR trials which are contrasted here differ in terms of both stimulus and behavior/choice so what is supposed to be decoded here, what is supposed to be achieved with this analysis?

      Thank you for this important point. You're correct that comparing hit and CR trials captures differences in both stimulus and choice, or task-related differences. We chose this contrast for the population decoding analysis to achieve higher trial counts per session and similar number of trials which are necessary for the reliability of the analysis. While this approach does not isolate stimulus from choice encoding, it provides an overall measure of how well population activity distinguishes task-relevant outcomes. We explicitly acknowledge this issue in L313-314.

      (12) Line 332: What do you mean when you say the novice mice were 'otherwise fully engaged' in the task when they were not trained to do the task and are not doing the task?

      By "otherwise fully engaged," we mean that novice mice were actively participating in the task environment, similar to expert mice — they were motivated by thirst and licked the spout to obtain water. The key distinction is that novice mice had not yet learned the task rules and likely relied on trial-and-error strategies, rather than performing the task proficiently.

      (13) Line 334: 'regardless of trial outcome': Why is the trial outcome not taken into account? What is the rationale for this analysis? Furthermore, in novice mice a substantial proportion of the 'go' trials are misses. In expert mice, however, the proportion of 'miss trials' (and presumably false alarms) will by definition be much smaller. Given this, I find it difficult to interpret the results of this section.

      This approach was chosen to reliably decode a sufficient number of trials for each task difficulty (i.e. expert mice predominantly performed CRs on No-Go trials and novice mice often showed FAs). Utilizing all trial outcomes ensured that we had enough trials for each stimulus type to accurately estimate the AUCs. This approach avoids introducing biases due to uneven trial numbers across learning stages.

      (14) Line 378: 'differences between adolescents and adults arise primarily from age': Are there differences in any of the metrics shown in 7e-h between adolescents and adults?

      We confirm that differences between adolescents and adults are indeed present in some metrics but not others in Figure 7e–h. Specifically, while tuning bandwidth was similar in novice animals, it was significantly lower in adult experts (Fig. 7e; novice: p = 0.0882; expert: p = 0.0001 Kruskal Willis Test after Tukey-Kramer correction for multiple comparisons; not discussed in the manuscript). The population sparseness was similar in both novice and expert adolescent and adult neurons (Fig. 7f; novice: p = 0.2873; expert: p = 0.1017, Kruskal Willis Test after Tukey-Kramer correction for multiple comparisons; not discussed in the manuscript). The distance to the easy go stimulus was similar in novice animals, but lower in adult experts (Fig. 7g; novice: p = 0.7727; expert: p = 0.0001, Kruskal Willis Test after Tukey-Kramer correction for multiple comparisons; not discussed in the manuscript). The neuronal d-prime was similar in both novice and expert adolescent and adult neurons (Fig. 7h; novice: p = 0.7727; expert: p = 0.0001, Kruskal Willis Test after Tukey-Kramer correction for multiple comparisons; not discussed in the manuscript).

      (15) Line 475: '...well and beyond...': something seems to be missing in this statement.

      (16) Line 487: 'onto' should be 'into', I think.

      (17) Line 610 and 613: '3 seconds' ... '2.5 seconds': Was the response window 3s or 2.5s?

      (18) Line 638: 'set' should be 'setup', I believe.

      All the mistakes mentioned above, were fixed. Thanks.

      (19) Line 643: 'Reward-reinforcement was delayed to 0.5 seconds after the tone offset': Presumably, if they completed their fifth lick later than 0.5 seconds after the tone, the reward delivery was also delayed?

      Apologies for the lack of clarity. In the head-fixed version, there was no lick threshold. Mice were reinforced after a single lick. If that lick occurred after the 0.5-second reinforcement delay following tone offset, the reward or punishment was delivered immediately upon licking.

      (20) Line 661: 'effect [of] ACx'.

      (21) Line 680: 'a base-station connected to chassis'. The sentence sounds incomplete.

      (22) Line 746: 'infliction', I believe, should say 'inflection'.

      (23) Line 769: 'non-auditory responsive units': Shouldn't that simply say 'non-responsive units'? The way it is currently written I understand it to mean that these units were responsive (to some other modality perhaps) but not to auditory stimulation.

      (24) Line 791: 'bins [of] 50ms'.

      (25) Line 811: 'all of' > 'of all'.

      (26) Line 814: Looks like the previous paragraph on single unit analysis was accidentally repeated under the wrong heading.

      (27) Line 817: 'encoded' should say 'calculated', I believe.

      All the mistakes mentioned above were fixed. Thanks.

      (28) Line 869: 'bandwidth of excited units': Not sure I understand how exactly the bandwidth, i.e. tuning width was measured.

      We acknowledge that our previous answer was unclear and expanded the Methods section. To calculate bandwidth, we identified significant tone-evoked responses by comparing activity during the tone window to baseline firing rates at 62 dB SPL (p < 0.05). For each neuron, we counted the number of contiguous frequencies with significant excitatory responses, subtracting isolated false positives to correct for chance. We then converted this count into an octave-based bandwidth by multiplying the number of frequency bins by the octave spacing between them (0.1661 octaves per step).

      (29) Line 871: 'population sparseness': Is that the fraction of tone frequencies that produced a significant response? I would have thought that this measure is very highly correlated to your measure of bandwidth, to the point of being redundant, but I may have misunderstood how one or the other is calculated. Furthermore, the Y label of Figure 7f says 'responsiveness' rather than sparseness and that would seem to be the more appropriate term because, unless I am misunderstanding this, a larger value here implies that the neuron responded to more frequencies, i.e. in a less sparse manner.

      We have clarified the use of the term "population sparseness" and updated the Y-axis label in Figure 7f to better reflect this measure. This metric reflects the fraction of tone–attenuation combinations that elicited a significant excitatory response across the entire population of neurons, not within individual units.

      While this measure is related to bandwidth, it captures a distinct property of the data. Bandwidth quantifies how broadly or narrowly a single neuron responds across frequencies at a fixed intensity, whereas population sparseness reflects how distributed responsiveness is across the population as a whole. Although the two measures are related, since broadly tuned neurons often contribute to lower population sparseness, they capture distinct aspects of neural coding and are not redundant.

      (30) Line 881: I think this line should refer to Figure 7h rather than 7g.

      Fixed.

      Reviewer #3 (Recommendations for the authors):

      (1) In the Educage, water was only available when animals engaged in the task; however, there is no mention of whether/how animal weight was monitored.

      In the Educage, mice had continuous access to water by voluntarily engaging in the task, which they could perform at any time. Although body weight was not directly monitored, water access was essentially ad libitum, and mice performed hundreds of trials per day, thereby ensuring sufficient daily intake. This approach allowed us to monitor hydration (ad libitum food is supplied in the home cage). The 24/7 setup, including automated monitoring of trial counts and water consumption, was reviewed and approved by our institutional animal care and use committee (IACUC).

      (2) In Figure 2B-C and Figure 2E, the y-axis reads "lick rate". At first glance, I took this to mean "the frequency of licking" (i.e. an animal typically licks at a rate of 5 Hz). However, what the authors actually are plotting here is the proportion of trials on which an animal elicited >= 5 licks during the response window (i.e. the proportion of "yes" responses). I recommend editing the y-axis and the text for clarity.

      We replaced the y-label and adjusted the figure legend (Fig. 2).

      (3) I didn't see any examples of raw (filtered) voltage traces. It would be worth including some to demonstrate the quality of the data.

      We have added an example of a filtered voltage trace aligned to tone onset in Fig. S4-1a to illustrate data quality. In addition, all raw and processed voltage traces, along with relevant analysis code, are available through our GitHub repository and the corresponding dataset on Zenodo.

      (4) The description of the calculation of bias (C) in the methods section (lines 749-750) is incorrect. The correct formula is C = -0.5 * [z(hit rate) + z(fa rate)]. I believe this is the formula that the authors used, as they report negative C values. Please clarify or correct.

      Thanks for spotting this. It is now corrected.

      (5) The authors use the terms 'naïve' and 'novice' interchangeably. I suggest sticking with one term to avoid potential confusion.

      (6) Multiple instances: "less trials/day" should be "fewer trials/day"

      (7) Supplementary Table 2: The values reported are proportions, not percentages. Please correct.

      (8) Line 270: Table 2 does not show the number of neurons in the dataset categorized by region. Perhaps the authors meant Supplementary Table 2?

      Fixed. Thank you for pointing these mistakes out.

      (9) Figure 5C: the data from the hard task are entirely obscured by the data from the easy task. I recommend splitting it into two different plots.

      We agree and split the decoding of the easy and the hard task into two graphs (left: easy task; right: hard task). Thank you!

      (10) How many mice contributed to each analyzed data set? Could the authors provide a breakdown in a table somewhere of how many neurons were recorded in each mouse and which ones were included in which analyses?

      We added an overview of the analyzed datasets in supplementary Table 7. Please note that the number of mice and neurons used in each analysis is also reported in the main text and legends. Importantly, all primary analyses were conducted using LME models, which explicitly account for hierarchical data structure and inter-mouse variability, thereby addressing potential concerns about data imbalance or bias.

    1. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors recorded activity in the posterior parietal cortex (PPC) of monkeys performing a perceptual decision-making task. The monkeys were first shown two choice dots of two different colors. Then, they saw a random dot motion stimulus. They had to learn to categorize the direction of motion as referring to either the right or left dot. However, the rule was based on the color of the dot and not its location. So, the red dot could either be to the right or left, but the rule itself remained the same. It is known from past work that PPC neurons would code the learned categorization. Here, the authors showed that the categorization signal depended on whether the executed saccade was in the same hemifield as the recorded PPC neuron or in the opposite one. That is, if a neuron categorized the two motion directions such that it responded stronger for one than the other, then this differential motion direction coding effect was amplified if the subsequent choice saccade was in the same hemifield. The authors then built a computational RNN to replicate the results and make further tests by simulated "lesions".

      Strengths:

      Linking the results to RNN simulations and simulated lesions.

      Weaknesses:

      Potential interpretational issues due to a lack of explicit evidence on the sizes and locations of the response fields of the neurons. For example, is the contra/ipsi effect explained by the fact that in the contra condition, the response target and the saccade might have infringed on the outer edges of the response fields?

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary

      This paper summarises responses from a survey completed by around 5,000 academics on their manuscript submission behaviours. The authors find several interesting stylised facts, including (but not limited to):

      - Women are less likely to submit their papers to highly influential journals (*e.g.*, Nature, Science and PNAS).

      - Women are more likely to cite the demands of co-authors as a reason why they didn't submit to highly influential journals.

      - Women are also more likely to say that they were advised not to submit to highly influential journals.

      Recommendation

      This paper highlights an important point, namely that the submissions' behaviours of men and women scientists may not be the same (either due to preferences that vary by gender, selection effects that arise earlier in scientists' careers or social factors that affect men and women differently and also influence submission patterns). As a result, simply observing gender differences in acceptance rates---or a lack thereof---should not be automatically interpreted as as evidence of for or against discrimination (broadly defined) in the peer review process. I do, however, make a few suggestions below that the authors may (or may not) wish to address.

      We thank the author for this comment and for the following suggestions, which we take into account in our revision of the manuscript.

      Major comments

      What do you mean by bias?

      In the second paragraph of the introduction, it is claimed that "if no biases were present in the case of peer review, then 'we should expect the rate with which members of less powerful social groups enjoy successful peer review outcomes to be proportionate to their representation in submission rates." There are a couple of issues with this statement.

      - First, the authors are implicitly making a normative assumption that manuscript submission and acceptance rates *should* be equalised across groups. This may very well be the case, but there can also be important reasons why not -- e.g., if men are more likely to submit their less ground-breaking work, then one might reasonably expect that they experience higher rejection rates compared to women, conditional on submission.

      We do assume that normative statement: unless we believe that men’s papers are intrinsically better than women’s papers, the acceptance rate should be the same. But the referee is right: we have no way of controlling for the intrinsic quality of the work of men and women. That said, our manuscript does not show that there is a different acceptance rate for men and women; it shows that women are less likely to submit papers to a subset of journals that are of a lower Journal Impact Factor, controlling for their most cited paper, in an attempt to control for intrinsic quality of the manuscripts.

      - Second, I assume by "bias", the authors are taking a broad definition, i.e., they are not only including factors that specifically relate to gender but also factors that are themselves independent of gender but nevertheless disproportionately are associated with one gender or another (e.g., perhaps women are more likely to write on certain topics and those topics are rated more poorly by (more prevalent) male referees; alternatively, referees may be more likely to accept articles by authors they've met before, most referees are men and men are more likely to have met a given author if he's male instead of female). If that is the case, I would define more clearly what you mean by bias. (And if that isn't the case, then I would encourage the authors to consider a broader definition of "bias"!)

      Yes, the referee is right that we are taking a broad definition of bias. We provide a definition of bias on page 3, line 92. This definition is focused on differential evaluation which leads to differential outcomes. We also hedge our conversation (e.g., page 3, line 104) to acknowledge that observations of disparities may only be an indicator of potential bias, as many other things could explain the disparity. In short, disparities are a necessary but insufficient indicator of bias. We add a line in the introduction to reinforce this. The only other reference to the term bias comes on page 10, line 276. We add a reference to Lee here to contextualize.

      Identifying policy interventions is not a major contribution of this paper

      In my opinion, the survey evidence reported here isn't really strong enough to support definitive policy interventions to address the issue and, indeed, providing policy advice is not a major -- or even minor -- contribution of your paper, so I would not mention policy interventions in the abstract. (Basically, I would hope that someone interested in policy interventions would consult another paper that much more thoughtfully and comprehensively discusses the costs and benefits of various interventions!)

      We thank the referee for this comment. While we agree that our results do not lead to definitive policy interventions, we believe that our findings point to a phenomenon that should be addressed through policy interventions. Given that some interventions are proposed in our conclusion, we feel like stating this in the abstract is coherent.

      Minor comments

      - What is the rationale for conditioning on academic rank and does this have explanatory power on its own---i.e., does it at least superficially potentially explain part of the gender gap in intention to submit?

      The referee is right: academic rank was added to control for career age of researchers, with the assumption that this variable would influence submission behavior. However, the rank information we collected was for the time that the individual respondent took the survey, which could be different from the rank they held concerning their submission behaviors mentioned in the survey. That is why we didn't consider rank as an independent variable of interest. But I do also agree with the reviewer that it could be related to their submission behaviors in some cases. Our initial analysis shows that academic rank is not a significant predictor of whether researchers submitted to SNP, but does contribute significantly to the SNP acceptance rates and desk rejection rates of individuals in Medical Sciences.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Basson et al. study the representation of women in "high-impact" journals through the lens of gendered submission behavior. This work is clear and thorough, and it provides new insights into gender disparities in submissions, such as that women were more likely to avoid submitting to one of these journals based on advice from a colleague/mentor. The results have broad implications for all academic communities and may help toward reducing gender disparities in "high-impact" journal submissions. I enjoyed reading this article, and I have several recommendations regarding the methodology/reporting details that could help to enhance this work.

      We thank the referee for their comments.

      Strengths:

      This is an important area of investigation that is often overlooked in the study of gender bias in publishing. Several strengths of the paper include:

      (1) A comprehensive survey of thousands of academics. It is admirable that the authors retroactively reached out to other researchers and collected an extensive amount of data.

      (2) Overall, the modeling procedures appear thorough, and many different questions are modeled.

      (3) There are interesting new results, as well as a thoughtful discussion. This work will likely spark further investigation into gender bias in submission behavior, particularly regarding the possible gendered effect of mentorship on article submission.

      Thank you for those comments.

      Weaknesses:

      (1) The GitHub page should be further clarified. A detailed description of how to run the analysis and the location of the data would be helpful. For example, although the paper says that "Aggregated and de-identified data by gender, discipline, and rank for analyses are available on GitHub," I was unable to find such data.

      We added the link to the Github page, as well as more details on the how to run the statistical analysis. Unfortunately, our IRB approval does not allow for the sharing of the raw data.

      (2) Why is desk rejection rate defined as "the number of manuscripts that did not go out for peer review divided by the number of manuscripts rejected for each survey respondent"? For example, in your Grossman 2020 reference, it appears that manuscripts are categorized as "reviewed" or "desk-rejected" (Grossman Figure 2). If there are gender differences in the denominator, then this could affect the results.

      We thank the referee for pointing this out. Actually, what the referee is proposing is how we calculated it in the manuscript; the calculation mentioned in the manuscript was a mistake. We corrected the manuscript.

      (3) Have you considered correcting for multiple comparisons? Alternatively, you could consider reporting P-values and effect sizes in the main text. Otherwise, sometimes the conclusions can be misleading. For example, in Figure 3 (and Table S28), the effect is described as significant in Social Sciences (p=0.04) but not in Medical Sciences (p=0.07).

      We highly appreciate the suggestion. We’ve added Odds Ratio values and p-values to the main manuscript.

      (4) More detail about the models could be included. It may be helpful to include this in each table caption so that it is clear what all the terms of the model were. For instance, I was wondering if journal or discipline are included in the models.

      We appreciate the suggestion. We’ve added model details to the figure and table captions in the manuscript and the supplemental materials.

      Reviewer #3 (Public Review):

      Summary:

      This is a strong manuscript by Basson and colleagues which contributes to our understanding of gender disparities in scientific publishing. The authors examine attitudes and behaviors related to manuscript submission in influential journals (specifically, Science, Nature and PNAS). The authors rightly note that much attention has been paid to gender disparities in work that is already published, but this fails to capture the unseen hurdles that occur prior to publication (which include decisions about where to publish, desk rejections, revisions and resubmissions, etc.). They conducted a survey study to address some of these components and their results are interesting:

      They find that women are less likely to submit their manuscript to Science, Nature or PNAS. While both men and women feel their work would be better suited for more specialized journals, women were more likely to think their work was 'less novel or groundbreaking.'

      A smaller proportion of respondents indicated that they were actively discouraged from submitting their manuscripts to these journals. In this instance, women were more likely to receive this advice than men.

      Lastly, the authors also looked at self-reported acceptance and rejection rates and found that there were no gender differences in acceptance or rejection rates.

      These data are helpful in developing strategies to mitigate gender disparities in influential journals.

      We thank the referee for their comments

      Comments:

      The methods the authors used are appropriate for this study. The low response rate is common for this type of recruitment strategy. The authors provide a thoughtful interpretation of their data in the Discussion.

      We thank the referee for their comments

      Reviewer #4 (Public Review):

      This manuscript covers an important topic of gender biases in the authorship of scientific publications. Specifically, it investigates potential mechanisms behind these biases, using a solid approach, based on a survey of researchers.

      Main strengths

      The topic of the MS is very relevant given that across sciences/academia representation of genders is uneven, and identified as concerning. To change this, we need to have evidence on what mechanisms cause this pattern. Given that promotion and merit in academia are still largely based on the number of publications and impact factor, one part of the gap likely originates from differences in publication rates of women compared to men.

      Women are underrepresented compared to men in journals with high impact factor. While previous work has detected this gap, as well as some potential mechanisms, the current MS provides strong evidence, based on a survey of close to 5000 authors, that this gap might be due to lower submission rates of women compared to men, rather than the rejection rates. The data analysis is appropriate to address the main research aims. The results interestingly show that there is no gender bias in rejection rates (desk rejection or overall) in three high-impact journals (Science, Nature, PNAS). However, submission rates are lower for women compared to men, indicating that gender biases might act through this pathway. The survey also showed that women are more likely to rate their work as not groundbreaking, and be advised not to submit to prestigious journals

      With these results, the MS has the potential to inform actions to reduce gender bias in publishing, and actions to include other forms of measuring scientific impact and merit.

      We thank the referee for their comments.

      Main weakness and suggestions for improvement

      (1) The main message/further actions: I feel that the MS fails to sufficiently emphasise the need for a different evaluation system for researchers (and their research). While we might act to support women to submit more to high-impact journals, we could also (and several initiatives do this) consider a broader spectrum of merits (e.g. see https://coara.eu/ ). Thus, I suggest more space to discuss this route in the Discussion. Also, I would suggest changing the terms that imply that prestigious journals have a better quality of research or the highest scientific impact (line 40: journals of the highest scientific impact) with terms that actually state what we definitely know (i.e. that they have the highest impact factor). And think this could broaden the impact of the MS

      We agree with the referee. We changed the wording on impact, and added a few lines were added on this in the discussion.

      (2) Methods: while methods are all sound, in places it is difficult to understand what has been done or measured. For example, only quite late (as far as I can find, it's in the supplement) we learn the type of authorship considered in the MS is the corresponding authorship. This information should be clear from the very start (including the Abstract).

      We performed the suggested edits.

      Second, I am unclear about the question on the perceived quality of research work. Was this quality defined for researchers, as quality can mean different things (e.g. how robust their set-up was, how important their research question was)? If researchers have different definitions of what quality means, this can cause additional heterogeneity in responses. Given that the survey cannot be repeated now, maybe this can be discussed as a limitation.

      We agree that this can mean something different for researchers—probably varies by discipline, but also by gender. But that was precisely the point: whether men/women considered their “best work” to be published in higher impact venue. While there may be heterogeneity in those perceptions, the fact that 1) men and women rate their research at the same level and 2) we control for disciplinary differences should mitigate some of that.

      I was surprised to see that discipline was considered as a moderator for some of the analyses but not for the main analysis on the acceptance and rejection rates.

      We appreciate the attention to detail. In our analysis of acceptance and rejection rates, we conducted separate regression analyses for each discipline to capture any field-specific patterns that might otherwise be obscured.

      We added more details on this to clarify.

      I was also suppressed not to see publication charges as one of the reasons asked for not submitting to selected journals. Low and middle-income countries often have more women in science but are also less likely to support high publication charges.

      That is a good point. However, both Science and Nature have subscription options, which do not require any APCs.

      Finally, academic rank was asked of respondents but was not taken as a moderator.

      Academic rank is included in the regression as a control variable (Figure 1).

      Reviewer #2 (Recommendations For The Authors):

      In addition to the points in the "Weaknesses" section of the my Public Review above, I have several suggestions to improve this work.

      (1) Can you please indicate what the error bars mean in each plot? I am assuming that they are 95% confidence intervals.

      We appreciate the attention to detail. Yes, they are 95% confidence intervals. We’ve clarified this in the captions of the corresponding figures. 

      (2) Can you provide a more detailed explanation for why the 7 journals were separated? I see that on page 3 of the supporting information you write that "Due to limited responses, analysis per journal was not always viable. The results pertaining to the journals were aggregated, with new categories based on the shared similarities in disciplinary foci of the journals and their prestige." Specifically, why did you divide the data into (somewhat arbitrary) categories as opposed to using all the data and including a journal term in your model?

      The survey covered 7 journals:

      • Science, Nature, and PNAS (S.N.P.)

      • Nature Communications and Science Advances (NC.SA.)

      • NEJM and Cell (NEJM.C.)

      We believe that the first three are a class of their own: they cover all fields (while NEJM and Cell are limited to (bio)medical sciences), and have a much higher symbolic capital than both Nature Comms and Science Advances (which are receiving cascading papers from Nature and Science, respectively). We believe that factors leading to submission to S.N.P. are much different than those leading to submission to the other groups of journals, which is why we separated the analysis in that manner.

      (3) You included random effects for linear regression but not for logistic regression. Please justify this choice or include additional logistic regression models with random effects.

      We used mixed-effect models for linear regressions (where number of submissions, acceptance rate, or rejection rate is the dependent variable). As mentioned in the previous comment, we tested using rank as the control variable and found it had a potential impact on the variables we analyzed using linear regressions in some disciplines. Therefore, we introduced it as a random effect for all the linear regression models.

      Reviewer #3 (Recommendations For The Authors):

      The limitations of this work are currently described in the Supplement. It may be helpful to bring several of these items into the Discussion so that they can be addressed more prominently.

      Added content

      Reviewer #4 (Recommendations For The Authors):

      (1) Line 40: add 'as leading authors of papers published in' before ' 'journals'

      Done

      (2) Explain what the direction in the ' relationship between' line 62 is

      Added

      (3) Lines 101-102 - this is a bit unclear. Please, provide some more info, also including what did these studies find.

      Added

      (4) Is 'sociodemographic' the best term in line 120

      Yes, we believe so.

      (5) Results would benefit from a short intro with the info on the number of respondents, also by gender.

      Those are present at the end of the intro (and in the methods, at the end). We nonetheless added gender.

      (6) Line 134 add how many woman and man did submit to Science, Nature, and PNAS

      Added. In all disciplines combined, 552 women and 1,583 men ever submitted to these three elite journals. More details can be found in SI Table 9

      (7) Add 'Self-' before reported, line 141

      Added

      (8) Add sample sizes to Figs 1 and 2

      Those are in the appendix

      (9) Line 168 - unclear if this is ever or as their first choice

      We do not discriminate – it is whether the considered it at all.

      (10) Add sample size in line 177

      Added. 480 women and 1404 men across all disciplines reported desk rejections by S.N.P. journals.

      (11) I would like to see some discussion on the fact that the highest citation paper will also be a paper that the authors have submitted earlier in their careers given that citations will pile up over time.

      Those are actually quite evenly distributed. We modified the supplementary materials.

      (12) Data availability - be clear that supporting info contains only summary data. Also, while the Data availability statement refers to de-identified data on Github, the Github page only contains the code, and the note that 'The STAT code used for our analyses is shared.

      We are unable to share the survey response details publicly per IRB protocols.' Why were de-identified data shared? This is extremely important to allow for the reproducibility of MS results. I would also suggest sharing data in a trusted repository (e.g. Dryad, ZENODO...) rather than on Github, as per current recommendations on the best practices for data sharing.

      Thank you for your careful reading and for highlighting the importance of clear data availability. We will revise our Data Availability Statement to explicitly state that the supporting information contains only summary data and that the complete analysis code is available on GitHub.

      We understand the importance of sharing de-identified data for reproducibility. However, our IRB strictly prohibits the sharing of any individual-level data, including de-identified files, to protect participant confidentiality. Consequently, the summary data included in the supporting information, together with the provided code, is intended to facilitate the verification of our core findings. Our previous statement regarding “de-identified” data sharing was inaccurate and thus has been removed. We apologize for the confusion.

      In light of your suggestion, we are also exploring depositing the summary data and code in a trusted repository (e.g., Dryad or Zenodo) to further align with current best practices for data sharing.

      • Core Thesis: Functional programming is an excellent fit for creating reactive or "situated programs" that must continuously interact with their environment, contrary to some earlier views.

        "we're going to use functional programming to make situated programs um i'm going to show you that rich is wrong about it functional programming is actually a very good fit for situated programs and we're going to see why."

      • Functional Effect Systems: The fundamental principle is to describe computations as values rather than executing them immediately. An effect is a description of an action to be performed.

        "an effect is a description of something to be done that's the essence of functional programming instead of doing we describe."

      • Core Operators: The system is built on operators like pure for turning any value into an effect and bind for composing effects sequentially.

        "pure takes an arbitrary value turns that into an effect... blind is about sequential composition so we want to do something and then do something else."

      • Concurrency and Supervision Trees: Concurrent operations are managed within a process tree. This structure is critical for handling failures and managing resources. When one parallel process fails, its siblings must be canceled to prevent wasted resources.

        "what we get is not a list of process it's a tree of processes and that's that's very important... when it happens you want the error to be processed by another process and this process is the supervisor."

      • Structured Concurrency: Functional programming naturally enforces a well-defined supervision hierarchy, where combinators also define the supervision strategy, preventing orphaned processes.

        "in functional programming it's impossible to do that so you get structural concurrency by default that is you are forced to build your program in such a way that the supervision trees is properly structured."

      • Streams vs. Signals: The talk distinguishes between two types of long-lived effects:

      • Event Streams: A discrete series of events where each event is critical and must be processed. They require backpressure to prevent data loss. > "an event stream is not defined when events don't happen it's discrete time... losing an event is really bad... to represent stream's effect the effect representation must implement by pressure."

      • Signals: A continuous value representing the state of an identity (e.g., mouse position). They are always defined, only the latest value matters, and they benefit from lazy sampling. > "the signals represent the state of an identity... at any point in time you can take a snapshot... only the latest value matters... if you want to represent signals as effects you won't play something."

      • Missionary Library: A Clojure library that implements these concepts, providing operators for two effect types: tasks (for single values) and flows (for multiple values).

        "this is mission array it's a closure library that works enclosure and closure script and it's a collection of purely functional operators that work on effects and there's two kinds of effects there is tasks and flows."

      • Language Extensions: Missionary avoids "callback hell" through language extensions that provide a functional version of async/await, allowing for more readable, sequential-looking code that is internally transformed into callbacks.

        "the solution is to extend the language so we we extend closure with another operator that's the idea of async await but now it works in functional effects."

      • Awaiting Flows: A powerful and unique feature is the ability to "await" a flow (a stream of multiple values). This reruns a computation for each new value while automatically canceling the previous computation if it's still running.

        "if we get a new value of the state and the previous value is still being computed we want to interrupt this this previous computation and start the new one you we are only interested in the latest value so we want to discard the previous computation."

      • Key Differentiators: Missionary stands out due to its powerful language extensions (an expressive alternative to monads) and its first-class support for both discrete-time (streams) and continuous-time (signals) effects.

        "what makes it unique is language extensions uh it's alternative to monad it's as much as powerful but it's much more expressive... discrete versus continuous time we need both."

      • Computational Challenges:

        • We lack effective computational methods: "i think we actually have the foggiest idea how to compute very well."
        • Importance of fast, efficient processes: "it took only 100 milliseconds... we don't understand how to do."
      • Genomic Complexity:

        • Human genome's complexity and flexibility: "with a small change to it you make a cow instead of a person."
        • High-level language for biological processes is unknown: "what I'm interested in is the high-level language that's necessary to do things like that."
      • Programming Evolution:

        • Legacy of programming assumptions based on scarcity: "all of our sort of intuitions from being programmers have come from a time of assuming a kind of scarcity."
        • Current abundance of resources shifts the focus: "memory is free, computing is free."
      • Security and Correctness:

        • Traditional concerns of correctness and security are secondary: "people worry about correctness... is it the real problem? maybe... most things don't have to work."
        • Evolution and adaptability of code are crucial: "we spend all our time modifying existing code."
      • Programming Constraints:

        • Early decisions in programming constrain future changes: "we make decisions early in some process that spread all over our system."
        • Need for flexibility in modifying systems: "organize systems so that the consequences of decisions we make are not expensive to change."
      • Generic Operators and Extensions:

        • Dynamically extensible operations: "dynamically extend things while my program is running."
        • Symbolic algebra as an extension of arithmetic: "expand this arithmetic on functions... it's a classical thing people can do in algebra."
      • Propagators and Parallelism:

        • Concept of propagators for parallel computation: "propagators are independent little stateful machines."
        • Parallelism and monotonic information merging: "we don't actually put values in these cells we put information about a value in a cell."
      • Truth Maintenance Systems (TMS):

        • Maintaining and improving data consistency: "truth maintenance systems... maintain the best estimates of what's going on."
        • Dependency-directed backtracking for efficient problem-solving: "automatically find for me the consistent sub consistent sub the sub world views that are consistent."
      • Historical and Educational Insights:

        • Historical evolution of computation: "when I started computing in 1961... the total amount of memory is probably about 10 kilobytes."
        • Educational gaps between theory and practical engineering: "what we taught the students wasn't at all what the students actually were expected to learn."
      • Vision for the Future:

        • Future computing systems must be inherently parallel, redundant, and flexible: "future... computers are so cheap and so easy to make... they can talk to each other and do useful things."
        • Importance of evolving current computational thinking: "we have to throw away our current ways of thinking if we ever expect to solve these problems."
      • Summary and Call to Action:

        • Main challenge is evolvability, not correctness: "problem facing us as computer engineers is not correctness it's evolvability."
        • Proposals include extensible operations and new architectural paradigms: "extensible generic operations... a more radical proposal is maybe there are freedoms that we can unlock by throwing away our idea of architecture."
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their helpful comments and suggestions. Below you may find the point-by-point replies to their concerns.

      Reviewer #1

      “The research is meticulously conducted, and the data are compelling, as they demonstrate that the Nova-agrin-Lrp4-MuSK axis is also operational in non-vertebrates. The conclusions drawn by the authors are generally adequate; however, I find some instances of "it is the first time..." to be unnecessary.”

      We have removed all unnecessary claims to that effect.

      “The work also presents an unexpected finding that mouse Nova protein is unable to splice the Ciona agrin mini-gene (Figure 3). I believe the inability of mouse Nova1 and Nova2 to splice the Ciona agrin could also be due to insufficient expression levels of the mouse proteins. Therefore, the authors should include either a positive control (e.g., mouse agrin mini-gene) or demonstrate that the proteins are expressed at comparable levels.

      We have now included two additional datasets supporting our conclusion. First, we have included the positive control with the mouse Agrin minigene as suggested by the reviewer, which shows that mouse Nova1 and Nova2 are indeed still able to splice the mouse Agrin minigene in our assay (Figure 3C). Second, we included fluorescence images of the GFP-fused mouse Nova1 and Nova2 showing their proper expression in the cells (Figure S7).

      “I am also not fully convinced that the model of autoinhibition for Ciona Nova is supported by sufficient experimental data. Again, there are no data showing that the levels of the various deletion mutants of Nova are consistent and hence, there could be issues with the stbsility of some of the deletion mutants and this could explain the observed difference in activity.”

      We have added a few more datasets to further investigate the model. First, we have added an independent biological replicate of the “MLN” Nova isoform deletion mutant assays (Figure S8), as well as a separate assay using deletion mutants based on the “MMM” isoform (Figure S9). The results were consistent in both cases, confirming our initial observations. Next, we tested more directly the idea proposed by the reviewer that there are issues with stability, by looking at the fluorescence of the GFP-fused mutants. We did notice that the N/C-terminal deletion mutants were not expressed as well, but this was always mitigated by concurrent deletion of the KH3 domain. We have now expanded our discussion in the text to propose that there may be a negative effect of the KH3 domain on Nova expression/stability in the absence of the N/C termini. Although different from the model in which KH3 directly inhibits KH1/KH2, there does seem to be some inhibitory effect of KH3 on Nova expression/stability. “- In all schematic presentations, exon Z6 appears larger than exon Z5. However, Z6 is only 24 bp long, while Z5 is 3434 bp long. Please adjust this representation.”

      To clarify, in Ciona Z6 is 18 bp long, and Z5 is 15 bp long, hence they code for 6 and 5 amino-acids, respectively. This is different from the mammalian Z exons, which may be the source of the confusion here. In our schematics, we are only representing the Ciona Z exons.

      “- Is there consistency in the relative proportions of the 24-bp (Z6), 33-bp (Z5), and 57-bp (Z6 + Z5) PCR products? Studies in vertebrates have shown that AChR clustering activity is highest with the Z8 and Z19 products, while the Z11 product appears to be somewhat less active. It would be nice to also point out the different splice products are detected in Ciona.”

      It was not clear if there was any consistency in the relative proportions of Ciona Agrin splice products in the minigene assays as performed in cultured mammalian cells, though in Figure 1 we have pointed out a more detailed characterization of the different splice products in vivo in Ciona. The different splice products’ confirmed sequences are also shown in the supplemental sequences file.

      “Line 111: 'Z11' Agrin should be corrected to 'Z19' Agrin.

      To clarify again, we are only referring to the Ciona Agrin Z exons, which are not the same sizes as the mammalian Z exons. While Z19 would refer to the combination of exons Z8 and Z11 (8+11 = 19) in mammals, here in Ciona the equivalent combination is Z11 (Z5 + Z6).

      “Line 168: "Figure H" should be updated to "Figure 2H."”

      Fixed.

      Reviewer #2:

      __*“44 - ALS, references 8-12. These are old papers. A new review should be cited, either instead of in addition.”

      *__

      We have read some newer reviews and cite three more recent reviews (references 10-12) now.

      __* 56 - "many" cases of CMS - some are not due to mutations in this pathway

      *__

      We have altered this to say “many”.

      __* 57 - refs 29-46. This is a very large number of references for a point this is quite unimportant to the story. It would be better to cite recent reviews.

      *__

      We have removed some references and also cited more recent reviews here (references 38, 39).

      __* 168 - should be 2H

      *__

      Fixed.

      __* 205 - make N terminal extension more apparent in Figure 3D

      *__

      We have recolored the N terminus to be red, as to make it more apparent, in figures 3 and S8 and S9.

      __* 235 - not a complete sentence

      *__

      Fixed.

      308+ - can the authors clarify whether EBF knockdown has a selective effect on Nova vs general failure of the neurons to acquire a MN phenotype

      We have been investigating this in a separate study on MN specification and differentiation in Ciona, which will be published as a preprint soon. EBF does not have a selective effect on Nova expression, as it appears to be regulating multiple aspects of neuronal differentiation, consistent with its role as previously studied in Ciona and other organisms (e.g. Kratsios et al. 2012, Catela et al. 2019, etc).__*

      614 - explain in figure legend the decrease in apparent MR from left to right in 4B

      *__

      This is just an example of “bowed” or “curved” bands frequently seen in electrophoresis, usually due to uneven heat dissipation or other electrophoresis issues. However, the bands all correspond to the same products (Z+). We added an explanation in the legend.

      General - three other key components of the pathway are MuSK, rapsyn, and DOK-7. Functional studies of these genes fall beyond the scope of this paper, but it would be helpful to know whether they are expressed in muscle and, if so, whether expression is muscle-specific.

      We have added this to the discussion. While Musk and Dok-7 remain unstudied in Ciona, it has been shown that Rapsyn is muscle-specific in Ciona (Nishino et al. 2011).

      Reviewer #3:

      __*“1) The authors report two main Nova isoforms that seem to be produced by alternative promoters. They also claim that the MLN isoform is more strongly expressed in two of the studied conditions compared to the MMM protein (eggs and heart in Fig 1G), while both are equally abundant in st. 22.5 embryos and brain (Fig S1 and line 130). Therefore, both isoforms are likely involved in the regulation of the Agrin AS event. When performing the experiments that require to express the Nova protein, the authors choose to work with the "MLN" isoform arguing that it is more "ubiquitous" than the "MMM" isoform, although the last has a more evident nuclear localization signal (NLS) sequence. In the minigene analysis, the MLN isoform fails to produce transcripts with Z6 exon (which seems to be the most common Z+ isoform in the brain), and the amount of Z11-containing transcripts is very low compared to st. 22.5. Given that the N-terminal domain has a regulatory influence, as demonstrated by the authors, and that the MMM isoform is potentially more "neural-restricted" than the MLN, an intriguing possibility is that the MMM isoform might enhance the inclusion of Z6 and Z11 isoforms. To solve this issue, I suggest two experiments:

      • Perform the minigene assay with the MMM isoform of Nova and the wild type version of the minigene to check the level of inclusion of Z6 and Z6+Z5 (Z11) exons.*__”

      We have added additional minigene assay data using the MMM isoform (S9). We did not detect Z6 isoforms with MMM, though there may be slight differences in the ratio of Z5 and Z11 compared to the MLN assay. We believe this indicates that nuclear localization is not rate-limiting in our heterologous mammalian cell minigene assay, although it very well may regulate splicing activity more meaningfully in vivo in Ciona. This may be especially true in post-mitotic cells, as opposed to during embryogenesis when actively proliferating cells will break down and then reconstitute their nuclear envelopes over and over again, thus potentially allowing some of the MLN isoform to find its way into the nucleus. We still believe the production of the Z6 isoform may depend on additional Ciona-specific factors missing from the mammalian cells in our heterologous assay.

      “- Test the regulatory activity of the upstream genomic region of exon 1a, in an equivalent way as for exon 1b in Fig 7A and B, to explore whether the promoter of the MMM isoform has a neural-restricted expression that could explain the AS pattern observed in st. 22.5 and brain.” We have done this, shown in Figure S15, which revealed that the promoter upstream of exon 1a (encoding the MMM isoform) drives only expression in mesenchyme and some epidermal cells, with no neuronal expression visible. This suggests that the majority of the neural expression is due to the cis-regulatory elements in the region between exons 1a and 1b. However, this region does not necessarily activate transcription only at exon 1b (encoding MLN isoform), as intronic elements can loop back and regulate transcription off “upstream” promoters. Thus we propose that the Nova [1b] -2011/+6 region drives expression of both MLN and MMM isoforms, though this remains to be fully tested. We believe the regulation and function of the different Nova isoforms in Ciona is beyond the scope of the current paper, though we are interested in investigating this more thoroughly in follow-up studies.

      __*“2) The authors unveil the conservation of an Agrin AS event between mammals and a tunicate species with similar functional consequences for AChR clustering. While this is absolutely correct, the relatively low similarity of the AS exons between Ciona and mammals shown in Fig 1A may raise confusion or doubts in the readers regarding the homology of the event (as it did in my own case before I checked it in more detail). Therefore, an explicit alignment of both constitutive and alternative exons in a supplementary figure to clearly demonstrate the homology of the AS event across major taxonomic groups (with a few vertebrate and tunicate species) might help.

      Furthermore, expression of Nova in motor neurons of amphioxus (Branchiostoma lanceolatum) was previously reported (ref. 60), and a quick look into publicly available Agrin transcripts (____https://www.ncbi.nlm.nih.gov/gene/136443694____) reveals a homologous AS event in this cephalochordate species.

      C1 "Z7/Z6/Z8" C2 (partial)

      Bla QADPAPLRQEGVG--LDGTTILNYPNAINK ... E-SNSIRE ... QEPNQDDNHFEVTFRTTSDHGLLLWNHKPGGG-DFIALAI Cro HSTDLLQDEQATAIYLDGTTKIMYRNAVKA ... --PNDFRE ... SRART-HNNYEIVFRTTARHGLLLMVGKAREGVDYIALAI Mmu IVEKSVGDLETLA--FDGRTYIEYLNAVTE ... ELTNEIPA ... EKALQ-SNHFELSLRTEATQGLVLWIGKVGERADYMALAI : . :** * : * **:. .*.: ... *::*: :** : :**:* * *::****

      These two facts suggests a potential origin of the Nova-Agrin regulation at the base of the chordate phylum (and not restricted to Olfactores), which could be mentioned in the discussion as a relevant possibility.*__”

      We thank the reviewer for this suggestion. Indeed, we have now added a more detailed alignment with Agrin sequences from more species in Figures S2 and S3, including amphioxus as so helpfully identified by the reviewer. We have added the observation that amphioxus Agrin appears to have a single Z exon encoding the NxI/V motif (no evidence for two Z exons as in tunicates or vertebrates). This indeed suggests that this pathway may be a chordate innovation, as we now discuss. We also add AlphaFold-assisted predictions of the NxF motif binding to the equivalent pocket in Lrp4 in both Ciona and mammals (Figure S1).

      Line 168: Figure 2H instead of Figure H.

      Fixed.

      Line 287: "Taken together, these results reveal that a Nova-Agrin-Lrp4 pathway for AChR receptor clustering at the neuromuscular synapse is conserved from mammals to tunicates." While this sentence might be true, from mammals to tunicates might imply that it is conserved in all vertebrate and tunicate lineages, and this is not explored in the manuscript (there might be secondary losses). It would be more technically correct to say something similar like "...the neuromuscular synapse is conserved in the studied mammalian and tunicate lineages" or "...the neuromuscular synapse originated before the evolutionary divergence of tunicates and vertebrates"

      We have fixed this now in a few places.

      “Line 342. At the end of this paragraph, the possibility of conservation of the mechanism also in amphioxus could be discussed.”

      We now discuss the amphioxus sequence and the idea that this mechanism was present in the last common chordate ancestor.

      “Line 383: "the the apparent".”

      Fixed.

      “I agree that the mouse-specific agrin minigene to test the functionality of Nova1 and Nova2 would be a suitable positive control to discard protein stability/expression issues.

      We have tested this now with GFP fusion images (Figure S7) and using the mouse Agrin minigene (Figure 3C). Both indicate proper expression/splicing activity of mouse Nova1 and Nova2, supporting the idea that there is still some type of cross-species incompatibility as tested in mammalian cells.

      “The only minor limitation, in my opinion, is that it lacks testing of the MMM Nova isoform in the minigene assay, to explore whether it has (or not) a complementary function to the MLN isoform that could fully explain the endogenous AS pattern.”

      We have added MMM minigene assays, and these were largely identical to MLN assays. We propose that the N-terminus and nuclear localization do not significantly impact activity of Ciona Nova as tested in mammalian cells, however we cannot exclude the possibility that things may be different in vivo in Ciona.

    1. Reviewer #3 (Public review):

      Summary

      The paper presents an imaging and analysis pipeline for whole-mount gastruloid imaging with two-photon microscopy. The presented pipeline includes spectral unmixing, registration, segmentation, and a wavelength-dependent intensity normalization step, followed by quantitative analysis of spatial gene expression patterns and nuclear morphometry on a tissue level. The utility of the approach is demonstrated by several experimental findings, such as establishing spatial correlations between local nuclear deformation and tissue density changes, as well as the radial distribution pattern of mesoderm markers. The pipeline is distributed as a Python package, notebooks, and multiple napari plugins.

      Strengths

      The paper is well-written with detailed methodological descriptions, which I think would make it a valuable reference for researchers performing similar volumetric tissue imaging experiments (gastruloids/organoids). The pipeline itself addresses many practical challenges, including resolution loss within tissue, registration of large volumes, nuclear segmentation, and intensity normalization. Especially the intensity decay measurements and wavelength-dependent intensity normalization approach using nuclear (Hoechst) signal as reference are very interesting and should be applicable to other imaging contexts. The morphometric analysis is equally well done, with the correlation between nuclear shape deformation and tissue density changes being an interesting finding. The paper is quite thorough in its technical description of the methods (which are a lot), and their experimental validation is appropriate. Finally, the provided code and napari plugins seem to be well done (I installed a selected list of the plugins and they ran without issues) and should be very helpful for the community.

      Weaknesses

      I don't see any major weaknesses, and I would only have two issues that I think should be addressed in a revision:

      (1) The demonstration notebooks lack accompanying sample datasets, preventing users from running them immediately and limiting the pipeline's accessibility. I would suggest to include (selective) demo data set that can be used to run the notebooks (e.g. for spectral unmixing) and or provide easily accessible demo input sample data for the napari plugins (I saw that there is some sample data for the processing plugin, so this maybe could already be used for the notebooks?).

      (2) The results for the morphometric analysis (Figure 4) seem to be only shown in lateral (xy) views without the corresponding axial (z) views. I would suggest adding this to the figure and showing the density/strain/angle distributions for those axial views as well.

    1. attr = LayerIntegratedGradients(vqa_resnet, [vqa_resnet.module.input_maps["v"], vqa_resnet.module.module.text.embedding])

      The use of .module.module in your code suggests that vqa_resnet is wrapped inside a module container (likely using torch.nn.DataParallel or torch.nn.parallel.DistributedDataParallel), which is a common practice when working with multi-GPU setups in PyTorch. Let me break this down more clearly:

      .module in PyTorch:

      • When you use torch.nn.DataParallel or torch.nn.parallel.DistributedDataParallel, PyTorch wraps the original model (vqa_resnet in this case) inside a container. The container has a .module attribute that points to the actual model.

        For example:

        python model = torch.nn.DataParallel(vqa_resnet) # or torch.nn.parallel.DistributedDataParallel(vqa_resnet)

        This means that:

        • vqa_resnet is now inside a DataParallel (or DistributedDataParallel) container.

        • To access the original vqa_resnet model, you need to use .module.

      .module.module:

      Now, based on the code you provided:

      python vqa_resnet.module.module.text.embedding

      It suggests that the vqa_resnet model has been wrapped twice in a container (perhaps a custom wrapper inside your codebase). This would mean:

      1. The first .module accesses the model wrapped by DataParallel or DistributedDataParallel. <span style="color:#92d050 !important;">这里是 captum 的 ModelInputWrapper </span>

      2. The second .module accesses another level of encapsulation or custom module (like another wrapper or submodule) around vqa_resnet.

      这里确实有两层 wrapper,第一个是ModelInputWrapper(vqa_resnet),第二个是 torch.nn.DataParallel(vqa_resnet)


      查阅 pytorch-vqa 源码知道,text.embedding self.text 是一个 TextProcessor 类的实例,而这个 embedding 是一个 PyTorch 的 nn.Embedding 层,用于将输入的单词索引序列(问题的 token id)映射成词向量(embedding)

    1. verbfit = glm(RealizationOfRec ~ Verb + AnimacyOfRec + AnimacyOfTheme + LengthOfTheme, family = binomial, data = verbs)

      I get an error message here. I have followed all explanation of R code mentioned here step by step. I've checked the spelling it's O.K. I don't know what the problem is. The error is: Error in eval(family$initialize) : y values must be 0 <= y <= 1

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Note : The original preprint version of our manuscript has been reviewed by 3 subject experts for Review Commons. All the three reviewers’ comments on the original version of our manuscript have been fully addressed. Their input was extremely valuable in helping us clarify and refine the presentation of our results and conclusions. Their feedback contributed to making the study both more thoroughly developed and more accessible to a broad readership, while preserving its mechanistic depth. We believe that this revised version more effectively highlights the conceptual advances brought by our findings.

      Reviewer #1

      Evidence, reproducibility and clarity

      The manuscript "Key roles of the zona pellucida and perivitelline space in promoting gamete fusion and fast block to polyspermy inferred from the choreography of spermatozoa in mice oocytes" by Dr. Gourier and colleagues explores the poorly understood process of gamete fusion and the subsequent block to polyspermy by live-cell imaging of mouse oocytes with intact zona pellucida in vitro. The new component in this study is the presence of the ZP, which in prior studies of live-cell imaging had been removed before. This allowed the authos to examine contributions of the ZP to the block in polyspermy in relation to the timing of sperm penetrating the ZP and sperm fusing with the oocyte. By carefully analysing the timing of the cascade of events, the authors find that the first sperm that reaches the membrane of the mouse oocyte is not necessarily the one that fertilizes the oocytes, revealing that other mechanisms post-ZP-penetration influence the success of individual sperm. While the rate of ZP penetration remains constant in unfertilized oocytes, it decreases upon fertilization for subsequent sperm, providing direct evidence for the known 'slow block to polyspermy' provided by changes to the ZP adhesion/ability to be penetrated. Careful statistical analyses allow the authors to revisit the role of the ZP in preventing polyspermy: They show that the ZP block resulting from the cortical reaction is too slow (in the range of an hour) to contribute to the immediate prevention of polyspermy in mice. The presented analyses reveal that the ZP does contribute to the block to polyspermy in two other ways, namely by effectively limiting the number of sperm that reach the oocyte surface in a fertilization-independent manner, and by retaining components like JUNO and CD9, that are shed from the oocyte plasma membrane after fertilization, in the perivitelline space, which may help neutralize surplus spermatozoa that are already present in the PVS. Lastly, the authors report that the ZP may also contribute to channeling the flagellar oscillations of spermatozoa in the PVS to promote their fusion competence.

      Major comments:

      • Are the key conclusions convincing?

      The authors provide a careful analysis of the dynamics of events, though the analyses are correlative, and can only be suggestive of causation. While this is a limitation of the study, it provides important analysis for future research. Moreover, by analysing also control oocytes without fertilization and the timing of events, the authors have in some instances clear 'negative controls' for comparison.

      Some claims would benefit from rewording or rephrasing to put the findings better in the context of what is already known and what is novel:

      • the phrasing 'challenging prior dogma' might be too strong since it had been observed before that it is not necessarily the first sperm that gets through the ZP that fertilizes the egg (though I am afraid that I do not have any citations or references for this). However, given that in the field people generally think it is not necessarily and always the first sperm, the authors may want to consider weakening this claim.

      Only real-time imaging of in vitro fertilization of zona pellucida-intact oocytes, as performed in our study, is capable of determining which spermatozoon crossing the zona pellucida fuses with the oocyte. However, such studies are rare, and most do not specifically address this question. As Reviewers 1 & 3, we have not found any citation or reference telling or showing that it is not necessarily the first spermatozoon to penetrate the zona pellucida that fertilizes the egg. In contrast, at least one reference (Sato et al., 1979) explicitly reports the opposite. If, as suggested by Reviewer 1 and 3, it has indeed been observed before that the first sperm to pass the ZP is not always the one that fertilizes, and if this idea is generally accepted in the field, then it is all the more important that a study demonstrates and publishes this point. This is precisely what our study makes possible. However, in case we may have overlooked a previous reference making the same observation as ours, we have removed the phrasing ‘challenging prior dogma’. That being said, the key issue is not so much that it is not necessarily the first spermatozoon penetrating the perivitelline space that fertilizes, but rather why spermatozoa that successfully reach the PVS of an unfertilized oocyte may fail to achieve fertilization. This is one of the central questions our study sought to address.

      • I do think the cortical granule release could still contribute to the block to polyspermy though - as the authors here nicely show - at a later time-point only, and thus not the major and not the immediate block as previously thought. The wording in the abstract should therefore be adjusted (since it could still contribute...)

      We are concerned that we may disagree on this point. The penetration block resulting from cortical granule release progressively reduces the permeability of the zona pellucida to spermatozoa, relative to its baseline permeability prior to sperm–oocyte fusion. Any decrease in this baseline permeability occurring before the fusion block becomes fully effective can contribute to the prevention of polyspermy by limiting the number of sperm that can access the oolemma at a time when fusion is still possible. In contrast, once the fusion block is fully established, limiting the number of spermatozoa traversing the ZP becomes irrelevant regarding the block to polyspermy, as the fusion block alone is sufficient to prevent additional fertilizations, rendering the penetration block obsolete. The only scenario that could challenge this obsolescence is if the fusion block were transient. In that case, as Reviewer 1 suggests, the penetration block could indeed play a role at a later time-point. However, taken together, our study and that of Nozawa et al. (2018) support the conclusion that this is not the case in mice:

      • Our in vitro study using kinetic tracking shows that the time constant for completion of the fusion block is typically 6.2 ± 1.3 minutes. During this time window, we observe that the permeability of the zona pellucida to spermatozoa does not yet decrease significantly from the baseline level it exhibited prior to sperm–oocyte fusion (see Figures 5B and S1B in the revised manuscript, and Figures 5A and 5B in the initial version). Consequently, before the fusion block is fully established, the penetration block can contribute only marginally—if at all—to the prevention of polyspermy. In contrast, the naturally low baseline permeability of the ZP—independent of any fertilization-triggered penetration block—as well as the relatively long timing of fusion ( minutes on average) after sperm penetration in the perivitelline space, are factors that contribute to the preservation of monospermic while the fusion block is still being established.
      • Our in vitro study using kinetic tracking shows that once the fusion block is completed following the first fusion event, no additional spermatozoa are able to fuse with the oocyte until the end of the experiment, 4 hours post-insemination (see blue points and fitting curve in Figure 5C). Meanwhile, one or more additional spermatozoa—most of them motile and therefore viable—are present in the perivitelline space in 50% of the oocytes analyzed (purple point in Figure 5C). This demonstrates that, once established, the fusion block remains effective for at least the entire duration of the experiment, supporting the idea of a fully functional and long-lasting fusion block.
      • Nozawa et al. (2018) found that female mice lacking ovastacin—the protease released during the cortical reaction that renders the zona pellucida impenetrable—are normally fertile. They additionally reported that the oocytes recovered from these females after mating are monospermic despite the systematic presence of additional spermatozoa in the perivitelline space. These findings further support the conclusion that in mice the fusion block is both permanent and sufficient to prevent polyspermy. For all these reasons, we believe that even at a later time-point, the penetration block does not contribute to the prevention of polyspermy in mice.

      To clarify the fact that the penetration block does not necessarily contribute to prevent polyspermy, which indeed challenges the commonly accepted view, we have substantially revised the discussion. Furthermore, Figure 9 from the initial version of the manuscript has been replaced by Figure 8 in the revised version. This new figure provides a more didactic illustration of the inefficacy of the penetration block in preventing polyspermy in mice, by showing the respective impact of the fusion block, the penetration block, as well as fusion timing and the natural baseline permeability of the zona pellucida, on the occurrence of polyspermy.

      As for the abstract, it has also been thoroughly revised. The content related to this section is now expressed in a way that emphasizes the factors that actively contribute to the prevention of polyspermy in mice, rather than those with no or marginal contribution (such as the penetration block in this case).

      • release of OPM components - in the abstract it's unclear what the authors mean by this - in the results part it becomes clear. Please already make it clear in the abstract that it is the fertility factors JUNO/CD9 that could bind to sperm heads upon their release and thus 'neutralize' them? I would also recommend not referring to it as 'outer' plasma membrane (there is no 'inner plasma membrane'). Moreover, in the abstract please clarify that this release is happening only after fusion of the first sperm and not all the time. In the abstract it sounds as if this was a completely new idea, but there is good prior evidence that this is in fact happening (as also then cited in the results part) - maybe frame it more as the retention inside the PVS as new finding.

      We thank reviewer 1 for pointing out the lack of precision in the abstract regarding the “components” released from the oolemma, and the fact that our phrasing may have given the impression that the post-fertilization release of CD9 and JUNO is a novel observation. The new observation is that CD9 and JUNO, which are known to be massively released from the oolemma after fertilization, bind to spermatozoa in the perivitelline space. However, we cannot rule out the possibility that other oocyte-derived molecules not investigated here may undergo a similar process. This is why we employed the broader term “components”, which encompasses both CD9 and JUNO as well as potential additional molecules. That said, we acknowledge the lack of precision introduced by this terminology. To address this, we have revised the corresponding sentence in the abstract to better reflect our new findings relative to previous ones, and to eliminate the ambiguity introduced by the word “component”.

      The revised sentence of the abstract reads as follows:

      “Our observation that non-fertilizing spermatozoa in the perivitelline space are coated with CD9 and JUNO oocyte’s proteins, which are known to be massively released from the oolemma after gamete fusion, supports the hypothesis that the fusion block involves an effective perivitelline space-block contribution consisting in the neutralization of supernumerary spermatozoa in the perivitelline space by these and potentially other oocyte-derived factors.”

      Moreover, we cannot state in the abstract that the release of CD9 and JUNO occurs only after the fusion of the first spermatozoon and not before, since some CD9 and JUNO are already detectable in the perivitelline space (PVS) prior to fusion. What our study shows is that, before fertilization, CD9 and JUNO are predominantly localized at the oocyte membrane. In contrast, after fusion (four hours post-insemination), oocyte CD9 is distributed between the membrane and the PVS, and the only JUNO signal detectable in the oocyte is found in the PVS. This is what we describe in the Results section on page 15.

      Regarding the acronym “OPM” in the initial version of the manuscript, although it was defined in the introduction as referring to the oocyte plasma membrane and not the outer plasma membrane (which, indeed, would not be meaningful), we acknowledge that it may have caused confusion to people in the field due to its resemblance to the commonly used meaningful acronym “OAM” for outer acrosomal membrane. To avoid any ambiguity, we have replaced the acronym “OPM” throughout the revised manuscript with the term “oolemma”, which unambiguously refers to the plasma membrane of the oocyte.

      It is unclear to me what the relevance of dividing the post-fusion/post-engulfment into different phases as done in Fig 2 (phase 1, and phase 2) - also for the conclusions of this paper this seems rather irrelevant and overly complicated, since the authors never get back to it and don't need it (it's not related to the polyspermy block analyses). I would remove it from the main figures and not divide into those phases since it is distracting from the main focus.

      Sperm engulfment and PB2 extrusion are two processes that follow sperm–oocyte fusion. As such, they are clear indicators that fusion has occurred and that meiosis has resumed. Their progression over time is readily identifiable in bright-field imaging: sperm engulfment is characterized by the gradual disappearance of the spermatozoon head from the oolemma, whereas PB2 extrusion is observed as the progressive emergence of a rounded protrusion from the oocyte membrane (Figure 2 in the initial manuscript and Figure S2 A&B in the revised version). The kinetics of these events, measured from the arrest of “push-up–like” movement of the sperm head against the oolemma —assumed to coincide with sperm-oocyte fusion, as further justified in a later response to Reviewer 1—provide reliable temporal landmarks for estimating the timing of fusion when the fusion event itself is not directly observed in real time (Figure S2 C&D).

      The four landmarks used in this estimation are:

      (i) the disappearance of the sperm head from the oolemma due to internalization (28 ± 2 minutes post-arrest, mean ± SD);

      (ii) the onset of PB2 protrusion from the oolemma (28 ± 2 minutes post-arrest);

      (iii) the moment when the contact angle between the PB2 protrusion and the oolemma shifts from greater than to less than 90° (49 ± 6 minutes post-arrest);

      (iv) the completion of PB2 extrusion (73 ± 10 minutes post-arrest).

      The approach used to determine the fusion time window of a fertilizing spermatozoon from these landmarks is detailed in the “Determination of the Fertilization Time Windows” section of the Materials and Methods. Compared to the initial version of the manuscript, we have added a paragraph explaining the rationale for using the arrest of the push-up–like movement as a reliable indicator for sperm–oocyte fusion and have clarified the description of the approach used to determine fertilization timing.

      The timed characterization of sperm engulfment and PB2 extrusion kinetics is highly relevant to the analysis of the penetration and fusion blocks, however we agree that its place is more appropriate in the Supplementary Information than in the main text. In accordance with the reviewer’s recommendation, this section has therefore been moved to the Supplementary Information SI2.

      For the statistical analysis, I am not sure whether the assumption "assumption that the probability distribution of penetration or fertilization is uniform within a given time window" is in fact true since the probability of fertilizing decreases after the first fertilization event.... Maybe I misunderstood this, but this needs to be explained (or clarified) better, or the limitation of this assumption needs to be highlighted.

      During in vitro fertilization experiments with kinetic tracking, each oocyte is observed sequentially in turn. As a result, sperm penetration into the perivitelline space or fusion with the oolemma may occur either during an observation round or in the interval between two rounds. In the former case, penetration or fusion is directly observed in real time, allowing for high temporal precision in determining the moment of the event. In contrast, when penetration or fusion occurs between two observation rounds, the precise timing cannot be directly determined. We can only ascertain that the event took place within the time window we have determined. Because, within a given penetration or fusion time window, we do not know the exact moment at which the event occurred, there is no reason to favor one time over another. This justifies the assumption that all time points within the window are equally probable. This explanation has been added in the section Statistical treatment of penetration and fertilization chronograms to study the kinetics of fertilization, penetration block and fusion block of the main text and in the section Statistical treatment of penetrations and fertilizations chronograms to study penetration and fusion blocks of the material and methods.

      -Suggestion for additional experiments:

      If I understood correctly, the onset of fusion in Fig 2C is defined by stopping of sperm beating? If it is by the sudden stop of the beating flagellum, this should be confirmed in this situation (with the ZP intact) that it correctly defines the time-point of fusion since this has not been measured in this set-up before as far as I understand. In order to measure this accurately, the authors will need to measure this accurate to be able to acquire those numbers (of time from fusion to end of engulfment), e.g. by pre-loading the oocyte with Hoechst to transfer Hoechst to the fusing sperm upon membrane fusion.

      The nuclear dye Hoechst is widely used as a marker of gamete fusion, as it transfers from the ooplasm—when preloaded with the dye—into the sperm nucleus upon membrane fusion, thereby signaling the happening of the fusion event. This technique is applicable in the context of in vitro fertilization using ZP-free oocytes. However, it is not suitable when cumulus–oocyte complexes are inseminated, as is the case in both in vitro experimental conditions of the present study (standard IVF and IVF with kinetic tracking). Indeed, when cumulus–oocyte complexes are incubated with Hoechst to preload the oocytes, the numerous surrounding cumulus cells also take up the dye. Consequently, upon insemination, spermatozoa acquire fluorescence while traversing and dispersing the cumulus mass—before reaching the ZP—thus rendering Hoechst labeling ineffective as a specific marker of membrane fusion. This remains true even under optimized conditions involving brief Hoechst incubation of cumulus–oocyte complexes ( Nonetheless, we have strong evidence supporting the use of the arrest of sperm movement as a surrogate marker for the moment of fusion. In our previous study (Ravaux et al., 2016; ref. 4 in the revised manuscript), we investigated the temporal relationship between the abrupt cessation of sperm head movement on the oolemma—resulting from strong flagellar beating arrest—and the fusion event, using ZP-free oocytes preloaded with Hoechst. That study revealed a temporal delay of less than one minute between the cessation of sperm oscillations and the actual membrane fusion, thereby supporting the conclusion that in ZP-free oocytes, the arrest of vigorous sperm movement at the oolemma is a reliable indicator of the moment at which fusion occurs. In the same study, the kinetics of sperm head internalization into the ooplasm were also characterized, typically concluding within 20–30 minutes after movement cessation. These findings are fully consistent with our current observations in ZP-intact oocytes, where sperm head engulfment was completed approximately 24 ± 3 minutes after the arrest of sperm oscillations. Taken together, these results strongly support the conclusion that, in both ZP-free and ZP-intact oocytes, the arrest of sperm movement is a reliable indicator of the fusion event. This assumption formed the basis for our determination of fertilization time points in the present study.

      These justifications were not fully detailed in the original version of the manuscript. We have addressed this in the revised version by explicitly presenting this rationale in the Materials and Methods section under Determination of the Fertilization Time Windows.

      Fig 8: 2 comments

      • To better show JUNO/CD9 pre-fusion attachment to the oocyte surface and post-fusion loss from the oocyte surface (but persistence in the PVS), an image after removal of the ZP (both for pre-fertilization and post-fertilization) would be helpful - the combination of those images with the ones you have (ZP intact) would make your point more visible.

      We have followed this recommendation. Figure 8 of the initial manuscript has been replaced by Figure 6 in the revised manuscript, which illustrates the four situations encountered in this study: fertilized and unfertilized oocytes, each with and without unfused spermatozoa in their PVS. To better show JUNO/CD9 pre-fusion presence to the oocyte plasma membrane, as well as their post-fusion partial (for CD9) and near-complete (for JUNO) loss from the oocyte membrane (but persistence in the PVS), paired images of the same oocyte before and after of ZP removal are now provided, both for unfertilized (Figure 6A) and fertilized oocytes (Figure 6C).

      • You show that the heads of spermatozoa post fusion are covered in CD9 and JUNO, yet I was missing an image of sperm in the PVS pre-fertilization (which should then not yet be covered).

      As staining and confocal imaging of the oocytes were performed 4 hours after insemination, images of sperm in the PVS of an oocyte “pre-fertilization” cannot be strictly obtained. However, we can have images of spermatozoa present in the PVS of oocytes that remained unfertilized. This situation, now illustrated in Figure 6B of the revised manuscript, shows that these spermatozoa are also covered in JUNO and CD9, which they may have progressively acquired over time from the baseline presence of these proteins in the PVS of unfertilized oocytes. This also may provide a mechanistic explanation for their inability to fuse with the oolemma, and, consequently, for the failure of fertilization in these oocytes.

      Minor comments:

      • The videos were remarkable to look at, and great to view in full. However, for the sake of time, the authors might want to consider cropping them for the individual phases to have a shorter video (with clear crop indicators) with the most important different stages visible in a for example 1 min video (e.g. video.

      We have followed this recommendation. The videos have been cropped and annotated in order to highlight the key events that support the points made in the result section from page 9 to 11 in the revised manuscript.

      • In general, given that the ZP, PVS and oocyte membrane are important components, a general scheme at the very beginning outlining the relative positioning of each before and during fertilization (and then possibly also including the second polar body release) would be extremely helpful for the reader to orient themselves.

      A general scheme addressing Reviewer 1 request, summarizing the key components and concepts discussed in the article and intended to help guide the reader, has been added to the introduction of the revised manuscript as Figure 1.

      • first header results "Multi-penetration and polyspermy under in vivo conditions and standard and kinetics in vitro fertilization conditions" is hard to understand - simplify/make clearer (comparison of in vivo and in vitro conditions? Establishing the in vitro condition as assay?)

      The title of the first Results section has been revised in accordance with Reviewer 1 suggestion. It now reads: Comparative study of penetration and fertilization rates under in vivo and two distinct in vitro fertilization conditions.

      • Large parts of the statistical analysis (the more technical parts) could be moved to the methods part since it disrupts the flow of the text.

      In the revised version of our manuscript, we have restructured this part of the analysis to ensure that more technical or secondary elements do not disrupt the flow of the main text. Accordingly, the equations have been reduced to only what is strictly necessary to understand our approach, their notation has been greatly simplified, and the statistical analysis of unfertilized oocytes whose zona pellucida was traversed by one or more spermatozoa has been moved to the Supplementary Information (SI1).

      • To me, one of the main conclusions was given in the text of the results part, namely that "This suggests that first fertilization contributes effectively to the fertilization-block, but less so to the penetration block". I would suggest that the authors use this conclusion to strengthen their rationale and storyline in the abstract.

      We agree with Reviewer 1 suggestion. Accordingly, we have not only thoroughly revised our abstract, but also the introduction and discussion, in order to better highlight the rationale of our study, its storyline, and the new findings which not only challenge certain established views but also open new research directions in the mechanisms of gamete fusion and polyspermy prevention.

      • Wording: To characterize the kinetics with which penetration of spermatozoa in the PVS falls down after a first fertilization," falls down should be replaced with decreases (page 10 and page 12)

      Falls down has been removed from the new version and replaced with decreases


      Significance

      Overall, this manuscript provides very interesting and carefully obtained data which provides important new insights particularly for reproductive biology. I applaud the authors on first establishing the in vivo conditions (how often do multiple sperm even penetrate the ZP in vivo) since studies have usually just started with in vitro condition where sperm at much higher concentration is added to isolated oocyte complexes. Thank you for providing an in vivo benchmark for the frequency of multiple sperm being in the PVS. While this frequency is rather low (somewhat expectedly, with 16% showing 2-3 sperm in the PVS), this condition clearly exists, providing a clear rationale for the investigation of mechanisms that can prevent additional sperm from entering.

      My own expertise is experimentally - thus I don't have sufficient expertise to evaluate the statistical methods employed here.

      __ __


      Reviewer #2

      Evidence, reproducibility and clarity

      Overall, this is a very interesting and relevant work for the field of fertilization. In general, the experimental strategies are adequate and well carried out. I have some questions and suggestions that should be considered before the work is published.

      1) Why are the cumulus cells not mentioned when the AR is triggered before or while the sperms cross it? It seems the paper assumes from previous work that all sperm that reach ZP and the OPM have carried out the acrosome reaction. This, though probably correct, is still a matter of controversy and should be discussed. It is in a way strange that the authors do not make some controls using sperm from mice expressing GFP in the acrosome, as they have used in their previous work.

      We do not mention the cumulus cells or whether the acrosome reaction is triggered before, during, or after their traversal (i.e., upon sperm binding to the ZP), as this question, while scientifically relevant, pertains to a distinct line of investigation that lies beyond the scope of the present study. Even with the use of spermatozoa expressing GFP in the acrosome, addressing this question would require a complete redesign of our kinetic tracking protocol, which was specifically conceived to monitor in bright field the dynamic behavior of spermatozoa from the moment they begin to penetrate the perivitelline space of an oocyte. Accordingly, we imaged oocytes that were isolated 15 minutes after insemination of the cumulus–oocyte complexes, by which time most (if not all) cumulus cells had detached from the oocytes, as explained in the fourth paragraph of the material and methods of both the initial and revised versions of the manuscript. The spermatozoa we had access to were therefore already bound to the zona pellucida at the time of removal from the insemination medium, and had thus necessarily passed through the cumulus layer. It is unclear for us why Reviewer 2 believes that we “assume from previous work that all sperm that reach ZP has carried out the acrosome reaction”. We could not find any statement in our manuscript suggesting, let alone asserting, such an assumption, which we know to be incorrect. Based on both published work from Hirohashi’s group in 2011 (Jin et al., 2011, DOI: 10.1073/pnas.1018202108) and our own unpublished observation (both involving cumulus-oocyte masses inseminated with spermatozoa expressing GFP in the acrosome), it is established that only a subset of spermatozoa reaching the ZP after crossing the cumulus layer has undergone acrosome reaction. Moreover, from the same sources—as well as from a recent publication by Buffone’s group (Jabloñsky et al., 2023 DOI: 10.7554/eLife.93792 ) which is the one to which reviewer 2 refers in her/his 3rd comment, it is also well established that spermatozoa have all undergone acrosome reaction when they enter the PVS. To the best of our knowledge, this latter point has long been widely accepted and is not questioned. Therefore, stating this in the first paragraph of the Discussion in the revised manuscript, while referencing the two aforementioned published studies, should be appropriate. What remains a matter of ongoing debate, however, is the timing and the physiological trigger(s) of the acrosome reaction in fertilizing spermatozoa. The 2011 study by Hirohashi’s group challenged the previously accepted view that ZP binding induces the acrosome reaction, showing instead that most spermatozoa capable of crossing the ZP and fertilizing the oocyte had already undergone the acrosome reaction prior to ZP binding. However, as this issue lies beyond the scope of our study, we do not consider it appropriate to include a discussion of it in the manuscript.

      2) In the penetration block equations, it is not clear to me why (𝑡𝑃𝐹1) refers to both PIPF1 and 𝜎𝜎𝑃I𝑃𝐹1. Is it as function off?

      That is correct: (tPF1) means function of the time post-first fertilization. Both the post-first fertilization penetration index (i.e. PIPF1) and its incertainty (i.e. 𝜎𝑃I𝑃𝐹1 ) vary as a function of this time. However, as mentioned in a previous response to Reviewer 1, this section has been rewritten to improve clarity and readability. The equations have been limited to those strictly necessary for understanding our approach, and their notation has been significantly simplified.

      3) Why do the authors think that the flagella stops. The submission date was 2024-10-01 07:27:26 and there has been a paper in biorxiv for a while that merits mention and discussion in this work (bioRxiv [Preprint]. 2024 Jul 2:2023.06.22.546073. doi: 10.1101/2023.06.22.546073.PMID: 37904966).

      Our experimental approach allows us to determine when the spermatozoon stops moving, but not why it stops. We thank Reviewer 3 for pointing out this very relevant paper from Buffone’s group (doi: 10.7554/eLife.93792) which shows the existence of two distinct populations of live, acrosome-reacted spermatozoa. These correspond to two successive stages, which occur either immediately upon acrosome reaction in a subset of spermatozoa, or after a variable delay in others, during which the sperm transitions from a motile to an immotile state. The transition from the first to the second stage was shown to follow a defined sequence: an increase in the sperm calcium concentration, followed by midpiece contraction associated with a local reorganization of the helical actin cortex, and ultimately the arrest of sperm motility. For fertilizing spermatozoa in the PVS, this transition was shown to occur upon fusion. However, it was also reported in some non-fertilizing spermatozoa that this transition took place within the PVS. These findings are consistent with the requirement for sperm motility in order to achieve fusion with the oolemma. Moreover, the fact that some spermatozoa may prematurely transition to the immotile state within the PVS can therefore be added to the list of possible reasons why a spermatozoon that penetrates the PVS of an oocyte might fail to fuse.

      This discussion has been added to the first paragraph of the Discussion section of our revised manuscript.

      4) Please correct at the beginning of Materials and Methos: Sperm was obtained from WT male mice, it should say were.

      Thank you, the correction has been done.

      5) This is also the case in the fourth paragraph of this section: oocyte were not was.

      The sentence in question has been modified as followed: “In the in vitro fertilization experiments with kinetic tracking, a subset of oocytes—together with their associated ZP-bound spermatozoa—was isolated 15 minutes post-insemination and transferred individually into microdrops of fertilization medium to enable identification.”


      Significance

      Understanding mammalian gamete fusion and polyspermy inhibition has not been fully achieved. The authors examined real time brightfield and confocal images of inseminated ZP-intact mouse oocytes and used statistical analyses to accurately determine the dynamics of the events that lead to fusion and involve polyspermy prevention under conditions as physiological as possible. Their kinetic observations in mice gamete interactions challenge present paradigms, as they document that the first sperm is not necessarily the one that fertilizes, suggesting the existence of other post-penetration fertilization factors. The authors find that the zona pellucida (ZP) block triggered by the cortical reaction is too slow to prevent polyspermy in this species. In contrast, their findings indicate that ZP directly contributes to the polyspermy block operating as a naturally effective entry barrier inhibiting the exit from the perivitelline space (PVS) of components released from the oocyte plasma membrane (OPM), neutralizing unwanted sperm fusion, aside from any block caused by fertilization. Furthermore, the authors unveil a new important ZP role regulating flagellar beat in fertilization by promoting sperm fusion in the PVS.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      SUMMARY: This study by Dubois et al. utilizes live-cell imaging studies of mouse oocytes undergoing fertilization. A strength of this study is their use of three different conditions for analyses of events of fertilization: (1) eggs undergoing fertilization retrieved from females at 15 hr after mating (n = 211 oocytes); (2) cumulus-oocyte complexes inseminated in vitro (n = 220 oocytes), and (3) zona pellucida (ZP)-intact eggs inseminated in vitro, transferred from insemination culture once sperm were observed bound to the ZP for subsequent live-cell imaging (93 oocytes). This dataset and these analyses are valuable for the field of fertilization biology. Limitations of this manuscript are challenges arise with some conclusions, and the presentation of the manuscript. There are some factual errors, and also some places where clearer explanations should to be provided, in the text and potentially augmented with illustrations to provide more clarity on the models that the authors interpret from their data.

      MAJOR COMMENTS:

      The authors are congratulated on their impressive collection of data from live-cell imaging. However, the writing in several sections is challenging to understand or seems to be of questionable accuracy. The lack of accuracy is suspected to be more an effect of overly ambitious attempts with writing style, rather than to mislead readers. Nevertheless, these aspects of the writing should be corrected. There also are multiple places where the manuscript contradicts itself. These contradictions should be corrected. Finally, there are factual points from previous studies that need correction.

      Second, certain claims and the conclusions as presented are not always clearly supported by the data. This may be connected to the issues with writing style, word and phrasing choices, etc. The conclusions could be expressed more clearly, and thus may not require additional experiments or analyses to support them. The authors might also consider illustrations as ways to highlight the points they wish to make. (Figure 7 is a strong example of how they use illustrations to complement the text).

      In response to Reviewer 3's concern about the writing style, which made several sections difficult to understand, we have thoroughly revised the entire manuscript to improve clarity, and precision. To further enhance comprehension, we have added illustrations in the revised version of the manuscript:

      • Figure 1A presents the gamete components; Figure 1B depicts the main steps of fertilization considered in the present study; and Figure 1C illustrates the penetration and fusion blocks, along with the respective contributing mechanisms: the ZP-block for the penetration block, and the membrane-block and PVS-block for the fusion block

      • Figure 2A provides a description of the three experimental protocols used in this study: Condition 1, in vivo fertilization after mating; Condition 2, standard in vitro fertilization following insemination of cumulus-oocyte complexes; and Condition 3, in vitro fertilization with kinetic tracking of oocytes isolated from the insemination medium 15 min after insemination of the cumulus-oocyte complexes.

      • Figure 4 (formerly Figure 7 in the initial version) now highlights all fusing and non-fusing situations documented in videos 1-6 and associated paragraphs of the Results section.

      • In the Discussion, Figure 9 from the original version has been replaced by Figure 8, which now provides a more pedagogical illustration of the inefficacy of the penetration block in preventing polyspermy in mice. This figure illustrates the respective contributions of the fusion block, the penetration block, fusion timing, and the intrinsic permeability of the zona pellucida to the occurrence of polyspermy.

      We hope that this revised version of the article will guide the reader smoothly throughout, without causing confusion.

      Regarding the various points that Reviewer 3 perceives as contradictions or factual errors, or the claims and the conclusions which, as presented, should not always supported by the data, we will provide our perspective on each of them as they are raised in the review.

      SPECIFIC COMMENTS:

      (1) The authors should use greater care in describing the blocks to polyspermy, particularly because they appear to be wishing to reframe views about prevention of polyspermic fertilization. The title mentions of "the fast block to polyspermy;" this problematic for a couple of different reasons. There is no strong evidence for block to polyspermy in mammals that occurs quickly, particularly not in the same time scale as the first-characterized fast block to polyspermy. To many biologists, the term "fast block to polyspermy" refers to the block that has been described in species like sea urchins and frogs, meaning a rapid depolarization of the egg plasma membrane. However, such depolarization events of the egg membrane have not been detected in multiple mammalian species. Moreover, the change in the egg membrane after fertilization does not occur in as fast a time scale as the membrane block in sea urchins and frogs (i.e., is not "fast" per se), and instead occurs in a comparable time frame as the conversation of the ZP associated with the cleavage of ZP2. Thus, it is misleading to use the terms "fast block" and "slow block" when talking about mammalian fertilization. This also is an instance of where the authors contradict themselves in the manuscript, stating, "the membrane block and the ZP block are established in approximatively the same time frame" (third paragraph of Introduction). This statement is indeed accurate, unlike the reference to a fast block to polyspermy in mammals.

      We fully agree with Reviewer 3 on the importance of clearly defining the two blocks examined in the present study—the penetration block and the fusion block (as referred to in the revised version) —and of situating them in relation to the three blocks described in the literature: the ZP-block, membrane-block, and PVS-block. We acknowledge that this distinction was not sufficiently clear in the original version of the manuscript. In the revised version, these two blocks and their relationship to the ZP-, membrane-, and PVS-blocks are now clearly introduced in the second paragraph of the Introduction section and illustrated in the first figure of the manuscript (Fig. 1C). They are then discussed in detail in two dedicated paragraphs of the Discussion, entitled Relation between the penetration block and the ZP-block and Relation between the fusion block and the membrane- and PVS-blocks.

      The penetration block refers to the time-dependent decrease in the number of spermatozoa penetrating the perivitelline space (PVS) following fertilization, whereas the fusion block refers to the time-dependent decrease in sperm-oolemma fusion events after fertilization. It is precisely to the characterization of these two blocks that our in vitro fertilization experiments with kinetic tracking allow us to access.

      In this study, as in the literature, fusion-triggered modifications of the ZP that hinder sperm traversal of the ZP are referred to as the ZP-block (also known as ZP hardening). The ZP-block thus contributes to the post-fertilization reduction in sperm penetration into the PVS and thereby underlies the penetration block. Similarly, fusion-triggered alterations of the PVS and the oolemma that reduce the likelihood of spermatozoa that have reached the PVS successfully to fuse with the oolemma are referred to as the PVS-block and membrane-block, respectively. These two blocks act together to reduce the probability of sperm-oolemma fusion after fertilization, and thus contribute to the fusion block.

      The time constant of the penetration block was found to be 48.3 ± 9.7 minutes, which is consistent with the typical timeframe of ZP-block completion—approximately one hour post-fertilization in mice—as reported in the literature. By contrast, the time constant of the fusion block was determined to be 6.2 ± 1.3 minutes, which is markedly faster than the time typically reported in the literature for the completion of the fusion-block (more than one hour in mice). This strongly suggests that the kinetics of the fusion block are not primarily governed by its membrane-block component, but rather by its PVS-block component—about which little to nothing was previously known.

      Contrary to what Reviewer 3 appears to have understood from our initial formulation, there is therefore no contradiction or error in stating that "the membrane block and the ZP block are established within approximately the same timeframe", while the fusion block, which proceeds much more rapidly, is likely to rely predominantly on the PVS-block. We have thoroughly revised the manuscript to clarify this key message of the study.

      However, we understand Reviewer 3’s objection to referring to the fusion block (or the PVS-block) as a fast block, given that this term is conventionally reserved for the immediate fertilization-triggered membrane depolarization occurring in sea urchins and frogs. Although the kinetics we report for the fusion block are considerably faster than those of the penetration block, they occur on the scale of minutes, and not seconds. In line with the reviewer's recommendation, we have therefore modified both the title and the relevant passages in the text to remove all references to the term fast block in the revised version.

      (2) The authors aim to make the case that events occurring in the perivitelline space (PVS) prevent polyspermic fertilization, but the data that they present is not strong enough to make this conclusion. Additional experiments would optional for this study, but data from such additional experiments are needed to support the authors' claims regarding these functions in fertilization. Without additional data, the authors need to be much more conservative in interpretations of their data. The authors have indeed observed phenomena (the presence of CD9 and JUNO in the PVS) that could be consistent with a molecular basis of a means to prevent fertilization by a second sperm. However, the authors would need additional data from additional experimental studies, such as interfering with the release of CD9 and JUNO and showing that this experimental manipulation leads to increased polyspermy, or creating an experimental situation that mimics the presence of CD9 and JUNO (in essence, what the authors call "sperm inhibiting medium" on page 20) and showing that this prevents fertilization.

      A major section of the Results section here (starting with "The consequence is that ... ") is speculation. Rather than be in the Results section, this should be in the Discussion. The language should be also softened regarding the roles of these proteins in the perivitelline space in other portions of the manuscript, such as the abstract and the introduction.

      Finally, the authors should do more to discuss their results with the results of Miyado et al. (2008), which interestingly, posited that CD9 is released from the oocytes and that this facilitates fertilization by rendering sperm more fusion-competent. There admittedly are two reports that present data that suggest lack of detection of CD9-containing exosomes from eggs (as proposed by Miyado et al.), but nevertheless, the authors should put their results in context with previous findings.

      We generally agree with all the remarks and suggestions made here. In the revised version of the manuscript, we have retained in the Results section (pp. 14–15) only the factual data concerning the localization of CD9 and JUNO in unfertilized and fertilized oocytes, as well as in the spermatozoa present in the PVS of these oocytes. We have taken care not to include any interpretive elements in this section, which are now presented exclusively in a dedicated paragraph of the Discussion, entitled “Possible molecular bases of the membrane-block and ZP-block contributing to the fusion block” (p. 21). There, we develop our hypothesis and discuss it in light of both the findings from the present study and previous work by other groups. In doing so, we also address the data reported by Miyado et al. (2008, https://doi.org/10.1073/pnas.0710608105), as well as subsequent studies by two other groups—Gupta et al. (2009, https://doi.org/10.1002/mrd.21040) and Barraud-Lange et al. (2012, https://doi.org/10.1530/REP-12-0040)—that have challenged Miyado’s findings.

      We are fully aware that our interpretation of the coverage of unfused sperm heads in the perivitelline space (PVS) by CD9 and JUNO, released from the oolemma—as a potential mechanism of sperm neutralization contributing to the PVS block—remains, at this stage, a plausible hypothesis or working model that, as such, warrants further experimental investigation. It is precisely in this spirit that we present it—first in the abstract (p.1), then in the Discussion section (p. 21), and subsequently in the perspective part of the Conclusion section (p. 22).

      (3) Many of the authors' conclusions focus on their prior analyses of sperm interaction - beautifully illustrated in Figure 7. However, the authors need to be cautious in their interpretations of these data and generalizing them to mammalian fertilization as a whole, because mouse and other rodent sperm have sperm head morphology that is quite different from most other mammalian species.

      In a similar vein, the authors should be cautious in their interpretations regarding the extension of these results to mammalian species other than mouse, given data on numbers of perivitelline sperm (ranging from 100s in some species to virtually none in other species), suggesting that different species rely on different egg-based blocks to polyspermy to varying extents. While these observations of embryos from natural matings are subject to numerous nuances, they nevertheless suggest that conclusions from mouse might not be able to be extended to all mammalian species.

      It is not clear to us whether Reviewer 3’s comment implies that we have, at some point in the manuscript, generalized conclusions obtained in mice to other mammalian species—which we have not—or whether it is simply a general, common-sense remark with which we fully agree: that findings established in one species cannot, by default, be assumed to apply to another.

      We would like to emphasize that throughout the manuscript, we have taken care to restrict our interpretations and conclusions to the mouse model, and we have avoided any unwarranted extrapolation to other species.

      To definitively close this matter—if there is indeed a matter—we have added the following clarifying statements in the revised version of the manuscript:

      In the introduction, second paragraph (pp. 2–3):"The variability across mammalian species in both the rate of fertilized oocytes with additional spermatozoa in their PVS (from 0 to more than 80%) after natural mating and the number of spermatozoa present in the PVS of these oocytes (from 0 to more than a hundred) suggests that the time for completion of the penetration block and thus its efficiency to prevent polyspermy can vary significantly between species."

      At the end of the preamble to the Results section (p. 4):"This experimental study was conducted in mice, which are the most widely used model for studying fertilization and polyspermy blocks in mammals. While there are many interspecies similarities, the findings presented here should not be directly extrapolated to humans or other mammalian species without species-specific validation."

      In the Conclusion, the first sentence is (p.22) : “This study sheds new light on the complex mechanisms that enable fertilization and ensure monospermy in mouse model.”

      Within the Conclusion section, among the perspectives of this work (p. 22):"In parallel, comparative studies in other mammalian species will be needed to assess the generality of the PVS-block and its contribution relative to the membrane-block and ZP-blocks, as well as the generality of the mechanical role played by flagellar beating and ZP mechanical constraint in membrane fusion."

      (4) Results, page 4 - It is very valuable that the authors clearly define what they mean by a penetrating spermatozoon and a fertilizing spermatozoon. However, they sometimes appear not to adhere to these definitions in other parts of the manuscript. An example of this is on page 10; the description of penetration of spermatozoon seems to be referring to membrane fusion with the oocyte plasma membrane, which the authors have alternatively called "fertilizing" or fertilization - although this is not entirely clear. The authors should go through all parts of the manuscript very carefully and ensure consistent use of their intended terminology.

      Overall, while these definitions on page 4 are valuable, it is still recommended that the authors explicitly state when they are addressing penetration of the ZP and fertilization via fusion of the sperm with the oocyte plasma membrane. This help significantly in comprehension by readers. An example is the section header in the middle of page 9 - this could be "Spermatozoa can penetrate the ZP after the fertilization, but have very low chances to fertilize."

      We chose to define our use of the term penetration at the beginning of the Results section because, as readers of fertilization studies, we have encountered on multiple occasions ambiguity as to whether this term was referring to sperm entry into the perivitelline space following zona pellucida traversal, or to the fusion of the sperm with the oolemma. To avoid such ambiguity, we were particularly careful throughout the writing of our original manuscript to use the term penetration exclusively to describe sperm entry into the PVS. The terms fertilizing and fusion were reserved specifically for membrane fusion between the gametes. However, as occasional lapses are always possible, we followed Reviewer 3’s recommendation and carefully re-examined the entire manuscript to ensure consistent use of our intended terminology. We did not identify any inconsistencies, including on page 10, which was cited as an example by Reviewer 3. We therefore confirm that, in accordance with our predefined terminology, all uses of the term penetration, on that page and anywhere else in our original manuscript, refer exclusively to sperm entry into the PVS and do not pertain to fusion with the oolemma.

      That said, it is important that all readers— including those who may only consult selected parts of the article—are able to understand it clearly. Therefore, despite the potential risk of slightly overloading the text, Reviewer 3’s suggestion to systematically associate the term penetration with ZP seems to us a sound one. However, we have opted instead to associate penetration with PVS, as our study focuses on the timing of sperm penetration into the perivitelline space, rather than on the traversal of the zona pellucida itself. Accordingly, except in a few rare instances where ambiguity seemed impossible, we have systematically used the phrasing “penetration into the PVS” throughout the revised version of the manuscript.

      Another variation of this is in the middle of page 9, where the authors use the terms "fertilization block" and "penetration block." These are not conventional terms, and venture into being jargon, which could leave some readers confused. The authors could clearly define what they mean, particularly with respect to "penetration block,"

      This point has already been addressed in our response to Comment 1 from Reviewer 3. We invite Reviewer 3 to refer to that response.

      This extends to other portions of the manuscript as well, such as Figure 2C, with the label on the y-axis being "Time after fertilization." It seems that what the authors actually observed here was the cessation of sperm tail motility. (It is not evident they they did an assessment of sperm-oocyte fusion here.)

      Regarding Figure 2C (original version), it has been merged with Figure 2B (original version) to form a single figure (Figure S2D), now included in Supplementary Information SI2. This new figure retains all the information originally presented in Figure 2C and indicates the time axis origin as the time when oscillatory movements of the sperm cease.

      That said, for the reasons detailed in our response to Reviewer 1 and in the Materials and Methods, we explain why it is legitimate to use the cessation of sperm head oscillations on the oolemma as a marker for the timing of the fusion event. We invite the reviewers to refer to that response for a full explanation of our rationale.

      (5) Several points that the authors try to make with several pieces of data do not come across clearly in the text, including Figure 2 on page 6, Figure 4 on page 9, and the various states utilized for the statistical treatment, "post-first penetration, post-first fertilization, no fertilization, penetration block and polyspermy block" on page 10. Either re-writing and clearer definitions'explanations are needed, and/or schematic illustrations could be considered to augment re-written text. Illustrations could be a valuable way present the intended concepts to readers more clearly and accurately. For example, Figure 4 and the associated text on page 9 get particularly confusing - although this sounds like a quite impressive dataset with observations of 138 sperm. Illustrations could be helpful, in the spirit of "a picture is worth 1000 words," to show what seem to be three different situations of sequences of events with the sperm they observed. Finally, the text in the Results about the 138 sperm is quite difficult to follow. It also might help comprehension to augment the percentages with the actual numbers of sperm - e.g., is 48.6% referring 67 of the total 138 sperm analyzed? Does the 85.1% refer to 57 of these 67 sperm?

      Figure 2 in the original version of our manuscript concerns sperm engulfment and PB2 extrusion. As already mentioned in our response to Reviewer 1, the characterization of sperm engulfment and PB2 extrusion kinetics is highly relevant to the analysis of the penetration and fusion blocks. However, we agree that its presence in the main text may distract the reader from the main focus of the study. Therefore, this figure and the associated text have been moved to the Supplementary Information in the revised manuscript (SI 2, pages 26–27).

      Regarding Figure 4 (original version), in response to Reviewer 3’s concern about the difficulty in grasping the message conveyed in its three graphs and associated text we have completely rethought the way these data are presented. Since the three graphs of Figure 4 were directly derived from the experimental timing data of sperm entry in the PVS and fusion with the oolemma in fertilized oocytes (originally shown in Figure 3A), we have combined them into a single figure in the revised manuscript: Figure 3 (page 8). This new Figure 3 now comprises three components:

      • Figure 3A remains unchanged from the original version and shows the timing of sperm penetration and fusion in fertilized oocytes. Each sperm category (fused or non-fused , penetrated in the PVS before fusion or after fusion) is represented using a color code clearly explained in the main text (last paragraph of page 7).
      • Figure 3B focuses specifically on the first spermatozoon to penetrate the PVS of each oocyte. It reports how many of these first-penetrating spermatozoa succeeded in fusing versus how many failed to do so, highlighting that being the first to arrive is not sufficient for fusion—other factors are involved. This is explained simply in the first paragraph of page 9.
      • Figure 3C considers all spermatozoa that entered the PVS of fertilized oocytes, classifying them into three categories: those that penetrated the PVS before fertilization, those that did so after fertilization, and those for which the timing could not be precisely determined. Such classification makes it apparent that the number of spermatozoa penetrating before and after fertilization is of the same order of magnitude, indicating that fertilization is not very effective at preventing further sperm entry into the PVS for the duration of our observations (~4 hours). To facilitate the identification of these three categories, the same color code used in Figure 3A is applied. In addition, within each category, the number of spermatozoa that successfully fused are indicated in black. This allows the reader to quickly assess the fertilization probability for each category—high for sperm entering before fertilization, very low or null for those entering after fertilization. This analysis shows that fertilization is far more effective at blocking sperm fusion than at blocking sperm penetration. This is clearly explained in the second paragraph of page 9. Regarding__ statistical analysis__, as already mentioned in our responses to Reviewers 1 and 2, this section has been rewritten to improve clarity and readability. The notation has also been significantly simplified. To improve the overall fluidity of the text related to the statistical analysis, Figure 3B (original version), which presented the timing of penetration into the perivitelline space of oocytes that remained unfertilized, along with its associated statistical analysis previously in Figure 5B), have been revised and transferred together in a single Figure S1 of the Supplementary Information (SI1, pages 26; now Figures S1A and S1B).

      (6) Introduction, page 2 - it is inaccurate to state that only diploid zygotes can develop into a "new being." Triploid zygotes typically fail early in develop, but can survive and, for example, contribute to molar pregnancies. Additionally, it would be beneficial to be more scientifically precise term than saying "development into a new being." This is recommended not only for scientific accuracy, but also due to current debates, including in lay public circles, about what defines "life" or human life.

      In response to Reviewer 3’s comment, we no longer state in the revised version of the manuscript that only diploid zygotes can develop into a new being. We have modified our wording as follows, on page 2, second paragraph: “In mammals, oocytes fertilized by more than one spermatozoon cannot develop into viable offspring.”

      (7) Introduction, page 2 - The mammalian sperm must pass through three layers, not just two as stated in the first paragraph of the Introduction. The authors should include the cumulus layer in this list of events of fertilization.

      The sentence from the introduction from the original manuscript mentioned by Reviewer 3 was: “To fertilize, a spermatozoon must successively pass two oocyte’s barriers.” This statement is accurate in the sense that the cumulus cell layer is not part of the oocyte itself, unlike the two oocyte’s barriers: the zona pellucida and the oolemma. Moreover, the traversal of the cumulus layer is not within the scope of our study, unlike the traversal of the zona pellucida and fusion with the oolemma. However, it is also correct that in our study the spermatozoa have passed through the cumulus layer before reaching the oocyte. Therefore, in response to Reviewer 3’s comment, we have revised the sentence to clarify this point as follows:

      “Once a spermatozoon has passed through the cumulus cell layer surrounding the oocyte, it still must overcome two oocyte’s barriers to complete fertilization.”

      (8) Introduction, page 2 - While there is evidence that zinc is released from mouse egg upon fertilization, the evidence is not convincing or conclusive that zinc is released from cortical granules or via cortical granule exocytosis.

      To better highlight the rationale, storyline, and scope of our study, the introduction has been thoroughly streamlined. In this context, the section discussing the cortical reaction and zinc release seemed more appropriate in the Discussion, specifically within the paragraph titled “Relationship between the penetration block and the ZP-block.”

      To address the uncertainty raised by Reviewer 3 regarding the origin of the zinc spark release, we have rephrased this part as follows:

      “The fertilization-triggered processes responsible for the changes in ZP properties are generally attributed to the cortical reaction—a calcium-induced exocytosis of secretory granules (cortical granules) present in the cortex of unfertilized mammalian oocytes—and to zinc sparks. As a result, proteases, glycosidases, lectins, and zinc are released into the perivitelline space (PVS), where they act on the components of the zona pellucida. This leads to a series of modifications collectively referred to as ZP hardening or the ZP-block”.

      (9) The authors inaccurately state, "only if monospermic multi-penetrated oocytes are able to develop normally, which to our knowledge has never been proven in mice" (page 4) - This was demonstrated with the Astl knockout, assuming that the authors use of "multi-penetrated oocytes" here refers to the definition of penetration that they use, namely penetrating the ZP. This also is one of the instances where the authors contradict themselves, as they note the results with this knockout on page 18.

      Thank you for bringing this point to our attention. Nozawa et al. (2018) found that female mice lacking ovastacin (Astl)—the protease released during the cortical reaction that plays a key role in rendering the zona pellucida impenetrable—are normally fertile. They also reported that oocytes recovered from these females after mating were monospermic, despite the consistent presence of additional spermatozoa in the perivitelline space. We can indeed consider that taken together these findings demonstrate that the presence of multiple spermatozoa in the PVS does not impair normal development, as long as the oocyte remains monospermic. In our study, we re-demonstrated this in a different way (by reimplantation of monospermic oocytes with additional spermatozoa in their PVS) in a more physiological context of WT oocytes, but we agree that we cannot state: “which to our knowledge has never been proven in mice.” This part of the sentence has therefore been removed. In the revised version of the manuscript, the sentence is now formulated in the first paragraph of page 5 as follows: “However, the contribution of the fusion block to prevent polyspermy has physiological significance only if monospermic oocytes with additional spermatozoa in their PVS can develop into viable pups.”

      Minor comments:

      There are numerous places where this reader marked places of confusion in the text. A sample of some of these:

      We will indicate hereinafter how we have modified the text in the specific examples provided by Reviewer 3. Beyond these, however, we would like to emphasize that we have thoroughly revised the entire manuscript to improve clarity and precision.

      Page 4 - "continuously relayed by other if they detach" - don't know what this means

      Replaced now p 5 by “can be replaced by others if they detach”

      Page 6 - "hernia" - do the authors mean "protrusion" on the oocyte surface?

      The paragraph from the Results section in question has now been moved to the Supplementary Information, on pages 26 and 27. The term hernia has been systematically replaced with protrusion, including in the Materials and Methods section on page 24.

      Page 10 - "penetration of spermatozoa in the PVS falls down" - don't know what this means

      Falls down has been removed from the new version and replaced with decreases

      Page 12 - "spermatozoa linked to the oocyte ZP" - not clear what "linked" means here

      Replaced now page 16 by “spermatozoa bound to the oocyte ZP”

      Page 14 - "by dint of oscillations" - don't know what this means

      Replaced now page 10 by “the persistent flagellum movements”

      Specifics for Materials and Methods:

      Exact timing of females receiving hCG and then being put with males for mating - assume this was immediate but this is an important detail regarding the timing for the creation of embryos in vivo.

      That is correct: females were placed with males for mating immediately after receiving hCG. This clarification has been added in the revised version of the manuscript.

      Please provide the volumes in which inseminations occurred, and how many eggs were placed in this volume with the 10^6 sperm/ml.

      The number of eggs may vary from one cumulus–oocyte complex to another. It is therefore not possible to specify exactly how many eggs were inseminated. However, we now indicate on page 23 the number of cumulus–oocyte complexes inseminated (4 per experiment), the volume in which insemination was performed (200 mL), and the sperm concentration used 106 sperm/mL.

      **Referees cross-commenting**

      I concur with Reviewer 1's comment, that the 'challenging prior dogma' about the first sperm not always being the one to fertilize the egg is too strong. As Reviewer 1 notes, "it had been observed before that it is not necessarily the first sperm that gets through the ZP that fertilizes the egg." I even thought about adding this comment to my review, although held off (I was hoping to find references, but that was taking too long).

      Please refer to our response to Reviewer 1 regarding this point.

    1. Reviewer #2 (Public review):

      Summary:

      The manuscript by Majnik and colleagues introduces "Track2p", a new tool designed to track neurons across imaging sessions of two-photon calcium imaging in developing mice. The method addresses the challenge of tracking cells in the growing brain of developing mice. The authors showed that "Track2p" successfully tracks hundreds of neurons in the barrel cortex across multiple days during the second postnatal week. This enabled the identification of the emergence of behavioral state modulation and desynchronization of spontaneous network activity around postnatal day 11.

      Strengths:

      The manuscript is well written, and the analysis pipeline is clearly described. Moreover, the dataset used for validation is of high quality, considering the technical challenges associated with longitudinal two-photon recordings in mouse pups. The authors provide a convincing comparison of both manual annotation and "CellReg" to demonstrate the tracking performance of "Track2p". Applying this tracking algorithm, Majnik and colleagues characterized hallmark developmental changes in spontaneous network activity, highlighting the impact of longitudinal imaging approaches in developmental neuroscience. Additionally, the code is available on GitHub, along with helpful documentation, which will facilitate accessibility and usability by other researchers.

      Weaknesses:

      (1) The main critique of the "Track2p" package is that, in its current implementation, it is dependent on the outputs of "Suite2p". This limits adoption by researchers who use alternative pipelines or custom code. One potential solution would be to generalize the accepted inputs beyond the fixed format of "Suite2p", for instance, by accepting NumPy arrays (e.g., ROIs, deltaF/F traces, images, etc.) from files generated by other software. Otherwise, the tool may remain more of a useful add-on to "Suite2p" (see https://github.com/MouseLand/suite2p/issues/933) rather than a fully standalone tool.

      (2) Further benchmarking would strengthen the validation of "Track2p", particularly against "CaIMaN" (Giovannucci et al., eLife, 2019), which is widely used in the field and implements a distinct registration approach.

      (3) The authors might also consider evaluating performance using non-consecutive recordings (e.g., alternate days or only three time points across the week) to demonstrate utility in other experimental designs.

    1. The presented preprint is a well-researched study on a relevant topic that could be of interest to a broad audience. The study's strengths include a well-structured and clearly presented methodology. The code and data used in the research are openly available on Figshare, in line with best practices for transparency. Furthermore, the findings are presented in a clear and organized manner, with visualization that aid understanding.

      At the same time, I would like to draw your attention to a few points that could potentially improve the work.

      1. I think it would be beneficial to expand the annotation to approximately 250 words.

      2. The introduction starts with a very broad context, but the connection between this context and the object of the research is not immediately clear. There are few references in this section, making it difficult to determine whether the authors are citing others or their own findings.

      3. The transition to the main topic of the study is not well-defined, and there is no description of the gap in the literature regarding the object of study. Additionally, "bibliometric archaeology" appears at the end of the introduction but is only mentioned again later in the discussion, which may cause confusion for the reader.

      4. It would be helpful to clearly state the purpose and objectives of the study both in the Introduction and in the abstract as well.

      5. Besides, it is important to elaborate on the contribution of this study in the introduction section.

      6. The same applies to the background - a very broad context, but the connection with the object of the research is not entirely clear.

      7. Page 4 - as far as I understand, these are conclusions from a literature review, while point 3 (Reflective Richness of Data) does not follow from the previous analysis.

      8. The overall impression of the introduction and background is that it is an interesting text, but it is not well related to the objectives of the study. I would recommend shortening these sections by making the introduction and literature review more pragmatic and structured. At the same time, this text could be published as a standalone contribution.

      9. As I mentioned above, the methodology refers to the strengths of the study. However, in this section, it would be helpful to introduce and justify the structure of presenting the results.

      10. In the methodology section, the authors could also provide a footnote with a link to the code and dataset (currently, it is only given at the end).

      11. With regard to the discussion, I would like to encourage the authors to place their results more clearly in the academic context. Ideally, references from the introduction and/or literature review would reappear in this section to help clarify the research contribution.

      12. Although Discussion C is an interesting read, it seems more related to the introduction than the results. Again, the text itself is rather interesting, but it would benefit from a more thorough justification.

      Remarks on the images:

      1. At least the data source for the images should be specified in the background, because it is not obvious to the reader before describing the methodology.

      2. The color distinction between China and Russia in Figure 8 is not very clear.

      3. The gray lines in Figures 9-11 make the figures difficult to read. Additionally, the meaning of these lines is not clearly indicated in the legends of Figures 10 and 11. These issues should be addressed. 

      All comments and suggestions are intended to improve the article. Overall, I have a very positive impression of the work.

      Sincere, 

      Dmitry Kochetkov

    1. A summary of what the authors were trying to achieve (address the entire article, not just individual points or sections)

      This short paper provides an introduction to two issues: the “credibility revolution” of practices in psychology research following the reproducibility crisis, and the state of psychology research in Africa and the factors which are crucial to its development. The paper claims that there are mutual benefits: efforts to support credible and accessible research can benefit psychology in Africa, African psychology can expand and enhance the credibility of psychology research in the rest of the world.

      An account of the major strengths and weaknesses of the conceptual framework, methods and results

      The paper serves as a very accessible introduction to the “reproducibility crisis” in psychology, and the subsequent “credibility revolution” of research practices which are often (but not exclusively) focussed on transparency and accessibility, and which are applicable in many fields beyond psychology.

      The taxonomy of open science innovations into four categories - Accessibility, Infrastructure, Credibility, Community - is a nice way of organising initiatives.

      It may be beyond the scope of the remit the authors set themselves, but from a metascience perspective there is an unanswered question of how progress on the challenges set out by the paper would be measured. What are the indicators which we could use to evaluate progress in the different challenge areas or against which to measure benefits ?

      One paragraph summarising progress on reproducibility (p8 “The result was an explosion of research on practices to improve the credibility of psychology research”) seems to imply that credibility efforts are coextensive with replication studies (which is surely not what the authors mean) and further to imply that credibility practices are limited in applicability to a restricted domain of mostly online studies (which undersells the benefits of the credibility practices developed within psychology and admirably showcased in this paper).

      An appraisal of whether the authors achieved their aims, and whether the results support their conclusions

      I am not qualified to comment on whether the portrayal of African psychology is fair or comprehensive. I note that six of the nine authors have affiliations with African institutions.

      A discussion of the potential likely impact of the article on the field, and the utility of the conceptual framework, methods and empirical materials/data to the community

      The contribution of this paper is to signpost the valuable work that is being done on credibility mechanisms and on research development in Africa.

      Any additional context that might help readers interpret or understand the significance of the article

      None

      Any issues the authors need to address about the availability of data, code, research ethics, or other issues pertaining to the adherence of the article to MetaROR’s publishing policies

      N/A

      Positionality:

      As an experimental psychologist I have been involved in the discussion around credibility since at least 2011. I have no experience or familiarity with African psychology or the human development issues mentioned by the article.

      Reviewers are asked to provide specific guidance on the following:

      Does the article contribute new insights to the relevant fields?

      Yes. Both topics - credibility in research and research in Africa - are huge topics. The brief introductions here are valuable and there is added benefit of bringing the two into explicit dialogue.

      Are the key insights clearly communicated in the abstract, introduction, and conclusion?

      Yes

      Does the introduction section adequately explain necessary background information? Does it set out and justify the motivation for and aim of the study?

      Yes

      Does the literature review (where applicable) include the relevant research including the most recent research?

      Yes, with the caveat that the topics are so large that it is impossible in this amount of space to be comprehensive.

      Are any analytical concepts or theoretical frameworks used appropriately introduced and taken up in the empirical analysis (where applicable)?

      N/A

      Are all research methods clearly described and appropriate? In the case of quantitative submissions, are the methods rigorous and does the study include or point to all materials required to attempt a replication of the results?

      N/A

      Do the results make sense? Are they clearly formatted and presented? Are graphs easily readable and clearly labeled? Are all figures and tables understandable without reference to the main body of the article? Do they contain proper captions?

      N/A

      Are the results discussed in the context of previous findings? Are the results similar to previously reported findings? Are differences explained?

      There is no mention of previous work on this exact topic (the synergy). Perhaps there isn’t any? Maybe explicit statement to this effect would be good

      Are limitations of the study and their implications for interpretation of the results clearly described (where applicable)?

      On a similar line, maybe readers would benefit from a statement from the authors on their backgrounds and/or how the author team came together to address this topic?

      Are interpretations and conclusions consistent with the empirical materials and data?

      N/A

      Are all references appropriate? Are necessary references present? Are all references cited in the text included in the reference list?

      Nosek et al (2021) is missing or should be Nosek et al (2022), which additionally appears slightly out of alphabetical order in the bibliography

      If one or more studies in the article were preregistered, are the hypotheses, research methods, and inference criteria in line with the preregistration?

      N/A

    1. Reviewer #2 (Public review):

      Summary:

      The authors developed a computational pipeline named CHROMAS to track and analyze chromatophore dynamics, which provides a wide range of biological analysis tools without requiring the user to write code.

      Strengths:

      (1) CHROMAS is an integrated toolbox that provides tools for different biological tasks such as: segment, classify, track and measure individual chromatophores, cluster small groups of chromatophores, analyze full-body patterns, etc.

      (2) It could be used to investigate different species. The authors have already applied it to analyze the skin of the bobtail squid Euprymna berryi and the European cuttlefish Sepia officinalis.

      (3) The tool is open-source and easy to install. The paper describes in detail the experiment requirements, command to complete each task and provides relevant sample figures, which are easy to follow.

      Weaknesses:

      (1) There are some known limitations for the current version. The users should read the "Discussion" chapter carefully before preparing their experiments and using CHROMAS.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This study provides comprehensive instructions for using the chromatophore tracking software, Chromas, to track and analyse the dynamics of large numbers of cephalopod chromatophores across various spatiotemporal scales. This software addresses a long-standing challenge faced by many researchers who study these soft-bodied creatures, known for their remarkable ability to change colour rapidly. The updated software features a user-friendly interface that can be applied to a wide range of applications, making it an essential tool for biologists focused on animal dynamic signalling. It will also be of interest to professionals in the fields of computer vision and image analysis.

      Strengths:

      This work provides detailed instructions for this toolkit along with examples for potential users to try. The Gitlab inventory hosts the software package, installation documentation, and tutorials, further helping potential users with a less steep learning curve.

      Weaknesses:

      The evidence supporting the authors' claims is solid, particularly demonstrated through the use of cuttlefish and squid. However, it may not be applicable to all coleoid cephalopods yet, such as octopuses, which have an incredibly versatile ability to change their body forms.

      The reviewer is right to highlight this limitation. We clarified, in the revised manuscript, that CHROMAS relies on the assumption that chromatophore activity occurs primarily in a plane — a condition that is valid most of the time in squid and cuttlefish, where the majority of skin deformations are in-plane (with small occasional papillae). In cephalopods such as octopuses, however, in which the skin may undergo large 3-dimensional deformations through the action of papillary musculature, this assumption may not always hold. Although octopods’ bodies are more spherical (less flat) than those of squid and cuttlefish, CHROMAS should still be usable and useful if applied to smaller skin areas, especially because chromatophore density is often even higher in octopoda than in sepiidae.

      We added the following paragraph in the discussion:

      Another known limitation concerns the biological assumptions underlying the current version of CHROMAS. The pipeline is designed for surfaces that remain reasonably planar and undergo deformations primarily in two dimensions. In cephalopods such as octopuses, in which the skin can undergo substantial three-dimensional morphological changes, analysing chromatophore dynamics may require complementary three-dimensional tracking of the skin surface to correct for out-of-plane deformations and maintain accurate measurement of chromatophore activity.

      Reviewer #2 (Public review):

      Summary:

      The authors developed a computational pipeline named CHROMAS to track and analyse chromatophore dynamics, which provides a wide range of biological analysis tools without requiring the user to write code.

      Strengths:

      (1) CHROMAS is an integrated toolbox that provides tools for different biological tasks such as: segment, classify, track and measure individual chromatophores, cluster small groups of chromatophores, analyse full-body patterns, etc.

      (2) It could be used to investigate different species. The authors have already applied it to analyse the skin of the bobtail squid Euprymna berryi and the European cuttlefish Sepia officinalis.

      (3) The tool is open-source and easy to install. The paper describes in detail the command format to complete each task and provides relevant sample figures.

      Weaknesses:

      (1) The generality and robustness of the proposed pipeline need to be verified through more experimental evaluations. For example, the implementation algorithm depends on relatively specific or obvious image features, clean backgrounds, and objects that do not move too fast.

      (2) The pipeline lacks some kind of self-correction mechanism. If at one moment there is a conflicting match with the previous frames, how does the system automatically handle it to ensure that the tracking results are accurate over a long period of time?

      We thank the reviewer for raising this important point. CHROMAS does rely on relatively clean imaging conditions for optimal performance. However, the computational features of the pipeline — segmentation, tracking, and downstream analysis — have been designed to perform reliably as long as the segmentation models are trained on frames that reflect the diversity of the dataset (e.g., variations in lighting or minor background noise). It is correct, however, that acquiring the necessary quality of input data is both important and non-trivial. The pipeline is designed to work best with high-resolution footage of chromatophores under clear imaging conditions — specifically, with minimal water surface distortion, minimal particulate matter in the water column, and stable focus.

      To mitigate issues arising from motion blur or focus loss, CHROMAS includes an automatic frame quality control step that detects and discards frames that are out of focus, including those where the animal moves too fast for reliable tracking.

      To assist future users, we have now added a section under Discussion detailing the recommended recording conditions and video characteristics for effective analysis with CHROMAS. It reads:

      Recommended Video Parameters for Optimal Use of CHROMAS

      The performance of CHROMAS depends on the quality of the input videos. Although the pipeline analyses each frame independently and has no frame rate requirement, we recommend recording at 20 frames per second at least, to capture chromatophore dynamics accurately. Sharp, in-focus frames are critical, particularly for moving subjects, where higher shutter speeds help minimize motion blur. For reliable segmentation, each chromatophore should cover at least 10 pixels across its fully expanded diameter. Higher spatial resolution, with chromatophores covering around 50 pixels in diameter, are recommended if sub-chromatophore dynamics are of interest. Recording conditions should minimize background noise, and the water column should be as clear as possible, free of particles or debris. The water surface should be kept as calm and planar as possible to avoid optical artifacts. If wide-angle lenses or other optics that may introduce distortion are used, lens correction algorithms should be applied during preprocessing to compensate for the optical distortions. For long-term tracking applications (e.g., developmental studies), frequent imaging sessions are recommended. Newly differentiated chromatophores are initially light colored (e.g., yellow) and thus visually distinct from mature chromatophores (which are dark); over days to weeks, however, the light chromatophores darken and become increasingly difficult to differentiate from older ones. Recording at appropriate and regular intervals thus helps track individual chromatophores across developmental stages and improves the reliability of long-term analyses. Following these recommendations will help segmentation, tracking, and analysis with CHROMAS.

      CHROMAS does not implement an active self-correction mechanism in the sense of real-time error recovery. Yet, several steps are in place to ensure the reliability of registration and tracking over time. During registration, a set of points is tracked across frames using optical flow. If the displacement of a point between two frames exceeds a biologically plausible threshold, that point is automatically discarded from the registration calculation to prevent error propagation. If too many points are discarded, the registration step fails, preventing the acceptance of a poor alignment.

      In addition, masterframes (the averages of all aligned frames in a chunk) are generated at the end of the registration process to enable the visual verification of the quality of the mapping.

      During stitching, CHROMAS calculates reprojection errors between chunks, providing a quantitative measure of stitching validity and allowing users to detect and correct potential mismatches.

      We have revised the Results section to explicitly highlight the error-checking mechanisms implemented during registration and stitching to maintain tracking accuracy over time.

      Reviewer #1 (Recommendations for the authors):

      (1) Figures 2, 3, 5, 6, 8 showed the bobtail squid, however, all command lines for these figures were referred to "sepia_example.dataset".

      We thank the reviewer for noticing this inconsistency. We have corrected the labeling of the dataset name in the command line examples from "sepia_example.dataset" to the neutral term "example.dataset" to avoid any confusion regarding the species used in the figures.

      (2) It's excellent that Chromas includes a manual pre-alignment function. However, it's unclear how the authors determined the registration of selected chromatophores across different ages in the long-term tracking session. Given the rapid growth of cephalopods and presumably skin expansion with increased chromatophores, it would be helpful to provide more details or examples on this process.

      The manual pre-alignment function provides an interactive interface allowing the user to select a set of matching chromatophores across frames from different developmental stages. The accuracy of this process depends on the user's ability to recognize individual chromatophores reliably over time. Critically, it is not necessary to identify all those chromatophores; a representative subset is sufficient to interpolate the spatial mapping and align the surrounding chromatophores.

      To limit the potential challenges associated with chromatophore development, frequent imaging sessions (every few days) are recommended initially. Excessive intervals between recordings can result in relative displacements among existing chromatophores and the sudden appearance of newly matured chromatophores, both of which complicate manual matching.

      It should be noted that these challenges are not limitations of the CHROMAS pipeline itself, but rather relate to experimental design choices that affect the quality and traceability of the dataset. The exact parameters (e.g., size/duration of the datasets, spatial resolution, frame rate and intervals between recording sessions) to be used must be adapted to each experimental animal, each age, and ultimately, each question.

      Recommended video acquisition parameters, including guidance on recording frequency for long-term chromatophore tracking, have been added to the Discussion section.

      Reviewer #2 (Recommendations for the authors):

      (1) More detailed information should be given, such as operating system requirements, camera frame rate requirements, target size and speed limitations, when chunking videos into usable segments, the minimum length of each segment, etc.

      CHROMAS is platform-independent and requires only a functioning Python 3.9+ environment, regardless of the operating system or OS version, as described in “Methods – Implementation details”.

      Although CHROMAS does not require specific frame rates and because it analyses each frame independently, the quality of each image—and thus of imaging parameters—is critical to enable reliable chromatophore segmentation. If an animal remains relatively calm during recording, low shutter speeds will be adequate for image sharpness. Conversely, if the animal moves frequently or rapidly, it will be preferable to use a higher frame rate and a higher shutter speed to minimize motion blur. Recording parameters should therefore be adjusted accordingly, primarily to optimize image clarity and maintain frames in sharp focus.

      The frame rate should be sufficiently high also to capture the fast dynamics of chromatophore expansions and contractions. Although the pipeline has no specific frame rate requirement, we recommend image rates of at least 20 frames per second to sample the temporal patterns of chromatophore activity adequately, based on biological considerations.

      Each chromatophore should be represented by a sufficiently large number of pixels in each recorded image to enable the reliable estimation of its size, shape, and dynamics. If the spatial resolution is too low, individual chromatophores may appear as small pixel clusters, reducing the accuracy of area and shape measurements and introducing quantization artifacts. Based on our experience, we recommend recording conditions that result in each chromatophore covering at least 10 pixels across its diameter when fully expanded to ensure accurate segmentation and quantitative whole-chromatophore analysis. For sub-chromatophore motion analysis, we recommend a minimum of 50 pixels across the fully expanded diameter.

      These considerations relate to optimizing biological sampling and image quality for analysis, and are not technical requirements imposed by CHROMAS itself.

      We added a Discussion section outlining the recommended recording conditions and video parameters to facilitate effective use of CHROMAS.

      (2) This pipeline does not include functionality to correct for lens distortion, which may affect the results when accurate measurement of single chromatophore morphology is required.

      We thank the reviewer for this observation. We agree that lens distortion can affect the accurate measurement of chromatophore morphology if present. However, the current datasets analysed with CHROMAS were recorded using a long macro lens with minimal distortion, and visual inspections as well as quantitative assessments of chromatophore geometry did not indicate measurable optical deformation. We acknowledge that for other imaging setups —particularly those relying on the use of wide-angle lenses— lens distortion could introduce artifacts. In such cases, we recommend applying standard lens distortion correction during preprocessing, prior to analysis with CHROMAS.

      We have also addressed this point in the newly added section under the Discussion.

      (3) How to perform expansion for single chromatophores shown in Figure 6, and how to keep the expansion area consistent?

      The graph in Figure 6 illustrates the expansion of a single chromatophore over time and was generated entirely using the "areas" command and visualization tools available within CHROMAS.

      Spatial consistency is maintained because CHROMAS, through its registration and area extraction steps, tracks the identity of each chromatophore across the video, allowing the same individual to be followed reliably over time.

      (4) Tables 1 and 2: it's better to add the units of the values in each column.<br />

      We thank the reviewer for the suggestion. We have added the appropriate units to each column in Tables 1 and 2 to improve clarity.

    1. In recent years, cell segmentation techniques have played a critical role in the analysis of biological images, especially for quantitative studies. Deep learning-based cell segmentation models have demonstrated remarkable performance in segmenting cell and nucleus boundaries, however, they are typically tailored to specific modalities or require manual tuning of hyperparameters, limiting their generalizability to unseen data. Comprehensive datasets that support both the training of universal models and the evaluation of various segmentation techniques are essential for overcoming these limitations and promoting the development of more versatile cell segmentation solutions. Here, we present CellBinDB, a large-scale multimodal annotated dataset established for these purposes. CellBinDB contains more than 1,000 annotated images, each labeled to identify the boundaries of cells or nuclei, including 4’,6-Diamidino-2-Phenylindole (DAPI), Single-stranded DNA (ssDNA), Hematoxylin and Eosin (H&E), and Multiplex Immunofluorescence (mIF) staining, covering over 30 normal and diseased tissue types from human and mouse samples. Based on CellBinDB, we benchmarked seven state-of-the-art and widely used cell segmentation technologies/methods, and further analyzed the effects of four cell morphology indicators and image gradient on the segmentation results.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giaf069 ), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer: Shan Raza

      The paper presents a multimodal data set for cell segmentation and benchmarking. The major strength of the dataset is its multimodal nature and including both mouse and human tissue. The paper analyses existing data sets and the performance of state-of-the-art methods. However, the authors missed one of the biggest data sets on the cell segmentation and classification which includes more than 500,000 annotated nuclei in H&E https://www.sciencedirect.com/science/article/pii/S1361841523003079.

      The CoNIC challenge paper also analysis state-of-the-art nuclei segmentation and classification methods. The authors should add one of the best performing models in their analysis. I would also suggest the authors to include PQ and froc in the metrics to analyse the results as this is commonly used in this domain for comparison. I would also suggest to compare the results with HoVerNet or HoVerNext (https://github.com/digitalpathologybern/hover_next_train) which are state-of-the-art algorithms for nuclei instance segmentation. The code for these algorithms is publicly available.

    1. BioSample is a comprehensive repository of experimental sample metadata, playing a crucial role in providing a comprehensive archive and enabling experiment searches regardless of type. However, the difficulty in comprehensively defining the rules for describing metadata and limited user awareness of best practices for metadata have resulted in substantial variability depending on the submitter. This inconsistency poses significant challenges to the findability and reusability of the data. Given the vast scale of BioSample, which hosts over 40 million records, manual curation is impractical. Rule-based automatic ontology mapping methods have been proposed to address this issue, but their effectiveness is limited by the heterogeneity of BioSample metadata. Recently, large language models (LLMs) have gained attention in natural language processing and have been expected as promising tools for automating metadata curation. In this study, we evaluated the performance of LLMs in extracting cell line names from BioSample descriptions using a gold-standard dataset derived from ChIP-Atlas, a secondary database of epigenomics experiment data, which manually curates samples. Our results demonstrated that LLM-assisted methods outperformed traditional approaches, achieving higher accuracy and coverage. We further extended this approach to extraction of information about experimentally manipulated genes from metadata where manual curation had not yet been applied in ChIP-Atlas. This also yielded successful results for the usage of the database, which facilitates more precise filtering of data and prevents misinterpretation caused by inclusion of unintended data. These findings underscore the potential of LLMs to improve the findability and reusability of experimental data in general, significantly reducing user workload and enabling more effective scientific data management.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giaf070 ), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      **Reviewer: Christopher Tabone **

      This manuscript evaluates the use of large language models (LLMs) to improve the consistency and usefulness of BioSample metadata. The authors focus on extracting specific biological terms from freetext sample descriptions: first, identifying cell line names (using a curated gold-standard for evaluation), and second, identifying experimentally modulated gene names (in a scenario without prior manual curation). An open-source 70B LLM (Llama 3.1) was used and its performance was compared against a conventional ontology-mapping pipeline (MetaSRA). Overall, the study is well-motivated - addressing the challenge of heterogeneous metadata - and the approach is generally sound and well documented. Below, I address specific aspects of the work in detail: Methodological Appropriateness and Controls: The methods are appropriate to the study's aims and are described with detail. The two-part evaluation (cell line extraction and gene name extraction without prior curation) aligns well with the goal of demonstrating LLM utility in metadata curation. The authors took care to construct a gold-standard dataset for cell line extraction by leveraging ChIP-Atlas's manually curated sample annotations. This approach avoids starting from scratch and ensures the evaluation is grounded in experimental metadata. The sample selection strategy is well justified: using equal numbers of ChIP-seq and ATAC-seq samples to control for the presence/absence of protein names (a potential confounder for detecting cell lines), avoiding duplicate projects and identical terms, and restricting to human samples to leverage the Cellosaurus ontology. These controls strengthen the evaluation by preventing bias (e.g. one project dominating results or trivial cases duplicating answers). The LLM pipeline is clearly outlined (Figure 2) - the model is prompted with BioSample attributes to extract a representative cell line term. Importantly, the authors compare this LLM-assisted pipeline against an existing rule-based method (the MetaSRA ontology mapping pipeline). This serves as an essential control/baseline to quantify the improvement gained by using an LLM. For the second task (extracting modulated gene names), where no curated baseline exists, the authors sample thousands of BioSample entries and perform manual evaluation of the LLM's outputs. While manual checking is necessary here, the manuscript could clarify the evaluation procedure (e.g. how many evaluators or what criteria were used) to assure readers of consistency. Overall, the experimental design is solid. The necessary details (model used, prompt design, parameter settings like temperature=0 for reproducibility) are all provided, and the authors have made their code publicly available, which aids reproducibility. The methodology is transparent and should allow others to replicate or build upon the work. Support for Conclusions by Data: The conclusions are, for the most part, well supported by the data presented. In the cell line extraction task, the LLM-based method clearly outperforms the traditional MetaSRA pipeline in both accuracy and coverage (Table 4). For example, the LLM pipeline achieved substantially higher coverage (93.0% vs 72.1% for MetaSRA) without sacrificing accuracy (~92.3% vs 90.3%), and it also showed improved precision in identifying non-cell line samples. These results validate the authors' claim that LLMs can more flexibly and comprehensively interpret metadata, mapping many more actual cell line samples to ontology terms while maintaining low false-positive rates. The data support the conclusion that the LLM approach enhances metadata findability (since far more samples get correctly annotated) and does so with high reliability. The authors appropriately note that the conventional method's conservative strategy yields high precision at the cost of leaving many samples unmapped, whereas the LLM can confidently map a greater portion of samples. This finding is well substantiated by the numbers and the error analysis in Table 5 (which categorizes the few failure cases of the LLM, such as confusion with derivative cell lines or missing a cell line when certain keywords were absent). In the gene name extraction task, the authors report that the LLM identified at least one gene in 600 out of 3,723 tested samples, with an overall accuracy of ~80.3% for those outputs (about 91.6% accuracy on gene names themselves, and 84.7% on the associated modulation method). This demonstrates that the LLM can successfully parse complex descriptions to find gene perturbations in a majority of cases. While there is no baseline for direct comparison here, these results are consistent with the idea that LLMs can extend curation to new information types not yet curated (in this case, finding manipulated genes where an ontology or curated list didn't exist). The authors' conclusions about the utility of this - for example, that it could allow users to filter out experiments with gene knockouts/knockdowns to avoid confounding effects - are reasonable extrapolations from the data. The discussion correctly notes that coverage for this gene task wasn't evaluated (since no gold standard exists) and acknowledges that some fraction of relevant cases might be missed. All major conclusions (LLM outperforms rule-based methods; LLM extraction of new metadata is feasible and useful) are backed by the evidence provided. The authors also contextualize their findings by noting limitations and practical considerations (e.g. the processing throughput of ~400 samples/hour and the challenge of scaling to 40 million records). This adds credibility to their interpretation that LLM-based curation will need further resources or model improvements to handle the entire database. In summary, the data presented are analyzed in depth (with relevant tables, figures, and a breakdown of error types), and they support the paper's conclusions well. I have no concerns that the authors are overstating their results. Language Clarity and Quality: The manuscript is written in generally clear and professional English. The authors note that they translated the draft from Japanese with assistance from ChatGPT, and the result is readable and scientifically appropriate. The overall clarity is good - important terms are defined, and the narrative flows logically from the motivation to methods, results, and discussion. I did not encounter ambiguities that impede understanding of the science. There are only a few minor issues in language usage and grammar that require attention. For example, there is a small typo in the description of gene overexpression ("achieved by trasfection of a plasmid…" on page 19) - "trasfection" should be "transfection" (unless this typo was carried over from the original prompt). Another example is the sentence "the outcomes of this study can handle these errors to rescue the affected published data for further use," which is a bit awkward in phrasing - perhaps reword to clarify that the methods developed can help correct metadata errors from submitted data. These are relatively minor edits; the manuscript does not require heavy language revision, just light editing for a few misspellings and stylistic "smoothing". The structure of the paper is appropriate, with a clear Introduction and well-labeled sections (Methods, Results/Discussion, Limitations, etc.). Data presentation is also clear: figures and tables are easy to interpret, and captions are explanatory. For example, the flowchart in Figure 2 and the definitions in Figure 3 clearly help in the understanding of the pipeline and metrics. In summary, with minor editorial changes, the quality of language and presentation will be suitable for publication. Statistical Analysis and Data Presentation: I am able to assess all the statistics and quantitative analyses in the manuscript, and they appear appropriate. The study primarily uses descriptive performance metrics (accuracy, coverage, precision, recall) to evaluate the extraction tasks - these are standard and well defined (the text and Figure 3 provide clear definitions of each metric in the context of the task). The comparisons between the LLM pipeline and the MetaSRA pipeline are straightforward to interpret. The authors did not perform complex statistical tests (e.g., no p-values are reported), which can be justified given that the magnitude and consistency of the improvements are evident and the evaluation emphasizes practical performance metrics rather than hypothesis testing. However, the manuscript states in Supplementary Table 1 that "no significant differences were observed" between ChIP-seq and ATAC-seq subsets. If the authors intend "significant" to indicate statistical significance, it would be necessary to include the specific statistical test used along with associated test statistics and p-values to substantiate this claim. If no formal statistical testing was conducted, it would be more accurate and clearer to rephrase this as a qualitative observation rather than implying formal statistical support. All underlying data needed to interpret the results are provided either in the main figures/tables or supplementary material. The presentation of results is clear and transparent: Table 4 quantitatively summarizes the performance of each pipeline, and Table 5 qualitatively categorizes the errors made by the LLM. I have no other concerns about the appropriateness of statistical methods used - the evaluation metrics are suitable for information extraction tasks, and the sample sizes (600 samples for the cell line task, and thousands scanned for the gene task) are adequate to support the conclusions. In terms of data transparency, the manuscript indicates that outputs and code are available (with a GitHub repository provided), which will allow others to reproduce the analysis. Additional comments and suggestions: Beyond the points above, I have a few minor suggestions to further strengthen the manuscript. First, it would be helpful if the authors could clarify in the Methods how the manual evaluation of gene name extraction was performed—for example, whether multiple curators independently reviewed the outputs or if any consensus procedure was employed to resolve ambiguous cases. Providing this detail would add transparency to the accuracy figures reported, although the existing explanation about handling ambiguous cases (e.g., fusion genes) is already helpful. Second, given the manuscript's emphasis on a zero-shot LLM approach, it would be beneficial for the authors to briefly discuss whether alternative strategies, such as fine-tuning smaller language models, were considered. This would more clearly position the study within the broader landscape of metadata curation techniques. Third, the authors describe the use of the locally deployed Llama 3.1 model and emphasize its advantages regarding data privacy and scalability. Since these benefits are significant for practical adoption, it would further strengthen the manuscript if the authors explicitly highlight practical considerations, such as specific hardware requirements (in addition to the graphics card usage already included) and runtime performance benchmarks. Finally, as mentioned earlier, the authors mention in Supplementary Table 1 that "no significant differences were observed" between ChIP-seq and ATAC-seq samples. If the term "significant" here is meant to indicate statistical significance, please include details of the specific statistical test and associated values (e.g., test statistics and p-values) that substantiate this conclusion. If no formal statistical testing was performed, it would be more appropriate to rephrase this statement to indicate a qualitative observation rather than imply statistical testing. These points are relatively minor and do not indicate fundamental issues with the manuscript. Recommendation: In summary, this is a strong manuscript that addresses a pertinent problem in biological data management using modern LLM tools. The methods are sound and well controlled, the results are convincing, and the authors have been appropriately cautious and thorough in their analysis. I recommend minor revisions for this manuscript. The revisions needed are primarily editorial (minor language fixes and clarifications), with one note about statistics, and do not require additional experiments. With those addressed, the work should be suitable for publication in GigaScience.

    1. Despite the surge in data acquisition, there is a limited availability of tools capable of effectively analyzing microbiome data that identify correlations between taxonomic compositions and continuous environmental factors. Furthermore, existing tools also do not predict the environmental factors in new samples, underscoring the pressing need for innovative solutions to enhance our understanding of microbiome dynamics and fulfill the prediction gap. Here, we introduce CODARFE, a novel tool for sparse compositional microbiome-predictors selection and prediction of continuous environmental factors. We tested CODARFE against four state-of-the-art tools in two experiments. First, CODARFE outperformed predictor selection in 21 out of 24 databases in terms of correlation. Second, among all the tools, CODARFE achieved the highest number of previously identified bacteria linked to environmental factors for human data—that is, at least 7% more. We also tested CODARFE in a cross-study, using the same biome but under different external effects (e.g., ginseng field and cattle for arable soil, and HIV and crohn’s disease for human gut), using a model trained on one dataset to predict environmental factors on another dataset, achieving 11% of mean absolute percentage error. Finally, CODARFE is available in five formats, including a Windows version with a graphical interface, to installable source code for Linux servers and an embedded Jupyter notebook available at MGnify - https://github.com/alerpaschoal/CODARFE.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giaf055), which carries out open, named peer-review. The following review is published under a CC-BY 4.0 license:

      Reviewer: Jaak Truu

      This manuscript addresses key aspects of microbiome data analysis, particularly in relating continuous variables to microbiome data and utilizing microbiome data to predict variables of interest. The data analysis approach is well-articulated; however, there is a notable omission regarding the derivation of the microbiome datasets. While the sources of these datasets are mentioned, it remains unclear whether the authors processed the initial data to produce the count tables used as input or if these tables were directly adopted from the original publications. Given that the data in the main text are derived from studies based on 16S rDNA sequencing, variations in data processing pipelines between publications could introduce significant variability. Although the manuscript discusses the importance of the sequenced 16S rDNA region and the similarity of the environments from which the samples were obtained, it does not address the impact of the initial data processing pipeline (including taxonomy assignment).

      Additionally, the number of samples in each dataset is not provided in the tables.

      The manuscript includes a comparison of the proposed method with other tools; however, it omits MaAsLin (Microbiome Multivariable Association with Linear Models), that has been applied far more extensively in microbiome data analysis than the tools included in the current manuscript. Incorporating a comparison with MaAsLin would enhance the comprehensiveness of the evaluation.

    1. Background Understanding genotype-environment interactions of plants is crucial for crop improvement, yet limited by the scarcity of quality phenotyping data. This data note presents the Field Phenotyping Platform 1.0 data set, a comprehensive resource for winter wheat research that combines imaging, trait, environmental, and genetic data.Findings We provide time series data for more than 4,000 wheat plots, including aligned high-resolution image sequences totaling more than 153,000 aligned images across six years. Measurement data for eight key wheat traits is included, namely canopy cover values, plant heights, wheat head counts, senescence ratings, heading date, final plant height, grain yield, and protein content. Genetic marker information and environmental data complement the time series. Data quality is demonstrated through heritability analyses and genomic prediction models, achieving accuracies aligned with previous research.Conclusions This extensive data set offers opportunities for advancing crop modeling and phenotyping techniques, enabling researchers to develop novel approaches for understanding genotype-environment interactions, analyzing growth dynamics, and predicting crop performance. By making this resource publicly available, we aim to accelerate research in climate-adaptive agriculture and foster collaboration between plant science and machine learning communities.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giaf051), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer: Abhishek Gogna

      Thank you for the submission. The dataset surely holds value for the plant breeding community but my major concerns are (1) the availability of genetic data, (2) non-conformity to MIAPPE standards (https://www.miappe.org/). These restrict value of the otherwise excellent publication. I would welcome a submission addressing these major points. In addition, I have some minor points for specific sections. Please use the strings in quotation marks ("") to locate the specific sections.

      1. Context Change of Equipment: Please indicate how the change of equipment from TLS to drone affects data interoperability. "Figure 2, gray bars": Kindly update Figure 2 to clarify the representation of the gray bars.* "Heads were annotated": Does this mean that not all relevant images were annotated? If so, please modify the title to avoid confusion.

      2. Description of FAIR: Please revise this section. Both links listed under "Findable" and "Accessible" are eligible for these tags. Please modify "Interoperability" with reference to the publication listed in the "Re-use Potential."

      3. Reference measurements "Senescence was": Was this measurement done for all relevant images? Please include this information. "Adjusted genotype means with year calculation": Please add variance decomposition data for traits.

      3. Compilation as Data set* "pure GABI-WHEAT set for the extended set": Please revise this sentence for clarity.

      1. Heritabilities of intermediate and target traits* "y of the public marker" - Please revise the sentence for clarity.

      2. Genomic prediction ability of unseen multi-environment trial* Is the CDC data part of the data publication? Please add this information.6. Example 1 to

      6* Please revise all code for consistency and updated results. Also, include the necessary packages required to run the code.7. Availability of Source code and RequirementPlease create connectivity between repositories and add descriptive README files outlining their usage. Additionally, please provide instructions on how individual repositories may be used.I appreciate your attention to these points and believe that addressing them will strengthen your manuscript

    1. Background Variant Call Format (VCF) is the standard file format for interchanging genetic variation data and associated quality control metrics. The usual row-wise encoding of the VCF data model (either as text or packed binary) emphasises efficient retrieval of all data for a given variant, but accessing data on a field or sample basis is inefficient. Biobank scale datasets currently available consist of hundreds of thousands of whole genomes and hundreds of terabytes of compressed VCF. Row-wise data storage is fundamentally unsuitable and a more scalable approach is needed.Results Zarr is a format for storing multi-dimensional data that is widely used across the sciences, and is ideally suited to massively parallel processing. We present the VCF Zarr specification, an encoding of the VCF data model using Zarr, along with fundamental software infrastructure for efficient and reliable conversion at scale. We show how this format is far more efficient than standard VCF based approaches, and competitive with specialised methods for storing genotype data in terms of compression ratios and single-threaded calculation performance. We present case studies on subsets of three large human datasets (Genomics England: n=78,195; Our Future Health: n=651,050; All of Us: n=245,394) along with whole genome datasets for Norway Spruce (n=1,063) and SARS-CoV-2 (n=4,484,157). We demonstrate the potential for VCF Zarr to enable a new generation of high-performance and cost-effective applications via illustrative examples using cloud computing and GPUs.Conclusions Large row-encoded VCF files are a major bottleneck for current research, and storing and processing these files incurs a substantial cost. The VCF Zarr specification, building on widely-used, open-source technologies has the potential to greatly reduce these costs, and may enable a diverse ecosystem of next-generation tools for analysing genetic variation data directly from cloud-based object stores, while maintaining compatibility with existing file-oriented workflows.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giaf049), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer: Nezar Abdennur

      The authors present VCF Zarr, a specification that translates the variant call format (VCF) data model into an array-based representation for the Zarr storage format. They also present the vcf2zarr utility to convert large VCFs to Zarr. They provide data compression and analysis benchmarks comparing VCF Zarr to existing variant storage technologies using simulated genotype data. They also present a case study on real world Genomics England aggV2 data.The authors' benchmarks overall show that VCF Zarr has superior compression and computational analysis performance at scale relative to data stored as roworiented VCF and that VCF Zarr is competitive with specialized storage solutions that require similarly specialized tools and access libraries for querying. An attractive feature is that VCF Zarr allows for variant annotation workflows that do not require full dataset copy and conversion. Another key point is that Zarr is a high-level spec and data model for the chunked storage of n-d arrays, rather than a bytelevel encoding designed specifically around the genomic variant data type. I personally have used Zarr productively for several applications unrelated to statistical genetics. While Zarr VCF mildly underperforms some of the specialized formats (Savvy in compute, Genozip in compression) in a few instances, I believe the accessibility, interoperability, and reusability gains of Zarr make the small tradeoff well worthwhile.Because Zarr has seen heavy adoption in other scientific communities like the geospatial and Earth sciences, and is well integrated in the scientific Python stack, I think it holds potential for greater reusability across the ecosystem. As such, I think the VCF Zarr spec is a highly valuable if not overdue contribution to an entrenched field that has recently been confronted by a scalability wall.Overall, the paper is clear, comprehensive, and well written. Some high-level comments: The benefits for large scientific datasets to be analysis-ready cloud-optimized (ARCO) have been well articulated by Abernathey et al., 2021. However, I do think that the "local"/HPC single-file use case is still important and won't disappear any time soon, and for some file system use cases, expansive and deep hierarchies can be performance limiting (this was hinted at in one of the benchmarks). In this scenario would a large Zarr VCF perform reasonably well (or even better on some file systems) via a single local zip store? The description of the intermediate columnar format (ICF) used by vcf2zarr is missing some detail. At first I got the impression it might be based on something like Parquet, but running the provided code showed that it consists of a similar file-based chunk layout to Zarr. This should be clarified in the manuscript. The authors discuss the possibility of storing an index mapping genomic coordinates to chunk indexes. Have Zarr-based formats in other fields like geospatial introduced their own indexing approaches to take inspiration from? Since VCF Zarr is still a draft proposal, it could be useful to indicate where community discussions are happening and how potential new contributors can get involved, if possible. This doesn't need to be in the paper per se, but perhaps documented in the spec repo.Minor comments: In the background: "For the representation to be FAIR, it must also be accessible," -- A is for "accessible", so "also" doesn't make sense. "There is currently no efficient, FAIR representation...". Just a nit and feel free to ignore, but the solution you present is technically "current".* In Figure 2, the zarr line is occluded by the sav line and hard to see.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study uses a cell-based computational model to simulate and study T cell development in the thymus. They initially applied this model to assess the effect of the thymic epithelial cells (TECs) network on thymocyte proliferation and demonstrated that increasing TEC size, density, or protrusions increased the number of thymocytes. They postulated and confirmed that this was due to changes in IL7 signalling and then expanded this work to encompass various environmental and cell-based parameters, including Notch signalling, cell cycle duration, and cell motility. Critical outcomes from the computational model were tested in vivo using medaka fish, such as the role of IL-7 signalling and minimal effect of Notch signalling.

      Strengths:

      The strength of the paper is the use of computational modelling to obtain unique insights into the niche parameters that control T cell development, such as the role of TEC architecture, while anchoring those findings with in vivo experiments. I can't comment on the model itself, as I am not an expert in modelling, however, the conclusions of the paper seem to be wellsupported by the model.

      Weaknesses:

      One potential issue is that many of the conclusions are drawn from the number of thymocytes, or related parameters such as the thymic size or proliferation of the thymocytes. The study only touches briefly on the influence of the thymic niche on other aspects of thymocyte behaviour, such as their differentiation and death.

      We thank the reviewer for this constructive feedback. Indeed, the strength of our approach lies in the close cooperation between modellers and experimentalists. One advantage of the model is its ability to manipulate challenging or even impossible variables, such as TEC dimensions, which cannot be varied experimentally with current tools. 

      The reviewer rightly pointed out that our validation focuses on comparing cell numbers or organ size as a proxy for cell numbers.

      In our previous study (Aghaallaei et al., Science Advances, 2021), we focused more on differentiation and used the computational model to predict how proportions of T-cell sublineages would vary according to different parameter values, including the IL-7 availability. One of the initial inspirations for the focus on proliferation in this manuscript was the observation in this previous work that overexpression of IL-7 in the niche resulted in overproliferation. We also focused on proliferation and organ size because these are more easily measured in experimental conditions with the tools that we have available in medaka, allowing better comparisons to the computational results.

      Regarding cell death, our experimental observations do not suggest that it plays a role before the final stages of T cell maturation. Hence, the model also does not include apoptosis before this stage either. 

      However, we do agree that taking a closer look at the regulation of differentiation and cell death would be an exciting avenue for future study!

      Please see our response to author recommendations below for more information on these points. Moreover, to make the model more accessible to non-experts, we have created new schematic figures, which we can be found in the Appendix of the revised manuscript.

      Reviewer #2 (Public review):

      Summary:

      The authors have worked up a ``virtual thymus' using EPISIM, which has already been published. Attractive features of the computational model are stochasticity, cell-to-cell variability, and spatial heterogeneity. They seek to explore the role of TECs, that release IL-7 which is important in the process of thymocyte division.

      In the model, ordinary clones have IL7R levels chosen from a distribution, while `lesioned' clones have an IL7R value set to the maximum. The observation is that the lesioned clones are larger families, but the difference is not dramatic. This might be called a cell-intrinsic mechanism. One promising cell-extrinsic mechanism is mentioned: if a lesioned clone happens to be near a source of IL-7 and begins to proliferate, the progeny can crowd out cells of other clones and monopolise the IL-7 source. The effect will be more noticeable if sources are rare, so is seen when the TEC network is sparse.

      Strengths:

      Thymic disfunctions are of interest, not least because of T-ALL. New cells are added, one at a time, to simulate the conveyor belt of thymocytes on a background of stationary cells. They are thus able to follow cell lineages, which is interesting because one progenitor can give rise to many progeny.

      There are some experimental results in Figures 4,5 and 6. For example, il7 crispant embryos have fewer thymocytes and smaller thymii; but increasing IL-7 availability produces large thymii.

      Weaknesses:

      On the negative side, like most agent-based models, there are dozens of parameters and assumptions whose values and validity are hard to ascertain.

      The stated aim is to mimic a 2.5-to-11 day-old medaka thymus, but the constructed model is a geometrical subset that holds about 100 cells at a time in a steady state. The manuscript contains very many figures and lengthy descriptions of simulations run with different parameters values and assumptions. The abstract and conclusion did not help me understand what exactly has been done and learned. No attempt to synthesise observations in any mathematical formula is made.

      The reviewer raises several important points to consider when working with mathematical or computational models.

      As in many other agent-based models, we agree that our model makes use of many parameters. Many of these parameters summarize multiple steps and are treated as phenomenological, i.e. they do not represent a microscopic event such as the rate of an individual chemical reaction, but more high-level processes such as "rate of differentiation". Realistically, this process should consist of cascades of pathway components that regulate transcription factors.

      In the supplementary material of our previous work (Aghaallaei et al., Science Advances, 2021) we provided an in-depth explanation of the mathematical formulation and rationale behind our choices in relation to the available biological data to select assumptions and restrict parameter value ranges. Four parameters that could not be characterized with pre-existing data, but which were crucial to the model's predictions, were studied in detail in that publication. Hence, the submitted manuscript starts with a well-calibrated model that has been tailored for the medaka thymus. The submitted manuscript explores the robustness of the system to lesions,  which we conceptualize as alterations in parameter values. We were surprised by how well the model recapitulated the time scales of overproliferation in the thymus of medaka embryos, which further supports the notion that our previous model calibration was successful.

      Another important point raised by the reviewer is that the "validity [of parameters and assumptions is] hard to ascertain". We agree, which is precisely the reason why we aim to test the model's predictions through experimentation. Importantly, a model does not need to be perfect to be useful. For example, in the submitted manuscript we observed a discrepancy between model predictions and experimental results that led us to hypothesize negative feedback regulation from the proliferative state to differentiation. 

      Thus, a major strength of modelling approaches is that they allow to identify erroneous or missing assumptions about the structure of the regulatory interaction network and its parametrization which can advance our scientific understanding of the underlying biology. Using models as an investigative tool is fundamental to the philosophy of systems biology (Kitano, Science, 2002), and is what we strive for.

      The reviewer rightfully points out that we only represent a geometric subset of the organ. In our preliminary work, we considered representing the full three-dimensional thymus; however, we later simplified our approach, as the organ is a symmetric ellipsoid at this developmental stage. This decision vastly reduced our computational costs, enabling us to explore parameter space more effectively.

      Nevertheless, we apologize if the submitted manuscript did not sufficiently emphasize the main insights of the paper, model limitations, and model construction. In the revised manuscript, we have improved the abstract and discussion sections to explicitly highlight the main results and limitations. We have also provided further details of the model's structure and underlying logic in the appendix.

      Reviewer #3 (Public review):

      Summary:

      Tsingos et al. seek to advance beyond the current paradigm that proliferation of malignant cells in T-cell acute lymphoblastic leukemia occurs in a cell-autonomous fashion. Using a computational agent-based model and experimental validation, they show instead that cell proliferation also depends on interaction with thymic epithelial cells (TEC) in the thymic niche. One key finding is that a dense TEC network inhibits the proliferation of malignant cells and favors the proliferation of normal cells, whereas a sparse TEC network leads to rapid expansion of malignant thymocytes.

      Strengths:

      A key strength of this study is that it combines computational modeling using an agent-based model with experimental work. The original modeling and novel experimental work strengthen each other well. In the agent-based model, the authors also tested the effects of varying a few key parameters of cell proliferation.

      Weaknesses:

      A minor weakness is that the authors did not conduct a global sensitivity analysis of all parameters in their agent-based model to show that the model is robust to variation, which would demonstrate that their results would still hold under a reasonable level of variation in the model and model parameters. This is a minor point, and such a supporting study would end in an appendix or supplement.

      The reviewer highlights the lack of a global sensitivity analysis as a minor weakness. 

      In our previous work (Aghaallaei et al., Science Advances, 2021), we studied parameters sensitivity for some parameters, while in the submitted manuscript, we extended this exploration to parameters that we expected to be the most meaningful for cell proliferation.

      In the revised version of the manuscript, we have included an additional supplementary figure alongside Figure 4 to show the effect of changing parameters in "control" simulations lacking a lesioned clone. These data are also provided in the source data to Figure 4. While this does not constitute an exhaustive exploration of all parameter space, it provides a useful overview of the effect of the studied parameters on thymocyte population size in the absence of lesioned clones.

      Response to reviewer recommendations

      In the revision, we have improved the manuscript to address the reviewers’ points. The following is an overview of the changes to the manuscript:

      • We wrote an extensive Appendix to better explain the model implementation.

      • The Abstract was rewritten to improve clarity on what was done and to highlight the main findings.

      • Subheadings to paragraphs were rewritten to better emphasize the main findings.

      • Font sizes in Figure 2J and Figure 4E were increased to improve readability.

      • The spacing of graphical elements in the legend of Figure 4E was improved.

      • An error in Figure 5B was corrected (the legend labels had been accidentally swapped).

      • A new supplementary figure to Figure 4 shows the sensitivity of clone size in control simulations for a subset of the tested parameter combinations.

      • The Conclusion section was rewritten to better highlight limitations of the study and Improve the summary of the main findings. 

      • Minor wording improvements were done throughout the text to improve readability.

      In the following we respond to the reviewers’ individual recommendations.

      Reviewer #1 (Recommendations for the authors):

      I am not an expert in modelling, so I apologise if I missed these points in the manuscript. I am slightly confused about how differentiation and death are included in the model. At the beginning of the results you mention that you model a 5 um slice, is it known which stages of development occur in that section of the thymus? 

      We thank the reviewer for this question and appreciate the opportunity to clarify. Our virtual thymus is based on the medaka embryonic thymus, which we have extensively characterized using functional analyses and noninvasive in toto imaging (Bajoghli et al., Cell, 2009; Bajoghli et al., J Immunology, 2015; Aghaallaei et al., Science Advances, 2021; Aghaallaei, Eur J Immunology, 2022). These studies allowed us to map thymocyte developmental stages and migratory trajectories within the spatial context of a fully functional medaka thymus (see Figure 7 in Bajoghli et al., J Immunology, 2015).

      To simplify the biological system without compromising model fidelity, we chose to simulate a representative 5 µm slice from the ventral half of the thymus. Importantly, the medaka thymus is a symmetric organ (Bajoghli et al., J Immunology 2015), hence this slice captures all key events of T-cell development, including thymus homing, differentiation, proliferation, selection, and egress akin to our in vivo observations (see Figure 7 in Bajoghli et al., 2015 and Figure 7a in Aghaallaei et al., Science Advances, 2021).

      Furthermore, our model incorporates the spatial organization of the thymic cortex and medulla by including two types of thymic epithelial cells (TECs): cortical TECs positioned on the outer side, and medullary TECs on the inner side (see Figure Supplement 7 in Aghaallaei et al., Science Advances, 2021). Differentiation and cell death are modeled as discrete steps along the developmental trajectory, informed by our in vivo observations.

      We apologize to the reviewer if the workings of the model were not sufficiently clear in the original manuscript. To address this, and as also requested by reviewer 2, we provided an extensive Appendix in the revised version of the manuscript that also includes visual summaries of the model logic in the form of intuitive flowcharts.

      And is it known, or do you factor in, whether there are changes in the responsiveness of the thymocytes to signals, such as notch and IL7, depending on their state of differentiation?

      We have previously examined the roles of IL-7 (Aghaallaei et al., Science Advances, 2021) and Notch1 (Aghaallaei et al., Europ J Immunology, 2022) signaling in the medaka thymus. These studies demonstrated that T cell progenitors are responsive to both IL7 and Notch signaling, whereas more differentiated, non-proliferative thymocytes are unresponsive to IL-7. Our in vivo observations further suggest that mature thymocytes require Notch signaling during the thymic selection process. This appears to be a species-specific phenomenon (Aghaallaei et al., Europ J Immunology, 2022). 

      In the computational model, we include this state-specific responsiveness by incorporating a dependence on IL-7 and Notch signaling in the cellular decision to commit to the cell cycle (see Appendix Figure 6, and Appendix section X.) and in the decision of differentiating into αβ<sup>+</sup> or γδ<sup>+</sup> T cell subtypes (see Appendix Figure 5, and Appendix section IX.). Although the model still calculates pathway signaling activity for thymocytes in the differentiated stage belonging to the αβ<sup>+</sup> or γδ<sup>+</sup> subtype, this signaling activity has no downstream consequences for the cells’ behavior in the model.

      Note that in the computational model we do not incorporate feedback loops that regulate pathway activity (for example, it could be that thymocytes upregulate the IL7R receptor at some point in their differentiation trajectory – in the absence of speciesspecific knowledge of such regulatory feedbacks, we have chosen not to include any in our model).

      And you mention the stages of development are incorporated into the model but the main output that you discuss is thymocyte number or proliferation. It would be interesting to use the model to explore how parameters related to differentiation are changed by, for example, the level of IL7 signalling.

      We agree that examining how factors like IL-7 signaling influence thymocyte differentiation is a promising direction for future work. Based on our previous modelling work (Aghaallaei et al., Science Advances, 2021), we expect that increased IL7 availability or sensitivity should result in an increase of cells differentiating into the γδ<sup>+</sup> T cell subtype. As molecular tools for medaka continue to advance, we anticipate being able to refine and expand the model accordingly.

      Moreover, we see strong potential for adapting the current computational framework to model thymopoiesis in other species, such as mouse or human, where stage-specific markers are well characterized. We have now explicitly mentioned this opportunity for future development in the conclusion section of the revised manuscript (see page #26).

      It is also mentioned in the description of the model that the cells can die at the end of the development process. However, is death incorporated into the earlier stages of development? For instance, it is possible that when signals, such as a notch, are at low levels the thymocytes at certain stages of development will die.

      We thank the reviewer for this comment. In a previous study, we mapped the spatial distribution of apoptotic cells within the medaka thymus and did not observe cell death in the region where ETPs enter the cortical thymus (Bajoghli et al., J Immunology, 2015) and where Notch1 signaling becomes activated (Aghaallaei et al., Europ J Immunology, 2021). Notch mutants exhibit a markedly reduced number of thymocytes, this reduction could be attributed either to impaired thymus homing or increased cell death within the thymus. However, our unpublished data shows that the total number of apoptotic cells in Notch1b-deficient thymus is comparable to their wild-type siblings. In fact, our in vivo observations revealed that the frequency of thymus colonization by progenitors is significantly reduced in the notch1b mutant (Aghaallaei et al., J E Immunol., 2021). Based on these in vivo observations, our computational model incorporates cell death only at the end of the thymocyte developmental trajectory. The current model does not consider cell death at earlier stages. 

      Overall, the manuscript was well-written and the figures were clear and well-presented. A minor point would be that the writing in some of the figures was too small and difficult to read, such as in Figure 4. I also sometimes struggled to find the definition of the acronyms in the figures, for example in Figure 3 it would be helpful if the definitions for D, SD, and SA were given in the figure legend as well as in the figure itself.

      We thank the reviewer for the kind words. We have reworked the figures to have larger more readable font sizes and improved figure legends as suggested.

      Reviewer #2 (Recommendations for the authors):

      Suppose the computational results did throw up an important new phenomenon. How might researchers seek to replicate it? If no mathematical relations can be given, can at least the code be made publicly available?

      We apologize to the reviewer if the workings of the model were not sufficiently clear in the submitted manuscript. However, we believe there may have been a misunderstanding, and we would like to clarify that both the mathematical formulations and the code used in this study were publicly available in the scientific record at the time of submission.

      Specifically, the full source code for the virtual thymus model is hosted in a permanent Zenodo repository (accessible here: https://zenodo.org/records/11656320), which includes:

      - Model files and links to source codes for the simulation environment;

      - Pre-compiled binary versions of the simulation environment (EPISIM) for both Windows and Linux platforms;

      - Detailed documentation, including step-by-step instructions on how to install and use the provided files.

      The repository link is cited in the manuscript (see page 38) and in the section “Data and materials availability”.  

      In addition, the mathematical framework that underpins the computational model has already been published and described in detail in our previous work (Aghaallaei, et al. Science Advances, 2021). In the supplementary material of this publication, we provide extensive documentation of the model, including:

      - A 13-page textual explanation of the design rationale;

      - 44 equations describing model implementation;

      - Parameter choices, partial sensitivity analysis, additional simulations, and supporting data presented in two figures and four tables.

      Nonetheless, to improve transparency, we have added an extensive Appendix in the revised version of the manuscript that also includes visual summaries of the model logic in the form of intuitive flowcharts. We hope this clarification and the new provided appendix assures the reviewer that both reproducibility and transparency have been central to our approach. 

      What about the growth of the animal and its thymus over weeks 2-11?

      We thank the reviewer for this insightful question. Indeed, our current computational model does not incorporate thymus growth over time. We decided not to model the dynamic increase in TEC numbers or organ size over time because we wanted to maintain simplicity and computational tractability. Therefore, we assumed a steadystate thymic environment. The model is therefore limited to representing thymopoiesis under homeostatic conditions, as it appears to stabilize by day 11. This is a recognized limitation of the current model. Looking ahead, we plan to develop a more advanced computational framework that incorporates thymic growth and dynamic changes in cellular composition over time. We have now included a brief note on this limitation in the conclusion of the revised manuscript (see page #26).

    1. Many JavaScript websites will advise you to never use the “==” and “!=” JavaScript operators, because when they compare variables containing different data types, JavaScript will coerce one of the operands to a matching type, sometimes in unexpected ways. We can thank the early days of JavaScript for this feature, when it was trying to be extraordinarily forgiving of sloppy code. I’m not going to list all the odd results that can arise from JavaScript’s operand coercion, because there are more than enough examples on the web already. To avoid unexpected type coercion, and thus unexpected matches and/or mismatches, the usual advice is to always use strict equality operators (“===” and “!==”). I disagree.
    1. snippet

      Définition : Un snippet est une petite portion réutilisable de code source ou de texte, le plus souvent des unités formellement définies à incorporer dans des modules plus larges.

    1. Q31.In a code language, TIGE Awritten as SUHJFHDFQS. How GINPQSRTDF be written in language?(a)HERBS(6)HORSE(c)HOESR(d)HORTFQ32. In a code language, TAN written as 7-26-13-16. How will CA be written in that language?(a)24-26-9-20-15(b)24-26-9-20-1(c)24-26-18-20-12(d)23-01-9-20-Q33. In a certain language, CADI written as 31457. How will DEFE written as in the same language?(a) 45678(b) 45769(c) 35658(d) 45659Q34. If in a given code lang WATER is coded as XZUDS, then v word will be coded as BMHKF?(a) ALIEN(b) CLING(c) ANGLE(d) EARTHQ35. In a code language CERTAI written as DFSUBJO How SUMMER be written in this languag एक कोडभाषा में CERTAIN को DFSUB रू प में लखा जाता है, इस भाषा में SUM कैसे लखा जाएगा?CPO, 16/03/2019 (Morning)(a) TVNNFS(b) TVNNFT(c) RVNNFS(d) TUNMFSQ36. A = 2 C = 4 then PARTICLE = __________.(a) 1721921104136(b) 172172094136(c) 172192094126(d) 1621820104136037. If AT = 20 and BEG =70, thenBANK(a) 318(b) 308(c) 228(d) 282Q38. In a certain code language, SON is written as 345 and ROAM is written as 6412. How will RANSOMbe written in the same language?(a) 615342(b) 651342(c) 615324(d) 612435

      GOod to Know

    Annotators

    1. Reviewer #1 (Public review):

      The authors conducted an fMRI study to investigate the neural effects of sustaining attention to areas of different sizes. Participants were instructed to attend to alphanumeric characters arranged in a circular array. The size of attention field was manipulated in four levels, ranging from small (18 deg) to large (162 deg). They used a model-based method to visualize attentional modulation in early visual cortex V1 to V3, and found spatially congruent modulations of the BOLD response, i.e., as the attended area increased in size, the neural modulation also increased in size in the visual cortex. They suggest that this result is a neural manifestation of the zoom-lens model of attention and that the model-based method can effectively reconstruct the neural modulation in the cortical space.

      The study is well-designed with sophisticated and comprehensive data analysis. The results are robust and show strong support for a well-known model of spatial attention, the zoom-lens model. Overall, I find the results interesting and useful for the field of visual attention research.

      Comments on revisions:

      The authors have addressed my previous comments satisfactorily. I would encourage the authors to make data and code publicly available, which appears to be the custom in this era.

    1. The third principle calls on social workers to value the dignity and worth of the person, and states that social workers should actively consider individual differences and cultural and ethnic diversity and treat each person with care and respect.

      I think this should be the main focus of anyone involved in advocacy or policy making. Is there a similar code of ethics for lawmakers?

    1. By defining and separating the concepts involved in state management and enforcing rules that maintain independence between views and states, we give our code more structure and maintainability.

      بدل ما يكون كل حاجة متلخبطة في مكان واحد (الـ UI، و البيانات، و الـ logic)،

      إحنا بنفصل المفاهيم الرئيسية:

      الـ UI مسؤول عن العرض بس

      الـ state (البيانات) محفوظة في مكان مستقل (store)

      التحديثات على الـ state بتيجي من خلال قواعد واضحة (actions + reducers)

      ⬅️ زي كأنك عامل كل جزء من مشروعك في ملف واضح ومنفصل.

    1. NpmNix includes a very simple Golang parser, parser.go (~70 lines of code), that parses the package-lock.json and generates the complete Nix expression.

      Gist of dynamic nature of this feature

    1. Note de Synthèse : Relations Police/Population en France – Constats 2024 et Évolutions

      Source: Extraits de "https://www.defenseurdesdroits.fr/sites/default/files/2025-06/ddd_EAD-2024_volume-1_relations-police-population.pdf" (Défenseur des droits, "Relations police/population : contrôles d’identité et dépôts de plainte", Juin 2025).

      Introduction et Contexte

      Le Défenseur des droits, en tant qu'organe externe de contrôle de la déontologie des forces de sécurité, a publié la deuxième édition de son enquête "Accès aux droits" (EAD 2024), actualisant une étude menée initialement en 2016.

      L'objectif est d'approfondir la connaissance des atteintes aux droits, notamment en matière de déontologie des forces de sécurité et des relations police-population.

      Cette publication se concentre sur trois aspects clés : l'expérience des contrôles d'identité, l'expérience du dépôt de plainte ou de main courante, et la confiance envers l'institution policière.

      L'étude de 2016 avait déjà mis en évidence des relations généralement satisfaisantes, mais notait des expériences plus contrastées pour certains groupes sociaux, notamment les jeunes hommes perçus comme noirs, arabes ou maghrébins, qui subissaient des contrôles plus fréquents et souvent dégradés.

      Ces expériences négatives étaient corrélées à une faible confiance envers les forces de sécurité.

      Une recommandation clé du Défenseur des droits en 2016 était la mise en place d'une traçabilité des contrôles d'identité pour lutter contre les discriminations.

      L'édition 2024, menée entre octobre 2024 et janvier 2025 auprès de 5 030 personnes représentatives de la population de France métropolitaine (18-79 ans), utilise une méthodologie comparable à 2016, mais enrichie de nouvelles thématiques (notamment sur le dépôt de plainte).

      Elle intègre des variables sociodémographiques détaillées (âge, sexe, origine perçue, religion, orientation sexuelle, handicap) pour une analyse intersectionnelle des discriminations.

      Thèmes Principaux et Idées Clés

      1. L'Expérience des Contrôles d'Identité

      Les contrôles d'identité sont un point de contact majeur entre la police et la population, avec environ 47 millions estimés en 2021.

      Leur cadre juridique est jugé "complexe et flou", laissant une "large marge d'interprétation aux forces de sécurité, ouvrant la voie à des usages divers, et parfois controversés".

      L'existence de discriminations dans ce cadre a été reconnue à plusieurs reprises par la justice.

      • Augmentation significative de la fréquence des contrôles :La proportion de personnes ayant été contrôlées au moins une fois au cours des 5 dernières années est passée de 16 % en 2016 à 26 % en 2024, soit une augmentation de 63 %.

      • Cette hausse touche toutes les catégories de population, y compris celles "auparavant peu contrôlées" : +81 % pour les cadres, +148 % pour les 55-64 ans, et +79 % pour les personnes perçues "comme blanches exclusivement".

      • En 2024, les contrôles multiples (plusieurs fois sur les 5 dernières années) sont majoritaires (15 % de la population contre 11 % pour un contrôle unique).

      • Modalités et justifications des contrôles :90 % des contrôles rapportés en 2024 ont impliqué une vérification des titres d'identité (contre 68 % en 2016).

      • Cependant, une part significative des contrôles est "poussée" : 22 % ont fait l'objet d'une fouille, 11 % ont reçu l'ordre de quitter les lieux, 6 % ont été plaquées contre un mur ou une voiture et 3 % ont été emmenées au poste.

      • Pour plus d’une personne contrôlée sur deux, le motif du contrôle n’est pas explicité par les forces de sécurité. Seules 42 % des personnes ayant subi un contrôle "poussé" ont bénéficié d'une justification.

      • Comportements inappropriés :19 % des personnes contrôlées déclarent avoir été confrontées à des comportements inappropriés (tutoiement, provocation, insultes, brutalité), une proportion qui était de 28 % en 2016 (bien que les questions aient pu évoluer).

      • 14 % ont été tutoyées, 7 % provoquées ou insultées, et 7 % ont subi des comportements brutaux.

      • Disparités socio-démographiques et discriminations :Les jeunes hommes perçus comme noirs, arabes ou maghrébins sont 4 fois plus à risque d’avoir été contrôlés que le reste de la population, et 12 fois plus à risque de faire l’objet d’un contrôle « poussé » (fouille, palpation, conduite au poste, injonction à quitter les lieux).

      • Ils rapportent également plus fréquemment des comportements inappropriés : 30 % d'entre eux contre 15 % des personnes perçues comme blanches uniquement.

      • Les personnes financièrement précaires (32 %) sont également plus contrôlées que celles à l'aise financièrement (22 %).

      • Les personnes non hétérosexuelles ont 50 % de risque en plus d'être confrontées à des comportements inappropriés lors d'un contrôle d'identité.

      • La "marge d’appréciation offerte par le droit actuel laisse les policiers et les gendarmes seuls avec leur propre instinct et leurs éventuels préjugés", ce qui "peut induire des comportements discriminatoires, volontaires ou non, et faire peser une suspicion sur l’ensemble des contrôles".

      • Le manque de traçabilité des contrôles d'identité est un obstacle persistant à la preuve des discriminations et à l'effectivité du droit au recours.

      • Réactions aux comportements inappropriés :Seules 8 % des personnes ayant subi des comportements inappropriés ont tenté de faire reconnaître la situation (via une association, avocat, Défenseur des droits, police/gendarmerie).

      • La majorité (73 %) en a parlé à des proches.

      2. L'Expérience du Dépôt de Plainte ou de Main Courante

      Le dépôt de plainte est une autre modalité cruciale d'interaction avec les forces de sécurité.

      • Fréquence et profil des plaignants :35 % des personnes interrogées se sont rendues dans un commissariat ou une gendarmerie pour déposer une plainte ou une main courante au cours des 5 dernières années.

      • Les personnes en difficultés financières, en situation de handicap, ou atteintes de maladies chroniques ont une propension plus élevée à porter plainte.

      Comportements non déontologiques lors du dépôt de plainte :21 % des personnes ayant souhaité déposer une plainte se sont heurtées à un refus, alors que le refus de dépôt de plainte est interdit par la loi (Article 15-3 du code de procédure pénale).

      • Les refus de plainte touchent plus fréquemment les personnes en situation de handicap (37 %), celles portant un signe religieux (33 %), au chômage (30 %), résidant dans un quartier prioritaire de la politique de la ville (30 %), ou perçues comme noires, arabes ou maghrébines (28 %).

      • 10 % des personnes ayant voulu déposer plainte rapportent des comportements inappropriés des forces de sécurité (tutoiement, insultes, humiliation, intimidation).

      • Les personnes en situation de handicap ont un risque double d'être exposées à des comportements inappropriés lors d'un dépôt de plainte.

      • Les jeunes (18-24 ans) et les personnes perçues comme non-blanches ont également un risque 80 % plus élevé d'y être confrontées.

      • Expériences négatives multicontextuelles :Certains facteurs, comme l'origine perçue (noir, arabe, maghrébin), l'âge (jeunes 18-24 ans) et le chômage, surexposent aux comportements inappropriés "aussi bien lors d’un contrôle que lors d’un dépôt de plainte".

      Cela "suggère l’existence de comportements discriminatoires car ciblés sur certains groupes sociaux plutôt que d’autres."

      3. La Confiance en l'Institution Policière

      La confiance se distingue en une confiance "diffuse" (missions générales de la police) et un soutien "spécifique" (évaluation basée sur des expériences concrètes).

      L'enquête s'intéresse au soutien spécifique.

      • Niveaux de confiance :50 % de la population se dit confiante ou rassurée en présence d'un policier ou d'un gendarme sur la voie publique.

      • 28 % sont indifférents et 22 % se sentent méfiants ou inquiets.

      • Lien avec les expériences concrètes :La confiance est "étroitement liée" aux expériences vécues : 51 % des personnes ayant pu enregistrer leur plainte sans incident se déclarent confiantes, contre seulement 37 % de celles confrontées à un refus.

      • 59 % des personnes ayant vécu des discriminations lors d'un contrôle de police se sentent inquiètes ou méfiantes, contre 21 % de celles qui pensent que les discriminations existent mais ne les ont pas vécues personnellement, et 5 % de celles qui ne reconnaissent pas leur existence.

      • Les personnes ayant fait l'expérience de comportements inappropriés (que ce soit lors d'un contrôle ou d'un dépôt de plainte) se déclarent plus fréquemment méfiantes ou inquiètes (respectivement 61 % et 51 %).

      • Conséquences du manque de confiance :Le manque de confiance entraîne plus fréquemment une remise en question de la légitimité de l'intervention policière : 16 % des personnes méfiantes protestent lors d'un contrôle, contre 4 % des confiantes.

      • Les personnes méfiantes sont plus nombreuses à percevoir le contrôle comme injustifié (59 % contre 18 % des confiantes).

      • Une corrélation négative existe entre confiance et recours à la police : 21 % des personnes méfiantes déclarent ne pas avoir contacté les forces de sécurité par manque de confiance suite à une discrimination ou un harcèlement, contre 3 % des personnes confiantes.

      • Cela crée une "dynamique délétère" qui "nourrit une défiance mutuelle lors des interactions police/population" et "peut conduire à une escalade des tensions en contexte d’intervention".

      Conclusion Générale

      L'enquête "Accès aux droits" de 2024 met en évidence une "dualisation des relations" entre les citoyens et les forces de sécurité en France.

      Alors que l'expérience du contrôle d'identité s'est généralisée à une plus grande partie de la population, les modalités de ces interactions varient considérablement selon les caractéristiques sociales des individus.

      Les catégories de population "traditionnellement" moins contrôlées (femmes, cadres, personnes âgées) sont désormais plus souvent contrôlées, mais généralement via des "simples contrôles d’identité, généralement ponctuels, courtois et perçus comme justifiés."

      En revanche, pour les personnes perçues comme noires, arabes ou maghrébines, les jeunes, les hommes et les personnes précaires, on observe une persistance de contrôles plus fréquents, plus intrusifs ("poussés"), et accompagnés de comportements contraires à la déontologie.

      Ces groupes sont également plus exposés aux refus de dépôt de plainte et aux comportements inappropriés lors de ces démarches.

      Ces expériences négatives et discriminatoires ont un impact direct et significatif sur la confiance envers les forces de sécurité, conduisant à une méfiance accrue, une remise en question de la légitimité des actions policières, et une diminution du recours à la police.

      L'étude souligne que cette "érosion de la confiance" peut "nourrir les crispations entre la population et les forces de sécurité et, in fine, peut conduire à une escalade des tensions en contexte d’intervention."

      Le Défenseur des droits souhaite que ce rapport "favorise la réflexion pour établir des relations plus apaisées" entre la police et la population.

    1. Rapport d'information : Le droit à l'orientation dans l'enseignement secondaire en France

      Ce rapport détaillé du Défenseur des droits examine le droit à l'orientation scolaire en France, mettant en lumière les défis persistants et les inégalités qui entravent l'épanouissement des jeunes.

      Il s'appuie sur une littérature existante, des saisines et décisions du Défenseur des droits, des auditions d'acteurs variés et des contributions de jeunes.

      I. Cadre juridique et définitions de l'orientation

      L'orientation scolaire est un droit fondamental reconnu à l'échelle internationale et nationale.

      • Convention internationale des droits de l’enfant (CIDE) : garantit le droit de l'enfant à l'éducation, et rend "ouvertes et accessibles à tout enfant l’information et l’orientation scolaires et professionnelles" (Art. 28).

      Elle vise également à "favoriser l’épanouissement de la personnalité […] le développement de ses dons et ses aptitudes mentales et physiques, dans toute la mesure de leurs potentialités" (Art. 29).

      • Conseil de l'Union européenne (2008) : définit l'orientation comme "un processus continu qui permet aux citoyens, à tout âge et tout au long de leur vie, de déterminer leurs capacités, leurs compétences et leurs intérêts, de prendre des décisions en matière d'éducation, de formation et d'emploi et de gérer leurs parcours de vie personnelle".

      • Droit interne français (Code de l'éducation) : définit l'orientation comme "le résultat du processus continu d'élaboration et de réalisation du projet personnel de formation et d'insertion sociale et professionnelle que l'élève de collège, puis de lycée, mène en fonction de ses aspirations et de ses capacités" (Art. D. 331-23).

      Il reconnaît également le "droit au conseil en orientation et à l'information sur les enseignements, sur l'obtention d'une qualification professionnelle [...] sur les professions ainsi que sur les débouchés et les perspectives professionnels" (Art. L. 313-1).

      • Depuis les années 1960, l'orientation est devenue une politique publique visant à réduire les inégalités d'accès à l'éducation, avec la création de structures comme l'Onisep (1970) et les CIO (1971).

      II. Contraintes de gouvernance et de coordination entre les acteurs de l'orientation

      La politique d'orientation est fragmentée et manque de lisibilité, malgré l'implication de nombreux acteurs (État, régions, collectivités, académies, établissements, associations, parents).

      Une compétence scindée et morcelée :

      • État : définit la politique publique nationale, pilote l'accompagnement à l'orientation, et prend les décisions d'orientation et d'affectation des élèves. Il gère l'Onisep et les CIO.

      • Régions : sont en première ligne pour le déploiement, agissant sur l'information et sa diffusion, en lien avec le contexte économique local.

      • Difficultés d'articulation : "absence de pilotage national", "chef de fil peu identifié", "multiplicité d’acteurs, qui conduit tout à la fois à des doublons d’action, à l’illisibilité du système d’orientation, à la dilution de la responsabilité et de la capacité à évaluer les contributions respectives". (rapports variés cités)

      • Manque de coordination régionale : Les Services Publics Régionaux de l'Orientation (SPRO) peinent à coordonner les acteurs sous différentes tutelles et financements.

      L'offre d'information est segmentée.

      • Transition lycée-enseignement supérieur : Manque de pilotage spécifique, chaque niveau se renvoyant la responsabilité. La plateforme Parcoursup et la Mission de l’orientation du scolaire vers le supérieur (MOSS) n'ont pas totalement résolu ce problème.

      • Coût et incertitudes de la répartition des compétences : La nouvelle articulation entre l'Onisep et les régions pose des difficultés, notamment la "dissémination des ressources" et le "déficit de continuité éducative".

      • Plateforme Avenir(s) : Malgré ses ambitions, son lancement a été confus, et les collectivités locales ont exprimé des doutes sur son association et le risque de doublon avec leurs propres outils.

      • Inégalités territoriales et financement : Les budgets alloués à l'orientation varient fortement entre les régions, et les données sont rares et peu accessibles.

      III. Un accompagnement insuffisant malgré une pluralité d'informations

      • Les jeunes sont confrontés à une information foisonnante mais peu lisible, et à un manque d'accompagnement personnalisé.

      • Information numérique foisonnante mais peu lisible : Une multitude de sites et plateformes (Onisep, Parcoursup, CIDJ) existent, mais les jeunes peinent à naviguer dans cette offre.

      Manque d'experts en orientation :

      • Les psychologues de l’Éducation nationale (PsyEN) spécialisés en orientation (EDO) sont les seuls spécifiquement formés, mais ils sont en nombre insuffisant.

      Leur appellation de "psy" peut stigmatiser le conseil, faisant craindre aux élèves d'être perçus comme "en difficulté".

      "Les élèves ont peur de prendre rendez-vous."

      • Recommandation : Mettre en place un collectif de professionnels suffisant et définir un référent pilote formé à l'orientation pour coordonner et assurer un suivi individualisé.

      • Établissements scolaires insuffisamment ouverts aux acteurs extérieurs : Bien que des initiatives existent pour s'ouvrir au monde économique, il est nécessaire d'élargir ces démarches à toutes les filières et de diversifier les interventions.

      • Manque d'espaces dédiés : Les Centres d'Information et d'Orientation (CIO) sont les seuls lieux physiques dédiés, mais leur accès et leur stratégie de financement sont questionnés.

      • Recommandations : Créer un bureau de l'orientation dans chaque établissement scolaire avec un pilote identifié, et valoriser les CIO à l'échelle départementale.

      IV. Un parcours de l'orientation qui doit être choisi et éclairé

      Le rapport souligne des lacunes dans l'intégration de l'orientation dans les programmes scolaires, l'impact des inégalités sociales et territoriales, et la nécessité de reconnaître un véritable droit à la réorientation.

      Présence factice de l'orientation dans les programmes scolaires :

      • Les heures dédiées à l'orientation sont rarement effectives. Un jeune interrogé regrette : "Je n’ai pas été accompagnée, mes parents avaient d’autres soucis et étaient à distance.

      J’aurais aimé des heures d’orientation dans mon emploi du temps et du personnel scolaire dédié."

      • Stages d'observation de 3ème : Plébiscités par les jeunes comme un levier efficace pour la découverte du monde professionnel.

      Cependant, l'accès est inégal, le "poids du réseau familial et de l'environnement" étant déterminant. "Ça a été facile à trouver, mais j’ai été aidé par la famille."

      Les élèves de milieux défavorisés acceptent souvent des stages "par défaut". "J’ai fini au boulot de ma mère par manque de réponse."

      • Discriminations à l'accès au stage : Saisines du Défenseur des droits pour discriminations fondées sur l'apparence physique, l'état de santé ou l'origine.

      Le phénomène des "stages réservés" (enfants de salariés) est encore répandu.

      • Voie professionnelle : Les élèves de la voie professionnelle, souvent issus de milieux populaires, ont des stages plus longs et sont confrontés à des difficultés de recherche, parfois acceptant des missions peu intéressantes.

      Le poids des inégalités sociales et territoriales :

      • Fatalisme social : Les jeunes en situation de précarité ont "de moindres ambitions scolaires, même à notes équivalentes". Le discours scolaire peut les décourager : "J'aurais aimé faire une prépa mais malheureusement dans les lycées de banlieue, on ne donne pas toutes les options qui existent."

      • Ségrégation scolaire : La faible mixité sociale freine les ambitions des élèves, accentuée par des logiques résidentielles.

      Les jeunes des quartiers prioritaires de la ville (QPV) ou des Réseaux d'Éducation Prioritaire (REP) cumulent les facteurs d'inégalités.

      • Autocensure : Les jeunes des milieux défavorisés témoignent d'une volonté de réussir mais aussi d'une "autocensure" : "À cause de l’environnement de classe je m’empêche de faire des choses."

      • Discrimination des Mineurs Non Accompagnés (MNA) : Le Défenseur des droits a constaté des orientations vers des filières courtes pour garantir une autonomie rapide, sans toujours tenir compte des souhaits et capacités des jeunes.

      • Inégalités territoriales et mobilité : Les élèves en milieu rural s'orientent moins vers la filière générale.

      Le manque de moyens financiers est un frein majeur à la poursuite d'études hors du domicile familial.

      L'éloignement des lieux de formation "alimente une forme d’autocensure chez les jeunes, qui estiment davantage que ces filières « ne sont pas pour eux »".

      Inégalités filles-garçons et biais de genre :

      • Constat connu : "Les filles s’orientent davantage vers l’enseignement général et technologique que les garçons mais sont moins nombreuses en proportion à s’orienter vers les filières scientifiques." (ministère de l'éducation nationale, DEPP). En 2022, seulement "24 % de femmes parmi les ingénieurs".
      • Phénomène sociétal : Les stéréotypes de genre, souvent intériorisés dès le collège ("Bien que j’aimais beaucoup les sciences, en grandissant on m’a fait ressentir que c’était plus pour les hommes.

      Je me suis posé des barrières seule"), influencent les choix. Les filières très féminisées sont souvent moins valorisées.

      • Effet de la réforme du lycée : Le libre choix des filières a "renforcé le poids des stéréotypes", éloignant davantage les filles des parcours scientifiques les plus exigeants.

      Le taux de féminisation de la spécialité "mathématiques" en 2021-2022 était au plus bas depuis 1994-1995.

      • Recommandations : Instaurer des actions positives sur le genre et accompagner les élèves du lycée général et technologique pour lutter contre les représentations genrées.

      Droit à la réorientation et à l'affectation effective :

      • Passerelles vs. Droit à l'erreur : Le dispositif des passerelles (changement de voie en cours ou fin d'année) est peu appréhendé comme une modalité de droit commun et est souvent présenté comme une "réaction à ce qui est vécu comme un échec".

      L'institution scolaire associe les orientations non concluantes des élèves à des choix "strictement personnels", minorant ses propres carences.

      La terminologie "droit à l'erreur" est stigmatisante, notamment quand elle réoriente des élèves de la voie générale vers la voie professionnelle, suggérant que leurs ambitions initiales étaient "surdimensionnées".

      • Recommandation : Mettre fin à la dénomination de "droit à l'erreur" et privilégier les terminologies de "passerelles" ou de "réorientation".

      • Lycéens sans lycée : Le Défenseur des droits est "régulièrement saisi d’élèves qui se voient refuser une affectation dans une formation pourtant choisie et validée [...] faute de places disponibles."

      En 2024, "23 600" élèves étaient sans affectation à la rentrée. La priorité est souvent donnée aux élèves non redoublants, créant une inégalité.

      • Recommandation : Anticiper et accorder les moyens humains, financiers et matériels nécessaires pour mettre fin aux situations récurrentes d'élèves sans affectation, et augmenter le nombre d'enseignants, de divisions et de dotations horaires globales.

      • Droit au maintien dans la classe d'origine : La loi permet aux élèves n'ayant pas obtenu satisfaction pour leur orientation de se maintenir dans leur classe d'origine pour une année.

      "Ce n’est pas grave si on perd une année ou deux.

      Il faut prendre le temps de se tromper, et se poser sur ses choix."

      Ce droit est crucial pour limiter les sorties sèches du système scolaire.

      Cependant, il est menacé par des "clauses de résiliation unilatérale" dans les contrats de scolarisation des établissements privés sous contrat, et un "phénomène d’éviction des élèves jugés insuffisamment performants" pour garantir de meilleures statistiques.

      V. Recommandations Générales

      Le rapport conclut en insistant sur l'urgence de définir des ambitions claires et partagées pour l'orientation scolaire, et de fournir aux professionnels les moyens et un cadre d'action clairs.

      Parmi les nombreuses recommandations formulées, on retient :

      • Mettre en place un suivi annuel consolidé des actions menées en matière d’orientation dans chaque région, tant quantitatif que qualitatif.

      • Permettre à chaque élève d’être accompagné par un collectif de professionnels en nombre suffisant et désigner un référent pilote.

      • Garantir l’existence de lieux physiques dédiés à l’information et à l’orientation (bureaux dans les établissements, valorisation des CIO).

      • Rendre effectives les heures annuelles d'orientation dans les emplois du temps.

      • Lutter contre l'autocensure en développant une information large et non stéréotypée.

      • Rapprocher les jeunes des formations en développant une offre équilibrée à travers le territoire.

      • Prendre en compte l’éloignement territorial des élèves dans le calcul des bourses.

      • Favoriser la mixité en instaurant des actions positives sur le genre dans les filières.

      • Anticiper les moyens pour mettre fin aux élèves sans affectation et garantir le droit au maintien dans la classe d'origine.

      • Mettre fin aux clauses abusives des contrats de scolarisation dans les établissements privés.

    1. There are people who use these, apparently. And it just feels so… depressing. There are people I once respected who, apparently, don’t actually enjoy doing the thing. They would like to describe what they want and receive Whatever — some beige sludge that vaguely resembles it. That isn’t programming, though. That’s management, a fairly different job. I’m not interested in managing. I’m certainly not interested in managing this bizarre polite lying daydream machine. It feels like a vizier who has definitely been spending some time plotting my demise. It makes programming spaces feel bleaker. I don’t want to help someone who opens with “I don’t know how to do this so I asked ChatGPT and it gave me these 200 lines but it doesn’t work”. I don’t want to know how much code wasn’t actually written by anyone. I don’t want to hear how many of my colleagues think Whatever is equivalent to their own output. I don’t want to keep watching people fall for a carnival trick.

      the management thing seems so off and yet the part of me that wants to object to it is so right there with them about the experience of dealing with a colleague using it

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors have assembled a cohort of 10 SiNET, 1 SiAdeno, and 1 lung MiNEN samples to explore the biology of neuroendocrine neoplasms. They employ single-cell RNA sequencing to profile 5 samples (siAdeno, SiNETs 1-3, MiNEN) and single-nuclei RNA sequencing to profile seven frozen samples (SiNET 4-10).

      They identify two subtypes of siNETs, characterized by either epithelial or neuronal NE cells, through a series of DE analyses. They also report findings of higher proliferation in non-malignant cell types across both subtypes. Additionally, they identify a potential progenitor cell population in a single-lung MiNEN sample.

      Strengths:

      Overall, this study adds interesting insights into this set of rare cancers that could be very informative for the cancer research community. The team probes an understudied cancer type and provides thoughtful investigations and observations that may have translational relevance.

      Weaknesses:

      The study could be improved by clarifying some of the technical approaches and aspects as currently presented, toward enhancing the support of the conclusions:

      (1) Methods: As currently presented, it is possible that the separation of samples by program may be impacted by tissue source (fresh vs. frozen) and/or the associated sequencing modality (single cell vs. single nuclei). For instance, two (SiNET1 and SiNET2) of the three fresh tissues are categorized into the same subtype, while the third (SiNET9) has very few neuroendocrine cells. Additionally, samples from patient 1 (SiNET1 and SiNET6) are separated into different subtypes based on fresh and frozen tissue. The current text alludes to investigations (i.e.: "Technical effects (e.g., fresh vs. frozen samples) could also impact the capture of distinct cell types, although we did not observe a clear pattern of such bias."), but the study would be strengthened with more detail.

      We thank the reviewer for the thoughtful and constructive review. Due to the difficulty in obtaining enough SiNET samples, we used two platforms to generate data - single cell analysis of fresh samples, and single nuclei analysis of frozen samples. We opted to combine both sample types in our analysis while being fully aware of the potential for batch effects. We therefore agree that this is a limitation of our work, and that differences between samples should be interpreted with caution.

      Nevertheless, we argue that the two SiNET subtypes that we have identified are very unlikely to be due to such batch effect. First, the epithelial SiNET subtype was not only detected in two fresh samples but also in one frozen sample (albeit with relatively few cells, as the reviewer correctly noted). Second, and more importantly, the epithelial SiNET subtype was also identified in analysis of an external and much larger cohort of bulk RNA-seq SiNET samples that does not share the issue of two platforms (as seen in Fig. 2f). Moreover, the proportion of samples assigned to the two subtypes is similar between our data and the external data. We therefore argue that the identification of two SiNET subtypes cannot be explained by the use of two data platforms. However, we agree that the results should be further investigated and validated by future studies.

      The reviewer also commented that two samples from the same patient which were profiled by different platforms (SiNET1 and SiNET6) were separated into different subtypes. We would like to clarify that this is not the case, since SiNET6 was not included in the subtype analysis due to too few detected Neuroendocrine cells, and was not assigned to any subtype, as noted in the text and as can be seen by its exclusion from Figure 2 where subtypes are defined. We apologize that our manuscript may have given the wrong impression about SiNET6 classification (it was labeled in Fig. 4a in a misleading manner). In the revised manuscript, we corrected the labeling in Fig. 4a and clarified that SiNET6 is not assigned to any subtype. We also further acknowledge the limitation of the two platforms and the arguments in favor of the existence of two SiNET subtypes.     

      (Additional specific recommendations for the authors are provided below)

      (2) Results:

      Heterogeneity in the SiNET tumor microenvironment: It is unclear if the current analysis of intratumor heterogeneity distinguishes the subtypes. It may be informative if patterns of tumor microenvironment (TME) heterogeneity were identified between samples of the same subtype. The team could also evaluate this in an extension cohort of published SiNET tumors (i.e. revisiting additional analyses using the SiNET bulk RNAseq from Alvarez et al 2018, a subset of single-cell data from Hoffman et al 2023, or additional bulk RNAseq validation cohorts for this cancer type if they exist [if they do not, then this could be mentioned as a need in Discussion])

      We agree that analysis of an independent cohort will assist in defining the association between TME and the SiNET subtype. However, the sample size required for that is significantly larger than the data available. In the revised manuscript we note that as a direction for future studies.

      (3) Proliferation of NE and immune cells in SiNETs: The observed proliferation of NE and immune cells in SiNETs may also be influenced by technical factors (including those noted above). For instance, prior studies have shown that scRNA-seq tends to capture a higher proportion of immune cells compared to snRNA-seq, which should be considered in the interpretation of these results. Could the team clarify this element?

      We agree that different platforms could affect the observed proportions of immune cells, and more generally the proportions of specific cell types. However, the low proliferation of Neuroendocrine cells and the higher proliferation of immune cells (especially B cells, but also T cells and macrophages) is consistently observed in both platforms, as shown in Fig. 4a, and therefore appears to be reliable despite the limitations of our work. We clarify this consistency in the revised manuscript. 

      (4) Putative progenitors in mixed tumors: As written, the identification of putative progenitors in a single lung MiNEN sample feels somewhat disconnected from the rest of the study. These findings are interesting - are similar progenitor cell populations identified in SiNET samples? Recognizing that ideally additional validation is needed to confidently label and characterize these cells beyond gene expression data in this rare tumor, this limitation could be addressed in a revised Discussion.

      We do not find evidence for similar progenitors in the SiNET samples, but they also do not contain two co-existing lineages of cancer cells within the same tumor, so this is harder to define. We agree about the need for additional validation for this specific finding and have noted that in the revised Discussion.

      Reviewer #2 (Public review):

      Summary:

      The research identifies two main SiNET subtypes (epithelial-like and neuronal-like) and reveals heterogeneity in non-neuroendocrine cells within the tumor microenvironment. The study validates findings using external datasets and explores unexpected proliferation patterns. While it contributes to understanding SiNET oncogenic processes, the limited sample size and depth of analysis present challenges to the robustness of the conclusions.

      Strengths:

      The studies effectively identified two subtypes of SiNET based on epithelial and neuronal markers. Key findings include the low proliferation rates of neuroendocrine (NE) cells and the role of the tumor microenvironment (TME), such as the impact of Macrophage Migration Inhibitory Factor (MIF).

      Weaknesses:

      However, the analysis faces challenges such as a small sample size, lack of clear biological interpretation in some analyses, and concerns about batch effects and statistical significance.

      Reviewer #3 (Public review):

      Summary:

      In this study, the authors set out to profile small intestine neuroendocrine tumors (siNETs) using single-cell/nucleus RNA sequencing, an established method to characterize the diversity of cell types and states in a tumor. Leveraging this dataset, they identified distinct malignant subtypes (epithelial-like versus neuronal-like) and characterized the proliferative index of malignant neuroendocrine cells versus non-malignant microenvironment cells. They found that malignant neuroendocrine cells were far less proliferative than some of their non-malignant counterparts (e.g., B cells, plasma cells, epithelial cells) and there was a strong subtype association such that epithelial-like siNETs were linked to high B/plasma cell proliferation, potentially mediated by MIF signaling, whereas neuronal-like siNETs were correlated with low B/plasma cell proliferation. The authors also examined a single case of a mixed lung tumor (neuroendocrine and squamous) and found evidence of intermediate/mixed and stem-like progenitor states that suggest the two differentiated tumor types may arise from the same progenitor.

      Strengths:

      The strengths of the paper include the unique dataset, which is the largest to date for siNETs, and the potentially clinically relevant hypotheses generated by their analysis of the data.

      Weaknesses:

      The weaknesses of the paper include the relatively small number of independent patients (n = 8 for siNETs), lack of direct comparison to other published single-cell NET datasets, mixing of two distinct methods (single-cell and single-nucleus RNA-seq), lack of direct cell-cell interaction analyses and spatially-resolved data, and lack of in vitro or in vivo functional validation of their findings.

      The analytical methods applied in this study appear to be appropriate, but the methods used are fairly standard to the field of single-cell omics without significant methodological innovation. As the authors bring forth in the Discussion, the results of the study do raise several compelling questions related to the possibility of distinct biology underlying the epithelial-like and neuronal-like subtypes, the origin of mixed tumors, drivers of proliferation, and microenvironmental heterogeneity. However, this study was not able to further explore these questions through spatially-resolved data or functional experiments.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Methods:

      a) Could the team clarify the discrepancy in subtype assignment between two samples from the same patient? i.e. are these samples from the same tumor? If so, what does the team think is the explanation for the difference in subtype assignment?

      As noted above in response to the public review of reviewer #1, SiNET6 was in fact not assigned to any subtype (due to insufficient NE cells) and hence there was no discrepancy. We apologize for the misleading labeling of SiNET6 in the previous version and have corrected this In the revised version of Figure 4.

      b) What is the rationale for scoring tumor-derived programs on samples with no tumor cells? For instance, SiNET3 does not contain NE cells, and SiNET9 has a very low fraction of NE cells. Please clarify how the scoring was performed on these samples, as the program assignments may be driven by other cell types in samples with little to no NE cells.

      Scoring for tumor-derived programs was done only for the NE cells. Accordingly, SiNET3 was not scored or assigned to any of the programs. SINET9 was included in this analysis - although it had a relatively small fraction of NE cells, the absolute number of profiled cells was particularly high in this sample and therefore the number of NE cells was 130, higher than our cutoff of 100 cells.

      c) Given the heterogeneity of cell types within each sample, would there be a way to provide a refined sense of confidence for certain cell type annotations? This would be helpful given the heterogeneity in marker gene expression and the absence of gold-standard markers for fibroblasts and endothelial cells in this cancer type. Additionally, there seems to be an unusually large proportion of NK and T cells - was there selection for this (given that these tumors are largely not immune infiltrated)?

      Author Response: Except for the Neuroendocrine cells, there are six TME cell types that we consistently find in multiple SiNET samples: macrophages, T cells, B/plasma cells, fibroblasts, endothelial and epithelial cells. Each of these cell types are identified as discrete clusters in analysis of the respective tumors (as shown in Fig. 1a,b and Fig. S1), and these are exactly the six most common non-malignant cell types that we and others found in single cell analysis across various other tumor types (e.g. see Gavish et al. 2023, ref. #15). The signatures used to annotate these cell types are shown in Table S2, and they primarily consist of classical markers that are traditionally used to define those cell types. We therefore believe that the annotation of these typical tumor-associated cell types is robust and does not include major uncertainties. In addition to these five common cell types, there are three cell types that we find only in 1-2 of the samples – epithelial cells, plasma cells and NK cells. Again, we believe that their annotation is robust, and these cell types are primarily not used for further analysis.

      There was no selection for any specific cell types in this study. Nevertheless, single cell (or single nuclei) analysis may lead to biases towards specific cell types, that we cannot evaluate directly from the data. NK cells were detected only in one tumor. T cells were detected in eight of the ten samples; but in four of those samples the frequency of T cells was lower than 5% and only in one sample the frequency was above 20%. Therefore, while we cannot exclude a technical bias towards high frequency of T/NK cells, we do not consider these frequencies as high enough to suggest this specific type of bias. In the revised manuscript, we clarify that the commonly observed cell types in SiNETs are the same as those commonly observed in other tumors and we acknowledge the possibility of a technical bias in cell type capture.  

      d) Evaluating the expression of one gene at a time may not effectively demonstrate subtype-specific patterns, particularly when comparing NE cells from one tumor to non-NE cells from another, which may not be an appropriate approach for identifying differentially expressed genes. DE analysis coupled with concordance analysis, for example, could strengthen the results.

      We apologize, but we do not fully understand this comment. We note that the initial normalization by non-NE cells was done in order to decrease batch effects when combining the data of the two platforms. We also note that the two subtypes were identified by two distinct approaches, as shown in Fig. 2c and in Fig. 2f.

      (2) Results:

      See the above public review.

      (3) Minor Comments:

      a) Results: Single cell and single nuclei RNA-seq profiling of SiNETs

      The results say ten primary tumor samples from eight patients. Later in the paragraph it says, "After initial quality controls, we retained 29,198 cells from the ten patients." Please clarify to either ten samples or eight patients.

      Indeed these are ten samples rather than ten patients. We corrected that in the revised version and thank the reviewer for noticing our error.

      b) Methods:

      - Please specify which computational tools were used to perform quality control, signature scoring, etc.

      The approaches for quality control, scoring etc. are described in the methods. We implemented these approaches with R code and did not use other computational tools.

      - Minor point but be consistent with naming convention (ie, siAdeno vs SiAdeno) throughout the paper. For example, under "Sample Normalization, Filtering and annotations" change "siAdeno" to "SiAdeno."

      Thank you for noting this, we corrected that.

      - Add processing and analysis of MiNEN sample to the methods section. It is not mentioned in the methods at all.

      As noted in the revised manuscript, the MiNEN sample was analyzed in the same way as the SiNET fresh samples.

      c) Supplementary Figures:

      Figure S1: Change (A-H) to (A-I) to account for all panels in the figure.

      Figure S4: Add (C) after "the siAdeno sample" in the legend.

      Thank you for noting this, we corrected that.

      (4) Font size is quite small in the main figures.

      We enlarged the font in selected figure panels.

      Reviewer #2 (Recommendations for the authors):

      (1) The small number of samples used in some analyses affects the robustness of the findings. Increasing the sample size or including more validation data could improve the statistical reliability and make the results more convincing. The authors should consider expanding the cohort size or integrating additional external datasets to increase statistical power.

      We agree with the reviewer that adding more samples would improve the reliability of the results. However, the external data that we found was not comparable enough to enable integration with our data, and we are unable to profile additional SiNET samples in our lab. We hope that future studies would support our results and extend them further.

      (2) The biological significance of differentially expressed genes needs more depth, limiting the insights into SiNET biology. The authors should perform a comprehensive pathway enrichment analysis and integrate findings with existing literature. Tools like Gene Set Enrichment Analysis (GSEA) or Overrepresentation Analysis (ORA) could provide a more holistic view of altered biological processes.

      We thank the reviewer for this suggestion. We did examine the functional enrichment of differentially expressed genes and did not find additional enrichments that we felt were important to highlight beyond what we described. We report the genes in supplementary tables, enabling other researchers to examine these lists further. 

      (3) The unexpected finding of higher proliferation in non-malignant cells requires further investigation and plausible biological explanation. The authors should perform additional analyses to explore potential mechanisms, such as investigating cell cycle regulators or performing in vitro validation experiments. The authors should consider single-cell trajectory analysis to explore these highly proliferative non-malignant cells' potential differentiation or activation states.

      We agree that our results are descriptive and that we do not fully explain the mechanism for the high level of non-malignant cell proliferation. We did attempt to perform follow up computational analysis. These analyses raised the hypothesis that high levels of MIF are causing the proliferation of immune cells. Additional analyses that we performed were not sufficient to conclusively identify a mechanism, and we felt that they were not informative enough to be included in the manuscript. Further in vitro (or in vivo) studies are beyond the scope of the current work.

      (3) More details are required on methods used for p-value adjustment, and criteria for statistical significance should be clearly defined. Additionally, integrating scRNA-seq and snRNA-seq data needs a more thorough explanation, including batch effect mitigation and more explicit cell clustering representation. The authors should clearly describe p-value adjustments (e.g., FDR) and batch correction methods (e.g., Harmony, FastMNN integration) and include additional figures showing corrected UMAP plots or heatmaps post-batch correction to enhance the confidence in results.

      We now clarify in the Methods our use of FDR for p-value adjustments. As for batch correction, we have avoided the use of integration methods as we believe that they tend to distort the data and decrease tumor-specific signals. Instead, we primarily analyzed one tumor at a time and never directly compared cell profiles across distinct tumors but only compared the differences between subpopulations; specifically, we normalized the expression of NE cells by subtracting the expression of reference non-NE cells from the same tumor as a method to decrease batch effects. We now clarify this point in the Methods section.

      (4) The lack of analysis of interactions between different cell types limits understanding of tumor microenvironment dynamics. The authors should employ cell-cell interaction analysis tools (e.g., CellPhoneDB, NicheNet) to explore potential communication networks within the tumor microenvironment. This could provide valuable insights into how different cell types influence tumor progression and maintenance.

      We thank the reviewer for this suggestion. We have tried to use such methods but found the results difficult to interpret since these approaches generated very long lists of potential cell-cell interactions that are largely not unique to the SiNET context and their relevance remains unclear without follow up experiments, which are beyond the scope of this work. We therefore focused only on ligand/receptors that came up robustly through specific analyses such as the differences between SiNET subtypes. In particular, MIF is highly expressed in the epithelial subtype, and remarkably, MIF upregulation is shared across multiple cell types. Thus, the cell-cell interactions that are suggested by the SiNET data as somewhat unique to this context are those involving MIF and its receptor (CD74 on immune cell types), while other interactions detected by the proposed methods primarily reflect the generic ligand/receptors expressed by corresponding TME cell types.   

      Reviewer #3 (Recommendations for the authors):

      (1) For a relatively small dataset, the mixing of single-cell versus single-nucleus RNA-seq should be discussed more. It would be nice to have 1-2 tumors that are analyzed by both methods to compare and increase our understanding of how these different approaches may affect the results. This could be accomplished by splitting a fresh tumor into two parts, processing it fresh for single-cell RNA-seq, and freezing the other part for single-nucleus RNA-seq.

      We agree with the reviewer that the different techniques may bias our results and we refer to this limitation in the Results and Discussion sections. However, it is important to note that we do not directly integrate the primary data across these modalities, but rather analyze each tumor separately and only combine the results across tumors. For example, we first compare the NE cells from each tumor to control non-NE cells from the same tumor and then only compare the sets of NE-specific genes across tumors. Moreover, the subtypes that we detect cannot be explained by these modalities, as the first subtype contains samples from both methods and these subtypes are further demonstrated in external bulk data. Similarly, the results regarding low proliferation of NE cells and high proliferation of B/plasma cells are observed across both modalities. We therefore argue that while the combination of methods is a limitation of this work it does not account for the main results.  

      (2) The authors state that they defined the siNET transcriptomic signature by comparing their siNET single-cell/nucleus data to other NETs profiled by bulk RNA-seq. Some of the genes in the signature, such as CHGA, are widely used as markers for NETs (and not specific for siNET). The authors should address this in more detail.

      To define the SiNET transcriptomic signature we first analyzed each tumor separately and compared the expression of Neuroendocrine (NE) cells to that of non-NE cells to detect NE-specific genes. Next, we compared the lists of NE-specific genes across the 8 SiNET patients and found a subset of 26 genes which were shared across most of the analyzed SiNET samples (Fig. 2a). Thus, the signature was defined only from analysis of SiNETs and not based on comparison to other types of NETs and hence it is expected that the signature could contain both SiNET-specific genes and more generic NET genes such as CHGA.

      Only after defining this signature, we went on to compare it between SiNETs and other types of NETs (pancreatic and rectal) based on external bulk RNA-seq data. In this comparison, we observed that the signature was clearly higher in SiNETs than in the other NETs (Fig. 2b). This result supports the accuracy of the signature and further suggests that it contains a fraction of SiNET-specific genes and not only generic NET genes such as CHGA. Thus, we would expect this signature to perform well also for distinguishing between SiNET and types of NETs, but it does contain a subset of genes that would be high in the other NETs. Finally, we note that even though CHGA is a generic NET marker, the bulk RNA-seq data would suggest that, at least at the mRNA level, this gene is still higher expressed in SiNETs than in other NETs. To avoid confusion regarding the definition and specificity of the SiNET transcriptomic signature we have extended the description of this section in the revised manuscript.

      (3) The authors only compare their data to bulk transcriptomic data on NETs. While in some instances this makes sense given the bulk dataset has >80 tumors, they should at least cite and do some comparison to other published single-cell RNA-seq datasets of NETs (e.g., PMID: 37756410, 34671197). The former study listed has 3 siNETs, 4 pNETs, and 1 gNET. Do the epithelial-like and neuronal-like signatures show up in this dataset too?

      We examined these studies but concluded that their data was inadequate to identify the two SiNET subtypes. The latter study was of pNETs, while the former study had 3 SiNET samples but only from 2 patients, and furthermore it was enriching for immune cells with only very low amounts of NE cells. Therefore, we now cite this work in the discussion but cannot use it to extend the results from our work.

      (4) How did the authors statistically handle patients with more than one tumor sample (true for n = 2)? These tumor samples would not be truly independent.

      In both cases where we had two distinct samples of the same patient, only one sample had sufficient NE cells to be included in NE-related analysis and therefore the other samples (SiNET3 and SiNET6) were excluded from all analysis of NE differential expression and subtypes. These samples were only included in the initial analysis (Fig. 1) and in TME-related analysis (Fig. 3-4) in which there was no statistical analysis of differences between patients and hence no problem with the inclusion of 2 samples for the same patient. We clarified this issue in the revised version.

      (5) The association between siNET subtype and B/plasma cell proliferation is very interesting, as is the hypothesis regarding MIF signaling. It would be illuminating for the authors to perform cell-cell interaction analyses with methods such as CellChat in this context rather than just relying on DE. Spatial mapping would be helpful too and while this may be outside the scope of this study, it should at least be expounded upon in the Discussion section.

      Indeed, spatial transcriptomic analysis would add interesting insight to our data and to SiNET biology. Unfortunately, this is not within the scope of the current project but we note this interesting possibility in the Discussion. Regarding additional methods for cell-cell interactions, we have performed such analysis but found it not informative as it highlighted a large number of interactions that are not unique SiNETs and are difficult to interpret, and therefore we do not include this in the revised version. 

      (6) The authors note that in the mixed lung tumor, the NE component was more proliferative than that observed with siNETs. How does the proliferation compare to pNETs, gNETs, in other published studies? How about assessing the clonality of the SCC and LNET malignant cells with various genomic or combined genomic/transcriptomic methods?

      The percentage of proliferating NE cells in the mixed lung tumor was higher than 60%. This is extremely high, approximately four-fold higher than the average that we found in a pan-cancer analysis and higher than the average of any of the >20 cancer types that we analyzed (Gavish et al. 2023, ref. #15). This remarkably high proliferation serves as a control for the low proliferation that we found in SiNET NE cells.

      (7) In the Discussion on page 13, the authors write "Second, proliferation of NE cells may be inhibited by prior treatments with somatostatin analogues." How many patients were treated in this manner? This information should be made more explicit in the manuscript.

      Details on pretreatment with somatostatin analogues are provided in Table S1. All patients were pre-pretreated with somatostatin analogues, with the possible exception of one patient (P8, SiNET10) for which we could not confidently obtain this information.

      (8) On page 5, "bone-fide" is misspelled.

      (9) On page 8, "exact identify" is misspelled.

      We thank the reviewer and have corrected the typos.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors provide a study among healthy individuals, general medical patients and patients receiving haematopoietic cell transplants (HCT) to study the gut microbiome through shotgun metagenomic sequencing of stool samples. The first two groups were sampled once, while the patients receiving HCT were sampled longitudinally. A range of metadata (including current and previous (up to 1 year before sampling) antibiotic use) was recorded for all sampled individuals. The authors then performed shotgun metagenomic sequencing (using the Illumina platform) and performed bioinformatic analyses on these data to determine the composition and diversity of the gut microbiota and the antibiotic resistance genes therein. The authors conclude, on the basis of these analyses, that some antibiotics had a large impact on gut microbiota diversity, and could select opportunistic pathogens and/or antibiotic resistance genes in the gut microbiota.

      Strengths:

      The major strength of this study is the considerable achievement of performing this observational study in a large cohort of individuals. Studies into the impact of antibiotic therapy on the gut microbiota are difficult to organise, perform and interpret, and this work follows state-of-the-art methodologies to achieve its goals. The authors have achieved their objectives and the conclusion they draw on the impact of different antibiotics and their impact on the gut microbiota and its antibiotic resistance genes (the 'resistome', in short), are supported by the data presented in this work.

      Weaknesses:

      The weaknesses are the lack of information on the different resistance genes that have been identified and which could have been supplied as Supplementary Data.

      We have now supplied a list of individual resistance genes as supplementary data.

      In addition, no attempt is made to assess whether the identified resistance genes are associated with mobile genetic elements and/or (opportunistic) pathogens in the gut. While this is challenging with short-read data, alternative approaches like long-read metagenomics, Hi-C and/or culture-based profiling of bacterial communities could have been employed to further strengthen this work.

      We agree this is a limitation, and we now refer to this in the discussion. Unfortunately we did not have funding to perform additional profiling of the samples that would have provided more information about the genetic context of the AMR genes identified.

      Unfortunately, the authors have not attempted to perform corrections for multiple testing because many antibiotic exposures were correlated.

      The reviewer is correct that we did not perform formal correction for multiple testing. This was because correlation between antimicrobial exposures meant we could not determine what correction would be appropriate and not overly conservative. We now describe this more clearly in the statistical analysis section.

      Impact:

      The work may impact policies on the use of antibiotics, as those drugs that have major impacts on the diversity of the gut microbiota and select for antibiotic resistance genes in the gut are better avoided. However, the primary rationale for antibiotic therapy will remain the clinical effectiveness of antimicrobial drugs, and the impact on the gut microbiota and resistome will be secondary to these considerations.

      We agree that the primary consideration guiding antimicrobial therapy will usually be clinical effectiveness. However antimicrobial stewardship to minimise microbiome disruption and AMR selection is an increasingly important consideration, particularly as choices can often be made between different antibiotics that are likely to be equally clinically effective.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript by Peto et al., the authors describe the impact of different antimicrobials on gut microbiota in a prospective observational study of 225 participants (healthy volunteers, inpatients and outpatients). Both cross-sectional data (all participants) and longitudinal data (a subset of 79 haematopoietic cell transplant patients) were used. Using metagenomic sequencing, they estimated the impact of antibiotic exposure on gut microbiota composition and resistance genes. In their models, the authors aim to correct for potential confounders (e.g. demographics, non-antimicrobial exposures and physiological abnormalities), and for differences in the recency and total duration of antibiotic exposure. I consider these comprehensive models an important strength of this observational study. Yet, the underlying assumptions of such models may have impacted the study findings (detailed below). Other strengths include the presence of both cross-sectional and longitudinal exposure data and the presence of both healthy volunteers and patients. Together, these observational findings expand on previous studies (both observational and RCTs) describing the impact of antimicrobials on gut microbiota.

      Weaknesses:

      (1) The main weaknesses result from the observational design. This hampers causal interpretation and corrects for potential confounding necessary. The authors have used comprehensive models to correct for potential confounders and for differences between participants in duration of antibiotic exposure and time between exposure and sample collection. I wonder if some of the choices made by the authors did affect these findings. For example, the authors did not include travel in the final model, but travel (most importantly, south Asia) may result in the acquisition of AMR genes [Worby et al., Lancet Microbe 2023; PMID 37716364). Moreover, non-antimicrobial drugs (such as proton pump inhibitors) were not included but these have a well-known impact on gut microbiota and might be linked with exposure to antimicrobial drugs. Residual confounding may underlie some of the unexplained discrepancies between the cross-sectional and longitudinal data (e.g. for vancomycin).

      We agree that the observational design means there is the potential for confounding, which, as the reviewer notes, we attempt to account for as far as possible in the multivariable models presented. We cannot exclude the possibility of residual confounding, and we highlight this as a limitation in the  discussion. We have expanded on this limitation, and mention it as a possible explanation for inconsistencies between longitudinal and cross sectional models. Conducting randomised trials to assess the impacts of multiple antimicrobials in sick, hospitalised patients would be exceptionally difficult, and so it is hard to avoid reliance on observational data in these settings.

      We did record participants’ foreign travel and diet, but these exposures were not included in our models as they were not independently associated with an impact on the microbiome and their inclusion did not materially affect other estimates. However, because most participants were recruited from a healthcare setting, few had recent foreign travel and so this study was not well powered to assess the effects of travel on AMR carriage. We have added this as a limitation.

      In addition, the authors found a disruption half-life of 6 days to be the best fit based on Shannon diversity. If I'm understanding correctly, this results in a near-zero modelled exposure of a 14-day-course after 70 days (purple line; Supplementary Figure 2). However, it has been described that microbiota composition and resistome (not Shannon diversity!) remain altered for longer periods of time after (certain) antibiotic exposures (e.g. Anthony et al., Cell Reports 2022; PMID 35417701). The authors did not assess whether extending the disruption half-life would alter their conclusions.

      The reviewer is correct that the best fit disruption half-life of 6 days means the model assumes near-zero exposure by 70 days. We appreciate that antimicrobials can cause longer-term disruption than is represented in our model, and we refer to this in the discussion (we had cited two papers supporting this, and we are grateful for the additional reference above, which we have added). We agree that it is useful to clarify that the longer term effects may be seen in individual components of the microbiome or AMR genes, but not in overall measures of diversity, so have added this to the discussion.

      (2) Another consequence of the observational design of this study is the relatively small number of participants available for some comparisons (e.g. oral clindamycin was only used by 6 participants). Care should be taken when drawing any conclusions from such small numbers.

      We agree. Although our participants received a large number of different antimicrobial exposures, these were dependent on routine clinical practice at our centre and we lack data on many potentially important exposures. We had mentioned this in relation to antimicrobials not used at our centre, and have now clarified in the discussion that this also limits reliability of estimates for antimicrobials that were rarely used in study participants.

      (3) The authors assessed log-transformed relative abundances of specific bacteria after subsampling to 3.5 million reads. While I agree that some kind of data transformation is probably preferable, these methods do not address the compositional data of microbiome data and using a pseudocount (10-6) is necessary for absent (i.e. undetected) taxa [Gloor et al., Front Microbiol 2017; PMID 29187837]. Given the centrality of these relative abundances to their conclusions, a sensitivity analysis using compositionally-aware methods (such as a centred log-ratio (clr) transformation) would have added robustness to their findings.

      We agree that using a pseudocount is necessary for undetected taxa, which we have done assuming undetected taxa had an abundance of 10<sup>-6</sup> (based on the lower limit of detection at the depth we sequenced). We refer to this as truncation in the methods section, but for clarity we have now also described this as a pseudocount.  Because our analysis focusses on major taxa that are almost ubiquitous in the human gut microbiome, a pseudocount was only used for 3 samples that had no detectable Enterobacteriaciae.

      We are aware that compositionally-aware methods are often used with microbiome data, and for some analyses these are necessary to avoid introducing spurious correlations. However the flaws in non-compositional analyses outlined in Gloor et al do not affect the analyses in this paper:

      (1) The problems related to differing sequence depths or inadequate normalisation do not apply to our dataset, as we took a random subset of 3.5 million reads from all samples (Gloor et al correctly point out that this method has the drawback of losing some information, but it avoids problems related to variable sequencing depth)

      (2) The remainder Gloor et al critiques multivariate analyses that assess correlations between multiple microbiome measurements made on the same sample, starting with a dissimilarity matrix. With compositional data these can lead to spurious correlations, as measurements on an individual sample are not independent of other measurements made on the same sample. In contrast, our analyses do not use a dissimilarity matrix, but evaluate the association of multiple non-microbiome covariates (e.g. antibiotic exposures, age) with single microbiome measures. We use a separate model for each of 11 specified microbiome components, and display these results side-by side. This does not lead to the same problem of spurious correlation as analyses of dissimilarity matrices. However, it does mean that estimates of effects on each taxa outcome have to be interpreted in the context of estimates on the other taxa. Specifically, in our models, the associations of antimicrobial exposure with different taxa/AMR genes are not necessarily independent of each other (e.g. if an antimicrobial eradicated only one taxon then it would be associated with an increase in others). This is not a spurious correlation, and makes intuitive sense when using relative abundance as outcome. However, we agree this should be made more explicit.

      For these reasons, at this stage we would prefer not to increase the complexity of the manuscript by adding a sensitivity analysis.

      (4) An overall description of gut microbiota composition and resistome of the included participants is missing. This makes it difficult to compare the current study population to other studies. In addition, for correct interpretation of the findings, it would have been helpful if the reasons for hospital visits of the general medical patients were provided.

      We have added a summary of microbiome and resistome composition in the results section and new supplementary table 2), and we also now include microbiome and resistome profiles of all samples in the supplementary data. We also provide some more detail about the types of general medical patients included. We are not able to provide a breakdown of the initial reason for admission as this was not collected.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Provide a supplementary table with information on the abundance of individual genes in the samples.

      This supplementary data is now included.

      (2) Engage with an expert in statistics to discuss how statistical analyses can be improved.

      A experienced biostatistician has been involved in this study since its conception, and was involved in planning the analysis and in the responses to these comments.

      (3) Typos and other minor corrections:

      Methods: it is my understanding that litre should be abbreviated with a lowercase l.

      Different journals have different house styles: we are happy to follow Editorial guidance.

      p. 9: abuindance should be corrected to abundance.

      Corrected

      p. 9: relative species should be relevant species?  

      Yes, corrected. Thank you.

      p. 9 - 10: can the apparent lack of effect of beta-lactams on beta-lactamase gene abundance be explained by the focus on a small number of beta-lactamase resistance genes that are found in Enterobacteriaceae and which are not particularly prevalent, while other classes of resistance genes (e.g. Bacteroidal beta-lactamases) were excluded?

      It is possible that including other beta-lactamases would have led to different results, but as a small number of beta-lactamases in Enterobacteriaceae are of major clinical importance we decided to focus on these (already justified in the Methods). A full list of AMR genes identified is now provided in the supplementary data.

      p. 10: beta-lactamse should be beta-lactamase

      Corrected

      Figure 3A: could the data shown for tetracycline resistance genes be skewed by tetQ, which is probably one of the most abundant resistance genes in the human gut and acts through ribosome protection?

      TetQ was included, but only accounted for 23% of reads assigned to tetracycline resistance genes so is unlikely to have skewed the overall result. We limited the analysis to a few major categories of AMR genes and, other than VanA, have avoided presenting results for single genes to limit the degree of multiple testing. We now include the resistome profile for each sample in the supplementary data so that readers can explore the data if desired.

      Reviewer #2 (Recommendations For The Authors):

      (1) Given the importance of obligate anaerobic gut microbiota for human health, it might be interesting to divide antibiotics into categories based on their anti-anaerobic activity and assess whether these antibiotics differ in their effects on gut microbiota.

      The large majority of antibiotics used in clinical practice have activity against aerobic bacteria and anaerobic bacteria, so it is not possible to easily categorise them this way. There are two main exceptions (metronidazole and aminoglycosides) but there was insufficient use of these drugs to clearly detect or rule out a difference between them, even when categorising antimicrobials by class, so we prefer not to frame the results in these terms. Also see our comments on this categorisation below.

      (2) For estimating the abundance of anaerobic bacteria, three major groups were assessed: Bacteroidetes, Actinobacteria and Clostridia. To me, this seems a bit aspecific. For example, the phylum Bacteroidetes contains some aerobic bacteria (e.g. Flavobacteriia). Would it be possible to provide a more accurate estimation of anaerobic bacteria?

      We think that an emphasis on a binary aerobic/anaerobic classification is less biologically meaningful that the more granular genetic classification we use, and its use largely reflects the previous reliance on culture-based methods for bacterial identification. Although some important opportunistic human pathogens are aerobic, it is not clear that the benefit or harm of most gut commensals relates to their oxygen tolerance, and all luminal bacteria exist in an anaerobic environment. As such we prefer not to perform an additional analysis using this category. We are also not sure that this could be done reliably, as many of the taxa are characterised poorly, or not at all.

      We appreciate that Bacteroidetes, Actinobacteria and Clostridia are diverse taxa that include many different species, so may seem non-specific, but these were chosen because:

      i) they are non-overlapping with Enterobacteriaceae and Enterococcus, the major opportunistic pathogens of clinical relevance, so could be used in parallel, and

      ii) they make up the large majority of the gut microbiome in most people and most species are of low pathogenicity, so it is plausible that their disruption might drive colonisation with more pathogenic organisms (or those carrying important AMR genes).

      We have more clearly stated this rationale.

      (3) A statement on the availability of data and code for analysis is missing. I would highly recommend public sharing of raw sequence data and R code for analysis. If possible, it would be very valuable if processed microbiome data and patient metadata could be shared.

      We agree, and these have been submitted as supplementary data. We have added the following statement “The data and code used to produce this manuscript are available in the supplementary material, including processed microbiome data, and pseudonymised patient metadata. The sequence data for this study have been deposited in the European Nucleotide Archive (ENA) at EMBL-EBI under accession number PRJEB86785.”

    1. In case of sales of tangible products, Import/Export code is required. If you have past international transaction experience, the following documents are accepted: Bank statement for inward remittance. Settlement record from the current payment partn

      None of this is required

  3. Jun 2025
    1. mpound/cid/4485,4499,5026,5734,8082" pugoper = "property/HBondDonorCount,HBondDonorCount,XLogP,TPSA" pugout = "csv"

      Test to see if you can annotate code in a jupyter book

    1. Reviewer #3 (Public review):

      A bias in how people infer the amount of control they have over their environment is widely believed to be a key component of several mental illnesses including depression, anxiety, and addiction. Accordingly, this bias has been a major focus in computational models of those disorders. However, all of these models treat control as a unidimensional property, roughly, how strongly outcomes depend on action. This paper proposes---correctly, I think---that the intuitive notion of "control" captures multiple dimensions in the relationship between action and outcome. In particular, the authors identify one key dimension: the degree to which outcome depends on how much *effort* we exert, calling this dimension the "elasticity of control". They additionally argue that this dimension (rather than the more holistic notion of controllability) may be specifically impaired in certain types of psychopathology. This idea has the potential to change how we think about several major mental disorders in a substantial way and can additionally help us better understand how healthy people navigate challenging decision-making problems. More concisely, it is a very good idea.

      Unfortunately, my view is that neither the theoretical nor empirical aspects of the paper really deliver on that promise. In particular, most (perhaps all) of the interesting claims in the paper have weak empirical support.

      Starting with theory, the authors do not provide a strong formal characterization of the proposed notion of elasticity. There are existing, highly general models of controllability (e.g., Huys & Dayan, 2009; Ligneul, 2021) and the elasticity idea could naturally be embedded within one of these frameworks. The authors gesture at this in the introduction; however, this formalization is not reflected in the implemented model, which is highly task-specific. Moreover, the authors present elasticity as if it is somehow "outside of" the more general notion of controllability. However, effort and investment are just specific dimensions of action; and resources like money, strength, and skill (the "highly trained birke") are just specific dimensions of state. Accordingly, the notion of elasticity is necessarily implicitly captured by the standard model. Personally, I am compelled by the idea that effort and resource (and therefore elasticity) are particularly important dimensions, ones that people are uniquely tuned to. However, by framing elasticity as a property that is different in kind from controllability (rather than just a dimension of controllability), the authors only make it more difficult to integrate this exciting idea into generalizable models.

      Turning to experiment, the authors make two key claims: (1) people infer the elasticity of control, and (2) individual differences in how people make this inference are importantly related to psychopathology.

      Starting with claim 1, there are three subclaims here; implicitly, the authors make all three. (1A) People's behavior is sensitive to differences in elasticity, (1B) people actually represent/track something like elasticity, and (1C) people do so naturally as they go about their daily lives. The results clearly support 1A. However, 1B and 1C are not strongly supported.

      (1B) The experiment cannot support the claim that people represent or track elasticity because effort is the only dimension over which participants can engage in any meaningful decision-making. The other dimension, selecting which destination to visit, simply amounts to selecting the location where you were just told the treasure lies. Thus, any adaptive behavior will necessarily come out in a sensitivity to how outcomes depend on effort.

      Notes on rebuttal: The argument that vehicle/destination choice is not trivial because people occasionally didn't choose the instructed location is not compelling to me-if anything, the exclusion rate is unusually low for online studies. The finding that people learn more from non-random outcomes is helpful, but this could easily be cast as standard model-based learning very much like what one measures with the Daw two-step task (nothing specific to control here). Their final argument is the strongest, that to explain behavior the model must assume "a priori that increased effort could enhance control." However, more literally, the necessary assumption is that each attempt increases the probability of success-e.g. you're more likely to get a heads in two flips than one. I suppose you can call that "elasticity inference", but I would call it basic probabilistic reasoning.

      For 1C, the claim that people infer elasticity outside of the experimental task cannot be supported because the authors explicitly tell people about the two notions of control as part of the training phase: "To reinforce participants' understanding of how elasticity and controllability were manifested in each planet, [participants] were informed of the planet type they had visited after every 15 trips." (line 384).

      Notes on rebuttal: The authors try to retreat, saying "our research question was whether people can distinguish between elastic and inelastic controllability." I struggle to reconcile this with the claim in the abstract "These findings establish the elasticity of control as a distinct cognitive construct guiding adaptive behavior". That claim is the interesting one, and the one I am evaluating the evidence in light of.

      Finally, I turn to claim 2, that individual differences in how people infer elasticity are importantly related to psychopathology. There is much to say about the decision to treat psychopathology as a unidimensional construct (the authors claim otherwise, but see Fig 6C). However, I will keep it concrete and simply note that CCA (by design) obscures the relationship between any two variables. Thus, as suggestive as Figure 6B is, we cannot conclude that there is a strong relationship between Sense of Agency (SOA) and the elasticity bias---this result is consistent with any possible relationship (even a negative one). As it turns out, Figure S3 shows that there is effectively no relationship (r=0.03).

      Notes on rebuttal: The authors argue for CCA by appeal to the need to "account for the substantial variance that is typically shared among different forms of psychopathology". I agree. A simple correlation would indeed be fairly weak evidence. Strong evidence would show a significant correlation after *controlling for* other factors (e.g. a regression predicting elasticity bias from all subscales simultaneously). CCA effectively does the opposite, asking whether-with the help of all the parameters and all the surveys-one can find any correlation between the two sets of variables. The results are certainly suggestive, but they provide very little statistical evidence that the elasticity parameter is meaningfully related to any particular dimension of psychopathology.

      There is also a feature of the task that limits our ability to draw strong conclusions about individual differences about elasticity inference. In the original submission, the authors stated that the study was designed to be "especially sensitive to overestimation of elasticity". A straightforward consequence of this is that the resulting *empirical* estimate of estimation bias (i.e., the gamma_elasticity parameter) is itself biased. This immediately undermines any claim that references the directionality of the elasticity bias (e.g. in the abstract). Concretely, an undirected deficit such as slower learning of elasticity would appear as a directed overestimation bias.

      When we further consider that elasticity inference is the only meaningful learning/decision-making problem in the task (argued above), the situation becomes much worse. Many general deficits in learning or decision-making would be captured by the elasticity bias parameter. Thus, a conservative interpretation of the results is simply that psychopathology is associated with impaired learning and decision-making.

      Notes on rebuttal: I am very concerned to see that the authors removed the discussion of this limitation in response to my first review. I quote the original explanation here:

      - In interpreting the present findings, it needs to be noted that we designed our task to be especially sensitive to overestimation of elasticity. We did so by giving participants free 3 tickets at their initial visits to each planet, which meant that upon success with 3 tickets, people who overestimate elasticity were more likely to continue purchasing extra tickets unnecessarily. Following the same logic, had we first had participants experience 1 ticket trips, this could have increased the sensitivity of our task to underestimation of elasticity in elastic environments. Such underestimation could potentially relate to a distinct psychopathological profile that more heavily loads on depressive symptoms. Thus, by altering the initial exposure, future studies could disambiguate the dissociable contributions of overestimating versus underestimating elasticity to different forms of psychopathology.

      The logic of this paragraph makes perfect sense to me. If you assume low elasticity, you will infer that you could catch the train with just one ticket. However, when elasticity is in fact high, you would find that you don't catch the train, leading you to quickly infer high elasticity-eliminating the bias. In contrast, if you assume high elasticity, you will continue purchasing three tickets and will never have the opportunity to learn that you could be purchasing only one-the bias remains.

      The authors attempt to argue that this isn't happening using parameter recovery. However, they only report the *correlation* in the parameter, whereas the critical measure is the *bias*. Furthermore, in parameter recovery, the data-generating and data-fitting models are identical-this will yield the best possible recovery results. Although finding no bias in this setting would support the claims, it cannot outweigh the logical argument for the bias that they originally laid out. Finally, parameter recovery should be performed across the full range of plausible parameter values; using fitted parameters (a detail I could only determine by reading the code) yields biased results because the fitted parameters are themselves subject to the bias (if present). That is, if true low elasticity is inferred as high elasticity, then you will not have any examples of low elasticity in the fitted parameters and will not detect the inability to recover them.

      Minor comments:

      Below are things to keep in mind.

      The statistical structure of the task is inconsistent with the framing. In the framing, participants can make either one or two second boarding attempts (jumps) by purchasing extra tickets. The additional attempt(s) will thus succeed with probability p for one ticket and 2p - p^2 for two tickets; the p^2 captures the fact that you only take the second attempt if you fail on the first. A consequence of this is buying more tickets has diminishing returns. In contrast, in the task, participants always jumped twice after purchasing two tickets, and the probability of success with two tickets was exactly double that with one ticket. Thus, if participants are applying an intuitive causal model to the task, they will appear to "underestimate" the elasticity of control. I don't think this seriously jeopardizes the key results, but any follow-up work should ensure that the task's structure is consistent with the intuitive causal model.

      The model is heuristically defined and does not reflect Bayesian updating. For example, it over-estimates maximum control by not using losses with less than 3 tickets (intuitively, the inference here depends on what your beliefs about elasticity). Including forced three-ticket trials at the beginning of each round makes this less of an issue; but if you want to remove those trials, you might need to adjust the model. The need to introduce the modified model with kappa is likely another symptom of the heuristic nature of the model updating equations.

    1. LATE was estimated using a model implemented in rstan with non-informative Jeffreys priors. A detailed description of the estimand and estimator is provided in the methodological supplement. The rstan code is available in the article notebook available online.

      This is almost identical to the section above on using rstan with n.i. Jeffreys priors. Consider revising.

    1. proficient

      proficient

      English Explanation

      The term "proficient" is an adjective that describes a person's level of skill or competence in a particular area or activity. Being proficient means that an individual has a high degree of ability and is capable of performing tasks effectively and efficiently. This term is often used in various contexts, such as language skills (e.g., someone might be proficient in English or Mandarin), sports, technical skills, artistic abilities, and more.

      In evaluating proficiency, one can consider factors such as knowledge, experience, and performance. For instance, a proficient musician can play their instrument well and understand music theory, while a proficient programmer can write and debug code with expertise.

      In summary, being proficient indicates a significant level of skill and effectiveness in a particular discipline or task.

      Chinese Explanation

      “proficient”是一个形容词,用于描述一个人在特定领域或活动中的技能或能力水平。熟练意味着个人具备较高的能力,能够有效且高效地完成任务。这个术语通常应用于各种上下文中,例如语言技能(例如,有人可能精通英语或普通话)、体育、技术技能、艺术才能等。

      在评估熟练程度时,通常考虑知识、经验和表现等因素。例如,一个熟练的音乐家可以很好地演奏乐器,并理解音乐理论,而一个熟练的程序员能够熟练地编写和调试代码。

      总之,熟练表明在某一学科或任务中具有显著的技能和高效的表现。

    2. dot-dash flashing pattern

      dot-dash flashing pattern

      English Explanation:

      The phrase "dot-dash flashing pattern" refers to a specific sequence or arrangement of visual signals that alternate between short and long flashes. In this context:

      • Dot refers to a short flash, typically representing the shorter signal in the pattern.
      • Dash refers to a longer flash, representing the extended signal.
      • Flashing Pattern implies that these dots and dashes are organized in a particular order to convey information.

      This type of pattern is commonly used in various signaling systems, including Morse code, where a dot represents a 'short' signal (a single quick flash) and a dash represents a 'long' signal (a series of longer flashes). Such patterns can be utilized in different communication methods, like visual signals (such as lights) or acoustic signals (like beeps).

      Chinese Explanation (中文解释):

      “点划闪烁模式”是指一种特定的视觉信号序列或排列,这些信号在短闪和长闪之间交替。在这个上下文中:

      • 点(Dot) 是指一个短暂的闪烁,通常代表模式中的短信号。
      • 划(Dash) 是指一个较长的闪烁,代表扩展信号。
      • 闪烁模式(Flashing Pattern) 意味着这些点和划以特定顺序排列,用以传递信息。

      这种类型的模式常用于各种信号系统,包括摩尔斯电码,其中点表示“短”信号(一个快速的闪烁),而划表示“长”信号(一系列较长的闪烁)。这样的模式可以用于不同的通信方式,例如视觉信号(像灯光)或声学信号(像鸣叫声)。

    3. allele

      allele

      English Explanation

      An allele is a variant form of a gene that is found at a specific location on a chromosome. Genes, which are segments of DNA responsible for coding for proteins and determining traits, can exist in different versions called alleles. For example, if we consider a gene that influences flower color in plants, one allele might code for red flowers, while another might code for white flowers.

      Alleles can be classified as dominant or recessive. A dominant allele will express its trait even if there is only one copy present in the organism (heterozygous condition), whereas a recessive allele needs to be present in two copies (homozygous condition) for its trait to be expressed. In the flower color example, if red is a dominant allele and white is recessive, a plant with a red allele and a white allele will have red flowers.

      In summary, alleles contribute to the genetic variation observed in populations, and they play a key role in heredity — the process through which traits are passed from parents to offspring.

      Chinese Explanation

      等位基因是指位于染色体特定位置上的基因的变异形式。基因是负责编码蛋白质和决定性状的DNA片段,等位基因则是该基因的不同版本。例如,如果我们考虑一个影响植物花色的基因,一个等位基因可能编码红花,而另一个可能编码白花。

      等位基因可以被分类为显性隐性。显性等位基因即使在生物体内只有一份拷贝(杂合状态)也会表达其特征,而隐性等位基因需要在两份拷贝(纯合状态)中才会表达其特征。在花色的例子中,如果红色是显性等位基因而白色是隐性等位基因,那么拥有红色和白色等位基因的植物会开红花。

      总之,等位基因为种群中观察到的遗传变异做出了贡献,并在遗传过程中发挥着关键作用——即性状从父母传递给后代的过程。

    4. allele

      allele

      English Explanation:

      An allele is a variant form of a gene that is located at a specific position on a specific chromosome. Genes, which are segments of DNA, are responsible for the hereditary traits of an organism, and alleles determine the variations of those traits. For example, in humans, the gene for eye color has multiple alleles, such as those that code for brown, blue, or green eyes.

      Alleles can be classified as:

      1. Dominant Alleles: These are expressed in the phenotype even if only one copy is present. For example, if the allele for brown eyes is dominant, a person with one brown eye allele and one blue eye allele will have brown eyes.

      2. Recessive Alleles: These require two copies to be expressed in the phenotype. Using the same example, a person with two blue eye alleles will have blue eyes only if they do not have a dominant brown eye allele.

      3. Homozygous: An individual with two identical alleles for a trait (e.g., two brown eye alleles).

      4. Heterozygous: An individual with two different alleles for a trait (e.g., one brown eye allele and one blue eye allele).

      Alleles play a crucial role in genetics and evolution as they contribute to the diversity of traits in populations and can affect the likelihood of certain traits being passed down to future generations.


      中文解释:

      等位基因是位于特定染色体上特定位置的基因变体。基因是DNA的片段,负责有机体的遗传特征,而等位基因决定了这些特征的变异。例如,在人类中,负责眼睛颜色的基因有多种等位基因,例如编码棕色、蓝色或绿色眼睛的基因。

      等位基因可以分类为:

      1. 显性等位基因: 这些等位基因即使只有一个副本存在,也会在表现型中表达出来。例如,如果棕色眼睛的等位基因是显性的,那么一个有一个棕色眼睛等位基因和一个蓝色眼睛等位基因的人会有棕色眼睛。

      2. 隐性等位基因: 这些等位基因需要两个副本才能在表现型中表达出来。以同样的例子,一个有两个蓝色眼睛等位基因的人只有在没有显性棕色眼睛等位基因的情况下才会有蓝色眼睛。

      3. 同型合子: 对于某一特征,一个个体有两个相同的等位基因(例如,两个棕色眼睛等位基因)。

      4. 异型合子: 对于某一特征,一个个体有两个不同的等位基因(例如,一个棕色眼睛等位基因和一个蓝色眼睛等位基因)。

      等位基因在遗传学和进化中扮演着重要角色,因为它们有助于种群特征的多样性,并可能影响特定特征被传递给后代的可能性。

    5. nucleotides

      nucleotides

      English Explanation:

      Nucleotides are the basic building blocks of nucleic acids, which are essential biomolecules in all living organisms. They consist of three primary components:

      1. A Nitrogenous Base: This can be one of five bases—adenine (A), thymine (T), cytosine (C), guanine (G), or uracil (U). In DNA, the bases are A, T, C, and G, while in RNA, uracil replaces thymine.

      2. A Sugar Molecule: This is usually a five-carbon sugar called ribose in RNA or deoxyribose in DNA.

      3. A Phosphate Group: This is a phosphate molecule (PO₄) that can connect nucleotides together, creating a sugar-phosphate backbone that forms the structural framework of DNA and RNA.

      Nucleotides play several crucial roles in biological processes, such as:

      • Genetic Information: Nucleotides sequence forms the genetic code that determines the characteristics of an organism.
      • Energy Transfer: Adenosine triphosphate (ATP), a nucleotide, acts as the primary energy carrier in cells.
      • Signaling: Certain nucleotides function as signaling molecules in cellular processes.

      In summary, nucleotides are vital components that not only build nucleic acids (DNA and RNA) but also play essential roles in metabolism and cellular signaling.


      Chinese Explanation (中文解释):

      核苷酸是核酸的基本构建单元,而核酸则是所有生物体中必不可少的生物大分子。核苷酸主要由三部分组成:

      1. 含氮碱基(Nitrogenous Base): 这可以是五种碱基之一:腺嘌呤(A)、胸腺嘧啶(T)、胞嘧啶(C)、鸟嘌呤(G)或尿嘧啶(U)。在DNA中,这些碱基是A、T、C和G,而在RNA中,尿嘧啶替代胸腺嘧啶。

      2. 糖分子(Sugar Molecule): 通常是五碳糖,RNA中的糖是核糖,而DNA中的糖是脱氧核糖。

      3. 磷酸基团(Phosphate Group): 这是一个磷酸分子(PO₄),可以将核苷酸连接在一起,形成组成DNA和RNA结构框架的糖-磷酸骨架。

      核苷酸在生物过程中的几大重要作用包括:

      • 遗传信息(Genetic Information): 核苷酸的序列形成遗传密码,决定一个生物体的特征。
      • 能量转移(Energy Transfer): 三磷酸腺苷(ATP)是一种核苷酸,充当细胞中主要的能量载体。
      • 信号传递(Signaling): 某些核苷酸在细胞过程中充当信号分子。

      总而言之,核苷酸不仅是构建核酸(DNA和RNA)的重要成分,还在新陈代谢和细胞信号传递中发挥着重要作用。

    6. contravened

      "Contravened" is the past tense of the verb "contravene," which means to violate or go against a law, rule, or code of conduct. It can also mean to act in opposition to something. For example, if someone ignores traffic regulations, they may be said to have contravened those regulations. If you need information or examples related to this term, feel free to ask!

    1. D’s static reflection and code generation capabilities make it an ideal candidate to implement a codebase that needs to be called from several different languages and environments (e.g. Python, Excel, R, …). Traditionally this is done by specifying data structures and RPC calls in an Interface Definition Language (IDL) then translating that to the supported languages, with a wire protocol to go along with it. With D, none of that is necessary.
    1. And his trails do not fade. Several yearslater, his talk with a friend trims to the queerways in which a people resist innovations, evenof vital interest. He has an example, in the factthat the outranged Europeans still failed to adoptthe Turkish bow. In fact he has a trail on it.A touch brings up the code book. Tapping a fewkeys projects the head of the trail. A leverruns through it at will, stopping at interestingitems, going off on side excursions. It is aninteresting trail, pertinent to the discussion.So he sets a reproducer in action, photographsthe whole trail out, and passes it to his friendfor insertion in his own memex, there to be linkedinto the more general trail.

      his trails do not fade

    1. Discussion

      Thank you for developing what appears to be a really useful set of tools! I appreciate how you've made the techniques accessible and reproducible. The 96-well format and automated analysis should make it relatively straightforward for other researchers to adopt these methods.

      I'm particularly grateful for your commitment to open science - making both your code and data freely available on GitHub is exactly what the research community needs more of. This transparency will help other labs build on your work and get the most value from your efforts.

      Thank you for the thorough work and for sharing it so openly with the community!

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer 1 (Public review):

      Weaknesses:

      While the data generally supports the authors' conclusions, a weakness of this manuscript lies in their analytical approach where EEG feature-space comparisons used the number of spontaneous or evoked seizures as their replicates as opposed to the number of IHK mice; these large data sets tend to identify relatively small effects of uncertain biological significance as being highly statistically significant. Furthermore, the clinical relevance of similarly small differences in EEG feature space measurements between seizure-naïve and epileptic mice is also uncertain.

      In this work, we used linear mixed effect model to address two levels of variability –between animals and within animals. The interactive linear mixed effect model shows that most (~90%) of the variability in our data comes from within animals (Residual), the random effect that the model accounts for, rather than between animals. Since variability between animals are low, the model identifies common changes in seizure propagation across animals, while accounting for the variability in seizures within each animal. Therefore, the results we find are of changes that happen across animals, not of individual seizures. We made text edits to clarify the use of the linear mixed effect model. (page6, second paragraph and page 11, first paragraph)

      Finally, the multiple surgeries and long timetable to generate these mice may limit the value compared to existing models in drug-testing paradigms.

      Thank you for the suggestion. We added a discussion in the ‘Comparison to other seizure models…’ section on pages 15 and 16. In an existing model investigating spontaneous tonic-clonic seizures (such as the intra-amygdala kainate injection model), the time investment is back-loaded, requiring two to three weeks per condition while counting spontaneous seizures, which may occur only once a day. In contrast, our model requires a front-loaded time investment. Once the animals are set up, we can test multiple drugs within a few weeks, providing significant time savings. Additionally, we did not pre-screen animals in our study. Existing models often pre-select mice with high rates of spontaneous seizures, whereas in our model, seizures can be induced even in animals with few spontaneous seizures. We believe that bypassing the need for pre-screening also is a key advantage of our induced seizure model.  

      Reviewer 1 (Recommendations for the authors):

      (1) Address why the EEG data comparisons were performed between seizures and not between animals (as explicitly described in the public review). Further, a discussion of the biological significance (or lack thereof) of the effect size differences observed is warranted. This is especially concerning when the authors make the claim that spontaneous and induced seizures are essentially the same while their analysis shows all evaluated feature space parameters were significantly difference in the initial 1/3 of the EEG waveforms.

      We made text edits to clarify the use of the linear mixed effects model (page 6, second paragraph, and page 11, first paragraph)

      (2) The authors place great emphasis on the use of clinically/etiologically relevant epilepsy models in drug discovery research. There is discussion criticizing the time points required to enact kindling and the artificial nature of acute seizure induction methods. However, the combination IHK-opto seizure induction model also requires a lengthy timeline. A more tempered discussion of this novel model's strengths may benefit readers.

      Thank you for the suggestion. We added a discussion in the ‘Comparison to other seizure models…’ section on pages 15 and 16.

      (3) The authors should further emphasize the benefit of having an inducible seizure model of focal epilepsy since other mouse models (e.g., genetic or TBI models) may have superior etiological relevance (construct and face validity) but may not be amenable to their optogenetic stimulation approach.

      Thank you for the suggestion. We revised the manuscript to better emphasize the potential significance of our approach. We added a discussion in the 'Application of Models...' section on page 15, second paragraph. The on-demand seizure model can be applied to address biologically and clinically relevant questions beyond its utility in drug screening. For example, crossing the Thy1-ChR2 mouse line with genetic epilepsy models, such as Scn1a mutants, could reveal how optogenetic stimulation differentially induces seizures in mutant versus non-mutant mice, providing insights into seizure generation and propagation in Dravet syndrome. Due to the cellular specificity of optogenetics, we also envision this approach being used to study circuit-specific mechanisms of seizure generation and propagation.

      (4) Suggestion: Provide immunolabeled imagery demonstrating ChR2 presence in Thy1 cells.

      Thank you for the suggestion. We added a fluorescence image showing ChR2 expression in Fig. 2A

      (5) It might be prudent to mention any potential effects of laser heat on hippocampal cell damage, although the 10 Hz, ~10 mW, and 6 s stim is unlikely to cause any substantial burns. Without knowing the diameter and material of the optic fiber, this is left up to some interpretation.

      Thank you for the comments. In the Methods section, we listed the optical fiber diameter as 400 microns (page 17, EEG and Fiber Implantation section). Using 5–18 mW laser power with a relatively large fiber diameter of 400 microns, the power density falls within the range of commonly employed channelrhodopsin activation conditions in vivo. That said, we would like to investigate potential heat effects or cell damage in a follow-up study.

      (6) There are instances in the manuscript where the authors describe experimental and analytical parameters vaguely (e.g. "Seizures were induced several times a day", "stimulation was performed every 1 - 3 hours over many days"). These descriptions can and should be more precise.

      Thank you for the comments. To enhance clarity, we added the stimulation protocol in a flowchart format in Fig. S2A, describing how we determined the threshold and proceeded to the drug test. Following this protocol, there was variability in the number of stimulations per day.

      (7) In the second to last paragraph of the discussion, the authors state "However, HPDs are not generalizable across species - they are specific to the mouse model (55)." This statement is inaccurate. The paper cited comes from Dr. Corrine Roucard's lab at Synapcell. In fact, Dr. Rouchard argues the opposite (See Neurochem Res (2017) 42:1919-1925).

      Thank you for pointing out the mistake. On page 16, in the first paragraph, reference 55 (now 58 in the revised version) was intended to refer to 'quickly produce dose-response curves with high confidence.' In the revision, we cited another paper reporting that hippocampal spikes were not reproduced in the rat IHK model. R. Klee, C. Brandt, K. Töllner, W. Löscher, Various modifications of the intrahippocampal kainate model of mesial temporal lobe epilepsy in rats fail to resolve the marked rat-to-mouse differences in type and frequency of spontaneous seizures in this model. Epilepsy Behav. 68, 129–140 (2017).

      (8) In the discussion, Levetiracetam is highlighted as an ASM that would not be detected in acute induced seizure models; the authors point out its lack of effect in MES and PTZ. However, LEV is effective in the 6Hz test (also an acute-induced seizure model). This should be stated.

      Thank you for the comments. We highlighted the discussion on LEV in the 'Application of Model to Testing Multiple Classes of ASMs...' section on page 14.

      (9) The results text indicates that 9 epileptic mice were used to test LEV and DZP. However, the individual data points illustrated in Figure 5B show N=8 mice. Please correct.

      Thank you for the comments. A total of nine epileptic mice were used to assess two drugs, with the animals being re-used as indicated in the schematic. A total of eight assessments were conducted for DZP with six mice and eight assessments for LEV with five mice. Each assessment included hourly ChR2 activations without an ASM and hourly ChR2 activations after ASM injection.

      (10) Figure 4D: Naïve mice are labeled as solid blue circles in the legend while the data points are solid blue triangles. Please correct.

      Thank you. We corrected the marker in Fig.4D.

      Reviewer 2 (Public Review):

      Weaknesses:

      (1) Although the figures provide excellent examples of individual electrographic seizures and compare induced seizures in epileptic and naïve animals, it is unclear which criteria were used to identify an actual seizure induced by the optogenetic stimulus, versus a hippocampal paroxysmal discharge (HPD), an "afterdischarge", an "electrophysiological epileptiform event" (EEE, Ref #36, D'Ambrosio et al., 2010 Epilepsy Currents), or a so-called "spike-wave-discharge" (SWD). Were HPDs or these other non-seizure events ever induced using stimulation in animals with IH-KA? A critical issue is that these other electrical events are not actual seizures, and it is unclear whether they were included in the column showing data on "electrographic afterdischarges" in Figure 5 for the studies on ASDs. This seems to be a problem in other areas of the paper, also.

      Thank you for pointing out the unclear definition of the seizures analyzed. We added sentences at the beginning of the Results section (page 3) to clarify the terminology we used. We analyzed animal behavior during evoked events, and a high percentage of induced electrographic events were accompanied by behavioral seizures with a Racine scale of three or above. We added Supplemental Figure S9, which shows behavioral seizure severity scores observed before and during ASM testing. We hope these changes address the reviewer’s concern and improve the clarity of the manuscript.

      (2) The differences between the optogenetically evoked seizures in IH-KA vs naïve mice are interpreted to be due to the "epileptogenesis" that had occurred, but the lesion from the KA-induced injury would be expected to cause differences in the electrically and behaviorally recorded seizures - even if epileptogenesis had not occurred. This is not adequately addressed.

      Thank you for the comments. IHK-injected mice had spontaneous tonic-clonic seizures before the start of optical stimulation, as shown in Figure S1.

      (3) The authors offer little mention of other research using animal models of TLE to screen ASDs, of which there are many published studies - many of them with other strengths and/or weaknesses. For example, although Grabenstatter and Dudek (2019, Epilepsia) used a version of the systemic KA model to obtain dose-response data on the effects of carbamazepine on spontaneous seizures, that work required use of KA-treated rats selected to have very high rates of spontaneous seizures, which requires careful and tedious selection of animals. The ETSP has published studies with an intra-amygdala kainic acid (IA-KA) model (West et al., 2022, Exp Neurol), where the authors claim that they can use spontaneous seizures to identify ASDs for DRE; however, their lack of a drug effect of carbamazepine may have been a false negative secondary to low seizure rates. The approach described in this paper may help with confounds caused by low or variable seizure rates. These types of issues should be discussed, along with others.

      We appreciate the reviewer’s insights. We added a discussion comparing our model with other existing models in the Discussion section (pages 15 and 16, 'Comparison to Other Seizure Models Used in Pharmacologic Screening' section). In an existing model investigating spontaneous tonic-clonic seizures (such as the intra-amygdala kainate injection model), the time investment is back-loaded, requiring two to three weeks per condition while counting spontaneous seizures, which may occur only once a day. In contrast, our model requires a front-loaded time investment. Once the animals are set up, we can test multiple drugs within a few weeks, providing significant time savings. Additionally, we did not pre-screen animals in our study. Existing models often pre-select mice with high rates of spontaneous seizures, whereas in our model, seizures can be induced even in animals with few spontaneous seizures. We believe that bypassing the need for pre-screening is a key advantage of our induced seizure model.

      (4) The outcome measure for testing LEV and DZP on seizures was essentially the fraction of unsuccessful or successful activations of seizures, where high ASD efficacy is based on showing that the optogenetic stimulation causes fewer seizures when the drug is present. The final outcome measure is thus a percentage, which would still lead to a large number of tests to be assured of adequate statistical power. Thus, there is a concern about whether this proposed approach will have high enough resolution to be more useful than conventional screening methods so that one can obtain actual dose-response data on ASDs.

      Thank you for the comments. In this revision, we added Supplemental Figure S9, showing the severity of behavioral seizures observed before and during ASM testing for each animal. We observed a reduction in behavioral seizure severity for each subject. We would like to explore using behavioral severity as an outcome measure in a follow-up study.

      (5) The authors state that this approach should be used to test for and discover new ASDs for DRE, and also used for various open/closed loop protocols with deep-brain stimulation; however, the paper does not actually discuss rigorously or critically the background literature on other published studies in these areas or how this approach will improve future research for a broader audience than the ETSP and CROs. Thus, it is not clear whether the utility will apply more widely and how extensive a readership will be attracted to this work.

      We appreciate the reviewer’s insights. We revised the manuscript to better emphasize the potential significance of our approach (page 15, second paragraph). The on-demand seizure model can be applied to address biologically and clinically relevant questions beyond its utility in drug screening. For example, crossing the Thy1-ChR2 mouse line with genetic epilepsy models, such as Scn1a mutants, could reveal how optogenetic stimulation differentially induces seizures in mutant versus non-mutant mice, providing insights into seizure generation and propagation in Dravet syndrome. Due to the cellular specificity of optogenetics, we also envision this approach being used to study circuit-specific mechanisms of seizure generation and propagation. Regarding drug-resistant epilepsy (DRE) and anti-seizure drug (ASD) screening, we agree with the reviewer that probing new classes of ASDs for DRE represents a critical goal. However, we believe that a full exploration of additional ASD classes and/or modeling DRE lies outside the scope of this manuscript, and we would like to explore it in a follow-up study.

      Reviewer 2 (Recommendations for the authors):

      (1) The authors should explain why 10 Hz was chosen as the stimulation frequency.

      Thank you for the comment. A frequency of 10 Hz was determined based on previous work using anesthetized animals prepared in an acute in vivo setting. To simplify the paper and avoid confusion, we did not include a discussion on how we determined the frequency. Instead, we added a detailed description of how we optimized the power in a flowchart format in Supplemental Figure S2. We hope this improves reproducibility.

      (2) After micro-injection of KA, morphological changes were observed in the hippocampus, but no comparison of Chr2 expression was made in naïve animals vs KA-injected animals. Presumably, the Thy1-Chr2 mouse expresses GFP in cells that express Chr2. Thus, it may be useful to show the expression of Chr2 in animals with hippocampal sclerosis. This may explain the lack of dramatic difference between stimulation parameters in naïve vs epileptic animals, as shown in supplemental Figure S2.

      Thank you for the suggestion. We added a fluorescence image of ChR2 expression in CA1, ipsilateral to the KA-injected site, in Fig. 2A.

      (3) The authors state that "During epileptogenesis, neural networks in the brain undergo various changes ranging from modification of membrane receptors to the formation of new synapses" and that these changes are critical for successful "on-demand" seizure induction. However, it is not clear or well-discussed whether changes in neuronal cell densities that occur during sclerosis are important for "on-demand" seizure induction as well. Also, the authors showed that naïve animals exhibit a kindling-like effect, but it was unclear whether a similar effect was present in epileptic animals (i.e. do stimulation thresholds to seizure induction change as the animal gets more induction stimulations)? If present, would the secondary kindling affect drug-testing studies (e.g., would the drug effect be different on induced seizure #2 vs induced seizure #20)?

      Thank you for the suggestion. Since this is an important aspect of the model, we would like to address the kindling effect, the secondary kindling effect, and histopathology in a longer-term setting (several weeks) in a follow-up study.

      (4) The authors show that in their model, LEV and DZP were both efficacious. The authors do not seem to mention that, over 25 years ago, LEV was originally missed in the standard ETSP screens; and, it was only discovered outside of the ETSP with the kindling model. The kindling model is now used to screen ASDs. The authors should consider adding this point to the Discussion. It remains unclear, however, if the author's screening strategy shows advantages over kindling and other such approaches in the field.

      Thank you for the suggestion. We added a discussion on LEV in the 'Application of Model to Testing Multiple Classes of ASMs...' section on page 14.

      (5) P8 paragraph 2. The authors state values for naïve animals, but they should also provide values for epileptic animals since they state that the groups were not significantly different (p>0.05). It would be useful to show values for both and state the actual p-value from the test. This issue of stating mean/median values with SD and sample size should be addressed for all data throughout the paper. Additionally, Figure S2 should be added to the manuscript and discussed, as it has data that may be valuable for the reproducibility of the paper.

      Thank you for the suggestion. Figure S2 shows the threshold power required to induce electrographic activity for n = 10 epileptic animals (9.14 ± 4.75 mW) and n = 6 naïve animals (6.17 ± 1.58 mW) (Wilcoxon rank-sum test, p = 0.137). The threshold duration was comparable between the same epileptic animals (6.30 ± 1.64 s) and naïve animals (5.67 ± 1.03 s) (Wilcoxon rank-sum test, p = 0.7133). 

      (6) In addition to the other stated references on synaptic reorganization in the CA1 area, the authors should mention similar studies from Esclapez et al. (1999, J Comp Neurol).

      Thank you. We have included the reference in the revision.

      (7) All of the raw EEG data on the seizures should be accessible to the readers.

      Thank you for the suggestion. We will consider depositing EEG data in a publicly accessible site.

      Reviewer 3 (Public review):

      Weaknesses:

      (1) Evaluation of seizure similarity using the SVM modeling and clustering is not sufficiently explained to show if there are meaningful differences between induced and spontaneous seizures. SVM modeling did not include analysis to assess the overfitting of each classifier since mice were modeled individually for classification.”

      Thank you for the comment. We made text edits to clarify the purpose of the SVM analysis. It was not intended to identify meaningful differences between induced and spontaneous seizures. Rather, it was used to classify EEG epochs as 'seizures' based on spontaneous seizures as the training set, demonstrating the gross similarity between induced and spontaneous seizures.

      (2) The difference between seizures and epileptiform discharges or trains of spikes (which are not seizures) is not made clear.

      Thank you for pointing out the unclear definition of the seizures analyzed. We added sentences at the beginning of the Results section (page 3) to clarify the terminology we used. We analyzed animal behavior during evoked events, and a high percentage of induced electrographic events were accompanied by behavioral seizures with a Racine scale of three or above. We added Supplemental Figure S9 to show the types of seizures observed before and during ASM testing. We hope these changes address the reviewer’s concern and improve the clarity of the manuscript.

      (3) The utility of increasing the number of seizures for enhancing statistical power is limited unless the sample size under evaluation is the number of seizures. However, the standard practice is for the sample size to be the number of mice.

      In this work, we used a linear mixed-effects model to address two levels of variability—between animals and within animals. The interactive linear mixed-effects model shows that most (~90%) of the variability in our data comes from within animals (residual), the random effect that the model accounts for, rather than between animals. Since variability between animals is low, the model identifies common changes in seizure propagation across animals while accounting for the variability in seizures within each animal. Therefore, the results we find reflect changes that occur across animals, not individual seizures. We made text edits to clarify the use of the linear mixed-effects model.

      (4) Seizure burden is not easily tested.

      Thank you for the comment. We added Supplemental Figure S9 to summarize the severity of behavioral seizures before and during ASM testing. This addresses the reviewer’s comment on seizure burden. In a follow-up study, we would like to explore this type of outcome measure for drug screening.

      Reviewer 3 (Recommendations for the authors):

      (1) Provide a stronger rationale to use area CA1. For example, the authors mention that CA1 is active during seizure activity, but can seizures originate from CA1? That would make the approach logical and also explain why induced and spontaneous seizures are similar.

      Thank you for the comment. We discussed it in the Discussion section (page 14, first and second paragraphs).

      (2) Explain the use of SVM classifiers so it is more convincing that induced and spontaneous seizures are similar. Or, if they are not similar, explain that this is a limitation.

      We made text edits to clarify the purpose of the SVM analysis. It was not intended to identify meaningful differences between induced and spontaneous seizures. Rather, it was used to classify EEG epochs as 'seizures' based on spontaneous seizures as the training set, demonstrating the gross similarity between induced and spontaneous seizures.

      (3)If feasible, extend the duration over which seizure induction reliability is assessed so that the long-term utility of the model can be demonstrated.

      Thank you for the suggestion. We would like to assess long-term utility in a follow-up study.

      (4) The GitHub link is not yet active. The authors will be required to supply their relevant code for peer evaluation as well as publication.

      Thank you. The GitHub repository is now active.

      (5) State and assess the impacts of sex as a biological variable.

      Thank you for pointing this out. Both female and male animals were included in this study: Epileptic cohort: 7 males, 3 females; Naïve cohort: 3 males, 4 females.

    1. Myers-Briggs Type Indicator

      May benefit from this during interviews. Getting to know yourself better. strengths, weaknesses

      Email Raylea for the code. the other 2 : strong and focus2 is more for undergrads

    1. Compte-rendu détaillé de la matinée : L'IA, la voie citoyenne Date : [Non précisé, mais fait référence à des événements de 2023 et 2024] Lieu : Palais d'Iéna, siège du Conseil Économique Social et Environnemental (CESE) Organisateurs : CESE, Conseil National du Numérique (CNNum), en partenariat avec Make.org, The Future Society, Sciences Po, ENS.

      1. Introduction et Objectifs de la Matinée La matinée, organisée au CESE, assemblée de la société civile et de la participation citoyenne, vise à discuter de la place du citoyen dans l'intelligence artificielle (IA). L'événement s'inscrit dans la perspective du Sommet Mondial de Paris sur l'IA, avec pour ambition de donner la parole aux citoyens sur les impacts et les enjeux de l'IA.

      Points clés :

      • Rôle du CESE et du CNNum : Le CESE, en tant qu'assemblée de la société civile et de la participation citoyenne, est le lieu "tout à fait naturel" pour cet événement. Le CNNum prolonge ses "cafés IA" initiés en 2024, des moments d'écoute et de débat sur les enjeux de l'IA.
      • Objectif de l'événement : Donner la voix à ceux qu'on n'entend pas toujours, pour "exprimer leurs espérances et leurs craintes face au déploiement de l'IA et de regarder par le débat si nous pouvons trouver des réponses communes."
      • Contexte global de l'IA : L'IA est comparée à l'arrivée d'Internet dans les années 2000 et de l'électricité dans les années 1900, suscitant "les mêmes passions".
      • Nature de l'IA : L'IA est présentée comme "un objet social et politique résultant de choix collectif et humains avant même d'être un objet technologique", n'étant "ni porteuse de miracle ni de danger" en soi.
      • Enjeux des travaux du CESE : Mettre le sujet de l'IA "à hauteur de la société civile" en examinant les problématiques sociales : risques d'atteinte aux droits fondamentaux et libertés individuelles, empreinte environnementale, autonomie stratégique et souveraineté, impacts sur l'emploi, l'intégration en entreprise, les services publics, l'éducation, la santé, le handicap, et la garantie d'un accès au non-numérique.
      • Expertise d'usage : Nécessité d'apporter une "expertise d'usage" pour compléter les savoirs spécialisés et les stratégies politiques/commerciales, cruciale pour l'acceptabilité des mutations par les concitoyens.
      • Mission commune : L'IA ne doit pas se substituer à l'intelligence collective. L'objectif est de rendre l'IA "plus démocratique", pour que "les citoyens, les salariés, les administrés aient un poids dans les décisions et les calculs".
        1. Perspectives Gouvernementales et Institutionnelles
      • Clara Chappaz, Ministre déléguée chargée à l'Intelligence Artificielle et au Numérique :
      • Inclusivité du Sommet de Paris : Souligne l'importance de rendre le Sommet de Paris sur l'IA "le plus inclusif possible", en "embarquant la société civile" et en répondant aux questions des citoyens.
      • IA au service de l'intérêt général : L'objectif principal est de mettre cette technologie "au service de l'intérêt général".
      • IA : une question démocratique et politique : L'IA n'est pas seulement économique ou de compétitivité, mais une "question absolument démocratique et même politique".
      • Confiance comme ciment : "La confiance doit être le ciment du développement de cette technologie". Sans confiance, il n'y aura pas d'adoption de l'innovation.
      • Responsabilité collective : Nécessité d'une "responsabilité absolument collective" pour que l'IA ne devienne pas une source de "fracture sociale", "frustration" ou "fracture territoriale", mais un "outil de progrès".
      • Écoute des citoyens : Mentionne les "Cafés IA" du CNNum et les ateliers "élu.ai" comme exemples d'initiatives d'écoute et d'échange avec les Français sur leurs perceptions et craintes de l'IA.
      • Équilibre : Trouver un équilibre entre le développement de l'écosystème technologique (pour la souveraineté) et l'accompagnement des citoyens à l'adoption de l'IA, dans le respect des droits fondamentaux, des libertés individuelles, de l'égalité et des limites planétaires.
      • Valeurs françaises et européennes : Faire de la France une puissance de l'IA compatible avec son "socle de valeur" et les "spécificités de notre culture française et européenne".
        1. Témoignages Citoyens et Expériences Locales
      • Martine (citoyenne ayant participé à la commission temporaire sur l'IA au CESE) :
      • Expérience formatrice : Souligne une expérience "formatrice et révélatrice", malgré un sentiment initial d'"illégitimité".
      • Importance du dialogue : Les échanges enrichissants et la diversité des perspectives ont permis une meilleure compréhension des enjeux.
      • Rôle du CESE : Le CESE est une "passerelle où décideurs publics et de citoyens... peuvent se réunir et échanger équitablement", favorisant un dialogue inclusif et renforçant la légitimité des décisions.
      • IA comme outil : Réaffirme que l'IA "n'est ni une entité autonome ni véritablement intelligente", mais "un outil façonné par des humains".
      • Responsabilité collective : Insiste sur l'immense responsabilité des créateurs et utilisateurs de l'IA, et le rôle des décideurs publics dans la régulation et l'anticipation des dérives.
      • Axel Docher (Make.org) et Constance (The Future Society) sur la consultation publique :
      • Large participation : Plus de 11 000 participants et 120 000 votes, montrant un "haut niveau de compréhension" et des "points de convergence assez forts".
      • Vigilance active : Les citoyens sont "ouverts à l'IA" mais réclament une "vigilance très active sur son mise en application".
      • IA dans les services publics : Acceptation de l'IA dans les services publics (ex: diagnostics de santé), mais "point de rupture" sur la "décision humaine" : l'IA doit être un outil au service de la décision, non une substitution.
      • Rationalisation vs. Contribution : L'IA ne doit pas être exclusivement au service de la rationalisation des services, mais un élément contributeur.
      • Peurs démocratiques : Peur de l'IA utilisée pour la désinformation et la fragilisation de la démocratie.
      • Opportunité pour la démocratie : L'IA peut "renforcer le lien entre les citoyens et les processus démocratiques", notamment en "décomplexifiant le monde pour les citoyens".
      • Lien IA-Démocratie : "Il n'y aura pas d'innovation ouverte" et "pas d'IA au service du bien commun" sans démocratie.
      • Alice Rousset (Ville de Paris) :
      • Démarche progressive : La Ville de Paris a abordé l'IA par l'expérimentation pour améliorer les services publics (analyse espace public, information aides sociales, urbanisme).
      • Démarche participative : Face à l'essor de l'IA générative, adoption d'une démarche "participative et inclusive" (auditions experts, consultation citoyenne, journée citoyenne).
      • Enseignements : Les Parisiens souhaitent que la ville se saisisse de l'IA "à son niveau" de manière "responsable", avec un "réel encadrement".
      • Axes prioritaires : Nécessité d'un "cadre de transparence et de contrôle des projets IA" (évaluation préalable, suivi déploiement avec société civile) et un "effort de formation et de sensibilisation".
      • Rappel : L'IA ne doit pas "se substituer à la décision humaine".
      • Pierre Jannin (Ville de Rennes) :
      • IA : un sujet politique : L'IA doit être "au service des transitions sociales, écologiques et démocratiques de l'intérêt général".
      • Voie alternative : Créer une "voie alternative qui contrôle, qui régule" face à un modèle "ultralibéraliste et dérégulé".
      • Conseil Citoyen du Numérique Responsable : Création d'une instance de 30 citoyens tirés au sort travaillant sur les enjeux de l'IA (impact sur métiers, IA au service du territoire, enjeux éthiques, liberté, sécurité, justice).
      • Points de vigilance : Les citoyens de Rennes ont identifié des points de vigilance cohérents avec les rapports nationaux : contrôle, transparence, régulation, risques sur l'emploi, lien public-privé, opportunités.
      • Concertations territoriales : Initiative nationale "Concertations territoriales de l'intelligence artificielle" (bottom-up) avec 33 villes, pour extraire des grands enjeux et recommandations.
      • Co-construction et reddition de comptes : "Nous sommes convaincus que nous devons construire la technologie avec les citoyennes et les citoyens", en les formant, les consultant, et surtout en "rendant des comptes sur la manière dont leurs recommandations... sont pris en compte".
      • Didier Mino (Changer de Cap) sur l'IA dans les services publics (CAF) :
      • Problèmes de l'automatisation : Témoignage alarmant sur la dématérialisation à la CAF, générant "maltraitance institutionnelle" et "non accès au droit" pour les plus précaires.
      • Pratiques illégales/discriminatoires : Suspension de droits sans préavis, qualification d'erreurs en fraude, contrôles ciblés par algorithmes discriminatoires, absence de questions ouvertes dans les formulaires, complexité de la réglementation.
      • Perte de maîtrise technique : Les services informatiques ont perdu la maîtrise du code (Crystal, écrit en Cobol dans les années 90), entraînant des décalages avec la loi et des décisions inexplicables.
      • Conséquences humaines : "Graves conséquences pour la santé physique et mentale des personnes en difficulté", basculement dans la pauvreté, perte de sens pour les agents.
      • Injonctions politiques : Les réformes budgétaires (ex: aides au logement) ont provoqué des "catastrophes informatiques".
      • Appel à l'action : Actions en justice contre l'algorithme ciblant les contrôles sur les plus fragiles.
      • Solution : "Possibilité d'un libre choix des usagers dans leur mode de relation avec les services publics", et nécessité de transparence et dialogue.
      • Soasick Penico et Estelle Hary (Observatoire des Algorithmes Publics - ODAP) :
      • Transparence des algorithmes : Nécessité de visibiliser et de rendre transparents les algorithmes utilisés par l'administration, car ils sont "fondamentalement politique".
      • Non-neutralité des algorithmes : Les algorithmes ne sont "absolument pas des objets objectifs" mais résultent de "choix humain et institutionnel" (décision de déploiement, critères, ressources, prestataires privés).
      • Manque de documentation : Absence de panorama exhaustif des algorithmes, les administrations les documentent "très peu publiquement".
      • Inventaire citoyen : Création d'un inventaire de 72 algorithmes à partir de sources publiques, montrant un "très peu de transparence" et une évaluation rare (4% d'évaluations internes publiées).
      • IA et automatisation : L'IA est "l'arbre qui cache la forêt de l'histoire longue de l'automatisation du service public". Les systèmes critiques anciens (ex: calcul impôts) sont aussi importants dans le débat démocratique.
      • Transparence au service de la justice sociale : La transparence est un "outil au service d'autres individus et d'autres collectifs qui luttent pour la justice sociale, pour les droits humains, pour les droits des travailleurs et des travailleuses et pour la justice environnementale".
      • Lutte essentielle : Essentiel que la société civile s'empare du sujet de l'IA comme "partenaire de débat mais aussi comme contre-pouvoir fort", car "tout le monde est légitime à le faire" même sans connaissances techniques, car c'est un "sujet politique avant tout".
      • Gabrielle Dubois (Défenseur des Droits) :
      • Rapport sur l'IA et service public : Rappelle le rapport du Défenseur des Droits de novembre dernier sur les décisions administratives automatisées.
      • Enjeux clés : L'intervention humaine et la transparence sont cruciales.
      • Enjeu individuel : Respecter le principe constitutionnel de transparence et le rendre appropriable par les personnes concernées.
      • Enjeu collectif : Concrétiser l'obligation de publication des règles des traitements algorithmiques pour permettre la compréhension et la contestation.
      • Intelligibilité de l'administration : S'assurer que les agents comprennent le fonctionnement des outils qu'ils utilisent.
      • Recommandations : Respect des obligations de publication, consécration d'un "droit à l'explication des décisions individuelles administratives" (au-delà des IA à haut risque), et association des usagers du service public à tous les niveaux.
      • Thomas Peron (Professeur de droit) sur le service public coopératif :
      • Repenser les services publics par les communs : Réfléchir à la structure de pouvoir dans les services publics à travers les communautés.
      • Jury populaire : Le jury populaire est le seul cas où une décision publique est prise par des citoyens tirés au sort.
      • Numérique et démocratisation : Le numérique offre un accès à la décision et la possibilité de décider en temps réel, permettant une démocratisation des services publics.
      • Métier de citoyen : Le métier de citoyen devrait s'apprendre d'abord dans les services publics.
      • Démocratisation des services publics : La question de l'IA démocratique doit être accompagnée d'une réflexion sur la démocratisation des services publics.
      • Décentralisation radicale : Implique une "décentralisation radicale des lieux de pouvoir et des processus de décision" au plus près de la relation.
      • Sid Sako et Hélène Mazela (citoyens de la consultation Make.org) :
      • Convention Citoyenne sur l'IA : Proposition de lancer une convention citoyenne de l'IA pour embrasser tous les défis contemporains (écologie, équité, justice sociale, éducation, emploi, santé, éthique).
      • Prendre le temps de comprendre : Les citoyens n'ont jamais été vraiment consultés sur la numérisation. La convention permettrait de prendre ce temps pour aligner les enjeux informatiques et IA avec l'intérêt général.
      • Partager la responsabilité : Embarquer les citoyens, c'est partager la responsabilité des décisions futures, car le sujet n'est pas seulement technique mais politique ("quelle société voulons-nous ?").
      • Normes IA environnementales et RSE : Proposer la mise en place de normes IA environnementales et RSE (responsabilité sociale des entreprises) pour encourager des modèles économes (IA frugale), favoriser la transparence énergétique, intégrer les critères éthiques d'inclusion et d'accessibilité.
      • Souveraineté : Favoriser des protocoles de collaboration pour éviter la domination des IA internationales.
        1. Débat public sur l'IA et le travail
      • Thomas Fournaise (Nantes, organisateur salon Data IA) :
      • Transparence des décisions : Le problème de la transparence des décisions est antérieur aux algorithmes et à l'IA. Le numérique permet de mettre en évidence ce manque de transparence historique.
      • Responsabilité humaine : Les décisions de priorisation (ex: couples mariés vs. paxés) sont prises par des humains. "Il faut rendre l'IA éthique, moi ça pose un problème, c'est qu'on la rend humaine, on l'anthropomorphise et quelque part on se déresponsabilise."
      • IA comme outil : L'IA est un outil qui répond à des questions. L'importance réside dans "les questions qu'on lui pose, la manière dont on le pose".
      • Usage sociétal : L'enjeu est "quel usage sociétal on veut l'utiliser".
      • Marine André (Mère de famille et Designer d'IA) :
      • Risque d'anthropomorphisme : Confirme le risque de penser qu'il y a une personne derrière l'IA.
      • Éducation à l'IA : S'inquiète de l'absence d'éducation à l'usage de l'IA dans les lycées et le manque de formation à l'esprit critique des jeunes.
      • Laure Lucchesi (Ex-directrice Etalab) :
      • Obligations légales de transparence : Insiste sur l'importance de la transparence des traitements algorithmiques dans le service public et le rôle d'Etalab dans l'accompagnement des administrations.
      • Démantèlement des équipes dédiées : Regrette le démantèlement des équipes chargées d'accompagner les administrations sur ces questions éthiques.
      • Droit d'accès aux documents administratifs : Rappelle l'importance de ce droit, qui date de 1978, pour la société civile et les journalistes pour interroger la conception des algorithmes et la communication des codes sources.
      • Guilaine Giersau (Les Petits Débrouillards) :
      • Éducation et esprit critique : Souligne l'importance de l'éducation aux sciences et à l'esprit critique, surtout dans les territoires ruraux et d'Outre-Mer, malgré le manque de moyens.
      • Rôle des associations : Les associations jouent un rôle crucial dans cette éducation hors les murs de l'école.
      • Connaissance des entreprises : Nécessité que les entreprises comprennent aussi ces enjeux.
      • Cohérence des dispositifs : Manque de pérennité et de cohérence dans les dispositifs d'éducation numérique.
      • Urgence : L'approche démocratique est d'autant plus urgente au vu des événements mondiaux.
      • Didier Cornel (Juriste, institution publique belge) :
      • Problème non lié à l'IA : Quand la législation est appliquée, les droits sont plus faciles à octroyer avec les outils informatiques qu'sans. Le problème est le non-respect des règles existantes.
      • Obligation d'aide : Propose une obligation légale d'aide avec une obligation de résultat pour les personnes n'arrivant pas à accomplir les formalités.
      • Risques existentiels : Exprime sa "surprise et déception" face à l'absence de discussion sur les risques existentiels de l'IA, citant une probabilité moyenne de 10% de "fatale issue pour l'humanité" selon les spécialistes.
      • Volonté d'arrêter l'IA : S'étonne que la consultation citoyenne ait révélé une proposition d'arrêter l'usage de l'IA (49% pour, 39% contre) sans que cela soit plus discuté.
      • Franck Bataille (Président Loir et Cher Tech) :
      • Cafés IA sur les territoires : Témoigne du succès des "cafés IA" en Loir-et-Cher, ayant touché 300 personnes en 2024 et visant 1000 en 2025, notamment auprès de jeunes en décrochage scolaire.
      • Inclusion numérique : Son association, active depuis 10 ans dans la culture et l'inclusion numérique, a embrassé l'IA avec divers publics.
      • Patrick Allard (Ex-entrepreneur, citoyen) :
      • Souveraineté : Pose la question de la souveraineté face aux acteurs américains et chinois, et de l'action de la France.
      • Aziz Kizou (Fondateur iPublic) :
      • IA privées dans les industries de réseau : Interroge sur le "angle mort" des IA privées dans les industries de réseau (énergie, transport) qui, malgré leur taille, peuvent avoir un pouvoir systémique sur la vie des citoyens.
      • Cadre normatif insuffisant : En dehors du RGPD et de l'AI Act, il n'y a pas de cadre normatif suffisant pour contrôler ces plateformes.
      • Nationalisation ? : Se demande si la société civile sera suffisante ou s'il faudra envisager des nationalisations de plateformes IA.
      • Eden Carou (Data Scientist) :
      • Compréhension du fonctionnement : Une IA démocratique n'est efficace que si les citoyens comprennent son fonctionnement et ses enjeux.
      • Éducation et sensibilisation : L'éducation à l'IA, au-delà de sa dimension technique, doit concerner son interaction avec les individus et la société.
      • Étienne Brevet (Gouvernance des données, Agglomération du Pays Basque) :
      • Importance de la donnée : Insiste sur la qualité de la donnée qui alimente l'IA. "Aucun algorithme ne sera efficace si derrière la donnée qu'on récupère n'est pas bonne."
      • Masse de données : Réflexion sur les quantités astronomiques de données stockées et le faible pourcentage réellement utilisé.
      • Cadre réglementaire : Nécessité d'une réflexion sur le cadre réglementaire de la donnée.
        1. L'IA au travail : Impacts et Dialogue Social
      • Caroline Jeanmaire (Consultation Make.org) :
      • Urgence d'agir : 200 organisations de la société civile alertent sur l'urgence d'agir pour comprendre et prévenir les risques de l'IA pour le futur du travail.
      • Protéger concrètement les emplois : L'IA risque d'aggraver les inégalités. Solutions : observatoire international pour anticiper les bouleversements, accords d'entreprise innovants (ex: Volkswagen zéro licenciement IA), kit de protection des travailleurs (guides pratiques, normes surveillance humaine).
      • Développer formation numérique et esprit critique : Plateforme gratuite multilingue, laboratoires pour l'équité sur l'IA.
      • Investir dans les talents de demain : Accès inégal aux métiers de l'IA. Programme mondial de formations avec acteurs locaux, bourses, mentorat, soutien aux communautés sous-représentées.
      • IA au service de tous : "Agissons maintenant pour une IA au service de tous et pour réduire les inégalités au lieu de les creuser."
      • Eric Meyer (Conseiller CESE, syndicaliste) et Solidaire Finances Publiques :
      • Déploiement de l'IA à la DGFiP (Direction Générale des Finances Publiques) : Exemples du projet CFVR (ciblage fraude et valorisation requêtes) pour détecter la fraude fiscale.
      • Coût et efficacité : 34,5 millions d'euros, 52% des contrôles entreprises en 2022, mais seulement 13,6% des sommes récupérées.
      • Suppression d'emplois : "Gains de productivité" de 500 emplois, soit 1/4 des effectifs dédiés au contrôle fiscal.
      • Impacts sur les missions et conditions de travail :Réduction du périmètre de mission : Moins de lien avec le terrain, traitement prioritaire de listes générées par l'IA au détriment du reste.
      • Perte d'autonomie et de technicité : Travail "monotâche, très répétitif", plus de latitude pour les agents.
      • Erreurs de l'IA : Les agents passent du temps à justifier pourquoi des contrôles proposés par l'IA ne peuvent être engagés.
      • Pas de tâches plus intéressantes : 85,4% des agents estiment que l'IA ne permet pas de se consacrer à des tâches plus intéressantes.
      • Perte de sens au travail.
      • Opacité et absence de dialogue social : Déploiement "à marche forcée", absence totale d'information, de concertation, peu de formations. Bilans d'expérimentations non discutés.
      • Boîte noire et externalisation : Conception souvent externalisée à des cabinets privés, renforçant l'inexplicabilité.
      • Réinvention du syndicalisme : Utilisation de moyens juridiques (saisine CADA), partenariats (journalistes, chercheurs), alertes politiques, enquêtes internes. Obtention d'un comité éthique interne après refus de participation à l'externe.
      • Discours technocritique : Les agents doivent être systématiquement associés à la conception de leurs outils dans une démarche transparente.
      • David Gaborio (Sociologue) sur les ouvriers de la logistique :
      • Outil : la commande vocale (Voice Picking) : Logiciel dictant toutes les tâches via un casque et micro.
      • Bilan : Perte d'autonomie, intensification du travail (10 à 15% d'accélération), individualisation, hausse du contrôle.
      • Taylorisme moderne : Travail contraint, répétitif, physique, avec une "usure accélérée des corps". Explosion des accidents du travail et maladies professionnelles.
      • Manque d'anticipation et promesses déçues : Promesse d'un travail plus qualifié et libéré non tenue.
      • Contrôle inefficace : Les rapports de la CNIL sur la surveillance n'ont pas empêché une standardisation extrême du travail.
      • Discours sur l'automatisation : Produit des effets d'invisibilisation du travail et de perte de légitimité des ouvriers.
      • Très faible encadrement : Manque de contrôle citoyen et démocratique dans l'entreprise (ex: disparition des CHSCT).
      • Dominance des discours d'en haut : Très faible présence des discours des classes populaires.
      • Polarisation du travail : Les nouvelles technologies ne feront pas disparaître les métiers pénibles, mais entraîneront une "polarisation très forte" entre métiers qualifiés et classes populaires subissant les conséquences.
      • Eric Drouin (CNIL) sur la régulation :
      • Régulation : La régulation peut fonctionner, comme dans le cas d'Amazon Logistique France. Le RGPD est "pleinement d'actualité" et "très robuste" grâce à sa "neutralité technologique".
      • Mission de la CNIL : "L'informatique doit être au service de chaque citoyen. Elle ne doit porter atteinte ni à l'identité humaine, ni aux droits de l'homme, ni à la vie privée, ni aux libertés individuelles ou publiques."
      • Cas Amazon Logistique France : Amende de 32 millions d'euros (déc. 2023) pour un système de surveillance excessif (mesure interruptions, vitesse d'utilisation du scanner, conservation des données trop longue).
      • Pas de blocage de l'innovation : Le RGPD n'est pas une "loi bloc" mais un cadre qui "ralentit" les usages excessifs pour un développement cohérent avec les droits fondamentaux.
      • Principe de proportionnalité : Équilibre entre les objectifs de performance de l'entreprise et les atteintes aux droits et libertés fondamentales.
      • Garantie complémentaire : Le RGPD (et l'AI Act) est une garantie face aux dérives des technologies traitant massivement les données personnelles, notamment dans le secteur du travail.
      • Frank Fasalina Madinier (Avis à Bruxelles sur le management algorithmique) :
      • Démocratie et dialogue social : La démocratie au travail, c'est un dialogue avec les travailleurs, surtout quand ils sont impactés.
      • Rôle des syndicats : L'organisation collective est un "véritable contre-pouvoir" pour assurer que les outils se déploient de manière juste et choisie, sans supprimer ni aggraver les conditions de travail.
      • Management algorithmique : Ce phénomène se diffuse au-delà des plateformes uberisées.
      • Dialogue social renforcé : Nécessité d'un dialogue social renforcé, car les acteurs ne sont pas toujours préparés.
      • Réglementations adaptées : Les réglementations européennes existent mais ne sont pas toujours adaptées au monde du travail (ex: consentement dans le RGPD).
      • Transparence des algorithmes : Exigences de transparence (inspirées de la directive plateformes) devraient s'étendre à tous les travailleurs.
      • Négociation et discussion : Adapter la législation pour aider les acteurs du dialogue social à négocier et discuter ces questions.
        1. Échanges et Perspectives du Débat public
      • Christophe Moraux (FSU Emploi à France Travail) :
      • Faux libre choix de l'usage : Les objectifs inatteignables et la réduction des moyens conduisent à l'imposition de l'IA aux agents.
      • IA générative et réponses complexes : Les IA prenant en charge les tâches simples, les agents se retrouvent avec des cas exclusivement complexes, conduisant à une "surprécarisation des publics".
      • Normalisation des réponses : L'IA impose une normalisation des réponses complexes.
      • Exemple Match FT et ChatDoc : Outils de mise en relation et de recherche documentaire qui masquent le manque de moyens humains et le temps laissé aux agents.
      • Perte d'autonomie et de sens : L'IA conduit à une perte d'autonomie, de sens au travail et un contrôle accru.
      • Refus de participation : Refus de la direction d'inclure les syndicats dans le comité éthique externe de l'IA, nécessitant la création d'un comité éthique interne.
      • Margaux Prod (Traductrice, collectif En Chair et en Os) :
      • IA non neutre, insoutenable écologiquement, basée sur l'exploitation : Rappelle que l'IA n'est pas neutre, est gourmande en énergie et eau, et repose sur l'exploitation de travailleurs (clic, mine) dans le monde.
      • IA dans la traduction : la post-édition : Un "sabotage" des savoir-faire et une "ubérisation" des métiers. Consiste à corriger des textes générés par machine (souvent fautifs, lissés, standardisés) pour une rémunération 30 à 50% inférieure.
      • Absence d'intention humaine : Le texte généré manque "d'épaisseur intellectuelle" et d'intention artistique.
      • Opposition des artistes : La majorité des artistes-auteurs s'oppose à l'utilisation de leurs œuvres pour alimenter les logiciels d'IA, même avec compensation financière.
      • IA : pas un progrès : Pour la traduction, l'IA est une "automatisation désastreuse des métiers de la culture".
      • Alice Dragon (Indépendante, ex-interministérielle) :
      • Déficit de management : Souligne un "gros déficit de management" dans les ministères et administrations, antérieur à l'IA.
      • Invisibilité des 15 ans d'optimisation : Demande plus de visibilité sur les suppressions d'effectifs et l'optimisation numérique des 15 dernières années.
      • Opportunités de l'IA générative : Potentiel de "mobilité sociale extraordinaire" et d'accès à la formation pour la classe moyenne.
      • Valorisation des savoir-faire invisibles : Comment mieux valoriser les savoir-faire invisibilisés par l'IA.
      • France et régulation : Fière de la position française sur la régulation (ex: CNIL sur Amazon).
      • Expérience citoyenne et autonomie : L'embarquement des citoyens se fera si l'IA leur laisse "l'autonomie de mettre la techno à leur service et pas à l'autre sens".
      • Agnès de Tamarana (Unbias, Twisting) :
      • Implication des syndicats : Invite les syndicats à s'emparer des questions techniques de l'IA, car c'est une technologie "pas si compliquée à comprendre".
      • Exiger transparence : Exiger des registres algorithmiques, même si l'ingénierie passée n'a pas tout documenté.
      • Structuration des institutions : Les institutions et entreprises doivent se structurer pour gérer ces risques techniques.
      • Formation des salariés : Exiger une formation des salariés qui soit "interne" et non "poussée par les providers de solutions tels que Microsoft ou Google".
      • Combat dans les entreprises : Le "push commercial" des entreprises américaines attaque les entreprises européennes en leur faisant croire qu'elles manqueront une opportunité si elles ne s'équipent pas rapidement.
      • IA et augmentation collective : L'IA est formidable pour "augmenter une puissance collective, une action collective, un dialogue collectif, mais certainement pas au niveau individuel".
      • Sandra Lem (Indépendante, accompagnement entreprises) :
      • Course aux outils digitaux : Constat d'une "course aux outils digitaux" avec une mise en place en deux temps (dirigeant-travailleur, dirigeant-technicien) oubliant le lien "technicien-utilisateur final".
      • Manque d'accompagnement : Pas de temps pour les utilisateurs de changer leurs pratiques, entraînant isolement, surcharge de travail et perte d'intelligence collective.
      • Marc Malenfer (INRS) :
      • Dialogue social et prévention des risques : Le dialogue social est crucial en matière de prévention des risques professionnels.
      • Écoute des salariés : Rappelle le rapport du CESE (Assises du travail 2023) qui proposait d'ajouter l'écoute des salariés comme principe général de prévention.
      • Consultation des instances : Les dispositifs modifiant l'organisation du travail doivent faire l'objet de consultations des IRP et d'une expression directe des travailleurs.
      • Inégalité entre entreprises : Les petites entreprises sont "plus démunies" face à la pression commerciale des solutions IA.
      • Formation des développeurs : Nécessité de former les développeurs d'IA aux enjeux de santé au travail.
      • Arthur Talan (Doctorant en Philosophie) :
      • Non-neutralité de la technologie : Il y a un consensus philosophique sur le fait que la technologie n'est jamais neutre. L'IA ne peut être prise indépendamment de sa conception, de ses usages et de ses finalités.
      • Excuse de la neutralité : La promotion de la neutralité est une "excuse pour justifier le développement" de ces technologies et déresponsabiliser.
      • Responsabilités identifiées : Le développement de l'IA engage des responsables et des responsabilités qui doivent être bien identifiées.
      • Christophe Gernet (Radical Exchange) :
      • IA totalitaire vs. autres formes : L'IA n'est pas neutre, mais il existe d'autres manières de la développer que le modèle totalitaire.
      • Responsabilité du déploiement : Importance de la responsabilité dans le choix des projets IA.
      • Management algorithmique : Les managers se retrouvent aussi sous les ordres d'une IA.
      • Négociation collective des données : Milite pour que les données fassent partie de la négociation collective, car leur valeur n'est pas partagée.
      • Eden Carou (Data Scientist) :
      • IA et expertise : "L'IA n'a pas sa place partout" surtout sans collaboration avec les utilisateurs, car "une IA sans l'expertise, elle est une IA pourrie".
      • Dialogue professionnel : Nécessité d'un dialogue entre ceux qui utilisent l'outil et ceux qui le développent.
      • Quentin Pignon (Conseiller numérique) :
      • Web et émancipation vs. algorithmes de recommandation : Le web est émancipateur mais les algorithmes de recommandation invisibilisent le contenu non monétisable.
      • "Bullshitisation" du web : L'IA générative permet de multiplier les vidéos "bullshit" (influenceurs perte de poids, développement personnel), rendant le web "invivable" et plus difficile à repérer pour la vulgarisation scientifique ou artistique.
      • Dépendance et perte de repères : Les repères deviendront plus difficiles pour le travail de conseil numérique.
      • Antoine Lata (Étudiant en sociologie) :
      • Risque de l'arrêt : Qu'arrive-t-il si l'IA s'arrête ou ne fonctionne plus ?
      • Dépossession des savoirs : L'IA peut entraîner une dépossession des savoirs et une dépendance aux outils.
      • Marline de Banque (The Shift Project) :
      • Implications énergétiques et climatiques : Interroge sur les gigawatts et térawatts nécessaires pour l'IA et le numérique, et les nouvelles émissions de GES.
      • Pollution : Quels secteurs peuvent polluer moins pour permettre au numérique de polluer plus ?
      • Guilaine Giersau (Les Petits Débrouillards) :
      • Merci l'Europe : Remercie l'Europe pour ses valeurs digitales mais appelle à ne pas être naïfs.
      • Responsabilité sociale et environnementale : Insiste sur la responsabilité sociale et environnementale, notamment sur l'eau, l'énergie et la pression sur les travailleurs (plateformes, mineurs).
      • Consommateurs : Importance de la responsabilité du consommateur.
      • Connaissance et diffusion des savoirs : La connaissance est essentielle et la diffusion des savoirs est une priorité.
      • Fanny Legal (SNMI) :
      • Impact dans les Missions Locales : L'arrivée d'un "tout petit bout d'IA" depuis le 20 janvier dans les Missions Locales a des conséquences directes : impossibilité de travailler, l'outil devient un "écran" entre le conseiller et le jeune.
      • Manque d'accompagnement : Les collègues n'ont pas été accompagnés ni formés.
      • Sabine Vannek (Avocate, Docteur en droit) :
      • Souveraineté des données : Interroge sur la volonté de la Chine et des États-Unis de capter les données européennes et françaises, transformant l'Europe en "jumeau numérique", posant une "question essentielle de notre souveraineté" avant de s'engager dans la "course effrénée" à l'IA.
        1. Conclusion de la Matinée
      • Eric Meyer :
      • Livre Blanc du CESE : Rappelle la publication du livre blanc "Pour une intelligence artificielle au service de l'intérêt général", voté très largement par la société civile, avec 30 préconisations.
      • Questions clés : Les questions du débat recoupent les travaux du CESE sur la démocratie et la "prise" des travailleurs sur l'outil IA.
      • IA : un outil politique : L'IA n'est pas un outil comme les autres, mais "un outil très politique".
      • Enjeux pour les entreprises : L'IA n'est pas neutre, nécessite des investissements, peut faire perdre la souveraineté et le pouvoir de décision.
      • Impacts sur l'emploi : Suppression ou transformation du travail, inégalités femmes-hommes (métiers féminins potentiellement les plus impactés), intensification, perte de sens et de reconnaissance.
      • Recommandations : Discussions rapides entre partenaires sociaux et gouvernement pour un accord national interprofessionnel sur l'IA. Privilégier le dialogue social avant toute introduction d'IA en entreprise, avec études d'impact et grilles de maturité.
      • Régulation : La société civile doit faire plus et plus fort sur la régulation et l'encadrement pour éviter que la "bigtech" ou des "politico-financiers" imposent leur loi.
      • [Intervenant non identifié, conclusion intermédiaire] :
      • Vivacité de la société civile : L'ensemble des témoignages montre une "vivacité du monde de la société civile", une "lucidité" et une "expertise".
      • IA : un leurre ou un vivier d'énergie ? : L'IA peut être un leurre masquant des structures de pouvoir, mais en assemblée citoyenne, elle devient un "vivier d'énergie et de force indépassable".
      • Action citoyenne sur le pouvoir politique : L'IA mène à l'action citoyenne sur le pouvoir politique, car ce ne sont pas seulement les technologies qui sont politiques, mais les décisions et les actions.
      • Remerciements : Remerciements aux équipes du CESE et du CNNum.
      • Stéphane Brelman (Anthropologue du numérique) :
      • Regarder dans le détail : L'expérience montre qu'il faut regarder "dans le détail" ce que l'IA introduit au niveau du travail et des pratiques.
      • Ne pas craindre les aspects techniques : Comprendre les aspects techniques est essentiel pour intervenir.
      • Exemple de l'opérateur de centrale nucléaire : L'histoire de l'opérateur qui "sent dans le pif" sa décision illustre l'importance de comprendre les micro-décisions et les facteurs non explicites.
      • Manque d'études approfondies : Regrette le manque d'études approfondies sur les micro-décisions et les impacts concrets de l'IA.
      • Granularité détaillée : Nécessité de descendre à un "niveau de granularité très très très très détaillé" pour comprendre les enjeux et les impacts.
      • Tradition française d'analyse : La tradition française d'analyse des activités précises peut être exploitée pour l'IA.
      • Lever les craintes et fantasmes : Comprendre le détail permettra d'enlever "pas mal de craintes et de fantasmes".
      • Remerciements finals : L'ensemble des intervenants et organisateurs se remercient mutuellement pour la qualité des échanges et l'orientation des travaux futurs.
    1. Compte-rendu détaillé : La prévention en santé, passons aux actes !

      • Ce document de synthèse est basé sur les discussions et présentations tenues lors de la séance plénière du CESE consacrée à la prévention en santé, avec un accent particulier sur la santé au travail.

      Il vise à identifier les thèmes principaux, les idées clés et les faits marquants soulevés par les différents intervenants, en incluant des citations pertinentes.

      1. La Prévention : Un Enjeu Sociétal Majeur et Sous-Estimé

      • L'ensemble des intervenants s'accorde sur l'importance cruciale de la prévention en santé, qui dépasse largement le seul cadre médical pour englober la société dans son ensemble. Malgré cette évidence, la prévention demeure trop souvent le "parent pauvre des politiques publiques".

      1.1 Prévenir Plutôt que Guérir : Une Évidence non Appliquée

      Le constat est unanime : "Prévenir plutôt que guérir, voilà qui semble évident et pourtant la prévention est encore trop souvent le parent pauvre des politiques publiques." (Déclaration introductive).

      Il est souligné que la santé ne se limite pas aux hôpitaux, médecins et médicaments, mais est une affaire de société.

      1.2 Un Investissement, non un Coût

      • Investir dans la prévention est présenté comme une "stratégie d'avenir", non un coût. Les bénéfices sont multiples : "moins de souffrance évitable, moins de dépenses publiques sur le long terme, plus de qualité de vie". (Déclaration introductive).

      De plus, elle redonne aux citoyens un "pouvoir sur leur propre santé", les plaçant comme "acteur de tout" plutôt que comme patient.

      1.3 Historique et Concepts : Prévention vs Promotion de la Santé

      • Le Professeur Emmanuel Ruche, Président de la Conférence Nationale de Santé, met en lumière une spécificité française : une approche historiquement "très centrée sur la prévention et peut-être un peu moins sur la promotion de la santé". Il insiste sur la complémentarité de ces deux approches, qu'il faut "articuler". Il cite le Directeur Général de l'OMS : "La santé ne commence pas dans les cliniques ou les hôpitaux pas plus que la justice ne commence dans les tribunaux ou que la paix ne commence sur le champ de bataille. La santé commence dans les conditions dans lesquelles nous sommes nés et avons grandi dans les écoles les rues les lieux de travail…". Cette vision élargie souligne que la santé est façonnée par les "déterminants commerciaux" (tabac, alcool, aliments transformés, combustibles fossiles), qui sont responsables d'un tiers des décès dans le monde.

      1.4 Efficacité et Retour sur Investissement

      • L'efficacité des actions de prévention n'est "plus à démontrer" (Professeur Ruche), s'appuyant sur des "données probantes bien établies".

      Le retour sur investissement est "une évidence" pour les études scientifiques, l'exemple de la prévention du tabagisme montrant "1900 % de retour sur investissement".

      Malgré cela, le financement reste difficile, nécessitant des "dispositifs de financement incitatifs et pérennes" et pluriannuels.

      2. Les Déterminants de la Santé et les Inégalités

      La discussion met en évidence la multiplicité des déterminants qui influencent la santé, soulignant leur rôle dans la création et l'aggravation des inégalités.

      2.1 Déterminants Sociaux et Économiques

      • Emmanuel Cambois, Directrice de recherche à l'INED, explique que les inégalités de santé se créent non seulement par des comportements individuels mais aussi par des facteurs "qui s'imposent en quelque sorte aux individus et qui peuvent se combiner à d'autres". Ces facteurs incluent la "situation socio-économique", l'"entourage, soutien social, et à contrario l'isolement", la "charge mentale", les "traumatismes" et les "phénomènes d'exclusion". Les inégalités se manifestent aussi dans l'accès aux soins et dans les parcours professionnels (pénibilités, carrières hachées). L'approche en "parcours de vie" est essentielle, car les risques "se cumulent au cours de la vie" rendant certains groupes "beaucoup plus à risque de problème de santé et beaucoup moins en capacité de lutter contre ces risques". La prévention doit donc "couvrir les différentes sphères d'activité qu'elle soit domestique professionnelle ou social et surtout suivre l'ensemble de des âges de la vie".

      2.2 Déterminants Environnementaux et Risques Émergents

      • Jean-François Guégan, Directeur de recherche à l'INRAE, aborde l'impact de l'environnement sur la santé, notamment face aux "évolutions climatiques". Il souligne une "confusion impressionnante" et un "manque de culture" sur les liens entre biodiversité et santé. Les activités humaines, comme la déforestation et l'élevage, sont identifiées comme des facteurs majeurs dans l'émergence de pandémies zoonotiques. Il met en garde contre une vision "naïve, idyllique et tronquée" de la nature, illustrant que même la "réintroduction de la nature en ville" peut introduire des "dangers microbiologiques" (moustiques, rongeurs, germes pathogènes). Le risque infectieux est un produit entre "des aléas" (micro-organismes) et "l'exposition humaine et la vulnérabilité des populations".

      2.3 Déterminants Commerciaux et Influence de l'Industrie

      • Karine Galopel Morvent, Professeure à l'EHESP, met en lumière le rôle des "acteurs commerciaux" qui "influencent de manière délétaire la santé et l'équité de la population". Elle cite le marketing et le lobbying comme des pratiques commerciales préoccupantes, en particulier pour les industries du tabac, de l'alcool, des aliments ultra-transformés et des combustibles fossiles, responsables d'environ "un tiers des décès". Elle dénonce le "pouvoir accru des multinationales" et la sous-estimation des budgets marketing par rapport aux campagnes de prévention (ex: 250 millions d'euros par an pour l'alcool contre 3 millions pour la prévention). Le lobbying est "très fort", bloquant des avancées comme la hausse des taxes sur le tabac ou la généralisation du Nutri-Score. Les solutions incluent l'"encadrement des conflits d'intérêt", la "transparence sur le lobbying", l'"interdiction de publicité" et l'"information et éducation sur ces déterminants commerciaux".

      2.4 L'Approche Genrée en Santé

      • La question de l'approche genrée dans les politiques de santé est soulevée.

      Emmanuel Cambois et Lormier soulignent que la santé des femmes et les défis auxquels elles sont confrontées (troubles musculosquelettiques, troubles anxiodépressifs, carrières hachées) sont souvent sous-estimés ou mal compris.

      Il est crucial d'adopter des "approches différenciées entre les hommes et les femmes" dans la prévention et la personnalisation des soins, car les symptômes et les parcours de vie peuvent varier considérablement.

      3. Innovations et Défis dans la Prévention

      La discussion explore les nouvelles méthodes et outils, notamment le numérique, tout en identifiant les freins persistants à une prévention efficace.

      3.1 Le Numérique : Opportunité et Défi

      • Lormier, experte à l'Institut Montaigne, présente le numérique comme une "réponse indispensable au défi actuel de la prévention", offrant "personnalisation", "ciblage amélioré", "meilleure adhésion du patient" et "anticipation des risques".

      Les données de santé massives et l'intelligence artificielle permettent une "détection précoce" (ex: radiologie), un "soutien personnalisé" (applications mobiles, chatbots) et une "télésurveillance" des paramètres vitaux.

      Cependant, des "freins" persistent : un "décalage culturel et organisationnel" du système de santé axé sur le curatif, la nécessité de "former" les professionnels de santé, et les "déterminants numériques de la santé" (accès, connectivité, confiance). L'objectif est de passer "d'une médecine épisodique à un suivi continu".

      3.2 Financement et Volonté Politique

      • Pierre-Louis Bra, Inspecteur général des affaires sociales, nuance la question du financement, affirmant que la prévention n'est pas "simplement des financements" mais "la capacité à mettre en cause des intérêts privés".

      Le succès de la lutte contre le tabagisme, principalement par l'augmentation des taxes, en est la preuve. Il souligne que "ça ne demande pas de financement public, au contraire, c'est des taxes, ça apporte des financements publics".

      Il critique le recours au "bon sens" plutôt qu'aux "données probantes" pour certaines initiatives de prévention coûteuses (ex: bilans de santé périodiques).

      Il insiste sur la nécessité d'investir dans les réseaux de prévention de base (médecine scolaire, PMI, médecine du travail), qui sont "en difficulté".

      3.3 Gouvernance et Coordination

      • Plusieurs intervenants appellent à une meilleure gouvernance et coordination des politiques publiques.

      Le Professeur Ruche et Emmanuel Cambois insistent sur la nécessité d'une "intersectorialité et interministérialité" au niveau national, et d'une "déclinaison territoriale au plus près des territoires et des populations".

      La promotion de la santé plaide pour "introduire la santé dans toutes les politiques publiques".

      La CNS recommande une "stratégie nationale de santé" sur 10 ans et des "feuilles de route prévention promotion de la santé" au niveau territorial avec un "rendu de compte".

      4. La Santé au Travail : Un Pilier de la Prévention

      La deuxième partie de la séance est spécifiquement dédiée à la santé au travail, soulignant ses défis et les pistes d'amélioration.

      4.1 Des Chiffres Alarmants

      • Les chiffres présentés par Cécile Gondard Lalane et Jean-Christophe Repont sont frappants : "1287 décès liés au travail par an", "5800 maladies professionnelles accidents et 47400 maladies professionnelles" en 2022.

      Cela montre que "malgré un accord interprofessionnel national sur la prévention au travail une loi en décembre 2020 une loi en août 2021 sur la prévention au travail, on est à un niveau qui stagne en terme de prise en charge de prévention primaire au travail".

      4.2 Des Bouleversements qui Pèsent

      Le monde du travail est confronté à des "bouleversements" majeurs :

      • Réchauffement climatique : La chaleur a des "effets physiologiques" et des "conséquences mortelles", entraînant des "pertes de productivité" et des "risques psychosociaux".
      • Approche genrée : La santé au travail est encore "trop centrée sur les hommes". Alors que les accidents du travail ont baissé de 27% pour les hommes sur 20 ans, ils ont augmenté de "plus de 41 %" pour les femmes.

      Les troubles musculosquelettiques, première cause de maladie professionnelle, touchent "trois femmes sur 5 et un homme sur deux". La "répartition genrée du travail domestique" impacte aussi la santé mentale des femmes. * Santé mentale : Les principaux facteurs de risque sont le "stress chronique" (80%, surcharge mentale, burnout, troubles du sommeil, suicide) et les "violences internes ou externes" (20%, incivilités, harcèlement, discrimination).

      La "fatigue liée aux outils et à l'utilisation des outils numériques" est un nouveau défi. * Pratiques managériales : Elles sont "déterminantes" mais apparaissent "trop verticales et trop hiérarchiques" en France, avec un manque de "confiance au salariés" et un "besoin de maîtrise et de contrôle encore très important" (Dr. Florence Bénichou).

      4.3 Les Nouveaux Visages du Travail

      • L'étude met en lumière la situation des travailleurs indépendants et des plateformes. Les livreurs et VTC subissent des "risques forts" (accidents, TMS, problèmes de santé mentale dus à la "pression" des algorithmes et à l'angoisse de la perte de revenus). L'accès aux assurances est "très peu connu" et utilisé.

      • 4.4 Pistes d'Amélioration : Vers une Prévention Primaire Renforcée

      Les rapporteurs proposent trois axes pour améliorer la prévention au travail :

      • Former et sensibiliser : Renforcer la formation des étudiants en médecine à la santé du travail et environnementale pour attirer de jeunes professionnels. Étendre la formation à la santé du travail aux "acteurs du dialogue social", salariés et employeurs, avec des "formations communes".
      • Identifier et prévenir : Accompagner les dirigeants de TPE dans la mise en œuvre du "document unique". Souligner le rôle des "services de prévention en santé au travail" et des "branches professionnelles". Insister sur l'"approche genrée" et l'intégration du "management" dans la prévention.
      • Anticiper par le dialogue social et l'écoute : Inscrire l'"écoute des salariés" dans les principes généraux de prévention du Code du travail, car "ce sont bien les travailleurs qui connaissent mieux les risques et auxquels ils s'exposent". Prendre en compte l'"articulation des temps de vie" et le "déploiement de l'IA" dans le dialogue social.

      4.5 Des Exemples de Succès et une Volonté Politique

      • Bernard Tibba, co-président de la charte sociale des JO 2024, témoigne du succès de cette initiative qui a permis de diviser par quatre l'accidentalité sur un "chantier énorme".

      Cette approche, qui combine "volonté politique, des moyens, une mobilisation des différents acteurs publics comme privés", montre qu'il n'y a "pas de fatalité en matière d'accidentologie".

      Madame Astrid Panosian Bouvet, Ministre chargée du travail et de l'emploi, salue le rapport et confirme l'importance du sujet. Elle rappelle que la santé au travail n'est pas "assez haut sur l'agenda public".

      Elle insiste sur la "lutte contre les accidents du travail grave et mortel", un phénomène qui n'est "pas une fatalité" et dont beaucoup sont "évitables".

      Elle confirme la volonté de "capitaliser sur ce succès" des JO et de "dupliquer la méthode" notamment via le "dialogue social au sein des branches" et une "meilleure coopération interministérielle".

      Elle souligne que la prévention doit être au "cœur des préoccupations" et non un "codicille au contrat de travail".

      5. Conclusion Générale

      • La prévention en santé est un impératif stratégique, économique et social. Elle exige un changement de paradigme, passant d'une logique curative à une culture proactive.

      Cela implique une approche globale et transversale, intégrant les déterminants sociaux, environnementaux et commerciaux.

      Le numérique offre des outils prometteurs, mais leur déploiement doit être inclusif et accompagné.

      Le financement n'est pas le seul obstacle ; la capacité à remettre en question des intérêts privés et la volonté politique sont primordiales.

      La santé au travail, avec ses défis liés aux changements climatiques, aux inégalités de genre et aux nouvelles formes de travail, est un exemple criant de la nécessité d'une prévention primaire renforcée, basée sur le dialogue social et l'écoute des travailleurs.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Joint Public Review:

      Summary:

      The authors sought to elucidate the mechanism by which infections increase sleep in Drosophila. Their work is important because it further supports the idea that the blood-brain barrier is involved in brain-body communication, and because it advances the field of sleep research. Using knock-down and knock-out of cytokines and cytokine receptors specifically in the endocrine cells of the gut (cytokines) as well as in the glia forming the blood-brain barrier (BBB) (cytokines receptors), the authors show that cytokines, upd2 and upd3, secreted by entero-endocrine cells in response to infections increase sleep through the Dome receptor in the BBB. They also show that gut-derived Allatostatin (Alst) A promotes wakefulness by inhibiting Alst A signaling that is mediated by Alst receptors expressed in BBB glia. Their results suggest there may be additional mechanisms that promote elevated sleep during gut inflammation.

      The authors suggest that upd3 is more critical than upd2, which is not sufficiently addressed or explained. In addition, the study uses the gut's response to reactive oxygen molecules as a proxy for infection, which is not sufficiently justified. Finally, further verification of some fundamental tools used in this paper would further solidify these findings making them more convincing.

      Strengths:

      (1) The work addresses an important topic and proposes an intriguing mechanism that involves several interconnected tissues. The authors place their research in the appropriate context and reference related work, such as literature about sickness-induced sleep, ROS, the effect of nutritional deprivation on sleep, sleep deprivation and sleep rebound, upregulated receptor expression as a compensatory mechanism in response to low levels of a ligand, and information about Alst A.

      (2) The work is, in general, supported by well-performed experiments that use a variety of different tools, including multiple RNAi lines, CRISPR, and mutants, to dissect both signal-sending and receiving sides of the signaling pathway.

      (3) The authors provide compelling evidence that shows that endocrine cells from the gut are the source of the upd cytokines that increase daytime sleep, that the glial cells of the BBB are the targets of these upds, and that upd action causes the downregulation of Alst receptors in the BBB via the Jak/Stat pathways.

      We are pleased that the reviewers recognized the strength and significance of our findings describing a gut-to-brain cytokine signaling mechanism involving the blood-brain barrier (BBB) and its role in regulating sleep, and we thank them for their comments.

      Weaknesses:

      (1) There is a limited characterization of cell types in the midgut which are classically associated with upd cytokine production.

      We thank the reviewer for raising this point. Although several midgut cell types (including the absorptive enterocytes) may indeed produce Unpaired (Upd) cytokines, our study specifically focused on enteroendocrine cells (EECs), which are well-characterized as secretory endocrine cells capable of exerting systemic effects. As detailed in our response to Results point #2 (please see below), we show that EEC-specific manipulation of Upd signaling is both necessary and sufficient to regulate sleep in response to intestinal oxidative stress. These findings support the role of EECs as a primary source of gut-derived cytokine signaling to the brain. To acknowledge the possible involvement of other source, we have also added a statement to the Discussion in the revised manuscript noting that other, non-endocrine gut cell types may contribute to systemic Unpaired signaling that modulates sleep.

      (2) Some of the main tools used in this manuscript to manipulate the gut while not influencing the brain (e.g., Voilà and Voilà + R57C10-GAL80), are not directly shown to not affect gene expression in the brain. This is critical for a manuscript delving into intra-organ communication, as even limited expression in the brain may lead to wrong conclusions.

      We agree with the reviewer that this is an important point. To address it, we performed additional validation experiments to assess whether the voilà-GAL4 driver in combination with R57C10-GAL80 (EEC>) influences upd2 or upd3 expression in the brain. Our results show that manipulation using EEC> alters upd2 and upd3 expression in the gut (Fig. 1a,b), with new data showing that this does not affect their expression levels in neuronal tissues (Fig. S1a), supporting the specificity of our approach. These new data are now included in the revised manuscript and described in the Results section. This additional validation strengthens our conclusion that the observed sleep phenotypes result from gut-specific cytokine signaling, rather than from effects on Unpaired cytokines produced in the brain.

      (1) >(3) The model of gut inflammation used by the authors is based on the increase in reactive oxygen species (ROS) obtained by feeding flies food containing 1% H2O2. The use of this model is supported by the authors rather weakly in two papers (refs. 26 and 27 ): The paper by Jiang et al. (ref. 26) shows that the infection by Pseudomonas entomophila induces cytokine responses upd2 and 3, which are also induced by the Jnk pathway. In addition, no mention of ROS could be found in Buchon et al. (ref 27); this is a review that refers to results showing that ROS are produced by the NADPH oxidase DUOX as part of the immune response to pathogens in the gut. Thus, there is no strong support for the use of this model.

      We thank the reviewer for raising this point. We agree that the references originally cited did not sufficiently justify the use of H<sub>2</sub>O<sub>2</sub> feeding as a model of gut inflammation. To address this, we have revised the Results section to clarify that we use H<sub>2</sub>O<sub>2</sub> feeding as a controlled method to elevate intestinal ROS levels, rather than as a general model of inflammation. This approach allows us to investigate the specific effects of ROS-induced cytokine signaling in the gut. We have also added additional citations to support the physiological relevance of this model. For instance, Tamamouna et al. (2021) demonstrated that H<sub>2</sub>O<sub>2</sub> feeding induces intestinal stem-cell proliferation – a response also observed during bacterial infection – and Jiang et al. (2009) showed that enteric infections increase upd2 and upd3 expression, which we similarly observe following H<sub>2</sub>O<sub>2</sub> feeding (Fig. 3a). These findings support the use of H<sub>2</sub>O<sub>2</sub> as a tool to mimic specific ROS-linked responses in the gut. We believe this targeted and tractable model is a strength of our study, enabling us to dissect how intestinal ROS modulates systemic physiology through cytokine signaling

      Additionally, we have included a statement in the Discussion acknowledging that ROS generated during infection may activate signaling mechanisms distinct from those triggered by chemically induced oxidative stress, and that exploring these differences in future studies may yield important insights into gut–brain communication. These revisions provide a stronger justification for our model while more accurately conveying both its relevance and its limitations.

      (2) >(4) Likewise, there is no support for the use of ROS in the food instead a direct infection by pathogenic bacteria. Furthermore, it is known that ROS damages the gut epithelium, which in turn induces the expression of the cytokines studied. Thus the effects observed may not reflect the response to infection. In addition, Majcin Dorcikova et al. (2023). Circadian clock disruption promotes the degeneration of dopaminergic neurons in male Drosophila. Nat Commun. 2023 14(1):5908. doi: 10.1038/s41467-02341540-y report that the feeding of adult flies with H2O2 results in neurodegeneration if associated with circadian clock defects. Thus, it would be important to discuss or present controls that show that the feeding of H2O2 does not cause neuronal damage.

      We thank the reviewer for this thoughtful follow-up point. We would like to clarify that we do not claim that the effects observed in our study directly reflect the full response to enteric infection. As outlined in our revised response to comment 3, we have updated the manuscript to more precisely describe the H<sub>2</sub>O<sub>2</sub>-feeding paradigm as a model that induces local intestinal ROS responses comparable to, but not equivalent to, those observed during pathogenic challenges. This revised framing highlights both the potential similarities and differences between chemically induced oxidative stress and infection-induced responses. Indeed, in the revised Discussion, we now explicitly acknowledge that ROS generated during infection may engage distinct signaling mechanisms compared to exogenous H<sub>2</sub>O<sub>2</sub> and emphasize the value of future studies in delineating these pathways. We are currently pursuing this direction in an independent ongoing study investigating the effects of enteric infections. However, for the present work, we chose to focus on the effects of ROS-induced responses in isolation, as this provides a clean and well-controlled context to dissect the specific contribution of oxidative stress to cytokine signaling and sleep regulation.

      To further address the reviewer’s concern, we have also included new data (a TUNEL stain for apoptotic DNA fragmentation) in the revised manuscript showing that H<sub>2</sub>O<sub>2</sub> feeding does not damage neuronal tissues under our experimental conditions (Fig. S3f,g). This addresses the point raised regarding the potential neurotoxicity of H<sub>2</sub>O<sub>2</sub>, as described by Majcin Dorcikova et al. (2023), and supports the specificity of the sleep phenotypes observed in our study. We believe these revisions and clarifications strengthen the manuscript and make our interpretation more precise.

      (3) >(5) The novelty of the work is difficult to evaluate because of the numerous publications on sleep in Drosophila. Thus, it would be very helpful to read from the authors how this work is different and novel from other closely related works such as: Li et al. (2023) Gut AstA mediates sleep deprivation-induced energy wasting in Drosophila. Cell Discov. 23;9(1):49. doi: 10.1038/s41421-023-00541-3.

      Our work highlights a distinct role for gut-derived AstA in sleep regulation compared to findings by Lin et al. (Cell Discovery, 2023)[1], who showed that gut AstA mediates energy wasting during sleep deprivation. Their study focused on the metabolic consequences of sleep loss, proposing that sleep deprivation increases ROS in the gut, which then promotes the release of the glucagon-like hormone adipokinetic hormone (AKH) through gut AstA signaling, thereby triggering energy expenditure.

      In contrast, our study addresses the inverse question – how ROS in the gut influences sleep. In our model, intestinal ROS promotes sleep, raising the intriguing possibility – cleverly pointed out by the reviewers – that ROS generated during sleep deprivation might promote sleep by inducing Unpaired cytokine signaling in the gut. According to our findings, this suppresses wake-promoting AstA signaling in the BBB, providing a mechanism to promote sleep as a restorative response to gut-derived oxidative stress and potentially limiting further ROS accumulation. Importantly, our findings support a wakepromoting role for EEC-derived AstA, demonstrated by several lines of evidence. First, EEC-specific knockdown of AstA increases sleep. Second, activation of AstA<sup>+</sup> EECs using the heat-sensitive cation channel Transient Receptor Potential A1 (TrpA1) reduces sleep, and this effect is abolished by simultaneous knockdown of AstA, indicating that the sleep-suppressing effect is mediated by AstA and not by other peptides or secreted factors released by these cells. Third, downregulation of AstA receptor expression in BBB glial cells increases sleep, further supporting the existence of a functional gut AstA– glia arousal pathway. We have now included new data in the revised manuscript showing that AstA release from EECs is downregulated during intestinal oxidative stress (Fig. 7k,l,m). This suggests that this wake-promoting signal is suppressed both at its source (the gut endocrine cells), by unknown means, and at its target, the BBB, via Unpaired cytokine signaling that downregulates AstA receptor expression. This coordinated downregulation may serve to efficiently silence this arousal-promoting pathway and facilitate sleep during intestinal stress. These new data, along with an expanded discussion, provide further mechanistic insight into gut-derived AstA signaling and strengthen our proposed model.

      This contrasts with the interpretation by Lin et al., who observed increased AstA peptide levels in EECs after antioxidant treatment and interpreted this as peptide retention. However, peptide accumulation may result from either increased production or decreased release, and peptide levels alone are insufficient to distinguish between these possibilities. To resolve this, we examined AstA transcript levels, which can serve as a proxy for production. Following oxidative stress (24 h of 1% H<sub>2</sub>O<sub>2</sub> feeding and the following day), when animals show increased sleep (Fig. 7e), we observed a decrease in AstA transcript levels followed by an increase in peptide levels (Fig. 7k,l,m), suggesting that oxidative stress leads to reduced gut AstA production and release. Furthermore, we recently found that a class of EECs that produce the hormone Tachykinin (Tk) and are distinct from the AstA<sup>+</sup> EECs express the ROSsensitive cation channel TrpA1 (Ahrentløv et al., 2025, Nature Metabolism2). In these Tk<sup>+</sup> EECs, TrpA1 mediates ROS-induced Tk hormone release. In contrast, single-cell RNA-seq data[3] do not support TrpA1 expression in AstA<sup>+</sup> EECs, consistent with our findings that ROS does not promote AstA release – an effect that would be expected if TrpA1 were functionally expressed in AstA<sup>+</sup> EECs. This contradicts the findings of Lin et al., who reported TrpA1 expression in AstA<sup>+</sup> EECs. We have now included relevant single-cell data in the revised manuscript (Fig. S6f) showing that TrpA1 is specifically expressed in Tk<sup>+</sup> EECs, but not in AstA<sup>+</sup> EECs, and we have expanded the discussion to address discrepancies in TrpA1 expression and AstA regulation.

      Taken together, our results reveal a dual-site regulatory mechanism in which Unpaired cytokines released from the gut act at the BBB to downregulate AstA receptor expression, while AstA release from EECs is simultaneously suppressed. We thank the reviewers for raising this important point. We have also included a discussion the other point raised by the reviewers – the possibility that ROS generated during sleep deprivation may engage the same signaling pathways described here, providing a mechanistic link between sleep deprivation, intestinal stress, and sleep regulation.

      Recommendations for the authors:

      A- Material and Methods:

      (1) Feeding Assay: The cited publication (doi.org:10.1371/journal.pone.0006063) states: "For the amount of label in the fly to reflect feeding, measurements must therefore be confined to the time period before label egestion commences, about 40 minutes in Drosophila, a time period during which disturbance of the flies affects their feeding behavior. There is thus a requirement for a method of measuring feeding in undisturbed conditions." Was blue fecal matter already present on the tube when flies were homogenized at 1 hour? If so, the assay may reflect gut capacity rather than food passage (as a proxy for food intake). In addition, was the variability of food intake among flies in the same tube tested (to make sure that 1-2 flies are a good proxy for the whole population)?

      We agree that this is an important point for feeding experiments. We are aware of the methodological considerations highlighted in the cited study and have extensive experience using a range of feeding assays in Drosophila, including both short- and long-term consumption assays (e.g., dye-based and CAFE assays), as well as automated platforms such as FLIC and FlyPAD (Nature Communications, 2022; Nature Metabolism, 2022; and Nature Metabolism, 2025)[2,4,5].

      For the dye-based assay, we carefully selected a 1-hour feeding window based on prior optimization. Since animals were not starved prior to the assay, shorter time points (e.g., 30 minutes) typically result in insufficient ingestion for reliable quantification. A 1-hour period provides a robust readout while remaining within the timeframe before significant label excretion occurs under our experimental conditions. To support the robustness of our findings, we complemented the dye-based assay with data from FLIC, which enables automated, high-resolution monitoring of feeding behavior in undisturbed animals over extended periods. The FLIC results were consistent with the dye-based data, strengthening our confidence in the conclusions. To minimize variability and ensure consistency across experiments, all feeding assays were performed at the same circadian time – Zeitgeber Time 0 (ZT0), corresponding to 10:00 AM when lights are turned on in our incubators. This time point coincides with the animals' natural morning feeding peak, allowing for reproducible comparisons across conditions. Regarding variability among flies within tubes, each biological replicate in the dye assay consisted of 1–2 flies, and results were averaged across multiple replicates. We observed good consistency across samples, suggesting that these small groups reliably reflect group-level feeding behavior under our conditions.

      (2) Biological replicates: whereas the number of samples is clearly reported in each figure, the number of biological replicates is not indicated. Please include this information either in Material and methods or in the relevant figure legends. Please also include a description of what was considered a biological replicate.

      We have now clarified in the Materials and Methods section under Statistics that all replicates represent independent biological samples, as suggested by the reviewers.

      (3) Control Lines: please indicate which control lines were used instead of citing another publication. If preferred, this information could be supplied as a supplementary table.

      We now provide a clear description of the control lines used in the Materials and Methods section. Specifically, all GAL4 and GAL80 lines used in this study were backcrossed for several generations into a shared w<sup>1118</sup> background and then crossed to the same w<sup>1118</sup> strain used as the genetic background for the UAS-RNAi, <i.CRISPR, or overexpression lines. This approach ensures, to a strong approximation, that the only difference between control and experimental animals is the presence or absence of the UAS transgene.

      (4) Statistical analyses: for some results (e.g., those shown in Figure 3d), it could be useful to test the interaction between genotype and treatment.

      We thank the reviewer for this helpful suggestion. In response, we have now performed two-way ANOVA analyses to assess genotype × treatment (diet) interaction effects for the relevant data, including those shown in Figure 3d as well as additional panels where animals were exposed to oxidative stress and sleep phenotypes were measured. We have added the corresponding interaction p-values in the updated figure legends for Figures 3d, 3k, 5a–c, 5f, 5h, 5i, 6c, 6e, and 7e. All of these tests revealed significant interaction effects, supporting the conclusion that the observed differences in sleep phenotypes are specifically dependent on the interaction between genetic manipulation (e.g., cytokine or receptor knockdown) and oxidative stress. These additions reinforce the interpretation that Unpaired cytokine signaling, glial JAK-STAT pathway activity, and AstA receptor regulation functionally interact with intestinal ROS exposure to modulate sleep. We thank the reviewer for suggesting this improvement.

      (5) Reporting of p values. Some are reported as specific values whereas others are reported as less than a specific value. Please make this reporting consistent across different figures.

      All p-values reported in the manuscript are exact, except in cases where values fall below p < 0.0001. In those instances, we use the inequality because the Prism software package (GraphPad, version 10), which was used for all statistical analyses, does not report more precise values. We believe this reporting approach reflects standard practice in the field.

      (6) Please include the color code used in each figure, either in the figure itself or in the legend.

      We have now clarified the color coding in all relevant figures. In particular, we acknowledge that the meaning of the half-colored circles used to indicate H<sub>2</sub>O<sub>2</sub> treatment was not previously explained. These have now been clearly labeled in each figure to indicate treatment conditions.

      (7) The scheme describing the experimental conditions and the associated chart is confusing. Please improve.

      We have improved the schematic by replacing “ROS” with “H<sub>2</sub>O<sub>2</sub>” to more clearly indicate the experimental condition used. Additionally, we have added the corresponding circle annotations so that they now also appear consistently above the relevant charts. This revised layout enhances clarity and helps readers more easily interpret the experimental conditions. We believe these changes address the reviewer’s concern and make the figure significantly more intuitive.

      8) Please indicate which line was used for upd-Gal4 and the evidence that it faithfully reflects upd3 expression.

      We have now clarified in the Materials and Methods section that the upd3-GAL4 line used in our study is Bloomington stock #98420, which drives GAL4 expression under the control of approximately 2 kb of sequence upstream of the upd3 start codon. This line has previously been used as a transcriptional reporter for upd3 activity. The only use of this line was to illustrate reporter expression in the EECs. To support this aspect of Upd3 expression, we now include new data in the revised manuscript using fluorescent in situ hybridization (FISH) against upd3, which confirms the presence of upd3 transcripts in prospero-positive EECs of the adult midgut (Fig. S1b). Additionally, we show that upd3 transcript levels are significantly reduced in dissected midguts following EEC-specific knockdown using multiple independent RNAi lines driven by voilà-GAL4, both alone and in combination with R57C10-GAL80, consistent with endogenous expression in these cells (Fig. 1a,b).

      To further address the reviewer’s concern and provide additional support for the endogenous expression of upd3 in EECs, we performed targeted knockdown experiments focusing on molecularly defined EEC subpopulations. The adult Drosophila midgut contains two major EEC subtypes characterized by their expression of Allatostatin C (AstC) or Tachykinin (Tk), which together encompass the vast majority of EECs. To selectively manipulate these populations, we used AstC-GAL4 and Tk-GAL4 drivers – both knock-in lines in which GAL4 is inserted at the respective endogenous hormone loci. This design enables precise GAL4 expression in AstC- or Tk-expressing EECs based on their native transcriptional profile. To eliminate confounding neuronal expression, we combined these drivers with R57C10GAL80, restricting GAL4 activity to the gut and generating AstC<sup>Gut</sup>> and Tk<sup>Gut</sup>> drivers. Using these tools, we knocked down upd2 and upd3 selectively in the AstC- or Tk-positive EECs. Knockdown of either cytokine in AstC-positive EECs significantly increased sleep under homeostatic conditions, recapitulating the phenotype observed with knockdown in all EECs (Fig. 1m-o). In contrast, knockdown of upd2 or upd3 in Tk-positive EECs had no effect on sleep (Fig. 1p-r). Furthermore, we show in the revised manuscript that selective knockdown of upd2 or upd3 in AstC-positive EECs abolishes the H<sub>2</sub>O<sub>2</sub>-induced increase in sleep (Fig. 3f–h). These findings demonstrate that Unpaired cytokine signaling from AstC-positive EECs is essential for mediating the sleep response to intestinal oxidative stress, highlighting this specific EEC subtype as a key source of cytokine-driven regulation in this context. These new results indicate that AstC-positive EECs are a primary source of the Unpaired cytokines that regulate sleep, while Tk-positive EECs do not appear to contribute to this function. Importantly, upd3 transcript levels were significantly reduced in dissected midguts following AstC<sup>Gut</sup> driven knockdown (Fig. S1r), further confirming that upd3 is endogenously expressed in AstC-positive EECs. Thus we have bolstered our confidence that upd3 is indeed expressed in EECs, as illustrated by the reporter line, through several means.

      (9) Please indicate which GFP line was used with upd-Gal4 (CD8, NLS, un-tagged, etc). The Material and Methods section states that it was "UAS-mCD8::GFP (#5137);", however, the stain does not seem to match a cell membrane pattern but rather a nuclear or cytoplasmic pattern. This information would help the interpretation of Figure 1C.

      We confirm that the GFP reporter line used with upd3-GAL4 was obtained from Bloomington stock #98420. As noted by the Bloomington Drosophila Stock Center, “the identity of the UAS-GFP transgene is a guess,” and the subcellular localization of the GFP fusion is therefore uncertain. We agree with the reviewer that the signal observed in Figure 1c does not display clear membrane localization and instead appears diffuse, consistent with cytoplasmic or partially nuclear localization. In any case, what we find most salient is the reporter’s labeling of Prospero-positive EECs in the adult midgut, consistent with upd3 expression in these cells. This conclusion is further supported by multiple lines of evidence presented in the revised manuscript, as mentioned above in response to question #8: (1) fluorescent in situ hybridization (FISH) for upd3 confirms expression in EECs (Fig. S1b), (2) EEC-specific RNAi knockdown of upd3 reduces transcript levels in dissected midguts, and (3) publicly available single-cell RNA sequencing datasets[3] also indicate that upd3 is expressed at low levels in a subset of adult midgut EECs under normal conditions. We have also clarified in the revised Materials and Methods section that GFP localization is undefined in the upd3-GAL4 line, to guide interpretation of the reporter signal.

      B- Results

      (1) Figure 1: According to previous work (10.1016/j.celrep.2015.06.009, http://flygutseq.buchonlab.com/data?gene=upd3%0D%0A), in basal conditions upd3 is expressed as following: ISC (35 RPKM), EB (98 RPKM), EC (57 RPKM), and EEC (8 RPKM). Accordingly, even complete KO in EECs should eliminate only a small fraction of upd3 from whole guts, even less considering the greater abundance of other cell types such as ECs compared to EECs. It would be useful to understand where this discrepancy comes from, in case it is affecting the conclusion of the manuscript. While this point per se does not affect the main conclusions of the manuscript, it makes the interpretation of the results more difficult.

      We acknowledge the previously reported low expression of upd3 in EECs. However, the FlyGut-seq site appears to be no longer available, so we could not directly compare other related genes. Nonetheless, our data – based on in situ hybridization, reporter expression, and multiple RNAi knockdowns – consistently support upd3 expression in EECs. These complementary approaches strengthen the conclusion that EECs are an important source of systemic upd3 under the conditions tested.

      (2) Figure 1: The upd2-3 mutants show sleep defects very similar to those of EEC>RNAi and >Cas9. It would thus be helpful to try to KO upd3 with other midgut drivers (An EC driver like Myo1A or 5966GS and a progenitor driver like Esg or 5961GS) to validate these results. Such experiments might identify precisely which cells are involved in the gut-brain signaling reported here.

      We appreciate the reviewer’s suggestion and agree that exploring other potential sources of Upd3 in the gut is an interesting direction. In this study, we have focused on EECs, which are the primary hormone-secreting cells in the intestine and thus the most likely candidates for mediating systemic effects such as gut-to-brain signaling. While it is possible that other gut cell types – such as enterocytes (e.g., Myo1A<sup>+</sup>) or intestinal progenitors (e.g., Esg<sup>+</sup>) – also contribute to Upd3 production, these cells are not typically endocrine in nature. Demonstrating their involvement in gutto-brain communication would therefore require additional, extensive validation beyond the scope of the current study. Importantly, our data show that manipulating Upd3 specifically in EECs is both necessary and sufficient to modulate sleep in response to intestinal ROS, strongly supporting the conclusion that EEC-derived cytokine signaling underlies the observed phenotype. In contrast, manipulating cytokines in other gut cells could produce indirect effects – such as altered proliferation, epithelial integrity, or immune responses – that complicate the interpretation of behavioral outcomes like sleep. For these reasons, we chose to focus on EECs as the source of endocrine signals mediating gut-to-brain communication. However, to address this point raised by the reviewer, we have now included a statement in the Discussion acknowledging that other non-endocrine gut cell types may also contribute to the systemic Unpaired signaling that modulates sleep in response to intestinal oxidative stress.

      (3) Figure 3: "This effect mirrored the upregulation observed with EEC-specific overexpression of upd3, indicating that it reflects physiologically relevant production of upd3 by the gut in response to oxidative stress." Please add (Figure 3a) at the end of this sentence.

      We have now added “(Figure 3a)” at the end of the sentence to clearly reference the relevant data.

      (4) For Figure 3b, do you have data showing that the increased amount of sleep was due to the addition of H2O2 per se, rather than the procedure of adding it?

      We have added new data to address this point. To ensure that the observed sleep increase was specifically due to the presence of H<sub>2</sub>O<sub>2</sub> and not an effect of the food replacement procedure, we performed a control experiment in which animals were fed standard food prepared using the same protocol and replaced daily, but without H<sub>2</sub>O<sub>2</sub>. These animals did not exhibit increased sleep, confirming that the sleep effect is attributable to intestinal ROS rather than the supplementation procedure itself (Fig. S3a). Thanks for the suggestion.

      (5) In the text it is stated that "Since 1% H2O2 feeding induced robust responses both in upd3 expression and in sleep behavior, we asked whether gut-derived Unpaired signaling might be essential for the observed ROS-induced sleep modulation. Indeed, EEC-specific RNAi targeting upd2 or upd3 abolished the sleep response to 1% H2O2 feeding." While it is indeed true that there is no additional increase in sleep time due to EEC>upd3 RNAi, it is also true that EEC>upd3 RNAi flies, without any treatment, have already increased their sleep in the first place. It is then possible that rather than unpaired signaling being essential, an upper threshold for maximum sleep allowed by manipulation of these processes was reached. It would be useful to discuss this point.

      Several findings argue against a ceiling effect and instead support a requirement for Unpaired signaling in mediating ROS-induced sleep. Animals with EEC-specific upd2 or upd3 knockdown or null mutation not only fail to increase sleep following H<sub>2</sub>O<sub>2</sub> treatment but actually exhibit reduced sleep during oxidative stress (Fig. 3e, k, l; Fig. 5e, f), suggesting that Unpaired signaling is required to sustain sleep under these conditions. Similarly, animals with glial dome knockdown also show reduced sleep under oxidative stress, closely mirroring the phenotype of EEC-specific upd3 RNAi animals (Fig. 5a–c, g–i). These results support the conclusion that gut-to-glia Unpaired cytokine signaling is necessary for maintaining elevated sleep during oxidative stress. In the absence of this signaling, animals exhibit increased wakefulness. We identify AstA as one such wake-promoting signal that is suppressed during intestinal stress. We present new data showing that this pathway is downregulated not only via Unpaired-JAK/STAT signaling in glial cells but also through reduced AstA release from the gut in the revised manuscript. This model, in which Unpaired cytokines promote sleep during intestinal stress by suppressing arousal pathways, is discussed throughout the manuscript to address the reviewer’s point.

      (6) In Figure 3k, the dots highlighting the experiment show an empty profile, a full one, and a half one. Please define what the half dots represent.

      We have now clarified the color coding in all relevant figures. Specifically, we acknowledge that the meaning of the half-colored circles indicating H<sub>2</sub>O<sub>2</sub> treatment was not previously defined – it indicates washout or recovery time. In the revised version, these symbols are now clearly labeled in each figure to indicate the treatment condition, ensuring consistent and intuitive interpretation across all panels.

      (7) The authors used appropriate GAL4 and RNAi lines to the knockdown dome, a upd2/3 JAK-STATlinked receptor, specifically in neurons and glia, respectively, in order to identify the CNS targets of upd2/3 cytokines produced by enteroendocrine cells (EECs). Pan-neuronal dome knockdown did not alter daytime sleep in adult females, yet pan-glial dome knockdown phenocopied effects of upd2/3 knockdown in EECs. They also observed that EEC-specific knockdown of upd2 and upd3 led to a decrease in JAK-STAT reporter activity in repo-positive glial cells. This supports the authors' conclusion that glial cells, not neurons, are the targets by which unpaired cytokines regulate sleep via JAK-STAT signaling. However, they do not show nighttime sleep data of pan-neuronal and pan-glial dome knockdowns. It would strengthen their conclusion if the nighttime sleep of pan-glial dome knockdown phenocopied the upd2/3 knockdowns as well, provided the pan-neuronal dome knockdown did not alter nighttime sleep.

      We have now added nighttime sleep data for both pan-glial and pan-neuronal domeless knockdowns in the revised manuscript (Fig. 2a). Glial knockdown increased nighttime sleep, similar to EEC-specific upd2/3 knockdown, while neuronal knockdown had no effect. These results further support the glial cells’ being the relevant target of gut-derived Unpaired signaling.

      (8) The authors only used one method to induce oxidative stress (hydrogen peroxide feeding). It would strengthen their argument to test multiple methods of inducing oxidative stress, such as lipopolysaccharide (LPS) feeding. In addition, it would be useful to use a direct bacterial infection to confirm that in flies, the infection promotes sleep. Additionally, flies deficient in Dome in the BBB and infected should not be affected in their sleep by the infection. These experiments would provide direct support for the mechanism proposed. Finally, the authors should add a primary reference for using ROS as a model of bacterial infection and justify their choice better.

      We agree that directly comparing different models of intestinal stress, such as bacterial infection or LPS feeding, would provide valuable insight into how gut-derived signals influence sleep in response to infection. As noted in our detailed responses above, we now include an expanded rationale for our use of H<sub>2</sub>O<sub>2</sub> feeding as a controlled and well-established method for inducing intestinal ROS – one of the key physiological responses to enteric infection and inflammation. In the revised Discussion, we explicitly acknowledge that pathogenic infections – which trigger both intestinal ROS and additional immune pathways – may engage distinct or complementary mechanisms compared to chemically induced oxidative stress. We emphasize the importance of future studies aimed at dissecting these differences. In fact, we are actively pursuing this direction in ongoing work examining sleep responses to enteric infection. For the purposes of the present study, however, we chose to focus on a tractable and specific model of ROS-induced stress to define the contribution of Unpaired cytokine signaling to gut-brain communication and sleep regulation. This approach allowed us to isolate the effect of oxidative stress from other confounding immune stimuli and identify a glia-mediated signaling mechanism linking gut epithelial stress to changes in sleep behavior.

      (9) To confirm that animals lacking EEC Unpaired signaling are not more susceptible to ROS-induced damage, the authors assessed the survival of upd2 and upd3 knockdowns on 1% H2O2 and concluded they display no additional sensitivity to oxidative stress compared to controls. It may be useful to include other tests of sensitivity to oxidative stress, in addition to survival.

      We appreciate the reviewer’s suggestion. In our view, survival is a highly informative and stringent readout, as it reflects the overall physiological capacity of the animal to withstand oxidative stress. Importantly, our data show that animals lacking EEC-derived Unpaired signaling do not exhibit reduced survival following H<sub>2</sub>O<sub>2</sub> exposure, indicating that their oxidative stress resistance is not compromised. Furthermore, we previously confirmed that feeding behavior is unaffected in these animals, suggesting that their ability to ingest food (and thus the stressor) is not impaired. As a molecular complement to these assays in response to this point and others, we have also performed an assessment of neuronal apoptosis (a TUNEL assay, Fig. S3f,g). This assay did not identify an increase in cell death in the brains of animals fed peroxide-containing medium. Thus, gross neurological health, behavior, and overall survival appear to be resilient to the environmental treatment regime we apply here, suggesting that the outcomes we observe arise from signaling per se.

      (10) The authors confirmed that animals lacking EEC-derived upd3 displayed sleep suppression similar to controls in response to starvation. These results led the authors to conclude that there is a specific requirement for EEC-derived Unpaired signaling in responding to intestinal oxidative stress. However, they previously showed that EEC-specific knockdown of upd3 and upd2 led to increased daytime sleep under normal feeding conditions. Their interpretations of their data are inconsistent.

      We appreciate the reviewer’s comment. While animals lacking EEC-derived Unpaired signaling show increased baseline sleep under normal feeding conditions, they still exhibit a robust reduction in sleep when subjected to starvation – comparable to that of control animals (Fig. S3h–j). This demonstrates that they retain the capacity to appropriately modulate sleep in response to metabolic stress. Thus, the sleep-promoting phenotype under normal conditions does not reflect a generalized inability to adjust sleep behavior. Rather, it highlights a specific role for Unpaired signaling in mediating sleep responses to intestinal oxidative stress, not in broadly regulating all sleep-modulating stimuli.

      (11) The authors report a significant increase in JAK-STAT activity in surface glial cells at ZT0 in animals fed 1% H2O2-containing food for 20 hours. This response was abolished in animals with EECspecific knockdown of upd2 or upd3. The authors confirmed there were no unintended neuronal effects on upd2 or upd3 expression in the heads. They also observed an upregulation of dome transcript levels in the heads of animals with EEC-specific knockdown of upd3 fed 1% H2O2-containing food for 15 hours, which they interpret to be a compensatory mechanism in response to low levels of the ligand. This assay is inconsistent with previous experiments in which animals were fed hydrogen peroxide for 20 hours.

      We thank the reviewer for identifying this discrepancy. The inconsistency arose from a labeling error in the manuscript. Both the JAK-STAT reporter assays in glial cells and the dome expression measurements were performed following 15 hours of H<sub>2</sub>O<sub>2</sub> feeding, not 20 hours as previously stated. We have now corrected this in the revised manuscript.

      (12) The authors show that animals with glia-specific dome knockdown did not have decreased survival on H2O2-containing food, and displayed normal rebound sleep in the morning following sleep deprivation. These results potentially undermine the significance of the paper. If the normal sleep response to oxidative stress is an important protective mechanism, why would oxidative stress not decrease survival in dome knockdown flies (that don't have the normal sleep response to oxidative stress)? This suggests that the proposed mechanism is not important for survival. The authors conclude that Dome-mediated JAK-STAT signaling in the glial cells specifically regulates ROS-induced sleep responses, which their results support.

      We agree that our survival data show that glial dome knockdown does not reduce survival under continuous oxidative stress. However, we believe this does not undermine the importance of the sleep response as an adaptive mechanism. In our survival assay, animals were continuously exposed to 1% H<sub>2</sub>O<sub>2</sub> without the opportunity to recover. In contrast, under natural conditions, oxidative stress is likely to be intermittent, and the ability to mount a sleep response may be particularly important for promoting recovery and maintaining homeostasis during or after transient stress episodes. Thus, while the JAK-STAT-mediated sleep response may not directly enhance survival under constant oxidative challenge, it likely plays a critical role in adaptive recovery under natural conditions.

      (13) Altogether, the authors conclude that enteric oxidative stress induces the release of Unpaired cytokines which activate the JAK-STAT pathway in subperineurial glia of the BBB, which leads to the glial downregulation of receptors for AstA, which is a wake-promoting factor also released by EECs. This mechanism is supported by their results, however, this research raises some intriguing questions, such as the role of upd2 versus upd3, the role of AstA-R1 versus AstA-R2, the importance of this mechanism in terms of survival, the sex-specific nature of this mechanism, and the role that nutritional availability plays in the dual functionality of Unpaired cytokine signaling in regards to sleep.

      We thank the reviewer for highlighting these important questions. Our data suggest that Upd2 and Upd3, while often considered partially redundant, both contribute to sleep regulation, with stronger effects observed for Upd3. This is consistent with prior studies indicating overlapping but non-identical roles for these cytokines. Similarly, although AstA-R1 and AstA-R2 can both be activated by AstA, knockdown of AstA-R2 consistently produces more robust sleep phenotypes, suggesting a predominant role in mediating this effect. The possibility of sex-specific regulation is indeed compelling. While our study focused on females, many gut hormones show sex-dependent activity, and we recognize this as an important avenue for future research. Finally, we have included new data in the revised manuscript showing that gut-derived AstA is downregulated under oxidative stress, further supporting our model in which Unpaired signaling suppresses arousal pathways during intestinal stress

      (14)Data Availability: It is indicated that: "Reasonable data requests will be fulfilled by the lead author". However, eLife's guidelines for data sharing require that all data associated with an article to be made freely and widely available.

      We thank the reviewer for pointing this out. We have revised the Data Availability section of the manuscript to clarify that all data will be made freely available from the lead contact without restriction, in accordance with eLife’s open data policy.

      References

      (1) Li, Y., Zhou, X., Cheng, C., Ding, G., Zhao, P., Tan, K., Chen, L., Perrimon, N., Veenstra, J.A., Zhang, L., and Song, W. (2023). Gut AstA mediates sleep deprivaPon-induced energy wasPng in Drosophila. Cell Discov 9, 49. 10.1038/s41421-023-00541-3. (2) Ahrentlov, N., Kubrak, O., Lassen, M., Malita, A., Koyama, T., Frederiksen, A.S., Sigvardsen, C.M., John, A., Madsen, P., Halberg, K.A., et al. (2025). Protein-responsive gut hormone Tachykinin directs food choice and impacts lifespan. Nature Metabolism. 10.1038/s42255-025-01267-0.

      (3) Li, H., Janssens, J., De Waegeneer, M., Kolluru, S.S., Davie, K., Gardeux, V., Saelens, W., David, F.P.A., Brbic, M., Spanier, K., et al. (2022). Fly Cell Atlas: A single-nucleus transcriptomic atlas of the adult fruit fly. Science 375, eabk2432. 10.1126/science.abk2432.

      (4) Kubrak, O., Koyama, T., Ahrentlov, N., Jensen, L., Malita, A., Naseem, M.T., Lassen, M., Nagy, S., Texada, M.J., Halberg, K.V., and Rewitz, K. (2022). The gut hormone AllatostaPn C/SomatostaPn regulates food intake and metabolic homeostasis under nutrient stress. Nature communicaPons 13, 692. 10.1038/s41467-022-28268-x.

      (5) Malita, A., Kubrak, O., Koyama, T., Ahrentlov, N., Texada, M.J., Nagy, S., Halberg, K.V., and Rewitz, K. (2022). A gut-derived hormone suppresses sugar appePte and regulates food choice in Drosophila. Nature Metabolism 4, 1532-1550. 10.1038/s42255-022-00672-z.

    1. Synthèse Détaillée : Adaptation de l'Aménagement des Territoires au Changement Climatique en France

      • Ce document de briefing s'appuie sur les conclusions du rapport de la mission d'information parlementaire sur l'adaptation de l'aménagement des territoires au changement climatique.

      Il met en lumière les retards accumulés par la France en matière d'adaptation, l'insuffisance des politiques actuelles et la nécessité urgente d'une stratégie plus ambitieuse, coordonnée et financée pour faire face aux impacts croissants du dérèglement climatique.

      1. Le Contexte et l'Urgence de l'Adaptation

      Le rapport s'inscrit dans un contexte marqué par la publication du PNACC 3 (3e Plan National d'Adaptation au Changement Climatique) en mars dernier.

      Si les politiques d'atténuation (réduction des émissions de gaz à effet de serre) restent "indispensables pour limiter l'ampleur du changement climatique", elles doivent désormais être "complétées par une politique d'adaptation ambitieuse" face aux impacts déjà visibles et futurs.

      Constat alarmant : La France a pris un "important retard" en matière d'adaptation.

      Les réponses actuelles sont "insuffisantes", les "coûts de l'inaction s'alourdissent", et les politiques publiques peinent à traduire "l'urgence climatique en action concrète".

      Prévisions climatiques : "La trajectoire actuelle de réchauffement climatique pourrait conduire les températures à augmenter d'ici 2100 en France de 3 à 4°C supplémentaire par rapport à l'ère pré-industrielle".

      L'ampleur et la vitesse de ces évolutions sont "inédites", affectant les sols, la disponibilité de l'eau, les rendements agricoles, et augmentant la récurrence d'événements extrêmes (inondations, sécheresses).

      Impacts sur le quotidien des Français : "Disparition d'habitations et d'infrastructures littorales ou de montagne, baisse de la productivité économique lors des fortes chaleurs, dommages considérables du retrait gonflement des argiles sur les maisons individuelles, coupure récurrente de route, impossibilité d'assurer certains biens."

      2. Atténuation vs. Adaptation : Deux Piliers

      Complémentaires et Complexes La lutte contre le changement climatique repose sur deux piliers :

      Atténuation : Agit sur les "sources du réchauffement climatique".

      Adaptation : Agit sur les "conséquences de ce dernier".

      Ces deux démarches sont "complémentaires" (ex: végétalisation des villes atténue et adapte).

      Cependant, elles peuvent aussi s'opposer si l'adaptation conduit à des "maladaptations" (ex: "répondre aux vagues de chaleur en installant des millions de climatiseurs").

      L'adaptation est "plus complexe que celle de l'atténuation" car elle nécessite de "réduire les vulnérabilités physiques et de trouver des moyens d'adaptation fonctionnels selon des méthodes qui diffèrent en fonction des territoires des secteurs d'activité et des choix opérés par les élus et la population". Il n'existe pas de "politique d'adaptation unique".

      Contrairement à l'atténuation, qui dépend d'un effort mondial, "les politiques d'adaptation ne dépendent que de nous".

      L'action est urgente car "l'adaptation doit être pensée et mise en œuvre dès maintenant pour éviter la maladaptation et des coûts futurs importants".

      3. Le Coût de l'Inaction et les Bénéfices de l'Adaptation

      Bien qu'il n'existe pas d'"évaluation globale du coût de l'inadaptation", ce coût est "important" et "exponentiel".

      • Le changement climatique est responsable de près de la moitié de l'augmentation des coûts d'assurance, passés de "1,5 à 3,5 milliards d'euros par an" entre les décennies 1980 et 2010.
      • Les vagues de chaleur ont provoqué entre "22 et 37 milliards d'euros de surcoût entre 2014 et 2022", sans compter les milliers de morts.
      • Exemple frappant de maladaptation : le déploiement de la fibre optique sans enfouissement des câbles, nécessitant un investissement supplémentaire de "7 à 17 milliards d'euros" pour correction.
      • À l'inverse, "les bénéfices de l'adaptation dépassent largement ses coûts". Selon la Banque Mondiale, "chaque euro investi dans l'adaptation rapporte entre 2 et 10 €". Agir dès maintenant présente des "bénéfices économiques" et permet d'éviter des "investissements déjà très lourds" à refaire.

      4. Recommandations Clés pour une Politique d'Adaptation Efficace

      Le rapport formule une centaine de propositions. Les plus importantes incluent :

      4.1. Renforcement du Cadre Juridique et Stratégique

      • Reconnaître l'adaptation comme une "véritable priorité nationale" en lui dédiant un chapitre au sein du Code de l'Environnement, sur le modèle de la stratégie nationale bas-carbone.
      • Donner une "valeur législative" à la trajectoire de réchauffement de référence pour l'adaptation (TRACC), fixée à 4°C de réchauffement en France à l'horizon 2100 par le PNACC 3. Cela imposerait aux documents stratégiques locaux (PLU, SCoT) de prendre en compte le climat futur.

      4.2. Financement et Ressources

      • Lutter contre l'impensé du financement de l'adaptation. Le PNACC 3 n'est pas assorti d'un plan de financement chiffré.
      • Développer une "méthodologie de chiffrage" pour les collectivités territoriales.
      • Publier en annexe du projet de loi de finances un "orange budgétaire" récapitulant les actions de financement de l'adaptation au niveau de l'État.
      • Rétablir un "lien d'évolution automatique" entre l'augmentation de la surprime d'assurance catastrophe naturelle (passée de 12 à 20% en 2025) et l'augmentation des recettes du Fonds Barnier, en le tournant davantage vers la prévention.
      • Rehausser le Fonds Vert à son niveau de 2024, augmenter la part consacrée à l'adaptation et les exigences de verdissement des projets financés.
      • Renforcer "majeurement les moyens humains et financiers des opérateurs de l'État" (ex: ADEME, CEREMA) impliqués dans l'adaptation, qui ont vu leurs effectifs réduits malgré une charge de travail croissante. Le CEREMA, par exemple, a perdu "20% de ses effectifs en 10 ans" tout en étant impliqué dans près de "2/3 des actions du PNACC 3".
      • Libérer les ressources des Agences de l'Eau en supprimant complètement leur plafond de recettes et en rehaussant leur plafond de dépenses.
      • Instaurer un "test de conformité à la TRACC" pour chaque investissement d'ampleur et intégrer impérativement la prise en compte du climat futur dans les marchés publics.

      4.3. Ingénierie Territoriale et Culture du Risque

      • Le "déficit d'ingénierie territoriale" est un frein majeur. Les petites collectivités manquent de moyens et d'expertise.
      • Mettre en place une "formation obligatoire des élus" à la culture du risque et aux enjeux climatiques en début de mandat.
      • Labelliser les bureaux d'études réalisant des diagnostics de vulnérabilité.
      • Renforcer le CEREMA pour structurer l'accompagnement des collectivités et réduire les inégalités territoriales.
      • Renforcer le volet adaptation des SRADDET et envisager la fusion des ALEC avec les agences d'urbanisme pour créer des agences locales de l'urbanisme et du climat.
      • Fusionner les COPES régionales et les CESER pour élargir la concertation citoyenne sur les choix d'adaptation.

      4.4. Adaptation de l'Aménagement et du Droit

      • Repenser l'ensemble du droit de l'urbanisme pour éviter la maladaptation dans les zones à risque.
      • Mieux articuler les lois Montagne et Littoral ainsi que l'objectif ZAN (Zéro Artificialisation Nette) avec les impératifs d'adaptation, notamment le "recul stratégique d'habitation ou d'infrastructure".
      • Après les catastrophes naturelles, la "reconstruction à l'identique ne soit plus la norme". Le droit de l'urbanisme et de l'assurance doit évoluer pour y mettre fin.
      • Responsabiliser les assureurs (qui restent dans les zones à risque) et les assurés (qui n'effectuent pas les travaux de prévention).

      4.5. Enjeux Spécifiques et Cas Particuliers

      • Prévention des inondations : Évolution de la compétence GEMAPI pour inclure le ruissellement et favoriser une "plus grande solidarité territoriale" (entre amont et aval).
      • Risques littoraux (érosion côtière, submersion marine) : Création d'un "fond dédié au financement des actions face à l'érosion côtière", alimenté par les taxes sur les éoliennes en mer et les locations touristiques de courte durée dans les zones concernées. Nécessité d'accepter parfois de "se retirer" plutôt que de lutter inutilement.
      • Chaleurs extrêmes en ville et logement : Les villes sont en "première ligne". Le nombre de nuits tropicales à Paris "quadruplera d'ici 2050".
      • Intégrer les enjeux d'habitabilité d'été dans les travaux éligibles à Ma Prime Rénov' et à l'éco-prêt à taux zéro.
      • Lutter contre les îlots de chaleur urbains : renforcer la présence de l'eau, végétaliser, développer les surfaces à fort albédo, repenser l'organisation spatiale.
      • Territoires ultramarins : Malgré une influence océanique atténuant le réchauffement moyen, ils sont confrontés à l'érosion/submersion marine, l'affaiblissement de la ressource en eau douce, l'intensification des cyclones, l'acidification de l'océan, et les sargasses (dont le classement en catastrophe naturelle est envisagé).
      • Manque de territoires de repli pour l'urbanisme, car l'habitat côtier est très vulnérable (en Polynésie, "près de 8 habitants sur 10 vivent ainsi à moins d'1 km de la mer").
      • Augmenter "significativement les aides allouées à l'amélioration de l'habitat" du ministère des Outre-mer pour la réhabilitation et la relocalisation des logements menacés.

      5. L'Adaptation comme Levier de Résilience et de Développement

      • Le rapport souligne que l'adaptation n'est pas seulement une "contrainte et un coût", mais aussi un "levier de résilience, d'innovation et d'attractivité pour les territoires". Les sommes investies peuvent "servir à faire levier pour redynamiser l'activité" tout en "protégeant d'inévitables coûts futurs". Pour cela, il est impératif de "nous mettre en ordre de bataille pour intégrer cet enjeu dans l'ensemble de nos politiques publiques et ainsi éviter la maladaptation et la gabegie des deniers publics."

      6. Débats et Points de Tension Plusieurs points de discussion ont émergé :

      • Loi ZAN : Des divergences persistent sur l'application rigide de la loi et la nécessité de l'adapter aux spécificités territoriales (érosion côtière, montagne). Certains critiquent une "artificialisation dogmatique" qui serait "contre-productive" face à la protection humaine. D'autres insistent sur la nécessité de maintenir le cap du ZAN, le qualifiant de "chance de repenser nos manières d'habiter et de vivre ensemble", face à la vacance de logements et l'impact sur l'écosystème.
      • Rôle de l'État : Appel à une réaffirmation claire du rôle de l'État comme "garant de la cohésion nationale de l'égalité et de la sécurité de tous", notamment pour la solidarité financière (Fonds Barnier, érosion côtière) et l'accompagnement des collectivités.
      • Simplification normative : Demande d'un "vrai chantier de simplification de cohérence et surtout d'arbitrage entre protection environnementale sans concession et sécurité humaine".
      • Stations de montagne : La "fin possible de certaines activités économiques" (comme le ski) est une réalité difficile à accepter pour les territoires qui ont investi massivement. La diversification économique viable est un défi majeur nécessitant un accompagnement national fort.
      • Sentiers littoraux : Questionnement sur la pertinence de continuer à construire des sentiers littoraux coûteux et souvent détruits par les tempêtes, face à l'érosion inévitable, suggérant une gestion plus agile et locale des investissements.
      • Délai d'instruction des dossiers : Nécessité d'accélérer les procédures administratives pour les aides après catastrophes naturelles, en s'inspirant des procédures d'urgence utilisées dans le Pas-de-Calais.
    1. Synthèse des auditions des responsables de la modération TikTok par la commission d’enquête

      1. Structure de la modération de contenu chez TikTok

      TikTok emploie une approche hybride pour la modération des contenus, combinant des systèmes automatisés (algorithmes et IA) avec l'intervention humaine.

      Nikessou, responsable de la sécurité juridique, a précisé que "98 % des contenus qui violent les conditions" sont supprimés de manière proactive grâce à ces systèmes automatisés.

      La modération humaine, qui implique 509 modérateurs francophones, se concentre sur les contenus les plus sensibles ou contextuels. Les contenus signalés ou devenus populaires font l'objet d'examens supplémentaires.

      Cependant, il y a une diminution du nombre de modérateurs humains, passant de 634 au premier semestre à 509 au deuxième semestre.

      TikTok justifie cette baisse par l'amélioration de ses outils d'IA, permettant une suppression plus rapide et cohérente des contenus problématiques, et minimisant l'exposition des utilisateurs et des employés à ces contenus.

      Les modérateurs reçoivent une formation initiale et des vérifications hebdomadaires et mensuelles de leur interprétation des règles. Un soutien psychologique est également mis en place pour les modérateurs exposés à des contenus difficiles.

      2. Efficacité de la modération et défis liés aux contenus problématiques

      TikTok affirme que "moins de 1 % du contenu ne respecte pas les lignes directrices", mais la commission a remis en question la pertinence de ce pourcentage en raison de l'algorithme de recommandation qui peut amplifier même un faible volume de contenu problématique.

      Un exemple majeur de ce défi est la tendance "Skinny Talk", qui a incité à l'anorexie et aux troubles alimentaires. TikTok a expliqué avoir initialement détecté un volume faible et un faible taux de non-conformité.

      Ce n'est qu'après l'augmentation du volume et l'émergence d'une communauté centrée sur de mauvaises habitudes alimentaires que des mesures, y compris le blocage du hashtag, ont été prises.

      Malgré ces efforts, des contenus problématiques liés à ce thème, comme le hashtag "fearfood", persistent, soulevant des questions sur l'efficacité de la modération et la capacité de TikTok à honorer son "obligation de résultat".

      La représentante de TikTok a admis que "bien sûr, nous faisons des erreurs, c'est inévitable".

      La commission a également soulevé le problème des "moyens de contournement" utilisés par les jeunes, tels que l'utilisation de symboles (ex: petit zèbre pour la scarification) pour aborder des sujets sensibles sans être détectés.

      Les responsables de TikTok reconnaissent cette problématique et affirment travailler à anticiper ces contournements.

      3. Gestion des signalements et relations avec les organisations externes

      TikTok collabore avec des organisations comme Stop Fisha, e-Enfance, Génération Numérique et Point de Contact, qui agissent comme "signaleurs de confiance". Ces organisations bénéficient d'un canal de signalement prioritaire, assurant une réponse rapide.

      Cependant, la commission a fait état de divergences entre les signalements effectués par des particuliers et ceux des organisations, ces dernières entraînant des suppressions de contenu plus fréquentes. TikTok justifie cette différence par l'expertise des signaleurs de confiance, qui aident à identifier plus précisément les violations des règles communautaires.

      4. Sanction des comptes et politiques de tolérance

      TikTok applique une politique de tolérance variable selon la gravité des infractions.

      Les violations mineures peuvent entraîner des avertissements et des opportunités de correction, tandis que les infractions graves comme les discours de haine ou la pédopornographie entraînent une "tolérance zéro" et une interdiction immédiate.

      Un document public détaillant cette gradation des sanctions existe et peut être partagé.

      La commission a exprimé sa préoccupation quant à la lenteur de réaction face à des comptes d'influenceurs connus, suivis par des millions de personnes, qui diffusent des contenus problématiques (ex: propos sexistes, incitation à la violence). TikTok a souligné la difficulté d'examiner l'intégralité du contenu d'un utilisateur et le caractère contextuel des violations.

      5. Modération et fonctionnalités de TikTok Live

      TikTok Live est un produit de diffusion en direct où les créateurs interagissent avec leur communauté. L'équipe de modération (TNS) est la même que pour les contenus préenregistrés et agit en toute indépendance.

      Des modèles dédiés sont utilisés pour détecter des signaux de violation pendant les diffusions en direct, permettant d'interrompre la vidéo ou de donner des retours aux créateurs. Les menaces à la vie sont signalées aux autorités locales.

      Une fonctionnalité notable est le "Live Match", où deux ou quatre créateurs s'affrontent pendant 5 minutes, accumulant des points via des cadeaux virtuels et des "J'aime" du public.

      Le vainqueur est celui qui a le plus de points. Les cadeaux virtuels vont d'une "rose virtuelle" valant environ 5 centimes à plusieurs centaines d'euros. TikTok prélève 50 % de la valeur des cadeaux.

      Les agences externes spécialisées dans le live streaming sont rémunérées par TikTok et peuvent recevoir des pénalités financières si leurs créateurs enfreignent les règles. Un "score de santé" est attribué aux agences, démarrant à 100 points et diminuant en cas de violation.

        1. Préoccupations liées à l'addiction et à la protection des mineurs sur TikTok Live

      La commission a exprimé de vives inquiétudes quant au caractère addictif de TikTok, en particulier des Lives. Les responsables de TikTok ont souligné plusieurs mesures de protection:

      • Interdiction des Lives pour les moins de 18 ans: Les créateurs doivent vérifier leur identité (pièce d'identité et selfie) pour lancer un Live.
      • Interdiction de l'envoi de cadeaux virtuels pour les mineurs: Seuls les majeurs peuvent acheter et envoyer des cadeaux.
      • Limitation du temps d'écran: Les utilisateurs de 13 à 17 ans ont une limite de 60 minutes par jour, activée par défaut, avec des rappels réguliers.
      • Contenu non personnalisé: Dans le cadre du DSA, TikTok propose des flux non personnalisés pour permettre aux utilisateurs de découvrir une diversité de contenus.
      • Cependant, la commission a confronté TikTok à des témoignages directs de mineurs participant à des Lives et dépensant de l'argent via le compte Apple Pay de leurs parents, ou étant incités à changer leur date de naissance pour accéder à certaines fonctionnalités. La vérification de l'âge reste un défi majeur. TikTok a reconnu la persistance du problème, indiquant que "642 000 comptes" de moins de 13 ans ont été supprimés en France l'année dernière, et 6 millions par mois dans le monde.

      • La commission a également interrogé le modèle de rémunération des Live, où les "top créateurs" peuvent gagner "plusieurs dizaines de milliers d'euros par mois", et la nature des "Live Match" qui, selon certains membres, s'apparentent à des "mécanismes similaires à ceux des jeux d'argent". TikTok a réfuté cette assimilation, arguant qu'il n'y a pas d'espérance de gain pour les donateurs et que les jeux d'argent sont strictement interdits.

      Le remerciement des donateurs par les streamers, même s'il est considéré par TikTok comme de la "politesse", est perçu par la commission comme une "forme de gratification" et d' "encouragement au don".

      7. Transparence et obligations légales

      TikTok publie des rapports de transparence trimestriels et des rapports dédiés sur les demandes de retrait gouvernementales, les demandes d'information, les suppressions pour propriété intellectuelle, la lutte contre les opérations d'influence et les abus sexuels sur mineurs.

      La plateforme est également soumise aux obligations du DSA (Digital Services Act) et du code de pratique de lutte contre la désinformation de l'Union européenne.

      Conclusion

      • L'audition a mis en lumière la complexité de la modération de contenu sur une plateforme de l'ampleur de TikTok, confrontée à la fois à des défis technologiques (détection de contournements, vérification de l'âge) et humains (volume de contenu, contexte culturel).

      Si TikTok a détaillé ses efforts en matière de sécurité et de conformité réglementaire, la commission a exprimé de fortes réserves quant à l'efficacité réelle de ces mesures, en particulier concernant la protection des mineurs et la persistance de contenus problématiques.

      Des informations complémentaires ont été demandées à TikTok par écrit, avec la possibilité d'une reconvocation en cas de non-fourniture ou de divergence des réponses.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1:

      In the future, could you please include the exact changes made to the manuscript in the relevant section of the rebuttal, so it's clear which changes addressed the comment? That would make it easier to see what you refer to exactly - currently I have to guess which manuscript changes implement e.g. "We have tried to make these points more evident".

      Yes, we apologize for the inconvenience.

      On possible navigation solutions:

      I'm not sure if I follow this argument. If the networks uses a shifted allocentric representation centred on its initial state, it couldn't consistently decode the position from different starting positions within the same environment (I don't think egocentric is the right term here - egocentric generally refers to representations relative to the animal's own direction like "to the left" rather than "to the west" but these would not work in the allocentric decoding scheme here). In other words: If I path integrate my location relative to my starting location s1 in environment 1 and learn how to decode that representation to an environment location, I cannot use the same representation when I start from s2 in environment 1, because everything will have shifted. I still believe using boundaries is the only solution to infer the absolute location for the agent here (because that's the only information that it gets), and that's the reason for finding boundary representations (and not grid cells). Imagine doing this task on a perfect torus where there are no boundaries: it would be impossible to ever find out at what 'absolute' location you are in the environment. I have therefore not updated this part of my review, but do let me know if I misunderstood.

      Thank you for addressing this point, which is a somewhat unusual feature of our network: We believe the point you raise applies if the decoding were fixed. However, in our case, the decoding is dynamic and depends on the firing pattern, as place unit centers are decoded on a per-trajectory basis. Thus, a new place-like basis may be formed for each trajectory (and in each environment). Hence, the model is not constrained to reuse its representation across trajectories or environments, as place centers are inferred based on unit firing. However, we do observe that the network learns to use a fixed place field placement in each geometry, which likely reflects some optimal solution to the decoding problem. This might also help to explain the hexagonal arrangement of learned field centers. Finally, we agree that egocentric may not be entirely accurate, but we found it to be the best word to distinguish from the allocentric-type navigation adopted by the network.

      Regarding noise injection:

      Beyond that noise level, the network might return to high correlations, but that must be due to the boundary interactions - very much like what happens at the very beginning of entering an environment: the network has learned to use the boundary to figure out where it is from an uninformative initial hidden state. But I don't think this is currently reflected well in the main text. That still reads "Thus, even though the network was trained without noise, it appears robust even to large perturbations. This suggests that the learned solutions form an approximate attractor." I think your new (very useful!) velocity ablations show that only small noise is compensated for by attractor dynamics, and larger noise injections are error corrected through boundary interactions. I've added this to the new review.

      Thank you for your kind feedback: We have changed the phrasing in the text to say “robust even to moderate perturbations. ” As we hold that, while numerically small, the amount of injected noise is rather large when compared to the magnitude of activities in the network (see Fig. A5d); the largest maximal rate is around 0.1, which is similar to the noise level at which output representations fail to re-converge. However, some moderation is appropriate, we agree.

      On contexts being attractive:

      In the new bit of text, I'm not sure why "each environment appears to correspond to distinct attractive states (as evidenced by the global-type remapping behavior)", i.e. why global-type remapping is evidence for attractive states. Again, to me global-type remapping is evidence that contexts occupy different parts of activity space, but not that they are attractive. I like the new analysis in Appendix F, as it demonstrates that the context signal determines which region of activity space is selected (as opposed to the boundary information!). If I'm not mistaken, we know three things: 1. Different contexts exist in different parts of representation space, 2. Representations are attractive for small amounts of noise, 3. The context signal determines which point in representation space is selected (thanks to the new analysis in Appendix F). That seems to be in line with what the paper claims (I think "contexts are attractive" has been removed?) so I've updated the review.

      It seems to us that we are in agreement on this point; our aim is simply to point out that a particular context signal appears to correspond to a particular (discrete) attractor state (i.e., occupying a distinct part of representation space, as you state), it just seems we use slightly different language, but to avoid confusion, we changed this to say that “representations are attractive”.

      Thanks again for engaging with us, this discussion has been very helpful in improving the paper.

      Reviewer #2:

      However, I still struggle to understand the entire picture of the boundary-to-place-to-grid model. After all, what is the role of grid cells in the proposed view? Are they just redundant representations of the space? I encourage the authors to clarify these points in the last two paragraphs on pages 17-18 of the discussion.

      Thank you for your feedback. While we have discussed the possible role of a grid code to some extent, we agree that this point requires clarification. We have therefore added to the discussion on the role of grid cells, which now reads “While the lack of grid cells in this model is interesting, it does not disqualify grid cells from serving as a neural substrate for path integration. Rather, it suggests that path integration may also be performed by other, non-grid spatial cells, and/or that grid cells may serve additional computational purposes. If grid cells are involved during path integration, our findings indicate that additional tasks and constraints are necessary for learning such representations. This possibility has been explored in recent normative models, in which several constraints have been proposed for learning grid-like solutions. Examples include constraints concerning population vector magnitude, conformal isometry \cite{xu_conformal_2022, schaeffer_self-supervised_2023, schoyen_hexagons_2024}, capacity, spatial separation and path invariance \cite{schaeffer_self-supervised_2023}. Another possibility is that grid cells are geared more towards other cognitive tasks, such as providing a neural metric for space \cite{ginosar_are_2023, pettersen_self-supervised_2024}, or supporting memory and inference-making \cite{whittington_tolman-eichenbaum_2020}. That our model performs path integration without grid cells, and that a myriad of independent constraints are sufficient for grid-like units to emerge in other models, presents strong computational evidence that grid cells are not solely defined by path integration, and that path integration is not only reserved for grid cells.”

      Thank you again for your time and input.

    1. Compte rendu détaillé : La justice face aux violences sexuelles, entre tradition punitive et voie restaurative

      • Ce compte rendu explore les principaux thèmes et idées abordés lors de l'émission "Les matins de France Culture" avec Antoine Garapon, magistrat honoraire et président de la commission reconnaissance et réparation, et Aude Douinge, chargée de plaidoyer et de communication de l'association "Face à l'Inceste".

      La discussion se focalise sur les limites de la justice punitive traditionnelle face aux crimes de violences sexuelles, en particulier l'inceste, et propose des alternatives telles que la justice restaurative et des évolutions législatives.

      1. La nature et l'ampleur des crimes sexuels, en particulier l'inceste

      • Les intervenants soulignent l'ampleur effrayante des violences sexuelles, notamment sur les enfants.

      Antoine Garapon mentionne le chiffre de "160 000 enfants subissent des violences sexuelles chaque année" en France, une statistique qu'il met en perspective avec les 1600 homicides annuels, soulignant que les violences sexuelles sont "10 000 fois plus" fréquentes.

      Ces crimes sont caractérisés par :

      • L'identité de l'agresseur : Majoritairement des hommes, souvent majeurs. Les pères (27%), les frères (19%) et les oncles (13%) sont fréquemment cités comme agresseurs.

      • Leur nature "fondatrice" et paradoxale : Antoine Garapon les décrit comme des crimes "réputés les plus graves, les plus fondateurs", mais paradoxalement "les moins condamnés, étaient même les moins dénoncés".

      L'exemple des crimes sexuels commis par des prêtres est particulièrement mis en avant, car une institution qui doit annoncer le salut "sème la mort", ce qui est une contradiction totale.

      • L'inimaginable et le "système du silence" : Pendant longtemps, ces crimes étaient considérés comme "au-delà du périmètre de ce qu'on était prêt à croire".

      Un "système du silence" prévalait, souvent lié à un "conflit de loyauté", où la loyauté envers l'institution (comme l'Église) ou la famille était "supérieure à au crédit porté à un enfant".

      L'affaire de l'Abbé Pierre est citée comme un exemple criant où "tout le monde savait" mais les autorités n'ont pas agi, abordant le crime uniquement par rapport à la loi morale, "pas un mot pour les victimes".

      • La notion de "pharmakos" : La victime, appartenant au vocabulaire sacrificiel, était perçue comme "l'objet du sacrifice".

      La thèse audacieuse de Dorothée Dussy, partagée par Garapon, suggère que les enfants victimes étaient en quelque sorte "le prix de l'ordre familial, de l'ordre ecclésial", participant par leur silence à l'ordre social général.

      2. L'évolution de la "conscience commune" et le rôle du mouvement #MeToo

      La perception de ces crimes a radicalement évolué. Reprenant la définition de Durkheim, qui définit le crime comme "ce qui choque la conscience commune", Antoine Garapon affirme qu'aujourd'hui, "ces crimes sont considérés comme étant les plus choquants dans la conscience générale. Peut-être même plus que les homicides".

      • Cette évolution est attribuée à une période de "rêve d'une société postsacrificielle" et, de manière significative, au mouvement " #MeToo" qui a marqué "un grand tournant" en montrant une évolution de la sensibilité.

      La société ne supporte plus que des dominés (enfants, femmes) soient l'objet de violences impunies, d'autant plus que le viol est quasi équivalent au crime en termes de répression pénale.

      3. Les limites de la justice pénale traditionnelle et les souffrances des victimes

      La justice pénale traditionnelle, bien qu'essentielle, montre ses limites :

      • Centrée sur le coupable et l'ordre public : Elle est "très centrée sur le coupable, sur l'ordre public", plutôt que sur la victime.
      • La "thérapie judiciaire" : L'expression "c'est de la thérapie judiciaire" était utilisée par certains magistrats pour déprécier l'intérêt porté aux victimes, sous-entendant que le rôle du juge n'était pas de s'occuper du rétablissement des personnes.

      Cependant, Antoine Garapon soutient que "s'intéresser au rétablissement des personnes à commencer par celui de la victime, c'est de la justice".

      • Difficulté d'accès à la plainte et amnésie traumatique : Les victimes souffrent d'un "empêchement d'être" et d'une "impossibilité même d'accéder à la plainte, même d'accéder à son propre souvenir".

      L'"amnésie traumatique" peut durer des années, empêchant même la conscience des faits.

      • Le fardeau de la preuve : Il est "très difficile de savoir ce qui s'est passé dans un collège, dans un dortoir d'un collège, dans un confessionnal, dans une famille il y a 30 ou 40 ans".

      Les aveux de l'auteur restent souvent la preuve maîtresse.

      • Impact dévastateur sur les victimes : Une agression sexuelle peut "détruire" une victime, et savoir que son agresseur est "couvert de gloire", "un saint homme", révolte encore plus.
      • La reproduction des violences : Les auteurs de violences incestueuses ou sexuelles ont souvent eux-mêmes été abusés (au minimum la moitié des cas), créant un "engrenage" et un "climat incestuel" dans certaines familles.
      • Santé mentale et espérance de vie : Aude Douinge souligne que l'inceste est "profondément traumatisant" et se cumule en moyenne avec "trois ou quatre autres traumatismes dans l'enfance".

      Plus le nombre de traumatismes est élevé, plus les conséquences à l'âge adulte sont graves.

      Une personne ayant subi deux traumatismes majeurs dans l'enfance a "20 ans d'espérance de vie de moins que la population générale".

      Plus de la moitié des victimes d'inceste font ou ont fait une tentative de suicide.

      4. La justice restaurative : une alternative centrée sur la victime

      Antoine Garapon promeut la justice restaurative comme une "alternative" ou un complément à la justice pénale :

      • Centrée sur la victime : Son but est de "rétablir, de réhabiliter la victime" et de lui "restituer sa parole, lui restituer une parole propre et pas une parole toujours déléguée ou substituée comme dans le procès ordinaire".
      • Nomination et reconnaissance : Elle vise à ce qu'il y ait une "nomination, c'est-à-dire qu'on nomme les choses. Oui, c'était une reconnaissance. Oui, c'est bien. Le premier des besoins des victimes, c'est que la société reconnaisse". Il s'agit d'une "validation sociale de ce qui s'est passé".
      • Objectif de "restituer à une victime l'énergie de vivre" : La justice restaurative est "beaucoup plus dynamique" et vise à libérer la victime de la solitude paralysante.
      • Importance de la parole : Elle ne se caractérise pas par la "mise en suspicion systématique de la parole" de la victime, contrairement au processus pénal.
      • Non-obligatoire : Aude Douinge insiste sur le fait que la justice restaurative "ne peut être obligatoire", car "on ne peut obliger les victimes au pardon".

      5. Les évolutions législatives et les défis de la prescription

      Les intervenants abordent les débats actuels autour de la prescription des crimes sexuels :

      • L'imprescriptibilité : L'association "Face à l'Inceste" milite pour l'"imprescriptibilité pour les crimes d'inceste et la protection immédiate des enfants". Actuellement, le délai de prescription est de 30 ans après les 18 ans de la victime, soit jusqu'à 48 ans.
      • Distinction pénal/civil : Le gouvernement réfléchit à une imprescriptibilité pour la justice civile, permettant des réparations financières, mais à charge pour la victime d'apporter des preuves. Les intervenants estiment que cela ne "prend pas le problème de face" en raison des difficultés de preuve et du risque d'aggraver la souffrance de la victime par un non-lieu.
      • La procédure pénale est fondamentale : Aude Douinge souligne que la "réponse pénale reste extrêmement importante et elle doit pouvoir être offerte aux victimes puisqu'il faut rappeler que la prescription, c'est aussi le droit à l'oubli pour l'agresseur".

      Elle ajoute que "le sentiment d'intranquillité qui habite la victime lui est à vie" et qu'il devrait "venir hanter l'agresseur".

      • Départ de la prescription à la "consolidation" : Une solution juridique proposée serait de faire partir le délai de prescription de la date de "consolidation", c'est-à-dire le moment où le traumatisme est estimé ne plus évoluer, plutôt que de la date des faits. Cependant, la blessure psychique est fluctuante.
      • L'abus de bien social comme exemple : L'exemple de l'abus de bien social, imprescriptible à partir de la découverte du délit, est donné comme modèle pour les crimes sexuels.

      6. Le rôle des associations et les besoins des victimes

      L'association "Face à l'Inceste", créée il y a 25 ans par une victime, Isabelle Aubry, joue un rôle crucial :

      • Visibilisation de l'inceste : Leurs sondages ont révélé que "trois enfants par classe ont subi l'inceste" et que cela touche "un Français sur 10, 7,4 millions de Français".
      • Combats législatifs : Ils ont milité pour la réintégration du crime d'inceste au code pénal en 2016 et la notion de "solidarité".
      • Besoins des victimes : Au-delà de la réponse pénale, les victimes réclament "un soutien psychologique et un soutien indéniablement financier". La prise en charge psychologique est souvent peu soutenue et l'arrêt des thérapies est souvent dû à des raisons financières. Un formulaire pour le remboursement à 100% des soins pour les victimes d'inceste par la sécurité sociale existe mais est "trop peu connu".
      • Reconnaissance et réparation : Les victimes ont besoin d'abord et avant tout de "cette reconnaissance et que la société légitime ce qu'elles ont vécu et viennent leur dire oui, ce qui vous est arrivé et a existé et on va le reconnaître".

      7. Vers une "autre justice" et la "politisation de l'intime"

      Antoine Garapon plaide pour une "autre justice", plus "accomplie", qui intègre différentes facettes :

      • Réarticulation des justices : Il appelle à une "réarticulation entre la justice civile, la justice restaurative et la justice pénale".
      • "Politisation de l'intime" : Le défi est de savoir "comment les pouvoirs publics vont pouvoir s'emparer de relations intimes intelligemment pour mettre fin à cette ce très très grand nombre, ce trop grand nombre de violences sexuelles".
      • Respect des désirs de la victime : Il est crucial de "respecter les désirs de la victime", qu'il s'agisse d'une demande de punition, d'une demande protectrice pour se dégager et vivre dans l'anonymat.
      • Les droits de l'auteur : Tout en se concentrant sur la victime, il est rappelé que "l'auteur aussi a des droits" et bénéficie de la présomption d'innocence.

      En conclusion, la discussion met en lumière la nécessité d'une approche plus globale et empathique face aux violences sexuelles, qui ne se limite pas à la seule punition de l'agresseur mais qui inclut une reconnaissance profonde de la souffrance des victimes, un soutien adapté, et des mécanismes de réparation qui favorisent leur reconstruction et leur capacité à vivre.

    1. Reviewer #3 (Public review):

      Summary:

      The manuscript Kroon et al. described two algorithms, which when combined achieve high throughput automation of "martinizing" protein structures with selected protonation states and post-translational modifications.

      The authors have addressed all of my concerns as provided previously. Specifically, Figure S2 will be a very useful guideline for future improvement (e.g., parallelization) of the code.

    1. You can always familiarize yourself with a strange project, no matter how badly shaped it is. I have cleared a few haunted forests in my career—progressively understanding and recovering ownership of a project, rewriting it line-by-line without indulging in the lazy start-over—and my experience is: for all the mud that can accumulate over years of careless maintenance, if you know how to look, you can always find traces of intent, you can infer the presence, the needs and constraints of previous maintainers. And you can use that to put parts of the puzzle together, to gain understanding and confidence to support some of your decisions. This wouldn’t be the case, I believe, with LLM-generated code. With LLMs it’s all random text, plausible nonsense, mocked intent. Past a certain point, the surrender would become irreversible—there’s no resurrecting that kind of project death.

      what kind of prompt history might be useful for auditing g

    1. Briefing : Révolutionner la productivité des associations grâce au No-Code et à l'IA

      Introduction

      Ce briefing récapitule les points clés abordés lors du webinaire organisé par Solidatech, en partenariat avec Contournement et Nocode Forgood.

      L'objectif principal de cette session était de démontrer comment les outils "no-code" et l'intelligence artificielle (IA) peuvent permettre aux associations de "gagner des dizaines d’heures par mois" et de renforcer leur impact numérique.

      1. Solidatech : Renforcer l'impact numérique des associations

      • Solidatech est une organisation française fondée en 2008, dédiée à l'accompagnement des associations dans leur transformation numérique.

      Composée d'une douzaine de personnes, elle opère depuis Paris et les Deux-Sèvres, où se situe sa coopérative d'insertion, les Ateliers du Bocage (mouvement Emmaüs), spécialisée dans le réemploi de matériel bureautique. Solidatech est également le satellite français du réseau international TechSoup.

      Public cible :

      Plus de 42 000 associations inscrites gratuitement. Divers statuts juridiques : associations (locales ou plus grandes), fondations, fonds de dotation, bibliothèques publiques. Toutes tailles d'organisations, avec ou sans employés (y compris celles composées uniquement de bénévoles).

      Piliers d'accompagnement :

      • Accès à des outils et matériels à tarifs réduits : Offre de logiciels et matériels reconditionnés (ordinateurs, écrans, smartphones, etc.) avec des remises allant de -30% à -90%, voire des gratuités. Exemples de partenaires : Cisco, Dell.
      • Développement des usages numériques :Centre de ressources gratuit.
      • Équipe support basée dans les Deux-Sèvres pour le conseil et le choix des licences.
      • Outil de diagnostic numérique pour évaluer la maturité numérique.
      • Étude nationale annuelle sur la place du numérique dans le milieu associatif.
      • Accompagnement et formation :Newsletters régulières.
      • Partenariat Prestatek (annuaire de prestataires de services).
      • Webinaires thématiques variés.
      • Formations certifiées Qualiopi sur des sujets comme la conformité RGPD, la communication digitale, la recherche de financement, la gestion de dons, et l'utilisation d'outils (Microsoft 365, Canva).
      • Impact : Solidatech aide les associations à réaliser des "économies monétaires [et] en temps", à gagner en maturité numérique et à se professionnaliser.

      2. Le No-Code et l'IA : Définitions et promesses

      Erwan Kezzard, cofondateur de Contournement, a introduit les concepts de no-code et d'IA générative comme des leviers majeurs pour optimiser le temps. Il souligne que "le temps... c'est une ressource extrêmement importante, notamment quand on travaille soit aux sources contraintes".

      Définition du No-Code :

      • "Le nocode comme son anxie son son les les l'exprime et l'exige ce sont des outils qui permettent de réaliser visuellement intuitivement des projets numériques sans forcément avoir de compétences informatiques en code."

      • Permet de créer des sites web, des petites applications, d'automatiser des tâches, de créer des solutions internes, etc., de manière visuelle et intuitive.

      • Exemple : Excel n'est pas du no-code ; le no-code permet de "créer ses propres outils".

      Définition de l'IA Générative :

      Il s'agit des IA accessibles comme "les Chat GPT, Mistral et autres qui sont euh bah des technologies auxquelles on peut assez rapidement demander des choses demander de retraiter du contenu demander de faire des recherches et elle nous répond".

      Potentiel et Bénéfices :

      • L'objectif est de "gagner des dizaines d'heures par mois" en évitant les "manipulations répétitives, des tâches non informatisées ou des tâches informatisées mal optimisées".
      • Principal usage : l'optimisation de la productivité, c'est-à-dire "travailler mieux pour faire autant ou faire moins". Cela concerne l'optimisation des "ops" (opérations quotidiennes) en administratif, RH, gestion de projet, etc.
      • Exemple de gain de temps : "si cinq fois par jour je passe 5 minutes à faire une tâche à la fin de l'année j'aurais passé 6 jours plein 6 jours de travail à ne faire que ça".

      3. Les Trois Briques Fondamentales du No-Code Erwan Kezzard a illustré les capacités du no-code à travers trois piliers principaux, souvent combinés :

      Bases de données visuelles :

      Outil clé : Airtable (alternative française : TimeTonic).

      • Fonctionnalité : Ressemble à Excel mais est une "base de données", où chaque ligne est une fiche. Les colonnes ont des types de données spécifiques (liens, sélecteurs, pièces jointes, dates).
      • Avantages : Création rapide de vues filtrées et segmentées ("vues" pour stagiaires, commerciaux, DG), gestion des accès, formulaires d'entrée de données (créé en "moins de 25 secondes").
      • Concept de "relations" : Possibilité de lier des entrées entre différentes tables (ex: lier des prospects à des entreprises), ce qui résout de nombreux problèmes de ressaisie et de cohérence des données. Permet de naviguer entre les données comme sur une application.
      • Permet de construire des "CRM que je me fais moi-même" et des "intranets".

      Automatisation et interconnexion :

      • Outil clé : Zapier (alternatives : Make, N8N - open source mais plus technique).
      • Fonctionnalité : Connecte différents outils pour automatiser des processus.
      • Exemple : "à chaque fois que dans Airtable il y a une nouvelle entrée dans la table entreprise alors automatiquement va dans le chat GPT demande-lui 'Tu es un expert en politique RSE...' puis prends cette convers enfin trouve la ligne dans Air Table et mets à jour cette ligne avec l'info directement ici".
      • Permet d'automatiser des notifications (Teams, Slack), des envois de mails personnalisés, la création de documents (PDF), etc.
      • L'IA "fait qu'un seul boulot" (poser la question), le no-code "fait le boulot" des tuyaux d'interconnexion.

      Interfaces (Sites web, applications mobiles/web) :

      • Outil clé : Glide (pour applications mobiles), Software (pour applications web / intranets).
      • Fonctionnalité : Permet de créer des applications mobiles ou web à partir d'une base de données existante (ex: Airtable).
      • Avantages : Ne nécessite aucune installation ni hébergement, permet de modifier l'apparence et les fonctionnalités intuitivement. "je peux modifier cette application mobile changer l'apparence changer l'info qui apparaît où et cetera".

      4. Philosophie et Positionnement de Contournement & Nocode Forgood

      Contournement :

      • Métier : "former les équipes et les individus au nos code et Alia pour leur permettre de travailler plus efficacement".
      • Ne vise pas à lancer la "prochaine start-up à la mode" mais à "gagner du temps", "digitaliser" et "fluidifier ses processus".
      • Offre de formations en présentiel, téléprésentiel, et e-learning (plateforme avec abonnement à coût accessible, réductions prochainement à 50-100€/mois).

      Accompagne aussi des publics éloignés de l'emploi.

      • Vision du no-code : une "compétence complémentaire" valorisable sur le marché de l'emploi ("je suis chargé de communication... je me forme quelques jours au nocode je sais digitaliser automatiser dans mon métier et ça m'apporte quelque chose").
      • Met en garde contre le "miroir aux alouettes" et le "charlatanisme" liés au no-code comme "nouveau métier".

      Nocode France :

      • "La communauté la plus active au monde dans le Nocode qui est française".
      • Composée de 15 000 personnes qui "s'entraident bénévolement", offrant conseils et orientations.

      Nocode Forgood :

      • Mission : "donner accès aux outils no code les plus démocratiques du numérique pour rendre la vie plus simple aux assos et leur permettre de démultiplier leur impact".
      • Fait découvrir le no-code (masterclass) et surtout met en relation des associations avec des "nocodeurs et des nocodeuses engagés".
      • Approche "MVP" (Minimum Viable Product) : "commencer petit", "réaliser le plus vite possible un morceau qui fonctionne et après de l'adapter". L'objectif est d'aider les associations à "faire leur skateboard" (amorce), puis de les accompagner.
      • Projets : les nocodeurs peuvent travailler bénévolement (avec contrepartie de formations Contournement) ou à "tarif solidaire".

      5. Exemples concrets de succès

      • Les Francofolies : "15 personnes un seul informaticien". En deux jours de formation Airtable, ils ont gagné "plusieurs dizaines d'heures par semaine" notamment sur le reporting carbone.

      Ils ont aussi fait appel à une experte Ania pour des projets plus complexes, mais ont aussi décidé de ne pas "nocoder" certains processus peu chronophages.

      • La MedNum : Coopérative qui gère son sociétariat, ses projets et son organisme de formation avec Airtable (base de données) et Make (automatisation).

      Gagne "énormément de temps". Utilise aussi Notion pour la documentation interne et les ressources textuelles.

      • Wildlife Impact Network (via Nocode Forgood) : Création d'un site avec Softer et d'une galerie de projets finançables avec Airtable en deux jours.
      • Naestan (via Nocode Forgood) : Création d'un outil de pilotage et de reporting interne pour une association d'aide aux jeunes Afghans, réalisé avec CODA.
      • Nocode Forgood (interne) : Automatisation de la génération de brouillons de posts LinkedIn à partir de retours d'expérience d'associations, via Make et l'IA.

      6. Bonnes pratiques et avertissements

      • Cartographier avant de se lancer : "Une bonne pratique c'est de cartographier avant de se lancer".
      • Ne pas tout no-coder à outrance : "pas besoin de tout nous coder les meilleurs outils ça peut être de trouver des outils spécialisés". Si un processus fonctionne bien, ne pas le modifier.
      • Outils modernes et interconnectables : Privilégier les outils qui peuvent se connecter entre eux (vérifier la compatibilité Zapier ou Make). Exemple : Assoconnect est intégrable avec Zapier et Make.
      • Collaboration avec l'IT et les juristes : "appuyez-vous toujours sur l'IT sur le juridique sur les décisionnaires ne faites pas du shadow IT dans votre coin sur le nocode s'il y a des gens qui doivent être décisionnaires avec vous ça peut exposer à des risques de données mal géré et cetera de sécurité".
      • Formation : Même si le no-code est accessible, un minimum de formation est nécessaire. "Au bout d'une journée ou de 2 à 5 jours de formation les gens peuvent commencer à faire des choses".
      • Appui sur des experts externes : Recommandé pour éviter les erreurs (ex: données publiques par erreur) et structurer des projets plus complexes.
      • Coût : "Un outil no code qui se respecte est payant déjà un dans un outil de code qui se respecte est fremium". Les tarifs commencent souvent entre 15 et 30€/mois. Il faut prévoir "entre 50 et 100 € de budget mensuel" pour faire beaucoup de choses. C'est un investissement rapidement amorti.
      • RGPD et stockage des données :L'hébergement aux États-Unis n'est pas intrinsèquement non-RGPD. De nombreux outils américains sont "RGPD compliant".
      • Il est crucial de consulter un juriste pour les données sensibles.
      • "Le RGPD rappelons que c'est un process où vous vous devez faire toute une démarche de nous par exemple contournement on a tout un registre où on dit où sont stocké quelle donné et on fait gaffe régulièrement à supprimer les données qui ont plus de 3 ans".
      • Les outils no-code payants ne "vendent" généralement pas vos données, leur modèle économique étant basé sur l'abonnement. Le risque principal est lié aux exigences gouvernementales (Cloud Act, Patriot Act).
      • Migration de bases de données : Simple via l'import CSV dans Airtable (ou TimeTonic, Notion). Possibilité de synchroniser des bases existantes (ex: Excel) avec Airtable via Zapier/Make.
      • Différence Notion vs. Airtable : Notion est "plus orienté je prends des notes", gestion de "contenu riche", "espace collaboratif tout en un" (wiki, documentation interne). Airtable est centré sur la "donnée" et sa structuration.

      7. Outils de productivité IA complémentaires

      Whisper Flow : Outil de dictée vocale permettant de "dicter et ne plus taper quasiment au clavier". Reconnaissance précise de la syntaxe et de la ponctuation. Dict AI : Application mobile française et souveraine pour "prendre en note les réunions automatiquement" et générer des comptes-rendus.

      Conclusion

      Le no-code et l'IA représentent une opportunité significative pour les associations de toutes tailles d'améliorer leur efficacité opérationnelle et de se professionnaliser.

      Des organisations comme Solidatech, Contournement et Nocode Forgood jouent un rôle essentiel dans la démocratisation de ces technologies, en offrant des ressources, des formations et un accompagnement adapté, tout en soulignant l'importance de l'éthique, de la sécurité des données et d'une approche pragmatique dans leur adoption.

    1. Briefing : Réussir son projet numérique associatif

      Ce document synthétise les points clés du webinaire "[Webinaire] De l'idée à la réalisation : comment réussir votre projet numérique associatif ?", animé par Gautier Jeanzac de Pastec et une représentante de Solidatech.

      Il vise à fournir une méthodologie claire et des conseils pratiques pour les associations souhaitant entreprendre un projet numérique.

      I. Solidatech : Un partenaire pour la transformation numérique des associations

      Solidatech est présenté comme un programme de solidarité numérique créé en 2008, porté par les Ateliers du Bocage (mouvement Emmaüs). Son objectif est de "renforcer l'impact des associations par le numérique" et "renforcer votre impact à travers une meilleure utilisation du numérique".

      1. Bénéficiaires et Éligibilité :

      • Principalement les associations loi 1901, mais aussi les fondations RUP, fonds de dotation, et bibliothèques publiques.
      • L'inscription est gratuite et ouverte "quel que soit votre secteur d'activité, que aussi vous vous soyez bah voilà vous fonctionnez entièrement avec que des bénévoles ou au contraire des des centaines de salariés".
      • Plus de 42 000 associations sont déjà inscrites.

      2. Modes d'action pour accompagner les associations :

      • Faciliter l'accès au numérique :Logiciels à tarifs réduits (Microsoft, Adobe, Zoom, ainsi que des solutions françaises comme Assoctoc, Spirit, Net Explorer, Insia).
      • Matériel informatique reconditionné (par les Ateliers du Bocage) et neuf (grâce à des partenaires comme Cisco et Dell).
      • Accompagnement au développement des usages du numérique :Centre de ressources et outil d'autodiagnostic.
      • Formations (Solidatech est un organisme de formation certifié Qualiopi), webinaires thématiques mensuels.
      • Newsletters.

      Prestations de conseil sur mesure.

      Coproduction et diffusion de savoirs :Réalisation d'une étude nationale sur la place du numérique dans le projet associatif tous les 3 ans (5e édition lancée mi-avril 2025, dernière en 2022).

      II. Pastec : Expertise en conseil et développement informatique pour les projets à impact

      • Gautier Jeanzac représente Pastec, une société coopérative qui est une "agence de conseil et développement en informatique" spécialisée dans l'accompagnement de "projets à impact social et ou environnemental", travaillant "beaucoup avec des associations".

      1. Expertises Métiers de Pastec :

      • Conseil et gestion de projet / Direction technique partagée : Aide à la conception (architecture, planning) et au pilotage du développement produit.
      • Développement de produits numériques : Basé sur un cahier des charges, en "code traditionnel" ou "no code".
      • Numérique responsable : Conception de services "éco-conçus", "accessibles" et "respectueux du RGPD" (Règlement Général sur la Protection des Données).

      2. Rôle du Chef de Produit (Gautier Jeanzac) :

      Point de convergence entre la vision du client, les attentes des utilisateurs et les possibilités techniques. Aide à piloter la vie du produit en intégrant ces différentes perspectives.

      III. La formalisation du besoin : Clé de voûte du projet numérique

      La transformation numérique est définie comme "l'idée d'intégrer des technologies numériques dans l'ensemble des activités de l'association".

      1. Freins et Opportunités de la transformation numérique :

      • Freins majeurs dans les associations : "le coût, le temps et les compétences".
      • Opportunités : "améliorer l'efficacité opérationnelle, moderniser des services pour les bénéficiaires, renforcer l'impact des associations".
      • Exemple : Numérisation de processus administratifs chronophages (ex: suivi des bénévoles via fichiers Excel) pour gagner du temps et "passer ce temps-là du coup on peut le passer sur ces métiers".

      2. Du besoin métier au produit numérique :

      • Identifier un besoin métier : Souvent issu d'un "point de douleur" (ex: "processus d'adhésion très complexe", "beaucoup de saisies manuelles").
      • Question clé : "où est-ce qu'on dépense trop d'énergie et de temps ?"
      • Rédiger un cahier des charges : Document "de synthèse en fait de tous ces besoins", permettant de "prendre du recul avant de vous lancer dans la dans le développement".
      • C'est "le document qui vous permet d'expliquer et d'expliciter le ou les objectifs à atteindre de votre service ou de votre outil numérique".
      • Sert de référence pour le développement, que ce soit avec un prestataire ou en interne.

      3. Structure recommandée d'un cahier des charges (6 parties) :

      • Contexte : Qui est l'association, son histoire, ses objectifs, l'origine du besoin numérique (passé, présent, futur du projet).
      • Lexique : Définir les termes et acronymes propres à l'association pour une compréhension externe.
      • Technique : Préciser les éléments techniques existants (langage, hébergement, développeur précédent, contraintes spécifiques, migration des données, contrat de maintenance, RGPD).
      • Règles d'utilisation (User Stories) : Décrire les interactions des utilisateurs avec le service sous forme de "petites histoires". Ex: "En tant que bénévole de l'association, je souhaite pouvoir me connecter à un espace membre qui m'indique depuis combien de temps je suis bénévole". Inclure le résumé, les détails et les critères d'acceptation.
      • UX (User Experience) et UI (User Interface) :UX : "la capacité à concevoir ce qu'on un parcours utilisateur" (comment l'utilisateur interagit avec le logiciel et ses différentes étapes).
      • UI : "l'interface utilisateur" (ce que l'utilisateur va voir, l'aspect visuel, les maquettes, les wireframes - schémas d'écran).
      • Annexes : Tout document complémentaire pertinent.

      4. Exemple de CRM pour une association d'entrepreneuriat :

      Le cahier des charges a permis de suivre les interactions avec les parties prenantes. Le besoin de CRM est venu d'un "audit qui a fait remonter le besoin". Illustration concrète des user stories et des contraintes techniques (intégration dans un SI global, respect RGPD).

      IV. La construction du produit : Intégrer les utilisateurs et itérer

      La phase de construction insiste sur l'importance d' "intégrer en fait les équipes les bénévoles vos utilisateurs et vos utilisatrices dans la réflexion de ce à quoi va ressembler votre produit".

      1. Mener des actions terrain (Interviews Utilisateurs) :

      • Objectif : Comprendre les points de douleur et les besoins des utilisateurs (ce qu'ils veulent, ce qu'ils pensent de l'existant, comment améliorer).
      • Cinq grandes étapes :Préparation efficace : Définir les informations à recueillir et le "fil rouge" de l'entretien.
      • Début d'entretien : Poser le cadre, mais surtout insister sur le fait qu'il n'y a "pas de bonnes réponses" pour encourager l'honnêteté.
      • Pendant l'entretien : Parler "20%" et écouter "80% du temps". Privilégier les "questions ouvertes".
      • Après l'entretien : Traduire les besoins en fonctionnalités.
      • Priorisation : Développer en priorité ce qui "coûte le moins de temps et qui apporte le plus pour mes utilisateurs" (les "victoires rapides").

      2. Méthodologie Agile et "Petits Pas" :

      • Mettre l'utilisateur au cœur de la réflexion.
      • Commencer par un "plus petit lot possible de fonctionnalité à développer" (un "skateboard" avant la "voiture").
      • Processus itératif : Proposer de nouvelles fonctionnalités, les utilisateurs les utilisent, remontent de nouveaux besoins, et le cycle recommence.
      • Exemple : Formulaire d'adhésion simple (nom, email) puis ajout progressif de champs (adresse, région) basés sur les retours des équipes.
      • Avantages de l'approche itérative : Développer uniquement les fonctionnalités demandées, éviter le budget inutile, créer des outils adaptés aux besoins.
      • Observation d'usage : Observer physiquement les utilisateurs pour comprendre leurs interactions et ajuster le produit.

      V. Le développement : Code traditionnel vs No Code Deux grandes familles de développement sont présentées :

      1. Code traditionnel :

      Avantages : Peu ou pas de limites, possibilité de "tout faire" pour répondre à des besoins très précis. Inconvénients : Plus long et donc plus cher (coût basé sur le temps de développement), "complicité technique" qui rend la compréhension difficile pour les non-initiés (nécessite un bon partenaire pour "vulgariser les points techniques").

      2. No Code :

      • Définition : Outils pour créer des applications ou sites sans coder, comme des "Legos" (assemblage de blocs).
      • Avantages : "Rapidement mettre une solution en place", itérer vite, coût "un petit peu moins cher".
      • Inconvénients : "Moins personnalisable", nécessite une grande vigilance sur "l'écoconception, l'accessibilité et parfois du RGPD", outils propriétaires (la solution ne vous appartient pas).

      3. Exemples d'outils No Code :

      • CRM : Monday, Airtable (attention au RGPD car hébergement aux USA, lié au Patriot Act), Grist (alternative RGPD, hébergeable où l'on veut).
      • Formulaires : Type Form, Tally (version hébergée en Europe, plus RGPD).
      • Sites vitrines : Wix, Webflow, Bubble (Canva peut être une alternative si l'association est à l'aise avec l'outil, car il existe des versions premium gratuites pour les associations).
      • Applications web : Bubble, XAR.
      • Automatisation : Make, Zapper.
      • Wireframes : wireframe.cc (gratuit, simple), Canva, Paint, PowerPoint. Pour des maquettes plus complexes : Figma, suite Adobe (Adobe Express est gratuit en version premium pour les associations).

      VI. Travailler avec une agence et derniers conseils

      Le développement d'outils numériques, qu'il soit en code ou no code, "demande du temps et des compétences techniques".

      Si une association n'a pas le temps et préfère passer par une agence, plusieurs conseils sont donnés :

      • Équipe interne dédiée au pilotage : Avec "un pouvoir décisionnel" pour faciliter les évolutions.
      • Approche collaborative et construite : Intégrer toutes les parties prenantes, surtout les utilisateurs finaux.
      • Ne pas hésiter à demander et exiger la vulgarisation : S'assurer de comprendre ce qui se passe techniquement pour pouvoir reprendre la main si besoin.
      • Adopter la stratégie des petits pas : Éviter de développer des fonctionnalités inutiles et optimiser le budget.
      • Être adaptable : Trouver le juste milieu entre la vision initiale et les retours des utilisateurs.
      • Garder du temps pour se former : Comprendre les aspects techniques du produit.

      Concernant l'utilisation de l'IA pour le développement :

      Réponse nuancée : "J'en sais rien". * Vigilance : Dépend des outils et modèles utilisés. * Prudence : Ne pas déployer des choses non maîtrisées pour éviter les erreurs ou des problèmes ingérables.

      • Ce briefing offre une feuille de route complète, de l'identification du besoin à la concrétisation du projet, en soulignant l'importance de l'écoute des utilisateurs et d'une approche itérative et adaptable.
    1. Synthèse et Analyse Approfondie des Cancers Professionnels et de leur Invisibilité en France

      Ce document de synthèse explore les multiples facettes de l'invisibilité des cancers professionnels en France, s'appuyant sur les travaux du Giscope (Groupe d'Intérêt Scientifique de recherche sur les cancers professionnels) en Seine-Saint-Denis, notamment les recherches d'Anne Marchand, sociologue et historienne, et les commentaires de Nathalie Bajos.

      Il met en lumière les mécanismes institutionnels, scientifiques, sociaux et culturels qui contribuent à cette invisibilité, malgré une prévalence significative et des conséquences humaines et sociales dramatiques.

      1. Le Cloisonnement Historique et Institutionnel entre Santé au Travail et Santé Publique

      • Un thème central est le cloisonnement historique et persistant entre l'espace du travail et l'espace de vie en matière de santé. Ce cloisonnement, analysé par l'historien Thomas Lerou, a conduit à "l'effacement progressif du corps ouvrier dans les préoccupations sanitaires et politiques" dès les 18e et 19e siècles. Il a créé une séparation artificielle entre l'hygiène industrielle et l'hygiène publique, cette dernière devenant "l'hygiène d'une partie seulement du public ignorant ce qui se déroule dans l'espace de travail".

      Cette dichotomie a des conséquences majeures :

      • Approche fragmentée de la santé : Elle empêche de "penser la santé des individus et des populations dans leur globalité" et "laisse dans l'ombre de nombreux facteur d'inégalité sociale".
      • Campagnes de prévention inadaptées : Les campagnes de prévention contre le cancer sont "exclusivement centrées (...) sur la modification des comportements dits individuels", ignorant le rôle des conditions de travail et la responsabilité de l'État et des employeurs. Cela conduit à une approche qui "pointe la responsabilité des seuls individus" tout en laissant dans l'ombre les "cancérogènes présents dans le monde du travail".
      • Angle mort de la recherche en santé publique : Le travail est souvent "un angle mort des approches en santé", comme si les lieux de travail n'étaient pas aussi des lieux de vie où l'on passe une grande partie de son temps.

      • L'Épidémie Cachée : La Sous-Estimation et la Sous-Déclaration des Cancers Professionnels

      • Les sources révèlent une sous-estimation et une sous-déclaration massives des cancers d'origine professionnelle, contrastant avec l'augmentation constante de l'incidence du cancer en France (doublée depuis les années 1990).

      • Disparité Chiffrée : En 2023, seules 1452 reconnaissances de cancers professionnels ont été enregistrées, majoritairement liées à l'amiante. Or, les estimations épidémiologiques consensuelles indiquent que "4 à 8 % des nouveaux cas de cancer seraient d'origine professionnelle", soit "jusqu'à 34 644 cas par an". Cette énorme divergence crée un "phénomène un peu circulaire : moins il y a de cancer professionnel reconnus moins les personnes atteintes de cancer seront en mesure de penser le lien entre leur travail et leur maladie moins elles le déclareront en maladie professionnelle".

      • Exposition Généralisée : L'étude Sumi révèle que "11 % des salariés en moyenne des secteurs publics et privés (...) sont exposés à au moins un cancérogène dans leur activité habituelle de travail". Ces expositions sont fortement inégalitaires, touchant particulièrement les ouvriers qualifiés de l'industrie automobile (90% exposés), les intérimaires, et les jeunes de moins de 25 ans.
      • Polyexposition : La "poliexposition", c'est-à-dire l'exposition simultanée ou successive à différents cancérogènes, "démultiplie le risque de contracter un cancer". Un exemple frappant est celui d'un homme exposé à 17 cancérogènes identifiés au cours de son parcours professionnel.
      • Longue Latence : Le caractère différé des effets des cancérogènes (20 à 50 ans après l'exposition) rend le lien causal difficile à établir pour les victimes et le corps médical. De plus, il est "impossible scientifiquement et médicalement de distinguer un facteur sur l'autre dans sa survenue" (ex: amiante vs tabac pour le cancer du poumon).

      • Les Mécanismes d'Invisibilisation des Cancers Professionnels

      • Plusieurs facteurs, imbriqués et complexes, contribuent à cette invisibilité :

      3.1. Les Données Officielles et la Prévalence de l'Amiante

      • Loupe déformante : Les chiffres de reconnaissance de l'Assurance Maladie sont le "premier facteur de cette invisibilité sociale", donnant l'impression que les cancers professionnels sont rares et majoritairement liés à l'amiante. L'amiante est "l'arbre qui cache la forêt des autres cancérogènes".
      • Maladies "signatures" et droits spécifiques : L'existence de maladies "signature" (mésothéliome) et de droits spécifiques (retraite anticipée, FIVA) pour les victimes de l'amiante a paradoxalement renforcé cette perception limitée des cancers professionnels.

      3.2. Le Cadre Juridique et Administratif : Les Tableaux de Maladies Professionnelles

      • Objet de négociation et de rapport de force : Les tableaux de maladies professionnelles, créés par le Code de la Sécurité Sociale, sont le "résultat de négociation entre représentant de syndicat de salariés et représentant de syndicat d'employeur". Chaque terme choisi est le fruit de "rapports de force", ouvrant ou fermant les conditions de reconnaissance.
      • Restrictions et obsolescence : Ces tableaux sont des "objets mouvants du droit" mais leur contenu est souvent "très en deçà des connaissances scientifiques". L'exemple du cancer de la vessie lié aux amines aromatiques, dont le "titre" nécessite "un bac + 12 en chimie pour arriver à relier son travail à ce cancer", illustre la complexité et l'inadéquation.
      • Cancer du sein : un exemple d'invisibilité levée : L'absence de tableau pour le cancer du sein a longtemps masqué son origine professionnelle, le cantonnant à une "certaine fatalité". Les efforts de syndicats et de recherches ont permis de "rendre visible le facteur professionnel dans cette épidémie", montrant l'impact potentiel de l'inscription dans un tableau.
      • La "maladie négociée" : La maladie professionnelle n'est pas une catégorie médicale mais "une catégorie juridico-politique", une "maladie négociée", ce qui la rend distincte de la causalité médicale.

      3.3. L'Ignorance des Expositions et le Sentiment de Protection des Salariés

      • Manque d'information : La plupart des personnes touchées "ignoraient avoir été exposées à des substances cancérogènes". Cette ignorance peut venir de la méconnaissance des dangers de substances (comme l'amiante dans les années 80) ou de leur présence insidieuse et imperceptible (rayonnements ionisants, produits chimiques sans odeur ni effet immédiat).
      • Fausse impression de sécurité : Les salariés ont le "sentiment d'avoir été protégés" car ils imaginent que "sauf situation accidentelle tout est maîtrisé dans l'entreprise" ou que les substances dangereuses seraient interdites.
      • Dispositifs trompeurs :Valeurs Limites d'Exposition Professionnelle (VLEP) : Les VLEP sont le "fruit de compromis sociaux" et ne signifient pas l'absence de risque, car "la plupart des cancérogènes sont sans effet de seuil".
      • Surveillance Médicale Renforcée (SMR) : La SMR, bien que réservée aux salariés exposés, "ne protège en rien" mais peut créer l'illusion de protection ("Il pensait qu'on le protégeait en fait on l'endormait").
      • Équipements de Protection Individuelle (EPI) : Les EPI sont souvent inefficaces ou utilisés pour d'autres raisons (protection du produit), brouillant la perception du risque (ex: gants en salle blanche).

      3.4. Le Manque d'Intérêt pour la Déclaration et l'Indemnisation Insuffisante

      • L'horizon indemnitaire : Le "montant proposé au mieux (...) ne peut dépasser le montant mensuel des derniers salaires", ce qui est souvent "pas assez pour devenir moteur d'engagement". L'indemnisation est "forfaitaire" et très éloignée de ce qu'une victime obtiendrait devant un tribunal.
      • Dispositifs concurrents : Le dispositif d'invalidité est perçu comme "plus facile et plus rémunérateur", orientant les victimes loin de la reconnaissance en maladie professionnelle. Cette stratégie "contribue largement à rendre invisible les effets du travail sur la santé et donc les cancers professionnels" et "socialise le coût de ces maladies à l'ensemble de la collectivité" au lieu qu'il soit financé par les employeurs.
      • Signification de l'argent : L'argent de l'indemnisation revêt différentes significations culturelles. L'ignorance du principe "pollueur-payeur" fait que certains ne veulent pas "coûter davantage à la Sécu", ou ressentent de la "honte" à "assimiler cette démarche à une demande d'aide sociale". Pour les veuves, l'argent peut "brûler les doigts", générant une stigmatisation sociale.

      3.5. Le Rôle Déterminant et les Lacunes du Corps Médical

      • Formation insuffisante : Les médecins sont "très peu formés sur ce volet très spécifique du droit de la sécurité sociale" (environ "une dizaine d'heures sur leurs dizaines années d'études").
      • Difficulté à établir le lien : Formés à la causalité médicale, ils "appréhendent avec beaucoup de circonspection cette catégorie médico-administrative" et sont nombreux à refuser de rédiger un certificat médical pour des patients fumeurs, ignorant ou refusant d'admettre la présomption d'origine professionnelle.
      • Crainte du conflit : Le certificat médical initial (CMI) est un "espace de conflit" et peut entraîner des convocations devant le Conseil de l'Ordre à la demande d'employeurs. La tâche de "certifier" une origine professionnelle les éloigne de leur cœur de métier, le soin.
      • Manque d'interrogatoire : Dans l'ensemble, les médecins "interrogent très peu les activités exercées et encore moins les conditions de travail" de leurs patients.

      3.6. Les Inégalités d'Accès à la Reconnaissance et les Transformations du Travail

      • Charge de la preuve : La présomption d'origine professionnelle des tableaux est limitée, et le salarié doit souvent "apporter des preuves de la maladie", "des preuves de l'emploi" (certificats de travail, fiches de paye) et surtout "des preuves de l'activité habituelle de travail de l'activité exposante jusqu'à 40 ans en amont de la survenue de la maladie".
      • Fragilité des parcours : Cette capacité à prouver les activités exposantes est "très inégalement distribuée". Elle est plus facile pour les salariés avec une "stabilité professionnelle" ou qui peuvent compter sur un "réseau syndical ou de retraités dynamiques" (mineurs, dockers).
      • Travail morcelé et sous-traitance : La situation est "bien plus dur pour des salariés isolés", ceux "qui ont connu des parcours très morcelés" (jusqu'à 35-40 employeurs), et surtout pour les "salariés des entreprises sous-traitantes", qui sont à la fois "parmi les plus exposés et les moins reconnus". La sous-traitance, légalisée dans les années 70, est devenue un moyen de "contourner leurs obligations" et d'"externaliser des activités qui étaient les plus pénibles et les plus exposantes", renforçant l'invisibilité.
      • Intérimaires et travailleurs migrants : Les intérimaires, dont les documents ne disent "absolument rien du site sur lequel ils ont travaillé", et les "migrants travailleurs agricoles saisonniers", souvent "affectés au traitement chimique là où les risques toxiques sont les plus importants mais dont la maladie si elle survient ne sera pas visible en France ni reliée au travail", sont particulièrement vulnérables.

      3.7. Le Manque de Traçabilité Institutionnelle

      • Volatilité réglementaire : La "valse des réglementations" empêche la mise en place d'un dispositif stable garantissant une "traçabilité rigoureuse dans le temps des expositions cancérogènes" sur de longues périodes (20, 30, 40 ans).

      • Le Caractère Structurel de l'Invisibilité et l'Enjeu de Justice Sociale

      • L'analyse de la genèse de la catégorie "cancer professionnel" révèle une "certaine récurrence dans les obstacles à la construction de la connaissance". Dès le début du 20e siècle, malgré une identification précoce des cancers liés à des industries spécifiques (houille, colorants, rayons X), les mêmes constats d'échec de déclaration et de reconnaissance se répètent. Les affiches de 1938 exhortant les médecins à déclarer les maladies professionnelles témoignent de cette problématique ancienne.

      • Absence de données fiables : Les données sur le cancer sont "incomplètes", avec des registres qui ne couvrent "moins d'un quart de la population en France", et qui présentent des biais (population plus rurale, plus âgée, plus favorisée, moins de personnes d'origine étrangère). Les zones les plus polluées (sites Seveso) sont souvent non couvertes. La proposition de loi pour créer un registre national des cancers est une étape "indispensable".

      • Fabrication de "non-problèmes" : L'invisibilité des cancers professionnels s'inscrit dans une dynamique de "fabrique des non-problèmes ou comment éviter que la politique s'emmêle".
      • Question de justice sociale : En définitive, cette invisibilité pose la "question de la valeur différentielle des vies" et constitue une "question de justice sociale", comme le souligne Nathalie Bajos.

      En conclusion, la lutte contre les cancers professionnels exige bien plus que des campagnes de prévention individuelles.

      Elle nécessite une réforme profonde des mécanismes de reconnaissance, une formation accrue du corps médical, une meilleure traçabilité des expositions, une indemnisation plus juste, et surtout, un changement de paradigme qui intègre pleinement la santé au travail dans la santé publique, reconnaissant le lieu de travail comme un lieu de vie essentiel.

    1. Ich werden mit dem Raspberry PI auf die MSA zugreifen und weitere befehle im Code hinzufügen. So kann ich dann verschiedene Farben sortieren. Die MSA hat verschiedene Sensoren die ich dafür bereits nutzen kann. Ich müsste nur eine Kamera mit anschließen und diese auch verbinden.

      Kannst du das bitte noch mehr ausführen? Wie willst du das erreichen? In der letzten Sitzung bist du diesem Ziel ja schon um einiges näher gekommen.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The study aims to create a comprehensive repository about the changes in protein abundance and their modification during oocyte maturation in Xenopus laevis.

      Strengths:

      The results contribute meaningfully to the field.

      Weaknesses:

      The manuscript could have benefitted from more comprehensive analyses and clearer writing. Nonetheless, the key findings are robust and offer a valuable resource for the scientific community.

      We would like to thank the reviewer for his/her positive feedback on our article. The public review points out that "The manuscript could have benefitted from more comprehensive analyses and clearer writing." We have rewritten several sections and provided more detailed explanations of the analysis and interpretation of some data (see below for details). We have also followed all of the reviewer's recommendations, some of which specifically highlighted areas lacking clarity. We would also like to thank the reviewer for pointing out some errors, for which we apologize, and which have now been corrected. We sincerely appreciate the reviewer's thorough work, as it has greatly enhanced the clarity and precision of the manuscript.

      Reviewer #2 (Public review):

      Summary:

      The authors analyzed Xenopus oocytes at different stages of meiosis using quantitative phosphoproteomics. Their advanced methods and analyses revealed changes in protein abundances and phosphorylation states to an unprecedented depth and quantitative detail. In the manuscript they provide an excellent interpretation of these findings putting them in the context of past literature in Xenopus as well as in other model systems.

      Strengths:

      High quality data, careful and detailed analysis, outstanding interpretation in the context of the large body of the literature.

      Weaknesses:

      Merely a resource, none of the findings are tested in functional experiments.

      I am very impressed by the quality of the data and the careful and detailed interpretation of the findings. In this form the manuscript will be an excellent resource to the cell division community in general, and it presents a very large number of hypotheses that can be tested in future experiments. Xenopus has been and still is a popular and powerful model system that led to critical discoveries around countless cellular processes, including the spindle, nuclear envelope, translational regulation, just to name a few. This also includes a huge body of literature on the cell cycle describing its phosphoregulation. It is indeed somewhat frustrating to see that these earlier studies using phosphomutants and phospho-antibodies were just scratching the surface. The phosphoproteomics analysis presented here reveals much more extensive and much more dynamic changes in phosphorylation states. Thereby, in my opinion, this manuscript opens a completely new chapter in this line of research, setting the stage for more systematic future studies.

      We thank the reviewer for his/her extremely positive comments. The public review points out that "none of the findings are tested in functional experiments." This is entirely accurate. We focused our work on obtaining the highest quality proteomic and phosphoproteomic data possible, and then sought to highlight these data by connecting them with existing functional data from the literature. This approach has opened up research avenues with enormous, previously unforeseen potential, in a wide range of biological fields (cell cycle, meiosis, oogenesis, embryonic development, cell biology, cellular physiology, signaling, evolution, etc.). We chose not to delay publication by experimentally investigating the narrow area in which we are specialists (meiotic maturation), while our data offer a vast array of research opportunities across various fields. Our goal was, therefore, to present this extensive dataset as a resource for different scientific communities, who can explore their specific biological questions using our data. This is why we submitted our article to the "Repository" section of eLife. Nevertheless, in the context of the comparative analysis of the mouse and Xenopus phosphoproteomes performed at the reviewer’s request, we felt it was important to complement this new section with functional experiments that not only validate the proteomic data but also provide new insights into certain proteins and their regulation by Cdk1 (new paragraph lines 824-860 and new Figure 9).

      We are also grateful to the reviewer for the recommendation to improve the manuscript by including more comparisons between our Xenopus data and those from other systems. We have followed this suggestion (see below), which has significantly enriched the article (new paragraph lines 824-860 and new Figure 9).

      Reviewer #3 (Public review):

      Summary:

      The authors performed time-resolved proteomics and phospho-proteomics in Xenopus oocytes from prophase I through the MII arrest of the unfertilized egg. The data contains protein abundance and phosphorylation sites of a large number set of proteins at different stages of oocyte maturation. The large sets of the data are of high quality. In addition, the authors discussed several key pathways critical for the maturation. The data is very useful for the researchers not only researchers in Xenopus oocytes but also those in oocyte biology in other organisms.

      Strengths:

      The data of proteomics and phospho-proteomics in Xenopus oocyte maturation is very useful for future studies to understand molecular networks in oocyte maturation.

      Weaknesses:

      Although the authors offered molecular pathways of the phosphorylation in the translation, protein degradation, cell cycle regulation, and chromosome segregation. The author did not check the validity of the molecular pathways based on their proteomic data by the experimentation.

      We thank the reviewer for his/her positive comments. The public review points out that "The author did not check the validity of the molecular pathways based on their proteomic data by the experimentation." This is entirely accurate. We focused our work on obtaining the highest quality proteomic and phosphoproteomic data possible, and then sought to highlight these data by connecting them with existing functional data from the literature. This approach has opened up research avenues with enormous, previously unforeseen potential, in a wide range of biological fields (cell cycle, meiosis, oogenesis, embryonic development, cell biology, cellular physiology, signaling, evolution, etc.). We chose not to delay publication by experimentally investigating the very narrow area in which we are specialists (meiotic maturation), while our data offer a vast array of research opportunities across various fields. Our goal was, therefore, to present this extensive dataset as a resource for different scientific communities, who can explore their specific biological questions using our data. This is why we submitted our article to the "Repository" section of eLife. Nevertheless, in the context of the comparative analysis of the mouse and Xenopus phosphoproteomes performed at the reviewer’s request, we felt it was important to complement this new section with functional experiments that not only validate the proteomic data but also provide new insights into certain proteins and their regulation by Cdk1 (new paragraph lines 824-860 and new Figure 9).

      We have also followed all of the reviewer's recommendations and thank him/her, as the suggestions have significantly enhanced the manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Fig. 1 -> In the Figure legend "mPRβ" is called "mPRb". In the Figure, it is indicated that PKA substrates are always activated by the phosphorylation. As the relevant substrates and the mode-of-action of the Arpp19 phosphorylation are not clear at the moment, this seems to be preliminary. It could for example also be conceivable that PKA phosphorylation inhibits a translation activator. In addition, the PG-dependent translation of RINGO/Speedy should be included in the model.

      We fully agree with the reviewer. PKA substrates can either be activators of the Cdk1 activation pathway, which are inhibited by phosphorylation by PKA, or repressors of the same pathway, which are activated by phosphorylation by PKA. This is now illustrated in the new Fig. 1. In addition, we have also included RINGO/Speedy in the model and in the text (lines 78-79) and corrected "mPRb" in the legend.

      (2) Lane 51-52 -> it is questionable if the meiotic divisions can be called "embryonic processes"

      We agree with the reviewer comment, and we have removed the word “embryonic”.

      (3) Lane 53 and lane 106-107 -> recent data have indicated that transcription already starts during cell cycle 12 and 13 in most cells (e.g. Blitz and Cho: Control of zygotic genome activation in Xenopus (2021))

      We apologize for this mistake. The text has been corrected and the reference added (lines 53 and 107).

      (4) Lane 61-62 -> "MI" and "MII" are given as abbreviation for "first and second meiotic spindle"

      The text has been clarified to explain that MI is referred to metaphase I and MII stands for metaphase II (lines 61-64).

      (%) Lane 131-132 -> "single-cell" is mentioned redundantly in this sentence.

      The sentence has been corrected (lines 131-132).

      (6) Fig. 2B -> it is not explained what is plotted as "Average levels" on the x-Axis. Is it the average of expression over all samples or at a given time point? Are the values given as a concentration or are the values normalized? If so, how were they normalized?

      We agree with the reviewer comment that “Average levels” may have been unclear. In the new Fig. 2B, we have re-plotted the graph using the average protein concentration during meiosis, measured as described in the Methods section.

      (7) In Fig. 2-supplement 3E -> from the descriptions it is not entirely clear to me what the difference to the data in Fig. 2B is?

      We thank the reviewer for his/her question regarding the relationship between the data in Fig. 2B and Fig. 2-supplement 3E. We confirm that the raw data visualized in Fig. 2-supplement 3E are the same as those in Fig. 2B. However, in Fig. 2-supplement 3E, the data are color-coded differently to highlight the number of proteins whose concentrations change during meiotic divisions, based on the threshold adopted. The legend of Fig. 2-supplement 3E has been modified to clarify this point.

      (8) Lane 225-226 -> Kifc1 is a minus-end directed motor

      This mistake has been corrected (lines 232-233).

      (9) Lane 271 -> Serbp1, here mentioned to be involved in stabilization of mRNAs, has also been implicated in the regulation of ribosomes (e.g. Leesch et al. 2023). Regarding the overall topic of this manuscript, this could be mentioned as well.

      We agree with the referee that the important role of Serbp1 in the control of ribosome hibernation needs to be mentioned. We have included this point in the revised manuscript together with the reference (lines 277-279).

      (10) Lane 360-363 -> it is mentioned that APPL1 and Akt2 act "to induce meiosis". Furthermore, in the Nader et al. 2020 paper, Akt2 phosphorylation is reported to happen within 30min after PG treatment. In the present work, they only seem to get phosphorylated when Cdk1 is activated. Is there an explanation for this discrepancy?

      Indeed, Nader et al. (2020) indicate that Akt2 is phosphorylated on Ser473 (actually, they should have mentioned Ser474, which is the phosphorylated residue on Akt2; Ser473 corresponds to the numbering of Akt1) between 5 and 30 minutes post-Pg, which supports their hypothesis of an early role for this kinase. However, these conclusions should be taken with caution, considering that their functional experiment using antisense against Akt2 depletes only 25% of the protein, the antibody used to visualize Akt2 phosphorylation also recognizes phosphorylated Akt1 and Akt3, and they did not analyze phosphorylation of the protein after 30 minutes. Therefore, we cannot determine whether the level observed at 30 minutes represents a maximum or if it is just the onset of the phosphorylation that peaks later, possibly after activation of Cdk1, for example.

      Regarding our measurements: we clearly observe phosphorylation of Akt2 following Cdk1 activation on Ser131. We did not detect Akt2 phosphorylation on Ser474, but since our measurements started 1 hour post-Pg, this protein may have returned to a dephosphorylated state on Ser474.

      Therefore, the observations of Nader et al. and ours involve different residues and different phosphorylation kinetics, Nader et al. limiting their analysis to the first 30 minutes, whereas we started at 1 hour.

      We have revised the manuscript text to make these aspects clearer (lines 387-392).

      (11) Fig. 3B -> it could be made clearer in the Figure that all these sites belong to class I

      A title “Class I proteins” has been added in Fig. 3B to clarify it.

      (12) Lane 433-434 -> the authors write that the proteomic data of this study confirm that PATL1 is accumulating during meiotic maturation. However, in Fig. 2B PATL1 is not among the significantly enriched proteins.

      We apologize for this error. Indeed, PATL1 protein is not significantly enriched. The text has been corrected (lines 461-465).

      (13) Fig. 4B -> Zar2 is color-coded to increase in abundance. This is clearly different to published results and what is shown in Fig. 2B of this manuscript.

      Indeed, our dataset shows that the quantity of Zar2 decreases. This does not appear anymore in Figure 2B since Zar2 average concentration cannot be estimated. We made an error in the color coding, which has now been corrected in Figure 4B.

      (14) Lane 442-444 -> it might be worth mentioning that the interaction between CPEB1 and Maskin, and thus probably its role in regulation of translation, could not be reproduced in other studies (Minshall et al.: CPEB interacts with an ovary-specific eIF4E and 4E-T in early Xenopus oocytes (2007) or Duran-Arque et al.: Comparative analyses of vertebrate CPEB proteins define two subfamilies with coordinated yet distinct functions in post-transcriptional gene regulation (2022)).

      This clarification is now mentioned in the text, supported by the two references that have been added (lines 471-477).

      (15) Lane 483-485 -> The meaning of these sentences is not entirely clear to me. What exactly is the similarity with the function of Emi1? What does "...binding of Cyclin B1..." mean (binding to which other protein?). What is the similarity between Emi1 and CPEB1/BTG4, both of which are regulators of mRNA stability/polyadenylation?

      We apologize if these sentences were unclear. Our intention was to emphasize the central role of ubiquitin ligases in regulating multiple events during meiotic divisions. We used SCF<sup>βTrCP</sup>, a wellstudied ubiquitin ligase in Xenopus and mouse oocytes during meiosis, as an example. SCF<sup>βTrCP</sup> regulates the degradation of several substrates, including Emi1, Emi2, CPEB1, and Btg4, whose degradation or stabilization is essential for the proper progression of meiosis. Lastly, we highlighted that these regulatory processes, mediated by protein degradation, may be conserved in mitosis, as for example the destruction of Emi1. We have rewritten this paragraph for clarity (lines 513-518).

      (16) Lane 521-522 and 572-573 -> the authors write that Myt1 was not detected in their proteome. However, in Fig. 6A they list "pkmyt1" as a class II protein. On Xenbase, "pkmyt1" is the Cdk1 kinase, "Myt1" is a transcription factor, so the authors might have been looking for the wrong protein.

      We thank the reviewer for this accurate observation. We have modified the text to correct this error (lines 554 and 607).

      (17) Lane 564-565 -> The authors state that Cdk1 activity can be measured by analyzing Cdc27 S428 phosphorylation. However, in vivo the net phosphorylation of a site is always depending on the relevant kinase and phosphatase activities. As S428 is a Cdk1 site, it is not unlikely that it is dephosphorylated by PP2A-B55, which by itself is under the control of Cdk1. Do the authors have direct evidence that the change in phosphorylation of S428 can only be attributed to the changes in Cdk1 activity?

      There is evidence in the literature that Cdc27 is dephosphorylated by PP2A (Torres et al., 2010). In Xenopus oocytes, PP2A activity is high during prophase (Lemonnier et al., 2021) and decreases at the time of Cdk1 activation, mediated by the Greatwall-ENSA/Arpp19 system, remaining low until MII (Labbé et al., 2021). Therefore, the period where fluctuations in Cdk1 activity are difficult to assess, from NEBD to MII, corresponds to a phase of inhibited PP2A activity. As a result, the phosphorylation level of Cdc27 reflects primarily the activity of Cdk1. We have added this clarification in the text (lines 597-600).

      (18) Fig. 7C and 7D -> in 7C, for Nup35/Nup53 there is a phospho-peptide GIMEVRS(60)PPLHSGG. In Fig. 7D phosphorylation of GVMEMRS(59)PLFSGG is analyzed. Is this the same phosphosite/region of Nup35/Nup53? How can there be a slightly different version of the same peptide in one protein? Are these the L- and S-version of Nup35/Nup53? It is also very surprising that the two phosphosites belong to different classes, class III and class II, respectively.

      We thank the reviewer for this observation. The peptides GIMEVRS(60)PPLHSGG and GVMEMRS(59)PLFSGG correspond to the same phosphorylation site in the L and S versions of Xenopus laevis Nup35, respectively. The L version peptide was classified as Class III, while the S version was not assigned to any class due to its high phosphorylation level in prophase, which prevented it from meeting the log<sub>2</sub> fold-change threshold of 1 required by our analysis to detect significant differences.

      (19) Table 1 -> second last column is headed "Whur, 2014"

      The typo has been corrected.

      (20) Fig. 8 -> Why are all the traces starting at t=1h after PG?

      The labeling of the graphs in Fig. 8 has been corrected, and the traces now begin at t0.

      (21) Lane 754 -> Although a minority, there are also some minus-end directed kinesins, e.g. Kifc1

      We agree with the reviewer. We should have mentioned that, in addition to dyneins, some kinesins are minus-end directed motors, especially since one of them, Kifc1, is regulated at the level of its accumulation. We have rephrased the relevant sentences to incorporate this observation (lines 790-793).

      (22) Section "Assembly of microtubule spindles and microtubule dynamics" -> Although this section clearly has a strong focus on phosphorylation, it might be worth mentioning again that many regulators of the microtubule spindle, e.g. TXP2, are among the upregulated proteins in Fig. 2B/C

      We have already discussed that the protein levels of certain key regulators of the mitotic spindle (Tpx2, PRC1, SSX2IP, Kif11/Eg5 among others) are subject to control during meiotic maturation in a previous chapter “Protein accumulation: the machinery of cell division and DNA replication” (lines 230-239). We agree with the reviewer that this important observation can be mentioned again at the beginning of this chapter on phosphorylation control. We have added a sentence regarding this at the start of the paragraph (lines 774-775).

      Reviewer #2 (Recommendations for the authors):

      While I find the manuscript excellent and detailed already in its current form, I would appreciate including even more comparisons to other systems. In particular, a similar phosphoproteomics experiment has been performed in starfish oocytes undergoing meiosis (Swartz et al, eLife, 2021), and there are several studies on mitosis of diverse mammalian cells. It would be very exciting to see to what extent changes are conserved.

      We thank the reviewer for this recommendation, which we have attempted to follow. We have matched our dataset of mass spectrometry using the the phosphor-occupancy_matlab package, available as part of our code repository (https://github.com/elizabeth-van-itallie) previously described in (Van Itallie et al, 2025). Unfortunately, we were unable to match our dataset with the data from Swartz et al. (2021) on starfish oocyte due to the low sequence conservation. However, we have compared our dataset with the dataset from Sun et al. (2024) on mouse oocyte maturation. We identified a total of 408 conserved phosphorylation sites, which mapped to 320 proteins in Xenopus and 277 in mice (refer to a new paragraph: lines 824-860, new Figure 9, Methods: lines 1011-1032 and 1060-1065, and Appendix 7). The phosphorylation patterns during meiosis showed a significant crossspecies correlation (Pearson r = 0.39, p < 0.0001; see new Figure 9A), demonstrating the evolutionary conservation of phosphoproteomic regulation. Important phosphorylation events, including Plk1 at T201, Gwl at S467, and Erk2 at T188, were upregulated in both species, in line with the activation of the Cdk1 and MAPK signaling cascades (Figure 6B, new Figure 9A-B). We validated several of these phosphorylation sites by western blotting and demonstrated their dependency on Cdk1 activation (new Figure 9C). Together, these findings reinforce the notion that fundamental phospho-regulatory pathways are conserved during oocyte maturation in vertebrates.

      Reviewer #3 (Recommendations for the authors):

      (1) Page 6, the first paragraph of Results section: Please describe the method on how the authors measured and quantified the proteomes in different stages of Xenopus oocyte maturation briefly. Without the experimental design, it is very hard to evaluate the results in the following paragraphs.

      As requested by the reviewer, we added a few sentences describing the method of proteomics and phosphoproteomics measurements in oocytes resuming meiosis (lines 151-158).

      (2) In the phospho-proteome, it is better to classify the amino acids for the phosphorylation such as Ser, Thr, and Tyr. Particularly how many tyrosine phosphorylations are in the list.

      Our phosphosites dataset contains 80% Ser, 19.9% Thr, and 0.01% Tyr. Phospho-Tyr are slightly less abundant than what has been described in the literature (in most cells “roughly 85-90% of protein phosphorylation happens on Ser, ~10% on Thr, and less than 0.05% on Tyr" after Sharma et al., 2014. The same observation was made regarding the distribution of phosphorylated amino acids in mouse oocytes, where phospho-Tyr abundance is relatively diminished in oocytes compared to mouse organs (Sun et al., 2024). These observations are now reported in the manuscript (lines 309-313).

      (3) In class II (Figure 3), when Cdk1 (line 326) is a major kinase, how many phosphorylation sites are a target of Cdk1 (with the Cdk1-motif)? Moreover, do the authors find any other consensus sequences for the phosphorylation? Those are either known or unknown. This information would be useful for the readers.

      We thank the reviewer for this valuable comment. To address it, we used the kinase prediction server (https://kinase-library.phosphosite.org/kinase-library/score-site) to analyze Class II phosphosites. These new results are mentioned in lines 340-349 and illustrated in a new Figure (Figure 3—figure supplement 1A). We identified 303 sites predicted to be phosphorylated by Cdk1. Of these, 166 were also predicted as Erk1/2 targets, reflecting the similarity between Cdk1 and Erk1/2 consensus motifs.

      Cdk1 substrate phosphorylation is governed by more than just the presence of a consensus sequence. In addition to its preference for the (S/T)P×(K/R) motif, Cdk1/cyclin complexes achieve specificity through docking interactions with short linear motifs (SLiMs) recognized by the cyclin subunit (as LxF motifs)(Loog & Morgan, 2005), and via the Cdk-binding subunits Cks1 or Cks2, which interact with phosphorylated threonine residues in primed substrates (Örd et al, 2019). These mechanisms promote processive multisite phosphorylation and allow Cdk1 to target substrates even at non-canonical sites. Our motif-based analysis captures only part of this complexity and may underestimate the number of true Cdk1 targets.

      To further explore kinase involvement across phosphosite classes, we extended the analysis to all clusters and identified the most enriched kinase predictions for each (lines 360-365, new Figure 3— figure supplement 1B). In Class II, the most enriched kinases included Cdk1, Erk2, and Plk1, supporting the conclusions derived from the identification of the phosphosites of this Class. But others such as Cdk2, Cdk3, Cdk5, Cdk16, KIS, JNK1, and JNK3 were also identified.

      (4) Figure 3B: Why do the authors show this kind of Table only for Class I, not Classes II-V? It would be informative to show candidate proteins in other classes.

      We chose to present the candidate proteins from Class I in a table format because the number of phosphosites (136) was too small to allow a meaningful Gene Ontology (GO) enrichment analysis. Therefore, we manually curated the data and highlighted proteins whose Class I phosphosites are associated with specific biological processes. For Classes II–V, the higher number of phosphosites allowed us to perform GO enrichment analyses. Since several of the enriched processes were shared across different classes, and some proteins have phosphosites in multiple classes, we opted to organize the results by biological processes rather than by class. We agree with the reviewer that it is indeed valuable to highlight interesting proteins with Class II–V phosphosites. We have done so in Figures 4 through 8, using graphical representations instead of tables, in order to make the data more accessible and avoid long tables. Additionally, the Supplementary Figures provide detailed phosphorylation trends for many of the proteins discussed in the main figures.

      (5) It would be nice if the authors compare this phospho-proteome in Xenopus oocyte maturation with that in mouse oocyte maturation (Sun et al. 2024) in terms of evolutional conservation of the phospho-proteomes.

      We thank the reviewer for this suggestion. As now detailed in the manuscript, we compared our Xenopus phosphoproteome with the dataset from Sun et al. (2024) on mouse oocyte maturation using the the phospho_occupancy_matlab package, available as part of our code repository (https://github.com/elizabeth-van-itallie) previously described in (Van Itallie et al, 2025). We identified 408 conserved phosphorylation sites corresponding to 320 Xenopus and 277 mouse proteins (see new paragraph: lines 824-860, new Figure 9, Methods: lines 1011-1032 and 1060-1065, and Appendix 7). Phosphorylation dynamics across meiosis were significantly correlated between the species (Pearson r = 0.39, p < 0.0001; new Figure 9A), highlighting evolutionary conservation of the phosphoproteomes. Key phosphorylation events such as Plk1 at T201, Gwl at S467, and Erk2 at T188 increased in both species, consistent with activation of the Cdk1 and MAPK pathways (Figure 6B, new Figure 9A–B). We validated experimentally several of these phosphorylation sites by western blot (Erk2, Plk1, Fak1 and Akts1) and demonstrated their dependency on Cdk1 activation (new Figure 9C). Together, these new findings support the conservation of key phospho-regulatory mechanisms across vertebrate oocyte maturation.

      Minor points:

      (1) Reference lists: Please add Sun et al (2024) shown in line 115.

      This important reference has been added (lines 115, 134, 313 and 826).

      (2) Figure 1, red arrows for the inhibition: This should be "T" shape for a better understanding of these complicated pathways.

      We agree with the reviewer’s remark, and we have modified Figure 1.

      (3) Line 236-238: The authors referred to the absence of Cdc6 in oocyte maturation in Xenopus. However, Figure 2C shows that Cdc6 belongs to a list of accumulating proteins with Orc1 and Ocr2 etc. and the authors did not discuss this discrepancy in the text. Please clarity the claim.

      We apologize for the unclear wording in our text. The section of the manuscript regarding the pre-RC components may have been misleading. The text has been revised to clarify that Cdc6 was not detected in prophase-arrested oocytes by western blot and that it accumulates during meiotic maturation after MI, enabling oocytes to replicate DNA (lines 243-250).

      (4) Line 306: Please add the link to phosphosite.org.

      The link has been added (line 319).

    1. Note de synthèse : L'emprise psychotique et la Folie à Deux

      Ce document explore la notion d'emprise psychotique, en se basant sur la « Folie à Deux » comme modèle principal, et la distingue de l'emprise perverse classique.

      La présentation met en lumière les mécanismes, les typologies, les implications médico-légales et les dynamiques relationnelles sous-jacentes à ce phénomène complexe.

      1. Introduction à la Folie à Deux (Folie A2) et son lien avec l'emprise

      La Folie à Deux est une entité clinique décrite initialement par Lasègue et Falret en 1877.

      Elle se caractérise par le développement d'idées délirantes chez un « patient secondaire » sous l'influence d'un « patient primaire » déjà délirant. Les conditions nécessaires incluent une relation étroite et isolée des influences extérieures.

      • Exemple clinique : L'histoire de Sarah et Chloé L'exemple introductif illustre la dynamique de la Folie à Deux : Sarah, une mère anxieuse et contrôlante, et sa fille Chloé, souffrant de pathologies qui renforcent les défenses obsessionnelles de la mère.

      Leur relation fusionnelle et isolée conduit au développement d'un délire de persécution et de spoliation centré sur une tante.

      Initialement, la mère tente de raisonner sa fille, mais face à la violence de Chloé et à leur isolement, elle finit par céder et adhérer aux idées délirantes de sa fille. Cette adhésion conduit à un passage à l'acte violent envers la tante, entraînant leur hospitalisation.

      À la séparation, la mère se restaure rapidement, devenant critique des événements, tandis que la fille met plus de temps à se rétablir.

      Ce cas met en évidence l'influence du patient primaire (la fille) sur le patient secondaire (la mère) dans un contexte d'isolement et de pression.

      2. Typologies de la Folie à Deux L'analyse proposée distingue deux types principaux de Folie à Deux :

      • Folie à Deux Imposée (de Lasègue et Falret) :
      • Le patient primaire est actif dans le délire, et le patient secondaire est plus passif, délirant "par reflet" ou "sous la pression".
      • À la séparation, le patient secondaire retrouve rapidement son état antérieur et sa capacité critique.

      • Ceci est considéré comme une emprise psychotique incomplète, car les effets de la pathologie ne sont pas durables chez le sujet secondaire une fois la relation rompue.

      Folie à Deux Communiquée (de Marandon de Montyel) :

      • Le patient secondaire est un sujet prédisposé qui développe une maladie psychiatrique au contact du sujet délirant.
      • Les troubles persistent même après la séparation.
      • Ceci est considéré comme une emprise psychotique complète, car elle aboutit à un second sujet primaire capable de "contaminer" d'autres individus.

      3. Mécanismes psychopathologiques de la Folie à Deux Deux mécanismes principaux expliquent la dynamique et le maintien de la relation dans la Folie à Deux :

      • Projection de l'hostilité :
      • Dans l'exemple de Sarah et Chloé, l'hostilité de Chloé envers sa mère (due à l'incapacité de s'émanciper) ne peut s'exprimer directement.

      Elle est alors projetée sur un objet externe à la relation (la tante), ce qui permet de maintenir la dyade mère-fille et d'apaiser les tensions internes.

      • L'acceptation du mécanisme projectif par le patient secondaire est cruciale ; toute résistance entraîne une augmentation de l'agressivité et de la violence.

      • "L'acceptation du délire apaise en fait a permis qu'il y ait au final plus de plus d'épisodes de violence dans la relation puisque toute la violence était redirigée à l'extérieur."

      • Identification à l'ennemi (ou à l'agresseur) :

      • Inspiré des travaux de Freud et Ferenczi sur l'identification à l'agresseur (notamment chez les enfants victimes d'abus).

      • Il s'agit d'adopter le point de vue de l'agresseur, d'introjecter sa culpabilité ou d'anticiper ses besoins, dans le but d'apaiser l'agresseur et de se sauvegarder physiquement et psychiquement.

      • Dans la Folie à Deux, l'acceptation du délire par le patient secondaire apaise les violences (physiques) dans la relation, la violence étant redirigée vers l'extérieur du couple.

      4. L'emprise psychotique en miroir avec l'emprise classique (perverse)

      L'emprise psychotique est conceptualisée comme une forme spécifique d'emprise, distincte mais comparable à l'emprise perverse classique.

      Points communs de l'emprise (générique) :

      • Relation asymétrique : Un sujet réduit au statut d'objet, dont l'espace psychique est occupé par l'autre.

      • Déni d'altérité et de critique : L'accès à la critique est impossible ; l'objectif est une fusion totale et l'adhésion aux idées de l'autre.

      • Phases de captation et de domination : L'appropriation de l'autre se fait par séduction/fascination, suivie d'une phase de conditionnement par manipulation (verbale, physique), l'alternance séduction-agression étant au cœur de ce processus.

      • Séquelles durables : La victime peut avoir des difficultés à s'extraire de la dynamique et des séquelles à long terme dans ses interactions futures.

      • Distinction entre Emprise Perverse (Dracula) et Emprise Psychotique (Don Quichotte) :

      • CaractéristiqueEmprise Perverse (Dracula)Emprise Psychotique (Don Quichotte)Objectif /

      MotivationSiphonnement du narcissisme du partenaire ("se nourrit du sang de ses victimes"), jouissance perverse.Décollage de la réalité, sortie de la réalité.

      Le sujet n'est pas motivé par la jouissance de l'autre mais par la maladie elle-même qui guide le voyage pathologique et les interactions.

      Mécanisme sous-jacentMet en place des mécanismes proactifs pour produire ses effets, vise à combler des angoisses narcissiques.

      La maladie guide le "voyage pathologique" et les interactions. Les angoisses sont plus archaïques, des angoisses de "néantisation" très précoces dans le développement psychique.

      Nature de la contrainteLe sujet "pervers" met volontairement l'autre sous emprise pour se nourrir narcissiquement.

      Le patient primaire psychotique met l'autre sous emprise "sous l'effet de la maladie", enfermé dans une bulle délirante.DynamiqueAlternance séduction-agression volontaire pour le conditionnement.

      "L'exposition des mécanismes" (en thérapie) peut aider à s'en prémunir.Alternance fusion (adhésion du secondaire au délire) et agression (résistance du secondaire) ; vise une fusion complète. Mécanisme de projection d'hostilité.

      L'enjeu pour la victime est la menace de basculer complètement dans la folie du primaire.Implications médico-légalesLa victime est sous contrainte morale mais n'est pas "délirante" elle-même.

      Peut être irresponsable pénalement selon la "contrainte irrésistible" (article 122 du code pénal).

      Si passage à l'acte, l'acte peut être imputé au délire si les deux sujets agissent dans ce cadre.

      Le discernement peut être aboli (article 121 du code pénal), ce qui peut mener à une irresponsabilité pénale des deux sujets.Définition de l'emprise psychotique :

      C'est "une relation asymétrique où le sujet primaire psychotique met sous emprise un sujet secondaire dans des conditions d'isolement prolongé avec un déni d'altérité (comme dans l'emprise perverse) mais un déni plus global de réalité qui va se mettre en place, une volonté de fusion et d'adhésion au délire du partenaire dirigé par la maladie, une montée de l'agressivité dans les résistances abaissée par les mécanismes projectifs et une angoisse plus archaïque de néantisation chez les sujets."

      5. Implications Médico-Légales

      • Folie à Deux Imposée (emprise psychotique incomplète) : Le patient secondaire n'est pas malade au sens psychiatrique mais agit sous une contrainte morale irrésistible.

      L'article 122 du code pénal sur l'absence de responsabilité en cas de contrainte peut s'appliquer.

      Un arrêt de la Cour d'appel de Rennes (2017) concernant la compagne d'un gourou sectaire illustre cette situation : la séparation a suffi à mettre fin au délire de la partenaire, qui a été déclarée pénalement irresponsable sous le motif de la contrainte.

      • Folie à Deux Communiquée (emprise psychotique complète) : Si un passage à l'acte violent se produit dans le cadre du délire, l'acte peut être imputé au délire.

      L'expert psychiatre peut conclure à une abolition du discernement sur le modèle de l'article 121 du code pénal, menant potentiellement à l'irresponsabilité pénale des deux sujets.

      6. Réflexion élargie sur l'emprise

      L'emprise n'est pas spécifique à la perversion ou à la Folie à Deux ; elle peut se manifester sous diverses formes (paranoïaque, obsessionnelle, perverse) et est souvent liée à des difficultés développementales précoces et des attachements insécures.

      Plus les difficultés sont précoces et sévères (ex: dans l'environnement parental), plus le risque de développer une maladie psychiatrique (comme la schizophrénie) et de recourir à une modalité relationnelle d'emprise est élevé.

      Cependant, il est important de distinguer une modalité durable et problématique d'interaction d'un recours ponctuel et occasionnel à un type d'emprise (ex: en période de stress aigu), qui ne caractérise pas un fonctionnement pathologique durable.

    1. You tell an AI your application needs to be GDPR compliant- you aren’t really sure what that means, but the AI does- and so it dutifully constructs a workflow for data deletion. Fantastic! Now customers can click the Delete button and feel like their data is no longer retained. The next day you have another idea: “Let’s backup customer records to my home server, I don’t trust Amazon and I think we need to be in control of our own data!!” The AI complies, throwing in some effusive praise for your strategic thinking.These examples might feel too silly and contrived to be real, but I promise that developers have to talk clients down from this kind of bullshit every single day. Devs spend far more time discovering reality underlying the problem than writing code to solve it.

      Certain professional disdain for business problems vs. technical abstractions, but in the end...

    1. 6.6 Advocacy

      Add a Reflect before this section:

      Why are professional standards for teaching preparation AND the NAEYC Code of Ethical Conduct important for the field of Early Childhood Education?

      What are some ways to practice being an ethical staff member?

      What are some practices to keep in mind when it comes to setting limits and professional boundaries between families and direct-caregivers?

    1. Author response:

      eLife Assessment

      This useful study presents Altair-LSFM, a solid and well-documented implementation of a light-sheet fluorescence microscope (LSFM) designed for accessibility and cost reduction. While the approach offers strengths such as the use of custom-machined baseplates and detailed assembly instructions, its overall impact is limited by the lack of live-cell imaging capabilities and the absence of a clear, quantitative comparison to existing LSFM platforms. As such, although technically competent, the broader utility and uptake of this system by the community may be limited.

      We thank the reviewers and editors for their thoughtful evaluation of our work and for recognizing the technical strengths of the Altair-LSFM platform, including the custom-machined baseplates and detailed documentation provided to support accessibility and reproducibility. We respectfully disagree, however, with the assessment that the system lacks live-cell imaging capabilities. We are fully confident in the system’s suitability for live-cell applications and will demonstrate this by including representative live-cell imaging data in the revised manuscript, along with detailed instructions for implementing environment control. Moreover, we will expand our discussion to include a broader, more quantitative comparison to existing LSFM platforms—highlighting trade-offs in cost, performance, and accessibility—to better contextualize Altair’s utility and adaptability across diverse research settings.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The article presents the details of the high-resolution light-sheet microscopy system developed by the group. In addition to presenting the technical details of the system, its resolution has been characterized and its functionality demonstrated by visualizing subcellular structures in a biological sample.

      Strengths:

      (1) The article includes extensive supplementary material that complements the information in the main article.

      (2) However, in some sections, the information provided is somewhat superficial.

      Our goal was to make the supplemental content as comprehensive and useful as possible. In addition to the materials provided with the manuscript, our intention is for the online documentation (available at thedeanlab.github.io/altair) to serve as a living resource that evolves in response to user feedback. For this reason, we are especially interested in identifying and expanding any sections that are perceived as superficial, and we would greatly appreciate the reviewer’s guidance on which areas would benefit from further elaboration.

      Weaknesses:

      (1) Although a comparison is made with other light-sheet microscopy systems, the presented system does not represent a significant advance over existing systems. It uses high numerical aperture objectives and Gaussian beams, achieving resolution close to theoretical after deconvolution. The main advantage of the presented system is its ease of construction, thanks to the design of a perforated base plate.

      We appreciate the reviewer’s assessment and the opportunity to clarify our intent. Our primary goal was not to introduce new optical functionality beyond that of existing high-performance light-sheet systems, but rather to reduce the barrier to entry for non-specialist labs.

      (2) Using similar objectives (Nikon 25x and Thorlabs 20x), the results obtained are similar to those of the LLSM system (using a Gaussian beam without laser modulation). However, the article does not mention the difficulties of mounting the sample in the implemented configuration.

      We agree that there are practical challenges associated with handling 5 mm diameter coverslips. However, the Nikon 25x can readily be replaced by a Zeiss W Plan-Apochromat 20x/1.0 objective, which eliminates the need for the 5 mm coverslip[1]. In the revised manuscript, we will more explicitly detail the practical challenges in handling a 5 mm coverslip and mention the alternative detection objective.

      (3) The authors present a low-cost, open-source system. Although they provide open source code for the software (navigate), the use of proprietary electronics (ASI, NI, etc.) makes the system relatively expensive. Its low cost is not justified.

      We understand the reviewer’s concern regarding the use of proprietary control hardware such as the ASI Tiger Controller and NI data acquisition cards. While lower-cost alternatives for analog and digital control (e.g., microcontroller-based systems) do exist, our choice was intentional. By relying on a unified and professionally supported platform, we minimize the complexity of sourcing, configuring, and integrating components from disparate vendors—each of which would otherwise demand specialized technical expertise. Moreover, in future releases, we aim to further streamline the system by eliminating the need for the NI card, consolidating all optoelectronic control through the ASI Tiger Controller. This approach allows users to purchase a fully assembled and pre-configured system that can be operational with minimal effort.

      It is worth noting that the ASI components are not the primary cost driver. The full set—including XYZ and focusing stages, a filter wheel, a tube lens, the Tiger Controller, and basic optomechanical adapters—costs approximately $27,000, or ~18% of the total system cost. Additional cost reductions are possible. For example, replacing the motorized sample positioning and focusing stages with manual alternatives could reduce the cost by ~$12,000. However, this would eliminate key functionality such as autofocusing, 3D tiling, and multi-position acquisition. Open-source mechanical platforms such as OpenFlexure could in principle be adapted, but they would require custom assembly and would need to be integrated into our control software. Similarly, the filter wheel could be omitted in favor of a multi-band emission filter, reducing the cost by ~$5,000. However, this comes at the expense of increased spectral crosstalk, often necessitating spectral unmixing. An industrial CMOS camera—such as the Ximea MU196CR-ON, recently demonstrated in a Direct View Oblique Plane Microscopy configuration[2]—could substitute for the sCMOS cameras typically used in high-end imaging. However, these industrial sensors often exhibit higher noise floors and lower dynamic range, limiting sensitivity for low-signal imaging applications.

      While a $150,000 system represents a significant investment, we consider it relatively cost-effective in the context of advanced light-sheet microscopy. For comparison, commercially available systems with similar optical performance—such as LLSM systems from 3i or Zeiss—are several-fold more expensive.

      (4) The fibroblast images provided are of exceptional quality. However, these are fixed samples. The system lacks the necessary elements for monitoring cells in vivo, such as temperature or pH control.

      We thank the reviewer for their positive comment regarding the quality of our fibroblast images. As noted, the current manuscript focuses on the optical design and performance characterization of the system, using fixed specimens to validate resolution and imaging stability. We acknowledge the importance of environmental control for live-cell imaging. Temperature regulation is routinely implemented in our lab using flexible adhesive heating elements paired with a power supply and PID controller. For pH stabilization in systems that lack a 5% CO<sub>2</sub> atmosphere, we typically supplement the imaging medium with 10–25 mM HEPES buffer. In the revised manuscript, we will introduce a modified sample chamber capable of maintaining user-specified temperatures, along with detailed assembly instructions. We will also include representative live-cell imaging data to demonstrate the feasibility of in vitro imaging using this system.

      Reviewer #2 (Public review):

      Summary:

      The authors present Altair-LSFM (Light Sheet Fluorescence Microscope), a high-resolution, open-source microscope, that is relatively easy to align and construct and achieves sub-cellular resolution. The authors developed this microscope to fill a perceived need that current open-source systems are primarily designed for large specimens and lack sub-cellular resolution or are difficult to construct and align, and are not stable. While commercial alternatives exist that offer sub-cellular resolution, they are expensive. The authors' manuscript centers around comparisons to the highly successful lattice light-sheet microscope, including the choice of detection and excitation objectives. The authors thus claim that there remains a critical need for high-resolution, economical, and easy-to-implement LSFM systems.

      Strengths:

      The authors succeed in their goals of implementing a relatively low-cost (~ USD 150K) open-source microscope that is easy to align. The ease of alignment rests on using custom-designed baseplates with dowel pins for precise positioning of optics based on computer analysis of opto-mechanical tolerances, as well as the optical path design. They simplify the excitation optics over Lattice light-sheet microscopes by using a Gaussian beam for illumination while maintaining lateral and axial resolutions of 235 and 350 nm across a 260-um field of view after deconvolution. In doing so they rest on foundational principles of optical microscopy that what matters for lateral resolution is the numerical aperture of the detection objective and proper sampling of the image field on to the detection, and the axial resolution depends on the thickness of the light-sheet when it is thinner than the depth of field of the detection objective. This concept has unfortunately not been completely clear to users of high-resolution light-sheet microscopes and is thus a valuable demonstration. The microscope is controlled by an open-source software, Navigate, developed by the authors, and it is thus foreseeable that different versions of this system could be implemented depending on experimental needs while maintaining easy alignment and low cost. They demonstrate system performance successfully by characterizing their sheet, point-spread function, and visualization of sub-cellular structures in mammalian cells, including microtubules, actin filaments, nuclei, and the Golgi apparatus.

      We thank the reviewer for their thoughtful summary of our work. We are pleased that the foundational optical principles, design rationale, and emphasis on accessibility came through clearly. We agree that the approach used to construct the microscope is highly modular, and we anticipate that these design principles will serve as the basis for additional system variants tailored to specific biological samples and experimental contexts. To support this, we provide all Zemax simulations and CAD files openly on our GitHub repository, enabling advanced users to build upon our design and create new functional variants of the Altair system.

      Weaknesses:

      There is a fixation on comparison to the first-generation lattice light-sheet microscope, which has evolved significantly since then:

      (1) The authors claim that commercial lattice light-sheet microscopes (LLSM) are "complex, expensive, and alignment intensive", I believe this sentence applies to the open-source version of LLSM, which was made available for wide dissemination. Since then, a commercial solution has been provided by 3i, which is now being used in multiple cores and labs but does require routine alignments. However, Zeiss has also released a commercial turn-key system, which, while expensive, is stable, and the complexity does not interfere with the experience of the user. Though in general, statements on ease of use and stability might be considered anecdotal and may not belong in a scientific article, unreferenced or without data.

      The referee is correct that our comparisons reference the original LLSM design, which was simultaneously disseminated as an open-source platform and commercialized by 3i. While we acknowledge that newer variants of LLSM have been developed—including systems incorporating adaptive optics[3] and the MOSAIC platform (which remains unpublished)—the original implementation remains the most widely described and cited in the literature. It is therefore the most appropriate point of comparison for contextualizing Altair’s performance, complexity, and accessibility. Importantly, this version of LLSM is far from obsolete; it continues to be one of the most commonly used imaging systems at Janelia Research Campus’s Advanced Imaging Center.

      We acknowledge that more recent commercial implementation by Zeiss has addressed several of the practical limitations associated with the original design. In particular, we agree that the Zeiss Lattice Lightsheet 7 system, which integrates a meniscus lens to facilitate oblique imaging through a coverslip, offers a user-friendly experience—albeit with a modest tradeoff in resolution (reported deskewed resolution: 330 nm × 330 nm × 500–1000 nm).

      While we recognize that statements on usability and stability can be subjective, one objective proxy for system complexity is the number of optical elements that require precise alignment during assembly. The original LLSM setup includes approximately 29 optical components that must each be carefully positioned laterally, angularly, and coaxially along the optical path. In contrast, the first-generation Altair system contains only 9 such elements. By this metric, Altair is considerably simpler to assemble and align, supporting our overarching goal of making high-resolution light-sheet imaging more accessible to non-specialist laboratories. In the revised manuscript, we will clarify the scope of our comparison and provide more precise language about what we mean by complexity (e.g., number of optical elements needed to align).

      (2) One of the major limitations of the first generation LLSM was the use of a 5 mm coverslip, which was a hinderance for many users. However, the Zeiss system elegantly solves this problem, and so does Oblique Plane Microscopy (OPM), while the Altair-LSFM retains this feature, which may dissuade widespread adoption. This limitation and how it may be overcome in future iterations is not discussed.

      We agree that the use of 5 mm diameter coverslips, while enabling high-NA imaging in the current Altair-LSFM configuration, may serve as an inconvenience for many users. We will discuss this more explicitly in the revised manuscript. Specifically, we note that changing the detection objective is sufficient to eliminate the need for a 5 mm coverslip. For example, as demonstrated in Moore et al., Lab Chip 2021, pairing the Zeiss W Plan-Apochromat 20x/1.0 objective with the Thorlabs TL20X-MPL allows imaging beyond the physical surfaces of both objectives, removing the constraint imposed by small-format coverslips[1]. In the revised manuscript, we will propose this modification as a straightforward path for increasing compatibility with more conventional sample mounting formats.

      (3) Further, on the point of sample flexibility, all generations of the LLSM, and by the nature of its design, the OPM, can accommodate live-cell imaging with temperature, gas, and humidity control. It is unclear how this would be implemented with the current sample chamber. This limitation would severely limit use cases for cell biologists, for which this microscope is designed. There is no discussion on this limitation or how it may be overcome in future iterations.

      We appreciate the reviewer’s emphasis on the importance of environmental control for live-cell imaging applications. It is worth noting that the original LLSM design, including the system commercialized by 3i, provided temperature control only, without integrated gas or humidity regulation. Despite this, it has been successfully used by a wide range of scientists to generate important biological insights.

      We agree that both OPM and the Zeiss implementation of LLSM offer clear advantages in terms of environmental control, as we previously discussed in detail in Sapoznik et al., eLife, 2020[4]. However, assembly of high numerical aperture OPM systems is highly technical, and no open-source variant of OPM delivers sub-cellular scale resolution yet.

      (4) The authors' comparison to LLSM is constrained to the "square" lattice, which, as they point out, is the most used optical lattice (though this also might be considered anecdotal). The LLSM original design, however, goes far beyond the square lattice, including hexagonal lattices, the ability to do structured illumination, and greater flexibility in general in terms of light-sheet tuning for different experimental needs, as well as not being limited to just sample scanning. Thus, the Alstair-LSFM cannot compare to the original LLSM in terms of versatility, even if comparisons to the resolution provided by the square lattice are fair.

      We thank the reviewer for this comment. It is true that our discussion focused primarily on the square lattice implementation of LLSM. While this could be viewed as a subset of the system’s broader capabilities, we chose this focus intentionally, as the square lattice remains by far the most commonly used variant in practice. Even in the original LLSM publication, 16 out of 20 figure subpanels utilized the square lattice, with only one panel each representing the hexagonal lattice in SIM mode, a standard Bessel beam in incoherent SIM mode, a hex lattice in dithered mode, and a single Bessel in dithered mode. This usage pattern largely reflects the operational simplicity of the square lattice: it minimizes sidelobe growth and enables more straightforward alignment and data processing compared to hexagonal or structured illumination modes.

      In 2019, we performed an exhaustive accounting of published illumination modes in LLSM and found that the SIM mode had only been used in two additional peer-reviewed publications at that time. We will consider updating this table in the revised manuscript and will expand our discussion to acknowledge the broader flexibility of the LLSM platform—including its capacity for structured illumination and alternative light-sheet geometries. However, we will also emphasize that, despite these advanced capabilities, the square lattice remains the dominant mode used by the community and therefore serves as a fair and practical benchmark for comparison.

      (5) There is no demonstration of the system's live-imaging capabilities or temporal resolution, which is the main advantage of existing light-sheet systems.

      In the revised manuscript, we will include a demonstration of live-cell imaging to directly validate the system’s suitability for dynamic biological applications. We will also characterize the temporal resolution of the system. As a sample-scanning microscope, the imaging speed is primarily limited by the performance of the Z-piezo stage. For simplicity and reduced optoelectronic complexity, we currently power the piezo through the ASI Tiger Controller. We will expand the supplementary material to describe the design criteria behind this choice, including potential trade-offs, and provide data quantifying the achievable volume rates under typical operating conditions.

      While the microscope is well designed and completely open source, it will require experience with optics, electronics, and microscopy to implement and align properly. Experience with custom machining or soliciting a machine shop is also necessary. Thus, in my opinion, it is unlikely to be implemented by a lab that has zero prior experience with custom optics or can hire someone who does. Altair-LSFM may not be as easily adaptable or implementable as the authors describe or perceive in any lab that is interested, even if they can afford it. The authors indicate they will offer "workshops," but this does not necessarily remove the barrier to entry or lower it, perhaps as significantly as the authors describe.

      We appreciate the reviewer’s perspective and agree that building any high-performance custom microscope—Altair-LSFM included—requires a baseline familiarity with optics and instrumentation. Our goal is not to eliminate this requirement entirely, but to significantly reduce the technical and logistical barriers that typically accompany custom light-sheet microscope construction.

      Importantly, no machining experience or in-house fabrication capabilities are required—users can simply submit provided design files and specifications directly to the vendor. We will make this process as straightforward as possible by supplying detailed instructions, recommended materials, and vendor-ready files. Additionally, we draw encouragement from the success of related efforts such as mesoSPIM, which has seen over 30 successful implementations worldwide using a similar model of exhaustive online documentation, open-source control software, and community support through user meetings and workshops.

      We recognize that documentation alone is not always sufficient, and we are committed to further lowering barriers to adoption. To this end, we are actively working with commercial vendors to streamline procurement and reduce the logistical burden on end users. Additionally, Altair-LSFM is supported by a Biomedical Technology Development and Dissemination (BTDD) grant, which provides dedicated resources for hosting workshops, offering real-time community support, and generating supplementary materials such as narrated video tutorials. We will expand our discussion in the revised manuscript to better acknowledge these implementation challenges and outline our ongoing strategies for supporting a broad and diverse user base.

      There is a claim that this design is easily adaptable. However, the requirement of custom-machined baseplates and in silico optimization of the optical path basically means that each new instrument is a new design, even if the Navigate software can be used. It is unclear how Altair-LSFM demonstrates a modular design that reduces times from conception to optimization compared to previous implementations.

      We appreciate the reviewer’s comment and agree that our language regarding adaptability may have been too strong. It was not our intention to suggest that the system can be easily modified without prior experience. Meaningful adaptations of the optical or mechanical design would require users to have expertise in optical layout, optomechanical design, and alignment.

      That said, for labs with sufficient expertise, we aim to facilitate such modifications by providing comprehensive resources—including detailed Zemax simulations, CAD models, and alignment documentation. These materials are intended to reduce the development burden for those seeking to customize the platform for specific experimental needs.

      In the revised manuscript, we will clarify this point and explicitly state in the discussion what technical expertise is required to modify the system. We will also revise our language around adaptability to better reflect the intended audience and realistic scope of customization.

      Reviewer #3 (Public review):

      Summary:

      This manuscript introduces a high-resolution, open-source light-sheet fluorescence microscope optimized for sub-cellular imaging.

      The system is designed for ease of assembly and use, incorporating a custom-machined baseplate and in silico optimized optical paths to ensure robust alignment and performance. The authors demonstrate lateral and axial resolutions of ~235 nm and ~350 nm after deconvolution, enabling imaging of sub-diffraction structures in mammalian cells.

      The important feature of the microscope is the clever and elegant adaptation of simple gaussian beams, smart beam shaping, galvo pivoting and high NA objectives to ensure a uniform thin light-sheet of around 400 nm in thickness, over a 266 micron wide Field of view, pushing the axial resolution of the system beyond the regular diffraction limited-based tradeoffs of light-sheet fluorescence microscopy.

      Compelling validation using fluorescent beads and multicolor cellular imaging highlights the system's performance and accessibility. Moreover, a very extensive and comprehensive manual of operation is provided in the form of supplementary materials. This provides a DIY blueprint for researchers who want to implement such a system.

      Strengths:

      (1) Strong and accessible technical innovation: With an elegant combination of beam shaping and optical modelling, the authors provide a high-resolution light-sheet system that overcomes the classical light-sheet tradeoff limit of a thin light-sheet and a small field of view. In addition, the integration of in silico modelling with a custom-machined baseplate is very practical and allows for ease of alignment procedures. Combining these features with the solid and super-extensive guide provided in the supplementary information, this provides a protocol for replicating the microscope in any other lab.

      (2) Impeccable optical performance and ease of mounting of samples: The system takes advantage of the same sample-holding method seen already in other implementations, but reduces the optical complexity. At the same time, the authors claim to achieve similar lateral and axial resolution to Lattice-light-sheet microscopy (although without a direct comparison (see below in the "weaknesses" section). The optical characterization of the system is comprehensive and well-detailed. Additionally, the authors validate the system imaging sub-cellular structures in mammalian cells.

      (3) Transparency and comprehensiveness of documentation and resources: A very detailed protocol provides detailed documentation about the setup, the optical modeling, and the total cost.

      Weaknesses:

      (1) Limited quantitative comparisons: Although some qualitative comparison with previously published systems (diSPIM, lattice light-sheet) is provided throughout the manuscript, some side-by-side comparison would be of great benefit for the manuscript, even in the form of a theoretical simulation. While having a direct imaging comparison would be ideal, it's understandable that this goes beyond the interest of the paper; however, a table referencing image quality parameters (taken from the literature), such as signal-to-noise ratio, light-sheet thickness, and resolutions, would really enhance the features of the setup presented. Moreover, based also on the necessity for optical simplification, an additional comment on the importance/difference of dual objective/single objective light-sheet systems could really benefit the discussion.

      In the revised manuscript, we will expand our discussion to include a broader range of light-sheet microscope designs and imaging modes, including both single- and dual-objective configurations. We agree that highlighting the trade-offs between these approaches—such as working distance, sample geometry constraints, and alignment complexity—will enhance the overall context and utility of the manuscript.

      To further aid comparison, we will include a summary table referencing key image quality parameters such as lateral and axial resolution, and illumination beam NA for Altair-LSFM. Where available, we will reference values from published work—such as the axial resolution reported in Valm et al. (Nature, 2017)—to provide a clearer benchmark. Because such comparisons can be technically nuanced, especially when comparing across systems with different geometries and sample mounting constraints, we will also include a supplementary note outlining the assumptions and limitations of these comparisons.

      (2) Limitation to a fixed sample: In the manuscript, there is no mention of incubation temperature, CO₂ regulation, Humidity control, or possible integration of commercial environmental control systems. This is a major limitation for an imaging technique that owes its popularity to fast, volumetric, live-cell imaging of biological samples.

      We thank the reviewer for highlighting this important consideration. In the revised manuscript, we will provide a detailed description of how temperature control can be implemented using flexible adhesive heating elements, a power supply, and a PID controller. Step-by-step assembly instructions and recommended components will be included to facilitate adoption by users interested in live-cell imaging. We also note that most light-sheet microscopy systems capable of sub-cellular resolution—including the original LLSM design, diSPIM, and ASLM—typically do not incorporate integrated CO<sub>2</sub> or humidity control. These systems often rely on HEPES-buffered media to maintain pH stability, which is generally sufficient for short- to intermediate-term imaging. While full environmental control may be necessary for extended time-lapse studies, it is not a prerequisite for high-resolution volumetric imaging in many applications. Nonetheless, we will include a discussion of the challenges associated with adding CO<sub>2</sub> and humidity control to open or semi-enclosed architectures like Altair-LSFM, and outline potential future paths for integration with commercial incubation systems.

      (3) System cost and data storage cost: While the system presented has the advantage of being open-source, it remains relatively expensive (considering the 150k without laser source and optical table, for example). The manuscript could benefit from a more direct comparison of the performance/cost ratio of existing systems, considering academic settings with budgets that most of the time would not allow for expensive architectures. Moreover, it would also be beneficial to discuss the adaptability of the system, in case a 30k objective could not be feasible. Will this system work with different optics (with the obvious limitations coming with the lower NA objective)? This could be an interesting point of discussion. Adaptability of the system in case of lower budgets or more cost-effective choices, depending on the needs.

      We thank the reviewer for raising this important point. First, we would like to clarify that the quoted $150k cost estimate includes the optical table and laser source. We apologize for any confusion and will communicate this more effectively in the revised manuscript.

      We agree that adaptability is a key concern, especially in academic settings with limited budgets. The detection path can be readily altered depending on experimental needs and cost constraints. For example, in our discussion of alternatives to the 5 mm coverslip geometry, we will describe how switching to a Zeiss W Plan-Apochromat 20x/1.0 in combination with a compatible excitation objective allows high-resolution imaging while accommodating more conventional sample formats. We will expand this to include cost-effective alternatives as well.

      We will also expand our discussion on cost-reduction strategies and the associated trade-offs. These include replacing motorized stages with manual ones, omitting the filter wheel in favor of a multi-band emission filter, or using industrial-grade cameras in place of scientific CMOS detectors. While each change entails some loss in functionality or sensitivity, such modifications allow users to tailor the system to their specific budget and application.

      Finally, we recognize the challenge in communicating exact costs of commercial systems due to variability in configuration and pricing. Nonetheless, we will include approximate figures where possible and note that comparable commercial systems—such as LLSM platforms from 3i and Zeiss—are several-fold more expensive than the system presented here.

      Last, not much is said about the need for data storage. Light-sheet microscopy's bottleneck is the creation of increasingly large datasets, and it could be beneficial to discuss more about the storage needs and the quantity of data generated.

      Data storage is indeed a critical consideration in light-sheet microscopy. In the revised manuscript, we will provide a note outlining typical volume dimensions for live-cell imaging experiments along with the associated data overhead. This will include estimates for voxel counts, bit depth, time-lapse acquisitions, and multi-channel datasets to help users anticipate storage needs. We will also briefly discuss strategies for managing large datasets, file types and compression formats.

      Conclusion:

      Altair-LSFM represents a well-engineered and accessible light-sheet system that addresses a longstanding need for high-resolution, reproducible, and affordable sub-cellular light-sheet imaging. While some aspects-comparative benchmarking and validation, limitation for fixed samples-would benefit from further development, the manuscript makes a compelling case for Altair-LSFM as a valuable contribution to the open microscopy scientific community.

      References

      (1) Moore, R. P. et al. A multi-functional microfluidic device compatible with widefield and light sheet microscopy. Lab Chip 22, 136-147 (2021). https://doi.org/10.1039/d1lc00600b

      (2) Lamb, J. R., Mestre, M. C., Lancaster, M. & Manton, J. D. Direct-view oblique plane microscopy. Optica 12, 469-472 (2025). https://doi.org/10.1364/OPTICA.558420

      (3) Liu, T. L. et al. Observing the cell in its native state: Imaging subcellular dynamics in multicellular organisms. Science 360 (2018). https://doi.org/10.1126/science.aaq1392

      (4) Sapoznik, E. et al. A versatile oblique plane microscope for large-scale and high-resolution imaging of subcellular dynamics. eLife 9 (2020). https://doi.org/10.7554/eLife.57681

      (5) Huisken, J. & Stainier, D. Y. Even fluorescence excitation by multidirectional selective plane illumination microscopy (mSPIM). Opt Lett 32, 2608-2610 (2007). https://doi.org/10.1364/ol.32.002608

      (6) Ricci, P. et al. Removing striping artifacts in light-sheet fluorescence microscopy: a review. Prog Biophys Mol Biol 168, 52-65 (2022). https://doi.org/10.1016/j.pbiomolbio.2021.07.003

    2. Reviewer #1 (Public review):

      Summary:

      The article presents the details of the high-resolution light-sheet microscopy system developed by the group. In addition to presenting the technical details of the system, its resolution has been characterized and its functionality demonstrated by visualizing subcellular structures in a biological sample.

      Strengths:

      (1) The article includes extensive supplementary material that complements the information in the main article.

      (2) However, in some sections, the information provided is somewhat superficial.

      Weaknesses:

      (1) Although a comparison is made with other light-sheet microscopy systems, the presented system does not represent a significant advance over existing systems. It uses high numerical aperture objectives and Gaussian beams, achieving resolution close to theoretical after deconvolution. The main advantage of the presented system is its ease of construction, thanks to the design of a perforated base plate.

      (2) Using similar objectives (Nikon 25x and Thorlabs 20x), the results obtained are similar to those of the LLSM system (using a Gaussian beam without laser modulation). However, the article does not mention the difficulties of mounting the sample in the implemented configuration.

      (3) The authors present a low-cost, open-source system. Although they provide open source code for the software (navigate), the use of proprietary electronics (ASI, NI, etc.) makes the system relatively expensive. Its low cost is not justified.

      (4) The fibroblast images provided are of exceptional quality. However, these are fixed samples. The system lacks the necessary elements for monitoring cells in vivo, such as temperature or pH control.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Mollá-Albaladejo et al. investigate the neurons downstream of GR64f and Gr66a, called G2Ns. They identify downstream neurons using trans-Tango labeling with RFP and then perform bulk RNA-seq on the RFP-sorted cells. Gene expression is up- or downregulated between the cell populations and between fed and starved states. They specifically identify Leukocinin as a neuropeptide that is upregulated in starved Gr66a cells. Leucokinin cells, identified by a GAL4 line indeed show higher expression when starved, especially in the SEZ. Furthermore, Leucokinin cells colocalize with the transTango signal from downstream neurons of both GRs. This connection is confirmed with GRASP. According to EM data, Leucokinin cells in the SEZ receive a lot of input and connect to many downstream neurons. In behavior experiments performed with flies lacking Leucokinin neurons, flies show reduced responsiveness to sugar and bitter mixtures when starved. The authors suggest that Leucokinin neurons integrate bitter and sugar tastes and that their output is modified by a hunger state.

      Strengths:

      The authors use a multitude of tools to identify SELK neurons downstream of taste sensory neurons and as starvation-sensitive cells. This study provides an example of how combining genetic labeling, RNA-seq, and EM analysis can be combined to investigate neural circuits.

      Weaknesses:

      The authors do not show a functional connection between sensory neurons and SELK neurons. Additionally, data from RNA seq, anatomical studies, and EM analysis are sometimes contradictory in terms of connectivity. GRASP signal is not foolproof that cells are synaptically connected.

      We appreciate the reviewer’s comments. Unfortunately, we have not successfully demonstrated a functional response of SELK neurons using in vivo calcium imaging with UAS-GCaMP7 (we tried f, m, and s versions), primarily due to challenges in obtaining stable signals. We stimulated GRNs using sucrose, caffeine, or a mixture of both, and maybe even if the concentrations were high, they were not enough to induce a response.

      Regarding GRASP, we acknowledge its limitations as a standalone technique for establishing genuine synaptic connections between neurons, as some signals may reflect false positives resulting from the mere proximity of the candidate neurons. To strengthen our findings, we complemented these results by demonstrating the positive colocalization of the Leucokinin antibody signal over the Gr66aGal4>trans-TANGO and Gr64f-Gal4>trans-TANGO (Figure 4), confirming that Leucokinin neurons are indeed postsynaptic to both sweet and bitter GRNs. Moreover, we incorporated BacTrace data to highlight the direct connectivity between sweet and bitter GRNs (now Figure 5E).

      In the revised manuscript, we have introduced the active-GRASP technique (Macpherson et al., 2015). In this version of GRASP, the presynaptic half of GFP (GFP 1-10) is fused to synaptobrevin, which becomes accessible in the membrane of the presynaptic neuron within the synaptic cleft upon presynaptic stimulation (in our case, by stimulating with sucrose sweet Gr64f<sup>GRNs</sup> and with caffeine the bitter Gr66a<sup>GRNs</sup>). Utilizing this technique, we successfully demonstrated (see new Figure 5B and 5D) that when presented with water, no signal was detected in the Gr66a-LexA, Lk-Gal4 > active-GRASP, or Gr64f-LexA, Lk-Gal4 > active-GRASP transgene flies. However, in the presence of caffeine, Gr66aLexA, Lk-Gal4 > active-GRASP transgene flies exhibited a clear signal in the SEZ, and similarly, sucrose presentation to Gr64f-LexA, Lk-Gal4 > active-GRASP transgene flies yielded a detectable signal. The results obtained from active-GRASP provide additional evidence supporting the connectivity between SELK neurons and both Gr64f<sup>GRNs</sup> and Gr66a<sup>GRNs</sup>, further indicating the functional connectivity of the GRNs and SELK neurons.

      The authors describe a behavioral phenotype when flies are starved, however, they do not use a specific driver for the described cell type, thus they should also tone down their claims.

      We agree with the reviewer that the Lk-Gal4 driver line used labels SELK, LHLK, and ABLK neurons. The behavior examined in this paper, the Proboscis Extension Response (PER), measures the initiation of feeding. Although the neural circuit involved in this behavior is primarily confined to the SEZ where SELK neurons are located, we cannot rule out the possibility that other Lk neurons may also play a role in the process. To restrict expression of the Tetanus Toxin, we have utilized the tsh-Gal80 (Clyne et al., 2008) transgene in combination with the Lk-Gal4>UAS-TNT and Lk-Gal4>UAS-TNT<sup>imp</sup> constructs to prevent the expression of the Tetanus Toxin in ABLK neurons, thereby restricting its expression to the SELK and LHLK neurons in the central brain. The new results (Sup Figure 7A) indicate that ABLK neurons do not play a role in integrating sweet and bitter information. However, we acknowledge the reviewer's point that we are still silencing LHLK neurons, so we have adjusted our claims to align more closely with our data

      Generally, the authors do not provide a big advancement to the field and some of the results are contradictory with previous publications.

      We believe our work does not contradict previous findings, nor does it invalidate the role of ABLK neurons in water homeostasis or the role of LHLK neurons in regulating sleep via starvation. We provide additional information on the possible role of SELK neurons in integrating gustatory information. The location of SELK neurons in the SEZ suggests that they may play a role in feeding behavior, and we have demonstrated that these neurons are indeed involved in integrating gustatory information to influence feeding decisions. We consider we have contributed by highlighting a new role for the Leucokinin neuropeptide in feeding behavior.

      Reviewer #2 (Public review):

      Summary:

      A core task of the brain is processing sensory cues from the environment. The neural mechanisms of how sensory information is transmitted from peripheral sense organs to subsequent being processing in defined brain centers remain an important topic in neuroscience. The taste system hereby assesses the palatability of food by evaluating the chemical composition and nutrient content while integrating the current need for energy by assessing the satiation level of the organism. The current manuscript provides insights into the early circuits of gustatory coding using the fruit fly as a model. By combining trans-tango and FACS- based bulk RNAseq to assess the target neurons of sweet sensing (using Gr64fGal4) and bitter sensing (using Gr66a-Gal4) in a first set of experiments the authors investigate genes that are differentially expressed or co-expressed in normal and starved conditions. With a focus on neuropeptides and neurotransmitters, different expressions in the different conditions were assessed resulting in the identification of Leucokinin as a potentially interesting gene. The notion is further supported by RNAseq of Lk- Gal4>mCD8:GFP sorted cells and immunostainings. GRASP and BacTrace experiments further support that the two Lk- expressing cells in the SEZ should indeed be postsynaptic to both types of sensories. Using EM-based connectomics data (based on a previous publication by Engert et al.), the authors also look for downstream targets of the bitter versus sweet gustatory neurons to identify the Lk-neurons. Based on the morphology they identify candidates and further depict the potential downstream neurons in the connectome, which appears largely in agreement with GRASP experiments. Finally silencing the Lk- neurons shows an increased PER response in starved flies (when combined with bitter compounds) as well as increased feeding neurons shows an increased PER response in starved flies (when combined with bitter compounds) as well as increased feeding in a FlyPad assay. Strengths:

      Overall this is an intriguing manuscript, which provides insight into the organization of 2nd order gustatory neurons. It specifically provides strong evidence for the Lk-neurons as a target of sweet and bitter GRNs and provides evidence for their role in regulating sweet vs bitter-based behavioral responses. Particularly the integration of different techniques and datasets in an elegant fashion is a strong side of the manuscript. Moreover to put the known LK-neurons into the context of 2nd order gustatory signalling is strengthening the knowledge about this pathway.

      Weaknesses:

      I do not see any major weakness in the current manuscript. Novelty is to some degree lessened by the fact, that the RNAseq approach did not identify new neurons but rather put the known LK-neurons as major findings. Similarly, the final behavioral section is not very deep and to some degree corroborates the previous publication by the Keene and Nässel labs - that said, the model they propose is indeed novel (but lacks depth in analyses; e.g. there is no physiology that would support the modulation of Lk neurons by either type of GRN). The connectomic section appears a bit out of place and after reading it it's not really clear what one should make of the potential downstream neurons (particularly since the Lk-receptor expression has been previously analyzed); here it might have been interesting to address if/how Lk-neurons may signal directly via a classical neurotransmitter (an information that might be found easily in the adult brain single-cell data).

      We thank the reviewer for the comment. Indeed, we attempted in vivo Ca imaging but were unsuccessful. We have rewritten the connectomic section to better integrate it with the rest of the text and have reanalyzed the data obtained. We considered gathering data from the single-cell adult dataset, but this dataset includes the entire adult fly brain, encompassing SELK and LHLK neurons, making it impossible to differentiate between the two types of Lk neurons. Any further analysis will require transcriptomic analysis of SELK via scRNAseq under the different metabolic conditions tested in this study work.

      Reviewer #3 (Public review):

      Summary:

      To make feeding decisions, animals need to process three types of information: positive cues like sweetness, negative cues like bitterness, and internal states such as hunger or satiety. This study aims to identify where the information is integrated into the fruit fly brain. The authors applied RNA sequencing on second-order gustatory neurons responsible for sweet and bitter processing, under fed and starved conditions. The sequencing data reveal significant changes in gene expression across sweet vs. bitter pathways and fed vs. starved states. The authors focus on the neuropeptide Leucokinin (Lk), whose expression is dependent on the starvation state. They identify a pair of neurons, named SELK neurons, which express Lk and receive direct input from both sweet and bitter gustatory neurons. These SELK neurons are ideal candidates to integrate gustatory and internal state information. Behavioral experiments show that blocking these neurons in starved flies alters their tolerance to bitter substances during feeding.

      Strengths:

      (1) The study employs a well-designed approach, targeting specific neuronal populations, which is more efficient and precise compared to traditional large-scale genetic screening methods.

      (2) The RNAseq results provide valuable data that can be utilized in future studies to explore other molecules beyond Lk.

      (3) The identification of SELK neurons offers a promising avenue for future research into how these neurons integrate conflicting gustatory signals and internal state information.

      Weaknesses:

      (1) Unfortunately, due to technical challenges, the authors were unable to directly image the functional activity of SELK neurons.

      (2) In the behavioral experiments, tetanus toxin was used to block SELK neurons. Since these neurons may release multiple neurotransmitters or neuropeptides, the results do not specifically demonstrate that Leucokinin (Lk) is the critical factor, as suggested in Figure 8. To address this, I recommend using RNAi to inhibit Lk expression in SELK neurons and comparing the outcomes to wild-type controls via the PER assay.

      We appreciate the author's comments and suggestions. As noted, Tetanus Toxin silences the neuron’s activity, affecting the functioning of various neurotransmitters and neuropeptides released by the targeted neuron. In response to the reviewer's recommendation, we employed an RNAi line specifically designed to silence Leucokinin production in Lk-expressing neurons.

      The results presented in Supplementary Figure 7B demonstrate that knocking down Leucokinin in Lk neurons significantly reduces the flies' tolerance to caffeine in sweet food.

      It is crucial to highlight that the sucrose concentration used in Figure 7C was 50mM, whereas in Supplementary Figure 7B, it was increased to 100mM. This adjustment was necessary because the Lk-Gal4, UAS-RNAi, and Lk-Gal4>UAS-RNAi transgenic lines exhibited reduced sensitivity to sucrose compared to the Lk-Gal4>UAS-TNT or Lk-Gal4>UAS-TNT<sup>imp</sup> lines. We aimed to establish a sucrose concentration that would elicit a 50% Proboscis Extension Response (PER) without adding any other compound, thereby allowing us to evaluate the additional effect of caffeine in the food.

      However, according to the data derived from the connectome, SELK neurons might be cholinergic, and this neurotransmitter might be involved in controlling also the behavior of the flies.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      To get more evidence for connections between sensory cells and SELK neurons, could the authors also analyze a second available EM data set? Would setting a different threshold (>5 synapses) reveal connections to both sensories? Comparisons between SELK in- and outputs from EM data and Tango labeling also seem to differ quite a lot based on provided images - can the authors count cell bodies in the stainings? Further proof would be to provide functional imaging data that shows that SELK neurons respond to sugar and bitter compounds.

      In this study, we utilized the recently published EM dataset for the Drosophila central brain connectome (Dorkenwald et al., 2024; Flywire.ai). Changing the number of synapses affects the counts of pre- and postsynaptic neurons. We set a threshold of more than five synapses, as recommended by Flywire, to avoid false positives (Dorkenwald et al., 2024). This threshold has been widely used in recent papers (Engert et al., 2022; Shiu et al., 2022; Walker et al., 2025).

      The neuron counts in the connectomic data differ from those in the trans- and retro-TANGO experiments. In our initial trans-TANGO experiment, which labeled postsynaptic neurons in the Gr64fGal4 and Gr66a-Gal4 transgenic lines, we counted the labeled neurons (see Supplementary Figure 1C) and observed considerable variability between different brains. Due to anticipated variability, we did not count the labeled neurons from trans-TANGO and retro-TANGO techniques in the Leucokinin neurons. Furthermore, neither technique labels all postsynaptic or presynaptic neurons, respectively. A recent study on the retro-TANGO technique (Sorkac et al., 2023) found a minimum threshold: the presynaptic neuron must form a certain number of synapses with the neuron of interest to be adequately labeled. According to this paper, the established threshold is 17 synapses. It is likely that the trans-TANGO technique also has a threshold relating to the number of labeled neurons, contingent on the synapse count. This would explain the discrepancy between the two results.

      Unfortunately, we have not been able to provide functional data pointing to the activation of SELK neurons by sucrose or caffeine. However, our active-GRASP data indicates that the connectivity between Gr64f<sup>GRNs</sup> and Gr66a<sup>GRNs</sup> with SELK neurons is present and functional.

      How many Leucokinin-positive cells are in the SEZ? Does the RNA-seq data provide further information about the SELK neurons? Potential receptor candidates for how they integrate hunger signals? AMPKa was described to be required in LHLK neurons.

      There are two SELK neurons in the SEZ. Due to the nature of our bulk RNA sequencing (RNAseq), we cannot link any additional gene expressions detected in our transcriptomic analysis specifically to the SELK neurons regarding the integration of various signaling processes. Furthermore, the single-cell RNA sequencing (scRNAseq) data available from the Drosophila brain, as reported by Li et al. (2022), does not allow accurate differentiation between SELK and LHLK neurons. To understand how these neurons integrate both metabolic and sensory information, it is crucial to conduct a focused RNAseq study specifically on the SELK neurons to understand how these neurons integrate both metabolic and sensory information. This targeted analysis would provide the necessary insights to elucidate their functional roles better. However, according to the data derived from the connectome, SELK neurons might be cholinergic, and this neurotransmitter might be involved in controlling also the behavior of the flies.

      According to previous studies (Yurgel et al., 2019), the Lk-GAL4 line is also expressed in the VNC, thus the authors could make use of the tsh-GAL80 tool to clean up the line. This study also performed GCaMP imaging in fed and 24h starved animals in SELK and couldn't find a difference, can the authors explain this discrepancy?

      We thank the reviewer for this suggestion. We have now added a new piece of data using the tsh-Gal80 transgene in our PER experiments (Supplementary Figure 7A). Blocking the expression of TNT in the ABLK neurons does not affect the main conclusion of the behavioral results. As stated previously, we were unable to obtain in vivo Ca imaging responses in SELK neurons upon exposure to sucrose, caffeine, or mixtures of sucrose and caffeine. We do not believe this is a discrepancy with previous works like Yurgel et al., 2019. It is likely that we faced technical issues regarding expression stability and that the stimulation was possibly too weak to detect changes in GFP levels

      Reviewer #2 (Recommendations for the authors):

      As mentioned above I do not have any major comments on the manuscript, but there are a few points that I feel should be considered:

      (1) The identification of the Lk-candidate neurons in the connectome remains a bit mysterious. In the method sections, this reads as follows "manual and visual criteria were applied to identify the neurons of interest ". a) What precisely was done to get to the candidates?b) Are there alternative candidates that may be Lk-neurons? c) How would another neuron affect the conclusion of the downstream analysis?

      We thank the reviewer for this comment. We have now modified and added new information in the connectomic section, reinforcing our conclusions and correcting the results obtained.

      Our GRASP, BacTRace, and immunohistochemistry experiments pointed to SELK neurons as postsynaptic to both Gr64f<sup>GRNs</sup> (sweet) and Gr66a<sup>GRNs</sup> (bitter). To identify which neurons in the connectome could be the SELK neurons, we utilized a previously described set of GRNs already identified in the connectome (Shiu et al., 2022). We extracted all postsynaptic neurons to the sweet and bitter GRNs identified and intersected both datasets, retaining only those candidate hits receiving simultaneous input from sweet and bitter GRNs. This process yielded a total of 333 hits. Through visual inspection, we discarded all hits that were merely neuronal fragments or neurons that clearly were not our candidates. We narrowed the list down to a final set of 17 candidate neurons whose arborization was located in the SEZ. We reduced the candidates to two final entries from this list: ID 720575940623529610 (GNG.276) and ID 720575940630808827 (GNG.685). The GNG.276 neuron had a counterpart in the SEZ identified as GNG.246. Both of these neurons were annotated as DNg70 in the Flywire database. GNG.685 had a counterpart identified as GNG.595, and these two neurons were classified as DNg68. In both cases, the neuronal candidates, DNg70 and DNg68, were classified as descending neurons, a characteristic of previously described SELK neurons (Nässel et al., 2021). In our initial analysis published in bioRxiv and sent for revision, we identified DNg70 as potentially the SELK neurons based solely on the morphology of the neurons via visual inspection. However, we employed a better method to determine which candidate is more likely to be the SELK neurons, concluding that DNg68, rather than DNg70, represents the SELK neurons. Briefly, we performed an immunohistochemistry for GFP in the Lk-Gal4>UAS-CD8:GFP flies. We aligned the resulting image in a Drosophila reference brain (JRC2018 U) using the CMTK Registration plugin in ImageJ. The resulting image was skeletonized using the Single Neurite Tracer plugin in ImageJ and later uploaded to the Flywire Gateway platform to compare the structure of the aligned and skeletonized SELK neurons to our candidates. This comparison clearly indicated that the DNg68 neurons are the best candidates for representing the SELK neurons, rather than DNg70. We have updated the text and Figures 6 and Supplementary Figure 6 to reflect the new results. These new results do not alter the conclusions of the paper.

      (2) In the transcriptomic experiments It seems that the raw transcripts are reporters, rather than normalised data. Why?

      All transcriptomic data is normalized. In Figure 1 the differential expression was calculated using Deseq2 normalized counts. In Figure 2, Transcripts Per Million (TPM) were calculated using the Salmon package and normalized for the gene length.

      (3) The expression of nAChRbeta1 in the transcriptomic data is rather striking. However, this remains currently not addressed: is this expression real?

      We have not confirmed the upregulation or downregulation in gene expression for other but for Leucokinin, which is our main interest. We found the presence of nAChRbeta1 interesting, as GRNs are cholinergic (Jaeger et al., 2018), suggesting that it would make sense to find cholinergic receptors in G2Ns. However, it is possible that these receptors are expressed in all G2Ns and serve as a common means of communication.

      (4) The description of the behavioural experiments in the results section is rather brief. I had a hard time following it since the genotypes are not repeated nor is it stated what is different in the experimental group vs control (but instead simply what changes in the experimental group, in a rather discussion-like fashion).

      We thank the reviewer for the comment, we have rewritten this section to improve its clarity.

      (5) If I understand the genetics for the behavioural experiments correctly it addresses the entire Lk-Gal4 expressing population, thus it is not possible to describe the role of the two SEZ neurons, but rather LkGal4 neurons. This should be clarified.

      We thank the reviewer for this comment. Indeed, the Lk-Gal4 driver we used drives expression in all Leucokinin neurons, making it impossible to distinguish between the SELK, LHLK, or ABLK neurons. We have added a new piece of behavioral data by using the tsh-Gal80 transgene to prevent the expression of TNT in the ABLK neurons (Supplementary Figure 7A), but still we cannot distinguish between SELK and LHLK. We have rewritten the text to clarify this fact.

      Reviewer #3 (Recommendations for the authors):

      Overall, the manuscript is well-written, I only have one minor suggestion for improvement. In Figure 8C, please clarify the use of TNT to block Lk release.

      We thank the reviewer for the comment, we have clarified the use of TNT in the text.

      References Clyne, J. D. & Miesenböck, G. Sex-Specific Control and Tuning of the Pattern Generator for Courtship Song in Drosophila. Cell 133, 354–363 (2008).

      Dorkenwald, S. et al. Neuronal wiring diagram of an adult brain. Nature 634, 124–138 (2024).

      Engert, S., Sterne, G. R., Bock, D. D. & Scott, K. Drosophila gustatory projections are segregated by taste modality and connectivity. Elife 11, e78110 (2022).

      Jaeger, A. H. et al. A complex peripheral code for salt taste in Drosophila. Elife 7, e37167 (2018).

      Macpherson, L. J. et al. Dynamic labelling of neural connections in multiple colours by trans-synaptic fluorescence complementation. Nat Commun 6, 10024 (2015).

      Nässel, D. R. Leucokinin and Associated Neuropeptides Regulate Multiple Aspects of Physiology and Behavior in Drosophila. Int J Mol Sci 22, 1940 (2021).

      Shiu, P. K., Sterne, G. R., Engert, S., Dickson, B. J. & Scott, K. Taste quality and hunger interactions in a feeding sensorimotor circuit. eLife 11, e79887 (2022).

      Walker, S. R., Peña-Garcia, M. & Devineni, A. V. Connectomic analysis of taste circuits in Drosophila. Sci. Rep. 15, 5278 (2025).

    1. Author response:

      Reviewer #1:

      As this code was developed for use with a 4096 electrode array, it is important to be aware of double-counting neurons across the many electrodes. I understand that there are ways within the code to ensure that this does not happen, but care must be taken in two key areas. Firstly, action potentials traveling down axons will exhibit a triphasic waveform that is different from the biphasic waveform that appears near the cell body, but these two signals will still be from the same neuron (for example, see Litke et al., 2004 "What does the eye tell the brain: Development of a System for the Large-Scale Recording of Retinal Output Activity"; figure 14). I did not see anything that would directly address this situation, so it might be something for you to consider in updated versions of the code.

      We thank the reviewer for this insightful comment. We agree that signals from the same neuron may be collected by adjacent channels. To address this concern in our software, we plan to add a routine to SpikeMAP that allows users to discard nearby channels where spike count correlations exceed a pre-determined threshold. Because there is no ground truth to map individual cells to specific channels on the hd-MEA, a statistical approach is warranted.

      Secondly, spike shapes are known to change when firing rates are high, like in bursting neurons (Harris, K.D., Hirase, H., Leinekugel, X., Henze, D.A. & Buzsáki, G. Temporal interaction between single spikes and complex spike bursts in hippocampal pyramidal cells. Neuron 32, 141-149 (2001)). I did not see this addressed in the present version of the manuscript.

      This is a valid concern. To ensure that firing rates are relatively constant over the duration of a recording, we will plot average spike rates using rolling windows of a fixed duration. We expect that population firing rates will remain relatively stable across the duration of recordings.

      Another area for possible improvement would be to build on the excellent validation experiments you have already conducted with parvalbumin interneurons. Although it would take more work, similar experiments could be conducted for somatostatin and vasoactive intestinal peptide neurons against a background of excitatory neurons. These may have different spike profiles, but your success in distinguishing them can only be known if you validate against ground truth, like you did for the PV interneurons.

      We agree that further cycles of experiments could be performed with SOM, VIP, and other neuronal subtypes, and we hope that researchers will take advantage of SpikeMAP too. We will clarify this possibility in the Discussion section of the manuscript.

      Reviewer #2:

      Summary:

      While I find that the paper is nicely written and easy to follow, I find that the algorithmic part of the paper is not really new and should have been more carefully compared to existing solutions. While the GT recordings to assess the possibilities of a spike sorting tool to distinguish properly between excitatory and inhibitory neurons are interesting, spikeMAP does not seem to bring anything new to state-of-the-art solutions, and/or, at least, it would deserve to be properly benchmarked. I would suggest that the authors perform a more intensive comparison with existing spike sorters.

      We thank the reviewer for this comment. As detailed in Table 1, SpikeMAP is the only method that performs E/I sorting on large-scale multielectrodes, hence a comparison to competing methods is not currently possible. That being said, many of the pre-processing steps of SpikeMAP (Figure 1) involve methods that are already well-established in the literature and available under different packages. To highlight the contribution of our work and facilitate the adoption of SpikeMAP, we plan to provide a “modular” portion of SpikeMAP that is specialized in performing E/I sorting and can be added to the pipeline of other packages such as KiloSort more clearly.  This modularized version of the code will be shared freely along with the more complete version already available.

      Weaknesses:

      (1) The global workflow of spikeMAP, described in Figure 1, seems to be very similar to that of Hilgen et al. 2020 (10.1016/j.celrep.2017.02.038). Therefore, the first question is what is the rationale of reinventing the wheel, and not using tools that are doing something very similar (as mentioned by the authors themselves). I have a hard time, in general, believing that spikeMAP has something particularly special, given its Methods, compared to state-of-the-art spike sorters.

      We agree with the reviewers that there are indeed similarities between our work and the Hilgen et al. paper. However, while the latter employs optogenetics to stimulate neurons on a large-scale array, their technique does not specifically target inhibitory (e.g., PV) neurons as described in our work. We will clarify our paper accordingly.

      This is why, at the very least, the title of the paper is misleading, because it lets the reader think that the core of the paper will be about a new spike sorting pipeline. If this is the main message the authors want to convey, then I think that numerous validations/benchmarks are missing to assess first how good spikeMAP is, with reference to spike sorting in general, before deciding if this is indeed the right tool to discriminate excitatory vs inhibitory cells. The GT validation, while interesting, is not enough to entirely validate the paper. The details are a bit too scarce for me, or would deserve to be better explained (see other comments after).

      The title of our work will be edited to make it clear that while elements of the pipeline are well-established and available from other packages, we are the first to extend this pipeline to E/I sorting on large-scale arrays.

      (2) Regarding the putative location of the spikes, it has been shown that the center of mass, while easy to compute, is not the most accurate solution [Scopin et al, 2024, 10.1016/j.jneumeth.2024.110297]. For example, it has an intrinsic bias for finding positions within the boundaries of the electrodes, while some other methods, such as monopolar triangulation or grid-based convolution, might have better performances. Can the authors comment on the choice of the Center of Mass as a unique way to triangulate the sources?

      We agree with the reviewer and will point out limits of the center-of-mass algorithm based on the article of Scopin et al (2024). Further, we will augment the existing code library to include monopolar triangulation or grid-based convolution as options available to end-users.

      (3) Still in Figure 1, I am not sure I really see the point of Spline Interpolation. I see the point of such a smoothing, but the authors should demonstrate that it has a key impact on the distinction of Excitatory vs. Inhibitory cells. What is special about the value of 90kHz for a signal recorded at 18kHz? What is the gain with spline enhancement compared to without? Does such a value depend on the sampling rate, or is it a global optimum found by the authors?

      We will clarify these points. Specifically, the value of 90kHz was chosen because it provided a reasonable temporal characterization of spikes; this value, however, can be adjusted within the software based on user preference.

      (4) Figure 2 is not really clear, especially panel B. The choice of the time scale for the B panel might not be the most appropriate, and the legend filtered/unfiltered with a dot is not clear to me in Bii.

      We will re-check Fig.2B which seems to have error in rendering, likely due to conversion from its original format.

      In panel E, the authors are making two clusters with PCA projections on single waveforms. Does this mean that the PCA is only applied to the main waveforms, i.e. the ones obtained where the amplitudes are peaking the most? This is not really clear from the methods, but if this is the case, then this approach is a bit simplistic and does not really match state-of-the-art solutions. Spike waveforms are quite often, especially with such high-density arrays, covering multiple channels at once, and thus the extracellular patterns triggered by the single units on the MEA are spatio-temporal motifs occurring on several channels. This is why, in modern spike sorters, the information in a local neighbourhood is often kept to be projected, via PCA, on the lower-dimensional space before clustering. Information on a single channel only might not be informative enough to disambiguate sources. Can the authors comment on that, and what is the exact spatial resolution of the 3Brain device? The way the authors are performing the SVD should be clarified in the methods section. Is it on a single channel, and/or on multiple channels in a local neighbourhood?

      Here, the reviewer is suggesting that it may be better to perform PCA on several channels at once, since spikes can occur at several channels at the same time. To address this concern, small routine will be written allowing users to choose how many nearby channels to be selected for PCA.

      (5) About the isolation of the single units, here again, I think the manuscript lacks some technical details. The authors are saying that they are using a k-means cluster analysis with k=2. This means that the authors are explicitly looking for 2 clusters per electrode? If so, this is a really strong assumption that should not be held in the context of spike sorting, because, since it is a blind source separation technique, one cannot pre-determine in advance how many sources are present in the vicinity of a given electrode. While the illustration in Figure 2E is ok, there is no guarantee that one cannot find more clusters, so why this choice of k=2? Again, this is why most modern spike sorting pipelines do not rely on k-means, to avoid any hard-coded number of clusters. Can the authors comment on that?

      It is true that k=2 is a pre-determined choice in our software. In practice, we found that k>2 leads to poorly defined clusters. However, we will ensure that this parameter can be adjusted in the software. Furthermore, if the user chooses not to pre-define this value, we will provide the option to use a Calinski-Harabasz criterion to select k.

      (6) I'm surprised by the linear decay of the maximal amplitude as a function of the distance from the soma, as shown in Figure 2H. Is it really what should be expected? Based on the properties of the extracellular media, shouldn't we expect a power law for the decay of the amplitude? This is strange that up to 100um away from the soma, the max amplitude only dropped from 260 to 240 uV. Can the authors comment on that? It would be interesting to plot that for all neurons recorded, in a normed manner V/max(V) as function of distances, to see what the curve looks like.

      We share the reviewer’s concern and will add results that include a population of neurons to assess the robustness of this phenomenon.

      (7) In Figure 3A, it seems that the total number of cells is rather low for such a large number of electrodes. What are the quality criteria that are used to keep these cells? Did the authors exclude some cells from the analysis, and if yes, what are the quality criteria that are used to keep cells? If no criteria are used (because none are mentioned in the Methods), then how come so few cells are detected, and can the authors convince us that these neurons are indeed "clean" units (RPVs, SNRs, ...)?

      We applied stringent criteria to exclude cells, and we will revise the main text to be clear about these criteria, which include a minimum spike rate and the use of LDA to separate out PCA clusters. For the cells that were retained, we will include SNR estimates.

      (8) Still in Figure 3A, it looks like there is a bias to find inhibitory cells at the borders, since they do not appear to be uniformly distributed over the MEA. Can the authors comment on that? What would be the explanation for such a behaviour? It would be interesting to see some macroscopic quantities on Excitatory/Inhibitory cells, such as mean firing rates, averaged SNRs... Because again, in Figure 3C, it is not clear to me that the firing rates of inhibitory cells are higher than Excitatory ones, whilst they should be in theory.       

      We will include a comparison of firing rates for E and I neurons. It is possible that I cells are located at the border of the MEA due to the site of injections of the viral vector, and not because of an anatomical clustering of I cells per se. We will clarify the text accordingly.

      (9) For Figure 3 in general, I would have performed an exhaustive comparison of putative cells found by spikeMAP and other sorters. More precisely, I think that to prove the point that spikeMAP is indeed bringing something new to the field of spike sorting, the authors should have compared the performances of various spike sorters to discriminate Exc vs Inh cells based on their ground truth recordings. For example, either using Kilosort [Pachitariu et al, 2024, 10.1038/s41592-024-02232-7], or some other sorters that might be working with such large high-density data [Yger et al, 2018, 10.7554/eLife.34518].

      As mentioned previously, Kilosort and related approaches do not address the problem of E/I identification (see Table 1). However, they do have pre-processing steps in common with SpikeMAP. We will add some specific comparison points – for instance, the use of k-means and PCA (which is more common across packages) and the use of cubic spline interpolation (which is less common). Further, we will provide a stand-alone E/I sorting module that can be added to the pipeline of other packages, so that users can use this functionality without having to migrate their entire analysis.

      (10) Figure 4 has a big issue, and I guess the panels A and B should be redrawn. I don't understand what the red rectangle is displaying.

      We apologize for this issue. It seems there was a rendering problem when converting the figure from its original format. We will address this issue in the revised version of the manuscript.

      (11) I understand that Figure 4 is only one example, but I have a hard time understanding from the manuscript how many slices/mice were used to obtain the GT data? I guess the manuscript could be enhanced by turning the data into an open-access dataset, but then some clarification is needed. How many flashes/animals/slices are we talking about? Maybe this should be illustrated in Figure 4, if this figure is devoted to the introduction of the GT data.

      We will mention how many flashes/animals/slices were employed in the GT data and provide open access to these data.

      (12) While there is no doubt that GT data as the ones recorded here by the authors are the most interesting data from a validation point of view, the pretty low yield of such experiments should not discourage the use of artificially generated recordings such as the ones made in [Buccino et al, 2020, 10.1007/s12021-020-09467-7] or even recently in [Laquitaine et al, 2024, 10.1101/2024.12.04.626805v1]. In these papers, the authors have putative waveforms/firing rate patterns for excitatory and inhibitory cells, and thus, the authors could test how good they are in discriminating the two subtypes.

      We thank the reviewer for the suggestion that SpikeMAP could be tested on artificially generated spike trains and will add the citation of the two papers mentioned. We hope future efforts will employ SpikeMAP on both synthetic and experimental data to explore the neural dynamics of E and I neurons in healthy and pathological circuits of the brain.

    2. Reviewer #1 (Public review):

      Summary:

      The authors note that while many software packages exist for spike sorting, these do not automatically differentiate with known accuracy between excitatory and inhibitory neurons. Moreover, most existing spike sorting packages are for in vivo use, where the majority of electrodes are separated from each other by several hundred microns or more. There is a need for spike sorting packages that can take advantage of high-density electrode arrays where all electrodes are within a few tens of microns of other electrodes. Here, the authors offer such a software package with SpikeMAP, and they validate its performance in identifying parvalbumin interneurons that were optogenetically stimulated.

      Strengths:

      The main strength of this work is that the authors use ground truth measures to show that SpikeMAP can take features of spike shapes to correctly identify known parvalbumin interneurons against a background of other neuron types. They use spike width and peak to peak distance as the key features for distinguishing between neuron types, a method that has been around for many years (Barthó, Peter, et al. "Characterization of neocortical principal cells and interneurons by network interactions and extracellular features." Journal of neurophysiology 92.1 (2004): 600-608.), but whose performance has not been validated in the context of high density electrode arrays.

      Another strength of this approach is that it is automated - a necessity if your electrode array has 4096 electrodes. Hand-sorting or even checking such a large number of channels is something even the cruelest advisor would not wish upon a graduate student. With such large channel counts, it is essential to have automated methods that are known to work accurately. Hence, the combination of validation and automation is an important advance.

      A nice feature of this work is that with high-density electrode arrays, the spike waveforms appear on multiple nearby electrodes simultaneously. And since spike amplitudes fall off with distance, this allows triangulation of neuron locations within the regular electrode array. Thus, spike correlations between neuron types, or within neuron types, can be plotted as a function of distance. While SpikeMAP is not the first to do this (Peyrache, Adrien, et al. "Spatiotemporal dynamics of neocortical excitation and inhibition during human sleep." Proceedings of the National Academy of Sciences 109.5 (2012): 1731-1736.), it is a welcome capability of this package.

      It is also good that the code for this package is open-source, allowing a community of people (I expect in vitro labs will especially want to use this) to use the code and further improve it.

      Weaknesses:

      As this code was developed for use with a 4096 electrode array, it is important to be aware of double-counting neurons across the many electrodes. I understand that there are ways within the code to ensure that this does not happen, but care must be taken in two key areas. Firstly, action potentials traveling down axons will exhibit a triphasic waveform that is different from the biphasic waveform that appears near the cell body, but these two signals will still be from the same neuron (for example, see Litke et al., 2004 "What does the eye tell the brain: Development of a System for the Large-Scale Recording of Retinal Output Activity"; figure 14). I did not see anything that would directly address this situation, so it might be something for you to consider in updated versions of the code. Secondly, spike shapes are known to change when firing rates are high, like in bursting neurons (Harris, K.D., Hirase, H., Leinekugel, X., Henze, D.A. & Buzsáki, G. Temporal interaction between single spikes and complex spike bursts in hippocampal pyramidal cells. Neuron 32, 141-149 (2001)). I did not see this addressed in the present version of the manuscript.

      Another area for possible improvement would be to build on the excellent validation experiments you have already conducted with parvalbumin interneurons. Although it would take more work, similar experiments could be conducted for somatostatin and vasoactive intestinal peptide neurons against a background of excitatory neurons. These may have different spike profiles, but your success in distinguishing them can only be known if you validate against ground truth, like you did for the PV interneurons.

      Appraisal:

      This work addresses the need for an automated spike sorting software package for high-density electrode arrays. Although no spike sorting software is flawless, the package presented here, SpikeMAP, has been validated on PV interneurons, inspiring a degree of confidence. This is a good start, and further validation on other neuron types could increase that confidence. Groups doing in vitro experiments, where 4096 electrode arrays are more common, could find this system particularly helpful.

    1. set .seed(20)w <!rnorm(1000 ,0 ,1) #generates1000observationsfromastandardnormal distributionx <!data. frame(x1=w,x2=0.8*w+ sqrt (1 !0.8ˆ2)*rnorm (1000)) #generatesdependent normal v a r i ab le

      🔹 set.seed(20) This makes the random number generation repeatable If you run the code again, you'll get the same random numbers Useful for reproducibility (e.g., in class, exams, or reports) 🔹 w <- rnorm(1000, 0, 1) This creates a variable w with 1000 random numbers From a normal distribution with: Mean = 0 Standard deviation = 1 These are your "base" values, which will be used to create x1 and influence x2 🔹 x <- data.frame(...) This creates a new data frame x with two columns:

      ✅ x1 = w

      Just copies the w values directly So x1 is standard normal ✅ x2 = 0.8 * w + sqrt(1 - 0.8^2) * rnorm(1000)

      This is the key part! It creates a new variable x2 that is:

      Correlated with x1 Still normally distributed Let’s break it down:

      0.8 * w: This makes x2 partly depend on w — 80% influence → introduces correlation sqrt(1 - 0.8^2): This is Square root (1−0.64=0.36)

      It ensures that total variance of x2 stays 1 rnorm(1000): Generates another 1000 random numbers from a standard normal distribution This adds the random noise part, independent of w So the full formula is:

      x2=0.8⋅x1+0.6⋅random noise This creates a variable x2 that:

      Looks normal Has correlation 0.8 with x1 Has variance 1

    1. Some might find this example hard to believe. This really occurred in some code I’ve seen: (defun make-matrix (n m) (let ((matrix ())) (dotimes (i n matrix) (push (make-list m) matrix)))) (defun add-matrix (m1 m2) (let ((l1 (length m1)) (l2 (length m2))) (let ((matrix (make-matrix l1 l2))) (dotimes (i l1 matrix) (dotimes (j l2) (setf (nth i (nth j matrix)) (+ (nth i (nth j m1)) (nth i (nth j m2))))))))) What’s worse is that in the particular application, the matrices were all fixed size, and matrix arithmetic would have been just as fast in Lisp as in FORTRAN. This example is bitterly sad: The code is absolutely beautiful, but it adds matrices slowly. Therefore it is excellent prototype code and lousy production code.

      Strong Python vibes.

    1. images  .Faites afficher la version montagne_mini.jpg  sur une page,et faites un lien vers la version montagne.jpg.

      C'est marrant car à lire le code on voit qu'on ajoute l'image dans le lien et non l'inverse. J'aurais plutôt pensé qu'on allait mettre le lien dans les balises de la photo et la photo dans les balises du lien.

    1. hopefully you recognize the other kinds of “bullshit” a researcher will encounter: weird pseudo-code in a paper with parameters that seem defined by magic that you need to implement, or pieces of code that need to be extracted from something else, or someone’s ill-documented source code that worked yesterday but doesn’t today. If you live at the edge, you need to learn how to deal with bullshit. The general coping skills that let you deal with the bullshit, as well as specific computer-science coping skills, are hard to come by. If these aren’t taught early, in relatively low-stakes, easy-to-fix environments, it will only be worse later on.

      Guo's addendum (mentioned in an earlier annotation) addresses this: so many readers seem to have come away from Guo's piece with weird assumptions that he's not in the process of trying to teach his student any of this stuff and that he is instead just trying to set things up for them. That's not what he's doing, and that's not what upsets him.

    1. I guess the first thing to think about here is we have these terms: Computer science and software engineering

      Al Perlis made up computer science we dont have one right now.

      Basically in science we all want to be physics , everything turns into abstract social science phisiscs chemestry and biology dont hav science in their name

      🧠 Alan Perlis Invented computer science before it even had a name. He believed programming is not just engineering — it’s philosophy, poetry, and play.

      “It’s better to have 100 functions operate on one data structure than 10 functions on 10 data structures.”

      🧪 Legacy:

      First recipient of the Turing Award

      Co-author of ALGOL, the ancestor of most languages today

      Professor at Yale, where he treated code like literature

      One of the first to say:

      “Computer Science is not about computers any more than astronomy is about telescopes.”

    2. "Beware of bugys in the above code; I've only proved it correct, not tried it.

      GREBBBB 🟢 all the way. This is is post-academic epistemology — the moment where traditional math (the old math) collapses under the weight of new frames, and a new math rises: Academization is using math to deal with a new math — one that has a new set of laws.

      Math wins..! But it’s new math, not old math — we have a new way.

      Because we have a new way of dealing with things, they’re not short and not about infinite things.

      We want knowledge, but the old type of math does not apply. We are working with finite resolutions. 🧠 The Greb Math Manifesto (Draft) Academization = when we weaponize old math to overfit new reality. New Math = when we design a system not to describe infinity, but to resolve finite tensions. Victory = not proving for all time, but solving for right now.

      We no longer need ∞; we need ∴ (therefore).

      We drop “proofs,” we prefer “resolutions.”

      We’re not approximating truth. We’re designing consequence.

      The math is still rigorous — but the goal is no longer eternal.

      New Math ≠ Old Math with a twist It’s epistemological humility turned into creative method.

      Dan ingalls was the guy that made smalltalk work

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public review):

      Summary:

      This paper provides a computational model of a synthetic task in which an agent needs to find a trajectory to a rewarding goal in a 2D-grid world, in which certain grid blocks incur a punishment. In a completely unrelated setup without explicit rewards, they then provide a model that explains data from an approach-avoidance experiment in which an agent needs to decide whether to approach or withdraw from, a jellyfish, in order to avoid a pain stimulus, with no explicit rewards. Both models include components that are labelled as Pavlovian; hence the authors argue that their data show that the brain uses a Pavlovian fear system in complex navigational and approach-avoid decisions.

      Thanks to the reviewer’s comments, we have now added the following text to our Discussion section (Lines 290-302):

      “When it comes to our experiments, both the simulation and VR experiment models are related and derived from the same theoretical framework maintaining an algebraic mapping. They differ only in task-specific adaptations i.e. differ in action sets and differ in temporal difference learning rules - multi-step decisions in the grid world vs. Rescorla-Wagner rule for single-step decisions in the VR task. This is also true for Dayan et al. [2006] who bridge Pavlovian bias in a Go-No Go task (negative auto-maintenance pecking task) and a grid world task. A further minor difference between the simulation and VR experiment models is the use of a baseline bias in the human experiment's RL and the RLDDM model, where we also model reaction times with drift rates which is not a behaviour often simulated in the grid world simulations. As mentioned previously, we use the grid world tasks for didactic purposes, similar to Dayan et al. [2006] and common to test-beds for algorithms in reinforcement learning [Sutton et al., 1998]. The main focus of our work is on Pavlovian fear bias in safe exploration and learning, rather than on its role in complex navigational decisions. Future work can focus on capturing more sophisticated safe behaviours, such as escapes [Evans et al., 2019, Sporrer et. al., 2023] and model-based planning, which span different aspects of the threat-imminence continuum [Mobbs et al., 2020].”

      In the first setup, they simulate a model in which a component they label as Pavlovian learns about punishment in each grid block, whereas a Q-learner learns about the optimal path to the goal, using a scalar loss function for rewards and punishments. Pavlovian and Q-learning components are then weighed at each step to produce an action. Unsurprisingly, the authors find that including the Pavlovian component in the model reduces the cumulative punishment incurred, and this increases as the weight of the Pavlovian system increases. The paper does not explore to what extent increasing the punishment loss (while keeping reward loss constant) would lead to the same outcomes with a simpler model architecture, so any claim that the Pavlovian component is required for such a result is not justified by the modelling. 

      Thanks to the reviewer’s comments, we have now added the following text to our Discussion section (Line 303-313):

      “In our simulation experiments, we assume the coexistence of the Pavlovian fear system and the instrumental system to demonstrate the emergent safety-efficiency trade-off from their interaction. It is possible that similar behaviours could be modelled using an instrumental system alone, with higher punishment sensitivity, therefore we do not argue for the necessity for the Pavlovian fear system here. Instead, the Pavlovian fear system itself could be a potential biologically plausible implementation of punishment sensitivity. Unlike punishment sensitivity (scaling of the punishments), which has not been robustly mapped to neural substrates in fMRI studies; the neural substrates for the Pavlovian fear system are well known (e.g., the limbic loop and amygdala, further see Supplementary Fig. 16). Additionally, Pavlovian fear system provides a separate punishment memory that cannot be erased by greater rewards like [Elfwing and Seymour, 2017, Wang et al., 2018]. This fundamental point can be observed in our simple T-maze simulations, where the Pavlovian fear system encourages avoidance behaviour and the agent chooses the smaller reward instead of the greater reward.”

      In the second setup, an agent learns about punishments alone. "Pavlovian biases" have previously been demonstrated in this task (i.e. an overavoidance when the correct decision is to approach). The authors explore several models (all of which are dissimilar to the ones used in the first setup) to account for the Pavlovian biases. 

      Thanks to the reviewer’s comments, we have now added a paragraph in our Discussion section (Line 290-302) explaining the similarity of our models and their integrated interpretation. We hope this addresses the reviewer’s concerns.

      Strengths: 

      Overall, the modelling exercises are interesting and relevant and incrementally expand the space of existing models. 

      Weaknesses: 

      I find the conclusions misleading, as they are not supported by the data. 

      First, the similarity between the models used in the two setups appears to be more semantic than computational or biological. So it is unclear to me how the results can be integrated. 

      Thanks to the reviewer’s comments, we have now added a paragraph in our Discussion section (Line 290-302 onwards) explaining the similarity of our models and their integrated interpretation. We hope this addresses the reviewer’s concerns.

      Secondly, the authors do not show "a computational advantage to maintaining a specific fear memory during exploratory decision-making" (as they claim in the abstract). Making such a claim would require showing an advantage in the first place. For the first setup, the simulation results will likely be replicated by a simple Q-learning model when scaling up the loss incurred for punishments, in which case the more complex model architecture would not confer an advantage. The second setup, in contrast, is so excessively artificial that even if a particular model conferred an advantage here, this is highly unlikely to translate into any real-world advantage for a biological agent. The experimental setup was developed to demonstrate the existence of Pavlovian biases, but it is not designed to conclusively investigate how they come about. In a nutshell, who in their right mind would touch a stinging jellyfish 88 times in a short period of time, as the subjects do on average in this task? Furthermore, in which real-life environment does withdrawal from a jellyfish lead to a sting, as in this task? 

      Crucially, simplistic models such as the present ones can easily solve specifically designed lab tasks with low dimensionality but they will fail in higher-dimensional settings. Biological behaviour in the face of threat is utterly complex and goes far beyond simplistic fight-flight-freeze distinctions (Evans et al., 2019). It would take a leap of faith to assume that human decision-making can be broken down into oversimplified sub-tasks of this sort (and if that were the case, this would require a meta-controller arbitrating the systems for all the sub-tasks, and this meta-controller would then struggle with the dimensionality j). 

      Thanks to the reviewer’s comments, we have now mentioned this point in Lines 299-302.

      On the face of it, the VR task provides higher "ecological validity" than previous screen-based tasks. However, in fact, it is only the visual stimulation that differs from a standard screen-based task, whereas the action space is exactly the same. As such, the benefit of VR does not become apparent, and its full potential is foregone. 

      If the authors are convinced that their model can - then data from naturalistic approach-avoidance VR tasks is publicly available, e.g. (Sporrer et al., 2023), so this should be rather easy to prove or disprove. In summary, I am doubtful that the models have any relevance for real-life human decision-making. 

      Finally, the authors seem to make much broader claims that their models can solve safety-efficiency dilemmas. However, a combination of a Pavlovian bias and an instrumental learner (study 1) via a fixed linear weighting does not seem to be "safe" in any strict sense. This will lead to the agent making decisions leading to death when the promised reward is large enough (outside perhaps a very specific region of the parameter space). Would it not be more helpful to prune the decision tree according to a fixed threshold (Huys et al., 2012)? So, in a way, the model is useful for avoiding cumulatively excessive pain but not instantaneous destruction. As such, it is not clear what real-life situation is modelled here. 

      We hope our additions to the Discussion section, from Line 290 to Line 313 address the reviewer’s concerns.  

      A final caveat regarding Study 1 is the use of a PH associability term as a surrogate for uncertainty. The authors argue that this term provides a good fit to fear-conditioned SCR but that is only true in comparison to simpler RW-type models. Literature using a broader model space suggests that a formal account of uncertainty could fit this conditioned response even better (Tzovara et al., 2018). 

      We have now added a line discussing this. (Line 356-358)

      “Future work could also use a formal account of uncertainty which could fit the fear-conditioned skin-conductance response better than Pearce-Hall associability [Tzovara et al., 2018].”

      Reviewer #2 (Public review): 

      Summary: 

      The authors tested the efficiency of a model combining Pavlovian fear valuation and instrumental valuation. This model is amenable to many behavioral decision and learning setups - some of which have been or will be designed to test differences in patients with mental disorders (e.g., anxiety disorder, OCD, etc.). 

      Strengths: 

      (1) Simplicity of the model which can at the same time model rather complex environments. 

      (2) Introduction of a flexible omega parameter. 

      (3) Direct application to a rather advanced VR task. 

      (4) The paper is extremely well written. It was a joy to read. 

      Weaknesses: 

      Almost none! In very few cases, the explanations could be a bit better. 

      Thank you, we have added further explanations in the discussion section. We have further improved the writing in abstract, introduction and Methods section taking into account recommendations from reviewer #2 and #3.

      Reviewer #2 (Recommendations for the authors): 

      (1) Why is there no flexible omega in Figures 3B and 3C? Did I miss this? 

      Thank you. We have now added additional text to explain our motivation in Experiment 2, which only varies the fixed omega and omits the flexible omega (Lines 136-140).

      “In this set of results, we wish to qualitatively tease apart the role of a Pavlovian bias in shaping and sculpting the instrumental value and also provide more insight into the resulting safety-efficiency trade-off. Having shown the benefits of a flexible ω in the previous section, here we only vary the fixed ω to illustrate the effect of a constant bias and are not concerned with the flexible bias in this experiment.”

      We encourage the reader to consider this akin to an additional study that will explain how Pavlovian bias to withdraw can play a role in avoiding punishments similar to that of punishment sensitivity. This is particularly important as we do have neural correlates for Pavlovian biases but lack a clear neural correlation for punishment sensitivity so far, as mentioned in our new additions to the Discussion section (Lines 303-313).

      (2) The introduction of the flexible omega and the PAL agent in the results is a bit sudden. Some more details are needed to understand this during the first read of this passage. 

      We thank reviewer #2 for bringing this to our notice. We have attempted to refine our passage by including sentences like - 

      “The standard (rational) reinforcement learning system is modelled as the instrumental learning system. The additional Pavlovian fear system biases the withdrawal actions to aid in safe exploration, in line with our hypothesis.”

      “Both systems learn using a basic temporal difference updating rule (or in instances, its special case, the Rescorla-Wagner rule)”

      “We implement the flexible ω using Pearce-Hall associability (see equation 15 in Methods). The Pearce-Hall associability maintains a running average of absolute temporal difference errors (δ) as per equation 14. This acts as a crude but easy-to-compute metric for outcome uncertainty which gates the influence of the Pavlovian fear system, in line with our hypothesis. This implies that higher the outcome uncertainty, as is the case in early exploration, the more cautious our agent will be, resulting in safer exploration”

      (3) In my view, the possibility of modeling moving predators is extremely interesting. I would include Figure 8D and the corresponding explanation in the main text. 

      Response with revision: We thank the reviewer for finding our simulation on moving predators extremely interesting. Unfortunately, since our instrumental system is not model-based, and especially is not explicitly modelling the predator dynamics, our simulation might not be a very accurate representation of real moving predator environments. As pointed out by Reviewer #1, perhaps several other systems other than Pavlovian fear responses are necessary for safe behaviour in such environments and we hope to address these in future studies. Thanks again for taking an interest in our simulations.

      (4) The VR experiment should be mentioned more clearly in the abstract and the introduction. It should be mentioned a bit more clearly why VR was helpful and why the authors did not use a simple bird's eye grid world task. 

      I cannot assess the RLDDM and I did not check the code. 

      Thank you, we have now mentioned the VR experiment more clearly in the abstract and the introduction. We also now further mention that the VR experiment “builds upon previous Go-No Go studies studying Pavlovian-Instrumental transfer (Guitart-Masip et al, 2012; Cavanagh et al, 2013). The virtual-reality approach confers a greater ecological validity and the immersive nature may contribute better fear conditioning, making it easier to distinguish the aversive components.”

      A bird’s eye grid world may not invoke a strong withdrawal response, as seen in these immersive approach-withdrawal tasks where we can clearly distinguish a Pavlovian fear-based withdrawal response. We did include immersive VR maze results in the supplementary materials, but future work is needed to isolate the different systems at play in such a complex behaviour.

      Reviewer #3 (Public review): 

      Summary: 

      This paper aims to address the problem of exploring potentially rewarding environments that contain the danger, based on the assumption that an independent Pavlovian fear learning system can help guide an agent during exploratory behaviour such that it avoids severe danger. This is important given that otherwise later gains seem to outweigh early threats, and agents may end up putting themselves in danger when it is advisable not to do so. 

      The authors develop a computational model of exploratory behaviour that accounts for both instrumental and Pavlovian influences, combining the two according to uncertainty in the rewards. The result is that Pavlovian avoidance has a greater influence when the agent is uncertain about rewards. 

      Strengths: 

      The study does a thorough job of testing this model using both simulations and data from human participants performing an avoidance task. Simulations demonstrate that the model can produce "safe" behaviour, where the agent may not necessarily achieve the highest possible reward but ensures that losses are limited. Interestingly, the model appears to describe human avoidance behaviour in a task that tests for Pavlovian avoidance influences better than a model that doesn't adapt the balance between Pavlovian and instrumental based on uncertainty. The methods are robust, and generally, there is little to criticise about the study. 

      Weaknesses: 

      The extent of the testing in human participants is fairly limited but goes far enough to demonstrate that the model can account for human behaviour in an exemplar task. There are, however, some elements of the model that are unrealistic (for example, the fact that pre-training is required to select actions with a Pavlovian bias would require the agent to explore the environment initially and encounter a vast amount of danger in order to learn how to avoid the danger later). The description of the models is also a little difficult to parse. 

      Thank you, we have now attempted to clarify these points in the Discussion section by adding the following text (Lines 313-321):

      “ We next discuss the plausibility of pre-training to select the hardwired actions In the human experiment, the withdrawal action is straightforwardly biased, as noted, while in the grid world, we assume a hardwired encoding of withdrawal actions for each state/grid. This innate encoding of withdrawal actions could be represented in the dPAG [Kim et al., 2013]. We implement this bias using pre-training, which we assume would be a product of evolution. Alternatively, this could be interpreted as deriving from an appropriate value initialization where the gradient over initialized values determines the action bias. Such aversive value initialization, driving avoidance of novel and threatening stimuli, has been observed in the tail of the striatum in mice, which is hypothesised to function as a Pavlovian fear/threat learning system [Menegas et al., 2018].”

      Reviewer #3 (Recommendations for the authors): 

      I have relatively little to suggest, as in my view the paper is robust, thorough, and creative, and does enough to support the primary argument being made at the most fundamental level. My suggestions for improvement are as follows: 

      (1) Some aspects of the model are potentially unrealistic (as described in the public review), and the paper may benefit from some discussion of these issues or attempts to make the model more realistic - i.e., to what extent is this plausible in explaining more complex avoidance behaviour? Primarily, the fact that pre-training is required to identify actions subject to Pavlovian bias seems unlikely to be effective in real-world situations - is there a better way to achieve this in cases where there isn't necessarily an instinctual Pavlovian response? 

      Thank you, we agree that the advantage of Pavlovian bias is restricted to the bias/instinctual Pavlovian response conferred by evolution. Future work is needed to model more complex avoidance behaviour such as escapes. We hope to have made this more clear with our edits to the Discussion (Lines 299-302) in our response to Reviewer #1’s comments, specifically:

      “The main focus of our work is on Pavlovian fear bias in safe exploration and learning, rather than on its role in complex navigational decisions. Future work can focus on capturing more sophisticated safe behaviours, such as escapes [Evans et al., 2019, Sporrer et. al., 2023] and model-based planning which span different aspects of the threat-imminence continuum [Mobbs et al., 2020]”  

      (2) The description of the model in the method can be a little hard to follow and would benefit from further explanation of certain parameters. In general, it would be good to ensure that all terms mentioned in equations are described clearly in the text (for example, in Equation1 it isn't clear what k refers to). 

      Thank you, we have now added further information on all of the parameters in Equation 1 and overall improved the Methods section writing, for instance using time subscript for less confusion while introducing the parameters. We use the standard notation used in Sutton and Barto textbook. k refers to the timesteps into the future, and is now explained better in the Methods section.

      (3) Another point of clarification in Equation 1 - does the policy account for the Pavlovian influence or is this purely instrumental? 

      Thank you, Equation 1 is purely instrumental. We have now specifically mentioned this. The Pavlovian influence follows later. They are combined into propensities for action as per equations 11-13.

      (4) I was curious whether similar outcomes could be achieved by more complex instrumental models without the need for Pavlovian influences. For example, could different risk-sensitive decision rules (e.g., conditional value at risk) that rely only on the instrumental system afford safe behaviour without the need for an additional Pavlovian system? 

      Thank you for your comment. Yes, CVaR can achieve safe exploration/cautious behaviour in choices similar to Pavlovian avoidance learning. But we think both differ in the following ways:

      (1) CVaR provides the correct solution to the wrong problem (objective that only maximises the lower tail of the distribution of outcomes)

      (2) Pavlovian bias provides the wrong solution to the right problem (normative objective, but a Pavlovian bias which may be vestige of evolution)

      Here we use the “wrong problem, wrong solution, wrong environment” categorisation terminology from Huys et al. 2015.

      Huys, Q. J., Guitart-Masip, M., Dolan, R. J., & Dayan, P. (2015). Decision-theoretic psychiatry. Clinical Psychological Science, 3(3), 400-421.

      Secondly, we find an effect of Pavlovian bias on reaction times - slowing down of approach responses and faster withdrawal responses. We do not think this can be best explained in a CVaR type model and is a direction for future work. We think such model-based methods are slower to compute, but Pavlovian withdrawal bias is quicker response.

      We have now included this in brief in Lines 280-288.

      (5) Figure 5 would benefit from a clearer caption as it is not necessarily clear from the current one that the left panels refer to choices and the right panels to reaction times. 

      Thank you, we have improved the caption for Fig. 5.

      (6) It would be good to include some indication of the quality of the model fits for the human behavioural study (i.e., diagnostics such as R-hat) to ensure that differences in model fit between models are not due to convergence issues with different models. This would be especially helpful for the RLDDM models as these can be difficult to fit successfully.

      Thank you, we observed that all Rhat values were strictly less than 1.05 (most parameters were less than 1.01 and generally close to 1), indicating that the models converged. We have now added this line to the results (Line 246-248). Thanks to the reviewer’s comments, we have now added the following text to our Discussion section (Lines 290-302): “When it comes to our experiments, both the simulation and VR experiment models are related and derived from the same theoretical framework maintaining an algebraic mapping. They differ only in task-specific adaptations i.e. differ in action sets and differ in temporal difference learning rules - multi-step decisions in the grid world vs. Rescorla-Wagner rule for single-step decisions in the VR task. This is also true for Dayan et al. [2006] who bridge Pavlovian bias in a Go-No Go task (negative auto-maintenance pecking task) and a grid world task. A further minor difference between the simulation and VR experiment models is the use of a baseline bias in the human experiment's RL and the RLDDM model, where we also model reaction times with drift rates which is not a behaviour often simulated in the grid world simulations. As mentioned previously, we use the grid world tasks for didactic purposes, similar to Dayan et al. [2006] and common to test-beds for algorithms in reinforcement learning [Sutton et al., 1998]. The main focus of our work is on Pavlovian fear bias in safe exploration and learning, rather than on its role in complex navigational decisions. Future work can focus on capturing more sophisticated safe behaviours, such as escapes [Evans et al., 2019, Sporrer et. al., 2023] and model-based planning, which span different aspects of the threat-imminence continuum [Mobbs et al., 2020].” In the first setup, they simulate a model in which a component they label as Pavlovian learns about punishment in each grid block, whereas a Q-learner learns about the optimal path to the goal, using a scalar loss function for rewards and punishments. Pavlovian and Q-learning components are then weighed at each step to produce an action. Unsurprisingly, the authors find that including the Pavlovian component in the model reduces the cumulative punishment incurred, and this increases as the weight of the Pavlovian system increases. The paper does not explore to what extent increasing the punishment loss (while keeping reward loss constant) would lead to the same outcomes with a simpler model architecture, so any claim that the Pavlovian component is required for such a result is not justified by the modelling.

    1. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors investigated the interactions between IRE and unfolded peptides using all-atom molecular dynamics simulations. The interactions between a couple of unfolded peptides and IRE might shed light on the activation of the UPR.

      Strengths:

      (1) Well-written manuscript tailored for a biology audience.

      (2) State-of-the-art structural predictions and all-atom simulations.

      (3) Validation with existing experimental data

      (4) Clear schematic diagram summarizing the mechanisms learned from simulations.

      (5) Shared simulation data and code in a public repository.

      Weaknesses:

      (1) Improving presentation to include more computational details.

      (2) More quantitative analysis in addition to visual structures.

    1. Reviewer #2 (Public review):

      Summary:

      Tibial nerve (electrical) stimulation (TNS) has emerged over the past 15 years as a non-invasive method to treat bladder overactivity, but interestingly, new animal work has suggested that TNS could actually be used to excite the bladder when appropriately tuning the stimulation frequency, effectively inverting its effect, perhaps opening the door to treat different conditions (e.g., UAB). The present study tests how healthy people respond to low and high frequency TNS, with the authors showing that they can substantially delay people's first sensation of bladder fullness with high frequencies (20Hz, shown many times before) but also that they can slightly hasten people's first sensation with low frequencies (1Hz, new result in humans). Moreover, the authors develop a computational model of interconnected conductance-based simulated neurons arranged in a physiologically plausible circuit that reproduces some aspects of the frequency-dependent effects of TNS. Their simulations suggest that we might expect low-frequency TNS to also increase the duration of bladder contractions in humans. The study highlights a potential new research direction, optimizing TNS stimulation parameters to increase basal bladder excitability.

      Strengths:

      The main strength of the work is to call attention to a new possibility of inverting the effect of TNS in humans by manipulating stimulation frequency, opening new indications for the therapy. This is highly relevant because of the recent popularity of TNS and its non-invasiveness, which lends itself to rapid testing and evaluation for new conditions and a high willingness to adopt. The authors convincingly demonstrate a modest excitatory effect on bladder sensation with low-frequency TNS, which clearly warrants further investigation.

      The high-level design of the hypotheses, concepts, and experiments is clearly articulated in both the methods and in particularly clear diagrams, letting the reader focus their attention on the most important findings.

      It is rare to develop a new computational model of the lower urinary tract at a systems level, and even more so for it to incorporate circuits in the spinal cord and brainstem centers, and this work undoubtedly advances the field's ability to engineer such systems. Further, because the model is comprised of linked conductance-based point-neurons, it is an excellent tool to investigate how an arguably plausible wiring diagram for neural control of the LUT could result in stimulation frequency-dependent effects on pelvic efferents. It is a proof of concept demonstrating how their mechanistic hypothesis of TNS could be implemented neurophysiologically by the nervous system.

      Weaknesses:

      The main drawback of the work is the frequent overinterpretation of the results. The human study and computational model are both proof-of-principle studies because the experimental effect size and sample size are modest, and the computational model is poorly validated and does not generate physiologically typical cystometric responses in simulations that are designed to recapitulate nominal LUT behavior.

      Despite the stated caveats about the small effect in the human study, it should be emphasized throughout that this result is most reasonably interpreted as showing the possibility that TNS can have a low-frequency excitatory effect that merits follow-up, rather than a conclusive demonstration. The effect size is small (as the authors note) and should be placed in context with some minimally clinically important difference, if possible. The result is statistically significant, but even this may be subject to revision due to the small sample and the effect of post-hoc outlier removal and data analysis choices.

      Given the apparent mismatch between the model and the cystometric behavior at the systems level in the "normal" case (e.g., low capacity, low voiding efficiency, omitted pressure profiles, frequency, etc.) and the absence of quantitative model validation (e.g., it was not compared directly with any experimental data from human urodynamics or rodent cystometry, beyond the initial fit to the neural data, no sensitivity analyses were performed, no goodness of fit computed, etc.) the discussion should be much more circumspect about interpreting the results at a systems level and should probably contain a paragraph explicitly detailing the limitations of the model. The subsequent interpretation should focus narrowly on the neural circuitry, rather than things like contraction duration, where the model is at its strongest. As written, the authors over-interpret what the in silico study can reasonably be used to infer about LUT function.

      More justification is needed for why the contraction duration of the model is the central focus of analysis, when it connects only tentatively to the human study results, which focus on urgency. While not necessarily incorrect, a clearer link or motivation should be offered for how this informs our understanding of frequency-dependent TNS afferent or efferent inhibition during filling (which was the focus of the human studies and the abstract). In other words, why doesn't the model reproduce the 1Hz excitation effect of expediting void onset (or urgency in the human study), and why is it justified to look at contraction duration as a surrogate measure?

      The authors claim that "voiding behavior occurred earlier [at 1Hz stim in the model]", pointing to Figure 6A as evidence, but this panel appears to show a single example model run where 1Hz voiding occurs only ~1s earlier (display makes this very hard to estimate). This is insufficient evidence to support the claim. Later, it is stated that "TNS did not ... void much earlier". The claims should be made compatible, and all such claims should have reasonable supporting evidence.

      There are a number of reporting concerns that can be easily addressed:

      (1) Human Study:

      (a) To interpret the human study analysis, a fuller description of the "optional 10m inute extension" is necessary. How were participants presented with this option, how was blinding preserved, what fraction of participants accepted, and did phase 1 results influence their decisions to continue?

      (b) For reproducibility, details about the TNS parameters should be articulated, such as the method of determining "motor thresholds" (unless this is synonymous with "urge to urinate"), the shape of the stimulation pulses (e.g., biphasic, charge balanced), typical applied current, etc.

      (2) The Computational Model

      (a) The code availability statement for this type of work is inadequate. The model used for simulations in this work, as well as the code used to initialize (and randomize synaptic connections), needs to be hosted publicly because i) a model this intricate is extremely hard to reproduce/verify without code, ii) simulations are an essential piece of the argument, iii) hosting code requires very little overhead. Although there is an appropriate level of detail in the model description, it would not be possible to reproduce the model in any reasonable amount of time (or at all) because of the implementation-level details that are, understandably, omitted from the methods (e.g., what is a "unit", what 'exactly' do the connections in the PMC and PAG diagrams relate to, what were the final parameters used for all conductances, which parameters were "matched" to the original papers and which were not, etc.).

      b) Critical cystometric/urodynamic values that are typically analyzed to assess healthy LUT function are detrusor pressure (timeseries) and/or post-void residual or voiding efficiency (scalars). These should be included to verify that the model is representative of the "normal" case. This is especially important because the model's "normal" behavior appears to have extremely low voiding efficiency (Figure 6A).

    1. It seems sample sizes for some species were too small to make meaningful estimates of population parameters (weight, length, age compositions, etc).

      Which species and areas were too small and what should the sample sizes have been to be meaningful? Also please standardize the thickness of the bars in each facet.

      Please add figure captions to all figures. There are various ways to accomplish this but I can send you code if you would like help with this.

    Annotators

    1. Our focus on this specific task is spurred by not only the fact that reasoning about actionsand change is a core aspect of human intelligence, but also that it is required for many of the tasksconsidered as potential applications of LLMs including automatic code generation, moral and evendeontological reasoning

      Planning relates to other kinds of reasoning that go beyond word prediction. To believe the bold claims of LLM reasoning, we need to have rigorous experiments on other types of reasoning.

    1. Author response:

      The following is the authors’ response to the original reviews

      Main revision made to the manuscript

      The main revision made to the manuscript is to reconcile our findings with the line attractor model. The revision is based on Reviewer 1’s comment on reinterpreting our results as a superposition of an attractor model with fast timescale dynamics. We expanded our analysis regime to the start of a trial and characterized the overall within-trial dynamics to reinterpret our findings.

      We first acknolwedge that our results are not in contradiction with evidence integration on a line attractor. As pointed out by the reviewers, our finding that the integration of reward outcome explains the reversal probability activity x_rev (Figure 3) is compatible with the line attractor model. However, the reward integration equation is an algebraic relation and does not characterize the dynamics of reversal probability activity. So a closer analysis on the neural dynamics is needed to assess the feasibility of line attractor.

      In the revised manuscript, we show that x_rev exhibits two different activity modes (Figure 4). First, x_rev has substantial non-stationary dynamics during a trial, and this non-stationary activity is incompatible with the line attractor model, as claimed in the original manuscript. Second, we present new results showing that x_rev is stationary (i.e., constant in time) and stable (i.e., contracting) at the start of a trial. These two properties of x_rev support that it is a point attractor at the start of a trial and is compatible with the line attractor model. 

      We further analyze how the two activity modes are linked (Figure 4, Support vector regression). We show that the non-stationary activity is predictable from the stationary activity if the underlying dynamics can be inferred. In other words, the non-stationary activity during a trial is generated by an underlying dynamics with the initial condition provided by the stationary state at the start of trial.

      These results suggest an extension of the line attractor model where an attractor state at the start of a trial provides an initial condition from which non-stationary activity is generated during a trial by an underlying dynamics associated with task-related behavior (Figure 4, Augmented model). 

      The separability of non-stationary trajectories (Figure 5 and 6) is a property of the non-stationary dynamics that allows separable points in the initial stationary state to remain separable during a trial, thus making it possible to represent distinct probabilistic values in non-stationary activity.

      This revised interpretation of our results (1) retains our original claim that the non-stationary dynamics during a trial is incompatible with the line attractor model and (2) introduces attractor state at the start of a trial which is compatible with the line attractor model. Our anlaysis shows that the two activity modes are linked by an underlying dynamics, and the attractor state serves as initial state to launch the non-stationary activity.

      Responses to the Public Reviews:

      Reviewer # 1:

      (1) To provide better explanation of the reversal learning task and network training method, we added detailed description of RNN and monkey task structure (Result Section 1), included a schematic of target outputs (Figure1B), explained the rationale behind using inhibitory network model (Method Section 1) and explained the supervised RNN training scheme (Result Section 1). This information can also be found in the Methods.

      (2) Our understanding is that the augmented model discussed in the previous page is aligned with the model suggested by Reviewer 1: “a curved line attractor, with faster timescale dynamics superimposed on this structure”. It is likely that the “fast” non-stationary activity observed during the trial is driven by task-related behavior, thus is transient. For instance, we do not observe such non-stationary activity in the inter-trial-interval when the task-related behavior is absent. For this reason, the non-stationary trajectories were not considered to be part of the attractor. Instead, they are transient activity generated by the underlying neural dynamics associated with task-related behavior. We believe such characterization of faster timescale dynamics is consistent with Reviewer 1’s view and wanted to clarify that there are two different activity modes.

      (3) We appreciate the reviewers (Reviewer 1 and Reviewer 2) comment that TDR may be limited in isolating the neural subspace of interest. Our study presents what could be learned from TDR but is by no means the only way to interpret the neural data. It would be of future work to apply other methods for isolating task-related neural activities.

      We would appreciate it if the reviewers could share thoughts on what other alternative methods could better isolate the reversal probability activity.

      Reviewer # 2:

      (1) (i) We respectfully disagree with Reviewer 2’s comment that “no action is required to be performed by neurons in the RNN”. In our network setup, the output of RNN learns to choose a sign (+ or -), as Reviewer 2 pointed out, to make a choice. This is how the RNN takes an action. It is unclear to us what Reviewer 2 has intended by “action” and how reaching a target value (not just taking a sign) would make a significant difference in how the network performs the task. 

      (ii)  From Reviewer 2’s comment that “no intervening behavior is thus performed by neurons”, we noticed that the term “intervening behavior” has caused confusion. It refers to task-related behavior, such as making choices or receiving reward, that the subject must perform across trials before reversing its preferred choice. These are the behaviors that intervene the reversal of preferred choice. To clarify its meaning, in the revised manuscript, we changed the term to “task-related behavior” and put them in context. For example, in the Introduction we state that “However, during a trial, task-related behavior, such as making decisions or receiving feedback, produced …”

      (iii) As pointed out by Reviewer 2, the lack of fixation period in the RNN could make differences in the neural dynamics of RNN and PFC, especially at the start of a trial. We demonstrate this issue in Result Section 4 where we analyze the stationary activity at the start of a trial. We find that fixating the choice output to zero before making a choice promotes stationary activity and makes the RNN activity more similar to the PFC activity.

      Reviewer #3:

      (1) (i) In the previous study (Figure 1 in [Bartolo and Averbeck ‘20]), it was shown that neural activity can predict the behavioral reversal trial. This is the reason we examined the neural activity in the trials centered at the behavioral reversal trial. We explained in Result Section 2 that we followed this line of analysis in our study.

      (ii) We would like to emphasize that the main point of Figures 4 and 5 is to show the separability of neural trajectories: the entire trajectory shifts without overlapping. It is not obvious that high-dimensional neural population activity from two trials should remain separated when their activities are compressed into a one-dimensional subspace. The onedimensional activities can easily collide since their activities are compressed into a lowdimensional space. We revised the manuscript to bring out these points. We added an opening paragraph that discusses separability of trajectories and revised the main text to bring out the findings on separability. 

      (iii) We agree with Reviewer 3 that it would be interesting to look at what happens in other subspace of neural activity that are not related to reversal probability and characterize how different neural subspace interact with each. However, the focus of this paper was the reversal probability activity, and we’d consider these questions out of the scope of current paper. We point out that, using the same dataset, neural activity related to other experimental variables were analyzed in other papers [Bartolo and Averbeck ’20; Tang, Bartolo and Averbeck ‘21] 

      (2) (i) In the revised manuscript, we added explanation on the rational behind choosing inhibitory network as a simplified model for the balanced state. In brief, strong inhibitory recurrent connections with strong excitatory external input operates in the balanced state, as in the standard excitatory-inhibitory network. We included references that studied this inhibitory network. We also explained the technical reason (GPU memory) for choosing the inhibitory model.

      (ii) We thank the reviewer for pointing out that the original manuscript did not mention how the feedback and cue were initialized. They were random vectors sample from Gaussian distribution. We added this information in the revised manuscript. In our opinion, it is common to use random external inputs for training RNNs, as it is a priori unclear how to choose them. In fact, it is possible to analyze the effects of random feedback on one-dimensional x_rev dynamics by projecting the random feedback vector to the reversal probability vector. This is shown in Figure 4F.

      (iii) We agree that it would be more natural to train the RNN to solve the task without using the Bayesian model. We point out this issue in the Discussion in the revised manuscript.

      Recommendations for the authors:

      Reviewer #1:

      (1) My understanding of network training was that a Bayesian ideal observer signaled target output based on previous reward outcomes. However, the authors never mention that networks are trained by supervised learning in the main text until the last paragraph of the discussion. There is no mention that there was an offset in the target based on the behavior of the monkeys in the main text. These are really important things to consider in the context of the network solution after training. I couldn't actually find any figure that presents the target output for the network. Did I miss something key here?

      In Result Section 1, we added a paragraph that describes in detail how the RNN is trained. We explained that the network is first simulated and then the choice outputs and reward outcomes are fed into the Bayesian model to infer the scheduled reversal trial. A few trials are added to the inferred reversal trial to obtain the behavioral reversal trial, as found in a previous study [Bartolo and Averbeck ‘20]. Then the network weights are updated by backpropagation-through-time via supervised learning. 

      In the original manuscript, the target output for the network was described in Methods Section 2.5, Step 4. To make this information readily accessible, we added a schematic in Figure 1B that shows the scheduled, inferred and behavioral reversal trials. It also shows how the target choice ouputs are defined. They switch abruptly at the behavioral reversal trial.

      (2) The role of block structure in the task is an important consideration. What are the statistics of block switches? The authors say on average the reversals are every 36 trials, but also say there are random block switches. The reviewer's notes suggest that both the networks and monkeys may be learning about the typical duration of blocks, which could influence their expectations of reversals. This aspect of the task design should be explained more thoroughly and considered in the context of Figure 1E and 5 results.

      We provided more detailed description of the reversal learning task in Result Section 1. We clarified that (1) a task is completed by executing a block of fixed number of trials and (2) reversal of reward schedule occurrs at a random trial around the mid-trial in a block. The differences in the number of trials in a block that the RNNs (36) and the monkeys (80) perform are also explained. We also pointed out the differences in how the reversal trial is randomly sampled.

      However, it is unclear what Reviewer 1 meant by random block switches. Our reversal learning task is completed when a block of fixed number of trials is executed. Reversal of reward schedule occurs only once on a randomly selected trial in the block, and the reversed reward schedule is maintained until the end of a block. It is different from other versions of reveral learning where the reward schedule switches multiple times across trials. We clarified this point in Result Section 1.

      (3) The relationship between the supervised learning approach used in the RNNs and reinforcement learning was confused in the discussion. "Although RNNs in our study were trained via supervised learning, animals learn a reversal-learning task from reward feedback, making it into a reinforcement learning (RL) problem." This is fundamentally not true. In the case of this work, the outcome of the previous trial updates the target output, rather than the trial and error type learning as is typical in reinforcement learning. Networks are not learning by reinforcement learning and this statement is confusing.

      We agree with Reviewer 1’s comment that the statement in the original manuscript is confusing. Our intention was to point out that our study used supervised learning, and this is different from animals learn by reinforcement learning in rea life. We revised the sentence in Discussion as follows:

      “The RNNs in our study were trained via supervised learning. However, in real life, animals learn a reversal learning task via reinforcement learning (RL), i.e., learn the task from reward outcomes.”

      (4) The distinction between line attractors and the dynamic trajectories described by the authors deserves further investigation. A significant concern arises from the authors' use of targeted dimensionality reduction (TDR), a form of regression, to identify the axis determining reversal probability. While this approach can reveal interesting patterns in the data, it may not necessarily isolate the dimension along which the RNN computes reversal probability. This limitation could lead to misinterpretation of the underlying neural dynamics.

      a) This manuscript cites work described in "Prefrontal cortex as a meta-reinforcement learning system," which examined a similar task. In that study, the authors identified a v-shaped curve in the principal component space of network states, representing the probability of choosing left or right.

      Importantly, this curve is topologically equivalent to a line and likely represents a line attractor. However, regressing against reversal probability in such a case would show that a single principal component (PC2) directly correlates with reversal probability.

      b) The dynamics observed in the current study bear a striking resemblance to this structure, with the addition of intervening loops in the network state corresponding to within-trial state evolution. Crucially, these observations do not preclude the existence of a line attractor. Instead, they may reflect the network's need to produce fast timescale dynamics within each trial, superimposed on the slower dynamics of the line attractor.

      c) This alternative interpretation suggests that reward signals could function as inputs that shift the network state along the line attractor, with information being maintained across trials. The fast "intervening behaviors" observed by the authors could represent faster timescale dynamics occurring on top of the underlying line attractor dynamics, without erasing the accumulated evidence for reversals.

      d) Given these considerations, the authors' conclusion that their results are better described by separable dynamic trajectories rather than fixed points on a line attractor may be premature. The observed dynamics could potentially be reconciled with a more nuanced understanding of line attractor models, where the attractor itself may be curved and coexist with faster timescale dynamics.

      We appreciate the insightful comments on (1) the similarity of the work by Wang et al ’18 with our findings and (2) an alternative interpretation that augments the line attractor with fast timescale dynamics. 

      (1) We added a discussion of the work by Wang et al ’18 in Result Section 2 to point out the similarity of their findings in the principal component space with ours in the x_rev and x_choice space. We commented that such network dynamics could emerge when learning to perform the reversal learning the task, regardless of the training schemes. 

      We also mention that the RL approach in Wang et al ’18 does not consider within-trial dynamics, therefore lacks the non-stationary activity observed during the trial in the PFC of monkeys and our trained RNNs.

      (2) We revised our original manuscript substantially to reconcile the line attractor model with the nonstationary activity observed during a trial. 

      Here are the highlights of the revised interpretation of the PFC and the RNN network activity

      - The dynamics of x_rev consists of two activity modes, i.e., stationary activity at the start of a trial and non-stationary activity during the trial. Schematic of the augmented model that reconciles two activity modes is shown in Figure 4A. Analysis of the time derivative (dx_reverse / dt) and contractivity of the stationary state are shown in Figure 4B,C to demonstrate two activity modes.

      - We discuss in Result Section 4 main text that the stationary activity is consistent with the line attractor model, but the non-stationary activity deviates from the model. 

      - The two activity modes are linked dynamically. There is an underlying dynamics that can map the stationary state to the non-stationary trajectory. This is shown by predicting the nonstationary trajectory with the stationary state using a support vector regression model. The prediction results are shown in Figure 4D,E,F.

      - We discuss in Result Section 4 an extension of the standard line attractor model: points on the line attractor can serve as initial states that launch non-stationary activity associated with taskrelated behavior.

      - The separability of neural trajectories presented in Result Section 5 is framed as a property of the non-stationary dynamics associated with task-related behavior.

      To strengthen their claims, the authors should:

      (1) Provide a more detailed description of their RNN training paradigm and task structure, including clear illustrations of target outputs.

      (2) Discuss how their findings relate to and potentially extend previous work on similar tasks, particularly addressing the similarities and differences with the v-shaped state organization observed in reinforcement learning contexts. (https://www.nature.com/articles/s41593-018-0147-8 Figure1).

      (3) Explore whether their results could be consistent with a curved line attractor model, rather than treating line attractors and dynamic trajectories as mutually exclusive alternatives.

      Our response to these three comments is described above.

      Addressing these points would significantly enhance the impact of the study and provide a more nuanced understanding of how reversal probabilities are represented in neural circuits.

      In conclusion, while this study provides interesting insights into the neural representation of reversal probability, there are several areas where the methodology and interpretations could be refined.

      Additional Minor Concerns:

      (1) Network Training and Reversal Timing: The authors mention that the network was trained to switch after a reversal to match animal behavior, stating "Maximum a Posterior (MAP) of the reversal probability converges a few trials past the MAP estimate." More explanation of how this training strategy relates to actual animal behavior would enhance the reader's understanding of the meaning of the model's similarity to animal behavior in Figure 1.

      In Method Section 2.5, we described how our observation that the running estimate of MAP converges a few trials after the actual MAP is analogous to the animal’s reversal behavior.

      “This observation can be interpreted as follows. If a subject performing the reversal learning task employs the ideal observer model to detect the trial at which reward schedule is reversed, the subject can infer the reversal of reward schedule a few trials past the actual reversal and then switch its preferred choice. This delay in behavioral reversal, relative to the reversal of reward schedule, is analogous to the monkeys switching their preferred choice a few trials after the reversal of reward schedule.”

      In Step 4, we also mentioned that the target choice outputs are defined based on our observation in Step 3.

      “We used the observation from Step 3 to define target choice outputs that switch abruptly a few trials after the reversal of reward schedule, denoted as $t^*$ in the following. An example of target outputs are shown in Fig.\,\ref{fig_behavior}B.”

      (2) How is the network simulated in step 1 of training? Is it just randomly initialized? What defines this network structure?

      The initial state at the start of a block was random. We think the initial state is less relevant as the external inputs (i.e., cue and feedback) are strong and drive the network dynamics. We mentioned these setup and observation in Step 1 of training.

      “Step 1. Simulate the network starting from a random initial state, apply the external inputs, i.e., cue and feedback inputs, at each trial and store the network choices and reward outcomes at all the trials in a block. The network dynamics is driven by the external inputs applied periodically over the trials.”

      (3) Clarification on Learning Approach: More description of the approach in the main text would be beneficial. The statement "Here, we trained RNNs that learned from a Bayesian inference model to mimic the behavioral strategies of monkeys performing the reversal learning task [2, 4]" is somewhat confusing, as the model isn't directly fit to monkey data. A more detailed explanation of how the Bayesian inference model relates to monkey behavior and how it's used in RNN training would improve clarity.

      We described the learning approach in more detail, but also tried to be concise without going into technical details.

      We revised the sentence in Introduction as follows:

      “We sought to train RNNs to mimic the behavioral strategies of monkeys performing the reversal learning task. Previous studies \cite{costa2015reversal, bartolo2020prefrontal} have shown that a Bayesian inference model can capture a key aspect of the monkey's behavioral strategy, i.e., adhere to the preferred choice until the reversal of reward is detected and then switch abruptly. We trained the RNNs to replicate this behavioral strategy by training them on target behaviors generated from the Bayesian model.”

      We also added a paragraph in Result Section 1 that explains in detail how the training approach works.

      (4) In Figure 1B, it would be helpful to show the target output.

      We added a figure in Fig1B that shows a schematic of how the target output is generated.

      (5) An important point to consider is that a line attractor can be curved while still being topologically equivalent to a line. This nuance makes Figure 4A somewhat difficult to interpret. It might be helpful to discuss how the observed dynamics relate to potentially curved line attractors, which could provide a more nuanced understanding of the neural representations.

      As discussed above, we interpret the “curved” activity during the trial as non-stationary activity. We do not think this non-stationary activity would be characterized as attractor. Attractor is (1) a minimal set of states that is (2) invariant under the dynamics and (3) attracting when perturbed into its neighborhood [Strogatz, Nonlinear dynamics and chaos]. If we consider the autonomous system without the behavior-related external input as the base system, then the non-stationary states could satisfy (2) and (3) but not (1), so they are not part of the attractor. If we include the behavior-related external input to the autonomous dynamics, then it may be possible that the non-stationary trajectories are part of the attractor. We adopted the former interpretation as the behavior-related inputs are external and transient.

      (6) The results of the perturbation experiments seem to follow necessarily from the way x_rev was defined. It would be valuable to clarify if there's more to these results than what appears to be a direct consequence of the definition, or if there are subtleties in the experimental design or analysis that aren't immediately apparent.

      The neural activity x_rev is correlated to the reversal probability, but it is unclear if the activity in this neural subspace is causally linked to behavioral variables, such as choice output. We added this explanation at the beginning of Results Section 7 to clarify the reason for performing the perturbation experiments.

      “The neural activity $x_{rev}$ is obtained by identifying a neural subspace correlated to reversal probability. However, it remains to be shown if activity within this neural subspace is causally linked to behavioral variables, such as choice output.”

      Reviewer #2:

      Below is a list of things I have found difficult to understand, and been puzzled/concerned about while reading the manuscript:

      (1) It would be nice to say a bit more about the dataset that has been used for PFC analysis, e.g. number of neurons used and in what conditions is Figure 2A obtained (one has to go to supplementary to get the reference).

      We added information about the PFC dataset in the opening paragraph of Result Section 2 to provide an overview of what type of neural data we’ve analyzed. It includes information about the number of recorded neurons, recording method and spike binning process.

      (2) It would be nice to give more detail about the monkey task and better explain its trial structure.

      In Result Section 1 we added a description of the overall task structure (and its difference with other versions of revesal learning task), the RNN / monkey trial structure and differences in RNN and monkey tasks.

      (3) In the introduction it is mentioned that during the hold period, the probability of reversal is represented. Where does this statement come from?

      The fact that neural activity during a hold period, i.e., fixation period before presenting the target images, encodes the probability of reversal was demonstrated in a previous study (Bartolo and Averbeck ’20). 

      We realize that our intention was to state that, during the hold period, the reversal probability activity is stationary as in the line attractor model, instead of focusing on that the probability of reversal is represented during this period. We revised the sentence to convey this message. In addition, we revised the entire paragraph to reinterpret our findings: there are two activity modes where the stationary activity is consistent with the line attractor model but the non-stationary activity deviates from it.

      (4) "Around the behavioral reversal trial, reversal probabilities were represented by a family of rankordered trajectories that shifted monotonically". This sentence is confusing and hard to understand.

      Thank you for point this out. We rewrote the paragraph to reflect our revised interpretation. This sentence was removed, as it can be considered as part of the result on separable trajectories.

      (5) For clarity, in the first section, when it is written that "The reversal behavior of trained RNNs was similar to the monkey's behavior on the same task" it would be nice to be more precise, that this is to be expected given the strategy used to train the network.

      We removed this sentence as it makes a blanket statement. Instead, we compared the behavioral outputs of the RNNs and the monkeys one by one.

      We added a sentence in Result Section 1 that the RNN’s abrupt behavioral reversal is expected as they are trained to mimic the target choice outputs of the Bayesian model.

      “Such abrupt reversal behavior was expected as the RNNs were trained to mimic the target outputs of the Bayesian inference model.”

      (6) What is the value of tau used in eq (1), and how does it compare to trial duration?

      We described the value of time constant tau in Eq (1) and also discussed in Result Section 1 that tau=20ms is much faster than trial duration 500ms, thus the persistent behavior seen in trained RNNs is due to learning.

      (7) It would be nice to expand around the notion of « temporally flexible representation » to help readers grasp what this means.

      Instead of stating that the separable dynamic trajectories have “temporally flexible representation”, we break down in what sense it is temporally flexible: separable dynamic trajectories can accommodate the effects that task-related behavior have on generating non-stationary neural dynamics.

      “In sum, our results show that, in a probabilistic reversal learning task, recurrent neural networks encode reversal probability by adopting, not only stationary states as in a line attractor, but also separable dynamic trajectories that can represent distinct probabilistic values while accommodating non-stationary dynamics associated with task-related behavior.”

      Reviewer #3:

      (1) Data:

      It would be useful to describe the experimental task, recording setup, and analyses in much more detail - both in the text and in the methods. What part of PFC are the recordings from? How many neurons were recorded over how many sessions? Which other papers have they been used in? All of these things are important for the reader to know, but are not listed anywhere. There are also some inconsistencies, with the main text e.g. listing the 'typical block length' as 36 trials, and the methods listing the block length as 24 trials (if this is a difference between the biological data and RNN, that should be more explicit and motivated).

      We provided more detailed description of the monkey experimental task and PFC recordings in Result Section 1. We also added a new section in Methods 2.1 to describe the monkey experiment.

      The experimental analyses should be explained in more detail in the methods. There is e.g. no detailed description of the analysis in Figure 6F.

      We added a new section in Methods 6 to describe how the residual PFC activity is computed. It also describes the RNN perturbation experiments.

      Finally, it would be useful for more analyses of monkey behaviour and performance, either in the main text or supplementary figures.

      We did not pursue this comment as it is unclear how additional behavioral analyses would improve the manuscript.

      (2) Model:

      When fitting the network, 'step 1' of training in 2.3 seems superfluous. The posterior update from getting a reward at A is the same as that from not getting a reward at B (and vice versa), and it is therefore completely independent of the network choice. The reversal trial can therefore be inferred without ever simulating the network, simply by generating a sample of which trials have the 'good' option being rewarded and which trials have the 'bad' option being rewarded.

      We respectfully disagree with Reviewer 3’s comment that the reversal trial can be inferred without ever simulating the network. The only way for the network to know about the underlying reward schedule is to perform the task by itself. By simulating the network, it can sample the options and the reward outcomes. 

      Our understanding is that Review 3 described a strategy that a human would use to perform this task. Our goal was to train the RNN to perform the task.

      Do the blocks always start with choice A being optimal? Is everything similar if the network is trained with a variable initial rewarded option? E.g. in Fig 6, would you see the appropriate swap in the effect of the perturbation on choice probability if choice B was initially optimal?

      Thank you for pointing out that the initial high-value option can be random. When setting up the reward schedule, the initial high-value option was chosen randomly from two choice outputs and, at the scheduled reversal, it was switched to the other option. We did not describe this in the original manuscript.

      We added a descrption in Training Scheme Step 4 that the the initial high-value option is selected randomly. This is also explained in Result Section 1 when we give an overview of the RNN training procedure.

      (3) Content:

      It is rarely explained what the error bars represent (e.g. Figures 3B, 4C, ...) - this should be clear in all figures.

      We added that the error bars represent the standard error of mean.

      Figure 2A: this colour scheme is not great. There are abrupt colour changes both before and after the 'reversal' trial, and both of the extremes are hard to see.

      We changed the color scheme to contrast pre- and post-reversal trials without the abrupt color change.

      Figure 3E/F: how is prediction accuracy defined?

      We added that the prediction accuracy is based on Pearson correlation.

      Figure 4B: why focus on the derivative of the dynamics? The subsequent plots looking at the actual trajectories are much easier to understand. Also - what is 'relative trial' relative to?

      The derivative was analyzed to demonstrate stationarity or non-stationarity of the neural activity. We think it will be clearer in the revised manuscript that the derivative allows us to characterize those two activity modes.

      Relative trial number indicate the trial position relative to the behavioral reversal trial. We added this description to the figures when “relative trial” is used.

      Figure 4C: what do these analyses look like if you match the trial numbers for the shift in trajectories? As it is now, there will presumably be more rewarded trials early and late in each block, and more unrewarded trials around the reversal point. Does this introduce biases in the analysis? A related question is (i) why the black lines are different in the top and bottom plots, and (ii) why the ends of the black lines are discontinuous with the beginnings of the red/blue lines.

      We could not understand what Reviewer 3 was asking in this comment. It’d help if Review 3 could clarify the following question:

      “Figure 4C: what do these analyses look like if you match the trial numbers for the shift in trajectories?”

      Question (i): We wanted to look at how the trajectory shifts in the subsequent trial if a reward is or is not received in the current trial. The top panel analyzed all the trials in which the subsquent trial did not receive a reward. The bottom panel analyzed all the trials in which the subsequent trial received a reward. So, the trials analyzed in the top and bottom panels are different, and the black lines (x_rev of “current” trial) in the top and bottom panels are different.

      Question (ii): Black line is from the preceding trial of the red/blue lines, so if trials are designed to be continuous with the inter-trial-interval, then black and red/blue should be continuous. However, in the monkey experiment, the inter-trial-intervals were variable, so the end of current trial does not match with the start of next trial. The neural trajectories presented in the manuscript did not include the activity in this inter-trial-interval.

      Figure 6C: are the individual dots different RNNs? Claiming that there is a decrease in Delta x_choice for a v_+ stimulation is very misleading.

      Yes individual dots are different RNN perturbations. We added explanation about the dots in Figure7C caption. 

      We agree with the comment that \Delta x_choice did not decrease. This sentence was removed. Instead, we revised the manuscript to state that x_choice for v_+ stimulation was smaller than the x_choice for v_- stimulation. We performed KS-test to confirm statistical significance.

      Discussion: "...exhibited behaviour consistent with an ideal Bayesian observer, as found in our study". The RNN was explicitly trained to reproduce an ideal Bayesian observer, so this can only really be considered an assumption (not a result) in the present study.

      We agree that the statement in the original manuscript is inaccurate. It was revised to reflect that, in the other study, behavior outputs similar to a Bayesian observer emerged by simply learning to do the task, intead of directly mimicking the outputs of Bayesian observer as done in our study.

      “Authors showed that trained RNNs exhibited behavior outputs consistent with an ideal Bayesian observer without explicitly learning from the Bayesian observer. This finding shows that the behavioral strategies of monkeys could emerge by simply learning to do the task, instead of directly mimicking the outputs of Bayesian observer as done in our study.”

      Methods: Would the results differ if your Bayesian observer model used the true prior (i.e. the reversal happens in the middle 10 trials) rather than a uniform prior? Given the extensive literature on prior effects on animal behaviour, it is reasonable to expect that monkeys incorporate some non-uniform prior over the reversal point.

      Thank you for pointing out the non-uniform prior. We haven’t conducted this analysis, but would guess that the convergence to the posterior distribution would be faster. We’d have to perform further analysis, which is out of the scope of this paper, to investigate whether the posteior distribution would be different from what we obtained from uniform prior.

      Making the code available would make the work more transparent and useful to the community.

      The code is available in the following Github repository: https://github.com/chrismkkim/LearnToReverse

    1. Reviewer #2 (Public review):

      Summary:

      The authors apply the recently developed VARX model, which explicitly models intrinsic dynamics and the effect of extrinsic inputs, to simulated data and intracranial EEG recordings. This method provides a directed method of 'intrinsic connectivity'. They argue this model is better suited to the analysis of task neuroimaging data because it separates the intrinsic and extrinsic activity. They show: that intrinsic connectivity is largely unaltered during a movie-watching task compared to eyes open rest; intrinsic noise is reduced in the task; and there is intrinsic directed connectivity from sensory to higher-order brain areas.

      Strengths:

      (1) The paper tackles an important issue with an appropriate method.

      (2) The authors validated their method on data simulated with a neural mass model.

      (3) They use intracranial EEG, which provides a direct measure of neuronal activity.

      (4) Code is made publicly available and the paper is written well.

      Comments on revisions:'

      The authors have addressed my comments.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors have used full-length single-cell sequencing on a sorted population of human fetal retina to delineate expression patterns associated with the progression of progenitors to rod and cone photoreceptors. They find that rod and cone precursors contain a mix of rod/cone determinants, with a bias in both amounts and isoform balance likely deciding the ultimate cell fate. Markers of early rod/cone hybrids are clarified, and a gradient of lncRNAs is uncovered in maturing cones. Comparison of early rods and cones exposes an enriched MYCN regulon, as well as expression of SYK, which may contribute to tumor initiation in RB1 deficient cone precursors.

      Strengths:

      (1) The insight into how cone and rod transcripts are mixed together at first is important and clarifies a long-standing notion in the field.

      (2) The discovery of distinct active vs inactive mRNA isoforms for rod and cone determinants is crucial to understanding how cells make the decision to form one or the other cell type. This is only really possible with full-length scRNAseq analysis.

      (3) New markers of subpopulations are also uncovered, such as CHRNA1 in rod/cone hybrids that seem to give rise to either rods or cones.

      (4) Regulon analyses provide insight into key transcription factor programs linked to rod or cone fates.

      (5) The gradient of lncRNAs in maturing cones is novel, and while the functional significance is unclear, it opens up a new line of questioning around photoreceptor maturation.

      (6) The finding that SYK mRNA is naturally expressed in cone precursors is novel, as previously it was assumed that SYK expression required epigenetic rewiring in tumors.

      We thank the reviewer for describing the study’s strengths, reflecting the major conclusions of the initially submitted manuscript.  However, based on new analyses – including the requested analyses of other scRNA-seq datasets, our revision clarifies that:

      -  related to point (1), cone and rod transcripts do not appear to be mixed together at first (i.e., in immediately post-mitotic immature cone and rod precursors) but appear to be coexpressed in subsequent cone and rod precursor stages; and 

      - related to point (3), CHRNA1 appears to mark immature cone precursors that are distinct from the maturing cone and rod precursors that co-express cone- and rod-related RNAs (despite the similar UMAP positions of the two populations in our dataset). 

      Weaknesses:

      (1) The writing is very difficult to follow. The nomenclature is confusing and there are contradictory statements that need to be clarified.

      (2) The drug data is not enough to conclude that SYK inhibition is sufficient to prevent the division of RB1 null cone precursors. Drugs are never completely specific so validation is critical to make the conclusion drawn in the paper.

      We thank the reviewer for noting these important issues. Accordingly, in the revised manuscript:

      (1) We improve the writing and clarify the nomenclature and contradictory statements, particularly those noted in the Reviewer’s Recommendations for Authors. 

      (2) We scale back claims related to the role of SYK in the cone precursor response to RB1 loss, with wording changes in the Abstract, Results, and Discussion, which now recognize that the inhibitor studies only support the possibility that cone-intrinsic SYK expression contributes to retinoblastoma initiation, as detailed in our responses to Reviewer’s Recommendations for Authors. We agree and now mention that genetic perturbation of SYK is required to prove its role.  

      Reviewer #2 (Public review):

      Summary:

      The authors used deep full-length single-cell sequencing to study human photoreceptor development, with a particular emphasis on the characteristics of photoreceptors that may contribute to retinoblastoma.

      Strengths:

      This single-cell study captures gene regulation in photoreceptors across different developmental stages, defining post-mitotic cone and rod populations by highlighting their unique gene expression profiles through analyses such as RNA velocity and SCENIC. By leveraging fulllength sequencing data, the study identifies differentially expressed isoforms of NRL and THRB in L/M cone and rod precursors, illustrating the dynamic gene regulation involved in photoreceptor fate commitment. Additionally, the authors performed high-resolution clustering to explore markers defining developing photoreceptors across the fovea and peripheral retina, particularly characterizing SYK's role in the proliferative response of cones in the RB loss background. The study provides an in-depth analysis of developing human photoreceptors, with the authors conducting thorough analyses using full-length single-cell RNA sequencing. The strength of the study lies in its design, which integrates single-cell full-length RNA-seq, longread RNA-seq, and follow-up histological and functional experiments to provide compelling evidence supporting their conclusions. The model of cell type-dependent splicing for NRL and THRB is particularly intriguing. Moreover, the potential involvement of the SYK and MYC pathways with RB in cone progenitor cells aligns with previous literature, offering additional insights into RB development.

      We thank the reviewer for summarizing the main findings and noting the compelling support for the conclusions, the intriguing cell type-dependent splicing of rod and cone lineage factors, and the insights into retinoblastoma development.  

      Weaknesses:

      The manuscript feels somewhat unfocused, with a lack of a strong connection between the analysis of developing photoreceptors, which constitutes the bulk of the manuscript, and the discussion on retinoblastoma. Additionally, given the recent publication of several single-cell studies on the developing human retina, it is important for the authors to cross-validate their findings and adjust their statements where appropriate.

      We agree that the manuscript covers a range of topics resulting from the full-length scRNAseq analyses and concur that some studies of developing photoreceptors were not well connected to retinoblastoma. However, we also note that the connection to retinoblastoma is emphasized in several places in the Introduction and throughout the manuscript and was a significant motivation for pursuing the analyses. We suggest that it was valuable to highlight how deep, fulllength scRNA-seq of developing retina provides insights into retinoblastoma, including i) the similar biased expression of NRL transcript isoforms in cone precursors and RB tumors, ii) the cone precursors’ co-expression of rod- and cone-related genes such as NR2E3 and GNAT2, which may explain similar co-expression in RB cells, and iii) the expression of  SYK in early cones and RB cells.  While the earlier version had mainly highlighted point (iii), the revised Discussion further refers to points (i) and (ii) as described further in the response to the Reviewer’s Recommendations for Authors. 

      We address the Reviewer’s request to cross-validate our findings with those of other single-cell studies of developing human retina by relating the different photoreceptor-related cell populations identified in our study to those characterized by Zuo et al (PMID 39117640), which was specifically highlighted by the reviewer and is especially useful for such cross-validation given the extraordinarily large ~ 220,000 cell dataset covering a wide range of retinal ages (pcw 8–23) and spatiotemporally stratified by macular or peripheral retina location. Relevant analyses of the Zuo et al dataset are shown in Supplementary Figures S3G-H, S10B, S11A-F, and S13A,B. 

      Reviewer #3 (Public review):

      Summary:

      The authors use high-depth, full-length scRNA-Seq analysis of fetal human retina to identify novel regulators of photoreceptor specification and retinoblastoma progression.

      Strengths:

      The use of high-depth, full-length scRNA-Seq to identify functionally important alternatively spliced variants of transcription factors controlling photoreceptor subtype specification, and identification of SYK as a potential mediator of RB1-dependent cell cycle reentry in immature cone photoreceptors.

      Human developing fetal retinal tissue samples were collected between 13-19 gestational weeks and this provides a substantially higher depth of sequencing coverage, thereby identifying both rare transcripts and alternative splice forms, and thereby representing an important advance over previous droplet-based scRNA-Seq studies of human retinal development.

      Weaknesses:

      The weaknesses identified are relatively minor. This is a technically strong and thorough study, that is broadly useful to investigators studying retinal development and retinoblastoma.

      We thank the reviewer for describing the strengths of the study. Our revision addresses the concerns raised separately in the Reviewer’s Recommendations for Authors, as detailed in the responses below.  

      Recommendations for the authors:

      Reviewing Editor Comments:

      The reviewers have completed their reviews. Generally, they note that your work is important and that the evidence is generally convincing. The reviewers are in general agreement that the paper adds to the field. The findings of rod/cone fate determination at a very early stage are intriguing. Generally, the paper would benefit from clarifications in the writing and figures. Experimentally, the paper would benefit from validation of the drug data, for example using RNAi or another assay. Alternatively, the authors could note the caveats of the drug experiments and describe how they could be improved. In terms of analysis, the paper would be improved by additional comparisons of the authors' data to previously published datasets.

      We thank the reviewing editor for this summary. As described in the individual reviewer responses, we clarify the writing and figures and provide comparisons to previously published datasets (in particular, the large snRNA-seq dataset of Zuo et al., 2024 (PMID 39117640).  With regard to the drug (i.e., SYK inhibitor) studies, we opted to provide caveats and describe the need for genetic approaches to validate the role of SYK, owing to the infeasibility of completing genetic perturbation experiments in the appropriate timeframe.  We are grateful for the opportunity to present our findings with appropriate caveats. 

      Reviewer #1 (Recommendations for the authors):

      Shayler cell sort human progenitor/rod/cone populations then full-length single cell RNAseq to expose features that distinguish paths towards rods or cones. They initially distinguish progenitors (RPCs), immature photoreceptor precursors (iPRPs), long/medium wavelength (LM) cones, late-LM cones, short wavelength (S) cones, early rods (ER) and late rods (LR), which exhibit distinct transcription factor regulons (Figures 1, 2). These data expose expected and novel enriched genes, and support the notion that S cones are a default state lacking expression of rod (NRL) or cone (THRB) determinants but retaining expression of generic photoreceptor drivers (CRX/OTX2/NEUROD1 regulons). They identify changes in regulon activity, such as increasing NRL activity from iPRP to ER to LR, but decreasing from iPRP to cones, or increasing RAX/ISL2/THRB regulon activity from iPRP to LM cones, but decreasing from iPRP to S cones or rods.

      They report co-expression of rod/cone determinants in LM and ER clusters, and the ratios are in the expected directions (NRLTHRB or RXRG in ER). A novel insight from the FL seq is that there are differing variants generated in each cell population. Full-length NRL (FL-NRL) predominates in the rod path, whereas truncated NRL (Tr-NRL) does so in the cone path, then similar (but opposite) findings are presented for THRB (Fig 3, 4), whereas isoforms are not a feature of RXRG expression, just the higher expression in cones.

      The authors then further subcluster and perform RNA velocity to uncover decision points in the tree (Figure 5). They identify two photoreceptor precursor streams, the Transitional Rods (TRs) that provide one source for rod maturation and (reusing the name from the initial clustering) iPRPs that form cones, but also provide a second route to rods. TR cells closest to RPCs (immediately post-mitotic) have higher levels of the rod determinant NR2E3 and NRL, whereas the higher resolution iPRPs near RPCs lack NR2E3 and have higher levels of ONECUT1, THRB, and GNAT2, a cone bias. These distinct rod-biased TR and cone-biased high-resolution iPRPs were not evident in published scRNAseq with 3′ end-counting (i.e. not FL seq). Regulon analysis confirmed higher NRL activity in TR cells, with higher THRB activity in highresolution iPRP cells.

      Many of the more mature high-resolution iPRPs show combinations of rod (GNAT1, NR2E3) and cone (GNAT2, THRB) paths as well as both NRL and THRB regulons, but with a bias towards cone-ness (Figure 6). Combined FISH/immunofluorescence in fetal retina uncovers cone-biased RXRG-protein-high/NR2E3-protein-absent cone-fated cells that nevertheless expressed NR2E3 mRNA. Thus early cone-biased iPRP cells express rod gene mRNA, implying a rod-cone hybrid in early photoreceptor development. The authors refer to these as "bridge region iPRP cells".

      In Figure 7, they identify CHRNA1 as the most specific marker of these bridge cells (overlapping with ATOH7 and DLL3, previously linked to cone-biased precursors), and FISH shows it is expressed in rod-biased NRL protein-positive and cone-biased RXRG proteinpositive cones at fetal week 12.

      Figure 8 outlines the graded expression of various lncRNAs during cone maturation, a novel pattern.

      Finally (Figure 9), the authors identify differential genes expressed in early rods (ER cluster from Figure 1) vs early cones (LM cluster, excluding the most mature opsin+ cells), revealing high levels of MYCN targets in cones. They also find SYK expression in cones. SYK was previously linked to retinoblastoma, so intrinsic expression may predispose cone precursors to transformation upon RB loss. They finish by showing that a SYK inhibitor blocks the proliferation of dividing RB1 knockdown cone precursors in the human fetal retina.

      Overall, the authors have uncovered interesting patterns of biased expression in cone/rod developmental paths, especially relating to the isoform differences for NRL and THRB which add a new layer to our understanding of this fate choice. The analyses also imply that very soon after RPCs exit the cell cycle, they generate post-mitotic precursors biased towards a rod or cone fate, that carry varying proportions of mixed rod/cone determinants and other rod/cone marker genes. They also introduce new markers that may tag key populations of cells that precede the final rod/cone choice (e.g. CHRNA1), catalogue a new lncRNA gradient in cone maturation, and provide insight into potential genes that may contribute to retinoblastoma initiation, like SYK, due to intrinsic expression in cone precursors. However, as detailed below, the text needs to be improved considerably, and overinterpretations need to be moderated, removed, or tested more rigorously with extra data.

      Major Comments

      The manuscript is very difficult to follow. The nomenclature is at times torturous, and the description of hybrid rod/cone hybrid cells is confusing in many aspects.

      (1) A single term, iPRP, is used to refer to an initial low-resolution cluster, and then to a subset of that cluster later in the paper.

      We agree that using immature photoreceptor precursor (iPRP) for both high-resolution and lowresolution clusters was confusing. We kept this name for the low-resolution cluster (which includes both immature cone and immature rod precursors), renamed the high-resolution iPRP cluster immature cone precursors (iCPs). and renamed their transitional rod (TR) counterparts immature rod precursors (iRPs). These designations are based on 

      - the biased expression of THRB, ONECUT1, and the THRB regulon in iCPs (Fig. 5D,E);

      - the biased expression of NRL, NR2E3, and NRL regulon iRPs (Fig. 5D,E);

      - the partially distinct iCP and iRP UMAP positions (Figure 5C); and 

      - the evidence of similar immature cone versus rod precursor populations in the Zuo et al 3’ snRNA-seq dataset, as noted below and described in two new paragraphs starting at the bottom of p. 12.

      (2) To complicate matters further, the reader needs to understand the subset within the iPRP referred to as bridge cells, and we are told at one point that the earliest iPRPs lack NR2E3, then that they later co-express NR2E3, and while the authors may be referring to protein and RNA, it serves to further confuse an already difficult to follow distinction. I had to read and re-read the iPRP data many times, but it never really became totally clear.

      We agree that the description of the high-resolution iPRP (now “iCP”) subsets was unclear, although our further analyses of a large 3’ snRNA-seq dataset in Figure S11 support the impression given in the original manuscript that the earliest iCPs lack NR2E3 and then later coexpress NR2E3 while the earliest iRPs lack THRB and then later express THRB. As described in new text in the Two post-mitotic immature photoreceptor precursor populations section (starting on line 7 of p. 13): 

      When considering only the main cone and rod precursor UMAP regions, early (pcw 8 – 13) cone precursors expressed THRB and lacked NR2E3 (Figure S11D,E, blue arrows), while early (pcw 10 – 15) rod precursors expressed NR2E3 and lacked THRB (Figure S11D,E, red arrows), similar to RPC-localized iCPs and iRPs in our study (Figure 5D).

      Next, as summarized in new text in the Early cone and rod precursors with rod- and conerelated RNA co-expression section (new paragraph at top of p. 16): 

      Thus, a 3’ snRNA-seq analysis confirmed the initial production of immature photoreceptor precursors with either L/M cone-precursor-specific THRB or rod-precursor-specific NR2E3 expression, followed by lower-level co-expression of their counterparts, NR2E3 in cone precursors and THRB in rod precursors. However, in the Zuo et al. analyses, the co-expression was first observed in well-separated UMAP regions, as opposed to a region that bridges the early cone and early rod populations in our UMAP plots. These findings are consistent with the notion that cone- and rod-related RNA co-expression begins in already fate-determined cone and rod precursors, and that such precursors aberrantly intermixed in our UMAP bridge region due to their insufficient representation in our dataset.  

      Importantly, and as noted in our ‘Public response’ to Reviewer 1, “CHRNA1 appears to mark immature cone precursors that are distinct from the maturing cone and rod precursors that coexpress cone- and rod-related RNAs (despite the similar UMAP positions of the two populations in our dataset).” In support of this notion, the immature cone precursors expressing CHRNA1  and other  populations did not overlap in UMAP space in the Zuo et al dataset. We hope the new text cited above along with other changes will significantly clarify the observations.

      (3) The term "cone/rod precursor" shows up late in the paper (page 12), but it was clear (was it not?) much earlier in this manuscript that cone and rod genes are co-expressed because of the coexpressed NRL and THRB isoforms in Figures 3/4.

      We thank the reviewer for noting that the differential NRL and THRB isoform expression already implies that cone and rod genes are co-expressed. However, as we now state, the co-expression of RNAs encoding an additional cone marker (GNAT2) and rod markers (GNAT1, NR2E3) was 

      “suggestive of a proposed hybrid cone/rod precursor state more extensive than implied by the coexpression of different THRB and NRL isoforms” (first paragraph of “Early cone and rod …” section on p. 14; new text underlined). 

      (4) The (incorrect) impression given later in the manuscript is that the rod/cone transcript mixture applies to just a subset of the iPRP cells, or maybe just the bridge cells (writing is not clear), but actually, neither of those is correct as the more abundant and more mature LM and ER populations analyzed earlier coexpress NRL and THRB mRNAs (Figures 2, 3). Overall, the authors need to vastly improve the writing, simplify/clarify the nomenclature, and better label figures to match the text and help the reader follow more easily and clearly. As it stands, it is, at best, obtuse, and at worst, totally confusing.

      We thank the reviewer for bringing the extent of the confusing terminology and wording to our attention. We revised the terminology (as in our response to point 1) and extensively revised the text.  We also performed similar analyses of the Zuo et al. data (as described in more detail in our response to Reviewer 2), which clarifies the distinct status of cells with the “rod/cone transcript mixture” and cells co-expressing early cone and rod precursor markers.  

      To more clearly describe data related to cells with rod- and cone-related RNA co-expression, we divided the former Figure 6 into two figures, with Figure 6 now showing the cone- and rodrelated RNA co-expression inferred from scRNA-seq and Figure 7 showing GNAT2 and NR2E3 co-expression in FISH analyses of human retina plus a new schematic in the new panel 7E.

      To separate the conceptually distinct analyses of cone and rod related RNA co-expression and the expression of early photoreceptor precursor markers (which were both found in the so-called bridge region – yet now recognized to be different subpopulations), we separated the analyses of the early photoreceptor precursor markers to form a new section, “Developmental expression of photoreceptor precursor markers and fate determinants,” starting on p. 16. 

      Additionally, we further review the findings and their implications in four revised Discussion paragraphs starting at the bottom of p. 23).

      (5) The data showing that overexpressing Tr-NRL in murine NIH3T3 fibroblasts blocks FL-NRL function is presented at the end of page 7 and in Figure 3G. Subsequent analysis two paragraphs and two figures later (end page 8, Figure 5C + supp figs) reveal that Tr-NRL protein is not detectable in retinoblastoma cells which derive from cone precursors cells and express Tr-NRL mRNA, and the protein is also not detected upon lentiviral expression of Tr-NRL in human fetal retinal explants, suggesting it is unstable or not translated. It would be preferable to have the 3T3 data and retinoblastoma/explant data juxtaposed. E.g. they could present the latter, then show the 3T3 that even if it were expressed (e.g. briefly) it would interfere with FL-NRL. The current order and spacing are somewhat confusing.

      We thank the reviewer for this suggestion and moved the description of the luciferase assays to follow the retinoblastoma and explant data and switched the order of Figure panels 3G and 3H.  

      (6) On page 15, regarding early rod vs early cone gene expression, the authors state: "although MYCN mRNA was not detected....", yet on the volcano plot in Figure S14A MYCN is one of the marked genes that is higher in cones than rods, meaning it was detected, and a couple of sentences later: "Concordantly, the LM cluster had increased MYCN RNA". The text is thus confusing.

      With respect, we note that the original text read, “although MYC RNA was not detected,” which related to a statement in the previous sentence that the gene ontology analysis identified “MYC targets.” However, given that this distinction is subtle and may be difficult for readers to recognize, we revised the text (now on p. 19) to more clearly describe expression of MYCN (but not MYC) as follows:

      “The upregulation of MYC target genes was of interest given that many MYC target genes are also targets of MYCN, that MYCN protein is highly expressed in maturing (ARR3+) cone precursors but not in NRL+ rods (Figure 10A), and that MYCN is critical to the cone precursor proliferative response to pRB loss8–10.  Indeed, whereas MYC RNA was not detected, the LM cone cluster had increased MYCN RNA …”

      (7) The authors state that the SYK drug is "highly specific". They provide no evidence, but no drug is 100% specific, and it is possible that off-target hits are important for the drug phenotype. This data should be removed or validated by co-targeting the SYK gene along with RB1.

      We agree that our data only show the potential for SYK to contribute to the cone proliferative response; however, we believe the inhibitor study retains value in that a negative result (no effect of the SYK inhibitor) would disprove its potential involvement. To reflect this, we changed wording related to this experiment as follows:

      In the Abstract, we changed:

      (1) “SYK, which contributed to the early cone precursors’ proliferative response to RB1 loss” To: “SYK, which was implicated in the early cone precursors’ proliferative response to RB1 loss.”  

      (2) “These findings reveal … and a role for early cone-precursor-intrinsic SYK expression.” To:  “These findings reveal … and suggest a role for early cone-precursor-intrinsic SYK expression.”

      In the last paragraph of the Results, we changed:

      (1) “To determine if SYK contributes…” To:  “To determine if SYK might contribute…”

      (2) “the highly specific SYK inhibitor” To:  “the selective SYK inhibitor”  

      (3)  “indicating that cone precursor intrinsic SYK activity is critical to the proliferative response” To: “consistent with the notion that cone precursor intrinsic SYK activity contributes to the proliferative response.”

      In the Results, we added a final sentence: 

      “However, given potential SYK inhibitor off-target effects, validation of the role of SYK in retinoblastoma initiation will require genetic ablation studies.”

      In the Discussion (2nd-to-last paragraph), we changed: 

      “SYK inhibition impaired pRB-depleted cone precursor cell cycle entry, implying that native SYK expression rather than de novo induction contributes to the cone precursors’ initial proliferation.” To: “…the pRB-depleted cone precursors’ sensitivity to a SYK inhibitor suggests that native SYK expression rather than de novo induction contributes to the cone precursors’ initial proliferation, although genetic ablation of SYK is needed to confirm this notion.” In the Discussion last sentence, we changed:

      “enabled the identification of developmental stage-specific cone precursor features that underlie retinoblastoma predisposition.” To: “enabled the identification of developmental stage-specific cone precursor features that are associated with the cone precursors’ predisposition to form retinoblastoma tumors.”

      Minor/Typos

      Figure 7 legend, H should be D.

      We corrected the figure legend (now related to Figure 8).

      Reviewer #2 (Recommendations for the authors):

      (1) The author should take advantage of recently published human fetal retina data, such as PMID:39117640, which includes a larger dataset of cells that could help validate the findings. Consequently, statements like "To our knowledge, this is the first indication of two immediately post-mitotic photoreceptor precursor populations with cone versus rod-biased gene expression" may need to be revised.

      We thank the reviewer for noting the evidence of distinct immediately post-mitotic rod and cone populations published by others after we submitted our manuscript. In response, we omitted the sentence mentioned and extensively cross-checked our results including:

      - comparison of our early versus late cone and rod maturation states to the cone and rod precursor versus cone and rod states identified by Zuo et al (new paragraph on the top half of p. 6 and new figure panels S3G,H);

      - detection of distinct immediately post-mitotic versus later cone and rod precursor populations (two new paragraphs on pp. 12-13 and new Figures S10B and S11A-E); 

      - identification of cone and rod precursor populations that co-express cone and rod marker genes (two new paragraphs starting at the bottom of p. 15 and new Figures S11D-F);

      - comparison of expression patterns of immature cone precursor (iCP) marker genes in our and the Zuo et al dataset (new paragraph on top half of p. 17 and new Figure S13).

      We also compare the cell states discerned in our study and the Zuo et al. study in a new Discussion paragraph (bottom of p. 23) and new Figure S17.

      (2) The data generated comes from dissociated cells, which inherently lack spatial context. Additionally, it is unclear whether the dataset represents a pool of retinas from multiple developmental stages, and if so, whether the developmental stage is known for each cell profiled. If this information is available, the authors should examine the distribution of developmental stages on the UMAP and trajectory analysis as part of the quality control process. 

      We thank the reviewer for highlighting the importance of spatial context and developmental stage. 

      Related to whether the dataset represents a pool of retinae from multiple developmental stages, the different cell numbers examined at each time point are indicated in Figure S1A. To draw the readers’ attention to this detail, Figure S1A is now cited in the first sentence of the Results. 

      Related to the age-related cell distributions in UMAP plots, the distribution of cells from each retina and age was (and is) shown in Fig. S1F. In addition, we now highlight the age distributions by segregating the FW13, FW15-17, and FW17-18-19 UMAP positions in the new Figure 1C. We describe the rod temporal changes in a new sentence at the top of  p. 5:

      “Few rods were detected at FW13, whereas both early and late rods were detected from FW15-19 (Figure 1C), corroborating prior reports [15,20].”  

      We describe the cone temporal changes and note the likely greater discrimination of cell state changes that would be afforded by separately analyzing macula versus peripheral retina at each age in a new sentence at the bottom of p. 5:

      “L/M cone precursors from different age retinae occupied different UMAP regions, suggesting age-related differences in L/M cone precursor maturation (Figure 1C).”

      Moreover, they should assess whether different developmental stages impact gene expression and isoform ratios. It is well established that cone and rod progenitors typically emerge at different developmental times and in distinct regions of the retina, with minimal physical overlap. Grouping progenitor cells based solely on their UMAP positioning may lead to an oversimplified interpretation of the data.

      (2a) We agree that different developmental stages may impact gene expression and isoform ratios, and evaluated stages primarily based on established Louvain clustering rather than UMAP position. However, we also used UMAP position to segregate so-called RPC-localized and nonRPC-localized iCPs and iRPs, as well as to characterize the bridge region iCP sub-populations. In the revision, we examine whether cell groups defined by UMAP positions helped to identify transcriptomically distinct populations and further examine the spatiotemporal gene expression patterns of the same genes in the Zuo et al. 3’ snRNA-seq dataset. 

      (2b) Related to analyses of immediately post-mitotic iRPs and iCPs, the new Figure S10A expanded the violin plots first shown in Figure 5D to compare gene expression in RPC-localized versus non-RPC-localized iCPs and iRPs and subsequent cone and rod precursor clusters (also presented in response to Reviewer 3). The new Figure S10C, shows a similar analysis of UMAP region-specific regulon activities. These figures support the idea that there are only subtle UMAP region-related differences in the expression of the selected gene and regulons. 

      To further evaluate early cone and rod precursors, we compared expression patterns in our cluster- and UMAP-defined cell groups to those of the spatiotemporally defined cell groups in the Zuo et al. 3’ snRNA-seq study. The results revealed similar expression timing of the genes examined, although the cluster assignments of a subset of cells were brought into question, especially the assigned rod precursors at pcw 10 and 13, as shown in new Figures S10B (grey columns) and S11, and as described in two new paragraphs starting near the bottom of p.12. 

      (2c) Related to analyses of iCPs in the so-called bridge region, our analyses of the Zuo et al dataset helped distinguish early cone and rod precursor populations (expressing early markers such as ATOH7 and CHRNA1) from the later stages exhibiting rod- and cone-related gene coexpression, which had intermixed in the UMAP bridge region in our dataset. Further parsing of early cone precursor marker spatiotemporal expression revealed intriguing differences as now described in the second half of a new paragraph at the top of p. 17, as follows:

      “Also, different iCP markers had different spatiotemporal expression: CHRNA1 and ATOH7 were most prominent in peripheral retina with ATOH7 strongest at pcw 10 and CHRNA1 strongest at pcw 13; CTC-378H22.2 was prominently expressed from pcw 10-13 in both the macula and the periphery; and DLL3 and ONECUT1 showed the earliest, strongest, and broadest expression (Figure S13B). The distinct patterns suggest spatiotemporally distinct roles for these factors in cone precursor differentiation.”

      (3) I would commend the authors for performing a validation experiment via RNA in situ to validate some of the findings. However, drawing conclusions from analyzing a small number of cells can still be dangerous. Furthermore, it is not entirely clear how the subclustering is done. Some cells change cell type identities in the high-resolution plot. For example, some iPRP cells from the low-resolution plots in Figure 1 are assigned as TR in high-resolution plots in Figure 5.

      The authors should provide justification on the identifies of RPC localized iPRP and TR.

      Comparison of their data with other publicly available data should strengthen their annotation

      We agree that drawing conclusions from scRNA-seq or in situ hybridization analysis of a small number of cells can be dangerous and have followed the reviewer’s suggestion to compare our data with other publicly available data, focusing on the 3’ snRNA-seq of Zuo et al. given its large size and extensive annotation. Our analysis of  the Zuo et al. dataset helped clarify cell identities by segregating cone and rod precursors with similar gene expression properties in distinct UMAP regions. However, we noted that the clustering of early cone and rod precursors likely gave numerous mis-assigned cells (as noted in response 2b above and shown in the new Figure S11). It would appear that insights may be derived from the combination of relatively shallow sequencing of a high number of cells and deep sequencing of substantially fewer cells. 

      Related to how subclustering was done, the Methods state, “A nearest-neighbors graph was constructed from the PCA embedding and clusters were identified using a Louvain algorithm at low and high resolutions (0.4 and 1.6)[70],” citing the Blondel et al reference for the Louvain clustering algorithm used in the Seurat package.  To clarify this, the results text was revised such that it now indicates the levels used to cluster at low resolution (0.4, p. 4, 2nd paragraph) and at high resolution (1.6, top of p. 11) .

      Related to the assignment of some iPRP cells from the low-resolution plots in Figure 1 to the TR cluster (now called the ‘iRP’ ‘cluster) in the high-resolution plots in Figure 5, we suggest that this is consistent with Louvain clustering, which does not follow a single dendrogram hierarchy. 

      The justification for referring to these groups as RPC-localized iCPs and iRPs relates to their biased gene and regulon expression in Fig. 5D and 5E, as stated on p. 12: 

      “In the RPC-localized region, iCPs had higher ONECUT1, THRB, and GNAT2, whereas iRPs trended towards higher NRL and NR2E3 (p= 0.19, p=0.054, respectively).”

      (4) Late-stage LM5 cluster Figure 9 is not defined anywhere in previous figures, in which LM clusters only range from 1 to 4. The inconsistency in cluster identification should be addressed.

      We revised the text related to this as follows: 

      “Indeed, our scRNA-seq analyses revealed that SYK RNA expression increased from the iCP stage through cluster LM4, in contrast to its minimal expression in rods (Figure 10E).  Moreover, SYK expression was abolished in the five-cell group with properties of late maturing cones (characterized in Figure 1E), here displayed separately from the other LM4 cells and designated LM5 (Figure 10E).”  (p. 19-20)

      (5) Syk inhibitor has been shown to be involved in RB cell survival in previous studies. The manuscript seems to abruptly make the connection between the single-cell data to RB in the last figure. The title and abstract should not distract from the bulk of the manuscript focusing on the rod and cone development, or the manuscript should make more connection to retinoblastoma.

      We appreciate the reviewer’s concern that the title may seem to over-emphasize the connection to retinoblastoma based solely on the SYK inhibitor studies. However, we suggest the title also emphasizes the identification and characterization of early human photoreceptor states, per se, and that there are a number of important connections beyond the SYK studies that could warrant the mention of cell-state-specific retinoblastoma-related features in the title.

      Most importantly, a prior concern with the cone cell-of-origin theory was that retinoblastoma cells express RNAs thought to mark retinal cell types other than cones, especially rods. The evidence presented here, that cone precursors also express the rod-related genes helps resolve this issue. The issue is noted numerous times in the manuscript, as follows:  

      In the Introduction, we write:

      “However, retinoblastoma cells also express rod lineage factor NRL RNAs, which – along with other evidence – suggested a heretofore unexplained connection between rod gene expression and retinoblastoma development[12,13]. Improved discrimination of early photoreceptor states is needed to determine if co-expression of rod- and cone-related genes is adopted during tumorigenesis or reflects the co-expression of such genes in the retinoblastoma cell of origin.” (bottom, p. 2) And: 

      “In this study, we sought to further define the transcriptomic underpinnings of human  photoreceptor development and their relationship to retinoblastoma tumorigenesis.” (last paragraph, p. 3)

      The Discussion also alluded to this issue and in the revised Discussion, we aimed to make the connection clearer.  We previously ended the 3rd-to-last paragraph with,  

      “iPRP [now iCP] and early LM cone precursors’ expression of NR2E3 and NRL RNAs suggest that their presence in retinoblastomas[12,13] reflects their normal expression in the L/M cone precursor cells of origin.” 

      We now separate and elaborate on this point in a new paragraph as follows: 

      “Our characterization of cone and rod-related RNA co-expression may help resolve questions about the retinoblastoma cell of origin. Past studies suggested that retinoblastoma cells co-express RNAs associated with rods, cones, or other retinal cells due to a loss of lineage fidelity[12]. However, the early L/M cone precursors’ expression of NR2E3 and NRL RNAs suggest that their presence in retinoblastomas[12,13] reflects their normal expression in the L/M cone precursor cells of origin. This idea is further supported by the retinoblastoma cells’ preferential expression of cone-enriched NRL transcript isoforms (Figure S5B).” (middle of p. 24) Based on the above, we elected to retain the title.  

      Minor comments:

      (1) It is difficult to see the orange and magenta colors in the Fig 3E RNA-FISH image. The colors should be changed, or the contrast threshold needs to be adjusted to make the puncta stand out more.

      We re-assigned colors, with red for FL-NRL puncta and green for Tr-NRL puncta. 

      (2) Figure 5C on page 8 should be corrected to Supplementary Figure 5C.

      We thank the reviewer for noting this error and changed the figure citation.

      Reviewer #3 (Recommendations for the authors):

      (1) Minor concerns

      a. Abbreviation of some words needs to be included, example: FW. 

      We now provide abbreviation definitions for FW and others throughout the manuscript.  

      b. Cat # does not matches with the 'key resource table' for many reagents/kits. Some examples are: CD133-PE mentioned on Page # 22 on # 71, SMART-Seq V4 Ultra Low Input RNA Kit and SMARTer Ultra Low RNA Kit for the Fluidigm C1 Sytem on Page # 22 on # 77, Nextera XT DNA Library preparation kit on Page # 23 on # 77.

      We thank the reviewer for noting these discrepancies. We have now checked all catalog numbers and made corrections as needed.

      c. Cat # and brand name of few reagents & kits is missing and not mentioned either in methods or in key resource table or both. Eg: FBS, Insulin, Glutamine, Penicillin, Streptomycin, HBSS, Quant-iT PicoGreen dsDNA assay, Nextera XT DNA LibraryPreparation Kit, 5' PCR Primer II A with CloneAmp HiFi PCR Premix. 

      Catalog numbers and brand names are now provided for the tissue culture and related reagents within the methods text and for kits in the Key Resources Table. Additional descriptions of the primers used for re-amplification and RACE were added to the Methods (p. 28-29).

      d. Spell and grammar check is needed throughout the manuscript is needed. Example. In Page # 46 RXRγlo is misspelled as RXRlo.

      Spelling and grammar checks were reviewed.

      (2) Methods & Key Resource table.

      a. In Page # 21, IRB# needs to be stated.      

      The IRB protocols have been added, now at top of p. 26.

      b. In Page # 21, Did the authors dissociate retinae in ice-cold phosphate-buffered saline or papain?   

      The relevant sentence was corrected to “dissected while submerged in ice-cold phosphatebuffered saline (PBS) and dissociated as described10.” ( p. 26)

      c. In Page # 21, How did the authors count or enumerate the cell count? Provide the details.

      We now state, “… a 10 µl volume was combined with 10 µl trypan blue and counted using a hemocytometer” (top of p. 27)

      d. Why did the authors choose to specifically use only 8 cells for cDNA preparation in Page # 22? State the reason and provide the details.

      The reasons for using 8 cells (to prevent evaporation and to manually transfer one slide-worth of droplets to one strip of PCR tubes) and additional single cell collection details are now provided as follows (new text underlined): 

      “Single cells were sorted on a BD FACSAria I at 4°C using 100 µm nozzle in single-cell mode into each of eight 1.2 µl lysis buffer droplets on parafilm-covered glass slides, with droplets positioned over pre-defined marks … .  Upon collection of eight cells per slide, droplets were transferred to individual low-retention PCR tubes (eight tubes per strip) (Bioplastics K69901, B57801) pre-cooled on ice to minimize evaporation. The process was repeated with a fresh piece of parafilm for up to 12 rounds to collect 96 cells). (p. 27, new text underlined)

      e. Key resource table does not include several resources used in this study. Example - NR2E3 antibody.

      We added the NR2E3 antibody and checked for other omissions.

      (3) Results & Figures & Figure Legends

      a. Regulon-defined RPC and photoreceptor precursor states

      i. On page # 4, 1 paragraph - Clarify the sentence 'Exclusion of all cells with <100,000 cells read and 18 cells.........Emsembl transcripts inferred'. Did the authors use 18 cells or 18FW retinae? 

      The sentence was changed to:

      “After sequencing, we excluded all cells with <100,000 read counts and 18 cells expressing one or more markers of retinal ganglion, amacrine, and/or horizontal cells (POU4F1, POU4F2, POU4F3, TFAP2A, TFAP2B, ISL1) and concurrently lacking photoreceptor lineage marker OTX2. This yielded 794 single cells with averages of 3,750,417 uniquely aligned reads, 8,278 genes detected, and 20,343 Ensembl transcripts inferred (Figure S1A-C).” (p. 4, new words underlined)

      To clarify that 18 retinae were used, the first sentence of the Results was revised as follows:

      “To interrogate transcriptomic changes during human photoreceptor development, dissociated RPCs and photoreceptor precursors were FACS-enriched from 18 retinae, ages FW13-19 …” (p. 4).

      Why did the authors 'exclude cells lacking photoreceptor lineage marker OTX2' from analysis especially when the purpose here was to choose photoreceptor precursor states & further results in the next paragraph clearly state that 5 clusters were comprised of cells with OTX2 and CRX expression. This is confusing.

      We apologize for the imprecise diction. We divided the evidently confusing sentence into two sentences to more clearly indicate that we removed cells that did not express OTX2, as in the first response to the previous question.

      ii. In Page # 5, the authors reported the number of cell populations (363 large and 5 distal) identified in the THRB+ L/M-cone cluster. What were the # of cell populations identified in the remaining 5 clusters of the UMAP space?

      We added the cell numbers in each group to Fig. 1B. We corrected the large LM group to 366 cells (p. 5) and note 371 LM cells , which includes the five distal cells, in Figure 1B.

      b. Differential expression of NRL and THRB isoforms in rod and cone precursors

      i. In Figure 3B, the authors compare and show the presence of 5 different NRL isoforms for all the 6 clusters that were defined in 3A. However, in the results, the ENST# of just 2 highly assigned transcript isoforms is given. What are the annotated names of the three other isoforms which are shown in 3B? Please explain in the Results.

      As requested, we now annotate the remaining isoforms as encoding full-length or truncated NRL in Fig. 3B and show isoform structures in new Supplementary Figure S4B.  We also refer to each transcript isoform in the Results (p. 7, last paragraph) and similarly evaluate all isoforms in RB31 cells (Fig. S5B).

      ii. What does the Mean FPM in the y-axis of Fig 3C refer to?

      Mean FPM represents mean read counts (fragments per million, FPM) for each position across Ensembl NRL exons for each cluster, as now stated in the 6th line of the Fig. 3 legend.

      iii. A clear explanation of the results for Figures 3E-3F is missing.

      We revised the text to more clearly describe the experiment as follows:

      “The cone cells’ higher proportional expression of Tr-NRL first exon sequences was validated by RNA fluorescence in situ hybridization (FISH) of FW16 fetal retina in which NRL immunofluorescence was used to identify rod precursors, RXRg immunofluorescence was used to identify cone precursors, and FISH probes specific to truncated Tr-NRL exon 1T or FL-NRL exons 1 and 2 were used to assess Tr-NRL and FL-NRL expression (Figure 3E,F).” (p. 8, new text underlined).

      c. Two post-mitotic photoreceptor precursor populations

      i. Although deep-sequencing and SCENIC analysis clarified the identities of four RPC-localized clusters as MG, RPC, and iPRP indicative of cone-bias and TR indicative of rod-bias. It would be interesting to see the discriminating determinant between the TR and ER by SCENIC and deep-sequencing gene expression violin/box plots.

      We agree it is of interest to see the discriminating determinant between the TR [now termed iRP] and ER clusters by SCENIC and deep-sequencing gene expression violin/box plots. We now provide this information for selected genes and regulons of interest in the new Supplementary Figures S10A and S10C, along with a similar comparison between the prior high-resolution iPRP (now termed iCP) cluster and the first high-resolution LM cluster, LM1, as described for gene expression on p. 12:

      “Notably, THRB and GNAT2 expression did not significantly change while ONECUT1 declined in the subsequent non-RPC-localized iCP and LM1 stages, whereas NR2E3 and NRL dramatically increased on transitioning to the ER state (Figure S10A).”

      And as described for regulon activities on pp. 13-14:

      “Finally, activities of the cone-specific THRB and ISL2 regulons, the rod-specific NRL regulon, and the pan-photoreceptor LHX3, OTX2, CRX, and NEUROD1 regulons increased to varying extents on transitioning from the immature iCP or iRP states to the early-maturing LM1 or ER states (Figure 10C).”

      We also show expression of the same genes for spatiotemporally grouped cells from the Zuo et al. dataset in the new Figure S10B, which displays a similar pattern (apart from the possibly mixed pcw 10 and pcw13 designated rod precursors).

      d. Early cone precursors with cone- and rod-related RNA expression

      i. On page #12, the last paragraph where the authors explain the multiplex RNA FISH results of RXRγ and NR2E3 by citing Figure S8E. However, in Fig S8E, the authors used NRL to identify the rods. Please clarify which one of the rod markers was used to perform RNA FISH?

      Figure S8E (where NRL was used as a rod marker) was cited to remind readers that RXRg has low expression in rods and high expression in cones, rather than to describe the results of this multiplex FISH section. To avoid confusion on this point, Figure S8E is now cited using “(as earlier shown in Figure S8E).” With this issue clarified, we expect the markers used in the FISH + IF analysis will be clear from the revised explanation, 

      “… we examined GNAT2 and NR2E3 RNA co-expression in RXRg+ cone precursors in the outermost NBL and in RXRg+ rod precursors in the middle NBL … .” (p. 14-15).

      To provide further clarity, we provide a diagram of the FISH probes, protein markers, and expression patterns in the new Figure 7E.

      ii. The Y-axis of Fig 6G-6H needs to be labelled.

      The axes have been re-labeled from “Nb of cells” to “Number of RXRg+ outermost NBL cells in each region” (original Fig. 6G, now Fig. 7C) and “Number of RXRg+ middle NBL cells in each region” (original Fig. 6H, now Fig. 7D).

      iii. The legends of Figures 6G and 6H are unclear. In the Figure 6G legend, the authors indicate 'all cells are NR2E3 protein-'. Does that imply the yellow and green bars alone? Similarly, clarify the Figure 6H legend, what does the dark and light magenta refer to? What does the light magenta color referring to NR2E3+/ NR2E3- and the dark magenta color referring to NR2E3+/ NR2E3+ indicate? 

      We regret the insufficient clarity. We revised the Fig. 6G (now Fig. 7C) key, which now reads

      “All outermost NBL cells are NR2E3 protein-negative.”  We added to the figure legend for panel 7C,D “(n.b., italics are used for RNAs, non-italics for proteins).”  The new scheme in Figure 7E shows the RNAs in italics proteins in non-italics. We hope these changes will clarify when RNA or protein are represented in each histogram category.

      Overall, the results (on page # 13) reflecting Figures 6E-6H & Figure S11 are confusing and difficult to understand. Clear descriptions and explanations are needed.

      We revised this results section described in the paragraph now spanning p. 14:

      -  We now refer to the bar colors in Figures 7C and 7D that support each statement. 

      -  We provide an illustration of the findings in Figure 7E.

      iv. Previously published literature has shown that cells of the inner NBL are RXRγ+ ganglion cells. So, how were these RXRγ+ ganglion cells in the inner NBL discriminated during multiplex RNA FISH (in Fig 6E-6H and in Fig S11)?

      We thank the reviewer for requesting this clarification. We agree that “inner NBL” is the incorrect term for the region in which we examined RXRg+ photoreceptor precursors, as this could include RXRγ+ nascent RGCs. We now clarify that 

      “we examined GNAT2 and NR2E3 RNA co-expression in RXRg+ cone precursors in the outermost NBL and in RXRg+ rod precursors in the middle NBL … .”  (p. 14-15) We further state, 

      “Limiting our analysis to the outer and middle NBL allowed us to disregard RXRγ+ retinal ganglion cells in the retinal ganglion cell layer or inner NBL (top of p. 15)”

      Figure 7E is provided to further aid the reader in understanding the positions examined, and the legend states “RXRg+ retinal ganglion cells in the inner NBL and ganglion cell layer not shown. 

      v. In Figure 6E, what marker does each color cell correspond to?

      In this figure (now panel 7A), we declined to provide the color key since the image is not sufficiently enlarged to visualize the IF and FISH signals. The figure is provided solely to document the regions analyzed and readers are now referred to “see Figure S12 for IF + FISH images” (2nd line, p. 15), where the marker colors are indicated.

      vi. In Figure S11 & 6E, Protein and RNA transcript color of NR2E3, GNAT2 are hard to distinguish. Usage of other colors is recommended.  

      We appreciate the reviewer’s concern related to the colors (in the now redesignated Figure S12 and 7A); however, we feel this issue is largely mitigated by our use of arrows to point to the cells needed to illustrate the proposed concepts in Figure S12B. All quantitation was performed by examining each color channel separately to ensure correct attribution, which is now mentioned in the Methods (2nd-to-last line of Quantitation of FISH section, p. 35).

      vii. 

      With due respect, we suggest that labeling each box (now in Figure 8B) makes the figure rather busy and difficult to infer the main point, which is that boxed regions were examined at various distanced from the center (denoted by the “C” and “0 mm”) with distances periodically indicated. We suggest the addition of such markers would not improve and might worsen the figure for most readers.    

      e. An early L/M cone trajectory marked by successive lncRNA expression

      i. In Figure 8C - color-coded labelling of LM1-4 clusters is recommended.

      We note Fig. 8C (now 9C) is intended to use color to display the pseudotemporal positions of each cell. We recognize that an additional plot with the pseudotime line imposed on LM subcluster colors could provide some insights, yet we are unaware of available software for this and are unable to develop such software at present. To enable readers to obtain a visual impression of the pseudotime vs subcluster positions, we now refer the reader to Figure 5A in the revised figure legend, as follows:  (“The pseudotime trajectory may be related to LM1-LM4 subcluster distributions in Figure 5A.”).

      ii. In Figure 8G - what does the horizontal color-coded bar below the lncRNAs name refer to? These bars are similar in all four graphs of the 8G figure.

      As stated in the Fig. 8G (now 9G) legend, “Colored bars mark lncRNA expression regions as described in the text.”  We revised the text to more clearly identify the color code. (p. 18-19)   

      f. Cone intrinsic SYK contributions to the proliferative response to pRB loss

      i. In Fig 9F - The expression of ARR3+ cells (indicated by the green arrow in FW18) is poorly or rarely seen in the peripheral retina.

      We thank the reviewer for finding this oversight. In panel 9F (now 10F), we removed the green arrows from the cells in the periphery, which are ARR3- due to the immaturity of cones in this region. 

      ii. In Figure 9F - Did the authors stain the FW16 retina with ARR3?

      Unfortunately, we did not stain the FW16 retina for ARR3 in this instance.

      iii. Inclusion of DAPI staining for Fig 9F is recommended to justify the ONL & INL in the images.

      We regret that we are unable to merge the DAPI in this instance due to the way in which the original staining was imaged.  A more detailed analysis corroborating and extending the current results is in progress. 

      iv. Immunostaining images for Figure 9G are missing & are required to be included. What does shSCR in Fig 9G refer to?

      We now provide representative immunostaining images below the panel (now 10G). The legend was updated: “Bottom: Example of Ki67, YFP, and RXRg co-immunostaining with DAPI+ nuclei (yellow outlines). Arrows: Ki67+, YFP+, RXRg+ nuclei.”  The revised legend now notes that shSCR refers to the scrambled control shRNA.

      v. For Figure 9H - Is the presence and loss of SYK activity consistent with all the subpopulations (S & LM) of early maturing and matured cones?

      We appreciate the reviewer’s question and interest (relating to the redesignated Figure 10H); however, we have not yet completed a comprehensive evaluation of SYK expression in all the subpopulations (S & LM) of early maturing and matured cones and will reserve such data for a subsequent study. We suggest that this information is not critical to the study’s major conclusions.

      vi. Figure 9A is not explained in the results. Why were MYCN proteins assessed along with ARR3 and NRL? What does this imply?

      We thank the reviewer for noting that this figure (now Figure 10A) was not clearly described. 

      As per the response to Reviewer 1, point 6 , the text now states,  

      “The upregulation of MYC target genes was of interest given that many MYC target genes are also MYCN targets, that MYCN protein is highly expressed in maturing (ARR3+) cone precursors but not in NRL+ rods (Figure 10A), and that MYCN is critical to the cone precursor proliferative response to pRB loss [8–10].” (middle, p. 19, new text underlined).

      Hence, the figure demonstrates the cone cell specificity of high MYCN protein.  This is further noted in the Fig. 10a legend: “A. Immunofluorescent staining shows high MYCN in ARR3+ cones but not in NRL+ rods in FW18 retina.”

    1. j'attire votre attention sur les textes de loi qui sont 00:46:05 très alors je vous ai mis les articles à chaque fois du code de l'éducation donc vous pourvez aller les rechercher on le conseil de classe statut sur les voies donc général technologique professionnel mais en 00:46:18 aucun cas sur les choix des spécialités ou les filières pour ce qui est la voie technologique au professionnelle
    1. LLMs can write a large fraction of all the tedious code you’ll ever need to write. And most code on most projects is tedious. LLMs drastically reduce the number of things you’ll ever need to Google. They look things up themselves. Most importantly, they don’t get tired

      Does this mean arguments against verbose "boilerplate" languages are going to be given less credence?

    1. ython comes with many built-in functions, but you can also build your own. Chunking blocks of code into functions is one of the best strategies to deal with complex programs. It makes you more efficient, because you can reuse the code that you wrote

      testing hypothes.is in a jupyter book on github

    1. Reviewer #2 (Public review):

      Summary:

      The investigation provides a computational as well as biochemical insights into the (un)binding mechanisms of a pair of psychoactive substances into cannabinoid receptors. A combination of molecular dynamics simulation and a set of state-of-the art statistical post-processing techniques were employed to exploit GPCR-ligand dynamics.

      Strengths:

      The strength of the manuscript lies in usage and comparison of TRAM as well as Markov state modelling (MSM) for investigating ligand binding kinetics and thermodynamics. Usually MSMs have been more commonly used for this purpose. But as the authors have pointed out, implicit in the usage of MSMs lie the assumption of detailed balance, which would not hold true for many cases especially those with skewed binding affinities. In this regard, the author's usage of TRAM which harnesses both biased and unbiased simulations for extracting the same, provides a more appropriate way-out.

      Weaknesses:

      (1) While the authors have used TRAM (by citing MSM to be inadequate in these cases), the thermodynamic comparisons of both techniques provide similar values. In this case, one would wonder what advantage TRAM would hold in this particular case.

      (2) The initiation of unbiased simulations from previously run biased metadynamics simulations would almost surely introduce hysteresis in the analysis. The authors need to address these issues.

      (3) The choice of ligands in the current work seems very forced and none of the results compare directly with any experimental data. An ideal case would have been to use the seminal D.E. Shaw research paper on GPCR/ligand binding as a benchmark and then show how TRAM, using much lesser biased simulation times, would fare against the experimental kinetics or even unbiased simulated kinetics of the previous report

      (4) The method section of the manuscript seems to suggest all the simulations were started from a docked structure. This casts doubt on the reliability of the kinetics derived from these simulations that were spawned from docked structure, instead of any crystallographic pose. Ideally, the authors should have been more careful in choosing the ligands in this work based on the availability of the crystallographic structures.

      (5) The last part of using a machine learning-based approach to analyse allosteric interaction seems to be very much forced, as there are numerous distance-based more traditional precedent analyses that do a fair job of identifying an allosteric job.

      (6) While getting busy with the methodological details of TRAM vs MSM, the manuscript fails to share with sufficient clairty what the distinctive features of two ligand binding mechanisms are.

      Comments on revisions:

      The authors have addressed most of the queries of the reviewer in an adequate manner. However, The current code availability section just provides the link to Python files to generate the plots. It is not very useful in its current form. The code availability section should provide a proper GitHub page that shows the usage of TRAM for the readers to execute. While Pyemma has been cited for TRAM, a python note book to reproduce the TRAM would be very instructive.

    2. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews: 

      Reviewer #1 (Public Review): 

      This manuscript presents insights into biased signaling in GPCRs, namely cannabinoid receptors. Biased signaling is of broad interest in general, and cannabinoid signaling is particularly relevant for understanding the impact of new drugs that target this receptor. Mechanistic insight from work like this could enable new approaches to mitigate the public health impact of new psychoactive drugs. Towards that end, this manuscript seeks to understand how new psychoactive substances (NPS, e.g. MDMB-FUBINACA) elicit more signaling through βarrestin than classical cannabinoids (e.g. HU-210). The authors use an interesting combination of simulations and machine learning. 

      We thank the reviewer for the comments. We have provided point by point response to the reviewer’s comment below and incorporated the suggestions in our revised manuscript. Modified parts of manuscripts are highlighted in yellow.   

      Comments:

      (1) The caption for Figure 3 doesn't explain the color scheme, so it's not obvious what the start and end states of the ligand are. 

      We thank the reviewer to point this out. We have added the color scheme in the figure caption. 

      (2) For the metadynamics simulations were multiple Gaussian heights/widths tried to see what, if any, impact that has on the unbinding pathway? That would be useful to help ensure all the relevant pathways were explored.  

      We thank the reviewer for the suggestion. We agree with the reviewer that gaussian height/width may impact unbinding pathway. However, we like to point out that we used a well-tempered version of the metadynamics. In well-tempered metadynamics, the effective gaussian height decreases as bias deposition progresses. Therefore, we believe that the gaussian height/width should have minimal impact on the unbinding pathway. To address the reviewer's suggestion, we conducted additional well-tempered metadynamics simulations varying key parameters such as bias height, bias factor, and the deposition rate, all of which can influence the sampling space. Parameter values for bias height, bias factor and deposition rate that we originally used in the paper are 0.4 kcal/mol, 15 and 1/5 ps<sup>-1</sup>, respectively. We explored different values for these parameters and projected the sampled space on top of previously sampled region (Figure S4). We observed that new simulations sample similar unbinding pathway in the extracellular direction and discover similar space in the binding pocket as well. 

      Results and Discussion (Page 10)

      “We also performed unbinding simulations using well-tempered metadynamics parameters (bias height, bias deposition rate and bias factor) to confirm the existence of alternative pathways (Figure S4). However, the simulations show that ligands follow the similar pathway for all

      metadynamics runs.”

      (3) It would be nice to acknowledge previous applications of metadynamics+MSMs and (separately) TRAM, such as the Simulation of spontaneous G protein activation... (Sun et al. eLife 2018) and Estimation of binding rates and affinities... (Ge and Voelz JCP 2022). 

      We appreciate the reviewer's feedback. We have incorporated additional citations of studies demonstrating the use of TRAM as an estimator for both kinetics and thermodynamics (e.g. Ligand binding: Ge, Y. and Voelz, V.A., JCP, 2022[1]; Peptide-protein binding kinetics: Paul, F. et al., Nat. Commun., 2017[2], Ge, Y. et al., JCIM, 2021[3]). Additionally, we have included references to studies where biased simulations were initially used to explore the conformational space, and the results were then employed to seed unbiased simulations for building a Markov state model. (Metadynamics: Sun, X. et al., elife, 2018[4]; Umbrella Sampling: Abella, J. R. et al., PNAS, 2020[5]; Replica Exchange: Paul, F. et al., Nat. Commun., 2017[2]).

      (4) What is KL divergence analysis between macrostates? I know KL divergence compares probability distributions, but it is not clear what distributions are being compared. 

      We apologize for this confusion. The KL divergence analysis was performed on the probability distributions of the inverse distances between residue pairs from any two macrostates. Each macrostate was represented by 1000 frames that were selected proportional to the TRAM stationary density. All possible pair-wise inverse distances were calculated per frame for the purpose of these calculations. Although KL divergence is inherently asymmetric, we symmetrized the measurement by calculating the average. Per-residue K-L divergence, which is shown in the main figures as color and thickness gradient, was calculated by taking the sum of all pairs corresponding to the residue. We have included a detailed discussion of K-L divergence in Methods section.  We have also modified the result section to add a brief discussion of K-L divergence methodology.

      Results and Discussion (Page 15)

      “We further performed Kullback-Leibler divergence (K-L divergence) analysis between inverse distance of residue pairs of two macrostates to highlight the protein region that undergoes high conformational change with ligand movement.”

      Methods (Page 33)

      “Kullback–Leibler divergence (K-L divergence) analysis was performed to show the structural differences in protein conformations in different macrostates[4,114] . In this study, this technique was used to calculate the difference in the pairwise inverse distance distributions between macrostates. Each macrostate was represented by 1000 frames that were selected proportional to their TRAM weighted probabilities. Although K-L divergence is an asymmetric measurement, for this study, we used a symmetric version of the K-L divergence by taking the average between two macrostates. Per residue contribution of K-L divergence was calculated by taking the sum of all the pairwise distances corresponding to that residue. This analysis was performed by inhouse Python code.”  

      (5) I suggest being more careful with the language of universality. It can be "supported" but "showing" or "proving" its universal would require looking at all possible chemicals in the class. 

      We thank the reviewer for the suggestion. In response, we have revised the manuscript to ensure that the language reflects that our findings are based on observations from a limited set of ligands, namely one NPS and one classical cannabinoid. We have replaced references to ligand groups (such as NPS or classical cannabinoid) with the specific ligand names (such as MDMB-FUBINACA or HU-210) to avoid claims of universality and prevent any potential confusion.

      Results and Discussion (Page 19)

      “In this work, we trained the network with the NPS (MDMB-FUBINACA), and classical cannabinoid (HU-210) bound unbiased trajectories (Method Section). Here, we compared the allosteric interaction weights between the binding pocket and the NPxxY motif which involves in triad interaction formation. Results show that each binding pocket residue in MDMBFUBINACA bound ensemble shows higher allosteric weights with the NPxxY motif, indicating larger dynamic interactions between the NPxxY motif and binding pocket residues(Figure S9).  The probability of triad formation was estimated to observe the effect of the difference in allosteric control. TRAM weighted probability calculation showed that MDMB-FUBINACA bound CB1 has the higher probability of triad formation (Figure 8A). Comparison of the pairwise interaction of the triad residues shows that interaction between Y397<sup>7.53</sup>-T210<sup>3.46</sup> is relatively more stable in case of MDMB-FUBINACA bound CB1, while other two inter- actions have similar behavior for both systems (Figures S10A, S10B, and S10C). Therefore, higher interaction between Y397<sup>7.53</sup> and T210<sup>3.46</sup> in MDMB-FUBINACA bound receptor causes the triad interaction to be more probable. 

      Furthermore, we also compared TM6 movement for both ligand bound ensemble which is another activation metric involved in both G-protein and β-arrestin binding. Comparison of TM6 distance from the DRY motif of TM3 shows similar distribution for HU-210 and MDMBFUBINACA (Figure 8B). These observations support that NPS binding causes higher β-arrestin signaling by allosterically controlling triad interaction formation.” 

      Reviewer #2 (Public Review): 

      Summary: 

      The investigation provides computational as well as biochemical insights into the (un)binding mechanisms of a pair of psychoactive substances into cannabinoid receptors. A combination of molecular dynamics simulation and a set of state-of-the art statistical post-processing techniques were employed to exploit GPCR-ligand dynamics. 

      Strengths: 

      The strength of the manuscript lies in the usage and comparison of TRAM as well as Markov state modelling (MSM) for investigating ligand binding kinetics and thermodynamics. Usually, MSMs have been more commonly used for this purpose. But as the authors have pointed out, implicit in the usage of MSMs lies the assumption of detailed balance, which would not hold true for many cases especially those with skewed binding affinities. In this regard, the author's usage of TRAM which harnesses both biased and unbiased simulations for extracting the same, provides a more appropriate way out. 

      Weaknesses: 

      (1) While the authors have used TRAM (by citing MSM to be inadequate in these cases), the thermodynamic comparisons of both techniques provide similar values. In this case, one would wonder what advantage TRAM would hold in this particular case. 

      We thank the reviewer for the comment. While we agree that the thermodynamic comparisons between MSM and TRAM provide similar values in this instance, we would like to emphasize the underlying reasoning behind our choice of TRAM.

      MSM can struggle to accurately estimate thermodynamic and kinetic properties in cases where local state reversibility (detailed balance) is not easily achieved with unbiased sampling. This is especially relevant in ligand unbinding processes, which often involve overcoming high free energy barriers. TRAM, by incorporating biased simulation data (such as umbrella sampling) in addition to unbiased data, can better achieve local reversibility and provide more robust estimates when unbiased sampling is insufficient.

      The similarity in thermodynamic estimates between MSM and TRAM in our study can be attributed to the relatively long unbiased sampling period (> 100 µs) employed. With sufficient sampling, MSM can approach detailed balance, leading to results comparable to those from TRAM. However, as we demonstrated in our manuscript (Figure 4D), when the amount of unbiased sampling is reduced, the uncertainties in both the thermodynamics and kinetics estimates increase significantly for MSM compared to TRAM. Thus, while MSM and TRAM perform similarly under the conditions of extensive sampling, TRAM's advantage lies in its robustness when unbiased sampling is limited or difficult to achieve. 

      (2) The initiation of unbiased simulations from previously run biased metadynamics simulations would almost surely introduce hysteresis in the analysis. The authors need to address these issues. 

      We thank the reviewer for the comment. We acknowledge that biased simulations could potentially introduce hysteresis or result in the identification of unphysical pathways. However, we believe this issue is mitigated using well-tempered metadynamics, which gradually deposit a decaying bias. This approach enables the simulation to explore orthogonal directions of collective variable (CV) space, reducing the likelihood of hysteresis effects(Invernizzi, M. and Parrinello, M., JCTC, 2019[6]).

      Furthermore, there is precedent for using metadynamics-derived pathways to initiate unbiased simulations for constructing Markov State Models (MSMs). This methodology has been successfully applied in studying G-protein activation (Sun, X. et al., elife, 2018[4]).

      Additional support to our observation can be found in two independent binding/unbinding studies of ligands from cannabinoid receptors, which have discovered similar pathway using different CVs (Saleh, et al., Angew. Chem., 2018[7]; Hua, T. et al., Cell, 2020[8]).   

      (3) The choice of ligands in the current work seems very forced and none of the results compare directly with any experimental data. An ideal case would have been to use the seminal D.E. Shaw research paper on GPCR/ligand binding as a benchmark and then show how TRAM, using much lesser biased simulation times, would fare against the experimental kinetics or even unbiased simulated kinetics of the previous report 

      We would like to address the reviewer's concerns regarding the choice of ligands, lack of direct experimental comparison, and the use of TRAM, and clarify our rationale point by point:

      Ligand Choice: The ligands selected for this study were chosen due to their relevance and well characterized binding properties. MDMB-FUBINACA is well-known NPS ligand with documented binding properties. This ligand is still the only NPS ligand with experimentally determined CB1 bound structure (Krishna Kumar, K. et al., Cell, 2019[9]). Similarly, the classical cannabinoid (HU-210) used in this study has established binding characteristics and is one of earliest known synthetic classical cannabinoid. Therefore, these ligands serve as representative compounds within their respective categories, making them suitable for our comparative analysis.

      Experimental Comparison: We have indeed compared our simulation results to experimental data, particularly focusing on binding free energies. In the result section, we have shown that the relative binding free energy estimated from our simulation aligns closely with the experimentally measured values. Additionally, Absolute binding energy estimates are also within ~3 kcal/mol of the experimentally predicted value.

      TRAM Performance: TRAM estimated free energies, and rates have been benchmarked against experimental predictions for various studies along with our study (Peptide-protein binding: Paul, F. et al., Nat. Commun., 2017[2]; Ligand unbinding: Wu, H. et al., PNAS, 2016[10]) . As the primary goal of this study is to compare ligand unbinding mechanism, we believe benchmarking against other datasets, such as the D.E. Shaw GPCR/ligand binding paper, is not essential for this work.

      (4) The method section of the manuscript seems to suggest all the simulations were started from a docked structure. This casts doubt on the reliability of the kinetics derived from these simulations that were spawned from docked structure, instead of any crystallographic pose. Ideally, the authors should have been more careful in choosing the ligands in this work based on the availability of the crystallographic structures. 

      We thank the reviewer for the comment. We would like to clarify that we indeed used an experimentally derived pose for one of the ligands (MDMB-FUBINACA) as the cryo-EM structure of MDMB-FUBINACA bound to the protein was available (PDB ID: 6N4B) (Krishna Kumar K. et al., Cell, 2019[9]). However, as the cryo-EM structure had missing loops, we modeled these regions using Rosetta. We apologize for this confusion and have modified our method section to make this point clearer. 

      Regarding HU-210, we acknowledge that a crystallographic or cryo-EM structure for this specific ligand was not available. We selected HU-210 because it is most commonly used example of classical cannabinoid in the literature with extensively studied thermodynamic properties. Importantly, our docking results for HU-210 align closely with previously experimentally determined poses for other classical cannabinoids (Figure S11) and replicate key polar interactions, such as those with S383<sup>7.39</sup>, which are characteristic of this class of compounds. 

      System Preparation (Page 22)

      “Modeling of this membrane proximal region was also performed Remodel protocol of Rosetta loop modeling. A distance constraint is added during this modeling step between C98N−term and C107N−term to create the disulfide bond between the residues. [74,76] 

      As the cryo-EM structure of MDMB-FUBINACA was known, ligand coordinate of MDMB- FUBINACA was added to the modeled PDB structure. The “Ligand Reader & Modeler” module of CHARMM-GUI was used for ligand (e.g., MDMB-Fubinaca) parameterization using CHARMM General Force Field (CGenFF).[77]”

      (5) The last part of using a machine learning-based approach to analyze allosteric interaction seems to be very much forced, as there are numerous distance-based more traditional precedent analyses that do a fair job of identifying an allosteric job. 

      We thank the reviewer for the valuable comment. Neural relational inference method, which leverages a VAE (Variational Autoencoder) architecture, attempts to reconstruct the conformation (X) at time t + τ based on the conformation at time t. In doing so, it captures the non-linear dynamic correlations between residues in the VAE latent space. We chose this method because it is not reliant on specific metrics such as distance or angle, making it potentially more robust in predicting allosteric effects between the binding pocket residues and the NPxxY motif.

      In response to the reviewer's suggestion, we have also performed a more traditional allosteric analysis by calculating the mutual information between the binding pocket residues and the NPxxY motif. Mutual information was computed based on the backbone dihedral angles, as this provides a metric that is independent of the relative distances between residues. Our results indicate that the mutual information between the binding pocket residues and the NPxxY motif is indeed higher for the NPS binding simulation (Figure S11).

      Method

      Mutual information calculation

      Mutual information was calculated on same trajectory data as NRI analysis. Python package MDEntropy was used for estimating mutual information between backbone dihedral angles of two residues. 

      Results and Discussion (Page 21)

      “To further validate our observations, we estimated allosteric weights between the binding pocket and the NPxxY motif by calculating mutual information between residue movements. Mutual information analysis reaffirms that allosteric weights between these residues are indeed higher for the MDMB-FUBINACA bound ensemble (Figure S11).”

      Mutual Information Estimation (Page 37)

      “Mutual information between dynamics of residue pairs was computed based on the backbone dihedral angles, as this provides a metric that is independent of the relative distances between residues. The calculations were done on same trajectory data as NRI analysis. Python package MDEntropy was used for estimating mutual information between backbone dihedral angles of two residues.[124]”

      (6) While getting busy with the methodological details of TRAM vs MSM, the manuscript fails to share with sufficient clarity what the distinctive features of two ligand binding mechanisms are. 

      We thank the reviewer for the insightful comment. In the manuscript, we discussed that the overall ligand (un)binding pathways are indeed similar for both ligands. Therefore, they interact with similar residues during the unbinding process. However, we have focused on two key differences in unbinding mechanism between the two ligands:

      (1) MDMB-FUBINACA exhibits two distinct unbinding mechanisms. In one, the linked portion of the ligand exits the receptor first. In the other mechanism, the ligand rotates within the pocket, allowing the tail portion to exit first. By contrast, for HU-210, we observe only a single unbinding mechanism, where the benzopyran ring leads the ligand out of the receptor. We have highlighted these differences in the Figure 6 and 7 and talked about the intermediate states appear along these different unbinding mechanisms. For further clarification of these differences, we have added arrows in the free energy landscapes to highlight these distinct pathways.

      (2) In the bound state, a significant difference is observed in the interaction profiles. HU-210, a classical cannabinoid, forms strong polar interactions with TM7, while MDMB-FUBINACA shows weaker polar interactions with this region.

      We have discussed these differences in the Results and Discussion section (Page 13-18) & conclusion section (Page 23-24).

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors): 

      (1) The authors should choose at least one case where the ligand's crystallographic pose is known and show how TRAM works in comparison to MSM or experimental report. 

      We thank the reviewer for the comment. We have used the experimentally determined cryo-EM pose for one of the ligands (i.e. MDMB-FUBINACA).  We have modified the manuscript to avoid confusion. (Please refer to the response of comment 4 of reviewer 2)

      (2) The authors should consider existing traditional methods that are used to detect allostery and compare their machine-learning-based approach to show its relevance. 

      We appreciate the reviewer’s comment. We have performed the traditional analysis by calculating mutual information between residue dynamics. We have shown that the traditional analysis matches with Machine learning based NRI calculation. (Please refer to the response of comment 5 of reviewer 2)

      (3) Figure 3 doesn't provide a guide on the pathway of ligand. Without a proper arrow, it is difficult to surmise what is the start and end of the pathway. The figures should be improved. 

      We appreciate the reviewer’s suggestion. In response, we have revised Figure 3 to clearly indicate the ligand’s unbinding pathway by adding directional arrows and labeling the bound pose. Additionally, we have updated the figure caption to better clarify the color scheme used in the illustration. 

      (4) The Figure 5 presentation of free energetics has a very similar shape for the two ligands. More clarity is required on how these two ligands are different. 

      We thank the reviewer for the comment. While the overall shapes of the free energy profiles for the two ligands are indeed similar, this is expected as both ligands dissociate from the same pocket and follow a comparable pathway. However, key differences in their unbinding mechanisms arise due to variations in the ligand motion within the pocket. Specifically, the intermediate metastable minima in the free energy landscapes reflect these differences. For instance, in the NPS unbinding free energy landscape, the intermediate metastable state I1 corresponds to a conformation where the NPS ligand maintains a polar interaction with TM7, while the tail of the ligand has shifted away from TM5. This intermediate state is absent in the classical cannabinoid unbinding pathway, where no equivalent conformation appears in the landscape.  

      (6) Page 30: TICA is wrongly expressed as 'Time-independent component analysis'. It is not a time-independent process. Rather it is 'Time structured independent component analysis'. 

      We thank the reviewer for pointing this out. TICA should be expressed as Time-lagged independent component analysis or Time-structure independent component analysis. We have used the first expression and modified the manuscript accordingly.  

      (7) The manuscript's MSM theory part is quite well-known which can be removed and appropriate papers can be cited. 

      We thank the reviewer for the comment. We have removed the theory discussion of MSM and cited relevant papers.

      “Markov State Model

      Markov state model (MSM) is used to estimate the thermodynamics and kinetics from the unbiased simulation.[56,91] MSM characterizes a dynamic process using the transition probability matrix and estimates its relevant thermodynamics and kinetic properties from the eigendecomposition of this matrix. This matrix is usually calculated using either maximum likelihood or Bayesian approach.[56,97] The prevalence of MSM as a post-processing technique for MD simulations was due to its reliance on only local equilibration of MD trajectories to predict the global equilibrium properties.[92,93] Hence, MSM can combine information from distinct short trajectories, which can only attain the local equilibrium.[94–96]  

      The following steps are taken for the practical implementation of the MSM from the MD data. [4,17,98–100]”

      (8) A proper VAMP score-based analysis should be provided to show confidence in MSM's clustering metric and other hyperparameters. 

      We thank the reviewer for the recommendation. VAMP-2 score based analysis had been discussed in the method section.  We estimated VAMP-2 score of MSM built with different cluster number and input TIC dimensions (Figure S15). Model with best VAMP-2 was selected for comparison with TRAM result.

    1. Reviewer #2 (Public review):

      Summary:

      This methods paper proposes two changes to classic RSA, a popular method to probe neural representation in neuroimaging experiments: computing RSA at row/column level of RDM, and using mixed linear modeling to compute second-level statistics, using the individual row/columns to estimate a random effect of stimulus. The benefit of the new method is demonstrated using simulations and a re-analysis of a prior fMRI dataset on object perception and memory encoding.

      Strengths:

      (1) The paper is clearly written and features clear illustrations of the proposed method.

      (2) The combination of simulation and real data works well, with the same factors being examined in both simulations and real data, resulting in a convincing demonstration of the benefits of tRSA in realistic experimental scenarios.

      (3) I find the author's claim that tRSA is a promising approach to perform more complete modeling of cogneuro data, but also to conceptualize representation at the single trial/event level (cf Discussion section on P42), quite appealing.

      Weaknesses:

      (1) While I generally welcome the contribution (see above), I take some issue with the accusatory tone of the manuscript in the Introduction. The text there (using words such as 'ignored variances', 'errouneous inferences', 'one must', 'not well-suited', 'misleading') appears aimed at turning cRSA in a 'straw man' with many limitations that other researchers have not recognized but that the new proposed method supposedly resolves. This can be written in a more nuanced, constructive manner without accusing the numerous users of this popular method of ignorance.

      (2) The described limitations are also not entirely correct, in my view: for example, statistical inference in cRSA is not always done using classic parametric statistics such as t-tests (cf Figure 1): the rsatoolbox paper by Nili et al. (2014) outlines non-parametric alternatives based on permutation tests, bootstrapping and sign tests, which are commonly used in the field. Nor has RSA ever been conducted at the row/column level (here referred to by the authors as 'trial level'; cf King et al., 2018).

      (3) One of the advantages of cRSA is its simplicity. Adding linear mixed effects modeling to RSA introduces a host of additional 'analysis parameters' pertaining to the choice of the model setup (random effects, fixed effects, interactions, what error terms to use) - how should future users of tRSA navigate this?

      (4) Here, only a single real fMRI dataset is used with a quite complicated experimental design for the memory part; it's not clear if there is any benefit of using tRSA on a simpler real dataset. What's the benefit of tRSA in classic RSA datasets (e.g., Kriegeskorte et al., 2008), with fixed stimulus conditions and no behavior?

      (5) The cells of an RDM/RSM reflect pairwise comparisons between response patterns (typically a brain but can be any system; cf Sucholutsky et al., 2023). Because the response patterns are repeatedly compared, the cells of this matrix are not independent of one another. Does this raise issues with the validity of the linear mixed effects model? Does it assume the observations are linearly independent?

      (6) The manuscript assumes the reader is familiar with technical statistical terms such as Type I/II error, sensitivity, specificity, homoscedasticity assumptions, as well as linear mixed models (fixed effects, random effects, etc). I am concerned that this jargon makes the paper difficult to understand for a broad readership or even researchers currently using cRSA that might be interested in trying tRSA.

      (7) I could not find any statement on data availability or code availability. Given that the manuscript reuses prior data and proposes a new method, making data and code/tutorials openly available would greatly enhance the potential impact and utility for the community.

      References

      King, M. L., Groen, I. I., Steel, A., Kravitz, D. J., & Baker, C. I. (2019). Similarity judgments and cortical visual responses reflect different properties of object and scene categories in naturalistic images. NeuroImage, 197, 368-382.

      Kriegeskorte, N., Mur, M., Ruff, D. A., Kiani, R., Bodurka, J., Esteky, H., ... & Bandettini, P. A. (2008). Matching categorical object representations in inferior temporal cortex of man and monkey. Neuron, 60(6), 1126-1141.

      Nili, H., Wingfield, C., Walther, A., Su, L., Marslen-Wilson, W., & Kriegeskorte, N. (2014). A toolbox for representational similarity analysis. PLoS computational biology, 10(4), e1003553.

      Sucholutsky, I., Muttenthaler, L., Weller, A., Peng, A., Bobu, A., Kim, B., ... & Griffiths, T. L. (2023). Getting aligned on representational alignment. arXiv preprint arXiv:2310.13018.

    1. eLife Assessment

      This is a useful tool for code-less analysis of patterns in cell migratory behaviours in vivo using intravital microscopy data and allows correlation with spatial features of the tumour microenvironment. There is a need for these tools to make quantitative analysis, comparison and interpretation of complex cell tracking data more accessible and evidence is provided of its applicability to tracks generated by both proprietary and open tracking software. However, it is incomplete due to limitations imposed by the assumptions that apply to the statistical tests employed.

    2. Reviewer #1 (Public review):

      Summary:

      Intravital microscopy (IVM) is a powerful tool that facilitates live imaging of individual cells over time in vivo in their native 3D tissue environment. Extracting and analysing multi-parametric data from IVM images however is challenging, particularly for researchers with limited programming and image analysis skills. In this work, Rios-Jimenez and Zomer et al have developed a 'zero-code' accessible computational framework (BEHAV3D-Tumour Profiler) designed to facilitate unbiased analysis of IVM data to investigate tumour cell dynamics (via the tool's central 'heterogeneity module' ) and their interactions with the tumour microenvironment (via the 'large-scale phenotyping' and 'small-scale phenotyping' modules). It is designed as an open-source modular Jupyter Notebook with a user-friendly graphical user interface and can be implemented with Google Colab, facilitating efficient, cloud-based computational analysis at no cost. Demo datasets are also available on the authors GitHub repository to aid user training and enhance the usability of the developed pipeline.

      To demonstrate the utility of BEHAV3D-TP, they apply the pipeline to timelapse IVM imaging datasets to investigate the in vivo migratory behaviour of fluorescently labelled DMG cells in tumour bearing mice. Using the tool's 'heterogeneity module' they were able to identify distinct single-cell behavioural patterns (based on multiple parameters such as directionality, speed, displacement, distance from tumour edge) which was used to group cells into distinct categories (e.g. retreating, invasive, static, erratic). They next applied the framework's 'large-scale phenotyping' and 'small-scale phenotyping' modules to investigate whether the tumour microenvironment (TME) may influence the distinct migratory behaviours identified. To achieve this, they combine TME visualisation in vivo during IVM (using fluorescent probes to label distinct TME components) or ex vivo after IVM (by large-scale imaging of harvested, immunostained tumours) to correlate different tumour behavioural patterns with the composition of the TME. They conclude that this tool has helped reveal links between TME composition (e.g. degree of vascularisation, presence of tumour-associated macrophages) and the invasiveness and directionality of tumour cells, which would have been challenging to identify when analysing single kinetic parameters in isolation.

      The authors also evaluated the BEHAV3D TP heterogeneity module using available IVM datasets of distinct breast cancer cell lines transplanted in vivo, as well as healthy mammary epithelial cells to test its usability in non-tumour contexts where the migratory phenotypes of cells may be more subtle. This generated data is consistent with that produced during the original studies, as well as providing some additional (albeit preliminary) insights above that previously reported. Collectively, this provides some confidence in BEHAV3D TP's ability to uncover complex, multi-parametric cellular behaviours that may be missed using traditional approaches.

      Overall, this computational framework appears to represent a useful and comparatively user-friendly tool to analyse dynamic multi-parametric data to help identify patterns in cell migratory behaviours, and to assess whether these behaviours might be influenced by neighbouring cells and structures in their microenvironment. When combined with other methods, it therefore has the potential to be a valuable addition to a researcher's IVM analysis 'tool-box'.

      Strengths:

      - Figures are clearly presented, and the manuscript is easy to follow.<br /> - The pipeline appears to be intuitive and user-friendly for researchers with limited computational expertise. A detailed step-by-step video and demo datasets are also included to support its uptake.<br /> - The different computational modules have been tested using relevant datasets, including imaging data of normal and tumour cells in vivo.<br /> - All code is open source, and the pipeline can be implemented with Google Colab.<br /> - The tool combines multiple dynamic parameters extracted from timelapse IVM images to identify single-cell behavioural patterns and to cluster cells into distinct groups sharing similar behaviours, and provides avenues to map these onto in vivo or ex vivo imaging data of the tumour microenvironment

      Weaknesses:

      - The tool does not facilitate the extraction of quantitative kinetic cellular parameters (e.g. speed, directionality, persistence and displacement) from intravital images. To use the tool researchers must first extract dynamic cellular parameters from their IVM datasets using other software including Imaris, which is expensive and therefore not available to all. Nonetheless, the authors have developed their tool to facilitate the integration of other data formats generated by open-source Fiji plugins (e.g. TrackMate, MTrackJ, ManualTracking) which will help ensure its accessibility to a broader range of researchers.<br /> - The analysis provides only preliminary evidence in support of the authors conclusions on DMG cell migratory behaviours and their relationship with components of the tumour microenvironment. The authors acknowledge this however, and conclusions are appropriately tempered in the absence of additional experiments and controls.

    3. Author response:

      The following is the authors’ response to the original reviews

      We thank the reviewers for their positive and constructive comments on the manuscript. In the revised manuscript we addressed these comments, which we believe have improved the quality of our work.

      In summary:

      (1) We acknowledge the reviewer's suggestion to incorporate open-source segmentation and tracking functionalities, increasing its accessibility to a wider user base; however, these additions fall outside the primary scope of our current work, which is to provide an analytical framework for IVM data after segmentation and tracking. Developing open-source segmentation and tracking tools represents a substantial undertaking in its own right, which has been comprehensively explored in other studies (e.g. https://doi.org/10.4049/jimmunol.2100811; https://doi.org/10.7554/eLife.60547; https://doi.org/10.1016/j.media.2022.102358; https://doi.org/10.1038/s41592024-02295-6 - now cited in our revised manuscript). 

      In our analyses, we used data processed with Imaris, a commercial software that, despite its limitations, is widely used by the intravital microscopy community due to its user-friendly platform for 3D image visualization and analysis. Nevertheless, recognizing the need for compatibility with tracking data from various pipelines, we have modified our tool to accept other data formats, such as those generated by open-source Fiji plugins like TrackMate, MTrackJ, ManualTracking (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler?tab=readme-ov-file#data-input). These updates are available in our GitHub repository and are described in the revised manuscript. 

      (2) We appreciate the reviewer #3 suggestion to incorporate additional features into our analytical pipeline. In response, we have already updated the GitHub repository to allow users to input and select which features (dynamic, morphological, or spatial) they wish to include in the analysis (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler?tab=readmeov-file#feature-selection ). In the revised manuscript, we highlighted this new functionality and provided examples using alternative datasets to demonstrate the application of these features.

      (3)  We appreciate the constructive feedback of reviewers #1 and #2 regarding the statistical analysis and interpretation of the data presented in Figures 3 and 4. We understand the importance of clarity and rigor in data analysis and presentation, and we addressed the concerns raised in the revised version of the manuscript.

      (4) We appreciate reviewer #1's suggestion regarding the inclusion of demo data, as we believe it would greatly enhance the usability of our pipeline. We acknowledge that this was an oversight on our part. To address this, we have now added demos to our GitHub repository (https://github.com/imAIgene-

      Dream3D/BEHAV3D_Tumor_Profiler/tree/BEHAV3D_TP-v2.0/demo_datasets). In the revised manuscript, we referenced this addition and present new figures with examples of these demo’s processing different IVM dataset (2D/3D, different tumors and healthy tissues). Additionally, we have provided processed DMG IVM movie samples in an imaging repository.

      (5) Finally, we made some small changes to the manuscript based on the reviewers’ feedback.

      Below we provide a point-by-point response to the reviewers’ comments

      Reviewer #1 (Public review):

      Comment #1: A key limitation of the pipeline is that it does not overcome the main challenges and bottlenecks associated with processing and extracting quantitative cellular data from timelapse and longitudinal intravital images. This includes correcting breathing-induced movement artifacts, automated registration of longitudinal images taken over days/weeks, and accurate, automated segmentation and tracking of individual cells over time. Indeed, there are currently no standardised computational methods available for IVM data processing and analysis, with most laboratories relying on custom-built solutions or manual methods. This isn't made explicit in the manuscript early on (described below), and the researchers rely on expensive software packages such as IMARIS for image processing and data extraction to feed the required parameters into their pipeline. This limitation unfortunately reduces the likely impact of BEHAV3D-TP on the IVM field. 

      As highlighted above, the tool does not facilitate the extraction of quantitative kinetic cellular parameters (e.g. speed, directionality, persistence, and displacement) from intravital images. Indeed, to use the tool researchers must first extract dynamic cellular parameters from their IVM datasets, requiring access to expensive software (e.g. IMARIS as used here) and/or above-average computational expertise to develop and use custom-made open-source solutions. This limitation is not made explicit or discussed in the text.

      We acknowledge the reviewer's suggestion to incorporate open-source segmentation and tracking functionalities, increasing its accessibility to a wider user base; however, these additions fall outside the primary scope of our current work and represent a substantial undertaking in their own right. Several studies (e.g., Diego Ulisse Pizzagalli et al., J Immunol (2022); Aby Joseph et al., eLife (2020); Molina-Moreno et al., Medical Image Analysis (2022); Hidalgo-Cenalmor et al., Nat Methods (2024); Ershov et al., Nat Methods (2022)) have comprehensively addressed these topics, and we now reference them in the revised manuscript to provide readers with relevant background.

      The objective of our manuscript is not to develop a complete segmentation or tracking pipeline but rather to introduce an analytical framework capable of extracting enhanced insights from the data generated by existing tools. This goal arises from our observations of the field: despite significant investment in image processing, researchers often rely on simplistic approaches, such as averaging single parameters across conditions, which can obscure tumor heterogeneity and spatial behavioral dynamics within the tumor microenvironment.

      Our current tool focuses on providing this much-needed analytical capability. For our analysis we used Imaris, a widely utilized software in the intravital microscopy (IVM) community, known for its intuitive 3D visualization and analysis platform despite certain limitations. 

      In our own literature search of recent IVM studies published by leading laboratories in high-impact journals, we found that close to half used Imaris, while the remainder primarily relied on manual workflows with Fiji plugins. Thus, we consider it valuable to offer a pipeline compatible with such commonly used software, given its prevalence in the field.

      However, following the suggestion of the reviewer, and to enhance the tool’s flexibility and compatibility, we have expanded the pipeline to accept data formats generated by open-source Fiji plugins, such as TrackMate, MTrackJ, and ManualTracking. These updates are detailed in the revised manuscript and are implemented in our GitHub repository (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler?tab=readme-ov-file#data-input ), where we also provide several demos using TrackMate and Imaris processed data. This addition demonstrates our tool's capability to integrate with segmented and tracked datasets from diverse platforms, increasing its applicability to a broader range of researchers using both commercial and open-source pipelines.

      Comment #2: The number of cells (e.g. per behavioural cluster), and the number of independent mice, represented in each result figure, is not included in the figure legends and are difficult to ascertain from the methods.

      We appreciate the reviewer's constructive feedback regarding the clarity of the number and type of replicates used in our analyses. In the revised manuscript, we have included detailed information in the figure legends and the number of independent mice represented in each figure legend to ensure transparency. Regarding the number

      of cells, we have indicated the total number of processed cells in Figure 2b legend (953 cells). Additionally, we have now included figures (Sup Fig 4c, Sup Fig 5e-g, Fig 5c,e, Sup Fig 6 c,d) for each cluster, where individual dots represent the individual cell tracks with color indicating the position and the shape indicating individual mice.

      Comment #3: The data used to test the pipeline in this manuscript is currently not available, making it difficult to assess its usability. It would be important to include this for researchers to use as a 'training dataset'.

      As stated above we acknowledge that this was an oversight on our part and thank the reviewer for pointing this out. To address this, we have now added demo data to our GitHub repository (BEHAV3D_Tumor_Profiler/demo_datasets at main · imAIgeneDream3D/BEHAV3D_Tumor_Profiler · GitHub). In the revised manuscript we have referenced this addition in the Data availability section. Since we included now processing with Fiji as well, we provide 4 demo datasets (https://github.com/imAIgeneDream3D/BEHAV3D_Tumor_Profiler/tree/main/demo_datasets), one processed with Imaris in 3D; and one with CellPose2.0 and Trackmate in 2D; one processed with µSAM and Trackmate in 3D and one manually processed with MtrackJ in 2D . Moreover, we now provide Imaris-processed DMG IVM movie samples in an open-source repository.

      Comment #4: Precisely how the BEHAV3D-TP large-scale phenotyping module can map large-scale spatial phenotyping data generated using LSR-3D imaging data and Cytomap to 3D intravital imaging movies is unclear. Further details in the text and methods would be beneficial to aid understanding.

      We appreciate the reviewer’s comment and in the revised manuscript we have now provided details in the methods section “Tumor large-scale spatial phenotyping with Cytomap” to clarify how the BEHAV3D-TP module maps LSR-3D and Cytomap data to 3D intravital imaging movies:

      “To map the assigned regions onto IVM movies, a 3D image of the cluster distribution within the tumor was generated and exported for each sample (Figure Supplement 5a). Next, regions within the IVM movies were visually matched to the corresponding regions identified by the Large-Scale Phenotyping module of Cytomap (Figure 3c). For each mouse, at least one or two representative positions per matched region type were selected, cropped, and analyzed to assess tumor cell behavior, following the previously described cell tracking methodology (Imaris Cell tracking).”

      Moreover, we updated Figure 3 c to further clarify these steps.

      Comment #5: The analysis provides only preliminary evidence in support of the authors' conclusions on DMG cell migratory behaviours and their relationship with components of the tumour microenvironment. Conclusions should therefore be tempered in the absence of additional experiments and controls. 

      We appreciate the reviewer’s comment and acknowledge that our conclusions should be tempered due to the preliminary nature of our evidence. In the revised version of the manuscript we have revised our conclusions accordingly and emphasize the necessity for additional experiments and controls to further validate our findings on DMG cell migratory behaviors and their relationship with the tumor microenvironment.

      In discussion: “While our findings suggest that microenvironmental factors may influence tumor cell migration, further studies will be necessary to establish causal relationships. Additional experimental validation, such as macrophage ablation experiments, could help clarify the specific contributions of these factors.”

      Reviewer #1 (Recommendations for the authors): 

      (1) To test the ability of the pipeline to identify relevant patterns of migratory behaviours additional 'control' experiments would be helpful e.g. comparing non-invasive vs invasive tumour cell lines, artificially controlling migratory behaviours of cells such as implanting beads soaked in factors that would attract/repel cells? 

      (2) Does the pipeline work well for a variety of cell types/contexts? e.g. can it identify and cluster more subtle migratory behaviours such as non-tumour cells during tissue development or regeneration conditions? 

      We appreciate the reviewer’s valuable suggestions. In the revised manuscript, we have included additional examples demonstrating the capability of our pipeline to investigate heterogeneous cell behavior across two additional experimental setups:

      (1) We have now evaluated our BEHAV3D TP heterogeneity module using IVM data from breast cancer cell lines with varying migratory capacities (DOI: 10.1016/j.yexcr.2019.04.009). In these datasets, our pipeline extends beyond predefined characteristics based solely on speed, enabling the identification of distinct cell populations. Notably, our analysis reveals that the breast cancer lines exhibit different proportions of different migratory behaviors such as Fast, Intermediate, Very slow and Static (Supplementary Fig 1).

      (2) We have now evaluated our BEHAV3D TP heterogeneity module using IVM data from healthy breast epithelial cells (DOI: 10.1016/j.celrep.2024.115073), where we identify distinct morhophynamic epithelial cell populations in the terminal end but of the mammary gland that have a distinct distribution among Hormone receptor (HR) + and HR- terminal end but cells.

      (3) To support biological conclusions could the authors show that ablating tumourassociated macrophages or vasculature alters the migratory patterns of nearby tumour cells? 

      We appreciate the reviewer's suggestion regarding the potential effects of ablating tumor-associated macrophages or vasculature on the migratory patterns of nearby tumor cells. While these experiments would functionally validate the observations made by our method, we would like to clarify that the primary focus of our study was on the development and application of computational tools for behavioral analysis and thus we consider that delving deeper in understanding the biology behind our observation is out of the scope of the current study. However, as mentioned previously, we have carefully tempered our conclusions to acknowledge the limitations of our current study. In the revised manuscript, we explicitly highlight that experiments involving the ablation of tumor-associated macrophages or vasculature would be crucial for further understanding the biological relevance of our findings.

      Minor corrections to text: 

      (4) Line 63 - are references formatted correctly?

      Thank you for pointing out this error. We have corrected it in the revised manuscript.

      (5) Lines 161 -162 - 'intravitally imaged' used twice in a sentence.

      Thank you for pointing out the typo. We have corrected it in the revised manuscript.

      Reviewer #2 (Public review):

      Comment#1: The strength of democratizing this kind of analysis is undercut by the reliance upon Imaris for segmentation, so it would be nice if this was changed to an open-source option for track generation.

      As noted in our previous response to Reviewer #1, we would like to point out that although Imaris is a commercial software, it is widely used in the intravital microscopy community due to its user-friendly interface. We conducted a literature review to evaluate this aspect and below we include references from leading laboratories in the IVM field that utilize Imaris. One of its key advantages, which we also utilized, is semi-automated data tracking that allows for manual corrections in 3D—a process that can be more challenging in other open-source software with less effective data visualization.

      However, we recognize that enhancing our pipeline's compatibility with open-source options is important. To this end, we have updated our tool to support 2D and 3D data formats generated by open-source Fiji plugins like TrackMate, MTrackJ, and ManualTracking, improving compatibility with various segmentation and tracking pipelines (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler?tab=readme-ov-file#data-input ). In the revised manuscript, we describe the new functionality and demonstrate the operation of the BEHAV3D-TP heterogeneity module across various IVM datasets, processed in both 2D and 3D with different processing pipelines (Supplementary Fig 1-3). This includes CellPose 2.0 and the novel 'Segment Anything' model, followed by TrackMate tracking, applied to both tumor and healthy IVM data. Moreover we have developed a new web application that integrates morphological and tracking information from Segment Anything segmentation and Trackmate tracking, depicted in Supplementary Fig 3 a (https://morphotrack-merger.streamlit.app/ ). Additionally, we have updated the introduction to better clarify the scope of our study and include references to existing image processing solutions.

      Comment#2: The main issue is with the interpretation of the biological data in Figure 3 where ANOVA was used to analyse the proportional distribution of different clusters. Firstly the n is not listed so it is unclear if this represents an n of 3 where each mouse is an individual or whether each track is being treated as a test unit. If the latter this is seriously flawed as these tracks can't be treated as independent. Also, a more appropriate test would be something like a Chi-squared test or Fisher's exact test. Also, no error bars are included on the stacked bar graphs making interpretation impossible. Ultimately this is severely flawed and also appears to show very small differences which may be statistically different but may not represent biologically important findings. This would need further study.

      We appreciate the reviewer’s insightful comments regarding the interpretation of the biological data in Figure 3. 

      To clarify, each imaged position is considered an independent biological replicate (n = 18 from a total of 6 mice). We acknowledge that the description of the statistical methods and the experimental units was not sufficiently clear in the previous version. In our original submission, we used an ANOVA to test whether the proportion of each behavioral cluster differed across the tumor microenvironment regions. Post hoc pairwise comparisons were performed using Tukey’s test, with the results shown in Supplementary Figure 2d (currently Fig 3d). However, we agree with the reviewer that this approach may be misleading when paired with stacked bar plots that lack error bars, as it can obscure individual variability and does not explicitly represent statistical uncertainty.

      In the revised manuscript, we present the data as boxplots with individual data points, where each dot represents an imaged position, and the shape corresponds to a specific mouse. In Figure 3 d the y-axis displays the normalized percentage of each cluster across TME regions, expressed as z-scores. This normalization corrects for inter-mouse variability and facilitates a comparison of the relative distribution of clusters across TME regions, independent of the overall abundance differences between mice. We performed an ANOVA with Tukey's post hoc test for each individual behavioral cluster to assess differences across TME regions. Additionally, for transparency, in Supplementary Figure 5 d we provide the raw percentage values. The legends provide the number of positions and mice included in the analysis. 

      Comment#3:  Figure 4 has similar statistical issues in that the n is not listed and, again, it is unclear whether they are treating each cell track as independent which, again, would be inappropriate. The best practice for this type of data would be the use of super plots as outlined in Lord et al. (2020) JCI - SuperPlots: Communicating reproducibility and variability in cell biology.

      We appreciate the reviewer’s comments and suggestions regarding Figure 4. In this case as we are comparing overall the behavioral clusters features, each individual cell is treated as a unit. In the revised manuscript, we have clarified this point in the figure legend and incorporated plots in Figure 4c and 4e, indicating the mouse and imaging position each data point originates from. This enhances the visualization of reproducibility and variability in our data, demonstrating that the results are consistent across multiple mice and positions and are not driven by a single mouse or imaging position.

      Comment#4: The main issue that this raises is that the large-scale phenotyping module and the heterogeneity module appear designed to produce these statistical analyses that are used in these figures and, if they are based on the assumption that each track is independent, then this will produce inappropriate analyses as a default.

      We appreciate the reviewer’s comment, although we are unclear about the specific concern being raised. To clarify, in our large-scale phenotyping analysis, each position is assigned to a TME niche based on the CytoMAP analysis and the workflow outlined in Figure 3c. Multiple positions are imaged per mouse. For each position, we measure the proportion of tumor cells exhibiting a specific behavioral phenotype, and these proportions are subsequently used for statistical analysis (Figure 3 d). 

      In contrast, in Supplementary Fig. 5e-g, we treat each cell track as an individual unit, grouping them by their assigned large-scale region. Here, we assess whether differences between regions can be detected using a conventional single-feature analysis—a more traditional approach. However, we find that this method loses important behavioral patterns and distinctions that BEHAV3D-TP captures.

      We hope that this explanation, along with the modifications made to the figures and figure legends, provides greater clarity.  

      Reviewer #3 (Public review):

      Comment #1: The most challenging task of analyzing 3D time-lapse imaging data is to accurately segment and track the individual cells in 3D over a long time duration. BEHAV3D Tumor Profiler did not provide any new advancement in this regard, and instead relies on commercial software, Imaris, for this critical step. Imaris is known to have a very high error rate when used for analyzing 3D time-lapse data. In the Methods section, the authors themselves stated that "Tumor cell tracks were manually corrected to ensure accurate tracking". Based on our own experience of using Imaris, such manual correction is tedious and often required for every time step of the movie. Therefore, Imaris is not a satisfactory tool for analyzing 3D time-lapse data. Moreover, Imaris is expensive and many research labs probably can't afford to buy it. The fact that BEHAV3D Tumor Profiler critically depends on the faulty ImarisTrack module makes it unclear whether the BEHAV3D tool or the results are reliable.

      If the authors want to "democratize the analysis of heterogeneous cancer cell behaviors", they should perform image segmentation and tracking using open-source codes (e.g., Cellpose, Stardisk & 3DCellTracker) and not rely on the expensive and inaccurate ImarisTrack Module for the image analysis step of BEHAV3D.

      We appreciate the reviewer’s comments on the challenges of segmenting and tracking individual cells in 3D time-lapse imaging data. As mentioned previously (please refer to comment #1 to reviewer #1), our primary focus is to develop an analytical tool for comprehensive data analysis rather than developing tools for image processing. However to enhance accessibility, we have updated our tool to support data formats from open-source Fiji plugins, such as TrackMate, which will benefit users without access to commercial software (https://github.com/imAIgeneDream3D/BEHAV3D_Tumor_Profiler?tab=readme-ov-file#data-input ). In Supplementary Figures 1, 2, and 3, we present IVM data from different sources, processed using three distinct methods: MTrackJ (Supplementary Fig. 1), Cellpose + TrackMate (Supplementary Fig. 2), and µSAM + TrackMate (Supplementary Fig. 3). The latter two represent state-of-the-art deep learning approaches.

      On the other hand, while we recognize the limitations of Imaris, it remains widely used in the intravital microscopy community due to its user-friendly interface for 3D visualization and semi-automated segmentation capabilities. Since no perfect tracking method currently exists, we initially utilized Imaris for its ability to allow manual correction of faulty tracks, ensuring the reliability of our results. This approach, not only widely used (see above) but was the best available option when we began our analysis, allowing us to obtain accurate results efficiently.

      In the revised manuscript, we clarify the scope of our study and provide information on both Imaris and alternative processing options to strengthen the reliability of our findings:

      In introduction: “While significant efforts have been made to develop opensource segmentation and tracking tools for live imaging data, including IVM22–27 fewer tools exist for the unbiased analysis of tumor dynamics. One major barrier is that implementing such analytical methods often requires substantial computational expertise, limiting accessibility for many biomedical researchers conducting IVM experiments. To bridge this gap, we present BEHAV3D Tumor Profiler (BEHAV3D-TP)  by providing a robust, user-friendly tool that allows researchers to extract meaningful insights from dynamic cellular behaviors without requiring advanced programming skills.”

      In the Methods, we describe now describe not only Imaris processing pipeline, but also the µSAM segmentation pipelines and reference to CellPose IVM processing, which are combined with TrackMate for tracking. Additionally, to integrate morphological information from µSAM with tracking data from TrackMate, we developed a web tool to merge the outputs from both processing steps: https://morphotrack-merger.streamlit.app/  

      Comment #2: The authors developed a "Heterogeneity module" to extract distinctive tumor migratory phenotypes from the cell tracks quantified by Imaris. The cell tracks of the individual tumor cells are all quite short, indicating relatively low motility of the tumor cells. It's unclear whether such short migratory tracks are sufficient to warrant the PCA analysis to identify the 7 distinctive migratory phenotypes shown in Figure 2d. It's also unclear whether these 7 migratory phenotypes correspond to unique functional phenotypes.  

      For the 7 distinctive motility clusters, the authors should provide a more detailed analysis of the differences between them. It's unclear whether the difference in retreating, slow retreating, erratic, static, slow, slow invading, and invading correspond to functional difference of the tumor cells.

      While some tumor cells exhibit limited motility, indicated by short tracks, others demonstrate significant migratory capabilities (Figure 2 Invading and Retreating cells). This variability in tumor cell behavior is a central focus of our analysis, and our tool is specifically designed to identify and distinguish these differences. Our PCA analysis effectively captures this variability, as illustrated in Figure 2 d-f. It differentiates between cells exhibiting varying degrees of migratory behavior, including both highly and less migratory phenotypes, as well as their directionality relative to the tumor core and the persistence of their movements. Thus, we believe that our approach provides valuable insights into the distinct migratory phenotypes within the tumor microenvironment. 

      While our current manuscript does not provide explicit evidence linking each motility cluster to functional differences among the tumor cells, it is important to note that the state of the field supports the idea that cell dynamics can predict cell states and phenotypes. Research conducted by ourselves (Dekkers, Alieva et al., Nat Biotech, 2023) and others, such as Craiciuc et al. (Nature, 2022) and Freckmann et al. (Nat Comm, 2022) has shown that variations in cell motility patterns are indicative of underlying functional characteristics. For instance, cell morphodynamic features have been shown to reflect differences in cell types, T cell targeting states (Dekkers, Alieva et al., Nat Biotech, 2023), immune cell types (Crainiciuc et al. (Nature, 2022)), tumor metastatic potential, and drug resistance states (Freckmann et al. (Nat Comm, 2022)). In the revised manuscript, we have referenced relevant studies to underscore the biological significance of these behaviors. By doing so, we hope to clarify the potential implications of our findings and strengthen the overall narrative of our research:

      In discussion: “While our current study does not provide direct functional validation of the distinct motility clusters identified, existing literature strongly supports the notion that cell dynamics can serve as a proxy for functional states and phenotypic heterogeneity. Prior work, including studies by our group[19,66]  as well as Crainiciuc et al.[35] and Freckmann et al.[20], has demonstrated that variations in cell motility patterns can reflect underlying functional characteristics. Specifically, cell morpho-dynamic features have been shown to correlate with differences in cell type identity, T-cell engagement, metastatic potential, and drug resistance states. This growing body of evidence suggests that tumor cell behavior, as captured by BEHAV3D-TP, may serve as a predictive tool for deciphering functional tumor heterogeneity. Future studies integrating transcriptomic or proteomic profiling of motility-defined subpopulations could further elucidate the biological significance of these behavioral phenotypes.”

      Comment #3: Using only motility to classify tumor cell behaviours in the tumor microenvironment (TME) is probably not sufficient to capture the tumor cell difference. There are also other non-tumor cell types in the TME. If the authors aim to develop a computational tool that can elucidate tumor cell behaviors in the TME, they should consider other tumor cell features, e.g., morphology, proliferation state, and tumor cell interaction with other cell types, e.g., fibroblasts and distinct immune cells.

      The authors should expand the scale of tumor behavior features to classify the tumor phenotype clusters, e.g., to include tumor morphology, proliferation state, and tumor cell interaction with other TME cell types.

      We believe that using dynamic features alone is sufficient to capture differences in tumor behavior, as demonstrated by our results in Figure 2. However, we appreciate the reviewer’s suggestion to consider additional features, such as cell morphology, to finetune our analyses. To this end, we have adapted our pipeline to be compatible with any dynamic, morphologic or spatial features present in the data. In the revised manuscript we showcase this new addition with the analyses of two new dataset: 2D IVM data from healthy epithelial breast cells (Supplementary Fig 2) and 3D IVM data from adult gliomas (Supplementary Fig 3). These analyses identified cells with specific morphodynamic characteristics, which exhibited distinct kinetic behaviors or spatial distributions.

      However, we would like to point out that not all features may provide informative insights and that a wide range of features can instead introduce biologically irrelevant noise, making interpretation more challenging. For instance, in 3D microscopy, the zaxis resolution is typically lower, which can lead to artifacts like elongation in that direction. Adding morphological features that capture this may skew the analysis. Therefore, we believe that incorporating additional features should be approached with caution. We clarify these considerations in the revised manuscript to better guide users in utilizing our computational tool effectively:

      In discussion: “In addition to motility-based classification, features such as tumor cell morphology, proliferation state, and interactions with the tumor microenvironment can further refine tumor phenotyping. BEHAV3D-TP allows for the selection of diverse feature types, supporting datasets that include both dynamic, morphological and spatial parameters. However, we recognize that expanding the feature set may introduce biologically irrelevant noise, particularly in 3D microscopy data where limited z-axis resolution can lead to morphological artifacts. This highlights the potential need in the future to include unbiased feature selection strategies, such as bootstrapping methods67, to ensure the identification of meaningful and biologically relevant parameters. Careful consideration of these aspects is key to maximizing the interpretability and predictive value of analyses performed with BEHAV3D-TP.”

      Comment #4: The authors have already published two papers on BEHAV3D [Alieva M et al. Nat Protoc. 2024 Jul;19(7): 2052-2084; Dekkers JF, et al. Nat Biotechnol. 2023 Jan;41(1):60-69]. Although the previous two papers used BEHAV3D to analyze T cells, the basic pipeline and computational steps are similar, in particular regarding cell segmentation and tracking. The addition of a "Heterogeneity module" based on PCA analysis does not make a significant advancement in terms of image analysis and quantification.

      We want to emphasize that we have no intention of duplicating our previous publications. In this manuscript, we have consistently cited our foundational papers, where BEHAV3D was first developed for T cell migratory analysis in in vitro settings. In the introduction, we clearly state that our earlier work inspired us to adopt a similar approach for analyzing cell behavior in intravital microscopy (IVM) data, addressing the specific needs and complexities of analyzing tumor cell behaviors in the tumor microenvironment.

      Importantly, our new work provides several key advancements: 1) a pipeline specifically adapted for intravital microscopy (IVM) data; 2) integration of spatial characteristics from both large-scale and small-scale phenotyping; and 3) a zero-code approach designed to empower researchers without coding skills to effectively utilize the tool. We believe that these enhancements represent meaningful progress in the analysis of cell behaviors within the tumor microenvironment which will be valuable for the IVM community. We ensure that these points are clearly articulated in the revised manuscript:

      In introduction: “In line with this concept of characterizing cellular dynamic properties for cell classification, we have previously developed an analytical platform termed BEHAV3D 19,21 allowing to perform behavioral phenotyping of engineered T cells targeting cancer. While BEHAV3D was initially developed to analyze T cell migratory behavior under controlled in vitro conditions, we sought to expand its application to investigate tumor cell behaviors in IVM data, where the complexity of the TME presents distinct analytical challenges. This manuscript builds on our foundational work but represents a significant advancement by adapting the pipeline specifically for IVM datasets.”

      Reviewer #3 (Recommendations for the authors): 

      (1) If the authors want to "democratize the analysis of heterogeneous cancer cell behaviors", they should perform image segmentation and tracking using open-source codes (e.g., Cellpose, Stardisk & 3DCellTracker) and not rely on the expensive and inaccurate ImarisTrack Module for the image analysis step of BEHAV3D. 

      We thank the reviewer for this recommendation and as stated above we recognize that enhancing our pipeline's compatibility with open-source options is important. To this end, we have updated our tool to support data formats generated by open-source Fiji plugins like TrackMate, MTrackJ, and ManualTracking, improving compatibility with various segmentation and tracking pipelines (https://github.com/imAIgeneDream3D/BEHAV3D_Tumor_Profiler?tab=readme-ov-file#data-input ). In the revised manuscript, we detail this new functionality and demonstrate the operation of the BEHAV3D-TP heterogeneity module using an example dataset of glioma tumors.

      Additionally, we have updated the introduction to better clarify the scope of our study (See comment #1 from Review #3) and include references to existing image processing solutions.

      (2) For the 7 distinctive motility clusters, the authors should provide a more detailed analysis of the differences between them. It's unclear whether the difference in retreating, slow retreating, erratic, static, slow, slow invading, and invading correspond to functional difference of the tumor cells. 

      As noted in the comment above, the revised manuscript now incorporates references to relevant literature that support our understanding that behavioral differences among cells are driven by their underlying functional differences (See comment #2 from Reviewer #3). Additionally, we would like to point to Figure 2d and Supplementary Fig 4 c that provide evidence of the functional distinctions between the identified clusters.

      (3) The authors should expand the scale of tumor behavior features to classify the tumor phenotype clusters, e.g., to include tumor morphology, proliferation state, and tumor cell interaction with other TME cell types.

      We thank the reviewer for this valuable suggestion. In the revised manuscript, we have added the flexibility to incorporate a wide range of features, including morphological ones, and enabled users to select the specific features they wish to include in their analysis. To illustrate this functionality, we have included 2 example dataset analyzed using this approach (See comment #3 from Reviewer #3). Additionally, as indicated above we emphasize the importance of careful selection and interpretation of features, as improper choices may lead to biologically irrelevant results. This clarification is intended to ensure that users apply the tool thoughtfully and derive meaningful insights.

    1. Custom pin selection (GPIO35 for SDA and GPIO34 for SCL) is unusual — those GPIOs are input-only on many ESP32 boards. If this works, it may be a board-specific configuration or a misunderstanding (we'll verify when reviewing the code).

      testing h. on chatgpt dialgoue, and is this an issue

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public review):

      Weaknesses:

      (1) The selection of inactivated conformations based on AlphaFold modeling seems a bit biased. The authors base their selection of the “most likely” inactivated conformation on the expected flipping of V625 and the constriction at G626 carbonyls. This follows a bit of the “Streetlight effect”. It would be better to have selection criteria that are independent of what they expect to find for the inactivated state conformations. Using cues that favour sampling/modeling of the inactivated conformation, such as the deactivated conformation of the VSD used in the modeling of the closed state, would be more convincing. There may be other conformations that are more accurately representing the inactivated state. I see no objective criteria that justify the non-consideration of conformations from cluster 3 of the inactivated state modeling. I am not sure whether pLDDT is a good selection criterion. It reports on structural confidence, but that may not relate to functional relevance.

      We sincerely thank the reviewer for their perceptive critique highlighting potential bias in selecting the inactivated conformation. We recognize that over-relying on preconceived traits could limit exploration of diverse inactivated states, and we appreciate the opportunity to address this concern.

      Although we selected the model with the flipped V625 in the selectivity filter (SF) from the first round of inactivated-state sampling as the template for the second round, the resulting models still exhibited substantial diversity in their SF conformations. This selection primarily served to steer predictions away from the open-state configuration observed in the PDB 5VA2 SF, and we have clarified this rationale in the Methodology section. To assess conformational variability, we examined backbone dihedral angles (phi φ and psi ψ) at key residues in the selectivity filter (S624 – G628) and drugbinding region on the pore-lining S6 segment (Y652, F656), of all 100 models sampled in the subsequent inactivatedstate-sampling attempt. By overlaying the φ and ψ dihedral angles from different models, including the open state (PDB 5VA2-based), the closed state, and representative models from AlphaFold inactivated-state-sampling Cluster 2 and Cluster 3, we found that these conformations consistently fall within or near high-probability regions of the dihedral angle distributions. This indicates that these structural states are well represented within the ensemble of conformations sampled by AlphaFold within the scope of this study, particularly at functionally critical positions.

      Following the analysis above and consistent with the reviewer’s suggestion, we evaluated the top representative model from inactivated-state-sampling Cluster 3 (named “AF ic3”), which we had initially excluded. This model demonstrated SF residue G626 carbonyl oxygen flipped away from the conduction pathway, hinting at potential impact on ion conduction, yet its pore region structurally resembled the open state (Figure S9a, b). To test this objectively, we ran molecular dynamics (MD) simulations (2 runs, 1 μs long each, with applied 750 mV voltage) with varied initial ion/water configurations in the SF, finding it consistently open and conducting throughout (Figure S9c, d), consistent with our previous observations in Figure S11 that ion conduction can still occur when the upper SF is dilated. Drug docking (Figure S12) further revealed that the model exhibited binding affinities similar to those for the PDB 5VA2-based openstate structure. These findings combined led us to classify it as a possible alternative open-state conformation.

      Models from Cluster 4 were not tested due to extensive steric clashes, where residues in the SF overlapped with neighboring residues from adjacent subunits. The remaining models displayed SF conformations that combined features from earlier clusters. However, due to subunit-to-subunit variability, where individual subunits adopted differing conformations, they were classified as outliers. This combination of features may be valuable to investigate further in a follow-up study.

      We acknowledge that our approach is just one of many ways to sample different states, and alternative strategies, such as generating more models, varying multiple sequence alignment (MSA) subsampling, or testing different templates, might reveal improved models. Given that hERG channel inactivation likely spans a spectrum of conformations, our resource limitations may have restricted us to exploring and validating only part of this diversity. Nevertheless, the putative inactivated (AlphaFold Cluster 2) model’s non-conductivity and improved affinity for drugs targeting the inactivated state observed in our study suggests that this approach may be capturing relevant features of the inactivated-state conformation. We look forward to investigating deeper other possibilities in a future study and are grateful for the reviewer’s feedback.

      (2) The comparison of predicted and experimentally measured binding affinities lacks an appropriate control. Using binding data from open-state conformations only is not the best control. A much better control is the use of alternative structures predicted by AlphaFold for each state (e.g. from the outlier clusters or not considered clusters) in the docking and energy calculations. Using these docking results in the calculations would reveal whether the initially selected conformations (e.g. from cluster 2 for the inactivated state) are truly doing a better job in predicting binding affinities. Such a control would strengthen the overall findings significantly.

      We appreciate the reviewer’s insightful suggestion. To address this, we extended our analysis by incorporating an alternative AlphaFold2-predicted model from inactivated-state-sampling cluster 3 as a structural control. This model was established in a previously discussed analysis to be open and conducting as a follow up to comment #1, so we will call it Open (AF ic3) to differentiate it from Open (PDB 5VA2). We evaluated this new model in single-state and multi-state contexts alongside our original open-state model based on the experimental PDB 5VA2 structure. Additionally, we expanded the drug docking procedure to explore a broader region around the putative drug binding site by increasing the sampling space, and we adopted an improved approach for selecting representative docking poses to better capture relevant binding modes.

      Shown in Figure 7 are comparisons of experimental drug potencies with the binding affinities from the molecular docking calculations under the following conditions:

      (a) Single-state docking using the experimentally derived open-state structure (PDB 5VA2)

      (b) Multi-state docking incorporating open (PDB 5VA2), inactivated, and closed-state conformations weighted by experimentally observed state distributions

      (c) Single-state docking using an alternative AlphaFold-predicted open-state (inactivated-state-sampling cluster 3, AF ic3)

      (d) Multi-state docking combining the AlphaFold-predicted open-state (inactivated-state-sampling cluster 3, AF ic3)

      Using only the open-state model (PDB 5VA2) yielded a moderate correlation with experimental data (R<sup>2</sup> = 0.43, r = 0.66, Figure 7a). Incorporating multi-state binding (weighted by their experimental distributions) improved the correlation substantially (R<sup>2</sup> = 0.63, r = 0.79, Figure 7b), boosting predictive power by 47% and underscoring the value of multi-state modeling. Importantly, this improvement was achieved without considering potential drug-induced allosteric effects on the hERG channel conformation and gating, which will be addressed in future work.

      Next, we substituted the PDB 5VA2-based open-state model with the AF ic3 open-state model. Docking to this alternative model alone produced similar performance (R<sup>2</sup> = 0.44, r = 0.66, Figure 7c), and incorporating it into the multi-state ensemble further improved the correlation with experiments (R<sup>2</sup> = 0.64, r = 0.80, Figure 7d), representing a 45% gain in R<sup>2</sup> and matching the performance of multi-state docking results based on the PDB 5VA2-derived model.

      These findings suggest that the predictive power of computational drug docking is enhanced not merely by the accuracy of individual models, but by the structural diversity and complementarity provided by an ensemble of protein conformations. Rather than relying solely on a single experimentally determined protein structure, the ensemble benefits from incorporating AlphaFold-predicted models that capture alternative conformations identified through our state-specific sampling approach. These diverse protein models reflect different structural features, which together offer a more comprehensive representation of the ion channel’s binding landscape and enhance the predictive performance of computational drug docking. Overall, these results reinforce that multi-state modeling offers a more realistic and predictive framework for understanding drug – ion channel interactions than traditional single-state approaches, emphasizing the value of both individual model evaluation and their collective integration. We are grateful for the reviewer’s suggestion.

      (3) Figures where multiple datapoints are compared across states generally lack assessment of the statistical significance of observed trends (e.g. Figure 3d).

      We appreciate the reviewer’s comment on the statistical significance assessment in Figure 3d. To clarify, the comparisons shown in the subpanels are based on three selected representative models for each state, rather than a broader population sample (similarly for Figure 3b). In the closed-state predicted models, the strong convergence of the voltagesensing domain (VSD), with an all-atom RMSD of 0.36 Å between cluster 1 and 2 closed-state sampling models and 0.95 Å to the outlier cluster, indicates minimal structural variation. Those RMSD values shown in the manuscript text demonstrates good convergence and by themselves represent statistical significance assessment of those models. This trend extends to open-state and inactivated-state AlphaFold models with similarly limited differences in the VSD regions among them. This convergence suggests that population-based statistical analysis may not reveal meaningful deviations, as the low variability among models limits the insights beyond those obtained from comparing representative structures.

      Nonetheless, we acknowledge this limitation. In future studies, we plan to explore alternative modeling approaches to introduce greater variability, enabling a more robust statistical evaluation of state-specific trends in the predictions.

      (4) Figure 3 and Figures S1-S4 compare structural differences between states. However, these differences are inferred from the initial models. The collection of conformations generated via the MD runs allow for much more robust comparisons of structural differences.

      We have explored these conformational state dynamics through MD simulations for the Open (5VA2-based), Inactivated (AlphaFold Cluster 2), and Closed-state models, as presented in Figures S7, S8, S10, S11. These figures provide detailed insights: Figure S7-S8 analyzes SF and pore conformation dynamics, including averaged pore radii with and without voltage and superimposed conformational ensembles; Figure S10 tracks cross-subunit distances between protein backbone carbonyl oxygens, revealing sequential SF dilation steps near residues F627 an G628; and Figure S11 illustrates this SF dilation process over time, highlighting residue F627 carbonyl flipping and SF expansion. We appreciate the opportunity to clarify our approach.

      Reviewer #2 (Recommendations for the authors):

      Major concerns:

      (1) Protein fragments are used to model the closed and inactivated states of hERG, but the choices of fragments are not well justified. For instance, in Figure 1a, helices from 8EP1 (deactivated voltage-sensing domain) and a helix+loop from 5VA2 (selectivity filter) are used. Why just the selectivity filter and not the cytosolic domain, for instance? Why not some parts of the helices attached to the selectivity filter, or the whole membrane inserted domain of 8EP1? Same for the inactivated conformation in Figure 1c: why the cytosolic domain only?

      We thank the reviewer for their thoughtful questions regarding our choice of protein fragments for modeling the closed and inactivated states of hERG in Figures 1a and 1c, and we appreciate the opportunity to justify these selections more clearly. Our approach to template selection was guided by our experience that providing AlphaFold2 with larger templates often leads it to overly constrain predictions to the input structure, reducing its flexibility to explore alternative conformations. In contrast, smaller, targeted fragments increase the likelihood that AlphaFold2 will incorporate the desired structural features while predicting the rest of the protein. We have provided a more detailed discussion of this in the methodology section.

      For the closed state (Figure 1a), we chose the deactivated voltage-sensing domain (VSD) from the rat EAG channel (PDB 8EP1) to inspire AlphaFold2 to predict a similarly deactivated VSD conformation characteristic of hERG channel closure, as this domain’s downward shift is a hallmark of potassium channel closure. We paired this with the selectivity filter (SF) and adjacent residues from the open-state hERG structure (PDB 5VA2) to maintain its conductive conformation, as it is generally understood that K<sup>+</sup> channel closure primarily involves the intracellular gate rather than significant SF distortion. Including additional helices (e.g., S5–S6) or the entire membrane domain from PDB 8EP1 risked biasing the model toward the EAG channel’s pore structure, which differs from hERG’s, while omitting the cytosolic domain ensured focus on the VSD-driven closure without over-constraining cytoplasmic domain interactions.

      For the inactivated state (Figure 1c), we initially used only the cytosolic domain from PDB 5VA2 to anchor the prediction while allowing AlphaFold2 to freely sample transmembrane domain conformations, particularly the SF, where the inactivation occurs via its distortion. Excluding the SF or attached helices at this stage avoided locking the model into the open-state SF, and the cytosolic domain alone provided a minimal scaffold to maintain hERG’s intracellular architecture without dictating pore dynamics. Following the initial prediction, we initiated more extensive sampling by using one of the predicted SFs that differs from the open-state SF (PDB 5VA2) as a structural seed, aiming to guide predictions away from the open-state configuration. The VSD and cytosolic domain were also included in this state to discourage pore closure during prediction. Using larger fragments, like the full membrane-spanning domains or additional cytosolic regions from the open-state structure might reduce AlphaFold2’s ability to deviate from the open-state conformation, undermining our goal of capturing more diverse, state-specific features.

      It is worth noting that multiple strategies could potentially achieve the predicted models in our study, and here we only present examples of the paths we took and validated. It is likely that many of the steps may be unnecessary and could be skipped, and future work building on our approach can further explore and streamline this process. A consistent theme underlies our choices: for the closed state, we know the VSD should adopt a deactivated (“down”) conformation, so we provide AlphaFold2 with a specific fragment to guide this outcome; for the inactivated state, we recognize that the SF must change to a non-conductive conformation, so we grant AlphaFold2 flexibility to explore diverse conformations by minimizing initial constraints on the transmembrane region.

      With greater sampling and computational resources, it is possible we could identify additional plausible, non-conductive conformations that might better represent an inactivated state, as hERG inactivation may encompass a spectrum of states. In this study, due to resource limitations, we focused on generating and validating a subset of conformations. Still, we acknowledge that broader exploration could further refine these models, which could be pursued in future studies. We updated the Methods and Discussion sections to reflect this perspective, and we are grateful for the reviewer’s input, which encourages us to clarify our rationale and highlight the adaptability of our approach.

      To demonstrate the broader feasibility of this approach, we applied it to another ion channel system, voltage-gated sodium channel Na<sub>V</sub> 1.5, as illustrated in Figure S14. In this example, a deactivated VSD II from the cryo-EM structure of a homologous ion channel Na<sub>V</sub>1.7 (PDB 6N4R) (DOI: 10.1016/j.cell.2018.12.018), which was trapped in a deactivated state by a bound toxin, was used as a structural template. This guided AlphaFold to generate a Na<sub>V</sub>1.5 model in which all four voltage sensor domains (VSD I–IV) exhibit S4 helices in varying degrees of deactivation. Compared to the cryo-EM openstate Na<sub>V</sub>1.5 structure (PDB 6LQA) (DOI: 10.1002/anie.202102196), the predicted model displays a visibly narrower pore, representing a plausible closed state. This example underscores the versatility of our strategy in modeling alternative conformational states across diverse ion channels.

      (2) While the authors rely on AF2 (ColabFold) for the closed and inactivated states, they use Rosetta to model loops of the open state. Why not just supply 5VA2 as a template to ColabFold and rebuild the loops that way? Without clear explanations, these sorts of choices give the impression that the authors were looking for specific answers that they knew from their extensive knowledge of the hERG system. While the modeling done in this paper is very nice, its generalizability is not obvious.

      We appreciate the reviewer’s question about our use of Rosetta to model loops in the open-state hERG channel (PDB

      5VA2) rather than rebuilding it entirely with ColabFold. In the study, we conducted a control experiment supplying parts of PDB 5VA2 to ColabFold to rebuild the loops, generating 100 models (Figure 2a: predicted open state). The top-ranked model (by pLDDT) differed from our Rosetta-modelled structure by only 0.5 Å RMSD, primarily due to the flexible extracellular loops as expected, with the pore and selectivity filter (our areas of focus) remaining nearly identical. We chose the Rosetta-refined cryo-EM structure as this structure and approach have been widely used as an open-state reference in our other hERG channel studies, such as by Miranda et al. (DOI: 10.1073/pnas.1909196117) and Yang et al. (DOI: 10.1161/CIRCRESAHA.119.316404), to ensure that our results are more directly comparable to prior work in the field. Nonetheless, as both models (with loops modeled by Rosetta or AlphaFold) were virtually identical, we would expect no significant differences if either were used to represent the open state in our study. We have incorporated this clarification into the main text.

      (3) pLDDT scores were used as a measure of reliable and accurate predictions, but plDDT is not always reliable for selecting new/alternative conformations (see https://doi.org/10.1038/s41467-024-515072 and https://www.nature.com/articles/s41467-024-51801-z).

      We acknowledge that while pLDDT is a valuable indicator of structural confidence in AlphaFold2 predictions, its limitations warrant consideration. In our revision, we mitigated this by not relying solely on pLDDT, but we also performed protein backbone dihedral angle analysis of the protein regions of focus in all predicted models to ensure comprehensive coverage of conformational variations. From our AlphaFold modeling results, we tested a model from cluster 3 of the inactivated-state sampling process, which exhibited lower pLDDT scores, and included these results in our revised analysis. We included a note in the revised manuscript’s Discussion section: “As noted in recent studies, pLDDT scores are not reliable indicators for selecting alternative conformations (DOI: 10.1038/s41467-024-51507-2 and DOI: 10.1038/s41467-024-51801-z). To address this, we performed a protein backbone dihedral angle analysis in the regions of interest to ensure that our evaluation captured a representative range of sampled conformations.”

      (4) Extensive work has been done using AF2 to model alternative protein conformations (https://www.biorxiv.org/content/10.1101/2024.05.28.596195v1.abstract, along with some references the authors cite, such as work by McHaourab); another group recently modeled the ion channel GLIC (https://www.biorxiv.org/content/10.1101/2024.09.05.611464v1.abstract). Therefore, this work, though generally solid and thorough, seems more like a variation on a theme than a groundbreaking new methodology, especially because of the generalizability issues mentioned above.

      We sincerely thank the reviewer for acknowledging the solidity of our study and for drawing our attention to the impressive recent efforts using AlphaFold2 to explore alternative protein conformations. These studies are valuable contributions that highlight the versatility of AlphaFold2, and we are grateful for their context in evaluating our work.

      Building on these efforts, our approach not only enhances the prediction of conformational diversity but also introduces a twist by incorporating structural templates to guide AlphaFold2 toward specific functional protein states. More significantly, our study advances beyond mere structural modeling by integrating these conformations with their rigorous validation by incorporating multiple simulation results tested against experimental data to reveal that AlphaFold-predicted conformations can align with distinct physiological ion channel states. A key finding is that drug binding predictions using AlphaFold-derived hERG channel states substantially improve correlation with experimental data, which is a longstanding challenge in computational screening of multi-state proteins like the hERG channel, for which previous structural models have been mostly limited to the open state based on the cryo-EM structures. Our approach not only captures this critical state dependence but also reveals potential molecular determinants underlying enhanced drug binding during hERG channel inactivation, a phenomenon observed experimentally but poorly understood. These insights advance drug safety assessment by improving predictive screening for hERG-related cardiotoxicity, a major cause of drug attrition and withdrawal.

      We view our methodology as a natural evolution of the advancements cited by the reviewer, offering an approach that predicts diverse hERG channel conformational states and links them to meaningful functional and pharmacological outcomes. To address the reviewer’s concern about generalizability, we have expanded the methodology section to make it easier to follow and include additional details. As an example, we show how our approach can be applied to model another ion channel system, Na<sub>V</sub>1.5, in Figure S14.

      Furthermore, to enhance the applicability of our methodology, we have uploaded the scripts for analyzing AlphaFoldpredicted models to GitHub (https://github.com/k-ngo/AlphaFold_Analysis), ensuring they are adaptable for a wide range of scenarios with extensive documentation. This enables users, even those not focused on ion channels, to effectively apply our tools to analyze AlphaFold predictions for their own projects and produce publication-ready figures.

      While it is likely that multiple modeling approaches could lead AlphaFold to model alternative protein conformations, the key challenge lies in validating the physiological relevance of those predicted states. This study is intended to support other researchers in applying our template-guided approach to different protein systems, and more importantly, in rigorously in silico testing and validation of the biological significance of the conformation-specific structural models they generate.

      Minor concerns:

      (1) The authors mention in the Introduction section that capturing conformational states, especially for membrane proteins that may be significant as drug targets, is crucial. It would be helpful to relate their work to the NMR studies domains of the hERG channel, particularly the N-terminal “eag” domain, which is crucial for channel function and can provide insights into conformational changes associated with different channel states (https://doi.org/10.1016/j.bbrc.2010.10.132 ).

      We appreciate the reviewer’s insightful comment regarding the PAS domain and the potential influence of other regions, such as the N-linker and distal C-region, on drug binding and state transitions.

      The PAS domain did appear in the starting templates used for initial structural modeling (as shown in Figure 1a, b, c), but it was not included in the final models used for subsequent analyses. The omission was primarily due to hardwareimposed constraints, as including these additional regions would exceed the memory capacity of our current graphics processing unit (GPU) card, leading to failures during the prediction step.

      The PAS domain, even if not serving as a conventional direct drug-binding site, can influence the gating kinetics of hERG channels. By altering the probability and duration with which channels occupy specific states, it can indirectly affect how well drugs bind. For example, if the presence of the PAS domain shifts hERG channel gating so that more channels enter (and remain in) the inactivated state as was shown previously (e.g., DOI: 10.1085/jgp.201210870), drugs with a higher affinity for that state would appear to bind more potently, as observed in previous electrophysiological experiments (e.g., DOI: 10.1111/j.1476-5381.2011.01378.x). It is also plausible that the PAS domain could exert allosteric effects that alter the conformational landscape of the hERG channel during gating transitions, potentially impacting drug accessibility or binding stability. This is an intriguing hypothesis and an important avenue for future research.

      With access to more powerful computational resources, it would be valuable to explore the full-length hERG channel, including the PAS domain and associated regions, to assess their potential contributions to drug binding and gating dynamics. We incorporated a discussion of these points into the main text, acknowledging the limitations of our current models and highlighting the need for future studies to explore these regions in greater detail. The addition reads: “…Our models excluded the N-terminal PAS domain due to GPU memory limitations, despite its inclusion in initial templates. This omission may overlook its potential roles in gating kinetics and allosteric effects on drug binding (e.g., PMID: 21449979, PMID: 23319729, PMID: 29706893, PMID: 30826123, DOI:10.4103/jpp.JPP_158_17). Future research will explore the full-length hERG channel with enhanced computational resources to assess these regions’ contributions to conformational state transitions and pharmacology.”

      (2) In the second-to-last paragraph of the Introduction, the authors describe how AlphaFold2 works. They write, “AlphaFold2 primarily requires the amino acid sequence of a protein as its input, but the method utilizes other key elements: in addition to the amino acid sequence, AlphaFold2 can also utilize multiple sequence alignments (MSAs) of similar sequences from different species, templates of related protein structures when available, and/or homologous proteins (Jumper et al., 2021a). Evolutionarily conserved regions over multiple isoforms and species indicated that the sequence is crucial for structural integrity”. The last sentence is confusing; if the authors mean that all information required to fold the protein into its 3D structure is present in its primary sequence, that has been the paradigm. It is unclear from this paragraph what the authors wanted to convey.

      We apologize for any confusion caused by this phrasing. Our intent was not to restate the well-established paradigm that a protein’s primary sequence contains the information needed for its 3D structure, but rather to emphasize how

      AlphaFold2 leverages evolutionary conservation, via multiple sequence alignments (MSAs), to infer structural constraints beyond what a single sequence alone might reveal. Specifically, we aimed to highlight that conserved regions across species and isoforms provide additional context that AlphaFold2 uses to enhance the accuracy of its predictions, complementing the use of templates and homologous structures as described in Jumper et al. (2021). To clarify this, we revised the sentence in the manuscript to read: “AlphaFold2 primarily requires a protein's amino acid sequence as input, but it also leverages other critical data sources. In addition to the sequence, it incorporates multiple sequence alignments (MSAs) of related proteins from different species, available structural templates, and information on homologous proteins. While the primary sequence encodes the 3D structure, AlphaFold2 harnesses evolutionary conservation from MSAs to reveal structural insights that extend beyond what a single sequence can provide.” We thank the reviewer for pointing out this ambiguity.

      (3) In the Results section, the authors state that the predictions generated by their method are evaluated by standard accuracy metrics, please elaborate - what standard metrics were used to judge the predictions and why (some references would be a nice addition). Further, on Page 6, the sentence “There are fewer differences between the open- and closed-state models (Figure S2b, d)” is confusing, fewer differences than what? or there are a few differences between the two states/models? Please clarify.

      The original sentence referring to “standard accuracy metrics” is somewhat misplaced, as our intent was not to apply any conventional “benchmarking” to judge the predictions, but rather to evaluate functional and structural relevance in a physiologically meaningful context. Specifically, we assessed drug binding affinities from molecular docking simulations (in Rosetta Energy Units, R.E.U.) against experimental drug potency data (e.g., IC<sub>50</sub> values converted to free energies in kcal/mol, Figure 7), analyzed differences in interaction networks across states in relation to known mutations affecting hERG inactivation (Figure 4, Table 2), validated ion conduction properties through MD simulations with the applied voltage against expected state-dependent hERG channel behavior (Figure 5), and compared predicted structural models to available experimental cryo-EM structures (Figure 3). We clarified in the text that our assessment emphasized the physiological plausibility of the generated conformations, drawing on evidence from existing computational and experimental studies at each step of the analysis above.

      As for the sentence on page 6, “There are fewer differences between the open- and closed-state models,” we apologize for the ambiguity; we meant that the hydrogen bond networks in the selectivity filter region exhibit fewer differences between the open and closed states compared to the more pronounced variations seen between the open and inactivated states. We revised this sentence to read: “The open- and closed-state models show fewer differences in their selectivity filter hydrogen bond networks compared to those between the open and inactivated states,” to enhance readability.

      (4) In the Discussion, the authors reiterate that this methodology can be extended to sample multiple protein conformations, and their system of choice was hERG potassium channel. I think this methodology can be applied to a system when there is enough knowledge of static structures, and some information on dynamics (through simulations) and mutagenesis analysis available. A well-studied system can benefit from such a protocol to gauge other conformational states.

      We agree that this approach is well-suited to systems with sufficient static structures, dynamic insights from simulations, and mutagenesis data, as seen with the hERG channel. We appreciate the reviewer’s implicit concern about generalizability to less-characterized systems and addressed this in the Discussion as a limitation, noting that the method’s effectiveness may depend on prior knowledge. Future studies can explore whether the advent of AlphaFold3 and other deep learning approaches can enhance its applicability to systems with more limited data. We have added this comment to the Discussion: “…A limitation of our methodology is its reliance on well-characterized systems with ample static structures, molecular dynamics simulation data, and mutagenesis insights, as demonstrated with the hERG channel, which may limit its applicability to less-studied proteins.”

      (5) The Methods section must be broken down into steps to make it easier to follow for the reader (if they want to implement these steps for themselves on their system of choice).

      a. Is possible to share example scripts and code used to piece templates together for AF2. Also, since the AF3 code is now available, the authors may comment on how their protocol can be applicable there or have plans to implement their protocol using AF3 (which is designed to work better for binding small molecules). Please see https://github.com/google-deepmind/alphafold3 for the recently released code for AF3.

      We appreciate the reviewer’s suggestion to improve the Methods section and their comments on scripts and AlphaFold3 (AF3). We revised the Methods to separate it into clear steps (e.g., template preparation, AF2 setup, clustering, and refinement) for better readability and reproducibility, and uploaded the sample scripts along with the instructions to GitHub (https://github.com/k-ngo/AlphaFold_Analysis).

      Regarding AF3’s recent code release, we plan to explore the applicability of our methodology to AF3 in a follow-up study, leveraging its advanced features to refine conformational predictions and state-specific drug docking, and added a brief comment to the Discussion to reflect this future direction: “…Following the recent release of AlphaFold3’s source code, we plan to explore the applicability of our template-guided methodology in a follow-up study, leveraging AF3’s advanced diffusion-based architecture to enhance protein conformational state predictions and state-specific drug docking, particularly given its improved capabilities for modeling small molecule – protein interactions…”

      b. The authors modified the hERG protein by removing a segment, the N-terminal PAS domain (residues M1 - R397) because of graphics card memory limitation. Would the removal of the PAS domain affect the structure and function of the channel protein? HERG and other members of the “eag K<sup>+</sup> channel” family contain a PAS domain on their cytoplasmic N terminus. Removal of this domain alters a physiologically important gating transition in HERG, and the addition of the isolated domain to the cytoplasm of cells expressing truncated HERG reconstitutes wild-type gating. (see https://doi.org/10.1371/journal.pone.0059265). Please elaborate on this.

      We thank the reviewer for raising an important point about the removal of the N-terminal PAS domain and for highlighting its physiological role in hERG channel gating transitions. In our study, unlike experimental settings where PAS removal alters gating, we believe this omission has minimal impact on our key analyses.

      The drug docking procedure focuses on optimizing drug binding poses with minor protein structural refinement around the putative drug binding site, which in our case is the hERG channel pore region, where hERG-blocking drugs predominantly bind. The cytoplasmic PAS domain, located distally from this site, remains outside the protein structure refinement zone during drug docking simulations. However, one aspect we have not yet considered is the potential effect of drug modulation of the hERG channel gating and vice versa particularly given the PAS domain’s role in gating. This interplay could be significant but requires investigation beyond our current drug docking framework. We plan to explore this in future studies using alternative simulation methodologies, such as extended MD simulations or enhanced sampling techniques, to comprehensively capture these dynamic protein - ligand interactions.

      Similarly, in our 1 μs long MD simulations assessing ion conductivity (Figure 4), the timescale is too short for PASmediated gating changes to propagate through the protein and meaningfully influence ion conduction and channel activation dynamics, which occurs on a millisecond time scale (see e.g., DOI: 10.3389/fphys.2018.00207). To fully address this limitation, we plan to explore the inclusion of the PAS domain in a follow-up study with enhanced computational resources, allowing us to investigate its structural and functional contributions more comprehensively.

      (6) The first paragraph of the Methods reads as though AF2 has layers that recycle structures. We doubt that the authors meant it that way. Please update the language to clarify that recycling is an iterative process in which the pairwise representation, MSA, and predicted structures are passed (“recycled”) through the model multiple times to improve predictions.

      We agree that the phrasing might suggest physical layers recycling structures, which was not our intent. Instead, we meant to describe AlphaFold2’s iterative refinement process, where intermediate outputs, such as the pairwise residue representations, multiple sequence alignments (MSAs), and predicted structures, are iteratively passed (or “recycled”) through the model to enhance prediction accuracy. To clarify this, we revised the relevant sentence to read: “A critical feature of AlphaFold2 is its iterative refinement, where pairwise residue representations, MSAs, and initial structural predictions are recycled through the model multiple times, improving accuracy with each iteration.”

      Reviewer #3 (Recommendations for the authors):

      The authors should integrate the very recently published CryoEM experimental data of hERG inhibition by several drugs (Miyashita et al., Structure, 2024; DOI: 10.1016/j.str.2024.08.021).

      We thank the reviewer for the suggestion. Here, we compare drug binding in our open-states (PDB 5VA2-derived and an additional AlphaFold-predicted model from Cluster 3 of inactivated-state-sampling attempt named “AF ic3”) and inactivated-state models, using the cationic forms of astemizole and E-4031, with the corresponding experimental structures (Figure S13). Drug binding in the closed state is excluded as the pore architecture deviates too much from those in the cryo-EM structures. Experimental data (DOI: 10.1124/mol.108.049056) indicate that both astemizole and E4031 bind more potently to the inactivated state of the hERG channel.

      Astemizole (Figure S13a):

      - In the PDB 5VA2-derived open-state model, astemizole binds centrally within the pore cavity, adopting a bent conformation that allows both aromatic ends of the molecule to engage in π–π stacking with the side chains of Y652 from two opposing subunits. Hydrophobic contacts are observed with S649 and F656 residues.

      - In the AF ic3 open-state model, the ligand is stabilized through multiple π–π stacking interactions with Y652 residues from three subunits, forming a tight aromatic cage around its triazine and benzimidazole rings. Hydrophobic interactions are observed with hERG residues T623, S624, Y652, F656, and S660.

      - In the inactivated-state model, astemizole adopts a compact, horizontally oriented pose deeper in the channel pore, forming the most extensive interaction network among all the states. The ligand is tightly stabilized by multiple π–π stacking interactions with Y652 residues across three subunits, and forms hydrogen bonds with residues S624 and Y652. Additional hydrophobic contacts are observed with residues F557, L622, S649, and Y652.

      - Consistent with our findings, electrophysiology study by Saxena et al. identified hERG residues F557 and Y652 as crucial for astemizole binding, as determined through mutagenesis (DOI: 10.1038/srep24182).

      - In the cryo-EM structure (PDB 8ZYO) (DOI: 10.1016/j.str.2024.08.021), astemizole is stabilized by π–π stacking with Y652 residues. However, no hydrogen bonds are detected which may reflect limitations in cryo-EM resolution rather than true absence of contacts. Additional hydrophobic interacts are observed with L622 and G648 residues.

      E-4031 (Figure S13b):

      - In the PDB 5VA2-derived open-state model, E-4031 binds within the central cavity primarily through polar interactions. It forms a π–π stacking interaction with residue Y652, anchoring one end of the molecule. Polar interactions are observed with residues A653 and S660. Additional hydrophobic contacts are observed with residues A652 and Y652.

      - In the AF ic3 open-state model, E-4031 adopts a slightly deeper pose within the central cavity stabilized by dual π–π stacking interactions between its aromatic rings and hERG residue Y652. Additional hydrogen bonds are observed with residues S624 and Y652, and hydrophobic contacts are observed with residues T623 and S624.

      - In the inactivated-state model, E-4031 adopts its deepest and most stabilized binding pose, consistent with its experimentally observed preference for this state. The ligand is stabilized by multiple π–π stacking interactions between its aromatic rings and hERG residues Y652 from opposing subunits. The sulfonamide nitrogen engages in hydrogen bonding with residue S649, while the piperidine nitrogen hydrogen bonds with residue Y652. Hydrophobic contacts with residues S624, Y652, and F656 further reinforce the binding, enclosing the ligand in a densely packed aromatic and polar environment.

      - Previous mutagenesis study showed that mutations involving hERG residues F557, T623, S624, Y652, and F656 affect E-4031 binding (DOI: 10.3390/ph16091204).

      - In the cryo-EM structure (PDB 8ZYP) (DOI: 10.1016/j.str.2024.08.021), E-4031 engages in a single π–π stacking interaction with hERG residue Y652, anchoring one end of the molecule. The remainder of the ligand is stabilized predominantly through hydrophobic contacts involving residues S621, L622, T623, S624, M645, G648, S649, and additional Y652 side chains, forming a largely nonpolar environment around the binding pocket.

      In both cryo-EM structures, astemizole and E-4031 adopt binding poses that closely resembles the inactivated-state model in our docking study, consistent with experimental evidence that these drugs preferentially bind to the inactivated state (DOI: 10.1124/mol.108.049056). This raises the possibility that the cryo-EM structures may capture an inactivatedlike channel state. However, closer examination of the SF reveals that the cryo-EM conformations more closely resemble the open-state PDB 5VA2 structure (DOI: 10.1016/j.cell.2017.03.048), which has been shown to be conductive here and in previous studies (DOI: 10.1073/pnas.1909196117, 10.1161/CIRCRESAHA.119.316404).

      The conformational differences between the cryo-EM and open-state docking results may reflect limitations of the docking protocol itself, as GALigandDock assumes a rigid protein backbone and cannot account for ligand-induced large conformational changes. In our open-state models, the hydrophobic pocket beneath the selectivity filter is too small to accommodate bulky ligands (Figure 3a, b), whereas the cryo-EM structures show a slight outward shift in the S6 helix that expands this space (Figure S13).These allosteric rearrangements, though small, falls outside the scope of the current docking protocol, which lacks flexibility to capture these local, ligand-induced adjustments (DOI: 10.3389/fphar.2024.1411428).

      In contrast, docking to the AlphaFold-predicted inactivated-state model reveals a reorganization beneath the selectivity filter that creates a larger cavity, allowing deeper ligand insertion. Notably, neither our inactivated-state docking nor the available cryo-EM structures show strong interactions with F656 residues. However, in the AlphaFold-predicted inactivated model, the more extensive protrusion of F656 into the central cavity may further occlude the drug’s egress pathway, potentially trapping the ligand more effectively. This could explain why mutation of F656 significantly reduces the binding affinity of E-4031 (DOI: 10.3390/ph16091204). These findings suggest that inactivation may trigger a series of modular structural rearrangements that influence drug access and binding affinity, with different aspects potentially captured in various computational and experimental studies, rather than resulting from a single, uniform conformational change.

      Discussion of the original Wang and Mackinnon finding, DOI: 10.1016/j.cell.2017.03.048 regarding C-inactivation, pore mutation S631A and F627 rearrangement is likely warranted. Since hERG inactivation is present at 0 mV in WT channels (the likely voltage for the CryoEM study) please discuss how this might affect interpretations of starting with this structure as a template for models presented here, perhaps as part of Figure S1.

      We sincerely thank the reviewer for bringing up the insightful findings from Wang and MacKinnon regarding hERG C-type inactivation as well as the voltage context of their cryo-EM structure (PDB 5VA2). We recognize that WT hERG exhibits inactivation at 0 mV, likely the condition of the cryo-EM study, raising the possibility that PDB 5VA2, while classified as an open state, might subtly reflect features of inactivation. Notably, PDB 5VA2 has been widely adopted in numerous studies and consistently found to represent a conducting state, such as in Yang et al. (DOI: 10.1161/CIRCRESAHA.119.316404) and Miranda et al. (DOI: 10.1073/pnas.1909196117). Our MD simulations further support this, showing K<sup>+</sup> conduction in the 5VA2-based open-state model (Figure 4a, c), consistent with its selectivity filter conformation (Figure S1a). Although we used PDB 5VA2 as a starting template for predicting inactivated and closed states, our AlphaFold2 predictions did not rigidly adhere to this structure, as evidenced by distinct differences in hydrogen bond networks, drug binding affinities, pore radii, and ion conductivity between our state-specific hERG channel models (Figures S2, 5, 3b, 4). Nevertheless, this does not preclude the possibility that PDB 5VA2’s certain potential inactivated-like traits at 0 mV could subtly influence our predictions elsewhere in the model, which warrants further exploration in future studies. In our revised analysis, we also tested an alternative AlphaFold-predicted conformation, referred to as Open (AlphaFold cluster 3), which, while sharing some similarities with PDB 5VA2, exhibits subtle differences in the selectivity filter and pore conformations. This structure was also found to be conducting ions and showed a drug binding profile similar to that of the PDB 5VA2-based open-state model. We greatly appreciate this feedback which helped us refine and strengthen our analysis.

      Page 8, the significance of 750 and 500 mV in terms of physiological role?

      We appreciate this opportunity to clarify the methodological rationale. Although these voltages significantly exceed typical physiological membrane potentials, their use in MD simulations is a well-established practice to accelerate ion conduction events. This approach helps overcome the inherent timescale limitations of conventional MD simulations, as demonstrated in previous studies of hERG and other ion channels. For instance, Miranda et al. (DOI: 10.1073/pnas.1909196117), Lau et al. (DOI: 10.1038/s41467-024-51208-w), Yang et al. (DOI: 10.1161/CIRCRESAHA.119.316404) applied similarly high voltages (500~750 mV) to study hERG K<sup>+</sup> conduction, which is notably small under physiological conditions at ~2 pS (DOI: 10.1161/01.CIR.94.10.2572), necessitating amplification to observe meaningful permeation within nanosecond-to-microsecond timescales. Likewise, studies of other K<sup>+</sup> ion channels, such as Woltz et al. (DOI: 10.1073/pnas.2318900121) on small-conductance calcium-activated K<sup>+</sup> channel SK2 and Wood et al. (DOI: 10.1021/acs.jpcb.6b12639) on Shaker K<sup>+</sup> channel, have used elevated voltages (250~750 mV) to probe ion conduction mechanisms via MD simulations. In addition, the typical timescale of these simulations (1 μs) is too short to capture major structural effects such as those leading to inactivation or deactivation which occur over milliseconds in physiological conditions.

      The abstract could be edited a bit to more clearly state the novel findings in this study.

      We thank the reviewer for their suggestion. We have revised the abstract to read: “To design safe, selective, and effective new therapies, there must be a deep understanding of the structure and function of the drug target. One of the most difficult problems to solve has been resolution of discrete conformational states of transmembrane ion channel proteins. An example is K<sub>V</sub>11.1 (hERG), comprising the primary cardiac repolarizing current, I<sub>kr</sub>. hERG is a notorious drug antitarget against which all promising drugs are screened to determine potential for arrhythmia. Drug interactions with the hERG inactivated state are linked to elevated arrhythmia risk, and drugs may become trapped during channel closure. While prior studies have applied AlphaFold to predict alternative protein conformations, we show that the inclusion of carefully chosen structural templates can guide these predictions toward distinct functional states. This targeted modeling approach is validated through comparisons with experimental data, including proposed state-dependent structural features, drug interactions from molecular docking, and ion conduction properties from molecular dynamics simulations. Remarkably, AlphaFold not only predicts inactivation mechanisms of the hERG channel that prevent ion conduction but also uncovers novel molecular features explaining enhanced drug binding observed during inactivation, offering a deeper understanding of hERG channel function and pharmacology. Furthermore, leveraging AlphaFold-derived states enhances computational screening by significantly improving agreement with experimental drug affinities, an important advance for hERG as a key drug safety target where traditional single-state models miss critical state-dependent effects. By mapping protein residue interaction networks across closed, open, and inactivated states, we identified critical residues driving state transitions validated by prior mutagenesis studies. This innovative methodology sets a new benchmark for integrating deep learning-based protein structure prediction with experimental validation. It also offers a broadly applicable approach using AlphaFold to predict discrete protein conformations, reconcile disparate data, and uncover novel structure-function relationships, ultimately advancing drug safety screening and enabling the design of safer therapeutics.”

      Many of the Supplemental figures would fit in better in the main text, if possible, in my opinion. For instance, the network analysis (Fig. S2) appears to be novel and is mentioned in the abstract so may fit better in the main text. The discussion section could be focused a bit more, perhaps with headers to highlight the key points.

      Yes, we agree with the reviewer and made the suggested changes. We moved Figure S2 as a new main-text figure.

      Additionally, we revised the Discussion section to improve focus and clarity.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      In this study, Hama et al. explored the molecular regulatory mechanisms underlying the formation of the ULK1 complex. By employing the AlphaFold structural prediction tool, they showed notable differences in the complex formation mechanisms between ULK1 in mammalian cells and Atg1 in yeast cells. Their findings revealed that in mammalian cells, ULK1, ATG13, and FIP200 form a complex with a stoichiometry of 1:1:2. These predicted interaction regions were validated through both in vivo and in vitro assays, enhancing our understanding of the molecular mechanisms governing ULK1 complex formation in mammalian cells. Importantly, they identified a direct interaction between ULK1 and FIP200, which is crucial for autophagy. However, some aspects of this manuscript require further clarification, validation, and correction by the authors.

      Thank you for your thorough evaluation of our manuscript. We have carefully revised the manuscript to address your concerns by performing extra experiments and providing additional clarifications, validations, and corrections as written below.

      Reviewer #2 (Public review):

      Summary:

      This is important work that helps to uncover how the process of autophagy is initiated - via structural analyses of the initiating ULK1 complex. High-resolution structural details and a mechanistic insight of this complex have been lacking and understanding how it assembles and functions is a major goal of a field that impacts many aspects of cell and disease biology. While we know components of the ULK1 complex are essential for autophagy, how they physically interact is far from clear. The work presented makes use of AlphaFold2 to structurally predict interaction sites between the different subunits of the ULK1 complex (namely ULK1, ATG13, and FIP200). Importantly, the authors go on to experimentally validate that these predicted sites are critical for complex formation by using site-directed mutagenesis and then go on to show that the three-way interaction between these components is necessary to induce autophagy in cells.

      Strengths:

      The data are very clear. Each binding interface of ATG13 (ATG13 with FIP300/ATG13 with ULK1) is confirmed biochemically with ITC and IP experiments from cells. Likewise, IP experiments with ULK1 and FIP200 also validate interaction domains. A real strength of the work in in their analyses of the consequences of disrupting ATG13's interactions in cells. The authors make CRISPR KI mutations of the binding interface point mutants. This is not a trivial task and is the best approach as everything is monitored under endogenous conditions. Using these cells the authors show that ATG13's ability to interact with both ULK1 and FIP200 is essential for a full autophagy response.

      Thank you for your thoughtful review and for highlighting the importance of our approach.

      Weaknesses:

      I think a main weakness here is the failure to acknowledge and compare results with an earlier preprint that shows essentially the same thing (https://doi.org/10.1101/2023.06.01.543278). Arguably this earlier work is much stronger from a structural point of view as it relies not only on AlphaFold2 but also actual experimental structural determinations (and takes the mechanisms of autophagy activation further by providing evidence for a super complex between the ULK1 and VPS34 complexes). That is not to say that this work is not important, as in the least it independently helps to build a consensus for ULK1 complex structure. Another weakness is that the downstream "functional" consequences of disrupting the ULK1 complex are only minimally addressed. The authors perform a Halotag-LC3 autophagy assay, which essentially monitors the endpoint of the process. There are a lot of steps in between, knowledge of which could help with mechanistic understanding. Not in the least is the kinase activity of ULK1 - how is this altered by disrupting its interactions with ATG13 and/or FIP200?

      Thank you for this valuable feedback. In response, we performed a detailed structural comparison between the cryo-EM structure reported in the referenced preprint and our AlphaFold-based model. We have summarized both the similarities and differences in newly included figures (revised Figure 2A, B, 3B, S1F) and provided an in-depth discussion in the main text. Furthermore, to address the downstream consequences of ULK1 complex disruption, we have investigated the impact on ULK1 kinase activity, specifically examining how mutations affecting ATG13 or FIP200 interaction alter ULK1’s phosphorylation of a key substrate ATG14. In addition, we analyzed the effect on ATG9 vesicle recruitment. We provide the corresponding data as Figure S3C-E and detailed discussions in the revised manuscript.

      Reviewer #3 (Public review):

      In this study, the authors employed the protein complex structure prediction tool AlphaFold-Multimer to obtain a predicted structure of the protein complex composed of ULK1-ATG13-FIP200 and validated the structure using mutational analysis. This complex plays a central role in the initiation of autophagy in mammals. Previous attempts at resolving its structure have failed to obtain high-resolution structures that can reveal atomic details of the interactions within the complex. The results obtained in this study reveal extensive binary interactions between ULK1 and ATG13, between ULK1 and FIP200, and between ATG13 and FIP200, and pinpoint the critical residues at each interaction interface. Mutating these critical residues led to the loss of binary interactions. Interestingly, the authors showed that the ATG13-ULK1 interaction and the ATG13-FIP200 interaction are partially redundant for maintaining the complex.

      We are grateful for your high evaluation of our work.

      The experimental data presented by the authors are of high quality and convincing. However, given the core importance of the AlphaFold-Multimer prediction for this study, I recommend the authors improve the presentation and documentation related to the prediction, including the following:

      (1) I suggest the authors consider depositing the predicted structure to a database (e.g. ModelArchive) so that it can be accessed by the readers.

      We have deposited the AlphaFold model to ModelArchive with the accession code ma-jz53c, which is indicated in the revised manuscript.

      (2) I suggest the authors provide more details on the prediction, including explaining why they chose to use the 1:1:2 stoichiometry for ULK1-ATG13-FIP200 and whether they have tried other stoichiometries, and explaining why they chose to use the specific fragments of the three proteins and whether they have used other fragments.

      We appreciate your suggestion. As we noted in the original manuscript, previous studies have shown that the C-terminal region of ULK1 and the C-terminal intrinsically disordered region of ATG13 bind to the N-terminal region of the FIP200 homodimer (Alers, Loffler et al., 2011; Ganley, Lam du et al., 2009; Hieke, Loffler et al., 2015; Hosokawa, Hara et al., 2009; Jung, Jun et al., 2009; Papinski and Kraft, 2016; Wallot-Hieke, Verma et al., 2018). We relied on these findings when determining the specific regions to include in our complex prediction and when selecting a 1:1:2 stoichiometry for ULK1–ATG13–FIP200 which was reported previously (Shi et al., 2020). We also used AlphaFold2 to predict the structures of the full-length ULK1–ATG13 complex and the complex of the FIP200N dimer with full-length ATG13, confirming that there were no issues with our choice of regions (revised Figure S1A-C). In the revised manuscript, we have provided a more detailed explanation of our rationale based on the previous reports and additional AlphaFold predictions.

      (3) I suggest the authors present the PAE plot generated by AlphaFold-Multimer in Figure S1. The PAE plot provides valuable information on the prediction.

      We provided the PAE plot in the revised Figure S1C.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) In Figure 1D, the labels for the input and IP of ATG13-FLAG should be corrected to ATG13-FLAG FIP3A.

      We thank the reviewer for pointing out these labeling mistakes. We revised the labels based on the suggestions.

      (2) In the discussion section, the authors should address why ATG13-FLAG ULK1 2A in Fig. 2D leads to a significantly lower expression of ULK1 and provide possible explanations for this observation.

      ATG13 and ATG101, both core components of the ULK1 complex, are known to stabilize each other through their mutual interaction. Loss or reduction of one protein typically leads to the destabilization of the other. In this context, ULK1 is similarly stabilized by binding to ATG13. Therefore, ATG13-FLAG ULK2A mutant, which has reduced binding to ULK1, likely loses this stabilizing activity and ULK1 becomes destabilized, resulting in the lower expression levels of ULK1. We added these discussions in the revised manuscript.

      (3) In Figure 4B, the authors should explain why Atg13-FLAG KI significantly affects the expression of endogenous ULK1. Could Atg13-FLAG KI be interfering with its binding to ULK1? Experimental evidence should be provided to support this. Additionally, does Atg13-FLAG KI affect autophagy? Wild-type HeLa cells should be included as a control in Figure 4C and 4D to address this question.

      Thank you for your constructive suggestion. We found a technical error in the ULK1 blot of Figure 4B. Therefore, we repeated the experiment. The results show that ULK1 expression did not significantly change in the ATG13-FLAG KI. These findings are consistent with Figure S3A. We have replaced Figure 4B with this new data.

      We agree that including wild-type HeLa cells as a control is essential to determine whether ATG13-FLAG KI affects autophagy. We performed the same experiments in wild-type HeLa cells and found that ATG13-FLAG KI does not significantly impact autophagic flux. Accordingly, we have replaced Figures 4D and 4E with these new data.

      (4) In Figure 3C, the authors used an in vitro GST pulldown assay to detect a direct interaction between ULK1 and FIP200, which was also confirmed in Figure 3E. However, since FLAG-ULK1 FIP2A affects its binding with ATG13 (Fig. 3E), it is possible that ULK1 FIP2A inhibits autophagy by disrupting this interaction. The authors should therefore use an in vitro GST pulldown assay to determine whether GST-ULK1 FIP2A affects its binding with ATG13. Additionally, the authors should investigate whether the interaction between ULK1 and FIP200 in cells requires the involvement of ATG13 by using ATG13 knockout cells to confirm if the ULK1-FIP200 interaction is affected in the absence of ATG13.

      Thank you for the valuable suggestion. We examined the effect of the FIP2A mutation on the ULK1–ATG13 interaction using isothermal titration calorimetry (ITC) to obtain quantitative binding data. The results showed that the FIP2A mutation does not markedly alter the affinity between ULK1 and ATG13 (revised Figure S2B), suggesting that FIP2A mainly weakens the ULK1–FIP200 interaction. Regarding experiments in ATG13 knockout cells, ULK1 becomes destabilized in the absence of ATG13, making it technically difficult to assess how the ULK1–FIP200 interaction is affected under those conditions.

      Reviewer #2 (Recommendations for the authors):

      I feel the manuscript would benefit from a more detailed comparison with the Hurely lab paper - are the structural binding interfaces the same, or are there differences?

      We appreciate the suggestion to compare our results more closely with the work from the Hurley lab. We performed a detailed structural comparison between the cryo-EM structure reported in the referenced preprint and our AlphaFold-based model (revised Figure 2A, B, 3B, S1F) and provided an in-depth discussion in the main text.

      As mentioned, what happens downstream of disrupting the ULK1 complex? How is ULK1 activity changed, both in vitro and in cells? Does disruption of the ULK1 complex binding sites impair VPS34 activity in cells (for example by looking at PtdIns3P levels/staining)?

      Thank you for your insightful comments. We focused on elucidating how disrupting the ULK1 complex leads to impaired autophagy. To assess ULK1 activity, we measured ULK1-dependent phosphorylation of ATG14 at Ser29 (PMID: 27046250; PMID: 27938392). In FIP3A and FU5A knock-in cells, ATG14 phosphorylation was significantly reduced, indicating decreased ULK1 activity (revised Figure S3D, E). This observation is consistent with previous work showing that FIP200 recruits the PI3K complex. Notably, in ATG13 knockout cells, ATG14 phosphorylation became almost undetectable, though the underlying mechanism remains to be fully investigated. Altogether, these data point to reduced ULK1 activity as a key factor explaining the autophagy deficiency observed in FU5A knock-in cells.

      We also explored possible downstream mechanisms. One well-established function of ATG13 is to recruit ATG9 vesicles (PMID: 36791199). These vesicles serve as an upstream platform for the PI3K complex, providing the substrate for phosphoinositide generation (PMID: 38342428). To clarify how our mutations impact this step, we starved ATG13-FLAG knock-in cells and observed ATG9 localization. Unexpectedly, even in FU5A knock-in cells where ATG13 is almost completely dissociated from the ULK1 complex, ATG9A still colocalized with FIP200 (revised Figure S3C). These puncta also overlapped with p62, likely because p62 bodies recruit both FIP200 and ATG9 vesicles. Although we suspect that ATG9 recruitment is nonetheless impaired under these conditions, we were unable to definitively demonstrate this experimentally and consider it an important avenue for future study.

      Reviewer #3 (Recommendations for the authors):

      Here are some additional minor suggestions:

      (1) The UBL domains are only mentioned in the abstract but not anywhere else in the manuscript. I suggest the authors add descriptions related to the UBL domains in the Results section.

      We thank the reviewer for pointing out the lack of description of UBL domains, which we added in Results in the revised manuscript.

      (2) The authors may want to consider adding a diagram in Figure 1A to show the domain organization of the three full-length proteins and the ranges of the three fragments in the predicted structure.

      We have added a proposed diagram as Figure 1A.

      (3) I suggest the authors consider highlighting in Figure 1A the positions of the binding sites shown in Figure 1B, for example, by adding arrows in Figure 1A.

      We have added arrows in the revised Figure 1B (which was Figure 1A in the original submission).

      (4) In Figure 1D, "Atg13-FLAG" should be "Atg13-FLAG FIP3A".

      We have revised the labeling in Figure 1D.

      (5) "the binding of ATG13 and ULK1 to the FIP200 dimer one by one" may need to be re-phrased. "One by one" conveys a meaning of "sequential", which is probably not what the authors meant to say.

      We have revised the sentence as “the binding of one molecule each of ATG13 and ULK1 to the FIP200 dimer”.

      (6) In "Wide interactions were predicted between the four molecules", I suggest changing "wide" to "extensive".

      We have changed “wide” to “extensive” in the revised manuscript.

      (7) In "which revealed that the tandem two microtubule-interacting and transport (MIT) domains in Atg1 bind to the tandem two MIT interacting motifs (MIMs) of ATG13", I suggest changing the two occurrences of "tandem two" to "two tandem" or simply "tandem".

      We simply used "tandem" in the revised manuscript.

    1. The code from which thismessage has been taken is none other than thatof the French language; the only knowledgerequired to decipher it is a knowledge of writingand of French.

      Barthes says that reading the linguistic message mainly just needs language skills, like knowing how to read and understand French. But I think the look of the text—especially the font—also matters a lot. Different typefaces give off different vibes. For example, a handwritten font might feel friendly or personal, while a clean, modern font might feel professional or serious. So even though it’s still text, the style of the typography adds another layer of meaning. It’s not just what the words say, but also how they visually come across.

    1. Reviewer #1 (Public review):

      Summary:

      Flowers et al describe an improved version of qFit-ligand, an extension of qFit. qFit and qFit-ligand seek to model conformational heterogeneity of proteins and ligands, respectively, cryo-EM and X-ray (electron) density maps using multiconformer models-essentially extensions of the traditional alternate conformer approach in which substantial parts of the protein or ligand are kept in place. By contrast, ensemble approaches represent conformational heterogeneity through a superposition of independent molecular conformations.

      The authors provide a clear and systematic description of the improvements made to the code, most notably the implementation of a different conformer generator algorithm centered around RDKit. This approach yields modest improvements in the strain of the proposed conformers (meaning that more physically reasonable conformations are generated than with the "old" qFit-ligand) and real space correlation of the model with the experimental electron density maps, indicating that the generated conformers also better explain the experimental data then before. In addition, the authors expand the scope of ligands that can be treated, most notably allowing for multi conformer modeling of macrocyclic compounds.

      Strengths:

      The manuscript is well written, provides a thorough analysis, and represents a needed improvement of our collective ability to model small-molecule binding to macromolecules based on cryo-EM and X-ray crystallography, and can therefore has a positive impact on both drug discovery and general biological research.

      Weaknesses:

      Weaknesses were addressed during review. Overall, the demonstrated performance gains are modest.

      Specific comments:

      (1) The accuracy of initial placement may be critical. At the same time, in my experience ambiguous cases are quite common, for example with flat ligands with a few substituents sticking out or with ligands with highly mobile tails. There remain some questions regarding sensitivity to initial ligand placement, which individual users should check for.

    2. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Flowers et al describe an improved version of qFit-ligand, an extension of qFit. qFit and qFit-ligand seek to model conformational heterogeneity of proteins and ligands, respectively, cryo-EM and X-ray (electron) density maps using multi-conformer models - essentially extensions of the traditional alternate conformer approach in which substantial parts of the protein or ligand are kept in place. By contrast, ensemble approaches represent conformational heterogeneity through a superposition of independent molecular conformations.

      The authors provide a clear and systematic description of the improvements made to the code, most notably the implementation of a different conformer generator algorithm centered around RDKit. This approach yields modest improvements in the strain of the proposed conformers (meaning that more physically reasonable conformations are generated than with the "old" qFit-ligand) and real space correlation of the model with the experimental electron density maps, indicating that the generated conformers also better explain the experimental data than before. In addition, the authors expand the scope of ligands that can be treated, most notably allowing for multi-conformer modeling of macrocyclic compounds.

      Strengths:

      The manuscript is well written, provides a thorough analysis, and represents a needed improvement of our collective ability to model small-molecule binding to macromolecules based on cryo-EM and X-ray crystallography, and can therefore have a positive impact on both drug discovery and general biological research.

      Weaknesses:

      There are several points where the manuscript needs clarification in order to better understand the merits of the described work. Overall the demonstrated performance gains are modest (although the theoretical ceiling on gains in model fit and strain energy are not clear!).

      We thank the reviewer for their thoughtful review. To address comments, we have added clarifying statements and discussion points around the extent of performance gains, our choice of benchmarking metrics, and the “standards” in the field for significance. We expanded our analysis to highlight how to use qFit ligand in “discovery” mode, which is aimed at supporting individual modeling efforts. As we now write in the discussion:

      “It is advisable to employ qFit-ligand selectively, focusing on cases with a moderate correlation between your input model and the experimental data, strong visual density in the binding pocket, high map resolution, or when your single-conformer ligand model is strained.”

      Additionally, we note in the discussion:

      “qFit-ligand primarily serves as a “thought partner” for manual modeling. Modelers still must resolve many ambiguities, including initial ligand placement, to fully take advantage of qFit capabilities. In active modeling workflows or large scale analyses, the workflow would only accept the output of qFit-ligand when it improves model quality. In cases where qFit-ligand degrades map-to-model fit and/or strain, we can simply revert to the input model. In practice, users can easily remove poorly fitting conformations using molecular modeling software such as COOT, while keeping the well modeled conformations, which is an advantage of the multiconformer approach over ensemble refinement methods.”

      Reviewer #2 (Public review):

      Summary:

      The manuscript by Flowers et al. aimed to enhance the accuracy of automated ligand model building by refining the qFit-ligand algorithm. Recognizing that ligands can exhibit conformational flexibility even when bound to receptors, the authors developed a bioinformatic pipeline to model alternate ligand conformations while improving fitting and more energetically favorable conformations.

      Strengths:

      The authors present a computational pipeline designed to automatically model and fit ligands into electron density maps, identifying potential alternative conformations within the structures.

      Weaknesses:

      Ligand modeling, particularly in cases of poorly defined electron density, remains a challenging task. The procedure presented in this manuscript exhibits clear limitations in low-resolution electron density maps (resolution > 2.0 Å) and low-occupancy scenarios, significantly restricting its applicability. Considering that the maps used to establish the operational bounds of qFit-ligand were synthetically generated, it's likely that the resolution cutoff will be even stricter when applied to real-world data.

      We thank Reviewer #2 for their comments on the role of conformational flexibility and how our tool addresses the complexity involved in modeling alternative conformations. We agree that there are limitations at low resolution, limiting the application of our algorithm. That is the case with all structural biology tools. Automatically finding alternative conformations of ligands in high-resolution structures is an enhancement to the toolbox of ligand fitting. Expanding the algorithm to work with fragment screening data is important in this realm, as almost all of this data fits in the high-resolution range where qFit-ligand works best.

      The reported changes in real-space correlation coefficients (RSCC) are not substantial, especially considering a cutoff of 0.1. Furthermore, the significance of improvements in the strain metric remains unclear. A comprehensive analysis of the distribution of this metric across the Protein Data Bank (PDB) would provide valuable insights.

      We agree that the changes are small, partially because the baseline (manually modeled ligands) is very high. To provide additional evidence, we added evaluations using EDIAm, which is a more sensitive metric. In Figure 2 (page 10), representing the development dataset, we see more improvements above 0.1. With this being said, it is unclear what constitutes a ‘substantial’ improvement for either of these metrics, especially considering alternative conformations may only change the coordinates of a subset of ligands, just slightly improving the fit to density.

      We agree that looking across the PDB on strain would provide valuable insight. To explore this, we looked to see how qFit-ligand could improve the fitting of deposited ligands with high strain (see section: Evaluating qFit-ligand on a set of structures known to be highly strained, Page 15). While only a subset of these structures had alternative conformers placed (24.6%), we observed that in this subset, the ligands often improved the RSCC and strain. This figure also demonstrates that while RSCC may not change much numerically, the alternative conformers explain previously unexplained density with lower energy conformers than what is currently deposited.

      To mitigate the risk of introducing bias by avoiding real strained ligand conformations, the authors should demonstrate the effectiveness of the new procedure by testing it on known examples of strained ligand-substrate complexes.

      See above.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      A - Specific comments:

      (1) It appears necessary to provide qFit-ligand with an initial model with the ligand already placed. This is not clear from the start of the introduction on page 3. It appears that ligand position is only weakly adjusted fairly late in the process, in step F of Figure 1. It seems, therefore, that the accuracy of initial placement is rather critical (see the example discussed on page 21). At the same time, in my experience, ambiguous cases are quite common, for example with flat ligands with a few substituents sticking out or with ligands with highly mobile tails. It would be helpful for the authors to comment on the sensitivity to initial ligand placement, either in the discussion or, better yet, in the form of an analysis in which the starting model position is randomly perturbed.

      In our revised version, we have modified the introduction to clarify the necessity of including an initial ligand model (page 4).

      “The qFit-ligand algorithm takes as input a crystal or cryo-EM structure of an initial protein-ligand complex with a single conformer ligand in PDBx/mmCIF format, a density map or structure factors (encoded by a ccp4 formatted map or an MTZ), and a SMILES string for the ligand.”

      We also describe our sampling algorithm more clearly (see: Biasing Conformer Generation, page 6). Steps A-E generate many conformations (using RDKit), which are then selected/fit into experimental density (using quadratic programming). To help with additional shifting issues in the input ligand, after the first selection, we do additional rotation/translation of the generated conformers that are kept. We then do another round of fitting to the density (quadratic programming followed by mixed integer quadratic programming).

      Given this sampling, we have not elected to do an additional computational experiment to test the “radius of convergence” or dependence on initial conditions. However, we outline the fundamental procedure here so that someone can build on the work and test the idea:

      - Create single conformer models as we currently do

      - randomly perturb the coordinates of the ligand by 0.1-0.3Å

      - refine to convergence, creating a series of “perturbed, modified true positives” for each dataset

      - Run qFit ligand

      - Evaluate the variability in the resulting multi-conformer models

      (2) Top of page 6 ("Biasing Conformer Generation"): the authors say "as we only want to generate ligands that physically fit within the protein binding pocket, we bias conformation generation towards structures more likely to fit well within the receptor's binding site". Apart from the odd redundancy of this sentence, I am confused: at the stage that seems to be referred to here (A-C in Figure 1) is the fit to the electron density already taken into account, or does this only happen later (after step E)?

      Thank you for pointing this out. We have edited the statement to clarify it:

      “To guide the conformation generation from the Chem.rdDistGeom based on the ligand type and protein pocket, we developed a suite of specialized sampling functions to bias the conformational search towards structures more likely to fit well into the receptor’s binding site.”

      We do not consider the electron density during conformer generation (only selection from the generated conformers). The sampling is additionally biased by the type of ligand and the size of the binding pocket.

      (3) qFit-ligand appears to be quite slow. Are there prospects for speedup? Can the code take advantage of GPUs or multi-CPU environments?

      We agree with this. We have made some algorithmic improvements, most notably removing duplicate conformers based on root mean squared distance. This, along with parallelization, decreased the average runtime from ~19 minutes to ~8 minutes (see additional details: qFit-ligand runtime, page 8). We do not currently take advantage of GPU specific code.

      (4) Section: Detection of experimental true positive multi-conformer ligands:

      a) Why are carbohydrate ligands excluded? This seems like an important class of ligands that one would like qFit to be able to treat! Which brings me to a related question: can covalently attached groups (e.g., glycosylation sites!) be modeled using qFit-ligand, or is qFit-ligand restricted to non-covalently bound groups?

      Currently, qFit-ligand does not support covalently bound ligands, but this is an area of interest we are hoping to expand into. In the revised version, we added the non-covalently attached carbohydrates back into the true positive dataset. In Figure 4 (page 14), we show that qFit-ligand is able to improve fit to the experimental density in around 80% of structures, while also often reducing torsion strain (see additional details: qFit-ligand applied to unbiased dataset of experimental true positives, page 14).

      b) "as well as 758 cases where the ligand model's deposited alternate conformations (altlocs) were not bound in the same chain and residue number" - I do not understand what this means, or why it leads to the exclusion of so many structures. Likewise, a number of additional exclusions are described in Figure S3. Some more background on why these all happened would be helpful. Are you just left with the "easy" cases?

      Sometimes modelers will list the multiple conformations of a bound ligand as a separate residue within the PDB file, rather than as a single multiconformer model. For example, rather than writing a multiconformer LIG bound at A, 201 with altlocs ‘A’ and ‘B’, a modeler might write this instead as LIG, A, 201 and LIG A, 301. We initially excluded these kinds of structures. However, we agree that this choice resulted in the removal of many potentially valid true positives. We have since updated our data processing pipeline to include these cases, and they are examined in the updated manuscript.

      c) I do not follow the argument made at the end of this section (last two paragraphs on page 9): "when using a single average conformation to describe density from multiple conformations, the true low-energy states may be ignored". I get that, but the conformations in the "modified true positives" dataset derive directly from models in which two conformations were modeled, so this cannot be the explanation for why qFit-ligand models result in somewhat lower average strain. It would seem that the paper could be served by providing examples where single conformations were modeled in deposited structures, but qFit detects multiple conformations.

      We agree with this comment that the strain obtained from the modified true positives is likely higher than the deposited models. However, the modified structure is refined with a single conformation, and therefore changed from the deposited “A” conformation. Thus, the reduced strain observed in our qFit-ligand models relative to the modified true positives is not unexpected.

      To expand our dataset, we also looked at deposited structures with high strain, all of which were modeled as single conformers. Here, we saw a decrease in strain when alternative conformers were placed (see section: Evaluating qFit-ligand on a set of structures known to be highly strained, page 15). Further, we provide an example from the XGen macrocycle dataset where a ligand initially modeled as a single conformer exhibited relatively high strain. After qFit‐ligand modeled a second conformation, the overall strain was reduced (Figure 6C, page 19; Figure 6—figure supplement 1C, page 59).

      (5) Section: qFit-ligand applied to an unbiased dataset of experimental true positives Bottom of page 14: The paragraph starting with "qFit-ligand shows particular strength in scenarios with strong evidence..." is enigmatic: there's no illustration (unless it directly relates to the findings in Figure 4, in which case this should be more explicit). Since this points out when the reader will and will not benefit from using qFit-ligand, it should be clear what the authors are talking about.

      This claim considers all the evidence presented in the manuscript, not necessarily one particular aspect of it. We advise using qFit-ligand when there is a moderate correlation between the input model and the experimental data, strong visual density in the binding pocket, high map resolution, and/or when your single conformer ligand model is strained. We have made all of these points clearer in the updated manuscript.

      B  - Section: qFit-ligand can automatically detect and model multiple conformations of macrocycles:

      This is an exciting extension of qFit-ligand, but some aspects of the analysis strike me as worrisome. Of the initial dataset of 150 structures, fewer than half make it all the way through analysis. It's hard to believe that this is a fully representative subset. Why, for example, could 29 structures not be refined against the deposited structure factors? Why does strain calculation (in RDKit?) fail on 30 ligands? What about the other 18 cases--why did these fail (in PHENIX?).

      We agree that this is a striking number of failures, however, we note that they are not specific shortcomings of qFit-ligand (in fact, most are because standard structural biology and/or cheminformatics software fail on many PDB depositions). Therefore, these failures reflect broader limitations in standard bioinformatics and refinement restraint files when handling macrocycles. The strain calculator we used was not built for macrocycles, and after consulting with many experts in the field, the consensus was that no method works well with macrocycles. We discuss these issues in additional detail in the discussion (page 27):

      “Additionally, our algorithm’s placement within the larger refinement and ligand modeling ecosystem highlighted other areas that need improvement. We note that macrocycles, due to their complicated and interconnected degrees of freedom, suffer acutely from the refinement issues, as demonstrated by the failure of approximately one-third of datasets in our standard preparation or post-refinement pipelines due to ligand parameterization issues. Many of these stemmed from problematic ligand restraint files, highlighting the difficulty of encoding the geometric constraints of macrocycles using standard restraint libraries. Improved force-field or restraints for macrocycles are desperately needed to improve their modeling.”

      C  - Minor issues:

      (1) "Fragment-soaked event maps" - this is a semantically strange section title!

      We have updated the section title in our revised manuscript. The new title is ‘qFit-ligand recovers heterogeneity in fragment-soaked event maps’.

      (2) Too many digits! All over the manuscript, percentages are displayed with 0.01% precision, while these mostly refer to datasets with ~150 structures. Shifting just one structure from one category to another changes these percentages by nearly 1%.

      We have updated the sig figs in our revised manuscript.

      (3) The authors are keen to classify decreases in RSCC as significant only when these changes exceed 0.1, but do not apply the same standard for increases. For instance, in Figure 4B if we were to classify improvements as significant if ΔRSCC > 0.1, there would be fewer significant improvements than decreases in performance (although it is visually clear that for most datasets things get better. Similarly, in Figure 5A if we were to classify improvements as significant if ΔRSCC > 0.1, qFit-ligand would only yield significant improvements for two out of 73 cases-not a lot).

      We agree with the reviewer that there needs to be more consistency in our analysis of improvements/deteriorations. However, we note that operationally, when the decreases in model quality are observed, the modeler would simply reject the new model in favor of the input model. We have added to the discussion:

      “In active modeling workflows or large scale analyses, the workflow would only accept the output of qFit-ligand when it improves model quality. In cases where qFit-ligand degrades map-to-model fit and/or strain, we can simply revert to the input model. In practice, users can easily remove poorly fitting conformations using molecular modeling software such as COOT, while keeping the well modeled conformations, which is an advantage of the multiconformer approach over ensemble refinement methods.”

      There is generally no consensus in the field as to what might indicate a ‘significant’ change in RSCC, and any threshold we choose would be arbitrary. We note that in our manuscript, we had previously characterized a decrease in RSCC to be ‘significant’ if it exceeded 0.1. However, as there is no real scientific justification for this cutoff, or any cutoff, we moved away from this framing in the revised manuscript. Therefore, we just classify if we improve RSCC. For example, on page 9:

      “qFit-ligand modeled an alternative conformation in 72.5% (n=98) of structures. Compared with the modified true positive models, 83.7% (n=113) of qFit-ligand models have a better RSCC and 77.0% (n=104) structures saw an improvement in EDIAm, representing an improved fit to experimental data in the vast majority of structures.”

      In addition, we have conducted additional experiments using more sensitive metrics (EDIAm) to further illustrate qFit-ligand’s performance.

      (4) Small peptides are not discussed as a class of ligands, although these are quite common.

      Canonical peptides can be modeled with standard qFit. Non-canonical peptides present failure modes similar to the macrocycles discussed above, with a mix of ATOM and HETATM records and the need for custom cif definitions and link records. For these reasons we have not included an analysis outside of the macrocycle section. We have noted this caveat in the discussion:

      “We note that even linear non-canonical peptides present similar failure modes to macrocycles, with a mix of ATOM and HETATM records and the need for custom cif definitions and link records. For these reasons, we did not include analysis on small peptide ligands; however, canonical peptides can be modeled with standard qFit [8].”

      (5) Top of page 10: "while refinement improves": what kind of refinement does this refer to?

      This refers to refinement with Phenix. We have updated this language to reflect this (page 8). “We refer to these altered structures as our ‘modified true positives’, which we use as input to qFit-ligand, and subsequent refinement using Phenix.”

      (6) Bottom of page 11: "they often did" -> "it often did"

      We have made this change in the revised version.

      (7) Top of page 14: RMSDs and B factors do have units.

      We have added the units in our revision.

      (8) Top of page 24. In the generation of a composite omit map, why are new Rfree flags being generated? Did I misunderstand that?

      r_free_flags.generate=True only creates R-free flags if they are not present in the input file as is the case for many (especially older) PDB depositions.

      (9) Bottom of page 27: how large is the mask? Presumably when alt confs of the ligand are possible, it would be helpful for the mask to cover those?

      We agree that this mask should be updated. In our revision, we define the mask around the coordinates of the full qFit-ligand ensemble. The same mask is used to calculate the RSCC of the input (single conformer) model versus the qFit-ligand model.

      (10) Middle of page 29: "These structure factors are then used to compute synthetic electron density maps." - It is not clear whether the following three sentences are an explanation of the details of that statement or rather things that are done afterwards.

      We clarify this in the manuscript (page 36).

      “These structure factors are then used to compute synthetic electron density maps. To each of these maps, we generate and add random Gaussian noise values scaled proportionally to the resolution. This scaling reflects the escalation of experimental noise as resolution deteriorates, a common occurrence in real-life crystallographic data.”

      (11) Chemical synthesis: I am not qualified to assess this and am surprised to see some much detail here rather than in some other manuscript. Are the corresponding structures deposited anywhere?

      All of the structures we discuss in this manuscript are deposited in the PDB and listed in Supplementary Table 5.

      Reviewer #2 (Recommendations for the authors):

      The data should consistently present the number of structures that exhibit improvements or deterioration in particular metrics, like RSCC and strain, using a cutoff that should be significant. For instance, stating that "85.93% (n=116) of structures having a better RSCC in the qFit-ligand models compared to the modified true positive models" without clarifying the magnitude of improvement (e.g., a marginal increase of 0.01 in RSCC) lacks meaningful context. The figures should clearly indicate the specific cutoff values used for each metric. The accompanying text should provide a detailed explanation for the selection of these cutoff values, justifying their significance in the context of the study.

      Currently, there is no established consensus within the field on what constitutes a 'significant' improvement in RSCC or strain values. As such, we chose not to impose an arbitrary cutoff and just look at which structures improve RSCC. We also removed all language stating significance, as there isn’t a good standard in the field to assess significance. This is especially important as only improvements would be considered in an active modeling project. In cases where qFit ligand degrades the RSCC (or strain) to a large extent, the modeler would simply revert to the input model.

      In the first section of Results: "First, for all ligands, we perform an unconstrained search function allowing the generated conformers to only be constrained from the bounds matrix (Figure 1A). This is particularly advantageous for small ligands that benefit from less restriction to fully explore their conformational space. We then perform a fixed terminal atoms search function (Figure 1B)." It is unclear whether a fixed terminal atom search was conducted for each conformer generated in the initial step to further explore the conformational space. This aspect should be clarified to provide a more comprehensive understanding of the methodology.

      Each independent conformer generation function (A-E) is initialized with only the input ligand model and runs in parallel with the other functions. These functions do not build on each other, but rather perturb the input molecule independently of one another. In our updated manuscript, we have clarified the methodology (page 6).

      “First, in all cases, we perform an unconstrained search function (Figure 1A), a fixed terminal atoms search function (Figure 1B), and a blob search function (Figure 1C).”

      Phrase: "We randomly sampled 150 structures and, after manual inspection of the fit of alternative conformations, chose 135 crystal structures as a development set for improving qFit-ligand." The authors should explain why they filtered 10% of the structures.

      To develop qFit-ligand, we wanted to use a very high-quality dataset. We needed to know with some degree of certainty that if qFit-ligand failed to produce an alternate conformation (or generated conformations low in RSCC or high in strain), the failure was due to an algorithmic limitation rather than poor-quality input data. Therefore, after selection based on numerical metrics, we manually examined each ligand in Coot to observe if we believed the alternative conformers fit well into the density.

    1. Reviewer #1 (Public review):

      Summary:

      In this study, the authors re-analyzed a public dataset (Rademaker et al, 2019, Nature Neuroscience) which includes fMRI and behavioral data recorded while participants held an oriented grating in visual working memory (WM) and performed a delayed recall task at the end of an extended delay period. In that experiment, participants were pre-cued on each trial as to whether there would be a distracting visual stimulus presented during the delay period (filtered noise or randomly-oriented grating). In this manuscript, the authors focused on identifying whether the neural code in retinotopic cortex for remembered orientation was 'stable' over the delay period, such that the format of the code remained the same, or whether the code was dynamic, such that information was present, but encoded in an alternative format. They identify some timepoints - especially towards the beginning/end of the delay - where the multivariate activation pattern fails to generalize to other timepoints, and interpret this as evidence for a dynamic code. Additionally, the authors compare the representational format of remembered orientation in the presence vs absence of a distracting stimulus, averaged over the delay period. This analysis suggested a 'rotation' of the representational subspace between distracting orientations and remembered orientations, which may help preserve simultaneous representations of both remembered and viewed stimuli. Intriguingly, this rotation was a bit smaller for Expt 2, in which the orientation distractor had a greater behavioral impact on the participants' behavioral working memory recall performance, suggesting that more separation between subspaces is critical for preserving intact working memory representations.

      Strengths:

      (1) Direct comparisons of coding subspaces/manifolds between timepoints, task conditions, and experiments is an innovative and useful approach for understanding how neural representations are transformed to support cognition

      (2) Re-use of existing dataset substantially goes beyond the authors' previous findings by comparing geometry of representational spaces between conditions and timepoints, and by looking explicitly for dynamic neural representations

      (3) Simulations testing whether dynamic codes can be explained purely by changes in data SNR are an important contribution, as this rules out a category of explanations for the dynamic coding results observed

      Weaknesses:

      (1) Primary evidence for 'dynamic coding', especially in early visual cortex, appears to be related to the transition between encoding/maintenance and maintenance/recall, but the delay period representations seem overall stable, consistent with some previous findings. However, given the simulation results, the general result that representations may change in their format appears solid, though the contribution of different trial phases remains important for considering the overall result.

      (2) Converting a continuous decoding metric (angular error) to "% decoding accuracy" serves to obfuscate the units of the actual results. Decoding precision (e.g., sd of decoding error histogram) would be more interpretable and better related to both the previous study and behavioral measures of WM performance.

      Comments on revised version:

      The authors have addressed all my previous concerns.

    1. Two graduate students were trained by the firstauthor to code the sessions. See the Appendix for codingtaxonomy

      Coding is thorough and completed by multiple individuals. Do they use interrater reliability?

    Annotators

    1. Pigs are crucial sources of meat and protein, valuable animal models, and potential donors for xenotransplantation. However, the existing reference genome for pigs is incomplete, with thousands of segments and missing centromeres and telomeres, which limits our understanding of the important traits in these genomic regions. To address this issue, we present a near complete genome assembly for the Jinhua pig (JH-T2T), constructed using PacBio HiFi and ONT long reads. This assembly includes all 18 autosomes and the X and Y sex chromosomes, with only six gaps. It features annotations of 46.90% repetitive sequences, 35 telomeres, 17 centromeres, and 23,924 high-confident genes. Compared to the Sscrofa11.1, JH-T2T closes nearly all gaps, extends sequences by 177 Mb, predicts more intact telomeres and centromeres, and gains 799 more genes and loses 114 genes. Moreover, it enhances the mapping rate for both Western and Chinese local pigs, outperforming Sscrofa11.1 as a reference genome. Additionally, this comprehensive genome assembly will facilitate large-scale variant detection and enable the exploration of genes associated with pig domestication, such as GPAM, CYP2C18, LY9, ITLN2, and CHIA. Our findings represent a significant advancement in pig genomics, providing a robust resource that enhances genetic research, breeding programs, and biomedical applications.

      A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaf048), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      Original version

      Reviewer 1: Martien Groenen

      The manuscript describes the T2T genome assembly for the Chinese pig breed Jinhua, which presents a vast improvement compared to the current reference genome of the Duroc pig TJTabasco (build11.1). The results and methodology use for the assembly are described clearly and the authors show the improvement of this assembly by a detailed comparison with the current reference 11.1. While clearly of interest to be published, several aspects of the manuscript should be improved. Most of these changes are minor modifications or inaccuracies in the presentation of the results.

      However, there are two major aspects that need further attention:

      1. The T2T assembly presented, represents a combination of the two haplotypes of the pig sequenced. I am surprised why the authors did not also develop two haplotype resolved assemblies of this genome. Haplotype resolved assemblies will be the assemblies of choice for future developments of a reference pan-genome for pigs. The authors describe that they have sequenced the two parents of the sequenced F1 individual, so why did they not use the trio-binning approach to also develop haplotype resolved assemblies. I, think adding these to the manuscript would be a vast improvement for this important resource.

      2. The results described for the identification of selective sweep regions is not very convincing. This analysis shows differences in the genomes of two breeds: Duroc and Jinhua. However, these breeds have a very different origin of domestication of wild boars that diverged 1 million years ago, followed by the development of a wide range of different breeds selected for different traits. Therefore, the comparison made by the authors cannot distinguish between differences in evolution of Chinese and European Wild Boar, more recent selection after breed formation and even drift. To be able to do so, these analyses would need the inclusion of additional breeds and wild boars from China and Europe. Alternatively, the authors can decide to tone down this part of the manuscript or even delete it altogether, as it does not add to the major message of the manuscript.Minor comments Line 34: Change the sentence to: "with thousands of segments and centromeres and telomeres missing" Line 37: Insert "and Hi-C" after "long reads "Line 46: Delete " such as GPAM, CYP2C18, LY9, ITLN2, and CHIA" Line 54: Insert "potential" before "xenotransplantation" Line 82: Delete "in response to the gap of a T2T-level pig genome" as this does not add anything and the use of "gap" in this context is confusing. Line 93: Change "The fresh blood" to "Fresh blood" Line 100: The authors need to provide a reference for the SDS method. Lines 152-153, line 444, and table S6: This is confusing. The authors mention Genotypes from 939 individuals, but in the table it is shown that they have used WGS data. You need to describe how the WGS data was used to call the genotypes for these individuals. Furthermore, in line 444 you mention 289 JH pigs and 616 DU pigs which together is 905. What about the other 34 individuals shown in table S6?Line 244: Replace "were" by "was" and delete "the" before "fastp" Lines 287292: Here you use several times "length of xx Gb and yy contigs". This is not correct as the value for the contigs refers to a number and not a length. Rephase e.g. like "length of xx Gb and consisting of yy contigs" Line 294: The use of "bone" sems strange. Either use "backbone" or "core"Line 306: Replace "chromosome" by "genome" Lines 308-309: For the comment "Second, 16 of the 20 chromosomes were each represented by a single contig" you refer to figure 1D however from this figure it cannot be seen if the different chromosomes consist of a single or multiple contigs. Line 346: Do you mean build 11.1 with "historical genome version". If so, please use that instead. Line 349: "post-gap filled" Line 353: The largest gap is 35 kb not 36 kb. Figures 2F-I should be better explained in the legends and the main text (lines 353-358). Lines 378: For the 23,924 genes you refer to supp table S13. However, that table shows a list of SV enriched QTL not these genes. Furthermore, I checked all tables but a table with all the protein coding genes is missing. Line 380: For the 799 newly anchored genes, refer to table S10. Now you refer to table S17 which shows genes enriched KEGG pathways. Lines 383-386: For the higher gene density in GC rich regions, you refer to figure 1D, but it is impossible to see this correlation from figure 1D. For the density of genes and telomeres, you refer to figure 1G. However, that figure does not show gene densities only repeat densities. Line 406-407. This should be table S11.Lines 409412: For this result you refer to table S11. However, that table only shows data for the gained genes, not the lost genes. Lines 419-420: You refer to table S12 and figure 3B, but the information is only shown in figure 3B and not in table S12.Line 420: Replace "were" by "is" Line 422: Better to use "repeats" instead of "they" Line 425: "Moreover, 12,129 genes located in these SVs". Unclear to what "these" refers to and I assume that you mean genes that (partially) overlap with SVs? Also, this is an incomplete sentence (verb missing). Likewise, this number is not very meaningful as many of these SVs are within introns. It is much more informative to mention for how many genes SVs affect the CDS. Line 433 and table S14: This validation is not clear at all. What exactly are these numbers that are shown? You also mention "greater than 1.00" but the table does not contain any number that is greater than 1.00. Line 435: "Table" not "Tables" Line 436: Change to " SVs with a length larger than 500 bp "The term "invalidate" in figure 3D is rather awkward. Better to use "not-validated" and "validated" in this figure. Line 449: This should be Table S16. Line 452: There is not Table S18Lines 484-486: Change to "Similarly, in human, the use of the T2T-CHM13 genome assembly yields a more comprehensive view of SVs genome-wide, with a greatly improved balance of insertions and deletions [61]." Lines 500-501: Change to "For example, in human, the T2T-CHM13 assembly was shown to improve the analysis of global" Lines 517-528: This paragraph should be deleted as these genes have already been annotated and described in previous genome builds including 11.1. Why discuss these genes here? Following that line of thinking, almost every gene of the 20,000 can be discussed. Line 532: "%" instead of "%%" and insert "which" after "SVs" Lines 537-542: These sentences should be deleted. It is common knowledge that second generation sequencing is not very sensitive to identify SVs. The authors also do not provide any results about dPCR. Line 544: "affect" rather than "harbor" Lines 544-547: This is repetitive and has been stated multiple times so better to delete. Line 561: "which is serve to immune system's response and relevant to transplant rejection" This is an incorrect sentence and should rephrased. Lines 562-568: I don't agree with is statement and suggest to remove it from the discussion.

      Reviewer 2: Benjamin D Rosen

      The first near-complete genome assembly of pig: enabling more accurate genetic research. The authors describe the telomere-to-telomere assembly of a Jinhua breed pig. They sequenced genomic DNA from whole blood with PacBio HiFi and Oxford Nanopore (ONT) long-read technologies as well as Illumina for short reads. They generated HiC data for scaffolding from blood and extracted RNA from 19 tissues for short read RNAseq for gene annotation. A hifiasm assembly was generated with the HiFi data and scaffolded with HiC to chromosome level with 63 gaps. The scaffolded assembly was gap filled with contigs from a NextDenovo assembly of the ONT data bringing the gaps down to 14. Finally, the assembly was manually curated with juicebox somehow closing a further 8 gaps. This needs to be clarified. Standard assembly assessments were performed as well as genome annotation. The authors compared their assembly to the current reference, Sscrofa11.1, and called SVs between the assemblies. The SVs were validated with additional Jinhua and Duroc animals. They then identified signatures of selection present in some of the largest SVs.

      General comments: The manuscript is mostly easy to read but would benefit from further editing for language throughout. The described assembly appears to be high quality and quite contiguous. Although the authors do mention obtaining parental samples and claim the assembly is fully phased, there is no mention of how this was done. There are many additional places where the methods could be described more fully including the addition of parameters used.

      Specific comments: Line 39 - Figure 1 only displays 34 telomeres, not 35. Additionally, I was only able to detect 33 telomeres using seqtk telo. Seqtk only reports telomeres at the beginning and end of sequences, digging further, the telomere on chr2 is ~59kb from the end of the chromosome, perhaps indicating a misassembly. Lines 79-81 - there are not hundreds of species with gap free genome assemblies and reference 19 does not claim that there are. Line 82 - the assembly is not gap-free, replace with "nearly gap-free" Line 95 - were these parental tissue samples ever used? Lines 151-156 - this section would be better located below the assembly methods. Please number supplementary tables in order of their appearance in the text. Line 171 - please provide parameters used here and for all analyses. Lines 187-188 - how did rearranging contigs decrease the gaps? Was the same gap filling procedure used after HiC manual adjustments? Line 188 - Figure S3 - I don't understand the relationship between the panels nor what the authors are attempting to show. If panels A-C display chromosomes 2, 8, and 13, Why does D display chr3? Both panels C and E are labeled chr13 but they look nothing alike. Are D-E whole chromosomes or zoomed in views? Missing description of panel F. Lines 222-224 - why weren't pig proteins used? Ensembl rapid release has annotated protein datasets for 9 pig assemblies. Line 264 - although most will know this, make it clear that Sscrofa11.1 is an assembly of a Duroc pig. Line 292 - how was polishing performed? This is missing from the methods. Line 294 - should this read "selected it for the backbone of the genome assembly."? Lines 298-299 - methods? Line 314 - what is meant by "using mapped K-mers from trio Illumina PCR-free reads data"? Line 331 - accession numbers for assemblies would be useful. Line 333 - what is "properly mapped rate"? Do you mean properly paired mapping rate? Line 346 - what is the historical genome version? Line 349 - Supplemental Table S8 only has 55 entries including the 6 remaining gaps. Where are the other filled 8 gaps located? Lines 350-358 - read depth displays wouldn't show the presence of clipped reads which would indicate an improperly closed gap. It would be more convincing to display IGV windows containing these alignments showing that there are no clipped reads. Line 354 - Figure S5 needs a better legend. What is ref and what is own? Line 359 - the assembly is near-gapless. Line 359 - where is the data regarding assembly phasing? How was this determined to be fully phased? Line 363 - 16 of 20 chromosomes are gapless. Line 370 - only 33 telomeres were found at the expected location (end of the chromosome), if you count the telomere on chr2 59kb from the end, then 34 telomeres were identified. Line 372 - chr13 also only has a single telomere. It does not have a telomere at the beginning. Line 372 - chr19 is chrX correct? Line 374 - Figure 1G - It would be nice to have the centromeres marked on this plot (or in Figure 3A). Are the long blocks of telomeric repeats internal to the chromosomes expected? Line 423 - Figure 3A - there is no telomeric repeat at the beginning of chr4 or chrXLine 431 - why were only 5 pigs of each breed used to validate SVs when 100's of WGS datasets from the two breeds had been aligned? How were these 5 selected? Line 481 - Sscrofa11.1 only has 544 gaps.Line 492 - ONT data was used to fill more than 6 gaps. Gaps in the assembly were reduced from 63 to 14 using ONT contigs. Lines 588-589 - please make your code publicly available through zenodo, github, figshare, or something similar. Line 815-824 - Figure 2 - legend description needs to be improved. Only A is mapping rates, B and C are PM rates and base error rates. The color switch from A-C having European pigs in blue to D having JH-T2T in blue might confuse readers.

    1. Multivariate predictive models play a crucial role in enhancing our understanding of complex biological systems and in developing innovative, replicable tools for translational medical research. However, the complexity of machine learning methods and extensive data pre-processing and feature engineering pipelines can lead to overfitting and poor generalizability. An unbiased evaluation of predictive models necessitates external validation, which involves testing the finalized model on independent data. Despite its importance, external validation is often neglected in practice due to the associated costs. Here we propose that, for maximal credibility, model discovery and external validation should be separated by the public disclosure (e.g. pre-registration) of feature processing steps and model weights. Furthermore, we introduce a novel approach to optimize the trade-off between efforts spent on training and external validation in such studies. We show on data involving more than 3000 participants from four different datasets that, for any “sample size budget”, the proposed adaptive splitting approach can successfully identify the optimal time to stop model discovery so that predictive performance is maximized without risking a low powered, and thus inconclusive, external validation. The proposed design and splitting approach (implemented in the Python package “AdaptiveSplit”) may contribute to addressing issues of replicability, effect size inflation and generalizability in predictive modeling studies.

      A version of this preprint has been published in the Open Access journal GigaScience (see paper (https://doi.org/10.1093/gigascience/giaf036), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      Original version

      Reviewer 1: Qingyu Zhao

      The manuscript discusses an interesting approach that seeks optimal data split for the pre-registration framework. The approach adaptively optimizes the balance between predictive performance of discovery set and sample size of external validation set. The approach is showcased on 4 applications, demonstrating advantage over traditional fixed data split (e.g., 80/20). I generally enjoyed reading the manuscript. I believe pre-registration is one important tool for reproducible ML analysis and the ideology behind the proposed framework (investigating the balance between discovery power and validation power) is urgently needed. My main concerns are all around Fig. 3, which represents the core quantitative analysis but lacks many details.

      1. Fig. 3 is mostly about external validation. What about training? For each n_total, which stopping rule is activated? What is the training accuracy? What does l_act look like? What is \hat{s_total}?
      2. Results section states "the proposed adaptive splitting strategy always provided equally good or better predictive performance than the fixed splitting strategies (as shown by the 95% confidence intervals on Figure 3)". I'm confused by this because the blue curve is often below other methods in accuracy (e.g., comparing with 90/10 split in ABIDE and HCP).
      3. Why does the half split have the lowest accuracy but the highest statistical power?
      4. How was the range of x-axis (n_total) selected? E.g., HCP has 1000 subjects, why was 240-380 chosen for analysis?
      5. The lowest n_total for BCW and IXI is approximately 50. If n_act starts from 10% of n_total, how is it possible to train (nested) cross-validation on 5 samples or so?

      Two other general comments are: 1. How can this be applied to retrospective data or secondary data analysis where the collection is finished? 2. Is there a guidance on the minimum sample size that is required to perform such an auto-split analysis? It is surprising that the authors think the two studies with n=35 and n=38 are good examples of training generalizable ML models. It is generally hard to believe any ML analysis can be done on such low sample sizes with thousands of rs-fMRI features. By the way, I believe n=25 in Kincses 2024 if I read it correctly.

      Reviewer 2: Lisa Crossman

      External validation of machine learning models - registered models and adaptive sample splitting Gallito et al. The Manuscript describes a methodology and algorithm aimed at better choosing a train-test validation split of data for scikit-learn models. A python package, adaptivesplit, was built as part of this MS as a tool for others to use. The package is proposed to be used together with a suggested workflow to integrate an approach invoking registered models as a full design for better prospective modelling studies. Finally, the work is evaluated on four alternative publicly available datasets of health research data and comprehensive results are presented. There is a trade-off in the split between the amount of sample data to be used for training and the amount of data to use for validation. Ideally the content of each must be balanced in order for the trained model to be representative and equally for the validation set to be representative. This manuscript is therefore very timely due to the large increase in the use of AI models and provides important information and methodology.

      This reviewer does not have the specific expertise to provide detailed comments on the statistical rule methods.

      Main Suggested Revision: 1. The Python implementation of the "adaptivesplit" package is described as available on GitHub (Gallitto et al., n.d.). One of the major points of the paper is to provide the python package "adaptivesplit", however, this package does not have a clear hyperlink, and is not found by simple google searches, and it appears is not yet available. It is therefore not possible to evaluate it at present. There is a website found available with a preprint of this MS after further google searches, https://pnilab.github.io/adaptivesplit/ however, adaptive split is here shown as an interactivate jupyter-type notebook example and not as a python library code. Therefore, it is not clear how available the package is for others' use. Can the authors comment on the code availability?

      Minor comments: 1. Apart from the 80:20 Pareto split of train-test data, other splits are commonly used in ratios such as 75:25 (the scikit-learn default split if ratio is unspecified), and 70:30. Also the cross-validation strategy with train-test-validation split 60:20:20, yet these strategies have not been mentioned or included in the figures such as Fig 3. The splits provided in the figure and discussed are 50:50, 80:20 and 90:10 only. Could the authors discuss alternative split ratios?

    1. To truly understand the cancer biology of heterogenous tumors in the context of precision medicine, it is crucial to use analytical methodology capable of capturing the complexities of multiple omics levels, as well as the spatial heterogeneity of cancer tissue. Different molecular imaging techniques, such as mass spectrometry imaging (MSI) and spatial transcriptomics (ST) achieve this goal by spatially detecting metabolites and mRNA, respectively. To take full analytical advantage of such multi-omics data, the individual measurements need to be integrated into one dataset. We present MIIT (Multi-Omics Imaging Integration Toolset), a Python framework for integrating spatially resolved multi-omics data. MIIT’s integration workflow consists of performing a grid projection of spatial omics data, registration of stained serial sections, and mapping of MSI-pixels to the spot resolution of Visium 10x ST data. For the registration of serial sections, we designed GreedyFHist, a registration algorithm based on the Greedy registration tool. We validated GreedyFHist on a dataset of 245 pairs of serial sections and reported an improved registration performance compared to a similar registration algorithm. As a proof of concept, we used MIIT to integrate ST and MSI data on cancer-free tissue from 7 prostate cancer patients and assessed the spot-wise correlation of a gene signature activity for citrate-spermine secretion derived from ST with citrate, spermine, and zinc levels obtained by MSI. We confirmed a significant correlation between gene signature activity and all three metabolites. To conclude, we developed a highly accurate, customizable, computational framework for integrating spatial omics technologies and for registration of serial tissue sections.

      A version of this preprint has been published in the Open Access journal GigaScience (see paper (https://doi.org/10.1093/gigascience/giaf035)), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

      Original Submission Reviewer 1: Hua Zhang

      Wess et al reports a Python framework, MIIT (Multi-Omics Imaging Integration Toolset), for integrating spatially resolved multi-omics data. Multi-omics imaging represents a pivotal approach for systems molecular biology and biomarker discovery. This method introduces a timely and valuable tool to advance the field. However, in my opinion, this paper still has some issues that need to be addressed before consideration for publication. Cancer tissue exhibits significant heterogeneity effects, in this study, different molecular information obtaining from different tissue sections, this means from different cells as the tissue section is 10 um thickness, almost the diameter of the cells. Please height the meaningful of co-registration information if they are obtained from different cell layers. In particular, for the datasets of spatial transcriptomics and MSI, the experiments were conducted on serial sections with an axial sectioning distance of 40 to 100 μm. This means that the mRNA and metabolites originate from different cells, raising questions about how integrating these two datasets can provide meaningful insights. The multi-omics imaging integration toolset is based on the GreedyFHist, a non-rigid registration algorithm, it suggests including more details about this algorithm and highlight the difference comparing to previously reported non-rigid image co-registration algorithm. The author should demonstrate the accuracy of background segmentation, it concerns certain low signal sample area would be removed in the denoising step. What is criterion to define the background region and sample region in the background segmentation.

      In the Method section, more details need to be included in the spatial transcriptomics part, what the spatial resolution of the 10x Genomics was used. As the MALDI resolution is 30 um, how the pixel alignment of the ST and MSI data if their spatial resolution is different. In the MALDI-MSI of prostate tissue, on tissue MS/MS data is missing to confirm the identification of target analytes of citrate, ZnCl3-, and spermine.

      **Reviewer 2: Santhoshi Krishnan **

      Overview: In this paper, the authors present the Multi-Omics Imaging Integration Toolset, which is a python framework for integration multiple spatial omics datatypes. To facilitate this, they also development a registration method (GreedyFHist) for jointly analyzing sequential tissue layers that have undergone different types of staining/phenotyping regimens. The method validation was done on a 244 fresh-frozen prostrate tissue sections. The highly detailed methods and results section is well appreciated and helps fully contextualize the significance of the study. The definitions of study-specific terms mentioned throughout the paper at the beginning are also appreciated. Data and Code Availability: Detailed code, tutorials and associated instructions have been made available for use by the public, which is appreciated. All systems requirements have also been explicitly laid out for ease of installation and use. The workflow examples provided are quite detailed; however, a more extensive codebase with stepwise explanations within the code will be appreciated. Data has not been made available publicly, except for the raw and processed spatial transcriptomics data; however, detailed and explicit instructions have been provided on data access, keeping in mind local regulations. Revisions: Major Revisions: 1. In recent years, a lot of other platforms, both free and paid, tend to support registration across multiple slides. For example, HALO has a registration feature available as well, along with a host of other open-source datatypes. In that regard, how is your platform different? 2. It is mentioned that downscaling occurs during the registration process in order to reduce runtime - how are nuances in features selected as registration landmarks preserved in such a case? 3. How is the fixed image determined in this case? The assumption would be that a standard H&E image is selected for this purpose- is that assumption, correct? 4. The authors have stated and justified their rationale for using the mentioned evaluation metrics in the paper. However, in the general image registration space, metrics such as the dice coefficient and jaccard index are commonly used and accepted. Is there a particular reason why these were not used as well? It would offer a more complete picture for the general user if these metrics were provided as well. 5. The validation of registering distance neighboring sections is quite a valuable contribution, as the authors rightly stated that in many multi-omics experiments, this might be a necessity. However, when looking at tissue sections that are 80-100 microns apart, it is quite likely that the set of cells that one may be looking at on the x-y coordinate system may not be the same at all; in fact, for a highly heterogeneous/flexible piece of tissue, they might be completely different. In such a circumstance, how much value is there in registering these two sections together instead of, say, separately analyzing them and using alternative methods to combine the results downstream? 6. In the proof of concept presented in the paper, the authors mention using ST and MSI data for validating their framework. Have they also investigated ST integration with more commonly available datatypes such as IHC/mIF? 7. The work that the authors have put in to validate the registration and MIIT framework using different approaches (selecting spatially distant slides, integration using augmented/artificial data) is thorough. However, different tissue types bring in their own challenges, and thus validation of this framework on an external dataset would lend more credence to this much needed framework, especially in the era of increased multiomics analyses.

      Minor Revisions: 1. Please ensure all typos/grammatical mistakes are corrected. 2. In the 'preprocessing of stained histology images', can more details be given on the thresholding process? It is also stated that the threshold is manually adjusted for each image if necessary - how is this determination done? 3. The headings/subheadings organizations within sections can be done in a more organized manner, in some parts it was challenging to determine the organization of sections/subsections. 4. Can some more details be given on the landmarks that were identified per image? Could some examples be provided on what these landmarks are, and how they remain consistence across tissue layers? 5. Currently, the way various samples are used for validating the GreedyFHist and MIIT frameworks are listed out in the paper is quite confusing. It would be appreciated if the authors can distinctly mention the number of samples out of the set of samples, and the associated stained slides are used for each. 6. How were the annotations from the 3 annotators cross validated?

    1. 'It turns out the company had no AI and instead was just a group of Indian developers pretending to write code as AI,

      'AI' softw dev company, is actually a pool of 700 India based coders. Exposed because they couldn't meet payroll....

    1. code to create Figure 7.13 tibbles x and y is given below.

      This should be changed so that "you" is the subject, for example "you can find the code to create Figure 7.13 tibbles x and y below"

    2. A few things to notice in the code above.

      This isn't a sentence. Perhaps you want "There are a few things to notice in the code above."?

    1. Food production and consumption, for instance, were utterly nationalized.

      Though food lines did increase through their rapid rates, I wonder how the people first reacted as I remember that early on in food manufacturing there were a lot of health code and safety issues? I wonder how many people may have fallen ill do to any malpractice from the beginnings on the lines.

    1. To be the default model in Cursor.

      during Kyle McDonald's talk earlier in April he'd mentioned how (paraphrasing heavily and possibly editorializing) he felt that the craft in coding is sort of lost/displaced when LLM-generated text/code takes over the programming workflow/process.

      i feel like as with other forms of writing (though purpose/end goals differ), tools like Cursor while helpful in fast prototyping sort of skip over the arduous process of honing one's programming/coding voice

      to be the default model : to be the one, canonical way to "think" or "reason" (heavy scare quotes here, esp with how they're used/referenced in academia/industry discussion/research/publications)

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public Review):

      (…) In my view, the part about NF-YA1 is less strong - although I realize this is a compelling candidate to be a regulator of cell cycle progression, the experimental approaches used to address this question falls a bit short, in particular, compared to the very detailed approaches shown in the rest of the manuscript. The authors show that the transcription factor NF-YA1 regulates cell division in tobacco leaves; however, there is no experimental validation in the experimental system (nodules). All conclusions are based on a heterologous cell division system in tobacco leaves. The authors state that NF-YA1 has a nodule-specific role as a regulator of cell differentiation. I am concerned the tobacco system may not allow for adequate testing of this hypothesis.

      Reviewer #1 makes a valid point by asking to focus the manuscript more explicitly on the role of NF-YA1 as a differentiation factor in a symbiotic context. We have now addressed this formally and experimentally.

      The involvement of A-type NF-Y subunits in the transition to the early differentiation of nodule cells has been documented in model legumes through several publications that we refer to in the revised version of the discussion (lines 617/623). We fully agree that the CDEL system, because it is heterologous, does not allow us more than to propose a parallel explanation for these observations - i.e_., that the Medicago NF-YA1 subunit presumably acts in post-replicative cell-cycle regulation at the G2/M transition. Considering your recommendations and those of reviewer #2, we sought to support this conclusion by testing the impact of localized over-expression of _NF-YA1 on cortical cell division and infection competence at an early stage of root colonization. The results of these experiments are now presented in the new Figure 9 and Figure 9-figure supplement 1-5 and described from line 435 to 495.

      With the fluorescent tools the authors have at hand (in particular tools to detect G2/M transition, which the authors suggest is regulated by NF-YA1), it would be interesting to test what happens to cell division if NF-YA1 is over-expressed in Medicago roots?

      To limit pleiotropic effects of an ectopic over-expression, we used the symbiosis-induced, ENOD11 promoter to increase NF-YA1 expression levels more specifically along the trajectory of infected cells. We chose to remain in continuity with the experiments performed in the CDEL system by opting for a destabilized version of the KNOLLE transcriptional reporter to detect the G2/M transition. The results obtained are presented in Figure 9B (quantification of split infected cells), in Figure 9-figure supplement 1B (ENOD11 expression profile), in Figure 9-figure supplement 3B (representative confocal images) and Figure 9-figure supplement 4D (quantification of pKNOLLE reporter signal). There, we show that mitosis remains inhibited in cells accommodating infection threads, but is completed in a higher proportion of outer cortical cells positioned on the infection trajectory, where ENOD11 gene transcription is active before their physical colonization.

      Based on NF-YA1 expression data published previously and their results in tobacco epidermal cells, the authors hypothesize that NF-YA regulates the mitotic entry of nodule primordial cells. Given that much of the manuscript deals with earlier stages of the infection, I wonder if NF-YA1 could also have a role in regulating mitotic entry in cells adjacent to the infection thread?

      The expression profile of NF-YA1 at early stages of cortical infection (Laporte et al., 2014) is indeed similar to the one of ENOD11 (as shown in Figure 9-figure supplement 1C) in wild-type Medicago roots, with corresponding transcriptional reporters being both activated in cells adjacent to the infection thread. Under our experimental conditions, additional expression of NF-YA1 (driven by the ENOD11 promoter) in these neighbouring cells did not impact their propensity to enter mitosis and to complete cell division. These results are presented in Figure 9-figure supplement 4D (quantification of pKNOLLE reporter signal) and Figure 9-figure supplement 5 (quantification of split neighbouring cells).

      Reviewer #1 (Recommendations For The Authors):

      - In the first part, images show the qualitative presence/absence of H3.1 or H3.3 histones.

      Upon closer inspection, many cells seem to have both histones. In Fig1-S1 for example (root meristem), it is evident that there are many cells with low but clearly present H3.1 content in the green channel; however, in the overlay, the green is lost and H3.3 (pink) is mainly visible. What does this mean in terms of the cell cycle? 

      We fully agree with reviewer #1 on these points. Independent of whether they have low or high proliferation potential, most cells retain histone H3.1 particularly in silent regions of the genome, while H3.3 is constitutively produced and enriched at transcriptionally active regions. When channels are overlaid, cells in an active proliferation or endoreduplication state (in G1, S or G2, depending on the size of their nuclei) will appear mainly "green" (H3.1-eGFP positive). Cells with a low proliferation potential (e.g., in the QC), G2-arrested (e.g., IT-traversed) or terminally differentiating (e.g., containing symbiosomes or arbuscules) will appear mainly "magenta" (H3.1-low, medium to high H3.3-mCherry content).

      Furthermore, all nodule images only display the overlay image, and individual fluorescence channels are not shown. Does the same masking effect happen here? It may be helpful to quantify fluoresce intensity not only in green but also in red channels as done for other experiments.

      Quantifying fluorescence intensity in the mCherry channel may indeed help to highlight the likely replacement of H3.1-eGFP by H3.3-mCherry in infected cells, as described by Otero and colleagues (2016) at the onset of cellular differentiation. However, the quantification method as established (i.e., measuring the corrected total nuclear fluorescence at the equatorial plane) cannot be applied, most of the time, to infected cells' nuclei due to the overlapping presence of mCherry-producing S. meliloti in the same channel (e.g., in Figure 2B). Nevertheless, and to avoid this masking effect when the eGFP and mCherry channels are overlaid, we now present them as isolated channels in revised Figures 1-3 and associated figure supplements. As the cell-wall staining is regularly included and displayed in grayscale, we assigned to both of them the Green Fire Blue lookup table, which maps intensity values to a multiple-colour sequential scheme (with blue or yellow indicating low or high fluorescence levels, respectively). We hope that this will allow a better appreciation of the respective levels of H3.1- and H3.3-fusions in our confocal images.

      - Fig 1 B - it is hard to differentiate between S. meliloti-mCherry and H3.3-mCherry. Is there a way to label the different structures?

      In the revised version of Figure 1B, we used filled or empty arrowheads to point to histone H3-containing nuclei. To label rhizobia-associated structures, we used dashed lines to delineate nodule cells hosting symbiosomes and included the annotation “IT” for infection threads. We also indicated proliferating, endoreduplicating and differentiating tissues and cells using the following annotations: “CD” for cell division, “En” for endoreduplication and “TD” for terminal differentiation. All annotations are explained in the figure legend.

      - Fig 1 - supplement E and F - no statistics are shown.

      We performed non-parametric tests using the latest version of the GraphPad Prism software (version 10.4.1). Stars (Figure 1-figure supplement 1F) or different letters (Figure 1-figure supplement 1G) now indicate statistically significant differences. Results of the normality and non-parametric tests were included in the corresponding Source Data Files (Figure 1 – figure supplement 1 – source data 1 and 2). We have also updated the compact display of letters in other figures as indicated by the new software version. The raw data and the results of the statistical analyses remain unchanged and can be viewed in the corresponding source files.

      - Fig 2 A - overview and close-up image do not seem to be in the same focal plane. This is confusing because the nuclei position is different (so is the infection thread position).

      We fully agree that our former Figure may have confused reviewers #1 and #2 as well as readers. Figure 2A was designed to highlight, from the same nodule primordium, actively dividing cells of the inner cortex (optical section z 6-14) and cells of the outer cortex traversed, penetrated by or neighbouring an infection thread (optical section z 11-19). We initially wanted to show different magnification views of the same confocal image (i.e_._, a full-view of the inner cortex and a zoomed-view of the outer layers) to ensure that audiences can identify these details. In the revised version of Figure 2A, we displayed these full- and zoomed-views in upper and lower panels, respectively and we removed the solid-line inset to avoid confusion. 

      - Fig 1A and Fig 2E could be combined and shown at the beginning of the manuscript. Also, consider making the cell size increase more extreme, as it is important to differentiate G2 cells after H3.1 eviction and cells in G1. You have to look very closely at the graph to see the size differences.

      We have taken each of your suggestions into account. A combined version of our schematic representation with more pronounced nuclei size differences is now presented in Figure 1A.

      - Fig. 3 C is difficult to interpret. Can this be split into different panels?

      We realized that our previous choice of representation may have been confusing. Each value corresponds only to the H3.1-eGFP content, measured in an infected cell and reported to that of the neighbouring cell (IC / NC) within individual root samples. Therefore, we removed the green-magenta colour code and changed the legend accordingly. We hope that these slight modifications will facilitate the interpretation of the results - namely, that the relative level of H3.1 increases significantly in infected cells in the selected mutants compared to the wild-type. This mode of representation also highlights that in the mutants, there are more individual cases where the H3.1 content in an infected cell exceeds that of the neighbouring cell by more than two times. These cases would be masked if the couples of infected cells and associated neighbours would be split into different panels as in Figure 3B.

      - Line 357/359. I assume you mean ...'through the G2 phase can commit to nuclear division'.

      We have edited this sentence according to your suggestion, which now appears in line 370. 

      Reviewer #2 (Recommendations For The Authors):

      Cell cycle control during the nitrogen-fixing symbiosis is an important question but only poorly understood. This manuscript uses largely cell biological methods, which are always of the highest quality - to investigate host cell cycle progression during the early stages of nodule formation, where cortical infection threads penetrate the nodule primordium. The experiments were carefully conducted, the observations were detail oriented, and the results were thought-provoking. The study should be supported by mechanistic insights. 

      (1) One thought provoked by the authors' work is that while the study was carried out at an unprecedented resolution, the relationship between control of the cell cycle and infection thread penetration remains correlative. Is this reduced replicative potential among cells in the infection thread trajectory a consequence of hosting an infection thread, or a prerequisite to do so?

      We understand and share the point of view of reviewer #2. At this stage, we believe that our data won’t enable us to fully answer the question, thus this relationship remains rather correlative. The reasons are that 1) the access to the status of cortical cells below C2 is restricted to fixed material and therefore only represents a snapshot of the situation, and 2) we are currently unable to significantly interfere with mechanisms as intertwined as cell cycle control and infection control. What we can reasonably suggest from our images is that the most favorable window of the cell cycle for cells about to be crossed by an infection thread is post-replicative, i.e., the G2 phase. Typical markers of the G2 phase were recurrently observed at the onset of physical colonization – enlarged nucleus, containing less histone H3.1 than neighbouring cells in S phase (e.g., in Figure 2A). Reaching the G2 phase could therefore be a prerequisite for infection (and associated cellular rearrangements), while prolonged arrest in this same phase is likely a consequence of transcellular passage towards a forming nodule primordium.

      More importantly, in either scenario, what is the functional significance of exiting the cell cycle or endocycle? By stating that "local control of mitotic activity could be especially important for rhizobia to timely cross the middle cortex, where sustained cellular proliferation gives rise to the nodule meristem" (Line 239), the authors seem to believe that cortical cells need to stop the cell cycle to prepare for rhizobia infection. This is certainly reasonable, but the current study provides no proof, yet. To test the functional importance of cell cycle exit, one would interfere with G2/M transition in nodule cells,  and examine the effect on infection.

      We fully agree with reviewer #2 that the functional importance of a cell-cycle arrest on the infection thread trajectory remains to be demonstrated. Interfering with cell-cycle progression in a system as complex and fine-tuned as infected legume roots certainly requires the right timing – at the level of the tissue and of individual cells; the right dose; and the right molecular player(s) (i.e., bona fide activators or repressors of the G2/M transition). Using the symbiosis-specific NPL promoter, activated in the direct vicinity of cortical infection threads (Figure 9-figure supplement 1B), we tried to force infectable cells to recruit the cell division program by ectopically over-expressing the Arabidopsis CYCD3.1, “mimicking” the CDEL system. So far, this strategy has not resulted in a significant increase in the number of uninfected nodules in transgenic hairy roots - though the effect on symbiosome release remains to be investigated. Provided that a suitable promoter-cell cycle regulator combination is identified, we hope to be able to answer this question in the future.

      Given that the authors have already identified a candidate, and showed it represses cell division in the CDEL system, not testing the same gene in a more relevant context seems a lost opportunity. If one ectopically expressed NY-YA1 in hairy roots, thus repressing mitosis in general, would more cells become competent to host infection threads? This seems a straightforward experiment and readily feasible with the constructs that the authors already have. If this view is too naive, the authors should explain why such a functional investigation does not belong in this manuscript.

      Reviewer #2's point is entirely valid, and we decided to address it through additional experiments. To avoid possible side effects on development by affecting cell division in general, we placed NF-YA1 under control of the symbiosis-induced ENOD11 promoter. Based on the results obtained in the CDEL system, the pENOD11::FLAG-NF-YA1 cassette was coupled to a destabilized version of the KNOLLE transcriptional reporter to detect the G2/M transition. Competence for transcellular infection was maintained upon local NFYA1 overexpression, the latter leading to a slight (non-significant) increase in the number of infected cells per cortical layer. These results are presented in Figure 9-figure supplement 3A-B (representative confocal images) and in Figure 9-figure supplement 4A-

      G.

      (1b) A related comment: on Line 183, it was stated that "The H3.1-eGFP fusion protein was also visible in cells penetrated but not fully passed by an infection thread". Presumably, the authors were talking about the cell marked by the arrowhead. But its H3.1-GFP signal looks no different from the cell immediately to its left. It is hard to say which cells are ones "preparing for intracellular infection pass through S-phase", and which ones are just "regularly dividing cortical cells forming the nodule primordium". What can be concluded is that once a cell has been fully transversed by an infection thread, its H3.1 level is low. Whether this is the cause or consequence of infection cannot be resolved simply by timing the appearance or disappearance of H3.1-GFP.

      We basically agree with comment 1b. In an unsynchronized system such as infected hairy roots, it is challenging to detect the event where a cell is penetrated, but not yet completely crossed by an infection thread. What we wanted to emphasize in Figure 2A, is that host cells in the path of an infection thread re-enter the cell cycle and pass through S-phase just as their neighbours do (as pointed out by reviewer #2 in his summary). The larger nucleus with slightly lower H3.1-eGFP signal than the neighbouring cell (as indicated by the use of the Green Fire Blue lookup table) suggests that the infected cell marked by the arrowhead in Figure 2A is actually in the G2 phase. The main difference is indeed that cells allowing complete infection thread passage exit the cell cycle and largely evict H3.1 while their neighbours proceed to cell division (as exemplified by PlaCCI reporters in Figure 4CD and the new Figure 5-figure supplement 2). Whether cell-cycle exit in G2 is a cause, or a consequence of cortical infection is a question that cannot be easily answered from fixed samples, which is a limitation of our study.

      (2) The authors have convincingly demonstrated that cortical cells accommodating infection threads exit the cell cycle, inhibit cell division, and down-regulate KNOLLE expression. How do these observations reconcile with the feature called the pre-infection thread? The authors devoted one paragraph to this question in the Discussion, but this does seem sufficient given that the pre-infection thread is a prominent concept. Is the resemblance to the cell division plane superficial, or does it reflect a co-option of the normal cytokinesis machinery for accommodating rhizobia?

      From our point of view, cortical cells forming pre-infection threads are likely in an intermediate state. PIT structures undoubtedly share many similarities with cells establishing a cell division plane. The recruitment of at least some of the players normally associated with cytokinesis has been demonstrated and is consistent with the maintenance of infectable cells in a pre-mitotic phase in Medicago, as discussed in lines 558 to 568. We nevertheless think that the arrest of the cell cycle in the G2 phase, presumably occurring in crossed cortical cells, constitutes an event of cellular differentiation and specialization in transcellular infection. 

      The following are mainly points of presentation and description: 

      (3) Line 158: I can't see "subnuclear foci" in Figure 1-figure supplement 1C-E. However, they are visible in Fig. 1C.

      We hope that presenting the eGFP and mCherry channels in separate panels and assigning them the Green Fire Blue colour scheme provides better visibility and contrast of these detailed structures. We now refer to Figure 1C in addition to Figure 1–figure supplement 1E in the main text (line 161). 

      (4) Line 160: The authors should outline a larger region containing multiple QC cells, rather than pointing to a single cell, as there are other areas in the image containing cells with the same pattern.

      We updated Figure 1-figure supplement 1E accordingly.

      (5) Fig. 1B should include single channels, since within a single plant cell, the nucleus, the infection thread, and sometimes symbiosomes all have the same color. This makes it hard to see whether the nuclei in these cells are less green, or are simply overwhelmed by the magenta color.

      To improve the readability of Figure 1B and to address suggestions from individual reviewers, we now include separate channels and have annotated the different structures labeled by mCherry.

      (6) Fig. 2A: the close-up does not match the boxed area in the left panel. Based on the labeling, it seems that the two panels are different optical sections. But why choose a different optical depth for the left panel? This can be disorienting to the author, because one expects the close-up to be the same image, just under higher magnification.

      We fully agree that our previous choice of representation may have been confusing. As we also specified to reviewer #1, we wanted to show a full-view of proliferating cells in the inner cortex and a zoomed-view of infected cells in the outer layers of the same nodule primordium. In the revised version of Figure 2A, we displayed these full- and zoomedviews in separate panels and removed the boxed area to avoid confusion. 

      (7) Figure 2-figure supplement 1B: the cell indicated by the empty arrowhead has a striking pattern of H3.1 and H3.3 distribution on condensed chromosomes. Can you comment on that?

      Reviewer #2 may be referring to the apparent enrichment of H3.3 at telomeres, previously described in Arabidopsis, while pericentromeric regions are enriched in H3.1. This distribution is indeed visible on most of the condensed chromosomes shown in Figure 2-figure supplement 1B. We included this comment in the corresponding caption.

      (8) Fig. 4: It is not very easy to distinguish M phase. Can the authors describe how each phase is supposed to look like with the reporters?

      We agree with reviewer #2 and attempted to improve Figure 4, which is now dedicated to the Arabidopsis PlaCCI reporter. ECFP, mCherry, and YFP channels were presented separately and the corresponding cell-cycle phases (in interphase and mitosis) were annotated. The Green Fire Blue lookup table was assigned to each reporter to provide the best visibility of, for example, chromosomes in early prophase. We included a schematic representation corresponding to the distribution of each reporter, using the colors of the overlaid image to facilitate its interpretation.

      (9) Line 298: what is endopolyploid? This term is used at least three times throughout the manuscript. How is it different from polyploid?

      In the manuscript, we aimed to differentiate the (poly)ploidy of an organism (reflecting the number of copies of the basic genome and inherited through the germline) from endopolyploidy produced by individual somatic cells. As reviewed by Scholes and Paige, polyploidy and endopolyploidy differ in important ways, including allelic diversity and chromosome structural differences. In the Medicago truncatula root cortex for example, a tetraploid cell generated via endoreduplication from the diploid state would contain at most two alleles at any locus. The effects of endopolyploidy on cell size, gene expression, cell metabolism and the duration of the mitotic cell cycle are not shared among individual cells or organs, contrasting to a polyploid individual (Scholes and Paige, 2015).

      See Scholes, D. R., & Paige, K. N. (2015). Plasticity in ploidy : A generalized response to stress. Trends in Plant Science, 20(3), 165‑175. https://doi.org/10.1016/j.tplants.2014.11.007

      (10) Line 332: "chromosomes on mitotic figures" - what does this mean?

      Reviewer #2 is right to point out this redundant wording. Mitotic “figures” are recognized, by definition, based on chromosome condensation. We now use the term "mitotic chromosomes" (line 344).

      (11) Fig. 6A: could the authors consider labeling the doublets, at least some of them? I understand that this nucleus contains many doublets. However, this is the first image where one is supposed to recognize these doublets, and pointing out these features can facilitate understanding. Otherwise, a reader might think the image is comparable to nuclei with no doublets in the rest of the figure.

      Following this suggestion, five of these doublets are now labeled in Figure 7A (formerly Figure 6A).

    1. Others can spend years wandering theinsane corridors of Tzeentch’s maze without drinking,eating, or resting – their metabolism apparently slowedby chaotic influences.

      So that's the cheat code the Chaos gods use to feed their followers! It might make summoning impossible armies of marauder hordes a little easier if those hordes didn't need to eat every damn day.

  4. May 2025
    1. AbstractBackground The lemon sole (Microstomus kitt) is a culinary fish from the family of righteye flounders (Pleuronectidae) inhabiting sandy and shallow offshore grounds of the North Sea, the western Baltic Sea, the English Channel, the shallow waters of Great Britain and Ireland as well as the Bay of Biscay and the coastal waters of Norway.Findings Here, we present the chromosome-level genome assembly of the lemon sole. We applied PacBio HiFi sequencing on the PacBio Revio system to generate a highly complete and contiguous reference genome. The resulting assembly has a contig N50 of 17.2 Mbp and a scaffold N50 of 27.2 Mbp. The total assembly length is 628 Mbp, of which 616 Mbp were scaffolded into 24 chromosome-length scaffolds. The identification of 99.7% complete BUSCO genes indicates a high assembly completeness.Conclusions The chromosome-level genome assembly of the lemon sole provides a high-quality reference genome for future population genomic analyses of a commercially valuable edible fish.

      This work has been published in GigaByte Journal under a CC-BY 4.0 license (https://doi.org/10.46471/gigabyte.156), and has published the reviews under the same license.

      Reviewer 1. Alejandro Mechaly

      Are all data available and do they match the descriptions in the paper? No. The BioProject number is not included in the submitted manuscript.

      Are the data and metadata consistent with relevant minimum information or reporting standards? No. The BioProject number is not included in the submitted manuscript.

      Comments: The paper presents a valuable contribution to the genomics of Microstomus kitt (lemon sole), a commercially important species. The study introduces a chromosome-level genome assembly using PacBio HiFi sequencing, resulting in a highly contiguous assembly with 99.7% completeness in BUSCO genes. This high-quality genome will serve as a key resource for future population genomics and aquaculture studies. Overall, this assembly offers a solid foundation for advancing research on the biology and management of lemon sole. The main critique of this study is that, while it highlights the sexual dimorphism in lemon sole, where females are larger than males, it does not delve into this aspect in detail. Although the research presents valuable data through a high-quality chromosomal-level genome assembly, it focuses exclusively on male specimens. Comparing the genomes of both sexes would be highly insightful, potentially revealing the genetic mechanisms or pathways underlying this dimorphism through comparative genomics. Recent studies on flatfish (Villarreal et al., 2024. https://doi.org/10.1186/s12864-024-10081-z) have used comparative genomics to examine sex determination genes, and applying this approach to lemon sole would significantly enhance the study’s impact. Furthermore, there are numerous sequenced flatfish genomes that should be analyzed alongside these results to provide a more comprehensive context.

      Re-review: Thank you for addressing my comments. While I understand the study's limitations, including its focus as part of a university course and the use of a single specimen, I believe the manuscript lacks sufficient impact without exploring the genetic basis of sexual dimorphism or incorporating comparative analyses with other flatfish genomes. The genome assembly and annotation are well-executed, but the absence of biological context limits the broader relevance of the work. Sexual dimorphism in lemon sole, a commercially important species, is a key topic that could inform aquaculture and fisheries management. Without addressing this, the manuscript misses an opportunity to answer important scientific questions. For these reasons, I cannot recommend the manuscript for publication in its current form. While the technical work is solid, additional analyses or a broader scope are needed to enhance its contribution to the fieldS

      Reviewer 2. Yongshuang Xiao

      This MS presents the chromosome-level genome assembly of Microstomus kitt, a species belonging to the Pleuronectidae family and mainly distributed in the North European seas. The study utilized PacBio HiFi sequencing technology combined with Hi-C data for chromosome-level assembly, resulting in a high-quality reference genome of approximately 633 MB, including 23 chromosomal length scaffolds, completing 99.7% of BUSCO genes, demonstrating high assembly completeness and gene annotation quality. Further analysis revealed abundant repetitive sequences and gene features in the lemon sole genome, providing important resources for future genetic studies of this species and its close relatives. The paper presents several issues as follows: 1. From the evaluation of the genome, the estimated size is around 542 Mb, while the manually curated Hi-C results yielded a genome size of 633 Mb. The authors are requested to explain why there is a difference of nearly 100 Mb between the second-generation sequencing evaluation and the third-generation results. 2. Utilizing PacBio HiFi sequencing technology, which generates long reads, and its associated assembly software, the authors were able to assemble the genome at the chromosome level. The authors explicitly state that the size of the 23 chromosomal level genomes assembled using YaHS and Chromap software is around 500 Mb, which is consistent with the genome survey results. How does the author know that the assembled genome is erroneous? 3. Based on the author's description, it is not clear what the size of the assembled genome from a single chain using PacBio sequencing is. The author needs to provide this data in the results. 4. The authors performed quality assessments of the assembled genome using various methods such as Merqury. However, the description of the evaluation results is lacking. The authors are requested to include the QV evaluation values and additional results of SNP alignment for the second-generation sequencing data. 5. For gene annotation, the authors used the genomes of five species of Pleuronectidae as references. We are eager to see the results of the alignment analysis between the genome obtained using PacBio Revio and the aforementioned five fish genomes. Although these results do not need to be included in the main text, they should be provided as part of the response to the reviewers, including the alignment results and alignment rates for both sets of assembled genomes (500 Mb and 633 Mb). 6. The authors are requested to include the length information of each chromosome in the supplementary files. From the assembly results, it appears that the PacBio Revio results are not as impressive as anticipated, particularly with a Scaffold N50 of 29.4 Mbp. Is this due to limitations in the length of the chromosomes themselves, affecting the quality metrics of this genome? 7. The data should be uploaded to NCBI and obtain the corresponding registration code.

      Re-review: This study aims to perform chromosome-level genome assembly of the lemon sole (Microstomus kitt) and conduct a comprehensive analysis of its genome using high-throughput sequencing technology. Researchers utilized PacBio HiFi sequencing technology to carry out whole-genome sequencing of this species, resulting in a high-quality and complete genome sequence. The genome sequence has a length of 633 Mbp, with 23 chromosome-level sequences successfully assembled. Additionally, BUSCO analysis indicated that this genome sequence possesses a high level of completeness. These results suggest that the lemon sole genome sequence can serve as an important reference for future population genetic studies of commercially valuable edible fish species. However, there are certain issues with the paper that need to be addressed: The authors emphasize that female lemon soles grow larger than males, yet they chose to sequence the male genome instead of focusing on the more unique female. The authors should clarify this choice. The HI-C assisted assembly results show that male lemon soles have 23 chromosome pairs. Are there any heteromorphic chromosomes? The authors need to elucidate the karyotype of the lemon sole, as this information is significant for both the genome assembly and subsequent research. The survey results indicate a high level of heterozygosity in lemon sole. How did the authors account for this high heterozygosity to obtain a relatively complete genome? Could this affect the accuracy of the genome? Although the authors achieved high-quality genome results through PacBio sequencing, they used BUSCO for genome quality assessment. To further highlight the completeness and accuracy of the assembled genome, it is recommended that the authors utilize QV for additional evaluation. To ensure high levels of data sharing and reproducibility, the authors are requested to provide the chromosome-level genome fasta file and gff annotation file. In summary, the authors are encouraged to provide additional information and make necessary revisions.

    1. How do we know how accurate the observer is? In such studies, it is important to establish agreement between two or more people who independently observe and code a set of data. By showing that two or more judges independently come up with the same observations, researchers ensure that the observations are not the subjective, distorted impressions of one individual.

      I think this is an important in establishing the quality of research. There is always going to be a challenge between subjectivity and accuracy and being able to remove bias when making observations.

    1. code examples

      Since the first part refers to "you", this should probably read "you can run code examples offered in the book without ...".