35,500 Matching Annotations
  1. Last 7 days
    1. Dialog between Evolution and the Philosopher's brain

      Brain: Do I know everything there is to know my own working, philosophically?

      Evolution: No you don't. That's why Descartes got things so wrong. He knew nothing about his own working but thought he knew everything.

      Brain: Can I get it please?

      Evolution: No you can't. It's not worth it. Philosophy doesn't pay the bills.

      Brain: I'm surprised that you said I don't know everything I need to know in order to do philosophy of my self. I thought I had everything I needed.

      Evolution: Because you never got a meta-metacognition. Without it, you don't know what metacognition is missing, so you think you have everything you need.

      Brain: Why can't I have it?

      Evolution: Suppose module A is useful, well you'll get it. Suppose module B isn't useful, well you won't get it. Lamenting or being aware that module B is missing is not worth it. Imagine that you have an eye that not only has the R, G, B recepter cells, but also a meta-blindness cell that does nothing except keep sending a signal meaning "By the way, I can't see in infrared or ultraviolet". Do you think it's useful, or not?

      Brain: No. If it were useful, I'd have the cells. If it were not useful, then complaining about the lack of it is even less useful.

      Evolution: You got it. Meta-cognition really doesn't pay the bills!

      Brain: Last question. Why do I have metacognition, including the awareness of what I don't know, but not meta-metacognition?

      Evolution: You are aware of what you don't know when you can know it, and when knowing it is useful. Thus, you are aware of when you don't know what the weather is, or what your friends are doing -- both are things that matter for your survival, and both are things you can fix. But if you don't have the capacity to see in infrared, that is forever. You are born with it, and you will die with it, so why be aware of it? Similarly, if you don't know how many lobes you have, then that ignorance is forever, because short of growing a whole new circuit diagram, or trepanning, you can't know it, so why be aware of it?

      Brain: So that's why we keep hallucinating souls, free wills, desires, and other unnatural phenomena that not only are not science, but are not even written in the same grammar as science. Not knowing how we work, and not knowing that we don't know, we hallucinate all those structures that work magically, not causally, without gears, levers, or electrons. We are all buttons, and no wires; all GUI, and no code. Souls are superficial, and neurons are deep...

    2. The function of metacognitive systems is to engineer environmental solutions via the strategic uptake of limited amounts of information, not to reverse engineer the nature of the brain it belongs to.

      Dialog between Evolution and the Philosopher's brain

      Brain: Do I know everything there is to know my own working, philosophically?

      Evolution: No you don't. That's why Descartes got things so wrong. He knew nothing about his own working but thought he knew everything.

      Brain: Can I get it please?

      Evolution: No you can't. It's not worth it. Philosophy doesn't pay the bills.

      Brain: I'm surprised that you said I don't know everything I need to know in order to do philosophy of my self. I thought I had everything I needed.

      Evolution: Because you never got a meta-metacognition. Without it, you don't know what metacognition is missing, so you think you have everything you need.

      Brain: Why can't I have it?

      Evolution: Suppose module A is useful, well you'll get it. Suppose module B isn't useful, well you won't get it. Lamenting or being aware that module B is missing is not worth it. Imagine that you have an eye that not only has the R, G, B recepter cells, but also a meta-blindness cell that does nothing except keep sending a signal meaning "By the way, I can't see in infrared or ultraviolet". Do you think it's useful, or not?

      Brain: No. If it were useful, I'd have the cells. If it were not useful, then complaining about the lack of it is even less useful.

      Evolution: You got it. Meta-cognition really doesn't pay the bills!

      Brain: So that's why we keep hallucinating souls, free wills, desires, and other unnatural phenomena that not only are not science, but are not even written in the same grammar as science. Not knowing how we work, and not knowing that we don't know, we hallucinate all those structures that work magically, not causally, without gears, levers, or electrons. We are all buttons, and no wires; all GUI, and no code. Souls are superficial, and neurons are deep...

    1. code

      This is a test comment

    Annotators

    1. <html> </html>

      Je ne comprends pas à quoi sert la balise si on a déjà une balise <!DOCTYPE html> qui est déjà censée indiquer qu'on code en html.

    1. However, it is important to keep in mind that there are some people with disabilities — like cognitive disorders — who might benefit from having this additional image information readily available on the screen instead of buried in the SVG code.

      Make supports visible whenever possible.

    1. The code changes and operations we have performed inside the codespace will still be inside the stopped codespace. Also, the codespace may have an inactivity time limit and close after 30 minutes. If your codespace is stopped then you can restart it as shown below.

      This should be a regular paragraph rather than a quote block.

      Delete the sentence about the time limit.

    1. Open the file by clicking on the filename.

      If using the code command to open an R script, then this sentence can be deleted.

    2. new file icon in VS Code.

      You can't see this in the screenshot and people might not know where this is.

      It may be better the instruct people to run

      code R/example.R

      from the terminal - this has the advantage of putting the script in a sensible place and opening it for editing straight away.

    1. Social workers should promote the general welfare of society, from local to global levels, and the development of people, their communities, and their environments. Social workers should advocate for living conditions conducive to the fulfillment of basic human needs and should promote social, economic, political, and cultural values and institutions that are compatible with the realization of social justice.

      I see questions of power and structural inequality raised in section 6.01. This section describes a social worker's responsibility to advocate for improved living conditions and promote social justice. This section could go into more depth about how power dynamics and inequalities impact these living conditions. For example, the code could provide more detailed guidance on recognizing and addressing systemic issues such as racism, socioeconomic differences, and other forms of oppression that are deeply imbedded in our society. The code could also provide strategies to social workers to better confront the power imbalances they encounter. The NABSW Code of Ethics offers a valuable perspective to enrich the practicum learning experience by applying its commitment to addressing racism, oppression, and discrimination. Social workers can apply this principle in their practicum experience to identify inequalities within their practicum setting/organization. A social worker could assess the organization's policies and identify practices that may contribute to racial or social inequalities and take action towards creating a more equal environment.

    1. Now we want to update the source code for that we will use svn command update

      Use the Subversion command update to update your local copy with the latest changes by the R Core Team.

    2. If you are currently inside the BUILDDIR directory or root directory(/workspaces/r-dev-env) make sure to change it to TOP_SRCDIR so that we can update the changes made inside our source code.

      "In a bash terminal, change to the source directory"

    3. After following through the Contribution workflow and making the following changes, we need to update it inside the source code directory.

      The R Core Team commit changes to the development version of R sometimes multiple times a day. It's a good idea to update our local copy of the source code from time to time, especially before creating a patch.

    1. Then we will install recommended packages using cmd Command: "$TOP_SRCDIR/tools/rsync-recommended"

      "N. Download the source code for the recommended packages

      To build R with the recommended packages, we need to run the tools/rsync-recommended script from the source directory to download the source code for these packages:

      $TOP_SRCDIR/tools/rsync-recommended "

      Note this just downloads the source code which is not in the main svn repo, the packages are then built and installed when R is built.

    2. After we change directory to BUILDDIR we can configure and build R.

      -> " * After we change directory, we must run the configure script from the source directory.

      $TOP_SRCDIR/configure

      This step takes ~1 minute on the codespace. "

      I think we should just document the "with recommended packages" workflow as this is what we do in the R Dev Guide. So we can drop the --without-recommended-packages option here and add in the step to get the source code for the recommended packages.

      I tested the build without the --enable-R-shlib option and it works fine in VSCode. We will need to add this back in if we provide RStudio as an alternative IDE.

    3. Command: cd $BUILDDIR

      Drop the "* Command" bullet and put the code straight after the previous text.

    4. Command: mkdir -p $BUILDDIR

      Drop the "* Command" bullet and put the code straight after the previous text.

    5. The path ENV variable for R Build and R Source code are BUILDDIR and TOP_SRCDIR respectively.

      ->"* BUILDDIR defines the build directory: /workspaces/r-dev-env/build/r-devel * TOP_SRCDIR defines the source directory: /workspaces/r-dev-env/svn/r-devel "

      The screenshot needs to be updated.

    6. configure source code

      -> "Configure the build"

    7. We need to change our directory to R build directory(BUILDDIR) to build and configure our R source code.

      -> "To keep the source directory clean, we change to a build directory to configure and build R."

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #2 (Public Review):

      Weaknesses:

      The comparison of affinity predictions derived from AlphaFold2 and H3-opt models, based on molecular dynamics simulations, should have been discussed in depth. In some cases, there are huge differences between the estimations from H3-opt models and those from experimental structures. It seems that the authors obtained average differences of the real delta, instead of average differences of the absolute value of the delta. This can be misleading, because high negative differences might be compensated by high positive differences when computing the mean value. Moreover, it would have been good for the authors to disclose the trajectories from the MD simulations.

      Thanks for your careful checks. We fully understand your concerns about the large differences when calculating affinity. To understand the source of these huge differences, we carefully analyzed the trajectories of the input structures during MD simulations. We found that the antigen-antibody complex shifted as it transited from NVT to NPT during pre-equilibrium, even when restraints are used to determine the protein structure. To address this issue, we consulted the solution provided on Amber's mailing list (http://archive.ambermd.org/202102/0298.html) and modified the top file ATOMS_MOLECULE item of the simulation system to merge the antigen-antibody complexes into one molecule. As a result, the number of SOLVENT_POINTERS was also adjusted. Finally, we performed all MD simulations and calculated affinities of all complexes.

      We have corrected the “Afterwards, a 25000-step NVT simulation with a time step of 1 fs was performed to gradually heat the system from 0 K to 100 K. A 250000-step NPT simulation with a time step of 2 fs was carried out to further heat the system from 100 K to 298 K.” into “Afterwards, a 400-ps NVT simulation with a time step of 2 fs was performed to gradually heat the system from 0 K to 298 K (0–100 K: 100 ps; 100-298 K: 200 ps; hold 298 K: 100 ps), and a 100-ps NPT simulation with a time step of 2 fs was performed to equilibrate the density of the system. During heating and density equilibration, we constrained the antigen-antibody structure with a restraint value of 10 kcal×mol-1×Å-2.” and added the following sentence in the Method section of our revised manuscript: “The first 50 ns restrains the non-hydrogen atoms of the antigen-antibody complex, and the last 50 ns restrains the non-hydrogen atoms of the antigen, with a constraint value of 10 kcal×mol-1×Å-2”

      In addition, we have corrected the calculation of mean deltas using absolute values and have demonstrated that the average affinities of structures predicted by H3-OPT were closer to those of experimentally determined structures than values obtained through AF2. These results have been updated in the revised manuscript. However, significant differences still exist between the estimations of H3-OPT models and those derived from experimental structures in few cases. We found that antibodies moved away from antigens both in AF2 and H3-OPT predicted complexes during simulations, resulting in RMSDbackbone (RMSD of antibody backbone) exceeding 20 Å. These deviations led to significant structural changes in the complexes and consequently resulted in notable differences in affinity calculations. Thus, we removed three samples (PDBID: 4qhu, 6flc, 6plk) from benchmark because these predicted structures moved away from the antigen structure during MD simulations, resulting in huge energy differences from the native structures.

      Author response table 1.

      We also appreciate your reminder, and we have calculated all RMSDbackbone during production runs (SI Fig. 5).

      Author response image 1.

      Reviewer #3 (Public Review):

      Weaknesses:

      The proposed method lacks of a confidence score or a warning to help guiding the users in moderate to challenging cases.

      We were sorry for our mistakes. We have updated our GitHub code and added following sentences to clarify how we train this confidence score module in Method Section: “Confidence score prediction module

      We apply an MSE loss for confidence prediction, label error was calculated as the Cα deviation of each residue after alignment. The inputs of this module are the same as those used for H3-OPT, and it generates a confidence score ranging from 0 to 100. The dropout rates of H3-OPT were set to 0.25. The learning rate and weight decay of Adam optimizer are set to 1 × 10−5 and 1 × 10−4, respectively.”

      Reviewer #2 (Recommendations For The Authors):

      I would strongly suggest that the authors deepen their discussion on the affinity prediction based on Molecular Dynamics. In particular, why do the authors think that some structures exhibit huge differences between the predictions from the experimental structure and the predicted by H3-opt? Also, please compute the mean deltas using the absolute value and not the real value; the letter can be extremely misleading and hidden very high differences in different directions that are compensating when averaging.

      I would also advice to include graphical results of the MD trajectories, at least as Supp. Material.

      We gratefully thank you for your feedback and fully understand your concerns. We found the source of these huge differences and solved this problem by changing method of MD simulations. Then, we calculated all affinities and corrected the mean deltas calculation using the absolute value. The RMSDbackbone values were also measured to enable accurate affinity predictions during production runs (SI Fig. 5). There are still big differences between the estimations of H3-OPT models and those from experimental structures in some cases. We found that antibodies moved away from antigens both in AF2 and H3-OPT predicted complexes during simulations, resulting in RMSDbackbone exceeding 20 Å. These deviations led to significant structural changes in the complexes and consequently resulted in notable differences in affinity calculations. Thus, we removed three samples (PDBID: 4qhu, 6flc, 6plk) from benchmark.

      Thanks again for your professional advice.

      Reviewer #3 (Recommendations For The Authors):

      (1) I am pleased with the most of the answers provided by the authors to the first review. In my humble opinion, the new manuscript has greatly improved. However, I think some answers to the reviewers are worth to be included in the main text or supporting information for the benefit of general readers. In particular, the requested statistics (i.e. p-values for Cα-RMSD values across the modeling approaches, p-values and error bars in Fig 5a and 5b, etc.) should be introduced in the manuscript.

      We sincerely appreciate your advice. We have added the statistics values to Fig. 4 and Fig. 5 to our manuscript.

      Author response image 2.

      Author response image 3.

      (2) Similarly, authors state in the answers that "we have trained a separate module to predict the confidence score of the optimized CDR-H3 loops". That sounds a great improvement to H3-OPT! However, I couldn't find any reference of that new module in the reviewed version of the manuscript, nor in the available GitHub code. That is the reason for me to hold the weakness "The proposed method lacks of a confidence score".

      We were really sorry for our careless mistakes. Thank you for your reminding. We have updated our GitHub code and added following sentences to clarify how we train this confidence score module in Method Section:

      “Confidence score prediction module

      We apply an MSE loss for confidence prediction, label error was calculated as the Cα deviation of each residue after alignment. The inputs of this module are the same as those used for H3-OPT, and it generates a confidence score ranging from 0 to 100. The dropout rates of H3-OPT were set to 0.25. The learning rate and weight decay of Adam optimizer are set to 1 × 10−5 and 1 × 10−4, respectively.”

      (3) I acknowledge all the efforts made for solving new mutant/designed nanobody structures. Judging from the solved structures, mutants Y95F and Q118N seems critical to either crystallographic or dimerization contacts stabilizing the CDR-H3 loop, hence preventing the formation of crystals. Clearly, solving a molecular structure is a challenge, hence including the following comment in the manuscript is relevant for readers to correctly asset the magnitude of the validation: "The sequence identities of the VH domain and H3 loop are 0.816 and 0.647, respectively, comparing with the best template. The CDR-H3 lengths of these nanobodies are both 17. According to our classification strategy, these nanobodies belong to Sub1. The confidence scores of these AlphaFold2 predicted loops were all higher than 0.8, and these loops were accepted as the outputs of H3-OPT by CBM."

      We appreciate your kind recommendations and have revised “Although Mut1 (E45A) and Mut2 (Q14N) shared the same CDR-H3 sequences as WT, only minor variations were observed in the CDR-H3. H3-OPT generated accurate predictions with Cα-RMSDs of 1.510 Å, 1.541 Å and 1.411 Å for the WT, Mut1, and Mut2, respectively.” into “Although Mut1 (E45A) and Mut2 (Q14N) shared the same CDR-H3 sequences as WT (LengthCDR-H3 = 17), only minor variations were observed in the CDR-H3. H3-OPT generated accurate predictions with Cα-RMSDs of 1.510 Å, 1.541 Å and 1.411 Å for the WT, Mut1, and Mut2, respectively (The confidence scores of these AlphaFold2 predicted loops were all higher than 0.8, and these loops were accepted as the outputs of H3-OPT by CBM). ”. In addition, we have added following sentence in the legend of Figure 4 to ensure that readers can appropriately evaluate the significance and reliability of our validations: “The sequence identities of the VH domain and H3 loop are 0.816 and 0.647, respectively, comparing with the best template.”.

      (4) As pointed out in the first review, I think the work https://doi.org/10.1021/acs.jctc.1c00341 is worth acknowledging in section "2.2 Molecular dynamics (MD) simulations could not provide accurate CDR-H3 loop conformations" of supplementary material, as it constitutes a clear reference (and probably one of the few) to the MD simulations that authors pretend to perform. Similarly, the work https://doi.org/10.3390/molecules28103991 introduces a former benchmark on AI algorithms for predicting antibody and nanobody structures that readers may find interest to contrast with the present work. Indeed, this later reference is used by authors to answer a reviewer comment.

      Thanks a lot for your valuable comments. We have added these references in the proper positions in our manuscript.

    2. Reviewer #2 (Public Review):

      This work provides a new tool (H3-Opt) for the prediction of antibody and nanobody structures, based on the combination of AlphaFold2 and a pre-trained protein language model, with a focus on predicting the challenging CDR-H3 loops with enhanced accuracy than previously developed approaches. This task is of high value for the development of new therapeutic antibodies. The paper provides an external validation consisting of 131 sequences, with further analysis of the results by segregating the test sets in three subsets of varying difficulty and comparison with other available methods. Furthermore, the approach was validated by comparing three experimentally solved 3D structures of anti-VEGF nanobodies with the H3-Opt predictions

      Strengths:

      The experimental design to train and validate the new approach has been clearly described, including the dataset compilation and its representative sampling into training, validation and test sets, and structure preparation. The results of the in silico validation are quite convincing and support the authors' conclusions.

      The datasets used to train and validate the tool and the code are made available by the authors, which ensures transparency and reproducibility, and allows future benchmarking exercises with incoming new tools.

      Compared to AlphaFold2, the authors' optimization seems to produce better results for the most challenging subsets of the test set.

      Weaknesses:

      None

    1. if (IS_ENABLED(CONFIG_BALLOON_COMPACTION) && PageIsolated(page)) { /* raced with isolation */ unlock_page(page); continue; }

      Config enabled code

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This important study explores infants' attention patterns in real-world settings using advanced protocols and cutting-edge methods. The presented evidence for the role of EEG theta power in infants' attention is currently incomplete. The study will be of interest to researchers working on the development and control of attention.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The paper investigates the physiological and neural processes that relate to infants' attention allocation in a naturalistic setting. Contrary to experimental paradigms that are usually employed in developmental research, this study investigates attention processes while letting the infants be free to play with three toys in the vicinity of their caregiver, which is closer to a common, everyday life context. The paper focuses on infants at 5 and 10 months of age and finds differences in what predicts attention allocation. At 5 months, attention episodes are shorter and their duration is predicted by autonomic arousal. At 10 months, attention episodes are longer, and their duration can be predicted by theta power. Moreover, theta power predicted the proportion of looking at the toys, as well as a decrease in arousal (heart rate). Overall, the authors conclude that attentional systems change across development, becoming more driven by cortical processes.

      Strengths:

      I enjoyed reading the paper, I am impressed with the level of detail of the analyses, and I am strongly in favour of the overall approach, which tries to move beyond in-lab settings. The collection of multiple sources of data (EEG, heart rate, looking behaviour) at two different ages (5 and 10 months) is a key strength of this paper. The original analyses, which build onto robust EEG preprocessing, are an additional feat that improves the overall value of the paper. The careful consideration of how theta power might change before, during, and in the prediction of attention episodes is especially remarkable. However, I have a few major concerns that I would like the authors to address, especially on the methodological side.

      Points of improvement

      (1) Noise

      The first concern is the level of noise across age groups, periods of attention allocation, and metrics. Starting with EEG, I appreciate the analysis of noise reported in supplementary materials. The analysis focuses on a broad level (average noise in 5-month-olds vs 10-month-olds) but variations might be more fine-grained (for example, noise in 5mos might be due to fussiness and crying, while at 10 months it might be due to increased movements). More importantly, noise might even be the same across age groups, but correlated to other aspects of their behaviour (head or eye movements) that are directly related to the measures of interest. Is it possible that noise might co-vary with some of the behaviours of interest, thus leading to either spurious effects or false negatives? One way to address this issue would be for example to check if noise in the signal can predict attention episodes. If this is the case, noise should be added as a covariate in many of the analyses of this paper. 

      We thank the reviewer for this comment. We certainly have evidence that even the most state-of-the-art cleaning procedures (such as machine-learning trained ICA decompositions, as we applied here) are unable to remove eye movement artifact entirely from EEG data (Haresign et al., 2021; Phillips et al., 2023). (This applies to our data but also to others’ where confounding effects of eye movements are generally not considered.) Importantly, however, our analyses have been designed very carefully with this explicit challenge in mind. All of our analyses compare changes in the relationship between brain activity and attention as a function of age, and there is no evidence to suggest that different sources of noise (e.g. crying vs. movement) would associate differently with attention durations nor change their interactions with attention over developmental time. And figures 5 and 7, for example, both look at the relationship of EEG data at one moment in time to a child’s attention patterns hundreds or thousands of milliseconds before and after that moment, for which there is no possibility that head or eye movement artifact can have systematically influenced the results.

      Moving onto the video coding, I see that inter-rater reliability was not very high. Is this due to the fine-grained nature of the coding (20ms)? Is it driven by differences in expertise among the two coders? Or because coding this fine-grained behaviour from video data is simply too difficult? The main dependent variable (looking duration) is extracted from the video coding, and I think the authors should be confident they are maximising measurement accuracy.

      We appreciate the concern. To calculate IRR we used this function (Cardillo G. (2007) Cohen's kappa: compute the Cohen's kappa ratio on a square matrix. http://www.mathworks.com/matlabcentral/fileexchange/15365). Our “Observed agreement” was 0.7 (std= 0.15). However, we decided to report the Cohen's kappa coefficient, which is generally thought to be a more robust measure as it takes into account the agreement occurring by chance. We conducted the training meticulously (refer to response to Q6, R3), and we have confidence that our coders performed to the best of their abilities.

      (2) Cross-correlation analyses

      I would like to raise two issues here. The first is the potential problem of using auto-correlated variables as input for cross-correlations. I am not sure whether theta power was significantly autocorrelated. If it is, could it explain the cross-correlation result? The fact that the cross-correlation plots in Figure 6 peak at zero, and are significant (but lower) around zero, makes me think that it could be a consequence of periods around zero being autocorrelated. Relatedly: how does the fact that the significant lag includes zero, and a bit before, affect the interpretation of this effect? 

      Just to clarify this analysis, we did include a plot showing autocorrelation of theta activity in the original submission (Figs 7A and 7B in the revised paper). These indicate that theta shows little to no autocorrelation. And we can see no way in which this might have influenced our results. From their comments, the reviewer seems rather to be thinking of phasic changes in the autocorrelation, and whether the possibility that greater stability in theta during the time period around looks might have caused the cross-correlation result shown in 7E. Again though we can see no way in which this might be true, as the cross-correlation indicates that greater theta power is associated with a greater likelihood of looking, and this would not have been affected by changes in the autocorrelation.

      A second issue with the cross-correlation analyses is the coding of the looking behaviour. If I understand correctly, if an infant looked for a full second at the same object, they would get a maximum score (e.g., 1) while if they looked at 500ms at the object and 500ms away from the object, they would receive a score of e.g., 0.5. However, if they looked at one object for 500ms and another object for 500ms, they would receive a maximum score (e.g., 1). The reason seems unclear to me because these are different attention episodes, but they would be treated as one. In addition, the authors also show that within an attentional episode theta power changes (for 10mos). What is the reason behind this scoring system? Wouldn't it be better to adjust by the number of attention switches, e.g., with the formula: looking-time/(1+N_switches), so that if infants looked for a full second, but made 1 switch from one object to the other, the score would be .5, thus reflecting that attention was terminated within that episode? 

      We appreciate this suggestion. This is something we did not consider, and we thank the reviewer for raising it. In response to their comment, we have now rerun the analyses using the new measure (looking-time/(1+N_switches), and we are reassured to find that the results remain highly consistent. Please see Author response image 1 below where you can see the original results in orange and the new measure in blue at 5 and 10 months.

      Author response image 1.

      (3) Clearer definitions of variables, constructs, and visualisations

      The second issue is the overall clarity and systematicity of the paper. The concept of attention appears with many different names. Only in the abstract, it is described as attention control, attentional behaviours, attentiveness, attention durations, attention shifts and attention episode. More names are used elsewhere in the paper. Although some of them are indeed meant to describe different aspects, others are overlapping. As a consequence, the main results also become more difficult to grasp. For example, it is stated that autonomic arousal predicts attention, but it's harder to understand what specific aspect (duration of looking, disengagement, etc.) it is predictive of. Relatedly, the cognitive process under investigation (e.g., attention) and its operationalization (e.g., duration of consecutive looking toward a toy) are used interchangeably. I would want to see more demarcation between different concepts and between concepts and measurements.

      We appreciate the comment and we have clarified the concepts and their operationalisation throughout the revised manuscript.

      General Remarks

      In general, the authors achieved their aim in that they successfully showed the relationship between looking behaviour (as a proxy of attention), autonomic arousal, and electrophysiology. Two aspects are especially interesting. First, the fact that at 5 months, autonomic arousal predicts the duration of subsequent attention episodes, but at 10 months this effect is not present. Conversely, at 10 months, theta power predicts the duration of looking episodes, but this effect is not present in 5-month-old infants. This pattern of results suggests that younger infants have less control over their attention, which mostly depends on their current state of arousal, but older infants have gained cortical control of their attention, which in turn impacts their looking behaviour and arousal.

      We thank the reviewer for the close attention that they have paid to our manuscript, and for their insightful comments.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript explores infants' attention patterns in real-world settings and their relationship with autonomic arousal and EEG oscillations in the theta frequency band. The study included 5- and 10-month-old infants during free play. The results showed that the 5-month-old group exhibited a decline in HR forward-predicted attentional behaviors, while the 10-month-old group exhibited increased theta power following shifts in gaze, indicating the start of a new attention episode. Additionally, this increase in theta power predicted the duration of infants' looking behavior.

      Strengths:

      The study's strengths lie in its utilization of advanced protocols and cutting-edge techniques to assess infants' neural activity and autonomic arousal associated with their attention patterns, as well as the extensive data coding and processing. Overall, the findings have important theoretical implications for the development of infant attention.

      Weaknesses:

      Certain methodological procedures require further clarification, e.g., details on EEG data processing. Additionally, it would be beneficial to eliminate possible confounding factors and consider alternative interpretations, e,g., whether the differences observed between the two age groups were partly due to varying levels of general arousal and engagement during the free play.

      We thank the reviewer for their suggestions and have addressed them in our point-by-point responses below.

      Reviewer #3 (Public Review):

      Summary:

      Much of the literature on attention has focused on static, non-contingent stimuli that can be easily controlled and replicated--a mismatch with the actual day-to-day deployment of attention. The same limitation is evident in the developmental literature, which is further hampered by infants' limited behavioral repertoires and the general difficulty in collecting robust and reliable data in the first year of life. The current study engages young infants as they play with age-appropriate toys, capturing visual attention, cardiac measures of arousal, and EEG-based metrics of cognitive processing. The authors find that the temporal relations between measures are different at age 5 months vs. age 10 months. In particular, at 5 months of age, cardiac arousal appears to precede attention, while at 10 months of age attention processes lead to shifts in neural markers of engagement, as captured in theta activity.

      Strengths:

      The study brings to the forefront sophisticated analytical and methodological techniques to bring greater validity to the work typically done in the research lab. By using measures in the moment, they can more closely link biological measures to actual behaviors and cognitive stages. Often, we are forced to capture these measures in separate contexts and then infer in-the-moment relations. The data and techniques provide insights for future research work.

      Weaknesses:

      The sample is relatively modest, although this is somewhat balanced by the sheer number of data points generated by the moment-to-moment analyses. In addition, the study is cross-sectional, so the data cannot capture true change over time. Larger samples, followed over time, will provide a stronger test for the robustness and reliability of the preliminary data noted here. Finally, while the method certainly provides for a more active and interactive infant in testing, we are a few steps removed from the complexity of daily life and social interactions.

      We thank the reviewer for their suggestions and have addressed them in our point-by-point responses below.

      Reviewer #1 (Recommendations For The Authors):

      Here are some specific ways in which clarity can be improved:

      A. Regarding the distinction between constructs, or measures and constructs:

      i. In the results section, I would prefer to mention looking at duration and heart rate as metrics that have been measured, while in the introduction and discussion, a clear 1-to-1 link between construct/cognitive process and behavioural or (neuro)psychophysical measure can be made (e.g., sustained attention is measured via looking durations; autonomic arousal is measured via heart-rate). 

      The way attention and arousal were operationalised are now clarified throughout the text, especially in the results.

      ii. Relatedly, the "attention" variable is not really measuring attention directly. It is rather measuring looking time (proportion of looking time to the toys?), which is the operationalisation, which is hypothesised to be related to attention (the construct/cognitive process). I would make the distinction between the two stronger.

      This distinction between looking and paying attention is clearer now in the reviewed manuscript as per R1 and R3’s suggestions. We have also added a paragraph in the Introduction to clarify it and pointed out its limitations (see pg.5).

      B. Each analysis should be set out to address a specific hypothesis. I would rather see hypotheses in the introduction (without direct reference to the details of the models that were used), and how a specific relation between variables should follow from such hypotheses. This would also solve the issue that some analyses did not seem directly necessary to the main goal of the paper. For example:

      i. Are ACF and survival probability analyses aimed at proving different points, or are they different analyses to prove the same point? Consider either making clearer how they differ or moving one to supplementary materials.

      We clarified this in pg. 4 of the revised manuscript.

      ii. The autocorrelation results are not mentioned in the introduction. Are they aiming to show that the variables can be used for cross-correlation? Please clarify their role or remove them.

      We clarified this in pg. 4 of the revised manuscript.

      C. Clarity of cross-correlation figures. To ensure clarity when presenting a cross-correlation plot, it's important to provide information on the lead-lag relationships and which variable is considered X and which is Y. This could be done by labelling the axes more clearly (e.g., the left-hand side of the - axis specifies x leads y, right hand specifies y leads x) or adding a legend (e.g., dashed line indicates x leading y, solid line indicates y leading x). Finally, the limits of the x-axis are consistent across plots, but the limits of the y-axis differ, which makes it harder to visually compare the different plots. More broadly, the plots could have clearer labels, and their resolution could also be improved. 

      This information on what variable precedes/ follows was in the caption of the figures. However, we have edited the figures as per the reviewer’s suggestion and added this information in the figures themselves. We have also uploaded all the figures in higher resolution.

      D. Figure 7 was extremely helpful for understanding the paper, and I would rather have it as Figure 1 in the introduction. 

      We have moved figure 7 to figure 1 as per this request.

      E. Statistics should always be reported, and effects should always be described. For example, results of autocorrelation are not reported, and from the plot, it is also not clear if the effects are significant (the caption states that red dots indicate significance, but there are no red dots. Does this mean there is no autocorrelation?).

      We apologise – this was hard to read in the original. We have clarified that there is no autocorrelation present in Fig 7A and 7D.

      And if so, given that theta is a wave, how is it possible that there is no autocorrelation (connected to point 1)? 

      We thank the reviewer for raising this point. In fact, theta power is looking at oscillatory activity in the EEG within the 3-6Hz window (i.e. 3 to 6 oscillations per second). Whereas we were analysing the autocorrelation in the EEG data by looking at changes in theta power between consecutive 1 second long windows. To say that there is no autocorrelation in the data means that, if there is more 3-6Hz activity within one particular 1-second window, there tends not to be significantly more 3-6Hz activity within the 1-second windows immediately before and after.

      F. Alpha power is introduced later on, and in the discussion, it is mentioned that the effects that were found go against the authors' expectations. However, alpha power and the authors' expectations about it are not mentioned in the introduction. 

      We thank the reviewer for this comment. We have added a paragraph on alpha in the introduction (pg.4).

      Minor points:

      1. At the end of 1st page of introduction, the authors state that: 

      “How children allocate their attention in experimenter-controlled, screen-based lab tasks differs, however, from actual real-world attention in several ways (32-34). For example, the real-world is interactive and manipulable, and so how we interact with the world determines what information we, in turn, receive from it: experiences generate behaviours (35).”

      I think there's more to this though - Lab-based studies can be made interactive too (e.g., Meyer et al., 2023, Stahl & Feigenson, 2015). What remains unexplored is how infants actively and freely initiate and self-structure their attention, rather than how they respond to experimental manipulations.

      Meyer, M., van Schaik, J. E., Poli, F., & Hunnius, S. (2023). How infant‐directed actions enhance infants' attention, learning, and exploration: Evidence from EEG and computational modeling. Developmental Science, 26(1), e13259.

      Stahl, A. E., & Feigenson, L. (2015). Observing the unexpected enhances infants' learning and exploration. Science, 348(6230), 91-94.

      We thank the reviewer for this suggestion and added their point in pg. 4.

      (2) Regarding analysis 4:

      a. In analysis 1 you showed that the duration of attentional episodes changes with age. Is it fair to keep the same start, middle, and termination ranges across age groups? Is 3-4 seconds "middle" for 5-month-olds? 

      We appreciate the comment. There are many ways we could have run these analyses and, in fact, in other papers we have done it differently, for example by splitting each look in 3, irrespective of its duration (Phillips et al., 2023).

      However, one aspect we took into account was the observation that 5-month-old infants exhibited more shorter looks compared to older infants. We recognized that dividing each into 3 parts, regardless of its duration, might have impacted the results. Presumably, the activity during the middle and termination phases of a 1.5-second look differs from that of a look lasting over 7 seconds.

      Two additional factors that provided us with confidence in our approach were: 1) while the definition of "middle" was somewhat arbitrary, it allowed us to maintain consistency in our analyses across different age points. And, 2) we obtained a comparable amount of observations across the two time points (e.g. “middle” at 5 months we had 172 events at 5 months, and 194 events at 10 months).

      b. It is recommended not to interpret lower-level interactions if more complex interactions are not significant. How are the interaction effects in a simpler model in which the 3-way interaction is removed? 

      We appreciate the comment. We tried to follow the same steps as in (Xie et al., 2018). However, we have re-analysed the data removing the 3-way interaction and the significance of the results stayed the same. Please see Author response image 2 below (first: new analyses without the 3-way interactions, second: original analyses that included the 3-way interaction).

      Author response image 2.

      (3) Figure S1: there seems to be an outlier in the bottom-right panel. Do results hold excluding it? 

      We re-run these analyses as per this suggestion and the results stayed the same (refer to SM pg. 2).

      (4) Figure S2 should refer to 10 months instead of 12.

      We thank the reviewer for noticing this typo, we have changed it in the reviewed manuscript (see SM pg. 3). 

      (5) In the 2nd paragraph of the discussion, I found this sentence unclear: "From Analysis 1 we found that infants at both ages showed a preferred modal reorientation rate". 

      We clarified this in the reviewed manuscript in pg10

      (6) Discussion: many (infant) studies have used theta in anticipation of receiving information (Begus et al., 2016) surprising events (Meyer et al., 2023), and especially exploration (Begus et al., 2015). Can you make a broader point on how these findings inform our interpretation of theta in the infant population (go more from description to underlying mechanisms)? 

      We have extended on this point on interpreting frequency bands in pg13 of the reviewed manuscript and thank the reviewer for bringing it up.

      Begus, K., Gliga, T., & Southgate, V. (2016). Infants' preferences for native speakers are associated with an expectation of information. Proceedings of the National Academy of Sciences, 113(44), 12397-12402.

      Meyer, M., van Schaik, J. E., Poli, F., & Hunnius, S. (2023). How infant‐directed actions enhance infants' attention, learning, and exploration: Evidence from EEG and computational modeling. Developmental Science, 26(1), e13259.

      Begus, K., Southgate, V., & Gliga, T. (2015). Neural mechanisms of infant learning: differences in frontal theta activity during object exploration modulate subsequent object recognition. Biology letters, 11(5), 20150041.

      (7) 2nd page of discussion, last paragraph: "preferred modal reorientation timer" is not a neural/cognitive mechanism, just a resulting behaviour. 

      We agree with this comment and thank the reviewer for bringing it out to our attention. We clarified this in in pg12 and pg13 of the reviewed manuscript.

      Reviewer #2 (Recommendations For The Authors):

      I have a few comments and questions that I think the authors should consider addressing in a revised version. Please see below:

      (1) During preprocessing (steps 5 and 6), it seems like the "noisy channels" were rejected using the pop_rejchan.m function and then interpolated. This procedure is common in infant EEG analysis, but a concern arises: was there no upper limit for channel interpolation? Did the authors still perform bad channel interpolation even when more than 30% or 40% of the channels were identified as "bad" at the beginning with the continuous data? 

      We did state in the original manuscript that “participants with fewer than 30% channels interpolated at 5 months and 25% at 10 months made it to the final step (ICA) and final analyses”. In the revised version we have re-written this section in order to make this more clear (pg. 17).

      (2) I am also perplexed about the sequencing of the ICA pruning step. If the intention of ICA pruning is to eliminate artificial components, would it be more logical to perform this procedure before the conventional artifacts' rejection (i.e., step 7), rather than after? In addition, what was the methodology employed by the authors to identify the artificial ICA components? Was it done through manual visual inspection or utilizing specific toolboxes? 

      We agree that the ICA is often run before, however, the decision to reject continuous data prior to ICA was to remove the very worst sections of data (where almost all channels were affected), which can arise during times when infants fuss or pull the caps. Thus, this step was applied at this point in the pipeline so that these sections of really bad data were not inputted into the ICA. This is fairly widespread practice in cleaning infant data.

      Concerning the reviewer’s second question, of how ICA components were removed – the answer to this is described in considerable detail in the paper that we refer to in that setion of the manuscript. This was done by training a classifier specially designed to clean naturalistic infant EEG data (Haresign et al., 2021) and has since been employed in similar studies (e.g. Georgieva et al., 2020; Phillips et al., 2023).

      (3) Please clarify how the relative power was calculated for the theta (3-6Hz) and alpha (6-9Hz) bands. Were they calculated by dividing the ratio of theta or alpha power to the power between 3 and 9Hz, or the total power between 1 (or 3) and 20 Hz? In other words, what does the term "all frequency bands" refer to in section 4.3.7? 

      We thank the reviewer for this comment, we have now clarified this in pg. 22.

      (4) One of the key discoveries presented in this paper is the observation that attention shifts are accompanied by a subsequent enhancement in theta band power shortly after the shifts occur. Is it possible that this effect or alteration might be linked to infants' saccades, which are used as indicators of attention shifts? Would it be feasible to analyze the disparities in amplitude between the left and right frontal electrodes (e.g., Fp1 and Fp2, which could be viewed as virtual horizontal EOG channels) in relation to theta band power, in order to eliminate the possibility that the augmentation of theta power was attributable to the intensity of the saccades? 

      We appreciate the concern. Average saccade duration in infants is about 40ms (Garbutt et al., 2007). Our finding that the positive cross-correlation between theta and look duration is present not only when we examine zero-lag data but also when we examine how theta forwards-predicts attention 1-2 seconds afterwards seems therefore unlikely to be directly attributable to saccade-related artifact. Concerning the reviewer’s suggestion – this is something that we have tried in the past. Unfortunately, however, our experience is that identifying saccades based on the disparity between Fp1 and Fp2 is much too unreliable to be of any use in analysing data. Even if specially positioned HEOG electrodes are used, we still find the saccade detection to be insufficiently reliable. In ongoing work we are tracking eye movements separately, in order to be able to address this point more satisfactorily.

      (5) The following question is related to my previous comment. Why is the duration of the relationship between theta power and moment-to-moment changes in attention so short? If theta is indeed associated with attention and information processing, shouldn't the relationship between the two variables strengthen as the attention episode progresses? Given that the authors themselves suggest that "One possible interpretation of this is that neural activity associates with the maintenance more than the initiation of attentional behaviors," it raises the question of (is in contradiction to) why the duration of the relationship is not longer but declines drastically (Figure 6). 

      We thank the reviewer for raising this excellent point. Certainly we argue that this, together with the low autocorrelation values for theta documented in Fig 7A and 7D challenge many conventional ways of interpreting theta. We are continuing to investigate this question in ongoing work.

      (6) Have the authors conducted a comparison of alpha relative power and HR deceleration durations between 5 and 10-month-old infants? This analysis could provide insights into whether the differences observed between the two age groups were partly due to varying levels of general arousal and engagement during free play.

      We thank the reviewer for this suggestion. Indeed, this is an aspect we investigated but ultimately, given that our primary emphasis was on the theta frequency, and considering the length of the manuscript, we decided not to incorporate. However, we attached Author response image 3 below showing there was no significant interaction between HR and alpha band.

      Author response image 3.

      Reviewer #3 (Recommendations For The Authors):

      (1) In reading the manuscript, the language used seems to imply longitudinal data or at the very least the ability to detect change or maturation. Given the cross-sectional nature of the data, the language should be tempered throughout. The data are illustrative but not definitive. 

      We thank the reviewer for this comment. We have now clarified that “Data was analysed in a cross-sectional manner” in pg15.

      (2) The sample size is quite modest, particularly in the specific age groups. This is likely tempered by the sheer number of data points available. This latter argument is implied in the text, but not as explicitly noted. (However, I may have missed this as the text is quite dense). I think more notice is needed on the reliability and stability of the findings given the sample. 

      We have clarified this in pg16.

      (3) On a related note, how was the sample size determined? Was there a power analysis to help guide decision-making for both recruitment and choosing which analyses to proceed with? Again, the analytic approach is quite sophisticated and the questions are of central interest to researchers, but I was left feeling maybe these two aspects of the study were out-sprinting the available data. The general impression is that the sample is small, but it is not until looking at table s7, that it is in full relief. I think this should be more prominent in the main body of the study.

      We have clarified this in pg16.

      (4) The devotes a few sentences to the relation between looking and attention. However, this distinction is central to the design of the study, and any philosophical differences regarding what take-away points can be generated. In my reading, I think this point needs to be more heavily interrogated. 

      This distinction between looking and paying attention is clearer now in the reviewed manuscript as per R1 and R3’s suggestions. We have also added a paragraph in the Introduction to clarify it and pointed out its limitations (see pg.5).

      (5) I would temper the real-world attention language. This study is certainly a great step forward, relative to static faces on a computer screen. However, there are still a great number of artificial constraints that have been added. That is not to say that the constraints are bad--they are necessary to carry out the work. However, it should be acknowledged that it constrains the external validity. 

      We have added a paragraph to acknowledged limitations of the setup in pg. 14.

      (6) The kappa on the coding is not strong. The authors chose to proceed nonetheless. Given that, I think more information is needed on how coders were trained, how they were standardized, and what parameters were used to decide they were ready to code independently. Again, with the sample size and the kappa presented, I think more discussion is needed regarding the robustness of the findings. 

      We appreciate the concern. As per our answer to R1, we chose to report the most stringent calculator of inter-rater reliability, but other calculation methods (i.e., percent agreement) return higher scores (see response to R1).

      As per the training, we wrote an extensively detailed coding scheme describing exactly how to code each look that was handed to our coders. Throughout the initial months of training, we meet with the coders on a weekly basis to discuss questions and individual frames that looked ambiguous. After each session, we would revise the coding scheme to incorporate additional details, aiming to make the coding process progressively less subjective. During this period, every coder analysed the same interactions, and inter-rater reliability (IRR) was assessed weekly, comparing their evaluations with mine (Marta). With time, the coders had fewer questions and IRR increased. At that point, we deemed them sufficiently trained, and began assigning them different interactions from each other. Periodically, though, we all assessed the same interaction and meet to review and discuss our coding outputs.

    1. Figure 1line-by-line coding in EPPI-Reviewer.

      Figure 1 should be viewed full-size, in order to understand exactly how codes can be arranged into different categories, as the basis for developing descriptive themes.

      The code in this case is "bad food=nice, good food=awful"

    1. We hope that by the end of this course, you have a familiarity of what programming is and some of what you can do with it. We particularly hope you have a familiarity with basic Python programming concepts, and an ability to interact with Reddit using computer programs.

      Yeah, this course was really good at introducing me to coding concepts, as a person who has never coded before. I am able to now understand most basic code, and edit accordingly. It also has peaked my interest and I may try to learn to code more in the summer!

    1. After the changing directory to r-dev-env, we can open this code inside VSCode editor using cmd

      -> "Restart VSCode in the r-dev-env directory with the command: "

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      In this potentially useful study, the authors attempt to use comparative meta-analysis to advance our understanding of life history evolution. Unfortunately, both the meta-analysis and the theoretical model is inadequate and proper statistical and mechanistic descriptions of the simulations are lacking. Specifically, the interpretation overlooks the effect of well-characterised complexities in the relationship between clutch size and fitness in birds.

      Public Reviews:

      We would like to thank the reviewers for their helpful comments, which have been considered carefully and have been valuable in progressing our manuscript. The following bullet points summarise the key points and our responses, though our detailed responses to specific comments can be found below:<br /> - Two reviewers commented that our data was not made available. Our data was provided upon submission and during the review process, however was not made accessible to the reviewers. Our data and code are available at https://doi.org/10.5061/dryad.q83bk3jnk.

      - The reviewers have highlighted that some of our methodology was unclear and we have added all the requested detail to ensure our methods can be easily understood.

      - The reviewers highlight the importance of our conclusions, but also suggest some interpretations might be missing and/or are incomplete. To make clear how we objectively interpreted our data and the wider consequences for life-history theory we provide a decision tree (Figure 5). This figure makes clear where we think the boundaries are in our interpretation and how multiple lines of evidence converge to the same conclusions.

      Reviewer #1 (Public Review):

      This paper falls in a long tradition of studies on the costs of reproduction in birds and its contribution to understanding individual variation in life histories. Unfortunately, the meta-analyses only confirm what we know already, and the simulations based on the outcome of the meta-analysis have shortcomings that prevent the inferences on optimal clutch size, in contrast to the claims made in the paper.

      There was no information that I could find on the effect sizes used in the meta-analyses other than a figure listing the species included. In fact, there is more information on studies that were not included. This made it impossible to evaluate the data-set. This is a serious omission, because it is not uncommon for there to be serious errors in meta-analysis data sets. Moreover, in the long run the main contribution of a meta-analysis is to build a data set that can be included in further studies.

      It is disappointing that two referees comment on data availability, as we supplied a link to our full dataset and the code we used in Dryad with our submitted manuscript. We were also asked to supply our data during the review process and we again supplied a link to our dataset and code, along with a folder containing the data and code itself. We received confirmation that the reviewers had been given our data and code. We support open science and it was our intention that our dataset should be fully available to reviewers and readers. Our data and code are at https://doi.org/10.5061/dryad.q83bk3jnk.

      The main finding of the meta-analysis of the brood size manipulation studies is that the survival costs of enlarging brood size are modest, as previously reported by Santos & Nakagawa on what I suspect to be mostly the same data set.

      We disagree that the main finding of our paper is the small survival cost of manipulated brood size. The major finding of the paper, in our opinion, is that the effect sizes for experimental and observational studies are in opposite directions, therefore providing the first quantitative evidence to support the influential theoretical framework put forward by van Noordwijk and de Jong (1986), that individuals differ in their optimal clutch size and are constrained to reproducing at this level due to a trade-off with survival. We further show that while the manipulation experiments have been widely accepted to be informative, they are not in fact an effective test of whether within-species variation in clutch size is the result of a trade-off between reproduction and survival.

      The comment that we are reporting the same finding as Santos & Nakagawa (2012) is a misrepresentation of both that study and our own. Santos & Nakagawa found an effect of parental effort on survival only in males who had their clutch size increased – but no effect for males who had their clutch size reduced and no survival effect on females for either increasing or reducing parental effort. However, we found an overall reduction in survival for birds who had brood sizes manipulated to be larger than their original brood (for both sexes and mixed sex studies combined). In our supplementary information, we demonstrate that the overall survival effect of a change in reproductive effort is close to zero for males, negative (though non-significant) for females and significantly negative for mixed sexes (which are not included in the Santos & Nakagawa study). Please also note that the Santos & Nakagawa study was conducted over 10 years ago. This means we added additional data (L364-365). Furthermore, meta-analyses are an evolving practice and we also corrected and improved on the overall analysis approach (e.g. L358-359 and L 393-397, and see detailed SI).

      The paper does a very poor job of critically discussing whether we should take this at face value or whether instead there may be short-comings in the general experimental approach. A major reason why survival cost estimates are barely significantly different from zero may well be that parents do not fully adjust their parental effort to the manipulated brood size, either because of time/energy constraints, because it is too costly and therefore not optimal, or because parents do not register increased offspring needs. Whatever the reason, as a consequence, there is usually a strong effect of brood size manipulation on offspring growth and thereby presumably their fitness prospects. In the simulations (Fig.4), the consequences of the survival costs of reproduction for optimal clutch size were investigated without considering brood size manipulation effects on the offspring. Effects on offspring are briefly acknowledged in the discussion, but otherwise ignored. Assuming that the survival costs of reproduction are indeed difficult to discern because the offspring bear the brunt of the increase in brood size, a simulation that ignores the latter effect is unlikely to yield any insight in optimal clutch size. It is not clear therefore what we learn from these calculations.

      The reviewer’s comment is somewhat of a paradox. We take the best studied example of the trade-off between reproductive effort and parental survival – a key theme in life history and the biology of ageing – and subject this to a meta-analysis. The reviewer suggests we should interpret our finding as if there must be something wrong with the method or studies we included, rather than considering that the original hypothesis could be false or inflated in importance. We do not consider questioning the premise of the data over questioning a favoured hypothesis to necessarily be the best scientific approach here. In many places in our manuscript, we question and address, at length, the underlying data and their interpretation (L116-117, L165-167, 202-204 and L277-282). Moreover, we make it clear that we focus on the trade-off between current reproductive effort and subsequent parental survival, while being aware that other trade-offs could counter-balance or explain our findings (discussed on L208-210 & L301-316). Note that it is also problematic, when you do not find the expected response, to search for an alternative that has not been measured. In the case here, of potential trade-offs, there are endless possibilities of where a trade-off might operate between traits. We purposefully focus on the one well-studied and most commonly invoked trade-off. We clearly acknowledge, though, that when all possible trade-offs are taken into account a trade-off on the fitness level can occur and cite two famous studies (Daan et al., 1990 and Verhulst & Tinbergen 1991) that have shown just that (L314-316).

      So whilst we agree with the reviewer that the offspring may incur costs themselves, rather than costs being incurred by the parents, the aim of our study was to test for a general trend across species in the survival costs of reproductive effort. It is unrealistic to suggest that incorporating offspring growth into our simulations would add insight, as a change in offspring number rarely affects all offspring in the nest equally and there can even be quite stark differences; for example, this will be most evident in species that produce sacrificial offspring. This effect will be further confounded by catch-up growth, for example, and so it is likely that increased sibling competition from added chicks alters offspring growth trajectories, rather than absolute growth as the reviewer suggests. There are mixed results in the literature on the effect of altering clutch size on offspring survival, with an increased clutch size through manipulation often increasing the number of recruits from a nest.

      What we do appreciate from the reviewer’s comment is that the interpretation of our findings is complex. Even though our in-text explanation includes the caveats the reviewer refers to, and are discussed at length, their inter-relationships are hard to appreciate from a text format. To improve this presentation and for ease of the reader, we have added a decision tree (Figure 5) which represents the logical flow from the hypothesis being tested through to what overall conclusion can be drawn from our results. We believe this clarifies what conclusions can be drawn from our results. We emphasise again that the theory that trade-offs between reproductive effort and parental survival being the major driver of variation in offspring production was not supported though is the one that practitioners in the field would be most likely to invoke, and our result is important for this reason.

      There are other reasons why brood size manipulations may not reveal the costs of reproduction animals would incur when opting for a larger brood size than they produced spontaneously themselves. Firstly, the manipulations do not affect the effort incurred in laying eggs (which also biases your comparison with natural variation in clutch size). Secondly, the studies by Boonekamp et al on Jackdaws found that while there was no effect of brood size manipulation on parental survival after one year of manipulation, there was a strong effect when the same individuals were manipulated in the same direction in multiple years. This could be taken to mean that costs are not immediate but delayed, explaining why single year manipulations generally show little effect on survival. It would also mean that most estimates of the fitness costs of manipulated brood size are not fit for purpose, because typically restricted to survival over a single year.

      First, our results did show a survival cost of reproduction for brood manipulations (L107-123, Figure 1, Table 1). Note, however, that much theory is built on the immediate costs of reproduction and, as such, these costs are likely overinterpreted, meaning that our overall interpretation still holds, i.e. “parental survival trade-off is not the major determinative trade-off in life history within-species” (Figure 5).

      We agree with the reviewer that lifetime manipulations could be even more informative than single-year manipulations. Unfortunately, there are currently too few studies available to be able to draw generalisable conclusions across species for lifetime manipulations. This is, however, the reason we used lifetime change in clutch size in our fitness projections, which the reviewer seems to have missed – please see methods line 466-468, where we explicitly state that this is lifetime enlargement. Of course, such interpretations do not include an accumulation of costs that is greater than the annual cost, but currently there is no clear evidence that such an assumption is valid. Such a conclusion can also not be drawn from the study on jackdaws by Boonekamp et al (2014) as the treatments were life-long and, therefore, cannot separate annual from accrued (multiplicative) costs that are more than the sum of the annual costs incurred. Note that we have now included specific discussion of this study in response to the reviewer (L265-269).

      Details of how the analyses were carried out were opaque in places, but as I understood the analysis of the brood size manipulation studies, manipulation was coded as a covariate, with negative values for brood size reductions and positive values for brood size enlargements (and then variably scaled or not to control brood or clutch size). This approach implicitly assumes that the trade-off between current brood size (manipulation) and parental survival is linear, which contrasts with the general expectation that this trade-off is not linear. This assumption reduces the value of the analysis, and contrasts with the approach of Santos & Nakagawa.

      We thank the reviewer for highlighting a lack of clarity in places in our methods. We have added additional detail to the methodology section (see “Study sourcing & inclusion criteria” and “Extracting effect sizes”) in our revised manuscript. Note, that our data and code was not shared with the reviewers despite us supplying this upon submission and again during the review process, which would have explained a lot more of the detail required.

      For clarity in our response, each effect size was extracted by performing a logistic regression with survival as a binary response variable and clutch size was the absolute value of offspring in the nest (i.e., for a bird that laid a clutch size of 5 but was manipulated to have -1 egg, we used a clutch size value of 4). The clutch size was also standardised and, separately, expressed as a proportion of the species’ mean.

      We disagree that our approach reduces the value of our analysis. First, our approach allows a direct comparison between experimental and observational studies, which is the novelty of our study. Our approach does differ from Santos & Nakagawa but we disagree that it contrasts. Our approach allows us to take into consideration the severity of the change in clutch size, which Santos & Nakagawa do not. Therefore, we do not agree that our approach is worse at accounting for non-linearity of trade-offs than the approach used by Santos & Nakagawa. Arguably, the approach by Santos & Nakagawa is worse, as they dichotomise effort as increased or decreased, factorise their output and thereby inflate their number of outcomes, of which only 1 cell of 4 categories is significant (for males and females, increased and decreased brood size). The proof is in the pudding as well, as our results clearly demonstrate that the magnitude of the manipulation is a key factor driving the results, i.e. one offspring for a seabird is a larger proportion of care (and fitness) than one offspring for a passerine. Such insights were not achieved by Santos & Nakagawa’s method and, again, did not allow a direct quantitative comparison between quality (correlational) and experimental (brood size manipulation, i.e. “trade-off”) effects, which forms a central part of our argumentation (Figure 5). 

      Our analysis, alongside a plethora of other ecological studies, does assume that the response to our predictor variable is linear. However, it is common knowledge that there are very few (if any) truly linear relationships. We use linear relationships because they serve a good approximation of the trend and provide a more rigorous test for an underlying relationship than would fitting nonlinear models. For many datasets the range of added chicks required to estimate a non-linear relationship was not available. The question also remains of what the shape of such a non-linear relationship should be and is hard to determine a priori. There is also a real risk when fitting non-linear terms that they are spurious and overinterpreted, as they often present a better fit (denoting one df is not sufficient especially when slopes vary). We have added this detail to our discussion.

      The observational study selection is not complete and apparently no attempt was made to make it complete. This is a missed opportunity - it would be interesting to learn more about interspecific variation in the association between natural variation in clutch size and parental survival.

      We clearly state in our manuscript that we deliberately tailored the selection of studies to match the manipulation studies (L367-369). We paired species extracted for observational studies with those extracted in experimental studies to facilitate a direct comparison between observational and experimental studies, and to ensure that the respective datasets were comparable. The reviewer’s focus in this review seems to be solely on the experimental dataset. This comment dismisses the equally important observational component of our analysis and thereby fails to acknowledge one of the key questions being addressed in this study. Note that in our revised version we have edited the phylogenetic tree to indicate for which species we have both types of information, which highlights our approach to selecting observational data (Figure 3).

      Reviewer #2 (Public Review):

      I have read with great interest the manuscript entitled "The optimal clutch size revisited: separating individual quality from the costs of reproduction" by LA Winder and colleagues. The paper consists in a meta-analysis comparing survival rates from studies providing clutch sizes of species that are unmanipulated and from studies where the clutch sizes are manipulated, in order to better understand the effects of differences in individual quality and of the costs of reproduction. I find the idea of the manuscript very interesting. However, I am not sure the methodology used allows to reach the conclusions provided by the authors (mainly that there is no cost of reproduction, and that the entire variation in clutch size among individuals of a population is driven by "individual quality").

      We would like to highlight that we do not conclude that there is no cost of reproduction. Please see lines 336–339, where we state that our lack of evidence for trade-offs driving within-species variation in clutch size does not necessarily mean the costs of reproduction are non-existent. We conclude that individuals are constrained to their optima by the survival cost of reproduction. It is also an over-statement of our conclusion to say that we believe that variation in clutch size is only driven by quality. Our results show that unmanipulated birds that have larger clutch sizes also lived longer, and we suggest that this is evidence that some individuals are “better” than others, but we do not say, nor imply, that no other factors affect variation in clutch size. We have added Figure 5 to our manuscript to help the reader better understand what questions we can answer with our study and what conclusions we can draw from our results.

      I write that I am not sure, because in its current form, the manuscript does not contain a single equation, making it impossible to assess. It would need at least a set of mathematical descriptions for the statistical analysis and for the mechanistic model that the authors infer from it.

      We appreciate this comment, and have explained our methods in terms that are accessible to a wider audience. Note, however, that our meta-analysis is standard and based on logistic regression and standard meta-analytic practices. We have added the model formula to the model output tables.

      For the simulation, we simply simulated the resulting effects. We of course supplied our code for this along with our manuscript (https://doi.org/10.5061/dryad.q83bk3jnk), though as we mentioned above, we believe this was not shared with the reviewers despite us making this available for the review process. We therefore understand why the reviewer feels the simulations were not explained thoroughly. We have revised our methods section and added details which we believe make our methodology more clear without needing to consult the supplemental material. However, we have also added the equations used in the process of calculating our simulated data to the Supplementary Information for readers who wish to have this information in equation form.

      The texts mixes concepts of individual vs population statistics, of within individual vs among-individuals measures, of allocation trade-offs and fitness trade-offs, etc ....which means it would also require a glossary of the definitions the authors use for these various terms, in order to be evaluated.

      We would like to thank the reviewer for highlighting this lack of clarity in our text. Throughout the manuscript we have refined our terminology and indicated where we are referring to the individual level or the population level. The inclusion of our new Figure 5 (decision tree) should also help in this context, as it is clear on which level we base our interpretation and conclusions on.

      This problem is emphasised by the following sentence to be found in the discussion "The effect of birds having naturally larger clutches was significantly opposite to the result of increasing clutch size through brood manipulation". The "effect" is defined as the survival rate (see Fig 1). While it is relatively easy to intuitively understand what the "effect" is for the unmanipulated studies: the sensitivity of survival to clutch size at the population level, this should be mentioned and detailed in a formula. Moreover, the concept of effect size is not at all obvious for the manipulated ones (effect of the manipulation? or survival rate whatever the manipulation (then how could it measure a trade-off ?)? at the population level? at the individual level ?) despite a whole appendix dedicated to it. This absolutely needs to be described properly in the manuscript.

      Thank you for identifying this sentence for which the writing was ambiguous, our apologies. We have now rewritten this and included additional explanation. L282-290: ‘The effect on parental annual survival of having naturally larger clutches was significantly opposite to the result of increasing clutch size through brood manipulation, and quantitatively similar. Parents with naturally larger clutches are thus expected to live longer and this counterbalances the “cost of reproduction” when their brood size is experimentally manipulated. It is, therefore, possible that quality effects mask trade-offs. Furthermore, it could be possible that individuals that lay larger clutches have smaller costs of reproduction, i.e. would respond less in terms of annual survival to a brood size manipulation, but with our current dataset we cannot address this hypothesis (Figure 5).’

      We would also like to thank the reviewer for bringing to our attention the lack of clarity about the details of our methodology. We have added details to our methodology (see “Extracting effect sizes” section) to address this (see highlighted sections). For clarity, the effect size for both manipulated and unmanipulated nests was survival, given the brood size raised. We performed a logistic regression with survival as a binary response variable (i.e., number of individuals that survived and number of individuals that died after each breeding season), and clutch size was the absolute value of offspring in the nest (i.e., for a bird that laid a clutch size of 5 but was manipulated to have -1 egg, we used a clutch size value of 4). This allows for direct comparison of the effect size (survival given clutch size raised) between manipulated and unmanipulated birds.

      Despite the lack of information about the underlying mechanistic model tested and the statistical model used, my impression is still that the interpretation in the introduction and discussion is not granted by the outputs of the figures and tables. Let's use a model similar to that of (van Noordwijk and de Jong, 1986): imagine that the mechanism at the population level is

      a.c_(i,q)+b.s_(i,q)=E_q

      Where c_(i,q) are s_(i,q) are respectively the clutch size for individual i which is of quality q, and E_q is the level of "energy" that an individual of quality q has available during the given time-step (and a and b are constants turning the clutch size and survival rate into energy cost of reproduction and energy cost of survival, and there are both quite "high" so that an extra egg (c_(i,q) is increased by 1) at the current time-step, decreases s_(i,q) markedly (E_q is independent of the number of eggs produced), that is, we have strong individual costs of reproduction). Imagine now that the variance of c_(i,q) (when the population is not manipulated) among individuals of the same quality group, is very small (and therefore the variance of s_(i,q) is very small also) and that the expectation of both are proportional to E_q. Then, in the unmanipulated population, the variance in clutch size is mainly due to the variance in quality. And therefore, the larger the clutch size c_(i,q) the higher E_q, and the higher the survival s_(i,q).

      In the manipulated populations however, because of the large a and b, an artificial increase in clutch size, for a given E_q, will lead to a lower survival s_(i,q). And the "effect size" at the population level may vary according to a,b and the variances mentioned above. In other words, the costs of reproduction may be strong, but be hidden by the data, when there is variance in quality; however there are actually strong costs of reproduction (so strong actually that they are deterministic and that the probability to survive is a direct function of the number of eggs produced)

      We would like to thank the reviewer for these comments. We have added detail to our methodology section so our models and rationale are more clear. Please note that our simulations only take the experimental effect of brood size on parental survival into account. Our model does not incorporate quality effects. The reviewer is right that the relationship between quality and the effects exposed by manipulating brood size can take many forms and this is a very interesting topic, but not one we aimed to tackle in our manuscript. In terms of quality we make two points: (1) overall quality effects connecting reproduction and parental survival are present, (2) these effects are opposite in direction to the effects when reproduction is manipulated and similar in magnitude. We do not go further than that in interpreting our results. The reviewer is correct, however, that we do suggest and repeat suggestions by others that quality can also mask the trade-off in some individuals or circumstances (L74-76, L95-98 & L286-289), but we do not quantify this, as it is dependent on the unknown relationship between quality and the response to the manipulation. A focussed set of experiments in that context would be interesting and there are some data that could get at this, i.e. the relationship between produced clutch size and the relative effect of the manipulation (now included L287-290). Such information is, however, not available for all studies and, although we explored the possibility of analysing this, currently this is not possible with adequate confidence and there is the possible complexity of non-linear effects. We have added this rationale in our revision (L259-265).

      Moreover, it seems to me that the costs of reproduction are a concept closely related to generation time. Looking beyond the individual allocative (and other individual components of the trade-off) cost of reproduction and towards a populational negative relationship between survival and reproduction, we have to consider the intra-population slow fast continuum (some types of individuals survive more and reproduce less (are slower) than other (which are faster)). This continuum is associated with a metric: the generation time. Some individuals will produce more eggs and survive less in a given time-period because this time-period corresponds to a higher ratio of their generation time (Gaillard and Yoccoz, 2003; Gaillard et al., 2005). It seems therefore important to me, to control for generation time and in general to account for the time-step used for each population studied when analysing costs of reproduction. The data used in this manuscript is not just clutch size and survival rates, but clutch size per year (or another time step) and annual (or other) survival rates.

      The reviewer is right that this is interesting. There is a longstanding unexplained difference in temperate (seasonal) and tropical reproductive strategies. Most of our data come from seasonal breeders, however. Although there is some variation in second brooding and such, these species mostly only produce one brood. We do agree that a wider consideration here is relevant, but we are not trying to explain all of life history in our paper. It is clearly the case that other factors will operate and the opportunity for trade-offs will vary among species according to their respective life histories. However, our study focuses on the two most fundamental components of fitness – longevity and reproduction – to test a major hypothesis in the field, and we uncover new relationships that contrast with previous influential studies and cast doubt on previous conclusions. We question the assumed trade-off between reproduction and annual survival. We show that quality is important and that the effect we find in experimental studies is so small that it can only explain between-species patterns but is unlikely to be the selective force that constrains reproduction within species. We do agree that there is a lot more work that can be done in this area. We hope we are contributing to the field, by questioning this central trade-off. We have incorporated some of the reviewers suggestions in the revision (L309-315). We have added Figure 5 to make clear where we are able to reach solid conclusions and the evidence on which these are based as clearly as possible in an easily accessible format.

      Finally, it is important to relate any study of the costs of reproduction in a context of individual heterogeneity (in quality for instance), to the general problem of the detection of effects of individual differences on survival (see, e.g., Fay et al., 2021). Without an understanding of the very particular statistical behaviour of survival, associated to an event that by definition occurs only once per life history trajectory (by contrast to many other traits, even demographic, where the corresponding event (production of eggs for reproduction, for example) can be measured several times for a given individual during its life history trajectory).

      Thank you for raising this point. The reviewer is right that heterogeneity can dampen or augment selection. Note that by estimating the effect of quality here we give an example of how heterogeneity can possibly do exactly this. We thank the reviewer for raising that we should possibly link this to wider effects of heterogeneity and we have added to our discussion of how our results play into the importance of accounting for among-individual heterogeneity (L252-256).

      References:

      Fay, R. et al. (2021) 'Quantifying fixed individual heterogeneity in demographic parameters: Performance of correlated random effects for Bernoulli variables', Methods in Ecology and Evolution, 2021(August), pp. 1-14. doi: 10.1111/2041-210x.13728.

      Gaillard, J.-M. et al. (2005) 'Generation time: a reliable metric to measure life-history variation among mammalian populations.', The American naturalist, 166(1), pp. 119-123; discussion 124-128. doi: 10.1086/430330.

      Gaillard, J.-M. and Yoccoz, N. G. (2003) 'Temporal Variation in Survival of Mammals: a Case of Environmental Canalization?', Ecology, 84(12), pp. 3294-3306. doi: 10.1890/02-0409.

      van Noordwijk, A. J. and de Jong, G. (1986) 'Acquisition and Allocation of Resources: Their Influence on Variation in Life History Tactics', American Naturalist, p. 137. doi: 10.1086/284547.

      Reviewer #3 (Public Review):

      The authors present here a comparative meta-analysis analysis designed to detect evidence for a reproduction/ survival trade-off, central to expectations from life history theory. They present variation in clutch size within species as an observation in conflict with expectations of optimisation of clutch size and suggest that this may be accounted for from weak selection on clutch size. The results of their analyses support this explanation - they found little evidence of a reproduction - survival trade-off across birds. They extrapolated from this result to show in a mathematical model that the fitness consequences of enlarged clutch sizes would only be expected to have a significant effect on fitness in extreme cases, outside of normal species' clutch size ranges. Given the centrality of the reproduction-survival trade-off, the authors suggest that this result should encourage us to take a more cautious approach to applying concepts the trade-off in life history theory and optimisation in behavioural ecology more generally. While many of the findings are interesting, I don't think the argument for a major re-think of life history theory and the role of trade-offs in fitness maximisation is justified.

      The interest of the paper, for me, comes from highlighting the complexities of the link between clutch size and fitness, and the challenges facing biologists who want to detect evidence for life history trade-offs. Their results highlight apparently contradictory results from observational and experimental studies on the reproduction-survival trade-off and show that species with smaller clutch sizes are under stronger selection to limit clutch size.

      Unfortunately, the authors interpret the failure to detect a life history trade-off as evidence that there isn't one. The construction of a mathematical model based on this interpretation serves to give this possible conclusion perhaps more weight than is merited on the basis of the results, of this necessarily quite simple, meta-analysis. There are several potential complicating factors that could explain the lack of detection of a trade-off in these studies, which are mentioned and dismissed as unimportant (lines 248-250) without any helpful, rigorous discussion. I list below just a selection of complexities which perhaps deserve more careful consideration by the authors to help readers understand the implications of their results:

      We would like to thank the reviewer for their thoughtful response and summary of the findings that we also agree are central to our study. The reviewer also highlights areas where our manuscript could benefit from a deeper consideration and we have added detail accordingly to our revised discussion.

      We would like to highlight that we do not interpret the failure to detect a trade-off as evidence that there is not one. First, and importantly, we do find a trade-off but show this is only incurred when individuals produce a clutch beyond their optimal level. Second, we also state on lines 322-326 that the lack of evidence to support trade-offs being strong enough to drive variation in clutch size does not necessarily mean there are no costs of reproduction.

      The statement that we have constructed a mathematical model based on the interpretation that we have not found a trade-off is, again, factually incorrect. We ran these simulations because the opposite is true – we did find a trade-off. There is a significant effect of clutch size when manipulated on annual parental survival. We benefit from our unique analysis allowing for a quantitative fitness estimate from the effect size on annual survival (as this is expressed on a per-egg basis). This allowed us to ask whether this quantitative effect size can alone explain why reproduction is constrained, and we evaluate this using simulations. From these simulations we find that this effect size is too small to explain the constraint, so something else must be going on, and we do spend a considerable amount of text discussing the possible explanations (L202-215). Note that the possibly most parsimonious conclusion here is that costs of reproduction are not there, or simply small, so we also give that explanation some thought (L221-224 and L315-331).

      We are disappointed by the suggestion that we have dismissed complicating factors that could prevent detection of a trade-off, as this was not our intention. We were aiming to highlight that what we have demonstrated to be an apparent trade-off can be explained through other mechanisms, and that the trade-off between clutch size and survival is not as strong in driving within-species variation in clutch size as previously assumed. We have added further discussion to our revised manuscript to make this clear and give readers a better understanding of the complexity of factors associated with life-history theory, including the addition of a decision tree (Figure 5).

      • Reproductive output is optimised for lifetime reproductive success and so the consequences of being pushed off the optimum for one breeding attempt are not necessarily detectable in survival but in future reproductive success (and, therefore, lifetime reproductive success).

      We agree this is a valid point, which is mentioned in our manuscript in terms of alternative stages where the costs of reproduction might be manifested (L316-320). We would also like to highlight that , in our simulations, the change in clutch size (and subsequent survival cost) was assumed for the lifetime of the individual, for this very reason.

      • The analyses include some species that hatch broods simultaneously and some that hatch sequentially (although this information is not explicitly provided (see below)). This is potentially relevant because species which have been favoured by selection to set up a size asymmetry among their broods often don't even try to raise their whole broods but only feed the biggest chicks until they are sated; any added chicks face a high probability of starvation. The first point this observation raises is that the expectation of more chicks= more cost, doesn't hold for all species. The second more general point is that the very existence of the sequential hatching strategy to produce size asymmetry in a brood is very difficult to explain if you reject the notion of a trade-off.

      We agree with the reviewer that the costs of reproduction can be absorbed by the offspring themselves, and may not be equal across offspring (we also highlight this at L317-318 in the manuscript). However, we disagree that for some species the addition of more chicks does not equate to an increase in cost, though we do accept this might be less for some species. This is, however, difficult to incorporate into a sensible model as the impacts will vary among species and some species do also exhibit catch-up growth. So, without a priori knowledge on this, we kept our model simple to test whether the effect on parental survival (often assumed to be a strong cost) can explain the constraint on reproductive effort, and we conclude that it does not.

      We would also like to make clear that we are not rejecting the notion of a trade-off. Our study shows evidence that a trade-off between survival and reproductive effort probably does not drive within-species variation in clutch size. We do explicitly say this throughout our manuscript, and also provide suggestions of other areas where a trade-off may exist (L317-320). The point of our study is not whether trade-offs exist or not, it is whether there is a generalisable across-species trend for a trade-off between reproductive effort and survival – the most fundamental trade-off in our field but for which there is a lack of conclusive evidence within species. We believe the addition of Figure 5 to our reviewed manuscript also makes this more evident.

      • For your standard, pair-breeding passerine, there is an expectation that costs of raising chicks will increase linearly with clutch size. Each chick requires X feeding visits to reach the required fledge weight. But this is not the case for species which lay precocious chicks which are relatively independent and able to feed themselves straight after hatching - so again the relationship of care and survival is unlikely to be detectable by looking at the effect of clutch size but again, it doesn't mean there isn't a trade-off between breeding and survival.

      Precocial birds still provide a level of parental care, such as protection from predators. Though we agree that the level of parental care in provisioning food (and in some cases in all parental care given) is lower in precocial than altricial birds, this would only make our reported effect size for manipulated birds to be an underestimate. Again, we would like to draw the reviewer’s attention to the fact we did detect a trade-off in manipulated birds and we do not suggest that trade-offs do not exist. The argument the reviewer suggests here does not hold for unmanipulated birds, as we found that birds that naturally lay larger clutch sizes have higher survival.

      • The costs of raising a brood to adulthood for your standard pair-breeding passerine is bound to be extreme, simply by dint of the energy expenditure required. In fact, it was shown that the basal metabolic rate of breeding passerines was at the very edge of what is physiologically possible, the human equivalent being cycling the Tour de France (Nagy et al. 1990). If birds are at the very edge of what is physiologically possible, is it likely that clutch size is under weak selection?

      If birds are at the very edge of what is physiologically possible, then indeed it would necessarily follow that if they increase the resource allocated in one area then expenditure in another area must be reduced. In many studies, however, the overall brood mass is increased when chicks are added and cared for in an experimental setting, suggesting that birds are not operating at their limit all the time. Our simulations show that if individuals increase their clutch size, the survival cost of reproduction counterbalances the fitness gained by increasing clutch size and so there is no overall fitness gain to producing more offspring. Therefore, selection on clutch size is constrained to the within-species level. We do not say in our manuscript that clutch size is under weak selection – we only ask why variation in clutch size is maintained if selection always favours high-producing birds.

      • Variation in clutch size is presented by the authors as inconsistent with the assumption that birds are under selection to lay the Lack clutch. Of course, this is absurd and makes me think that I have misunderstood the authors' intended point here. At any rate, the paper would benefit from more clarity about how variable clutch size has to be before it becomes a problem for optimality in the authors' view (lines 84-85; line 246). See Perrins (1965) for an exquisite example of how beautifully great tits optimise clutch size on average, despite laying between 5-12 eggs.

      We thank the reviewer for highlighting that our manuscript may be misleading in places, however, we are unsure which part of our conclusions the author is referring to here. The question we pose is “Why don’t all birds produce a clutch size at the population optimum?”, and is central to the decades-long field of life-history theory. Why is variation maintained? As the reviewer outlines, there is extensive variability, with some birds laying half of what other birds lay.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Title: while the costs of reproduction are possibly important in shaping optimal clutch size, it is not clear what you can about it given that you do not consider clutch / brood size effects on fitness prospects of the offspring.

      We have expanded on our discussion of how some costs may be absorbed by the offspring themselves. However, a change in offspring number rarely affects all offspring in the nest equally and there can even be quite stark differences; for example this will be most evident in species that produce sacrificial offspring. This effect will be further confounded by catch-up growth. There are mixed results in the literature on the effect of altering clutch size on offspring survival, with an increased clutch size through manipulation often increasing the number of recruits from a nest. We have focussed on the relationship between reproductive effort and survival because it is given the most weight in the field in terms of driving intra-specific variation in clutch size. We have altered our title to show we focus on the survival costs specifically: “The optimal clutch size revisited: separating individual quality from the parental survival costs of reproduction”.

      (2) L.11-12: I agree that this is true for birds, but this is phrased more generally here. Are you sure that that is justified?

      The trade-off between survival and reproductive effort has largely been tested experimentally through brood manipulations in birds as this provides a good system in which to test the costs and benefits of increasing parental effort. The work in this area has provided theory beyond just passerine birds, which are the most commonly manipulated group, to across-taxa theories. We are unaware of any study/studies that provide evidence that the reproduction/survival trade-off is generalisable across multiple species in any taxa. As such, we do believe this sentence is justified. An example is the lack of a consistent negative genetic correlation in populations of fruitflies, for example, that has also been hailed as a lack-of-cost paradigm. Furthermore, some mutants that live longer do so without a cost on reproduction.

      (3) L.13-14: Not sure what you mean with this sentence - too much info lacking.

      We have added some detail to this sentence.

      (4) L.14: it is slightly awkward to say 'parental investment and survival' because it is the survival effect that is usually referred to as the 'investment'. Perhaps what you want to say is 'parental effort and survival'?

      We have replaced “parental investment” with “reproductive effort”

      (5) L.15: you can omit 'caused'. Compared to control treatment or to reduced broods? Why not mention effects or lack thereof of brood reduction? And it would be good to also mention here whether effects were similar in the sexes.

      Please see our methodology where we state that we use clutch size as a continuous variable (we do not compare to control or reduced but include the absolute value of offspring in a logistic regression). The effects of a brood reduction are drawn from the same regression and so are opposite. Though we appreciate the detail here is lacking to fully comprehend our study, we would like to highlight this is the abstract and details are provided in the main text.

      (6) L. 15: I am not sure why you write 'however', as the finding that experimental and natural variation have opposite effects is in complete agreement with what is generally reported in the literature and will therefore surprise no one that is aware of the literature.

      We use “however” to highlight the change in direction of the effect size from the results in the previous sentence. We also believe that ours ise the first study that provides a quantitative estimate of this effect and that previous work is largely theoretical. The reviewer states that this is what is generally reported but it is not reported in all cases, as some relationships between reproductive effort and survival are negative (for the quality measurement, in correlational space, see Figure 1).

      (7) L.16: saying 'opposite to the effect of phenotypic quality' seems difficult to justify, as clutch size cannot be equated with phenotypic quality. Perhaps simply say 'natural variation in clutch size'? If that is what you are referring to.

      Please note we are referring to effect sizes here –- that is, the survival effect of a change in clutch size. By phenotypic quality we are referring to the fact that we find higher parental survival when natural clutch sizes are higher. It is not the case that we refer to quality only as having a higher clutch size. This is explicitly stated in the sentence you refer to. We have changed “effect” to “effect size” to highlight this further.

      (8) L.18: why do you refer to 'parental care' here? Brood size is not equivalent to parental care.

      Brood size manipulations are used to manipulate parental care. The effect on parental survival is expected to be incurred because of the increase in parental care. We have changed “parental care” to “reproductive effort” to reduce the number of terms we use in our manuscript.

      (9) L.18-19: suggest to tone down this claim, as this is no more than a meta-analytic confirmation of a view that is (in my view) generally accepted in the field. That does not mean it is not useful, just that it does not constitute any new insight.

      We are unaware of any other study which provides generalisable across-species evidence for opposite effects of quality and costs of reproduction. The work in this area is also largely theoretical and is yet to be supported experimemtally, especially in a quantitative fashion. It is surprising to us that the reviewer considers there to be general acceptance in a field, rather than being influenced by rigorous testing of hypotheses, made possible by meta-analysis, the current gold standard in our field.

      (10) L.21: what does 'parental effort' mean here? You seem to use brood size, parental care, parental effort, and parental investment interchangeably but these are different concepts. Daan et al (1990, Behaviour), which you already cite, provide a useful graph separating these concepts. Please adjust this throughout the manuscript, i.e. replace 'reproductive effort' with wording that reflect the actual variable you use.

      We have not used the phrase “parental effort” in this sentence. We agree these are different concepts but in this context are intertwined. For example, brood size is used to manipulate parental care as a result of increased parental effort. We do agree the manuscript would benefit from keeping terminology consistent throughout the manuscript and have adjusted this throughout.

      (11) L.23: perhaps add 'in birds' somewhere in this sentence? Some reference to the assumptions underlying this inference would also be useful. Two major assumptions being that birds adjusted their effort to the manipulation as they would have done had they opted for a larger brood size themselves, and that the costs of laying and incubating extra eggs can be ignored. And then there is the effect that laying extra eggs will usually delay the hatch date, which in many species reduces reproductive success.

      Though our study does exclusively use birds, birds have been used to test the survival/reproduction trade-off because they present a convenient system in which to experimentally test this. The conclusions from these studies have a broader application than in birds alone. We believe that although these details are important, they are not appropriate in the abstract of our paper.

      (12) L.26: how is this an explanation? It just repeats the finding.

      We intend to refer to all interpretations from all results presented in our manuscript. We have made this more clear by adjusting our writing.

      (13) L.27: I do not see this point. And 'reproductive output' is yet another concept, that can be linked to the other concepts in the abstract in different ways, making it rather opaque.

      We have changed “reproductive output” to “reproductive effort”.

      (14) L.33: here you are jumping from 'resources' to 'energetically' - it is not clear that energy is the only or main limiting resource, so why narrow this down to energy?

      We do not say energy is the only or main limiting resource. We simply highlight that reproduction is energetically demanding and so, intuitively, a trade-off with a highly energetically demanding process would be the focal place to observe a trade off. We have, though, replaced “energetically” with “resource”.

      (15) L.35-36: this is new to me - I am not aware of any such claims, and effects on the residual reproductive value could also arise through effects on future reproduction. The authors you cite did not work on birds, or (in their own study systems) presented results that as far as I remember warrant such a general statement.

      The trade-off between reproduction and survival is seminal to the disposable soma theory, proposed by Kirkwood. Though Kirkwood’s work was largely not focussed on birds, it had fundamental implications for the field of evolutionary ecology because of the generalisable nature of his proposed framework. In particular, it has had wide-reaching influence on how the biology of aging is interpreted. The readership of the journal here is broad, and our results have implications for that field too. The work of Kirkwood (many of the papers on this topic have over 2000 citations each) has been perhaps overly influential in many areas, so a link to how that work should be interpreted is highly relevant. If the reviewer is interested in this topic the following papers by one of the co-authors and others could be of interest, some of which we could not cite in the main manuscript due to space considerations:

      https://www.science.org/doi/pdf/10.1126/sciadv.aay3047

      https://agingcelljournal.org/Archive/Volume3/stochasticity_explains_non_genetic_inheritance_of_lifespan/

      https://pubmed.ncbi.nlm.nih.gov/21558242/

      https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/1365-2435.13444

      https://www.nature.com/articles/362305a0

      https://www.cell.com/trends/ecology-evolution/fulltext/S0169-5347(12)00147-4

      https://www.cell.com/cell/pdf/S0092-8674(15)01488-9.pdf

      https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-018-0562-z

      (16) L.42: this could be preceded with mentioning the limitations of observational data.

      We have added detail as to why brood manipulations are a good test for trade-offs and so this is now inherently implied.

      (17) L.42-43: why?

      We have added detail to this sentence.

      (18) L.45: do any of the references cited here really support this statement? I am certain that several do not - in these this statement is an assumption rather than something that is demonstrated. It may be useful to look at Kate Lessell's review on this that appeared in Etologia, I think in the 1990's. Mind however that 'reproductive effort' is operationally poorly defined for reproducing birds - provisioning rate is not necessarily a good measure of effort in so far as there are fitness costs.

      We have updated the references to support the sentence.

      (19) L.47: Given that you make this statement with respect to brood size manipulations in birds, it seems to me that the paper by Santos & Nakagawa is the only paper you should cite here. Given that you go on to analyze the same data it deserves to be discussed in more detail, for example to clarify what you aim to add to their analysis. What warrants repeating their analysis?

      Please first note that our dataset includes Santos & Nakagawa and additional studies, so it is not accurate to say we analyse the same data. Furthermore, we believe our study has implications beyond birds alone and so believe it is appropriate to cite the papers that do support our statement. We have added details to the methods to explicitly state what data is gathered from Santos & Nakagawa (it is only used to find the appropriate literature and data was re-extracted and re-analysed in a more appropriate way) and, separately, how we gathered the observational studies (see L352-381).

      (20) L.48: There are more possible explanations to this, which deserve to be discussed. For example, brood size manipulations may not have been that effective in manipulating reproductive effort - for example, effects on energy expenditure tend to be not terribly convincing. Secondly, the manipulations do not affect the effort incurred in laying eggs (which also biases your comparison with natural variation in clutch size). Thirdly, the studies by Boonekamp et al on Jackdaws found that while there was no effect of brood size manipulation on parental survival after one year of manipulation, there was a strong effect when the same individuals were manipulated in the same direction in multiple years. This could be taken to mean that costs are not immediate but delayed, explaining why single year manipulations generally show little effect on survival. It would also mean that most estimates of the fitness costs of manipulated brood size are not fit for purpose, because typically restricted to survival over a single year.

      Please see our response to this comment in the public reviews.

      Out of interest and because the reviewer mentioned “energy expenditure” specifically: There are studies that show convincing effects of brood size manipulation on parental energy expenditure. We do agree that there are also studies that show ceilings in expenditure. We therefore disagree that they “tend to be not terribly convincing”. Just a few examples:

      https://academic.oup.com/beheco/article/10/5/598/222025 (Figure 2)

      https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/1365-2435.12321 (Figure 1)

      https://besjournals.onlinelibrary.wiley.com/doi/full/10.1046/j.1365-2656.2000.00395.x (but ceiling at enlarged brood).

      (21) L.48, "or, alternatively, that individuals may differ in quality": how do you see that happening when brood size is manipulated, and hence 'quality' of different experimental categories can be assumed to be approximately equal? This point does apply to observational studies, so I assume that that is what you had in mind, but that distinction should be clear (also on line 54).

      We have made it more clear that we determine if there are quality effects separate to the costs of reproduction found using brood manipulation studies.

      (22) L.50: Drent & Daan, in their seminal paper on "The prudent parent" (1980, Ardea) were among the earliest to make this point and deserve to be cited here.

      We have added this citation

      (23) L.51, "relative importance": relative to what? Please be more specific.

      We have adjusted this sentence.

      (24) L.54: Vedder & Bouwhuis (2018, Oikos) go some way towards this point and should be explicitly mentioned with reference to the role of 'quality' effects on the association between reproductive output and survival.

      We have added this reference.

      (25) L.55: can you be more specific on what you want to do exactly? What you write here could be interpreted differently.

      We have added an explicit aim after this sentence to be more clear.

      (26) L.57: Here also a more specific wording would be useful. What does it mean exactly when you say you will distinguish between 'quality' and 'costs'?

      We have added detail to this sentence.

      (27) L.62: it should be clearer from the introduction that this is already well known, which will indirectly emphasize what you are adding to what we know already.

      We would argue this is not well known and has only been theorised but not shown empirically, as we do here.

      (28) L.62: you equate clutch size with 'quality' here - that needs to be spelled out.

      We refer to quality as the positive effect size of survival for a given clutch size, not clutch size alone. We appreciate this is not clear in this sentence and have reworded.

      (29) L.64: this looks like a serious misunderstanding to me, but in any case, these inferences should perhaps be left to the discussion (this also applies to later parts of this paragraph), when you have hopefully convinced readers of the claims you make on lines 62-63.

      We are unsure of what the reviewer is referring to as a misunderstanding. We have chosen this format for the introduction to highlight our results. If this is a problem for the editors we will change as required.

      (30) L.66: quantitative comparison of what?

      Comparison of species. We have changed the wording of this sentence

      (31) L.67-69: this should be in the methods.

      We have used a modern format which highlights our result. We are happy to change the format should the editors wish us to.

      (32) L.74-88: suggest to (re)move this entire paragraph, presenting inferences in such an uncritical manner before presenting the evidence is inappropriate in my view. I have therefore refrained from commenting on this paragraph.

      We have chosen a modern format which highlights our result. We are happy to change the format should the editors wish us to.

      (33) L.271, "must detail variation in the number of raised young": it is not sufficiently clear what this means - what does 'detail' mean in this context? And what does 'number of raised young' mean? The number hatched or raised to fledging?

      We have now made this clear.

      (34) L271, "must detail variation in the number of raised young": looking at table S4, it seems that on the basis of this criterion also brood size manipulation studies where details on the number of young manipulated were missing are excluded. I see little justification for this - surely these manipulations can for example be coded as for example having the average manipulation size in the meta-analysis data set, thereby contributing to tests of manipulation effects, but not to variation within the manipulation groups?

      We have done in part what the reviewer describes. We are specifically interested in the manipulation size, so we required this to compare effect sizes across species and categories, a key advance of our study and outlined in many places in our manuscript. Note, however, that we only need comparative differences, and have used clutch size metrics more generally to obtain a mean clutch size for a species, as well as SD where required. Please also note that our supplement details exactly why studies were excluded from our analysis, as is the preferred practice in a meta-analysis.

      (35) L.271, "referred to as clutch size": the point of this simplification is not clear to me why it is clearly confusing - why not refer to 'brood size' instead?

      Brood size and clutch size can be used interchangeably here because, in the observational studies, the individuals vary in the number of eggs produced, whereas for brood manipulations this obviously happens after hatching and brood is perhaps a more appropriate term, but we wanted to simplify the terminology used. However, we use clutch size throughout as the aim of our study is to determine why individuals differ in the number of offspring they produce, and so clutch size is the most appropriate term for that.

      (36) L.280: according to the specified inclusion criteria (lines 271/272) these studies should already be in the data set, so what does this mean exactly?

      Selection criteria refers to whether a given study should be kept for analysis or not. It does not refer to how studies were found. Please see lines 361-378 for details on how we found studies (additional details are also in the Supplementary Methods).

      (37) L.281: the use of 'quality' here is misleading - natural variation in clutch or brood size will have multiple causes, variation in phenotypic quality of the individuals and their environment (territories) is only one of the causes. Why not simply refer to what you are actually investigating: natural and experimental variation in brood size.

      We disagree, our study aims to separate quality effects from the costs of reproduction and we use observational studies to test for quality differences, though we make no inference about the mechanisms. We do not imply that the environment causes differences in quality, but that to directly compare observation and experimental groups, they should contain similar species. So, to be clear again, quality refers to the positive covariation of clutch size with survival. We feel that we explain this clearly in our study’s rationale and have also improved our writing in several sections on this to avoid any confusion (see responses to earlier comments by the three reviewers).

      (38) L.283, "in most cases": please be exact and say in xx out xx cases.

      We have added the number of studies for each category here.

      (39) L.283-285: presumably readers can see this directly in a table with the extracted data?

      Our data and code can be accessed with the following link: https://doi.org/10.5061/dryad.q83bk3jnk. We believe the data are too large to include as a table in the main text and are not essential in understanding the paper. Though we do believe all readers should have access to this information if they wish and so is publicly available.

      (40) L.293: there does not seem to be a table that lists the included studies and effect sizes. It is not uncommon to find major errors in such tables when one is familiar with the literature, and absence of this information impedes a complete assessment of the manuscript.

      We supplied a link to our full dataset and the code we used in Dryad with our submitted manuscript. We were also asked to supply our data during the review process and we again supplied a link to our dataset and code, along with a folder containing the data and code itself. We received confirmation that the reviewers had been given our data and code. We support open science and it was our intention that our dataset should be fully available to reviewers and readers. We believe the data are too large to include as a table in the main text and are not essential in understanding the paper. Our data and code are at https://doi.org/10.5061/dryad.q83bk3jnk.

      (41) L.293: from how many species?

      We have added this detail.

      (42) L.296, "longevity": this is a tricky concept, not usually reported in the studies you used, so please describe in detail what data you used.

      We have removed longevity as we did not use this data in our current version of the manuscript.

      (43) L. 298: again: where can I see this information?

      Our data and code can be accessed with the following link: https://doi.org/10.5061/dryad.q83bk3jnk. We did supply this information when we submitted our manuscript and again during the review process but we believe this was not passed onto the reviewers.

      (44) L. 304, "we used raw data": I assume that for the majority of papers the raw data were not available, so please explain how you dealt with this. Or perhaps this applies to a selection of the studies only? Perhaps the experimental studies?

      By raw data, we mean the absolute value of offspring in the nest. We have changed the wording of this sentence and added detail about whether the absolute value of offspring was not present for brood manipulation studies (L393-397).

      (45) L.304: When I remember correctly, Santos and Nakagawa examined effects of reducing and enlarging brood size separately, which is of importance because trade-off curves are unlikely to be linear and whether they are or not has major effects on the optimization process. But perhaps you tackled this in another way? I will read on.....

      You are correct that Santos & Nakagawa compared brood increases and reductions to control separately. Note that this only partially accounts non-linearity and it does not take into account the severity of the change in brood size. By using a logistic regression of absolute clutch size, as we have done, we are able to directly compare brood manipulations with experimental studies. Please see Supplementary Methods lines 11-12, where we have added additional detail as to why our approach is beneficial in this analysis.

      (46) L.319: what are you referring to exactly with "for each clutch size transformation"?

      We refer to the raw, standardised and proportional clutch size transformations. We have added detail here to be more clear.

      (47) L.319: is there a cost of survival? Perhaps you mean 'survival cost'? This would be appropriate for the experimental data, but not for the observational data, where the survival variation may be causally unrelated to the brood size variation, even if there is a correlation.

      We have changed “cost of survival” to “effect of parental survival”. We only intend to imply causality for the experimental studies. For observational studies we do not suggest that increasing clutch size is causal for increasing survival, only correlative (and hence we use the phrase “quality”).

      (48) L.320: please replace "parental effort" with something like 'experimental change in brood size'.

      We have changed “parental effort” to “reproductive effort”

      (49) L.321: due to failure of one or more eggs to hatch, and mortality very early in life, before brood sizes are manipulated, it is not likely that say an enlargement of brood size by 1 chick can be equated to the mean clutch size +1 egg / check. For example, in the Wytham great tit study, as re-analysed by Richard Pettifor, a 'brood size manipulation' of unmanipulated birds is approximately -1, being the number of eggs / chicks lost between laying and the time of brood size manipulation. Would this affect your comparisons?

      Though we agree these are important factors in determining what a clutch/brood size actually is for a given individual/pair, as this can vary from egg laying to fledging. We do not believe that accounting for this (if it was possible to do so) would significantly affect our conclusions, as observational studies are comparable in the fact that these birds would also likely see early life mortality of their offspring. It is also possibly the case that parents already factor in this loss, and so a brood manipulation still changes the parental care effort an individual has to incur.

      (50) L.332: instead of "adjusted" perhaps say 'mean centred'?

      We have implemented this suggestion.

      (51) L.345: this statement surprised me, but is difficult to verify because I could not locate a list of the included studies. However, to my best knowledge, most studies reporting brood size manipulation effects on parental survival had this as their main focus, in contrast to your statement.

      Our data and code can be accessed with the following link: https://doi.org/10.5061/dryad.q83bk3jnk. We did supply this information when we submitted our manuscript and again during the review process but we believe this was not passed onto the reviewers by the journal, although supplied by us on several occasions. We regret that the reviewer was impeded by this unfortunate communication failure, but we did our best to make the data available to the reviewers during the initial review process.

      (52) L.361-362: this seems a realistic approach from an evolutionary perspective, but we know from the jackdaw study by Boonekamp that the survival effect of brood size manipulation in a single year is very different from the survival effect of manipulating as in your model, i.e. every year of an individual's life the same manipulation. For very short-lived species this possibly does not make much difference, but for somewhat longer-lived species this could perhaps strongly affect your results. This should be discussed, and perhaps also explored in your simulations?

      Note that the Boonekamp study does not separate whether the survival effects are additive or

      multiplicative. As such, we do not know whether the survival effects for a single year manipulation are just small and hard to detect, or whether the survival effects are multiplicative. Our simulations assumed that the brood enlargement occurred every year throughout their lives. We have added some text to the discussion on the point you raise.

      (53) L.360: what is "lifetime reproductive fitness"? Is this different from just "fitness"?

      We have changed “lifetime reproductive fitness” to “lifetime reproductive output”.

      (54) L.363: when you are interested in optimal clutch size, why not also explore effects of reducing clutch size?

      As we find that a reduction in clutch size leads to a reduction in survival (for experimental studies), we already know that these individuals would have a reduced fitness return compared to reproducing at their normal level, and so we would not learn anything from adding this into our simulations. The interest in using clutch size enlargements is to find out why an individual does not produce more offspring than it does, and the answer is that it would not have a fitness benefit (unless its clutch size and survival rate combination is out of the bounds of that observable in the wild).

      (55) Fig.1 - using 'parental effort' in the y-axis label is misleading, suggest to replace with e.g. "clutch or brood size". Using "clutch size" in the title is another issue, as the experimental studies typically changed the number of young rather than the number of eggs.

      We have updated the figure axes to say “clutch size” rather than “parental effort”. Please see response to comment 35 where we explain our use of the term “clutch size” throughout this manuscript.

      (56) L.93 - 108: I appreciate the analysis in Table 1, in particular the fact that you present different ways of expressing the manipulation. However, in addition, I would like to see the results of an analysis treating the manipulations as factor, i.e. without considering the scale of the manipulation. This serves two purposes. Firstly, I believe it is in the interest of the field that you include a detailed comparison with the results of Santos & Nakagawa's analysis of what I expect to be largely the same data (manipulation studies only - for this purpose I would also like to see a comparison of effect size between the sexes). Secondly, there are (at least) two levels of meta-analysis, namely quantifying an overall effect size, and testing variables that potentially explain variation in effect size. You are here sort of combining the two levels of analysis, but including the first level also would give much more insight in the data set.

      Our main intention here was to improve on how the same hypothesis was approached by Santos & Nakagawa. We did this by improving our analysis (on a by “egg” basis) and by adding additional studies (i.e. more data). In this process mistakes are corrected (as we re-extracted all data, and did not copy anything across from their dataset – which was used simply to ensure we found the same papers); more recent data were also added, including studies missed by Santos & Nakagawa. This means that the comparison with Santos & Nakagawa becomes somewhat irrelevant, apart from maybe technical reasons, i.e. pointing out mistakes or limitations in certain approaches. We would not be able to pinpoint these problems clearly without considering the whole dataset, yet Santos & Nakagawa only had a small subset of the data that were available to us. In short, meta-analysis is an iterative process and similar questions are inevitably analysed multiple times and updated. This follows basic meta-analytic concepts and Cochrane principles. Except where there is a huge flaw in a prior dataset or approach (like we sometimes found and highlighted in our own work, e.g. Simons, Koch, Verhulst 2013, Aging Cell), in itself a comparison of the kind the reviewer suggests distracts from the biology. With the dataset being made available others can make these comparisons, if required. On the sex difference, we provide a comparison of effect sizes separated between both sexes and mixed sex in Table S2 and Figure S1.

      (57) L.93 - 108: a thing that does not become clear from this section is whether experimentally reducing brood size affects parental survival similarly (in absolute terms) as enlarging brood size. Whether these effects are symmetric is biologically important, for example because of its effect on clutch size optimization. In the text you are specific about the effects of increasing brood size, but the effect you find could in theory be due entirely to brood size reduction.

      We have added detail to make it clear that a brood reduction is simply the opposite trend. We use linear relationships because they serve a good approximation of the trend and provide a more rigorous test for an underlying relationship than would fitting nonlinear models. For many datasets there is not a range of chicks added for which a non-linear relationship could be estimated. The question also remains of what the shape of this non-linear relationship should be and is hard to determine a priori.

      We have added some discussion on this to our manuscript (L278-282), in response to an earlier comment.

      (58) L.103-107: this is perhaps better deferred to the discussion, because other potential explanations should also be considered. For example, there have been studies suggesting that small birds were provisioning their brood full time already, and hence had no scope to increase provisioning effort when brood size was experimentally increased.

      We agree this is a discussion point but we believe it also provides an important context for why we ran our simulations, and so we believe this is best kept brief but in place. We agree the example you give is relevant but believe this argument is already contained in this section. See line 121-123 “...suggesting that costs to survival were only observed when a species was pushed beyond its natural limits”.

      (59) L.103-107: this discussion sort of assumes that the results in Table 1 differ between the different ways that the clutch/brood size variation is expressed. Is there any statistical support for this assumption?

      We are unsure of what the reviewer means here exactly. Note that in each of the clutch size transformations, experimental and observational effect sizes are significantly opposite. For the proportional clutch size transformation, experimental and observation studies are both separately significantly different from 0.

      (60) L.104: at this point, I would like to have better insight into the data set. Specifically, a scatter plot showing the manipulation magnitude (raw) plotted against control brood size would be useful.

      Our data and code can be accessed with the following link: https://doi.org/10.5061/dryad.q83bk3jnk. We did supply this information when we submitted our manuscript and again during the review process but we believe this was not passed onto the reviewers by the journal.

      Thank you for this suggestion: this is a useful suggestion also to illustrate how manipulations are relatively stronger for species with smaller clutches, in line with our interpretation of the result presented in Figure 2. We have added Figure S1 which shows the strength of manipulation compared to the species average.

      (61) L. 107: this seems a bold statement - surely you can test directly whether effect size becomes disproportionally stronger when manipulations are outside the natural range, for example by including this characterization as a factor in the models in Table 1.

      It is hard to define exactly what the natural range is here, so it is not easy to factorise objectively, which is why we chose not to do this. However, it is clear that for species with small clutches the manipulation itself is often outside the natural range. Thank you for your suggestion to include a figure for this as it is clear manipulations are stronger in species with smaller clutches. We attribute this to species being forced outside their natural range. We consider our wording makes it clear that this is our interpretation of our findings and we therefore do not think this is a bold statement, especially as it fits with how we interpret our later simulations.

      (62) Fig.3, legend: the term 'node support' does not mean much to me, please explain.

      Node support is a value given in phylogenetic trees to dictate the confidence of a branch. In this case, values are given as a percentage and so can translate to how many times out of 100 the estimate of the phylogeny gives the same branching. Our values are low, as we have relatively few species in our meta-analysis.

      (63) Fig.3: it would be informative when you indicate in this figure whether the species contributed to the experimental or the observational data set or both.

      We have added into Fig 3 whether the species was observational, experimental or both.

      (64) L.139: the p-value refers to the interaction between species clutch size and treatment (observational vs. experimental), but it appears that no evidence is presented for the correlation being significant in either observational or experimental studies.

      We agree that our reporting of the effect size could be misinterpreted and have added detail here. The statistic provided describes the slopes are significantly different between observational and experimental, implying there are differences between the slopes of small and large clutch-laying species.

      (65) L.140: I am wondering to what extent these correlations, which are potentially interesting, are driven by the fact that species average clutch size was also used when expressing the manipulation effect. In other words, to what extent is the estimate on the Y-axis independent from the clutch size on the X-axis? Showing that the result is the same when using survival effect sizes per manipulation category would considerably improve confidence in this finding.

      We are unsure what the reviewer means by “per manipulation category”. Please also note that we have used a logistic regression to calculate our effect sizes of survival, given a unit increase in reproductive effort. So, for example, if a population contained birds that lay 2,3 or 4 eggs, provided that the number of birds which survived and died in each category did not change, if we changed the number of eggs raised to 10,11 or 12, respectively, then our effect size would be the same. In this way, our effect sizes are independent of the species’ average clutch size.

      (66) L.145: when I remember correctly, Santos & Nakagawa considered brood size reduction and enlargement separately. Can this explain the contrasting result? Please discuss.

      You are correct, in that Santos & Nakagawa compared reductions and enlargements to controls separately. However, we found some mistakes in the data extracted by Santos & Nakagawa that we believe explain the differences in our results for sex-specific effect sizes. We do not feel that highlighting these mistakes in the main text is fair, useful or scientifically relevant, as our approach is to improve the test of the hypothesis.

      (67) L.158-159: looking at table S2 it seems to me you have a whole range of estimates. In any case, there is something to be said for taking the estimates for females because it is my impression (and experience) that clutch size variation in most species is a sex-linked trait, in that clutch size tends to be repeatable among females but not among males.

      We agree that, in many cases, the female is the one that ultimately decides on the number of chicks produced. We did also consider using female effect sizes only, however, we decided against this for the following reasons: (1) many of the species used in our meta-analysis exhibit biparental care, as is the case for many seabirds, and so using females only would bias our results towards species with lower male investment; in our case this would bias the results towards passerine species. (2) it has also been shown that, as females in some species are operating at their maximum of parental care investment, it is the males who are able to adjust their workload to care for extra offspring. (3) we are ultimately looking at how many offspring the breeding adults should produce, given the effort it costs to raise them, and so even if the female chooses a clutch size completely independently of the male, it is still the effort of both parents combined that determines whether the parents gain an overall fitness benefit from laying extra eggs. (4) some studies did not clearly specify male or female parental survival and we would not want to reduce our dataset further.

      (68) L.158-168: please explain how you incorporated brood size effects on the fitness prospects of offspring, given that it is a very robust finding of brood size manipulation studies that this affects offspring growth and survival.

      We would argue this is near-on impossible to incorporate into our simulations. It is unrealistic to suggest that incorporating offspring growth into our simulations would add insight, as a change in offspring number rarely affects all offspring in the nest equally and there can even be quite stark differences; for example, this will be most evident in species that produce sacrificial offspring. This effect will be further confounded by catch-up growth, for example, and so it is likely that increased sibling competition from added chicks alters offspring growth trajectories, rather than absolute growth as the reviewer suggests. There are mixed results in the literature on the effect of altering clutch size on offspring survival, with an increased clutch size through manipulation often increasing the number of recruits from a nest. It would be interesting, however, to explore this further using estimates from the literature, but this is beyond our current scope, and would in our initial intuition not be very accurate. It would be interesting to explore how big the effect on offspring should be to constrain effect size strongly. Such work would be more theoretical. The point of our simple fitness projections here is to aid interpretation of the quantitative effect size we estimated.

      (69) L.163: while I can understand that you select the estimate of -0.05 for computational reasons, it has enormous confidence intervals that also include zero. This seems problematic to me. However, in the simulations, you also examined the results of selecting -0.15, which is close to the lower end of the 95% C.I., which seems worth mentioning here already.

      Thank you for this suggestion. Yes, indeed, our range was chosen based on the CI, and we have now made this explicit in the manuscript.

      (70) L.210: defined in this way, in my world this is not what is generally taken to be a selection differential. Is what you show not simply scaled lifetime reproductive success?

      As far as we are aware, a selection differential is the relative change between a given group and the population mean, which is what we have done here. We appreciate this is a slightly unusual context in which to place this, but it is more logical to consider the individuals who produce more offspring as carrying a potential mutation for higher productivity. However, we believe that “selection differential” is the best terminology for the statistic we present. We also detail in our methodology how we calculate this. We have adjusted this sentence to be more explicit about what we mean by selection differential.

      (71) L.177-180: is this not so because these parameter values are closest to the data you based your estimates on, which yielded a low estimate and hence you see that here also?

      We are unsure of what exactly the reviewer means here. The effect sizes for our exemplar species were predicted from each combination of clutch size and survival rate. Note that we used a range of effect sizes, higher than that estimated in our meta-analysis, to explore a large parameter space and that these same conclusions still hold.

      (72) L.191-194: these statements are problematic, because based on the assumption that an increase in brood size does not impact the fitness prospects of the offspring, and we know this assumption to be false.

      Though we appreciate that some cost is often absorbed by the offspring themselves, we are unaware of any evidence that these costs are substantial and large enough to drive within-species variation in reproductive effort, though for some specific species this may be the case. However, in terms of explaining a generalisable, across-species trend, the fitness costs incurred by a reduction in offspring quality are unlikely to be significantly larger than the survival costs to reproduce. We also find it highly unlikely the cost to fitness incurred by a reduction in offspring quality is large enough to counter-balance the effect of parental quality that we find in our observational studies. We do also discuss other costs in our discussion.

      (73) L.205: here and in other places it would be useful to be more explicit on whether in your discussion you are referring to observational or experimental variation.

      We have added this detail to our manuscript. Do note that many of our conclusions are drawn by the combination of results of experimental and observational studies. We believe the addition of Figure 5 makes this more clear to the reader.

      (74) L.225: this may be true (at least, when we overlook the misuse of the word 'quality' here), but I would expect some nuance here to reflect that there is no surprise at all in this result as this pattern is generally recognized in the literature and has been the (empirical) basis for the often-repeated explanation of why experiments are required to demonstrate trade-offs. On a more quantitative level, it is worth mentioning the paper of Vedder & Bouwhuis (2017, Oikos) that essentially shows the same thing, i.e. a positive association between reproductive output and parental survival.

      We have added some discussion on this point, including adding the citation mentioned. However, we would like to highlight that our results demonstrate that brood manipulations are not necessarily a good test of trade-offs, as they fail to recognise that individuals differ in their underlying quality. Though we agree that this result should not necessarily be a surprising one, we have also not found it to be the case that differences in individual quality are accepted as the reason that intra-specific clutch size is maintained – in fact, we find that it is most commonly argued that when costs of reproduction are not identifiedit is concluded that the costs must be elsewhere – yet we cannot find conclusive evidence that the costs of reproduction (wherever they lie) are driving intra-specific variation in reproductive effort. Furthermore, some studies in our dataset have reported negative correlations between reproductive effort and survival (see observational studies, Figure 1).

      (75) L.225-226: perhaps present this definition when you first use the term.

      We have added more detail to where we first use and define this term to improve clarity (L57-58).

      (76) L.227-228, "currently unknown": this statement surprised me, given that there is a plethora of studies showing within-population variation in clutch size to depend on environmental conditions, in particular the rate at which food can be gathered.

      We mean to question that if an individual is “high quality”, why is it not selected for? We have rephrased, to improve clarity.

      (77) L.231: this seems no more than a special case of the environmental effect you mention above.

      We think this is a relevant special case, as it constitutes within-individual variation in reproduction that is mistaken for between-individual variation. This is a common problem in our field, that we feel needs adressing. We only have between-individual variation here in our study on quality, and by highlighting this we show that there might not be any variation between individuals, but this could come about fully (doubtful) or partly (perhaps likely) due to terminal effects.

      (78) L235-236: but apparently depending on how experimental and natural variation was expressed? Please specify here.

      We are not sure what results the reviewer is referring to here, as we found the same effect (smaller clutch laying species are more severely affected by a change in clutch size) for both clutch size expressed as raw clutch size and standardised clutch size.

      (79) L.237: the concept of 'limits' is not very productive here, and it conflicts with the optimality approach you apply elsewhere. What you are saying here can also be interpreted as there being a non-linear relationship between brood size manipulation and parental survival, but you do not actually test for that. A way to do this would be to treat brood size reduction and enlargement separately. Trade-off curves are not generally expected to be linear, so this would also make more sense biologically than your current approach.

      We have replaced “limits” with “optima”. We believe our current approach of treating clutch size as a continuous variable, regardless of manipulation direction, is the best approach, as it allows us to directly compare with observational studies and between species that use different manipulations (now nicely illustrated by the reviewer’s suggested Figure S1). Also note that transforming clutch size to a proportion of the mean allows us to account for the severity in change in clutch size. We also do not believe that treating reductions and enlargements separately accounts for non-linearity, as either we are separating this into two linear relationships (one for enlargements and one for reductions) or we compare all enlargements/reductions to the control, as in Santos & Nakagawa 2012, which does not take into account the severity of the increase, which we would argue is worse for accounting for non-linearity. Furthermore, in the cases where the manipulation involved one offspring only, we also cannot account for non-linearity.

      (80) L.239: assuming birds are on average able to optimize their clutch size, one could argue that any manipulation, large or small, on average forces birds to raise a number of offspring that deviates from their natural optimum. At this point, it would be interesting to discuss in some detail studies with manipulation designs that included different levels of brood size reduction/enlargement.

      We agree with the reviewer that any manipulation is changing an individual’sclutch size away from its own individual optima, which we have argued also means brood manipulations are not necessarily a good test of whether a trade-off occurs in the wild (naturally), as there could be interactions with quality – we have now edited to explicitly state this (L299-300).

      (81) L.242-244: when you choose to maintain this statement, please add something along the lines of "assuming there is no trade-off between number and quality of offspring".

      As explained above, though we agree that the offspring may incur some of the cost themselves, we are not aware of any evidence suggesting this trade-off is also large enough to drive intra-specific variation in clutch size across species. Furthermore, in the context here, the trade-off between number and quality of offspring would not change our conclusion – that the fitness benefit of raising more offspring is offset by the cost on survival. We have added detail on the costs incurred by offspring earlier in our discussion (L309-315). The addition of Figure 5 should help interpret these data.

      (82) L.253: instead of reference 30 the paper by Tinbergen et al in Behaviour (1990) seems more appropriate.

      We believe our current citation is relevant here but we have also added the Tinbergen et al (1990) citation.

      (83) L.253-254: such trade-offs may perfectly explain variation in reproductive effort within species if we were able to estimate cost-benefit relations for individuals. In fact, reference 29 goes some way to achieve this, by explaining seasonal variation in reproductive effort.

      We are unaware of any quantitative evidence that any combination of trade-offs explains intra-specific variation in reproductive effort, especially as a general across-species trend.

      (84) L.255: how does one demonstrate "between species life-history trade-offs"? The 'trade-off' between reproductive rate and survival we observe between species is not necessarily causal, and hence may not really be a trade-off but due to other factors - demonstrating causality requires some form of experimental manipulation.

      Between-species trade-offs are well established in the field, stemming from GC Williams’ seminal paper in 1966, and for example in r/K selection theory. It is possible to move from these correlations to testing for causation, and this is happening currently by introducing transgenes (genes from other species) that promote longevity into shorter-lived species (e.g., naked-mole rat genes into mice). As yet it is unclear what the effects on reproduction are.

      (85) L.256: it is quite a big claim that this is a novel suggestion. In fact, it is a general finding in evolutionary theory that fitness landscapes tend to be rather flat at equilibrium.

      It is important to note here that we simulate the effect size found, and hence this is the novel suggestion, that because the resulting fitness landscape is relatively flat there is no directional selection observed. We did not intend to suggest our interpretation of flat fitness landscapes is novel. We have changed the phrasing of this sentence to avoid misinterpretation.

      (86) L.259: why bring up physiological 'costs' here, given that you focus on fitness costs? Do you perhaps mean fitness costs instead of physiological costs? Furthermore, here and in the remainder of this paragraph it would be useful to be more specific on whether you are considering natural or experimental variation.

      The cost of survival is a physiological cost incurred by the reduction of self-maintenance as a result of lower resource allocation. This is one arm of fitness; we feel it would be confusing here to talk about costs to fitness, as we do not assess costs to future reproduction (which formed the large part of the critique offered by the reviewer). We would like to highlight that the aim of this manuscript was to separate costs of reproduction from the effects of quality, and this is why we have observational and experimental studies in one analysis, rather than separately. Our conclusion that we have found no evidence that the survival cost to reproduce drives within-species variation in clutch size comes both from the positive correlation found in the observational studies and our negligible fitness return estimates in our simulations. We therefore, do not believe it is helpful to separate observational and experimental conclusions throughout our manuscript, as the point is that they are inherently linked. We hope that with the addition of Figure 5 that this is more clear.

      (87) L.262: The finding that naturally more productive individuals tend to also survive better one could say is by definition explained by variation in 'quality', how else would you define quality?

      We agree, and hence we believe quality is a good term to describe individuals who perform highly in two different traits. Note that we also say the lack of evidence that trade-offs drive intra-specific variation in clutch size also potentially suggests an alternative theory, including intra-specific variation driven by differences in individual quality.

      Supplementary information

      (88) Table S1: please provide details on how the treatment was coded - this information is needed to derive the estimates of the clutch size effect for the treatments separately.

      We have added this detail.

      (89) Table S2: please report the number of effect sizes included in each of these models.

      We have added this detail.

      (90) Table S4: references are not given. Mentioning species here would be useful. For example, Ashcroft (1979) studied puffins, which lay a single egg, making me wonder what is meant when mentioning "No clutch or brood size given" as the reason for exclusion. A few more words to explain why specific studies were excluded would be useful. For example, what does "Clutch size groups too large" mean? It surprises me that studies are excluded because "No standard deviation reported for survival" - as the exact distribution is known when sample size and proportion of survivors is known.

      We have updated this table for more clarity.

      (91) Fig.S1: please plot different panels with the same scale (separately for observational and experimental studies). You could add the individual data points to these plots - or at least indicate the sample size for the different categories (female, male, mixed).

      We have scaled all panels to have the same y axis and added sample sizes to the figure legend.

      (92) Fig.S3: please provide separate plots for experimental and observational studies, as it seems entirely plausible that the risk of publication bias is larger for observational studies - in particular those that did not also include a brood size manipulation. At the same time, one can wonder what a potential publication bias among observational studies would represent, given that apparently you did not attempt to collect all studies that reported the relevant information.

      We have coloured the points for experimental and observational studies. Note that a study is an independent effect size and, therefore, does not indicate whether multiple data (i.e., both experimental and observational studies) came from the same paper. As we detail in the paper and above in our reviewer responses, we searched for observational studies from species used in the experimental studies to allow direct comparison between observational and experimental datasets.

      Reviewer #2 (Recommendations For The Authors):

      I strongly recommend improving the theoretical component of the analysis by providing a solid theoretical framework before, from it, drawing conclusions.

      This, at a minimum, requires a statistical model and most importantly a mechanistic model describing the assumed relationships.

      We thank the reviewer for highlighting that our aims and methodology are unclear in places. We have added detail to our model and simulation descriptions and have improved the description of our rationale. We also feel the failure of the journal to provide code and data to the reviewers has not helped their appreciation of our methodology and use of data.

      Because the field uses the same wording for different concepts and different wording for the same concept, a glossary is also necessary.

      We thank the reviewer for raising this issue. During the revision of this manuscript, we have simplified our terminology or given a definition, and we believe this is sufficient for readers to understand our terminology.

      Reviewer #3 (Recommendations For The Authors):

      • The files containing information of data extracted from each study were not available so it has not been possible to check how any of the points raised above apply to the species included in the study. The ms should include this file on the Supp. Info as is standard good practice for a comparative analysis.

      We supplied a link to our full dataset and the code we used in Dryad with our submitted manuscript. We were also asked to supply our data during the review process and we again supplied a link to our dataset and code, along with a folder containing the data and code itself. We received confirmation that the reviewers had been given our data and code. We support open science and it was our intention that our dataset should be fully available to reviewers and readers. We believe the data is too large to include as a table in the main text and is not essential in understanding the paper. Our data and code are at https://doi.org/10.5061/dryad.q83bk3jnk.

      • For clarity, refer to 'the effect size of clutch size on survival" rather than simply "effect size". Figures 1 and 2 require cross-referencing with the main text to understand the y-axis.

      We have added detail to the figure legend to increase the interpretability of the figures.

      • Silhouettes in Figure 3 (or photos) would help readers without ornithological expertise to understand the taxonomic range of the species included in the analyses.

      We have added silhouettes into Figure 3.

      • Throughout the discussion: superscripts shouldn't be treated as words in a sentence so please add authors' names where appropriate.

      We have added author names and dates where required.

    1. https://web.archive.org/web/20240528070547/https://shkspr.mobi/blog/2023/05/the-limits-of-general-purpose-computation/

      Terence Eden (posted #2023/05/28 ) on the question if an app provider does have a say on being willing to run their code on your device, in contrast me being in control of a device and determining which code to run there or not. In this case a bank that would disallow their app on a rooted phone, because of risk profiles attached to that. Interesting tension: my risk assessment, control over general computation devices versus a service provider for which their software is a conduit and their risk assessments. I suspect the issue underneath this is such tensions need to be a conversation or negotiation to resolve, but in practice it's a dictate by one party based on a power differential (the bank controls your money, so they can set demands for your device, because you will need to keep access to your account.)

    1. User Interface: Also known as the presentation layer, it is responsible for all user interaction, handling the display of data, and processing inputs and interface events such as button clicks and text highlighting. Usually, this layer is implemented as a desktop application. For example, an academic system should provide a graphical interface for instructors to enter grades for their classes. The main element of this interface can be a form with two columns: student name and grade. The code implementing this form resides in the interface layer.

      tính năng người dùng hỗ trợ tương tác, hiển thị và các thứ

    1. Summary of "Revised Report on the Propagator Model" by Alexey Radul and Gerald Jay Sussman

      Introduction

      • Main Problem: Traditional programming models hinder extending existing programs for new situations due to rigid commitments in the code.
      • Quote: "The most important problem facing a programmer is the revision of an existing program to extend it for some new situation."
      • Solution: The Propagator Programming Model supports multiple viewpoints and integration of redundant solutions to aid program extensibility.
      • Quote: "The Propagator Programming Model is an attempt to mitigate this problem."

      Propagator Programming Model

      • Core Concept: Autonomous machines (propagators) communicate via shared cells, continuously adding information based on computations.
      • Quote: "The basic computational elements are autonomous machines interconnected by shared cells through which they communicate."
      • Additivity: New contributions are seamlessly integrated by adding new propagators without disrupting existing computations.
      • Quote: "New ways to make contributions can be added just by adding new propagators."

      Propagator System

      • Language Independence: The model can be implemented in any programming language as long as a communication protocol is maintained.
      • Quote: "You should be able to write propagators in any language you choose."
      • Cell Operations: Cells support adding content, collecting content, and registering propagators for notifications on content changes.
      • Quote: "Cells must support three operations: add some content, collect the content currently accumulated, register a propagator to be notified when the accumulated content changes."

      Implementing Propagator Networks

      • Creating Cells and Propagators: Cells store data, while propagators compute based on cell data. Propagators are attached using d@ (diagram style) or e@ (expression style) for simpler cases.
      • Quote: "The cells' job is to remember things; the propagators' job is to compute."
      • Example: Adding two and three using propagators.
      • Quote: "(define-cell a) (define-cell b) (add-content a 3) (add-content b 2) (define-cell answer (e:+ a b)) (run) (content answer) ==> 5"

      Advanced Features

      • Conditional Network Construction: Delayed construction using conditional propagators like p:when and p:if to control network growth.
      • Quote: "The switch propagator does conditional propagation -- it only forwards its input to its output if its control is 'true'."
      • Partial Information: Cells accumulate partial information, which can be incrementally refined.
      • Quote: "Each 'memory location' of Scheme-Propagators, that is each cell, maintains not 'a value', but 'all the information it has about a value'."

      Built-in Partial Information Structures

      • Types: Nothing, Just a Value, Numerical Intervals, Propagator Cells, Compound Data, Closures, Truth Maintenance Systems, Contradiction.
      • Quote: "The following partial information structures are provided with Scheme-Propagators: nothing, just a value, intervals, propagator cells, compound data, closures, supported values, truth maintenance systems, contradiction."

      Debugging and Metadata

      • Debugging: Scheme's built-in debugger aids in troubleshooting propagator networks. Metadata tracking for cells and propagators enhances debugging.
      • Quote: "The underlying Scheme debugger is your friend."
      • Metadata: Tracking names and connections of cells and propagators helps navigate and debug networks.
      • Quote: "Inspection procedures using the metadata are provided: name, cell?, content, propagator?, propagator-inputs, propagator-outputs, neighbors, cell-non-readers, cell-connections."

      Benefits of the Propagator Model

      • Additivity and Redundancy: Supports incremental additions and multiple redundant computations, enhancing flexibility and resilience.
      • Quote: "It is easy to add new propagators that implement additional ways to compute any part of the information about a value in a cell."
      • Intrinsic Parallelism: Each component operates independently, making the model naturally parallel and race condition-resistant.
      • Quote: "The paradigm of monotonically accumulating information makes [race conditions] irrelevant to the final results of a computation."
      • Dependency Tracking: Facilitates easier integration and conflict resolution via premises and truth maintenance.
      • Quote: "If the addition turns out to conflict with what was already there, it (or the offending old thing) can be ignored, locally and dynamically, by retracting a premise."

      Conclusion

      • Goal Achievement: The Propagator Model approaches goals of extensibility and additivity by allowing flexible integration and redundancy in computations.
      • Quote: "Systems built on the Propagator Model of computation can approach some of these goals."
      • The speaker initially describes the excitement and complexity of rendering a simple red triangle in Clojure using a Java library.

        • "I found a Java library that gave bindings I started playing around with it and after quite a bit of struggle I was able to get a red triangle to show up on the screen."
      • The creation of more complex shapes like the Sierpinski pyramid, which involves recursive geometric transformations, serves as an example to explore different programming abstractions.

        • "It's actually fairly straightforward and I'll switch to two dimensions for your convenience, you start with a triangle and you shrink it down to half its size and then you copy it twice."
      • The speaker introduces macros as a way to handle repetitive code by creating a reusable "Sierpinski macro" that can be nested to achieve different levels of detail.

        • "We can write ourselves a Sierpinski macro which takes a body which is really just anything and first we wrap it in a function and then we create a scoped transform which shrinks everything that we draw by half."
      • Despite their usefulness, macros have limitations in terms of composition and downstream flexibility, prompting the exploration of functions as a better abstraction.

        • "Macros are not necessarily our most composable sort of abstraction and this has two issues... there is no potential for downstream composition."
      • Functions, though still limited, allow for more composable and testable abstractions by introducing indirection and treating rendering operations as data.

        • "We take our draw triangle function and we call it now it's a renderer and now we have a render function which simply invokes it."
      • The speaker demonstrates how higher-order functions and mapping over data can create more flexible and testable code, moving from functions to data-centric approaches.

        • "We defined three of these that will offset and scale appropriately and now we Define a Sierpinski method which takes a list of shapes and returns a list of shapes which is three times larger with all of them sort of offset and scaled appropriately."
      • The comparison between data, functions, and macros emphasizes the generality and composability of data-centric approaches, albeit with the necessity of grounding them in executable code.

        • "Data doesn't do anything by itself we have to sort of eventually ground out in code that does something we have to wrap our data in functions and execute our functions somewhere within our code."
      • The speaker critiques taxonomies while acknowledging their utility in providing a framework for discussion and suggesting approaches.

        • "Taxonomies are not a way to perfectly model the world... but they give us both a vocabulary to talk about it and they give us sort of a predictiveness."
      • Applying these abstractions to more practical examples, like sequence transformations and transducers, illustrates the balance between abstraction and practical utility.

        • "We can use the double threaded arrow to structure it so that the flow of data is left to right rather than inside out."
      • The exploration of automaton theory and finite state machines (like Automat) showcases the power of data-driven approaches in providing flexibility and compositional capabilities beyond traditional regular expressions.

        • "Automaton Theory... there's a much richer set of tools there than I'd originally realized and I tried to encompass this in a library called Automat."
      • The discussion on backpressure and causality in asynchronous programming highlights the importance of managing side effects and execution order in concurrent systems.

        • "Backpressure here is this emergent property because we have this structural relationship between this it's just that one happens before the other and that one can cause the other not to happen."
      • The introduction of streams and deferreds in the Manifold library exemplifies a less opinionated approach to handling eventual values and asynchronous computation, allowing for greater flexibility in execution models.

        • "A stream is lossy right you take a message it's gone a deferred represents a single unrealized value and you can sort of consume it as many times as you like."
      • The concept of "let-flow" in Manifold, which optimizes concurrency by analyzing data dependencies, demonstrates advanced techniques for achieving optimal parallel execution.

        • "Let-flow will basically walk the entire let binding figure out what the data dependencies are and execute them in the correct order."
      • The speaker concludes by emphasizing the importance of understanding the different forces at play in software composition and encourages ongoing discussion to improve the ecosystem.

        • "This is what causes our ecosystem to flourish or dwindle right the degree to which all the disparate pieces can work together."
      • In the Q&A, the speaker briefly touches on the position of monads in the spectrum and compares Automat to other finite state machine libraries like Regal.

        • "Where do you put monads in your spectrum... somewhere between functions and data right."
        • "Automat... is meant to be sort of a combinator version of Regal effectively."
      • Computational Challenges:
      • We lack effective computational methods: "i think we actually have the foggiest idea how to compute very well."
      • Importance of fast, efficient processes: "it took only 100 milliseconds... we don't understand how to do."

      • Genomic Complexity:

      • Human genome's complexity and flexibility: "with a small change to it you make a cow instead of a person."
      • High-level language for biological processes is unknown: "what I'm interested in is the high-level language that's necessary to do things like that."

      • Programming Evolution:

      • Legacy of programming assumptions based on scarcity: "all of our sort of intuitions from being programmers have come from a time of assuming a kind of scarcity."
      • Current abundance of resources shifts the focus: "memory is free, computing is free."

      • Security and Correctness:

      • Traditional concerns of correctness and security are secondary: "people worry about correctness... is it the real problem? maybe... most things don't have to work."
      • Evolution and adaptability of code are crucial: "we spend all our time modifying existing code."

      • Programming Constraints:

      • Early decisions in programming constrain future changes: "we make decisions early in some process that spread all over our system."
      • Need for flexibility in modifying systems: "organize systems so that the consequences of decisions we make are not expensive to change."

      • Generic Operators and Extensions:

      • Dynamically extensible operations: "dynamically extend things while my program is running."
      • Symbolic algebra as an extension of arithmetic: "expand this arithmetic on functions... it's a classical thing people can do in algebra."

      • Propagators and Parallelism:

      • Concept of propagators for parallel computation: "propagators are independent little stateful machines."
      • Parallelism and monotonic information merging: "we don't actually put values in these cells we put information about a value in a cell."

      • Truth Maintenance Systems (TMS):

      • Maintaining and improving data consistency: "truth maintenance systems... maintain the best estimates of what's going on."
      • Dependency-directed backtracking for efficient problem-solving: "automatically find for me the consistent sub consistent sub the sub world views that are consistent."

      • Historical and Educational Insights:

      • Historical evolution of computation: "when I started computing in 1961... the total amount of memory is probably about 10 kilobytes."
      • Educational gaps between theory and practical engineering: "what we taught the students wasn't at all what the students actually were expected to learn."

      • Vision for the Future:

      • Future computing systems must be inherently parallel, redundant, and flexible: "future... computers are so cheap and so easy to make... they can talk to each other and do useful things."
      • Importance of evolving current computational thinking: "we have to throw away our current ways of thinking if we ever expect to solve these problems."

      • Summary and Call to Action:

      • Main challenge is evolvability, not correctness: "problem facing us as computer engineers is not correctness it's evolvability."
      • Proposals include extensible operations and new architectural paradigms: "extensible generic operations... a more radical proposal is maybe there are freedoms that we can unlock by throwing away our idea of architecture."

      This outline captures the essential points and arguments presented, while providing specific quotes for reference.

      • Introduction

        • Jerry Sussman discusses the importance of flexible programming systems.
        • "I want to show how to the ways of making systems that have the property that you can you write this big pile of code and you know you, you get all of a sudden it's gee I have a different problem I have to solve why can't I use the same piece of code."
      • Personal Background

        • Sussman's extensive experience as a programmer since the 1960s.
        • "I'm the oldest guy here probably and way back when I first started programming computers that's what they looked like."
      • Goals for Robust Systems

        • Desires systems that are generalizable, evolvable, and tough.
        • "I want is robust systems that have the following property that they're generalizable in the sense that... they're evolvable in the sense that they can be adapted to new jobs without modification."
      • Critique of Programming 'Religions'

        • Emphasizes the need for diverse tools and approaches in programming.
        • "We have a lot of people who have religions about how to program... Each of which is good for some particular problems but not good for a lot of problems."
      • Body Plans in Engineering and Programming

        • Discusses the concept of body plans from biology and its application in engineering.
        • "Superrod radio receiver... a body plan... that separates band selectivity from inter-channel selectivity."
      • Multiple Approaches in Problem Solving

        • Highlights the value of having multiple methods to solve the same problem, inspired by biology.
        • "There are two ways to make a frog... why don't we do that in programming?"
      • Biological Inspiration in Programming

        • Advocates learning from nature's solutions, such as the flexible human genome.
        • "One of the things nature does is it's very expensive... generates and tests... we don't do that very much in programming because it's expensive."
      • Generating and Testing in Programming

        • Discusses McCarthy's amb operator for non-deterministic search.
        • "McCarthy's amb operator... is a modeling non-deterministic autometer."
      • Generic Operations and Extensibility

        • Explains the power of generic operations and their application in programming.
        • "Most people think generic operations... in fact, automatic differentiation is just a generic extension of arithmetic."
      • Risks and Benefits of Generic Extensions

        • Acknowledges the dangers of generic programming while emphasizing its power.
        • "Generic arithmetic... very dangerous... the only thing that scares me in programming is floating point."
      • Teaching and Practical Applications

        • Emphasizes the importance of teaching robust, flexible programming techniques.
        • "I've only given you three of them... I have hundreds of them that I just understand how to avoid programming yourself into a corner."
      • Conclusion

        • Discusses his books and teaching philosophy, integrating traditional and programming languages for clarity.
        • "I've written several books about this... using programming as the way of expressing the ideas besides the traditional mathematics in addition to you make it unambiguous CL clear and therefore easy to read."
    1. Résumé de la vidéo [00:00:00][^1^][1] - [00:29:11][^2^][2]:

      Cette vidéo présente une discussion sur le harcèlement et la violence à l'école, animée par Myriam Ilouz et Catherine Perelmutter. Elle aborde les impacts psychologiques du harcèlement, son évolution à travers le temps, et les défis de l'identification des harceleurs dans un contexte scolaire.

      Points forts: + [00:00:00][^3^][3] Introduction et contexte * Présentation de Myriam Ilouz, psychologue clinicienne * Importance du sujet du harcèlement à l'école + [00:01:00][^4^][4] Le harcèlement à l'école * Impact du harcèlement sur l'introduction de l'enfant au social * La souffrance des victimes et des harceleurs + [00:03:02][^5^][5] Changement dans la nature du harcèlement * Augmentation de l'intensité et de la violence du harcèlement * Utilisation des médias modernes pour harceler + [00:10:02][^6^][6] Caractéristiques des harceleurs * Agressivité, nuisance intentionnelle et répétée * Instauration d'une relation de disymétrie sociale + [00:20:01][^7^][7] La perversion et le harcèlement * Le harcèlement comme manifestation de la perversion narcissique * Difficulté à démasquer les harceleurs et à protéger les victimes + [00:27:02][^8^][8] Conséquences sociétales du harcèlement * Impact sur la confiance en l'école et la société * Nécessité d'une éducation solide pour prévenir la perversion Résumé de la vidéo [00:29:13][^1^][1] - [00:56:27][^2^][2]:

      La vidéo aborde le problème du harcèlement et de la violence à l'école, en se concentrant sur les changements dans l'éducation des enfants, l'impact de la société moderne sur le comportement des élèves, et les défis rencontrés par les enseignants et les parents. Elle souligne l'importance de comprendre la loi, la frustration et la castration symbolique dans l'éducation pour prévenir le harcèlement.

      Points forts: + [00:29:13][^3^][3] L'évolution de l'éducation des enfants * La psychanalyse française et son impact * La création d'enfants rois et le manque de règles claires * L'importance de la loi et de la frustration dans l'éducation + [00:32:35][^4^][4] Le rôle des parents et des enseignants * La disparition de l'autorité et de la hiérarchie * La nécessité d'une alliance éducative entre l'école et la famille * L'importance du respect de l'autorité des enseignants + [00:39:01][^5^][5] Le harcèlement scolaire et la justice * Définition légale et formes de harcèlement * L'impact des réseaux sociaux sur le harcèlement * La vulnérabilité des victimes et l'importance de la prévention + [00:47:02][^6^][6] Les mesures et les lois contre le harcèlement * Statistiques et plans gouvernementaux * Le droit à l'éducation et la protection contre la violence * Les initiatives pour améliorer le climat scolaire et prévenir le harcèlement Résumé de la vidéo [00:56:29][^1^][1] - [01:14:14][^2^][2]:

      Cette vidéo aborde le harcèlement et la violence à l'école, en se concentrant sur les aspects juridiques du harcèlement en France, notamment les changements apportés par la loi du 2 mars 2022. Elle explique les nouvelles dispositions du Code pénal français concernant le harcèlement scolaire, les peines encourues et les mesures à prendre pour prouver le harcèlement et obtenir justice.

      Points forts: + [00:56:29][^3^][3] Cadre juridique du harcèlement * Discussion sur le Code pénal français et la loi du 2 mars 2022 * Explication des articles relatifs au harcèlement * Importance de la preuve et des démarches légales + [01:01:14][^4^][4] Cas judiciaire spécifique * Examen d'un jugement du tribunal pour enfants d'Épinal * Analyse de la causalité entre le harcèlement et le suicide * Mention d'un appel possible devant la cour d'appel + [01:07:58][^5^][5] Témoignage d'une mère * Récit d'une mère sur le harcèlement subi par sa fille * Difficultés rencontrées dans la prise en charge et la justice * Appel à des actions concrètes pour soutenir les victimes

  2. May 2024
    1. Résumé de la vidéo [00:00:00][^1^][1] - [00:26:57][^2^][2]:

      Cette vidéo présente le Conseil Économique, Social et Environnemental (CESE) en France, ses missions, sa composition et son impact sur la société. Le CESE est décrit comme un pont entre les citoyens et les pouvoirs publics, offrant une plateforme pour la démocratie participative et l'élaboration de politiques publiques.

      Points forts: + [00:00:00][^3^][3] Rôle et missions du CESE * Conseille le gouvernement et le Parlement * Favorise la démocratie participative * Évalue l'efficacité des politiques publiques + [00:06:29][^4^][4] Composition du CESE * 175 conseillers issus de divers secteurs * Représentation de la société civile organisée * Groupes d'intérêt et affinités variés + [00:14:15][^5^][5] Débats et propositions * Discussions sur des sujets d'actualité et de société * Interventions des membres sur des thématiques variées * Propositions pour améliorer la vie quotidienne + [00:21:05][^6^][6] Inégalités de genre, crise climatique et transition écologique * Analyse de l'impact du genre sur les questions écologiques * Vulnérabilité des femmes face aux crises * Rôle des femmes dans la promotion de la durabilité Résumé de la vidéo [00:26:59][^1^][1] - [00:53:06][^2^][2]:

      La vidéo présente une discussion sur les solutions pour construire une société durable et respectueuse de l'égalité de genre. Elle aborde l'écoféminisme, la mixité des métiers, et l'impact du changement climatique sur les femmes.

      Points forts: + [00:27:02][^3^][3] L'écoféminisme * Parallèle entre la domination de la nature et celle des femmes * Vision d'une société sans patriarcat ni domination * Importance de renouer avec le vivant + [00:29:17][^4^][4] Les stéréotypes de genre * Impact des stéréotypes dès l'enfance * Influence sur la vie et le rapport à la nature * Nécessité de promouvoir la mixité des métiers + [00:31:18][^5^][5] L'égalité de genre dans les politiques publiques * Lien entre égalité de genre et action pour le vivant * Intégration des réalités de genre dans les solutions climatiques * Importance de la diplomatie féministe et du financement des associations féministes + [00:38:04][^6^][6] La participation des femmes à la lutte environnementale * Femmes comme actrices majeures de la lutte pour l'environnement * Changement de paradigme pour valoriser leurs compétences * Connexion entre les questions sociales et environnementales Résumé de la vidéo [00:53:08][^1^][1] - [01:17:09][^2^][2]:

      La vidéo présente les solutions pour construire une société durable et respectueuse de l'égalité de genre, en se concentrant sur les impacts différenciés du changement climatique sur les femmes et les hommes. Elle souligne l'importance de l'intégration de l'égalité de genre dans les politiques environnementales et la nécessité d'une action concrète pour protéger les droits des femmes.

      Points forts: + [00:53:08][^3^][3] Introduction et quiz * Présentation des recommandations principales * Quiz interactif pour évaluer les connaissances sur l'égalité de genre * Importance de l'égalité de genre dans la gestion des catastrophes + [01:00:04][^4^][4] Impact différencié du changement climatique * Les femmes sont affectées de manière disproportionnée par les catastrophes climatiques * Les crises climatiques augmentent les violences envers les femmes * Nécessité de soutenir les projets portés par les femmes + [01:05:03][^5^][5] Intégration de l'égalité de genre dans les politiques * La diplomatie féministe de la France et ses implications * L'importance de l'évaluation des engagements internationaux * La sécurité des femmes déplacées par les changements climatiques + [01:14:00][^6^][6] Conséquences des activités industrialisées * Les pays riches sont responsables des crises climatiques * Les pays en développement sont les plus touchés * Appel à la protection juridique des migrants environnementaux Résumé de la vidéo [01:17:11][^1^][1] - [01:39:37][^2^][2] :

      Cette partie de la vidéo aborde les solutions pour construire une société durable et respectueuse de l'égalité de genre. Elle met en lumière l'intégration des questions de genre dans les politiques environnementales, l'importance de l'investissement public dans la transition écologique, et le rôle des collectivités territoriales et des entreprises dans la promotion de l'égalité de genre.

      Points forts : + [01:17:11][^3^][3] Intégration du genre dans la fiscalité environnementale * Éviter de renforcer les inégalités existantes * Corriger les inégalités à travers les investissements publics * Stratégie française pour l'énergie et le climat + [01:18:00][^4^][4] Objectifs transversaux d'écologie et d'égalité * Intégrer les objectifs d'écologie et de réduction des inégalités * Documenter avec des données spécifiques au genre * Chaque euro dépensé doit également bénéficier à l'égalité de genre + [01:20:04][^5^][5] Politique de mobilité et impact sur les femmes * Exemple de la promotion du vélo et ses conséquences sur l'espace public * Nécessité pour les collectivités de croiser les thématiques d'environnement et de genre * Politiques inclusives comme celles de la Ville de Genève + [01:24:04][^6^][6] Inégalités professionnelles dans les métiers verdissants * Sous-représentation des femmes dans les secteurs émetteurs de gaz à effet de serre * Importance de l'inclusion des femmes dans la transition écologique * Lever les obstacles à la participation des femmes dans ces métiers Résumé de la vidéo [01:39:39][^1^][1] - [02:03:26][^2^][2] : La vidéo aborde les solutions pour construire une société durable et respectueuse de l'égalité de genre. Elle met en lumière les défis et les préconisations du Conseil Économique, Social et Environnemental (CESE) en France, notamment en matière d'accueil des réfugiés, de politiques publiques, de biodiversité, de pollution et de participation démocratique.

      Points forts : + [01:40:00][^3^][3] Défis et réactions face à l'égalité de genre * Discussion sur les réactions négatives aux travaux sur le genre * Confusion entre genre masculin et masculinité * Importance de prévenir les réactions négatives + [01:41:06][^4^][4] Intégration des questions de genre dans les politiques environnementales * Lien entre biodiversité, pollution et inégalités de genre * Nécessité d'une approche intégrée et détaillée * Choix difficiles dans les axes de préconisation + [01:45:52][^5^][5] Accueil des réfugiés et prise en charge spécifique des femmes et des filles * Préconisation d'intégrer une jurisprudence dans le Code de l'entrée et du séjour des étrangers * Besoins spécifiques des femmes et des filles réfugiées * Importance de projets spécifiques et de soutien financier + [01:57:43][^6^][6] Importance des données ventilées par sexe pour les politiques publiques * Collecte de données pour mieux connaître et agir * Évaluation continue des outils existants * Nécessité d'améliorer l'index d'égalité professionnelle Résumé de la vidéo [02:03:29][^1^][1] - [02:29:31][^2^][2]:

      La vidéo aborde les solutions pour construire une société durable respectant l'égalité des genres. Elle souligne l'importance de l'intégration des femmes dans les métiers verts, la nécessité d'une diplomatie féministe, et l'impact du changement climatique sur les femmes. Elle appelle à une meilleure représentation des femmes dans les décisions politiques et environnementales, et à l'adoption de politiques publiques sensibles au genre.

      Points forts: + [02:03:29][^3^][3] Impact du changement climatique sur les femmes * Importance de la recherche sur les différences d'impact entre les sexes * Nécessité d'une meilleure représentation des femmes dans les métiers verts * Appel à une diplomatie féministe et à des politiques publiques adaptées + [02:06:00][^4^][4] Justice de genre et justice climatique * Lien entre la préservation de la planète et l'évolution de la société * La justice de genre comme élément central de la justice climatique * Les politiques publiques doivent intégrer l'égalité des sexes + [02:10:21][^5^][5] Rôle des femmes dans la lutte contre la crise climatique * Les femmes sont plus vulnérables et exposées aux catastrophes naturelles * Nécessité de reconnaître et promouvoir les innovations des femmes * Importance de l'égalité des sexes pour évoluer les politiques publiques + [02:15:16][^6^][6] Intégration du genre dans la transition écologique * Les inégalités de genre exacerbent l'impact de la crise climatique * Propositions pour intégrer le genre dans les stratégies d'adaptation climatique * Valorisation de l'action des femmes et leur intégration dans la prise de décision + [02:22:14][^7^][7] Engagement des femmes dans la transition écologique * Les femmes doivent être des actrices majeures dans la lutte contre le changement climatique * Nécessité d'une approche transversale des politiques climatiques et d'égalité * Importance de la collecte de données sexo-spécifiques pour informer les politiques + [02:27:00][^8^][8] Rôle des femmes dans l'agriculture et la production bio * Évolution du secteur agricole avec une augmentation des femmes chefs d'exploitation * Défis rencontrés par les femmes dans l'adaptation au changement climatique * L'émancipation économique des femmes comme objectif pour la justice sociale et climatique Résumé de la vidéo [02:29:33][^1^][1] - [02:42:34][^2^][2]:

      Cette partie de la vidéo aborde les solutions pour construire une société durable et respectueuse de l'égalité de genre. Elle souligne l'importance d'intégrer les spécificités de genre dans les politiques nationales et internationales, en particulier en ce qui concerne les conséquences du dérèglement climatique sur les femmes. La vidéo met en avant la nécessité de sensibiliser et d'accompagner les acteurs économiques sur ces enjeux, ainsi que de lutter contre les stéréotypes de genre dans les métiers verts.

      Points forts: + [02:29:33][^3^][3] Intégration des spécificités de genre * Importance dans les politiques face au climat * Impact disproportionné sur les femmes * Solutions portées par les femmes + [02:31:38][^4^][4] Sensibilisation et éducation * Importance de la sensibilisation dès l'école * Lutte contre les stéréotypes de genre * Mixité des métiers verts + [02:34:15][^5^][5] Données genrées et politiques publiques * Nécessité de données pour réduire les inégalités * Engagement dans de nouvelles politiques * Participation des femmes aux décisions + [02:36:28][^6^][6] Diplomatie féministe et développement durable * Spécificité des revendications des femmes * Importance de l'éducation et de la formation * Accès des femmes à tous les métiers

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02393

      Corresponding author(s): Katja Petzold

      1. General Statements [optional]

      We thank the reviewers for recognising the impact of our manuscript. The reviewers noted the novelty of the miRNA bulge structure, the importance of the three observed binding modes and their potential for use in future structure-based drug design, and the possible importance of the duplex release phenomenon. We are also thankful for the relevant and constructive feedback provided.

      Our responses to the comments are written point by point in blue, and any changes in the manuscript are shown in red.

      2. Description of the planned revisions

      In response to Reviewer 1 - major comment 2

      Some of the data is over-interpreted. For example, in Figure 3A, it is concluded that supplementary regions are more important for weaker seeds. Only two 8-mer seeds are present among the twelve target sites and thus it might be difficult to generalize.

      We found the relationship between seed type and the effect of supplementary pairing in our data intriguing. To further investigate this effect, we tested whether it exists in published microarray data from HCT116 cells transfected with six different miRNAs (Linsley et al., 2007; Argawal et al., 2015). Here we found that the for the two miRNAs (miR-103 and miR-106b) where we see an impact of supplementary pairing, the difference is primarily driven by 7mer-m8 seeds.

      Since the effect appears to be specific to the miRNA, we would like to test whether it can be observed for miR-34a in a larger dataset. Therefore, we plan to transfect HEK293T cells with miR-34a and analyse the mRNA response via RNAseq. We will repeat the analysis shown above, using the predicted number of supplementary pairs to categorise the dataset into groups with or without the effect of supplementary pairing. We will then compare the three seed types within these groups.

      In response to Reviewer 2 - minor comment 1, "why was the 34-nt 3'Cy3-labeled miR34a complementary probe shifted up in the presence of AGO?".

      We plan to investigate the upper band, which we hypothesise is a result of duplex release, using EMSA to ascertain whether the band height agrees with the size of the duplex.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer #1

      Evidence, reproducibility and clarity

      Sweetapple et al. Biophysics of microRNA-34a targeting and its influence on down-regulation

      In this study, the authors have investigated binding of miR-34a to a panel of natural target sequences using EMSA, luciferase reporter systems and structural probing. The authors compared binding within a binary and a ternary complex that included Ago2 and find that Ago2 affects affinity and strengthens weak binders and weakens strong binders. The affinity is, however, generally determined by binary RNA-RNA interactions also in the ternary complex. Luciferase reporter assays containing 12 different target sites that belong to one of three seed-match types were tested. Generally, affinity is a strong contributor to repression efficiency. Duplex release, a phenomenon observed for specific miRNA-target complementarities, seems to be more pronounced when high affinity within the binary complex is observed. Furthermore, the authors use RABS for structural probing either in a construct in CIS or binding by the individual miRNA in TRANS or in a complex with Ago2. They find pronounced asymmetric target binding and Ago2 does not generally change the binding pattern. The authors observe one specific structural group that was unexpected, which was mRNA binding with bulged miRNAs, which was expected sterically problematic based on the known structures. MD simulations, however, revealed that such structures could indeed form.

      This is an interesting manuscript that contributes to our mechanistic understanding of the miRNA-target pairing rules. The combination of affinity measurements, structural probing and luciferase reporters allow for a broad correlation of target binding and repression strength, which is a well-thought and highly conclusive approach. However, there are a number of shortcomings that are summarized below.

      The manuscript is not easy to read and to follow for several reasons. First, many of the sub-Figures are not referenced in the text of the results section (1C, 1D, 2C, 4D), which is somewhat annoying. Figure 4A seems to be mis-labeled. Second, a lot of data is presented in suppl. Figures. It should be considered to move more data into the main text in order to make it easier for readers to evaluate and follow.

      Thank you for bringing this to our attention. We have now revised the figure references accordingly.

      We have relocated gel images of BCL2, WNT1, MTA2 and the control samples from Figure S3 and S4 to the main results (Figure 2A-B) to improve readability and provide controls and details that aid in clear understanding. Additionally, we have relocated panel C from Figure S6 to Figure 2C to enhance the clarity of our rationale for using polyuridine (pU) in our AGO2 binding assays.

      The updated figure is shown below, with changes to the legend marked in red.

      Figure 2. Binary and ternary____ complex binding affinities measured by EMSA. (A) Binary (mRNA:miR-34a) binding assays showing examples of BCL2, WNT1 and MTA2. (B) Ternary (mRNA:miR-34a-AGO2) binding assays showing examples of BCL2, WNT1, MTA2, and the three control targets PERFECT, SCRseed, and SCRall. The Cy5 labelled species is indicated with asterisk (*). F indicates the free labelled species (miR34a or mRNA), B indicates binary complex, and T indicates ternary complex. Adjacent titrations points differ two-fold in concentration, with maximum concentrations stated at the top right. Adjacent titration points for MTA2 differed three-fold to assess a wider concentration range. In theternary assay, miRNA duplex release from AGO2 was observed for amongst others BCL2, WNT1, PERFECT, and SCRseed (band indicated with B), while it was not observed for SCRall and MTA2. See Figures S3 and S4 for representative gel images for all targets. See Supplementary files 2 and 3 for all images and replicates. (C) Titrations with increasing miR-34a-AGO2 concentration against Cy5-labelled SCRall (left) or PNUTS (right) comparing the absence and presence of 20 μM polyuridine (pU) during equilibration. pU acted as a blocking agent, reducing nonspecific binding, as seen by the different KD,app values for SCRall and PNUTS after addition of 20 μM pU. Therefore, all final mRNA:miR-34a-AGO2 EMSAs were carried out in the presence of 20 μM pU. Labels are as stated above. (D) Individual binding profiles for each of the 12 mRNA targets assessed by electrophoretic mobility assay (EMSA). Each datapoint represents an individual experiment (n=3). Blue represents results for the binary complex, and green represents results for the ternary complex. Dotted horizontal lines represent the KD,app values, which are also stated in blue and green with standard deviations (units = nM). Note that the x-axis spans from 0.1 to 100,000 in CCND1, MTA2 and NOTCH2, whereas the remaining targets span 0.1 to 10,000.

      Some of the data is over-interpreted. For example, in Figure 3A, it is concluded that supplementary regions are more important for weaker seeds. Only two 8-mer seeds are present among the twelve target sites and thus it might be difficult to generalize.

      We have revised our wording to recognise that more 8-mer sites would be required to draw a stronger conclusion based on this hypothesis. This hypothesis would be interesting to confirm in a larger dataset but is unfortunately outside of the scope of this paper.

      Our hypothesis also aligns with recent data from Kosek et al. (NAR 2023; Figure 2D) where SIRT1 with an 8mer and 7mer-A1 seed was compared. Only the 7mer-A1 was sensitive to mutations in the central region or switching all mismatched to WC pairs.

      Page 21 now states:

      "This result indicates that the impact of supplementary binding may be greater for targets with weaker seeds, as has been observed earlier in a mutation study of miR-34a binding to SIRT1 (Kosek et al., 2023), although a larger sample size would be needed to confirm this observation."

      Furthermore, we found the relationship between seed type and the effect of supplementary pairing in our data intriguing. To further investigate this effect, we tested whether it exists in published microarray data from HCT116 cells transfected with six different miRNAs (Linsley et al., 2007; Argawal et al., 2015). Here we found that the for the two miRNAs (miR-103 and miR-106b) where we see an impact of supplementary pairing, the difference is primarily driven by 7mer-m8 seeds. We therefore plan to test whether the effect can be observed for miR-34a in a larger dataset. We have outlined our preliminary data and planned experiments in Section 2 - description of the planned revisions.

      I did not understand why the CIS system shown in 4A is a good test case for miR-34a-target binding. It appears very unnatural and artificial. This needs to be rationalized better. Otherwise it remains questionable, whether these data are meaningful at all.

      Thank you for pointing out the need for clearer rationalisation.

      The TRANS construct, where the scaffold carries the mRNA targeting sequence, provides reactivity information for the mRNA side only, while the microRNA is bound within RISC, with the backbone protected by AGO2. Therefore, to gain information on the miR-34a side of each complex we used the CIS construct, which provides reactivity information from both the miRNA and mRNA. We used the miRNA and mRNA reactivities to calculate all possible secondary structures for the binary complex, and then compared these structures to the mRNA reactivity in TRANS to find which structure fitted the reactivity patterns observed in the ternary complex.

      We have included an additional statement in the manuscript to clarify this point on pages 12-13:

      "Two RNA scaffolds were used for each mRNA target; i) a CIS-scaffold: RNA scaffold containing both mRNA target and miRNA sequence separated by a 10 nucleotide non-interacting closing loop, and ii) a TRANS-scaffold: RNA scaffold containing only the mRNA target sequence, to which free miR-34a or the miR-34a-AGO2 complex was bound (Figure 4A). The CIS constructs therefore provided reactivity information on the miRNA side, which is lacking in the TRANS construct, and was used to complement the TRANS data."

      It may be worthwhile noting that a non-interacting 10 nucleotide loop was inserted between then miRNA and mRNA of the CIS constructs, allowing the miRNA and mRNA strands to bind and release freely. The reactivity patterns of each mRNA:miRNA duplex were compared between CIS and TRANS, and showed similar base pairing (Figure 4D). Furthermore, we have previously compared the two scaffolds in our RABS methodology paper (Banijamali et al. 2022), where no differences were observed besides reduced end fraying in the CIS construct.

      For the TRANS experiments, only one specific scaffold structure is used. This structure might impact binding as well and thus at least one additional and independent scaffold should be selected for a generalized statement.

      For each construct, the potential of interaction with the scaffold was tested using the RNAstructure (Reuter & Mathews, 2010)package. Based on the results of this assessment, two different scaffolds were used for our TRANS experiments. The testing and use of scaffolds has now been clarified further on page 13:

      "The overall conformation of each scaffold with the inserted RNA was assessed using the RNAstructure (Reuter & Mathews, 2010) package to ensure that the sequence of interest did not interact with the scaffold. If any interaction was observed between the RNA of interest and the scaffold, then the scaffold was modified until no predicted interaction occurred. The different scaffolds and their sequence details are shown in supplementary information (Table S1)."

      We have previously examined the scaffold's effect on binding and structure during the development of the RABS method. We tested the same mRNA (SIRT1) in separate, independent scaffolds to verify the consistency of the results. An example of this can be found in the supplementary information (Figure S1a) of Banijamali et al. (2022).

      Generally, it would be nice to have some more information about the experiments also in the result section. Recombinant Ago2 is expressed in insect cells and re-loaded with miR-34a, luciferase reporters are transfected into tissue culture cells, I guess.

      We have now stated the cell types used for AGO2 expression and luciferase reporter assays in the results.

      On page 17 we have included:

      "Samples of each of the 12 mRNA targets, as well as miR-34a and AGO2, were synthesised in-house for biophysical and biological characterisation. Target mRNA constructs were produced via solid-phase synthesis while miR-34a was transcribed in vitro and cleaved from a tandem transcript (Feyrer et al., 2020), ensuring a 5' monophosphate group. AGO2 was produced in Sf9 insect cells."

      "To measure the affinity of each mRNA target binding to miR-34a, both within the binary complex (mRNA:miR-34a) and theternary complex (mRNA:miR-34a-AGO2), we optimised an RNA:RNA binding EMSA protocol to suit small RNA interactions. The protocol is loosely based on Bak et al. (2014)36, with major differences being use of a sodium phosphate buffering system so as not to disturb weaker interactions (James et al., 1996; Stellwagen et al., 2000), supplemented with Mg2+ as a counterion to reduce electrostatic repulsion between the two negatively charged RNAs (Misra & Draper, 1998), and fluorescently labelled probes."

      Page 19:

      " We successfully tested various RNA backgrounds, including polyuridine (pU) and total RNA extract (Figure S6B) to block any unspecific binding. Ultimately, we supplemented our binding buffer with pU at a fixed concentration of 20 µM for the ternary assays to achieve the greatest consistency."

      Page 20:

      "Repression efficacy for the 12 mRNA targets by miR-34a was assessed through a dual luciferase reporter assay6. Target mRNAs were cloned into reporter constructs and transfected into HEK293T cells."

      Page 22:

      "To infer base pairing patterns and secondary structure for each of the 12 mRNA:miR-34a pairs, we used the RABS technique (Banijamali et al., 2023) with 1M7 as a chemical probe. All individual reactivity traces are shown in Figure S9. Reactivity of each of the 22 miR-34a nucleotides was assessed upon binding to each of the 12 mRNA targets within a CIS construct, containing both miR-34a and the mRNA target site separated by a non-interacting 10-nucleotide loop. The two RNAs can therefore bind and release freely within the CIS construct and reactivity information is collected from both RNA strands."

      In the first sentence of the abstract, Argonaute 2 should be replaced by Argonaute only since other members bind to miRNAs as well.

      Thank you for recognising this. It has now been corrected.

      Significance

      This is an interesting manuscript that contributes to our mechanistic understanding of the miRNA-target pairing rules. The combination of affinity measurements, structural probing and luciferase reporters allow for a broad correlation of target binding and repression strength, which is a well-thought and highly conclusive approach. However, there are a number of shortcomings.

      We thank the reviewer for recognising the approach and impact of our work. In addition we thank the reviewer for identifying the need for further data to support our conclusions from the luciferase assays, which is something that we plan to address, as described in section 2.



      Reviewer #2

      Evidence, reproducibility and clarity

      Summary: Sweetapple et al. took the approaches of EMSA, SHAPE, and MD simulations to investigate target recognition by miR-34a in the presence and absence of AGO2. Surprisingly, their EMSA showed that guide unloading occurred even with seed-unpaired targets. Although previous studies reported guide unloading, they used perfectly complementary guide and target sets. The authors of this study concluded that the base-pairing pattern of miR-34a with target RNAs, even without AGO2, can be applicable to understanding target recognition by miR-34a-bound AGO2.

      Major comments:

      (Page 11 and Figure S4) The authors pre-loaded miR-34a into AGO2 and subsequently equilibrated the RISC with a 5' modified Cy5 target mRNA. Since properly loaded miR-34a is never released from AGO2, it is impossible for the miR-34a loaded into AGO2 to form the binary complex (mRNA:miR-34a) in the EMSA (guide unloading has been a long-standing controversy). However, they observed bands of the binary complex in Figure S4. The authors did not use ion-exchange chromatography. AGOs are known to bind RNAs nonspecifically on their positively charged surface. Is it possible that most miR-34a was actually bound to the surface of AGO2 instead of being loaded into the central cleft? This could explain why they observed the bands of the binary complex in EMSA.

      Thank you for mentioning this crucial point which has been a focus of our controls. We have addressed this point in four ways:

      Salt wash during reverse IMAC purification. Separation of unbound RNA and proteins via SEC. Blocking non-specific interactions using polyuridine. Observing both the presence and absence of duplex release among different targets using the same AGO2 preparation and conditions.

      Firstly, although we did not use a specific ion exchange column for purification, we believe the ionic strength used in our IMAC wash step was sufficient to remove non-specific interactions. We used A linear gradient with using buffer A (50 mM Tris-HCl, 300 mM NaCl, 10 mM Imidazole, 1 mM TCEP, 5% glycerol v/v) and buffer B (50 mM Tris-HCl, 500 mM NaCl, 300 mM Imidazole, 1 mM TCEP, 5% glycerol) at pH 8. The protocol followed recommendation by BioRad for their Profinity IMAC resins where it is stated that 300 mM NaCl should be included in buffers to deter nonspecific protein binding due to ionic interactions. The protein itself has a higher affinity for the resin than nucleic acids.

      A commonly used protocol for RISC purification follows the method by Flores-Jasso et al. (RNA 2013). Here, the authors use ion exchange chromatography to remove competitor oligonucleotides. After loading, they washed the column with lysis buffer (30 mM HEPES-KOH at pH 7.4, 100 mM potassium acetate, 2 mM magnesium acetate and 2 mM DTT). AGO was eluted with lysis buffer containing 500 mM potassium acetate. Competing oligonucleotides were eluted in the wash.

      As ionic strength is independent of ion identity or chemical nature of the ion involved (Jerermy M. Berg, John L. Tymoczko, Gregory J. Garret Jr., Biochemistry 2015), we reasoned that our Tris-HCl/NaCl/ imidazole buffer wash should have at comparable ionic strength to the Flores-Jasso protocol.

      Our total ionic contributions were: 500 mM Na+, 550 mM Cl-, 50 mM Tris and 300 mM imidazole. We recognise that Tris and imidazole are both partially ionized according the pH of the buffer (pH 8) and their respective pKa values, but even if only considering the sodium and chloride it should be comparable to the Flores-Jasso protocol.

      We have restated the buffer compositions below written the methods section more explicitly to describe this:

      "Following dialysis, any precipitate was removed by centrifugation, and the resulting supernatant was loaded onto a IMAC buffer A-equilibrated HisTrap-Ni2+ column to remove TEV protease, other proteins, and non-specifically bound RNA. A linear gradient was employed using IMAC buffers A and B."

      Secondly, after reverse HisTrap purification, AGO2 was run through size exclusion chromatography to remove any remaining impurities (shown Figure S2B).

      Thirdly, knowing that AGO2 has many positively charged surface patches and can bind nucleic acid nonspecifically (Nakanishi, 2022; O'Geen et al., 2018), we tested various blocking backgrounds to eliminate nonspecific binding effects in our EMSA ternary binding assays. We were able to address this issue by adding either non-homogenous RNA extract or homogenous polyuridine (pU) in our EMSA buffer during equilibration background experiments. This allowed us to eliminate non-specific binding of our target mRNAs, as shown previously in Supplementary Figure S6. We appreciate that the reviewer finds this technical detail important and have moved the panel C of figure S6 into the main results in Figure 2C, to highlight the novel conditions used and important controls needed to be performed. If miR-34a were non-specifically bound to the surface of AGO2 after washing, this blocking step would render any impact of surface-bound miR-34a negligible due to the excess of competing polyuridine (pU).

      Our EMSA results show that, using polyU, we can reduce non-specific interaction between AGO2 and RNAs that are present. And still, duplex release occurs despite the blocking step. It is therefore less likely that duplex release is caused by surface-bound miR-34a.

      Finally, the observation of distinct duplex release for certain targets, but not for others (e.g. MTA2, which bound tightly to miR-34a-AGO2 but did not exhibit duplex release; see Figure 2), argues against the possibility that the phenomenon was solely due to non-specifically bound RNA releasing from AGO2.

      In response to the reviewers statement "Since properly loaded miR-34a is never released from AGO2, it is impossible for the miR-34a loaded into AGO2 to form the binary complex (mRNA:miR-34a)" we would like to refer to the three papers, De et al. (2013) Jo MH et al. (2015), and Park JH et al. (2017), which have previously reported duplex release and collectively provide considerable evidence that miRNA can be unloaded from AGO in order to promote turnover and recycling of AGO. It is known that AGO recycling must occur, therefore there must be some mechanisms to enable release of miRNA from AGO2 to enable this. It is possible that AGO recycling proceeds via miRNA degradation (TDMD) in the cell, but in the absence of enzymes responsible for oligouridylation and degradation, the miRNA duplex may be released. As TDMD-competent mRNA targets have been observed to release the miRNA 3' tail from AGO2 (Sheu-Gruttadauria et al., 2019; Willkomm et al., 2022), there is a possible mechanistic similarity between the two processes, however, we do not have sufficient data to make any statement on this.

      (Page 18 and Figure S5) Previous studies (De et al., Jo MH et al., Park JH et al.) reported guide unloading when they incubated a RISC with a fully complementary target. However, neither MTA2, CCND1, CD44, nor NOTCH2 can be perfectly paired with miR-34a (Figure 1A). Therefore, the unloading reported in this study is quite different from the previously reported works and thus cannot be explained by the previously reported logic. The authors need to explain the guide unloading mechanism that they observed. Otherwise, they might misinterpret the results of their EMSA and RABS of the ternary complex.

      The three aforementioned studies have reported unloading/duplex release. However, they did not only report fully complementary targets in this process.

      De et al. (2013) reported that "highly complementary target RNAs promote release of guide RNAs from human Argonaute2".

      Subsequently, Park et al. (2017) reported: "Strikingly, we showed that miRNA destabilization is dramatically enhanced by an interaction with seedless, non-canonical targets."

      A figure extracted from Figure 5 of Park et al. is shown below illustrating the occurrence of unloading in the presence of seed mismatches in positions 2 and 3 (mm 2-3). Jo et al. (2015) also reported that binding lifetime was not affected by the number of base pairs in the RNA duplex.

      In addition to these three reports, a methodology paper focusing on miRNA duplex release was published recently titled "Detection of MicroRNAs Released from Argonautes" (Min et al., 2020).

      Therefore, we do believe that the previously observed microRNA release is similar to our observation. Here we also correlate it to structure and stability of the complex.

      (Page 20) The authors reported, "it is notable that the seed region binding does not appear to be necessary for duplex release." The crystal structures of AGO2 visualize that the seed of the guide RNA is recognized, whereas the rest is not, except for the 3' end captured by the PAZ domain. How do the authors explain the discrepancy?

      In this manuscript, we intend to present our observations of duplex release. There are many potential relationships between duplex release and AGO2 activity, which we do not have data to speculate upon. Previous studies, such as Park et al. (2017) have also observed non-canonical and seedless targets leading to duplex release, supporting our findings. Additionally, other publications including McGearly et al. (2019) report 3'-only miRNA targets, Lal et al. (2009) have documented seedless binding by miRNA and their downstream biological effects, and Duan et al. (2022) show that a large number of let-7a targets are regulated through 3′ non-seed pairing.

      It is also possible that duplex release is not coupled to classical repression outcomes, and does not need to proceed by the seed, but instead regulates AGO2 recycling before AGO2 enters the quality control mode of recognising the formed seed.

      (Pages 22) The authors mentioned, "It follows that the structure imparted via direct RNA:RNA interaction remains intact within AGO2, highlighting the role of RNA as the structural determinant." A free guide and a target can start their annealing from any nucleotide position. In contrast, a guide loaded into AGO needs to start annealing with targets through the seed region. Additionally, the Zamore group reported that the loaded guide RNA behaves quite differently from its free state (Wee et al., Cell 2012). How do the authors explain the discrepancy?

      The key point we would like to emphasise is that AGO does not seem to alter the underlying RNA:RNA interactions. The bound state in the ternary complex reflects the structure established in the binary complex. We do not aim to claim a specific sequence of events, as this interpretation is not possible from our equilibrium data. Our data indicates that the protein is flexible enough to accommodate the RNA structure that is favoured in the binary complex. This hypothesis is further supported by our MD simulation, which demonstrates the accommodation of a miRNA-bulge structure within AGO2.

      Targets lacking seeds have been identified previously (McGeary et al. 2019, Park et al. 2017, Lal et al. 2009) and can bind to miRNA within AGO. Therefore, there must be a mechanism by which these targets can anneal within AGO, such as via sequence-independent interactions (as discussed in question 3).

      With respect to Wee et al., (2012), which studied fly and mouse AGO2 and found considerable differences between the thermodynamic and kinetic properties of the two AGO2 species. Furthermore, they found different average affinities between the two species, with the fly AGO binding tighter the mouse. Following this logic, it is not unexpected that human AGO2 would have unique properties compared to those of fly and mouse.

      Below is an extract from Wee et al., (2012):

      "Our KM data and published Argonaute structures (Wang et al., 2009) suggest that 16-17 base pairs form between the guide and the target RNAs, yet the binding affinity of fly Ago2-RISC (KD = 3.7 {plus minus} 0.9 pM, mean {plus minus} S.D.) and mouse AGO2-RISC (KD = 20 {plus minus} 10 pM, mean {plus minus} S.D.) for a fully complementary target was comparable to that of a 10 bp RNA:RNA helix. Thus, Argonaute functions to weaken the binding of the 21 nt siRNA to its fully complementary target: without the protein, the siRNA, base paired from positions g2 to g17, is predicted to have a KD ∼3.0 × 10−11 pM (ΔG25{degree sign}C = −30.7 kcal mol−1). Argonaute raises the KD of the 16 bp RNA:RNA hybrid by a factor of > 1011."

      In the Wee et al. (2012) paper, affinity data on mouse and fly AGO2 was collected via filter binding assays, using a phosphorothioate linkage flanked by 2′-O-methyl ribose at positions 10 and 11 of the target to prevent cleavage. They then compared the experimentally determined mean KD and ΔG values for each species to predicted values of an RNA:RNA helix of 16-17 base-pairs. No comparison was made between individual targets, and no experimental data was collected for the RNA:RNA binding. The calculated energy values were made based on a simple helix without taking into account any possible secondary structure features. Considering the different AGO species, alternative experimental setup, modified nucleotides in the tested RNA, and the computationally predicted RNA values compared to the averaged experimental values, we believe there is considerable reason to observe differences compared to our findings.

      We have expanded our discussion on page 27 to the following:

      "An earlier examination of mRNA:miRNA binding thermodynamics by Wee and colleagues (2012) found that mouse and fly AGO2 reduce the affinity of a guide RNA for its target61. Our data indicate that the range of miR-34a binary complex affinities is instead constricted by human AGO2 in the ternary complex - strengthening weak binders while weakening strong binders. The 2012 study reported different average affinities between the two AGO2 species, with the fly protein binding tighter the mouse. Following this logic, it is not unexpected that human AGO2 would have unique properties compared to those of fly and mouse."

      The authors concluded that the range of binary complex affinities is constricted by human AGO2 in the ternary complex - strengthening weak binders while weakening strong binders. This may hold true for miR-34a, but it cannot be generalized. Other miRNAs need to be tested.

      That is true, we have now adjusted the wording to encompass this more clearly, shown below. Testing of further miRNAs is the likely content of future work from us and others.

      "Our data indicate that the range of miR-34a binary complex affinities is instead constricted by human AGO2 in the ternary complex - strengthening weak binders while weakening strong binders."

      Minor comments:

      (Figure S2) Why was the 34-nt 3'Cy3-labeled miR34a complementary probe shifted up in the presence of AGO?

      We believe this observation is also indicative of duplex release. At the time that these activity assays were collected, we were not as aware of the presence of duplex release so did not test it further, assuming it may be due to transient interactions. We plan to investigate this via EMSA and have included this in the planned revisions (section 2).

      2.(Page 17) Does the Cy3 affect the interaction of the 3' end of miR-34 with AGO2?

      miR-34a-3'Cy5 was used for binary experiments only and the reverse experiment was conducted as a control (where Cy5 was located on the mRNA) (Figure S3b), showing no change in affinity/interaction when the probe was switched to the target. For ternary experiments the mRNA target was labelled on the 5' terminus, to make sure there was no interference with loading miR-34a into AGO2.

      A Cy3 labelled RNA probe (fully complementary to miR-34a) was used to detect miR-34a in northern blots, but AGO2 interaction is not relevant here under denaturing conditions.

      Otherwise, the 34-nt slicing probe had Cy3 on the 5 nt 3' overhang and should therefore not interact with AGO.

      1. Several groups reported that overproduced AGOs loaded endogenous small RNAs. The authors should mention that their purified AGO2 was not as pure as a RISC with miR-34a. Otherwise, readers might think that the authors used a specific RISC.

      We have now improved our explanation of the loading efficiency to make it more clear to the reader that our AGO2 sample was not fully bound by miR-34a, and that all concentrations refer to the miR-34a-loaded portion of AGO2. The following text can be found in the results on page 18:

      "The mRNA:miR-34a-AGO2 assay had a limited titration range, reaching a maximum miR-34a-AGO2 concentration of 268 nM due to a 5% loading efficiency (see Figure S2D for loading efficiency quantification). The total AGO2 concentration was thus 20-fold higher than the miR-34a-loaded portion. Further increase in protein concentration was prevented by precipitation. Weaker mRNA targets (CD44, CCND1, and NOTCH2) did not reach a saturated binding plateau within this range, leading to larger errors in their estimated KD,app values. However, reasonable estimation of the KD,app was possible by monitoring the disappearance of the free mRNA probe. Note that we refer to the miR-34a-loaded portion of AGO2 when discussing concentration values for all titration ranges. To ensure AGO2 binding specificity despite low loading efficiency, a scrambled control was used (SCRall; lacking stable base pairing with miR-34a or other human miRNAs according to the miRBase database57). SCRall showed no interaction with miR-34a-AGO2 (Figure 2B)."

      (Figure legend of Figure S5) Binding was assessed "by."

      Thank you for pointing this out, it is now fixed.

      (Page 17) It would be great if the authors could even briefly describe the mechanism by which the sodium phosphate buffer with magnesium does not disturb weaker interactions by citing reference papers.

      We have now added a supplementary methods section to our manuscript and included the description below on page 10:

      "We found that a more traditional Tris-borate-EDTA (TBE) buffer disrupted weaker RNA:RNA binding interactions (Supplementary Methods Figure M1). Borate anions form stable adducts with carbohydrate hydroxyl groups (James et al., 1996) and can form complexes with nucleic acids, likely through amino groups in nucleic bases or oxygen in phosphate groups (Stellwagen et al., 2000). This makes TBE unsuitable for assessment of RNA binding, particularly involving small RNA molecules, which typically have weaker affinities. We therefore adapted our buffer system to a sodium phosphate buffer supplemented with magnesium. Magnesium acts as a counterion to reduce electrostatic repulsion between the two negatively charged backbones by neutralisation (Misra et al., 1998)."

      We have also clarified the buffer adaptions in our results section on page 17:

      The protocol is loosely based on Bak et al. (2014)36, with major differences being use of a sodium phosphate buffering system so as not to disturb weaker interactions(James et al., 1996; Stellwagen et al., 2000), supplemented with Mg2+ as a counterion to reduce electrostatic repulsion between the two negatively charged RNAs(Misra & Draper, 1998), and fluorescently labelled probes. Original gel images and quantification are shown in supplementary Figures S3 and S4. All KD,app values are shown in Supplementary Table 1, and represent the mean of three independent replicates.

      Figure M1. Comparison of Tris-borate EDTA (TBE) and sodium phosphate with magnesium (NaP-Mg2+) buffer systems for EMSA. Cy5-labelled miR-34a and unlabelled CD44 were equilibrated in the two different buffer systems, using the same titration range. No mobility shifts were observed in the TBE system, while clear binding shifts were observed in the NaP-Mg2+ system.

      6.(Page 22) The authors cited Figure 4C in the sentence, "Comparison between CIS and TRANS ..." Is this supposed to be Figure 4D?

      The reviewer was correct in their assumption, and this has now been corrected.

      7.(Figure 6) Readers would appreciate it if the guide and target were colored in red and blue. The color codes have been used in most papers reporting AGO structures. The current color codes are opposite.

      We have now adjusted the colour schemes throughout the manuscript, and Figure 6 has been modified to the following:

      __"Figure 6. The miRNA-bulge structure is readily accommodated by AGO2 as shown by molecular dynamics simulation. __Panel (A) displays a snapshot of the all-atom MD simulation of miR-34a (red) and NOTCH1 (blue) in AGO2. The NOTCH1:miR-34a duplex is shown with AGO2 removed for clarity and is rotated 90{degree sign} to show the miRNA bulge and bend in the duplex. This NOTCH1:miR-34a-AGO2 structure is compared with (B), which shows the crystal structure of miR-122 (orange) paired with its target (purple) via the seed and four nucleotides in the supplementary region (PDB-ID 6N4O17), and (C), which shows the crystal structure of miR-122 (orange) and its target (green) with extended 3' pairing, necessary for the TDMD-competent state (PDB-ID 6NIT19). AGO2 is depicted in grey, with the PAZ domain in green, and the N-terminal domain marked with N. The miRNA duplexes in (B) and (C) feature symmetrical 4-nucleotide internal loops, whereas the NOTCH1 structure in (A) has an asymmetrical miRNA bulge with five unpaired nucleotides on the miRNA side and a 3-nucleotide asymmetry."

      Significance

      This paper will have a significant impact on the field if seed-unpaired targets can indeed unload guide RNAs. The authors may want to validate their results very carefully.

      We thank the reviewer for recognising the significance of duplex release (or guide unloading) from AGO2. We agree that the observations should be tested rigorously and have outlined the actions we took to ensure validity in our AGO2 preparation.

      __Reviewer #3 __

      Evidence, reproducibility and clarity (Required):

      In this manuscript, the authors use a combination of biochemical, biophysical, and computational approaches to investigate the structure-function relationship of miRNA binding sites. Interestingly, they find that AGO2 weakens tight RNA:RNA binding interactions, and strengthens weaker interactions.

      Given this antagonistic role, I wonder: shouldn't there be an 'average' final binding affinity? Furthermore, if I understand correctly, not many trends were observed to correlate binding affinity with repression, etc.

      Overall, there was no 'average' final binding affinity observed, as the binary assays had a much higher maximum (NOTCH2binary affinity was within the micromolar range) skewing the mean average of the binary affinities to 657 nM, versus 111 nM for the ternary affinities. We also compare the variances of the binary and ternary affinity datasets using the F-test and found that F > F(critical one tail) and thus the variation of the two populations is unequal (binary variation is significantly larger than ternary).

      F-Test Two-Sample for Variances

      • *

      binary affinity

      ternary affinity

      Mean

      657.3

      110.971667

      Variance

      2971596.1

      24406.4012

      Observations

      12

      12

      df

      11

      11

      F

      121.754784

      P(F

      7.559E-10

      F(critical one-tail)

      2.81793047

      We agree that the overall correlation between affinity and repression was not strong, although we found a stronger correlation within the miRNA-bulge group (Figure 5C and S7C). A larger sample size of miRNA bulge-forming duplexes would be needed to test the generalizability of this observation.

      Given the context of the study - whereby structure is being investigated as a contributing factor to the interaction between the miRNA and mRNA, I find it interesting that the authors chose to use MC-fold to predict the structures of the mRNA, rather than using an experimental approach to assess / validate the structures. Thirty-seven RNAs were assessed; I think even for a subset (the 12 that were focused on in the study), the secondary structure should be validated experimentally (e.g., by chemical probing experiments, which the research group has demonstrated expertise in over the last several years). The validation should follow the in silico folding approach used to narrow down the region of interest. It is necessary to know whether an energy barrier (associated with the mRNA unfolding) has to occur prior to miRNA binding; this could help explain some of the unexplained results in the study. Indeed, the authors mention that there are many variables that influence miRNA regulation.

      Indeed, experimentally validated structures offer valuable insights that cannot be obtained solely through sequence-based predictions. This is why we opted to employ our RABS method to experimentally evaluate the binary and ternary complex binding of our 12 selected targets (as depicted in Figures 4 and S9 and discussed in the text on pages 23-24). While we (in silico) assessed all 37 RNA targets that were experimentally confirmed at the time, selecting 12 to represent both biological and predicted structural diversity, it would have been impractical to experimentally pre-assess all the targets not included in the final selection. Our in-silico assessment was designed to narrow down the regions of interest and evaluate predicted secondary structures present. The pipeline is shown in Figure 1. Details of the code used in the in-silico analysis are provided in Supplementary File 1.

      Regarding the energy of unfolding of mRNA, our constructs considered the isolated binding sites thus the effects of surrounding mRNA interactions were removed. We compared our affinities to dG as well as MFE and have now included this analysis in Figure S8A. Additionally, we have included the text on page 27-28 of the discussion:

      "Gibbs free energy (G), which is often included in targeting prediction models as a measure of stability of the miRNA:mRNA pair12,62, correlated with the log of our binary KD,app values, using ΔG values predicted by RNAcofold (R2 = 0.61). There was a weaker correlation with the free energy values derived from the minimum free energy (MFE) structures predicted by RNAcofold (R2 = 0.41) (Figure S8A). This result highlights the contribution of unfolding (in ΔG) as being an important in predicting KD. The differences between ΔG and KD,app are likely primarily due to inaccurately predicted structures used for energy calculations."

      Additionally, we assessed the free form of all mRNA targets via RABS (Figure S9) and observed that the seed of each free mRNA was available for miRNA binding (seeds of the free mRNA were not stably bound).

      Finally, when designing our luciferase plasmids we used RNAstructure (Reuter & Mathews, 2010) to check for self-folding effects which could interfere with target site binding and ensured that all plasmids were void of such effects.

      In the methods, T7 is italicized by accident in the T7 in vitro transcription section. Bacmid is sometimes written with a capital B and other times with a lower-cased b. The authors should be consistent. The concentration of TEV protease that was added (as opposed to the volume) should be described for reproducibility.

      Thank you for pointing out these overlooked points. They have now been corrected.

      In figure S2D, what is the second species in the gel on the right-hand side of the gel in the miR-34a:AGO lanes? The authors should mention this.

      We believe that the faint upper band corresponds to other longer RNA species loaded into AGO2. As AGO2 is loaded with a diversity of RNA species, it is likely that some of them may have a weak affinity for the miR-34a-complementary probe, and therefore show up on the northern blot.

      Figure S3B and S3A are referenced out of order in the text. In regard to S3A, what are the anticipated or hypothesized alternative conformations for NOTCH1, DLL1, and MTA2? There are really interesting things going on in the gels, also for HNF4a and NOTCH2. Can the authors offer some explanation for why the free RNA bands don't seem to disappear, but rather migrate slowly? Is this a new species?

      The order of the figure references have now been updated, thank you for alerting us to this.

      Figure S3A: For MTA2, the two alternative conformations are shown in Figure S9 and S10 (and shown below here, miR-34aseed marked in pink). It appears that a single conformation is favoured at high concentration (> 1 µM) while the two conformations are present at {less than or equal to} 1 µM. The RABS data for MTA2 also indicated multiple binding conformations, as the reactivity traces were inconsistent. We expect that the conformation shown on the left was most dominant within AGO2, based on the reactivity of the TRANS + AGO assays. However, we cannot exclude a possible G-quadruplex formation due to the high G content of MTA2 (shown below right).

      Regarding NOTCH1 and DLL1, a faint fluorescent shadow was observed beneath the miR-34a bound band. The RABS reactivity traces indicated a single dominant conformation for these targets, so it is possible that the lower shadow observed was due to more subtle differences in conformation, such as the opening/closing of one or a few base pairs at the terminus or bulge, (i.e. end fraying). HNF4α and NOTCH2 appear to never fully saturate the miR-34a, so a small un-bound population remains visible on the gel. For NOTCH2 this free miR-34a band appears to migrate upwards, possibly due to overloading the gel lane with excess NOTCH2 (which are not observed in the Cy5 fluorescence image).

      In the EMSA for Perfect, why does the band intensity for the bound complex increase then decrease? How many replicates were run for this? This needs to be reconciled.

      As for all EMSAs, three replicates were carried out for each mRNA target and all gels are shown in Supplementary Files 2 and 3, for the binary and ternary assays respectively.

      Uneven heat distribution across the gel can lead to bleaching of the Cy5 fluorophore. To address this, we we used a circulating cooler in our electrophoresis tank, as outlined in our methods (page 10). However, the aforementioned gel for one of thePERFECT sample replicates appears to have been evenly cooled. As the binding ratio (rather than total band volume) was used for quantification, the binding curve was unaffected, and this did not influence KD,app.

      We have now replaced the exemplary gel for PERFECT in Figure S3 with a more representative and evenly labelled gel from our replicates (Cy5 fluorescence image shown below). The binding curve for PERFECT is also shown here:

      The authors list that the RNA concentration was held constant at 10 nM; in EMSAs, the RNA concentration should be less than the binding affinity; what is the lowest concentration of protein used in the assays shown in S3A? Is this a serial dilution? It seems to me like the binding assays for MTA2, Perfect, and SRCseed might have too high of an RNA concentration. (Actually, now I see in the supplement the concentrations of proteins, and the RNA concentration is too high). Also, why is the intensity of bands for bound complex for SRCseed more intense than the free RNA?

      Why are the binding affinity error bars so large (e.g., for NOTCH2 with mir-34a) - 6 uM +/- 3 uM?

      No protein was used in the binary assays shown in Figure S3A. For the ternary assays in Figure S4, the maximum concentration of miR-34a-loaded AGO2 (miR-34a-AGO2) was 268 nM, with a serial dilution down to a minimum of 0.06 nM.

      Optimal EMSA conditions require a constant RNA concentration that is lower than the binding affinity to accurately estimate high-affinity interactions.

      For our tightest binders, such as SIRT1, we can confidently state that the KD,app is less than 10 nM, estimated at 0.4 {plus minus} 1.1 nM. Therefore, the accuracy of this estimation is reduced, and the standard deviation is larger than the estimated KD,app. As NOTCH2 bound miR-34a very weakly and did not reach a fully bound plateau, the resulting high error was expected. Consequently, we do not have the same level of certainty for extremely tight or weak binders. In this study, the relative affinities were of primary importance.

      We have included on page 18:

      As the Cy5-miR-34a concentration was fixed to 10 nM to give sufficient signal during detection, KD,app values below 10 nM have a lower confidence.

      Regarding the control samples PERFECT and SCRseed, our focus was not on determining the exact KD,app of these artificial constructs. Instead, we were primarily interested in whether they exhibited binding and under which conditions. For SCRseed, we neither adjusted the titration range nor calculated KD,app. For PERFECT, the concentration was adjusted to a lower range of 30 nM - 0.001 nM to give a relative comparison with the other tight binder SIRT1. However, further reduction in RNA concentration was not pursued, as it already fell well below the 10 nM sensitivity threshold.

      Regarding the intensity of the bound SCRseed band, we observed that the bound fluorophore often resulted in stronger intensity than for the free probe. This was observed for a number of the samples (PERFECT, BLC2, SCRseed). A previous publication reported that Cy5 is sequence dependent in DNA, that the effect is more sensitive to double-stranded DNA, and that the fluorophore is sensitive to the surrounding 5 base pairs (Kretschy, Sack and Somoza, 2016). It is likely that the same phenonenon exists in RNA.

      For MTA2, the two alternative conformations (shown in Figure S9 and S10) make assessment of KD,app more difficult. As the higher affinity conformation did not reach a fully-bound plateau before the weaker affinity conformation appeared, the binding curve plateau (where all miR-34a was bound) reflected the weaker conformation KD,app. We increased the range of titration tested by using a three-fold serial dilution, but further reduction in RNA concentration would not have been fruitful as it already dropped below well below the 10 nM sensitivity range. Therefore the MTA2 binary complex had a higher error at (944 {plus minus} 274 nM) and lower confidence.

      We then decided to run a competition assay to detect the weaker KD,app of MTA2. The assay was set up using the known binding affinity of CD44, which was labelled with Cy5 to track the reaction. MTA2 was titrated against a constant concentration of Cy5-CD44:miR-34a, and disruption of the CD44 and miR-34a binding was monitored. We fitted the data to a quadratic for competitive binding (Cheng and Prusoff., 1973) to calculate the KD,app for competitive binding, or KC,app.

      We validated our competition assay by comparing it with our direct binding assays, specifically assessing CD44 in a self-competition assay. The CD44 KC,app (168 {plus minus} 24 nM; mean and SD of three replicates) was found to be consistent with the KD,app obtained from the direct assay (165 {plus minus} 21 nM).

      As we wanted all affinity data to be directly comparable (using the same methodology), we compared the KD,app values obtained via direct assay in the manuscript. It appears that the competitive EMSA assay for MTA2 reflects the weaker affinity conformation observed in the direct assay.

      It would be very helpful if the authors wrote in the Kds in Figure 2A in green and blue (in the extra space in the plots). This would help the reader to better understand what's going on, and for me, as a reviewer, to better consider the analysis/conclusions presented by the authors.

      KD,app values are written in in green and blue in what is now Figure 2D (originally Figure 2A).

      The authors state on page 18 that 'Interestingly, however, we did not observe a correlation between binary or ternary complex affinity and seed type.' They should elaborate on why this is interesting.

      The prevailing view is that the miRNA seed type significantly influences affinity within AGO2. The largest biochemical studies of miRNA-target interactions to date, conducted by McGeary et al. (2019, 2022), used AGO-RBNS (RNA Bind-n-Seq) to reveal relative binding affinities. These studies demonstrated strong correlations between the canonical seed types and binding affinity. Therefore, we find it interesting that no such correlation was observed in our dataset (despite its small size).

      We have now added to the manuscript (page 20):

      "The largest biochemical studies of miRNA-target interactions to date (McGeary et al., 2019, 2022) used AGO-RBNS (RNA Bind-n-Seq) to extract relative binding affinities, demonstrating strong correlations between the canonical seed types and binding affinity. Therefore, it is intriguing that our dataset, despite its small size, showed no such correlation."

      Figure 2C is not referenced in the text (the authors should go back through the text to make sure everything is referenced and in order). The Kds should be listed alongside the gels in Figure 2C.

      Figure 2 has now been rearranged and updated, with KD,app values listed in what is now Figure 2D.

      Figure 3B is rather confusing to understand.

      We have now adapted Figure 3 to simplify readability. Panel B has now been moved to C, and we have introduced panel A (moved from Figure 2B). In Figure 3C (originally 3B) we have added arrows to indicate the direction of affinity change from binary to ternary complex, and moved the duplex release information to panel A. We thank the reviewer and think that the data is now much clearer.

      Figure 3. AGO2 moderates affinity by strengthening weak binders and weakening strong binders. (A) Correlation of relative mRNA:miR-34a with mRNA:miR-34aAGO2 binding affinities. No seed type correlation is observed, seeds coloured, where 8mer is pink, 7mer-m8 is turquoise, and 7-mer-A1 is mauve. The slope of the linear fit is 0.48, and intercept on the (log y)-axis is 7.11. The occurrence of miRNA duplex release from AGO2 is marked with diamonds. (B) miR-34a-mediated repression of dual luciferase reporters fused to the 12 mRNA targeting sites. Luciferase activity from HEK293T cells co-transfected with each reporter construct, miR-34a was measured 24 hours following transfection and normalised to the miR-34a-negative transfection control. Each datapoint represents the R/F ratio for an independent experiment (n=3) with standard deviations indicated. SCRseed is a scrambled seed control, SCRall is a fully scrambled control, and PERFECT is the perfect complement of miR-34a. Dotted horizontal lines represent the repression values for the 22-nucleotide seed-only controls6 for the respective seed types, in the absence of any other WC base pairing. (C) Comparison of relative target repression with relative affinity assessed by EMSA. Blue represents mRNA:miR-34a affinity (binary complex), while green represents mRNA:miR-34a-AGO2 affinity (ternary complex). Arrows indicate the direction of change in affinity upon binding within AGO2 compared to the binary complex. It is seen that AGO2 moderates affinity bi-directionally by strengthening weak binders and weakening strong binders.

      Page 20: Perfect should be italicized.

      Thank you for bringing this to our attention, this how now been adjusted.

      Have the authors considered using NMR to assess the base pair pattern formed between the miRNA:mRNA complexes (with / without AGO)? As a validation for results obtained by RABS? This could be helpful for the Asymmetric target binding section, the Ago increases flexibility section, and the three distinct structural groups section in the results. It is widely accepted that while chemical probing is insightful, results should be validated using alternative approaches. Distinguishing structural changes and protected reactivity in the presence of protein is challenging.

      NMR provides high-resolution information on RNA base-pairing patterns, allowing us to compare our RABS results for SIRT1with those obtained via NMR (Banijamali et al., 2022) for the binary complex. For SIRT1, the RNA:RNA structures identified were consistent between both methods. However, using NMR to measure RNA:RNA binding within AGO2 is challenging due to the protein's large size. Currently, there are no published complete NMR structures of RNA within AGO2. The largest solution-state NMR structures published that include AGO consist solely of the PAZ domain. Our group has been working on method development using DNP-enhanced solid-state NMR to obtain structural information within the complete AGO2 protein, but the current resolution does not allow us to fully reconstruct a complete NMR structure. We hope that in the coming years, this will be a method to evaluate RNA within AGO. This limitation highlights the advantage of RABS in providing RNA base-pairing information within the ternary complex in solution.

      Reviewer #3 (Significance (Required)):

      The work is helpful for understanding how microRNAs recognize and bind their mRNA targets, and the impact Ago has on this interaction. I think for therapeutic studies, this will be helpful for structure-based design. Especially given the three types of structures identified to be a part of the interaction.

      We thank the reviewer for their detailed remarks, especially concerning the importance of technical details the binding assays. We further thank the reviewer for recognising the potential impact of our work for rational design.

      4. Description of analyses that authors prefer not to carry out

      • *

      In response to Reviewer 2 - major comment 1, we prefer to not run an additional ion exchange purification on the AGO2 protein due to the reasoning discussed above, which is repeated here:

      We have addressed this point in three ways:

      Thank you for mentioning this crucial point which has been a focus of our controls. We have addressed this point in four ways:

      Salt wash during reverse IMAC purification. Separation of unbound RNA and proteins via SEC. Blocking non-specific interactions using polyuridine. Observing both the presence and absence of duplex release among different targets using the same AGO2 preparation and conditions.

      Firstly, although we did not use a specific ion exchange column for purification, we believe the ionic strength used in our IMAC wash step was sufficient to remove non-specific interactions. We used A linear gradient with using buffer A (50 mM Tris-HCl, 300 mM NaCl, 10 mM Imidazole, 1 mM TCEP, 5% glycerol v/v) and buffer B (50 mM Tris-HCl, 500 mM NaCl, 300 mM Imidazole, 1 mM TCEP, 5% glycerol) at pH 8. The protocol followed recommendation by BioRad for their Profinity IMAC resins where it is stated that 300 mM NaCl should be included in buffers to deter nonspecific protein binding due to ionic interactions. The protein itself has a higher affinity for the resin than nucleic acids.

      A commonly used protocol for RISC purification follows the method by Flores-Jasso et al. (RNA 2013). Here, the authors use ion exchange chromatography to remove competitor oligonucleotides. After loading, they washed the column with lysis buffer (30 mM HEPES-KOH at pH 7.4, 100 mM potassium acetate, 2 mM magnesium acetate and 2 mM DTT). AGO was eluted with lysis buffer containing 500 mM potassium acetate. Competing oligonucleotides were eluted in the wash.

      As ionic strength is independent of ion identity or chemical nature of the ion involved (Jerermy M. Berg, John L. Tymoczko, Gregory J. Garret Jr., Biochemistry 2015), we reasoned that our Tris-HCl/NaCl/ imidazole buffer wash should have at comparable ionic strength to the Flores-Jasso protocol.

      Our total ionic contributions were: 500 mM Na+, 550 mM Cl-, 50 mM Tris and 300 mM imidazole. We recognise that Tris and imidazole are both partially ionized according the pH of the buffer (pH 8) and their respective pKa values, but even if only considering the sodium and chloride it should be comparable to the Flores-Jasso protocol.

      Secondly, after reverse HisTrap purification, AGO2 was run through size exclusion chromatography to remove any remaining impurities (shown Figure S2B).

      Thirdly, knowing that AGO2 has many positively charged surface patches and can bind nucleic acid nonspecifically (Nakanishi, 2022; O'Geen et al., 2018), we tested various blocking backgrounds to eliminate nonspecific binding effects in our EMSA ternary binding assays. We were able to address this issue by adding either non-homogenous RNA extract or homogenous polyuridine (pU) in our EMSA buffer during equilibration background experiments. This allowed us to eliminate non-specific binding of our target mRNAs, as shown previously in Supplementary Figure S6. We appreciate that the reviewer finds this technical detail important and have moved the panel C of figure S6 into the main results in Figure 2C, to highlight the novel conditions used and important controls needed to be performed. If miR-34a were non-specifically bound to the surface of AGO2 after washing, this blocking step would render any impact of surface-bound miR-34a negligible due to the excess of competing polyuridine (pU).

      Our EMSA results show that, using polyU, we can reduce non-specific interaction between AGO2 and RNAs that are present. And still, duplex release occurs despite the blocking step. It is therefore less likely that duplex release is caused by surface-bound miR-34a.

      Finally, the observation of distinct duplex release for certain targets, but not for others (e.g. MTA2, which bound tightly to miR-34a-AGO2 but did not exhibit duplex release; see Figure 2), argues against the possibility that the phenomenon was solely due to non-specifically bound RNA releasing from AGO2.

      In response to the reviewers statement "Since properly loaded miR-34a is never released from AGO2, it is impossible for the miR-34a loaded into AGO2 to form the binary complex (mRNA:miR-34a)" we would like to refer to the three papers, De et al. (2013) Jo MH et al. (2015), and Park JH et al. (2017), which have previously reported duplex release and collectively provide considerable evidence that miRNA can be unloaded from AGO in order to promote turnover and recycling of AGO. It is known that AGO recycling must occur, therefore there must be some mechanisms to enable release of miRNA from AGO2 to enable this. It is possible that AGO recycling proceeds via miRNA degradation (TDMD) in the cell, but in the absence of enzymes responsible for oligouridylation and degradation, the miRNA duplex may be released. As TDMD-competent mRNA targets have been observed to release the miRNA 3' tail from AGO2 (Sheu-Gruttadauria et al., 2019; Willkomm et al., 2022), there is a possible mechanistic similarity between the two processes, however, we do not have sufficient data to make any statement on this.

    1. Pour consulter le code du corrigé de l’exercice, vous pouvez cliquer sur le fichier   index.html

      J'ai tout bien fais, mais dans le corrigé, la balise orpheline pour la langue est juste apres Doctype. Je l'ai mise dans les balises de paires head. Est- ce que ca pose un probleme? Et est-ce que j'aurais mal pris mes notes du cours ?

    1. Author response:

      The following is the authors’ response to the original reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Lines 40-42: The sentence "The coupling of structural connectome (SC) and functional connectome (FC) varies greatly across different cortical regions reflecting anatomical and functional hierarchies as well as individual differences in cognitive function, and is regulated by genes" is a misstatement. Regional variations of structure-function coupling do not really reflect differences in cognitive function among individuals, but inter-subject variations do.

      Thank you for your comment. We have made revisions to the sentence to correct its misstatement. Please see lines 40-43: “The coupling of structural connectome (SC) and functional connectome (FC) varies greatly across different cortical regions reflecting anatomical and functional hierarchies[1, 6-9] and is regulated by genes[6, 8], as well as its individual differences relates to cognitive function[8, 9].”

      (2) In Figure 1, the graph showing the relation between intensity and cortical depth needs explanation.

      Thank you for your comment. We have added necessary explanation, please see lines 133-134: “The MPC was used to map similarity networks of intracortical microstructure (voxel intensity sampled in different cortical depth) for each cortical node.”

      (3) Line 167: Change "increased" to "increase".

      We have corrected it, please see lines 173-174: “…networks significantly increased with age and exhibited greater increase.”

      (4) Line 195: Remove "were".

      We have corrected it, please see line 204: “…default mode networks significantly contributed to the prediction…”

      (5) Lines 233-240, Reproducibility analyses: Comparisons of parcellation templates were not made with respect to gene weights. Is there any particular reason?

      Thank you for your comment. We have quantified the gene weights based on HCPMMP using the same procedures. We identified a correlation (r \= 0.25, p<0.001) between the gene weights in HCPMMP and BNA. Given that this is a relatively weak correlation, we need to clarify the following points.

      Based on HCPMMP, we produced an averaged gene expression profile for 10,027 genes covering 176 left cortical regions[1]. The excluding 4 cortical regions that had an insufficient number of assigned samples may lead to different templates having a relatively weak correlation of gene associations. Moreover, the effect of different template resolutions on the results of human connectome-transcriptome association is still unclear.

      In brain connectome analysis, the choice of parcellation templates can indeed influence the subsequent findings to some extent. A methodological study[2] provided referenced correlations about 0.4~0.6 for white matter connectivity and 0.2~0.4 for white matter nodal property between two templates (refer to Figure 4 and 5 in [2]). Therefore, the age-related coupling changes as a downstream analysis was calculated using multimodal connectome and correlated with gene expression profiles, which may be influenced by the choice of templates. 

      We have further supplemented gene weights results obtained from HCPMMP to explicitly clarify the dependency of parcellation templates.

      Please see lines 251-252: “The gene weights of HCPMMP was consistent with that of BNA (r = 0.25, p < 0.001).”

      Author response image 1.

      The consistency of gene weights between HCPMMP and BNA.

      Please see lines 601-604: “Finally, we produced an averaged gene expression profile for 10,027 genes covering 176 left cortical regions based on HCPMMP and obtained the gene weights by PLS analysis. We performed Pearson's correlation analyses to assess the consistency of gene weights between HCPMMP and BNA.”

      Reviewer #2 (Recommendations For The Authors):

      Your paper is interesting to read and I found your efforts to evaluate the robustness of the results of different parcellation strategies and tractography methods very valuable. The work is globally easy to navigate and well written with informative good-quality figures, although I think some additional clarifications will be useful to improve readability. My suggestions and questions are detailed below (I aimed to group them by topic which did not always succeed so apologies if the comments are difficult to navigate, but I hope they will be useful for reflection and to incorporate in your work).

      * L34: 'developmental disorder'

      ** As far as I understand, the subjects in HCP-D are mostly healthy (L87). Thus, while your study provides interesting insights into typical brain development, I wonder if references to 'disorder' might be premature. In the future, it would be interesting to extend your approach to the atypical populations. In any case, it would be extremely helpful and appreciated if you included a figure visualising the distribution of behavioural scores within your population and in relationship to age at scan for your subjects (and to include a more detailed description of the assessment in the methods section) given that large part of your paper focuses on their prediction using coupling inputs (especially given a large drop of predictive performance after age correction). Such figures would allow the reader to better understand the cognitive variability within your data, but also potential age relationships, and generally give a better overview of your cohort.

      We agree with your comment that references to 'disorder' is premature. We have made revisions in abstract and conclusion. 

      Please see lines 33-34: “This study offers insight into the maturational principles of SC-FC coupling in typical development.”

      Please see lines 395-396: “Further investigations are needed to fully explore the clinical implications of SC-FC coupling for a range of developmental disorders.”

      In addition, we have included a more detailed description of the cognitive scores in the methods section and provided a figure to visualize the distributions of cognitive scores and in relationship to age for subjects. Please see lines 407-413: “Cognitive scores. We included 11 cognitive scores which were assessed with the National Institutes of Health (NIH) Toolbox Cognition Battery (https://www.healthmeasures.net/exploremeasurement-systems/nih-toolbox), including episodic memory, executive function/cognitive flexibility, executive function/inhibition, language/reading decoding, processing speed, language/vocabulary comprehension, working memory, fluid intelligence composite score, crystal intelligence composite score, early child intelligence composite score and total intelligence composite score. Distributions of these cognitive scores and their relationship with age are illustrated in Figure S12.”

      Author response image 2.

      Cognitive scores and age distributions of scans.

      * SC-FC coupling

      ** L162: 'Regarding functional subnetworks, SC-FC coupling increased disproportionately with age (Figure 3C)'.

      *** As far as I understand, in Figure 3C, the points are the correlation with age for a given ROI within the subnetwork. Is this correct? If yes, I am not sure how this shows a disproportionate increase in coupling. It seems that there is great variability of SC-FC correlation with age across regions within subnetworks, more so than the differences between networks. This would suggest that the coupling with age is regionally dependent rather than network-dependent? Maybe you could clarify?

      The points are the correlation with age for a given ROI within the subnetwork in Figure 3C. We have revised the description, please see lines 168-174: “Age correlation coefficients distributed within functional subnetworks were shown in Figure 3C. Regarding mean SC-FC coupling within functional subnetworks, the somatomotor (𝛽𝑎𝑔𝑒\=2.39E-03, F=4.73, p\=3.10E-06, r\=0.25, p\=1.67E07, Figure 3E), dorsal attention (𝛽𝑎𝑔𝑒\=1.40E-03, F=4.63, p\=4.86E-06, r\=0.24, p\=2.91E-07, Figure 3F), frontoparietal (𝛽𝑎𝑔𝑒 =2.11E-03, F=6.46, p\=2.80E-10, r\=0.33, p\=1.64E-12, Figure 3I) and default mode (𝛽𝑎𝑔𝑒 =9.71E-04, F=2.90, p\=3.94E-03, r\=0.15, p\=1.19E-03, Figure 3J) networks significantly increased with age and exhibited greater increase.” In addition, we agree with your comment that the coupling with age is more likely region-dependent than network-dependent. We have added the description, please see lines 329-332: “We also found the SC-FC coupling with age across regions within subnetworks has more variability than the differences between networks, suggesting that the coupling with age is more likely region-dependent than network-dependent.” This is why our subsequent analysis focused on regional coupling.  

      *** Additionally, we see from Figure 3C that regions within networks have very different changes with age. Given this variability (especially in the subnetworks where you show both positive and negative correlations with age for specific ROIs (i.e. all of them)), does it make sense then to show mean coupling over regions within the subnetworks which erases the differences in coupling with age relationships across regions (Figures 3D-J)?

      Considering the interest and interpretation for SC-FC coupling, showing the mean coupling at subnetwork scales with age correlation is needed, although this eliminates variability at regional scale. These results at different scales confirmed that coupling changes with age at this age group are mainly increased.

      *** Also, I think it would be interesting to show correlation coefficients across all regions, not only the significant ones (3B). Is there a spatially related tendency of increases/decreases (rather than a 'network' relationship)? Would it be interesting to show a similar figure to Figure S7 instead of only the significant regions?

      As your comment, we have supplemented the graph which shows correlation coefficients across all regions into Figure 3B. Similarly, we supplemented to the other figures (Figure S3-S6).

      Author response image 3.

      Aged-related changes in SC-FC coupling. (A) Increases in whole-brain coupling with age. (B) Correlation of age with SC-FC coupling across all regions and significant regions (p<0.05, FDR corrected). (C) Comparisons of age-related changes in SC-FC coupling among functional networks. The boxes show the median and interquartile range (IQR; 25–75%), and the whiskers depict 1.5× IQR from the first or third quartile. (D-J) Correlation of age with SC-FC coupling across the VIS, SM, DA, VA, LIM, FP and DM. VIS, visual network; SM, somatomotor network; DA, dorsal attention network; VA, ventral attention network; LIM, limbic network; FP, frontoparietal network; DM, default mode network.

      *** For the quantification of MPC.

      **** L421: you reconstructed 14 cortical surfaces from the wm to pial surface. If we take the max thickness of the cortex to be 4.5mm (Fischl & Dale, 2000), the sampling is above the resolution of your anatomical images (0.8mm). Could you expand on what the interest is in sampling such a higher number of surfaces given that the resolution is not enough to provide additional information?

      The surface reconstruction was based on state-of-the-art equivolumetric surface construction techniques[3] which provides a simplified recapitulation of cellular changes across the putative laminar structure of the cortex. By referencing a 100-μm resolution Merkerstained 3D histological reconstruction of an entire post mortem human brain (BigBrain: https://bigbrain.loris.ca/main.php), a methodological study[4] systematically evaluated MPC stability with four to 30 intracortical surfaces when the resolution of anatomical image was 0.7 mm, and selected 14 surfaces as the most stable solution. Importantly, it has been proved the in vivo approach can serve as a lower resolution yet biologically meaningful extension of the histological work[4]. 

      **** L424: did you aggregate intensities over regions using mean/median or other statistics?

      It might be useful to specify.

      Thank you for your careful comment. We have revised the description in lines 446-447: “We averaged the intensity profiles of vertices over 210 cortical regions according to the BNA”.

      **** L426: personal curiosity, why did you decide to remove the negative correlation of the intensity profiles from the MPC? Although this is a common practice in functional analyses (where the interpretation of negatives is debated), within the context of cortical correlations, the negative values might be interesting and informative on the level of microstructural relationships across regions (if you want to remove negative signs it might be worth taking their absolute values instead).

      We agree with your comment that the interpretation of negative correlation is debated in MPC. Considering that MPC is a nascent approach to network modeling, we adopted a more conservative strategy that removing negative correlation by referring to the study [4] that proposed the approach. As your comment, the negative correlation might be informative. We will also continue to explore the intrinsic information on the negative correlation reflecting microstructural relationships.

      **** L465: could you please expand on the notion of self-connections, it is not completely evident what this refers to.

      We have revised the description in lines 493-494: “𝑁𝑐 is the number of connection (𝑁𝑐 = 245 for BNA)”.

      **** Paragraph starting on L467: did you evaluate the multicollinearities between communication models? It is possibly rather high (especially for the same models with similar parameters (listed on L440-444)). Such dependence between variables might affect the estimates of feature importance (given the predictive models only care to minimize error, highly correlated features can be selected as a strong predictor while the impact of other features with similarly strong relationships with the target is minimized thus impacting the identification of reliable 'predictors').

      We agree with your comment. The covariance structure (multicollinearities) among the communication models have a high probability to lead to unreliable predictor weights. In our study, we applied Haufe's inversion transform[5] which resolves this issue by computing the covariance between the predicted FC and each communication models in the training set. More details for Haufe's inversion transform please see [5]. We further clarified in the manuscript, please see in lines 497-499: “And covariance structure among the predictors may lead to unreliable predictor weights. Thus, we applied Haufe's inversion transform[38] to address these issues and identify reliable communication mechanisms.”

      **** L474: I am not completely familiar with spin tests but to my understanding, this is a spatial permutation test. I am not sure how this applies to the evaluation of the robustness of feature weight estimates per region (if this was performed per region), it would be useful to provide a bit more detail to make it clearer.

      As your comment, we have supplemented the detail, please see lines 503-507: “Next, we generated 1,000 FC permutations through a spin test[86] for each nodal prediction in each subject and obtained random distributions of model weights. These weights were averaged over the group and were investigated the enrichment of the highest weights per region to assess whether the number of highest weights across communication models was significantly larger than that in a random discovery.”

      **** L477: 'significant communication models were used to represent WMC...', but in L103 you mention you select 3 models: communicability, mean first passage, and flow graphs. Do you want to say that only 3 models were 'significant' and these were exactly the same across all regions (and data splits/ parcellation strategies/ tractography methods)? In the methods, you describe a lot of analysis and testing but it is not completely clear how you come to the selection of the final 3, it would be beneficial to clarify. Also, the final 3 were selected on the whole dataset first and then the pipeline of SC-FC coupling/age assessment/behaviour predictions was run for every (WD, S1, S2) for both parcellations schemes and tractography methods or did you end up with different sets each time? It would be good to make the pipeline and design choices, including the validation bit clearer (a figure detailing all the steps which extend Figure 1 would be very useful to understand the design/choices and how they relate to different runs of the validation).

      Thank you for your comment. In all reproducibility analyses, we used the same 3 models which was selected on the main pipeline (probabilistic tractography and BNA parcellation). According to your comment, we produced a figure that included the pipeline of model selection as the extend of Figure 1. And the description please see lines 106-108: “We used these three models to represent the extracortical connectivity properties in subsequent discovery and reproducibility analyses (Figure S1).” 

      Author response image 4.

      Pipeline of model selection and reproducibility analyses.

      **** Might the imbalance of features between structural connectivity and MPC affect the revealed SC-FC relationships (3 vs 1)? Why did you decide on this ratio rather than for example best WM structural descriptor + MPC?

      We understand your concern. The WMC communication models represent diverse geometric, topological, or dynamic factors. In order to describe the properties of WMC as best as possible, we selected three communication models after controlling covariance structure that can significantly predict FC from the 27 models. Compared to MPC, this does present a potential feature imbalance problem. However, this still supports the conclusion that coupling models that incorporate microarchitectural properties yield more accurate predictions of FC from SC[6, 7]. The relevant experiments are shown in Figure S2 below. If only the best WM structural descriptor is used, this may lose some communication properties of WMC.

      **** L515: were intracranial volume and in-scanner head motion related to behavioural measures? These variables likely impact the inputs, do you expect them to influence the outcome assessments? Or is there a mistake on L518 and you actually corrected the input features rather than the behaviour measures?

      The in-scanner head motion and intracranial volume are related to some age-adjusted behavioural measures, as shown in the following table. The process of regression of covariates from cognitive measures was based on these two cognitive prediction studies [8, 9]. Please see lines 549-554: “Prior to applying the nested fivefold cross-validation framework to each behaviour measure, we regressed out covariates including sex, intracranial volume, and in-scanner head motion from the behaviour measure[59, 69]. Specifically, we estimated the regression coefficients of the covariates using the training set and applied them to the testing set. This regression procedure was repeated for each fold.”

      Author response table 1.

      ** Additionally, in the paper, you propose that the incorporation of cortical microstructural (myelin-related) descriptors with white-matter connectivity to explain FC provides for 'a more comprehensive perspective for characterizing the development of SC-FC coupling' (L60). This combination of cortical and white-matter structure is indeed interesting, however the benefits of incorporating different descriptors could be studied further. For example, comparing results of using only the white matter connectivity (assessed through selected communication models) ~ FC vs (white matter + MPC) ~ FC vs MPC ~ FC. Which descriptors better explain FC? Are the 'coupling trends' similar (or the same)? If yes, what is the additional benefit of using the more complex combination? This would also add strength to your statement at L317: 'These discrepancies likely arise from differences in coupling methods, highlighting the complementarity of our methods with existing findings'. Yes, discrepancies might be explained by the use of different SC inputs. However, it is difficult to see how discrepancies highlight complementarity - does MCP (and combination with wm) provide additional information to using wm structural alone?~

      According to your comment, we have added the analyses based on different models using only the myelin-related predictor or WM connectivity to predict FC, and further compared the results among different models. please see lines 519-521: “In addition, we have constructed the models using only MPC or SCs to predict FC, respectively. Spearman’s correlation was used to assess the consistency between spatial patterns based on different models.” 

      Please see lines 128-130: “In addition, the coupling pattern based on other models (using only MPC or only SCs to predict FC) and the comparison between the models were shown in Figure S2A-C.” Please see lines 178-179: “The age-related patterns of SC-FC coupling based other coupling models were shown in Figure S2D-F.”

      Although we found that there were spatial consistencies in the coupling patterns between different models, the incorporation of MPC with SC connectivity can improve the prediction of FC than the models based on only MPC or SC. For age-related changes in coupling, the differences between the models was further amplified. We agree with you that the complementarity cannot be explicitly quantified and we have revised the description, please see line 329: “These discrepancies likely arise from differences in coupling methods.”

      Author response image 5.

      Comparison results between different models. Spatial pattern of mean SC-FC coupling based on MPC ~ FC (A), SCs ~ FC (B), and MPC + SCs ~ FC (C). Correlation of age with SC-FC coupling across cortex based on MPC ~ FC (D), SCs ~ FC (E), and MPC + SCs ~ FC (F).

      ** For the interpretation of results: L31 'SC-FC coupling is positively associated with genes in oligodendrocyte-related pathways and negatively associated with astrocyte-related gene'; L124: positive myelin content with SC-FC coupling...and similarly on L81, L219, L299, L342, and L490:

      ***You use a T1/T2 ratio which is (in large part) a measure of myelin to estimate the coupling between SC and FC. Evaluation with SC-FC coupling with myeline described in Figure 2E is possibly biased by the choice of this feature. Similarly, it is possible that reported positive associations with oligodendrocyte-related pathways and SC-FC coupling in your work could in part result from a bias introduced by the 'myelin descriptor' (conversely, picking up the oligodendrocyte-related genes is a nice corroboration for the T1/T2 ration being a myelin descriptor, so that's nice). However, it is possible that if you used a different descriptor of the cortical microstructure, you might find different expression patterns associated with the SCFC coupling (for example using neurite density index might pick up neuronal-related genes?). As mentioned in my previous suggestions, I think it would be of interest to first use only the white matter structural connectivity feature to assess coupling to FC and assess the gene expression in the cortical regions to see if the same genes are related, and subsequently incorporate MPC to dissociate potential bias of using a myelin measure from genetic findings.

      Thank you for your insightful comments. In this paper, however, the core method of measuring coupling is to predict functional connections using multimodal structural connections, which may yield more information than a single modal. We agree with your comment that separating SCs and MPC to look at the genes involved in both separately could lead to interesting discoveries. We will continue to explore this in the future.

      ** Generally, I find it difficult to understand the interpretation of SC-FC coupling measures and would be interested to hear your thinking about this. As you mention on L290-294, how well SC predicts FC depends on which input features are used for the coupling assessment (more complex communication models, incorporating additional microstructural information etc 'yield more accurate predictions of FC' L291) - thus, calculated coupling can be interpreted as a measure of how well a particular set of input features explain FC (different sets will explain FC more or less well) ~ coupling is related to a measure of 'missing' information on the SC-FC relationship which is not contained within the particular set of structural descriptors - with this approach, the goal might be to determine the set that best, i.e. completely, explains FC to understand the link between structure and function. When you use the coupling measures for comparisons with age, cognition prediction etc, the 'status' of the SC-FC changes, it is no longer the amount of FC explained by the given SC descriptor set, but it's considered a descriptor in itself (rather than an effect of feature selection / SC-FC information overlap) - how do you interpret/argue for this shift of use?

      Thank you for your comment. In this paper, we obtain reasonable SC-FC coupling by determining the optimal set of structural features to explain the function. The coupling essentially measures the direct correspondence between structure and function. To study the relationship between coupling and age and cognition is actually to study the age correlation and cognitive correlation of this direct correspondence between structure and function. 

      ** In a similar vein to the above comment, I am interested to hear what you think: on L305 you mention that 'perfect SC-FC coupling may be unlikely'. Would this reasoning suggest that functional activity takes place through other means than (and is therefore somehow independent of) biological (structural) substrates? For now, I think one can only say that we have imperfect descriptors of the structure so there is always information missing to explain function, this however does not mean the SC and FC are not perfectly coupled (only that we look at insufficient structural descriptors - limitations of what imaging can assess, what we measure etc). This is in line with L305 where you mention that 'Moreover, our results suggested that regional preferential contributions across different SCs lead to variations in the underlying communication process'. This suggests that locally different areas might use different communication models which are not reflected in the measures of SC-FC coupling that was employed, not that the 'coupling' is lower or higher (or coupling is not perfect). This is also a change in approach to L293: 'This configuration effectively releases the association cortex from strong structural constraints' - the 'release' might only be in light of the particular structural descriptors you use - is it conceivable that a different communication model would be more appropriate (and show high coupling) in these areas.

      Thank you for your insightful comments. We have changed the description, please see lines 315317: “SC-FC coupling is dynamic and changes throughout the lifespan[7], particularly during adolescence[6,9], suggesting that perfect SC-FC coupling may require sufficient structural descriptors.” 

      *Cognitive predictions:

      ** From a practical stand-point, do you think SC-FC coupling is a better (more accurate) indicator of cognitive outcomes (for example for future prediction studies) than each modality alone (which is practically easier to obtain and process)? It would be useful to check the behavioural outcome predictions for each modality separately (as suggested above for coupling estimates). In case SC-FC coupling does not outperform each modality separately, what is the benefit of using their coupling? Similarly, it would be useful to compare to using only cortical myelin for the prediction (which you showed to increase in importance for the coupling). In the case of myelin->coupling-> intelligence, if you are able to predict outcomes with the same performance from myelin without the need for coupling measures, what is the benefit of coupling?

      From a predictive performance point of view, we do not believe that SC-FC coupling is a better indicator than a single mode (voxel, network or other indicator). Our starting point is to assess whether SC-FC coupling is related to the individual differences of cognitive performances rather than to prove its predictive power over other measures. As you suggest, it's a very interesting perspective on the predictive power of cognition by separating the various modalities and comparing them. We will continue to explore this issue in the future study.

      ** The statement on L187 'suggesting that increased SC-FC coupling during development is associated with higher intelligence' might not be completely appropriate before age corrections (especially given the large drop in performance that suggests confounding effects of age).

      According to your comment, we have removed the statement.

      ** L188: it might be useful to report the range of R across the outer cross-validation folds as from Figure 4A it is not completely clear that the predictive performance is above the random (0) threshold. (For the sake of clarity, on L180 it might be useful for the reader if you directly report that other outcomes were not above the random threshold).

      According to your comment, we have added the range of R and revised the description, please see lines 195-198: “Furthermore, even after controlling for age, SC-FC coupling remained a significant predictor of general intelligence better than at chance (Pearson’s r\=0.11±0.04, p\=0.01, FDR corrected, Figure 4A). For fluid intelligence and crystal intelligence, the predictive performances of SC-FC coupling were not better than at chance (Figure 4A).”

      In a similar vein, in the text, you report Pearson's R for the predictive results but Figure 4A shows predictive accuracy - accuracy is a different (categorical) metric. It would be good to homogenise to clarify predictive results.

      We have made the corresponding changes in Figure 4.

      Author response image 6.

      Encoding individual differences in intelligence using regional SC-FC coupling. (A) Predictive accuracy of fluid, crystallized, and general intelligence composite scores. (B) Regional distribution of predictive weight. (C) Predictive contribution of functional networks. The boxes show the median and interquartile range (IQR; 25–75%), and the whiskers depict the 1.5× IQR from the first or third quartile.

      *Methods and QC:

      -Parcellations

      ** It would be useful to mention briefly how the BNA was applied to the data and if any quality checks were performed for the resulting parcellations, especially for the youngest subjects which might be most dissimilar to the population used to derive the atlas (healthy adults HCP subjects) ~ question of parcellation quality.

      We have added the description, please see lines 434-436: “The BNA[31] was projected on native space according to the official scripts (http://www.brainnetome.org/resource/) and the native BNA was checked by visual inspection.” 

      ** Additionally, the appropriateness of structurally defined regions for the functional analysis is also a topic of important debate. It might be useful to mention the above as limitations (which apply to most studies with similar focus).

      We have added your comment to the methodological issues, please see lines 378-379: “Third, the appropriateness of structurally defined regions for the functional analysis is also a topic of important debate.”

      - Tractography

      ** L432: it might be useful to name the method you used (probtrackx).

      We have added this name to the description, please see lines 455-456: “probabilistic tractography (probtrackx)[78, 79] was implemented in the FDT toolbox …”

      ** L434: 'dividing the total fibres number in source region' - dividing by what?

      We have revised the description, please see line 458: “dividing by the total fibres number in source region.”

      ** L436: 'connections in subcortical areas were removed' - why did you trace connections to subcortical areas in the first place if you then removed them (to match with cortical MPC areas I suspect)? Or do you mean there were spurious streamlines through subcortical regions that you filtered?

      On the one hand we need to match the MPC, and on the other hand, as we stated in methodological issues, the challenge of accurately resolving the connections of small structures within subcortical regions using whole-brain diffusion imaging and tractography techniques[10, 11]. 

      ** Following on the above, did you use any exclusion masks during the tracing? In general, more information about quality checks for the tractography would be useful. For example, L437: did you do any quality evaluations based on the removed spurious streamlines? For example, were there any trends between spurious streamlines and the age of the subject? Distance between regions/size of the regions?

      We did not use any exclusion masks. We performed visual inspection for the tractography quality and did not assess the relationship between spurious streamlines and age or distance between regions/size of the regions.

      ** L439: 'weighted probabilistic network' - this was weighted by the filtered connectivity densities or something else?

      The probabilistic network is weighted by the filtered connectivity densities.

      ** I appreciate the short description of the communication models in Text S1, it is very useful.

      Thank you for your comment.

      ** In addition to limitations mentioned in L368 - during reconstruction, have you noticed problems resolving short inter-hemispheric connections?

      We have not considered this issue, we have added it to the limitation, please see lines 383-384: “In addition, the reconstruction of short connections between hemispheres is a notable challenge.”

      - Functional analysis:

      ** There is a difference in acquisition times between participants below and above 8 years (21 vs 26 min), does the different length of acquisition affect the quality of the processed data?

      We have made relatively strict quality control to ensure the quality of the processed data.  

      ** L446 'regressed out nuisance variables' - it would be informative to describe in more detail what you used to perform this.

      We have provided more detail about the regression of nuisance variables, please see lines 476-477: “The nuisance variables were removed from time series based on general linear model.”

      ** L450-452: it would be useful to add the number of excluded participants to get an intuition for the overall quality of the functional data. Have you checked if the quality is associated with the age of the participant (which might be related to motion etc). Adding a distribution of remaining frames across participants (vs age) would be useful to see in the supplementary methods to better understand the data you are using.

      We have supplemented the exclusion information of the subjects during the data processing, and the distribution and aged correlation of motion and remaining frames. Please see lines 481-485: “Quality control. The exclusion of participants in the whole multimodal data processing pipeline was depicted in Figure S13. In the context of fMRI data, we computed Pearson’s correlation between motion and age, as well as between the number of remaining frames and age, for the included participants aged 5 to 22 years and 8 to 22 years, respectively. These correlations were presented in Figure S14.”

      Author response image 7.

      Exclusion of participants in the whole multimodal data processing pipeline.  

      Author response image 8.

      Figure S14. Correlations between motion and age and number of remaining frames and age.

      ** L454: 'Pearson's correlation's... ' In contrast to MPC you did not remove negative correlations in the functional matrices. Why this choice?

      Whether the negative correlation connection of functional signal is removed or not has always been a controversial issue. Referring to previous studies of SC-FC coupling[12-14], we find that the practice of retaining negative correlation connections has been widely used. In order to retain more information, we chose this strategy. Considering that MPC is a nascent approach to network modeling, we adopted a more conservative strategy that removing negative correlation by referring to the study [4] that proposed the approach.

      - Gene expression:

      ** L635, you focus on the left cortex, is this common? Do you expect the gene expression to be fully symmetric (given reported functional hemispheric asymmetries)? It might be good to expand on the reasoning.

      An important consideration regarding sample assignment arises from the fact that only two out of six brains were sampled from both hemispheres and four brains have samples collected only in the left. This sparse sampling should be carefully considered when combining data across donors[1]. We have supplemented the description, please see lines 569-571: “Restricting analyses to the left hemisphere will minimize variability across regions (and hemispheres) in terms of the number of samples available[40].”

      ** Paragraph of L537: you use evolution of coupling with age (correlation) and compare to gene expression with adults (cohort of Allen Human Brain Atlas - no temporal evolution to the gene expressions) and on L369 you mention that 'relative spatial patterns of gene expressions remain stable after birth'. Of course this is not a place to question previous studies, but would you really expect the gene expression associated with the temporary processes to remain stable throughout the development? For example, myelination would follow different spatiotemporal gradient across brain regions, is it reasonable to expect that the expression patterns remain the same? How do you then interpret a changing measure of coupling (correlation with age) with a gene expression assessed statically?

      We agree with your comment that the spatial expression patterns is expected to vary at different periods. We have revised the previous description, please see lines 383-386: “Fifth, it is important to acknowledge that changes in gene expression levels during development may introduce bias in the results.”

      - Reproducibility analyses:

      ** Paragraph L576: are we to understand that you performed the entire pipeline 3 times (WD, S1, S2) for both parcellations schemes and tractography methods (~12 times) including the selection of communication models and you always got the same best three communication models and gene expression etc? Or did you make some design choices (i.e. selection of communication models) only on a specific set-up and transfer to other settings?

      The choice of communication model is established at the beginning, which we have clarified in the article, please see lines 106-108: “We used these three models to represent the extracortical connectivity properties in subsequent discovery and reproducibility analyses (Figure S1).” For reproducibility analyses (parcellation, tractography, and split-half validation), we fixed other settings and only assessed the impact of a single factor.

      ** Paragraph of L241: I really appreciate you evaluated the robustness of your results to different tractography strategies. It is reassuring to see the similarity in results for the two approaches. Did you notice any age-related effects on tractography quality for the two methods given the wide age range (did you check?)

      In our study, the tractography quality was checked by visual inspection. Using quantifiable tools to tractography quality in future studies could answer this question objectively.

      ** Additionally, I wonder how much of that overlap is driven by the changes in MPC which is the same between the two methods... especially given its high weight in the SC-FC coupling you reported earlier in the paper. It might be informative to directly compare the connectivity matrices derived from the two tracto methods directly. Generally, as mentioned in the previous comments, I think it would be interesting to assess coupling using different input settings (with WM structural and MPC separate and then combined).

      As your previous comment, we have examined the coupling patterns, coupling differences, coupling age correlation, and spatial correlations between the patterns based on different models, as shown in Figure S2. Please see our response to the previous comment for details.

      ** L251 - I also wonder if the random splitting is best adapted to validation in your case given you study relationships with age. Would it make more sense to make stratified splits to ensure a 'similar age coverage' across splits?

      In our study, we adopt the random splitting process which repeated 1,000 times to minimize bias due to data partitioning. The stratification you mentioned is a reasonable method, and keeping the age distribution even will lead to higher verification similarity than our validation method. However, from the validation results of our method, the similarity is sufficient to explain the generalization of our findings.

      Minor comments

      L42: 'is regulated by genes'

      ** Coupling (if having a functional role and being regulated at all) is possibly resulting from a complex interplay of different factors in addition to genes, for example, learning/environment, it might be more cautious to use 'regulated in part by genes' or similar.

      We have corrected it, please see line 42.

      L43 (and also L377): 'development of SC-FC coupling'

      ** I know this is very nitpicky and depends on your opinion about the nature of SC-FC coupling, but 'development of SC-FC coupling' gives an impression of something maturing that has a role 'in itself' (for example development of eye from neuroepithelium to mature organ etc.). For now, I am not sure it is fully certain that SC-FC coupling is more than a byproduct of the comparison between SC and FC, using 'changes in SC-FC coupling with development' might be more apt.

      We have corrected it, please see lines 43-44.

      L261 'SC-FC coupling was stronger ... [] ... and followed fundamental properties of cortical organization.' vs L168 'No significant correlations were found between developmental changes in SC-FC coupling and the fundamental properties of cortical organization'.

      **Which one is it? I think in the first you refer to mean coupling over all infants and in the second about correlation with age. How do you interpret the difference?

      Between the ages of 5 and 22 years, we found that the mean SC-FC coupling pattern has become similar to that of adults, consistent with the fundamental properties of cortical organization. However, the developmental changes in SC-FC coupling are heterogeneous and sequential and do not follow the mean coupling pattern to change in the same magnitude.

      L277: 'temporal and spatial complexity'

      ** Additionally, communication models have different assumptions about the flow within the structural network and will have different biological plausibility (they will be more or less

      'realistic').

      Here temporal and spatial complexity is from a computational point of view.

      L283: 'We excluded a centralized model (shortest paths), which was not biologically plausible' ** But in Text S1 and Table S1 you specify the shortest paths models. Does this mean you computed them but did not incorporate them in the final coupling computations even if they were predictive?

      ** Generally, I find the selection of the final 3 communication models confusing. It would be very useful if you could clarify this further, for example in the methods section.

      We used all twenty-seven communication models (including shortest paths) to predict FC at the node level for each participant. Then we identified three communication models that can significantly predict FC. For the shortest path, he was excluded because he did not meet the significance criteria. We have further added methodological details to this section, please see lines 503-507.

      L332 'As we observed increasing coupling in these [frontoparietal network and default mode network] networks, this may have contributed to the improvements in general intelligence, highlighting the flexible and integrated role of these networks' vs L293 'SC-FC coupling in association areas, which have lower structural connectivity, was lower than that in sensory areas. This configuration effectively releases the association cortex from strong structural constraints imposed by early activity cascades, promoting higher cognitive functions that transcend simple sensori-motor exchanges'

      ** I am not sure I follow the reasoning. Could you expand on why it would be the decoupling promoting the cognitive function in one case (association areas generally), but on the reverse the increased coupling in frontoparietal promoting the cognition in the other (specifically frontoparietal)?

      We tried to explain the problem, for general intelligence, increased coupling in frontoparietal could allow more effective information integration enable efficient collaboration between different cognitive processes.

      * Formatting errors etc.

      L52: maybe rephrase?

      We have rephrased, please see lines 51-53: “The T1- to T2-weighted (T1w/T2w) ratio of MRI has been proposed as a means of quantifying microstructure profile covariance (MPC), which reflects a simplified recapitulation in cellular changes across intracortical laminar structure[6, 1215].”

      L68: specialization1,[20].

      We have corrected it.

      L167: 'networks significantly increased with age and exhibited greater increased' - needs rephrasing.

      We have corrected it.

      L194: 'networks were significantly predicted the general intelligence' - needs rephrasing.

      We have corrected it, please see lines 204-205: “we found that the weights of frontoparietal and default mode networks significantly contributed to the prediction of the general intelligence.”

      L447: 'and temporal bandpass filtering' - there is a verb missing.

      We have corrected it, please see line 471: “executed temporal bandpass filtering.”

      L448: 'greater than 0.15' - unit missing.

      We have corrected it, please see line 472: “greater than 0.15 mm”.

      L452: 'After censoring, regression of nuisance variables, and temporal bandpass filtering,' - no need to repeat the steps as you mentioned them 3 sentences earlier.

      We have removed it.

      L458-459: sorry I find this description slightly confusing. What do you mean by 'modal'? Connectional -> connectivity profile. The whole thing could be simplified, if I understand correctly your vector of independent variables is a set of wm and microstructural 'connectivity' of the given node... if this is not the case, please make it clearer.

      We have corrected it, please see line 488: “where 𝒔𝑖 is the 𝑖th SC profiles, 𝑛 is the number of SC profiles”.

      L479: 'values and system-specific of 480 coupling'.

      We have corrected it.

      L500: 'regular' - regularisation.

      We have changed it to “regularization”.

      L567: Do you mean that in contrast to probabilistic with FSL you use deterministic methods within Camino? For L570, you introduce communication models through 'such as': did you fit all models like before? If not, it might be clearer to just list the ones you estimated rather than introduce through 'such as'.

      We have changed the description to avoid ambiguity, please see lines 608-609: “We then calculated the communication properties of the WMC including communicability, mean first passage times of random walkers, and flow graphs (timescales=1).”

      Citation [12], it is unusual to include competing interests in the citation, moreover, Dr. Bullmore mentioned is not in the authors' list - this is most likely an error with citation import, it would be good to double-check.

      We have corrected it.

      L590: Python scripts used to perform PLS regression can 591 be found at https://scikitlearn.org/. The link leads to general documentation for sklearn.

      We have corrected it, please see lines 627-630: “Python scripts used to perform PLS regression can be found at https://scikit-learn.org/stable/modules/generated/sklearn.cross_decomposition.PLSRegression.html#sklearn.cro ss_decomposition.PLSRegression.”

      P26 and 27 - there are two related sections: Data and code availability and Code availability - it might be worth merging into one section if possible.

      We have corrected it, please see lines 623-633.

      References

      (1) Arnatkeviciute A, Fulcher BD, Fornito A. A practical guide to linking brain-wide gene expression and neuroimaging data. Neuroimage. 2019;189:353-67. Epub 2019/01/17. doi: 10.1016/j.neuroimage.2019.01.011. PubMed PMID: 30648605.

      (2) Zhong S, He Y, Gong G. Convergence and divergence across construction methods for human brain white matter networks: an assessment based on individual differences. Hum Brain Mapp. 2015;36(5):1995-2013. Epub 2015/02/03. doi: 10.1002/hbm.22751. PubMed PMID: 25641208; PubMed Central PMCID: PMCPMC6869604.

      (3) Waehnert MD, Dinse J, Weiss M, Streicher MN, Waehnert P, Geyer S, et al. Anatomically motivated modeling of cortical laminae. Neuroimage. 2014;93 Pt 2:210-20. Epub 2013/04/23. doi: 10.1016/j.neuroimage.2013.03.078. PubMed PMID: 23603284.

      (4) Paquola C, Vos De Wael R, Wagstyl K, Bethlehem RAI, Hong SJ, Seidlitz J, et al. Microstructural and functional gradients are increasingly dissociated in transmodal cortices. PLoS Biol. 2019;17(5):e3000284. Epub 2019/05/21. doi: 10.1371/journal.pbio.3000284. PubMed PMID: 31107870.

      (5) Haufe S, Meinecke F, Gorgen K, Dahne S, Haynes JD, Blankertz B, et al. On the interpretation of weight vectors of linear models in multivariate neuroimaging. Neuroimage. 2014;87:96-110. Epub 2013/11/19. doi: 10.1016/j.neuroimage.2013.10.067. PubMed PMID: 24239590.

      (6) Demirtas M, Burt JB, Helmer M, Ji JL, Adkinson BD, Glasser MF, et al. Hierarchical Heterogeneity across Human Cortex Shapes Large-Scale Neural Dynamics. Neuron. 2019;101(6):1181-94 e13. Epub 2019/02/13. doi: 10.1016/j.neuron.2019.01.017. PubMed PMID: 30744986; PubMed Central PMCID: PMCPMC6447428.

      (7) Deco G, Kringelbach ML, Arnatkeviciute A, Oldham S, Sabaroedin K, Rogasch NC, et al. Dynamical consequences of regional heterogeneity in the brain's transcriptional landscape. Sci Adv. 2021;7(29). Epub 2021/07/16. doi: 10.1126/sciadv.abf4752. PubMed PMID: 34261652; PubMed Central PMCID: PMCPMC8279501.

      (8) Chen J, Tam A, Kebets V, Orban C, Ooi LQR, Asplund CL, et al. Shared and unique brain network features predict cognitive, personality, and mental health scores in the ABCD study. Nat Commun. 2022;13(1):2217. Epub 2022/04/27. doi: 10.1038/s41467-022-29766-8. PubMed PMID: 35468875; PubMed Central PMCID: PMCPMC9038754.

      (9) Li J, Bzdok D, Chen J, Tam A, Ooi LQR, Holmes AJ, et al. Cross-ethnicity/race generalization failure of behavioral prediction from resting-state functional connectivity. Sci Adv. 2022;8(11):eabj1812. Epub 2022/03/17. doi: 10.1126/sciadv.abj1812. PubMed PMID: 35294251; PubMed Central PMCID: PMCPMC8926333.

      (10) Thomas C, Ye FQ, Irfanoglu MO, Modi P, Saleem KS, Leopold DA, et al. Anatomical accuracy of brain connections derived from diffusion MRI tractography is inherently limited. Proc Natl Acad Sci U S A. 2014;111(46):16574-9. Epub 2014/11/05. doi: 10.1073/pnas.1405672111. PubMed PMID: 25368179; PubMed Central PMCID: PMCPMC4246325.

      (11) Reveley C, Seth AK, Pierpaoli C, Silva AC, Yu D, Saunders RC, et al. Superficial white matter fiber systems impede detection of long-range cortical connections in diffusion MR tractography. Proc Natl Acad Sci U S A. 2015;112(21):E2820-8. Epub 2015/05/13. doi: 10.1073/pnas.1418198112. PubMed PMID: 25964365; PubMed Central PMCID: PMCPMC4450402.

      (12) Gu Z, Jamison KW, Sabuncu MR, Kuceyeski A. Heritability and interindividual variability of regional structure-function coupling. Nat Commun. 2021;12(1):4894. Epub 2021/08/14. doi: 10.1038/s41467-021-25184-4. PubMed PMID: 34385454; PubMed Central PMCID: PMCPMC8361191.

      (13) Liu ZQ, Vazquez-Rodriguez B, Spreng RN, Bernhardt BC, Betzel RF, Misic B. Time-resolved structure-function coupling in brain networks. Commun Biol. 2022;5(1):532. Epub 2022/06/03. doi: 10.1038/s42003-022-03466-x. PubMed PMID: 35654886; PubMed Central PMCID: PMCPMC9163085.

      (14) Zamani Esfahlani F, Faskowitz J, Slack J, Misic B, Betzel RF. Local structure-function relationships in human brain networks across the lifespan. Nat Commun. 2022;13(1):2053. Epub 2022/04/21. doi: 10.1038/s41467-022-29770-y. PubMed PMID: 35440659; PubMed Central PMCID: PMCPMC9018911.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In this study, the authors provide a new computational platform called Vermouth to automate topology generation, a crucial step that any biomolecular simulation starts with. Given a wide arrange of chemical structures that need to be simulated, varying qualities of structural models as inputs obtained from various sources, and diverse force fields and molecular dynamics engines employed for simulations, automation of this fundamental step is challenging, especially for complex systems and in case that there is a need to conduct high-throughput simulations in the application of computer-aided drug design (CADD). To overcome this challenge, the authors develop a programming library composed of components that carry out various types of fundamental functionalities that are commonly encountered in topological generation. These components are intended to be general for any type of molecules and not to depend on any specific force field and MD engines. To demonstrate the applicability of this library, the authors employ those components to re-assemble a pipeline called Martinize2 used in topology generation for simulations with a widely used coarse-grained model (CG) MARTINI. This pipeline can fully recapitulate the functionality of its original version Martinize but exhibit greatly enhanced generality, as confirmed by the ability of the pipeline to faithfully generate topologies for two high-complexity benchmarking sets of proteins.

      Strengths:

      The main strength of this work is the use of concepts and algorithms associated with induced subgraph in graph theory to automate several key but non-trivial steps of topology generation such as the identification of monomer residue units (MRU), the repair of input structures with missing atoms, the mapping of topologies between different resolutions, and the generation of parameters needed for describing interactions between MRUs.

      Weaknesses:

      Although the Vermouth library appears promising as a general tool for topology generation, there is insufficient information in the current manuscript and a lack of documentation that may allow users to easily apply this library. More detailed explanation of various classes such as Processor, Molecule, Mapping, ForceField etc. that are mentioned is still needed, including inputs, output and associated operations of these classes. Some simple demonstration of application of these classes would be of great help to users. The formats of internal databases used to describe reference structures and force fields may also need to be clarified. This is particularly important when the Vermouth needs to be adapted for other AA/CG force fields and other MD engines.

      We thank the reviewer for pointing out the strengths of the presented work and agree that one of the current limitations is the lack of documentation about the library. In the revision, we point more clearly to the documentation page of the Vermouth library, which contains more detailed information on the various processors. The format of the internal databases has also been added to the documentation page. Providing a simple demonstration of applications of these classes is a great suggestion, however, we believe that it is more convenient to provide those in the form of code examples in the documentation or for instance jupyter notebooks rather than in the paper itself.  

      The successful automation of the Vermouth relies on the reference structures that need to be pre-determined. In case of the study of 43 small ligands, the reference structures and corresponding mapping to MARTINIcompatible representations for all these ligands have been already defined in the M3 force field and added into the Vermouth library. However, the authors need to comment on the scenario where significantly more ligands need to be considered and other force fields need to be used as CG representations with a lack of reference structures and mapping schemes.

      We acknowledge that vermouth/martinize2 is not capable of automatically generating Martini mappings or parameters on the fly for unknown structures that are not part of the database. However, this capability is not the purpose of the program, which is rather to distribute and manage existing parameters. Unlike atomistic force fields, which frequently have automated topology builders, Martini parameters are usually obtained for a set of specific molecules at a time and benchmarked accordingly. As more parameters are obtained by researchers, they can be added to the vermouth library via the GitHub interface in a controlled manner. This process allows the database to grow and in our opinion will quickly grow beyond the currently implemented parameters. Furthermore, the API of Vermouth is set up in a way that it can easily interface with automated topology builders which are currently being developed. Hence this limitation in our view does not diminish the applicability of vermouth to high-throughput applications with many ligands. The framework is existing and works, now only more parameters have to be added.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript by Kroon, Grunewald, Marrink and coworkers present the development of Vermouth library for coarse grain assignment and parameterization and an updated version of python script, the Martinize2 program, to build Martini coarse grained (CG) models, primarily for protein systems.

      Strengths:

      In contrast to many mature and widely used tools to build all-atom (AA) models, there are few well-accepted programs for CG model constructions and parameterization. The research reported in this manuscript is among the ongoing efforts to build such tools for Martini CG modeling, with a clear goal of high-throughput simulations of complex biomolecular systems and, ultimately, whole-cell simulations. Thus, this manuscript targets a practical problem in computational biophysics. The authors see such an effort to unify operations like CG mapping, parameterization, etc. as a vital step from the software engineering perspective.

      Weaknesses:

      However, the manuscript in this shape is unclear in the scientific novelty and appears incremental upon existing methods and tools. The only "validation" (more like an example application) is to create Martini models with two protein structure sets (I-TASSER and AlphaFold). The success rate in building the models was only 73%, while the significant failure is due to incomplete AA coordinates. This suggests a dependence on the input AA models, which makes the results less attractive for high-throughput applications (for example, preparation/creation of the AA models can become the bottleneck). There seems to be an improvement in considering the protonation state and chemical modification, but convincing validation is still needed. Besides, limitations in the existing Martini models remain (like the restricted dynamics due to the elastic network, the electrostatic interactions or polarizability).

      We thank the reviewer for pointing out the strengths of the presented work, but respectfully disagree with the criticism that the presented work is only incremental upon existing methods and tools. All MD simulations of structured proteins regardless of the force field or resolution rely on a decent initial structure to produce valid results. Therefore, failure upon detection of malformed protein input structures is an essential feature for any high-throughput pipeline working with proteins, especially considering the computational cost of MD simulations. We note that programs such as the first version of Martinize generate reasonable-looking input parameters that lead to unphysical simulations and wasted CPU hours.

      The alpha-fold database for which we surveyed 200,000 structures only contained 7 problematic structures, which means that the success rate was 99% for this database. This example simply shows that users potentially have to add the step of fixing atomistic protein input structures, if they seek to run a high-throughput pipeline.

      But at least they can be assured that martinize2 will make sure to check that no issues persist.

      Furthermore, we note that the manuscript does not aim to validate or improve the existing Martini (protein) models. All example cases presented in the paper are subject to the limitations of the protein models for the reason that martinize2 is only the program to generate those parameters. Future improvements in the protein model, which are currently underway, will immediately be available through the program to the broader community.  

      Reviewer #3 (Public Review):

      Summary:

      The manuscript Kroon et al. described two algorithms, which when combined achieve high throughput automation of "martinizing" protein structures with selected protonation states and post-translational modifications.

      Strengths:

      A large scale protein simulation was attempted, showing strong evidence that authors' algorithms work smoothly.

      The authors described the algorithms in detail and shared the open-source code under Apache 2.0 license on GitHub. This allows both reproducibility of extended usefulness within the field. These algorithms are potentially impactful if the authors can address some of the issues listed below.

      We thank the reviewer for pointing out the strengths.  

      Weaknesses:

      One major caveat of the manuscript is that the authors claim their algorithms aim to "process any type of molecule or polymer, be it linear, cyclic, branched, or dendrimeric, and mixtures thereof" and "enable researchers to prepare simulation input files for arbitrary (bio)polymers". However, the examples provided by the manuscript only support one type of biopolymer, i.e. proteins. Despite the authors' recommendation of using polyply along with martinize2/vermouth, no concrete evidence has been provided to support the authors' claim. Therefore, the manuscript must be modified to either remove these claims or include new evidence.

      We acknowledge that the current manuscript is largely protein-centric. To some extent this results from the legacy of martinize version 1, which was also only used for proteins. However, to show that martinize2 also works for cyclic as well as branched molecules we implemented two additional test cases and updated formerly Figure 6 and now Figure 7. Crown ether is used as an example of a cyclic molecule whereas a small branched polyethylene molecule is a test case for branching. Needless to say both molecules are neither proteins nor biomolecules. 

      Method descriptions on Martinize2 and graph algorithms in SI should be core content of the manuscript. I argue that Figure S1 and Figure S2 are more important than Figure 3 (protonation state). I recommend the authors can make a workflow chart combining Figure S1 and S2 to explain Martinize2 and graph algorithms in main text.

      The reviewer's critique is fair. Given the already rather large manuscript, we tried to strike a balance between describing benchmark test cases, some practical usage information (e.g. the Histidine modification), and the algorithmic library side of the program. In particular, we chose to add the figure on protonation state, because how to deal with protonation states—in particular, Histidines—was amongst the top three raised issues by users on our GitHub page. Due to this large community interest, we consider the figure equally important. However, we moved Figure S1 from the Supporting Information into the manuscript and annotated the already mentioned text with the corresponding panels to more clearly illustrate the underlying procedure. 

      In Figure 3 (protonation state), the figure itself and the captions are ambiguous about whether at the end the residue is simply renamed from HIS to HIP, or if hydrogen is removed from HIP to recover HIS.

      Using either of the two routes yields the same parameters in the end, which are for the protonated Histidine. In the second route, the extra hydrogen on Histidine is detected as an additional atom and therefore a different logic flow is triggered. Atoms are never removed, but only compounded to a base block plus modification atoms. We adjusted the figure caption to point this out more clearly.  

      In "Incorporating a Ligand small-molecule Database", the authors are calling for a community effort to build a small-molecule database. Some guidance on when the current database/algorithm combination does or does not work will help the community in contributing.

      Any small molecule not part of the database will not work. However, martinize2 will quickly identify if there are missing components of the system and alert the users. At that point, the users can decide to make their files, guided by the new documentation pages. 

      A speed comparison is needed to compare Martinize2 and Martinize.

      We respectfully disagree that a speed comparison is needed. We already alerted in the manuscript discussion that martinize2 is slower, since it does more checks, is more general, and does not only implement a single protein model.

    1. As of January 2023, GitHub reported having over 100 million developers and more than 420 million repositories, including at least 28 million public repositories. It is the world's largest source code host as of June 2023.

      Github 100 million developers

    1. I'm sick of this recurring advice. Instead I should keep a wiki and comment my code better. Don't assume that everything applies to everyone.

      Absolutely decimated.

    1. your code only works in CodePens and JSFiddles because those execute the JavaScript after the DOM is parsed.

      execute javascriptr after dom is parsed

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This is a valuable study that develops a new model of the way muscle responds to perturbations, synthesizing models of how it responds to small and large perturbations, both of which are used to predict how muscles function for stability but also how they can be injured, and which tend to be predicted poorly by classic Hill-type models. The evidence presented to support the model is solid, since it outperforms Hill-type models in a variety of conditions. Although the combination of phenomenological and mechanistic aspects of the model may sometimes make it challenging to interpret the output, the work will be of interest to those developing realistic models of the stability and control of movement in humans or other animals.

      Reviewer #1 (Public Review):

      Muscle models are important tools in the fields of biomechanics and physiology. Muscle models serve a wide variety of functions, including validating existing theories, testing new hypotheses, and predicting forces produced by humans and animals in health and disease. This paper attempts to provide an alternative to Hill-type muscle models that includes contributions of titin to force enhancement over multiple time scales. Due to the significant limitations of Hill-type models, alternative models are needed and therefore the work is important and timely.

      The effort to include a role for titin in muscle models is a major strength of the methods and results. The results clearly demonstrate the weaknesses of Hill models and the advantages of incorporating titin into theoretical treatments of muscle mechanics. Another strength is to address muscle mechanics over a large range of time scales.

      The authors succeed in demonstrating the need to incorporate titin in muscle models, and further show that the model accurately predicts in situ force of cat soleus (Kirsch et al. 1994; Herzog & Leonard, 2002) and rabbit posts myofibrils (Leonard et al. 2010). However, it remains unclear whether the model will be practical for use with data from different muscles or preparations. Several ad hoc modifications were described in the paper, and the degree to which the model requires parameter optimization for different muscles, preparations and experiment types remains unclear.

      I think the authors should state how many parameters require fitting to the data vs the total number of model parameters. It would also be interesting for the authors to discuss challenges associated with modeling ex vivo and in vivo data sets, due to differences in means of stimulation vs. model inputs.

      (1) I think the authors should state how many parameters require fitting to the data vs the total number of model parameters.

      The total number of model parameters are listed in Table 1. Each parameter has, in addition, references listed for the source of data (if one exists) along with how the data were used (’C’ calculate, ’F’ fit, ’E’ estimated, or ’S’ for scaled) for the specific simulations that appear in this paper. While this is a daunting number of parameters, only a few of these parameters must be updated when modeling a new musculotendon.

      Similar to a Hill-type muscle model, at least 5 parameters are needed to fit the VEXAT model to a specific musculotendon: maximum isometric force (fiso), optimal contractile element (CE) length, pennation angle, maximum shortening velocity, and tendon slack length. However, similar to a Hill model, it is only possible to use this minimal set of parameters by making use of default values for the remaining set of parameters. The defaults we have used have been extracted from mammalian muscle (see Table 1) and may not be appropriate for modeling muscle tissue that differs widely in terms of the ratio of fast/slow twitch fibers, titin isoform, temperature, and scale.

      Even when these defaults are appropriate, variation is the rule for biological data rather than the exception. It will always be the case that the best fit can only be obtained by fitting more of the model’s parameters to additional data. Standard measurements of the active force-length relation, passive forcelength relation, and force-velocity relations are quite helpful to improve the accuracy of the model to a specific muscle. It is challenging to improve the fit of the model’s cross-bridge (XE) and titin models because the data required are so rare. The experiments of Kirsch et al., Prado et al, and Trombitas et´ al. are unique to our knowledge. However, if more data become available, it is relatively straight forward to update the model’s parameters using the methods described in Appendix B or the code that appears online (https://github.com/mjhmilla/Millard2023VexatMuscle).

      We have modified the manuscript to make it clear that, in some circumstances, the burden of parameter identification for the VEXAT model can be as low as a Hill model:

      - Section 3: last two sentences of the 2nd paragraph, found at: Page 10, column 2, lines 1-12 of MillardFranklinHerzog v3.pdf and 05 MillardFranklinHerzog v2 v3 diff.pdf

      - Table 1: last two sentences of the caption, found at: Page 11 of MillardFranklinHerzog v3.pdf and 05 MillardFranklinHerzog v2 v3 diff.pdf

      (2) It would also be interesting for the authors to discuss challenges associated with modeling ex vivo and in vivo data sets, due to differences in means of stimulation vs. model inputs.

      All of the experiments simulated in this work are in-situ or ex-vivo. So far the main challenges of simulating any experiment have been quite consistent across both in-situ and ex-vivo datasets: there are insufficient data to fit most model parameters to a specific specimen and, instead, defaults from the literature must be used. In an ideal case, a specimen would have roughly ten extra trials collected so that the maximum isometric force, optimal fiber length, active force-length relation, passive force-length relation (upto ≈ 0_._6_f_oM), and the force-velocity relations could be identified from measurements rather than relying on literature values. Since most lab specimens are viable for a small number of trials (with the exception of cat soleus), we don’t expect this situation to change in future.

      However, if data are available the fitting process is pretty straight forward for either in-situ or ex-vivo data: use a standard numerical method (for example non-linear least squares, or the bisection method) to adjust the model parameters to reduce the errors between simulation and experiment. The main difficulty, as described in the previous paragraph, is the availability of data to fit as many parameters as possible for a specific specimen. As such, the fitting process really varies from experiment to experiment and depends mainly on the richness of measurements taken from a specific specimen, and from the literature in general.

      Working from in-vivo data presents an entirely different set of challenges. When working with human data, for example, it’s just not possible to directly measure muscle force with tendon buckles, and so it is never completely clear how force is distributed across the many muscles that typically actuate a joint. Further, there is also uncertainty in the boundary condition of the muscle because optical motion capture markers will move with respect to the skeleton. Video fluoroscopy offers a method of improving the accuracy of measured boundary conditions, though only for a few labs due to its great expense. A final boundary condition remains impossible to measure in any case: the geometry and forces that act at the boundaries as muscle wraps over other muscles and bones. Fitting to in-vivo data are very difficult.

      While this is an interesting topic, it is tangent to our already lengthy manuscript. Since these reviews are public, we’ll leave it to the motivated reader to find this text here.

      Reviewer #2 (Public Review):

      This model of skeletal muscle includes springs and dampers which aim to capture the effect of crossbridge and titin stiffness during the stretch of active muscle. While both crossbridge and titin stiffness have previously been incorporated, in some form, into models, this model is the first to simultaneously include both. The authors suggest that this will allow for the prediction of muscle force in response to short-, mid- and long-range stretches. All these types of stretch are likely to be experienced by muscle during in vivo perturbations, and are known to elicit different muscle responses. Hence, it is valuable to have a single model which can predict muscle force under all these physiologically relevant conditions. In addition, this model dramatically simplifies sarcomere structure to enable this muscle model to be used in multi-muscle simulations of whole-body movement.

      In order to test this model, its force predictions are compared to 3 sets of experimental data which focus on short-, mid- and long-range perturbations, and to the predictions of a Hill-type muscle model. The choice of data sets is excellent and provide a robust test of the model’s ability to predict forces over a range of length perturbations. However, I find the comparison to a Hill-type muscle model to be somewhat limiting. It is well established that Hill-type models do not have any mechanism by which they can predict the effect of active muscle stretch. Hence, that the model proposed here represents an improvement over such a model is not a surprise. Many other models, some of which are also simple enough to be incorporated into whole-body simulations, have incorporated mechanistic elements which allow for the prediction of force responses to muscle stretch. And it is not clear from the results presented here that this model would outperform such models.

      The paper begins by outlining the phenomenological vs mechanistic approaches taken to muscle modelling, historically. It appears, although is not directly specified, that this model combines these approaches. A somewhat mechanistic model of the response of the crossbridges and titin to active stretch is combined with a phenomenological implementation of force-length and force-velocity relationships. This combination of approaches may be useful improving the accuracy of predictions of muscle models and whole-body simulations, which is certainly a worthy goal. However, it also may limit the insight that can be gained. For example, it does not seem that this model could reflect any effect of active titin properties on muscle shortening. In addition, it is not clear to me, either physiologically or in the model, what drives the shift from the high stiffness in short-range perturbations to the somewhat lower stiffness in mid-range perturbations.

      (1) It is well established that Hill-type models do not have any mechanism by which they can predict the effect of active muscle stretch.

      While many muscle physiologists are aware of the limitations of the Hill model, these limitations are not so well known among computational biomechanists. There are at least two reasons for this gap: there are few comprehensive evaluations of Hill models against several experiments, and some of the differences are quite nuanced. For example, active lengthening experiments can be replicated reasonably well using a Hill model if the lengthening is done on the ascending limb of the force length curve. Clearly the story is quite different on the descending limb as shown in Figure 9. Similarly, as Figure 8 shows, by choosing the right combination of tendon model and perturbation bandwidth it is possible to get reasonably accurate responses from the Hill model to stochastic length changes. Yet when a wide variety of perturbation bandwidths, magnitudes, and tendon models are tested it is clear that the Hill model cannot, in general, replicate the response of muscle to stochastic perturbations. For these reasons we think many of the Hill model’s drawbacks have not been clearly understood by computational biomechanists for many years now.

      (2) Many other models, some of which are also simple enough to be incorporated into whole-body simulations, have incorporated mechanistic elements which allow for the prediction of force responses to muscle stretch. And it is not clear from the results presented here that this model would outperform such models.

      We agree that it will be valuable to benchmark other models in the literature using the same set of experiments. Hopefully we, or perhaps others, will have the good fortune to secure research funding to continue this benchmarking work. This will, however, be quite challenging: few muscle models are accompanied by a professional-quality open-source implementation. Without such an implementation it is often impossible to reproduce published results let alone provide a fair and objective evaluation of a model.

      (3) For example, it does not seem that this model could reflect any effect of active titin properties on muscle shortening.

      The titin model described in the paper will provide an enhancement of force during a stretch-shortening cycle. This certainly would be an interesting next experiment to simulate in a future paper.

      (4) In addition, it is not clear to me, either physiologically or in the model, what drives the shift from the high stiffness in short-range perturbations to the somewhat lower stiffness in mid-range perturbations.

      We can only respond to what drives the frequency dependent stiffness in the model, though we’re quite interested in what happens physiologically. Hopefully that there are some new experiments done to examine this phenomena in the future. In the case of the model, the reasons are pretty straight forward: the formulation of Eqn. 16 is responsible for this shift.

      Equation 16 has been formulated so that the acceleration of the attachment point of the XE is driven by the force difference between the XE and a reference Hill model (numerator of the first term in Eqn. 16) which is then low pass filtered (denominator of the first term in Eqn. 16). Due to this formulation the attachment point moves less when the numerator is small, or when the differences in the numerator change rapidly and effectively become filtered out. When the attachment point moves less, more of the CE’s force output is determined by variations in the length of the XE and its stiffness.

      On the other hand, the attachment point will move when the numerator of the first term in Eqn. 16 is large, or when those differences are not short lived. When the attachment point moves to reduce the strain in the XE, the force produced by the XE’s spring-damper is reduced. As a result, the CE’s force output is less influenced by variations of the length of the XE and its stiffness.

      Reviewer #2 (Recommendations for the Authors):

      I find the clarity of the manuscript to be much improved following revision. While I still find the combination of phenomenological and mechanistic approaches to be a little limiting with regards to our understanding of muscle contraction, the revised description of small length changes makes the interpretation much less confusing.

      Similarly, while I agree that Hill-type models are widely used their limitations have been addressed extensively and are very well established. Hence, moving forward I think it would be much more valuable to start to compare these newer models to one another rather than just showing an improvement over a Hill model under (very biologically important) conditions which that model has no capacity to predict forces.

      (1) While I still find the combination of phenomenological and mechanistic approaches to be a little limiting with regards to our understanding of muscle contraction ...

      We have had to abstract some of the details of reality to have a model that can be used to simulate hundreds of muscles. In contrast, FiberSim produced by Kenneth Campbell’s group uses much less abstraction and might be of greater interest to you. FiberSim’s models include individual cross-bridges, titin molecules, and an explicit representation of the spatial geometry of a sarcomere. While this model is a great tool for testing muscle physiology questions through simulation, it is computationally expensive to use this model to simulate hundreds of muscles simultaneously.

      Kosta S, Colli D, Ye Q, Campbell KS. FiberSim: A flexible open-source model of myofilament-level contraction. Biophysical journal. 2022 Jan 18;121(2):175-82.https://campbell-muscle-lab.github.io/FiberSim/

      (2) Similarly, while I agree that Hill-type models are widely used their limitations have been addressed extensively and are very well established.

      Please see our response 1 to Reviewer # 1.

      (3) Hence, moving forward I think it would be much more valuable to start to compare these newer models to one another rather than just showing an improvement over a Hill model under (very biologically important) conditions which that model has no capacity to predict forces.

      Please see our response to 2 to Reviewer #1.

    1. Michael Kan. FBI: Hackers Are Compromising Legit QR Codes to Send You to Phishing Sites. PCMAG, January 2022. URL: https://www.pcmag.com/news/fbi-hackers-are-compromising-legit-qr-codes-to-send-you-to-phishing-sites (visited on 2023-12-06).

      What kind of scenarios are there where people are scanning QR codes unnecessarily or haphazardly? The only time I use QR codes are when I need to scan the menu at a restaurant or if an event or a business would like me to get to a specific website via QR code to access a survey or something of that nature. While this scam is important to note and keep in mind for the future, it seems like the plausibility of people falling for this scam is lower than other scams.

    1. Coding style is more important than I expected in the beginning. My start to software engineering started from being on the product-minded end of the spectrum and moved towards the “technical-minded” side of the spectrum.

      1000000% agree, code is basically useless if you can't give it to someone else and they can work with your code as well or figure it out.

    2. It’s the reason why when ChatGPT outputs some hogwash, it’s easier just to re-prompt it or write it from scratch yourself instead of trying to figure out the errors in its buggy code.

      I actually do use ChatGPT to debug small errors, its great for when you've been staring at code for hours and can miss a small mistake

    3. There’s a popular saying that debugging code is twice as hard as writing it.

      This is SO true! It's better to take a while meticulously writing the code in layers, making sure each respective step works before adding another layer. Otherwise debugging the whole thing is a nightmare! [ I'm in AME :') ]

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This manuscript investigates the dynamics of GC-content patterns in the 5'end of the transcription start sites (TSS) of protein-coding genes (pc-genes). The manuscript introduces a quite careful and comprehensive analysis of GC content in pc-genes in humans and other vertebrates, specially around the TSS. The result of this investigation states that "GC-content surrounding the TSS is largely influenced by patterns of recombination." (from end of Introduction)

      My main concern with this manuscript is one of causal reasoning, whether intended or not. I hope the authors can follow my reasoning bellow on how the logic sometimes seems to fail, and that they introduce changes to clarify their suggested mechanisms of action.

      The above quoted sentence form the end of the Intro is in conflict with this other sentence that appears at the end of the Abstract "the dynamics of GC-content in mammals are largely shaped by patterns of recombination". The sentence in the Intro seems to indicate that the effect is specific to TSSs, but the one in the abstract seem to indicate the opposite, that is, that the effect is ubiquitous.

      We are sorry about the lack of clarity. We have now rewritten the abstract and intro to emphasize that our results are restricted to the 5' end of genes, and that by "patterns of recombination" we mean "historic patterns of recombination".

      The observations as stated in the abstract are: "We observe that in primates and rodents, where recombination is directed away from TSSs by PRDM9, GC-content at protein-coding gene TSSs is currently undergoing mutational decay."

      If I understand the measurements described in the manuscript correctly, and the arguments around them, you seem to show that the mutational decay of GC-content in humans is independent of location (TSSS or not), as noted here (also from the abstract) "These patterns extend into the open reading frame affecting protein-coding regions, and we show that changes in GC-content due to recombination affect synonymous codon position choices at the start of the open reading frame."

      Again, we have rewritten this section to clarify these points.

      There is one more result described in the manuscript, that in my mind is very important, but it is not given the relevance that it appears to me that it has. That is presented in Figure S3G. "we concluded that GC-content at the TSS of protein-coding genes is not at equilibrium, but in decay in primates and rodents. This decay rate is similar to the decay seen in intergenic regions that have the same GC-content (Figure S3G)"

      Thus, if the decaying effect happens everywhere, how can it be related to "recombination being directed away from TSSs by PRDM9" as it is stated in the abstract and in the model described in Figure 7?

      We make the argument that the GC-peak as likely caused by past recombination events. This is based on:

      1) The change in GC-content at the TSS in Dogs and Fox, coupled to the fact that they perform recombination at the TSS

      2) That the TSS can act as a default recombination site in mice when PRDM9 is knocked out

      3) That some forms of PRDM9 allow for recombination at TSS (see Schield et al., 2020, Hoge et al. 2023, and Joseph et al., 2023) and that this is expected to cause an increase in GC-content

      We thus speculate that the GC-peak in humans and rodents was caused by past recombination at TSSs that were permitted by ancient variants of PRDM9. We further point out that PRDM9 is undergoing rapid evolution, and some of the past versions of the protein may have had this property.

      We have tried to clarify these points in the latest version of the text.

      The fact that the decay rate is similar to any other region with similar GC-content should be an indication that the effect is not related to anything having to do with TSS or recombination being directed away from TSSs by PRDM9.

      We are sorry about the lack of clarity. TSSs in humans, chimpanzees, mouse and rats are are experiencing GC-decay at the same rate as in non-functional DNA regions with high GC-content. Thus the GC-peak is not being maintained by selection. This is surprising, given the role that GC-content plays in gene expression. This is a critical point, and we added it to the "conclusion" section of the abstract.

      I hope these paragraphs show my confusion about the relationship between the results presented which I think are very comprehensive and their interpretation and suggested model for GC-content dynamics around TSSs in human.

      On another note, can you provided a bit more background on recombination and its mechanisms?

      We have done our best to clarify these issues.

      You seem to have confident sets of genes under high/low/med recombination. How are those determined.

      We used the recombination rates per gene provided in Pouyet et al 2017 to identify the sets of genes under low/med/high recombination. Those rates were estimated from the HapMap genetic map (Frazer et al., 2007). This is now all specified in the methods section.

      You also seem to concentrate the cause of recombination on PRDM9, please explain. Is PRDM9 the unique indicator of recombination?

      PRDM9 has been shown to be the primary determinant of where recombination occurs in the genome (Grey et al., 2011, Brick et al., 2012). This is very well established. We now reword some of the introduction to make this clear.

      specific comments


      Figure 1, it is very hard to understand the differences between the three rows. Please explain more clearly in the legend, and add more information to the figure itself.

      We altered the axis titles to make this clearer. We also label "Upsream", "Exon 1" and "Part of Intron 1" in Figure 1C, F and I, and in Figure 2C. We now spell this out in the Figure Legend.

      Figure 7, express somewhere in the figure that the y axis measures GC content.

      We now added "GC Content" to the left of the first "graph" in Figure 7.

      Figure seems to introduce a 'causal' model of GC-content dismissing (diminishing?) based on recombination being directed away from TSSs. How about the diminishing of GC-content on any other genomic regions as you have shown in Figure S3G?

      Our focus in this model, and manuscript, is on TSSs. I think that to add the dynamics of other GC-rich regions is distracting. We do not know what caused these intergenic genomic regions to be high in GC-content prior to decay. After excluding known recombination sites and TSSs, these regions are very rare in the human genome. They may be ancient recombination sites that are decaying in GC-content. However, unlike TSSs, which have some connection to recombination (i.e. data from PRDM9 knockout mice and dogs and fox), we do not have any direct or indirect evidence that these other sites were used for recombination in the past. Alternatively, there could have been some other pressure on these sites in the past to increase GC-content that we are not aware of.

      -- The title is too selective, as to the results, and it has the implication that the decay is exclusive to the surrounding of the TSSs.

      Decay of GC-content towards equilibrium is the default state for non-functional DNA. That it is occurring at the TSS is surprising, as it indicates that the GC-peak is not maintained by selection. We now state this in the paper and include this in the "conclusion" portion of the abstract.

      Reviewer #1 (Significance (Required)):

      The statistical analysis is comprehensive and robust.

      We thank the reviewer for this.

      Their model interpretation as is describe induces confusion and needs to be clarified.

      We are sorry about this. Hopefully our revised text will clear up the confusion.

      I am an expert computational biologist, I do not have a deep knowledge of sequence implications of recombination, and it would be good if the manuscript could add some more background on that.

      We thank the reviewer for their perspective, and we hope that our text changes better explain to the non-expert why our findings are so surprising. We further clarify how recombination affects DNA sequence by gBGC and some of these changes are detailed in our response to the other reviewers.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this work, the author present various analyses suggesting that GC-content in TSS of coding genes is affected by recombination. The article findings are interesting and novel and are important to our understanding of how various non-adaptive evolutionary forces shape vertebrate genome evolutionary history.

      We thank the reviewer for these kind words.

      The Methods section includes most needed details (see comments below for missing information), and the scripts and data provided online help in transparency and usability of these analyses.

      I have several comments, mostly regarding clarifications in the text and several suggestions:

      1. In introduction: CpG islands, have been shown to activate transcription (Fenouil et al., 2012) - what is known about CpG Islands is somewhat inaccurately described. It should be rephrased more accurately, e.g. - CpG Islands found near TSS are associated with robust and high expression level of genes, including genes expressed in many tissues, such as housekeeping genes.

      We thank the reviewer for that. We have rewrote this part of the introduction.

      1. The following claim (in Introduction), regarding retrogenes and their GC content is not in agreement recent analyses: "Indeed, it has been observed that these genes have elevated GC-content at their 5' ends in comparison to their intron-containing counterparts, suggesting that elevation of GC-content can be driven by positive selection to drive their efficient export (Mordstein et al., 2020). Moreover, retrogenes tend to arise from parental genes that have high GC-content at their 5'ends (Kaessmann et al.,2009)." Recent work showed that retrogenes in mouse and human are significantly depleted of CpG islands in their promoters (PMID: 37055747). This follows the notion that young genes, such as these retrogenes, have simple promoters (PMID: 30395322) with few TF binding sites and without CpGs. The two reported trends should be both mentioned with some suggestions regarding why they seem to be contrasting each other and how they can be reconciled.

      We thank the reviewer for this information. The previous report (Mordstein et al., 2020) indicated that the increase in GC-content occurs downstream of the TSS in retrogenes. Since sequences upstream of the TSS are not part of the retro-insertion, it is not surprising that GC-content may differ between the retrogene and the parental gene. That retrogenes have lower numbers of CpGs upstream of the TSS, bolsters the idea that GC-content is not required for transcription and that the GC-peak is not being maintained in most genes by purging selection.

      1. In "Thus GC-content is expected, and is indeed observed to be higher near recombination hotspots due to gBGC (REF)." I think you forgot the reference...

      We thank the reviewer for catching this.

      1. In Results, regarding average GC content (Fig 2X): "Interestingly, this pattern is different in the nonamniotes examined, including anole lizard, coelacanth, shark and lamprey." - in lizard, it seems that the genomic average is lower (and lizards are amniotes)

      You are absolutely right. We now fix this.

      1. In Discussion, the statement: "This model is supported by findings in a recent preprint, which documents the equilibrium state of GC-content in TSS regions from numerous organisms" seems to contrast with the findings of the mentioned preprint. If "most mammals have a high GC-content equilibrium state" but still have a functional PRDM9, in the lack of evidence for functional differences between ortholog PRDM9 proteins (such as signatures for positive selection or functional assays), the authors' findings regarding the relationship between a lack of PRDM9 in canids and the trends observed in their TSS, are weakened.

      We are sorry about the confusion. We were not exactly sure what points were being commented on. 1) whether GC-content is at equilibrium for most mammals or 2) that the equilibrium state is high for most mammals despite containing PRDM9. We rewrote this sentence to clarify both issues (especially given that these concepts may not be clear to non-experts, such as the first reviewer). To answer the first potential concern, the paper in question (Joseph et al., 2023), does not show that GC-content at the TSS in mammals is at equilibrium, rather, it calculates what the equilibrium state is given the nucleotide substitution rates. In most organisms, the TSS is not at equilibrium. To answer both 1 and 2, Joseph et al., show that the equilibrium GC-content at the TSS for canids is much higher than for other mammals. They and others infer that the diversity between other mammals (where the equilibrium state is higher than humans and rodents but lower than canids) has to do with the variation between PRDM9 orthologues, however this has yet to be tested. Although the action of PRDM9 has not been evaluated in most mammals, we do point out that in snakes PRDM9 allows for some recombination at the TSS.

      1. In Methods, the ENSEMBL version (in addition of the per-species genome version) should be mentioned.

      This has been fixed.

      1. In Fig 1, it is worth clarifying in the legend that the differences between the first and second rows of panels is in the length of the plotted region.

      We have now indicated this in the figure legend.

      Reviewer #2 (Significance (Required)):

      The manuscript provides a rigorous analysis of the possible processes that have impacted the TSS GC-content during evolution. It should be of interest to a diverse set of investigators in the genomics community, since it touches on different topics including genome evolution, transcription and gene structures.

      Thank you.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This study analyzes the distribution of GC-content along genes in humans and vertebrates, and particularly the higher GC-content in the 5'-end than in the 3'-end of genes. The results suggest that this pattern is ancient in vertebrates, currently decaying in mouse and humans, and probably driven by recombination and GC-biased gene conversion. It is proposed that the 5'-3' gradient was generated during evolution when PRDM9 was less active (in which case recombination occurs mostly near transcription start sites), and decays when PRDM9 is very active, as it is currently in humans and mouse. This is a very interesting hypothesis, also corroborated by a recent, similar analysis in mammals (Joseph et al. 2023). These two preprints, which appeared around the same time, are, I think, quite novel and important. The analyses performed here are thorough and convincing. Source code and raw data sets are openly distributed. I only have a couple of minor comments and suggestions, which I hope might help improve the manuscript.

      Thank you very much for the kind words.

      A1. There has been quite some work on the 5'-3' GC-content gradient in plants (e.g. Clément et al. 2014 GBE, Ressayre et al. 2015 GBE, Brazier & Glemin 2023 biorxiv), which you might like to cite.

      Thank you for pointing out these very interesting papers, we have incorporated them into the latest version.

      A2. CpG-content and GC-content are related in various ways (e.g. see Galtier & Duret 2000 MBE, Fryxell & Moon 2005 MBE) that you might like to discuss; currently the manuscript discusses the CpG hypermutation rate as a driver of GC-content but the picture might be a bit more complex.

      Thank you for this, we have incorporated these citations.

      A3. The model introduced by this manuscript (figure 7) is dependent on the evolution of recombination determination in vertebrates and the role of PRDM9. A recent preprint by Raynaud et al (biorxiv) seems relevant to this issue.

      Thank you for pointing out this pre-print. We have added a paragraph to the discussion that mentions this work. This also initiated a conversation with the authors, and we include some "personal communications" that illuminate what is going on in teleost fish.

      Line-by-line comments

      B1. "First, highly spliced mRNAs tend to have high GC-content at their 5' ends despite the fact that it is not required for export and does not affect expression levels (Mordstein et al., 2020)" -> I do not totally understand this sentence, which seems to imply some link between splicing and export/expression, could you please clarify?

      We rewrote that sentence to make it clearer.

      B2. "mismatches will form in the heteroduplex which are typically corrected in favor of Gs and Cs over As and Ts by about 70%" -> This 70% figure is human-specific, and varies a lot among species; I know in this introduction you're mainly reviewing the human literature but since this part of the text introduces gBGC as a process maybe clarify by adding "in humans" or refrain from giving this figure?

      Thank you. This is a good point. We fixed this.

      B3. "Thus GC-content is expected, and is indeed observed to be higher near recombination hotspots due to gBGC (REF)." -> reference missing here; actually I'm not sure you will find a good reference for this because PRDM9-dependent hotspots are so short-lived that GC-content would only respond weakly; mayber rather refer to the equilibrium GC-content (and cite, for instance, Pratto et al 2014 Science), or to high-recombining regions instead of hotspots (and you have plenty of papers to cite)?

      Thanks for this.

      B4. Paragraph starting: "PRDM9 and recombination hotspots also experience accelerated rates of evolution..." -> I would suggest removing the word "also" and moving this paragraph up, just before the sentence I'm commenting above (the one starting "Thus GC-content..."). This will justify my suggestion in comment B3 of mentioning high-recombining regions instead of hotspots, while also avoiding to have the important paragraph on recombination at TSS (the one starting "There are interesting connections...") being sandwiched between two sections on PRDM9.

      We did not move this paragraph, although we did adjust the wording slightly.

      B5. Paragraph starting "There are interesting connections..." is crucial to your discussion and might be emphasized a bit more in introduction, in my opinion. For instance, what about adding a sentence like "Also not directly relevant to humans, these observations suggest that gBGC might have played a role in shaping the observed 5'-3' GC-content gradient."

      We did not alter the structure of this paragraph but we did reword sections of it.

      1. "Interestingly, this pattern is different in the non-amniotes examined, including anole lizard, coelacanth, shark and lamprey. These organisms had clear differences in GC-content between their first exon and surrounding sequences (upstream and intronic sequences), which came close to the overall genomic GC-content." -> I'm not sure I got the point the authors are intending to make here. Also please note that lizards are amniotes.

      We thank the reviewer for catching this error, we have fixed this.

      Reviewer #3 (Significance (Required)):

      This is one of two preprints having appeared ~at the same time (the other one being the cited Joseph et al 2023), which I think are quite important and convincing regarding the role of PRDM9-dependent and PRDM9-independent recombination on GC-content evolution in vertebrates. I support publication of this preprint in a molecular evolutionary journal.

      We thank the reviewer for their kind assessment!

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This study analyzes the distribution of GC-content along genes in humans and vertebrates, and particularly the higher GC-content in the 5'-end than in the 3'-end of genes. The results suggest that this pattern is ancient in vertebrates, currently decaying in mouse and humans, and probably driven by recombination and GC-biased gene conversion. It is proposed that the 5'-3' gradient hass generated during evolution when PRDM9 was less active (in which case recombination occurs mostly near transcription start sites), and decays when PRDM9 is very active, as it is currently in humans and mouse. This is a very interesting hypothesis, also corroborated by a recent, similar analysis in mammals (Joseph et al. 2023). These two preprints, which appeared around the same time, are, I think, quite novel and important. The analyses performed here are thorough and convincing. Source code and raw data sets are openly distributed. I only have a couple of minor comments and suggestions, which I hope might help improve the manuscript.

      A1. There has been quite some work on the 5'-3' GC-content gradient in plants (e.g. Clément et al. 2014 GBE, Ressayre et al. 2015 GBE, Brazier & Glemin 2023 biorxiv), which you might like to cite.

      A2. CpG-content and GC-content are related in various ways (e.g. see Galtier & Duret 2000 MBE, Fryxell & Moon 2005 MBE) that you might like to discuss; currently the manuscript discusses the CpG hypermutation rate as a driver of GC-content but the picture might be a bit more complex.

      A3. The model introduced by this manuscript (figure 7) is dependent on the evolution of recombination determination in vertebrates and the role of PRDM9. A recent preprint by Raynaud et al (biorxiv) seems relevant to this issue.

      Line-by-line comments

      B1. "First, highly spliced mRNAs tend to have high GC-content at their 5' ends despite the fact that it is not required for export and does not affect expression levels (Mordstein et al., 2020)" -> I do not totally understand this sentence, which seems to imply some link between splicing and export/expression, could you please clarify?

      B2. "mismatches will form in the heteroduplex which are typically corrected in favor of Gs and Cs over As and Ts by about 70%" -> This 70% figure is human-specific, and varies a lot among species; I know in this introduction you're mainly reviewing the human literature but since since this part of the text introduces gBGC as a process maybe clarify by adding "in humans" or refrain from giving this figure?

      B3. "Thus GC-content is expected, and is indeed observed to be higher near recombination hotspots due to gBGC (REF)." -> reference missing here; actually I'm not sure you will find a good reference for this because PRDM9-dependent hotspots are so short-lived that GC-content would only respond weakly; mayber rather refer to the equilibrium GC-content (and cite, for instance, Pratto et al 2014 Science), or to high-recombining regions instead of hotspots (and you have plenty of papers to cite)?

      B4. Paragraph starting: "PRDM9 and recombination hotspots also experience accelerated rates of evolution..." -> I would suggest removing the word "also" and moving this paragraph up, just before the sentence I'm commenting above (the one starting "Thus GC-content..."). This will justify my suggestion in comment B3 of mentioning high-recombining regions instead of hotspots, while also avoiding to have the important paragraph on recombination at TSS (the one starting "There are interesting connections...") being sandwiched between two sections on PRDM9.

      B5. Paragraph starting "There are interesting connections..." is crucial to your discussion and might be emphasized a bit more in introduction, in my opinion. For instance, what about adding a sentence like "Also not directly relevant to humans, these observations suggest that gBGC might have played a role in shaping the observed 5'-3' GC-content gradient."

      1. "Interestingly, this pattern is different in the non-amniotes examined, including anole lizard, coelacanth, shark and lamprey. These organisms had clear differences in GC-content between their first exon and surrounding sequences (upstream and intronic sequences), which came close to the overall genomic GC-content." -> I'm not sure I got the point the authors are intending to make here. Also please note that lizards are amniotes.

      Significance

      This is one of two preprints having appeared ~at the same time (the other one being the cited Joseph et al 2023), which I think are quite important and convincing regarding the role of PRDM9-dependent and PRDM9-independent recombination on GC-content evolution in vertebrates. I support publication of this preprint in a molecular evolutionary journal.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript "Self-inhibiting percolation and viral spreading in epithelial tissue" describes a model based on 5-state cellular automata of development of an infection. The model is motivated and qualitatively justified by time-resolved measurements of expression levels of viral, interferon-producing, and antiviral genes. The model is set up in such a way that the crucial difference in outcomes (infection spreading vs. confinement) depends on the initial fraction of special virus-sensing cells. Those cells (denoted as 'type a') cannot be infected and do not support the propagation of infection, but rather inhibit it in a somewhat autocatalytic way. Presumably, such feedback makes the transition between two outcomes very sharp: a minor variation in concentration of ``a' cells results in qualitative change from one outcome to another. As in any percolation-like system, the transition between propagation and inhibition of infection goes through a critical state with all its attributes. A power-law distribution of the cluster size (corresponding to the fraction of infected cells) with a fairly universal exponent and a cutoff at the upper limit of this distribution.

      Strengths:

      The proposed model suggests an explanation for the apparent diversity of outcomes of viral infections such as COVID.

      Author response: We thank the referee for the concise and accurate summary of our work.

      Weaknesses:

      Those are not real points of weakness, though I think addressing them would substantially improve the manuscript.

      Author response: Below we will address these point by point.

      The key point in the manuscript is the reduction of actual biochemical processes to the NOVAa rules. I think more could be said about it, be it referring to a set of well-known connections between expression states of cells and their reaction to infection or justifying it as an educated guess.

      Author response: We have now improved this part in the model section. We have added a few sentences explaining how the cell state transitions are motivated by the UMAP results:

      “The cell state transitions triggered by IFN signaling or viral replication are known in viral infection, but how exactly the transitions are orchestrated for specific infections is poorly understood. The UMAP cell state distribution hints at possible preferred transitions between states. The closer two cell states are on the UMAP, the more likely transitions between them are, all else being equal. For instance, the antiviral state (𝐴) is easily established from a susceptible cell (𝑂), but not from the fully virus-hijacked cell (𝑉 ). The IFN-secreting cell state (𝑁) requires the co-presence of the viral and antiviral genes and thus the cell cluster is located between the antiviral state (𝐴) and virus-infected state (𝑉 ) but distant from the susceptible cells (𝑂).

      Inspired by the UMAP data visualization (Fig. 1a), we propose the following transitions between five main discrete cell states”

      Another aspect where the manuscript could be improved would be to look a little beyond the strange and 'not-so-relevant for a biomedical audience' focus on the percolation critical state. While the presented calculation of the precise percolation threshold and the critical exponent confirm the numerical skills of the authors, the probability that an actual infected tissue is right at the threshold is negligible. So in addition to the critical properties, it would be interesting to learn about the system not exactly at the threshold: For example, how the speed of propagation of infection depends on subcritical p_a and what is the cluster size distribution for supercritical p_a.

      Author response: We agree that further exploring the model away from the critical threshold is worthwhile. While our main focus has been on explaining the large degree of heterogeneity in outcomes – readily explained as a consequence of the sharp threshold-like behavior – we now include plots of the time-evolution of the infection (as well as the remaining states) over time for subcritical values of pa. The plots can be found in Figure S4 of the supplement.

      Reviewer #2 (Public Review):

      Xu et al. introduce a cellular automaton model to investigate the spatiotemporal spreading of viral infection. In this study, the author first analyzes the single-cell RNA sequencing data from experiments and identifies four clusters of cells at 48 hours post-viral infection, including susceptible cells (O), infected cells (V), IFN-secreting cells (N), and antiviral cells (A). Next, a cellular automaton model (NOVAa model) is introduced by assuming the existence of a transient pre-antiviral state (a). The model consists of an LxL lattice; each site represents one cell. The cells change their state following the rules depending on the interaction of neighboring cells. The model introduces a key parameter, p_a, representing the fraction of pre-antiviral state cells. Cell apoptosis is omitted in the model. Model simulations show a threshold-like behavior of the final attack rate of the virus when p_a changes continuously. There is a critical value p_c, so that when p_a < p_c, infections typically spread to the entire system, while at a higher p_a > p_c, the propagation of the infected state is inhibited. Moreover, the radius R that quantifies the diffusion range of N cells may affect the critical value p_c; a larger R yields a smaller value of the critical value p_c. The structure of clusters is different for different values of R; greater R leads to a different microscopic structure with fewer A and N cells in the final state. Compared with the single-cell RNA seq data, which implies a low fraction of IFN-positive cells - around 1.7% - the model simulation suggests R=5. The authors also explored a simplified version of the model, the OVA model, with only three states. The OVA model also has an outbreak size. The OVA model shows dynamics similar to the NOVAa model. However, the change in microstructure as a function of the IFN range R observed in the NOVAa model is not observed in the OVA model.

      Author response: We thank the referee for the comprehensive summary of our work.

      Data and model simulation mainly support the conclusions of this paper, but some weaknesses should be considered or clarified.

      Author response: Thank you - we will address these point by point below.

      (1) In the automaton model, the authors introduce a parameter p_a, representing the fraction of pre-antiviral state cells. The authors wrote: ``The parameter p_a can also be understood as the probability that an O cell will switch to the N or A state when exposed to the virus of IFNs, respectively.' Nevertheless, biologically, the fraction of pre-antiviral state cells does not mean the same value as the probability that an O cell switches to the N or A state. Moreover, in the numerical scheme, the cell state changes according to the deterministic role N(O)=a and N(a)=A. Hence, the probability p_a did not apply to the model simulation. It may need to clarify the exact meaning of the parameter p_a.

      Author response: We acknowledge that this was an imprecise formulation, and have now changed it.

      What we tried to convey with that comment was that, alternatively to having a certain fraction of cells be in the a state initially, one could instead have devised a model in which We should note that even the current model has a level of stochasticity, since we choose the cells to be updated with a constant probability rate - we choose N cells to update in each timestep, with replacement.

      However, based on your suggestion, we simulated a version of the dynamics which included stochastic conversion, i.e. each action of a cell on a nearby cell happens only with a probability p_conv (and the original model is recovered as the p_conv=1 scenario). Of course, this slows down the dynamics (or effectively rescales time by a factor p_conv), but crucially we find that it does not appreciably affect the location of the threshold p_c. Below we include a parameter scan across p_a values for R=1 and p_conv=0.5, which shows that the threshold continues to appear at around p_a=27%. each O-state cell simply had a probability to act as an a-state cell upon exposure to the virus or to interferons, i.e. to switch to an N state (if exposed to virus) or to the A state (if exposed to interferons). In this simplified model, there would be no functional difference, since it would simply amount to whether each cell had a probability to be designated an a-cell initially (as in our model), or upon exposure. So our remark mainly served to explain that the role of the p_a parameter is simply to encode that a certain fraction of virus-naive cells behave this way (whether predetermined or not).

      (2) The current model is deterministic. However, biologically, considering the probabilistic model may be more realistic. Are the results valid when the probability update strategy is considered? By the probability model, the cells change their state randomly to the state of the neighbor cells. The probability of cell state changes may be relevant for the threshold of p_a. It is interesting to know how the random response of cells may affect the main results and the critical value of p_a.

      Author response: This is a good point - we are firm believers in the importance of stochasticity. We should note that even the current model has a level of stochasticity, since we choose the cells to be updated with a constant probability rate - we choose N cells to update in each timestep, with replacement.

      However, based on your suggestion, we simulated a version of the dynamics which included stochastic conversion, i.e. each action of a cell on a nearby cell happens only with a probability p_conv (and the original model is recovered as the p_conv=1 scenario). Of course, this slows down the dynamics (or effectively rescales time by a factor p_conv), but crucially we find that it does not appreciably affect the location of the threshold p_c. Below we include a parameter scan across p_a values for R=1 and p_conv=0.5, which shows that the threshold continues to appear at around p_a=27%.

      We now discuss these findings in the supplement and include the figure below as Fig. S5.

      Author response image 1.

      (3) Figure 2 shows a critical value p_c = 27.8% following a simulation on a lattice with dimension L = 1000. However, it is unclear if dimension changes may affect the critical value.

      Author response: Re-running the simulations on a lattice 4x as large (i.e. L=2000) yields a similar critical value of 27-28% for R=1, so we are confident that finite size effects do not play a major role at L=1000 and beyond. For R=5, however, we find that a minimum lattice size greater than L=1000 is necessary to determine the critical threshold. Concretely, we find that the threshold value pc for R=5 changes somewhat when the lattice size is increased from 1000 to 2000, but is invariant under a change from 2000 to 3000, so we conclude that L=2000 is sufficient for R=5. The pc value for R=5 cited in the manuscript (~0.4%) was determined from simulations at L=2000.

      Reviewer #3 (Public Review):

      Summary:

      This study considers how to model distinct host cell states that correspond to different stages of a viral infection: from naïve and susceptible cells to infected cells and a minority of important interferon-secreting cells that are the first line of defense against viral spread. The study first considers the distinct host cell states by analyzing previously published single-cell RNAseq data. Then an agent-based model on a square lattice is used to probe the dependence of the system on various parameters. Finally, a simplified version of the model is explored, and shown to have some similarity with the more complex model, yet lacks the dependence on the interferon range. By exploring these models one gains an intuitive understanding of the system, and the model may be used to generate hypotheses that could be tested experimentally, telling us "when to be surprised" if the biological system deviates from the model predictions.

      Author response: Thank you for the summary! We agree with the role that you describe for a model such as this one.

      Strengths:

      -  Clear presentation of the experimental findings and a clear logical progression from these experimental findings to the modeling.

      -  The modeling results are easy to understand, revealing interesting behavior and percolation-like features.

      -  The scaling results presented span several decades and are therefore compelling. - The results presented suggest several interesting directions for theoretical follow-up work, as well as possible experiments to probe the system (e.g. by stimulating or blocking IFN secretion).

      Weaknesses:

      -  Since the "range" of IFN is an important parameter, it makes sense to consider lattice geometries other than the square lattice, which is somewhat pathological. Perhaps a hexagonal lattice would generalize better.

      -  Tissues are typically three-dimensional, not two-dimensional. (Epithelium is an exception). It would be interesting to see how the modeling translates to the three-dimensional case. Percolation transitions are known to be very sensitive to the dimensionality of the system.

      Author response: We agree that probing different lattice geometries (2- and 3-dimensional alike) would be interesting and worthwhile. However, for this manuscript, we prefer to confine the analysis to the current, simple case. We do agree, however, that an extensive exploration of the role of geometry is an interesting future possibility.

      -  The fixed time-step of the agent-based modeling may introduce biases. I would consider simulating the system with Gillespie dynamics where the reaction rates depend on the ambient system parameters.

      -  Single-cell RNAseq data typically involves data imputation due to the high sparsity of the measured gene expression. More information could be provided on this crucial data processing step since it may significantly alter the experimental findings.

      Justification of claims and conclusions:

      The claims and conclusions are well justified.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      It is necessary to explain what UMAP does. Is clustering done in the space of twenty-something original dimensions or 2D? How UMAP1 and UMAP2 are selected and are those the same in all plots?

      Author response: We have now added a few sentences to clarify the point raised above - the second snippet explains how clustering is performed:

      “As a dimension reduction algorithm, UMAP is a manifold learning technique that favors the preservation of local distances over global distances (McInnes et al., 2018; Becht et al., 2019). It constructs a weighted graph from the data points and optimizes the graph layout in the low-dimensional space.”

      “We cluster the cells with the principal components analysis (PCA) results from their gene expression. With the first 16 principal components, we calculate k-nearest neighbors and construct the shared nearest neighbor graph of the cells then optimize the modularity function to determine clusters. We present the cluster information on the UMAP plane and use the same UMAP coordinates for all the plots in this paper hereafter.”

      Figure 1, what do bars in the upper right corners of panels d,e,f, and g indicate? ``Averaged' refers to time average? Something is missing in ``Cell proportions are labeled with corresponding colors in a)' .

      Author response: Thank you - we have now modified the figure caption. The bars in the upper right corners of panels d, e, f are color keys for gene expression, the brighter the color is, the higher the gene expression is.

      “Averaged” gene expression refers to the mean expression of that particular gene across the cells within each indicated cluster.

      The lines in c) correspond to cell proportions in different states at different time points. The same state in 1) and c) is shown in the same color.

      Line 46, ``However' does not sound right in this context. Would ``Also' be better?

      Author response: We agree and have corrected it in the revised manuscript.

      Line 96``The viral genes are also partially expressed in these cells, but different from the 𝑁 cluster, the antiviral genes are fully expressed (Fig. S1 and S2).' The sentence needs to be rephrased.

      Author response: We have rephrased the sentence: “As in the N cluster, the viral gene E is barely detected in these cells, indicating incomplete viral replication. However, in contrast to the N cluster, the antiviral genes are expressed to their full extent (Fig. S1 and S2).”

      Line 126, missing "be", ``large' -> ``larger'.

      Author response: Thank you, we have now corrected these typos.

      Line 139-140 The logical link between ignoring apoptosis and the diffusion of IFN is unclear.

      Author response: We modified the sentence as “Here, we assume that the secretion of IFNs by the 𝑁 cells is a faster process than possible apoptosis (Wen et al., 1997; Tesfaigzi, 2006) of these cells and that the diffusion of IFNs to the neighborhood is not significantly affected by apoptosis.”

      Fig. 2a Do the yellow arrows show the effect of IFN and the purple arrows the propagation of viral infection?

      Author response: That is correct. We have added this information to the figure caption: “The straight black arrows indicate transitions between cell states. The curved yellow arrows indicate the effects of IFNs on activating antiviral states. The curved purple arrows indicate viral spread to cells with 𝑂 and 𝑎 states.”

      Fig. 3, n(s) as the axis label vs P(s) in the text? How do the curves in panel a) look when the p_a is well above or below p_c?

      Author response: Thank you. We have edited the labels in the figure to reflect the symbols used in the text.

      Boundary conditions? From Fig. 4, apparently periodic?

      Author response: Yes, we use periodic boundary conditions in the model. We clarify it in the model section now (last sentence).

      It will be good to see a plot with time dependences of all cell types for a couple of values of p_a, illustrating propagation and cessation of the infection.

      Author response: We agree, and have added a Figure S4 in the supplement which explores exactly that. Thank you for the suggestion.

      A verbal qualitative description of why p_a has such importance and how the infection is terminated for large p_a would help.

      Reviewer #2 (Recommendations For The Authors):

      Below are two minor comments:

      (1) In the single-cell RNA sequencing data analysis, the authors describe the cell clusters O, V, A, and N. However, showing how the clusters are identified from the data might be more straightforward.

      Author response: Technically, we cluster the cells using principal components analysis (PCA) results of their gene expression. With the first 16 principal components, we calculate k-nearest neighbors and construct the shared nearest neighbor graph of the cells and then optimize the modularity function to determine clusters. We manually annotate the clusters with O, V, A, and N based on the detected abundance of viral genes, antiviral genes, and IFNs.

      (2) In Figure 3, what does n(s) mean in Figure 3a? And what is the meaning of the distribution P(s) of infection clusters? It may be stated clearly.

      Author response: The use of n(s) was inconsistent, and we have now edited the figure to instead say P(s), to harmonize it with the text. P(s) is the distribution of cluster sizes, s, expressed as a fraction of the whole system. In other words, once a cluster has reached its final size, we record s=(N+V)/L^2 where N and V are the number of N and V state cells in the cluster (note that, by design, each simulation leads to a single cluster, since we seed the infection in one lattice point). We now indicate more clearly in the caption and the main text what exactly P(s) and s refer to.

      Reviewer #3 (Recommendations For The Authors):

      - Would the authors kindly share the simulation code with the community? Also, the data analysis code should be shared to follow current best practices. This needs to be standard practice in all publications. I would go as far as to say that in 2024 publishing a data analysis / simulation study without sharing the relevant code should be ostracized by the community.

      Author response: We absolutely agree and have created a GitHub repository in which we share the C++ source code for the simulations and a Python notebook for plotting. The public repository can be found at https://github.com/BjarkeFN/ViralPercolation. We add this information in supplement under section “Code availability”.

      ­

      - I would avoid the use of the wording "critical" threshold since this is almost guaranteed to infuriate a certain type of reader.

      ­

      - Line 265 has a curious use of " ... " which should be replaced with something more appropriate.

      Author response: Thank you for pointing it out! We have checked the typos.

  3. Local file Local file
    1. RUN grant through JeffCo = “Tri (Jefferson) County Workforce Board”?

      Yes and Yes -- This indicates a client that received RUN services through the Tri (Jefferson) County Workforce Board. Several of the regional workforce boards had both a RUN grant and a WIG grant (slightly different). Given these frequencies, I think the best we can do is explore the data by Coaching Collaborative and Trade Association Training. However, before I ask you to do that, could you please run the frequencies by zip code? The final question on the survey is the participants' zip code. I'd like to see if there is value in looking at regional differences, but before we do that, let's see if there's enough variability to make that worth it. Thank you!

    Annotators

    1. No contexto educacional, os alunosde diferentes lugares do mundo conseguem interagir entre eles, por meio da linguagem oral, gráfica e gestual, pois os avatares possibilitam manifestações de gestos e emoções

      Eu gosto de interagir no engage, uma plataforma profissional onde os avatares têm um dress code entre o casual e o executivo, Neste ambiente eu tenho formações interactivas, com exercícios um a um, com workshops , de tal forma imerso os que me esqueço que sou em casa. Temos a possibilidade de estar com pessoas de vários países, sem custos.

      Uso, para trabalho a plataforma virtual speech para treino de entrevistas de embuscada, para treino em palco para plateias, enfim, nas áreas do public speaking e é impressionante o resultado de, após uma formação fora d emetaverso, os nossos clientes podem treinar no metaverso num role play imersivo como se na situação real se encontrassem.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General Statements

      We thank all three reviewers for their time and care in reviewing our manuscript, in particular Reviewer 3 for providing a detailed critique that was very useful for planning revisions. We are grateful that all three reviewers indicate that the new genome resources presented in this work are of high-quality and address an existing knowledge gap. We are also grateful for general assessments that the manuscript is 'well-written', and the analyses 'well performed' and 'thorough'.

      We acknowledge Reviewer 3’s legitimate criticism that the assembly and annotation data is not already publicly available and would like to assure the reviewing team that we have been pressing NCBI to progress the submission status since before the preprint was submitted. We regret the delay but hope that we can resolve this issue promptly. Furthermore, as some additional fields in the REAT genome annotation are lost during the NCBI submission process, we will ensure that comprehensive annotation files are also added to Zenodo.

      Reviewer 3 also made the general comment that 'the manuscript could greatly benefit from merging the result and discussion sections' and we would naturally be happy to make this adjustment if the journal in question uses that format.

      Description of the planned revisions

      • We will follow suggestions by Reviewer 3 to improve clarity of two figures:

      Figure S9: Please use a more appropriate colour palette. It is difficult to know the copy number based on the colour gradient.

      Figure 5: Consider changing panel B for a similar version of Fig S12. I think it gives a cleaner and more general perspective of the presence of starship elements.

      • We will address the choice of LOESS versus linear regression for investigating the relationship between candidate secreted effector protein (CSEP) density and transposable element (TE) density, as queried by Reviewer 3:

      Lines 140-144: LOESS smoothing functions are based on local regressions and usually find correlations when there are very weak associations. The authors have to justify the use of this model versus a simpler and more straightforward linear regression. My suspicion is that the latter would fail to find an association. Also, there is no significance of Kendall's Tau estimate (p-value).

      We agree with the reviewer, that as we did not find an association with the more sensitive LOESS, we expect that linear regression would also not find an association, supporting our current conclusions. We will add this negative result into the text.

      • We will check for other features associated with the distribution of CSEPs, as queried by Reviewer 3:

      Lines 157-163: Was there any other feature associated with the CSEP enrichment? GC content? Repetitive content? Centromere likely localisation?

      • We will integrate TE variation into the PERMANOVA lifestyle testing, as suggested by Reviewer 3:

      Line 186: Why not to test the variation content of TEs as a factor for the PERMANOVA?

      In reviewing this suggestion, we also spotted an error in our data plotting code, and the PERMANOVA lifestyle result for all genes will be corrected from 17% to 15% in Fig. 4a. Correcting this error does not impact our ultimate results or interpretation.

      • To complement the current graphical-based assessment of approximate data normality, we will include additional tests (Shapiro-Wilk for sample sizes

      Line 743: Q-Q plots are not a formal statistical test for normality.

      • One of the main critiques from Reviewer 3 was that, although we already acknowledged low sample sizes being a limitation of this work, the manuscript could benefit from reframing with greater consideration of this factor. They also highlighted a few specific places in the text that could be rephrased in consideration of this:

      Line 267: "Multiple strains" can be misleading about the magnitude.

      Lines 305-307: The fact that there is significant copy number variation between the two GtA strains suggests that the variation in the GtA lineage has not been fully captured and that there may be an unsampled substructure. Although the authors acknowledge the need for pangenomic references, they should recognize this limitation in the sample size of their own study, especially when expressing its size as "multiple strains" (line 267).

      Lines 314-317: Again, the sample size is still very small and likely not representative. It suggests UNSAMPLED substructure even for the UK populations.

      Line 164 (and whole section): I would invite the authors to cautiously revisit the use of the terms "core", "soft core". The sample size is very low, as they themselves acknowledge, and probably not representative of the diversity of Gaeumannomyces.

      We intend to edit the text to address this, including removal of both text and figure references to ‘soft-core’ genes, as we agree the term is likely not meaningful in this case, and removing it has no bearing on the results or interpretation.

      Description of the revisions that have already been incorporated in the transferred manuscript

      • We have amended the text in a number of places for clarity/fluency as suggested by Reviewer 3:

      ii) There need to be an explicit conclusion about the differences between pathogenic Gt and non-pathogenic Gh. Somehow, this is not entirely clear and is probably only a matter of rephrasing.

      Please see new lines 477-478: ‘Regarding differences between pathogenic Gt and non-pathogenic Gh, we found that Gh has a larger overall genome size and greater number of genes.’

      Lines 309-314: The message seems a bit out of context in the paragraph.

      This is valid, these lines have now been removed.

      Lines 392-395: The idea that crop pathogenic fungi are under pressure that favours heterothallism does not take into account the multiple cases of successful pathogenic clonal lineages in which sexual reproduction is absent. This paragraph seems very speculative to me. Please rephrase it.

      Our intention here was the exact reverse, that crop pathogens are under pressure to favour homothallism (as Reviewer 3 points out, anecdotally this often seems to play out in nature). We have rephrased lines 386-390 to hopefully make our stance more explicit: 'Together, this could suggest a selective pressure towards homothallism for crop fungal pathogens, and a switch from heterothallism in Gh to homothallism in Gt and Ga may, therefore, have been a key innovation underlying lifestyle divergence between non-pathogenic Gh and pathogenic Gt and Ga.'

      Lines 463-464: Please refer to the analyses when discussing the genetic divergence.

      We have rephrased this sentence to make our intended point clearer, please see new lines 459-461: ‘If we compare Ga and Gt in terms of synteny, genome size and gene content, the magnitude of differences does not appear to be more pronounced than those between GtA and GtB.’

      • We have also fixed the following typographic errors highlighted by Reviewer 3:

      Line 399: You mean, Fig 4C?

      Line 722: You missed "trimAI"

      Lines 723-727: Missing citations for "AMAS" and RAxML-NG, "AHDR" and "OrthoFinder"

      • We have added genome-wide RIP estimates to Supplementary Table S1 as requested by Reviewer 3:

      Lines 416-422: Please provide the data related to the genome-wide estimates of RIP.

      • We have added a note clarifying that differences in overall genome size between lineages are not fully explained by differences in gene copy-number (lines 406-408: 'We should note that the total length of HCN genes was not sufficiently large to account for the overall greater genome size of GtB compared to GtA (Supplemental Table S1).') in response to a comment from Reviewer 3:

      Line 396: The difference in duplicated genes raises the question of whether there are differences in overall genome size between lineages and, if so, whether they can be explained by the presence of genes.

      • We have made an alteration to the author order and added equal second-author contributions.

      Description of analyses that authors prefer not to carry out

      • In response to our analysis regarding the absence of TE-effector compartmentalisation in this system, Reviewer 1 requested additional analyses:

      While TE enrichment is typically associated with accessory compartments, it is not a defining feature. To bolster the authors' claim, it is essential to demonstrate that there is no bias in the ratio of conserved and non-conserved genes across the genomes.

      We believe that there are two slightly different compartmentalisation concepts being somewhat conflated here – (1) the idea of compartments where TEs and virulence proteins such as effectors are significantly colocalised in comparison with the rest of the genome, and (2) the idea of compartments containing gene content that is not shared in all strains (i.e. accessory). The two may overlap – as Reviewer 2 states, accessory compartments may also be enriched with TEs – but not necessarily. We specifically address the first concept in our text, and we appreciate Reviewer 3’s response on this subject:

      There is a clear answer for the compartmentalisation question. The authors favour the idea of "one-compartment" with compelling analyses.

      We believe that the second concept of accessory compartments is shown to be irrelevant in this case from our GENESPACE results (see Fig. 2), which demonstrate that gene content is conserved, broadly syntenic even, across strains, with no clear evidence of accessory compartments or chromosomes regarding gene content. We have already acknowledged that other mechanisms of compartmentalisation beyond TE-effector colocalisation may be at play (as seen from our exploration of effector distributions biased towards telomeres, see section from line 156: ‘Although CSEPs were not broadly colocalised with TEs, we did observe that they appeared to be non-randomly distributed in some pseudochromosomes (Fig. 3a)…’).

      • Reviewer 1 questioned the statement that higher level of genome-wide RIP is consistent with lower levels of gene duplication:

      L422: Is the highest RIP rate in GtA consistent with its low levels of gene duplication? Does this suggest that duplicated sequences in GtA are no longer recognizable due to RIP mutations? This seems counterintuitive, as RIP is primarily triggered by gene duplication.

      Our understanding is that, while RIP can directly mutate coding regions, it predominantly acts on duplicated sequences within repetitive regions such as TEs (https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-02060-w), which has a knock-on effect of reducing TE-mediated gene duplication. In Neurospora crassa, where RIP was first discovered and thus the model species for much of our understanding of the process, a low number of gene duplicates has been linked to the activity of RIP (https://www.nature.com/articles/nature01554). We therefore believe the current text is reasonable.

      • Reviewer 2 stated that experimental validation of gene function is required to make clear links to lifestyle or pathogenicity:

      In my eyes, the study has two main limitations. First of all, the research only concerns genomics analyses, and therefore is rather descriptive and observational, and as such does not provide further mechanistic details into the pathogen biology and/or into pathogenesis. This is further enhanced by the lack of clear observations that discriminate particular species/lineages or life styles from others in the study. Some observations are made with respect to variations in candidate secreted effector proteins and biosynthetic gene clusters, but clear links to life style or pathogenicity are missing. To further substantiate such links, lab-based experimental work would be required.

      We agree that in an ideal world supportive wet biology gene function experimental evidence would be included. Unfortunately, transformation has not been successfully developed yet in this system (see lines 33-35: ‘There have also been considerable difficulties in producing a reliable transformation system for Gt, preventing gene disruption experiments to elucidate function (Freeman and Ward 2004).’) not for lack of trying – after 18 months of effort using all available transformation techniques and selectable markers neither Gt or Gh was transformable. Undertaking that challenge has proven to be far beyond the scope of this paper, the purpose of which was to generate and analyse high-quality genomic data, a major task in itself. We again appreciate Reviewer 3’s response to this point, agreeing that it is out of scope for this work:

      I just want to respectfully disagree with reviewer #2 about the need for more experimental laboratory work, as in my opinion it clearly goes beyond the intention and scope of the submitted work. This could be a limitation that would depend on the chosen journal and its specific format and requirements. Finally, I think it would suffice for the authors to discuss on the lack of in-depth experimental work as part of the limitations of their overall approach.

      As per the suggestion by Reviewer 3, we will add text to address the absence of in-depth experimental work within the scope of this study.

      • Reviewer 3 suggested we might 'consider including formal population differentiation estimators', however, as they previously highlighted above, our sample sizes are too small to produce reliable population-level statistics.

      • Reviewer 3 raised the disparity in the appearance of branches at the root of phylogenetic trees in various figures:

      Figure 4a (and Figs S5, S13): The depicted tree has a trichotomy at the basal node. Please correct it so Magnaporthiopsis poae is resolved as an outgroup, as in Fig. S17.

      All the trees were rooted with M. poae as the outgroup, and although it may seem counterintuitive, a trifurcation at the root is the correct outcome in the case of rerooting a bifurcating tree, please see this discussion including the developers of both leading phylogeny visualisation tools ggtree and phytools (https://www.biostars.org/p/332030/). Although it is possible to force a bifurcating tree after rooting by positioning the root along an edge, the resulting branch lengths in the tree can be misleading, and so in cases where we wanted to include meaningful branch lengths in the figure (i.e. estimated from DNA substitute rates, in Figures 4a, S5 and S13) we have not circumvented the trifurcation. In Fig S17 meaningful branch lengths have not been included and the tree only represents the topology, resulting in the appearance of bifurcation at the root.

      • Reviewer 3 suggested that the discussion on giant Starship TEs resembled more of a review:

      Lines 434-451: This section resembles more a review than a discussion of the results of the present work. This also highlights the lack of analysis on the genetic composition and putative function of the identified starship-like elements.

      The reviewer has a valid point. However, Starships are a recently discovered and thus underexplored genetic feature that readers from the wider mycology/plant pathology community may not yet be aware of. We believe it is warranted to include some additional exposition to give context for why their discovery here is novel, interesting and unexpected. We are naturally keen to investigate the make-up of the elements we have found in this lineage, however that will require a substantial amount of further work. Analysis of Starships is not trivial, for example the starfish tool is still under development and a limited number of species have been used to train it. How best to compare elements is also an active area of investigation – they are dynamic in their structure and may include genes originating from the host genome or a previous host – and for this reason we believe is out of scope to interrogate alongside the other foundational genomic data presented in this paper.

    1. g

      here, g is the gini index, not to be confused with the label g that tells the code what measures to use in the bootstrap

    1. not only pause your code, it pauses the full machine, full-stop down to the kernel.

      pause the machine

    1. you've used a metaphor in the past of thinking of genes not as a as a code as you said but as a kind of musical score

      for - metaphor - genes - musical scores - Denis Noble

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the Reviewers

      We thank the referees for their careful reading of the manuscript and their valuable suggestions for improvements.

      General Statements:

      Existing SMC-based loop extrusion models successfully predict and characterize mesoscale genome spatial organization in vertebrate organisms, providing a valuable computational tool to the genome organization and chromatin biology fields. However, to date this approach is highly limited in its application beyond vertebrate organisms. This limitation arises because existing models require knowledge of CTCF binding sites, which act as effective boundary elements, blocking loop-extruding SMC complexes and thus defining TAD boundaries. However, CTCF is the predominant boundary element only in vertebrates. On the other hand, vertebrates only contain a small proportion of species in the tree of life, while TADs are nearly universal and SMC complexes are largely conserved. Thus, there is a pressing need for loop extrusion models capable of predicting Hi-C maps in organisms beyond vertebrates.

      The conserved-current loop extrusion (CCLE) model, introduced in this manuscript, extends the quantitative application of loop extrusion models in principle to any organism by liberating the model from the lack of knowledge regarding the identities and functions of specific boundary elements. By converting the genomic distribution of loop extruding cohesin into an ensemble of dynamic loop configurations via a physics-based approach, CCLE outputs three-dimensional (3D) chromatin spatial configurations that can be manifested in simulated Hi-C maps. We demonstrate that CCLE-generated maps well describe experimental Hi-C data at the TAD-scale. Importantly, CCLE achieves high accuracy by considering cohesin-dependent loop extrusion alone, consequently both validating the loop extrusion model in general (as opposed to diffusion-capture-like models proposed as alternatives to loop extrusion) and providing evidence that cohesin-dependent loop extrusion plays a dominant role in shaping chromatin organization beyond vertebrates.

      The success of CCLE unambiguously demonstrates that knowledge of the cohesin distribution is sufficient to reconstruct TAD-scale 3D chromatin organization. Further, CCLE signifies a shifted paradigm from the concept of localized, well-defined boundary elements, manifested in the existing CTCF-based loop extrusion models, to a concept also encompassing a continuous distribution of position-dependent loop extrusion rates. This new paradigm offers greater flexibility in recapitulating diverse features in Hi-C data than strictly localized loop extrusion barriers.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This manuscript presents a mathematical model for loop extrusion called the conserved-current loop extrusion model (CCLE). The model uses cohesin ChIP-Seq data to predict the Hi-C map and shows broad agreement between experimental Hi-C maps and simulated Hi-C maps. They test the model on Hi-C data from interphase fission yeast and meiotic budding yeast. The conclusion drawn by the authors is that peaks of cohesin represent loop boundaries in these situations, which they also propose extends to other organism/situations where Ctcf is absent.

      __Response: __

      We would like to point out that the referee's interpretation of our results, namely that, "The conclusion drawn by the authors is that peaks of cohesin represent loop boundaries in these situations, ...", is an oversimplification, that we do not subscribe to. The referee's interpretation of our model is correct when there are strong, localized barriers to loop extrusion; however, the CCLE model allows for loop extrusion rates that are position-dependent and take on a range of values. The CCLE model also allows the loop extrusion model to be applied to organisms without known boundary elements. Thus, the strict interpretation of the positions of cohesin peaks to be loop boundaries overlooks a key idea to emerge from the CCLE model.

      __ Major comments:__

      1. More recent micro-C/Hi-C maps, particularly for budding yeast mitotic cells and meiotic cells show clear puncta, representative of anchored loops, which are not well recapitulated in the simulated data from this study. However, such punta are cohesin-dependent as they disappear in the absence of cohesin and are enhanced in the absence of the cohesin release factor, Wapl. For example - see the two studies below. The model is therefore missing some key elements of the loop organisation. How do the authors explain this discrepency? It would also be very useful to test whether the model can predict the increased strength of loop anchors when Wapl1 is removed and cohesin levels increase.

      Costantino L, Hsieh TS, Lamothe R, Darzacq X, Koshland D. Cohesin residency determines chromatin loop patterns. Elife. 2020 Nov 10;9:e59889. doi: 10.7554/eLife.59889. PMID: 33170773; PMCID: PMC7655110. Barton RE, Massari LF, Robertson D, Marston AL. Eco1-dependent cohesin acetylation anchors chromatin loops and cohesion to define functional meiotic chromosome domains. Elife. 2022 Feb 1;11:e74447. doi: 10.7554/eLife.74447. Epub ahead of print. PMID: 35103590; PMCID: PMC8856730.

      __Response: __

      We are perplexed by this referee comment. While we agree that puncta representing loop anchors are a feature of Hi-C maps, as noted by the referee, we would reinforce that our CCLE simulations of meiotic budding yeast (Figs. 5A and 5B of the original manuscript) demonstrate an overall excellent description of the experimental meiotic budding yeast Hi-C map, including puncta arising from loop anchors. This CCLE model-experiment agreement for meiotic budding yeast is described and discussed in detail in the original manuscript and the revised manuscript (lines 336-401).

      To further emphasize and extend this point we now also address the Hi-C of mitotic budding yeast, which was not included the original manuscript. We have now added an entire new section of the revised manuscript entitled "CCLE Describes TADs and Loop Configurations in Mitotic S. cerevisiae" including the new Figure 6, which presents a comparison between a portion of the mitotic budding yeast Hi-C map from Costantino et al. and the corresponding CCLE simulation at 500 bp-resolution. In this case too, the CCLE model well-describes the data, including the puncta, further addressing the referee's concern that the CCLE model is missing some key elements of loop organization.

      Concerning the referee's specific comment about the role of Wapl, we note that in order to apply CCLE when Wapl is removed, the corresponding cohesin ChIP-seq in the absence of Wapl should be available. To our knowledge, such data is not currently available and therefore we have not pursued this explicitly. However, we would reinforce that as Wapl is a factor that promotes cohesin unloading, its role is already effectively represented in the optimized value for LEF processivity, which encompasses LEF lifetime. In other words, if Wapl has a substantial effect it will be captured already in this model parameter.

      1. Related to the point above, the simulated data has much higher resolution than the experimental data (1kb vs 10kb in the fission yeast dataset). Given that loop size is in the 20-30kb range, a good resolution is important to see the structural features of the chromosomes. Can the model observe these details that are averaged out when the resolution is increased?

      __Response: __

      We agree with the referee that higher resolution is preferable to low resolution. In practice, however, there is a trade-off between resolution and noise. The first experimental interphase fission yeast Hi-C data of Mizuguchi et al 2014 corresponds to 10 kb resolution. To compare our CCLE simulations to these published experimental data, as described in the original manuscript, we bin our 1-kb-resolution simulations to match the 10 kb experimental measurements. Nevertheless, CCLE can readily predict the interphase fission yeast Hi-C map at higher resolution by reducing the bin size (or, if necessary, reducing the lattice site size of the simulations themselves). In the revised manuscript, we have added comparisons between CCLE's predicted Hi-C maps and newer Micro-C data for S. pombe from Hsieh et al. (Ref. [50]) in the new Supplementary Figures 5-9. We have chosen to present these comparisons at 2 kb resolution, which is the same resolution for our meiotic budding yeast comparisons. Also included in Supplementary Figures 5-9 are comparisons between the original Hi-C maps of Mizuguchi et al. and the newer maps of Hsieh et al., binned to 10 kb resolution. Inspection of these figures shows that CCLE provides a good description of Hsieh et al.'s experimental Hi-C maps and does not reveal any major new features in the interphase fission yeast Hi-C map on the 10-100 kb scale, that were not already apparent from the Hi-C maps of Mizuguchi et al 2014. Thus, the CCLE model performs well across this range of effective resolutions.

      3. Transcription, particularly convergent has been proposed to confer boundaries to loop extrusion. Can the authors recapitulate this in their model?

      __Response: __

      In response to the suggestion of the reviewer we have now calculated the correlation between cohesin ChIP-seq and the locations of convergent gene pairs, which is now presented in Supplementary Figures 17 and 18. Accordingly, in the revised manuscript, we have added the following text to the Discussion (lines 482-498):

      "In vertebrates, CTCF defines the locations of most TAD boundaries. It is interesting to ask what might play that role in interphase S. pombe as well as in meiotic and mitotic S. cerevisiae. A number of papers have suggested that convergent gene pairs are correlated with cohesin ChIP-seq in both S. pombe [65, 66] and S. cerevisiae [66-71]. Because CCLE ties TADs to cohesin ChIP-seq, a strong correlation between cohesin ChIP-seq and convergent gene pairs would be an important clue to the mechanism of TAD formation in yeasts. To investigate this correlation, we introduce a convergent-gene variable that has a nonzero value between convergent genes and an integrated weight of unity for each convergent gene pair. Supplementary Figure 17A shows the convergent gene variable, so-defined, alongside the corresponding cohesin ChIP-seq for meiotic and mitotic S. cerevisiae. It is apparent from this figure that a peak in the ChIP-seq data is accompanied by a non-zero value of the convergent-gene variable in about 80% of cases, suggesting that chromatin looping in meiotic and mitotic S. cerevisiae may indeed be tied to convergent genes. Conversely, about 50% of convergent genes match peaks in cohesin ChIP-seq. The cross-correlation between the convergent-gene variable and the ChIP-seq of meiotic and mitotic S. cerevisiae is quantified in Supplementary Figures 17B and C. By contrast, in interphase S. pombe, cross-correlation between convergent genes and cohesin ChIP-seq in each of five considered regions is unobservably small (Supplementary Figure 18A), suggesting that convergent genes per se do not have a role in defining TAD boundaries in interphase S. pombe."

      Minor comments:

      1. In the discussion, the authors cite the fact that Mis4 binding sites do not give good prediction of the HI-C maps as evidence that Mis4 is not important for loop extrusion. This can only be true if the position of Mis4 measured by ChIP is a true reflection of Mis4 position. However, Mis4 binding to cohesin/chromatin is very dynamic and it is likely that this is too short a time scale to be efficiently cross-linked for ChIP. Conversely, extensive experimental data in vivo and in vitro suggest that stimulation of cohesin's ATPase by Mis4-Ssl3 is important for loop extrusion activity.

      __Response: __

      We apologize for the confusion on this point. We actually intended to convey that the absence of Mis4-Psc3 correlations in S. pombe suggests, from the point of view of CCLE, that Mis4 is not an integral component of loop-extruding cohesin, during the loop extrusion process itself. We agree completely that Mis4/Ssl3 is surely important for cohesin loading, and (given that cohesin is required for loop extrusion) Mis4/Ssl3 is therefore important for loop extrusion. Evidently, this part of our Discussion was lacking sufficient clarity. In response to both referees' comments, we have re-written the discussion of Mis4 and Pds5 to more carefully explain our reasoning and be more circumspect in our inferences. The re-written discussion is described below in response to Referee #2's comments.

      Nevertheless, on the topic of whether Nipbl-cohesin binding is too transient to be detected in ChIP-seq, the FRAP analysis presented by Rhodes et al. eLife 6:e30000 "Scc2/Nipbl hops between chromosomal cohesin rings after loading" indicates that, in HeLa cells, Nipbl has a residence time bound to cohesin of about 50 seconds. As shown in the bottom panel of Supplementary Fig. 7 in the original manuscript (and the bottom panel of Supplementary Fig. 20 in the revised manuscript), there is a significant cross-correlation (~0.2) between the Nipbl ChIP-seq and Smc1 ChIP-seq in humans, indicating that a transient association between Nipbl and cohesin can be (and in fact is) detected by ChIP-seq.

      1. *Inclusion of a comparison of this model compared to previous models (for example bottom up models) would be extremely useful. What is the improvement of this model over existing models? *

      __Response: __

      As stated in the original manuscript, as far as we are aware, "bottom up" models, that quantitatively describe the Hi-C maps of interphase fission yeast or meiotic budding yeast or, indeed, of eukaryotes other than vertebrates, do not exist. Bottom-up models would require knowledge of the relevant boundary elements (e.g. CTCF sites), which, as stated in the submitted manuscript, are generally unknown for fission yeast, budding yeast, and other non-vertebrate eukaryotes. The absence of such models is the reason that CCLE fills an important need. Since bottom-up models for cohesin loop extrusion in yeast do not exist, we cannot compare CCLE to the results of such models.

      In the revised manuscript we now explicitly compare the CCLE model to the only bottom-up type of model describing the Hi-C maps of non-vertebrate eukaryotes by Schalbetter et al. Nat. Commun. 10:4795 2019, which we did cite extensively in our original manuscript. Schalbetter et al. use cohesin ChIP-seq peaks to define the positions of loop extrusion barriers in meiotic S. cerevisiae, for which the relevant boundary elements are unknown. In their model, specifically, when a loop-extruding cohesin anchor encounters such a boundary element, it either passes through with a certain probability, as if no boundary element is present, or stops extruding completely until the cohesin unbinds and rebinds.

      In the revised manuscript we refer to this model as the "explicit barrier" model and have applied it to interphase S. pombe, using cohesin ChIP-seq peaks to define the positions of loop extrusion barriers. The corresponding simulated Hi-C map is presented in Supplementary Fig. 19 in comparison with the experimental Hi-C. It is evident that the explicit barrier model provides a poorer description of the Hi-C data of interphase S. pombe compared to the CCLE model, as indicated by the MPR and Pearson correlation scores. While the explicit barrier model appears capable of accurately reproducing Hi-C data with punctate patterns, typically accompanied by strong peaks in the corresponding cohesin ChIP-seq, it seems less effective in several conditions including interphase S. pombe, where the Hi-C data lacks punctate patterns and sharp TAD boundaries, and the corresponding cohesin ChIP-seq shows low-contrast peaks. The success of the CCLE model in describing the Hi-C data of both S. pombe and S. cerevisiae, which exhibit very different features, suggests that the current paradigm of localized, well-defined boundary elements may not be the only approach to understanding loop extrusion. By contrast, CCLE allows for a concept of continuous distribution of position-dependent loop extrusion rates, arising from the aggregate effect of multiple interactions between loop extrusion complexes and chromatin. This paradigm offers greater flexibility in recapitulating diverse features in Hi-C data than strictly localized loop extrusion barriers.

      We have also added the following paragraph in the Discussion section of the manuscript to elaborate this point (lines 499-521):

      "Although 'bottom-up' models which incorporate explicit boundary elements do not exist for non-vertebrate eukaryotes, one may wonder how well such LEF models, if properly modified and applied, would perform in describing Hi-C maps with diverse features. To this end, we examined the performance of the model described in Ref. [49] in describing the Hi-C map of interphase S. cerevisiae. Reference [49] uses cohesin ChIP-seq peaks in meiotic S. cerevisiae to define the positions of loop extrusion barriers which either completely stall an encountering LEF anchor with a certain probability or let it pass. We apply this 'explicit barrier' model to interphase S. pombe, using its cohesin ChIP-seq peaks to define the positions of loop extrusion barriers, and using Ref. [49]'s best-fit value of 0.05 for the pass-through probability. Supplementary Figure 19A presents the corresponding simulated Hi-C map the 0.3-1.3 kb region of Chr 2 of interphase S. pombe in comparison with the corresponding Hi-C data. It is evident that the explicit barrier model provides a poorer description of the Hi-C data of interphase S. pombe compared to the CCLE model, as indicated by the MPR and Pearson correlation scores of 1.6489 and 0.2267, respectively. While the explicit barrier model appears capable of accurately reproducing Hi-C data with punctate patterns, typically accompanied by strong peaks in the corresponding cohesin ChIP-seq, it seems less effective in cases such as in interphase S. pombe, where the Hi-C data lacks punctate patterns and sharp TAD boundaries, and the corresponding cohesin ChIP-seq shows low-contrast peaks. The success of the CCLE model in describing the Hi-C data of both S. pombe and S. cerevisiae, which exhibit very different features, suggests that the current paradigm of localized, well-defined boundary elements may not be the only approach to understanding loop extrusion. By contrast, CCLE allows for a concept of continuous distribution of position-dependent loop extrusion rates, arising from the aggregate effect of multiple interactions between loop extrusion complexes and chromatin. This paradigm offers greater flexibility in recapitulating diverse features in Hi-C data than strictly localized loop extrusion barriers."

      Reviewer #1 (Significance (Required)):

      This simple model is useful to confirm that cohesin positions dictate the position of loops, which was predicted already and proposed in many studies. However, it should be considered a starting point as it does not faithfully predict all the features of chromatin organisation, particularly at better resolution.

      Response:

      As described in more detail above, we do not agree with the assertion of the referee that the CCLE model "does not faithfully predict all the features of chromatin organization, particularly at better resolution" and provide additional new data to support the conclusion that the CCLE model provides a much needed approach to model non-vertebrate contact maps and outperforms the single prior attempt to predict budding yeast Hi-C data using information from cohesin ChIP-seq.

      *It will mostly be of interest to those in the chromosome organisation field, working in organisms or systems that do not have ctcf. *

      __Response: __

      We agree that this work will be of special interest to researchers working on chromatin organization of non-vertebrate organisms. We would reinforce that yeast are frequently used models for the study of cohesin, condensin, and chromatin folding more generally. Indeed, in the last two months alone there are two Molecular Cell papers, one Nature Genetics paper, and one Cell Reports paper where loop extrusion in yeast models is directly relevant. We also believe, however, that the model will be of interest for the field in general as it simultaneously encompasses various scenarios that may lead to slowing down or stalling of LEFs.

      This reviewer is a cell biologist working in the chromosome organisation field, but does not have modelling experience and therefore does not have the expertise to determine if the modelling part is mathematically sound and has assumed that it is.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: Yuan et al. report on their development of an analytical model ("CCLE") for loop extrusion with genomic-position-dependent speed, with the idea of accounting for barriers to loop extrusion. They write down master equations for the probabilities of cohesin occupancy at each genomic site and obtain approximate steady-state solutions. Probabilities are governed by cohesin translocation, loading, and unloading. Using ChIP-seq data as an experimental measurement of these probabilities, they numerically fit the model parameters, among which are extruder density and processivity. Gillespie simulations with these parameters combined with a 3D Gaussian polymer model were integrated to generate simulated Hi-C maps and cohesin ChIP-seq tracks, which show generally good agreement with the experimental data. The authors argue that their modeling provides evidence that loop extrusion is the primary mechanism of chromatin organization on ~10-100 kb scales in S. pombe and S. cerevisiae.

      Major comments:

      1. I am unconvinced that this analysis specifically is sufficient to demonstrate that extrusion is the primary organizer of chromatin on these scales; moreover, the need to demonstrate this is questionable, as extrusion is widely accepted, even if not universally so. How is the agreement of CCLE with experiments more demonstrative of loop extrusion than previous modeling?

      __Response: __

      We agree with the referee's statement that "loop extrusion is extrusion is widely accepted, even if not universally so". We disagree with the referee that this state of affairs means that "the need to demonstrate this (i.e. loop extrusion) is questionable". On the contrary, studies that provide further compelling evidence that cohesin-based loop extrusion is the primary organizer of chromatin, such as ours, must surely be welcomed, first, in order to persuade those who remain unconvinced by the loop extrusion mechanism in general, and, secondly, because, until the present work, quantitative models of loop extrusion, capable of reproducing Hi-C maps quantitatively, in yeasts and other non-vertebrate eukaryotes have been lacking, leaving open the question of whether loop extrusion can describe Hi-C maps beyond vertebrates. CCLE has now answered that question in the affirmative. Moreover, the existence of a robust model to predict contact maps in non-vertebrate models, which are extensively used in the pursuit of research questions in chromatin biology, will be broadly enabling to the field.

      It is fundamental that if a simple, physically-plausible model/hypothesis is able to describe experimental data quantitatively, it is indeed appropriate to ascribe considerable weight to that model/hypothesis (until additional data become available to refute the model).

      How is the agreement of CCLE with experiments more demonstrative of loop extrusion than previous modeling?

      Response:

      As noted above and in the original manuscript, we are unaware of previous quantitative modeling of cohesin-based loop extrusion and the resultant Hi-C maps in organisms that lack CTCF, namely non-vertebrate eukaryotic models such as fission yeast or budding yeast, as we apply here. As noted in the original manuscript, previous quantitative modeling of Hi-C maps based on cohesin loop extrusion and CTCF boundary elements has been convincing that loop extrusion is indeed relevant in vertebrates, but the restriction to vertebrates excludes most of the tree of life.

      Below, the referee cites two examples of loop extrusion outside of vertebrates. The one that is suggested to correspond to yeast cells (Dequeker et al. Nature 606:197 2022) actually corresponds to mouse cells, which are vertebrate cells. The other one models the Hi-C map of the prokaryote, Bacillus subtilis, based on loop extrusion of the bacterial SMC complex thought to most resemble condensin (not cohesin), subject to barriers to loop extrusion that are related to genes or involving prokaryote-specific Par proteins (Brandao et al. PNAS 116:20489 2019). We have referenced this work in the revised manuscript but would reinforce that it lacks utility in predicting the contact maps for non-vertebrate eukaryotes.

      Relatedly, similar best fit values for S. pombe and S. cerevisiae might not point to a mechanistic conclusion (same "underlying mechanism" of loop extrusion), but rather to similar properties for loop-extruding cohesins in the two species.

      Response:

      In the revised manuscript, we have replaced "suggesting that the underlying mechanism that governs loop extrusion by cohesin is identical in both species" with "suggesting loop-extruding cohesins possess similar properties in both species" (lines 367-368).

      As an alternative, could a model with variable binding probability given by ChIP-seq and an exponential loop-size distribution work equally well? The stated lack of a dependence on extrusion timescale suggests that a static looping model might succeed. If not, why not?

      Response:

      A hypothetical mechanism that generates the same instantaneous loop distributions and correlations as loop extrusion would lead to the same Hi-C map as does loop extrusion. This circumstance is not confined to CCLE, but is equally applicable to previous CTCF-based loop extrusion models. It holds because Hi-C and ChIP-seq, and therefore models that seek to describe these measurements, provide a snapshot of the chromatin configuration at one instant of time.

      We would reinforce that there is no physical basis for a diffusion capture model with an approximately-exponential loop size distributions. Nevertheless, one can reasonably ask whether a physically-sensible diffusion capture model can simultaneously match cohesin ChIP-seq and Hi-C. Motivated by the referee's comment we have addressed this question and, accordingly, in the revised manuscript, we have added (1) an entire subsection entitled "Diffusion capture does not reproduce experimental interphase S. pombe Hi-C maps" (lines 303-335) and (2) Supplementary Figure 15. As we now demonstrate, the CCLE model vastly outperforms an equilibrium binding model in reproducing the experimental Hi-C maps and measured P(s).

      *2. I do not understand how the loop extrusion residence time drops out. As I understand it, Eq 9 converts ChIP-seq to lattice site probability (involving N_{LEF}, which is related to \rho, and \rho_c). Then, Eqs. 3-4 derive site velocities V_n and U_n if we choose rho, L, and \tau, with the latter being the residence time. This parameter is not specified anywhere and is claimed to be unimportant. It may be true that the choice of timescale is arbitrary in this procedure, but can the authors please clarify? *

      __Response: __

      As noted above, Hi-C and ChIP-seq both capture chromatin configuration at one instant in time. Therefore, such measurements cannot and do not provide any time-scale information, such as the loop extrusion residence time (LEF lifetime) or the mean loop extrusion rate. For this reason, neither our CCLE simulations, nor other researchers' previous simulations of loop extrusion in vertebrates with CTCF boundary elements, provide any time-scale information, because the experiments they seek to describe do not contain time-scale information. The Hi-C map simulations can and do provide information concerning the loop size, which is the product of the loop lifetime and the loop extrusion rate. Lines 304-305 of the revised manuscript include the text: "Because Hi-C and ChIP-seq both characterize chromatin configuration at a single instant of time, and do not provide any direct time-scale information, ..."

      In practice, we set the LEF lifetime to be some explicit value with arbitrary time-unit. We have added a sentence in the Methods that reads, "In practice, however, we set the LEF dissociation rate to 5e-4 time-unit-1 (equivalent to a lifetime of 2000 time-units), and the nominal LEF extrusion rate (aka \rho*L/\tau, see Supplementary Methods) can be determined from the given processivity" (lines 599-602), to clarify this point. We have also changed the terminology from "timesteps" to "LEF events" in the manuscript as the latter is more accurate for our purpose.

      1. The assumptions in the solution and application of the CCLE model are potentially constraining to a limited number of scenarios. In particular the authors specify that current due to binding/unbinding, A_n - D_n, is small. This assumption could be problematic near loading sites (centromeres, enhancers in higher eukaryotes, etc.) (where current might be dominated by A_n and V_n), unloading sites (D_n and V_{n-1}), or strong boundaries (D_n and V_{n-1}). The latter scenario is particularly concerning because the manuscript seems to be concerned with the presence of unidentified boundaries. This is partially mitigated by the fact that the model seems to work well in the chosen examples, but the authors should discuss the limitations due to their assumptions and/or possible methods to get around these limitations.

      4. Related to the above concern, low cohesin occupancy is interpreted as a fast extrusion region and high cohesin occupancy is interpreted as a slow region. But this might not be true near cohesin loading and unloading sites.

      __Response: __

      Our response to Referee 2's Comments 3. and 4. is that both in the original manuscript and in the revised manuscript we clearly delineate the assumptions underlying CCLE and we carefully assess the extent to which these assumptions are violated (lines 123-126 and 263-279 in the revised manuscript). For example, Supplementary Figure 12 shows that across the S. pombe genome as a whole, violations of the CCLE assumptions are small. Supplementary Figure 13 shows that violations are similarly small for meiotic S. cerevisiae. However, to explicitly address the concern of the referee, we have added the following sentences to the revised manuscript:

      Lines 277-279:

      "While loop extrusion in interphase S. pombe seems to well satisfy the assumptions underlying CCLE, this may not always be the case in other organisms."

      Lines 359-361:

      "In addition, the three quantities, given by Eqs. 6, 7, and 8, are distributed around zero with relatively small fluctuations (Supplementary Fig. 13), indicating that CCLE model is self-consistent in this case also."

      In the case of mitotic S. cerevisiae, Supplementary Figure 14 shows that these quantities are small for most of genomic locations, except near the cohesin ChIP-seq peaks. We ascribe these greater violations of CCLE's assumptions at the locations of cohesin peaks in part to the low processivity of mitotic cohesin in S. cerevisiae, compared to that of meiotic S. cerevisiae and interphase S. pombe, and in part to the low CCLE loop extrusion rate at the cohesin peaks. We have added a paragraph at the end of the Section "CCLE Describes TADs and Loop Configurations in Mitotic S. cerevisiae" to reflect these observations (lines 447-461).

      1. *The mechanistic insight attempted in the discussion, specifically with regard to Mis4/Scc2/NIPBL and Pds5, is problematic. First, it is not clear how the discussion of Nipbl and Pds5 is connected to the CCLE method; the justification is that CCLE shows cohesin distribution is linked to cohesin looping, which is already a questionable statement (point 1) and doesn't really explain how the model offers new insight into existing Nipbl and Pds5 data. *

      Furthermore, I believe that the conclusions drawn on this point are flawed, or at least, stated with too much confidence. The authors raise the curious point that Nipbl ChIP-seq does not correlate well with cohesin ChIP-seq, and use this as evidence that Nipbl is not a part of the loop-extruding complex in S. pombe, and it is not essential in humans. Aside from the molecular evidence in human Nipbl/cohesin (acknowledged by authors), there are other reasons to doubt this conclusion. First, depletion of Nipbl (rather than binding partner Mau2 as in ref 55) in mouse cells strongly inhibits TAD formation (Schwarzer et al. Nature 551:51 2017). Second, at least two studies have raised concerns about Nibpl ChIP-seq results: 1) Hu et al. Nucleic Acids Res 43:e132 2015, which shows that uncalibrated ChIP-seq can obscure the signal of protein localization throughout the genome due to the inability to distinguish from background * and 2) Rhodes et al. eLife 6:e30000, which uses FRAP to show that Nipbl binds and unbinds to cohesin rapidly in human cells, which could go undetected in ChIP-seq, especially when uncalibrated. It has not been shown that these dynamics are present in yeast, but there is no reason to rule it out yet.*

      Similar types of critiques could be applied to the discussion of Pds5. There is cross-correlation between Psc3 and Pds5 in S. pombe, but the authors are unable to account for whether Pds5 binding is transient and/or necessary to loop extrusion itself or, more importantly, whether Pds5 ChIP is associated with extrusive or cohesive cohesins; cross-correlation peaks at about 0.6, but note that by the authors own estimates, cohesive cohesins are approximately half of all cohesins in S. pombe (Table 3).

      *Due to the above issues, I suggest that the authors heavily revise this discussion to better reflect the current experimental understanding and the limited ability to draw such conclusions based on the current CCLE model. *

      __Response: __

      As stated above, our study demonstrates that the CCLE approach is able to take as input cohesin (Psc3) ChIP-seq data and produce as output simulated Hi-C maps that well reproduce the experimental Hi-C maps of interphase S. pombe and meiotic S. cerevisiae. This result is evident from the multiple Hi-C comparison figures in both the original and the revised manuscripts. In light of this circumstance, the referee's statement that it is "questionable", that CCLE shows that cohesin distribution (as quantified by cohesin ChIP-seq) is linked to cohesin looping (as quantified by Hi-C), is demonstrably incorrect.

      However, we did not intend to suggest that Nipbl and Pds5 are not crucial for cohesin loading, as the reviewer states. Rather, our inquiries relate to a more nuanced question of whether these factors only reside at loading sites or, instead, remain as a more long-lived constituent component of the loop extrusion complex. We regret any confusion and have endeavored to clarify this point in the revised manuscript in response to Referee 2's Comment 5. as well as Referee 1's Minor Comment 1. We have now better explained how the CCLE model may offer new insight from existing ChIP-seq data in general and from Mis4/Nipbl and Pds5 ChIP-seq, in particular. Accordingly, we have followed Referee 2's advice to heavily revise the relevant section of the Discussion.

      To this end, we have removed the following text from the original manuscript:

      "The fact that the cohesin distribution along the chromatin is strongly linked to chromatin looping, as evident by the success of the CCLE model, allows for new insights into in vivo LEF composition and function. For example, recently, two single-molecule studies [37, 38] independently found that Nipbl, which is the mammalian analogue of Mis4, is an obligate component of the loop-extruding human cohesin complex. Ref. [37] also found that cohesin complexes containing Pds5, instead of Nipbl, are unable to extrude loops. On this basis, Ref. [32] proposed that, while Nipbl-containing cohesin is responsible for loop extrusion, Pds5-containing cohesin is responsible for sister chromatid cohesion, neatly separating cohesin's two functions according to composition. However, the success of CCLE in interphase S. pombe, together with the observation that the Mis4 ChIP-seq signal is uncorrelated with the Psc3 ChIP-seq signal (Supplementary Fig. 7) allows us to infer that Mis4 cannot be a component of loop-extruding cohesin in S. pombe. On the other hand, Pds5 is correlated with Psc3 in S. pombe (Supplementary Fig. 7) suggesting that both proteins are involved in loop-extruding cohesin, contradicting a hypothesis that Pds5 is a marker for cohesive cohesin in S. pombe. In contrast to the absence of Mis4-Psc3 correlation in S. pombe, in humans, Nipbl ChIP-seq and Smc1 ChIP-seq are correlated (Supplementary Fig. 7), consistent with Ref. [32]'s hypothesis that Nipbl can be involved in loop-extruding cohesin in humans. However, Ref. [55] showed that human Hi-C contact maps in the absence of Nipbl's binding partner, Mau2 (Ssl3 in S. pombe [56]) show clear TADs, consistent with loop extrusion, albeit with reduced long-range contacts in comparison to wild-type maps, indicating that significant loop extrusion continues in live human cells in the absence of Nipbl-Mau2 complexes. These collected observations suggest the existence of two populations of loop-extruding cohesin complexes in vivo, one that involves Nipbl-Mau2 and one that does not. Both types are present in mammals, but only Mis4-Ssl3-independent loop-extruding cohesin is present in S. pombe."

      And we have replaced it by the following text in the revised manuscript (lines 533-568):

      "As noted above, the input for our CCLE simulations of chromatin organization in S. pombe, was the ChIP-seq of Psc3, which is a component of the cohesin core complex [75]. Accordingly, Psc3 ChIP-seq represents how the cohesin core complex is distributed along the genome. In S. pombe, the other components of the cohesin core complex are Psm1, Psm3, and Rad21. Because these proteins are components of the cohesin core complex, we expect that the ChIP-seq of any of these proteins would closely match the ChIP-seq of Psc3, and would equally well serve as input for CCLE simulations of S. pombe genome organization. Supplementary Figure 20C confirms significant correlations between Psc3 and Rad21. In light of this observation, we then reason that the CCLE approach offers the opportunity to investigate whether other proteins beyond the cohesin core are constitutive components of the loop extrusion complex during the extrusion process (as opposed to cohesin loading or unloading). To elaborate, if the ChIP-seq of a non-cohesin-core protein is highly correlated with the ChIP-seq of a cohesin core protein, we can infer that the protein in question is associated with the cohesin core and therefore is a likely participant in loop-extruding cohesin, alongside the cohesin core. Conversely, if the ChIP-seq of a putative component of the loop-extruding cohesin complex is uncorrelated with the ChIP-seq of a cohesin core protein, then we can infer that the protein in question is unlikely to be a component of loop-extruding cohesin, or at most is transiently associated with it.

      For example, in S. pombe, the ChIP-seq of the cohesin regulatory protein, Pds5 [74], is correlated with the ChIP-seq of Psc3 (Supplementary Fig. 20B) and with that of Rad21 (Supplementary Fig. 20D), suggesting that Pds5 can be involved in loop-extruding cohesin in S. pombe, alongside the cohesin core proteins. Interestingly, this inference concerning fission yeast cohesin subunit, Pds5, stands in contrast to the conclusion from a recent single-molecule study [38] concerning cohesin in vertebrates. Specifically, Reference [38] found that cohesin complexes containing Pds5, instead of Nipbl, are unable to extrude loops.

      Additionally, as noted above, in S. pombe the ChIP-seq signal of the cohesin loader, Mis4, is uncorrelated with the Psc3 ChIP-seq signal (Supplementary Fig. 20A), suggesting that Mis4 is, at most, a very transient component of loop-extruding cohesin in S. pombe, consistent with its designation as a "cohesin loader". However, both References [38] and [39] found that Nipbl (counterpart of S. pombe's Mis4) is an obligate component of the loop-extruding human cohesin complex, more than just a mere cohesin loader. Although CCLE has not yet been applied to vertebrates, from a CCLE perspective, the possibility that Nipbl may be required for the loop extrusion process in humans is bolstered by the observation that in humans Nipbl ChIP-seq and Smc1 ChIP-seq show significant correlations (Supplementary Fig. 20G), consistent with Ref. [32]'s hypothesis that Nipbl is involved in loop-extruding cohesin in vertebrates. A recent theoretical model of the molecular mechanism of loop extrusion by cohesin hypothesizes that transient binding by Mis4/Nipbl is essential for permitting directional reversals and therefore for two-sided loop extrusion [41]. Surprisingly, there are significant correlations between Mis4 and Pds5 in S. pombe (Supplementary Fig. 20E), indicating Pds5-Mis4 association, outside of the cohesin core complex."

      In response to Referee 2's specific comment that "at least two studies have raised concerns about Nibpl ChIP-seq results", we note (1) that, while Hu et al. Nucleic Acids Res 43:e132 2015 present a general method for calibrating ChIP-seq results, they do not measure Mis4/Nibpl ChIP-seq, nor do they raise any specific concerns about Mis4/Nipbl ChIP-seq, and (2) that (as noted above, in response to Referee 1's comment) while the FRAP analysis presented by Rhodes et al. eLife 6:e30000 indicates that, in HeLa cells, Nipbl has a residence time bound to cohesin of about 50 seconds, nevertheless, as shown in Supplementary Fig. 20G in the revised manuscript, there is a significant cross-correlation between the Nipbl ChIP-seq and Smc1 ChIP-seq in humans, indicating that a transient association between Nipbl and cohesin is detected by ChIP-seq, the referees' concerns notwithstanding.

      We thank the referee for pointing out Schwarzer et al. Nature 551:51 2017. However, our interpretation of these data is different than the referee's. As noted in our original manuscript, Nipbl has traditionally been considered to be a cohesin loading factor. If the role of Nipbl was solely to load cohesin, then we would expect that depleting Nipbl would have a major effect on the Hi-C map, because fewer cohesins are loaded onto the chromatin. Figure 2 of Schwarzer et al. Nature 551:51 2017, shows the effect of depleting Nibpl on a vertebrate Hi-C map. Even in this case when Nibpl is absent, this figure (Figure 2 of Schwarzer et al. Nature 551:51 2017) shows that TADs persist, albeit considerably attenuated. According to the authors' own analysis associated with Fig. 2 of their paper, these attenuated TADs correspond to a smaller number of loop-extruding cohesin complexes than in the presence of Nipbl. Since Nipbl is depleted, these loop-extruding cohesins necessarily cannot contain Nipbl. Thus, the data and analysis of Schwarzer et al. Nature 551:51 2017 actually seem consistent with the existence of a population of loop-extruding cohesin complexes that do not contain Nibpl.

      Concerning the referee's comment that we cannot be sure whether Pds5 ChIP is associated with extrusive or cohesive cohesin, we note that, as explained in the manuscript, we assume that the cohesive cohesins are uniformly distributed across the genome, and therefore that peaks in the cohesin ChIP-seq are associated with loop-extruding cohesins. The success of CCLE in describing Hi-C maps justifies this assumption a posteriori. Supplementary Figure 20B shows that the ChIP-seq of Pds5 is correlated with the ChIP-seq of Psc3 in S. pombe, that is, that peaks in the ChIP-seq of Psc3, assumed to derive from loop-extruding cohesin, are accompanied by peaks in the ChIP-seq of Pds5. This is the reasoning allowing us to associate Pds5 with loop-extruding cohesin in S. pombe.

      1. I suggest that the authors recalculate correlations for Hi-C maps using maps that are rescaled by the P(s) curves. As currently computed, most of the correlation between maps could arise from the characteristic decay of P(s) rather than smaller scale features of the contact maps. This could reduce the surprising observed correlation between distinct genomic regions in pombe (which, problematically, is higher than the observed correlation between simulation and experiment in cervisiae).

      Response:

      We thank the referee for this advice. Following this advice, throughout the revised manuscript, we have replaced our original calculation of the Pearson correlation coefficient of unscaled Hi-C maps with a calculation of the Pearson correlation coefficient of rescaled Hi-C maps. Since the MPR is formed from ratios of simulated to experimental Hi-C maps, this metric is unchanged by the proposed rescaling.

      As explained in the original manuscript, we attribute the lower experiment-simulation correlation in the meiotic budding yeast Hi-C maps to the larger statistical errors of the meiotic budding yeast dataset, which arises because of its higher genomic resolution - all else being equal we can expect 25 times the counts in a 10 kb x10 kb bin as in a 2 kb x 2 kb bin. For the same reason, we expect larger statistical errors in the mitotic budding yeast dataset as well. Lower correlations for noisier data are to be expected in general.

      *7. Please explain why the difference between right and left currents at any particular site, (R_n-L_n) / Rn+Ln, should be small. It seems easy to imagine scenarios where this might not be true, such as directional barriers like CTCF or transcribed genes. *

      __Response: __

      For simplicity, the present version of CCLE sets the site-dependent loop extrusion rates by assuming that the cohesin ChIP-seq signal has equal contributions from left and right anchors. Then, we carry out our simulations which subsequently allow us to examine the simulated left and right currents and their difference at every site. The distributions of normalized left-right difference currents are shown in Supplementary Figures 12B, 13B, and 14D, for interphase S. pombe, meiotic S. cerevisiae, and mitotic S. cerevisiae, respectively. They are all centered at zero with standard deviations of 0.12, 0.16, and 0.33. Thus, it emerges from our simulations that the difference current is indeed generally small.

      8. Optional, but I think would greatly improve the manuscript, but can the authors: a) analyze regions of high cohesin occupancy (assumed to be slow extrusion regions) to determine if there's anything special in these regions, such as more transcriptional activity

      __Response: __

      In response to Referee 1's similar comment, we have calculated the correlation between the locations of convergent genes and cohesin ChIP-seq. Supplementary Figure 18A in the revised manuscript shows that for interphase S. pombe no correlations are evident, whereas for both of meiotic and mitotic S. cerevisiae, there are significant correlations between these two quantities (Supplementary Fig. 17).

      *b) apply this methodology to vertebrate cell data *

      __Response: __

      The application of CCLE to vertebrate data is outside the scope of this paper which, as we have emphasized, has the goal of developing a model that can be robustly applied to non-vertebrate eukaryotic genomes. Nevertheless, CCLE is, in principle, applicable to all organisms in which loop extrusion by SMC complexes is the primary mechanism for chromatin spatial organization.

      1. *A Github link is provided but the code is not currently available. *

      __Response: __

      The code is now available.

      Minor Comments:

      1. Please state the simulated LEF lifetime, since the statement in the methods that 15000 timesteps are needed for equilibration of the LEF model is otherwise not meaningful. Additionally, please note that backbone length is not necessarily a good measure of steady state, since the backbone can be compacted to its steady-state value while the loop distribution continues to evolve toward its steady state.

      __Response: __

      The terminology "timesteps" used in the original manuscript in fact should mean "the number of LEF events performed" in the simulation. Therefore, we have changed the terminology from "timesteps" to "LEF events".

      The choice of 15000 LEF events is empirically determined to ensure that loop extrusion steady state is achieved, for the range of parameters considered. To address the referee's concern regarding the uncertainty of achieving steady state after 15000 LEF events, we compared two loop size distributions: each distribution encompasses 1000 data points, equally separated in time, one between LEF event 15000 and 35000, and the other between LEF event 80000 and 100000. The two distributions are within-errors identical, suggesting that the loop extrusion steady state is well achieved within 15000 LEF events.

      2. How important is the cohesive cohesin parameter in the model, e.g., how good are fits with \rho_c = 0?

      __Response: __

      As stated in the original manuscript, the errors on \rho_c on the order of 10%-20% (for S. pombe). Thus, fits with \rho_c=0 are significantly poorer than with the best-fit values of \rho_c.

      *3. A nice (but non-essential) supplemental visualization might be to show a scatter of sim cohesin occupancy vs. experiment ChIP. *

      __Response: __

      We have chosen not to do this, because we judge that the manuscript is already long enough. Figures 3A, 5D, and 6C already compare the experimental and simulated ChIP-seq, and these figures already contain more information than the figures proposed by the referee.

      1. *A similar calculation of Hi-C contacts based on simulated loop extruder positions using the Gaussian chain model was previously presented in Banigan et al. eLife 9:e53558 2020, which should be cited. *

      __Response: __

      We thank the referee for pointing out this citation. We have added it to the revised manuscript.

      1. It is stated that simulation agreement with experiments for cerevisiae is worse in part due to variability in the experiments, with MPR and Pearson numbers for cerevisiae replicates computed for reference. But these numbers are difficult to interpret without, for example, similar numbers for duplicate pombe experiments. Again, these numbers should be generated using Hi-C maps scaled by P(s), especially in case there are systematic errors in one replicate vs. another.

      __Response: __

      As noted above, throughout the revised manuscript, we now give the Pearson correlation coefficients of scaled-by-P(s) Hi-C maps.

      1. *In the model section, it is stated that LEF binding probabilities are uniformly distributed. Did the authors mean the probability is uniform across the genome or that the probability at each site is a uniformly distributed random number? Please clarify, and if the latter, explain why this unconventional assumption was made. *

      __Response: __

      It is the former. We have modified the manuscript to clarify that LEFs "initially bind to empty, adjacent chromatin lattice sites with a binding probability, that is uniformly distributed across the genome." (lines 587-588).

      *7. Supplement p4 line 86 - what is meant by "processivity of loops extruded by isolated LEFs"? "size of loops extruded by..." or "processivity of isolated LEFs"? *

      __Response: __

      Here "processivity of isolated LEFs" is defined as the processivity of one LEF without the interference (blocking) from other LEFs. We have changed "processivity of loops extruded by isolated LEFs" to "processivity of isolated LEFs" for clarity.

      1. The use of parentheticals in the caption to Table 2 is a little confusing; adding a few extra words would help.

      __Response: __

      In the revised manuscript, we have added an additional sentence, and have removed the offending parentheses.

      1. *Page 12 sentence line 315-318 is difficult to understand. The barrier parameter is apparently something from ref 47 not previously described in the manuscript. *

      __Response: __

      In the revised manuscript, we have removed mention of the "barrier parameter" from the discussion.

      1. *Statement on p14 line 393-4 is false: prior LEF models have not been limited to vertebrates, and the authors have cited some of them here. There are also non-vertebrate examples with extrusion barriers: genes as boundaries to condensin in bacteria (Brandao et al. PNAS 116:20489 2019) and MCM complexes as boundaries to cohesin in yeast (Dequeker et al. Nature 606:197 2022). *

      __Response: __

      In fact, Dequeker et al. Nature 606:197 2022 concerns the role of MCM complexes in blocking cohesin loop extrusion in mouse zygotes. Mouse is a vertebrate. The sole aspect of this paper, that is associated with yeast, is the observation of cohesin blocking by the yeast MCM bound to the ARS1 replication origin site, which is inserted on a piece of lambda phage DNA. No yeast genome is used in the experiment. Therefore, the referee is mistaken to suggest that this paper models yeast genome organization.

      We thank the referee for pointing out Brandao et al. PNAS 116:20489 2019, which includes the development of a tour-de-force model of condensin-based loop extrusion in the prokaryote, Bacillus subtilis, in the presence of gene barriers to loop extrusion. To acknowledge this paper, we have changed the objectionable sentence to now read (lines 571-575):

      "... prior LEF models have been overwhelmingly limited to vertebrates, which express CTCF and where CTCF is the principal boundary element. Two exceptions, in which the LEF model was applied to non-vertebrates, are Ref. [49], discussed above, and Ref. [76] (Brandao et al.), which models the Hi-C map of the prokaryote, Bacillus subtilis, on the basis of condensin loop extrusion with gene-dependent barriers."

      *Referees cross-commenting *

      I agree with the comments of Reviewer 1, which are interesting and important points that should be addressed.

      *Reviewer #2 (Significance (Required)):

      Analytically approaching extrusion by treating cohesin translocation as a conserved current is an interesting approach to modeling and analysis of extrusion-based chromatin organization. It appears to work well as a descriptive model. But I think there are major questions concerning the mechanistic value of this model, possible applications of the model, the provided interpretations of the model and experiments, and the limitations of the model under the current assumptions. I am unconvinced that this analysis specifically is sufficient to demonstrate that extrusion is the primary organizer of chromatin on these scales; moreover, the need to demonstrate this is questionable, as extrusion is widely accepted, even if not universally so. It is also unclear that the minimal approach of the CCLE necessarily offers an improved physical basis for modeling extrusion, as compared to previous efforts such as ref 47, as claimed by the authors. There are also questions about significance due to possible limitations of the model (detailed above). Applying the CCLE model to identify barriers would be interesting, but is not attempted. Overall, the work presents a reasonable analytical model and numerical method, but until the major comments above are addressed and some reasonable application or mechanistic value or interpretation is presented, the overall significance is somewhat limited.*

      __Response: __

      We agree with the referee that analytically approaching extrusion by treating cohesin translocation as a conserved current is an interesting approach to modeling and analysis of extrusion-based chromatin organization. We also agree with the referee that it works well as a descriptive model (of Hi-C maps in S. pombe and S. cerevisiae). Obviously, we disagree with the referee's other comments. For us, being able to describe the different-appearing Hi-C maps of interphase S. pombe (Fig. 1 and Supplementary Figures 1-9), meiotic S. cerevisiae (Fig. 5) and mitotic S. cerevisiae (Fig. 6), all with a common model with just a few fitting parameters that differ between these examples, is significant and novel. The reviewer prematurely ignores the fact that there are still debates about whether "diffusion-capture"-like model is the more dominant mechanism that shape chromatin spatial organization at the TAD-scale. Many works have argued that such models could describe TAD-scale chromatin organization, as cited in the revised manuscript (Refs. [11, 14, 15, 17, 20, 22-24, 55]). However, in contrast to the poor description of the Hi-C map using diffusion capture model (as demonstrated in the revised manuscript and Supplementary Fig. 15), the excellent experiment-simulation agreement achieved by CCLE provides compelling evidence that cohesin-based loop extrusion is indeed the primary organizer of TAD-scale chromatin.

      Importantly, CCLE provides a theoretical base for how loop extrusion models can be generalized and applied to organisms without known loop extrusion barriers. Our model also highlights that (and provides means to account for) distributed barriers that impede but do not strictly block LEFs could also impact chromatin configurations. This case might be of importance to organisms with CTCF motifs that infrequently coincide with TAD boundaries, for instance, in the case of Drosophila melanogaster. Moreover, CCLE promises theoretical descriptions of the Hi-C maps of other non-vertebrates in the future, extending the quantitative application of the LEF model across the tree of life. This too would be highly significant if successful.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Yuan et al. report on their development of an analytical model ("CCLE") for loop extrusion with genomic-position-dependent speed, with the idea of accounting for barriers to loop extrusion. They write down master equations for the probabilities of cohesin occupancy at each genomic site and obtain approximate steady-state solutions. Probabilities are governed by cohesin translocation, loading, and unloading. Using ChIP-seq data as an experimental measurement of these probabilities, they numerically fit the model parameters, among which are extruder density and processivity. Gillespie simulations with these parameters combined with a 3D Gaussian polymer model were integrated to generate simulated Hi-C maps and cohesin ChIP-seq tracks, which show generally good agreement with the experimental data. The authors argue that their modeling provides evidence that loop extrusion is the primary mechanism of chromatin organization on ~10-100 kb scales in S. pombe and S. cerevisiae.

      Major comments:

      1. I am unconvinced that this analysis specifically is sufficient to demonstrate that extrusion is the primary organizer of chromatin on these scales; moreover, the need to demonstrate this is questionable, as extrusion is widely accepted, even if not universally so. How is the agreement of CCLE with experiments more demonstrative of loop extrusion than previous modeling? Relatedly, similar best fit values for S. pombe and S. cerevisiae might not point to a mechanistic conclusion (same "underlying mechanism" of loop extrusion), but rather to similar properties for loop-extruding cohesins in the two species. As an alternative, could a model with variable binding probability given by ChIP-seq and an exponential loop-size distribution work equally well? The stated lack of a dependence on extrusion timescale suggests that a static looping model might succeed. If not, why not?
      2. I do not understand how the loop extrusion residence time drops out. As I understand it, Eq 9 converts ChIP-seq to lattice site probability (involving N_{LEF}, which is related to \rho, and \rho_c). Then, Eqs. 3-4 derive site velocities V_n and U_n if we choose rho, L, and \tau, with the latter being the residence time. This parameter is not specified anywhere and is claimed to be unimportant. It may be true that the choice of timescale is arbitrary in this procedure, but can the authors please clarify?
      3. The assumptions in the solution and application of the CCLE model are potentially constraining to a limited number of scenarios. In particular the authors specify that current due to binding/unbinding, A_n - D_n, is small. This assumption could be problematic near loading sites (centromeres, enhancers in higher eukaryotes, etc.) (where current might be dominated by A_n and V_n), unloading sites (D_n and V_{n-1}), or strong boundaries (D_n and V_{n-1}). The latter scenario is particularly concerning because the manuscript seems to be concerned with the presence of unidentified boundaries. This is partially mitigated by the fact that the model seems to work well in the chosen examples, but the authors should discuss the limitations due to their assumptions and/or possible methods to get around these limitations.
      4. Related to the above concern, low cohesin occupancy is interpreted as a fast extrusion region and high cohesin occupancy is interpreted as a slow region. But this might not be true near cohesin loading and unloading sites.
      5. The mechanistic insight attempted in the discussion, specifically with regard to Mis4/Scc2/NIPBL and Pds5, is problematic. First, it is not clear how the discussion of Nipbl and Pds5 is connected to the CCLE method; the justification is that CCLE shows cohesin distribution is linked to cohesin looping, which is already a questionable statement (point 1) and doesn't really explain how the model offers new insight into existing Nipbl and Pds5 data.

      Furthermore, I believe that the conclusions drawn on this point are flawed, or at least, stated with too much confidence. The authors raise the curious point that Nipbl ChIP-seq does not correlate well with cohesin ChIP-seq, and use this as evidence that Nipbl is not a part of the loop-extruding complex in S. pombe, and it is not essential in humans. Aside from the molecular evidence in human Nipbl/cohesin (acknowledged by authors), there are other reasons to doubt this conclusion. First, depletion of Nipbl (rather than binding partner Mau2 as in ref 55) in mouse cells strongly inhibits TAD formation (Schwarzer et al. Nature 551:51 2017). Second, at least two studies have raised concerns about Nibpl ChIP-seq results: 1) Hu et al. Nucleic Acids Res 43:e132 2015, which shows that uncalibrated ChIP-seq can obscure the signal of protein localization throughout the genome due to the inability to distinguish from background and 2) Rhodes et al. eLife 6:e30000, which uses FRAP to show that Nipbl binds and unbinds to cohesin rapidly in human cells, which could go undetected in ChIP-seq, especially when uncalibrated. It has not been shown that these dynamics are present in yeast, but there is no reason to rule it out yet.

      Similar types of critiques could be applied to the discussion of Pds5. There is cross-correlation between Psc3 and Pds5 in S. pombe, but the authors are unable to account for whether Pds5 binding is transient and/or necessary to loop extrusion itself or, more importantly, whether Pds5 ChIP is associated with extrusive or cohesive cohesins; cross-correlation peaks at about 0.6, but note that by the authors own estimates, cohesive cohesins are approximately half of all cohesins in S. pombe (Table 3).

      Due to the above issues, I suggest that the authors heavily revise this discussion to better reflect the current experimental understanding and the limited ability to draw such conclusions based on the current CCLE model. 6. I suggest that the authors recalculate correlations for Hi-C maps using maps that are rescaled by the P(s) curves. As currently computed, most of the correlation between maps could arise from the characteristic decay of P(s) rather than smaller scale features of the contact maps. This could reduce the surprising observed correlation between distinct genomic regions in pombe (which, problematically, is higher than the observed correlation between simulation and experiment in cervisiae). 7. Please explain why the difference between right and left currents at any particular site, (R_n-L_n) / Rn+Ln, should be small. It seems easy to imagine scenarios where this might not be true, such as directional barriers like CTCF or transcribed genes. 8. Optional, but I think would greatly improve the manuscript, but can the authors: a) analyze regions of high cohesin occupancy (assumed to be slow extrusion regions) to determine if there's anything special in these regions, such as more transcriptional activity

      b) apply this methodology to vertebrate cell data 9. A Github link is provided but the code is not currently available.

      Minor Comments:

      1. Please state the simulated LEF lifetime, since the statement in the methods that 15000 timesteps are needed for equilibration of the LEF model is otherwise not meaningful. Additionally, please note that backbone length is not necessarily a good measure of steady state, since the backbone can be compacted to its steady-state value while the loop distribution continues to evolve toward its steady state.
      2. How important is the cohesive cohesin parameter in the model, e.g., how good are fits with \rho_c = 0?
      3. A nice (but non-essential) supplemental visualization might be to show a scatter of sim cohesin occupancy vs. experiment ChIP.
      4. A similar calculation of Hi-C contacts based on simulated loop extruder positions using the Gaussian chain model was previously presented in Banigan et al. eLife 9:e53558 2020, which should be cited.
      5. It is stated that simulation agreement with experiments for cerevisiae is worse in part due to variability in the experiments, with MPR and Pearson numbers for cerevisiae replicates computed for reference. But these numbers are difficult to interpret without, for example, similar numbers for duplicate pombe experiments. Again, these numbers should be generated using Hi-C maps scaled by P(s), especially in case there are systematic errors in one replicate vs. another.
      6. In the model section, it is stated that LEF binding probabilities are uniformly distributed. Did the authors mean the probability is uniform across the genome or that the probability at each site is a uniformly distributed random number? Please clarify, and if the latter, explain why this unconventional assumption was made.
      7. Supplement p4 line 86 - what is meant by "processivity of loops extruded by isolated LEFs"? "size of loops extruded by..." or "processivity of isolated LEFs"?
      8. The use of parentheticals in the caption to Table 2 is a little confusing; adding a few extra words would help.
      9. Page 12 sentence line 315-318 is difficult to understand. The barrier parameter is apparently something from ref 47 not previously described in the manuscript.
      10. Statement on p14 line 393-4 is false: prior LEF models have not been limited to vertebrates, and the authors have cited some of them here. There are also non-vertebrate examples with extrusion barriers: genes as boundaries to condensin in bacteria (Brandao et al. PNAS 116:20489 2019) and MCM complexes as boundaries to cohesin in yeast (Dequeker et al. Nature 606:197 2022).

      Referees cross-commenting

      I agree with the comments of Reviewer 1, which are interesting and important points that should be addressed.

      Significance

      Analytically approaching extrusion by treating cohesin translocation as a conserved current is an interesting approach to modeling and analysis of extrusion-based chromatin organization. It appears to work well as a descriptive model. But I think there are major questions concerning the mechanistic value of this model, possible applications of the model, the provided interpretations of the model and experiments, and the limitations of the model under the current assumptions. I am unconvinced that this analysis specifically is sufficient to demonstrate that extrusion is the primary organizer of chromatin on these scales; moreover, the need to demonstrate this is questionable, as extrusion is widely accepted, even if not universally so. It is also unclear that the minimal approach of the CCLE necessarily offers an improved physical basis for modeling extrusion, as compared to previous efforts such as ref 47, as claimed by the authors. There are also questions about significance due to possible limitations of the model (detailed above). Applying the CCLE model to identify barriers would be interesting, but is not attempted. Overall, the work presents a reasonable analytical model and numerical method, but until the major comments above are addressed and some reasonable application or mechanistic value or interpretation is presented, the overall significance is somewhat limited.

    1. Latest News Click to read more latest news

      This section makes it easy for a screen reader to help direct the reader, as there are titles and a brief summary of what the article is about. This can help the reader immediately understand the content as it is navigable by a screen reader which many readers may rely on. There are also alt descriptions by examining the HTML code.

    1. A compiler is a software tool that translates human-readable source code into machine-executable code

      Hmm! Sounds like a plan

    1. Author response:

      eLife assessment 

      This important study provides evidence for a combination of the latest generation of Oxford Nanopore Technology long reads with state-of-the art variant callers enabling bacterial variant discovery at accuracy that matches or exceeds the current "gold standard" with short reads. The evidence supporting the claims of the authors is convincing, although the inclusion of a larger number of reference genomes would further strengthen the study. The work will be of interest to anyone performing sequencing for outbreak investigations, bacterial epidemiology, or similar studies. 

      We thank the editor and reviewers for the accurate summary and positive assessment. We address the comment about increasing the number of reference genomes in the response to reviewer 2.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The authors assess the accuracy of short variant calling (SNPs and indels) in bacterial genomes using Oxford Nanopore reads generated on R10.4 flow cells from a very similar genome (99.5% ANI), examining the impact of variant caller choice (three traditional variant callers: bcftools, freebayes, and longshot, and three deep learning based variant callers: clair3, deep variant, and nano caller), base calling model (fast, hac and sup) and read depth (using both simplex and duplex reads). 

      Strengths: 

      Given the stated goal (analysis of variant calling for reads drawn from genomes very similar to the reference), the analysis is largely complete and results are compelling. The authors make the code and data used in their analysis available for re-use using current best practices (a computational workflow and data archived in INSDC databases or Zenodo as appropriate). 

      Weaknesses: 

      While the medaka variant caller is now deprecated for diploid calling, it is still widely used for haploid variant calling and should at least be mentioned (even if the mention is only to explain its exclusion from the analysis). 

      We agree that this would be an informative addition to the study and will add it to the benchmarking.

      Appraisal: 

      The experiments the authors engaged in are well structured and the results are convincing. I expect that these results will be incorporated into "best practice" bacterial variant calling workflows in the future. 

      Thank you for the positive appraisal.

      Reviewer #2 (Public Review): 

      Summary: 

      Hall et al describe the superiority of ONT sequencing and deep learning-based variant callers to deliver higher SNP and Indel accuracy compared to previous gold-standard Illumina short-read sequencing. Furthermore, they provide recommendations for read sequencing depth and computational requirements when performing variant calling. 

      Strengths: 

      The study describes compelling data showing ONT superiority when using deep learning-based variant callers, such as Clair3, compared to Illumina sequencing. This challenges the paradigm that Illumina sequencing is the gold standard for variant calling in bacterial genomes. The authors provide evidence that homopolymeric regions, a systematic and problematic issue with ONT data, are no longer a concern in ONT sequencing. 

      Weaknesses: 

      (1) The inclusion of a larger number of reference genomes would have strengthened the study to accommodate larger variability (a limitation mentioned by the authors). 

      Our strategic selection of 14 genomes—spanning a variety of bacterial genera and species, diverse GC content, and both gram-negative and gram-positive species (including M. tuberculosis, which is neither)—was designed to robustly address potential variability in our results. Moreover, all our genome assemblies underwent rigorous manual inspection as the quality of the true genome sequences is the foundation this research is built upon. Given this, the fundamental conclusions regarding the accuracy of variant calls would likely remain unchanged with the addition of more genomes.  However, we do acknowledge that a substantially larger sample size, which is beyond the scope of this study, would enable more fine-grained analysis of species differences in error rates.

      (2) In Figure 2, there are clearly one or two samples that perform worse than others in all combinations (are always below the box plots). No information about species-specific variant calls is provided by the authors but one would like to know if those are recurrently associated with one or two species. Species-specific recommendations could also help the scientific community to choose the best sequencing/variant calling approaches.

      Thank you for highlighting this observation. The precision, recall, and F1 scores for each sample and condition can be found in Supplementary Table S4. We will investigate the samples that consistently perform below expectation to determine if this is associated with specific species, which may necessitate tailored recommendations for those species. Additionally, we will produce a species-segregated version of Figure 2 for a clearer interpretation and will place it in the supplementary materials.

      (3) The authors support that a read depth of 10x is sufficient to achieve variant calls that match or exceed Illumina sequencing. However, the standard here should be the optimal discriminatory power for clinical and public health utility (namely outbreak analysis). In such scenarios, the highest discriminatory power is always desirable and as such an F1 score, Recall and Precision that is as close to 100% as possible should be maintained (which changes the minimum read sequencing depth to at least 25x, which is the inflection point).

      We agree that the highest discriminatory power is always desirable for clinical or public health applications. In which case, 25x is probably a better minimum recommendation. However, we are also aware that there are resource-limited settings where parity with Illumina is sufficient. In these cases, 10x depth from ONT would provide sufficient data.

      The manuscript currently emphasises the latter scenario, but we will revise the text to clearly recommend 25x depth as a conservative aim in settings where resources are not a constraint, ensuring the highest possible discriminatory power for applications like outbreak analysis.

      (4) The sequencing of the samples was not performed with the same Illumina and ONT method/equipment, which could have introduced specific equipment/preparation artefacts that were not considered in the study. See for example https://academic.oup.com/nargab/article/3/1/lqab019/6193612

      To our knowledge, there is no evidence that sequencing on different ONT machines or barcoding kits leads to a difference in read characteristics or accuracy. To ensure consistency and minimise potential variability, we used the same ONT flowcells for all samples and performed basecalling on the same Nvidia A100 GPU. We will update the methods to emphasise this.

      For Illumina and ONT, the exact machines used for which samples will be added as a supplementary table. We will also add a comment about possible Illumina error rate differences in the ‘Limitations’ section of the Discussion.

      In summary, while there may be specific equipment or preparation artifacts to consider, we took steps to minimise these effects and maintain consistency across our sequencing methods.

      Reviewer #3 (Public Review): 

      Hall et al. benchmarked different variant calling methods on Nanopore reads of bacterial samples and compared the performance of Nanopore to short reads produced with Illumina sequencing. To establish a common ground for comparison, the authors first generated a variant truth set for each sample and then projected this set to the reference sequence of the sample to obtain a mutated reference. Subsequently, Hall et al. called SNPs and small indels using commonly used deep learning and conventional variant callers and compared the precision and accuracy from reads produced with simplex and duplex Nanopore sequencing to Illumina data. The authors did not investigate large structural variation, which is a major limitation of the current manuscript. It will be very interesting to see a follow-up study covering this much more challenging type of variation. 

      We fully agree that investigating structural variations (SVs) would be a very interesting and important follow-up. Identifying and generating ground truth SVs is a nontrivial task and we feel it deserves its own space and study. We hope to explore this in the future.

      In their comprehensive comparison of SNPs and small indels, the authors observed superior performance of deep learning over conventional variant callers when Nanopore reads were basecalled with the most accurate (but also computationally very expensive) model, even exceeding Illumina in some cases. Not surprisingly, Nanopore underperformed compared to Illumina when basecalled with the fastest (but computationally much less demanding) method with the lowest accuracy. The authors then investigated the surprisingly higher performance of Nanopore data in some cases and identified lower recall with Illumina short read data, particularly from repetitive regions and regions with high variant density, as the driver. Combining the most accurate Nanopore basecalling method with a deep learning variant caller resulted in low error rates in homopolymer regions, similar to Illumina data. This is remarkable, as homopolymer regions are (or, were) traditionally challenging for Nanopore sequencing. 

      Lastly, Hall et al. provided useful information on the required Nanopore read depth, which is surprisingly low, and the computational resources for variant calling with deep learning callers. With that, the authors established a new state-of-the-art for Nanopore-only variant, calling on bacterial sequencing data. Most likely these findings will be transferred to other organisms as well or at least provide a proof-of-concept that can be built upon. 

      As the authors mention multiple times throughout the manuscript, Nanopore can provide sequencing data in nearly real-time and in remote regions, therefore opening up a ton of new possibilities, for example for infectious disease surveillance. 

      However, the high-performing variant calling method as established in this study requires the computationally very expensive sup and/or duplex Nanopore basecalling, whereas the least computationally demanding method underperforms. Here, the manuscript would greatly benefit from extending the last section on computational requirements, as the authors determine the resources for the variant calling but do not cover the entire picture. This could even be misleading for less experienced researchers who want to perform bacterial sequencing at high performance but with low resources. The authors mention it in the discussion but do not make clear enough that the described computational resources are probably largely insufficient to perform the high-accuracy basecalling required. 

      We have provided runtime benchmarks for basecalling in Supplementary Figure S16 and detailed these times in Supplementary Table S7. In addition, we state in the Results section (P10 L228-230) “Though we do note that if the person performing the variant calling has received the raw (pod5) ONT data, basecalling also needs to be accounted for, as depending on how much sequencing was done, this step can also be resource-intensive.”

      Even with super-accuracy basecalling considered, our analysis shows that variant calling remains the most resource-intensive step for Clair3, DeepVariant, FreeBayes, and NanoCaller. Therefore, the statement “the described computational resources are probably largely insufficient to perform the high-accuracy basecalling required”, is incorrect. However, we will endeavour to make the basecalling component and considerations more prominent in the Results and Discussion.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors assess the accuracy of short variant calling (SNPs and indels) in bacterial genomes using Oxford Nanopore reads generated on R10.4 flow cells from a very similar genome (99.5% ANI), examining the impact of variant caller choice (three traditional variant callers: bcftools, freebayes, and longshot, and three deep learning based variant callers: clair3, deep variant, and nano caller), base calling model (fast, hac and sup) and read depth (using both simplex and duplex reads).

      Strengths:

      Given the stated goal (analysis of variant calling for reads drawn from genomes very similar to the reference), the analysis is largely complete and results are compelling. The authors make the code and data used in their analysis available for re-use using current best practices (a computational workflow and data archived in INSDC databases or Zenodo as appropriate).

      Weaknesses:

      While the medaka variant caller is now deprecated for diploid calling, it is still widely used for haploid variant calling and should at least be mentioned (even if the mention is only to explain its exclusion from the analysis).

      Appraisal:

      The experiments the authors engaged in are well structured and the results are convincing. I expect that these results will be incorporated into "best practice" bacterial variant calling workflows in the future.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      I have trialled the package on my lab's data and it works as advertised. It was straightforward to use and did not require any special training. I am confident this is a tool that will be approachable even to users with limited computational experience. The use of artificial data to validate the approach - and to provide clear limits on applicability - is particularly helpful.

      The main limitation of the tool is that it requires the user to manually select regions. This somewhat limits the generalisability and is also more subjective - users can easily choose "nice" regions that better match with their hypothesis, rather than quantifying the data in an unbiased manner. However, given the inherent challenges in quantifying biological data, such problems are not easily circumventable.

      *

      * I have some comments to clarify the manuscript:

      1. A "straightforward installation" is mentioned. Given this is a Method paper, the means of installation should be clearly laid out.*

      __This sentence is now modified. In the revised manuscript we now describe how to install the toolset and we give the link to the toolset website if further information is needed. __On this website, we provide a full video tutorial and a user manual. The user manual is provided as a supplementary material of the manuscript.

      * It would be helpful if there was an option to generate an output with the regions analysed (i.e., a JPG image with the data and the drawn line(s) on top). There are two reasons for this: i) A major problem with user-driven quantification is accidental double counting of regions (e.g., a user quantifies a part of an image and then later quantifies the same region). ii) Allows other users to independently verify measurements at a later time.*

      We agree that it is helpful to save the analyzed regions. To answer this comment and the other two reviewers' comments pointing at a similar feature, we have now included an automatic saving of the regions of interest. The user will be able to reopen saved regions of interest using a new function we included in the new version of PatternJ.

      * 3. Related to the above point, it is highlighted that each time point would need to be analysed separately (line 361-362). It seems like it should be relatively straightforward to allow a function where the analysis line can be mapped onto the next time point. The user could then adjust slightly for changes in position, but still be starting from near the previous timepoint. Given how prevalent timelapse imaging is, this seems like (or something similar) a clear benefit to add to the software.*

      We agree that the analysis of time series images can be a useful addition. We have added the analysis of time-lapse series in the new version of PatternJ. The principles behind the analysis of time-lapse series and an example of such analysis are provided in Figure 1 - figure supplement 3 and Figure 5, with accompanying text lines 140-153 and 360-372. The analysis includes a semi-automated selection of regions of interest, which will make the analysis of such sequences more straightforward than having to draw a selection on each image of the series. The user is required to draw at least two regions of interest in two different frames, and the algorithm will automatically generate regions of interest in frames in which selections were not drawn. The algorithm generates the analysis immediately after selections are drawn by the user, which includes the tracking of the reference channel.

      * Line 134-135. The level of accuracy of the searching should be clarified here. This is discussed later in the manuscript, but it would be helpful to give readers an idea at this point what level of tolerance the software has to noise and aperiodicity.

      *

      We agree with the reviewer that a clarification of this part of the algorithm will help the user better understand the manuscript.__ We have modified the sentence to clarify the range of search used and the resulting limits in aperiodicity (now lines 176-181). __Regarding the tolerance to noise, it is difficult to estimate it a priori from the choice made at the algorithm stage, so we prefer to leave it to the validation part of the manuscript. We hope this solution satisfies the reviewer and future users.

      *

      **Referees cross-commenting**

      I think the other reviewer comments are very pertinent. The authors have a fair bit to do, but they are reasonable requests. So, they should be encouraged to do the revisions fully so that the final software tool is as useful as possible.

      Reviewer #1 (Significance (Required)):

      Developing software tools for quantifying biological data that are approachable for a wide range of users remains a longstanding challenge. This challenge is due to: (1) the inherent problem of variability in biological systems; (2) the complexity of defining clearly quantifiable measurables; and (3) the broad spread of computational skills amongst likely users of such software.

      In this work, Blin et al., develop a simple plugin for ImageJ designed to quickly and easily quantify regular repeating units within biological systems - e.g., muscle fibre structure. They clearly and fairly discuss existing tools, with their pros and cons. The motivation for PatternJ is properly justified (which is sadly not always the case with such software tools).

      Overall, the paper is well written and accessible. The tool has limitations but it is clearly useful and easy to use. Therefore, this work is publishable with only minor corrections.

      *We thank the reviewer for the positive evaluation of PatternJ and for pointing out its accessibility to the users.

      *

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      # Summary

      The authors present an ImageJ Macro GUI tool set for the quantification of one-dimensional repeated patterns that are commonly occurring in microscopy images of muscles.

      # Major comments

      In our view the article and also software could be improved in terms of defining the scope of its applicability and user-ship. In many parts the article and software suggest that general biological patterns can be analysed, but then in other parts very specific muscle actin wordings are used. We are pointing this out in the "Minor comments" sections below. We feel that the authors could improve their work by making a clear choice here. One option would be to clearly limit the scope of the tool to the analysis of actin structures in muscles. In this case we would recommend to also rename the tool, e.g. MusclePatternJ. The other option would be to make the tool about the generic analysis of one-dimensional patterns, maybe calling the tool LinePatternJ. In the latter case we would recommend to remove all actin specific wordings from the macro tool set and also the article should be in parts slightly re-written.

      *

      We agree with the reviewer that our initial manuscript used a mix of general and muscle-oriented vocabulary, which could make the use of PatternJ confusing especially outside of the muscle field. To make PatternJ useful for the largest community, we corrected the manuscript and the PatternJ toolset to provide the general vocabulary needed to make it understandable for every biologist. We modified the manuscript accordingly.

      * # Minor/detailed comments

      # Software

      We recommend considering the following suggestions for improving the software.

      ## File and folder selection dialogs

      In general, clicking on many of the buttons just opens up a file-browser dialog without any further information. For novel users it is not clear what the tool expects one to select here. It would be very good if the software could be rewritten such that there are always clear instructions displayed about which file or folder one should open for the different buttons.*

      We experienced with the current version of macOS that the file-browser dialog does not display any message; we suspect this is the issue raised by the reviewer. This is a known issue of Fiji on Mac and all applications on Mac since 2016. We provided guidelines in the user manual and on the tutorial video to correct this issue by changing a parameter in Fiji. Given the issues the reviewer had accessing the material on the PatternJ website, which we apologize for, we understand the issue raised. We added an extra warning on the PatternJ website to point at this problem and its solution. Additionally, we have limited the file-browser dialog appearance to what we thought was strictly necessary. Thus, the user will experience fewer prompts, speeding up the analysis.

      *

      ## Extract button

      The tool asks one to specify things like whether selections are drawn "M-line-to-M-line"; for users that are not experts in muscle morphology this is not understandable. It would be great to find more generally applicable formulations. *

      We agree that this muscle-oriented vocabulary can make the use of PatternJ confusing. We have now corrected the user interface to provide both general and muscle-specific vocabulary ("center-to-center or edge-to-edge (M-line-to-M-line or Z-disc-to-Z-disc)").*

      ## Manual selection accuracy

      The 1st step of the analysis is always to start from a user hand-drawn profile across intensity patterns in the image. However, this step can cause inaccuracy that varies with the shape and curve of the line profile drawn. If not strictly perpendicular to for example the M line patterns, the distance between intensity peaks will be different. This will be more problematic when dealing with non-straight and parallelly poised features in the image. If the structure is bended with a curve, the line drawn over it also needs to reproduce this curve, to precisely capture the intensity pattern. I found this limits the reproducibility and easy-usability of the software.*

      We understand the concern of the reviewer. On curved selections this will be an issue that is difficult to solve, especially on "S" curved or more complex selections. The user will have to be very careful in these situations. On non-curved samples, the issue may be concerning at first sight, but the errors go with the inverse of cosine and are therefore rather low. For example, if the user creates a selection off by 5 degrees, which is visually obvious, lengths will be affected by an increase of only 0.38%. The point raised by the reviewer is important to discuss, and we therefore added a paragraph to comment on the choice of selection (lines 94-98) and a supplementary figure to help make it clear (Figure 1 - figure supplement 1).*

      ### Reproducibility

      Since the line profile drawn on the image is the first step and very essential to the entire process, it should be considered to save together with the analysis result. For example, as ImageJ ROI or ROIset files that can be re-imported, correctly positioned, and visualized in the measured images. This would greatly improve the reproducibility of the proposed workflow. In the manuscript, only the extracted features are being saved (because the save button is also just asking for a folder containing images, so I cannot verify its functionality). *

      We agree that this is a very useful and important feature. We have added ROI automatic saving. Additionally, we now provide a simplified import function of all ROIs generated with PatternJ and the automated extraction and analysis of the list of ROIs. This can be done from ROIs generated previously in PatternJ or with ROIs generated from other ImageJ/Fiji algorithms. These new features are described in the manuscript in lines 120-121 and 130-132.

      *

      ## ? button

      It would be great if that button would open up some usage instructions.

      *

      We agree with the reviewer that the "?" button can be used in a better way. We have replaced this button with a Help menu, including a simple tutorial showing a series of images detailing the steps to follow by the user, a link to the user website, and a link to our video tutorial.

      * ## Easy improvement of workflow

      I would suggest a reasonable expansion of the current workflow, by fitting and displaying 2D lines to the band or line structure in the image, that form the "patterns" the author aims to address. Thus, it extracts geometry models from the image, and the inter-line distance, and even the curve formed by these sets of lines can be further analyzed and studied. These fitted 2D lines can be also well integrated into ImageJ as Line ROI, and thus be saved, imported back, and checked or being further modified. I think this can largely increase the usefulness and reproducibility of the software.

      *

      We hope that we understood this comment correctly. We had sent a clarification request to the editor, but unfortunately did not receive an answer within the requested 4 weeks of this revision. We understood the following: instead of using our 1D approach, in which we extract positions from a profile, the reviewer suggests extracting the positions of features not as a single point, but as a series of coordinates defining its shape. If this is the case, this is a major modification of the tool that is beyond the scope of PatternJ. We believe that keeping our tool simple, makes it robust. This is the major strength of PatternJ. Local fitting will not use line average for instance, which would make the tool less reliable.

      * # Manuscript

      We recommend considering the following suggestions for improving the manuscript. Abstract: The abstract suggests that general patterns can be quantified, however the actual tool quantifies specific subtypes of one-dimensional patterns. We recommend adapting the abstract accordingly.

      *

      We modified the abstract to make this point clearer.

      * Line 58: Gray-level co-occurrence matrix (GLCM) based feature extraction and analysis approach is not mentioned nor compared. At least there's a relatively recent study on Sarcomeres structure based on GLCM feature extraction: https://github.com/steinjm/SotaTool with publication: *https://doi.org/10.1002/cpz1.462

      • *

      We thank the reviewer for making us aware of this publication. We cite it now and have added it to our comparison of available approaches.

      * Line 75: "...these simple geometrical features will address most quantitative needs..." We feel that this may be an overstatement, e.g. we can imagine that there should be many relevant two-dimensional patterns in biology?!*

      We have modified this sentence to avoid potential confusion (lines 76-77).

      • *

      • Line 83: "After a straightforward installation by the user, ...". We think it would be convenient to add the installation steps at this place into the manuscript. *

      __This sentence is now modified. We now mention how to install the toolset and we provide the link to the toolset website, if further information is needed (lines 86-88). __On the website, we provide a full video tutorial and a user manual.

      * Line 87: "Multicolor images will give a graph with one profile per color." The 'Multicolor images' here should be more precisely stated as "multi-channel" images. Multi-color images could be confused with RGB images which will be treated as 8-bit gray value (type conversion first) images by profile plot in ImageJ. *

      We agree with the reviewer that this could create some confusion. We modified "multicolor" to "multi-channel".

      * Line 92: "...such as individual bands, blocks, or sarcomeric actin...". While bands and blocks are generic pattern terms, the biological term "sarcomeric actin" does not seem to fit in this list. Could a more generic wording be found, such as "block with spike"? *

      We agree with the reviewer that "sarcomeric actin" alone will not be clear to all readers. We modified the text to "block with a central band, as often observed in the muscle field for sarcomeric actin" (lines 103-104). The toolset was modified accordingly.

      * Line 95: "the algorithm defines one pattern by having the features of highest intensity in its centre". Could this be rephrased? We did not understand what that exactly means.*

      We agree with the reviewer that this was not clear. We rewrote this paragraph (lines 101-114) and provided a supplementary figure to illustrate these definitions (Figure 1 - figure supplement 2).

      * Line 124 - 147: This part the only description of the algorithm behind the feature extraction and analysis, but not clearly stated. Many details are missing or assumed known by the reader. For example, how it achieved sub-pixel resolution results is not clear. One can only assume that by fitting Gaussian to the band, the center position (peak) thus can be calculated from continuous curves other than pixels. *

      Note that the two sentences introducing this description are "Automated feature extraction is the core of the tool. The algorithm takes multiple steps to achieve this (Fig. S2):". We were hoping this statement was clear, but the reviewer may refer to something else. We agree that the description of some of the details of the steps was too quick. We have now expanded the description where needed.

      * Line 407: We think the availability of both the tool and the code could be improved. For Fiji tools it is common practice to create an Update Site and to make the code available on GitHub. In addition, downloading the example file (https://drive.google.com/file/d/1eMazyQJlisWPwmozvyb8VPVbfAgaH7Hz/view?usp=drive_link) required a Google login and access request, which is not very convenient; in fact, we asked for access but it was denied. It would be important for the download to be easier, e.g. from GitHub or Zenodo.

      *

      We are sorry for issues encountered when downloading the tool and additional material. We thank the reviewer for pointing out these issues that limited the accessibility of our tool. We simplified the downloading procedure on the website, which does not go through the google drive interface nor requires a google account. Additionally, for the coder community the code, user manual and examples are now available from GitHub at github.com/PierreMangeol/PatternJ, and are provided as supplementary material with the manuscript. To our knowledge, update sites work for plugins but not for macro toolsets. Having experience sharing our codes with non-specialists, a classical website with a tutorial video is more accessible than more coder-oriented websites, which deter many users.

      * Reviewer #2 (Significance (Required)):

      The strength of this study is that a tool for the analysis of one-dimensional repeated patterns occurring in muscle fibres is made available in the accessible open-source platform ImageJ/Fiji. In the introduction to the article the authors provide an extensive review of comparable existing tools. Their new tool fills a gap in terms of providing an easy-to-use software for users without computational skills that enables the analysis of muscle sarcomere patterns. We feel that if the below mentioned limitations could be addressed the tool could indeed be valuable to life scientists interested in muscle patterning without computational skills.

      In our view there are a few limitations, including the accessibility of example data and tutorials at sites.google.com/view/patternj, which we had trouble to access. In addition, we think that the workflow in Fiji, which currently requires pressing several buttons in the correct order, could be further simplified and streamlined by adopting some "wizard" approach, where the user is guided through the steps.

      *As answered above, the links on the PatternJ website are now corrected. Regarding the workflow, we now provide a Help menu with:

      1. __a basic set of instructions to use the tool, __
      2. a direct link to the tutorial video in the PatternJ toolset
      3. a direct link to the website on which both the tutorial video and a detailed user manual can be found. We hope this addresses the issues raised by this reviewer.

      *Another limitation is the reproducibility of the analysis; here we recommend enabling IJ Macro recording as well as saving of the drawn line ROIs. For more detailed suggestions for improvements please see the above sections of our review. *

      We agree that saving ROIs is very useful. It is now implemented in PatternJ.

      We are not sure what this reviewer means by "enabling IJ Macro recording". The ImageJ Macro Recorder is indeed very useful, but to our knowledge, it is limited to built-in functions. Our code is open and we hope this will be sufficient for advanced users to modify the code and make it fit their needs.*

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary In this manuscript, the authors present a new toolset for the analysis of repetitive patterns in biological images named PatternJ. One of the main advantages of this new tool over existing ones is that it is simple to install and run and does not require any coding skills whatsoever, since it runs on the ImageJ GUI. Another advantage is that it does not only provide the mean length of the pattern unit but also the subpixel localization of each unit and the distributions of lengths and that it does not require GPU processing to run, unlike other existing tools. The major disadvantage of the PatternJ is that it requires heavy, although very simple, user input in both the selection of the region to be analyzed and in the analysis steps. Another limitation is that, at least in its current version, PatternJ is not suitable for time-lapse imaging. The authors clearly explain the algorithm used by the tool to find the localization of pattern features and they thoroughly test the limits of their tool in conditions of varying SNR, periodicity and band intensity. Finally, they also show the performance of PatternJ across several biological models such as different kinds of muscle cells, neurons and fish embryonic somites, as well as different imaging modalities such as brightfield, fluorescence confocal microscopy, STORM and even electron microscopy.

      This manuscript is clearly written, and both the section and the figures are well organized and tell a cohesive story. By testing PatternJ, I can attest to its ease of installation and use. Overall, I consider that PatternJ is a useful tool for the analysis of patterned microscopy images and this article is fit for publication. However, i do have some minor suggestions and questions that I would like the authors to address, as I consider they could improve this manuscript and the tool:

      *We are grateful to this reviewer for this very positive assessment of PatternJ and of our manuscript.

      * Minor Suggestions: In the methodology section is missing a more detailed description about how the metric plotted was obtained: as normalized intensity or precision in pixels. *

      We agree with the reviewer that a more detailed description of the metric plotted was missing. We added this information in the method part and added information in the Figure captions where more details could help to clarify the value displayed.

      * The validation is based mostly on the SNR and patterns. They should include a dataset of real data to validate the algorithm in three of the standard patterns tested. *

      We validated our tool using computer-generated images, in which we know with certainty the localization of patterns. This allowed us to automatically analyze 30 000 images, and with varying settings, we sometimes analyzed 10 times the same image, leading to about 150 000 selections analyzed. From these analyses, we can provide with confidence an unbiased assessment of the tool precision and the tool capacity to extract patterns. We already provided examples of various biological data images in Figures 4-6, showing all possible features that can be extracted with PatternJ. In these examples, we can claim by eye that PatternJ extracts patterns efficiently, but we cannot know how precise these extractions are because of the nature of biological data: "real" positions of features are unknown in biological data. Such validation will be limited to assessing whether a pattern was found or not, which we believe we already provided with the examples in Figures 4-6.

      * The video tutorial available in the PatternJ website is very useful, maybe it would be worth it to include it as supplemental material for this manuscript, if the journal allows it. *

      As the video tutorial may have been missed by other reviewers, we agree it is important to make it more prominent to users. We have now added a Help menu in the toolset that opens the tutorial video. Having the video as supplementary material could indeed be a useful addition if the size of the video is compatible with the journal limits.

      * An example image is provided to test the macro. However, it would be useful to provide further example images for each of the three possible standard patterns suggested: Block, actin sarcomere or individual band.*

      We agree this can help users. We now provide another multi-channel example image on the PatternJ website including blocks and a pattern made of a linear intensity gradient that can be extracted with our simpler "single pattern" algorithm, which were missing in the first example. Additionally, we provide an example to be used with our new time-lapse analysis.

      * Access to both the manual and the sample images in the PatternJ website should be made publicly available. Right now they both sit in a private Drive account. *

      As mentioned above, we apologize for access issues that occurred during the review process. These files can now be downloaded directly on the website without any sort of authentication. Additionally, these files are now also available on GitHub.

      * Some common errors are not properly handled by the macro and could be confusing for the user: When there is no selection and one tries to run a Check or Extraction: "Selection required in line 307 (called from line 14). profile=getProfile( ;". A simple "a line selection is required" message would be useful there. When "band" or "block" is selected for a channel in the "Set parameters" window, yet a 0 value is entered into the corresponding "Number of bands or blocks" section, one gets this error when trying to Extract: "Empty array in line 842 (called from line 113). if ( ( subloc . length == 1 ) & ( subloc [ 0 == 0) ) {". This error is not too rare, since the "Number of bands or blocks" section is populated with a 0 after choosing "sarcomeric actin" (after accepting the settings) and stays that way when one changes back to "blocks" or "bands".*

      We thank the reviewer for pointing out these bugs. These bugs are now corrected in the revised version.

      * The fact that every time one clicks on the most used buttons, the getDirectory window appears is not only quite annoying but also, ultimately a waste of time. Isn't it possible to choose the directory in which to store the files only once, from the "Set parameters" window?*

      We have now found a solution to avoid this step. The user is only prompted to provide the image folder when pressing the "Set parameter" button. We kept the prompt for directory only when the user selects the time-lapse analysis or the analysis of multiple ROIs. The main reason is that it is very easy for the analysis to end up in the wrong folder otherwise.

      * The authors state that the outputs of the workflow are "user friendly text files". However, some of them lack descriptive headers (like the localisations and profiles) or even file names (like colors.txt). If there is something lacking in the manuscript, it is a brief description of all the output files generated during the workflow.*

      PatternJ generates multiple files, several of which are internal to the toolset. They are needed to keep track of which analyses were done, and which colors were used in the images, amongst others. From the user part, only the files obtained after the analysis All_localizations.channel_X.txt and sarcomere_lengths.txt are useful. To improve the user experience, we now moved all internal files to a folder named "internal", which we think will clarify which outputs are useful for further analysis, and which ones are not. We thank the reviewer for raising this point and we now mention it in our Tutorial.

      I don't really see the point in saving the localizations from the "Extraction" step, they are even named "temp".

      We thank the reviewer for this comment, this was indeed not necessary. We modified PatternJ to delete these files after they are used.

      * In the same line, I DO see the point of saving the profiles and localizations from the "Extract & Save" step, but I think they should be deleted during the "Analysis" step, since all their information is then grouped in a single file, with descriptive headers. This deleting could be optional and set in the "Set parameters" window.*

      We understand the point raised by the reviewer. However, the analysis depends on the reference channel picked, which is asked for when starting an analysis, and can be augmented with additional selections. If a user chooses to modify the reference channel or to add a new profile to the analysis, deleting all these files would mean that the user will have to start over again, which we believe will create frustration. An optional deletion at the analysis step is simple to implement, but it could create problems for users who do not understand what it means practically.

      * Moreover, I think it would be useful to also save the linear roi used for the "Extract & Save" step, and eventually combine them during the "Analysis step" into a single roi set file so that future re-analysis could be made on the same regions. This could be an optional feature set from the "Set parameters" window. *

      We agree with the reviewer that saving ROIs is very useful. ROIs are now saved into a single file each time the user extracts and saves positions from a selection. Additionally, the user can re-use previous ROIs and analyze an image or image series in a single step.

      * In the "PatternJ workflow" section of the manuscript, the authors state that after the "Extract & Save" step "(...) steps 1, 2, 4, and 5 can be repeated on other selections (...)". However, technically, only steps 1 and 5 are really necessary (alternatively 1, 4 and 5 if the user is unsure of the quality of the patterning). If a user follows this to the letter, I think it can lead to wasted time.

      *

      We agree with the reviewer and have corrected the manuscript accordingly (line 119-120).

      • *

      *I believe that the "Version Information" button, although important, has potential to be more useful if used as a "Help" button for the toolset. There could be links to useful sources like the manuscript or the PatternJ website but also some tips like "whenever possible, use a higher linewidth for your line selection" *

      We agree with the reviewer as pointed out in our previous answers to the other reviewers. This button is now replaced by a Help menu, including a simple tutorial in a series of images detailing the steps to follow, a link to the user website, and a link to our video tutorial.

      * It would be interesting to mention to what extent does the orientation of the line selection in relation to the patterned structure (i.e. perfectly parallel vs more diagonal) affect pattern length variability?*

      As answered to reviewer 1, we understand this concern, which needs to be clarified for readers. The issue may be concerning at first sight, but the errors grow only with the inverse of cosine and are therefore rather low. For example, if the user creates a selection off by 3 degrees, which is visually obvious, lengths will be affected by an increase of only 0.14%. The point raised by the reviewer is important to discuss, and we therefore have added a comment on the choice of selection (lines 94-98) as well as a supplementary figure (Figure 1 - figure supplement 1).

      * When "the algorithm uses the peak of highest intensity as a starting point and then searches for peak intensity values one spatial period away on each side of this starting point" (line 133-135), does that search have a range? If so, what is the range? *

      We agree that this information is useful to share with the reader. The range is one pattern size. We have modified the sentence to clarify the range of search used and the resulting limits in aperiodicity (now lines 176-181).

      * Line 144 states that the parameters of the fit are saved and given to the user, yet I could not find such information in the outputs. *

      The parameters of the fits are saved for blocks. We have now clarified this point by modifying the manuscript (lines 186-198) and modifying Figure 1 - figure supplement 5. We realized we made an error in the description of how edges of "block with middle band" are extracted. This is now corrected.

      * In line 286, authors finish by saying "More complex patterns from electron microscopy images may also be used with PatternJ.". Since this statement is not backed by evidence in the manuscript, I suggest deleting it (or at the very least, providing some examples of what more complex patterns the authors refer to). *

      This sentence is now deleted.

      * In the TEM image of the fly wing muscle in fig. 4 there is a subtle but clearly visible white stripe pattern in the original image. Since that pattern consists of 'dips', rather than 'peaks' in the profile of the inverted image, they do not get analyzed. I think it is worth mentioning that if the image of interest contains both "bright" and "dark" patterns, then the analysis should be performed in both the original and the inverted images because the nature of the algorithm does not allow it to detect "dark" patterns. *

      We agree with the reviewer's comment. We now mention this point in lines 337-339.

      * In line 283, the authors mention using background correction. They should explicit what method of background correction they used. If they used ImageJ's "subtract background' tool, then specify the radius.*

      We now describe this step in the method section.

      *

      Reviewer #3 (Significance (Required)):

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. Being a software paper, the advance proposed by the authors is technical in nature. The novelty and significance of this tool is that it offers quick and simple pattern analysis at the single unit level to a broad audience, since it runs on the ImageJ GUI and does not require any programming knowledge. Moreover, all the modules and steps are well described in the paper, which allows easy going through the analysis.
      • Place the work in the context of the existing literature (provide references, where appropriate). The authors themselves provide a good and thorough comparison of their tool with other existing ones, both in terms of ease of use and on the type of information extracted by each method. While PatternJ is not necessarily superior in all aspects, it succeeds at providing precise single pattern unit measurements in a user-friendly manner.
      • State what audience might be interested in and influenced by the reported findings. Most researchers working with microscopy images of muscle cells or fibers or any other patterned sample and interested in analyzing changes in that pattern in response to perturbations, time, development, etc. could use this tool to obtain useful, and otherwise laborious, information. *

      We thank the reviewer for these enthusiastic comments about how straightforward for biologists it is to use PatternJ and its broad applicability in the bio community.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      The authors present an ImageJ Macro GUI tool set for the quantification of one-dimensional repeated patterns that are commonly occurring in microscopy images of muscles.

      Major comments

      In our view the article and also software could be improved in terms of defining the scope of its applicability and user-ship. In many parts the article and software suggest that general biological patterns can be analysed, but then in other parts very specific muscle actin wordings are used. We are pointing this out in the "Minor comments" sections below. We feel that the authors could improve their work by making a clear choice here. One option would be to clearly limit the scope of the tool to the analysis of actin structures in muscles. In this case we would recommend to also rename the tool, e.g. MusclePatternJ. The other option would be to make the tool about the generic analysis of one-dimensional patterns, maybe calling the tool LinePatternJ. In the latter case we would recommend to remove all actin specific wordings from the macro tool set and also the article should be in parts slightly re-written.

      Minor/detailed comments

      Software

      We recommend considering the following suggestions for improving the software.

      File and folder selection dialogs

      In general, clicking on many of the buttons just opens up a file-browser dialog without any further information. For novel users it is not clear what the tool expects one to select here. It would be very good if the software could be rewritten such that there are always clear instructions displayed about which file or folder one should open for the different buttons.

      Extract button

      The tool asks one to specify things like whether selections are drawn "M-line-to-M-line"; for users that are not experts in muscle morphology this is not understandable. It would be great to find more generally applicable formulations.

      Manual selection accuracy

      The 1st step of the analysis is always to start from a user hand-drawn profile across intensity patterns in the image. However, this step can cause inaccuracy that varies with the shape and curve of the line profile drawn. If not strictly perpendicular to for example the M line patterns, the distance between intensity peaks will be different. This will be more problematic when dealing with non-straight and parallelly poised features in the image. If the structure is bended with a curve, the line drawn over it also needs to reproduce this curve, to precisely capture the intensity pattern. I found this limits the reproducibility and easy-usability of the software.

      Reproducibility

      Since the line profile drawn on the image is the first step and very essential to the entire process, it should be considered to save together with the analysis result. For example, as ImageJ ROI or ROIset files that can be re-imported, correctly positioned, and visualized in the measured images. This would greatly improve the reproducibility of the proposed workflow. In the manuscript, only the extracted features are being saved (because the save button is also just asking for a folder containing images, so I cannot verify its functionality).

      ? button

      It would be great if that button would open up some usage instructions.

      Easy improvement of workflow

      I would suggest a reasonable expansion of the current workflow, by fitting and displaying 2D lines to the band or line structure in the image, that form the "patterns" the author aims to address. Thus, it extracts geometry models from the image, and the inter-line distance, and even the curve formed by these sets of lines can be further analyzed and studied. These fitted 2D lines can be also well integrated into ImageJ as Line ROI, and thus be saved, imported back, and checked or being further modified. I think this can largely increase the usefulness and reproducibility of the software.

      Manuscript

      We recommend considering the following suggestions for improving the manuscript. Abstract: The abstract suggests that general patterns can be quantified, however the actual tool quantifies specific subtypes of one-dimensional patterns. We recommend adapting the abstract accordingly.

      Line 58: Gray-level co-occurrence matrix (GLCM) based feature extraction and analysis approach is not mentioned nor compared. At least there's a relatively recent study on Sarcomeres structure based on GLCM feature extraction: https://github.com/steinjm/SotaTool with publication: https://doi.org/10.1002/cpz1.462

      Line 75: "...these simple geometrical features will address most quantitative needs..." We feel that this may be an overstatement, e.g. we can imagine that there should be many relevant two-dimensional patterns in biology?!

      Line 83: "After a straightforward installation by the user, ...". We think it would be convenient to add the installation steps at this place into the manuscript.

      Line 87: "Multicolor images will give a graph with one profile per color." The 'Multicolor images' here should be more precisely stated as "multi-channel" images. Multi-color images could be confused with RGB images which will be treated as 8-bit gray value (type conversion first) images by profile plot in ImageJ.

      Line 92: "...such as individual bands, blocks, or sarcomeric actin...". While bands and blocks are generic pattern terms, the biological term "sarcomeric actin" does not seem to fit in this list. Could a more generic wording be found, such as "block with spike"?

      Line 95: "the algorithm defines one pattern by having the features of highest intensity in its centre". Could this be rephrased? We did not understand what that exactly means.

      Line 124 - 147: This part the only description of the algorithm behind the feature extraction and analysis, but not clearly stated. Many details are missing or assumed known by the reader. For example, how it achieved sub-pixel resolution results is not clear. One can only assume that by fitting Gaussian to the band, the center position (peak) thus can be calculated from continuous curves other than pixels.

      Line 407: We think the availability of both the tool and the code could be improved. For Fiji tools it is common practice to create an Update Site and to make the code available on GitHub. In addition, downloading the example file (https://drive.google.com/file/d/1eMazyQJlisWPwmozvyb8VPVbfAgaH7Hz/view?usp=drive_link) required a Google login and access request, which is not very convenient; in fact, we asked for access but it was denied. It would be important for the download to be easier, e.g. from GitHub or Zenodo.

      Significance

      The strength of this study is that a tool for the analysis of one-dimensional repeated patterns occurring in muscle fibres is made available in the accessible open-source platform ImageJ/Fiji. In the introduction to the article the authors provide an extensive review of comparable existing tools. Their new tool fills a gap in terms of providing an easy-to-use software for users without computational skills that enables the analysis of muscle sarcomere patterns. We feel that if the below mentioned limitations could be addressed the tool could indeed be valuable to life scientists interested in muscle patterning without computational skills.

      In our view there are a few limitations, including the accessibility of example data and tutorials at sites.google.com/view/patternj, which we had trouble to access. In addition, we think that the workflow in Fiji, which currently requires pressing several buttons in the correct order, could be further simplified and streamlined by adopting some "wizard" approach, where the user is guided through the steps. Another limitation is the reproducibility of the analysis; here we recommend enabling IJ Macro recording as well as saving of the drawn line ROIs. For more detailed suggestions for improvements please see the above sections of our review.

    1. If you select the ⋮ (Options) button in the template editor and select the Code editor option, you will see the block markup of the template:

      You need tap the pencil icon next to "Single Posts" and enter editing mode and then find Options button at top-right.

    1. You can use inline R code (see Section 3.1) anywhere in an Rmd document, including the YAML metadata section. This means some YAML metadata can be dynamically generated with inline R code, such as the document title

      Does this apply to .qmd documents?

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The authors discuss an effect, "diffusive lensing", by which particles would accumulate in high-viscosity regions, for instance in the intracellular medium. To obtain these results, the authors rely on agent-based simulations using custom rules performed with the Ito stochastic calculus convention. The "lensing effect" discussed is a direct consequence of the choice of the Ito convention without spurious drift which has been discussed before and is likely to be inadequate for the intracellular medium, causing the presented results to likely have little relevance for biology.

      We thank the editors and the reviewers for their consideration of our manuscript. We argue in this rebuttal and revision that our results and conclusions are in fact likely to have relevance for biology. While we use the Itô convention for ease of modeling considering its non-anticipatory nature upon discretization (see (Volpe and Wehr 2016) for the discretization schemes), we refer to Figure S1B to emphasize that diffusive lensing occurs not only under the Itô convention but across a wide parameter space. Indeed, it is absent only in the normative isothermal convention; note that even a stochastic differential equation conforming to the isothermal convention may be reformulated into the Itô convention by adding suitable drift terms, allowing for diffusive lensing to be seen even in case of the isothermal convention. We note in particular that the choice of the convention is a highly context-dependent one (Sokolov 2010); there is not a universally correct choice, and one can obtain stochastic differential equations consistent with Ito or Stratonovich interpretations in different regimes. Lastly, space-dependent diffusivity is now an experimentally well-recognized feature of the cellular interior, as noted in our references and as discussed further later in this response. This fact points towards the potential relevance of our model for subcellular diffusion.

      In our revised preprint, we have made changes to the text and minor changes to figures to address reviewer concerns.

      Responses to the Reviewers

      We thank the reviewers for their feedback and address the issues they raised in this rebuttal and in the revised manuscript. The central point that the reviewers raise concerns the validity of the drift-less Itô interpretation in modeling potential nonequilibrium types of subcellular transport arising from space-dependent diffusivity. If the drift term were considered, the resulting stochastic differential equation stochastic differential equation (SDE) is equivalent to one arising from the isothermal interpretation of heterogeneous diffusivity (Volpe and Wehr 2016), wherein no diffusive lensing is seen (as shown in Fig. S1B). That is, the isothermal interpretation and the drift-comprising Itô SDE produce the same uniform steady-state particle densities.

      While we agree with the reviewers that for a given interpretation, equivalent stochastic differential equations (SDEs) arising from other interpretations may be drawn, we disagree with the generalization that all types of subcellular diffusion conform to the isothermal interpretation. That is, there is no reason why any and all instances of nonequilibrium subcellular particle diffusion must be modeled using isothermal-conforming SDEs (such as the drift-comprising Itô SDE, for instance). We refer to (Sokolov 2010) which prescribes choosing a convention in a context-dependent manner. In this regard, we disagree with the second reviewer’s characterization of making such a choice merely a “choice of writing” considering that it is entirely dependent on the choice of microscopic parameters, as detailed in the discussion section of the manuscript. The following references have also been added to the manuscript: the reference from the first reviewer (Kupferman et al. 2004) proposes a prescription for choosing an appropriate convention based upon comparing the noise correlation time and the particle relaxation time. The reference notes that the Itô convention is appropriate when the particle relaxation time is large when compared to the noise correlation time and the Stratonovich convention is appropriate in the converse scenario. In (Rupprecht et al. 2018), active noise is considered and the resulting Fokker-Planck equation conforms to the Stratonovich convention when thermal noise was negligible. The related reference, (Vishen et al. 2019) compares three timescales: those of particle relaxation, noise correlation and viscoelastic relaxation, to make the choice. Indeed, as noted in the manuscript, lensing is seen in all but one interpretation (without drift additions); only its magnitude is altered by the interpretation/choice of the drift term. The appendix has been modified to include a subsection on the interchangeability of the conventions.

      Separately, with regards to the discussion on anomalous diffusion, the section on mean squared displacement calculation has been amended to avoid confusing our model with canonical anomalous diffusion which considers the anomalous exponent; how the anomalous exponent varies with space-dependent diffusivity offers an interesting future area of study.

      Responses to specific reviewer comments appear below.

      Reviewer #1 (Public Review):

      The manuscript "Diffusive lensing as a mechanism of intracellular transport and compartmentalization", explores the implications of heterogeneous viscosity on the diffusive dynamics of particles. The authors analyze three different scenarios:

      (i)   diffusion under a gradient of viscosity,

      (ii)  clustering of interacting particles in a viscosity gradient, and

      (iii) diffusive dynamics of non-interacting particles with circular patches of heterogeneous viscous medium.

      The implications of a heterogeneous environment on phase separation and reaction kinetics in cells are under-explored. This makes the general theme of this manuscript very relevant and interesting. However, the analysis in the manuscript is not rigorous, and the claims in the abstract are not supported by the analysis in the main text.

      Following are my main comments on the work presented in this manuscript:

      (a) The central theme of this work is that spatially varying viscosity leads to position-dependent diffusion constant. This, for an overdamped Langevin dynamics with Gaussian white noise, leads to the well-known issue of the interpretation of the noise term.

      The authors use the Ito interpretation of the noise term because their system is non-equilibrium.

      One of the main criticisms I have is on this central point. The issue of interpretation arises only when there are ill-posed stochastic dynamics that do not have the relevant timescales required to analyze the noise term properly. Hence, if the authors want to start with an ill-posed equation it should be mentioned at the start. At least the Langevin dynamics considered should be explicitly mentioned in the main text. Since this work claims to be relevant to biological systems, it is also of significance to highlight the motivation for using the ill-posed equation rather than a well-posed equation. The authors refer to the non-equilibrium nature of the dynamics but it is not mentioned what non-equilibrium dynamics to authors have in mind. To properly analyze an overdamped Langevin dynamics a clear source of integrated timescales must be provided. As an example, one can write the dynamics as Eq. (1) \dot x = f(x) + g(x) \eta , which is ill-defined if the noise \eta is delta correlated in time but well-defined when \eta is exponentially correlated in time. One can of course look at the limit in which the exponential correlation goes to a delta correlation which leads to Eq. (1) interpreted in Stratonovich convention. The choice to use the Ito convention for Eq. (1) in this case is not justified.

      We thank the reviewer for detailing their concerns with our model’s assumptions. We have addressed them in the common rebuttal.

      (b) Generally, the manuscript talks of viscosity gradient but the equations deal with diffusion which is a combination of viscosity, temperature, particle size, and particle-medium interaction. There is no clear motivation provided for focus on viscosity (cytoplasm as such is a complex fluid) instead of just saying position-dependent diffusion constant. Maybe authors should use viscosity only when talking of a context where the existence of a viscosity gradient is established either in a real experiment or in a thought experiment.

      The manuscript has been amended to use only “diffusivity” to avoid confusion.

      (c) The section "Viscophoresis drives particle accumulation" seems to not have new results. Fig. 1 verifies the numerical code used to obtain the results in the later sections. If that is the case maybe this section can be moved to supplementary or at least it should be clearly stated that this is to establish the correctness of the simulation method. It would also be nice to comment a bit more on the choice of simulation methods with changing hopping sizes instead of, for example, numerically solving stochastic ODE.

      The main point of this section and of Fig. 1 is the diffusive lensing effect itself: the accumulation of particles in lower-diffusivity areas. To the best of our knowledge, diffusive lensing has not been reported elsewhere as a specific outcome of non-isothermal interpretations of diffusion, with potential relevance to nonequilibrium subcellular motilities. The simulation method has been fully described in the Methods section, and the code has also been shared (see Code Availability).

      A minor comment, the statement "the physically appropriate convention to use depends upon microscopic parameters and timescale hierarchies not captured in a coarse-grained model of diffusion." is not true as is noted in the references that authors mention, a correct coarse-grained model provides a suitable convention (see also Phys. Rev. E, 70(3), 036120., Phys. Rev. E, 100(6), 062602.).

      This has been addressed in the common rebuttal.

      (d) The section "Interaction-mediated clustering is affected by viscophoresis" makes an interesting statement about the positioning of clusters by a viscous gradient. As a theoretical calculation, the interplay between position-dependent diffusivity and phase separation is indeed interesting, but the problem needs more analysis than that offered in this manuscript. Just a plot showing clustering with and without a gradient of diffusion does not give enough insight into the interplay between density-dependent diffusion and position-dependent diffusion. A phase plot that somehow shows the relative contribution of the two effects would have been nice. Also, it should be emphasized in the main text that the inter-particle interaction is through a density-dependent diffusion constant and not a conservative coupling by an interaction potential.

      The density-dependence has been added from the Methods to the main text. The goal of the work is to present lensing as a natural outcome of the parameter choices we make and present its effects as they relate to clustering and commonly used biophysical methods to probe dynamics within cells. A dense sampling of the phase space and how it is altered as a function of diffusivity, and the subsequent interpretation, lie beyond the scope of the present work but offer exciting future directions of study.

      (e) The section "In silico microrheology shows that viscophoresis manifests as anomalous diffusion" the authors show that the MSD with and without spatial heterogeneity is different. This is not a surprise - as the underlying equations are different the MSD should be different.

      The goal here is to compare and contrast the ways in which homogeneous and heterogeneous diffusion manifest in simulated microrheology measurements. We hope that an altered saturation MSD, as is observed in our simulations, provokes interest in considering lensing while modeling experimental data.

      There are various analogies drawn in this section without any justification:

      (i) "the saturation MSD was higher than what was seen in the homogeneous diffusion scenario possibly due to particles robustly populating the bulk milieu followed by directed motion into the viscous zone (similar to that of a Brownian ratchet, (Peskin et al., 1993))."

      In case of i), the Brownian ratchet is invoked as a model to explain directed accumulation. We have removed this analogy to avoid confusion as it is not delved into further over the course of our work.

      (ii) "Note that lensing may cause particle displacements to deviate from a Gaussian distribution, which could explain anomalous behaviors observed both in our simulations and in experiments in cells (Parry et al., 2014)." Since the full trajectory of the particles is available, it can be analyzed to check if this is indeed the case.

      This has been addressed in the common rebuttal.

      (f) The final section "In silico FRAP in a heterogeneously viscous environment ... " studies the MSD of the particles in a medium with heterogeneous viscous patches which I find the most novel section of the work. As with the section on inter-particle interaction, this needs further analysis.

      We thank the reviewer for their appreciation. In presenting these three sections discussing the effects of diffusive lensing, we intend to broadly outline the scope of this phenomenon in influencing a range of behaviors. Exploring the directions further comprise promising future directions of study that lie beyond the scope of this manuscript.

      To summarise, as this is a theory paper, just showing MSD or in silico FRAP data is not sufficient. Unlike experiments where one is trying to understand the systems, here one has full access to the dynamics either analytically or in simulation. So just stating that the MSD in heterogeneous and homogeneous environments are not the same is not sufficient. With further analysis, this work can be of theoretical interest. Finally, just as a matter of personal taste, I am not in favor of the analogy with optical lensing. I don't see the connection.

      We value the reviewer’s interest in investigating the causes underlying the differences in the MSDs and agree that it represents a promising future area of study. The main point of this section of the manuscript was to make a connection to experimentally measurable quantities.

      Reviewer #2 (Public Review):

      Summary:

      The authors study through theory and simulations the diffusion of microscopic particles and aim to account for the effects of inhomogeneous viscosity and diffusion - in particular regarding the intracellular environment. They propose a mechanism, termed "Diffusive lensing", by which particles are attracted towards high-viscosity regions where they remain trapped. To obtain these results, the authors rely on agent-based simulations using custom rules performed with the Ito stochastic calculus convention, without spurious drift. They acknowledge the fact that this convention does not describe equilibrium systems, and that their results would not hold at equilibrium - and discard these facts by invoking the fact that cells are out-of-equilibrium. Finally, they show some applications of their findings, in particular enhanced clustering in the high-viscosity regions. The authors conclude that as inhomogeneous diffusion is ubiquitous in life, so must their mechanism be, and hence it must be important.

      Strengths:

      The article is well-written, and clearly intelligible, its hypotheses are stated relatively clearly and the models and mathematical derivations are compatible with these hypotheses.

      We thank the reviewer for their appreciation.

      Weaknesses:

      The main problem of the paper is these hypotheses. Indeed, it all relies on the Ito interpretation of the stochastic integrals. Stochastic conventions are a notoriously tricky business, but they are both mathematically and physically well-understood and do not result in any "dilemma" [some citations in the article, such as (Lau and Lubensky) and (Volpe and Wehr), make an unambiguous resolution of these]. Conventions are not an intrinsic, fixed property of a system, but a choice of writing; however, whenever going from one to another, one must include a "spurious drift" that compensates for the effect of this change - a mathematical subtlety that is entirely omitted in the article: if the drift is zero in one convention, it will thus be non-zero in another in the presence of diffusive gradients. It is well established that for equilibrium systems obeying fluctuation-dissipation, the spurious drift vanishes in the anti-Ito stochastic convention (which is not "anticipatory", contrarily to claims in the article, are the "steps" are local and infinitesimal). This ensures that the diffusion gradients do not induce currents and probability gradients, and thus that the steady-state PDF is the Gibbs measure. This equilibrium case should be seen as the default: a thermal system NOT obeying this law should warrant a strong justification (for instance in the Volpe and Wehr review this can occur through memory effects in robotic dynamics, or through strong fluctuation-dissipation breakdown). In near-equilibrium thermal systems such as the intracellular medium (where, although out-of-equilibrium, temperature remains a relevant and mostly homogeneous quantity), deviations from this behavior must be physically justified and go to zero when going towards equilibrium.

      Considering that the physical phenomena underlying diffusion span a range of timescales (particle relaxation, noise, environmental correlation, et cetera), we disagree with the assertion that all types of subcellular diffusion processes can be modeled as occurring at thermal equilibrium: for example, one can easily imagine memory effects arising in the presence of an appropriate hierarchy of timescales. We have added references that describe in more detail the way in which the comparison of timescales can dictate the applicability of different conventions. We also refer the referee to the common rebuttal section of our response in which we discuss factors that govern the choice of the interpretation. The adiabatic elimination arguments highlighted in (Kupferman et al. 2004) provide a clear description of how relevant particle and environment-related timescales can inform the choice of stochastic calculus to use.

      With regards to the use of the term “anticipatory” to refer to the isothermal interpretation, we refer to the comment in (Volpe and Wehr 2016) of the Itô interpretation “not looking into the future”. In any case, whether anticipatory or otherwise, the interpretation’s effect on our model remains unchanged, as highlighted in the section in the Appendix on the conversion between different conventions; this section has been added to minimize confusion about the effects of the choice of convention on lensing.

      Here, drifts are arbitrarily set to zero in the Ito convention (the exact opposite of the equilibrium anti-Ito), which is the equilibrium equivalent to adding a force (with drift $- grad D$) exactly compensating the spurious drift. If we were to interpret this as a breakdown of detailed balance with inhomogeneous temperature, the "hot" region would be effectively at 4x higher temperature than the cold region (i.e. 1200K) in Fig 1A.

      Our work is based on existing observations of space-dependent diffusivity in cells (Garner et al., 2023; Huang et al., 2021; Parry et al., 2014; Śmigiel et al., 2022; Xiang et al., 2020). These papers support a definitive model for the existence of space-dependent diffusivity without invoking space-dependent temperature.

      It is the effects of this arbitrary force (exactly compensating the Ito spurious drift) that are studied in the article. The fact that it results in probability gradients is trivial once formulated this way (and in no way is this new - many of the references, for instance, Volpe and Wehr, mention this).

      Addressed in the common rebuttal.

      Enhanced clustering is also a trivial effect of this probability gradient (the local concentration is increased by this force field, so phase separation can occur). As a side note the "neighbor sensing" scheme to describe interactions is very peculiar and not physically motivated - it violates stochastic thermodynamics laws too, as the detailed balance is apparently not respected.

      The neighbor-sensing scheme used here is just one possible model of an effective attractive potential between particles. Other models that lead to density-dependent attraction between particles should also provide qualitatively similar results as ours; this offers an interesting prospect for future research.

      Finally, the "anomalous diffusion" discussion is at odds with what the literature on this subject considers anomalous (the exponent does not appear anomalous).

      This has been addressed in the common rebuttal, and the relevant part of the manuscript has been modified to avoid confusion.

      The authors make no further justification of their choice of convention than the fact that cells are out-of-equilibrium, leaving the feeling that this is a detail. They make mentions of systems (eg glycogen, prebiotic environment) for which (near-)equilibrium physics should mostly prevail, and of fluctuation-dissipation ("Diffusivity varies inversely with viscosity", in the introduction). Yet the "phenomenon" they discuss is entirely reliant on an undiscussed mechanism by which these assumptions would be completely violated (the citations they make for this - Gnesotto '18 and Phillips '12 - are simply discussions of the fact that cells are out-of-equilibrium, not on any consequences on the convention).

      Finally, while inhomogeneous diffusion is ubiquitous, the strength of this effect in realistic conditions is not discussed (this would be a significant problem if the effect were real, which it isn't). Gravitational attraction is also an ubiquitous effect, but it is not important for intracellular compartmentalization.

      The manuscript text has been supplemented with additional references that detail the ways in which the comparison of timescales can dictate how one can apply different conventions. We refer the reviewer to the common rebuttal section of our response where we detail factors that dictate the choice of the convention to use. As previously noted, the adiabatic elimination arguments highlighted in (Kupferman et al., 2004) provide a prescription for how different timescales are to be considered in deciding the choice of stochastic calculus to use.

      With regards to the strength of space-dependent diffusivity in subcellular milieu, various measurements of heterogeneous diffusivity have been made both across different model systems and via different modalities, as cited in our manuscript. (Garner et al. 2023) used single-particle tracking to determine over 100-fold variability in diffusivity within individual S. pombe cells. Single-molecule measurements in (Xiang et al. 2020) and (Śmigiel et al. 2022) reveal an order-of-magnitude variation in tracer diffusion in mammalian cells and multi-fold variation in E. coli cytoplasm respectively. Fluorescence correlation spectroscopy measurements in (Huang et al. 2022) have found a two-fold increase in short-range diffusion of protein-sized tracers in X. laevis extracts. We have also added a reference to a study that uses 3D single particle tracking in the cytosol of a multinucleate fungus, A. gossypii, to identify regions of low-diffusivity near nuclei and hyphal tips (McLaughlin et al. 2020). Many of these references deploy particle tracking and investigate how mesoscale-sized particles (i.e. tracers spanning biologically relevant size scales) are directly impacted by space-dependent diffusivity. Therefore, we base our model on not only space-dependent diffusivity being a well-recognized feature of the cellular interior, but also on these observations pertaining to mesoscale-sized particles’ motion along relevant timescales.

      These measurements are also relevant to the reviewer’s question about the strength of the effect, which depends directly on the variability in diffusivity: for ten- or a hundred-fold diffusivity variations, the effect would be expected to be significant. In case of using the Itô convention directly, the contrast in concentration gradient is, in fact, that of the diffusivity gradient.

      To conclude, the "diffusive lensing" effect presented here is not a deep physical discovery, but a well-known effect of sticking to the wrong stochastic convention.

      As detailed in the various responses above, we respectfully disagree with the notion that there exists a singular correct stochastic convention that is applicable for all cases of subcellular heterogeneous diffusion. Further, as detailed in (Volpe and Wehr 2016) and as detailed in the Appendix, it is possible to convert between conventions and that an isothermal-abiding stochastic differential equation may be suitably altered, by means of adding a drift term, to an Itô-abiding stochastic differential equation; therefore, one can observe diffusive lensing without discarding the isothermal convention if the latter were modified. Indeed, it is only the driftless (or canonical) isothermal convention that does not allow for diffusive lensing.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1: 

      This is my first review of the article entitled "The canonical stopping network: Revisiting the role of the subcortex in response inhibition" by Isherwood and colleagues. This study is one in a series of excellent papers by the Forstmann group focusing on the ability of fMRI to reliably detect activity in small subcortical nuclei - in this case, specifically those purportedly involved in the hyper- and indirect inhibitory basal ganglia pathways. I have been very fond of this work for a long time, beginning with the demonstration of De Hollander, Forstmann et al. (HBM 2017) of the fact that 3T fMRI imaging (as well as many 7T imaging sequences) do not afford sufficient signal to noise ratio to reliably image these small subcortical nuclei. This work has done a lot to reshape my view of seminal past studies of subcortical activity during inhibitory control, including some that have several thousand citations.

      In the current study, the authors compiled five datasets that aimed to investigate neural activity associated with stopping an already initiated action, as operationalized in the classic stop-signal paradigm. Three of these datasets are taken from their own 7T investigations, and two are datasets from the Poldrack group, which used 3T fMRI.

      The authors make six chief points: 

      (1) There does not seem to be a measurable BOLD response in the purportedly critical subcortical areas in contrasts of successful stopping (SS) vs. going (GO), neither across datasets nor within each individual dataset. This includes the STN but also any other areas of the indirect and hyperdirect pathways.

      (2) The failed-stop (FS) vs. GO contrast is the only contrast showing substantial differences in those nodes.

      (3) The positive findings of STN (and other subcortical) activation during the SS vs. GO contrast could be due to the usage of inappropriate smoothing kernels.

      (4) The study demonstrates the utility of aggregating publicly available fMRI data from similar cognitive tasks. 

      (5) From the abstract: "The findings challenge previous functional magnetic resonance (fMRI) of the stop-signal task" 

      (6) and further: "suggest the need to ascribe a separate function to these networks." 

      I strongly and emphatically agree with points 1-5. However, I vehemently disagree with point 6, which appears to be the main thrust of the current paper, based on the discussion, abstract, and - not least - the title.

      To me, this paper essentially shows that fMRI is ill-suited to study the subcortex in the specific context of the stop-signal task. That is not just because of the issues of subcortical small-volume SNR (the main topic of this and related works by this outstanding group), but also because of its limited temporal resolution (which is unacknowledged, but especially impactful in the context of the stop-signal task). I'll expand on what I mean in the following.

      First, the authors are underrepresenting the non-fMRI evidence in favor of the involvement of the subthalamic nucleus (STN) and the basal ganglia more generally in stopping actions. 

      - There are many more intracranial local field potential recording studies that show increased STN LFP (or even single-unit) activity in the SS vs. FS and SS vs. GO contrast than listed, which come from at least seven different labs. Here's a (likely non-exhaustive) list of studies that come to mind:

      Ray et al., NeuroImage 2012 <br /> Alegre et al., Experimental Brain Research 2013 <br /> Benis et al., NeuroImage 2014 <br /> Wessel et al., Movement Disorders 2016 <br /> Benis et al., Cortex 2016 <br /> Fischer et al., eLife 2017 <br /> Ghahremani et al., Brain and Language 2018 <br /> Chen et al., Neuron 2020 <br /> Mosher et al., Neuron 2021 <br /> Diesburg et al., eLife 2021 

      - Similarly, there is much more evidence than cited that causally influencing STN via deep-brain stimulation also influences action-stopping. Again, the following list is probably incomplete: 

      Van den Wildenberg et al., JoCN 2006 <br /> Ray et al., Neuropsychologia 2009 <br /> Hershey et al., Brain 2010 <br /> Swann et al., JNeuro 2011 <br /> Mirabella et al., Cerebral Cortex 2012 <br /> Obeso et al., Exp. Brain Res. 2013 <br /> Georgiev et al., Exp Br Res 2016 <br /> Lofredi et al., Brain 2021 <br /> van den Wildenberg et al, Behav Brain Res 2021 <br /> Wessel et al., Current Biology 2022 

      - Moreover, evidence from non-human animals similarly suggests critical STN involvement in action stopping, e.g.: 

      Eagle et al., Cerebral Cortex 2008 <br /> Schmidt et al., Nature Neuroscience 2013 <br /> Fife et al., eLife 2017 <br /> Anderson et al., Brain Res 2020 

      Together, studies like these provide either causal evidence for STN involvement via direct electrical stimulation of the nucleus or provide direct recordings of its local field potential activity during stopping. This is not to mention the extensive evidence for the involvement of the STN - and the indirect and hyperdirect pathways in general - in motor inhibition more broadly, perhaps best illustrated by their damage leading to (hemi)ballism. 

      Hence, I cannot agree with the idea that the current set of findings "suggest the need to ascribe a separate function to these networks", as suggested in the abstract and further explicated in the discussion of the current paper. For this to be the case, we would need to disregard more than a decade's worth of direct recording studies of the STN in favor of a remote measurement of the BOLD response using (provably) sub ideal imaging parameters. There are myriads of explanations of why fMRI may not be able to reveal a potential ground-truth difference in STN activity between the SS and FS/GO conditions, beginning with the simple proposition that it may not afford sufficient SNR, or that perhaps subcortical BOLD is not tightly related to the type of neurophysiological activity that distinguishes these conditions (in the purported case of the stop-signal task, specifically the beta band). But essentially, this paper shows that a specific lens into subcortical activity is likely broken, but then also suggests dismissing existing evidence from superior lenses in favor of the findings from the 'broken' lens. That doesn't make much sense to me.

      Second, there is actually another substantial reason why fMRI may indeed be unsuitable to study STN activity, specifically in the stop-signal paradigm: its limited time resolution. The sequence of subcortical processes on each specific trial type in the stop-signal task is purportedly as follows: at baseline, the basal ganglia exert inhibition on the motor system. During motor initiation, this inhibition is lifted via direct pathway innervation. This is when the three trial types start diverging. When actions then have to be rapidly cancelled (SS and FS), cortical regions signal to STN via the hyperdirect pathway that inhibition has to be rapidly reinstated (see Chen, Starr et al., Neuron 2020 for direct evidence for such a monosynaptic hyperdirect pathway, the speed of which directly predicts SSRT). Hence, inhibition is reinstated (too late in the case of FS trials, but early enough in SS trials, see recordings from the BG in Schmidt, Berke et al., Nature Neuroscience 2013; and Diesburg, Wessel et al., eLife 2021). 

      Hence, according to this prevailing model, all three trial types involve a sequence of STN activation (initial inhibition), STN deactivation (disinhibition during GO), and STN reactivation (reinstantiation of inhibition during the response via the hyperdirect pathway on SS/FS trials, reinstantiation of inhibition via the indirect pathway after the response on GO trials). What distinguishes the trial types during this period is chiefly the relative timing of the inhibitory process (earliest on SS trials, slightly later on FS trials, latest on GO trials). However, these temporal differences play out on a level of hundreds of milliseconds, and in all three cases, processing concludes well under a second overall. To fMRI, given its limited time resolution, these activations are bound to look quite similar. 

      Lastly, further building on this logic, it's not surprising that FS trials yield increased activity compared to SS and GO trials. That's because FS trials are errors, which are known to activate the STN (Cavanagh et al., JoCN 2014; Siegert et al. Cortex 2014) and afford additional inhibition of the motor system after their occurrence (Guan et al., JNeuro 2022). Again, fMRI will likely conflate this activity with the abovementioned sequence, resulting in a summation of activity and the highest level of BOLD for FS trials. 

      In sum, I believe this study has a lot of merit in demonstrating that fMRI is ill-suited to study the subcortex during the SST, but I cannot agree that it warrants any reappreciation of the subcortex's role in stopping, which are not chiefly based on fMRI evidence. 

      We would like to thank reviewer 1 for their insightful and helpful comments. We have responded point-by-point below and will give an overview of how we reframed the paper here.  

      We agree that there is good evidence from other sources for the presence of the canonical stopping network (indirect and hyperdirect) during action cancellation, and that this should be reflected more in the paper. However, we do not believe that a lack of evidence for this network during the SST makes fMRI ill-suited for studying this task, or other tasks that have neural processes occurring in quick succession. What we believe the activation patterns of fMRI reflect during this task, is the large of amount of activation caused by failed stops. That is, that the role of the STN in error processing may be more pronounced that its role in action cancellation. Due to the replicability of fMRI results, especially at higher field strengths, we believe the activation profile of failed stop trials reflects a paramount role for the STN in error processing. Therefore, while we agree we do not provide evidence against the role of the STN in action cancellation, we do provide evidence that our outlook on subcortical activation during different trial types of this task should be revisited. We have reframed the article to reflect this, and discuss points such as fMRI reliability, validity and the complex overlapping of cognitive processes in the SST in the discussion. Please see all changes to the article indicated by red text.

      A few other points: 

      - As I said before, this team's previous work has done a lot to convince me that 3T fMRI is unsuitable to study the STN. As such, it would have been nice to see a combination of the subsamples of the study that DID use imaging protocols and field strengths suitable to actually study this node. This is especially true since the second 3T sample (and arguably, the Isherwood_7T sample) does not afford a lot of trials per subject, to begin with.

      Unfortunately, this study already comprises of the only 7T open access datasets available for the SST. Therefore, unless we combined only the deHollander_7T and Miletic_7T subsamples there is no additional analysis we can do for this right now. While looking at just the sub samples that were 7T and had >300 trials would be interesting, based on the new framing of the paper we do not believe it adds to the study, as the sub samples still lack the temporal resolution seemingly required for looking at the processes in the SST.

      - What was the GLM analysis time-locked to on SS and FS trials? The stop-signal or the GO-signal? 

      SS and FS trials were time-locked to the GO signal as this is standard practice. The main reason for this is that we use contrasts to interpret differences in activation patterns between conditions. By time-locking the FS and SS trials to the stop signal, we are contrasting events at different time points, and therefore different stages of processing, which introduces its own sources of error. We agree with the reviewer, however, that a separate analysis with time-locking on the stop-signal has its own merit, and now include results in the supplementary material where the FS and SS trials are time-locked to the stop signal as well.

      - Why was SSRT calculated using the outdated mean method? 

      We originally calculated SSRT using the mean method as this was how it was reported in the oldest of the aggregated studies. We have now re-calculated the SSRTs using the integration method with go omission replacement and thank the reviewer for pointing this out. Please see response to comment 3.

      - The authors chose 3.1 as a z-score to "ensure conservatism", but since they are essentially trying to prove the null hypothesis that there is no increased STN activity on SS trials, I would suggest erring on the side of a more lenient threshold to avoid type-2 error. 

      We have used minimum FDR-corrected thresholds for each contrast now, instead of using a blanket conservative threshold of 3.1 over all contrasts. The new thresholds for each contrast are shown in text. Please see below (page 12):

      “The thresholds for each contrast are as follows: 3.01 for FS > GO, 2.26 for FS > SS and 3.1 for SS > GO.”

      - The authors state that "The results presented here add to a growing literature exposing inconsistencies in our understanding of the networks underlying successful response inhibition". It would be helpful if the authors cited these studies and what those inconsistencies are. 

      We thank reviewer 1 for their detailed and thorough evaluation of our paper. Overall, we agree that there is substantial direct and indirect evidence for the involvement of the cortico-basal-ganglia pathways in response inhibition. We have taken the vast constructive criticism on board and agree with the reviewer that the paper should be reframed. We would like to thank the reviewer for the thoroughness of their helpful comments aiding the revising of the paper.

      (1) I would suggest reframing the study, abstract, discussion, and title to reflect the fact that the study shows that fMRI is unsuitable to study subcortical activity in the SST, rather than the fact that we need to question the subcortical model of inhibition, given the reasons in my public review.

      We agree with the reviewer that the article should be reframed and not taken as direct evidence against the large sum of literature pointing towards the involvement of the cortico-basal-ganglia pathway in response inhibition. We have significantly rewritten the article in light of this.

      (2) I suggest combining the datasets that provide the best imaging parameters and then analyzing the subcortical ROIs with a more lenient threshold and with regressors time-locked to the stop-signals (if that's not already the case). This would make the claim of a null finding much more impactful. Some sort of power analysis and/or Bayes factor analysis of evidence for the null would also be appreciated. 

      Instead of using a blanket conservative threshold of 3.1, we instead used only FDR-corrected thresholds. The threshold level is therefore different for each contrast and noted in the figures. We have also added supplementary figures including the group-level SPMs and ROI analyses when the FS and SS trials were time-locked to the stop signal instead of the GO signal (Supplementary Figs 4 & 5). But as mentioned above, due to the difference in time points when contrasting, we believe that time-locking to the GO signal for all trial types makes more sense for the main analysis.

      We have now also computed BFs on the first level ROI beta estimates for all contrasts using the BayesFactor package as implemented in R. We add the following section to the methods and updated the results section accordingly (page 8):

      “In addition to the frequentist analysis we also opted to compute Bayes Factors (BFs) for each contrast per ROI per hemisphere. To do this, we extracted the beta weights for each individual trial type from our first level model. We then compared the beta weights from each trial type to one another using the ‘BayesFactor’ package as implement in R (Morey & Rouder, 2015). We compared the full model comprising of trial type, dataset and subject as predictors to the null model comprising of only the dataset and subject as predictor. The datasets and subjects were modeled as random factors. We divided the resultant BFs from the full model by the null model to provide evidence for or against a significant difference in beta weights for each trial type. To interpret the BFs, we used a modified version of Jeffreys’ scale (Jeffreys, 1939; Lee & Wagenmakers, 2014).”

      (3) I suggest calculating SSRT using the integration method with the replacement of Go omissions, as per the most recent recommendation (Verbruggen et al., eLife 2019).

      We agree we should have used a more optimal method for SSRT estimation. We have replaced our original estimations with that of the integration method with go omissions replacement, as suggested and adapted the results in table 3.

      We have also replaced text in the methods sections to reflect this (page 5):

      “For each participant, the SSRT was calculated using the mean method, estimated by subtracting the mean SSD from median go RT (Aron & Poldrack, 2006; Logan & Cowan, 1984).”

      Now reads:

      “For each participant, the SSRT was calculated using the integration method with replacement of go omissions (Verbruggen et al., 2019), estimated by integrating the RT distribution and calculating the point at which the integral equals p(respond|signal). The completion time of the stop process aligns with the nth RT, where n equals the number of RTs in the RT distribution of go trials multiplied by the probability of responding to a signal.”

      Reviewer #2:

      This work aggregates data across 5 openly available stopping studies (3 at 7 tesla and 2 at 3 tesla) to evaluate activity patterns across the common contrasts of Failed Stop (FS) > Go, FS > stop success (SS), and SS > Go. Previous work has implicated a set of regions that tend to be positively active in one or more of these contrasts, including the bilateral inferior frontal gyrus, preSMA, and multiple basal ganglia structures. However, the authors argue that upon closer examination, many previous papers have not found subcortical structures to be more active on SS than FS trials, bringing into question whether they play an essential role in (successful) inhibition. In order to evaluate this with more data and power, the authors aggregate across five datasets and find many areas that are *more* active for FS than SS, specifically bilateral preSMA, caudate, GPE, thalamus, and VTA, and unilateral M1, GPi, putamen, SN, and STN. They argue that this brings into question the role of these areas in inhibition, based upon the assumption that areas involved in inhibition should be more active on successful stop than failed stop trials, not the opposite as they observed. 

      As an empirical result, I believe that the results are robust, but this work does not attempt a new theoretical synthesis of the neuro-cognitive mechanisms of stopping. Specifically, if these many areas are more active on failed stop than successful stop trials, and (at least some of) these areas are situated in pathways that are traditionally assumed to instantiate response inhibition like the hyperdirect pathway, then what function are these areas/pathways involved in? I believe that this work would make a larger impact if the author endeavored to synthesize these results into some kind of theoretical framework for how stopping is instantiated in the brain, even if that framework may be preliminary. 

      I also have one main concern about the analysis. The authors use the mean method for computing SSRT, but this has been shown to be more susceptible to distortion from RT slowing (Verbruggen, Chambers & Logan, 2013 Psych Sci), and goes against the consensus recommendation of using the integration with replacement method (Verbruggen et al., 2019). Therefore, I would strongly recommend replacing all mean SSRT estimates with estimates using the integration with replacement method. 

      I found the paper clearly written and empirically strong. As I mentioned in the public review, I believe that the main shortcoming is the lack of theoretical synthesis. I would encourage the authors to attempt to synthesize these results into some form of theoretical explanation. I would also encourage replacing the mean method with the integration with replacement method for computing SSRT. I also have the following specific comments and suggestions (in the approximate order in which they appear in the manuscript) that I hope can improve the manuscript: 

      We would like to thank reviewer 2 for their insightful and interesting comments. We have adapted our paper to reflect these comments. Please see direct responses to your comments below. We agree with the reviewer that some type of theoretical synthesis would help with the interpretability of the article. We have substantially reworked the discussion and included theoretical considerations behind the newer narrative. Please see all changes to the article indicated by red text.

      (1) The authors say "performance on successful stop trials is quantified by the stop signal reaction time". I don't think this is technically accurate. SSRT is a measure of the average latency of the stop process for all trials, not just for the trials in which subjects successfully stop. 

      Thank you for pointing this technically incorrect statement. We have replaced the above sentence with the following (page 1):

      “Inhibition performance in the SST as a whole is quantified by the stop signal reaction time (SSRT), which estimates the speed of the latent stopping process (Verbruggen et al., 2019).”

      (2) The authors say "few studies have detected differences in the BOLD response between FS and SS trials", but then do not cite any papers that detected differences until several sentences later (de Hollander et al., 2017; Isherwood et al., 2023; Miletic et al., 2020). If these are the only ones, and they only show greater FS than SS, then I think this point could be made more clearly and directly. 

      We have moved the citations to the correct place in the text to be clearer. We have also rephrased this part of the introduction to make the points more direct (page 2).

      “In the subcortex, functional evidence is relatively inconsistent. Some studies have found an increase in BOLD response in the STN in SS > GO contrasts (Aron & Poldrack, 2006; Coxon et al., 2016; Gaillard et al., 2020; Yoon et al., 2019), but others have failed to replicate this (Bloemendaal et al., 2016; Boehler et al., 2010; Chang et al., 2020; B. Xu et al., 2015). Moreover, some studies have actually found higher STN, SN and thalamic activation in failed stop trials, not successful ones (de Hollander et al., 2017; Isherwood et al., 2023; Miletić et al., 2020).

      (3) Unless I overlooked it, I don't believe that the author specified the criterion that any given subject is excluded based upon. Given some studies have significant exclusions (e.g., Poldrack_3T), I think being clear about how many subjects violated each criterion would be useful. 

      This is indeed interesting and important information to include. We have added the number of participants who were excluded for each criterion. Please see added text below (page 4):

      “Based on these criteria, no subjects were excluded from the Aron_3T dataset. 24 subjects were excluded from the Poldrack_3T dataset (3 based on criterion 1, 9 on criterion 2, 11 on criterion 3, and 8 on criterion 4). Three subjects were excluded from the deHollander_7T dataset (2 based on criterion 1 and 1 on criterion 2). Five subjects were excluded from the Isherwood_7T dataset (2 based on criterion 1, 1 on criterion 2, and 2 on criterion 4). Two subjects were excluded from the Miletic_7T dataset (1 based on criterion 2 and 1 on criterion 4). Note that some participants in the Poldrack_3T study failed to meet multiple inclusion criteria.”

      (4) The Method section included very exhaustive descriptions of the neuroimaging processing pipeline, which was appreciated. However, it seems that much of what is presented is not actually used in any of the analyses. For example, it seems that "functional data preprocessing" section may be fMRIPrep boilerplate, which again is fine, but I think it would help to clarify that much of the preprocessing was not used in any part of the analysis pipeline for any results. For example, at first blush, I thought the authors were using global signal regression, but after a more careful examination, I believe that they are only computing global signals but never using them. Similarly with tCompCor seemingly being computed but not used. If possible, I would recommend that the authors share code that instantiates their behavioral and neuroimaging analysis pipeline so that any confusion about what was actually done could be programmatically verified. At a minimum, I would recommend more clearly distinguishing the pipeline steps that actually went into any presented analyses.

      We thank the reviewer for finding this inconsistency. The methods section indeed uses the fMRIprep boilerplate text, which we included so to be as accurate as possible when describing the preprocessing steps taken. While we believe leaving the exact boilerplate text that fMRIprep gives us is the most accurate method to show our preprocessing, we have adapted some of the text to clarify which computations were not used in the subsequent analysis. As a side-note, for future reference, we’d like to add that the fmriprep authors expressly recommend users to report the boilerplate completely and unaltered, and as such, we believe this may become a recurring issue (page 7).

      “While many regressors were computed in the preprocessing of the fMRI data, not all were used in the subsequent analysis. The exact regressors used for the analysis can be found above. For example, tCompCor and global signals were calculated in our generic preprocessing pipeline but not part of the analysis. The code used for preprocessing and analysis can be found in the data and code availability statement.”

      (5) What does it mean for the Poldrack_3T to have N/A for SSD range? Please clarify. 

      Thank you for pointing out this omission. We had not yet found the possible SSD range for this study. We have replaced this value with the correct value (0 – 1000 ms).

      (6) The SSD range of 0-2000ms for deHollander_7T and Miletic_7T seems very high. Was this limit ever reached or even approached? SSD distributions could be a useful addition to the supplement. 

      Thank you for also bringing this mistake to light. We had accidentally placed the max trial duration in these fields instead of the max allowable SSD value. We have replaced the correct value (0 – 900 ms).

      (7) The author says "In addition, median go RTs did not correlate with mean SSRTs within datasets (Aron_3T: r = .411, p = .10, BF = 1.41; Poldrack_3T: r = .011, p = .91, BF = .23; deHollander_7T: r = -.30, p = .09, BF = 1.30; Isherwood_7T: r = .13, p = .65, BF = .57; Miletic_7T: r = .37, p = .19, BF = 1.02), indicating independence between the stop and go processes, an important assumption of the horse-race model (Logan & Cowan, 1984)." However, the independent race model assumes context independence (the finishing time of the go process is not affected by the presence of the stop process) and stochastic independence (the duration of the go and stop processes are independent on a given trial). This analysis does not seem to evaluate either of these forms of independence, as it correlates RT and SSRT across subjects, so it was unclear how this analysis evaluated either of the types of independence that are assumed by the independent race model. Please clarify or remove. 

      Thank you for this comment. We realize that this analysis indeed does not evaluate either context or stochastic independence and therefore we have removed this from the manuscript.

      (8) The RTs in Isherwood_7T are considerably slower than the other studies, even though the go stimulus+response is the same (very simple) stimulus-response mapping from arrows to button presses. Is there any difference in procedure or stimuli that might explain this difference? It is the only study with a visual stop signal, but to my knowledge, there is no work suggesting visual stop signals encourage more proactive slowing. If possible, I think a brief discussion of the unusually slow RTs in Isherwood_7T would be useful. 

      We have included the following text in the manuscript to reflect this observed difference in RT between the Isherwood_7T dataset and the other datasets (page 9).

      “Longer RTs were found in the Isherwood_7T dataset in comparison to the four other datasets. The only difference in procedure in the Isherwood_7T dataset is the use of a visual stop signal as opposed to an auditory stop signal. This RT difference is consistent with previous research, where auditory stop signals and visual go stimuli have been associated with faster RTs compared to unimodal visual presentation (Carrillo-de-la-Peña et al., 2019; Weber et al., 2024). The mean SSRTs and probability of stopping are within normal range, indicating that participants understood the task and responded in the expected manner.”

      (9) When the authors included both 3T and 7T data, I thought they were preparing to evaluate the effect of magnet strength on stop networks, but they didn't do this analysis. Is this because the authors believe there is insufficient power? It seems that this could be an interesting exploratory analysis that could improve the paper.

      We thank the reviewer for this interesting comment. As our dataset sample contains only two 3T and three 7T datasets we indeed believe there is insufficient power to warrant such an analysis. In addition, we wanted the focus of this paper to be how fMRI examines the SST in general, and not differences between acquisition methods. With a greater number of datasets with different imaging parameters (especially TE or resolution) in addition to field strength, we agree such an analysis would be interesting, although beyond the scope of this article.

      (10) The authors evaluate smoothing and it seems that the conclusion that they want to come to is that with a larger smoothing kernel, the results in the stop networks bleed into surrounding areas, producing false positive activity. However, in the absence of a ground truth of the true contributions of these areas, it seems that an alternative interpretation of the results is that the denser maps when using a larger smoothing kernel could be closer to "true" activation, with the maps using a smaller smoothing kernel missing some true activity. It seems worth entertaining these two possible interpretations for the smoothing results unless there is clear reason to conclude that the smoothed results are producing false positive activity. 

      We agree with the view of the reviewer on the interpretation of the smoothing results. We indeed cannot rule this out as a possible interpretation of the results, due to a lack of ground truth. We have added text to the article to reflect this view and discuss the types of errors we can expect for both smaller and larger smoothing kernels (page 15).

      “In the absence of a ground truth, we are not able to fully justify the use of either larger or smaller kernels to analyse such data. On the one hand, aberrantly large smoothing kernels could lead to false positives in activation profiles, due to bleeding of observed activation into surrounding tissues. On the other side, too little smoothing could lead to false negatives, missing some true activity in surrounding regions. While we cannot concretely validate either choice, it should be noted that there is lower spatial uncertainty in the subcortex compared to the cortex, due to the lower anatomical variability. False positives from smoothing spatially unmatched signal, are more likely than false negatives. It may be more prudent for studies to use a range of smoothing kernels, to assess the robustness of their fMRI activation profiles.”

    1. automatically calls

      wow. so if the grantType is authorization code flow type, this parseFromUrl() function actually automatically does an entire extra step which is the POST to get the token.

    2. which malicious browser extensions would not have access to

      according to the next paragraph, at least in this example case of okta, we don't need manual access to the random secret generated by frontend code.

    1. I wish I would have written less code

      code is not assets

      but a liability

    1. Reviewer #2 (Public Review):

      Summary:

      This study takes advantage of multiple methodological advances to perform layer-specific staining of cortical neurons and tracking of their axons to identify the pattern of their projections. This publication offers a mesoscale view of the projection patterns of neurons in the whisker primary and secondary somatosensory cortex. The authors report that, consistent with the literature, the pattern of projection is highly different across cortical layers and subtype, with targets being located around the whole brain. This was tested across 6 different mouse types that expressed a marker in layer 2/3, layer 4, layer 5 (3 sub-types) and layer 6.<br /> Looking more closely at the projections from primary somatosensory cortex into the primary motor cortex, they found that there was a significant spatial clustering of projections from topographically separated neurons across the primary somatosensory cortex. This was true for neurons with cell bodies located across all tested layers/types.

      Strengths:

      This study successfully looks at the relevant scale to study projection patterns, which is the whole brain. This is achieved thanks to an ambitious combination of mouse lines, immuno-histochemistry, imaging and image processing, which results in a standardized histological pipeline that processes the whole-brain projection patterns of layer-selected neurons of the primary and secondary somatosensory cortex.<br /> This standardization means that comparisons between cell-types projection patterns are possible and that both the large-scale structure of the pattern and the minute details of the intra-areas pattern are available.<br /> This reference dataset and the corresponding analysis code are made available to the research community.

      Weaknesses:

      One major question raised by this dataset is the risk of missing axons during the post-processing step. Indeed, it appears that the control and training efforts have focused on the risk of false positives (see Figure 1 supplementary panels). And indeed, the risk of overlooking existing axons in the raw fluorescence data id discussed in the article.

      Based on the data reported in the article, this is more than a risk. In particular, Figure 2 shows an example Rbp4-L5 mouse where axonal spread seems massive in Hippocampus, while there is no mention of this area in the processed projection data for this mouse line.

      Similarily, the Ntsr1-L6CT example shows a striking level of fluorescence in Striatum, that does not reflect in the amount of axons that are detected by the algorithms in the next figures.<br /> These apparent discrepancies may be due to non axonal-specific fluorescence in the samples. In any case, further analysis of such anatomical areas would be useful to consolidate the valuable dataset provided by the article.

    1. https://www.youtube.com/watch?v=KbzqB09ZyJQ

      Résumé de la vidéo [00:00:00][^1^][1] - [00:25:15][^2^][2]:

      Cette vidéo présente une conférence de Didier Fassin sur la faculté de punir, explorant les pratiques punitives au-delà de leur forme légale et l'impact de la peine d'emprisonnement. Fassin discute des théories de la justice rétributive et utilitariste, et analyse l'extension de la peine, notamment en termes de profondeur de l'affliction et d'élargissement du cercle des affligés.

      Points forts: + [00:00:29][^3^][3] La critique de la définition traditionnelle de la peine * Remise en question de la définition par les philosophes et juristes * Focus sur le châtiment pour comprendre les pratiques punitives * Importance de la réalité des pratiques au-delà de la loi + [00:01:54][^4^][4] L'extension de la peine * La peine affecte plus profondément et au-delà de l'intention initiale * Analyse de l'emprisonnement et son impact sur les proches des détenus * Référence aux recherches en sociologie et anthropologie + [00:03:38][^5^][5] L'influence de Cesare Beccaria sur la prison et la peine d'emprisonnement * Discussion sur l'œuvre de Beccaria et son impact sur la réforme pénale * Rejet de la justice rétributive et plaidoyer pour l'utilitarisme * L'importance de la prévention des crimes et de la proportionnalité des peines + [00:21:19][^6^][6] L'histoire de la peine d'emprisonnement en France * Introduction de l'emprisonnement dans le code pénal de 1791 * Évolution des peines et principes de la réforme pénale * Critique de l'adoucissement des peines et de la punition généralisée Résumé de la vidéo [00:25:17][^1^][1] - [00:48:46][^2^][2]:

      Cette partie de la vidéo explore l'évolution de la peine d'emprisonnement dans l'histoire française, depuis son introduction dans le code pénal en 1781 jusqu'à son utilisation contemporaine. Elle examine les changements dans la gestion des prisons, les conditions de détention et l'impact sur les détenus et le personnel pénitentiaire.

      Points forts: + [00:25:17][^3^][3] Introduction de l'emprisonnement punitif * Remplacement des peines corporelles * Emprisonnement comme sanction ordinaire * Extension progressive de l'empire carcéral + [00:29:00][^4^][4] Privatisation et conditions des prisons * Développement des manufactures carcérales * Main-d'œuvre captive pour les industriels * Dénonciation des abus par des intellectuels + [00:35:00][^5^][5] Perception de la prison par le personnel * Stratégies de défense face à la détresse des détenus * Indifférence cognitive et justification de l'agressivité * Sensibilité de certains membres du personnel + [00:41:00][^6^][6] Confinement et privation de liberté * Confinement dans le confinement * Impact de la surpopulation carcérale * Privation de la vie affective et sexuelle + [00:45:00][^7^][7] Frustrations et privation de dignité * Toutepuissance du personnel et épreuve de l'estime de soi * Fouilles corporelles et atteinte à la dignité * Difficultés liées à la cohabitation forcée Résumé de la vidéo [00:48:48][^1^][1] - [01:03:27][^2^][2]:

      Cette partie de la vidéo aborde la faculté de punir et les diverses privations subies par les détenus, allant au-delà de la simple perte de liberté. Elle met en lumière les conditions de vie difficiles en prison, l'impact de l'incarcération sur les proches des détenus et les conséquences sociales plus larges de la politique pénitentiaire.

      Points forts: + [00:48:48][^3^][3] Les conditions de détention * Les décisions du tribunal administratif et leur impact * Les restrictions sur les activités quotidiennes comme les douches * La pression exercée par les surveillants + [00:50:07][^4^][4] La privation de sens de la peine * L'absence de prise de conscience du délit commis * Le manque d'opportunités pour la réinsertion sociale et la réforme morale * La description de la prison comme un lieu qui dévalorise la personne + [00:52:17][^5^][5] Le suicide en milieu carcéral * Le taux élevé de suicide en prison en France * Les facteurs contribuant aux suicides, comme le choc de l'incarcération * L'impact des conditions carcérales sur la santé mentale des détenus + [00:55:23][^6^][6] Les répercussions sur les proches des détenus * L'effet de l'incarcération sur les familles et les communautés * Les difficultés rencontrées par les visiteurs pour accéder à la prison * La notion de "châtiment vicariant" affectant les proches des condamnés

    1. Fig. 3.5 Discharge patterns of the Rhine and Meuse rivers.

      Onderliggende gegevens? Python code voor de plot? Misschien wel aardig om ook het 10e en 90e percentiel te geven

    1. AbstractThe edible jelly fungus Dacryopinax spathularia (Dacrymycetaceae) is wood-decaying and can be commonly found worldwide. It has also been used in food additives given its ability to synthesize long-chain glycolipids. In this study, we present the genome assembly of D. spathularia using a combination of PacBio HiFi reads and Omni-C data. The genome size of D. spathularia is 29.2 Mb and in high sequence contiguity and completeness, including scaffold N50 of 1.925 Mb and 92.0% BUSCO score, respectively. A total of 11,510 protein-coding genes, and 474.7 kb repeats accounting for 1.62% of the genome, were also predicted. The D. spathularia genome assembly generated in this study provides a valuable resource for understanding their ecology such as wood decaying capability, evolutionary relationships with other fungus, as well as their unique biology and applications in the food industry.

      This work has been published in GigaByte Journal under a CC-BY 4.0 license (https://doi.org/10.46471/gigabyte.120), and has published the reviews under the same license. This is part of a thematic series presenting Data Releases from the Hong Kong Biodiversity Genomics consortium (https://doi.org/10.46471/GIGABYTE_SERIES_0006). These are as follows.

      Reviewer 1. Anton Sonnenberg

      Is the language of sufficient quality? Yes.

      Are all data available and do they match the descriptions in the paper? Yes.

      Is the data acquisition clear, complete and methodologically sound? Yes.

      Is there sufficient detail in the methods and data-processing steps to allow reproduction? Yes.

      Figure 1E could be improved by eliminating in the pie-chart the non-repeat sequences or bar-plot the repeats. That will visualize better the frequencies of each type of repeats.

      Reviewer 2. Riccardo Iacovelli

      Is the language of sufficient quality? No.

      There are several typos spread across the text, and some sentences are written in an unclear manner. I provide some suggestions in the attachment.

      Are all data available and do they match the descriptions in the paper?

      Yes, but some of the data shown is rather unclear and/or not supported by sufficient explanation. For example, what is actually Fig. 1C showing? Because the reference in the text (which contains a typo, line 197) refers to something else. What is the second set of stats in Fig. 1B? This other organism is not mentioned at all anywhere in the manuscript.

      Are the data and metadata consistent with relevant minimum information or reporting standards?

      No. NCBI TaxID of the sequenced species object of this work is missing.

      Is there sufficient detail in the methods and data-processing steps to allow reproduction?

      No. In my opinion, some of the procedures described for the processing of the sample and library prep for sequencing are reported in an unclear way. For example, lines 100-103: no details on RNAse A treatment; how do you define chloroform:IAA (24:1) washes? how much supernatant is added to how much H1 buffer to have the final volume of 6 ml? Another example, lines 180-175: what parameters did you use for EvidenceModeler to generate the final consensus genes model? The weight given to each particular prediction set is important.

      Is there sufficient data validation and statistical analyses of data quality?

      No/ While sufficient data validation and statistical analyses have been carried out with respect to DNA sequencing and genome assembly, nothing is reported about DNA extraction and quality. The authors mention several times throughout the text that DNA preps are checked via NanoDrop, Qubit, gel electrophoresis, etc. But none of this is shown in the main body or in the supplementary information. Without this information, it is difficult to assess directly the efficacy of DNA extraction and preparation methods. I recommend including this type of data.

      Additional Comments:

      In this article, the authors report the first whole genome assembly of Dacryopinax spathularia, an edible mushroom-forming fungus that is used in the food industry to produce natural preservatives. In general, I find the data of sufficiently high quality for release, and I do agree with the authors in that it will prove useful to gain further insights into the ecology of the fungus, and to better understand the genetic basis of its ability to decay wood and produce valuable compounds. This can ultimately lead to discoveries with applications in biotech and other industries.

      Nevertheless, during the review process I noticed several shortcomings with respect to unclear language, insufficient description of the experimental procedures and/or results presented, and missing data altogether. These are all discussed within the checklist available in the ReView portal. For minor comments line-by-line, see below:

      1: Dacrymycetaceae should be italicized (throughout the whole manuscript). This follows the convention established by The International Code of Nomenclature for algae, fungi, and plants (https://www.iaptglobal.org/icn). Although not binding, this allows easy recognition of taxonomic ranks when reading an article. 49: other fungus -> other fungi 56: photodynamic injury -> UV damage/radiation (photodynamic is used with respect to light-activated therapies etc.) 60: in food industry as natural preservatives in soft drinks -> in food industry to produce natural preservatives for soft drinks 68: cultivated in industry as food additives -> cultivated in industry to produce food additives 69: isolated fungal extract -> the isolated fungal extract 71: What do you mean by Pacific? It’s unclear 71-72: the genomic resource -> genomic data/ genome sequence 72: I would remove “with translational values”, it is very vague and does not add anything to the statement 78: genomic resource -> genomic data/ genome sequence 78-81: this could be rephrased in a smoother manner: e.g. something like “the genomic data will be useful to gain a better understanding of the fungus’ ecology as well as the genetic basis of its wood-decaying ability and…” 85: fruit bodies -> fruiting bodies 88-89: Grown hyphae from >2 week-old was transferred  Fungal hyphae from 2-week old colonies were transferred 90-91: validated with the DNA barcode of Translation  assigned by DNA barcoding using the sequence of Translation… 95: ~ -> Approximately (sentences are not usually started with symbols or numbers) 101-3: Procedure is not clear enough (see other comments through ReView portal) 124: for further cleanup the library -> to further clean up the library / for further cleanup of the library 132: as line 95 152: as lines 95, 132 181-5: Insufficient description of methods, see comments through ReView portal 197: Figure and 1C; Table 2 -> Figure 1C and Table 2 200: average protein length of 451 bp -> average protein-coding gene length / average protein length of ~150 amino acids 211: via the fermentation process with applications in the food industry -> via the fermentation process with potential applications in the food industry

      As a fungal biologist myself interested in fungal genomics and biotechnology, I would like to thank the authors for carrying out this work and the editor for the opportunity to review it. I am looking forward to reading the revised version of the manuscript.

      Riccardo Iacovelli, PhD GRIP, Chemical and Pharmaceutical Biology department University of Groningen, Groningen - The Netherlands

    1. AI-powered code generation tools like GitHub Copilot make it easier to write boilerplate code, but they don’t eliminate the need to consult with your organization’s domain experts to work through logic, debugging, and other complex problems.Stack Overflow for Teams is a knowledge-sharing platform that transfers contextual knowledge validated by your domain experts to other employees. It can even foster a code generation community of practice that champions early adopters and scales their learnings. OverflowAI makes this trusted internal knowledge—along with knowledge validated by the global Stack Overflow community—instantly accessible in places like your IDE so it can be used alongside code generation tools. As a result, your teams learn more about your codebase, rework code less often, and speed up your time-to-production.
    1. Reviewer #1 (Public Review):

      I thank the authors for addressing almost all my comments on the previous version of this manuscript, which studies the representation by gender and name origin of authors from Nature and Springer Nature articles in Nature News.

      The representation of author identities is an important step towards equality in science, and the authors found that women are underrepresented in news quotes and mentions with respect to the proportion of women authors.

      The research is rigorously conducted. It presents relevant questions and compelling answers. The documentation of the data and methods is thoroughly done, and the authors provide the code and data for reproduction.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      We would like to first thank the Editor as well as the three reviewers for their enthusiasm and conducting another careful evaluation of our manuscript. We appreciate their thoughtful and constructive comments and suggestions. Some concerns regarding experimental design, data analysis, and over-interpretation of our findings still remains unresolved after the initial revision. Here we endeavored to address these remaining concerns through further refinement of our writing, and inclusion of these concerns in the discussion session. We hope our response can better explain the rationale of our experimental design and data interpretation. In addition, we also acknowledge the limitations of our present study, so that it will benefit future investigations into this topic. Our detail responses are provided below.

      Reviewer #1 (Public Review):

      This study examines whether the human brain uses a hexagonal grid-like representation to navigate in a non-spatial space constructed by competence and trustworthiness. To test this, the authors asked human participants to learn the levels of competence and trustworthiness for six faces by associating them with specific lengths of bar graphs that indicate their levels in each trait. After learning, participants were asked to extrapolate the location from the partially observed morphing bar graphs. Using fMRI, the authors identified brain areas where activity is modulated by the angles of morphing trajectories in six-fold symmetry. The strength of this paper lies in the question it attempts to address. Specifically, the question of whether and how the human brain uses grid-like representations not only for spatial navigation but also for navigating abstract concepts, such as social space, and guiding everyday decision-making. This question is of emerging importance.

      I acknowledge the authors' efforts to address the comments received. However, my concerns persist:

      Thanks very much again for the re-evaluation and comments. Please find our revision plans to each comment below.

      (1) The authors contend that shorter reaction times correlated with increased distances between individuals in social space imply that participants construct and utilize two-dimensional representations. This method is adapted from a previous study by Park et al. Yet, there is a fundamental distinction between the two studies. In the prior work, participants learned relationships between adjacent individuals, receiving feedback on their decisions, akin to learning spatial locations during navigation. This setup leads to two different predictions: If participants rely on memory to infer relationships, recalling more pairs would be necessary for distant individuals than for closer ones. Conversely, if participants can directly gauge distances using a cognitive map, they would estimate distances between far individuals as quickly as for closer ones. Consequently, as the authors suggest, reaction times ought to decrease with increasing decision value, which, in this context, corresponds to distances. However, the current study allowed participants to compare all possible pairs without restricting learning experiences, rendering the application of the same methodology for testing two-dimensional representations inappropriate. In this study, the results could be interpreted as participants not forming and utilizing two-dimensional representations.

      We apologize for not being clear enough about our task design, we have made relevant changes in the methodology section in the manuscript to make it clearer. The reviewer’s concern is that participants learned about all the pairs in the comparison task which makes the distance effect invalid. We would like to clarify that during all the memory test tasks (the comparison task, the collect task and the recall task outside and inside scanner), participants never received feedback on whether their responses were correct or not. Therefore, the comparison task in our study is similar to the previous study by Park et al. (2021). Participants do not have access to correct responses for all possible pairs of comparison prior to or during this task, they would need to make inference based on memory retrieval.

      (2) The confounding of visual features with the value of social decision-making complicates the interpretation of this study's results. It remains unclear whether the observed grid-like effects are due to visual features or are genuinely indicative of value-based decision-making, as argued by the authors. Contrary to the authors' argument, this issue was not present in the previous study (Constantinescu et al.). In that study, participants associated specific stimuli with the identities of hidden items, but these stimuli were not linked to decision-making values (i.e., no image was considered superior to another). The current study's paradigm is more akin to that of Bao et al., which the authors mention in the context of RSA analysis. Indeed, Bao et al. controlled the length of the bars specifically to address the problem highlighted here. Regrettably, in the current paradigm, this conflation remains inseparable.

      We’d like to thank the reviewer for facilitating the discussion on the question of ‘social space’ vs. ‘sensory space’. The task in scanner did not require value-based decision making. It is akin to both the Bao et al. (2019) study and Constantinescu et al. (2016) study in a sense that all three tasks are trying to ask participants to imagine moving along a trajectory in an abstract, non-physical space and the trajectory is grounded in sensory cue. Participants were trained to associate the sensory cue with abstract (social/nonsocial) concepts. We think that the paradigm is a relatively faithful replication of the study by Constantinescu et al. Nonetheless, we agreed that a design similar to Bao et al. (2019) which controls for sensory confounds would be more ideal to address this concern, or adopting a value-based decision-making task in the scanner similar to that by Park et al. (2021), and we have included this limitation in the discussion section.

      (3) While the authors have responded to comments in the public review, my concerns noted in the Recommendation section remain unaddressed. As indicated in my recommendations, there are aspects of the authors' methodology and results that I find difficult to comprehend. Resolving these issues is imperative to facilitate an appropriate review in subsequent stages.

      Considering that the issues raised in the previous comments remain unresolved, I have retained my earlier comments below for review.

      We apologize for not addressing the recommendations properly, please find detailed our response and plans for revision.

      I have some comments. I hope that these can help.

      (1) While the explanation of Fig.4A-C is lacking in both the main text and figure legend, I am not sure if I understand this finding correctly. Did the authors find the effects of hexagonal modulation in the medial temporal gyrus and lingual gyrus correlate with the individual differences in the extent to which their reaction times were associated with the distances between faces when choosing a better collaborator? If so, I am not sure what argument the authors try to draw from these findings. Do the authors argue that these brain areas show hexagonal modulation, which was not supported in the previous analysis (Fig.3)? What is the level of correlation between these behavioral measures and the grid consistency effects in the vmPFC and EC, where the authors found actual grid-like activity? How do the authors interpret this finding? More importantly, how does this finding associate with other findings and the argument of the study?

      We apologize for not being clear enough in the manuscript and we will improve the clarity in our revision. This exploratory analysis reported in Figure 4 aims to use whole-brain analysis to examine: 1) if there is any correlation between the strength of grid-like representation of social value map and behavioral indicators of map-like representation; and 2) if there are any correlation between the strength of grid-like representation of this social value map and participants’ social trait.

      To be more specific, for the behavioral indicator, we used the distance effect in the reaction time of the comparison task outside the scanner. We interpreted stronger distance effect as a behavioral index of having better internal map-like representation. We interpreted stronger grid consistency effect as a neural index of better representation of the 2D social space. Therefore, we’d like to see if there exists correlation between behavioral and neural indices of map-like representation.

      To achieve this goal, behavioral indicators are entered as covariates in second-level analysis of the GLM testing grid consistency effect (GLM2). Figure3 showed results from GLM2 without the covariates. Figure4 showed results of clusters whose neural indices of map-like representation covaried with that from behavior and survived multiple-comparison correction. Indeed, in these regions, the grid consistency effect was not significant at group level (so not shown in Figure 3). We tried to interpret this finding in our discussion (line 374-289 for temporal lobe correlation, line 395-404 for precuneus correlation).

      Finally, we would like to point out that including the covariates in GLM2 did not change results in Figure3, the clusters in Figure3 still survives correction. Meanwhile, these clusters in Figure 3 did not show correlation with behavioral indicators of map-like representation.

      Author response image 1.

      (2) There are no behavioral results provided. How accurately did participants perform each of the tasks? How are the effects of grid consistency associated with the level of accuracy in the map test?

      Why did participants perform the recall task again outside the scanner?

      We will endeavor to improve signposting the corresponding figures in the main text. For the behavioral results, we reported the stats in section “Participants construct social value map after associative learning of avatars and corresponding characteristics” in the main text, and the plots are shown in Figure 1. Particularly, figure 1F showed accuracy of tasks in training, as well as the recall task in the scanner. For the correlation, we did not find significant correlation between behavioural accuracy and grid consistency effect. We will make it clearer in the result section.

      (3) The methods did not explain how the grid orientation was estimated and what the regressors were in GLM2. I don't think equations 2 and 3 are quite right.

      For the grid orientation estimation method, we provided detailed description in the Supplementary methods 2.2.2. We will add links to this section in the main text.

      Equation 2 and 3 describes how the parametric regressors entered into GLM2 were formed and provided prerequisites on calculation of grid orientations. Equation 2 was the results of directly applying the angle addition and subtraction theorems so they should be correct. We will try to make the rationale clearer in the supplementary text.

      (4) With the increase in navigation distances, more grid cells would activate. Therefore, in theory, the activity in the entorhinal cortex should increase with the Euclidean distances, which has not been found here. I wonder if there was enough variability in the Euclidean distances that can be captured by neural correlates. This would require including the distributions of Euclidean distances according to their trajectory angles. Regarding how Fig.1E is generated, I don't understand what this heat map indicates. Additionally, it needs to be confirmed if the grid effects remain while controlling for the Euclidean distances of navigation trajectories.

      We did not specifically control for the trajectory length, we only controlled for the distribution of trajectory to be uniform. We have included a figure of the distribution of Euclidean distances in Figure S9 and the distribution of trajectory direction in Figure S8.

      Author response image 2.

      As for Figure 1E, we aim to reproduce the findings from Figure 1F in Constantinescu et al. (2016) where they showed that participants progressively refined the locations of the outcomes through training. We divided the space into 15×15 subregions and computed the amount of time spent in each subregion and plotted Figure 1E. Brighter color in Figure 1E indicate greater amount of time spent in the corresponding subregion. Note that all these timing indices were computed as a percentage of the total time spent in the explore task in a given session. If participants were well-acquainted with the space and avatars, they would spend more time at the avatar (brighter color in avatar locations) in the review session compared to the learning session.

      As for the effect of distances on grid-like representation, we did not include the distance as a parametric modulator in grid consistency effect GLM (GLM2) due to insufficient trials in each bin (6-8 trials). But there is side evidence that could potentially rule out this confound. In the distance representation analysis, we did not find distance representation in any of the clusters that have significant grid-like representation (regions in Figure 2).

      Reviewer #2 (Public Review):

      Summary:

      In this work, Liang et al. investigate whether an abstract social space is neurally represented by a grid-like code. They trained participants to 'navigate' around a two-dimensional space of social agents characterized by the traits warmth and competence, then measured neural activity as participants imagined navigating through this space. The primary neural analysis consisted of three procedures: 1) identifying brain regions exhibiting the hexagonal modulation characteristic of a grid-like code, 2) estimating the orientation of each region's grid, and 3) testing whether the strength of the univariate neural signal increases when a participant is navigating in a direction aligned with the grid, compared to a direction that is misaligned with the grid. From these analyses, the authors find the clearest evidence of a grid-like code in the prefrontal cortex and weaker evidence in the entorhinal cortex.

      Strengths:

      The work demonstrates the existence of a grid-like neural code for a socially-relevant task, providing evidence that such coding schemes may be relevant for a variety of two-dimensional task spaces.

      Weaknesses:

      In the revised manuscript, the authors soften their claims about finding a grid code in the entorhinal cortex and provide additional caveats about limitations in their findings. It seems that the authors and reviewers are in agreement about the following weaknesses, which were part of my original review: Claims about a grid code in the entorhinal cortex are not well-supported by the analyses presented. The whole-brain analysis does not suggest that the entorhinal cortex exhibits hexagonal modulation; the strength of the entorhinal BOLD signal does not track the putative alignment of the grid code there; multivariate analyses do not reveal any evidence of a grid-like representational geometry.

      In the authors' response to reviews, they provide additional clarification about their exploratory analyses examining whether behavior (i.e., reaction times) and individual difference measures (i.e., social anxiety and avoidance) can be predicted by the hexagonal modulation strength in some region X, conditional on region X having a similar estimated grid alignment with some other region Y. My guess is that readers would find it useful if some of this language were included in the main text, especially with regard to an explanation regarding the rationale for these exploratory studies.

      Thank you very much again for your careful re-evaluation and suggestions. We have tried to improve our writing and incorporate the suggestions in the new revision.

      Reviewer #3 (Public Review):

      Liang and colleagues set out to test whether the human brain uses distance and grid-like codes in social knowledge using a design where participants had to navigate in a two-dimensional social space based on competence and warmth during an fMRI scan. They showed that participants were able to navigate the social space and found distance-based codes as well as grid-like codes in various brain regions, and the grid-like code correlated with behavior (reaction times).

      On the whole, the experiment is designed appropriately for testing for distant-based and grid-like codes, and is relatively well powered for this type of study, with a large amount of behavioral training per participant. They revealed that a number of brain regions correlated positively or negatively with distance in the social space, and found grid-like codes in the frontal polar cortex and posterior medial entorhinal cortex, the latter in line with prior findings on grid-like activity in entorhinal cortex. The current paper seems quite similar conceptually and in design to previous work, most notably Park et al., 2021, Nature Neuroscience.

      (1) The authors claim that this study provides evidence that humans use a spatial / grid code for abstract knowledge like social knowledge.

      This data does specifically not add anything new to this argument. As with almost all studies that test for a grid code in a similar "conceptual" space (not only the current study), the problem is that, when the space is not a uniform, square/circular space, and 2-dimensional then there is no reason the code will be perfectly grid like, i.e., show six-fold symmetry. In real world scenarios of social space (as well as navigation, semantic concepts), it must be higher dimensional - or at least more than two dimensional. It is unclear if this generalizes to larger spaces where not all part of the space is relevant. Modelling work from Tim Behrens' lab (e.g., Whittington et al., 2020) and Bradley Love's lab (e.g., Mok & Love, 2019) have shown/argued this to be the case. In experimental work, like in mazes from the Mosers' labs (e.g., Derdikman et al., 2009), or trapezoid environments from the O'Keefe lab (Krupic et al., 2015), there are distortions in mEC cells, and would not pass as grid cells in terms of the six-fold symmetry criterion.

      The authors briefly discuss the limitations of this at the very end but do not really say how this speaks to the goal of their study and the claim that social space or knowledge is organized as a grid code and if it is in fact used in the brain in their study and beyond. This issue deserves to be discussed in more depth, possibly referring to prior work that addressed this, and raise the issue for future work to address the problem - or if the authors think it is a problem at all.

      Thanks very much again for your careful re-evaluation and comments. We have tried to incorporate some of the suggested papers into our discussion. In summary, we agree that there is more to six-fold symmetric code that can be utilized to represent “conceptual space”. We think that the next step for a stronger claim would be to find the representation of more spontaneous non-spatial maps.

      References

      Bao, X., Gjorgieva, E., Shanahan, L. K., Howard, J. D., Kahnt, T., & Gottfried, J. A. (2019). Grid-like Neural Representations Support Olfactory Navigation of a Two-Dimensional Odor Space. Neuron, 102(5), 1066-1075 e1065. https://doi.org/10.1016/j.neuron.2019.03.034

      Constantinescu, A. O., O'Reilly, J. X., & Behrens, T. E. J. (2016). Organizing conceptual knowledge in humans with a gridlike code. Science, 352(6292), 1464-1468. https://doi.org/10.1126/science.aaf0941

      Park, S. A., Miller, D. S., & Boorman, E. D. (2021). Inferences on a multidimensional social hierarchy use a grid-like code. Nat Neurosci, 24(9), 1292-1301. https://doi.org/10.1038/s41593-02100916-3

    2. Reviewer #3 (Public Review):

      Liang and colleagues set out to test whether the human brain uses distance and grid-like codes in social knowledge using a design where participants had to navigate in a two-dimensional social space based on competence and warmth during an fMRI scan. They showed that participants were able to navigate the social space and found distance-based codes as well as grid-like codes in various brain regions, and the grid-like code correlated with behavior (reaction times).

      On the whole, the experiment is designed appropriately for testing for distant-based and grid-like codes, and is relatively well powered for this type of study, with a large amount of behavioral training per participant. They revealed that a number of brain regions correlated positively or negatively with distance in the social space, and found grid-like codes in the frontal polar cortex and posterior medial entorhinal cortex, the latter in line with prior findings on grid-like activity in entorhinal cortex. The current paper seems quite similar conceptually and in design to previous work, most notably Park et al., 2021, Nature Neuroscience.

      (1) The authors claim that this study provides evidence that humans use a spatial / grid code for abstract knowledge like social knowledge.

      This data does specifically not add anything new to this argument. As with almost all studies that test for a grid code in a similar "conceptual" space (not only the current study), the problem is that, when the space is not a uniform, square/circular space, and 2-dimensional then there is no reason the code will be perfectly grid like, i.e., show six-fold symmetry. In real world scenarios of social space (as well as navigation, semantic concepts), it must be higher dimensional - or at least more than two dimensional. It is unclear if this generalizes to larger spaces where not all part of the space is relevant. Modelling work from Tim Behrens' lab (e.g., Whittington et al., 2020) and Bradley Love's lab (e.g., Mok & Love, 2019) have shown/argued this to be the case. In experimental work, like in mazes from the Mosers' labs (e.g., Derdikman et al., 2009), or trapezoid environments from the O'Keefe lab (Krupic et al., 2015), there are distortions in mEC cells, and would not pass as grid cells in terms of the six-fold symmetry criterion.

      After revision, the authors now discuss some of this and the limitations and notes that future work is required to address the problem.

    1. e 28 avril, TotalEnergies assignait en justice Greenpeace France en raison de la publication d’un rapport qui interroge les calculs effectués par la multinationale sur ses émissions de CO2. La procédure, qui présente la particularité de reposer sur des dispositions du code monétaire et financier, vise notamment à faire interrompre toute diffusion actuelle ou à venir du rapport

      TotalEnergies versucht juristisch die weitere Publikation eines Greenpeace-Berichts zu verhindern, in dem aufgezeigt wird, dass die Angaben des Konzerns zu den eigenen CO2 Emissionen Greenwashing sind. Dieses Verfahren ist ein Beispiel für das mundtot machen von Journalisten mit Journal mit juristischen Mitteln gegen das sich jetzt eine Initiative zur Wehr setzt.

      https://www.liberation.fr/idees-et-debats/tribunes/bollore-total-face-aux-procedures-baillons-on-ne-se-taira-pas-20230625_UKD2PHN6QFB5ZAK7YFYEQOR6BA/

    1. match
      • types of these fields must also match, e.g., it will be removed if you provide a number to the code field.
      • 这些属性的类型也必须匹配,如果你给的code是一个number类型,它也会被筛掉。

      @github.com/mj2068

    1. Research on sexual harassment points ro w;-iysthat girls especially feel pressure to conform to gcndereJ norms or feel thehostility of gender dynamics particularly keenly

      This sort of reminds me of the dress code issues mentioned in last weeks lectures. When we discussed that female students are always at fault on what they are wearing and not what the male students are thinking. Society has always had scenarios like these come up, causing innocent individuals to be the victim.

    1. Unlike other default asset permissions, SMS code permissions can only be setup for security groups, not individual users. SMS codes are not created by users and aren't an asset type you can manage with Asset Creation controls for the security group. You can only setup default SMS code permissions for the security group at a time. You cannot grant individual users default SMS code permissions. The permissions are for the security group.
    1. The SMS Message Designer provides an easy-to-use interface to build your message content and easily add features like field merges, hyperlinks, and keywords. An SMS keyword is a word or phrase that your subscribers can text to an SMS code to interact with your SMS campaign. A code and keyword combination cannot be activated simultaneously on multiple campaigns.
    1. For SMS messages, an invalid keyword response message can be sent to originating mobile numbers if the incoming message contains an invalid keyword or a keyword that is inactive or is not defined for the code.
    2. Each code can only have one invalid keyword response message.
    1. An SMS keyword is a word or phrase that your subscribers can text to an SMS code to interact with your SMS campaign. They are used to define actions for mobile originated (MO) messages.
    1. Author response:

      The following is the authors’ response to the original reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Pg. 3 - lines 51-53: "Once established, the canonical RdDM pathway takes over, whereby small RNAs are generated by the plant-specific polymerase IV (Pol IV). In both cases, a second plant-specific polymerase, Pol V, is an essential downstream component." The authors' intro omits an important aspect of Pol V's function in RdDM, which is quite relevant to their study. Pol V transcribes DNA to synthesize noncoding RNA scaffolds, to which AGO4-bound 24 nt siRNAs are thought to base pair, leading to DRM2 recruitment for cytosine methylation near to these nascent Pol V transcripts (Wierzbicki et al 2008 Cell; Wierzbicki et al. 2009 Nat Genet). I recommend that the authors cite these key studies.

      These citations have now been added (see line 57).

      The authors provide compelling evidence that Pol V redistributes to ectopic heterochromatin regions in h1 mutants (e.g., Fig1a browser shot). Presumably, this would allow Pol V to transcribe these regions in h1 mutants, whereas it could not transcribe them in WT plants. Have the authors detected and/or quantified Pol V transcripts in the h1 mutant compared to WT plants at the sites of Pol V redistribution (detected via NRPE1 ChIP)?

      Robust detection of Pol V transcripts can be experimentally challenging, and instead we quantify and detect NRPE1 dependent methylation at these regions (Fig 5), which occurs downstream of Pol V transcript production. However, we note detecting Pol V transcripts as a potential future direction in the discussion (see line 263).

      Pg. 5 - lines 101-102: Figure 1e - "The preferential enrichment of NRPE1 in h1 was more pronounced at TEs that overlapped with heterochromatin associated mark, H3K9me2 (Fig. 1e). Was a statistical test performed to determine that the overall differences are significant only at TE sites with H3K9me2? Can the sites without H3K9me2 also be differentiated statistically?

      Yes, there is a statistically significant difference between WT and h1 at both the H3K9me2 marked and unmarked TEs (Wilcoxon rank sum tests, see updated Fig 1e). The size of the effect is larger for the H3K9me2 marked TEs (median difference of 0.41 vs 0.16). Median values have now been added to the boxplots so that this is directly viewable to the reader (Fig 1e). This reflects the general increase in NRPE1 occupancy in h1 mutants through the genome, with the effect consistently stronger in heterochromatin. In our initial version of the manuscript, we summarise the effect as follows “We found that h1 antagonizes NRPE1 occupancy throughout the genome, particularly at heterochromatic regions” (previous version line 83, current version line 95). Although important exceptions exist (see Fig 5, NRPE1 and DNA methylation loss in h1), we now make this point even more explicit, and have updated the manuscript at several locations (abstract line 26, results line 245, discussion line 265).

      Pg. 5 - lines 108-110: The authors state, "Importantly, we found no evidence for increased NRPE1 expression at the mRNA or protein level in the h1 mutant (Suppl. Fig. 2)." But the authors did observe reduced NRPE1 transcript levels in h1 mutants, in their re-analysis of RNA-seq data and reduced NRPE1 protein signals via western blot in (Suppl. Fig. 2), which should be reported here in the results.

      As described further below, we reanalysed h1 RNA-seq from scratch, and see no evidence for significant differential gene expression of NRPE1. This table and analysis are now provided in Supplementary Table 1.

      More importantly, the above logic about NRPE1 expression in h1 mutants assumes that NRPE1 is the stoichiometrically limiting subunit for Pol V assembly and function in vivo, but this is not known to be the case:

      (1) While NRPE1's expression is somewhat reduced (and not increased) in h1 mutant plants, we cannot be certain that other genes influencing Pol V stability or recruitment are unaffected by h1 mutants. I thus recommend that the authors perform RT-qPCR directly on the WT and h1 mutant materials used in their current study, quantifying NRPE1, NRPE2, NRPE5, DRD1, DMS3, RDM1, SUVH2 and SUVH9 transcript levels.

      (2) Normalizations used to compare samples should be included with RT-qPCR and western assays. An appropriate house-keeping gene like Actin2 or Ubiquitin could be used to normalize the RT-qPCR. Protein sample loading in Suppl. Fig. 2 could be checked by Coomassie staining and/or an antibody detection of a house-keeping protein.

      We have now included a full re-analysis of h1 RNA-seq (data from Choi et al 2020) focusing on transcriptional changes of DNA methylation machinery genes in the h1 mutant. Of the 61 genes analysed, only AGO6 and AGO9 were found to be differentially expressed (2-3 fold upregulation). This analysis is now included as a table

      (Supplementary Table 1). The western blot has been moved to Supplementary Fig 3 to now illustrate antibody specificity and H1 loss in the h1 mutant lines, so NRPE1 itself serves as a loading control (Supplementary Fig 3a).

      Pg. 6 - lines 129-131: The authors state that "over NRPE1 defined peaks (where NRPE1 occupancy is strongest in WT) we observed no change in H1 occupancy in nrpe1 (Fig 2b). The results indicate that H1 does not invade RdDM regions in the nrpe1 mutant background." This conclusion assumes that the author's H1 ChIP is successfully detecting H1 occupancy. However, in Fig 2d there does not appear to be H1 enrichment or peaks as visualized across the 10766 ZF-DMS3 off-target loci, or even at the selected 451 ZFDMS3 off-target hyper DMRs, where the putative signal for H1 enrichment on the metaplot center is extremely weak/non-existent.

      As a reference for H1 enrichment in chromatin (e.g., looking where H2A.W antagonizes H1 occupancy) one can compare analyses in Bourguet et al (2021) Nat Commun, involving co-authors of the current study. Bourguet et al (2021) Fig 5b show a metaplot of H1 levels centered on H2A.W peaks with H1 ChIP signal clearly tapering away from the metaplot center point peak. To my eye, the H1 ChIP metaplots for ZF-DMS3 offtarget loci in the current manuscript (Fig 2d) resemble "shuffled peaks" controls like those in Fig 5b of Bourguet et al (2021).

      Can one definitively interpret Fig 2d as showing RdDM "not reciprocally affecting H1 localization" without first showing the specificity of the ChIP-seq results in a genotype where H1 occupancy changes? Alternatively, could this dataset be displayed with Deeptools heatmaps to strengthen the evidence that the authors are detecting H1 occupancy/enrichment genome-wide, before diving into WT/nrpe1 mutant analysis at ZF-DMS3 off-target loci?

      This is an excellent suggestion from the reviewer. We have now included several analyses that assess and demonstrate the quality of our H1 ChIP-seq profiles. First, as suggested by the reviewer, we show that our H1 profiles peak over H2A.W enriched euchromatic TEs as defined by Bourguet et al, mirroring these published findings. Next, we investigated whether our H1 profiles match Teano’s recently described pattern over genes, confirming a similar pattern with 3’ enrichment of H1 over H3K27me3 unmarked genes. Furthermore, we show that the H1 peaks defined here are similarly enriched with GFP tagged H1.2 from the Teano et al. 2023 study. These analyses that validate the quality of our H1 ChIP-seq datasets and bolster the conclusion that NRPE1 redistribution does not affect H1 occupancy. These new analysis are now presented in Supplementary Figure 3 and see line 153.

      Pg. 8 - lines 228-230: The authors state that, "As with NRPE1, SUVH1 increased in the h1 background significantly more in heterochromatin, with preferential enrichment over long TEs, cmt2 dependent hypo CHH DMRs, and heterochromatic TEs (Fig. 6b)."

      Contrary to the above statement, the violin plots in Fig. 6c show SUVH1 occupancy increasing at euchromatic TEs in the h1 mutant. What statistical test allowed the authors to determine that the increase in h1 occurs "significantly more in heterochromatin"? The authors should critically interpret Fig. 6c and 6d, which are not currently referenced in the results section. More support is needed for the claim that SUVH1 specifically encroaches into heterochromatin in the h1 mutant, rather than just TEs generally (euchromatic and heterochromatic alike).

      Similar to what we see for NRPE1, statistical tests that we have now performed show that SUVH1 is significantly enriched in h1 in all classes. Importantly however, the effect size is larger in all of the heterochromatin associated classes. We display these statistical tests and the median values on the plots so that effects are immediately viewable (see updated Fig 6).

      In addition, the authors should verify that SUVH1-3xFLAG transgenes (in the WT and h1 mutant backgrounds, respectively) and endogenous Arabidopsis genes encoding the transcriptional activator complex (SUVH1-SUVH3-DNAJ1-DNAJ2) are not overexpressed in the h1 mutant vs. WT. Higher expression of SUVH1 or limiting factors in the larger complex could explain the observation of increased SUVH1 occupancy in the h1 background.

      We do not see a difference in SUVH1/3/DNAJ1/2 complex gene expression in the h1 background (see Supplementary Table 1). However, we cannot rule out that that our SUVH1-FLAG line in h1 is more highly expressed than the corresponding SUVH1-FLAG line in WT. We now note this point in line 248.

      Pg. 8 - lines 231-232: Here the authors make a sweeping conclusion about H1 demarcating, "the boundary between euchromatic and heterochromatic methylation pathways, likely through promoting nucleosome compaction and restricting heterochromatin access." I do not see how a H1 boundary between euchromatic and heterochromatic methylation pathways is revealed based on the SUVH1-3xFLAG occupancy data, which shows increased enrichment at every category interrogated in the h1 mutant (Fig 6b,c,d) and all along the baseline too in the h1 mutant browser tracks (Fig 6a). Can the authors provide more examples of this phenomenon (similar to Fig 6a) and better explain why their SUVH1-3xFLAG ChIP supports this demarcation model?

      The general conclusion from SUVH1 about H1’s agnostic role in preventing heterochromatin access is now further supported from our findings with H3K27me3 (see Figure 6e and description from line 250). However, we agree that the demarcation model as initially presented was overly simplistic. This point was also raised by reviewer 2. We have removed the line highlighted by the reviewer in the revised version of the manuscript. In the revised version we clarify that H1 impedes RdDM and associated machinery throughout the genome (consistent with H1’s established broad occupancy across the genome) but this effect is most pronounced in heterochromatin, corresponding to maximal H1 occupancy (abstract line 26, results line 245, discussion line 265). 

      Corrections:

      Pg. 8 - lines 226-227: "We therefore wondered whether complex's occupancy might also be affected by H1." The sentence contains a typo, where I assume the authors mean to refer to occupancy by the SUVH1-SUVH3-DNAJ1-DNAJ2 transcriptional activator complex. This needs to be specified more clearly.

      The paragraph has been updated (see from line 237).

      Pg. 13 - lines 393-405: There are minor errors in the capitalization of titles and author initials in the References. I recommend that the authors proofread all the references to eliminate these issues:

      Thank you, these have been corrected.

      Choi J, Lyons DB, Zilberman D. 2021. Histone H1 prevents non-cg methylation-mediated small RNA biogenesis in arabidopsis heterochromatin. Elife 10:1-24. doi:10.7554/eLife.72676 (...)

      Du J, Johnson LM, Groth M, Feng S, Hale CJ, Li S, Vashisht A a., Gallego-Bartolome J, Wohlschlegel J a., Patel DJ, Jacobsen SE. 2014. Mechanism of DNA methylation-directed histone methylation by KRYPTONITE. Mol Cell 55:495-504. doi:10.1016/j.molcel.2014.06.009 (...)

      Du J, Zhong X, Bernatavichute Y V, Stroud H, Feng S, Caro E, Vashisht A a, Terragni J, Chin HG, Tu A, Hetzel J, Wohlschlegel J a, Pradhan S, Patel DJ, Jacobsen SE. 2012. Dual binding of chromomethylase domains to H3K9me2-containing nucleosomes directs DNA methylation in plants. Cell 151:167-80. doi:10.1016/j.cell.2012.07.034

      Reviewer #2 (Recommendations For The Authors):

      As for a normal review, here are our major and minor points.

      Major:

      (1) Lines 38 to 45 of the introduction are important for the subsequent definition of heterochromatic and non-heterochromatic transposons, but the definition is ambiguous. Is heterochromatin defined by surrounding context such as pericentromeric position or is this an autonomous definition? Can a TE with the chromosomal arms be considered heterochromatic provided that it is long enough and recruits the right machinery? These cases should be more explicitly introduced. Ideally, a supplemental dataset should provide a key to the categories, genomic locations and overlapping TEs as they were used in this analysis, even if some of the categories were taken from another study.

      We have now added all the regions used for analysis in this study to Supplementary Table 3.

      (2) Line 80: This would be the first chance to cite Teno et al. and the "encroachment" of

      PcG complexes to TEs in H1 mutants

      Done - “H1 also plays a key role in shaping nuclear architecture and preventing ectopic polycomb-mediated H3K27me3 deposition in telomeres (Teano et al., 2023).” See line 83

      (3) It is "only" a supplemental figure but S2 but it should still follow the rules: Indicate the number of biological replicates for the RNA-seq data, and perform a statistical test. In case of WB data, provide a loading control.

      We are now using the western blot to illustrate antibody specificity and H1 loss in the h1 mutant lines, so NRPE1 itself serves as a loading control (Supplementary Fig 3a). For NRPE1 mRNA expression, we have now replaced this with a more comprehensive transcriptome analysis of methylation machinery in h1 (see Supplementary Table 1). 

      (4) Lines 115 to 124 and corresponding data: Here, the goal is to exclude other changes to heterochromatin structure other than "increased access" in H1 mutants; however, only one feature, H3K9me2, is tested. Testing this one mark does not necessarily prove that the nature of the chromatin does not change, e.g. H2A.W could be differently redistributed, DDM1 may change, VIM protein, and others. Either more comprehensive testing for heterochromatin markers should be performed, or the conclusions moderated.

      We have moderated the text accordingly (see line 135).

      (5) Lines 166ff and Figure 1, a bit out of order also Figure 5: The general hypothesis is that NRPE1 redistributes to heterochromatic regions in h1 mutants (as do other chromatin modifiers), but the data seem to only support a higher occurrence at target sites.

      a. The way the NRPE1 data is displayed makes it seem like there is much more NRPE1 in the h1 samples, even at peaks that should not be recruiting more as they do not represent "long" TEs. It would be good to present more gbrowse shots of all peak classes.

      We now clarify that h1 does result in a general increase of NRPE1 throughout the genome, but the effect is strongest at heterochromatin. In our initial version of the manuscript, we summarise the effect as follows “We found that h1 antagonizes NRPE1 occupancy throughout the genome, particularly at heterochromatic regions” (previous version line 83, current version line 95). We have modified the language at several locations throughout the manuscript to make this point more clearly (abstract line 26, results line 245, discussion line 265). We include several browser shots in Supp Fig. 8.

      b. The data are "normalized" how exactly?

      c. One argument of observing "gaining" and "losing" peaks is that there is redistribution of NRPE1 from euchromatic to heterochromatic sites. There should be an analysis and figure to corroborate the point (e.g. by comparing FRIP values). Figure 1b shows lower NRPE1 signals at the TE flanking regions. This could reflect a redistribution or a flawed normalization procedure.

      The data are normalised using a standardised pipeline by log2 fold change over input, after scaling each sample by mapped read depth using the bamCompare function in deepTools. This is now described in detail in the Materials and Methods line 365, with full code and pipelines available from GitHub (https://github.com/Zhenhuiz/H1-restrictseuchromatin-associated-methylation-pathways-from-heterochromatic-encroachment).

      d. Figure 1d and f show similar profiles comparing "long" and "short" TEs or "CMT2 dependent hypo-CHH" and "DRM2 dependent CHH". How do these categories relate to each other, how many fragments are redundant?

      The short vs long TEs were defined in Liu et al 2018 (doi: 10.1038/s41477-017-0100-y) and the DMRs were defined in Zhang et al. 2018 (DOI: 10.1073/pnas.1716300115). There is likely to be some degree of overlap between the categories, but numbers are very different (short TEs (n=820), long TEs (n=155), drm2 DMRs (n=5534), CMT (n=21784)) indicating that the different categories are informative. We have now listed all the regions used for analysis in this study as in Supplementary Table 3.

      e. The purpose of the data presented in Figure 1 b is to compare changes of NRPE1 association in H3K9me3 non-overlapping and overlapping TEs between wild-type and background, yet the figure splits the categories in two subpanels and does neither provide a fold-change number nor a statistical test of the comparison. As before, the figure does not really support the idea that NPRE1 somehow redistribute from its "normal" sites towards heterochromatin as both TE classes seem to show higher NRPE1 binding in h1 mutants.

      There is a statistically significant difference between WT and h1 at both the H3K9me2 marked and unmarked TEs, however, the size of the effect is larger for the H3K9me2 marked TEs (median difference of 0.41 vs 0.16). Median values have now been added to the boxplots so that this is directly viewable to the reader (Fig 1e). Although important exceptions exist (see Fig 5 – regions that lose NRPE1 and DNA methylation), this reflects the general increase in NRPE1 occupancy in h1 mutants throughput the genome, with a consistently stronger effect in heterochromatin. As noted above, we have updated the manuscript to make this point more clearly (abstract line 26, results line 245, discussion line 265).

      f. Panel g is the only attempt to corroborate the redistribution towards heterochromatic regions, but at this scale, the apparent reduction of binding in the chromosome arms may be driven by off-peak differences and normalization problems between different ChIP samples with different signal-to-noise-ratio.

      We describe our normalisation and informatic pipeline in more detail in the Materials and Methods line 365. It is also important to note that the reduction is not only observed at the chromosomal level, but also at specific sites. We called differential peaks between WT and h1 mutant. The "Regions that gain NRPE1 in h1" peaks are more enriched in heterochromatic regions, while " Regions that lose NRPE1 in h1" peaks are more enriched outside heterochromatic regions.

      g. Figure 5: how many regions gain vs lose NRPE1 in h1 mutants? If the "redistribution causes loss" scenario applies, the numbers should overall be balanced but that does not seem the case. The loss case appears to be rather exceptional judging from the zigzagging meta-plot. Are these sites related to the sites taken over by PcG-mediated repression in h1 mutants?

      As described in line 222 (previous version of the manuscript line 206), there are 15,075 sites that gain and 1,859 sites that lose NRPE1 in h1. Comparing these sites to

      H3K27me3 in the Teano et al. study was an excellent suggestion. We compared sites that gain NRPE1 to sites that gain H3K27me3 in h1, finding a statistically significant overlap (2.4 fold enrichment over expected, hypergeometric test p-value 2.1e-71). Reciprocally, sites that lose NRPE1 were significantly enriched for overlap with H3K27me3 loss regions (1.6 fold over expected, hypergeometric test p-value 1.4e-4). This indicates that RdDM and H3K27me3 patterning are similarly modulated by H1. To directly test this, we reanalysed the H3K27me3 ChIP-seq data from Teano et al., finding coincident gain and loss of H3K27me3 at sites that gain and lose NRPE1 in h1. These results are described from line 250 and in Fig 6e, which supports a general role for H1 in preventing heterochromatin encroachment.

      (6) Lines 166ff and Figure 3: The data walk towards the scenario of pathway redistribution but actually find that RdDM plays a minor role overall as a substantial increase in heterochromatin regions occurs in all contexts and is largely independent of RdDM.

      a. How exactly are DNA-methylation data converted across regions to reach a fraction score from 0 to 1? There is no explanation in the legend for the methods that allow to recapitulate.

      We now explain our methods in full in the Materials and Methods and all the code for generating these has now been deposited on GitHub (https://github.com/Zhenhuiz/H1restricts-euchromatin-associated-methylation-pathways-from-heterochromaticencroachment). Briefly, BSMAP is used to calculate the number of reads that are methylated vs unmethylated on a per-cytosine basis across the genome. Next, the DNA methylation fraction in each region is calculated by adding all the methylation fractions per cytosine in a given window, and divided by the total number of cytosines in that same window (ie mC/(unmC+mC)) i.e. this is expressed as a fraction ranging from 0 to 1.

      “0” indicates this region is not methylated, and “1” indicates this region is fully methylated (every cytosine is 100% methylated).  

      b. Kernel plots? These are slang for experts and should be better described. In addition, nothing is really concluded from these plots in the text, although they may be quite informative.

      Kernel density plots show the proportion of TEs that gain or lose methylation in a particular mutant, rather than the overall average as depicted in the methylation metaplots above. We now describe the kernel density plots in more detail in the Figure 3 legend. 

      (7) Figure 4: This could be a very interesting analysis if the reader could actually understand it.

      a. The legend is minimal. What is the meaning of hypo and hyper regions indicated to the right of Figure 4c?

      b. The color scale represents observed/expected values. What exactly does this mean? Mutant vs WT?

      c. Some comparisons in 4a are cryptic, e.g. h1 nrpe1 nrpe1 vs CHH?

      d. Figure 4d focuses on a correlation square of relevance, but why? Interestingly the square does not correspond to any "hypo" or "hyper" label?

      Thank you, we have revised Figure 4 and legend based on these suggestions to clarify all of the above.

      (8) Lines 226 and Figure 6B. De novo (or increased) targeting of SUVH1 to heterochromatic sites in h1 mutants, similar to NRPE1, is used to support the argument that more access allows other chromatin modifiers to encroach. SUVH1 strongly depends on RdDM for its in vivo binding and may be the least conclusive factor to argue for a "general" encroachment mechanism.

      We appreciate the reviewers point here. Something that is entirely independent of RdDM following the same pattern would be stronger evidence in favour of general encroachment. Excitingly, this is exactly what we provide evidence for when investigating the interrelationship with H3K27me3 and we appreciate the reviewer’s suggestion to check this! This data is now described in Figure 6e and line 250.

      Minor:

      (1) Line 23: "Loss of H1 resulted in heterochromatic TE enrichment by NRPE1." This does not seem right. NRPE enrichment as TEs

      Modified, (line 26) thank you.

      (2) Lines 73-74: The idea that DDM1 displaces H1 in heterochromatic TEs is somewhat counterintuitive to model that heterochromatic TEs are unavailable for RdDM because of the presence of H1. Is this displacement non-permanent and directly linked to interaction with CMT2/3 Met1?

      This is a very good question and we agree with the reviewer that the effect of DDM1 may only be transient or insufficient to allow for full RdDM assembly, or indeed there may be a direct interaction between DDM1 and CMTs/MET1. During preparation of these revisions, a structure of Arabidopsis nucleosome bound DDM1 was published, which provides some insight by showing that DDM1 promotes DNA sliding. This is at least consistent with the idea of DDM1 causing transient / non-permanent displacement of H1 that would be insufficient for RdDM establishment. We incorporate discussion of these ideas at line 80.

      (3) Line 85: A bit more background on the Reader activator complex should be given. In fact, the reader may not really care that it was more recently discovered (not really recent btw) but what does it actually do?

      We have quite extensively reconfigured this paragraph to take into account our new finding with H3K27me3, such that there is less emphasis on the reader activator complex. The sentence now reads as follows:

      “We found that h1 antagonizes NRPE1 occupancy throughout the genome, particularly at heterochromatic regions. This effect was not limited to RdDM,  similarly impacting both the methylation reader complex component, SUVH1 (Harris et al., 2018) and polycomb-mediated H3K27me3 (Teano et al., 2023).” (line 95). 

      Also, when describing the experiment the results section (line 241), we now provide more background on SUVH1’s function.

      (4) Lines 80-81: Since it is already shown that RdDM associated small RNAs are more enriched in h1 at heterochromatin, help us to know what is precisely the added value of studying the enrichment of NRPE1 at these sites.

      Good point. We have the following line: ‘...small RNAs are not a direct readout of functional RdDM activity and Pol IV dependent small RNAs are abundant in regions of the genome that do not require RdDM for methylation maintenance and that do not contain Pol V (Stroud et al., 2014).’ (line 90)

      (5) Line 99: This seems to be the only time where the connection between long TEs and heterochromatic regions is mentioned but no source is cited.

      We have added the following appropriate citations: (Bourguet et al., 2021; Zemach et al., 2013). (line 110).

      (6) Line 100: DMRs is used for the first time here without explanation and full text. The abbreviation is introduced later in the text (Line 187).

      Thank you, we now describe DMRs upon first use, line 112.

      (7) Figure 2: Panels 2 c and d should show metaplots for WT and transgenes in one panel. There is something seriously wrong with the normalization in d or the scale for left and right panel is not the same. Neither legend nor methods describe how normalization was performed.

      Thank you for pointing this out, the figure has been corrected. We have updated the Materials and Methods (line 365) and have added codes and pipelines to GitHub to explain the normalisation procedure in more detail (https://github.com/Zhenhuiz/H1restricts-euchromatin-associated-methylation-pathways-from-heterochromaticencroachment).

    1. Author response:

      The following is the authors’ response to the original reviews.

      We are very grateful to the reviewers for their constructive comments. Here is a summary of the main changes we made from the previous manuscript version, based on the reviewers’ comments:

      (1) Introduction of a new model, based on a Markov chain, capturing within-trial evolution in search strategy .

      (2) Addition of a new figure investigating inter-animal variations in search strategy.

      (3) Measurement of model fit consistency across 10 simulation repetitions, to prevent the risk of model overfitting.

      (4) Several clarifications have been made in the main text (Results, Discussion, Methods) and figure legends.

      (5) We now provide processed data and codes for analyses and models at GitHub repository

      (6) Simplification of the previous modeling. We realized that the two first models in the previous manuscript version were simply special cases of the third model. Therefore, we retained only the third model, which has been renamed as the ‘mixture model’.

      (7) Modification of Figure 4-6 and Supplementary Figure 7-8 (or their creation) to reflect the aforementioned changes

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors design an automated 24-well Barnes maze with 2 orienting cues inside the maze, then model what strategies the mice use to reach the goal location across multiple days of learning. They consider a set of models and conclude that one of these models, a combined strategy model, best explains the experimental data.

      This study is written concisely and the results presented concisely. The best fit model is reasonably simple and fits the experimental data well (at least the summary measures of the data that were presented).

      Major points:

      (1) One combined strategy (once the goal location is learned) that might seem to be reasonable would be that the animal knows roughly where the goal is, but not exactly where, so it first uses a spatial strategy just to get to the first vestibule, then switches to a serial strategy until it reaches the correct vestibule. How well would such a strategy explain the data for the later sessions? The best combined model presented in the manuscript is one in which the animal starts with a roughly 50-50 chance of a serial (or spatial strategy) from the start vestibule (i.e. by the last session before the reversal the serial and spatial strategies are at ~50-50m in Fig. 5d). Is it the case that even after 15 days of training the animal starts with a serial strategy from its starting point approximately half of the time? The broader point is whether additional examination of the choices made by the animal, combined with consideration of a larger range of possible models, would be able to provide additional insight into the learning and strategies the animal uses.

      Our analysis focused on the evolution of navigation strategies across days and trials. The reviewer raises the interesting possibility that navigation strategy might evolve in a specific manner within each trial, especially on the later days once the environment is learned. To address this possibility, we first examined how some of the statistical distributions, previously analyzed across days, evolved within trials. Consistent with the reviewer’s intuition, the statistical distributions changed within trials, suggesting a specific strategy evolution within trials. Second, we developed a new model, where strategies are represented as nodes of a Markov chain. This model allows potential strategy changes after each vestibule visit, according to a specific set of transition probabilities. Vestibules are chosen based on the same stochastic processes as in the previous model. This new model could be fitted to the experimental distributions and captured both the within-trial evolution and the global distributions. Interestingly, the trials were mostly initiated in the random strategy (~67% chance) and to a lesser extent in the spatial strategy (~25% chance), but rarely in the serial strategy (~8% chance). This new model is presented in Figure 6.

      (2) To clarify, in the Fig. 4 simulations, is the "last" vestibule visit of each trial, which is by definition 0, not counted in the plots of Fig. 4b? Otherwise, I would expect that vestibule 0 is overrepresented because a trial always ends with Vi = 0.

      The last vestibule visit (vestibule 0 by definition) is counted in the plots of Fig.4b. We initially shared the same concern as the reviewer. However, upon further consideration, we arrived at the following explanation: A factor that might lead to an overrepresentation of vestibule 0 is the fact that, unlike other vestibules, it has to be contained in each trial, as trials terminated upon the selection of vestibule 0. Conversely, a factor that might contribute to an underrepresentation of vestibule 0 is that, unlike other vestibules, it cannot be counted more than once per trial. Somehow these two factors seem to counterbalance each other, resulting in no discernible overrepresentation or underrepresentation of vestibule 0 in the random process. 

      Reviewer #2 (Public Review):

      This paper uses a novel maze design to explore mouse navigation behaviour in an automated analogue of the Barnes maze. Overall I find the work to be solid, with the cleverly designed maze/protocol to be its major strength - however there are some issues that I believe should be addressed and clarified.

      (1) Whilst I'm generally a fan of the experimental protocol, the design means that internal odor cues on the maze change from trial to trial, along with cues external to the maze such as the sounds and visual features of the recording room, ultimately making it hard for the mice to use a completely allocentric spatial 'place' strategy to navigate. I do not think there is a way to control for these conflicts between reference frames in the statistical modelling, but I do think these issues should be addressed in the discussion.

      It should be pointed out that all cues on the maze (visual, tactile, odorant) remained unchanged across trials, since the maze was rotated together with goal and guiding cues. Furthermore, the maze was equipped with an opaque cover to prevent mice from seeing the surrounding room (the imaging of mouse trajectories was achieved using infrared light and camera). It is however possible that some other cues such as room sounds and odors could be perceived and somewhat interfered with the sensory cues provided inside the maze. We have now mentioned this possibility in the discussion.

      (2) Somewhat related - I could not find how the internal maze cues are moved for each trial to demarcate the new goal (i.e. the luminous cues) ? This should be clarified in the methods.

      The luminous cues were fixed to the floor of the arena. Consequently, they rotated along with the arena as a unified unit, depicted in figure 1. We have added some clarifications in Figure 1 legend and methods.

      (3) It appears some data is being withheld from Figures 2&3? E.g. Days 3/4 from Fig 2b-f and Days 1-5 on for Fig 3. Similarly, Trials 2-7 are excluded from Fig 3. If this is the case, why? It should be clarified in the main text and Figure captions, preferably with equivalent plots presenting all the data in the supplement.

      The statistical distributions for all single days/trials are shown in the color-coded panels of Figure2&3. In the line plots of Figure2&3, we show only the overlay of 2-3 lines for the sake of clarity. The days/trials represented were chosen to capture the dynamic range of variability within the distributions. We have added this information in the figure legends.

      (4) I strongly believe the data and code should be made freely available rather than "upon reasonable request".

      Matrices of processed data and various codes for simulations and analyses are now available at https://github.com/ sebiroyerlab/Vestibule_sequences.

      Reviewer #3 (Public Review):

      Royer et al. present a fully automated variant of the Barnes maze to reduce experimenter interference and ensure consistency across trials and subjects. They train mice in this maze over several days and analyze the progression of mouse search strategies during the course of the training. By fitting models involving stochastic processes, they demonstrate that a model combined of the random, spatial, and serial processes can best account for the observed changes in mice's search patterns. Their findings suggest that across training days the spatial strategy (using local landmarks) was progressively employed, mostly at the expense of the random strategy, while the serial strategy (consecutive nearby vestibule check) is reinforced from the early stages of training. Finally, they discuss potential mechanistic underpinnings within brain systems that could explain such behavioral adaptation and flexibility.

      Strength:

      The development of an automated Barnes maze allows for more naturalistic and uninterrupted behavior, facilitating the study of spatial learning and memory, as well as the analysis of the brain's neural networks during behavior when combined with neurophysiological techniques. The system's design has been thoughtfully considered, encompassing numerous intricate details. These details include the incorporation of flexible options for selecting start, goal, and proximal landmark positions, the inclusion of a rotating platform to prevent the accumulation of olfactory cues, and careful attention given to atomization, taking into account specific considerations such as the rotation of the maze without causing wire shortage or breakage. When combined with neurophysiological manipulations or recordings, the system provides a powerful tool for studying spatial navigation system.

      The behavioral experiment protocols, along with the analysis of animal behavior, are conducted with care, and the development of behavioral modeling to capture the animal's search strategy is thoughtfully executed. It is intriguing to observe how the integration of these innovative stochastic models can elucidate the evolution of mice's search strategy within a variant of the Barnes maze.

      Weakness:

      (1) The development of the well-thought-out automated Barnes maze may attract the interest of researchers exploring spatial learning and memory. However, this aspect of the paper lacks significance due to insufficient coverage of the materials and methods required for readers to replicate the behavioral methodology for their own research inquiries.

      Moreover, as discussed by the authors, the methodology favors specialists who utilize wired recordings or manipulations (e.g. optogenetics) in awake, behaving rodents. However, it remains unclear how the current maze design, which involves trapping mice in start and goal positions and incorporating angled vestibules resulting in the addition of numerous corners, can be effectively adapted for animals with wired implants.

      The reviewer is correct in pointing out that the current maze design is not suitable for performing experiments with wired implant, particularly due to the maze’s enclosed structure and the access to the start/goal boxes through side holes. Instead, pharmacogenetics and wireless approaches for optogenetic and electrophysiology would need to be used. We have now mentioned this limitation in the discussion.

      (2) Novelty: In its current format, the main axis of the paper falls on the analysis of animal behavior and the development of behavioral modeling. In this respect, while it is interesting to see how thoughtfully designed models can explain the evolution of mice search strategy in a maze, the conclusions offer limited novel findings that align with the existing body of research and prior predictions.

      We agree with the reviewer that our study is weakly connected to previous researches on hippocampus and spatial navigation, as it consists mainly of animal behavior analysis and modeling and addresses a relatively unexplored topic. We hope that the combination of our behavioral approach with optogenetic and electrophysiology will allow in the future new insights that are in line with the existing body of research.

      (3) Scalability and accessibility: While the approach may be intriguing to experts who have an interest in or are familiar with the Barnes maze, its presentation seems to primarily target this specific audience. Therefore, there is a lack of clarity and discussion regarding the scalability of behavioral modeling to experiments involving other search strategies (such as sequence or episodic learning), other animal models, or the potential for translational applications. The scalability of the method would greatly benefit a broader scientific community. In line with this view, the paper's conclusions heavily rely on the development of new models using custom-made codes. Therefore, it would be advantageous to make these codes readily available, and if possible, provide access to the processed data as well. This could enhance comprehension and enable a larger audience to benefit from the methodology.

      The current approach might indeed extend to other species in equivalent environments and might also constitute a general proof of principle regarding the characterization of animal behaviors by the mixing of stochastic processes. We have now mentioned these points in the discussion.

      As suggest by the reviewer, we have now provided model/simulation codes and processed data to replicate the figures, at https://github.com/sebiroyerlab/Vestibule_sequences

      (4) Cross-validation of models: The authors have not implemented any measures to mitigate the risk of overfitting in their modeling. It would have been beneficial to include at least some form of cross-validation with stochastic models to address this concern. Additionally, the paper lacks the presence of analytics or measures that assess and compare the performance of the models.

      To avoid the risk of model overfitting, the most appropriate solution appeared to be repeating the simulations several times and examining the consistency of the obtained parameters across repetitions. For the mixture model, we now show in Supplementary figure 7 the probabilities obtained from 10 repetitions of the simulation. Similarly, for the Markov chain model, the probabilities obtained from 10 repetitions of the simulation are shown in Figure 6.

      Regarding model comparison, we have simplified our mixture model into only one model, as we realized the 2 other models in the previous manuscript version were simply special cases of the 3rd model. Nevertheless, comparison was still needed for the estimation for the best value of N (the number of consecutive segments that a strategy lasts) in the mixture model. We now show the comparison of mean square errors obtained for different values of N, using t-test across 10 repetitions of the simulations (Figure 5c).

      (5) Quantification of inter-animal variations in strategy development: It is important to investigate, and address the argument concerning the possibility that not all animals recruit and develop the three processes (random, spatial, and serial) in a similar manner over days of training. It would be valuable to quantify the transition in strategy across days for each individual mouse and analyze how the population average, reflecting data from individual mice, corresponds to these findings. Currently, there is a lack of such quantification and analysis in the paper.

      We have added a figure (Supplementary figure 8) showing the mixture model matching analyses for individual animals. A lot of variability is indeed observed across animals, with some animals displaying strong preferences for certain strategies compare to others. The average across mouse population showed a similar trend as the result obtained with the pooled data.

      Recommendations for the authors:

      Summary of Reviewer Comments:

      (1) In its present form, the manuscript lacks sufficient coverage of the materials and methods necessary for readers to replicate the behavioral methodology in their own research inquiries. For instance, it would be beneficial to clarify how the cues are rotated relative to the goal.

      (2) The models may be over-fitted, leading to spurious conclusions, and cross-validation is necessary to rule out this possibility.

      (3) The specific choice of the three strategies used to fit behavior in this model should be better justified, as other strategies may account for the observed behavior.

      (4) The study would benefit from an analysis of behavior on an animal-by-animal basis, potentially revealing individual differences in strategies.

      (5) Spatial behavior is not necessarily fully allocentric in this task, as only the two cues in the arena can be used for spatial orientation, unlike odor cues on the floor and sound cues in the room. This should be discussed.

      (6) Making the data and code fully open source would greatly strengthen the impact of this study.

      In addition, each reviewer has raised both major and minor concerns which should be addressed if possible.

      Reviewer #1 (Recommendations For The Authors):

      Minor points:

      (1) Change "tainted" to "tinted" in Fig. 1a

      (2) Should note explicitly in Fig. 2d that the goal is at vestibule 0, and also in the legend

      (3) Fig. 3 legend should say "c-e)", not "c-f)"

      (4) Supplementary Fig. 8 legend repeats "d)" twice

      Reviewer #2 (Recommendations For The Authors):

      Packard & McGaugh 1996 is cited twice as refs 5 and 14

      Reviewer #3 (Recommendations For The Authors):

      - Figure 3: Please correct the labels referenced as "c-f)" in the figure's legend.

      - Rounding numbers issue on page 4: 82.62% + 17.37% equals 99.99%, not 100%.

      We fixed all minor points. We are very thankful to the reviewers for their constructive comments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We are thankful to the reviewers and the editor for their detailed feedback, insightful suggestions, and thoughtful assessment of our work. Our point-by-point responses to the comments and suggestions are below.

      The revised manuscript has taken into account all the comments of the three reviewers. Modifications include corrections to errors in spelling and unit notation, additional quantification, improvements to the clarity of the language in some places, as well as additional detail in the descriptions of the methods, and revisions to the figures and figure legends.

      We have also undertaken additional analyses and added materials in response to reviewer suggestions. In brief:

      In response to a suggestion from Reviewer #1, we added Figure 6-1 to show examples of the calcium traces of individual fish and individual ROIs from the condensed data in Figure 6. We revised Figure 7 as follows:

      • We added an analysis of the duration of the response to shock to address comments from Reviewers #2 and #3.

      • In response to Reviewer #3, we added histograms showing the distribution of the amplitudes of the calcium signals in the gsc2 and rln3a neurons to show, without relying on the detection of peaks in the calcium trace, that the rln3a neurons have more oscillations in activity.

      We added Figure 8-2 in response to the suggestion from Reviewer #3 to analyze turning behavior in larvae with ablated rln3a neurons.

      To address Reviewer #2’s suggestion to show how the ablated transgenic animals compare to the non-ablated transgenic animals of the same genotype, we have added this analysis as Figure 8-3.

      A detailed point-by-point is as follows:

      The reviewers agree that the study of Spikol et al is important, with novel findings and exciting genetic tools for targeting cell types in the nucleus incertus. The conclusions are overall solid. Results could nonetheless be strengthened by performing few additional optogenetic experiments and by consolidating the analysis of calcium imaging and behavioral recordings as summarized below.

      (1) Light pulses used for optogenetic-mediated connectivity mapping were very long (5s), which could lead to non specific activation of numerous population of neurons than the targeted ones. To confirm their results, the authors should repeat their experiments with brief 5-50ms (500ms maximum) -long light pulses for stimulation.

      As the activity of the gsc2 neurons is already increased by 1.8 fold (± 0.28) within the first frame that the laser is activated (duration ~200 msec), it is unlikely that that the observed response is due to non-specific activation induced by the long light pulse.

      (2) In terms of analysis, the authors should improve :

      a) The detection of calcium events in the "calcium trace" showing the change in fluorescence over time by detecting the sharp increase in the signal when intracellular calcium rises;

      We have added an additional analysis to Figure 7 that does not rely on detection of calcium peaks. See response to Reviewer #3.

      b) The detection of bouts in the behavioral recordings by measuring when the tail beat starts and ends, thereby distinguishing the active swimming during bouts from the immobility observed between bouts.

      Our recordings capture the entire arena that the larva can explore in the experiment and therefore lack the spatial resolution to capture and analyze the tail beat. Rather, we measured the frequency and length of phases of movement in which the larva shows no more than 1 second of immobility. To avoid confusion with studies that measure bouts from the onset of tail movement, we removed this term from the manuscript and refer to activity as phases of movement.

      (3) The reviewers also ask for more precisions in the characterization of the newly-generated knock-in lines and the corresponding anatomy as explained in their detailed reports.

      Please refer to the point-by-point request for additional details that have now been added to the manuscript.

      Reviewer #1 (Recommendations For The Authors):

      The conclusions of this paper are mostly well supported by data, but some technical aspects, especially about calcium imaging and data analysis, need to be clarified.

      (1) Both the endogenous gsc2 mRNA expression and Tg(gsc2:QF2) transgenic expression are observed in a neuronal population in the NI, but also in a more sparsely distributed population of neurons located more anteriorly (for example, Fig. 2B, Fig. 5A). The latter population is not mentioned in the text. It would be necessary to clarify whether or not this anterior population is also considered as the NI, and whether this population was included for the analysis of the projection patterns and ablation experiments.

      The sparsely distributed neurons had been mentioned in the Results, line 134, but we have now added more detail. In line 328, we have clarified that: “As the sparsely distributed anterior group of gsc2 neurons (Fig. 2B, C) are anatomically distinct from the main cluster and not within the nucleus incertus proper, they were excluded from subsequent analyses.”

      (2) Both Tg(gsc2:QF2) and Tg(rln3a:QF2) transgenic lines have the QF genes inserted in the coding region of the targeted genes. This probably leads to knock out of the gene in the targeted allele. Can the authors mention whether or not the endogenous expression of gsc2 and rln3a was affected in the transgenic larvae? Is it possible that the results they obtained using these transgenic lines are affected by the (heterozygous or homozygous) mutation of the targeted genes?

      Figure 8-1 includes in situ hybridization for gsc2 and rln3a in heterozygous Tg(gsc2:QF2)c721; Tg(QUAS:GFP)c578 and Tg(rln3a:QF2; he1.1:YFP)c836; Tg(QUAS:GFP)c578 transgenic larvae.

      The expression of gsc2 is unaffected in Tg(gsc2:QF2)c721; Tg(QUAS:GFP)c578 heterozygotes

      (Fig. 8-1A), whereas the expression of rln3a is reduced in Tg(rln3a:QF2; he1.1:YFP)c836; Tg(QUAS:GFP)c578 heterozygous larvae (Fig. 8-1D), as mentioned in the legend for Figure 8-1. We confirmed these findings by comparing endogenous gene expression between transgenic and non-transgenic siblings that were processed for RNA in situ hybridization in the same tube.

      The behavioral results we obtained are not due to rln3a heterozygosity because comparisons were made with sibling larvae that are also heterozygous for Tg(rln3a:QF2; he1.1:YFP)c836; Tg(QUAS:GFP)c578, as stated in the Figure 8 legend.

      (3) Optogenetic activation and simultaneous calcium imaging is elegantly designed using the combination of the orthogonal Gal4/UAS and QF2/QUAS systems (Fig. 6). However, I have some concerns about the analysis of calcium responses from a technical point of view. Their definition of ΔF/F in this manuscript is described as (F-Fmin)/(Fmax-Fmin) (see line 1406). This is confusing because it is different from the conventional definition of ΔF/F, which is F-F0/F0, where F0 is a baseline GCaMP fluorescence. Their way of calculating the ΔF/F is inappropriate for measuring the change in fluorescence relative to the baseline signal because it rather normalizes the amplitude of the responses across different ROIs. The same argument applies to the analyses done for Fig. 7.

      We have taken a careful look at our analyses and replotted the data using F-F0/F0. However, this only changes Y-axis values and does not change the shape of the calcium trace or the change in signal upon stimulation. Both metrics (F-F0/F0 and (F-Fmin)/(Fmax-Fmin)) adjust the fluorescence values of each ROI to its own baseline.

      (4) The %ΔF/F plots shown in Fig.6 are highly condensed showing the average of different ROIs (cells) within one fish and then the average of multiple fish. It would be helpful to see example calcium traces of individual ROIs and individual fish to know the variability across ROIs and fish. Also, It would be helpful to know how much laser power (561 nm laser) was used to photostimulate ReaChR.

      Laser power (5%) was added to the section titled Calcium Signaling in Methods.

      In Figure 6, shading in the %ΔF/F plots (D, D’, E, E’, F, F’, G, G’, H, H’) represents the variability across ROIs, and the dot plots (D’’, E’’, F’’, G’’, H’’) show the variability across fish (where each data point represents an individual fish). We have now also added Figure 6-1 with examples of calcium traces from individual fish and individual ROIs.

      (5) Some calcium traces presented in Fig. 6 (Fig. 6D, D', F, H, H') show discontinuous fluctuations at the onset and offset of the photostimulation period. Is this caused by some artifacts introduced by switching the settings for the photostimulation? The authors should mention if there are some alternative explanations for this discontinuity.

      As noted by the reviewer, this artifact does result from switching the settings for photostimulation, which we mention in the legend for Figure 6.

      (6) In the introduction, they mention that the griseum centrale is a presumed analogue of the NI (lines 74-75). It would be helpful for the readers to better understand the brain anatomy if the authors could discuss whether or not their findings on the gsc2 and rln3a NI neurons support this idea.

      Our findings on the gsc2 and rln3a neurons support the idea that the griseum centrale of fish is the analogue of the mammalian NI. We have now edited the text in the third paragraph of the discussion, line 1271, to make this point more clearly: “By labeling with QUAS-driven fluorescent reporters, we determined that the anatomical location, neurotransmitter phenotype, and hodological properties of gsc2 and rln3a neurons are consistent with NI identity, supporting the assertion that the griseum centrale of fish is analogous to the mammalian NI. Both groups of neurons are GABAergic, reside on the floor of the fourth ventricle and project to the interpeduncular nucleus.”

      Reviewer #2 (Recommendations For The Authors):

      Major comments:

      (1) Throughout the figures a need for more precision and reference in the anatomical evidence:

      • Specify how many planes over which height were projected for each Z-projection in Figure 1,2,3, ....

      We added this information to the last paragraph of the section titled Confocal Imaging within the Materials and Methods.

      • Provide the rhombomere numbers, deliminate the ventricles & always indicate on the panel the orientation (Rostral Caudal, Left Right or Ventral Dorsal) for Figure 1 panels D-F , Figure 2-1B-G, Figure 2-2A-C in the adult brain, Figure 3.

      We annotated Figures 2-1 and 2-2 as suggested. We also indicated the orientation (anterior to the top or anterior to the left) in all figure legends. For additional context on the position of gsc2 and rln3a neurons within the larval brain, refer to Fig. 1A-C’, Fig. 1-2A, Fig. 2, Fig. 4 and Fig. 5.

      • Add close up when necessary: Figure 2-2A-C, specify in the text & in the figure where are the axon bundles from the gsc2+ neurons in the adult brain- seems interesting and is not commented on?

      We added a note to the legend of Figure 2-2: Arrowheads in B and B’ indicate mApple labeling of gsc2 neuronal projections to the hypothalamus. We also refer to Fig 2-2B, B’ in the Results section titled Distinct Projection Patterns of gsc2 and rln3a neurons.

      • keep the same color for one transgene within one figure: example, glutamatergic neurons should always be the same color in A,B,C - it is confusing as it is.

      We have followed the reviewer’s suggestion and made the color scheme consistent in Figure 3.

      • Movies: add the labels (which transgenic lines in which color, orientation & anatomical boundaries for NI, PAG, any other critical region that receives their projections and the brain ventricle boundaries) on the anatomical movies in supplemental (ex Movie 4-1 for gsc2 neurons and 4-2 for rln3 neurons: add cerebellum, IPN, raphe, diencephalon, and rostral and caudal hypothalamus, medulla for 4-1 as well as lateral hypothalamus and optic tectum for 42); add the ablated region when necessary.

      We added more detail to the movie legends. Please refer to Figure 4 for additional anatomical details.

      • for highlighting projections from NI neurons and distinguish them from the PAG neurons, the authors elegantly used 2 Photon ablation of one versus the other cluster: this method is valid but we need more resolution that the Z stacks added in supplemental by performing substraction of before and after maps.

      We are not sure what the author meant by subtraction as there are no before and after images in this experiment. Larvae underwent ablation of cell bodies and were imaged one day later in comparison to unablated larvae.

      In particular, it is not clear to me if both PAG and NI rln3a neurons project to medulla - can the authors specify this point & the comparison between intact & PAG vs NI ablation maps? The authors should resolve better the projections to all targeted regions of NI gsc2 neurons and differentiate them from other PAG gsc2 neurons, same for rln3a neurons.

      We have clarified this point on line 549.

      Make sure to mention in the result section the duration between ablation & observation that is key for the axons to degrade.

      We always assessed degeneration of neuronal processes at 1-day post-ablation.

      (“2) calcium imaging experiments:

      a) with optogenetic connectivity mapping:

      the authors combine an impressive diverse set of optogenetic actuators & sensors by taking advantage of the QUAS/QF2 and UAS/GAL4 systems to test connectivity from Hb-IPN onto gsc2 and rln3 neurons.

      The experiments are convincing but the choice of the duration of the stimulation (5s) is not adequate to test for direct connectivity: the authors should make sure that response in gsc2 neurons is observed with short duration (50ms-1s max).

      As noted above:

      “As the activity of the gsc2 neurons is already increased by 1.8 fold (± 0.28) within the first frame that the laser is activated (duration ~200 msec), it is unlikely that that the observed response is due to non-specific activation induced by the long light pulse.”

      note: Specify that the gsc2 neurons tested are in NI.

      We have edited the text accordingly in the Results section titled Afferent input to the NI from the dHb-IPN pathway.

      b) for the response to shock: in the example shown for rln3 neurons, the activity differs before and after the shock with long phases of inhibition that were not seen before. Is it representative? the authors should carefully stare at their data & make sure there is no difference in activity patterns after shock versus before.

      We reexamined the responses for each of the rln3a neurons individually and confirmed that, although oscillations in activity are frequent, the apparent inhibition (excursions below baseline) are an idiosyncratic feature of the particular example shown.

      (3) motor activity assay:

      a) there seems to be a misconception in the use of the word "bout" to estimate in panels H and I bout distance and duration and the analysis should be performed with the criterion used by all in the motor field:

      As we know now well based on the work of many labs on larval zebrafish (Orger, Baier, Engert, Wyart, Burgess, Portugues, Bianco, Scott, ...), a bout is defined as a discrete locomotor event corresponding to a distance swam of typically 1-6mm, bout duration is typically 200ms and larvae exhibit a bout every s or so during exploration (see Mirat et al Frontiers 2013; Marques et al Current Biology 2018; Rajan et al. Cell Reports 2022).

      Since the larval zebrafish has a low Reynolds number, it does not show much glide and its movement corresponds widely to the active phase of the tail beats.

      Instead of detecting the active (moving) frames as bouts, the authors however estimate these values quite off that indicate an error of calibration in the detection of a movement: a bout cannot last for 5-10s, nor can the fish swim for more than 1 cm per bout (in the definition of the authors, bout last for 5-10 s, and bout correspond to 10 cm as 50 cm is covered in 5 bouts).

      The authors should therefore distinguish the active (moving) from inactive (immobile) phase of the behavior to define bouts & analyze the corresponding distance travelled and duration of active swimming. They would also benefit from calculating the % of time spent swimming in order to test whether the fish with ablated rln3 neurons change the fraction of the time spent swimming.

      As noted above:

      Our recordings capture the entire arena that the larva can explore in the experiment and therefore lack the spatial resolution to capture and analyze the tail beat. Rather, we measured the frequency and length of phases of movement in which the larva shows no more than 1 second of immobility. To avoid confusion with studies that measure bouts from the onset of tail movement, we removed this term from the manuscript and refer to activity as phases of movement.

      Note that a duration in seconds is not a length and that the corresponding symbol for seconds in a scientific publication is "s" and not "sec".

      We have corrected this.

      b) controls in these experiments are key as many clutches differ in their spontaneous exploration and there is a lot of variation for 2 min long recordings (baseline is 115s). The authors specify that the control unablated are a mix of siblings; they should show us how the ablated transgenic animals compare to the non ablated transgenic animals of the same clutch.

      The unablated Tg(gsc2:QF2)c721; Tg(QUAS:GFP)c578 and Tg(rln3a:QF2, he1.1:YFP)c836; Tg(QUAS:GFP)c578 larvae in the control group are siblings of ablated larvae. We repeated the analyses using either the Tg(gsc2:QF2)c721; Tg(QUAS:GFP)c578 or Tg(rln3a:QF2, he1.1:YFP)c836; Tg(QUAS:GFP)c578 larvae only as controls and added the results in Figure 8-3. Although the statistical power is slightly reduced due to a smaller number of samples in the control group, the conclusions are the same, as the behavior of Tg(gsc2:QF2)c721; Tg(QUAS:GFP)c578 and Tg(rln3a:QF2, he1.1:YFP)c836; Tg(QUAS:GFP)c578 unablated larvae is indistinguishable.

      Minor comments:

      (1) Anatomy :

      • Add precision in the anatomy in Figure 1:

      • Improve contrast for cckb.

      The contrast is determined by the signal to background ratio from the fluorescence in situ hybridization. Increasing the brightness would increase both the signal and the background, as any modification must be applied to the whole image.

      • since the number of neurons seems low in each category, could you quantify the number of rln3+, nmbb+, gsc2+, cckb+ neurons in NI?

      Quantification of neuronal numbers has been added to the first Results section titled Identification of gsc2 neurons in the Nucleus Incertus, lines 219-224.

      note: indicate duration for the integral of the DF/F in s and not in frames.

      We have added this in the legends for Figures 6 and 7 and in Materials and Methods.

      (2) Genetic tools:

      To generate a driver line for the rln3+ neurons using the Q system, the authors used the promoter for the hatching gland in order to drive expression in a structure outside of the nervous system that turns on early and transiently during development: this is a very elegant approach that should be used by many more researchers.

      If the her1 construct was integrate together with the QF2 in the first exon of the rln3 locus as shown in Figure 2, the construct should not be listed with a ";" instead of a "," behind rln3a:QF2 in the transgene name. Please edit the transgene name accordingly.

      We have edited the text accordingly.

      (3) Typos:

      GABAergic neurons is misspelled twice in Figure 3.

      Thank you for catching this. We have corrected the misspellings.

      Reviewer #3 (Recommendations For The Authors):

      • More analysis should be done to better characterize the calcium activity of gsc2 and rln3a populations. Specifically:

      Spontaneous activity is estimated by finding peaks in the time-series data, but the example in Fig7 raises concerns about this process: Two peaks for the gsc2 cell are identified while numerous other peaks of apparently similar SNR are not detected. Moreover, the inset images suggest GCaMP7a expression might be weaker in the gsc2 transgenic and as such, differences in peak count might be related to the SNR of the recordings rather than underlying activity. Overall, the process for estimating spontaneous activity should be more rigorous.

      To not solely rely on the identification of peaks in the calcium traces, we also plotted histograms of the amplitudes of the calcium signals for the rln3a and gsc2 neurons. The histograms show that the amplitudes of the rln3a calcium signals frequently occur at small and large values (suggesting large fluctuations in activity), whereas the amplitudes of the gsc2 calcium signals occur most frequently at median values. We added this analysis to a revised Figure 7.

      Interestingly, there are a number of large negative excursions in the calcium data for the rln3a cell - what is the authors' interpretation of these? Could it be that presynaptic inhibition via GABA-B receptors in dIPN might influence dIPN-innervating rln3a neurons?

      As noted above:

      We reexamined the responses for each of the rln3a neurons individually and confirmed that, although oscillations in activity are frequent, the apparent inhibition (excursions below baseline) are an idiosyncratic feature of the particular example shown.

      Regarding shock-evoked activity, the authors state "rln3a neurons showed ... little response to shock", yet the immediate response after shock appears very similar in gsc2 vs rln3a cells (approx 30 units on the dF/F scale). The subsequent time-course of the response is what appears to distinguish gsc2 versus rln3a; it might thus be useful to separately quantify the amplitude and decay time constant of the shock evoked response for the two populations.

      The reviewer is correct that the difference between the gsc2 and rln3a neurons in the response to shock is dependent on the duration of time post-shock that is analyzed. Thus, the more relevant feature is the length of the response rather than the size. To reflect this, we compared the average length of responses for the gsc2 and rln3a neurons. We have now added this analysis to Figure 7 and updated the text accordingly.

      • The difference in spontaneous locomotor behavior is interesting and the example tracking data suggests there might also be differences in turn angle distribution and/or turn chain length following rln3 NI ablations. I would recommend the authors consider exploring this.

      Thank you for this suggestion. We wrote additional code to quantify turning behavior and found that larvae with rln3a NI neurons ablated do indeed have a statistically significant increase in turning compared to other groups. We now show this analysis as Figure 8-2 and we added an explanation of the quantification of turning behavior to the Methods section titled Locomotor assay.

      • I didn't follow the reasoning in the discussion that activity of rln3a cells may control transitions between phases of behavioral activity and inactivity. The events (at least those that are detected) in Fig7 occur with an average interval exceeding 30 s, yet swim bouts occur at a frequency around 1 Hz. The authors should clarify their hypothesis about how these disparate timescales might be connected.

      As noted above:

      Our recordings capture the entire arena that the larva can explore in the experiment and therefore lack the spatial resolution to capture and analyze the tail beat. Rather, we measure the frequency and length of phases of movement in which the larva shows no more than 1 second of immobility. To avoid confusion with studies that measure bouts from the onset of tail movement, we removed this term from the manuscript and refer to activity as phases of movement.

      • Fig2-2: Images are ordered from (A, B, C) anterior to (A', B', C') posterior. Its not clear what this means and images appear to be in sequence A, A', B, B'.... please clarify and consider including a cartoon of the brain in sagittal view showing location of sections indicated.

      We clarified the text in the Figure 2-2 legend and added a drawing of the brain showing the location of the sections.

      • In Fig7, why are 300 frames analyzed pre/post shock? Even for gsc2, the response appears complete in ~100 frames.

      Reviewer #2 also pointed out that the difference between the gsc2 and rln3a neurons in the response to shock is dependent on the duration of time post-shock that is analyzed. Thus, the more relevant feature is the length of the response rather than the size. To reflect this, we compared the average length of response for the gsc2 and rln3a neurons and modified the text and Figure as described above.

      • What are the large negative excursions in the calcium signal in the rln3a data (Fig7E)?

      See response to Reviewer # 2, repeated below:

      We looked through each of the responses of individual rln3a neuron and confirmed that, although oscillations in activity are frequent among the rln3a neurons, the apparent inhibition (excursions below baseline) are an idiosyncratic feature of the particular example shown.

      • There are several large and apparently perfectly straight lines in the fish tracking examples (Fig8) suggestive of tracking errors (ie. where the tracked centroid instantaneously jumps across the camera frame). Please investigate these and include analysis of the distribution of swim velocities to support the validity of the tracking data.

      The reason for this is indeed imperfect tracking resulting in frames in which the tracker does not detect the larva. The result is that the larva appears to move 1 cm or more in a single frame. However, analysis of the distribution of distances across all frames shows that these events (movement of 1 cm or more in a single frame) are rare (less than 0.04%), and there are no systematic differences that would explain the differences in locomotor behavior presented in Fig. 8. A summary of the data is as follows:

      Controls: 0.0249% of distances 1 cm or greater gsc2 neurons ablated: 0.0302% of distances 1 cm or greater rln3a NI neurons ablated: 0.0287% of distances 1 cm or greater rln3a PAG neurons ablated: 0.0241% of distance 1 cm or greater

      • Insufficient detail is provided in the methods about how swim bouts are detected (and their durations extracted) from the centroids tracking data. Please expand detail in this section.

      We added an explanation to the Methods section titled Locomotor assay.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Protein conformational changes are often critical to protein function, but obtaining structural information about conformational ensembles is a challenge. Over a number of years, the authors of the current manuscript have developed and improved an algorithm, qFit protein, that models multiple conformations into high resolution electron density maps in an automated way. The current manuscript describes the latest improvements to the program, and analyzes the performance of qFit protein in a number of test cases, including classical statistical metrics of data fit like Rfree and the gap between Rwork and Rfree, model geometry, and global and case-by-case assessment of qFit performance at different data resolution cutoffs. The authors have also updated qFit to handle cryo-EM datasets, although the analysis of its performance is more limited due to a limited number of high-resolution test cases and less standardization of deposited/processed data.

      Strengths:

      The strengths of the manuscript are the careful and extensive analysis of qFit's performance over a variety of metrics and a diversity of test cases, as well as the careful discussion of the limitations of qFit. This manuscript also serves as a very useful guide for users in evaluating if and when qFit should be applied during structural refinement.

      Reviewer #2 (Public Review):

      Summary

      The manuscript by Wankowicz et al. describes updates to qFit, an algorithm for the characterization of conformational heterogeneity of protein molecules based on X-ray diffraction of Cryo-EM data. The work provides a clear description of the algorithm used by qFit. The authors then proceed to validate the performance of qFit by comparing it to deposited X-ray entries in the PDB in the 1.2-1.5 Å resolution range as quantified by Rfree, Rwork-Rfree, detailed examination of the conformations introduced by qFit, and performance on stereochemical measures (MolProbity scores). To examine the effect of experimental resolution of X-ray diffraction data, they start from an ultra high-resolution structure (SARS-CoV2 Nsp3 macrodomain) to determine how the loss of resolution (introduced artificially) degrades the ability of qFit to correctly infer the nature and presence of alternate conformations. The authors observe a gradual loss of ability to correctly infer alternate conformations as resolution degrades past 2 Å. The authors repeat this analysis for a larger set of entries in a more automated fashion and again observe that qFit works well for structures with resolutions better than 2 Å, with a rapid loss of accuracy at lower resolution. Finally, the authors examine the performance of qFit on cryo-EM data. Despite a few prominent examples, the authors find only a handful (8) of datasets for which they can confirm a resolution better than 2.0 Å. The performance of qFit on these maps is encouraging and will be of much interest because cryo-EM maps will, presumably, continue to improve and because of the rapid increase in the availability of such data for many supramolecular biological assemblies. As the authors note, practices in cryo-EM analysis are far from uniform, hampering the development and assessment of tools like qFit.

      Strengths

      qFit improves the quality of refined structures at resolutions better than 2.0 A, in terms of reflecting true conformational heterogeneity and geometry. The algorithm is well designed and does not introduce spurious or unnecessary conformational heterogeneity. I was able to install and run the program without a problem within a computing cluster environment. The paper is well written and the validation thorough.

      I found the section on cryo-EM particularly enlightening, both because it demonstrates the potential for discovery of conformational heterogeneity from such data by qFit, and because it clearly explains the hurdles towards this becoming common practice, including lack of uniformity in reporting resolution, and differences in map and solvent treatment.

      Weaknesses

      The authors begin the results section by claiming that they made "substantial improvement" relative to the previous iteration of qFit, "both algorithmically (e.g., scoring is improved by BIC, sampling of B factors is now included) and computationally (improving the efficiency and reliability of the code)" (bottom of page 3). However, the paper does not provide a comparison to previous iterations of the software or quantitation of the effects of these specific improvements, such as whether scoring is improved by the BIC, how the application of BIC has changed since the previous paper, whether sampling of B factors helps, and whether the code faster. It would help the reader to understand what, if any, the significance of each of these improvements was.

      Indeed, it is difficult (embarrassingly) to benchmark against our past work due to the dependencies on different python packages and the lack of software engineering. With the infrastructure we’ve laid down with this paper, made possible by an EOSS grant from CZI, that will not be a problem going forward. Not only is the code more reliable and standardized, but we have developed several scientific test sets that can be used as a basis for broad comparisons to judge whether improvements are substantial. We’ve also changed with “substantial improvement” to “several modifications”  to indicate the lack of comparison to past versions.

      The exclusion of structures containing ligands and multichain protein models in the validation of qFit was puzzling since both are very common in the PDB. This may convey the impression that qFit cannot handle such use cases. (Although it seems that qFit has an algorithm dedicated to modeling ligand heterogeneity and seems to be able to handle multiple chains). The paper would be more effective if it explained how a user of the software would handle scenarios with ligands and multiple chains, and why these would be excluded from analysis here.

      qFit can indeed handle both. We left out multiple chains for simplicity in constructing a dataset enriched for small proteins while still covering diversity to speed the ability to rapidly iterate and test our approaches. Improvements to qFit ligand handling will be discussed in a forthcoming work as we face similar technical debt to what we saw in proteins and are undergoing a process of introducing “several modifications” that we hope will lead to “substantial improvement” - but at the very least will accelerate further development.

      It would be helpful to add some guidance on how/whether qFit models can be further refined afterwards in Coot, Phenix, ..., or whether these models are strictly intended as the terminal step in refinement.

      We added to the abstract:

      “Importantly, unlike ensemble models, the multiconformer models produced by qFit can be manually modified in most major model building software (e.g. Coot)  and fit can be further improved by refinement using standard pipelines (e.g. Phenix, Refmac, Buster).”

      and introduction:

      “Multiconformer models are notably easier to modify and more interpretable in software like Coot12 unlike ensemble methods that generate multiple complete protein copies(Burnley et al. 2012; Ploscariu et al. 2021; Temple Burling and Brünger 1994).”

      and results:

      “This model can then be examined and edited in Coot12 or other visualization software, and further refined using software such as phenix.refine, refmac, or buster as the modeler sees fit.”

      and discussion

      “qFit is compatible with manual modification and further refinement as long as the subsequent software uses the PDB standard altloc column, as is common in most popular modeling and refinement programs. The models can therefore generally also be deposited in the PDB using the standard deposition and validation process.”

      Appraisal & Discussion

      Overall, the authors convincingly demonstrate that qFit provides a reliable means to detect and model conformational heterogeneity within high-resolution X-ray diffraction datasets and (based on a smaller sample) in cryo-EM density maps. This represents the state of the art in the field and will be of interest to any structural biologist or biochemist seeking to attain an understanding of the structural basis of the function of their system of interest, including potential allosteric mechanisms-an area where there are still few good solutions. That is, I expect qFit to find widespread use.

      Reviewer #3 (Public Review):

      Summary:

      The authors address a very important issue of going beyond a single-copy model obtained by the two principal experimental methods of structural biology, macromolecular crystallography and cryo electron microscopy (cryo-EM). Such multiconformer model is based on the fact that experimental data from both these methods represent a space- and time-average of a huge number of the molecules in a sample, or even in several samples, and that the respective distributions can be multimodal. Different from structure prediction methods, this approach is strongly based on high-resolution experimental information and requires validated single-copy high-quality models as input. Overall, the results support the authors' conclusions.

      In fact, the method addresses two problems which could be considered separately:

      - An automation of construction of multiple conformations when they can be identified visually;

      - A determination of multiple conformations when their visual identification is difficult or impossible.

      We often think about this problem similarly to the reviewer. However, in building qFit, we do not want to separate these problems - but rather use the first category (obvious visual identification) to build an approach that can accomplish part of the second category (difficult to visualize) without building “impossible”/nonexistent conformations - with a consistent approach/bias.

      The first one is a known problem, when missing alternative conformations may cost a few percent in R-factors. While these conformations are relatively easy to detect and build manually, the current procedure may save significant time being quite efficient, as the test results show.

      We agree with the reviewers' assessment here. The “floor” in terms of impact is automating a tedious part of high resolution model building and improving model quality.

      The second problem is important from the physical point of view and has been addressed first by Burling & Brunger (1994; https://doi.org/10.1002/ijch.199400022). The new procedure deals with a second-order variation in the R-factors, of about 1% or less, like placing riding hydrogen atoms, modeling density deformation or variation of the bulk solvent. In such situations, it is hard to justify model improvement. Keeping Rfree values or their marginal decreasing can be considered as a sign that the model is not overfitted data but hardly as a strong argument in favor of the model.

      We agree with the overall sentiment of this comment. What is a significant variation in R-free is an important question that we have looked at previously (http://dx.doi.org/10.1101/448795) and others have suggested an R-sleep for further cross validation (https://pubmed.ncbi.nlm.nih.gov/17704561/). For these reasons it is important to get at the significance of the changes to model types from large and diverse test sets, as we have here and in other works, and from careful examination of the biological significance of alternative conformations with experiments designed to test their importance in mechanism.

      In general, overall targets are less appropriate for this kind of problem and local characteristics may be better indicators. Improvement of the model geometry is a good choice. Indeed, yet Cruickshank (1956; https://doi.org/10.1107/S0365110X56002059) showed that averaged density images may lead to a shortening of covalent bonds when interpreting such maps by a single model. However, a total absence of geometric outliers is not necessarily required for the structures solved at a high resolution where diffraction data should have more freedom to place the atoms where the experiments "see" them.

      Again, we agree—geometric outliers should not be completely absent, but it is comforting when they and model/experiment agreement both improve.

      The key local characteristic for multi conformer models is a closeness of the model map to the experimental one. Actually, the procedure uses a kind of such measure, the Bayesian information criteria (BIC). Unfortunately, there is no information about how sharply it identifies the best model, how much it changes between the initial and final models; in overall there is not any feeling about its values. The Q-score (page 17) can be a tool for the first problem where the multiple conformations are clearly separated and not for the second problem where the contributions from neighboring conformations are merged. In addition to BIC or to even more conventional target functions such as LS or local map correlation, the extreme and mean values of the local difference maps may help to validate the models.

      We agree with the reviewer that the problem of “best” model determination is poorly posed here. We have been thinking a lot about htis in the context of Bayesian methods (see: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9278553/); however, a major stumbling block is in how variable representations of alternative conformations (and compositions) are handled. The answers are more (but by no means simply) straightforward for ensemble representations where the entire system is constantly represented but with multiple copies.

      This method with its results is a strong argument for a need in experimental data and information they contain, differently from a pure structure prediction. At the same time, absence of strong density-based proofs may limit its impact.

      We agree - indeed we think it will be difficult to further improve structure prediction methods without much more interaction with the experimental data.

      Strengths:

      Addressing an important problem and automatization of model construction for alternative conformations using high-resolution experimental data.

      Weaknesses:

      An insufficient validation of the models when no discrete alternative conformations are visible and essentially missing local real-space validation indicators.

      While not perfect real space indicators, local real-space validation is implicit in the MIQP selection step and explicit when we do employ Q-score metrics.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      A point of clarification: I don't understand why waters seem to be handled differently in for cryo-EM and crystallography datasets. I am interested about the statement on page 19 that the Molprobity Clashscore gets worse for cryo-EM datasets, primarily due to clashes with waters. But the qFit algorithm includes a round of refinement to optimize placement of ordered waters, and the clashscore improves for the qFit refinement in crystallography test cases. Why/how is this different for cryo-EM?

      We agree that this was not an appropriate point. We believe that the high clash score is coming from side chains being incorrectly modeled. We have updated this in the manuscript and it will be a focus of future improvements.

      Reviewer #2 (Recommendations For The Authors):

      - It would be instructive to the reader to explain how qFit handles the chromophore in the PYP (1OTA) example. To this end, it would be helpful to include deposition of the multiconformer model of PYP. This might also be a suitable occasion for discussion of potential hurdles in the deposition of multiconformer models in the PDB (if any!). Such concerns may be real concerns causing hesitation among potential users.

      Thank you for this comment. qFit does not alter the position or connectivity of any HETATM records (like the chromophore in this structure). Handling covalent modifications like this is an area of future development.

      Regarding deposition, we have noted above that the discussion now includes:

      “qFit is compatible with manual modification and further refinement as long as the subsequent software uses the PDB standard altloc column, as is common in most popular modeling and refinement programs. The models can therefore, generally also be deposited in the PDB using the standard deposition and validation process.”

      Finally, we have placed all PDBs in a Zenodo deposition (XXX) and have included that language in the manuscript. It is currently under a separate data availability section (page XXX). We will defer to the editor as to the best header that should go under.

      - It may be advisable to take the description of true/false pos/negatives out of the caption of Figure 4, and include it in a box or so, since these terms are important in the main text too, and the caption becomes very cluttered.

      We think adding the description of true/false pos/negatives to the Figure panel would make it very cluttered and wordy. We would like to retain this description within the caption. We have also briefly described each in the main text.

      - page 21, line 4: some issue with citation formatting.

      We have updated these citations.

      - page 25, second paragraph: cardinality is the number of members of a set. Perhaps "minimal occupancy" is more appropriate.

      Thank you for pointing this out. This was a mistake and should have been called the occupancy threshold.

      - page 26: it's - its

      Thank you, we have made this change. 

      - Font sizes in Supplementary Figures 5-7 are too small to be readable.

      We agree and will make this change. 

      Reviewer #3 (Recommendations For The Authors):

      General remarks

      (1) As I understand, the procedure starts from shifting residues one by one (page 4; A.1). Then, geometry reconstruction (e.g., B1) may be difficult in some cases joining back the shifted residues. It seems that such backbone perturbation can be done more efficiently by shifting groups of residues ("potential coupled motions") as mentioned at the bottom of page 9. Did I miss its description?

      We would describe the algorithm as sampling (which includes minimal shifts) in the backbone residues to ensure we can link neighboring residues. We agree that future iterations of qFit should include more effective backbone sampling by exploring motion along the Cβ-Cα, C-N, and (Cβ-Cα × C-N) bonds and exploring correlated backbone movements.

      (2) While the paper is well split in clear parts, some of them seem to be not at their right/optimal place and better can be moved to "Methods" (detailed "Overview of the qFit protein algorithm" as a whole) or to "Data" missed now (Two first paragraphs of "qFit improves overall fit...", page 8, and "Generating the qFit test set", page 22, and "Generating synthetic data ..." at page 26; description of the test data set), At my personal taste, description of tests with simulated data (page 15) would be better before that of tests with real data.

      Thank you for this comment, but we stand by our original decision to keep the general flow of the paper as it was submitted.

      (3) I wonder if the term "quadratic programming" (e.g., A3, page 5) is appropriate. It supposes optimization of a quadratic function of the independent parameters and not of "some" parameters. This is like the crystallographic LS which is not a quadratic function of atomic coordinates, and I think this is a similar case here. Whatever the answer on this remark is, an example of the function and its parameters is certainly missed.

      We think that the term quadratic programming is appropriate. We fit a function with a loss function (observed density - calculated density), while satisfying the independent parameters. We fit the coefficients minimizing a quadratic loss. We agree that the quadratic function is missing from the paper, and we have now included it in the Methods section.

      Technical remarks to be answered by the authors :

      (1) Page 1, Abstract, line 3. The ensemble modeling is not the only existing frontier, and saying "one of the frontiers" may be better. Also, this phrase gives a confusing impression that the authors aim to predict the ensemble models while they do it with experimental data.

      We agree with this statement and have re-worded the abstract to reflect this.

      (2) Page 2. Burling & Brunger (1994) should be cited as predecessors. On the contrary, an excellent paper by Pearce & Gros (2021) is not relevant here.

      While we agree that we should mention the Burling & Brunger paper and the Pearce & Gros (2021) should not be removed as it is not discussing the method of ensemble refinement.

      (3) Page 2, bottom. "Further, when compared to ..." The preference to such approach sounds too much affirmative.

      We have amended this sentence to state:

      “Multiconformer models are notably easier to modify and more interpretable in software like Coot(Emsley et al. 2010) unlike ensemble methods that generate multiple complete protein copies(Burnley et al. 2012; Ploscariu et al. 2021; Temple Burling and Brünger 1994).”

      “The point we were trying to make in this sentence was that ensemble-based models are much harder to manually manipulate in Coot or other similar software compared to multiconformer models. We think that the new version of this sentence states this point more clearly.”

      (4) Page 2, last paragraph. I do not see an obvious relation of references 15-17 to the phrase they are associated with.

      We disagree with this statement, and think that these references are appropriate.

      “Multiconformer models are notably easier to modify and more interpretable in software like Coot12 unlike ensemble methods that generate multiple complete protein copies(Burnley et al. 2012; Ploscariu et al. 2021; Temple Burling and Brünger 1994).”

      (5) Page 3, paragraph 2. Cryo-EM maps should be also "high-resolution"; it does not read like this from the phrase.

      We agree that high-resolution should be added, and the sentence now states:

      “However, many factors make manually creating multiconformer models difficult and time-consuming. Interpreting weak density is complicated by noise arising from many sources, including crystal imperfections, radiation damage, and poor modeling in X-ray crystallography, and errors in particle alignment and classification, poor modeling of beam induced motion, and imperfect detector Detector Quantum Efficiency (DQE) in high-resolution cryo-EM.”

      (6) Page 3, last paragraph before "results". The words "... in both individual cases and large structural bioinformatic projects" do not have much meaning, except introducing a self-reference. Also, repeating "better than 2 A" looks not necessary.

      We agree that this was unnecessary and have simplified the last sentence to state:

      “With the improvements in model quality outlined here, qFit can now be increasingly used for finalizing high-resolution models to derive ensemble-function insights.”

      (7) Page 3. "Results". Could "experimental" be replaced by a synonym, like "trial", to avoid confusing with the meaning "using experimental data"?

      We have replaced experimental with exploratory to describe the use of qFit on CryoEM data. The statement now reads:

      “For cryo-EM modeling applications, equivalent metrics of map and model quality are still developing, rendering the use of qFit for cryo-EM more exploratory.”

      (8) Page 4, A.1. Should it be "steps +/- 0.1" and "coordinate" be "coordinate axis"? One can modify coordinates and not shift them. I do not understand how, with the given steps, the authors calculated the number of combinations ("from 9 to 81"). Could a long "Alternatively, ...absent" be reduced simply to "Otherwise"?

      We have simplified and clarified the sentence on the sampling of backbone coordinates to state:

      “If anisotropic B-factors are absent, the translation of coordinates occurs in the X, Y, and Z directions. Each translation takes place in steps of 0.1 along each coordinate axis, extending to 0.3 Å, resulting in 9 (if isotropic) or to 81 (if anisotropic) distinct backbone conformations for further analysis.”

      (9) Page 6, B.1, line 2. Word "linearly" is meaningless here.

      We have modified this to read:

      “Moving from N- to C- terminus along the protein,”

      (10) Page 9, line 2. It should be explained which data set is considered as the test set to calculate Rfree.

      We think this is clear and would be repetitive if we duplicated it.

      (11) Page 9, line 7. It should be "a valuable metric" and not "an"

      We agree and have updated the sentence to read:

      “Rfree is a valuable metric for monitoring overfitting, which is an important concern when increasing model parameters as is done in multiconformer modeling.”

      (12) Page 10, paragraph 3. "... as a string (Methods)". I did not find any other mention of this term "string", including in "Methods" where it supposed to be explained. Either this should be explained (and an example is given?), or be avoided.

      We agree that string is not necessary (discussing the programmatic datatype). We have removed this from the sentence. It now reads:

      “To quantify how often qFit models new rotameric states, we analyzed the qFit models with phenix.rotalyze, which outputs the rotamer state for each conformer (Methods).”

      (13) Page10, lines 3-4 from bottom. Are these two alternative conformations justified?

      We are unsure what this is referring to.

      (14) Page 12, Fig. 2A. In comparison with Supplement Fig 2C, the direction of axes is changed. Could they be similar in both Figures?

      We have updated Supplementary Figure 2C to have the same direction of axes as Figure 2A.

      (15) Page 15, section's title. Choose a single verb in "demonstrate indicate".

      We have amended the title of this section to be:

      “Simulated data demonstrate qFit is appropriate for high-resolution data.”

      (16) Page 15, paragraph 2. "Structure factors from 0.8 to 3.0 A resolution" does not mean what the author wanted apparently to tell: "(complete?) data sets with the high-resolution limit which varied from 0.8 to 3.0 A ...". Also, a phrase of "random noise increasing" is not illustrated by Figs.5 as it is referred to.

      We have edited this sentence to now read:

      “To create the dataset for resolution dependence, we used the ground truth 7KR0 model, including all alternative conformations, and generated artificial structure factors with a high resolution limit ranging from  0.8 to 3.0 Å resolution (in increments of 0.1 Å).”

      (17) Page 15, last paragraph is written in a rather formal and confusing way while a clearer description is given in the figure legend and repeated once more in Methods. I would suggest to remove this paragraph.

      We agree that this is confusing. Instead of create a true positive/false positive/true negative/false negative matrix, we have just called things as they are, multiconformer or single conformer and match or no match. We have edited the language the in the manuscript and figure legends to reflect these changes.

      (18) Page 16. Last two paragraphs start talking about a new story and it would help to separate them somehow from the previous ones (sub-title?).

      We agree that this could use a subtitle. We have included the following subtitle above this section:

      “Simulated multiconformer data illustrate the convergence of qFit.”

      (19) Page 20. "or static" and "we determined that" seem to be not necessary.

      We have removed static and only used single conformer models. However, as one of the main conclusions of this paper is determining that qFit can pick up on alternative conformers that were modeled manually, we have decided to the keep the “we determined that”.

      (20) Page 21, first paragraph. "Data" are plural; it should be "show" and "require"

      We have made these edits. The sentence now reads:

      “However, our data here shows that not only does qFit need a high-resolution map to be able to detect signal from noise, it also requires a very well-modeled structure as input.”

      (21) Page 21, References should be indicated as [41-45], [35,46-48], [55-57]. A similar remark to [58-63] at page 22.

      We have fixed the reference layout to reflect this change.

      (22) Page 21, last paragraph. "Further reduce R-factors" (moreover repeated twice) is not correct neither by "further", since here it is rather marginal, nor as a goal; the variations of R-factors are not much significant. A more general statement like "improving fit to experimental data" (keeping in mind density maps) may be safer.

      We agree with the duplicative nature of these statements. We have amended the sentence to now read:

      “Automated detection and refinement of partial-occupancy waters should help improve fit to experimental data further reduce Rfree15 and provide additional insights into hydrogen-bond patterns and the influence of solvent on alternative conformations.”

      (23) Page 22. Sub-sections of "Methods" are given in a little bit random order; "Parallelization of large maps" in the middle of the text is an example. Put them in a better order may help.

      We have moved some section of the Methods around and made better headings by using an underscore to highlight the subsections (Generating and running the qFit test set, qFit improved features, Analysis metrics, Generating synthetic data for resolution dependence).

      (24) Page 24. Non-convex solution is a strange term. There exist non-convex problems and functions and not solutions.

      We agree and we have changed the language to reflect that we present the algorithm with non-convex problems which it cannot solve.

      (25) Page 26, "Metrics". It is worthy to describe explicitly the metrics and not (only) the references to the scripts.

      For all metrics, we describe a sentence or two on what each metric describes. As these metrics are well known in the structural biology field, we do not feel that we need to elaborate on them more.

      (26) Page 26. Multiplying B by occupancy does not have much sense. A better option would be to refer to the density value in the atomic center as occ*(4*pi/B)^1.5 which gives a relation between these two entities.

      We agree and have update the B-factor figures and metrics to reflect this.

      (27) Page 40, suppl. Fig. 5. Due to the color choice, it is difficult to distinguish the green and blue curves in the diagram.

      We have amended this with the colors of the curves have been switched.

      (28) Page 42, Suppl. Fig. 7. (A) How the width of shaded regions is defined? (B) What the blue regions stand for? Input Rfree range goes up to 0.26 and not to 0.25; there is a point at the right bound. (C) Bounds for the "orange" occupancy are inversed in the legend.

      (A) The width of the shaded region denotes the standard deviations among the values at every resolution. We have made this clearer in the caption

      (B) The blue region denotes the confidence interval for the regression estimate. Size of the confidence interval was set to 95%. We have made this clearer in the caption

      (C) This has been fixed now

      The maximum R-free value is 0.2543, which we rounded down to 0.25.

      (29) Page 43. Letters E-H in the legend are erroneously substituted by B-E.

      We apologize for this mistake. It is now corrected.

    1. As adoption of nanopore sequencing technology continues to advance, the need to maintain large volumes of raw current signal data for reanalysis with updated algorithms is a growing challenge. Here we introduce slow5curl, a software package designed to streamline nanopore data sharing, accessibility and reanalysis. Slow5curl allows a user to fetch a specified read or group of reads from a raw nanopore dataset stored on a remote server, such as a public data repository, without downloading the entire file. Slow5curl uses an index to quickly fetch specific reads from a large dataset in SLOW5/BLOW5 format and highly parallelised data access requests to maximise download speeds. Using all public nanopore data from the Human Pangenome Reference Consortium (>22 TB), we demonstrate how slow5curl can be used to quickly fetch and reanalyse signal reads corresponding to a set of target genes from each individual in large cohort dataset (n = 91), minimising the time, egress costs, and local storage requirements for their reanalysis. We provide slow5curl as a free, open-source package that will reduce frictions in data sharing for the nanopore community: https://github.com/BonsonW/slow5curlCompeting Interest Statement

      Reviewer 2. Yunfan Fan

      Comments to Author: In this manuscript, the authors demonstrate a highly streamlined method for downloading targeted subsets of raw ONT electrical signals, for re-analysis. In my view, this will be a highly useful tool for researchers working with public nanopore data, and I hope to see its widespread adoption. The benchmarks are well-described in the manuscript, and the code is publicly available and well-documented. I have no other notes or suggestions for the authors.

    1. Dynamic functional connectivity (dFC) has become an important measure for understanding brain function and as a potential biomarker. However, various methodologies have been developed for assessing dFC, and it is unclear how the choice of method affects the results. In this work, we aimed to study the results variability of commonly-used dFC methods. We implemented seven dFC assessment methods in Python and used them to analyze fMRI data of 395 subjects from the Human Connectome Project. We measured the pairwise similarity of dFC results using several similarity metrics in terms of overall, temporal, spatial, and inter-subject similarity. Our results showed a range of weak to strong similarity between the results of different methods, indicating considerable overall variability. Surprisingly, the observed variability in dFC estimates was comparable to the expected natural variation over time, emphasizing the impact of methodological choices on the results. Our findings revealed three distinct groups of methods with significant inter-group variability, each exhibiting distinct assumptions and advantages. These findings highlight the need for multi-analysis approaches to capture the full range of dFC variation. They also emphasize the importance of distinguishing neural-driven dFC variations from physiological confounds, and developing validation frameworks under a known ground truth. To facilitate such investigations, we provide an open-source Python toolbox that enables multi-analysis dFC assessment. This study sheds light on the impact of dFC assessment analytical flexibility, emphasizing the need for careful method selection and validation, and promoting the use of multi-analysis approaches to enhance reliability and interpretability of dFC studies.Competing Interest StatementThe authors have declared no competing interest.

      Reviewer 2. Nicolas Farrugia

      Comments to Author: Summary of review This paper fills a very important gap in the literature investigating time-varying functional connectivity (or dynamic functional connectivity, dFC), by measuring analytical flexibility of seven different dFC methods. An impressive amount of work has been put up to generate a set of convincing results, that essentially show that the main object of interest of dFC, which is the temporal variability of connectivity, cannot be measured with a high consistency, as this variability is of the same order of magnitude or even higher than the changes observed across different methods on the same data. In this very controversial field, it is very remarkable to note that the authors have managed to put together a set of analysis to demonstrate this in a very clear and transparent way. The paper is very well written, the overall approach is based on a few assumptions that make it possible to compare methods (e.g. subsampling of temporal aspects of some methods, spatial subsampling), and the provided analysis is very complete. The most important results are condensed in a few figures in the main manuscript, which is enough to convey the main messages. The supplementary materials provide an exhaustive set of additional results, which are shortly discussed one by one. Most importantly, the authors have provided an open source implementation of 7 main dfc methods. This is very welcome for the community and for reproductibility, and is of course particularly suited for this kind of contribution. A few suggestions follow. Clarification questions and suggestions : 1- How was the uniform downsampling of 286 ROI to 96 done ? Uniform in which sense ? According to the RSN ? Were ROIs regrouped with spatial contiguity ? I understand this was done in order to reduce computational complexity and to harmonize across methods, but the manuscript would benefit from having an added sentence to explain what was done. 2- Table A in figure 1 shows the important hyperparameters (HP) for each method, but the motivations regarding the choice of HP for each method is only explained in the discussion (end of page 11, "we adopted the hyperparameter values recommended by the original paper or consensus among the community for each method"). It would be better to explain it in the methods, and then only discuss why this can be a limitation, in the discussion. 3- The github repository https://github.com/neurodatascience/dFC/tree/main does not reference the paper 4- The github repository https://github.com/neurodatascience/dFC/tree/main is not documented enough. There are two very large added values in this repo : open implementation of methods, and analytical flexibility tools. The demo notebook shows how to use the analytical flexibility tools, but the methods implementation is not documented. I expect that many people will want to perform analysis using the methods as well as comparison analysis, so the documentation of individual methods should not be minimized. 5 - For the reader, it would be better to include early in the manuscript (in the introduction) the presence of the code for reproductibility. Currently, the toolbox is only introduced in the final paragraph of the discussion. It comes as a very nice suprise when reading the manuscript in full, but I think the manuscript would gain a lot of value if this paragraph was included earlier, and if the development of the toolbox was included much earlier (ie. in the abstract). 6 - We have published two papers on dFC that the authors may want to include, although these papers have investigated cerebello-cerebral dFC using whole brain + cerebellum parcellations. The first paper used continuous HMM on healthy subjects, and found correlations with impulsivity scores, while the second papers used network measures on sliding window dFC matrices on a clinical cohort (patients with alcohol use disorder). I am not sure why the authors have not found our papers in their litterature, but maybe it would be good to include them. Authors need to update the final table in supplementary materials as well as the citations in the main paper. Abdallah, M., Farrugia, N., Chirokoff, V., & Chanraud, S. (2020). Static and dynamic aspects of cerebro-cerebellar functional connectivity are associated with self-reported measures of impulsivity: A resting-state fMRI study. Network Neuroscience, 4(3), 891-909. Abdallah, M., Zahr, N. M., Saranathan, M., Honnorat, N., Farrugia, N., Pfefferbaum, A., Sullivan, E. & Chanraud, S. (2021). Altered cerebro-cerebellar dynamic functional connectivity in alcohol use disorder: a resting-state fMRI study. The Cerebellum, 20, 823-835. Note that in Abdallah et al. (2020), while we did not compare HMM results with other dFC methods, we did investigate the influence of HMM hyperparameters, as well as perform internal cross validation on our sample + null models of dFC.

      Minor comments 6 - "[..] what lies behind the of methods. Instead, they reveal three groups of methods, 720 variations in dynamic functional connectivity?. " -> an extra "." was added (end of page 10).

    2. AbstractDynamic functional connectivity (dFC) has become an important measure for understanding brain function and as a potential biomarker. However, various methodologies have been developed for assessing dFC, and it is unclear how the choice of method affects the results. In this work, we aimed to study the results variability of commonly-used dFC methods. We implemented seven dFC assessment methods in Python and used them to analyze fMRI data of 395 subjects from the Human Connectome Project. We measured the pairwise similarity of dFC results using several similarity metrics in terms of overall, temporal, spatial, and inter-subject similarity. Our results showed a range of weak to strong similarity between the results of different methods, indicating considerable overall variability. Surprisingly, the observed variability in dFC estimates was comparable to the expected natural variation over time, emphasizing the impact of methodological choices on the results. Our findings revealed three distinct groups of methods with significant inter-group variability, each exhibiting distinct assumptions and advantages. These findings highlight the need for multi-analysis approaches to capture the full range of dFC variation. They also emphasize the importance of distinguishing neural-driven dFC variations from physiological confounds, and developing validation frameworks under a known ground truth. To facilitate such investigations, we provide an open-source Python toolbox that enables multi-analysis dFC assessment. This study sheds light on the impact of dFC assessment analytical flexibility, emphasizing the need for careful method selection and validation, and promoting the use of multi-analysis approaches to enhance reliability and interpretability of dFC studies.

      This work has been published in GigaScience Journal under a CC-BY 4.0 license (https://doi.org/10.1093/gigascience/giae009), and has published the reviews under the same license. These are as follows.

      Reviewer 1: Yara Jo Toenders

      Comments to Author: The authors performed an in-depth comparison of 7 dynamic functional connectivity methods. The paper includes many figures that are greatly appreciated as they clearly demonstrate the findings. Moreover, the authors developed a Python toolbox to implement these 7 methods. The results showed that the results were highly variable, although three clusters of similar methods could be detected. However, after reading the manuscript, there are some remaining questions. - The TR and timepoints of the fMR images are shown, but other acquisition parameters such as the voxel size are missing. Could all acquisition parameters please be provided? - Could more information be provided on the downsampling of the 286 to 96 ROIs? How was this done and what were the 96 ROIs that were created? - In the results it is explained that the definition of groups depended on the cutoff value of the clustering, however it is unclear how the cutoff value was determined. Could the authors elucidate this how this was done? - The difference between the subplots in Figure 3 is a bit difficult to understand because the labels of the different methods switch places. Perhaps the same colour could be used for the cluster of the continuous HMM, Clustering and Discrete HMM method to increase readability? - Figure 4b shows that the default mode network is more variable over methods than time, while the auditory and visual are not. Could the authors explain what may underlie this discrepancy? - From the introduction it became clear that many studies have used dFC to study clinical populations, while I understand that no single recommendation can be given, not every clinical study might have the capacity to use all 7 methods. What would the authors recommend these clinical studies? Would there for example be a method that would be recommended within each of the three clusters? - It could be helpful if the authors create DOIs for their toolbox code bases that could be cited in a manuscript, rather than linking to bare GitHub URLs. One potentially useful guide is: https://guides.github.com/activities/citable-code/

    1. Background Culture-free real-time sequencing of clinical metagenomic samples promises both rapid pathogen detection and antimicrobial resistance profiling. However, this approach introduces the risk of patient DNA leakage. To mitigate this risk, we need near-comprehensive removal of human DNA sequence at the point of sequencing, typically involving use of resource-constrained devices. Existing benchmarks have largely focused on use of standardised databases and largely ignored the computational requirements of depletion pipelines as well as the impact of human genome diversity.Results We benchmarked host removal pipelines on simulated Illumina and Nanopore metagenomic samples. We found that construction of a custom kraken database containing diverse human genomes results in the best balance of accuracy and computational resource usage. In addition, we benchmarked pipelines using kraken and minimap2 for taxonomic classification of Mycobacterium reads using standard and custom databases. With a database representative of the Mycobacterium genus, both tools obtained near-perfect precision and recall for classification of Mycobacterium tuberculosis. Computational efficiency of these custom databases was again superior to most standard approaches, allowing them to be executed on a laptop device.Conclusions Nanopore sequencing and a custom kraken human database with a diversity of genomes leads to superior host read removal from simulated metagenomic samples while being executable on a laptop. In addition, constructing a taxon-specific database provides excellent taxonomic read assignment while keeping runtime and memory low. We make all customised databases and pipelines freely available.Competing Interest StatementThe authors have declared no competing interest.

      Reviewer 2. Darrin Lemmer, M.S.

      Comments to Author: This paper describes a method for improving the accuracy and efficiency of extracting a pathogen of interest (M. tuberculosis in this instance, though the methods should work equally well for other pathogens) from a "clinical" metagenomic sample. The paper is well written and provides links to all source code and datasets used, which were well organized and easy to understand. The premise – that using a pangenome database improves classification -- seems pretty intuitive, but it is nice to see some benchmarking to prove it. For clarity I will arrange my comments by the three major steps of your methods: dataset generation, human read removal, and Mycobacterium read classification. 1. Dataset generation -- I appreciate that you used a real-world study (reference #8) to approximate the proportions of organisms in your sample, however I am disappointed that you generated exactly one dataset for benchmarking. Even if you use the exact same community composition, there is a level of randomness involved in generating sequencing reads, and therefore some variance. I would expect to see multiple generations and an averaging of the results in the tables, however with a sufficiently high read depth, the variance won't likely change your results much, so it would be nice, and more true to real sequencing data, to vary the number of reads generated (I didn't see where you specified to what read depth for each species you generated the reads for), as it is rare in the real world to always get this deep of coverage. Ideally it would also be nice to see datasets varying the proportions of MTBC in the sample to test the limits of detection, but that may be beyond the scope of this particular paper. 2. Human read removal -- The data provided do not really support the conclusion, as all methods benchmarked performed quite well and, particularly when using the long reads from the Nanopore simulated dataset, fairly indistinguishable with the exception of HRRT. The short Illumina reads show a little more separation between the methods, probably due to the shorter sequences being able to align to multiple sequences in the reference databases, however comparing kraken human to kraken HPRC still shows very little difference, thus not supporting the conclusion that the pangenome reference provides "superior" host removal. The run times and memory used do much more to separate the performance of the various methods, and particularly with the goal of being able to run the analysis on a personal computer where peak memory usage is important. The only methods that perform well within the memory constraints of a personal computer for both long reads and short leads are HRRT and the two kraken methods, with kraken being superior at recall, but again, kraken human and kraken HPRC are virtually indistinguishable, making it hard to justify the claim that the pangenome is superior. Also, it appears your run time and peak memory usage is again based on one single data point, these should be performed multiple times and averaged. Finally, as an aside, I did find it interesting and disturbing that HRRT had such a high false negative rate compared to the other methods, given that this is the primary method used by NCBI for publishing in the SRA database, implying there are quite a few human remaining in SRA. 3. Mycobacterium read classification -- Here we do have some pretty good support for using a pangenome reference database, particularly compared to the kraken standard databases, though as mentioned previously, a single datapoint isn't really adequate, and I'd like to see both multiple datasets and multiple runs of each method. Additionally, given the purpose here is to improve the amount of MTB extracted from a metagenomic sample, these data should be taken the one extra step to show the coverage breadth and depth of the MTB genome provided by the reads classified as MTB, as a high number of reads doesn't mean much if they are all stacked at the same region of the genome. Given that these are simulated reads, which tend to have pretty even genome coverage, this may not show much, however it is still an important piece to show the value of your recommended method. One final comment is that it should be fairly easy to take this beyond a theoretical exercise, by running some actual real world datasets through the methods you are recommending to see how well they perform in actuality. For instance, reference #8, which you used as a basis for the composition of your simulated metagenomic sample, published their actual sequenced sputum samples. It would be easy to show if you can improve the amount of Mycobacterium extracted from their samples over the methods they used, thus showing value to those lower income/high TB burden regions where whole metagenome sequencing may be the best option they have.

      Re-review.

      This is a significantly stronger paper than originally submitted. I especially appreciate that multiple runs have now been done with more than one dataset, including a "real" dataset, and the analysis showing the breadth and depth of coverage of the retained Mtb reads, proving that you can still generally get a complete genome of a metagenomic sample with these methods. However kraken's low sensitivity when using the standard database definitely impacts the results, making a stronger argument for using a pangenome database (Kraken-Standard can identify the presence of Mtb, but if you want to do anything more with it, like AMR detection, you would need to use a pangenome database). I really think that this should be emphasized more, and perhaps some or all of the data in tables S9-S12 be brought into the main paper. It is maybe worth noting, that the significant drop in breadth, I would imagine, is a result of dividing the total size of the aligned reads by the size of the genome, implying a shallow coverage, but the reality is still high coverage in the areas that are covered, but lots of complete gaps in coverage. I did also like the switch to the somewhat more standard sensitivity/specificity metrics, though I do lament the actual FN/FP counts being relegated to the supplemental tables, as I thought these numbers valuable (or at least interesting) when comparing the results of the various pipelines, particularly with human read removal, where the various pipelines perform quite similarly.

    1. Stack Overflow - Where Developers Learn, Share, & Build Careers. URL: https://stackoverflow.com/ (visited on 2023-12-08).

      As someone who has to code frequently, Stack Overflow is my best helper because people have had problems and solutions for most of the coding errors I have faced. I really like Stack Overflow as it has a great community filled with coders who are willing to help you. An interesting fact that I saw online about Stack Overflow that I found funny: "All coders have used Stack Overflow, but none of them has seen the homepage."

    1. Black Duck® software composition analysis (SCA) helps teams manage the security, quality, and license compliance risks that come from the use of open source and third-party code in applications and containers.
    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The authors set up a pipeline for automated high-throughput single-molecule fluorescence imaging (htSMT) in living cells and analysis of molecular dynamics

      Strengths:

      htSMT reveals information on the diffusion and bound fraction of molecules, dose-response curves, relative estimates of binding rates, and temporal changes of parameters. It enables the screening of thousands of compounds in a reasonable time and proves to be more sensitive and faster than classical cell-growth assays. If the function of a compound is coupled to the mobility of the protein of interest, or affects an interaction partner, which modulates the mobility of the protein of interest, htSMT allows identifying the modulator and getting the first indication of the mechanism of action or interaction networks, which can be a starting point for more in-depth analysis.

      Weaknesses:

      While elegantly showcasing the power of high-throughput measurements, the authors disclose little information on their microscope setup and analysis procedures. Thus, reproduction by other scientists is limited. Moreover, a critical discussion about the limits of the approach in determining dynamic parameters, the mechanism of action of compounds, and network reconstruction for the protein of interest is missing. In addition, automated imaging and analysis procedures require implementing sensitive measures to assure data and analysis quality, but a description of such measures is missing.

      The reviewer rightly highlights both the power and complexity in high throughput assay systems, and as such the authors have spent significant effort in first developing quality control checks to support screening. We discuss some of these as part of the description and characterization of the platform. We added additional details into the manuscript to help clarify. The implementation of our workflow for image acquisition, processing and analysis relies heavily on the specifics of our lab hardware and software infrastructure. We have added additional details to the text, particularly in the Methods section, and believe we have added enough information that our results can be reproduced using the suite of tools that already exist for single molecule tracking.

      The reviewer also points out that all assays have limitations, and these have not been clearly identified as part of our discussion of the htSMT platform. We have also added some comments on the limitations of the current system and our approach.

      Reviewer #2 (Public Review):

      Summary:

      McSwiggen et al present a high throughput platform for SPT that allows them to identify pharmaceutics interactions with the diffusional behavior of receptors and in turn to identify potent new ligands and cellular mechanisms. The manuscript is well written, it provides a solid new mentor and a proper experimental foundation

      Strengths:

      The method capitalizes and extends to existing high throughput toolboxes and is directly applied to multiple receptors and ligands. The outcomes are important and relevant for society. 10^6 cells and >400 ligands per is a significant achievement.

      The method can detect functionally relevant changes in transcription factor dynamics and accurately differentiate the ligand/target specificity directly within the cellular environment. This will be instrumental in screening libraries of compounds to identify starting points for the development of new therapeutics. Identifying hitherto unknown networks of biochemical signaling pathways will propel the field of single-particle live cell and quantitative microscopy in the area of diagnostics. The manuscript is well-written and clearly conveys its message.

      Weaknesses:

      There are a few elements, that if rectified would improve the claims of the manuscript.

      The authors claim that they measure receptor dynamics. In essence, their readout is a variation in diffusional behavior that correlates to ligand binding. While ligand binding can result in altered dynamics or /and shift in conformational equilibrium, SPT is not recording directly protein structural dynamics, but their effect on diffusion. They should correct and elaborate on this.

      This is an excellent clarifying question, and we have tried to make it more explicit in the text. The reviewer is absolutely correct; we’re not using SPT to directly measure protein structural dynamics, but rather the interactions a given protein makes with other macromolecules within the cell. So when an SHR binds to ligand it adopts conformations that promote association with DNA and other protein-protein interactions relevant to transcription. This is distinct from assays that directly measure conformational changes of the protein.

      L 148 What do the authors mean 'No correlation between diffusion and monomeric protein size was observed, highlighting the differences between cellular protein dynamics versus purified systems'. This is not justified by data here or literature reference. How do the authors know these are individual molecules? Intensity distributions or single bleaching steps should be presented.

      The point we were trying to make is that the relative molecular weights for the monomer protein (138 kDa for Halo-AR, 102 kDa for ER-Halo, 122 kDa for Halo-GR, and 135 kDa for Halo-PR) is uncorrelated with its apparent free diffusion coefficient. Were we to make this measurement on purified protein in buffer, where diffusion is well described by the Stokes Einstein equation, one would expect to see monomer size and diffusion related. We’ve clarified this point in the manuscript.

      Along the same lines, the data in Figs 2 and 4 show that not only the immobile fraction is increased but also that the diffusion coefficient of the fast-moving (attributed to free) is reduced. The authors mention this and show an extended Fig 5 but do not provide an explanation.

      This is an area where there is still more work to do in understanding the estrogen receptor and other SHRs. As the reviewer says, we see not only an increase in chromatin binding but also a decrease in the diffusion coefficient of the “free” population. A potential explanation is that this is a greater prevalence of freely-diffusing homodimers of the receptor, or other protein-protein interactions (14-3-3, P300, CBP, etc) that can occur after ligand binding. Nothing in our bioactive compound screen shed light on this in particular, and so we can only speculate and have refrained from drawing further conclusions in the text.

      How do potential transient ligand binding and the time-dependent heterogeneity in motion (see comment above) contribute to this? Also, in line 216 the authors write "with no evidence" of transient diffusive states. How do they define transient diffusive states? While there are toolboxes to directly extract the existence and abundance of these either by HMM analysis or temporal segmentation, the authors do not discuss or use them.

      Throughout the analysis in this work, we consider all of tracks with a 2-second FOV as representative of a single underlying population and have not looked at changes in dynamics within a single movie. As we show in the supplemental figures we added (see Figure 3, figure supplement 1), this appears to be a reasonable assumption, at least in the cases we’ve encountered in this manuscript. For experiments involving changes in dynamics over time, these are experiments where we’ve added compound simultaneous with imaging and collect many 2-second FOVs in sequence to monitor changes in ER dynamics. In this case when we refer to “transient states,” we are pointing out that we don’t observe any new states in the State Array diagram that exist in early time points but disappear at later time point.

      The reviewer suggests track-level analysis methods like hidden Markov models or variational Bayesian approaches which have been used previously in the single molecule community. These are very powerful techniques, provided the trajectories are long (typically 100s of frames). In the case of molecules that diffuse quickly and can diffuse out of the focal plane, we don’t have the luxury of such long trajectories. This was demonstrated previously (Hansen et al 2017, Heckert el al 2022) and so we’ve adopted the State Array approach to inferring state occupations from short trajectories. As the reviewer rightly points out, this approach potentially loses information about state transitions or changes over time, but as of now we are not aware of any robust methods that work on short trajectories.

      The authors discuss the methods for extracting kinetic information of ligand binding by diffusion. They should consider the temporal segmentation of heterogenous diffusion. There are numerous methods published in journals or BioRxiv based on analytical or deep learning tools to perform temporal segmentation. This could elevate their analysis of Kon and Koff.

      We’re aware of a number of approaches for analyzing both high framerate SMT as well as long exposure residence time imaging. As we say above, we’re not aware of any methods that have been demonstrated to work robustly on short trajectories aside from the approaches we’ve taken. Similarly, for residence time imaging there are published approaches, but we’re not aware of any that would offer new insight into the experiments in this study. If the reviewer has specific suggestions for analytical approaches that we’re not aware of we would happily consider them.

      Reviewer #3 (Public Review):

      Summary:

      The authors aim to demonstrate the effectiveness of their developed methodology, which utilizes super-resolution microscopy and single-molecule tracking in live cells on a high-throughput scale. Their study focuses on measuring the diffusion state of a molecule target, the estrogen receptor, in both ligand-bound and unbound forms in live cells. By showcasing the ability to screen 5067 compounds and measure the diffusive state of the estrogen receptor for each compound in live cells, they illustrate the capability and power of their methodology.

      Strengths:

      Readers are well introduced to the principles in the initial stages of the manuscript with highly convincing video examples. The methods and metrics used (fbound) are robust. The authors demonstrate high reproducibility of their screening method (R2=0.92). They also showcase the great sensitivity of their method in predicting the proliferation/viability state of cells (R2=0.84). The outcome of the screen is sound, with multiple compounds clustering identified in line with known estrogen receptor biology.

      Weaknesses:

      • Potential overstatement on the relationship of low diffusion state of ER bound to compound and chromatin state without any work on chromatin level.

      We appreciate the reviewers caution in over-interpreting the relationship between an increase in the slowest diffusing states that we observe by SMT and bona fide engagement with chromatin. In the case of the estrogen receptor there is strong precedent in the literature showing increases in chromatin binding and chromatin accessibility (as measured by ChIP-seq and ATAC-seq) upon treatment with either estradiol or SERM/Ds. Taken together with the RNA-seq, we felt it reasonable to assume all the trajectories with a diffusion coefficient less that 0.1 µm2/sec were chromatin bound.

      • Could the authors clarify if the identified lead compound effects are novel at any level?

      Most of the compounds we characterize in the manuscript have not previously been tested in an SMT assay, but many are known to functionally impact the ER or other SHRs based on other biochemical and functional assays. We have not described here any completely novel ER-interacting compounds, but to our knowledge this is the first systematic investigation of a protein showing that both direct and indirect perturbation can be inferred by observing the protein’s motion. Especially for the HSP90 inhibitors, the observation that inhibiting this complex would so dramatically increase ER chromatin-binding as opposed to increasing the speed of the free population is counterintuitive and novel.

      • More video example cases on the final lead compounds identified would be a good addition to the current data package.

      Reviewer #1 (Recommendations For The Authors):

      General:

      • More information on the microscope setup and analysis procedures should be given. Since custom code is used for automated image registration, spot detection, tracking, and analysis of dynamics, this code should be made publicly available.

      Results:

      • line 97: more details about the robotic system and automatic imaging, imaging modalities, and data analysis procedures should be given directly in the text.

      Additional information added to text and methods

      • line 100: we generated three U2OS cell lines --> how?

      Additional information added to text and methods

      • line 101: ectopically expressing HaloTag fused proteins --> how much overexpression did cells show?

      The L30 promoter tends to produce fairly low expression levels. The same approach was used for all ectopic expression plasmids, and for the SHRs the expression levels were all comparable to endogenous levels. We have not checked this for H2B, Caax and free Halo but given that the necessary dye concentration to achieve similar spot densities is within a 10-fold range for all constructs, its reasonable to say that those clonal cell lines will also have modest Halotag expression.

      • line 107: Single-molecule trajectories measured in these cell lines yielded the expected diffusion coefficients --> how was data analysis performed?

      Additional information added to text and methods

      • line 109: how was the localization error determined?

      Additional information added to text and methods

      • line 155: define occupation-weighted average diffusion coefficient.

      Additional information added to text and methods

      • line 157: with 34% bound in basal conditions and 87% bound after estradiol treatment  contradicts figure 2b, where the bound fraction is up to 50% after estradiol treatment.

      Line 157 is the absolute fraction bound, figure 2b is change in fbound

      • line 205: Figure 2c is missing.

      Fixed

      • line 215: within minutes --> how was this data set obtained? which time bins were taken?

      Additional information added to text and methods

      • line 216: with no evidence of transient diffusive states  What is meant by transient diffusive state? It seems all time points have a diffusive component, which decreases over time.

      Additional information added to text and methods

      The diffusive peak decreases, the bound peak increases but no other peaks emerge during that time (e.g. neither super fast nor super slow)

      • line 225: it seems that fbound of GDC-0810 and GDC-0927 are rather similar in FRAP experiments, please comment, how was FRAP done?

      FRAP is in the methods section. The curves and recovery times are quite distinct, is the reviewer looking at

      • line 285: reproducibly: how often was this repeated?

      Information added to the manuscript

      • line 285: it would be necessary to name all of the compounds that were tested, e.g. with an ID number in the graph and a table. This also refers to extended data 7 and 8.

      Additional supplemental file with the list of bioactive compounds tested will be included.

      • line 290/1: what is meant by vendor-provided annotation was poorly defined?

      Additional information added to text and methods. Specifically, the “other” category is the most common category, and it includes both compounds with unknown targets/functions as well as compound where the target and pathway are reasonably well documented. Hence, we applied our own analysis to better understand the list of active compounds.

      Figures:

      • fig. 2-6: detailed statistics are missing (number of measured cells, repetitions, etc.).

      We have added clarifying information, including an “experiment design and sample size” section in the Methods.

      • fig. 3: the authors need to give a list with details about the 5067 compounds tested,

      Additional supplemental file with the list of bioactive compounds tested will be included.

      • extended data 1c: time axis does not correspond to the 1.5s of imaging in the text, results line 127.

      Axes fixed

      • extended data 3: panel c and d are mislabeled.

      Panel labels fixed

      Methods:

      • line 746: HILO microscope: the authors need to explain how they can get such large fields of view using HILO

      Additional details added to the materials and methods. The combination of the power of the lasers, the size of the incident beam out of the fiber optic coupling device and the sCMOS camera are the biggest components that enable detection over a larger field of view.

      • line 761: it is common practice to publish the analysis code. Since the authors wrote their own code, they should publish it

      Our software contains proprietary information that we cannot yet release publicly. Comparable results can be achieved with HILO data using publicly-available tools like utrack. State Arrays code is distributed and the parameters used are listed in the M&M.

      Reviewer #2 (Recommendations For The Authors):

      The writing and presentation are coherent, concise, and easy to follow.

      The authors should consider justifying the following:

      Why is 1.5s imaging time selected? Topological and ligand variations may last significantly longer than this. The authors should present at least for one condition the same effect images for longer.

      Related to the similar comment above, we added a figure examining the jump length distribution as a function of frame. Over the 6 seconds of data collection the jump length distribution is unchanged, suggesting it is reasonable to consider all the trajectories within an FOV as representative of the same underlying dynamical states.

      The authors miss the k test or T test in their graphs.

      We chose to apply the Kurskal-Wallis test in the context of the bioactive screen to assess whether a grouping of compounds based on their presumed cellular target was significantly different from the control even when individual compounds might not by themselves raise to significance. In this case many of the pathway inhibitors are subtle and not necessarily obvious in their difference. In the other cases throughout the manuscript, whether two conditions are statistically distinguishable is rarely in question and of far less importance to the conclusions in the manuscript than the magnitude of the difference. We’ve added statistical tests where appropriate.

      The overall integrated area of Fig 4a appears to reduce upon ligand addition. Data appear normalized but the authors should also add N (number of molecules) on top of the graphs.

      While the integrated area may appear to decrease, all State Array analysis is performed by first randomly sampling 10,000 trajectories from the assay well and inferring state distribution on those 10,000. This has been clarified in the figure legend and in the Methods.

      Minor

      Extended Figure 3 legend c, d appear swapped and incorrectly named in the text.

      Panel labels fixed

      L 197 but this appears not to BE a general feature of SHRs (maybe missing Be).

      Error fixed

      L205 authors refer to Figure 2c, which does not exist.

      Panel reference fixed

      Reviewer #3 (Recommendations For The Authors):

      Among minor issues:

      In Figure 1B, if the authors could specify how they discriminate the specific cell lines from the mixed context, it would enhance clarity. Could they perform additional immunofluorescence to understand how the assignment is determined? Alternatively, could they also show the case with isolated cell lines in an unmixed context?

      Immunofluorescence would be a challenge given that there is not a good epitope to distinguish the three ectopically-expressed genes from each other or from endogenous proteins in the case of H2B and CaaX. We are really reliant on the single cell dynamics to determine the likely cell identity. That said, we’ve added graphs of a number of individual cell State Arrays from the same data graphed in 1A which support the notion that it’s reasonable to assume a cells identity given the observed dynamics.

      In Extended Figure 2F: possibly a CHip-Seq experiment would be more directly qualified to state the effect of ER ligand on ER ability to bind chromatin.

      This is true. Presumably ER that is competent at activating transcription of ER-responsive genes is also capable of binding DNA. ChIP would be the more direct measure, but would not address whether the protein was functional. We chose to balance these measuring these two aspects of ER biology by pairing dynamics with the end-point transcription readout.

      In Figure 3: A representation with plate-by-plate orientation along the x-axis, with controls included in each plate, would be more appropriate to reflect the consistency of the controls used in the assay across different plates. Currently, all controls are pooled in one location, and we cannot appreciate how the controls vary from plate to plate.

      Figure added to the supplement

      Also in this figure, a general workflow of the screen down to segmentation/analysis would be a great add-on.

      New figure added to the supplement and reflected in the textual description of the platform

      In Extended Figures 3B and C an add-on of the positive and negative control would make the figure more convincing.

      Addressed as part of figure added to the supplement

      Is there any description of compound leads identified that is novel in nature in relation to impact on ER, and if so could it be stated more clearly in the text as novel finding?

      To our knowledge, the impact of HSP inhibition in increasing ER-chromatin association has never been described, neither has the link between inhibition post-translation modifying enzymes like the CDKs or mTOR and ER dynamics ever been described. We added clarifying text to the manuscript

    1. Jackson Laboratories: C57BL/6J (strain code: #000664

      DOI: 10.1101/2024.01.23.576951

      Resource: (IMSR Cat# JAX_000664,RRID:IMSR_JAX:000664)

      Curator: @abever99

      SciCrunch record: RRID:IMSR_JAX:000664


      What is this?

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      In this important paper, Blin and colleagues develop a high-throughput behavioral assay to test spontaneous swimming and olfactory preference in individual Mexican cavefish larvae. The authors present compelling evidence that the surface and cave morphs of the fish show different olfactory preferences and odor sensitivities and that individual fish show substantial variability in their spontaneous activity that is relevant for olfactory behaviour. The paper will be of interest to neurobiologists working on the evolution of behaviour, olfaction, and the individuality of behaviour.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors posed a research question about how an animal integrates sensory information to optimize its behavioral outputs and how this process evolved. Their data (behavioral output analysis with detailed categories in response to the different odors in different concentrations by comparing surface and cave populations and their hybrid) partially answer this tough question. They built a new low-disturbance system to answer the question. They also found that the personality of individual fish is a good predictor of behavioral outputs against odor response. They concluded that cavefish evolved to specialize their response to alanine and histidine while surface fish are more general responders, which was supported by their data.

      Strengths:

      With their new system, the authors could generate clearer results without mechanical disturbances. The authors characterize multiple measurements to score the odor response behaviors, and also brought a new personality analysis. Their conclusion that cavefish evolved as a specialist to sense alanine and histidine among 6 tested amino acids was well supported by their data.

      Weaknesses:

      The authors posed a big research question: How do animals evolve the processes of sensory integration to optimize their behavioral outputs? I personally feel that, to answer the questions about how sensory integration generates proper (evolved) behavior, the authors at least need to show the ecological relevance of their response. For the alanine/histidine preference in cavefish, they need data for the alanine and other amino acid concentrations in the local cave water and compare them with those of surface water.

      We agree with the reviewer. This is why, in the Discussion section, we had written: “…Such significant variations in odor preferences or value may be adaptive and relate to the differences in the environmental and ecological conditions in which these different animals live. However, the reason why Pachón cavefish have become “alanine specialists” remains a mystery and prompts analysis of the chemical ecology of their natural habitat. Of note, we have not found an odor that would be repulsive for Astyanax so far, and this may relate to their opportunist, omnivorous and detritivore regime (Espinasa et al., 2017; Marandel et al., 2020).” This is also why we currently develop field work projects aimed at clarifying this question. However, such experiments and analyses are challenging, practically and technically. We hope we can reach some conclusions in the future.

      To complete the discussion we have also added an important hypothesis: “Alternatively, specialization for alanine may not need to be specific for an olfactory cue present only, or frequently, or in high amounts in caves. Bat guano for example, which is probably the main source of food in the Pachón cave, must contain many amino acids. Enhanced recognition of one of them - in the present case alanine but evolution may have randomly acted for enhanced recognition of another amino acid – should suffice to confer cavefish with augmented sensitivity to their main source of nutriment.”

      Also, as for "personality matters", I read that personality explains a large variation in surface fish. Also, thigmotaxis or wall-following cavefish individuals are exceeded to respond well to odorants compared with circling and random swimming cavefish individuals. However, I failed to understand the authors' point about how much percentages of the odorant-response variations are explained (PVE) by personality. Association (= correlation) was good to show as the authors presented, but showing proper PVE or the effect size of personality to predict the behavioral outputs is important to conclude "personality is matter"; otherwise, the conclusion is not so supported.

      From the above, I recommend the authors reconsider the title also their research questions well. At this moment, I feel that the authors' conclusions and their research questions are a little too exaggerated, with less supportive evidence.

      Thank you for this interesting suggestion, which we have fully taken into consideration. We have therefore now calculated and plotted PVE (the percentage of variation explained on the olfactory score) as a function of swimming speed or as a function of swimming pattern. The results are shown in modified Figure 8 of our revised ms and they suggest that the personality (here, swimming patterns or swimming speed) indeed predicts the olfactory response skills. Therefore, we would like to keep our title as we provide support for the fact that “personality matters”.

      Also, for the statistical method, Fisher's exact test is not appropriate for the compositional data (such as Figure 2B). The authors may quickly check it at https://en.wikipedia.org/wiki/Compositional_data or https://www.annualreviews.org/doi/pdf/10.1146/annurev-statistics-042720-124436.

      The authors may want to use centered log transformation or other appropriate transformations (Rpackage could be: https://doi.org/10.1016/j.cageo.2006.11.017). According to changing the statistical tests, the authors' conclusion may not be supported.

      Actually, in most cases, the distributions are so different (as seen by the completely different colors in the distribution graphs) that there is little doubt that swimming behaviors are indeed different between surface and cavefish, or between ‘before’ and ‘after’ odor stimulation. However, it is true that Fisher’s exact test is not fully appropriate because data can be considered as compositional type. For this kind of data, centered log transformation have been suggested. However, our dataset contains many zeros, and this is a case where log transformations have difficulty handling.

      To help us dealing with our data, the reviewer proposed to consider the paper by Greenacre (2021) (https://www.annualreviews.org/doi/pdf/10.1146/annurev-statistics-042720-124436). In his paper, Greenacre clearly wrote: "Zeros in compositional data are the Achilles heel of the logratio approach (LRA)."

      Therefore, we have now tested our data using CA (Correspondence Analysis), that can deal with table containing many zeros and is a trustable alternative to LRA (Cook-Thibeau, 2021; Greenacre, 2011).

      The results of CA analysis are shown in Supplemental figure 8 and they fully confirm the difference in baseline swimming patterns between morphs as well as changes (or absence of changes) in behavioral patterns after odor stimulation suggested by the colored bar plots in main figures, with confidence ellipses overlapping or not overlapping, depending on cases. Therefore, the CA method fully confirms and even strengthens our initial interpretations.

      Finally, we have kept our initial graphical representation in the ms (color-coded bar plots; the complete color code is now given in Suppl. Fig7), and CA results are shown in Suppl. Figure 8 and added in text.

      Reviewer #2 (Public Review):

      In their submitted manuscript, Blin et al. describe differences in the olfactory-driven behaviors of river-dwelling surface forms and cave-dwelling blind forms of the Mexican tetra, Astyanax mexicanus. They provide a dataset of unprecedented detail, that compares not only the behaviors of the two morphs but also that of a significant number of F2 hybrids, therefore also demonstrating that many of the differences observed between the two populations have a clear (and probably relatively simple) genetic underpinning.

      To complete the monumental task of behaviorally testing 425 six-week-old Astyanax larvae, the authors created a setup that allows for the simultaneous behavioral monitoring of multiple larvae and the infusion of different odorants without introducing physical perturbations into the system, thus biasing the responses of cavefish that are particularly fine-tuned for this sensory modality. During the optimization of their protocol, the authors also found that for cave-dwelling forms one hour of habituation was insufficient and a full 24 hours were necessary to allow them to revert to their natural behavior. It is also noteworthy that this extremely large dataset can help us see that population averages of different morphs can mask quite significant variations in individual behaviors.

      Testing with different amino-acids (applied as relevant food-related odorant cues) shows that cavefish are alanine- and histidine-specialists, while surface fish elicit the strongest behavioral responses to cysteine. It is interesting that the two forms also react differently after odor detection: while cave-dwelling fish decrease their locomotory activity, surface fish increase it. These differences are probably related to different foraging strategies used by the two populations, although, as the observations were made in the dark, it would be also interesting to see if surface fish elicit the same changes in light as well.

      Thank you for these nice comments.

      Further work will be needed to pinpoint the exact nature of the genetic changes that underlie the differences between the two forms. Such experimental work will also reveal how natural selection acted on existing behavioral variations already present in the SF population.

      Yes. Searching for genetic underpinnings of the sensory-driven behavioral differences is our current endeavor through a QTL study and we should be able to report it in the near future.

      It will be equally interesting, however, to understand what lies behind the large individual variation of behaviors observed both in the case surface and cave populations. Are these differences purely genetic, or perhaps environmental cues also contribute to their development? Does stochasticity provided by the developmental process has also a role in this? Answering these questions will reveal if the evolvability of Astyanax behavior was an important factor in the repeated successful colonization of underground caves.

      Yes. We will also access (at least partially) responses to most of these questions in our current QTL study.

      Reviewer #3 (Public Review):

      Summary:

      The paper explores chemosensory behaviour in surface and cave morphs and F2 hybrids in the Mexican cavefish Astyanax mexicanus. The authors develop a new behavioural assay for the longterm imaging of individual fish in a parallel high-throughput setup. The authors first demonstrate that the different morphs show different basal exploratory swimming patterns and that these patterns are stable for individual fish. Next, the authors test the attraction of fish to various concentrations of alanine and other amino acids. They find that the cave morph is a lot more sensitive to chemicals and shows directional chemotaxis along a diffusion gradient of amino acids. For surface fish, although they can detect the chemicals, they do not show marked chemotaxis behaviour and have an overall lower sensitivity. These differences have been reported previously but the authors report longer-term observations on many individual fish of both morphs and their F2 hybrids. The data also indicate that the observed behavior is a quantitative genetic trait. The approach presented will allow the mapping of genes' contribution to these traits. The work will be of general interest to behavioural neuroscientists and those interested in olfactory behaviours and the individual variability in behavioural patterns.

      Strengths:

      A particular strength of this paper is the development of a new and improved setup for the behavioural imaging of individual fish for extended periods and under chemosensory stimulation. The authors show that cavefish need up to 24 h of habituation to display a behavioural pattern that is consistent and unlikely to be due to the stressed state of the animals. The setup also uses relatively large tanks that allow the build-up of chemical gradients that are apparently present for at least 30 min.

      The paper is well written, and the presentation of the data and the analyses are clear and to a high standard.

      Thank you for these nice comments.

      Weaknesses:

      One point that would benefit from some clarification or additional experiments is the diffusion of chemicals within the behavioural chamber. The behavioural data suggest that the chemical gradient is stable for up to 30 min, which is quite surprising. It would be great if the authors could quantify e.g. by the use of a dye the diffusion and stability of chemical gradients.

      OK. We had tested the diffusion of dyes in our previous setup and we also did in the present one (not shown). We think that, due to differences of molecular weight and hydrophobicity between the tested dyes and the amino acid molecules we are using, their diffusion does not constitute a proper read-out of actual amino acid diffusion. We anticipate that amino acid diffusion is extremely complex in the test box, possibly with odor plumes diffusing and evolving in non-gradient patterns, in the 3 dimensions of the box, and potentially further modified by the fish swimming through it, the flow coming from the opposite water injection side and the borders of the box. This is the reason why we have designed the assay with contrasting “odor side” and “water control side”. Moreover, our question here is not to determine the exact concentration of amino acid to which the fish respond, but to compare the responses in cavefish, surface fish and F2 hybrids. Finally and importantly, we have performed dose/response experiments whereby varying concentrations have been presented for 3 of the 6 amino acids tested, and these experiments clearly show a difference in the threshold of response of the different morphs.

      The paper starts with a statement that reflects a simplified input-output (sensory-motor) view of the organisation of nervous systems. "Their brains perceive the external world via their sensory systems, compute information and generate appropriate behavioral outputs." The authors' data also clearly show that this is a biased perspective. There is a lot of spontaneous organised activity even in fish that are not exposed to sensory stimulation. This sentence should be reworded, e.g. "The nervous system generates autonomous activity that is modified by sensory systems to adapt the behavioural pattern to the external world." or something along these lines.

      Done

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      In addition to my comments in the "weakness" section above, here are my other comments.

      How many times fish were repeatedly assayed and what the order (alanine followed by cysteine, etc) was, is not clear (Pg 24, Materials and Methods). I am afraid that fish memorize the prior experience to get better/worse their response to the higher conc of alanine, etc. Please clarify this point.

      Many fish were tested in different conditions on consecutive days, indeed. Most often, control experiments (eg, water/nothing; water/water; nothing/nothing) were followed by odor testing. In such cases, there is no risk that fish memorize prior experience and that such previous experience interferes with response to odor. In other instances, fish were tested with a low concentration of one amino acid, followed by a high concentration of another amino acid, which is also on the safe side. Of note, on consecutive days, the odors were always perfused on alternate sides of the test box, to avoid possibility of spatial memory. Finally, in the few cases where increasing concentrations of the same amino acids were perfused consecutively, 1) they were perfused on alternate sides, 2) if the fish does not detect a low concentration below threshold / does not respond, then prior experience should not interfere for responding to higher concentrations, and 3) we have evidence (unpublished, current studies) that when a fish is given increasing concentrations of the same amino acid above detection threshold, then the behavioral response is stable and reproducible (eg does not decrease or increase).

      Minor points:

      Thygmotaxis and wall following.

      Classically, thigmotaxis and wall following are treated as the same (sharma et al., 2009; https://pubmed.ncbi.nlm.nih.gov/19093125/) but the authors discriminate it in thigmotaxis at X-axis and Y-axis because fish repeatedly swam back and forth on x-axis wall or y-axis wall. I understand the authors' point to discriminate WF and T but present them with more explanations (what the differences between them) in the introduction and result sections.

      Done

      Pg5 "genetic architecture" in the introduction.

      "Genetic architecture" analysis needs a more genomic survey, such as GWAS, QTL mapping, and Hi-C. Phenotype differences in F2 generation can be stated as "genetic factor(s)" "genetic component(s)", etc. please revise.

      Done

      Pg10 At the serine treatment, the authors concluded that "...suggesting that their detection threshold for serine is lower than for alanine." I believe that the 'threshold for serine is higher' according to the authors' data. Their threshold-related statement is correct in Pg21 "as SF olfactory concentration detection threshold are higher than CF,..." So the statement on page 10 is a just mistake, I think. Please revise.

      Done (mistake indeed)

      Pg11 After explaining Fig5, the statement "In sum, the responses of the different fish types to different concentrations of different amino acids were diverse and may reflect complex, case-bycase, behavioral outputs" does not convey any information. Please revise.

      OK. Done : “In sum, the different fish types show diverse responses to different concentrations of different amino acids.”

      For the personality analysis (Fig 7)

      The index value needs more explanation. I read the materials and methods three times but am still confused. From the equation, the index does not seem to exceed 1.0, unless the "before score" was a negative value, and the "after score" value was positive. I could not get why the authors set a score of 1.5 as the threshold for the cumulative score of these different behavior index values (= individual score). Please provide more description. Currently, I am skeptical about this index value in Fig 7.

      Done, in results and methods.

      Pg15 the discussion section

      Please discuss well the difference between the authors' finding (cavefish respond 10^-4M for position and surface fish responded 10^-4 for thig-Y; Fig 4AB), and those in Hinaux et al. 2016 (cavefish responded 10^-10M alanine but surface fish responded 10^-5M or higher). It seems that surface fish could respond to the low conc of alanine as cavefish do, which is opposed to the finding in Hinaux 2016.

      The increase in NbrtY at population level for surface fish with 10-4M alanine (~10-6M in box) was most probably due to only a few individuals. Contrarily to cavefish, all other parameters were unchanged in surface fish for this concentration. Moreover, at individual level, only 3.2% of surface fish had significant olfactory scores (to be compared to 81.3% for cavefish). Thus, we think that globally this result does not contradict our previous findings in Hinaux et al (2016), and solely represent the natural, unexplained variations inherent to the analysis of complex animal behaviors – even when we attempt to use the highest standards of controlled conditions.

      Of note, in the revised version, we have now included a full dose/response analysis for alanine concentration ranging from 10-2M to 10-10M, on cavefish. Alanine 10-5M has significant effects (now shown in Suppl Fig2 and indicated in text; a column has been added for 10-5M in Summary Table 1). Lower concentrations have milder effects (described in text) but confirm the very low detection threshold of cavefish for this amino acid.

      Pg19, "In sum, CF foraging strategy has evolved in response to the serious challenge of finding food in the dark"

      My point is the same as explained in the 'weakness' section above: how this behavior is effective in the cave life, if they conclude so? Please explain or revise this statement.

      The present manuscript reports on experiments performed in “artificial” and controlled laboratory conditions. We are fully aware that these conditions are probably distantly related to conditions encountered in the wild. Note that we had written in original version (page 20) “…for 6-week old juveniles in a rectangular box - but the link may be more elusive when considering a fish swimming in a natural, complex environment.” As the reviewer may know, we also perform field studies in a more ethological approach of animal behaviors, thus we may be able to discuss this point more accurately in the future.

      Pg20 "To our knowledge, this is the first time individual variations are taken into consideration in Astyanax behavioral studies."

      This is wrong. Please see Fernandes et al., 2022. (https://pubmed.ncbi.nlm.nih.gov/36575431/).

      OK. The sentence is wrong if taken in its absolute sense, i.e., considering inter-individual variations of a given parameter (e.g., number of neuromasts per individual or number of approaches to vibrating rod in Fernandez et al, 2022). In this same sense, Astyanax QTL studies on behaviors in the past also took into account variations among F2 individuals. Here, we wanted to stress that personality was taken into consideration. The sentence has been changed: “To our knowledge, this is the first time individual temperament is taken into consideration in Astyanax behavioral studies.”

      Figure 2B and others.

      The order of categories (R, R-TX, etc) should match in all columns (SF, F2, and CF). Currently, the category orders seem random or the larger ratio categories at the bottom, which is quite difficult to compare between SF, F2, and CF. Also, the writings in Fig 2A (times, Y-axis labels, etc), and the bargraphs' writings are quite difficult to read in Fig 2B, Fig 3B 4H, 5GN, 6EFG. Also, no need to show fish ID in Fig 2C in the current way, but identify the fish data points of the fish in Fig 2D (SF#40, CF#65, and F2#26) in Fig 2C if the authors want to show fish ID numbers in the boxplots. Fish ID numbers in other boxplot figures are recommended to be removed too.

      We have thought a lot on how to best represent the distributions of swimming patterns in graphs such as Fig 2B and others. The difficulty is due to the existence of many combinations (33 possibilities in total, see new Suppl Fig7), which are never the same in different plots/conditions because individual tested fish are different. We decided that that the best way was to represent, from bottom to top, the most used to the less used swimming patterns, and to use a color code that matches at best the different combinations. It was impossible to give the full color code on each figure, therefore it was simplified, and we believe that the results are well conveyed on the graphs. We would like to keep it as it is. To respond (partially) to the reviewer’s concern, we have now added a full color code description in a new Supplemental Figure 7 (associated to Methods).

      Size of lettering has been modified in all pattern graphs like Fig2A. Thanks for the suggestion, it reads better now.

      Finally, we would like to keep the fish ID numbers because this contributes to conveying the message of the paper, that individuality matters.

      Raw data files were not easy to read in Excel or LibreOffice. Please convert them into the csv format to support the rigor in the authors' conclusion.

      We do not understand this request. Our very large dataset must be analysed with R, not excel for stats or for plotting and pattern analysis. However, raw data files can be opened in excel with format conversion.

      Reviewer #2 (Recommendations For The Authors):

      I think most of the experimental procedures (with few exceptions, see below) are well-defined and nicely described, so the majority of my suggestions will be related to the visualization of the data. I think the authors have done a great job in presenting this complex dataset, but there are still some smaller tweaks that could be used to increase the legibility of the presented data.

      First and perhaps foremost, a better definition of the swimming pattern subsets is needed. I have no problem understanding the main behavioral types, but whereas the color codes for these suggest that there is continuous variance within each pattern, it is not clear (at least to me), what particular aspect(s) of the behaviors vary. Also, whereas the sidebars/legends suggest a continuum within these behaviors, the bar charts themselves clearly present binned data. I did not find a detailed description of how the binning was done. As this has been - according the Methods section - a manual process, more clarity about the details of the binning would be welcome. I would also suggest using binned color codes for the legends as well.

      Done, in Results and Methods. We hope it is now clear that there is no “continuum”, rather multiple combinations of discrete swimming patterns. The gradient aspect in color code in figures has been removed to avoid the idea of continuum. According to the chosen color code, WF is in red, R in blue, T in yellow and C in green. Then, combination are represented by colors in between, for example, R+WF is purple. We have now added a full color code description for the swimming patterns and their combinations in a new Supplemental Figure 7 (associated to Methods).

      Also, to better explain the definition of the swimming patterns and the graphical representation, it now reads (in Methods):

      “The determination of baseline swimming patterns and swimming patterns after odor injection was performed manually based on graphical representations such as in Figure 2A or Figure 3A. Four distinctive baseline behaviors clearly emerged: random swim (R; defined as haphazard swimming with no clear pattern, covering entirely or partly the surface of the arena), wall following (WF; defined as the fish continuously following along the 4 sides of the box and turning around it, in a clockwise or counterclockwise fashion), large or small circles (C; self explanatory), and thigmotactism (T, along the X- or the Y-axis of the box; defined as the fish swimming back and forth along one of the 4 sides of the box). On graphical representations of swimming pattern distributions, we used the following color code: R in blue, WF in red, C in green, T in yellow. Of note, many fish swam according to combination(s) of these four elementary swimming patterns (see descriptions in the legends of Supplemental figures, showing many examples). To fully represent the diversity and the combinations of swimming patterns used by individual fish, we used an additional color code derived from the “basic” color code described above and where, for example R+WF is purple. The complete combinatorial color code is shown in Suppl. Fig7.”

      It would be also easier to comprehend the stacked bar charts, presenting the particular swimming patterns in each population, if the order of different swimming patterns was the same for all the plots (e.g. the frequency of WF always presented at the bottom, R on the top, and C and T in the middle). This would bring consistency and would highlight existing differences between SF, CF, and F2s. Furthermore, such a change would also make it much easier to see (and compare) shifts in behaviors.

      We have thought a lot on how to best represent the distributions of swimming patterns in graphs such as Fig 2B and others. The difficulty is due to the existence of many combinations, which are never the same in different plots/conditions because the individual fish tested are different. We decided to keep it as it currently stands, because we think re-doing all the graphs and figures would not significantly improve the representation. In fact, we think that the differences between morphs (dominant blue in SF, dominant red in CF) and between conditions (bar charts next to each other) are easy to interpret at first glance in the vast majority of cases. Moreover, they are now completed by CA analyses (Suppl Figure 8).

      While the color coding of the timeline in the "3D" plots presented for individual animals is a nice feature, at the moment it is slightly confusing, as the authors use the same color palette as for the stacked bar charts, representing the proportionality of the particular swimming patterns. As the y-axis is already representing "time" here, the color coding is not even really necessary. If the authors would like to use a color scheme for aesthetic reasons, I would suggest using another palette, such as "grey" or "viridis".

      We would like to keep the graphical aspect of our figures as they are, for aesthetic reasons. To avoid confusion with stacked bar chart color code, we have added a sentence in Methods and in the legend of Figure 2, where the colors first appear:

      “The complete combinatorial color code is shown in Suppl. Figure 7. Of note, in all figures, the swimming pattern color code does not relate whatsoever with the time color code used in the 2D plus time representation of swimming tracks such as in Figure 2A”.

      I would also suggest changing the boxplots to violin-plots. Figure 7 clearly shows bimodality for F2 scores (something, as the authors themselves note, not entirely surprising given the probably poligenic nature of the trait), but looking at SF and CF scores I think there are also clear hints for non-normal distributions. If non-normal distribution of traits is the norm, violin-plots would capture the variance in the data in a more digestible way. (The existence of differently behaving cohorts within the population of both SF and CF forms would also help to highlight the large pre-existing variance, something that was probably exploited by natural selection as well, as mentioned briefly in the Discussion by the authors, too.)

      The bimodal distribution of scores shown by F2s in Figure 7B is indeed probably due to the polygenic nature of the trait. However, such distribution is rather the exception than the norm. Moreover, the boxplot representations we have used throughout figures include all the individual points, and outliers can be identified as they have the fish ID number next to them. This allows the reader to grasp the variance of the data. Again, redoing all graphs and figures would constitute a lot of work, for little gain in term of conveying the results. Therefore, we choose not to change the boxplot for violin plots.

      The summary data of individual scores in Table 1B shows some intriguing patterns, that warrant a bit further discussion, in my opinion. For example, we can see opposite trends in scores of SF and CF forms with increasing alanine concentration. Is there an easy explanation for this? Also, in the case of serine, the CF scores do not seem to respond in a dose-dependent manner and puzzlingly at 10^(-3)M serine concentration F2 scores are above those of both grandparental populations.

      That is true. However, we have no simple explanation for this. To begin responding to this question, we have now performed full dose/responses expts for alanine (concentrations tested from 10-2M to 10-10M on cavefish; confirm that CF are bona fide “alanine specialists”) and for serine (10-2M to 104M tested on both morphs; confirm that both morphs respond well to this amino acid). These complementary results are now included in text and figures (partially) and in the summary table 1.

      If anything is known about this, I would also welcome some discussion on how thigmotactic behavior, a marker of stress in SF, could have evolved to become the normal behavior of CF forms, with lower cortisol levels and, therefore lower anxiety.

      We actually think thigmotactism is a marker of stress in both morphs. See Pierre et al, JEB 2020, Figure S3A: in both SF and CF thigmotaxis behavior decreases after long habituation times. In our hands, the only difference between the two morphs is that surface fish (at 5 month of age) express stress by thigmotactism but also freezing and rapid erratic movements, while cavefish have a more restricted stress repertoire.

      This is why in the present paper we have carefully made the distinction between thigmotactism (= possible stress readout) and wall following (= exploratory behavior). Our finding that WF and large circles confers better olfactory response scores to cavefish is in strong support of the different nature of these two swimming patterns. Then, why is swimming along the 4 walls of a tank fundamentally different from swimming along one wall? The question is open, although the number of changes of direction is probably an important parameter: in WF the fish always swims forward in the same direction, while in T the fish constantly changes direction when reaching the corner of the tank – which is similar to erratic swim in stressed surface fish.

      Finally two smaller suggestions:

      • When referring to multiple panels on the same figure it would be better to format the reference as "Figure 4D-G" instead of "Figure 4DEFG";

      Done

      • On page 4, where the introduction reads as "although adults have a similar olfactory rosette with 2025 lamellae", in my opinion, it would be better to state that "while adults of the two forms have a similar olfactory rosette with 20-25 lamellae".

      Done

      Reviewer #3 (Recommendations For The Authors):

      Consider moving Figure 3 to be a supplement of Figure 4. This figure shows a water control and therefore best supplements the alanine experiment.

      We would like to keep this figure as a main figure: we consider it very important to establish the validity of our behavioral setup at the beginning of the ms, and to establish that in all the following figures we are recording bona fide olfactory responses.

      "sensory changes in mecano-sensory and gustatory systems " - mechano-sensory.

      Done

      Figure 2 legend: "(3) the right track is the 3D plus time (color-coded)" - shouldn't it be 2D plus time or 3D (x,y, time).

      True! Thanks for noting this, corrected.

      Figure 4 legend "E, Change in swimming patterns" should be H.

      Done

      "suggesting that their detection threshold for serine is lower than for alanine" - higher?

      Done

      In the behavioural plots, I assume that the "mean position" value represents the mean position along the X-axis of the chamber - this should be clarified and the axis label updated accordingly.

      That is correct and has been updated in Methods and Figures and legends.

      "speed, back and forth trips in X and Y, position and pattern changes (see Methods; Figure 7A)." - here it would be helpful to add an explanation like "to define an olfactory score for individual fish."

      This has been changed in Results and more detailed explanations on score calculations are now given in Methods.

      "possess enhanced mecanosensory lateral line" - mechanosensory.

      Done

    1. Author response:

      We would like to thank the eLife Editors and Reviewers for their positive assessment and constructive comments, and for the opportunity to revise our manuscript. We greatly appreciate the Reviewers’ recommendations and believe that they will further improve our manuscript.

      In revising the manuscript, our primary focus will be enhancing the clarity surrounding testing procedures and addressing corrections for multiple comparisons. Additionally, we intend to offer more explicit information about the statistical tests employed, along with the details about the number of models/comparisons for each test. We will also include an extended discussion on potential limitations of the dopaminergic receptor mapping methods used, addressing the Reviewers’ comments relating to the quality of PET imaging with different dopaminergic tracers in mesiotemporal regions such as the hippocampus. While the code used for connectopic mapping is publicly available through the ConGrads toolbox, we will provide the additional code we have used for data processing and analysis, visualization of hippocampal gradients, and the cortical projections. The data used in the current study is not publicly available due to ethical considerations concerning data sharing, but can be shared upon reasonable request from the senior author. Additional plans include clarifying and discussing which findings were successfully replicated, and addressing Reviewers’ suggestions for using other openly available cohorts for replication, and implementing alternative coordinate systems to quantify connectivity change along gradients.

    2. eLife assessment

      This fundamental work demonstrates the importance of considering overlapping modes of functional organization (i.e. gradients) in the hippocampus, showing associations between with aging, dopaminergic receptor distribution and episodic memory. The evidence supporting the conclusions is solid, although some clarifications about testing procedures and a discussion of the limitations of the dopaminergic receptor mapping techniques employed should be provided along with analysis code. The work will be of broad interest to basic and clinical neuroscientists.

    3. Reviewer #2 (Public Review):

      Summary:

      This paper derives the first three functional gradients in the left and right hippocampus across two datasets. These gradient maps are then compared to dopamine receptor maps obtained with PET, associated with age, and linked to memory. Results reveal links between dopamine maps and gradient 2, age with gradients 1 and 2, and memory performance.

      Strengths:

      This paper investigates how hippocampal gradients relate to aging, memory, and dopamine receptors, which are interesting and important questions. A strength of the paper is that some of the findings were replicated in a separate sample.

      Weaknesses

      The paper would benefit from added clarification on the number of models/comparisons for each test. Furthermore, it would be helpful to clarify whether or not multiple comparison correction was performed and - if so - what type or - if not - to provide a justification. The manuscript would furthermore benefit from code sharing and clarifying which results did/did not replicate.

    1. Author response:

      Reviewer #1 (Public Review):

      This study makes a substantial contribution to our understanding of the molecular evolutionary dynamics of microbial genomes by proposing a model that incorporates relatively frequent adaptive reversion mutations. In many ways, this makes sense from my own experience with evolutionary genomic data of microbes, where reversions are surprisingly familiar as evidence of the immense power of selection in large populations.

      One criticism is the reliance on one major data set of B. fragilis to test fits of these models, but this is relatively minor in my opinion and can be caveated by discussion of other relevant datasets for parallel investigation.

      We analyze data from 10 species of the Bacteroidales family, and we compare it to a dataset of Bacteroides fragilis. We have now added a reference to a recent manuscript from our group showing phenotypic alteration by reversion of a stop codon and further breaking of the same pathway through stop codons in other genes in Burkholderia dolosa on page 9, and have added a new analysis of codon usage in support of the reversion model on page 14.

      We have chosen not to analyze other species as there are no large data sets with rigorous and evenly-applied quality control across scales. We anticipate the reversion model would be able to fit the data in these cases. We now note that this work remains to be done in the discussion.

      Another point is that this problem isn't as new as the manuscript indicates, see for example https://journals.asm.org/doi/10.1128/aem.02002-20 .

      Loo et al puts forward an explanation similar to the purifying model proposed by Rocha et al, which we refute here. Quoting from Loo et al: “Our results confirm the observation that nonsynonymous SNPs are relatively elevated under shorter time periods and that purifying selection is more apparent over longer periods or during transmission.” While there is some linguistic similarity between the weak purifying model and our model of strong local adaptation model and strong adaptive reversion, we believe that the dynamical and predictive implications suggested by the reversion model are an important conceptual leap and correction to the literature. We now cite Loo et al and additional works cited therein. We have updated the abstract, introduction, and discussion to further emphasize the distinction of the reversion model from previous models: namely the implication of the reversion model that long-time scale dN/dS hides dynamics.

      Nonetheless, the paper succeeds by both developing theory and offering concrete parameters to illustrate the magnitudes of the problems that distinguish competing ideas, for example, the risk of mutational load posed in the absence of frequent back mutation.

      Reviewer #2 (Public Review):

      This manuscript asks how different forms of selection affect the patterns of genetic diversity in microbial populations. One popular metric used to infer signatures of selection is dN/dS, the ratio of nonsynonymous to synonymous distances between two genomes. Previous observations across many bacterial species have found dN/dS decreases with dS, which is a proxy for the divergence time. The most common interpretation of this pattern was proposed by Rocha et al. (2006), who suggested the excess in nonsynonymous mutations on short divergence times represent transient deleterious mutations that have not yet been purged by selection.

      In this study, the authors propose an alternative model based on the population structure of human gut bacteria, in which dN is dominated by selective sweeps of SNPs that revert previous mutations within local populations. The authors argue that contrary to standard population genetics models, which are based on the population dynamics of large eukaryotes, the large populations in the human gut mean that reversions may be quite common and may have a large impact on evolutionary dynamics. They show that such a model can fit the decrease of dN/dS in time at least as well as the purifying selection model.

      Strengths

      The main strength of the manuscript is to show that adaptive sweeps in gut microbial populations can lead to small dN/dS. While previous work has shown that using dN/dS to infer the strength of selection within a population is problematic (see Kryazhimskiy and Plotkin, 2008, cited in the paper) the particular mechanism proposed by the authors is new to my knowledge. In addition, despite the known caveats, dN/dS values are still routinely reported in studies of microbial evolution, and so their interpretation should be of considerable interest to the community.

      The authors provide compelling justification for the importance of adaptive reversions and make a good case that these need to be carefully considered by future studies of microbial evolution. The authors show that their model can fit the data as well as the standard model based on purifying selection and the parameters they infer appear to be plausible given known data. More generally, I found the discussion on the implications of traditional population genetics models in the context of human gut bacteria to be a valuable contribution of the paper.

      Thank you for the kind words and appreciation of the manuscript.

      Weaknesses

      The authors argue that the purifying selection model would predict a gradual loss in fitness via Muller's ratchet. This is true if recombination is ignored, but this assumption is inconsistent with the data from Garud, et al. (2019) cited in the manuscript, who showed a significant linkage decrease in the bacteria also used in this study.

      We now investigate the effect of recombination on the purifying selection model on page 8 and in Supplementary Figure S6. In short, we show that reasonable levels of recombination (obtained from literature r/m values) cannot rescue the purifying selection model from Muller’s ratchet when s is so low and the influx of new deleterious mutations is so high. We thank the reviewers for prompting this improvement.

      I also found that the data analysis part of the paper added little new to what was previously known. Most of the data comes directly from the Garud et al. study and the analysis is very similar as well. Even if other appropriate data may not currently be available, I feel that more could be done to test specific predictions of the model with more careful analysis.

      In addition to new analyses regarding recombination and compensatory mutations using the Garud et al data set, we have now added two new analyses, both using Bacteroides fragilis . First, we show that de novo mutations in Zhao & Lieberman et al dataset include an enrichment of premature stop codons (page 9). Second we show that genes expected to be under fluctuating selection in B. fragilis displays a significant closeness to stop codons, consistent with recent stop codons and reversions. We thank the reviewer for prompting the improvement.

      Finally, I found the description of the underlying assumptions of the model and the theoretical results difficult to understand. I could not, for example, relate the fitting parameters nloci and Tadapt to the simulations after reading the main text and the supplement. In addition, it was not clear to me if simulations involved actual hosts or how the changes in selection coefficients for different sites was implemented. Note that these are not simply issues of exposition since the specific implementation of the model could conceivably lead to different results. For example, if the environmental change is due to the colonization of a different host, it would presumably affect the selection coefficients at many sites at once and lead to clonal interference. Related to this point, it was also not clear that the weak mutation strong selection assumption is consistent with the microscopic parameters of the model. The authors also mention that "superspreading" may somehow make a difference to the probability of maintaining the least loaded class in the purifying selection model, but what they mean by this was not adequately explained.

      We apologize for leaving the specifics of the implementation from the paper and only accessible through the Github page and have corrected this. We have added a new section in the methods further detailing the reversion model and the specifics of how nloci and Tadapt (now tau_switch as of the edits) are implemented in the code.

      The possibility for clonal interference is indeed included in the simulation. Switching is not correlated with transmissions in our main figure simulations (Figure 4a). When we run simulations in which transmission and selection are correlated, the results remain essentially the same, barring higher variance at lower divergences (new Figure S10). We have now clarified these points in the results, and have also better clarified the selection only at transmission model in the main results.

      Reviewer #3 (Public Review):

      The diversity of bacterial species in the human gut microbiome is widely known, but the extensive diversity within each species is far less appreciated. Strains found in individuals on opposite sides of the globe can differ by as little as handfuls of mutations, while strains found in an individual's gut, or in the same household, might have a common ancestor tens of thousands of years ago. What are the evolutionary, ecological, and transmission dynamics that established and maintain this diversity?

      The time, T, since the common ancestor of two strains, can be directly inferred by comparing their core genomes and finding the fraction of synonymous (non-amino acid changing) sites at which they differ: dS. With the per-site per-generation mutation rate, μ, and the mean generation times roughly known, this directly yields T (albeit with substantial uncertainty of the generation time.) A traditional way to probe the extent to which selection plays a role is to study pairs of strains and compare the fraction of non-synonymous (amino acid or stop-codon changing) sites, dN, at which the strains differ with their dS. Small dN/dS, as found between distantly related strains, is attributed to purifying selection against deleterious mutations dominating over mutations that have driven adaptive evolution. Large dN/dS as found in laboratory evolution experiments, is caused by beneficial mutations that quickly arise in large bacterial populations, and, with substantial selective advantages, per generation, can rise to high abundance fast enough that very few synonymous mutations arise in the lineages that take over the population.

      A number of studies (including by Lieberman's group) have analyzed large numbers of strains of various dominant human gut species and studied how dN/dS varies. Although between closely related strains the variations are large -- often much larger than attributable to just statistical variations -- a systematic trend from dN/dS around unity or larger for close relatives to dN/dS ~ 0.1 for more distant relatives has been found in enough species that it is natural to conjecture a general explanation.

      The conventional explanation is that, for close relatives, the effects of selection over the time since they diverged has not yet purged weakly deleterious mutations that arose by chance -- roughly mutations with sT<1 -- while since the common ancestor of more distantly related strains, there is plenty of time for most of those that arose to have been purged.

      Torrillo and Lieberman have carried out an in-depth -- sophisticated and quantitative -- analysis of models of some of the evolutionary processes that shape the dependence of dN/dS on dS -- and hence on their divergence time, T. They first review the purifying selection model and show that -- even ignoring its inability to explain dN/dS > 1 for many closely related pairs -- the model has major problems explaining the crossover from dN/dS somewhat less than unity to much smaller values as dS goes through -- on a logarithmic scale -- the 10^-4 range. The first problem, already seen in the infinite-population-size deterministic model, is that a very large fraction of non-synonymous mutations would have to have deleterious s's in the 10^-5 per generation range to fit the data (and a small fraction effectively neutral). As the s's are naturally expected (at least in the absence of quantitative analysis to the contrary) to be spread out over a wide range on a logarithmic scale of s, this seems implausible. But the authors go further and analyze the effects of fluctuations that occur even in the very large populations: ~ >10^12 bacteria per species in one gut, and 10^10 human guts globally. They show that Muller's ratchet -- the gradual accumulation of weakly deleterious mutations that are not purged by selection - leads to a mutational meltdown with the parameters needed to fit the purifying selection model. In particular, with N_e the "effective population size" that roughly parametrizes the magnitude of stochastic birth-death and transition fluctuations, and U the total mutation rate to such deleterious mutations this occurs for U/s > log(sN_e) which they show would obtain with the fitted parameters.

      Torrillo and Lieberman promise an alternate model: that there are a modest number of "loci" at which conditionally beneficial mutations can occur that are beneficial in some individual guts (or other environmental conditions) at some times, but deleterious in other (or the same) gut at other times. With the ancestors of a pair of strains having passed through one too many individuals and transmissions, it is possible for a beneficial mutation to occur and rise in the population, only later to be reverted by the beneficial inverse mutation. With tens of loci at which this can occur, they show that this process could explain the drop of dN/dS from short times -- in which very few such mutations have occurred -- to very long times by which most have flipped back and forth so that a random pair of strains will have the same nucleotide at such sites with 50% probability. Their qualitative analysis of a minimally simple model of this process shows that the bacterial populations are plenty big enough for such specific mutations to occur many times in each individual's gut, and with modest beneficials, to takeover. With a few of these conditionally beneficial mutations or reversions occurring during an individuals lifetime, they get a reasonably quantitative agreement with the dN/dS vs dS data with very few parameters. A key assumption of their model is that genetically exact reversion mutations are far more likely to takeover a gut population -- and spread -- than compensatory mutations which have a similar phenotypic-reversion effect: a mutation that is reverted does not show up in dN, while one that is compensated by another shows up as a two-mutation difference after the environment has changed twice.

      Strengths:

      The quantitative arguments made against the conventional purifying selection model are highly compelling, especially the consideration of multiple aspects that are usually ignored, including -- crucially -- how Muller's ratchet arises and depends on the realistic and needed-to-fit parameters; the effects of bottlenecks in transmission and the possibility that purifying selection mainly occurs then; and complications of the model of a single deleterious s, to include a distribution of selective disadvantages. Generally, the author's approach of focusing on the simplest models with as few as possible parameters (some roughly known), and then adding in various effects one-by-one, is outstanding and, in being used to analyze environmental microbial data, exceptional.

      The reversion model the authors propose and study is a simple general one and they again explore carefully various aspects of it -- including dynamics within and between hosts -- and the consequent qualitative and quantitative effects. Again, the quantitive analysis of almost all aspects is exemplary. Although it is hard to make a compelling guess of the number of loci that are subject to alternating selection on the needed time-scales (years to centuries) they make a reasonable argument for a lower bound in terms of the number of known invertible promoters (that can genetically switch gene expression on and off).

      We are very grateful for the reviewer’s kind words and careful reading.

      Weaknesses:

      The primary weakness of this paper is one that the author's are completely open about: the assumption that, collectively, any of possibly-many compensatory mutations that could phenotypically revert an earlier mutation, are less likely to arise and takeover local populations than the exact specific reversion mutation. While detailed analysis of this is, reasonably enough, beyond the scope of the present paper, more discussion of this issue would add substantially to this work. Quantitatively, the problem is that even a modest number of compensatory mutations occurring as the environmental pressures change could lead to enough accumulation of non-synonymous mutations that they could cause dN/dS to stay large -- easily >1 -- to much larger dS than is observed. If, say, the appropriate locus is a gene, the number of combinations of mutations that are better in each environment would play a role in how large dN would saturate to in the steady state (1/2 of n_loci in the author's model). It is possible that clonal interference between compensatory and reversion mutations would result in the mutations with the largest s -- eg, as mentioned, reversion of a stop codon -- being much more likely to take over, and this could limit the typical number of differences between quite well-diverged strains. However, the reversion and subsequent re-reversion would have to both beat out other possible compensatory mutations -- naively less likely. I recommend that a few sentences in the Discussion be added on this important issue along with comments on the more general puzzle -- at least to this reader! -- as to why there appear to be so little adaptive genetic changes in core genomes on time scales of human lifetimes and civilization.

      We now directly consider compensatory mutations (page 14, SI text 3.2, and Supplementary Figure 12). We show that as long as true reversions are more likely than compensatory mutations overall, (adaptive) nonsynonymous mutations will still tend to revert towards their initial state and not contribute to asymptotic dN/dS, and show that true reversions are expected in a large swath of parameter space. Thank you for motivating this improvement!

      We note in the discussion that directional selection could be incorporated into the parameter alpha (assuming even more of the genome is deleterious) on page 16.

      An important feature of gut bacterial evolution that is now being intensely studied is only mentioned in passing at the end of this paper: horizontal transfer and recombination of core genetic material. As this tends to bring in many more mutations overall than occur in regions of a pair of genomes with asexual ancestry, the effects cannot be neglected. To what extent can this give rise to a similar dependence of dN/dS on dS as seen in the data? Of course, such a picture begs the question as to what sets the low dN/dS of segments that are recombined --- often from genetic distances comparable to the diameter of the species.

      We now discuss the effect of recombination on the purifying selection model on page 8 and in Supplementary Figure S6. In short, we now show that reasonable levels of recombination cannot rescue the purifying selection model from Muller’s ratchet when s is so low and the influx of new deleterious mutations is so high. We thank the reviewers for prompting this improvement

    1. We use a cost-effectiveness analysis to quantify our reasoning. Here is a summary of our analysis, using one state, Bauchi, as an example.

      The linked Google sheet is hard to parse and hard to read. This makes it less than fully transparent. E.g., the columns are frozen in a way that you can barely navigate the by-region columns.

      Linking something people can't use doesn't add transparency, it just wastes people's attention. If you feel the need put these links at the bottom, in a 'data section' or something. Anyone who wants to dig into it will need to do so as part of a separate and intensive exercise -- not just a glance while reading this. At least that's my impression.

      But also note that a code notebook based platform can be far more manageable for the reader.

    1. initializer list

      List of initialising data attached to an array by the programmer manually in the code.

      In this method of creating an array, the programmer doesn't need to specify the size of the array as it will be automatically determined (you still need to specify the data type of the data list).

    1. if I move this function into the same directory as the other functions it's coupled to I've reduced the cost of making changes to it
      • Why we split code into files?
      • Good way to restrict context?
      • We can afford coupling in small contexts but as the context gets large coupling becomes expensive.
      • Therefore we might optimize on context size vs coupling?
      • Context needs to be defined more precisely
    1. some of the main reasons to use multiple methods in your programs: Organization and Reducing Complexity: organize your program into small sections of code by function to reduce its complexity. Divide a problem into subproblems to solve it a piece at a time. Reusing Code: avoid repetition of code. Reuse code by putting it in a method and calling it whenever needed. Maintainability and Debugging: smaller methods are easier to debug and understand than searching through a large main method.

      Some of the main reasons to use multiple methods in your programs

    2. Procedural Abstraction

      Procedurally abstracting the details of how stuff works behind the scenes.

      It basically a word for the process of using the concept of functions on code.

    1. Reviewer #3 (Public Review):

      Summary:

      The authors consider several known aspects of PV and SOM interneurons and tie them together into a coherent single-cell model that demonstrates how the aspects interact. These aspects are:<br /> (1) While SOM interneurons target distal parts of pyramidal cell dendrites, PV interneurons target perisomatic regions.<br /> (2) SOM interneurons are associated with beta rhythms, PV interneurons with gamma rhythms.<br /> (3) Clustered excitation on dendrites can trigger various forms of dendritic spikes independent of somatic spikes. The main finding is that SOM and PV interneurons are not simply associated with beta and gamma frequencies respectively, but that their ability to modulate the activity of a pyramidal cell "works best" at their assigned frequencies. For example, distally targeting SOM interneurons are ideally placed to precisely modulate dendritic Ca-spikes when their firing is modulated at beta frequencies or timed relative to excitatory inputs. Outside those activity regimes, not only is modulation weakened, but overall firing reduced.

      Strengths:

      I think the greatest strength is the model itself. While the various individual findings were largely known or strongly expected, the model provides a coherent and quantitative picture of how they come together and interact.

      The paper also powerfully demonstrates that an established view of "subtractive" vs. "divisive" inhibition may be too soma-focused and provide an incomplete picture in cells with dendritic nonlinearities giving rise to a separate, non-somatic all-or-nothing mechanism (Ca-spike).

      Weaknesses:

      While the authors overall did an admirable job of simulating the neuron in an in-vivo-like activity regime, I think it still provides an idealized picture that it optimized for the generation of the types of events the authors were interested in. That is not a problem per se - studying a mechanism under idealized conditions is a great advantage of simulation techniques - but this should be more clearly characterized. Specifics on this are very detailed and will follow in the comments to authors.

      What disappointed me a bit was the lack of a concise summary of what we learned beyond the fact that beta and gamma act differently on dendritic integration. The individual paragraphs of the discussion often are 80% summary of existing theories and only a single vague statement about how the results in this study relate. I think a summarizing schematic or similar would help immensely.

      Orthogonal to that, there were some points where the authors could have offered more depth on specific features. For example, the authors summarized that their "results suggest that the timescales of these rhythms align with the specialized impacts of SOM and PV interneurons on neuronal integration". Here they could go deeper and try to explain why SOM impact is specialized at slower time scales. (I think their results provide enough for a speculative outlook.)

      Beyond that, the authors invite the community to reappraise the role of gamma and beta in coding. This idea seems to be hindered by the fact that I cannot find a mention of a release of the model used in this work. The base pyramidal cell model is of course available from the original study, but it would be helpful for follow-up work to release the complete setup including excitatory and inhibitory synapses and their activation in the different simulation paradigms used. As well as code related to that.

      Impact:

      Individually, most results were at least qualitatively known or at least expected. However, demonstrating that beta-modulation of dendritic events and gamma-modulation of soma spiking can work together, at the same time and in the same model can lead to highly valuable follow-up work. For example, by studying how top-down excitation onto apical compartments and bottom-up excitation on basal compartments interacts with the various rhythms; or what the impact of silencing of SOM neurons by VIP interneuron activation entails. But this requires - again - public release of the model and the code controlling the simulation setups.

      Beyond that, the authors clearly demonstrated that a single compartment, i.e., only a soma-focused view is too simple, at least when beta is considered. Conversely, the authors were able to describe the impact of most things related to the apical dendrite on somatic spiking as "going through" the Ca-spike mechanism. Therefore, the setup may serve as the basis of constraining simplified two-compartment models in the future.

    1. You don’t need to write a getter for every instance variable in a class but if you want code outside the class to be able to get the value of one of your instance variables, you’ll need to write a getter that looks like the following. class ExampleTemplate { // Instance variable declaration private typeOfVar varName; // Accessor (getter) method template public typeOfVar getVarName() { return varName; } } Notice that the getter’s return type is the same as the type of the instance variable and all the body of the getter does is return the value of the variable using a return statement.

      Making Private instance variables available for access for methods outside the class by creating a public getter method to return the value in the private instance variable.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to reviewer comments

      • *

      We extend our gratitude to the reviewers for their time and valuable feedback on our manuscript. We especially appreciate the insightful suggestions that have significantly contributed to refining our work and elucidating our findings. With the revisions made to the text and the inclusion of new experimental data, we believe our manuscript now effectively addresses all reviewer comments. We eagerly await your evaluation of our revised submission.

      Small ARF-like GTPases play fundamental roles in dynamic signaling processes linked with vesicular trafficking in eukaryotes. Despite of their evolutionary conservation, there is little known about the ARF-like GTPase functions in plants. Our manuscript reports the biochemical and cell biological characterization of the small ARF-like GTPase TTN5 from the model plant Arabidopsis thaliana*. Fundamental investigations like ours are mostly lacking for ARF and ARL GTPases in Arabidopsis. *

      We employed fluorescence-based enzymatic assays suited to uncover different types of the very rapid GTPase activities for TTN5. The experimental findings are now illustrated in a more comprehensive modified Figure 2 and in the form of a summary of the GTPase activities for TTN5 and its mutant variants in the NEW Figure 7A in the Discussion part. Taken together, we found that TTN5 is a non-classical GTPase based on its enzymatic kinetics. The reviewers appreciated these findings and highlighted them as being „impressive in vitro biochemical characterization" and "major conceptual advance". Since such experiments are "uncommon" for being conducted with plant GTPases, reviewers regarded this analysis as "useful addition to the plant community in general". The significance of these findings is given by the circumstance that „the ARF-like proteins are poorly addressed in Arabidopsis while they could reveal completely different function than the canonical known ARF proteins". Reviewers saw here clearly a "strength" of the manuscript.

      With regard to the cell biological investigation and initial assessment of cell physiological roles of TTN5, we now provide requested additional evidence. First of all, we provide NEW data on the localization of TTN5 by immunolocalization using a complementing HA3-TTN5 construct, supporting our initial suggestions that TTN5 may be associated with vesicles and processes of the endomembrane system. The previous preprint version had left the reviewers „less convinced" of cell biological data due to the lack of complementation of our YFP-TTN5 construct, lack of Western blot data and the low resolution of microscopic images. We fully agree that these points were of concern and needed to be addressed. We have therefore intensively worked on these „weaknesses" and present now a more detailed whole-mount immunostaining series with the complementing HA3-TTN5 transgenic line (NEW Figure 4, NEW Figure 3P), Western blot data (NEW Supplementary Figures S7C and D), and we will provide all original images upon publication of our manuscript at BioImage Archives which will provide the high quality for re-analysis. BioImage Archives is an online storage for biological image data associated with a peer-reviewed publication. This way, readers will be able to inspect each image in detail. The immunolocalization data are of particular importance as they indicate that HA3-TTN5 can be associated with punctate vesicle structures and BFA bodies as seen with YFP studies of YFP-TTN5 seedlings. We have re-phrased very carefully and emphasized those localization patterns which are backed up by immunostaining and YFP fluorescence detection of YFP-TTN5 signals. To improve the comprehension, the findings are summarized in a schematic overview in NEW Figure 7B of the Discussion. We have also addressed all other comments related to the cell biological experiments to "provide the substantial improvement" that had been requested. We emphasize that we found two cell physiological phenotypes for the TTN5T30N mutant. YFP-TTN5T30N confers phenotypes, which are differing mobility of the fluorescent vesicles in the epidermis of hypocotyls (see Video material and NEW Supplementary Video Material S1M-O), and a root growth phenotype of transgenic HA3-TTN5T30N seedlings (NEW Figure 3O). We explain the cell physiological phenotypes in relation to enzymatic GTPase data. These findings convince us of the validity of the YFP-TTN5 analysis indicative of TTN5 localization.

      *We are deeply thankful to the reviewers for judging our manuscript as "generally well written", "important" and "of interest to a wide range of plant scientists" and "for scientists working in the trafficking field" as it "holds significance" and will form the basis for future functional studies of TTN5. *

      We prepared very carefully our revised manuscript in which we address all reviewer comments one by one. Please find our revision and our detailed rebuttal to all reviewer comments below. Changes in the revised version are highlighted by yellow and green color. In the "revised version with highlighted changes".

      With these adjustments, we hope that our peer-reviewed study will receive a positive response.

      We are looking forward to your evaluation of our revised manuscript and thank you in advance,

      Sincerely

      Petra Bauer and Inga Mohr on behalf of all authors

      *

      • *

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      The manuscript from Mohr and collaborators reports the characterization of an ARF-like GTPase of Arabidopsis. Small GTPases of the ARF family play crucial role in intracellular trafficking and plant physiology. The ARF-like proteins are poorly addressed in Arabidopsis while they could reveal completely different function than the canonical known ARF proteins. Thus, the aim of the study is important and could be of interest to a wide range of plant scientists. I am impressed by the biochemical characterization of the TTN5 protein and its mutated versions, this is clearly a very nice point of the paper and allows for proper interpretations of the other results. However, I was much less convinced on the cell biology part of this manuscript and aside from the subcellular localization of the TTN5 I think the paper would benefit from a more functional angle. Below are my comments to improve the manuscript:

      1- In the different pictures and movies, TTN5 is quite clearly appearing as a typical ER-like pattern. The pattern of localization further extends to dotty-like structures and structures labeled only at the periphery of the structure, with a depletion of fluorescence inside the structure. These observations raise several points. First, the ER pattern is never mentioned in the manuscript while I think it can be clearly observed. Given that the YFP-TTN5 construct is not functional (the mutant phenotype is not rescued) the ER-localization could be due to the retention at the ER due to quality control. The HA-TTN5 construct is functional but to me its localization shows a quite different pattern from the YFP version, I do not see the ER for example or the periphery-labeled structures. In this case, it will be a crucial point to perform co-localization experiments between HA-TTN5 and organelles markers to confirm that the functional TTN5 construct is labeling the Golgi and MVBs, as does the non-functional one. I am also quite sure that a co-localization between YFP-TTN5 and HA-TTN5 will not completely match... The ER is contacting so many organelles that the localization of YFP-TTN5 might not reflects the real location of the protein.

      __Our response: __

      At first, we like to state that specific detection of intracellular localization of plant proteins in plant cells is generally technically very difficult, when the protein abundance is not overly high. In this revised version, we extended immunostaining analysis to different membrane compartments, including now immunostaining of complementing HA3-TTN5 in the absence and presence of BFA, along with immunodetection of ARF1 and FM4-64 labeling in roots (NEW Figure 3P, NEW Figure 4A, B). In the revised version, we focus the analysis and conclusions on the fluorescence patterns that overlap between YFP-TTN5 detection and HA3-TTN5 immunodetection. With this, we can be most confident about subcellular TTN5 localization. Please find this NEW text in the Result section (starting Line 323):

      „For a more detailed investigation of HA3-TTN5 subcellular localization, we then performed co-immunofluorescence staining with an Alexa 488-labeled antibody recognizing the Golgi and TGN marker ARF1, while detecting HA3-TTN5 with an Alexa 555-labeled antibody (Robinson et al. 2011, Singh et al. 2018) (Figure 4A). ARF1-Alexa 488 staining was clearly visible in punctate structures representing presumably Golgi stacks (Figure 4A, Alexa 488), as previously reported (Singh et al. 2018). Similar structures were obtained for HA3-TTN5-Alexa 555 staining (Figure 4A, Alexa 555). But surprisingly, colocalization analysis demonstrated that the HA3-TTN5-labeled structures were mostly not colocalizing and thus distinct from the ARF1-labeled ones (Figure 4A). Yet the HA3-TTN5- and ARF1-labeled structures were in close proximity to each other (Figure 4A). We hypothesized that the HA3-TTN5 structures can be connected to intracellular trafficking steps. To test this, we performed brefeldin A (BFA) treatment, a commonly used tool in cell biology for preventing dynamic membrane trafficking events and vesicle transport involving the Golgi. BFA is a fungal macrocyclic lactone that leads to a loss of cis-cisternae and accumulation of Golgi stacks, known as BFA-induced compartments, up to the fusion of the Golgi with the ER (Ritzenthaler et al. 2002, Wang et al. 2016). For a better identification of BFA bodies, we additionally used the dye FM4-64, which can emit fluorescence in a lipophilic membrane environment. FM4-64 marks the plasma membrane in the first minutes following application to the cell, then may be endocytosed and in the presence of BFA become accumulated in BFA bodies (Bolte et al. 2004). We observed BFA bodies positive for both, HA3-TTN5-Alexa 488 and FM4-64 signals (Figure 4B). Similar patterns were observed for YFP-TTN5-derived signals in YFP-TTN5-expressing roots (Figure 4C). Hence, HA3-TTN5 and YFP-TTN5 can be present in similar subcellular membrane compartments."

      We did not find evidence that HA3-TTN5 can localize at the ER using whole-mount immunostaining (NEW Figure 3P; NEW Figure 4A, B). Hence, we are careful with describing that fluorescence at the ER, as seen in the YFP-TTN5 line (Figure 3M, N) reflects TTN5 localization. We therefore do not focus the text on the ER pattern in the Result section (starting Line 295):

      „Additionally, YFP signals were also detected in a net-like pattern typical for ER localization (Figure 3M, N). (...) We also found multiple YFP bands in α-GFP Western blot analysis using YFP-TTN5 Arabidopsis seedlings. Besides the expected and strong 48 kDa YFP-TTN5 band, we observed three weak bands ranging between 26 to 35 kDa (Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins produced from aberrant transcripts with perhaps alternative translation start or stop sites. On the other side, a triple hemagglutinin-tagged HA3-TTN5 driven by the 35S promoter did complement the embryo-lethal phenotype of ttn5-1 (Supplementary Figure S7D, E). α-HA Western blot control performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size, but no band that was 13 to 18 kDa smaller (Supplementary Figure S7D). (...) We did not observe any staining in nuclei or ER when performing HA3-TTN5 immunostaining (Figure 3P; Figure 4A, B), as was the case for fluorescence signals in YFP-TTN5-expressing cells. Presumably, this can indicate that either the nuclear and ER signals seen with YFP-TTN5 correspond to the smaller proteins detected, as described above, or that immunostaining was not suited to detect them. Hence, we focused interpretation on patterns of localization overlapping between the fluorescence staining with YFP-labeled TTN5 and with HA3-TTN5 immunostaining, such as the particular signal patterns in the specific punctate membrane structures."

      *And we discuss in the Discussion section (starting Line 552): *

      „We based the TTN5 localization data on tagging approaches with two different detection methods to enhance reliability of specific protein detection. Even though YFP-TTN5 did not complement the embryo-lethality of a ttn5 loss of function mutant, we made several observations that suggest YFP-TTN5 signals to be meaningful at various membrane sites. We do not know why YFP-TTN5 does not complement. There could be differences in TTN5 levels and interactions in some cell types, which were hindering specifically YFP-TTN5 but not HA3-TTN5. (...) Though constitutively driven, the YFP-TTN5 expression may be delayed or insufficient at the early embryonic stages resulting in the lack of embryo-lethal complementation. On the other hand, the very fast nucleotide exchange activity may be hindered by the presence of a large YFP-tag in comparison with the small HA3-tag which is able to rescue the embryo-lethality. The lack of complementation represents a challenge for the localization of small GTPases with rapid nucleotide exchange in plants. Despite of these limitations, we made relevant observations in our data that made us believe that YFP signals in YFP-TTN5-expressing cells at membrane sites can be meaningful."

      2- What are the structures with TTN5 fluorescence depleted at the center that appear in control conditions? They look different from the Golgi labeled by Man1 but similar to MVBs upon wortmannin treatment, except that in control conditions MVBs never appear like this. Are they related to any kind of vacuolar structures that would be involved in quality control-induced degradation of non-functional proteins?

      Our response:

      The reviewer certainly refers to fluorescence images from N. benthamiana leaf epidermal cells where different circularly shaped structures are visible. In these respective structures, the fluorescent circles are depleted from fluorescence in the center, e.g. in Figure 5C, YFP- fluorescent signals in TTN5T30N transformed leaf discs. We suspect that these structures can be of vacuolar origin as described for similar fluorescent rings in Tichá et al., 2020 for ANNI-GFP (reference in manuscript). The reviewer certainly does not refer to swollen MVBs that are seen following wortmannin treatment, as in Figure 5N-P, which look similar in their shape but are larger in size. Please note that we always included the control conditions, namely the images recorded before the wortmannin treatment, so that we were able to investigate the changes induced by wortmannin. Hence, we can clearly say that the structures with depleted fluorescence in the center as in Figure 5C are not wortmannin-induced swollen MVBs.To make these points clear to the reader, we added an explanation into the text (Line 385-388):

      „We also observed YFP fluorescence signals in the form of circularly shaped ring structures with a fluorescence-depleted center. These structures can be of vacuolar origin as described for similar fluorescent rings in Tichá et al. (2020) for ANNI-GFP."

      3- The fluorescence at nucleus could be due to a proportion of YFP-TTN5 that is degraded and released free-GFP, a western-blot of the membrane fraction vs the cytosolic fraction could help solving this issue.

      Our response:

      In an α-GFP Western blot using YFP-TTN5 Arabidopsis seedlings, we detected besides the expected and strong 48 kDa YFP-TTN5 band, three additional weak bands ranging between 26 to 35 kDa (NEW Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins expressed from aberrant transcripts. α-HA Western blot controls performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size (Supplementary Figure S7D). We must therefore be cautious about nuclear TTN5 localization and we rephrased the text carefully (starting Line 300):

      „We also found multiple YFP bands in α-GFP Western blot analysis using YFP-TTN5 Arabidopsis seedlings. Besides the expected and strong 48 kDa YFP-TTN5 band, we observed three weak bands ranging between 26 to 35 kDa (Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins produced from aberrant transcripts with perhaps alternative translation start or stop sites. On the other side, a triple hemagglutinin-tagged HA3-TTN5 driven by the 35S promoter did complement the embryo-lethal phenotype of ttn5-1 (Supplementary Figure S7D, E). α-HA Western blot control performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size, but no band that was 13 to 18 kDa smaller (Supplementary Figure S7D). (...) We did not observe any staining in nuclei or ER when performing HA3-TTN5 immunostaining (Figure 3P; Figure 4A, B), as was the case for fluorescence signals in YFP-TTN5-expressing cells. Presumably, this can indicate that either the nuclear and ER signals seen with YFP-TTN5 correspond to the smaller proteins detected, as described above, or that immunostaining was not suited to detect them. Hence, we focused interpretation on patterns of localization overlapping between the fluorescence staining with YFP-labeled TTN5 and with HA3-TTN5 immunostaining, such as the particular signal patterns in the specific punctate membrane structures."

      4- It is not so easy to conclude from the co-localization experiments. The confocal pictures are not always of high quality, some of them appear blurry. The Golgi localization looks convincing, but the BFA experiments are not that clear. The MVB localization is pretty convincing but the images are blurry. An issue is the quantification of the co-localizations. Several methods were employed but they do not provide consistent results. As for the object-based co-localization method, the authors employ in the text co-localization result either base on the % of YFP-labeled structures or the % of mCherry/mRFP-labeled structures, but the results are not going always in the same direction. For example, the proportion of YFP-TTN5 that co-localize with MVBs is not so different between WT and mutated version but the proportion of MVBs that co-localize with TTN5 is largely increased in the Q70L mutant. Thus it is quite difficult to interpret homogenously and in an unbiased way these results. Moreover, the results coming from the centroid-based method were presented in a table rather than a graph, I think here the authors wanted to hide the huge standard deviation of these results, what is the statistical meaning of these results?

      Our response:

      First of all, we like to point out that, as explained above, the BFA experiments are now more clear. We performed additional BFA treatment coupled with immunostaining using HA3-TTN5-expressing Arabidopsis seedlings and coupled with fluorescence analysis using YFP-TTN5-expressing Arabidopsis plants. In both experiments, we observed the typical BFA bodies very clearly (NEW Figure 4B, C).

      Second, we like to insist that we performed colocalization very carefully and quantified the data in three different manners. We like to state that there is no general standardized procedure that best suits the idea of a colocalization pattern. Results of colocalization are represented in stem diagrams and table format, including statistical analysis. Colocalization was carried out with the ImageJ plugin JACoP for Pearson's and Overlap coefficients and based on the centroid method. The plotted Pearson's and Overlap coefficients are presented in bar diagrams in Supplementary Figure S8A and C, including statistics. The obtained values by the centroid method are represented in table format in Supplementary Figure S8B and D, which *can be considered a standard method (see Ivanov et al., 2014). *

      Colocalization of two different fluorescence signals was performed for the two channels in a specific chosen region of interest (indicating in % the overlapping signal versus the sum of signal for each channel). The differences between the YFP/mRFP and mRFP/YFP ratios indicate that a higher percentage of ARA7-RFP signal is colocalizing with YFP-TTN5Q70L signal than with the TTN5WT or the TTN5T30N mutant form signals, while the YFP signals have a similar overlap with ARA7-positive structures. This is not a contradiction. Presumably this answers well the questions on colocalization.

      Please note that upon acceptance for publication, we will upload all original colocalization data to BioImage Archive. Hence, the high-quality data can be reanalyzed by readers.

      5- The use of FM4-64 to address the vacuolar trafficking is a hazardous, FM4-64 allows the tracking of endocytosis but does not say anything on vacuolar degradation targeting and even less on the potential function of TTN5 in endosomal vacuolar targeting. Similarly, TTN5, even if localized at the Golgi, is not necessarily function in Golgi-trafficking. __Our response: __

      *Perhaps our previous description was misleading. Thank you for pointing this out. We reformulated the text and modified the schematic representation of FM4-64 in NEW Figure 6A: *

      "(A), Schematic representation of progressive stages of FM4-64 localization and internalization in a cell. FM4-64 is a lipophilic substance. After infiltration, it first localizes in the plasma membrane, at later stages it localizes to intracellular vesicles and membrane compartments. This localization pattern reflects the endocytosis process (Bolte et al. 2004)."

      6- The manuscript lacks in its present shape of functional evidences for a role of TTN5 in any trafficking steps. I understand that the KO mutant is lethal but what are the phenotypes of the Q70L and T30N mutant plants? What is the seedling phenotype, how are the Golgi and MVBs looking like in these mutants? Do the Q70L or T30N mutants perturbed the trafficking of any cargos?

      __Our response: __

      *We agree fully that functional evidences are interesting to assign roles for TTN5 in trafficking steps. A phenotype associated with TTN5T30N and TTN5Q70L is clearly meaningful. *

      First of all, we like to emphasize that it is incorrect that the manuscript lacks functional evidences for a role of TTN5 and the two mutants. In fact, the manuscript even highlights several functional activities that are meaningful in a cellular context. These include different types of kinetic GTPase enzyme activities, subcellular localization in planta and association with different endomembrane compartments and subcellular processes such as endocytosis. We surely agree that future research can focus even more on cell physiological aspects and the physiological functions in plants to examine the proposed roles of TTN5 in intracellular trafficking steps. For such studies, our findings are the fundamental basis.

      Concerning the aspect of colocalization of the mutants with the markers we show in Figure 5C, D and G, H that YFP-TTN5T30N- and YFP-TTN5Q70L-related signals colocalize with the Golgi marker GmMan1-mCherry. Figure 5K, L and O, P show that YFP-TTN5T30N and YFP-TTN5Q70L-related signals can colocalize with the MVB marker, and this may affect relevant vesicle trafficking processes and plasma membrane protein regulation involved in root cell elongation.

      *At present, we have not yet investigated perturbed cargo trafficking. These aspects are certainly interesting but require extensive work and testing of appropriate physiological conditions and appropriate cargo targets. We discuss future perspectives in the Discussion. We agree that such functional information is of great importance, but needs to be clarified in future studies. *

      __Reviewer #1 (Significance (Required)): __

      In conclusion, I think this manuscript is a good biochemical description of an ARF-like protein but it would need to be strengthen on the cell biology and functional sides. Nonetheless, provided these limitations fixed, this manuscript would advance our knowledge of small GTPases in plants. The major conceptual advance of that study is to provide a non-canonical behavior of the active/inactive cycle dynamics for a small-GTPase. Of course this dynamic probably has an impact on TTN5 function and involvement in trafficking, although this remains to be fully demonstrated. Provided a substantial amount of additional experiments to support the claims of that study, this study could be of general interest for scientist working in the trafficking field.

      __Our response: __

      We thank reviewer 1 for the very fruitful comments. We hope that with the additional experiments, NEW Figures and NEW Supplementary Figures as well as our changes in the text, all comments by the reviewer have been addressed.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __

      The manuscript by Mohr and colleagues characterizes the Arabidopsis predicted small GTPase TITAN5 in both biochemical and cell biology contexts using in vitro and in planta techniques. In the first half of the manuscript, the authors use in vitro nucleotide exchange assays to characterise the GTPase activity and nucleotide binding properties of TITAN5 and two mutant variants of it. The in vitro data they produce indicates that TITAN5 does indeed have general GTPase and nucleotide binding capability that would be expected for a protein predicted to be a small GTPase. Interestingly, the authors show that TITAN5 favors a GTP-bound form, which is different to many other characterized GTPases that favor GDP-binding. The authors follow their biochemical characterisation of TITAN with in planta experiments characterizing TITAN5 and its mutant variants association with the plant endomembrane system, both by stable expression in Arabidopsis and transient expression in N.benthamiana.

      The strength of this manuscript is in its in vitro biochemical characterisation of TITAN5 and variants. I am not an expert on in vitro GTPase characterisation and so cannot comment specifically on the assays they have used, but generally speaking this appears to have been well done, and the authors are to be commended for it. In vitro characterisation of plant small GTPases is uncommon, and much of our knowledge is inferred for work on animal or yeast GTPases, so this will be a useful addition to the plant community in general, especially as TITAN5 is an essential gene. The in planta data that follows is sadly not as compelling as the biochemical data, and suffers from several weaknesses. I would encourage the authors to consider trying to improve the quality of the in planta data in general. If improved and then combined with the biochemical aspects of the paper, this has the potential to make a nice addition to plant small GTPase and endomembrane literature.

      The manuscript is generally well written and includes the relevant literature.

      Major issues:

      1. The authors make use of a p35s: YFP-TTN5 construct (and its mutant variants) both stably in Arabidopsis and transiently in N.benthamiana. I know from personal experience that expressing small GTPases from non-endogenous promoters and in transient expression systems can give very different results to when working from endogenous promoters/using immunolocalization in stable expression systems. Strong over-expression could for example explain why the authors see high 'cytosolic' levels of YFP-TTN5. It is therefore questionable how much of the in planta localisation data presented using p35S and expression in tobacco is of true relevance to the biological function of TITAN5. The authors do present some immunolocalization data of HA3-TTN5 in Arabidopsis, but this is fairly limited and it is very difficult in its current form to use this to identify whether the data from YFP-TTN5 in Arabidopsis and tobacco can be corroborated. I would encourage the authors to consider expanding the immunolocalization data they present to validate their findings in tobacco. __Our response: __

      We are aware that endogenous promoters may be preferred over 35S promoter. However, the two types of lines we generated with endogenous promoter did both not show fluorescent signals so that we could unfortunately not use them (not shown). Besides 35S promoter-mediated expression we were also investigating inducible expression vectors for fluorescence imaging in N. benthamiana (not shown). Both inducible and constitutive expression showed very similar expression patterns so that we chose characterizing in detail the 35S::YFP-TTN5 fluorescence in both N. bethamiana*and Arabidopsis. *

      We have expanded immunolocalization using the HA3-TTN5 line and compare it now along with YFP fluorescence signal in YFP-TTN5 seedlings (NEW Figure 3P; NEW Figure 4).

      „For a more detailed investigation of HA3-TTN5 subcellular localization, we then performed co-immunofluorescence staining with an Alexa 488-labeled antibody recognizing the Golgi and TGN marker ARF1, while detecting HA3-TTN5 with an Alexa 555-labeled antibody (Robinson et al. 2011, Singh et al. 2018) (Figure 4A). ARF1-Alexa 488 staining was clearly visible in punctate structures representing presumably Golgi stacks (Figure 4A, Alexa 488), as previously reported (Singh et al. 2018). Similar structures were obtained for HA3-TTN5-Alexa 555 staining (Figure 4A, Alexa 555). But surprisingly, colocalization analysis demonstrated that the HA3-TTN5-labeled structures were mostly not colocalizing and thus distinct from the ARF1-labeled ones (Figure 4A). Yet the HA3-TTN5- and ARF1-labeled structures were in close proximity to each other (Figure 4A). We hypothesized that the HA3-TTN5 structures can be connected to intracellular trafficking steps. To test this, we performed brefeldin A (BFA) treatment, a commonly used tool in cell biology for preventing dynamic membrane trafficking events and vesicle transport involving the Golgi. BFA is a fungal macrocyclic lactone that leads to a loss of cis-cisternae and accumulation of Golgi stacks, known as BFA-induced compartments, up to the fusion of the Golgi with the ER (Ritzenthaler et al. 2002, Wang et al. 2016). For a better identification of BFA bodies, we additionally used the dye FM4-64, which can emit fluorescence in a lipophilic membrane environment. FM4-64 marks the plasma membrane in the first minutes following application to the cell, then may be endocytosed and in the presence of BFA become accumulated in BFA bodies (Bolte et al. 2004). We observed BFA bodies positive for both, HA3-TTN5-Alexa 488 and FM4-64 signals (Figure 4B). Similar patterns were observed for YFP-TTN5-derived signals in YFP-TTN5-expressing roots (Figure 4C). Hence, HA3-TTN5 and YFP-TTN5 can be present in similar subcellular membrane compartments."

      • *

      Many of the confocal images presented are of poor quality, particularly those from N.benthamiana.

      Our response:

      All confocal images are of high quality in their original format. To make them accessible, we will upload all raw data to BioImage Archive upon acceptance of the manuscript.

      The authors in some places see YFP-TTN5 in cell nuclei. This could be a result of YFP-cleavage rather than genuine nuclear localisation of YFP-TTN5, but the authors do not present western blots to check for this.

      __Our response: __

      As described in our response to reviewer 1, comment 3, Fluorescence signals were detected within the nuclei of root cells of YFP-TTN5 plants, while immunostaining signals of HA3-TTN5 were not detected in the nucleus. In an α-GFP Western blot using YFP-TTN5 Arabidopsis seedlings, we detected besides the expected and strong 48 kDa YFP-TTN5 band, three additional weak bands ranging between 26 to 35 kDa (NEW Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins expressed from aberrant transcripts. α-HA Western blot controls performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size (Supplementary Figure S7D). We must therefore be cautious about nuclear TTN5 localization and we rephrased the text carefully (starting Line 300):

      • *

      „We also found multiple YFP bands in α-GFP Western blot analysis using YFP-TTN5 Arabidopsis seedlings. Besides the expected and strong 48 kDa YFP-TTN5 band, we observed three weak bands ranging between 26 to 35 kDa (Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins produced from aberrant transcripts with perhaps alternative translation start or stop sites. On the other side, a triple hemagglutinin-tagged HA3-TTN5 driven by the 35S promoter did complement the embryo-lethal phenotype of ttn5-1 (Supplementary Figure S7D, E). α-HA Western blot control performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size, but no band that was 13 to 18 kDa smaller (Supplementary Figure S7D). (...) We did not observe any staining in nuclei or ER when performing HA3-TTN5 immunostaining (Figure 3P; Figure 4A, B), as was the case for fluorescence signals in YFP-TTN5-expressing cells. Presumably, this can indicate that either the nuclear and ER signals seen with YFP-TTN5 correspond to the smaller proteins detected, as described above, or that immunostaining was not suited to detect them. Hence, we focused interpretation on patterns of localization overlapping between the fluorescence staining with YFP-labeled TTN5 and with HA3-TTN5 immunostaining, such as the particular signal patterns in the specific punctate membrane structures."

      That YFP-TTN5 fails to rescue the ttn5 mutant indicates that YFP-tagged TTN5 may not be functional. If the authors cannot corroborate the YFP-TTN5 localisation pattern with that of HA3-TTN5 via immunolocalization, then the fact that YFP-TTN5 may not be functional calls into question the biological relevance of YFP-TTN5's localisation pattern.

      __Our response: __

      This refers to your comment 1, please check this comment for a detailed response. Please also see our answer to reviewer 1, comment 1.

      At first, we like to state that specific detection of intracellular localization of plant proteins in plant cells is generally technically very difficult, when the protein abundance is not overly high. In this revised version, we extended immunostaining analysis to different membrane compartments, including now immunostaining of complementing HA3-TTN5 in the absence and presence of BFA, along with immunodetection of ARF1 and FM4-64 labeling in roots (NEW Figure 3P, NEW Figure 4A, B). In the revised version, we focus the analysis and conclusions on the fluorescence patterns that overlap between YFP-TTN5 detection and HA3-TTN5 immunodetection. With this, we can be most confident about subcellular TTN5 localization. Please find this NEW text in the Result section (starting Line 323):

      „For a more detailed investigation of HA3-TTN5 subcellular localization, we then performed co-immunofluorescence staining with an Alexa 488-labeled antibody recognizing the Golgi and TGN marker ARF1, while detecting HA3-TTN5 with an Alexa 555-labeled antibody (Robinson et al. 2011, Singh et al. 2018) (Figure 4A). ARF1-Alexa 488 staining was clearly visible in punctate structures representing presumably Golgi stacks (Figure 4A, Alexa 488), as previously reported (Singh et al. 2018). Similar structures were obtained for HA3-TTN5-Alexa 555 staining (Figure 4A, Alexa 555). But surprisingly, colocalization analysis demonstrated that the HA3-TTN5-labeled structures were mostly not colocalizing and thus distinct from the ARF1-labeled ones (Figure 4A). Yet the HA3-TTN5- and ARF1-labeled structures were in close proximity to each other (Figure 4A). We hypothesized that the HA3-TTN5 structures can be connected to intracellular trafficking steps. To test this, we performed brefeldin A (BFA) treatment, a commonly used tool in cell biology for preventing dynamic membrane trafficking events and vesicle transport involving the Golgi. BFA is a fungal macrocyclic lactone that leads to a loss of cis-cisternae and accumulation of Golgi stacks, known as BFA-induced compartments, up to the fusion of the Golgi with the ER (Ritzenthaler et al. 2002, Wang et al. 2016). For a better identification of BFA bodies, we additionally used the dye FM4-64, which can emit fluorescence in a lipophilic membrane environment. FM4-64 marks the plasma membrane in the first minutes following application to the cell, then may be endocytosed and in the presence of BFA become accumulated in BFA bodies (Bolte et al. 2004). We observed BFA bodies positive for both, HA3-TTN5-Alexa 488 and FM4-64 signals (Figure 4B). Similar patterns were observed for YFP-TTN5-derived signals in YFP-TTN5-expressing roots (Figure 4C). Hence, HA3-TTN5 and YFP-TTN5 can be present in similar subcellular membrane compartments."

      We did not find evidence that HA3-TTN5 can localize at the ER using whole-mount immunostaining (NEW Figure 3P; NEW Figure 4A, B). Hence, we are careful with describing that fluorescence at the ER, as seen in the YFP-TTN5 line (Figure 3M, N) reflects TTN5 localization. We therefore do not focus the text on the ER pattern in the Result section (starting Line 295):

      „Additionally, YFP signals were also detected in a net-like pattern typical for ER localization (Figure 3M, N). (...) We also found multiple YFP bands in α-GFP Western blot analysis using YFP-TTN5 Arabidopsis seedlings. Besides the expected and strong 48 kDa YFP-TTN5 band, we observed three weak bands ranging between 26 to 35 kDa (Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins produced from aberrant transcripts with perhaps alternative translation start or stop sites. On the other side, a triple hemagglutinin-tagged HA3-TTN5 driven by the 35S promoter did complement the embryo-lethal phenotype of ttn5-1 (Supplementary Figure S7D, E). α-HA Western blot control performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size, but no band that was 13 to 18 kDa smaller (Supplementary Figure S7D). (...) We did not observe any staining in nuclei or ER when performing HA3-TTN5 immunostaining (Figure 3P; Figure 4A, B), as was the case for fluorescence signals in YFP-TTN5-expressing cells. Presumably, this can indicate that either the nuclear and ER signals seen with YFP-TTN5 correspond to the smaller proteins detected, as described above, or that immunostaining was not suited to detect them. Hence, we focused interpretation on patterns of localization overlapping between the fluorescence staining with YFP-labeled TTN5 and with HA3-TTN5 immunostaining, such as the particular signal patterns in the specific punctate membrane structures."

      *And we discuss in the Discussion section (starting Line 552): *

      „We based the TTN5 localization data on tagging approaches with two different detection methods to enhance reliability of specific protein detection. Even though YFP-TTN5 did not complement the embryo-lethality of a ttn5 loss of function mutant, we made several observations that suggest YFP-TTN5 signals to be meaningful at various membrane sites. We do not know why YFP-TTN5 does not complement. There could be differences in TTN5 levels and interactions in some cell types, which were hindering specifically YFP-TTN5 but not HA3-TTN5. (...) Though constitutively driven, the YFP-TTN5 expression may be delayed or insufficient at the early embryonic stages resulting in the lack of embryo-lethal complementation. On the other hand, the very fast nucleotide exchange activity may be hindered by the presence of a large YFP-tag in comparison with the small HA3-tag which is able to rescue the embryo-lethality. The lack of complementation represents a challenge for the localization of small GTPases with rapid nucleotide exchange in plants. Despite of these limitations, we made relevant observations in our data that made us believe that YFP signals in YFP-TTN5-expressing cells at membrane sites can be meaningful."

      • *

      Without a cell wall label/dye, the plasmolysis data presented in Figure 5 is hard to visualize.

      __Our response: __

      Figure 6E-G (previously Fig. 5) show the results of plasmolysis experiments with YFP-TTN5 and the two mutant variant constructs. It is clearly possible to observe plasmolysis when focusing on the Hechtian strands. Hechtian strands are formed due to the retraction of the protoplast as a result of the osmotic pressure by the added mannitol solution. Hechtian strands consist of PM which remained in contact with the cell wall, visible as thin filamental structures. We stained the PM and the Hechtian strands by the PM dye FM4-64. This is similary done in Yoneda et al., 2020. We could detect in the YFP-TTN5-transformed cells, colocalization with the YFP channels and the PM dye in filamental structures between two neighbouring FM4-64-labelled PMs. Although an additional labeling of the cell wall may further indicate plasmolysis, it is not needed here.

      Please consider that we will upload all original image data to BioImage Archive so that a detailed re-investigation of the images can be done.

      • *

      __Minor issues: __

      In some of the presented N.benthamiana images, it looks like YFP-TTN5 may be partially ER-localised. However, co-localisation with an ER marker is not presented.

      Our response:

      *Referring to our response to comments 1 and 3 of reviewer 2 and to comment 1 of reviewer 1: *

      We did not find evidence that HA3-TTN5 can localize at the ER using whole-mount immunostaining (NEW Figure 3P; NEW Figure 4A, B). Hence, we are careful with describing that fluorescence at the ER, as seen in the YFP-TTN5 line (Figure 3M, N) reflects TTN5 localization. We therefore do not focus the text on the ER pattern in the Result section (starting Line 295):

      „Additionally, YFP signals were also detected in a net-like pattern typical for ER localization (Figure 3M, N). (...) We also found multiple YFP bands in α-GFP Western blot analysis using YFP-TTN5 Arabidopsis seedlings. Besides the expected and strong 48 kDa YFP-TTN5 band, we observed three weak bands ranging between 26 to 35 kDa (Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins produced from aberrant transcripts with perhaps alternative translation start or stop sites. On the other side, a triple hemagglutinin-tagged HA3-TTN5 driven by the 35S promoter did complement the embryo-lethal phenotype of ttn5-1 (Supplementary Figure S7D, E). α-HA Western blot control performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size, but no band that was 13 to 18 kDa smaller (Supplementary Figure S7D). (...) We did not observe any staining in nuclei or ER when performing HA3-TTN5 immunostaining (Figure 3P; Figure 4A, B), as was the case for fluorescence signals in YFP-TTN5-expressing cells. Presumably, this can indicate that either the nuclear and ER signals seen with YFP-TTN5 correspond to the smaller proteins detected, as described above, or that immunostaining was not suited to detect them. Hence, we focused interpretation on patterns of localization overlapping between the fluorescence staining with YFP-labeled TTN5 and with HA3-TTN5 immunostaining, such as the particular signal patterns in the specific punctate membrane structures."

      *And we discuss in the Discussion section (starting Line 552): *

      „We based the TTN5 localization data on tagging approaches with two different detection methods to enhance reliability of specific protein detection. Even though YFP-TTN5 did not complement the embryo-lethality of a ttn5 loss of function mutant, we made several observations that suggest YFP-TTN5 signals to be meaningful at various membrane sites. We do not know why YFP-TTN5 does not complement. There could be differences in TTN5 levels and interactions in some cell types, which were hindering specifically YFP-TTN5 but not HA3-TTN5. (...) Though constitutively driven, the YFP-TTN5 expression may be delayed or insufficient at the early embryonic stages resulting in the lack of embryo-lethal complementation. On the other hand, the very fast nucleotide exchange activity may be hindered by the presence of a large YFP-tag in comparison with the small HA3-tag which is able to rescue the embryo-lethality. The lack of complementation represents a challenge for the localization of small GTPases with rapid nucleotide exchange in plants. Despite of these limitations, we made relevant observations in our data that made us believe that YFP signals in YFP-TTN5-expressing cells at membrane sites can be meaningful."

      • *

      There is some inconsistency within the N.benthamiana images. For example, compare Figure 4C of YFP-TTN5T30N to Figure 4O of YFP-TTN5T30N. Figure 4O is presented as being significant because wortmannin-induced swollen ARA7 compartments are labelled by YFP-TTN5T30N. However, structures very similar to these can already been seen in Figure 4C, which is apparently an unrelated experiment. This, to my mind, is likely a result of the very different expression levels between different cells that can be produced by transient expression in N.benthamiana.

      __Our response: __

      Former Figure 4 is now Figure 5. As detailed in our response to comment 2 of reviewer 1:

      The reviewer certainly refers to fluorescence images from N. benthamiana leaf epidermal cells where different circularly shaped structures are visible. In these respective structures, the fluorescent circles are depleted from fluorescence in the center, e.g. in Figure 5C, YFP- fluorescent signals in TTN5T30N transformed leaf discs. We suspect that these structures can be of vacuolar origin as described for similar fluorescent rings in Tichá et al., 2020 for ANNI-GFP (reference in manuscript). The reviewer certainly does not refer to swollen MVBs that are seen following wortmannin treatment, as in Figure 5N-P, which look similar in their shape but are larger in size. Please note that we always included the control conditions, namely the images recorded before the wortmannin treatment, so that we were able to investigate the changes induced by wortmannin. Hence, we can clearly say that the structures with depleted fluorescence in the center as in Figure 5C are not wortmannin-induced swollen MVBs.To make these points clear to the reader, we added an explanation into the text (Line 385-388):

      „We also observed YFP fluorescence signals in the form of circularly shaped ring structures with a fluorescence-depleted center. These structures can be of vacuolar origin as described for similar fluorescent rings in Tichá et al. (2020) for ANNI-GFP."

      **Referees cross-commenting**

      It sems that all of the reviewers have converged on the conclusion that the in planta characterisation of TTN5 is insufficient to be of substantial interest to the field, highlighting the fact that major improvements are required to strengthen this part of the manuscript and increase its relevance.

      __Reviewer #2 (Significance (Required)): __

      General assessment: the strengths of this work are in its in vitro characterisation of TITAN5, however, the in planta characterisation lacks depth.

      Significance: the in vitro characterisation of TITAN5 is commendable as such work is lacking for plant GTPases. However, the significance of the work would be boosted substantially by better in planta characterisation, which is where most the most broad interest will lie.

      My expertise: my expertise is in in planta characterisation of small GTPases and their interactors.

      __Our response: __

      We thank the reviewer for the kind evaluation of our manuscript. We are confident that the changes in the text and NEW Figures and NEW Supplementary Figures will be convincing to consider our work.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __

      Summary: Cellular traffic is an important and well-studied biological process in animal and plant systems. While components involved in transport are known the mechanism by which these components control activity or destination remains to be studied. A critical step in regulating traffic is proper budding and tethering of vesicles. A critical component in determining this step is a family proteins with GTPase activity, which act as switches facilitating vesicle interaction between proteins, or cytoskeleton. The current manuscript by Mohr and colleagues have characterized a small GTPase TITAN5 (TTN5) and identified two residues Gln70 and Thr30 in the protein which they propose to have functional roles. The authors catalogue the localization, GTP hydrolytic activity, and discuss putative functions of TTN5 and the mutants.

      __Major comments: __

      The core of the manuscript, which is descriptive characterization of TTN5, lies in reliably demonstrating putative roles. While the GTP hydrolysis rates are well-quantified (though the claims need to be toned down), the microscopy data especially the association of TTN5 with different endomembrane compartments is not convincing due to the quality (low resolution) of the figures submitted. The manuscript text is difficult to navigate due to repetition and inconsistency in the order that the mutants are referred. I am requesting additional experiments which should be feasible considering the authors have all the materials required to perform the experiments and obtain high-quality images which support their claims.

      In general the figure quality needs to be improved for all microscopy images. I would suggest that the authors highlight 1-2 individual cells to make their point and use the current images as supplementary to establish a broader spread. __Our response: __

      *We have worked substantially on the text and figures to make the content well comprehensive. The mutants are referred to in a consistent manner in the text and figures. We have addressed requested experiments. *

      As we pointed out in the cover letter and our responses to reviewers 1 and 2, we will upload all raw image data to BioImage Archive upon acceptance of the manuscript so that they can be re-examined without any reduction of resolution. Furthermore, we have conducted new experiments on immunolocalization of HA3-TTN5 (NEW Figure 3P, NEW Figure 4A, B). The text has been improved in several places (see highlighted changes in the manuscript and as detailed in the responses to reviewer 1. We think, this addresses well the reviewers' concerns.

      Fig. S1 lacks clarity. __Our response: __

      Supplementary Figure S1 shows TTN5 gene expression in different organs and growing stages as revealed by transcriptomic data, made available through the AtGenExpress eFB tool of the Bio-Analytic Resource for Plant Biology (BAR). The figure visualizes that TTN5 is ubiquitously expressed in different plant organs and tissues, e.g. the epidermis layers that we investigated here, and throughout development including embryo development. In accordance with the embryo-lethal phenotype, this highlights well that TTN5* is needed throughout for plant growth and it emphasizes that our investigation of TTN5 localization in epidermis cells is valid. *

      We have added a better description to the figure legend. We now also mention the respective publications from which the transcriptome data-sets are derived. The modified figure legend is:

      "Supplementary Figure S1. Visualization of TTN5 gene expression levels during plant development based on transcriptome data. Expression levels in (A), different types of aerial organs at different developmental stages; from left to right and bottom to top are represented different seed and plant growth stages, flower development stages, different leaves, vegetative to inflorescence shoot apex, embryo and silique development stages; (B), seedling root tissues based on single cell analysis represented in form of a uniform manifold approximation and projection plot; (C), successive stages of embryo development. As shown in (A) to (C), TTN5 is ubiquitously expressed in these different plant organs and tissues. In particular, it should be noted that TTN5 transcripts were detectable in the epidermis cell layer of roots that we used for localization of tagged TTN5 protein in this study. In accordance with the embryo-lethal phenotype, the ubiquitous expression of TTN5 highlights its importance for plant growth. Original data were derived from (Nakabayashi et al. 2005, Schmid et al. 2005) (A); (Ryu et al. 2019) (B); (Waese et al. 2017) (C). Gene expression levels are indicated by local maximum color code, ranging from the minimum (no expression) in yellow to the maximum (highest expression) in red."

      For the supplementary videos, it is difficult to determine if punctate structures are moving or is it cytoplasmic streaming? Could this be done with a co-localized marker? Considering that such markers have been used later in Fig. 4? __Our response: __

      We had detected movement of YFP fluorescent structures in all analyzed YFP-TTN5 plant parts except the root tip. Movement of fluorescence signals in YFP-TTN5T30N seedlings was slowed in hypocotyl epidermis cells. To answer the reviewer comment, we added three NEW supplemental videos (NEW Supplementary Video Material S1M-O) generated with all the three YFP-TTN5 constructs imaged over time in N. benthamiana leaf epidermal cells upon colocalization with the cis-Golgi marker GmMan1-mCherry as requested by the reviewer. In these NEW videos, some of *the YFP fluorescent spots seem to move together with the Golgi stacks. GmMan1 is described with a stop-and-go directed movement mediated by the actino-myosin system (Nebenführ 1999) and similarly it might be the case for YFP-TTN5 signals based on the colocalization. *

      • *

      It would be good if the speed of movement is quantified, if the authors want to retain the current claims in results and the discussion. __Our response: __

      *We describe a difference in the movement of YFP fluorescent signal for the YFP-TTN5T30N variant in the hypocotyl compared to YFP-TTN5 and YFP-TTN5Q70L. In hypocotyl cells, we could observe a slowed down or arrested movement specifically of YFP-TTN5T30N fluorescent structures, and we describe this in the Results section (Line 278-291). *

      "Interestingly, the mobility of these punctate structures differed within the cells when the mutant YFP-TTN5T30N was observed in hypocotyl epidermis cells, but not in the leaf epidermis cells (Supplementary Video Material S1E, compare with S1B) nor was it the case for the YFP-TTN5Q70L mutant (Supplementary Video Material S1F, compare with S1E)."

      *The slowed movement in the YFP-TTN5T30N mutant is well visible even without quantification. We checked that the manuscript text does not contain overstatements in this regard. *

      • *

      Fig.2 I am not sure what the unit / scale is in Fig. 2D/E if each parameter (Kon, Koff, and Kd) are individually plotted? Could the authors please clarify/simplify this panel?

      __Our response: __

      We presented kinetics for nucleotide association (kon) and dissociation (koff) and the dissociation constant (Kd) in a bar diagram for each nucleotide, mdGDP (Figure 2D) and mGppNHp (Figure 2E). We modified and relabeled the bar diagram representation. It should be now very clear which are the parameters and units. Please see also the other modified figures (NEW modified Figure 2A-H). We also modified the legend of Figure 2D and E:

      "(D-E), Kinetics of association and dissociation of fluorescent nucleotides mdGDP (D) or mGppNHp (E) with TTN5 proteins (WT, TTN5T30N, TTN5Q70L) are illustrated as bar charts. The association of mdGDP (0.1 µM) or mGppNHp (0.1 µM) with increasing concentration of TTN5WT, TTN5T30N and TTN5Q70L was measured using a stopped-flow device (see A, B; data see Supplementary Figure S3A-F, S4A-E). Association rate constants (kon in µM-1s-1) were determined from the plot of increasing observed rate constants (kobs in s-1) against the corresponding concentrations of the TTN5 proteins. Intrinsic dissociation rates (koff in s-1) were determined by rapidly mixing 0.1 µM mdGDP-bound or mGppNHp-bound TTN5 proteins with the excess amount of unlabeled GDP (see A, C, data see Supplementary Figure S3G-I, S4F-H). The nucleotide affinity (dissociation constant or Kd in µM) of the corresponding TTN5 proteins was calculated by dividing koff by kon. When mixing mGppNHp with nucleotide-free TTN5T30N, no binding was observed (n.b.o.) under these experimental conditions."

      • *

      Are panels D and E representing values for mdGDP and GppNHP? This is not very clear from the figure legend.

      __Our response: __

      Yes, Figure 2D and E represent the kon, koff and Kd values for mdGDP (Figure 2D) and mGppNHP (Figure 2E). As detailed in our previous response to comment 2a, we modified figure and figure legend to make the representation more clear.

      • *

      Fig. 3 Same comments as in para above - improve resolution fo images, concentrate on a few selected cells, if required use an inset figure to zoom-in to specific compartments. Our response:

      As detailed in our responses to reviewers 1 and 2, we will upload all original image data to BioImage Archive upon acceptance of the manuscript, so that a detailed investigation of all our images is possible without any reduction of resolution.

      Please provide the non-fluorescent channel images to understand cell topography __Our response: __

      *We presented our microscopic images with the respective fluorescent channel and for colocalization with an additional merge. We did not present brightfield images as the cell topography was already well visible by fluorescent signal close to the PM. Therefore, brightfield images would not provide any benefit. Since we will upload all original data to BioImage Archive for a detailed investigation of all our images, the data can be obtained if needed. *

      Is the nuclear localization seen in transient expression (panel L-N) an artefact? If so, this needs to be mentioned in the text. Our response:

      As explained in our responses to reviewers 1 and 2, fluorescence signals were detected within the nuclei of root cells of YFP-TTN5 plants, while immunostaining signals of HA3-TTN5 were not detected in the nucleus.

      In an α-GFP Western blot using YFP-TTN5 Arabidopsis seedlings, we detected besides the expected and strong 48 kDa YFP-TTN5 band, three additional weak bands ranging between 26 to 35 kDa (NEW Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins expressed from aberrant transcripts. α-HA Western blot controls performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size (Supplementary Figure S7D). We must therefore be cautious about nuclear TTN5 localization and we rephrased the text carefully (starting Line 300):

      „We also found multiple YFP bands in α-GFP Western blot analysis using YFP-TTN5 Arabidopsis seedlings. Besides the expected and strong 48 kDa YFP-TTN5 band, we observed three weak bands ranging between 26 to 35 kDa (Supplementary Figure S7C). We cannot explain the presence of these small protein bands. They might correspond to free YFP, to proteolytic products or potentially to proteins produced from aberrant transcripts with perhaps alternative translation start or stop sites. On the other side, a triple hemagglutinin-tagged HA3-TTN5 driven by the 35S promoter did complement the embryo-lethal phenotype of ttn5-1 (Supplementary Figure S7D, E). α-HA Western blot control performed with plant material from HA3-TTN5 seedlings showed a single band at the correct size, but no band that was 13 to 18 kDa smaller (Supplementary Figure S7D). (...) We did not observe any staining in nuclei or ER when performing HA3-TTN5 immunostaining (Figure 3P; Figure 4A, B), as was the case for fluorescence signals in YFP-TTN5-expressing cells. Presumably, this can indicate that either the nuclear and ER signals seen with YFP-TTN5 correspond to the smaller proteins detected, as described above, or that immunostaining was not suited to detect them. Hence, we focused interpretation on patterns of localization overlapping between the fluorescence staining with YFP-labeled TTN5 and with HA3-TTN5 immunostaining, such as the particular signal patterns in the specific punctate membrane structures."

      Fig. 4 - In addition to the points made for Fig. 3 The authors should consider reducing gain/exposure to improve image clarity. Especially for the punctate structures, which are difficult to observe in TTN5, likely because of the cytoplasmic localization as well.

      __Our response: __

      Thank you for this comment. We record image z-stacks and represent in single z-planes. Reducing the gain to decrease the cytoplasmic signal does not increase the clarity of the punctate structures as the signal strength will become weak.. As mentioned above, we will upload all original image data to BioImage Archive for a detailed investigation of all our images without any reduction of resolution.

      • *

      Reducing Agrobacterial load could be considered. OD of 0.4 is a bit much, 0.1 or even 0.05 could be tried. If available try expression in N. tabaccum, which is more amenable to microscopy. However, this is OPTIONAL, benthamiana should suffice. __Our response: __

      Thank you for the suggestion. We are routinely using N. benthamiana leaf infiltration. When setting up this method at first, we did not observe different localization results by using different ODs of bacterial cultures. Hence, an OD600 of 0.4 is routinely used in our institute. This value is comparable with the literature although some literature reports even higher OD values for infiltration (Norkunas et al., 2018; Drapal et al., 2021; Zhang et al., 2020, Davis et al., 2020; Stephenson et al., 2018).

      A standard norm now is to establish the level of colocalization is by quantifying a pearson's or Mander's correlation. Which I believe has been done in the text, I didn't find a plot representing the same? Could the data (which the authors already have) be plotted alongwith "n" as a table or graph? __Our response: __

      *Please check our response to reviewer 1, comment 4. *

      We like to insist that we performed colocalization very carefully and quantified the data in three different manners. We like to state that there is no general standardized procedure that best suits the idea of a colocalization pattern. Results of colocalization are represented in stem diagrams and table format, including statistical analysis. Colocalization was carried out with the ImageJ plugin JACoP for Pearson's and Overlap coefficients and based on the centroid method. The plotted Pearson's and Overlap coefficients are presented in bar diagrams in Supplementary Figure S8A and C, including statistics. The obtained values by the centroid method are represented in table format in Supplementary Figure S8B and D, which *can be considered a standard method (see Ivanov et al., 2014). *

      Colocalization of two different fluorescence signals was performed for the two channels in a specific chosen region of interest (indicating in % the overlapping signal versus the sum of signal for each channel). The differences between the YFP/mRFP and mRFP/YFP ratios indicate that a higher percentage of ARA7-RFP signal is colocalizing with YFP-TTN5Q70L signal than with the TTN5WT or the TTN5T30N mutant form signals, while the YFP signals have a similar overlap with ARA7-positive structures. This is not a contradiction. Presumably this answers well the questions on colocalization.

      Please note that upon acceptance for publication, we will upload all original colocalization data to BioImage Archive. Hence, the high-quality data can be reanalyzed by readers.

      The cartoons for the action of chemicals are useful, but need a bit more clarity. Our response:

      The schematic explanations of pharmacological treatments and expected outcomes are useful to readers. For a better understanding, we added additional explaining sentences to the figure legends (Figure 5E, M; Figure 6A). We also modified Figure 6A and the corresponding legend.

      "(E), Schematic representation of GmMan1 localization at the ER upon brefeldin A (BFA) treatment. BFA blocks ARF-GEF proteins which leads to a loss of Golgi cis-cisternae and the formation of BFA-induced compartments due to an accumulation of Golgi stacks up to a redistribution of the Golgi to the ER by fusion of the Golgi with the ER (Renna and Brandizzi 2020)."

      "(M), Schematic representation of ARA7 localization in swollen MVBs upon wortmannin treatment. Wortmannin inhibits phosphatidylinositol-3-kinase (PI3K) function leading to the fusion of TGN/EE to swollen MVBs (Renna and Brandizzi 2020)."

      "(A), Schematic representation of progressive stages of FM4-64 localization and internalization in a cell. FM4-64 is a lipophilic substance. After infiltration, it first localizes in the plasma membrane, at later stages it localizes to intracellular vesicles and membrane compartments. This localization pattern reflects the endocytosis process (Bolte et al. 2004)."

      • *

      Fig. 5 does the Q70L mutant show reduced endocytosis ?

      __Our response: __

      We have not investigated this question. As detailed in our response to reviewer 1, *we like to emphasize that we agree fully that functional evidences are interesting to assign role for TTN5 in trafficking steps. A phenotype associated with TTN5T30N and TTN5Q70L would be clearly meaningful. *

      Concerning the aspect of colocalization of the mutants with the markers we show in Figure 5C, D and G, H that YFP-TTN5T30N- and YFP-TTN5Q70L-related signals colocalize with the Golgi marker GmMan1-mCherry. Figure 5K, L and O, P show that YFP-TTN5T30N and YFP-TTN5Q70L-related signals can colocalize with the MVB marker, and this may affect relevant vesicle trafficking processes and plasma membrane protein regulation involved in root cell elongation.

      *At present, we have not yet investigated perturbed cargo trafficking. These aspects are certainly interesting but require extensive work and testing of appropriate physiological conditions and appropriate cargo targets. We discuss future perspectives in the Discussion. We agree that such functional information is of great importance, but needs to be clarified in future studies. *

      • *

      The main text needs to be organized in a way that a reader can separate what is the hypothesis/assumption from actual results and conclusions (see lines #143-149).

      Our response:

      *Thank you for this comment. We reformulated text throughout the manuscript. *

      The text is repeated in multiple places, while I understand that this is not plagiarism, the repetitiveness makes it difficult to read and understand the text. I highlight a couple of examples here, but please check the whole text thoroughly and edit/delete as necessary. a. Lines #124-125 with Lines #149-151 Lines #140-143

      __Our response: __

      *We checked the text and removed unnecessary repetitions. *

      • *

      • Could the authors elaborate on whether there are plan homologs of TTN5? Also, have other ARF/ARLs been compared to TTN5 beyond HsARF1? *

      Our response:

      Phylogenetic trees of the ARF family in Arabidopsis in comparison to human ARF family were already published by Vernoud et al. (2003). In this phylogenetic tree ARF, ARL and SAR proteins of Arabidopsis are compared with the members in humans and S. cervisiae. It is difficult to deduce whether the proteins are homologs or orthologs. In this setting, an ortholog of TTN5 may be HsARL2 followed by HsARL3. In Figure 1A we represented some human GTPases as closely related in sequence to TTN5, these are HsARL2, HsARF1 and AtARF1 since they are the best studied ARF GTPases. HRAS is a well-known member of the RAS superfamily which we used for kinetic comparison in Figure 2. We additionally compared published kinetics of RAC1, HsARF3, *CDC42, RHOA, ARF6, RAD, GEM, and RAS GTPases. *

      • *

      On a related note, a major problem I have with these kinetic values is the assumption of significance or not. For eg. Line#180 the values represent and 2 and 6-fold increase, if these numbers do not matter can a significance threshold be applied so as to understand how much fold-change is appreciable?

      Our response:

      The kinetics of TTN5 and its two mutant variants can be compared with those of other studied GTPases. To provide a basis for the statements about differences in GTPase activities, we modified the text and added respective references in the text for comparisons of fold changes.

      The new text is now as follows Line 175-231):

      „ We next measured the dissociation (koff) of mdGDP and mGppNHp from the TTN5 proteins in the presence of excess amounts of GDP and GppNHp, respectively (Figure 2C) and found interesting differences (Figure 2D, E; Supplementary Figures S3G-I, S4F-H). First, TTN5WT showed a koff value (0.012 s-1 for mGDP) (Figure 2D; Supplementary Figure S3G), which was 100-fold faster than those obtained for classical small GTPases, including RAC1 (Haeusler et al. 2006)and HRAS (Gremer et al. 2011), but very similar to the koff value of HsARF3 (Fasano et al. 2022). Second, the koffvalues for mGDP and mGppNHp, respectively, were in a similar range between TTN5WT (0.012 s-1 mGDP and 0.001 s-1mGppNHp) and TTN5Q70L (0.025 s-1 mGDP and 0.006 s-1 mGppNHp), respectively, but the koff values differed 10-fold between the two nucleotides mGDP and mGppNHp in TTN5WT (koff = 0.012 s-1 versus koff = 0.001 s-1; Figure 2D, E; Supplementary Figure S3G, I, S4F, H). Thus, mGDP dissociated from proteins 10-fold faster than mGppNHp. Third, the mGDP dissociation from TTN5T30N (koff = 0.149 s-1) was 12.5-fold faster than that of TTN5WT and 37-fold faster than the mGppNHp dissociation of TTN5T30N (koff = 0.004 s-1) (Figure 2D, E; Supplementary Figure S3H, S4G). Mutants of CDC42, RAC1, RHOA, ARF6, RAD, GEM and RAS GTPases, equivalent to TTN5T30N, display decreased nucleotide binding affinity and therefore tend to remain in a nucleotide-free state in a complex with their cognate GEFs (Erickson et al. 1997, Ghosh et al. 1999, Radhakrishna et al. 1999, Jung and Rösner 2002, Kuemmerle and Zhou 2002, Wittmann et al. 2003, Nassar et al. 2010, Huang et al. 2013, Chang and Colecraft 2015, Fisher et al. 2020, Shirazi et al. 2020). Since TTN5T30N exhibits fast guanine nucleotide dissociation, these results suggest that TTN5T30N may also act in either a dominant-negative or fast-cycling manner as reported for other GTPase mutants (Fiegen et al. 2004, Wang et al. 2005, Fidyk et al. 2006, Klein et al. 2006, Soh and Low 2008, Sugawara et al. 2019, Aspenström 2020).

      The dissociation constant (Kd) is calculated from the ratio koff/kon, which inversely indicates the affinity of the interaction between proteins and nucleotides (the higher Kd, the lower affinity). Interestingly, TTN5WT binds mGppNHp (Kd = 0.029 µM) 10-fold tighter than mGDP (Kd = 0.267 µM), a difference, which was not observed for TTN5Q70L (Kd for mGppNHp = 0.026 µM, Kd for mGDP = 0.061 µM) (Figure 2D, E). The lower affinity of TTN5WT for mdGDP compared to mGppNHp brings us one step closer to the hypothesis that classifies TTN5 as a non-classical GTPase with a tendency to accumulate in the active (GTP-bound) state (Jaiswal et al. 2013). The Kd value for the mGDP interaction with TTN5T30N was 11.5-fold higher (3.091 µM) than for TTN5WT, suggesting that this mutant exhibited faster nucleotide exchange and lower affinity for nucleotides than TTN5WT. Similar as other GTPases with a T30N exchange, TTN5T30Nmay behave in a dominant-negative manner in signal transduction (Vanoni et al. 1999).

      To get hints on the functionalities of TTN5 during the complete GTPase cycle, it was crucial to determine its ability to hydrolyze GTP. Accordingly, the catalytic rate of the intrinsic GTP hydrolysis reaction, defined as kcat, was determined by incubating 100 µM GTP-bound TTN5 proteins at 25{degree sign}C and analyzing the samples at various time points using a reversed-phase HPLC column (Figure 2F; Supplementary Figure S5). The determined kcat values were quite remarkable in two respects (Figure 2G). First, all three TTN5 proteins, TTN5WT, TTN5T30N and TTN5Q70L, showed quite similar kcatvalues (0.0015 s-1, 0.0012 s-1, 0.0007 s-1; Figure 2G; Supplementary Figure S5). The GTP hydrolysis activity of TTN5Q70L was quite high (0.0007 s-1). This was unexpected because, as with most other GTPases, the glutamine mutations at the corresponding position drastic impair hydrolysis, resulting in a constitutively active GTPase in cells (Hodge et al. 2020, Matsumoto et al. 2021). Second, the kcat value of TTN5WT (0.0015 s-1) although quite low as compared to other GTPases (Jian et al. 2012, Esposito et al. 2019), was 8-fold lower than the determined koff value for mGDP dissociation (0.012 s-1) (Figure 2E). This means that a fast intrinsic GDP/GTP exchange versus a slow GTP hydrolysis can have drastic effects on TTN5 activity in resting cells, since TTN5 can accumulate in its GTP-bound form, unlike the classical GTPase (Jaiswal et al. 2013). To investigate this scenario, we pulled down GST-TTN5 protein from bacterial lysates in the presence of an excess amount of GppNHp in the buffer using glutathione beads and measured the nucleotide-bound form of GST-TTN5 using HPLC. As shown in Figure 2H, isolated GST-TTN5 increasingly bonds GppNHp, indicating that the bound nucleotide is rapidly exchanged for free nucleotide (in this case GppNHp). This is not the case for classical GTPases, which remain in their inactive GDP-bound forms under the same experimental conditions (Walsh et al. 2019, Hodge et al. 2020)."

      Another issue with the kinetic measurements is the significance levels. Line #198-201. The three proteins are claimed to have similar values and in the nnext line, the Q70L mutant is claimed to be high.

      Our response:

      Please see our response and changes in the text according in our response to the previous comment 9. We have provided extra explanations and references to clarify why the kinetic behavior of TTN5 is unusual in several respects (Line 215-220).

      „First, all three TTN5 proteins, TTN5WT, TTN5T30N and TTN5Q70L, showed quite similar kcat values (0.0015 s-1, 0.0012 s-1, 0.0007 s-1; Figure 2G; Supplementary Figure S5). The GTP hydrolysis activity of TTN5Q70L was quite high (0.0007 s-1). This was unexpected because, as with most other GTPases, the glutamine mutations at the corresponding position drastic impair hydrolysis, resulting in a constitutively active GTPase in cells (Hodge et al. 2020, Matsumoto et al. 2021)."

      Provide data for conclusion in line#214-215

      Our response:

      We agree that a reference should be added after this sentence to make this sentence clearer (Line 228-231).

      "As shown in Figure 2H, isolated GST-TTN5 increasingly bonds GppNHp, indicating that the bound nucleotide is rapidly exchanged for free nucleotide (in this case GppNHp). This is not the case for classical GTPases, which remain in their inactive GDP-bound forms under the same experimental conditions (Walsh et al. 2019, Hodge et al. 2020)."

      • *

      How were the mutants studied here identified? random mutation or was it directed based on qualified assumptions?

      __Our response: __

      We used the T30N and the Q70L point mutations as such types of mutants had been reported to confer specific phenotypes in these well-conserved amino acid positions in multiple other small GTPases (Erickson et al. 1997, Ghosh et al. 1999, Radhakrishna et al. 1999, Jung and Rösner 2002, Kuemmerle and Zhou 2002, Wittmann et al. 2003, Nassar et al. 2010, Huang et al. 2013, Chang and Colecraft 2015, Fisher et al. 2020, Shirazi et al. 2020). In particular, these positions affect the interaction between small GTPases and their respective guanine nucleotide exchange factor (GEF; T30N) or on GTP hydrolysis (Q70L). We introduced the mutants and described their potential effect on the GTPase cycle in the introduction and cited exemplary literature. Please see also our response to comment 6 and the proposed text changes (Line 142-151).

      Could more simplification be provided for deifitinition of Kon/Koff values. And can these values be compared between mutants directly?

      __Our response: __

      *We introduce kon and koff in the modified Figure 2D, E, and they are described in the figure legends. Moreover, we present the data for calculations in Supplementary Figures S3, 4, where again we define the values in the respective figure legends. *

      • *

      Data provided are not convincing to claim that both the mutant forms have lower association with the Golgi.

      __Our response: __

      *Our conclusion is that both YFP-TTN5 and YFP-TTN5Q70L fluorescence signals tend to colocalize more with the Golgi-marker signals compared to YFP-TTN5T30N signals as deduced from the centroid-based colocalization method (Line 404-405). *

      "Hence, the GTPase-active TTN5 forms are likely more present at cis-Golgi stacks compared to TTN5T30N."

      The Pearson coefficients of all three YFP-TTN5 constructs were nearly identical, but we could identify differences in overlapping centers between the YFP and mCherry channel. 48 % of the GmMan1-mCherry fluorescent cis-Golgi stacks were overlapping with signal of YFP-TTN5Q70L, while for YFP-TTN5T30N an overlap of only 31 % was detected. This means that less cis*-Golgi stacks colocalized with signals in the YFP-TTN5T30N mutant than in YFP-TTN5Q70L, which is the statement in our manuscript. *

      • *

      IN general the Authors should strongly consider the claims made in the manuscript. For eg. "This study lays the foundation for studying the functional relationships of this small GTPase" (line 125) is unqualified as this is true for every protein ever studied and published. Considering that TTN was not isolated/identified in this study for the first time this claim doesn't stand.

      __Our response: __

      *We reformulated the sentence (Line 123-124). *

      "This study paves the way towards future investigation of the cellular and physiological contexts in which this small GTPase is functional."

      • *

      Line #185 - "characterestics of a dominant-negative...." What is this based on? From the text it is not clear what are the paremeters. Considering that no complementation phenotypes have been presented, this is a far-fetched claim Our response:

      Small GTPases in general are a well studied protein family and the here used mutations T30N and Q70L are conserved amino acids and commonly used for the characterization of the Ras superfamily members. We added explaining sentences with references to the text. The characteristics referred to in the above paragraph is based on the kinetic study.

      We modified the text as follows (Line 186-197 ):

      „Third, the mGDP dissociation from TTN5T30N (koff = 0.149 s-1) was 12.5-fold faster than that of TTN5WT and 37-fold faster than the mGppNHp dissociation of TTN5T30N (koff = 0.004 s-1) (Figure 2D, E; Supplementary Figure S3H, S4G). Mutants of CDC42, RAC1, RHOA, ARF6, RAD, GEM and RAS GTPases, equivalent to TTN5T30N, display decreased nucleotide binding affinity and therefore tend to remain in a nucleotide-free state in a complex with their cognate GEFs (Erickson et al. 1997, Ghosh et al. 1999, Radhakrishna et al. 1999, Jung and Rösner 2002, Kuemmerle and Zhou 2002, Wittmann et al. 2003, Nassar et al. 2010, Huang et al. 2013, Chang and Colecraft 2015, Fisher et al. 2020, Shirazi et al. 2020). Since TTN5T30N exhibits fast guanine nucleotide dissociation, these results suggest that TTN5T30N may also act in either a dominant-negative or fast-cycling manner as reported for other GTPase mutants (Fiegen et al. 2004, Wang et al. 2005, Fidyk et al. 2006, Klein et al. 2006, Soh and Low 2008, Sugawara et al. 2019, Aspenström 2020)."

      The claims in Line #224-227 are exaggerated. Please tone down or delete __Our response: __

      *We rephrased the sentence (Line 240-243). *

      "Therefore, we propose that TTN5 exhibits the typical functions of a small GTPase based on in vitro biochemical activity studies, including guanine nucleotide association and dissociation, but emphasizes its divergence among the ARF GTPases by its kinetics."

      Line#488-489 - This conclusion is not really supported. At best Authors can claim that TTN5 is associated with trafficking components, but the functional relevance of this association is not determined. Our response:

      *We toned down our statement (Line 604-608). *

      „The colocalization of FM4-64-labeled endocytosed vesicles with fluorescence in YFP-TTN5-expressing cells may indicate that TTN5 is involved in endocytosis and the possible degradation pathway into the vacuole. Our data on colocalization with the different markers support the hypothesis that TTN5 may have functions in vesicle trafficking."

      __Minor comments: __

      Line #95 - " This rolein vesicle....." - please clarify which role? Our response:

      We rephrased the sentence (Line 96-99).

      „These roles of ARF1 and SAR1 in COPI and II vesicle formation within the endomembrane system are well conserved in eukaryotes which raises the question of whether other plant ARF members are also involved in functioning of the endomembrane system."

      Line #168 - "we did not observed" please change to "not able to measure/quantify" __Our response: __

      *We changed the text accordingly (Line 169-171). *

      „A remarkable observation was that we were not able to monitor the kinetics of mGppNHp association with TTN5T30N but observed its dissociation (koff = 0.026 s-1; Figure 2E)."

      Line#179 - ARF# is human for Arabidopsis?

      Our response:

      *The study of Fasano et al., 2022 is based on human ARF3 and we added the information to the text (Line 180-181) *

      "(...) very similar to the koff value of HsARF3 (Fasano et al. 2022)."

      • *

      Line #181 - compared to what is the 10-fold difference?

      __Our response: __

      The 10-fold difference is between the nucleotides mGDP and mGppNHp, for both TTN5WT and TTN5Q70L. We added the information on specific nucleotides to this sentence for a better understanding (Line 181-185).

      „Second, the koff values for mGDP and mGppNHp, respectively, were in a similar range between TTN5WT (0.012 s-1mGDP and 0.001 s-1 mGppNHp) and TTN5Q70L (0.025 s-1 mGDP and 0.006 s-1 mGppNHp), respectively, but the koffvalues differed 10-fold between the two nucleotides mGDP and mGppNHp in TTN5WT (koff = 0.012 s-1 versus koff = 0.001 s-1; Figure 2D, E; Supplementary Figure S3G, I, S4F, H)."

      Lines #314-323 - are diffciult to understand, consider reframing. Same goes for the conclusion following these lines.

      __Our response: __

      We added an explanation to these sentences for a better understanding (Line 392-405).

      „We performed an additional object-based analysis to compare overlapping YFP fluorescence signals in YFP-TTN5-expressing leaves with GmMan1-mCherry signals (YFP/mCherry ratio) and vice versa (mCherry/YFP ratio). We detected 24 % overlapping YFP- fluorescence signals for TTN5 with Golgi stacks, while in YFP-TTN5T30N and YFP-TTN5Q70L-expressing leaves, signals only shared 16 and 15 % overlap with GmMan1-mCherry-positive Golgi stacks (Supplementary Figure S8B). Some YFP-signals did not colocalize with the GmMan1 marker. This effect appeared more prominent in leaves expressing YFP-TTN5T30N and less for YFP-TTN5Q70L, compared to YFP-TTN5 (Figure 5B-D). Indeed, we identified 48 % GmMan1-mCherry signal overlapping with YFP-positive structures in YFP-TTN5Q70L leaves, whereas 43 and only 31 % were present with YFP fluorescence signals in YFP-TTN5 and YFP-TTN5T30N-expressing leaves, respectively (Supplementary Figure S8B), indicating a smaller amount of GmMan1-positive Golgi stacks colocalizing with YFP signals for YFP-TTN5T30N. Hence, the GTPase-active TTN5 forms are likely more present at cis-Golgi stacks compared to TTN5T30N."

      Authors might consider a longer BFA treatment (3-4h) to see more clearer ER-Golgi fusion (BFA bodies)

      __Our response: __

      We perforned addtional BFA treatments for HA3-TTN5-expressing Arabidopsis seedlings followed by whole-mount immunostaining and for YFP-TTN5-expressing Arabidopsis lines. In both experiments we could obtain the typical BFA bodies. We included the NEW data in NEW Figure 4B, C

      **Referees cross-commenting**

      I agree with both my co-reviewers that the manuscript needs substantial improvement in its cell biology based experiments and conclusions thereof. I think the concensus of all reviewers points to weakness in the in-planta experiments which needs to be addressed to understand and characterize TTN5, which is the main goal of the manuscript.

      Reviewer #3 (Significance (Required)):

      Significance: The manuscript has general significance in understanding the role of small GTPases which are understudied. Although the manuscript does not advance the field of either intracellular trafficking or organization it holds significance in attempting to characterize proteins involved, which is a prerequisite for further functional studies.

      __Our response: __

      Thank you for your detailed analysis of our manuscript and positive assessment. Our study is an advance in the plant vesicle trafficking field.

    1. Figure 2: Comparing Snap! or Scratch Repeat Until Loop to Java while loop

      Difference between a Repeat Until loop and a While Loop:

      Repeat Until Loop: Executes the code as long as the given condition is False, once the condition turns to True, it stops.

      While Loop: Executes the code as long as the given condition is True, once the condition turns to False, it stops.

    1. If statements can be nested inside other if statements. Sometimes with nested ifs we find a dangling else that could potentially belong to either if statement. The rule is that the else clause will always be a part of the closest unmatched if statement in the same block of code, regardless of indentation.

      Good to know. No. Really important to know.

      Better to always use curly braces. As shown in the next highlight.

    1. Students’ testimonials reveal how exclusionary beliefs interact with school practices such as dress codes, curriculum tracking, and narrow course cur-riculum to maintain raced-gendered inequality and sexualized policing that typecasts, limits, and recreates hierarchies.

      One experience that dissimilar to the sexualize policing girls in high school both Latinas and Black girls were instantly impacted by this in school. During the hottest days of the year, where our school was located would be in 90s to 110s, and many girls would wear tank tops, shorts, skirts, and other pieces. But many of us would be dress coded because it was "showing too much skin". Our school had bad air conditioning and old classrooms, and with little ventilation being in class would be uncomfortable. However the same was not said to our male counterparts who had shorts. This demonstrated how the dress coding was because we were being sexualized and therefore targeted. The only time were we supported was when a few teachers posted on their doors they would not be taking part in dress code because it was targeting and sexualizing girls.

    1. backslash escape sequence

      Overriding the compiler's default reaction to special characters such as quotation marks and backslashes, that are used in the code with programming language specific meaning, to allow the programmer to display these special characters included with the string literals as a part of it.

    1. The code Turtle t1 = null; creates a variable t1 that refers to a Turtle object, but the null means that it doesn’t refer to an object yet. You could later create the object and set the object variable to refer to that new object (t1 = new Turtle(world1)). Or more commonly, you can declare an object variable and initialize it in the same line of code (Turtle t2 = new Turtle(world1);). World world1 = new World(); Turtle t1 = null; t1 = new Turtle(world1); // declare and initialize t2 Turtle t2 = new Turtle(world1);

      Summary: - You can declare object variables without assigning to it a new object. The assignment can be done at a later stage in code. - Null here is just Null, no special contextual meaning here. It literally just means nothing's here.

    1. à partir du milieu des 00:45:24 années 1990 la tendance inverse avec un durcissement de la législation chaque fois qu'une majorité de droite revient au pouvoir et une correction seulement partielle lorsque c'est la gauche qui 00:45:35 gouverne ainsi en 1994 on institue la rétention judiciaire autrement dit la garde à vue pour les moins de 13 ans en 1996 on permet la comparution immédiate et la comparution devant le juge des 00:45:49 enfants sans instruction préalable en 2002 on crée les centres éducatifs fermés ainsi que les établissements pénitentiaires pour mineurs et on abaisse l'âge de la responsabilité pénale de 13 à 10 ans 00:46:01 autorisant des sanctions beaucoup plus tôt dans la vie le code de la justice pénale des mineurs rétablira en fait en 2021 la limite de 13 ans en 2007 les exception 00:46:13 permettant de ne pas appliquer l'excuse de minorité pour les les mineurs de plus de 16 ans sont élargies ces dispositions seront toutefois abreugé en 2014 la pleine excuse de minorité se trouvant 00:46:25 alors rétablie en 2007 encore on supprime l'atténuation de la peine pour les mineurs de 16 ans en cas deuxèe récidif s'il commett un délit avec violence ou agression sexuelle en 00:46:38 2011 les tribunaux correctionnels pour mineurs sont créés pour juger les délits punis de plus de 3 ans d'emprisonnement en récidive par des adolescents de plus de 16 ans ils seront 00:46:50 cependant supprimé en 2016 en 2019 on permet d'appliquer au mineurs de plus de 13 la détention à domicile sous surveillance électronique progressivement ainsi avec 00:47:02 ces balancements que je vous ai indiqué le législateur érode le principe de protection de l'ordonnance de 1945 restreint les effets de la 00:47:14 présomption de non discernement et de l'excuse de minorité multiplie les lieux d'enfermement et les possibilités de peine correspondantes et rapproche la justice pénale des mineurs de la justice 00:47:27 pénale des adultes et vous aurez certainement remarqué que c'est un débat qui aujourd'hui est à nouveau sur la table
    2. les écoles n'échappent donc pas totalement au moment punitif en réalité plutôt que de se demander si la discipline y est plus sévère ou moins sévère que par le passé il faudrait s'interroger sur la manière dont la 00:35:28 discipline s'y reconfigure en permanence l'interdiction des châtiments corporels autrefois prévalent peut-être ainsi concomitante de l'apparition de nouveaux motifs de sanction au titre notamment du principe de laïcité tel que défini dans 00:35:41 la loi du 15 mars 2004 ainsi pour l'année scolaire 2022-2023 ce sont 3881 signalements qui ont été transmis au ministère de l'Éducation nationale dont environ la moitié concerne je cite des tenues qui 00:35:54 ne manifestent pas par nature une appartenance religieuse comme des jupes ou des robes longues selon les termes des bilans qui sont effectués par les dites équipe académique valeur de la 00:36:05 République ou eavr et leurs 1200 formateurs on ignore le nombre de sanctions correspondantes qui peuvent être disciplinaire au sein de l'école et même pénal dans le cadre du code de l'éducation qui prévoit une amende de 00:36:19 150 € portée à 200 en cas de récidive
    1. Likewise, the telegraph was invented in 1830 and used initially to warn train stations when multiple trains were on the track. Telegraphs allowed almost instant communication over huge distances - they sent a series of electrical impulses over a wire as "long" and "short" signals. The inventor of the telegraph, Samuel Morse, invented a code based off of those signals that could be translated into letters and, as a result, be used to sent messages. Morse Code thus enabled the first modern mass communications device. This was the first time when a message could travel faster than could a messenger on horseback, vastly increasing the speed by which information could be shared and disseminated. Simultaneously, steamships were transforming long-distance commerce. The first sailed in 1816, going about twice as fast as the fastest sailing ship could. This had obvious repercussions for trade, because it became cheaper to transport basic goods via steamship than it was to use locally-produced ones; this had huge impacts on agriculture and forestry, among other industries. Soon, it became economically viable to ship grain from the United States or Russia across oceans to reach European markets. The first transatlantic crossing was a race between two steamships going from England to New York in 1838; soon, sailing vessels became what they are today: archaic novelties.

      Morse code allowed for a quick and easy way of communicating and steamships were cheaper and faster which allowed imported goods to become a viable source of goods

    1. Author response:

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their constructive comments and suggestions. We have prepared a revised manuscript with updated quantification of theta cycle skipping, new statistical comparisons of the difference between the two behavioral tasks, and general improvements to the text and figures.

      Reviewer #1 (Public Review):

      Summary

      The authors provide very compelling evidence that the lateral septum (LS) engages in theta cycle skipping.

      Strengths

      The data and analysis are highly compelling regarding the existence of cycle skipping.

      Weaknesses

      The manuscript falls short on in describing the behavioral or physiological importance of the witnessed theta cycle skipping, and there is a lack of attention to detail with some of the findings and figures:

      More/any description is needed in the article text to explain the switching task and the behavioral paradigm generally. This should be moved from only being in methods as it is essential for understanding the study.

      Following this suggestion, we have expanded the description of the behavioral tasks in the Results section.

      An explanation is needed as to how a cell can be theta skipping if it is not theta rhythmic.

      A cell that is purely theta skipping (i.e., always fires on alternating theta cycles and never on adjacent theta cycles) will only have enhanced power at half theta frequency and not at theta frequency. Such a cell will therefore not be considered theta rhythmic in our analysis. Note, however, that there is a large overlap between theta rhythmic and theta skipping cell populations in our data (Figure 3 - figure supplement 2), indicating that most cells are not purely theta skipping.

      The most interesting result, in my opinion, is the last paragraph of the entire results section, where there is more switching in the alternation task, but the reader is kind of left hanging as to how this relates to other findings. How does this relate to differences in decoding of relative arms (the correct or incorrect arm) during those theta cycles or to the animal's actual choice? Similarly, how does it relate to the animal's actual choice? Is this phenomenon actually behaviorally or physiologically meaningful at all? Does it contribute at all to any sort of planning or decision-making?

      We agree that the difference between the two behavioral tasks is very interesting. It may provide clues about the mechanisms that control the cycle-by-cycle expression of possible future paths and the potential impact of goal-directed planning and (recent) experience. In the revised manuscript, we have expanded the analysis of the differences in theta-cycle dynamics between the two behavioral tasks. First, we confirm the difference through a new quantification and statistical comparison. Second, we performed additional analyses to explore the idea that the alternation of non-local representations reflects the number of relevant paths available to the animal (Figure 11 – figure supplements 2 and 3), but this did not appear to be the case. However, these results provide a starting point for future studies to clarify the task dependence of the theta- cycle dynamics of spatial representations and to address the important question of behavioral/physiological relevance.

      The authors state that there is more cycle skipping in the alternation task than in the switching task, and that this switching occurs in the lead-up to the choice point. Then they say there is a higher peak at ~125 in the alternation task, which is consistent. However, in the final sentence, the authors note that "This result indicates that the representations of the goal arms alternate more strongly ahead of the choice point when animals performed a task in which either goal arm potentially leads to reward." Doesn't either arm potentially lead to a reward (but different amounts) in the switching task, not the alternation task? Yet switching is stronger in the alternation task, which is not constant and contradicts this last sentence.

      The reviewer is correct that both choices lead to (different amounts of) reward in the switching task. As written, the sentence that the reviewer refers to is indeed not accurate and we have rephrased it to: “This result indicates that the representations of the goal arms alternate more strongly ahead of the choice point when animals performed a task in which either goal arm potentially leads to a desirable high-value reward.”.

      Additionally, regarding the same sentence - "representations of the goal arms alternate more strongly ahead of the choice point when the animals performed a task in which either goal arm potentially leads to reward." - is this actually what is going on? Is there any reason at all to think this has anything to do with reward versus just a navigational choice?

      We appreciate the reviewer’s feedback and acknowledge that our statement needs clarification. At the choice point in the Y-maze there are two physical future paths available to the animal (disregarding the path that the animal took to reach the choice point) – we assume this is what the reviewer refers to as “a navigational choice”. One hypothesis could be that alternation of goal arm representations is present whenever there are multiple future paths available, irrespective of the animal’s (learned) preference to visit one or the other goal arm. However, the reduced alternation of goal arm representations in the switching task that we report, suggests that the animal’s recent history of goal arm visits and reward expectations likely do influence the theta-cycle representations ahead of the choice point. We have expanded our analysis to test if theta cycle dynamics differ for trials before and after a switch in reward contingency in the switching task, but there was no statistical difference in our data. We have rewritten and expanded this part of the results to make our point more clearly.

      Similarly, the authors mention several times that the LS links the HPC to 'reward' regions in the brain, and it has been found that the LS represents rewarded locations comparatively more than the hippocampus. How does this relate to their finding?

      Indeed, Wirtshafter and Wilson (2020) reported that lateral septum cells are more likely to have a place field close to a reward site than elsewhere in their double-sided T-maze. It is possible that this indicates a shift towards reward or value representations in the lateral septum. In our study we did not look at reward-biased cells and whether they are more or less likely to engage in theta cycle skipping. This could be a topic for future analyses. It should be noted that the study by Wirtshafter and Wilson (2020) reports that a reward bias was predominantly present for place fields in the direction of travel away from the reward site. These reward-proximate LS cells may thus contribute to theta-cycle skipping in the inbound direction, but it is not clear if these cells would be active during theta sweeps when approaching the choice point in the outbound direction.

      Reviewer #2 (Public Review)

      Summary

      Recent evidence indicates that cells of the navigation system representing different directions and whole spatial routes fire in a rhythmic alternation during 5-10 Hz (theta) network oscillation (Brandon et al., 2013, Kay et al., 2020). This phenomenon of theta cycle skipping was also reported in broader circuitry connecting the navigation system with the cognitive control regions (Jankowski et al., 2014, Tang et al., 2021). Yet nothing was known about the translation of these temporally separate representations to midbrain regions involved in reward processing as well as the hypothalamic regions, which integrate metabolic, visceral, and sensory signals with the descending signals from the forebrain to ensure adaptive control of innate behaviors (Carus-Cadavieco et al., 2017). The present work aimed to investigate theta cycle skipping and alternating representations of trajectories in the lateral septum, neurons of which receive inputs from a large number of CA1 and nearly all CA3 pyramidal cells (Risold and Swanson, 1995). While spatial firing has been reported in the lateral septum before (Leutgeb and Mizumori, 2002, Wirtshafter and Wilson, 2019), its dynamic aspects have remained elusive. The present study replicates the previous findings of theta-rhythmic neuronal activity in the lateral septum and reports a temporal alternation of spatial representations in this region, thus filling an important knowledge gap and significantly extending the understanding of the processing of spatial information in the brain. The lateral septum thus propagates the representations of alternative spatial behaviors to its efferent regions. The results can instruct further research of neural mechanisms supporting learning during goal-oriented navigation and decision-making in the behaviourally crucial circuits entailing the lateral septum.

      Strengths

      To this end, cutting-edge approaches for high-density monitoring of neuronal activity in freely behaving rodents and neural decoding were applied. Strengths of this work include comparisons of different anatomically and probably functionally distinct compartments of the lateral septum, innervated by different hippocampal domains and projecting to different parts of the hypothalamus; large neuronal datasets including many sessions with simultaneously recorded neurons; consequently, the rhythmic aspects of the spatial code could be directly revealed from the analysis of multiple spike trains, which were also used for decoding of spatial trajectories; and comparisons of the spatial coding between the two differently reinforced tasks.

      Weaknesses

      Possible in principle, with the present data across sessions, longitudinal analysis of the spatial coding during learning the task was not performed. Without using perturbation techniques, the present approach could not identify the aspects of the spatial code actually influencing the generation of behaviors by downstream regions.

      Reviewer #3 (Public Review)

      Summary

      Bzymek and Kloosterman carried out a complex experiment to determine the temporal spike dynamics of cells in the dorsal and intermediate lateral septum during the performance of a Y-maze spatial task. In this descriptive study, the authors aim to determine if inputting spatial and temporal dynamics of hippocampal cells carry over to the lateral septum, thereby presenting the possibility that this information could then be conveyed to other interconnected subcortical circuits. The authors are successful in these aims, demonstrating that the phenomenon of theta cycle skipping is present in cells of the lateral septum. This finding is a significant contribution to the field as it indicates the phenomenon is present in neocortex, hippocampus, and the subcortical hub of the lateral septal circuit. In effect, this discovery closes the circuit loop on theta cycle skipping between the interconnected regions of the entorhinal cortex, hippocampus, and lateral septum. Moreover, the authors make 2 additional findings: 1) There are differences in the degree of theta modulation and theta cycle skipping as a function of depth, between the dorsal and intermediate lateral septum; and 2) The significant proportion of lateral septum cells that exhibit theta cycle skipping, predominantly do so during 'non-local' spatial processing.

      Strengths

      The major strength of the study lies in its design, with 2 behavioral tasks within the Y-maze and a battery of established analyses drawn from prior studies that have established spatial and temporal firing patterns of entorhinal and hippocampal cells during these tasks. Primary among these analyses, is the ability to decode the animal's position relative to locations of increased spatial cognitive demand, such as the choice point before the goal arms. The presence of theta cycle skipping cells in the lateral septum is robust and has significant implications for the ability to dissect the generation and transfer of spatial routes to goals within and between the neocortex and subcortical neural circuits.

      Weaknesses

      There are no major discernable weaknesses in the study, yet the scope and mechanism of the theta cycle phenomenon remain to be placed in the context of other phenomena indicative of spatial processing independent of the animal's current position. An example of this would be the ensemble-level 'scan ahead' activity of hippocampal place cells (Gupta et al., 2012; Johnson & Redish, 2007). Given the extensive analytical demands of the study, it is understandable that the authors chose to limit the analyses to the spatial and burst firing dynamics of the septal cells rather than the phasic firing of septal action potentials relative to local theta oscillations or CA1 theta oscillations. Yet, one would ideally be able to link, rather than parse the phenomena of temporal dynamics. For example, Tingley et al recently showed that there was significant phase coding of action potentials in lateral septum cells relative to spatial location (Tingley & Buzsaki, 2018). This begs the question as to whether the non-uniform distribution of septal cell activity within the Y-maze may have a phasic firing component, as well as a theta cycle skipping component. If so, these phenomena could represent another means of information transfer within the spatial circuit during cognitive demands. Alternatively, these phenomena could be part of the same process, ultimately representing the coherent input of information from one region to another. Future experiments will therefore have to sort out whether theta cycle skipping, is a feature of either rate or phase coding, or perhaps both, depending on circuit and cognitive demands.

      The authors have achieved their aims of describing the temporal dynamics of the lateral septum, at both the dorsal extreme and the intermediate region. All conclusions are warranted.

      Reviewer #1 (Recommendations For The Authors)

      The text states: "We found that 39.7% of cells in the LSD and 32.4% of cells in LSI had significantly higher CSI values than expected by chance on at least one of the trajectories." The text in the supplemental figure indicates a p-value of 0.05 was used to determine significance. However, four trajectory categories are being examined so a Bonferroni correction should be used (significance at p<0.0125).

      Indeed, a p-value correction for multiple tests should be performed when determining theta cycle skipping behavior for each of the four trajectories. We thank the reviewer for pointing out this oversight. We have implemented a Holm-Sidak p-value correction for the number of tested trajectories per cell (excluding trajectories with insufficient spikes). As a consequence, the number of cells with significant cycle-skipping activity decreased, but overall the results have not changed.

      Figure 4 is very confusing as raster plots are displayed for multiple animals but it is unclear which animal the LFP refers to? The bottom of the plot is also referenced twice in the figure caption.

      We apologize for the confusion. We have removed this figure in the revised manuscript, as it was not necessary to make the point about the spatial distribution of theta cycle skipping. Instead, we show examples of spatially-resolved cycle skipping in Figure 4 (formerly Figure 5 - supplementary figures 1 and 2) and we have added a plot with the spatially-resolved cycle skipping index for all analyzed cells in Figure 5A.

      Figure 6 has, I think, an incorrect caption or figure. Only A and B are marked in the figure but A-G are mentioned in the caption but do not appear to correspond to anything in the figure.

      Indeed, the caption was outdated. This has now been corrected.

      Figure 8 is also confusing for several reasons: how is the probability scale on the right related to multiple semi-separate (top and middle) figures? In the top and bottom figures, it is not clear what the right and left sides refer to. It is also unclear why a probability of 0.25 is used for position (seems potentially low). The caption also mentions Figure A but there are no lettered "sub" figures in Figure 8.

      The color bar on the right applies to both the top plot (directional decoding) and the middle plot (positional decoding). However, the maximum probability that is represented by black differs between the top and middle plots. We acknowledge that a shared color bar may lead to confusion and we have given each of the plots a separate color bar.

      As for the maximum probability of 0.25 for position: this was a typo in the legend. The correct maximum value is 0.5. In general, the posterior probability will be distributed over multiple (often neighboring) spatial bins, and the distribution of maximum probabilities will depend on the number of spatial bins, the level of spatial smoothing in the decoding algorithm, and the amount of decodable information in the data. It would be more appropriate to consider the integrated probability over a small section of the maze, rather than the peak probability that is assigned to a single 5 cm bin. Also, note that a posterior probability of 0.5 is many times higher than the probability associated with a uniform distribution, which is in our case.

      The left and right sides of the plots represent two different journeys that the animal ran. On the left an outbound journey is shown, and on the right an inbound journey. We have improved the figure and the description in the legend to make this clearer.

      The reviewer is correct that there are no panels in Figure 8 and we have corrected the legend.

      Some minor concerns

      The introduction states that "a few studies have reported place cell-like activity in the lateral septum (Tingley and Buzsaki, 2018; Wirtshafter and Wilson, 2020, 2019)." However, notably and controversially, the Tingley study is one of the few studies to find NO place cell activity in the lateral septum. This is sort of mentioned later but the citation in this location should be removed.

      The reviewer is correct, Tingley and Buzsaki reported a spatial phase code but no spatial rate code. We have removed the citation.

      Stronger position/direction coding in the dLS consistent with prior studies and they should be cited in text (not a novel finding).

      Thank you for pointing out this omission. Indeed, a stronger spatial coding in the dorsal lateral septum has been reported before, for example by Van der Veldt et al. (2021). We now cite this paper when discussing these findings.

      Why is the alternation task administered for 30m but the switching task for 45m?

      The reason is that rats received a larger reward in the switching task (in the high-reward goal arm) and took longer to complete trials on average. To obtain a more-or-less similar number of trials per session in both tasks, we extended the duration of switching task sessions to 45 minutes. We have added this explanation to the text.

      Regarding the percentage of spatially modulated cells in the discussion, it is also worth pointing out that bits/sec information is consistent with previous studies.

      Thank you for the suggestion. We now point out that the spatial information in our data is consistent with previous studies.

      Reviewer #2 (Recommendations For The Authors)

      While the results of the study are robust and timely, further details of behavioural training, additional quantitative comparisons, and improvements in the data presentation would make the study more comprehensible and complete.

      Major comments

      (1) I could not fully comprehend the behavioural protocols. They require a clearer explanation of both the specific rationale of the two tasks as well as a more detailed presentation of the protocols. Specifically:

      (1.1) In the alternation task, were the arms baited in a random succession? How many trials were applied per session? Fig 1D: how could animals reach high choice accuracy if the baiting was random?

      We used a continuous version of the alternation task, in which the animals were rewarded for left→home→right and right→home→left visit sequences. In addition, animals were always rewarded on inbound journeys. There was no random baiting of goal arms. Perhaps the confusion stems from our use of the word “trial” to refer to a completed lap (i.e., a pair of outbound/inbound journeys). On average, animals performed 54 of such trials per 30-minute session in the alternation task. We have expanded the description of the behavioral tasks in the Results and further clarified these points in the Methods section.

      (1.2) Were they rewarded for correct inbound trials? If there was no reward, why were they considered correct?

      Yes, rats received a reward at the home platform for correct inbound trials. We have now explicitly stated this in the text.

      (1.3) In the switch alternation protocol, for how many trials was one arm kept more rewarding than the other, and how many trials followed after the rewarding value switch?

      A switch was triggered when rats (of their own volition) visited the high-reward goal arm eight times in a row. Following a switch, the animals could complete as many trials as necessary until they visited the new high- reward goal arm in eight consecutive trials, which triggered another switch. As can be seen in Figure 1D, at the population level, animals needed ~13 trials to fully commit to the high-reward goal arm following a switch. We have further clarified the switching task protocol in the Results and Methods sections.

      (1.4) What does the phrase "the opposite arm (as 8 consecutive visits)" exactly mean? Sounds like 8 consecutive visits signalled that the arm was rewarded (as if were not predefined in the protocol).

      The task is self-paced and the animals initially visit both goal arms, before developing a bias for the high- reward goal arm. A switch of reward size was triggered as soon as the animal visited the high-reward goal arm for eight consecutive trials. We have rewritten the description of the switching task protocol, including this sentence, which hopefully clarifies the procedure.

      (1.5) P. 15, 1st paragraph, Theta cycle skipping and alternation of spatial representations is more prominent in the alternation task. Why in the switching task, did rats visit the left and right arms approximately equally often if one was more rewarding than the other? How many switches were applied per recording session, and how many trials were there in total?

      Both the left and right goal arms were sampled more or less equally by the animals because both goal arms at various times were associated with a large reward following switches in reward values during sessions. The number of switches per session varied from 1 to 3. Sampling of both goal arms was also evident at the beginning of each session and following each reward value switch, before animals switched their behavior to the (new) highly rewarded goal arm. In Table 1, we have now listed the number of trials and the number of reward-value switches for all sessions.

      (1.6) Is the goal arm in figures the rewarded/highly rewarded arm only or are non-baited arms also considered here?

      Both left and right arms are considered goal arms and were included in the analyses, irrespective of the reward that was received (or not received).

      (2) The spatial navigation-centred behavioural study design and the interpretation of results highlight the importance of the dorsal hippocampal input to the LS. Yet, the recorded LSI cells are innervated by intermediate and ventral aspects of the hippocampus, and LS receives inputs from the amygdala and the prefrontal cortex, which together may together bring about - crucial for the adaptive behaviours regulated by the LS - reward, and reward-prediction-related aspects in the firing of LS cells during spatial navigation. Does success or failure to acquire reward in a trial modify spatial coding and cycle skipping of LSD vs. LSI cells in ensuing inbound and outbound trials?

      This is an excellent question and given the length of the current manuscript, we think that exploration of this question is best left for a future extension of our study.

      A related question: in Figure 10, it is interesting that cycle skipping is prominent in the goal arm for outbound switching trials and inbound trials of both tasks. Could it be analytically explained by task contingencies and behaviour (e.g. correct/incorrect trial, learning dynamics, running speed, or acceleration)?

      Our observation of cycle skipping at the single-cell level in the goal arms is somewhat surprising and, we agree with the reviewer, potentially interesting. However, it was not accompanied by alternation of representations at the population level. Given the current focus and length of the manuscript, we think further investigation of cycle skipping in the goal arm is better left for future analyses.

      (3) Regarding possible cellular and circuit mechanisms of cycle skipping and their relation to the alternating representations in the LS. Recent history of spiking influences the discharge probability; e.g. complex spike bursts in the hippocampus are associated with a post-burst delay of spiking. In LS, cycle skipping was characteristic for LS cells with high firing rates and was not uniformly present in all trajectories and arms. The authors propose that cycle skipping can be more pronounced in epochs of reduced firing, yet the opposite seems also possible - this phenomenon can be due to an intermittently increased drive onto some LS cells. Was there a systematic relationship between cycle skipping in a given cell and the concurrent firing rate or a recent discharge with short interspike intervals?

      In our discussion, we tried to explain the presence of theta cycle skipping in the goal arms at the single-cell level without corresponding alternation dynamics at the population level. We mentioned the possibility of a decrease in excitatory drive. As the reviewer suggests, an increase in excitatory drive combined with post- burst suppression or delay of spiking is an alternative explanation. We analyzed the spatial tuning of cells with theta cycle skipping and found that, on average, these cells have a higher firing rate in the goal arm than the stem of the maze in both outbound and inbound run directions (Figure 5 – figure supplement 1). In contrast, cells that do not display theta cycle skipping do not show increased firing in the goal arm. These results are more consistent with the reviewer’s suggested mechanism and we have updated the discussion accordingly.

      (4) Were the differences between the theta modulation (cycle skipping) of local vs. non-local representations (P.14, line 10-12, "In contrast...", Figure 9A) and between alternation vs. switching tasks (Figure 10 C,D) significantly different?

      We have added quantification and statistical comparisons for the auto- and cross-correlations of the local/non-local representations. The results indeed show significantly stronger theta cycle skipping of the non-local representations as compared to the local representations (Figure 10 - figure supplement 1A), a stronger alternation of non-local representations in the outbound direction (Figure 10 - figure supplement 1B), and significant differences between the two tasks (Figure 11E,F).

      (5) Regarding the possibility of prospective coding in LS, is the accurate coding of run direction not consistent with prospective coding? Can the direction be decoded from the neural activity in the start arm? Are the cycling representations of the upcoming arms near the choice point equally likely or preferential for the then- selected arm?

      The coding of run direction (outbound or inbound) is distinct from the prospective/retrospective coding of the goal arm. As implemented, the directional decoding model does not differentiate between the two goal arms and accurate decoding of direction with this model can not inform us whether or not there is prospective (or retrospective) coding. To address the reviewer’s comments, we performed two additional analyses. First, we analyzed the directional (outbound/inbound) decoding performance as a function of location in the maze (Figure 6 - figure supplement 3E). The results show that directional decoding performance is high in both stem and goal arms. Second, we analyzed how well we can predict the trajectory type (i.e., to/from the left or right goal arm) as a function of location in the maze, and separately for outbound and inbound trajectories (Figure 6 - figure supplement 3C,D). The results show that on outbound journeys, decoding the future goal arm is close to chance when the animals are running along the stem. The decoding performance goes up around the choice point and reaches the highest level when animals are in the goal arm.

      (6) Figure 10 seems to show the same or similar data as Figures 5 (A,B) and 9 (C,D).

      Figure 10 (figure 11 in revised manuscript) re-analyzes the same data as presented in Figures 5 and 9, but separates the experimental sessions according to the behavioral task. We now explicitly state this.

      Minor comments

      (1) If cycle skipping in the periodicity of non-local representations was more prominent in alternation than in the switching task, one might expect them to be also prominent in early trials of the switching task, when the preference of a more rewarding arm is not yet established. Was this the case?

      The reviewer makes an interesting suggestion. Indeed, if theta cycle skipping and the alternation of non-local representations reflect that there are multiple paths that the animal is considering, one may predict that the theta skipping dynamics are similar between the two tasks in early trials (as the reviewer suggests). Similarly, one may predict that in the switching task, the alternation of non-local representations is weaker immediately before a reward contingency switch (when the animal has developed a bias towards the goal arm with a large reward) as compared to after the switch.

      We have now quantified the theta cycle dynamics of spatial representations in the early trials in each session of both tasks (Figure 11 - figure supplement 2) and in the trials before and after each switch in the switching task (Figure 11 - figure supplement 3).

      The results of the early trial analysis indicate stronger alternation of non-local representations in the alternation task than in the switching task (consistent with the whole session analysis), which is contrary to the prediction.

      The pre-/post-switch analysis did not reveal a significant difference between the trials before and after a reward contingency switch. If anything, there was a trend towards stronger theta cycle skipping/alternation in the trials before a switch, which would be opposite to the prediction.

      These results do not appear to support the idea that the alternation of non-local representations reflects the number of relevant paths available to the animal. We have updated the text to incorporate these new data and discuss the implications.

      (2) Summary: sounds like the encoding of spatial information and its readout in the efferent regions are equally well established.

      Thank you for pointing this out.

      (3) Summary: "motivation and reward processing centers such as the ventral tegmental area." How about also mentioning here the hypothalamus, which is a more prominent output of the lateral septum than the VTA?

      We have now also mentioned the hypothalamus.

      (4) "lateral septum may contribute to the hippocampal theta" - readers not familiar with details of the medial vs. lateral septum research may misinterpret the modest role of LS in theta compared to MS.

      We have added “in addition to the strong theta drive originating from the medial septum” to make clear that the lateral septum has a modest role in hippocampal theta generation.

      (5) "(Tingley and Buzsáki, 2018) found a lack of spatial rate coding in the lateral septum and instead reported a place coding by specific phases of the hippocampal theta rhythm (Rizzi-Wise and Wang, 2021) " needs rephrasing.

      Thank you, we have rephrased the sentence.

      (6) Figure 4 is a bit hard to generalize. The authors may additionally consider a sorted raster presentation of the dataset in this main figure.

      We have removed this figure in the revised manuscript, as it was not necessary to make the point about the location of theta cycle skipping. Instead, we show examples of spatially-resolved cycle skipping in Figure 4 (formerly Figure 5 - supplementary figures 1 and 2), and, following the reviewer’s suggestion, we have added a plot with the spatially-resolved cycle skipping index for all analyzed cells (Figure 5A).

      (7) It would help if legends of Figure 5 (and related supplementary figures) state in which of the two tasks the data was acquired, as it is done for Figure 10.

      Thank you for the suggestion. The legends of Figure 4A,B (formerly Figure 5 – supplemental figures 1 and 2) and Figure 5 now include in which behavioral task the data was acquired.

      (8) Page 10, "Spatial coding...", 1st Citing the initial report by Leugeb and Mizumori would be appropriate here too.

      The reviewer is correct. We have added the citation.

      (9) The legend in Figure 6 (panels A-G) does not match the figure (only panels A,B). What is shown in Fig. 6B, the legend does not seem to fully match.

      Indeed, the legend was outdated. This has now been corrected.

      (10) 7 suppl., if extended to enable comparisons, could be a main figure. Presently, Figure 7C does not account for the confounding effect of population size and is therefore difficult to interpret without complex comparisons with the Supplementary Figure which is revealing per se.

      We thank the reviewer for their suggestion. We have changed Figure 7 such that it only shows the analysis of decoding performed with all LSD and LSI cells. Figure 7 – supplemental figure 1 has been transformed into main Figure 8, with the addition of a panel to show a statistical comparison between decoding performance in LSD and LSI with a fixed number of cells.

      (11) 14, line 10 there is no Figure 8A

      This has been corrected.

      (12) 15 paragraph 1, is the discussed here model the one from Kay et al?

      From Kay et al. (2020) and also Wang et al. (2020). We have added the citations.

      (13) Figure 5 - Figure Supplement 1 presents a nice analysis that, in my view, can merit a main figure. I could not find the description of the colour code in CSI panels, does grey/red refer to non/significant points?

      Indeed, grey/red refers to non-significant points and significant points respectively. We have clarified the color code in the figure legend. Following the reviewer’s suggestion, we have made Figure 5 Supplement 1 and 2 a main figure (Figure 4).

      (14) Figure 5 -Figure Supplement 2. Half of the cells (255 and 549) seems not to be representative of the typically high SCI in the goal arm in left and right inbound trials combined (Figure 5 A). Were the changes in CSI in the right and left inbound trials similar enough to be combined in Fig 5A? Otherwise, considering left and right inbound runs separately and trying to explain where the differences come from would seem to make sense.

      Figure 5 – figure supplement 2 is now part of the new main Figure 4. Originally, the examples were from a single session and the same cells as shown in the old Figure 4. However, since the old Figure 4 has been removed, we have selected examples from different sessions and both left/right trajectories that are more representative of the overall distribution. We have further added a plot with the spatially-resolved cycle skipping for all analyzed cells in Figure 5A.

      (15) In the second paragraph of the Discussion, dorso-ventral topography of hippocampal projections to the LS (Risold and Swanson, Science, 90s) could be more explicitly stated here.

      Thank you for the suggestion. We have now explicitly mentioned the dorsal-ventral topography of hippocampal-lateral septum projections and cite Risold & Swanson (1997).

      (16) Discussion point: why do the differences in spatial information of cells in the ventral/intermediate vs. dorsal hippocampus not translate into similarly prominent differences in LSI vs. LSD?

      In our data, we do observe clear differences in spatial coding between LSD and LSI. Specifically, cell activity in the LSD is more directional, has higher goal arm selectivity, and higher spatial information (we have now added statistical comparisons to Figure 6 – figure supplement 1). As a result, spatial decoding performance is much better for LSD cell populations than LSI cell populations (see updated Figure 8, with statistical comparison of decoding performance). Spatial coding in the LS is not as strong as in the hippocampus, likely because of the convergence of hippocampal inputs, which may give the impression of a less prominent difference between the two subregions.

      (17) Discussion, last paragraph: citation of the few original anatomical and neurophysiological studies would be fitting here, in addition to the recent review article.

      Thank you for the suggestion. We have added selected citations of the original literature.

      (18) Methods, what was the reference electrode?

      We used an external reference electrode that was soldered to a skull screw, which was positioned above the cerebellum. We have added this to the Methods section.

      (19) Methods, Theta cycle skipping: bandwidth = gaussian kerner parameter?

      The bandwidth is indeed a parameter of the Gaussian smoothing kernel and is equal to the standard deviation.

      Reviewer #3 (Recommendations For The Authors)

      Below I offer a short list of minor comments and suggestions that may benefit the manuscript.

      (A) I was not able to access the Open Science Framework Repository. Can this be rectified?

      Thank you for checking the OSF repository. The data and analysis code are now publicly available.

      (B) In the discussion the authors should attempt to flesh out whether they can place theta cycle skipping into context with left/right sweeps or scan ahead phenomena, as shown in the Redish lab.

      Thank you for the excellent suggestion. We have now added a discussion of the possible link between theta cycle skipping and the previously reported scan-ahead theta sweeps.

      (C) What is the mechanism of cycle skipping? This could be relevant to intrinsic vs network oscillator models. Reference should also be made to the Deshmukh model of interference between theta and delta (Deshmukh, Yoganarasimha, Voicu, & Knierim, 2010).

      We had discussed a potential mechanism in the discussion (2nd to last paragraph in the revised manuscript), which now includes a citation of a recent computational study (Chu et al., 2023). We have now also added a reference to the interference model in Deshmukh et al, 2010.

      (D) Little background was given for the motivation and expectation for potential differences between the comparison of the dorsal and intermediate lateral septum. I don't believe that this is the same as the dorsal/ventral axis of the hippocampus, but if there's a physiological justification, the authors need to make it.

      We have added a paragraph to the introduction to explain the anatomical and physiological differences across the lateral septum subregions that provide our rationale for comparing dorsal and intermediate lateral septum (we excluded the ventral lateral septum because the number of cells recorded in this region was too low).

      (E) It would help to label "outbound" and "inbound" on several of the figures. All axes need to be labeled, with appropriate units indicated.

      We have carefully checked the figures and added inbound/outbound labels and axes labels where appropriate.

      (F) In Figure 6, the legend doesn't match the figure.

      Indeed, the legend was outdated. This has now been corrected.

      (G) The firing rate was non-uniform across the Y-maze. Does this mean that the cells tended to fire more in specific positions of the maze? If so, how would this affect the result? Would increased theta cycle skipping at the choice point translate to a lower firing rate at the choice point? Perhaps less overdispersion of the firing rate (Fenton et al., 2010)?

      Individual cells indeed show a non-uniform firing rate across the maze. To address the reviewer’s comment and test if theta cycle skipping cells were active preferentially near the choice point or other locations, we computed the mean-corrected spatial tuning curves for cell-trajectory pairs with and without significant theta cycle skipping. This additional analysis indicates that, on average, the population of theta cycle skipping cells showed a higher firing rate in the goal arms than in the stem of the maze as compared to non-skipping cells for outbound and inbound directions (shown in Figure 5 - figure supplement 1).

      (H) As mentioned above, it could be helpful to look at phase preference. Was there an increased phase preference at the choice point? Would half-cycle firing correlate with an increased or decreased phase preference? Based on prior work, one would expect increased phase preference, at least in CA1, at the choice point (Schomburg et al., 2014). In contrast, other work might predict phasic preference according to spatial location (Tingley & Buzsaki, 2018). Including phase analyses is a suggestion, of course. The manuscript is already sufficiently novel and informative. Yet, the authors should state why phase was not analyzed and that these questions remain for follow-up analyses. If the authors did analyze this and found negative results, it should be included in this manuscript.

      We thank the reviewer for their suggestion. We have not yet analyzed the theta phase preference of lateral septum cells or other relations to the theta phase. We agree that this would be a valuable extension of our work, but prefer to leave it for future analyses.

      (I) One of the most important aspects of the manuscript, is that there is now evidence of theta cycle skipping in the circuit loop between the EC, CA1, and LS. This now creates a foundation for circuit-based studies that could dissect the origin of route planning. Perhaps the authors should state this? In the same line of thinking, how would one determine whether theta cycle skipping is necessary for route planning as opposed to a byproduct of route planning? While this question is extremely complex, other studies have shown that spatial navigation and memory are still possible during the optogenetic manipulation of septal oscillations (Mouchati, Kloc, Holmes, White, & Barry, 2020; Quirk et al., 2021). However, pharmacological perturbation or lesioning of septal activity can have a more profound effect on spatial navigation (Bolding, Ferbinteanu, Fox, & Muller, 2019; Winson, 1978). As a descriptive study, I think it would be helpful to remind the readers of these basic concepts.

      We thank the reviewer for their comment and for pointing out possible future directions for linking theta cycle skipping to route planning. Experimental manipulations to directly test this link would be very challenging, but worthwhile to pursue. We now mention how circuit-based studies may help to test if theta cycle skipping in the broader subcortical-cortical network is necessary for route planning. Given that the discussion is already quite long, we decided to omit a more detailed discussion of the possible role of the medial septum (which is the focus of the papers cited by the reviewer).

      Very minor points

      (A) In the introduction, "one study" begins the sentence but there is a second reference.

      Thank you, we have rephrased the sentence.

      (B) Also in the introduction, it could be helpful to have an operational definition of theta cycle skipping (i.e., 'enhanced rhythmicity at half theta frequency').

      We followed the reviewer’s suggestion.

      (C) The others should be more explicit in the introduction about their main question. Theta cycle skipping exists in CA1, and then import some of the explanations mentioned in the discussion to the introduction (i.e., attractors states of multiple routes). The main question is then whether this phenomenon, and others from CA1, translate to the output in LS.

      We have edited the introduction to more clearly state the main question of our study, following the suggestion from the reviewer.

      (D) There are a few instances of extra closing parentheses.

      We checked the text but did not find instances of erroneous extra closing parentheses. There are instances of nested parentheses, which may have given the impression that closing parentheses were duplicated.

      (E) The first paragraph of the Discussion lacks sufficient references.

      We have now added references to the first paragraph of the discussion.

      (F) At the end of the 2nd paragraph in the Discussion, the comparison is missing. More than what? It's not until the next reference that one can assume that the authors are referring to a dorsal/ventral axis. However, the physiological motivation for this comparison is lacking. Why would one expect a dorsal/intermediate continuum for theta modulation as there is along the dorsal/ventral axis of the hippocampus?

      Thank you for spotting this omission. We have rewritten the paragraph to more clearly make the parallel between dorsal-ventral gradients in the lateral septum and hippocampus and how this relates to the topographical connections between the two structures.

    2. Reviewer #2 (Public Review):

      Summary

      Recent evidence indicates that cells of the navigation system representing different directions and whole spatial routes fire in a rhythmic alternation during 5-10 Hz (theta) network oscillation (Brandon et al., 2013, Kay et al., 2020). This phenomenon of theta cycle skipping was also reported in broader circuitry connecting the navigation system with the cognitive control regions (Jankowski et al., 2014, Tang et al., 2021). Yet nothing was known about the translation of these temporally separate representations to midbrain regions involved in reward processing as well as the hypothalamic regions, which integrate metabolic, visceral, and sensory signals with the descending signals from the forebrain to ensure adaptive control of innate behaviors (Carus-Cadavieco et al., 2017). The present work aimed to investigate theta cycle skipping and alternating representations of trajectories in the lateral septum, neurons of which receive inputs from large number of CA1 and nearly all CA3 pyramidal cells (Risold and Swanson, 1995). While spatial firing has been reported in the lateral septum before (Leutgeb and Mizumori, 2002, Wirtshafter and Wilson, 2019), its dynamic aspects have remained elusive. The present study replicates the previous findings of theta-rhythmic neuronal activity in the lateral septum and reports a temporal alternation of spatial representations in this region, thus filling an important knowledge gap and significantly extending the understanding of the processing of spatial information in the brain. The lateral septum thus propagates the representations of alternative spatial behaviors to its efferent regions. The results can instruct further research of neural mechanisms supporting learning during goal-oriented navigation and decision-making in the behaviourally crucial circuits entailing the lateral septum.

      Strengths

      To this end, cutting-edge approaches for high-density monitoring of neuronal activity in freely behaving rodents and neural decoding were applied. Strengths of this work include comparisons of different anatomically and probably functionally distinct compartments of the lateral septum, innervated by different hippocampal domains and projecting to different parts of the hypothalamus; large neuronal datasets including many sessions with simultaneously recorded neurons; consequently, the rhythmic aspects of the spatial code could be directly revealed from the analysis of multiple spike trains, which were also used for decoding of spatial trajectories; and comparisons of the spatial coding between the two differently reinforced tasks.

      Weaknesses

      Without using perturbation techniques, the present approach could not identify the aspects of the spatial code actually influencing the generation of behaviors by downstream regions.

    1. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #1 (Recommendations For The Authors):

      In this revision the authors address some of the key concerns, including clarification of the balanced nature of the RL driven pitch changes and conducting analyses to control for the possible effects of singing quantity on their results. The paper is much improved but still has some sources of confusion, especially around Fig. 4, that should be fixed. The authors also start the paper with a statistically underpowered minor claim that seems unnecessary in the context of the major finding. I recommend the authors may want to restructure their results section to focus on the major points backed by sufficient n and stats.

      Major issues.

      (1) The results section begins very weak - a negative result based on n=2 birds and then a technical mistake of tube clogging re-spun as an opportunity to peak at intermittent song in the otherwise muted birds. The logic may be sound but these issues detract from the main experiment, result, analysis, and interpretation. I recommend re-writing this section to home in on, from the outset, the well-powered results. How much is really gained from the n=2 birds that were muted before ANY experience? These negative results may not provide enough data to make a claim. Nor is this claim necessary to motivate what was done in the next 6 birds. I recommend dropping the claim?

      We thank the reviewer for the recommendation. We moved the information to the Methods.

      (2) Fig. 4 is very important yet remains very confusing, as detailed below.

      Fig. 4a. Can the authors clarify if the cohort of WNd birds that give rise to the positive result in Fig 4 ever experienced the mismatch in the absence of ongoing DAF reinforcement pre-deafening? Fig4a does nor the next clearly specifies this. This is important because we know that there are day timescale delays in LMAN-dependent bias away from DAF and consolidation into the HVC-RA pathway (Andalman and Fee, 2009). Thus, if birds experienced mismatch pre-deafening in the absence of DAF, then an earnly learning phase in Area X could be set in place. Then deafening occurs, but these weight changes in X could result in LMAN bias that expresses only days later -independent of auditory feedback. Such a process would not require an internal model as the authors are arguing for here. It would simply arise from delays in implementing reinforcement-driven feedback. If the birds in Fig 4 always had DAF on before deafening, then this is not an issue. But if the birds had hours of singing with DAF off before deafening, and therefore had the opportunity to associate DA error signals with the targeted time in the song (e.g. pauses on the far-from-target renditions (Duffy et al, 2022), then the return-to-baseline would be expected to be set in place independent of auditory feedback. Please clarify exactly if the pitch-contingent DAF was on or off in the WNd cohort in the hours before deafening. In Fig. 3b it looks like the answer is yes but I cannot find this clearly stated in the text.

      We did not provide DAF-free singing experience to the birds in Fig. 4 before deafening. Thus, according to the reviewer, the concern does not apply.

      Note that we disagree with the reviewer’s premise that there is ‘day timescale delay in LMAN-dependent bias away from DAF and consolidation into the HVC-RA pathway’. More recent data reveals immediate consolidation of the anterior forebrain bias without a night-time effect (Kollmorgen, Hahnloser, Mante 2020; Tachibana, Lee, Kai, Kojima 2022). Thus, the single bird in (Andalman and Fee 2009) seems to be somewhat of an outlier.

      Hearing birds can experience the mismatch regardless of whether they experience DAF-free singing (provided their song was sufficiently shifted): even the renditions followed by white noise can be assessed with regards to their pitch mismatch, so that DAF imposes no limitation on mismatch assessment.

      We disagree with their claim that no internal model would be needed in case consolidation was delayed in Area X. If indeed, Area X stores the needed change and it takes time to implement this change in LMAN, then we would interpret the change in Area X as the plan that birds would be able to implement without auditory feedback. Because pitch can either revert (after DAF stops) or shift further away (when DAF is still present), there is no rigid delay that is involved in recovering the target, but a flexible decision making of implementing the plan, which in our view amounts to using a model.

      Fig 4b. Early and Late colored dots in legend are both red; late should be yellow? Perhaps use colors that are more distinct - this may be an issue of my screen but the two colors are difficult to discern.

      We used colors yellow to red to distinguish different birds and not early and late. We modified the markers to improve visual clarity: Early is indicated with round markers and late with crosses.

      Fig 4b. R, E, and L phases are only plotted for 4c; not in 4b. But the figure legend says that R, E and L are on both panels.

      In Fig. 4b E and L are marked with markers because they are different for different birds. In Fig. 4c the phases are the same for all birds and thus we labeled them on top. We additionally marked R in Fig. 4b as in Fig. 4c.

      Fig 4e. Did the color code switch? In the rest of Fig 4, DLO is red and WND is blue. Then in 4e it swaps. Is this a typo in the caption? Or are the colors switch? Please fix this it's very confusing.

      Thank you for pointing out the typo in the caption. We corrected it.

      The y axes in Fig 4d-e are both in std of pitch change - yet they have different ylim which make it visually difficult to compare by eye. Is there a reason for this? Can the authors make the ylim the same for fig 4d-e?.

      We added dashed lines to clarify the difference in ylim.

      Fig 4d-3 is really the main positive finding of the paper. Can the others show an example bird that showcases this positive result, plotted as in Fig 3b? This will help the audience clearly visualize the raw data that go into the d' analyses and get a more intuitive sense of the magnitude of the positive result.

      We added example birds to figure 4, one for WNd and one for dLO.

      Please define 'late' in Fig.4 legend.

      Done

      Minor

      Define NRP In the text with an example. Is an NRP of 100 where the birds was before the withdrawal of reinforcement?

      We added the sentence to the results:

      "We quantified recovery in terms of 𝑵𝑹𝑷 to discount for differences in the amount of initial pitch shift where 𝑵𝑹𝑷 = 𝟎% corresponds to complete recovery and 𝑵𝑹𝑷 = 𝟏𝟎𝟎% corresponds pitch values before withdrawal of reinforcement (R) and thus no recovery."

      Reviewer #3 (Recommendations For The Authors):

      The use of "hierarchically lower" to refer to the flexible process is confusing to me, and possibly to many readers. Some people think of flexible, top-down processes as being _higher_ in a hierarchy. Regardless, it doesn't seem important, in this paper, to label the processes in a hierarchy, so perhaps avoid using that terminology.

      We reformulated the paragraph using ‘nested processes’ instead of hierarchical processes.

      In the statement "a seeming analogous task to re-pitching of zebra finch song, in humans, is to modify developmentally learned speech patterns", a few suggestions: it is not clear whether "re-pitching" refers to planning or feedback-dependent learning (I didn't see it introduced anywhere else). And if this means planning, then it is not clear why this would be analogous to "humans modifying developmentally learned speech patterns". As you mentioned, humans are more flexible at planning, so it seems re-pitching would _not_ be analogous (or is this referring to the less flexible modification of accents?).

      We changed the sentence to:

      "Thus, a seeming analogous task to feedback-dependent learning of zebra finch song, in humans, is to modify developmentally learned speech patterns."

    1. Résumé de la vidéo [00:00:20][^1^][1] - [00:22:58][^2^][2]:

      Cette vidéo présente une conférence de Luc PELLISSIER sur la complexité des textes juridiques et leur traitement calculatoire. Il explore les notions de complexité en informatique théorique, la simplification et la codification du droit, et comment ces processus affectent l'accessibilité et l'intelligibilité de la loi. Il discute également des différentes époques de la codification en France et de l'impact de l'informatisation sur la législation.

      Points forts: + [00:00:20][^3^][3] Introduction et contexte * Présentation du sujet et de l'importance de la complexité en informatique * Lien entre l'enseignement du droit et la complexité + [00:01:15][^4^][4] Complexité et droit * Exploration de la complexité en informatique théorique * Absence de la notion de complexité dans la littérature juridique + [00:04:27][^5^][5] Simplification et codification * Discussion sur la simplification du droit comme objectif constitutionnel * Relation entre simplification et codification dans l'histoire juridique française + [00:07:04][^6^][6] Les trois âges de la codification en France * Analyse des différentes méthodes de codification depuis le Consulat jusqu'à aujourd'hui * Impact de l'informatisation sur la codification et la législation + [00:17:19][^7^][7] Le droit comme système calculatoire * Proposition d'une nouvelle perspective sur le droit comme objet calculatoire * Exemple de la rémunération des heures complémentaires dans les universités françaises Résumé de la vidéo [00:23:00][^1^][1] - [00:44:29][^2^][2] : La vidéo présente une analyse détaillée des textes juridiques en tant qu'objets calculatoires, en explorant la complexité des modifications législatives et réglementaires, ainsi que leur impact sur la consolidation des textes de loi.

      Points saillants : + [00:23:00][^3^][3] Classification des dispositions législatives * Différenciation entre les dispositions substantielles et celles qui appellent à l'action d'autres autorités. * Exemples concrets de modifications législatives et leur effet direct ou indirect sur le monde réel. * Discussion sur la complexité des textes qui modifient d'autres textes. + [00:25:02][^4^][4] Modifications précises vs. générales * Comparaison entre les modifications qui indiquent exactement quel texte changer et celles qui sont plus larges et moins spécifiques. * Impact des modifications générales sur la clarté et l'interprétation des textes. * Exemple de la réforme du Conseil national des universités et ses implications. + [00:31:02][^5^][5] Citations et références dans les textes législatifs * Utilisation de citations textuelles pour lier différents codes, comme le Code de l'éducation et le Code de la santé publique. * Problèmes posés par les modifications qui ne sont pas consolidées et laissent l'utilisateur final interpréter le texte. + [00:37:00][^6^][6] Conséquences des modifications non consolidées * Difficultés rencontrées par les utilisateurs du droit en raison de modifications automatiques non reflétées dans les textes consolidés. * Questions soulevées sur la simplification du droit et l'accessibilité des informations légales actuelles. Résumé de la vidéo [00:44:31][^1^][1] - [01:03:14][^2^][2]:

      La partie 3 de la vidéo aborde la complexité du texte juridique en tant qu'objet calculatoire, en se concentrant sur les processus d'amendement, de consolidation et de codification. Luc Pellissier explore les défis théoriques et pratiques liés à la gestion des versions d'un texte de loi et la nécessité d'une approche formelle pour comprendre les modifications et leur impact sur la structure du texte.

      Points forts: + [00:44:31][^3^][3] Théorie du versionnement * Discussion sur la gestion des versions d'un fichier ou d'une loi * Importance des modifications explicites pour une théorie correcte + [00:46:11][^4^][4] Analogie avec le logiciel libre * Comparaison entre le texte juridique et le code source d'un logiciel * Le rôle de la compilation dans la compréhension du logiciel + [00:49:00][^5^][5] Questions épistémologiques * Débat sur la neutralité du droit et l'impact des hypothèses de recherche * Lien entre la simplification du droit et la qualité démocratique + [00:55:01][^6^][6] Développement d'un logiciel de versionnage * Spécification du logiciel pour la gestion des versions du droit * Défis liés à la création d'une théorie propre pour un "code spaghetti" juridique + [01:01:03][^7^][7] Structure formelle du texte juridique * Visualisation du texte de loi comme un arbre avec des branches et des modifications * Impact des outils informatiques sur la précision des modifications législatives

    2. ça 00:46:13 conclura une analogie avant de finir qui est celle du début du mouvement du logiciel libre parce que là je vous ai dit un petit peu comme s'il y avait un hiatus absolument incroyable dans le cas du droit qui est que le texte qu'on écrit c'est pas le 00:46:27 texte qu'on applique mais en fait c'est le cas de tout le logiciel fondamentalement le logiciel il est écrit sous une certaine forme qu'on appelle le code source par des expériences humains des experts humaines 00:46:40 et ensuite celui que j'exécute sur ma machine c'est pas ça c'est une autre version c'est le binaire et il y a un processus de transformation entre les deux qu'on appelle en général compilation
    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study provides an important cell atlas of the gill of the mussel Gigantidas platifrons using a single nucleus RNA-seq dataset, a resource for the community of scientists studying deep sea physiology and metabolism and intracellular host-symbiont relationships. The work, which offers solid insights into cellular responses to starvation stress and molecular mechanisms behind deep-sea chemosymbiosis, is of relevance to scientists interested in host-symbiont relationships across ecosystems.

      Public Reviews:

      Reviewer #1 (Public Review):

      Wang et al have constructed a comprehensive single nucleus atlas for the gills of the deep sea Bathymodioline mussels, which possess intracellular symbionts that provide a key source of carbon and allow them to live in these extreme environments. They provide annotations of the different cell states within the gills, shedding light on how multiple cell types cooperate to give rise to the emergent functions of the composite tissues and the gills as a whole. They pay special attention to characterizing the bacteriocyte cell populations and identifying sets of genes that may play a role in their interaction with the symbiotes.

      Wang et al sample mussels from 3 different environments: animals from their native methane-rich environment, animals transplanted to a methane-poor environment to induce starvation, and animals that have been starved in the methane-poor environment and then moved back to the methane-rich environment. They demonstrated that starvation had the biggest impact on bacteriocyte transcriptomes. They hypothesize that the upregulation of genes associated with lysosomal digestion leads to the digestion of the intracellular symbiont during starvation, while the non-starved and reacclimated groups more readily harvest the nutrients from symbiotes without destroying them.

      Strengths:

      This paper makes available a high-quality dataset that is of interest to many disciplines of biology. The unique qualities of this non-model organism and the collection of conditions sampled make it of special interest to those studying deep sea adaptation, the impact of environmental perturbation on Bathymodioline mussels populations, and intracellular symbiotes. The authors do an excellent job of making all their data and analysis available, making this not only an important dataset but a readily accessible and understandable one.

      The authors also use a diverse array of tools to explore their data. For example, the quality of the data is augmented by the use of in situ hybridizations to validate cluster identity and KEGG analysis provides key insights into how the transcriptomes of bacteriocytes change.

      The authors also do a great job of providing diagrams and schematics to help orient non-mussel experts, thereby widening the audience of the paper.

      Thank the reviewer for the valuable feedback on our study. We are grateful that the reviewers found our work to be interesting and we appreciate their thorough evaluation of our research. Their constructive comments will be considered as we continue to develop and improve our study.

      Weaknesses:

      One of the main weaknesses of this paper is the lack of coherence between the images and the text, with some parts of the figures never being referenced in the body of the text. This makes it difficult for the reader to interpret how they fit in with the author's discussion and assess confidence in their analysis and interpretation of data. This is especially apparent in the cluster annotation section of the paper.

      We appreciate the feedback and suggestions provided by the reviewer, and we have revised our manuscript to make it more accessible to general audiences.

      Another concern is the linking of the transcriptomic shifts associated with starvation with changes in interactions with the symbiotes. Without examining and comparing the symbiote population between the different samples, it cannot be concluded that the transcriptomic shifts correlate with a shift to the 'milking' pathway and not other environmental factors. Without comparing the symbiote abundance between samples, it is difficult to disentangle changes in cell state that are due to their changing interactions with the symbiotes from other environmental factors.

      We are grateful for the valuable feedback and suggestions provided by the reviewer. Our keen interest lies in understanding symbiont responses, particularly at the single-cell level. However, it's worth noting that existing commercial single-cell RNA-seq technologies rely on oligo dT priming for reverse transcription and barcoding, thus omitting bacterial gene expression information from our dataset. We hope that advancements in technology will soon enable us to perform an integrated analysis encompassing both host and symbiont gene expression.

      Additionally, conclusions in this area are further complicated by using only snRNA-seq to study intracellular processes. This is limiting since cytoplasmic mRNA is excluded and only nuclear reads are sequenced after the organisms have had several days to acclimate to their environment and major transcriptomic shifts have occurred.

      We appreciate the comments shared by the reviewer and agree that scRNA-seq provides more comprehensive transcriptional information by targeting the entire mRNA of the cell. However, we would like to highlight that snRNA-seq has some unique advantages over scRNA-seq. Notably, snRNA-seq allows for simple snap-freezing of collected samples, facilitating easier storage, particularly for samples obtained during field trips involving deep-sea animals and other ecologically significant non-model animal samples. Additionally, unlike scRNA-seq, snRNA-seq eliminates the need for tissue dissociation, which often involves prolonged enzymatic treatment of deep-sea animal tissue/cells under atmospheric pressure. This process can potentially lead to the loss of sensitive cells or alterations in gene expression. Moreover, snRNA-seq procedures disregard the size and shape of animal cells, rendering it a superior technology for constructing the cell atlas of animal tissues. Consequently, we assert that snRNA-seq offers flexibility and represents a suitable choice for the research objects of our current research.

      Reviewer #2 (Public Review):

      Wang, He et al. shed insight into the molecular mechanisms of deep-sea chemosymbiosis at the single-cell level. They do so by producing a comprehensive cell atlas of the gill of Gigantidas platifrons, a chemosymbiotic mussel that dominates the deep-sea ecosystem. They uncover novel cell types and find that the gene expression of bacteriocytes, the symbiont-hosting cells, supports two hypotheses of host-symbiont interactions: the "farming" pathway, where symbionts are directly digested, and the "milking" pathway, where nutrients released by the symbionts are used by the host. They perform an in situ transplantation experiment in the deep sea and reveal transitional changes in gene expression that support a model where starvation stress induces bacteriocytes to "farm" their symbionts, while recovery leads to the restoration of the "farming" and "milking" pathways.

      A major strength of this study includes the successful application of advanced single-nucleus techniques to a non-model, deep-sea organism that remains challenging to sample. I also applaud the authors for performing an in situ transplantation experiment in a deep-sea environment. From gene expression profiles, the authors deftly provide a rich functional description of G. platifrons cell types that is well-contextualized within the unique biology of chemosymbiosis. These findings offer significant insight into the molecular mechanisms of deep-sea host-symbiont ecology, and will serve as a valuable resource for future studies into the striking biology of G. platifrons.

      The authors' conclusions are generally well-supported by their results. However, I recognize that the difficulty of obtaining deep-sea specimens may have impacted experimental design. In this area, I would appreciate more in-depth discussion of these impacts when interpreting the data.

      Thank the reviewer for their valuable feedback on our study. We're grateful that the reviewers found our work interesting, and we appreciate their thorough evaluation of our research. We'll consider their constructive comments as we continue to develop and improve our study.

      Because cells from multiple individuals were combined before sequencing, the in situ transplantation experiment lacks clear biological replicates. This may potentially result in technical variation (ie. batch effects) confounding biological variation, directly impacting the interpretation of observed changes between the Fanmao, Reconstitution, and Starvation conditions. It is notable that Fanmao cells were much more sparsely sampled. It appears that fewer cells were sequenced, resulting in the Starvation and Reconstitution conditions having 2-3x more cells after doublet filtering. It is not clear whether this is due to a technical factor impacting sequencing or whether these numbers are the result of the unique biology of Fanmao cells. Furthermore, from Table S19 it appears that while 98% of Fanmao cells survived doublet filtering, only ~40% and ~70% survived for the Starvation and Reconstitution conditions respectively, suggesting some kind of distinction in quality or approach.

      There is a pronounced divergence in the relative proportions of cells per cell type cluster in Fanmao compared to Reconstitution and Starvation (Fig. S11). This is potentially a very interesting finding, but it is difficult to know if these differences are the expected biological outcome of the experiment or the fact that Fanmao cells are much more sparsely sampled. The study also finds notable differences in gene expression between Fanmao and the other two conditions- a key finding is that bacteriocytes had the largest Fanmao-vs-starvation distance (Fig. 6B). But it is also notable that for every cell type, one or both comparisons against Fanmao produced greater distances than comparisons between Starvation and Reconstitution (Fig. 6B). Again, it is difficult to interpret whether Fanmao's distinctiveness from the other two conditions is underlain by fascinating biology or technical batch effects. Without biological replicates, it remains challenging to disentangle the two.

      As highlighted by the reviewer, our experimental design involves pooling multiple biological samples within a single treatment state before sequencing. We acknowledge the concern regarding the absence of distinct biological replicates and the potential impact of batch effects on result interpretation. While we recognize the merit of conducting multiple sequencing runs for a single treatment to provide genuine biological replicates, we contend that batch effects may not exert a strong influence on the observed patterns.

      In addition, we applied a bootstrap sampling algorithm to assess whether the gene expression patterns within a cluster are more similar than those between clusters. This algorithm involves selecting a portion of cells per cluster and examining whether this subset remains distinguishable from other clusters. Our assumption was that if different samples exhibited distinct expression patterns due to batch effect, the co-assignment probabilities of a cluster would be very low. This expectation was not met in our data, as illustrated in Fig. S2. The lack of significantly low co-assignment probabilities within clusters suggests that batch effects may not exert a strong influence on our results.

      Indeed, we acknowledge a noticeable shift in the expression patterns of certain cell types, such as the bacteriocyte. However, this is not universally applicable across all cell types. For instance, the UMAP figure in Fig. 6A illustrates a substantial overlap among basal membrane cell 2 from Fanmao, Starvation, and Reconstitution treatments, and the centroid distances between the three treatments are subtle, as depicted in Fig. 6B. This consistent pattern is also observed in DEPC, smooth muscle cells, and the food groove ciliary cells.

      The reviewer also noted variations in the number of cells per treatment. Specifically, Fanmao sequencing yielded fewer than 10 thousand cells, whereas the other two treatments produced 2-3 times more cells after quality control (QC). It is highly probable that the technician loaded different quantities of cells into the machine for single-nucleus sequencing—a not uncommon occurrence in this methodology. While loading more cells may increase the likelihood of doublets, it is crucial to emphasize that this should not significantly impact the expression patterns post-QC. It's worth noting that overloading samples has been employed as a strategic approach to capture rare cell types, as discussed in a previous study (reference: 10.1126/science.aay0267).

      The reviewer highlighted the discrepancy in cell survival rates during the 'doublet filtering' process, with 98% of Fanmao cells surviving compared to approximately 40% and 70% for the Starvation and Reconstitution conditions, respectively. It's important to clarify that the reported percentages reflect the survival of cells through a multi-step QC process employing various filtering strategies.

      Post-doublet removal, we filtered out cells with <100 or >2500 genes and <100 or >6000 unique molecular identifiers (UMIs). Additionally, genes with <10 UMIs in each data matrix were excluded. The observed differences in survival rates for Starvation and Reconstitution cells can be attributed to the total volume of data generated in Illumina sequencing. Specifically, we sequenced approximately 91 GB of data for Fanmao, ~196 GB for Starvation, and ~249 GB for Reconstitution. As a result, the qualified data obtained for Starvation and Reconstitution conditions was only about twice that of Fanmao due to the limited data volume.

      The reviewer also observed a divergence in the relative proportions of cells per cell type cluster in Fanmao compared to Reconstitution and Starvation, as depicted in Fig. S1. This discrepancy may hold true biological significance, presenting a potentially intriguing finding. However, our discussion on this pattern was rather brief, as we acknowledge that the observed differences could be influenced by the sample preparation process for dissection and digestion. It is crucial to consider that cutting a slightly different area during dissection may result in variations in the proportion of cells obtained. While we recognize the potential impact of this factor, we do not think that the sparsity of sampling alone could significantly affect the relative proportions of cells per cell type.

      In conclusion, we acknowledge the reviewer's suggestion that sequencing multiple individual samples per treatment condition would have been ideal, rather than pooling them together. However, the homogenous distribution observed in UMAP and the consistent results obtained from bootstrap sampling suggest that the impact of batch effects on our analyses is likely not substantial. Additionally, based on our understanding, the smaller number of cells in the Fanmao sample should not have any significant effect on the resulting different proportion of cells or the expression patterns per each cluster.

      Reviewer #3 (Public Review):

      Wang et al. explored the unique biology of the deep-sea mussel Gigantidas platifrons to understand the fundamental principles of animal-symbiont relationships. They used single-nucleus RNA sequencing and validation and visualization of many of the important cellular and molecular players that allow these organisms to survive in the deep sea. They demonstrate that a diversity of cell types that support the structure and function of the gill including bacteriocytes, specialized epithelial cells that host sulfur-oxidizing or methane-oxidizing symbionts as well as a suite of other cell types including supportive cells, ciliary, and smooth muscle cells. By performing experiments of transplanting mussels from one habitat which is rich in methane to methane-limited environments, the authors showed that starved mussels may consume endosymbionts versus in methane-rich environments upregulated genes involved in glutamate synthesis. These data add to the growing body of literature that organisms control their endosymbionts in response to environmental change.

      The conclusions of the data are well supported. The authors adapted a technique that would have been technically impossible in their field environment by preserving the tissue and then performing nuclear isolation after the fact. The use of single-nucleus sequencing opens the possibility of new cellular and molecular biology that is not possible to study in the field. Additionally, the in-situ data (both WISH and FISH) are high-quality and easy to interpret. The use of cell-type-specific markers along with a symbiont-specific probe was effective. Finally, the SEM and TEM were used convincingly for specific purposes in the case of showing the cilia that may support water movement.

      We appreciate the valuable feedback provided by the reviewer on our study. It is encouraging to know that our work was found to be interesting and that they conducted a thorough evaluation of our research. We will take their constructive comments into account as we strive to develop and enhance our study. Thank the reviewer for all the input.

      The one particular area for clarification and improvement surrounds the concept of a proliferative progenitor population within the gill. The authors imply that three types of proliferative cells within gills have long been known, but their study may be the first to recover molecular markers for these putative populations. The markers the authors present for gill posterior end budding zone cells (PEBZCs) and dorsal end proliferation cells (DEPCs) are not intuitively associated with cell proliferation and some additional exploration of the data could be performed to strengthen the argument that these are indeed proliferative cells. The authors do utilize a trajectory analysis tool called Slingshot which they claim may suggest that PEBZCs could be the origin of all gill epithelial cells, however, one of the assumptions of this analysis is that differentiated cells are developed from the same precursor PEBZC population.

      However, these conclusions do not detract from the overall significance of the work of identifying the relationship between symbionts and bacteriocytes and how these host bacteriocytes modulate their gene expression in response to environmental change. It will be interesting to see how similar or different these data are across animal phyla. For instance, the work of symbiosis in cnidarians may converge on similar principles or there may be independent ways in which organisms have been able to solve these problems.

      We are grateful for the valuable comments and suggestions provided by the reviewer. All suggestions have been carefully considered, and the manuscript has been revised accordingly. We particularly value the reviewer's insights regarding the characterization of the G. platifrons gill proliferative cell populations. In a separate research endeavor, we have conducted experiments utilizing both cell division and cell proliferation markers on these proliferative cell populations. While these results are not incorporated into the current manuscript, we would be delighted to share our preliminary findings with the reviewer. Our preliminary results indicate that the proliferative cell populations exhibit positivity for cell proliferation markers and contain a significant number of mitotic cells..

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Further experiments are needed to link the changes in transcriptomes of Bathymodioline mussels in the different environmental conditions to changes in their interactions with symbiotes. For example, quantifying the abundance and comparing the morphology of symbiotes between the environmental conditions would lend much support for shifting between milking and farming strategies. Without analyzing the symbiotes and comparing them across populations, it is difficult to comment on the mechanisms of interactions between symbiotes and the hosts. Without this analysis, this data is better suited towards comments about the general effect of environmental perturbation and stress on gene expression in these mussels.

      We appreciate the reviewer’s comments. We are also very curious about the symbiont responses, especially at the single-cell level. However, all the current commercial single-cell RNA-seq technologies are based on oligo dT priming for reverse transcription and barcoding. Therefore, the bacterial gene expression information is omitted from our dataset. Hopefully, with the development of technology, we could conduct an integrated analysis of both host and symbiont gene expression soon.

      Additionally, clarification is needed on which types of symbiotes are being looked at. Are they MOX or SOX populations? Are they homogenous? What are the concentrations of sulfur at the sampled sites?

      We thank you for your valuable comments and suggestions. Gigantidas platifrons harbors a MOX endosymbiont population characterized by a single 16S rRNA phylotype. We apologize for any confusion resulting from our previous wording. To clarify, we have revised lines 57-59 of our introduction

      In the text and images, consider using standardized gene names and leaving out the genome coordinates. This would greatly help with readability. Also, be careful to properly follow gene naming and formatting conventions (ie italicizing gene names and symbols).

      We appreciate the reviewer’s insightful comments. In model animals, gene nomenclature often stems from forward genetic approaches, such as the identification of loss-of-function mutants. These gene names, along with their protein products, typically correspond to unique genome coordinates. Conversely, in non-model invertebrates (e.g., Gigantidas platifrons of present study), gene prediction relies on a combination of bioinformatics methods, including de novo prediction, homolog-based prediction, and transcriptomics mapping. Subsequently, the genes are annotated by identifying their best homologs in well-characterized databases. Given that different genes may encode proteins with similar annotated functions, we chose to include both the gene ID (genome coordinates) and the gene name in our manuscript. This dual labeling approach ensures that our audience receives accurate and comprehensive information regarding gene identification and annotation.

      Additionally, extending KEGG analysis to the atlas annotation section could help strengthen the confidence of annotations. For example, when identifying bacteriocyte populations, the functional categories of individual marker genes (lysosomal proteases, lysosomal traffic regulators, etc) are used to justify the annotation. Presenting KEGG support that these functional categories are upregulated in this population relative to others would help further support how you characterize this cluster by showing it's not just a few specific genes that are enriched in this cell group, but rather an overall functionality.

      We appreciate the valuable suggestion provided by the reviewer. Indeed, incorporating KEGG analysis into the atlas annotation section could further enhance the confidence in our annotations. However, in our study, we encountered some limitations that impeded us from conducting a comprehensive KEGG enrichment analysis.

      Firstly, the number of differentially expressed genes (DEGs) that we identified for certain cell populations was relatively small, making it challenging to meet the threshold required for meaningful KEGG enrichment analysis. For instance, among the 97 marker genes identified for the Bacteriocyte cluster, only two genes, Bpl_scaf_59648-4.5 (lysosomal alpha-glucosidase-like) and Bpl_scaf_52809-1.6 (lysosomal-trafficking regulator-like isoform X1), were identified as lysosomal genes. To generate reliable KEGG enrichments, a larger number of genes is typically required.

      Secondly, single-nucleus sequencing, as employed in our study, tends to yield a relatively smaller number of genes per cell compared to bulk RNA sequencing. This limited gene yield can make it challenging to achieve sufficient gene representation for rigorous KEGG enrichment analysis.

      Furthermore, many genes in the genome still lack comprehensive annotation, both in terms of KEGG and GO annotations. In our dataset, out of the 33,584 genes obtained through single-nuclei sequencing, 26,514 genes have NO KEGG annotation, and 25,087 genes have NO GO annotation. This lack of annotations further restricts the comprehensive application of KEGG analysis in our study.

      The claim that VEPCs are symbiote free is not demonstrated. Additional double in situs are needed to show that markers of this cell type localize in regions free of symbiotes.

      We appreciate your comments and suggestions. In Figure 5B, our results demonstrate that the bacteriocytes (green fluorescent signal) are distant from the VEPCs, which are located around the tip of the gill filaments (close to the food groove). We have revised our Figure 5B to make it clear.

      Additionally, it does not seem like trajectory analysis is appropriate for these sampling conditions. Generally, to create trajectories confidently, more closely sampled time points are needed to sufficiently parse out the changes in expression. More justification is needed for the use of this type of analysis here and a discussion of the limitations should be mentioned, especially when discussing the hypotheses relating to PEBZCs, VEPCs, and DEPCs.

      We greatly appreciate your thoughtful commentary. It is important to acknowledge that in the context of a developmental study, incorporating more closely spaced time points indeed holds great value. In our ongoing project investigating mouse development, for instance, we have implemented time points at 24-hour intervals. However, in the case of deep-sea adult animals, we hypothesized a slower transcriptional shift in such extreme environment, which led us to opt for a time interval of 3-7 days. Examining the differential expression profiles among the three treatments, we observed that most cell types exhibited minimal changes in their expression profiles. For the cell types strongly impacted by in situ transplantation, their expression profiles per cell type still exhibited highly overlap in the UMAP analysis (Figure 6a), thus enabling meaningful comparisons. Nevertheless, we recognize that our sampling strategy may not be flawless. Additionally, the challenging nature of conducting in situ transplantation in 1000-meter depths limited the number of sampling occasions available to us. We sincerely appreciate your input and understanding.

      Finally, more detail should be added on the computational methods used in this paper. For example, the single-cell genomics analysis protocol should be expanded on so that readers unfamiliar with BD single-cell genomics handbooks could replicate the analysis. More detail is also needed on what criteria and cutoffs were used to calculate marker genes. Also, please be careful to cite the algorithms and software packages mentioned in the text.

      Acknowledged, thank you for highlighting this. In essence, the workflow closely resembles that of the 10x Genomics workflow (despite the use of a different software, i.e., Cell Ranger). We better explain the workflow below, and also noting that this information may no longer be relevant for newer users of BD or individuals who are not acquainted with BD, given that the workflow underwent a complete overhaul in the summer of 2023.

      References to lines

      Line 32: typo "..uncovered unknown tissue heterogeny" should read "uncovering" or "and uncovered")

      Overall abstract could include more detail of findings (ex: what are the "shifts in cell state" in line 36 that were observed)

      We apologize for the mistakes, and have revised the manuscript accordingly.

      Line 60: missing comma "...gill filament structure, but also"

      We apologize for the mistakes, and have revised the manuscript accordingly.

      Line 62-63: further discussion here, or in the relevant sections of the specific genes identified in the referenced bulk RNA-seq project could help strengthen confidence in annotation

      We appreciate the comment, and have revised the manuscript accordingly.

      Line 112: what bootstrapping strategy? Applied to what?

      This is a bootstrap sampling algorithm to assess the robustness of each cell cluster developed in a recent biorxiv paper. (Singh, P. & Zhai, Y. Deciphering Hematopoiesis at single cell level through the lens of reduced dimensions. bioRxiv, 2022.2006.2007.495099 (2022). https://doi.org:10.1101/2022.06.07.495099)

      Lines 127-129: What figures demonstrate the location of the inter lamina cells? Are there in situs that show this?

      We apologize for any errors; the referencing of figures in the manuscript has been revised for clarity

      Lines 185-190: does literature support these as markers of SMCs? Are they known smooth muscle markers in other systems?

      We characterized the SMCs by the expression of LDL-associated protein, angiotensin-converting enzyme-like protein, and the "molecular spring" titin-like protein, all of which are commonly found in human vascular smooth muscle cells. Based on this analysis, we hypothesize that these cells belong to the smooth muscle cell category.

      Line 201: What is meant by "regulatory roles"?

      In this context, we are discussing the expression of genes encoding regulatory proteins, such as SOX transcription factors and secreted-frizzled proteins.

      Line 211: which markers disappeared? What in situs show this?

      We apologize for the mistakes, and have revised the manuscript accordingly.

      Line 211: typo, "role" → "roll"

      We apologize for the mistakes, and have revised the manuscript accordingly.

      Line 214: what are these "hallmark genes"

      We apologize for the mistakes, here we are referring to the genes listed in figure 4B. We have revised the manuscript accordingly.

      Line 220: are there meristem-like cells in metazoans? If so, this would be preferable to a comparison with plants.

      In this context, we are discussing the morphological characteristics of gill proliferative cell populations found in filibranch bivalves. These populations, namely PEPC, VEPC, and DEPC, consist of cells exhibiting morphological traits akin to those of plant cambial-zone meristem cells. These cells typically display small, round shapes with a high nucleus-to-plasma ratio. We acknowledge that while these terms are utilized in bivalve studies (citations below), they lack the robust support seen in model systems backed by molecular biology evidences. The present snRNA-seq data, however, may offer valuable cell markers for future comprehensive investigations.

      Leibson, N. L. & Movchan, O. T. Cambial zones in gills of Bivalvia. Mar. Biol. 31, 175-180 (1975). https://doi.org:10.1007/BF00391629

      Wentrup, C., Wendeberg, A., Schimak, M., Borowski, C. & Dubilier, N. Forever competent: deep-sea bivalves are colonized by their chemosynthetic symbionts throughout their lifetime. Environ. Microbiol. 16, 3699-3713 (2014). https://doi.org:10.1111/1462-2920.12597

      Cannuel, R., Beninger, P. G., McCombie, H. & Boudry, P. Gill Development and its functional and evolutionary implications in the blue mussel Mytilus edulis (Bivalvia: Mytilidae). Biol. Bull. 217, 173-188 (2009). https://doi.org:10.1086/BBLv217n2p173

      Line 335: what is slingshot trajectory analysis? Does this differ from the pseudotime analysis?

      Slingshot is an algorithm that uses the principal graph of the cells to infer trajectories. It models trajectories as curves on the principal graph, capturing the progression and transitions between different cellular states.

      Both Slingshot and pseudotime aim to infer cellular trajectories. Slingshot focuses on capturing branching patterns which is fully compatible with the graph generated using dimensionality reduction such as UMAP and PHATE, while pseudotime analysis aims to order cells along a continuous trajectory. It does not rely on dimensionality reduction graphs. We used both in the MS for different purposes.

      Line 241: introduce FISH methodology earlier in the paper, when in situ images are first referenced

      We appreciate the comment, and have revised the manuscript accordingly.

      Line 246-249: can you quantify the decrease in signal or calculate the concentration of symbiotes in the cells? Was 5C imaged whole? This can impact the fluorescent intensity in tissues of different thicknesses.

      We appreciate your comment. In Figure 5C, most of the typical gill filament region is visible (the ventral tip of the gill filament, and the mid part of the gill filament) except for the dorsal end. The gill filament of bathymodioline mussels exhibits a simple structure: a single layer of bacteriocytes grow on the basal membrane. Consequently, the gill slices have a fairly uniform thickness (with two layers of bacteriocytes and one layer of interlamina cells in between), minimizing any potential impact on fluorescent intensity. As of now, detailed quantification of intracellular symbionts may necessitate continuous TEM or ultra-resolution confocal sections to 3D reconstruct the bacteriocytes, which may exceed the scope of the current study. Therefore, fluorescent intensity remains the only method available to us for estimating bacterial density/distribution across the gill filament.

      Line 249: What is meant by 'environmental gradient?'

      Here we are refereeing the gases need for symbiont’s chemosynthesis. We have revised the manuscript to make it clear.

      Lines 255-256: Were the results shown in the TEM images previously known? Not clear what novel information is conveyed in images Fig 5 C and D

      In the Fig 5 C and D, we’ve delivered a high-quality SEM TEM image of a typical bacteriocyte, showcasing its morphology and subcellular machinery with clarity. These electron microscopy images offer the audience a comprehensive introduction to the cellular function of bacteriocytes. Additionally, they serve as supportive evidence for the bacteriocytes' snRNA-seq data.

      Line 295-296: Can you elaborate on what types of solute carrier genes have been shown to be involved with symbioses?

      We appreciate the comment, and have revised the manuscript accordingly. The putative functions of the solute carriers could be found in Figure 5I.

      Line 297-301: Which genes from the bulk RNA-seq study? Adding more detail and references in cluster annotation would help readers better understand the justifications.

      We appreciate the comment, and have revised the manuscript accordingly.

      Line 316 -322: Can you provide the values of the distances?

      We also provide values in the main text, in addition to the Fig6b. We also provide a supplementary Table (Supplementary Table S19).

      Line 328: What are the gene expression patterns?

      We observed genes that are up- and down-regulated in Starvation and reconstitution.

      LIne 334-337: A visualization of the different expression levels of the specific genes in clusters between sites might be helpful to demonstrate the degree of difference between sites.

      We have prepared a new supplementary file showing the different expression levels.

      Line 337: Citation needed

      We appreciate the comment. Here, we hypothesize the cellular responds based on the gene’s function and their expression patterns.

      Line 402-403: Cannot determine lineages from data presented. Need lineage tracing over time to determine this

      We acknowledge the necessity of conducting lineage tracing over time to validate this hypothesis. Nonetheless, in practical terms, it is difficult to obtain samples for testing this. Perhaps, it is easier to use their shallow sea relatives to test this hypothesis. However, in practice, it is very difficult.

      413-414: What are the "cell-type specific responses to environmental change"? It could be interesting to present these results in the "results and discussion" section

      These results are shown in Supplementary Figure S8.

      Line 419-424: Sampling details might go better earlier on in the paper, when the sampling scheme is introduced.

      We appreciate the comments. Here, we are discussing the limitations of our current study, not sampling details.

      Line 552: What type of sequencing? Paired end? How long?

      We conducted 150bp paired-end sequencing.

      556-563: More detail here would be useful to readers not familiar with the BD guide. Also be careful to cite the software used in analysis!

      The provided guide and handbook elucidate the intricacies of gene name preparation, data alignment to the genome, and the generation of an expression matrix. It is worth mentioning that we relied upon outdated versions of the aforementioned resources during our data analysis phase, as they were the only ones accessible to us at the time. However, we have since become aware of a newer pipeline available this year, rendering the information presented here of limited significance to other researchers utilizing BD.

      Many thanks for your kind reminding. We have now included a reference for STAR. All other software was cited accordingly. There are no scholarly papers or publications to refer to for the BD pipeline that we can cite.

      Line 577-578: How was the number of clusters determined? What is meant by "manually combine the clusters?" If cells were clustered by hand, more detail on the method is needed, as well as direct discussion and justification in the body of the paper.

      It would be more appropriate to emphasize the determination of cell types rather than clusters. The clusters were identified using a clustering function, as mentioned in the manuscript. It's important to note that the clustering function (in our case, the FindClusters function of Seurat) provides a general overview based on diffuse gene expression. Technically speaking, there is no guarantee that one cluster corresponds to a single cell type. Therefore, it is crucial to manually inspect the clustering results to assign clusters to the appropriate cell types. In some cases, multiple clusters may be assigned to the same cell type, while in other cases, a single cluster may need to be further subdivided into two or more cell types or sub-cell types, depending on the specific circumstances.

      For studies conducted on model species such as humans or mice, highly and specifically expressed genes within each cluster can be compared to known marker genes of cell types mentioned in previous publications, which generally suffices for annotation purposes. However, in the case of non-model species like Bathymodioline mussels, there is often limited information available about marker genes, making it challenging to confidently assign clusters to specific cell types. In such situations, in situ hybridisation proves to be incredibly valuable. In our study, WISH was employed to visualise the expression and morphology of marker genes within clusters. When WISH revealed the expression of marker genes from a cluster in a specific type of cell, we classified that cluster as a genuine cell type. Moreover, if WISH demonstrated uniform expression of marker genes from different clusters in the same cell, we assigned both clusters to the same cell type.

      We expanded the description of the strategy in the Method section.

      LIne 690-692: When slices were used, what part of the gill were they taken from?

      We sectioned the gill around the mid part which could represent the mature bacteriocytes.

      References to figures:

      General

      Please split the fluorescent images into different channels with an additional composite. It is difficult to see some of the expression patterns. It would also make it accessible to colorblind readers.

      We appreciate the comments and suggestions from the reviewer. We have converted our figures to CMYK colour which will help the colorblind audiences to read our paper.

      Please provide the number of replicates for each in situ and what proportion of those displayed the presented pattern.

      We appreciate the reviewer’s comments. We have explained in the material and methods part of the manuscript.

      Figure 2.C' is a fantastic summary and really helps the non-mussel audience understand the results. Adding schematics like this to Figures 3-5 would be helpful as well.

      We value the reviewer's comments. We propose that Figures 3K, 4C, and 5A-D could offer similar schematic explanations to assist the audience.

      Figure 2:

      Figures 2.C-F, 2.C', 2.H-J are not referenced in the text. Adding in discussions of them would help strengthen your discussions on the cluster annotation

      We appreciate the reviewer's comments. We have revise the manuscript accordingly.

      In 2.B. 6 genes are highlighted in red and said to be shown in in situs, but only 5 are shown.

      We apology for the mistake. We didn’t include the result 20639-0.0 WISH in present study. We have changed the label to black.

      Figure 3:

      FIg 2C-E not mentioned.

      We appreciate the reviewer's comments. We have revise the manuscript accordingly.

      In 3.B 8 genes are highlighted in red and said to be shown in in situs. Only 6 are.

      The result of the WISH were provided in Supplementary Figures S4 and S5.

      FIgure 3.K is not referenced in the legend.

      We appreciate the comment, and have revised the manuscript accordingly.

      Figure 4:

      In Figure D, it might be helpful to indicate the growth direction.

      We appreciate the comment, and have revised the manuscript accordingly by adding an arrow in panel D to indicate growth direction.

      4F: A double in situ with the symbiote marker is needed to demonstrate the nucleolin-like positive cells are symbiote free.

      We appreciate the comment. The symbiont free region could be found in Figure 5A.

      Figure 5:

      In 5.A, quantification of symbiote concentration would help support your conclusion that they are denser around the edges.

      We appreciate the comment, as we mentioned above, detailed quantification of intracellular symbionts may necessitate continuous TEM or ultra-resolution confocal sections to 3D reconstruct the bacteriocytes, which may exceed the scope of the current study. Therefore, fluorescent intensity remains the only method available to us for estimating bacterial density/distribution across the gill filament.

      In 5.D, the annotation is not clear. Adding arrows like in 5.C would be helpful.

      We appreciate the comment, and have revised the manuscript accordingly.

      A few genes in 5.F are not mentioned in the paper body when listing other genes. Mentioning them would help provide more support for your clustering.

      We appreciate the comment, and have revised the manuscript accordingly.

      Is 5.I meant to be color coded with the gene groups from 5.F? Color Coding the gene names, rather than organelles or cellular structures might portray this better and help visually strengthen the link between the diagram and your dot plot.

      We appreciate the suggestions. We've experimented with color-coding the gene names, but some colors are less discernible against a white background.

      Figure 6:

      6.B Is there a better way to visualize this data? The color coding is confusing given the pairwise distances. Maybe heatmaps?

      We attempted a heatmap, as shown in the figure below. However, all co-authors agree that a bar plot provides clearer visualization compared to the heatmap. We agree that the color scheme maya be confusing because they use the same color as for individual treatment. So we change the colors.

      Author response image 1.

      Figure 6.D: Why is the fanmao sample divided in the middle?

      Fig6C show that single-cell trajectories include branches. The branches occur because cells execute alternative gene expression programs. Thus, in Fig 6D, we show changes for genes that are significantly branch dependent in both lineages at the same time. Specifically, in cluster 2, the genes are upregulated during starvation but downregulated during reconstitution. Conversely, genes in cluster 1 are downregulated during starvation but upregulated during reconstitution. It's of note that Fig 6D displays only a small subset of significantly branch-dependent genes.

      FIgure 6.D: Can you visualize the expression in the same format as in figures 2-5?

      We appreciate the comments from the reviewer. As far as we know, this heatmap are the best format to demonstrate this type of gene expression profile.

      Supplementary Figure S2:

      Please provide a key for the cell type abbreviations

      We appreciate the comment, and have added the abbreviations of cell types accordingly.

      Supplementary Figures S4 and S5:

      What part of the larger images are the subsetted image taken from?

      We appreciate the comment, these images were taken from the ventral tip and mid of the gill slices, respectively. We have revised the figure legends to make it clear.

      Supplemental Figure S7:

      If clusters 1 and 2 show genes up and downregulated during starvation, what do clusters 4 and 3 represent?

      Cluster 1: Genes that are obviously upregulated during Starvation, and downregulated during reconstitution; luster4: genes are downregulated during reconstitution but not obviously upregulated during Starvation.

      Cluster 2 show genes upregulated during reconstitution, and cluster 3 obviously downregulated during Starvation.

      Author response table 1.

      Supplemental Figure S8:

      This is a really interesting figure that I think shows some of the results really well! Maybe consider moving it to the main figures of the paper?

      We appreciate the comments and suggestions. We concur with the reviewer on the significance of the results presented. However, consider the length of this manuscript, we have prioritized the inclusion of the most pertinent information in the main figures. Supplementary materials containing additional figures and details on the genes involved in these pathways are provided for interested readers.

      Supplemental Figure S11:

      Switching the axes might make this image easier for the reader to interpret. Additionally, calculating the normalized contribution of each sample to each cluster could help quantify the extent to which bacteriocytes are reduced when starving.

      Thank you for the insightful suggestion, which we have implemented as detailed below. We acknowledge the importance of understanding the changes in bacteriocyte proportions across different treatments. However, it's crucial to note that the percentage of cells per treatment is highly influenced by factors such as the location of digestion and sequencing, as previously mentioned.

      Author response image 2.

      Reviewer #2 (Recommendations For The Authors):

      The following are minor recommendations for the text and figures that may help with clarity:

      Fig. 3K: This figure describes water flow induced by different ciliary cells. It is not clear what the color of the arrows corresponds to, as they do not match the UMAP (i.e. the red arrow) and this is not indicated in the legend. Are these colours meant to indicate the different ciliary cell types? If so it would be helpful to include this in the legend.

      We appreciate the reviewer's comments and suggestions. The arrows indicate the water flow that might be agitated by the certain types of cilium. We have revised our figure and figure legends to make it clear.

      Line 369: The incorrect gene identifier is given for the mitochondrial trifunctional enzyme. This gene identifier is identical to the one given in line 366, which describes long-chain-fatty-acid-ligase ACSBG2-like (Bpl_scaf_28862-1.5).

      We appreciate the reviewer's comments and suggestions. We have revised our manuscript accordingly.

      Line 554: The Bioproject accession number (PRJNA779258) does not appear to lead to an existing page in any database.

      We appreciate the reviewer's comments and suggestions. We have released this Bioproject to the public.

      Line 597-598: it would be helpful to know the specific number of cells that the three sample types were downsampled to, and the number of cells remaining in each cluster, as this can affect the statistical interpretation of differential expression analyses.

      The number of cells per cluster in our analysis ranged from 766 to 14633. To mitigate potential bias introduced by varying cell numbers, we implemented downsampling, restricting the number of cells per cluster to no more than 3500. This was done to ensure that the differences between clusters remained less than 5 times. We experimented with several downsampling strategies, exploring cell limits of 4500 and 2500, and consistently observed similar patterns across these variations.

      Data and code availability:

      The supplementary tables and supplementary data S1 appear to be the final output of the differential expression analyses. Including the raw data (e.g. reads) and/or intermediate data objects (e.g. count matrices, R objects), in addition to the code used to perform the analyses, may be very helpful for replication and downstream use of this dataset. As mentioned above, the Bioproject accession number appears to be incorrect.

      We appreciate the reviewer's comments and suggestions. Regarding our sequencing data, we have deposited all relevant information with the National Center for Biotechnology Information (NCBI) under Bioproject PRJNA779258. Additionally, we have requested the release of the Bioproject. Furthermore, as part of this round of revision, we have included the count matrices for reference.

      Reviewer #3 (Recommendations For The Authors):

      As noted in the public review, my only major concerns are around the treatment of progenitor cell populations. I am sympathetic to the challenges of these experiments but suggest a few possible avenues to the authors.

      First, there could be some demonstration that these cells in G. platifrons are indeed proliferative, using EdU incorporation labeling or a conserved epitope such as the phosphorylation of serine 10 in histone 3. It appears in Mytilus galloprovincialis that proliferating cell nuclear antigen (PCNA) and phospho-histone H3 have previously been used as good markers for proliferative cells (Maiorova and Odintsova 2016). The use of any of these markers along with the cell type markers the authors recover for PEBZCs for example would greatly strengthen the argument that these are proliferative cells.

      If performing these experiments would not be currently possible, the authors could use some computation approaches to strengthen their arguments. Based on conserved cell cycle markers and the use of Cell-Cycle feature analysis in Seurat could the authors provide evidence that these progenitors occupy the G2/M phase at a greater percentage than other cells? Other than the physical position of the cells is there much that suggests that these are proliferative? While I am more convinced by markers in VEPCs the markers for PEBZCs and DEPCs are not particularly compelling.

      While I do not think the major findings of the paper hinge on this, comments such as "the PBEZCs gave rise to new bacteriocytes that allowed symbiont colonization" should be taken with care. It is not clear that the PBEZCs are proliferative and there does not seem to be any direct evidence that PBEZCs (or DEPCs or VEPCS for that manner) are the progenitor cells through any sort of labeling or co-expression studies.

      We appreciate the comments and suggestions from the reviewer. We have considered all the suggestions and have revised the manuscript accordingly. We especially appreciate the reviewer’s suggestions about the characterisations of the G. platifrons gill proliferative cell populations. In a separate research project, we have tested both cell division and cell proliferation markers on the proliferation cell populations. Though we are not able to include these results in the current manuscript, we are happy to share our preliminary results with the reviewer. Our results demonstrate the proliferative cell populations, particularly the VEPCs, are cell proliferation marker positive, and contains high amount of mitotic cells.

      Author response image 3.

      Finally, there is a body of literature that has examined cell proliferation and zones of proliferation in mussels (such as Piquet, B., Lallier, F.H., André, C. et al. Regionalized cell proliferation in the symbiont-bearing gill of the hydrothermal vent mussel Bathymodiolus azoricus. Symbiosis 2020) or other organisms (such as Bird, A. M., von Dassow, G., & Maslakova, S. A. How the pilidium larva grows. EvoDevo. 2014) that could be discussed.

      We appreciate the comments and suggestions from the reviewer. We have considered all the suggestions and have revised the manuscript accordingly (line 226-229).

      Minor comments also include:

      Consider changing the orientation of diagrams in Figure 2C' in relationship to Figure 2C and 2D-K.

      We appreciate the comments and suggestions from the reviewer. The Figure 2 has been reorganized.

      For the diagram in Figure 3K, please clarify if the arrows drawn for the direction of inter lamina water flow is based on gene expression, SEM, or some previous study.

      We are grateful for the reviewer's valuable feedback and suggestions. The arrows in the figure indicate the direction of water flow that could be affected by specific types of cilium. Our prediction is based on both gene expression and SEM results. To further clarify this point, we have revised the figure legend of Fig. 3.

      Please include a label for the clusters in Figure 5E for consistency.

      We have revised our Figure 5E to keep our figures consistent.

      Please include a note in the Materials and Methods for Monocle analysis in Figure 6.

      We conducted Monocle analyses using Monocle2 and Monocle 3 in R environment. We have revised our material and methods with further information of Figure 6.

      In Supplement 2, the first column is labeled PEBC while the first row is labeled PEBZ versus all other rows and columns have corresponding names. I am guessing this is a typo and not different clusters?

      We appreciate the great effort of the reviewer in reviewing our manuscript. We have corrected the typo in the revised version.

    1. Transformers give Clojurists some of the benefits of "Object Orientation" without many of the downsides Clojurists dislike about objects.

      1. Objects couple behaviors required from multiple callers into a single class, while transformers do not change existing behaviors for existing callers by default
      2. Objects push inheritance first design, whereas transformer inheritance is a function of shared structure between Clojure data structures derived from one another and design is driven by concrete implementation needs, like regular Clojure
      3. Objects couple state and methods in spaghetti ways and transformers are just immutable maps. And just like how Clojure lets you stash stateful things like atoms in functions, transformers allow you to build stateful transformers, but like Clojure the default convention is to do everything immutably
      4. Objects try to provide data hiding as a function of encapsulation whereas transformers are doing the opposite, exposing data otherwise hidden by a closure

      There are many strategies for reusing code in the software industry. In Clojure, we use what some call a "lego" method of building small, single purpose functions that just can be used in a million different contexts, because of a tasteful use of simplicity in the right places. This works tremendously well for 95% of use cases. In certain use-cases, like for building hierarchies of functions that are highly self-similar, like with UI toolkits, transformers provide a better alternative.Transformers allow you to build a UI toolkit with 25% the code of normal function composition and 25% of the code required for evolution over time for the widgets in that hierarchy. The lego method is great for vertically composing things together, but when you want to make lateral changes for only certain callers in the tree, you have to defensively copy code between duplicative implementation trees and just call them "grandpa-function-1" and "grandpa-function-2" and then make versions 1 and 2 for all functions that wrapped the grandpa-functions afterwards. Transformers provide a solution for that situation, in the rare cases we end up in them in Clojure, without the downsides of a traditional object system.

  4. Apr 2024
    1. ur la branche P3C5-solution.

      Je ne comprends pas bien dans la solution ce bout de code : .lien-conteneur-photo:hover.photo-hover { display: flex; } Pourquoi la class .photo-hover se trouve après la pseudo classe hover ?

    1. Reviewer #2 (Public Review):

      In the presented manuscript, the authors first use structured microfluidic devices with gliding filamentous cyanobacteria inside in combination with micropipette force measurements to measure the bending rigidity of the filaments. The distribution of bending rigidities is very broad.

      Next, they use triangular structures to trap the bacteria with the front against an obstacle. Depending on the length and rigidity, the filaments buckle under the propulsive force of the cells. The authors use theoretical expressions for the buckling threshold to infer propulsive force, given the measured length and (mean-) stiffnesses. They find nearly identical values for both species, 𝑓 ∼ (1.0 {plus minus} 0.6) nN∕µm, nearly independent of the velocity. These measurements have to be taken with additional care, as then inferred forces depend strongly on the bending rigidity, which already shows a broad distribution.

      Finally, they measure the shape of the filament dynamically to infer friction coefficients via Kirchhoff theory. In this section they report a strong correlation with velocity and report propulsive forces that vary over two orders of magnitude.

      From a theoretical perspective, not many new results are presented. The authors repeat the the well-known calculation for filaments buckling under propulsive load and arrive at the literature result of buckling when the dimensionless number (f L^3/B) is larger than 30.6 as previously derived by Sekimoto et al in 1995. In my humble opinion, the "buckling theory" section belongs to methods.<br /> Finally, the Authors use molecular dynamics type simulations similar to other models to reproduce the buckling dynamics from the experiments.

      Data and source code are available via trusted institutional or third-party repositories that adhere to policies that make data discoverable, accessible and usable.

    2. Author response:

      Reviewer 1:

      The paper “Quantifying gliding forces of filamentous cyanobacteria by self-buckling” combines experiments on freely gliding cyanobacteria, buckling experiments using two-dimensional V-shaped corners, and micropipette force measurements with theoretical models to study gliding forces in these organisms. The aim is to quantify these forces and use the results to perhaps discriminate between competing mechanisms by which these cells move. A large data set of possible collision events are analyzed, bucking events evaluated, and critical buckling lengths estimated. A line elasticity model is used to analyze the onset of buckling and estimate the effective (viscous type) friction/drag that controls the dynamics of the rotation that ensues post-buckling. This value of the friction/drag is compared to a second estimate obtained by consideration of the active forces and speeds in freely gliding filaments. The authors find that these two independent estimates of friction/drag correlate with each other and are comparable in magnitude. The experiments are conducted carefully, the device fabrication is novel, the data set is interesting, and the analysis is solid. The authors conclude that the experiments are consistent with the propulsion being generated by adhesion forces rather than slime extrusion. While consistent with the data, this conclusion is inferred.

      We thank the reviewer for the positive evaluation of our work.

      Summary:

      The paper addresses important questions on the mechanisms driving the gliding motility of filamentous cyanobacteria. The authors aim to understand these by estimating the elastic properties of the filaments, and by comparing the resistance to gliding under a) freely gliding conditions, and b) in post-buckled rotational states. Experiments are used to estimate the propulsion force density on freely gliding filaments (assuming over-damped conditions). Experiments are combined with a theoretical model based on Euler beam theory to extract friction (viscous) coefficients for filaments that buckle and begin to rotate about the pinned end. The main results are estimates for the bending stiffness of the bacteria, the propulsive tangential force density, the buckling threshold in terms of the length, and estimates of the resistive friction (viscous drag) providing the dissipation in the system and balancing the active force. It is found that experiments on the two bacterial species yield nearly identical values of f (albeit with rather large variations). The authors conclude that the experiments are consistent with the propulsion being generated by adhesion forces rather than slime extrusion.

      We appreciate this comprehensive summary of our work.

      Strengths of the paper:

      The strengths of the paper lie in the novel experimental setup and measurements that allow for the estimation of the propulsive force density, critical buckling length, and effective viscous drag forces for movement of the filament along its contour – the axial (parallel) drag coefficient, and the normal (perpendicular) drag coefficient (I assume this is the case, since the post-buckling analysis assumes the bent filament rotates at a constant frequency). These direct measurements are important for serious analysis and discrimination between motility mechanisms.

      We thank the reviewer for this positive assessment of our work.

      Weaknesses:

      There are aspects of the analysis and discussion that may be improved. I suggest that the authors take the following comments into consideration while revising their manuscript.

      The conclusion that adhesion via focal adhesions is the cause for propulsion rather than slime protrusion is consistent with the experimental results that the frictional drag correlates with propulsion force. At the same time, it is hard to rule out other factors that may result in this (friction) viscous drag - (active) force relationship while still being consistent with slime production. More detailed analysis aiming to discriminate between adhesion vs slime protrusion may be outside the scope of the study, but the authors may still want to elaborate on their inference. It would help if there was a detailed discussion on the differences in terms of the active force term for the focal adhesion-based motility vs the slime motility.

      We appreciate this critical assessment of our conclusions. Of course we are aware that many different mechanisms may lead to similar force/friction characteristics, and that a definitive conclusion on the mechanism would require the combination of various techniques, which is beyond the scope of this work. Therefore, we were very careful in formulating the discussion of our findings, refraining, in particular, from a singular conclusion on the mechanism but instead indicating “support” for one hypothesis over another, and emphasizing “that many other possibilities exist”.

      The most common concurrent hypotheses for bacterial gliding suggest that either slime extrusion at the junctional pore complex [A1], rhythmic contraction of fibrillar arrays at the cell wall [A2], focal adhesion sites connected to intracellular motor-microtubule complexes [A3], or modified type-IV pilus apparati [A4] provide the propulsion forces. For the slime extrusion hypothesis, which is still abundant today, one would rather expect an anticorrelation of force and friction: more slime extrusion would generate more force, but also enhance lubrication. The other hypotheses are more conformal to the trend we observed in our experiments, because both pili and focal adhesion require direct contact with a substrate. How contraction of fibrilar arrays would micromechanically couple to the environment is not clear to us, but direct contact might still facilitate force transduction. Please note that these hypotheses were all postulated without any mechanical measurements, solely based on ultra-structural electron microscopy and/or genetic or proteomic experiments. We see our work as complementary to that, providing a mechanical basis for evaluating these hypotheses.

      We agree with the referee that narrowing down this discussion to focal adhesion should have been avoided. We rewrote the concluding paragraph (page 8):

      “…it indicates that friction and propulsion forces, despite being quite vari able, correlate strongly. Thus, generating more force comes, inevitably, at the expense of added friction. For lubricated contacts, the friction coefficient is proportional to the thickness of the lubricating layer (Snoeijer et al., 2013 ), and we conjecture active force and drag both increase due to a more intimate contact with the substrate. This supports mechanisms like focal adhesion (Mignot et al., 2007 ) or a modified type-IV pilus (Khayatan et al., 2015 ), which generate forces through contact with extracellular surfaces, as the underlying mechanism of the gliding apparatus of filamentous cyanobacteria: more contacts generate more force, but also closer contact with the substrate, thereby increasing friction to the same extent. Force generation by slime extrusion (Hoiczyk and Baumeister, 1998 ), in contrast, would lead to the opposite behavior: More slime generates more propulsion, but also reduces friction. Besides fundamental fluid-mechanical considerations (Snoeijer et al., 2013 ), this is rationalized by two experimental observations: i. gliding velocity correlates positively with slime layer thickness (Dhahri et al., 2013 ) and ii. motility in slime-secretion deficient mutants is restored upon exogenous addition of polysaccharide slime. Still we emphasize that many other possibilities exist. One could, for instance, postulate a regulation of the generated forces to the experienced friction, to maintain some preferred or saturated velocity.”

      Can the authors comment on possible mechanisms (perhaps from the literature) that indicate how isotropic friction may be generated in settings where focal adhesions drive motility? A key aspect here would probably be estimating the extent of this adhesion patch and comparing it to a characteristic contact area. Can lubrication theory be used to estimate characteristic areas of contact (knowing the radius of the filament, and assuming a height above the substrate)? If the focal adhesions typically cover areas smaller than this lubrication area, it may suggest the possibility that bacteria essentially present a flat surface insofar as adhesion is concerned, leading to a transversely isotropic response in terms of the drag. Of course, we will still require the effective propulsive force to act along the tangent.

      We thank the referee for suggesting to estimate the dimensions of the contact region. Both pili and focal adhesion sites would be of sizes below one micron [A3, A4], much smaller than the typical contact region in the lubricated contact, which is on the order of the filament radius (few microns). So indeed, isotropic friction may be expected in this situation [A5] and is assumed frequently in theoretical work [A6–A8]. Anisotropy may then indeed be induced by active forces [A9], but we are not aware of measurements of the anisotropy of friction in bacterial gliding.

      For a more precise estimate using lubrication theory, rheology and extrusion rate of the secreted polysaccharides would have to be known, but we are not aware of detailed experimental characterizations.

      We extended the paragraph in the buckling theory on page 5 regarding the assumption of isotropic friction:

      “We use classical Kirchhoff theory for a uniform beam of length L and bending modulus B, subject to a force density ⃗b = −f ⃗t− η ⃗v, with an effective active force density f along the tangent ⃗t, and an effective friction proportional to the local velocity ⃗v, analog to existing literature (Fily et al., 2020; Chelakkot et al., 2014; Sekimoto et al., 1995 ). Presumably, this friction is dominated by the lubrication drag from the contact with the substrate, filled by a thin layer of secreted polysaccharide slime which is much more viscous than the surrounding bulk fluid. Speculatively, the motility mechanism might also comprise adhering elements like pili (Khayatan et al., 2015 ) or foci (Mignot et al., 2007 ) that increase the overall friction (Pompe et al., 2015 ). Thus, the drag due to the surrounding bulk fluid can be neglected (Man and Kanso, 2019 ), and friction is assumed to be isotropic, a common assumption in motility models (Fei et al., 2020; Tchoufag et al., 2019; Wada et al., 2013 ). We assume…”

      We also extended the discussion regarding the outcome of isotropic friction (page 7):

      “…Thus we plot f/v over η in Figure 4 D, finding nearly identical values over about two decades. Since f and η are not correlated with v0, this is due to a correlation between f and η. This relation is remarkable in two aspects: On the one hand, it indicates that friction is mainly isotropic. This suggests that friction is governed by an isotropic process like bond friction or lubrication from the slime layer in the contact with the substrate, the latter being consistent with the observation that mutations deficient of slime secretion do not glide but exogenous addition of slime restores motility (Khayatan et al., 2015 ). In contrast, hydrodynamic drag from the surrounding bulk fluid (Man and Kanso, 2019 ), or the internal friction of the gliding apparatus would be expected to generate strongly anisotropic friction. If the latter was dominant, a snapping-like transition into the buckling state would be expected, rather than the continuously growing amplitude that is observed in experiments. On the other hand, it indicates that friction and propulsion forces…”

      I am not sure why the authors mention that the power of the gliding apparatus is not rate-limiting. The only way to verify this would be to put these in highly viscous fluids where the drag of the external fluid comes into the picture as well (if focal adhesions are on the substrate-facing side, and the upper side is subject to ambient fluid drag). Also, the friction referred to here has the form of a viscous drag (no memory effect, and thus not viscoelastic or gel-like), and it is not clear if forces generated by adhesion involve other forms of drag such as chemical friction via temporary bonds forming and breaking. In quasi-static settings and under certain conditions such as the separation of chemical and elastic time scales, bond friction may yield overall force proportional to local sliding velocities.

      We agree with the referee that the origin of the friction is not easily resolved. Lubrication yields an isotropic force density that is proportional to the velocity, and the same could be generated by bond friction. Importantly, both types of friction would be assumed to be predominantly isotropic. We explicitly referred to lubrication drag because it has been shown that mutations deficient of slime extrusion do not glide [A4].

      Assuming, in contrast, that in free gliding, friction with the environment is not rate limiting, but rather the internal friction of the gliding apparatus, i.e., the available power, we would expect a rather different behavior during early-buckling evolution. During early buckling, the tangential motion is stalled, and the dynamics is dominated by the growing buckling amplitude of filament regions near the front end, which move mainly transversely. For geometric reasons, in this stage the (transverse) buckling amplitude grows much faster than the rear part of the filament advances longitudinally. Thus that motion should not be impeded much by the internal friction of the gliding apparatus, but by external friction between the buckling parts of the filament and the ambient. The rate at which the buckling amplitude initially grows should be limited by the accumulated compressive stress in the filament and the transverse friction with the substrate. If the latter were much smaller than the (logitudinal) internal friction of the gliding apparatus, we would expect a snapping-like transition into the buckled state, which we did not observe.

      In our paper, we do not intend to evaluate the exact origin of the friction, quantifying the gliding force is the main objective. A linear force-velocity relation agrees with our observations. A detailed analysis of friction in cyanobacterial gliding would be an interesting direction for future work.

      To make these considerations more clear, we rephrased the corresponding paragraph on page 7 & 8:

      “…Thus we plot f/v over η in Figure 4 D, finding nearly identical values over about two decades. Since f and η are not correlated with v0, this is due to a correlation between f and η. This relation is remarkable in two aspects: On the one hand, it indicates that friction is mainly isotropic. This suggests that friction is governed by an isotropic process like bond friction or lubrication from the slime layer in the contact with the substrate, the latter being consistent with the observation that mutations deficient of slime secretion do not glide but exogenous addition of slime restores motility (Khayatan et al., 2015 ). In contrast, hydrodynamic drag from the surrounding bulk fluid (Man and Kanso, 2019 ), or the internal friction of the gliding apparatus would be expected to generate strongly anisotropic friction. If the latter was dominant, a snapping-like transition into the buckling state would be expected, rather than the continuously growing amplitude that is observed in experiments. On the other hand, it indicates that friction and propulsion forces…”

      For readers from a non-fluids background, some additional discussion of the drag forces, and the forms of friction would help. For a freely gliding filament if f is the force density (per unit length), then steady gliding with a viscous frictional drag would suggest (as mentioned in the paper) f ∼ v! L η||. The critical buckling length is then dependent on f and on B the bending modulus. Here the effective drag is defined per length. I can see from this that if the active force is fixed, and the viscous component resulting from the frictional mechanism is fixed, the critical buckling length will not depend on the velocity (unless I am missing something in their argument), since the velocity is not a primitive variable, and is itself an emergent quantity.

      We are not sure what “f ∼ v! L η||” means, possibly the spelling was corrupted in the forwarding of the comments.

      We assumed an overdamped motion in which the friction force density ff (per unit length of the filament) is proportional to the velocity v0, i.e. ff ∼ η v0, with a friction coefficient η. Overdamped means that the friction force density is equal and opposite to the propulsion force density, so the propulsion force density is f ∼ ff ∼ η v0. The total friction and propulsion forces can be obtained by multiplication with the filament length

      L, which is not required here. In this picture, v0 is an emergent quantity and f and η are assumed as given and constant. Thus, by observing v0, f can be inferred up to the friction coefficient η. Therefore, by using two descriptive variables, L and v0, with known B, the primitive variable η can be inferred by logistic regression, and f then follows from the overdamped equation of motion.

      To clarify this, we revised the corresponding section on page 5 of the paper:

      “The substrate contact requires lubrication from polysaccharide slime to enable bacteria to glide (Khayatan et al., 2015 ). Thus we assume an over- damped motion with co-linear friction, for which the propulsion force f and the free gliding velocity v0 of a filament are related by f = η v0, with a friction coefficient η. In this scenario, f can be inferred both from the observed Lc ∼ (f/B)−1/3 and, up to the proportionality coefficient η, from the observed free gliding velocity. Thus, by combining the two relations, one may expect also a strong correlation between Lc and v0. In order to test this relation for consistency with our data, we include v0 as a second regressor, by setting x = (L−Lc(v0))/∆Lc in Equation 1, with Lc(v0) = (η v0/(30.5722 B))−1/3, to reflect our expectation from theory (see below). Now, η rather than f is the only unknown, and its ensemble distribution will be determined in the regression. Figure 3 E,F show the buckling behavior…”

      Reviewer 2:

      In the presented manuscript, the authors first use structured microfluidic devices with gliding filamentous cyanobacteria inside in combination with micropipette force measurements to measure the bending rigidity of the filaments.

      Next, they use triangular structures to trap the bacteria with the front against an obstacle. Depending on the length and rigidity, the filaments buckle under the propulsive force of the cells. The authors use theoretical expressions for the buckling threshold to infer propulsive force, given the measured length and stiffnesses. They find nearly identical values for both species, f ∼ (1.0 ± 0.6) nN/µm, nearly independent of the velocity.

      Finally, they measure the shape of the filament dynamically to infer friction coefficients via Kirchhoff theory. This last part seems a bit inconsistent with the previous inference of propulsive force. Before, they assumed the same propulsive force for all bacteria and showed only a very weak correlation between buckling and propulsive velocity. In this section, they report a strong correlation with velocity, and report propulsive forces that vary over two orders of magnitude. I might be misunderstanding something, but I think this discrepancy should have been discussed or explained.

      We regret the misunderstanding of the reviewer regarding the velocity dependence, which indicates that the manuscript should be improved to convey these relations correctly.

      First, in the Buckling Measurements section, we did not assume the same propulsion force for all bacteria. The logistic regression yields an ensemble median for Lc (and thus an ensemble median for f ), along with the width ∆Lc of the distribution (and thus also the width of the distribution of f ). Our result f ∼ (1.0 ± 0.6) nN/µm indicates the median and the width of the distribution of the propulsion force densities across the ensemble of several hundred filaments used in the buckling measurements. The large variability of the forces found in the second part is consistently reflected by this very wide distribution of active forces detected in the logistic regression in the first part.

      We did small modifications to the buckling theory paragraph to clarify that in the first part, a distribution of forces rather than a constant value is inferred (page 6)

      “Inserting the population median and quartiles of the distributions of bending modulus and critical length, we can now quantify the distribution of the active force density for the filaments in the ensemble from the buckling measurements. We obtain nearly identical values for both species, f ∼ (1.0±0.6) nN/µm, where the uncertainty represents a wide distribution of f across the ensemble rather than a measurement error.”

      The same holds, of course, when inferring the distribution of the friction coefficients (page 5):

      “The substrate contact requires lubrication from polysaccharide slime to enable bacteria to glide (Khayatan et al., 2015 ). Thus we assume an over- damped motion with co-linear friction, for which the propulsion force f and the free gliding velocity v0 of a filament are related by f = η v0, with a friction coefficient η. In this scenario, f can be inferred both from the observed Lc ∼ (f/B)−1/3 and, up to the proportionality coefficient η, from the observed free gliding velocity. Thus, by combining the two relations, one may expect also a strong correlation between Lc and v0. In order to test this relation for consistency with our data, we include v0 as a second regressor, by setting x = (L−Lc(v0))/∆Lc in Equation 1, with Lc(v0) = (η v0/(30.5722 B))−1/3, to reflect our expectation from theory (see below). Now, η rather than f is the only unknown, and its ensemble distribution will be determined in the regression. Figure 3 E,F show the buckling behavior…”

      The (naturally) wide distribution of force (and friction) leads to a distribution of Lc as well. However, due to the small exponent of 1/3 in the buckling threshold Lc ∼ f 1/3, the distribution of Lc is not as wide as the distributions of the individually inferred f or η. This is visualized in panel G of Figure 3, plotting Lc as a function of v0 (v0 is equivalent to f , up to a proportionality coefficient η). The natural length distribution, in contrast, is very wide. Therefore, the buckling propensity of a filament is most strongly characterized by its length, while force variability, which alters Lc of the individual, plays a secondary role.

      In order to clarify this, we edited the last paragraph of the Buckling Measurements section on page 5 of the manuscript:

      “…Within the characteristic range of observed velocities (1 − 3 µm/s), the median Lc depends only mildly on v0, as compared to its rather broad distribution, indicated by the bands in Figure 3 G. Thus a possible correlation between f and v0 would only mildly alter Lc. The natural length distribution (cf. Appendix 1—figure 1 ), however, is very broad, and we conclude that growth rather than velocity or force distributions most strongly impacts the buckling propensity of cyanobacterial colonies. Also, we hardly observed short and fast filaments of K. animale, which might be caused by physiological limitations (Burkholder, 1934 ).”

      Second, in the Profile analysis section, we did not report a correlation between force and velocity. As can be seen in Figure 4—figure Supplement 1, neither the active force nor the friction coefficient, as determined from the analysis of individual filaments, show any significant correlation with the velocity. This is also written in the discussion (page 7):

      We see no significant correlation between L or v0 and f or η, but the observed values of f and η cover a wide range (Figure 4 B, C and Figure 4—figure Supplement 1 ).

      Note that this is indeed consistent with the logistic regression: Using v0 as a second regressor did not significantly reduce the width of the distribution of Lc as compared to the simple logistic regression, indicating that force and velocity are not strongly correlated.

      In order to clarify this in the manuscript, we modified that part (page 7):

      “…We see no significant correlation between L or v0 and f or η, but the observed values of f and η cover a wide range (Figure 4 B,C and Figure 4— figure Supplement 1 ). This is consistent with the logistic regression, where using v0 as a second regressor did not significantly reduce the width of the distribution of critical lengths or active forces. The two estimates of the friction coefficient, from logistic regression and individual profile fits, are measured in (predominantly) orthogonal directions: tangentially for the logistic regression where the free gliding velocity was used, and transversely for the evolution of the buckling profiles. Thus we plot f/v over η in Figure 4 D, finding nearly identical values over about two decades. Since f and η are not correlated with v0, this is due to a correlation between f and η. This relation is remarkable in two aspects: On the one hand, it indicates that friction is mainly isotropic…”

      From a theoretical perspective, not many new results are presented. The authors repeat the well-known calculation for filaments buckling under propulsive load and arrive at the literature result of buckling when the dimensionless number (f L3/B) is larger than 30.6 as previously derived by Sekimoto et al in 1995 [1] (see [2] for a clamped boundary condition and simulations). Other theoretical predictions for pushed semi-flexible filaments [1–4] are not discussed or compared with the experiments. Finally, the Authors use molecular dynamics type simulations similar to [2–4] to reproduce the buckling dynamics from the experiments. Unfortunately, no systematic comparison is performed.

      [1]        Ken Sekimoto, Naoki Mori, Katsuhisa Tawada, and Yoko Y Toyoshima. Symmetry breaking instabilities of an in vitro biological system. Physical review letters, 75(1):172, 1995.

      [2]       Raghunath Chelakkot, Arvind Gopinath, Lakshminarayanan Mahadevan, and Michael F Hagan. Flagellar dynamics of a connected chain of active, polar, brownian particles. Journal of The Royal Society Interface, 11(92):20130884, 2014.

      [3]       Rolf E Isele-Holder, Jens Elgeti, and Gerhard Gompper. Self-propelled worm-like filaments: spontaneous spiral formation, structure, and dynamics. Soft matter, 11(36):7181–7190, 2015.

      [4]       Rolf E Isele-Holder, Julia J¨ager, Guglielmo Saggiorato, Jens Elgeti, and Gerhard Gompper. Dynamics of self-propelled filaments pushing a load. Soft Matter, 12(41):8495–8505, 2016.

      We thank the reviewer for pointing us to these publications, in particular the work by Sekimoto we were not aware of. We agree with the referee that the calculation is straight forward (basically known since Euler, up to modified boundary conditions). Our paper focuses on experimental work, the molecular dynamics simulations were included mainly as a consistency check and not intended to generate the beautiful post-buckling patterns observed in references [2-4]. However, such shapes do emerge in filamentous cyanobacteria, and with the data provided in our manuscript, simulations can be quantitatively matched to our experiments, which will be covered by future work.

      We included the references in the revision of our manuscript, and a statement that we do not claim priority on these classical theoretical results.

      Introduction, page 2:

      “…Self-Buckling is an important instability for self-propelling rod-like micro-organisms to change the orientation of their motion, enabling aggregation or the escape from traps (Fily et al., 2020; Man and Kanso, 2019; Isele-Holder et al., 2015; Isele-Holder et al., 2016 ). The notion of self-buckling goes back to work of Leonhard Euler in 1780, who described elastic columns subject to gravity (Elishakoff, 2000 ). Here, the principle is adapted to the self-propelling, flexible filaments (Fily et al., 2020; Man and Kanso, 2019; Sekimoto et al., 1995 ) that glide onto an obstacle. Filaments buckle if they exceed a certain critical length Lc ∼ (B/f)1/3, where B is the bending modulus and f the propulsion force density…”

      Buckling theory, page 5:

      “…The buckling of gliding filaments differs in two aspects: the propulsion forces are oriented tangentially instead of vertically, and the front end is supported instead of clamped. Therefore, with L < Lc all initial orientations are indifferently stable, while for L > Lc, buckling induces curvature and a resultant torque on the head, leading to rotation (Fily et al., 2020; Chelakkot et al., 2014; Sekimoto et al., 1995 ). Buckling under concentrated tangential end-loads has also been investigated in literature (de Canio et al., 2017; Wolgemuth et al., 2005 ), but leads to substantially different shapes of buckled filaments. We use classical Kirchhoff theory for a uniform beam of length L and bending modulus B, subject to a force density ⃗b = −f ⃗t − η ⃗v, with an effective active force density f along the tangent ⃗t, and an effective friction proportional to the local velocity ⃗v, analog to existing literature (Fily et al., 2020; Chelakkot et al., 2014; Sekimoto et al., 1995 )…”

      Further on page 6:

      “To derive the critical self-buckling length, Equation 5 can be linearized for two scenarios that lead to the same Lc: early-time small amplitude buckling and late-time stationary rotation at small and constant curvature (Fily et al., 2020; Chelakkot et al., 2014 ; Sekimoto et al., 1995 ). […] Thus, in physical units, the critical length is given by Lc = (30.5722 B/f)1/3, which is reproduced in particle based simulations (Appendix Figure 2 ) analogous to those in Isele-Holder et al. (2015, 2016).”

      Discussion, page 7 & 8:

      “…This, in turn, has dramatic consequences on the exploration behavior and the emerging patterns (Isele-Holder et al., 2015, 2016; Abbaspour et al., 2021; Duman et al., 2018; Prathyusha et al., 2018; Jung et al., 2020 ): (L/Lc)3 is, up to a numerical prefactor, identical to the flexure number (Isele-Holder et al., 2015, 2016; Duman et al., 2018; Winkler et al., 2017 ), the ratio of the Peclet number and the persistence length of active polymer melts. Thus, the ample variety of non-equilibrium phases in such materials (Isele-Holder et al., 2015, 2016; Prathyusha et al., 2018; Abbaspour et al., 2021 ) may well have contributed to the evolutionary success of filamentous cyanobacteria.”

      Reviewer 3:

      Summary:

      This paper presents novel and innovative force measurements of the biophysics of gliding cyanobacteria filaments. These measurements allow for estimates of the resistive force between the cell and substrate and provide potential insight into the motility mechanism of these cells, which remains unknown.

      We thank the reviewer for the positive evaluation of our work. We have revised the manuscript according to their comments and detail our replies and modifications next to the individual points below.

      Strengths:

      The authors used well-designed microfabricated devices to measure the bending modulus of these cells and to determine the critical length at which the cells buckle. I especially appreciated the way the authors constructed an array of pillars and used it to do 3-point bending measurements and the arrangement the authors used to direct cells into a V-shaped corner in order to examine at what length the cells buckled at. By examining the gliding speed of the cells before buckling events, the authors were able to determine how strongly the buckling length depends on the gliding speed, which could be an indicator of how the force exerted by the cells depends on cell length; however, the authors did not comment on this directly.

      We thank the referee for the positive assessment of our work. Importantly, we do not see a significant correlation between buckling length and gliding speeds, and we also do not see a correlation with filament length, consistent with the assumption of a propulsion force density that is more or less homogeneously distributed along the filament. Note that each filament consists of many metabolically independent cells, which renders cyanobacterial gliding a collective effort of many cells, in contrast to gliding of, e.g., myxobacteria.

      In response also to the other referees’ comments, we modified the manuscript to reflect more on the absence of a strong correlation between velocity and force/critical length. We modified the Buckling measurements section on page 5 of the paper:

      “The substrate contact requires lubrication from polysaccharide slime to enable bacteria to glide (Khayatan et al., 2015 ). Thus we assume an over-damped motion with co-linear friction, for which the propulsion force f and the free gliding velocity v0 of a filament are related by f = η v0, with a friction coefficient η. In this scenario, f can be inferred both from the observed Lc ∼ (f/B)−1/3 and, up to the proportionality coefficient η, from the observed free gliding velocity. Thus, by combining the two relations, one may expect also a strong correlation between Lc and v0. In order to test this relation for consistency with our data, we include v0 as a second regressor, by setting x = (L−Lc(v0))/∆Lc in Equation 1, with Lc(v0) = (η v0/(30.5722 B))−1/3, to reflect our expectation from theory (see below). Now, η rather than f is the only unknown, and its ensemble distribution will be determined in the regression. Figure 3 E, F show the buckling behavior…”

      Further, we edited the last paragraph of the Buckling measurements section on page 5 of the manuscript:

      “Within the characteristic range of observed velocities (1 − 3 µm/s), the median Lc depends only mildly on v0, as compared to its rather broad distribution, indicated by the bands in Figure 3 G. Thus a possible correlation between f and v0 would only mildly alter Lc. The natural length distribution (cf. Appendix 1—figure 1 ), however, is very broad, and we conclude that growth rather than velocity or force distributions most strongly impacts the buckling propensity of cyanobacterial colonies. Also, we hardly observed short and fast filaments of K. animale, which might be caused by physiological limitations (Burkholder, 1934 ).”

      We also rephrased the corresponding discussion paragraph on page 7:

      “…Thus we plot f/v over η in Figure 4 D, finding nearly identical values over about two decades. Since f and η are not correlated with v0, this is due to a correlation between f and η. This relation is remarkable in two aspects: On the one hand, it indicates that friction is mainly isotropic. This suggests that friction is governed by an isotropic process like bond friction or lubrication from the slime layer in the contact with the substrate, the latter being consistent with the observation that mutations deficient of slime secretion do not glide but exogenous addition of slime restores motility (Khayatan et al., 2015 ). In contrast, hydrodynamic drag from the surrounding bulk fluid (Man and Kanso, 2019 ), or the internal friction of the gliding apparatus would be expected to generate strongly anisotropic friction. If the latter was dominant, a snapping-like transition into the buckling state would be expected, rather than the continuously growing amplitude that is observed in experiments. On the other hand, it indicates that friction and propulsion forces…”

      Weaknesses:

      There were two minor weaknesses in the paper.

      First, the authors investigate the buckling of these gliding cells using an Euler beam model. A similar mathematical analysis was used to estimate the bending modulus and gliding force for Myxobacteria (C.W. Wolgemuth, Biophys. J. 89: 945-950 (2005)). A similar mathematical model was also examined in G. De Canio, E. Lauga, and R.E Goldstein, J. Roy. Soc. Interface, 14: 20170491 (2017). The authors should have cited these previous works and pointed out any differences between what they did and what was done before.

      We thank the reviewer for pointing us to these references. The paper by Wolgemuth is theoretical work, describing A-motility in myxobacteria by a concentrated propulsion force at the rear end of the bacterium, possibly stemming from slime extrusion. This model was a little later refuted by [A3], who demonstrated that focal adhesion along the bacterial body and thus a distributed force powers A-motility, a mechanism that has by now been investigated in great detail (see [A10]). The paper by Canio et al. contains a thorough theoretical analysis of a filament that is clamped at one end and subject to a concentrated tangential load on the other. Since both models comprise a concentrated end-load rather than a distributed propulsion force density, they describe a substantially different motility mechanism, leading also to substantially different buckling profiles. Consequentially, these models cannot be applied to cyanobacterial gliding.

      We included both citations in the revision and pointed out the differences to our work in the introduction (page 2):

      “…A few species appear to employ a type-IV-pilus related mechanism (Khayatan et al., 2015; Wilde and Mullineaux, 2015 ), similar to the better- studied myxobacteria (Godwin et al., 1989; Mignot et al., 2007; Nan et al., 2014; Copenhagen et al., 2021; Godwin et al., 1989 ), which are short, rod-shaped single cells that exhibit two types of motility: S (social) motility based on pilus extension and retraction, and A (adventurous) motility based on focal adhesion (Chen and Nan, 2022 ) for which also slime extrusion at the trailing cell pole was earlier postulated as mechanism (Wolgemuth et al., 2005 ). Yet, most gliding filamentous cyanobacteria do not exhibit pili and their gliding mechanism appears to be distinct from myxobacteria (Khayatan et al., 2015 ).”

      And in Buckling theory, page 5:

      “….The buckling of gliding filaments differs in two aspects: the propulsion forces are oriented tangentially instead of vertically, and the front end is supported instead of clamped. Therefore, with L < Lc all initial orientations are indifferently stable, while for L > Lc, buckling induces curvature and a resultant torque on the head, leading to rotation (Fily et al., 2020; Chelakkot et al., 2014; Sekimoto et al., 1995 ). Buckling under concentrated tangential end-loads has also been investigated in literature (de Canio et al., 2017; Wolgemuth et al., 2005 ), but leads to substantially different shapes of buckled filaments.”

      The second weakness is that the authors claim that their results favor a focal adhesion-based mechanism for cyanobacterial gliding motility. This is based on their result that friction and adhesion forces correlate strongly. They then conjecture that this is due to more intimate contact with the surface, with more contacts producing more force and pulling the filaments closer to the substrate, which produces more friction. They then claim that a slime-extrusion mechanism would necessarily involve more force and lower friction. Is it necessarily true that this latter statement is correct? (I admit that it could be, but is it a requirement?)

      We thank the referee for raising this interesting question. Our claim regarding slime extrusion is based on three facts: i. mutations deficient of slime extrusion do not glide, but start gliding as soon as slime is provided externally [A4]. ii. A positive correlation between speed and slime layer thickness was observed in Nostoc [A11]. iii. The fluid mechanics of lubricated sliding contacts is very well understood and predicts a decreasing resistance with increasing layer thickness.

      We included these considerations in the revision of our manuscript (page 8):

      “…it indicates that friction and propulsion forces, despite being quite variable, correlate strongly. Thus, generating more force comes, inevitably, at the expense of added friction. For lubricated contacts, the friction coefficient is proportional to the thickness of the lubricating layer (Snoeijer et al., 2013 ), and we conjecture active force and drag both increase due to a more intimate contact with the substrate. This supports mechanisms like focal adhesion (Mignot et al., 2007 ) or a modified type-IV pilus (Khayatan et al., 2015 ), which generate forces through contact with extracellular surfaces, as the underlying mechanism of the gliding apparatus of filamentous cyanobacteria: more contacts generate more force, but also closer contact with the substrate, thereby increasing friction to the same extent. Force generation by slime extrusion (Hoiczyk and Baumeister, 1998 ), in contrast, would lead to the opposite behavior: More slime generates more propulsion, but also reduces friction. Besides fundamental fluid-mechanical considerations (Snoeijer et al., 2013 ), this is rationalized by two experimental observations: i. gliding velocity correlates positively with slime layer thickness (Dhahri et al., 2013 ) and ii. motility in slime-secretion deficient mutants is restored upon exogenous addition of polysaccharide slime. Still we emphasize that many other possibilities exist. One could, for instance, postulate a regulation of the generated forces to the experienced friction, to maintain some preferred or saturated velocity.”

      Related to this, the authors use a model with isotropic friction. They claim that this is justified because they are able to fit the cell shapes well with this assumption. How would assuming a non-isotropic drag coefficient affect the shapes? It may be that it does equally well, in which case, the quality of the fits would not be informative about whether or not the drag was isotropic or not.

      The referee raises another very interesting point. Given the typical variability and uncertainty in experimental measurements (cf. error Figure 4 A), a model with a sightly anisotropic friction could be fitted to the observed buckling profiles as well, without significant increase of the mismatch. Yet, strongly anisotropic friction would not be consistent with our observations.

      Importantly, however, we did not conclude on isotropic friction based on the fit quality, but based on a comparison between free gliding and early buckling (Figure 4 D). In early buckling, the dominant motion is in transverse direction, while longitudinal motion is insignificant, due to geometric reasons. Thus, independent of the underlying model, mostly the transverse friction coefficiont is inferred. In contrast, free gliding is a purely longitudinal motion, and thus only the friction coefficient for longitudinal motion can be inferred. These two friction coefficients are compared in Figure 4 D. Still, the scatter of that data would allow to fit a certain anisotropy within the error margins. What we can exclude based on out observation is the case of a strongly anisotropic friction. If there is no ab-initio reason for anisotropy, nor a measurement that indicates it, we prefer to stick with the simplest

      assumption. We carefully chose our wording in the Discussion as “mainly isotropic” rather

      than “isotropic” or “fully isotropic”.

      We added a small statement to the Discussion on page 7 & 8:

      “... Thus we plot f/v over η in Figure 4 D, finding nearly identical values over about two decades. Since f and η are not correlated with v0, this is due to a correlation between f and η. This relation is remarkable in two aspects: On the one hand, it indicates that friction is mainly isotropic. This suggests that friction is governed by an isotropic process like bond friction or lubrication from the slime layer in the contact with the substrate, the latter being consistent with the observation that mutations deficient of slime secretion do not glide but exogenous addition of slime restores motility (Khayatan et al., 2015 ). In contrast, hydrodynamic drag from the surrounding bulk fluid (Man and Kanso, 2019 ), or the internal friction of the gliding apparatus would be expected to generate strongly anisotropic friction. If the latter was dominant, a snapping-like transition into the buckling state would be expected, rather than the continuously growing amplitude that is observed in experiments. On the other hand, it indicates that friction and propulsion forces ...”

      Recommendations for the authors

      The discussion regarding how the findings of this paper imply that cyanobacteria filaments are propelled by adhesion forces rather than slime extrusion should be improved, as this conclusion seems questionable. There appears to be an inconsistency with a buckling force said to be only weakly dependent on the gliding velocity, while its ratio with the velocity correlates with a friction coefficient. Finally, data and source code should be made publicly available.

      In the revised version, we have modified the discussion of the force generating mechanism according to the reviewer suggestions. The perception of inconsistency in the velocity dependence of the buckling force was based on a misunderstanding, as we detailed in our reply to the referee. We revised the corresponding section to make it more clear. Data and source code have been uploaded to a public data repository.

      Reviewer #2 (recommendations for the authors)

      Despite eLife policy, the authors do not provide a Data Availability Statement. For the presented manuscript, data and source code should be provided “via trusted institutional or third-party repositories that adhere to policies that make data discoverable, accessible and usable.” https://elifesciences.org/inside-elife/51839f0a/for-authors-updates- to-elife-s-data-sharing-policies

      Most of the issues in this reviewer’s public review should be easy to correct, so I would strongly support the authors to provide an amended manuscript.

      We added the Data Availability Statement in the amended manuscript.

      References

      [A1] E. Hoiczyk and W. Baumeister. “The junctional pore complex, a prokaryotic secretion organelle, is the molecular motor underlying gliding motility in cyanobacteria”. In: Curr. Biol. 8.21 (1998), pp. 1161–1168. doi: 10.1016/s0960-9822(07)00487-3.

      [A2] N. Read, S. Connell, and D. G. Adams. “Nanoscale Visualization of a Fibrillar Array in the Cell Wall of Filamentous Cyanobacteria and Its Implications for Gliding Motility”. In: J. Bacteriol. 189.20 (2007), pp. 7361–7366. doi: 10.1128/jb.00706- 07.

      [A3] T. Mignot, J. W. Shaevitz, P. L. Hartzell, and D. R. Zusman. “Evidence That Focal Adhesion Complexes Power Bacterial Gliding Motility”. In: Science 315.5813 (2007), pp. 853–856. doi: 10.1126/science.1137223.

      [A4] Behzad Khayatan, John C. Meeks, and Douglas D. Risser. “Evidence that a modified type IV pilus-like system powers gliding motility and polysaccharide secretion in filamentous cyanobacteria”. In: Mol. Microbiol. 98.6 (2015), pp. 1021–1036. doi: 10.1111/mmi.13205.

      [A5] Tilo Pompe, Martin Kaufmann, Maria Kasimir, Stephanie Johne, Stefan Glorius, Lars Renner, Manfred Bobeth, Wolfgang Pompe, and Carsten Werner. “Friction- controlled traction force in cell adhesion”. In: Biophysical journal 101.8 (2011), pp. 1863–1870.

      [A6] Hirofumi Wada, Daisuke Nakane, and Hsuan-Yi Chen. “Bidirectional bacterial gliding motility powered by the collective transport of cell surface proteins”. In: Physical Review Letters 111.24 (2013), p. 248102.

      [A7] Jo¨el Tchoufag, Pushpita Ghosh, Connor B Pogue, Beiyan Nan, and Kranthi K Mandadapu. “Mechanisms for bacterial gliding motility on soft substrates”. In: Proceedings of the National Academy of Sciences 116.50 (2019), pp. 25087–25096.

      [A8] Chenyi Fei, Sheng Mao, Jing Yan, Ricard Alert, Howard A Stone, Bonnie L Bassler, Ned S Wingreen, and Andrej Kosmrlj. “Nonuniform growth and surface friction determine bacterial biofilm morphology on soft substrates”. In: Proceedings of the National Academy of Sciences 117.14 (2020), pp. 7622–7632.

      [A9] Arja Ray, Oscar Lee, Zaw Win, Rachel M Edwards, Patrick W Alford, Deok-Ho Kim, and Paolo P Provenzano. “Anisotropic forces from spatially constrained focal adhesions mediate contact guidance directed cell migration”. In: Nature communications 8.1 (2017), p. 14923.

      [A10] Jing Chen and Beiyan Nan. “Flagellar motor transformed: biophysical perspectives of the Myxococcus xanthus gliding mechanism”. In: Frontiers in Microbiology 13 (2022), p. 891694.

      [A11] Samia Dhahri, Michel Ramonda, and Christian Marliere. “In-situ determination of the mechanical properties of gliding or non-motile bacteria by atomic force microscopy under physiological conditions without immobilization”. In: PLoS One 8.4 (2013), e61663.

    1. VSCodium Free/Libre Open Source Software Binaries of VS Code

    1. Reviewer #2 (Public Review):

      Pyoverdines, siderophores produced by many Pseudomonads, are one of the most diverse groups of specialized metabolites and are frequently used as model systems. Thousands of Pseudomonas genomes are available, but large-scale analyses of pyoverdines are hampered by the biosynthetic gene clusters (BGCs) being spread across multiple genomic loci and existing tools' inability to accurately predict amino acid substrates of the biosynthetic adenylation (A) domains. The authors present a bioinformatics pipeline that identifies pyoverdine BGCs and predicts the A domain substrates with high accuracy. They tackled a second challenging problem by developing an algorithm to differentiate between outer membrane receptor selectivity for pyoverdines versus other siderophores and substrates. The authors applied their dataset to thousands of Pseudomonas strains, producing the first comprehensive overview of pyoverdines and their receptors and predicting many new structural variants.

      The A domain substrate prediction is impressive, including the correction of entries in the MIBiG database. Their high accuracy came from a relatively small training dataset of A domains from 13 pyoverdine BGCs. The authors acknowledge that this small dataset does not include all substrates, and correctly point out that new sequence/structure pairs can be added to the training set to refine the prediction algorithm. The authors could have been more comprehensive in finding their training set data. For instance, the authors claim that histidine "had not been previously documented in pyoverdines", but the sequenced strain P. entomophila L48, incorporates His (10.1007/s10534-009-9247-y). The workflow cannot differentiate between different variants of Asp and OHOrn, and it's not clear if this is a limitation of the workflow, the training data, or both. The prediction workflow holds up well in Burkholderiales A domains, however, they fail to mention in the main text that they achieved these numbers by adding more A domains to their training set.

      To validate their predictions, they elucidated structures of several new pyoverdines, and their predictions performed well. However, the authors did not include their MS/MS data, making it impossible to validate their structures. In general, the biggest limitation of the submitted manuscript is the near-empty methods section, which does not include any experimental details for the 20 strains or details of the annotation pipeline (such as "Phydist" and "Syndist"). The source code also does not contain the requisite information to replicate the results or re-use the pipeline, such as the antiSMASH version and required flags. That said, skimming through the source code and data (kindly provided upon request) suggests that the workflow itself is sound and a clear improvement over existing tools for pyoverdine BGC annotation.

      Predicting outer membrane receptor specificity is likewise a challenging problem and the authors have made a promising achievement by finding specific gene regions that differentiate the pyoverdine receptor FpvA from FpvB and other receptor families. Their predictions were not tested experimentally, but the finding that only predicted FpvA receptors were proximate to the biosynthesis genes lends credence to the predictive power of the workflow. The authors find predicted pyoverdine receptors across an impressive 468 genera, an exciting finding for expanding the role of pyoverdines as public goods beyond Pseudomonas. However, whether or not these receptors can recognize pyoverdines (and if so, which structures!) remains to be investigated.

      In all, the authors have assembled a rich dataset that will enable large-scale comparative genomic analyses. This dataset could be used by a variety of researchers, including those studying natural product evolution, public good eco/evo dynamics, and NRPS engineering.

    1. Summary of the Talk on the Future of Web Frameworks by Ryan Carniado

      • Introduction and Background:

        • Ryan Carniado, creator of SolidJS, has extensive experience in web development spanning 25 years, having worked with various technologies including ASP.NET, Rails, and jQuery.
        • SolidJS was started in 2016 and reflects a shift towards new paradigms in web frameworks, particularly in the front-end JavaScript ecosystem.
        • Quote: "I've been doing web development now for like 25 years... it wasn't really until the 2010s that my passion reignited for front-end JavaScript."
      • Core Themes and Concepts:

        • Modern front-end development heavily relies on components (e.g., class components, function components, web components) which serve as fundamental building blocks for creating modular and composable applications.
        • Components have runtime implications due to their update models and life cycles, influencing the performance and design of web applications.
        • Traditional component models use either a top-down diffing approach (like virtual DOM) or rely on compilation optimizations to enhance performance.
        • Quote: "Modern front-end development for years has been about components... however, in almost every JavaScript framework components have runtime implications."
      • Reactive Programming and Fine-Grained Reactivity:

        • Ryan advocates for a shift towards reactive programming to manage state changes more efficiently. This approach is likened to how spreadsheets work, where changes in input immediately affect outputs without re-execution of all logic.
        • Fine-grained reactivity involves three primitives: signals (atomic atoms), derived state (computeds or memos), and side effects (effects). These primitives help manage state and side effects without heavy reliance on the component architecture or compilation.
        • Quote: "What if the relationship held instead? What if whenever we changed B and C, A also immediately updated? That's basically what reactive programming is."
      • Practical Demonstration and Code Examples:

        • Ryan demonstrated the implementation of fine-grained reactivity using SolidJS, showing how state management and updates can be handled more efficiently compared to traditional methods that rely heavily on component re-renders and hooks.
        • The examples provided emphasized how reactive programming can simplify state management and improve performance by only updating components that need to change, reducing unnecessary re-renders.
        • Quote: "The problem is that if any state in this component changes, the whole thing reruns again... what if we didn't? What if components didn't dictate the boundary of our performance?"
      • Performance Implications and Advantages:

        • The "reactive advantage" in SolidJS and similar frameworks lies in their ability to run components minimally, avoiding stale closures and excessive dependencies that can degrade performance.
        • Ryan highlighted that in reactive frameworks, component boundaries do not dictate performance; instead, performance optimization is achieved through smarter state management and reactive updates.
        • Quote: "Components run once... state is independent of components. Component boundaries are for your sake, how you want to organize your code, not for performance."
      • Future Directions and Framework Evolution:

        • The talk touched on the broader impact of reactive programming and fine-grained reactivity on the evolution of web frameworks. This includes the potential integration with AI and compilers to further optimize performance and developer experience.
        • Ryan suggested that the future of web development might see more frameworks adopting similar reactive principles, possibly leading to a "reactive renaissance" in the industry.
        • Quote: "A revolution is not in the cards, maybe just a reactive Renaissance."
      • Q&A and Additional Insights:

        • During the Q&A, Ryan discussed the potential application of SolidJS principles in environments like React Native and native code development, indicating the flexibility and broad applicability of reactive programming principles across different platforms and technologies.
        • Quote: "The custom renderer and stuff is not something you need a virtual DOM to... the reactive tree as it turns out is completely independent."
    1. Summary of Raph Levien's Blog: "Towards principled reactive UI"

      Introduction

      • Diversity of Reactive UI Systems: The blog notes the diversity in reactive UI systems primarily sourced from open-source projects. Levien highlights a lack of comprehensive literature but acknowledges existing sources offer insights into better practices. His previous post aimed to organize these diverse patterns.
        • "There is an astonishing diversity of 'literature' on reactive UI systems."

      Goals of the Inquiry

      • Clarifying Inquiry Goals: Levien sets goals not to review but to guide inquiry into promising avenues of reactive UI in Rust, likening it to mining for rich veins of ore rather than stamp collecting.
        • "I want to do mining, not stamp collecting."

      Main Principles Explored

      • Observable Objects vs. Future-like Polling: Discusses the importance of how systems manage observable objects or utilize future-like polling for efficient UI updates.
      • Tree Mutations: How to express mutation in the render object tree is crucial, focusing on maintaining stable node identities within the tree.
        • "Then I will go into deeper into three principles, which I feel are critically important in any reactive UI framework."

      Crochet: A Research Prototype

      • Introduction of Crochet: Introduces 'Crochet', a prototype exploring these principles, acknowledging its current limitations and potential for development.
        • "Finally, I will introduce Crochet, a research prototype built for the purpose of exploring these ideas."

      Goals for Reactive UI

      • Concise Application Logic: Emphasizes the need for concise, clear application logic that drives UI efficiently, with reactive UI allowing declarative state expressions of the view tree.
        • "The main point of a reactive UI architecture is so that the app can express its logic clearly and concisely."
      • Incremental Updates: Advocates for incremental updates in UI rendering to avoid performance issues related to full re-renders, highlighting the limitations of systems like imgui and the potential of systems like Conrod, despite its shortcomings.
        • "While imgui can express UI concisely, it cheats somewhat by not being incremental."

      Evaluation of Existing Systems

      • Comparison with Other Systems: Mentions SwiftUI, imgui, React, and Svelte, discussing their approaches to handling reactive UI and their adaptability to Rust.
        • "SwiftUI has gained considerable attention due to its excellent ergonomics in this regard."

      Technical Challenges and Proposals

      • Challenges in Tree Mutation and Stable Identity: Discusses the challenges in tree mutation techniques and the importance of stable identity in UI components to preserve user interaction states.
        • "Mutation of the DOM is expressed through a well-specified and reasonably ergonomic, if inefficient, interface."

      Conclusion and Future Work

      • Future Directions and Experiments: Encourages experimentation with the Crochet prototype and discusses the ongoing development and research in making reactive UIs more efficient and user-friendly.
        • "I encourage people to experiment with the Crochet code."

      This blog post encapsulates Levien's ongoing exploration into developing a principled approach to reactive UI in Rust, highlighting the complexity of the task and his experimental prototype, Crochet, as a step towards solving these challenges.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Strengths:

      This work (almost didactically) demonstrates how to develop, calibrate, validate and analyze a comprehensive, spatially resolved, dynamical, multicellular model. Testable model predictions of (also non-monotonic) emergent behaviors are derived and discussed. The computational model is based on a widely-used simulation platform and shared openly such that it can be further analyzed and refined by the community.

      Weaknesses:

      While the parameter estimation approach is sophisticated, this work does not address issues of structural and practical non-identifiability (Wieland et al., 2021, DOI:10.1016/j.coisb.2021.03.005) of parameter values, given just tissue-scale summary statistics, and does not address how model predictions might change if alternative parameter combinations were used. Here, the calibrated model represents one point estimate (column "Value" in Suppl. Table 1) but there is specific uncertainty of each individual parameter value and such uncertainties need to be propagated (which is computationally expensive) to the model predictions for treatment scenarios.

      We thank the reviewer for the excellent suggestions and observations. The CaliPro parameterization technique applied puts an emphasis on finding a robust parameter space instead of a global optimum. To address structural non-identifiability, we utilized partial rank correlation coefficient with each iteration of the calibration process to ensure that the sensitivity of each parameter was relevant to model outputs. We also found that there were ranges of parameter values that would achieve passing criteria but when testing the ranges in replicate resulted in inconsistent outcomes. This led us to further narrow the parameters into a single parameter set that still had stochastic variability but did not have such large variability between replicate runs that it would be unreliable. Additional discussion on this point has been added to lines 623-628. We acknowledge that there are likely other parameter sets or model rules that would produce similar outcomes but the main purpose of the model was to utilize it to better understand the system and make new predictions, which our calibration scheme allowed us to accomplish.

      Regarding practical non-identifiability, we acknowledge that there are some behaviors that are not captured in the model because those behaviors were not specifically captured in the calibration data. To ensure that the behaviors necessary to answer the aims of our paper were included, we used multiple different datasets and calibrated with multiple different output metrics. We believe we have identified the appropriate parameters to recapitulate the dominating mechanisms underlying muscle regeneration. We have added additional discussion on practical non-identifiability to lines 621-623.

      Suggested treatments (e.g. lines 484-486) are modeled as parameter changes of the endogenous cytokines (corresponding to genetic mutations!) whereas the administration of modified cytokines with changed parameter values would require a duplication of model components and interactions in the model such that cells interact with the superposition of endogenous and administered cytokine fields. Specifically, as the authors also aim at 'injections of exogenously delivered cytokines' (lines 578, 579) and propose altering decay rates or diffusion coefficients (Fig. 7), there needs to be a duplication of variables in the model to account for the coexistence of cytokine subtypes. One set of equations would have unaltered (endogenous) and another one have altered (exogenous or drugged) parameter values. Cells would interact with both of them.

      Our perturbations did not include delivery of exogenously delivered cytokines and instead were focused on microenvironmental changes in cytokine diffusion and decay rates or specific cytokine concentration levels. For example, the purpose of the VEGF delivery perturbation was to test how an increase in VEGF concentrations would alter regeneration outcome metrics with the assumption that the delivered VEGF would act in the same manner as the endogenous VEGF. We have clarified the purpose of the simulations on line 410. We agree that exploring if model predictions would be altered if endogenous and exogenous were represented separately; however, we did not explore this type of scenario.

      This work shows interesting emergent behavior from nonlinear cytokine interactions but the analysis does not provide insights into the underlying causes, e.g. which of the feedback loops dominates early versus late during a time course.

      Indeed, analyzing the model to fully understand the time-varying interactions between the multiple feedback loops is a challenge in and of itself, and we appreciate the opportunity to elaborate on our approach to addressing this challenge. First: the crosstalk/feedback between cytokines and the temporal nature was analyzed in the heatmap (Fig. 6) and lines 474-482. Second: the sensitivity of cytokine parameters to specific outputs was included in Table 9 and full-time course sensitivity is included in Supplemental Figure 2. Further correlation analysis was also included to demonstrate how cytokine concentrations influenced specific output metrics at various timepoints (Supplemental Fig. 3). We agree that further elaboration of these findings is required; therefore, we added lines 504-509 to discuss the specific mechanisms at play with the combined cytokine interactions. We also added more discussion (lines 637-638) regarding future work that could develop more analysis methods to further investigate the complex behaviors in the model.

      Reviewer #2 (Public Review):

      Strengths:

      The manuscript identified relevant model parameters from a long list of biological studies. This collation of a large amount of literature into one framework has the potential to be very useful to other authors. The mathematical methods used for parameterization and validation are transparent.

      Weaknesses:>

      I have a few concerns which I believe need to be addressed fully.

      My main concerns are the following:

      (1) The model is compared to experimental data in multiple results figures. However, the actual experiments used in these figures are not described. To me as a reviewer, that makes it impossible to judge whether appropriate data was chosen, or whether the model is a suitable descriptor of the chosen experiments. Enough detail needs to be provided so that these judgements can be made.

      Thank you for raising this point. We created a new table (Supplemental table 6) that describes the techniques used for each experimental measurement.

      (2) Do I understand it correctly that all simulations are done using the same initial simulation geometry? Would it be possible to test the sensitivity of the paper results to this geometry? Perhaps another histological image could be chosen as the initial condition, or alternative initial conditions could be generated in silico? If changing initial conditions is an unreasonably large request, could the authors discuss this issue in the manuscript?

      We appreciate your insightful question regarding the initial simulation geometry in our model. The initial configuration of the fibers/ECM/microvascular structures was kept consistent but the location of the necrosis was randomly placed for each simulation. Future work will include an in-depth analysis of altered histology configuration on model predictions which has been added to lines 618-621. We did a preliminary example analysis by inputting a different initial simulation geometry, which predicted similar regeneration outcomes. We have added Supplemental Figure 5 that provides the results of that example analysis.

      (3) Cytokine knockdowns are simulated by 'adjusting the diffusion and decay parameters' (line 372). Is that the correct simulation of a knockdown? How are these knockdowns achieved experimentally? Wouldn't the correct implementation of a knockdown be that the production or secretion of the cytokine is reduced? I am not sure whether it's possible to design an experimental perturbation which affects both parameters.

      We appreciate that this important question has been posed. Yes, in order to simulate the knockout conditions, the cytokine secretion was reduced/eliminated. The diffusion and decay parameters were also adjusted to ensure that the concentration within the system was reduced. Lines 391-394 were added to clarify this assumption.

      (4) The premise of the model is to identify optimal treatment strategies for muscle injury (as per the first sentence of the abstract). I am a bit surprised that the implemented experimental perturbations don't seem to address this aim. In Figure 7 of the manuscript, cytokine alterations are explored which affect muscle recovery after injury. This is great, but I don't believe the chosen alterations can be done in experimental or clinical settings. Are there drugs that affect cytokine diffusion? If not, wouldn't it be better to select perturbations that are clinically or experimentally feasible for this analysis? A strength of the model is its versatility, so it seems counterintuitive to me to not use that versatility in a way that has practical relevance. - I may well misunderstand this though, maybe the investigated parameters are indeed possible drug targets.

      Thank you for your thoughtful feedback. The first sentence (lines 32-34) of the abstract was revised to focus on beneficial microenvironmental conditions to best reflect the purpose of the model. The clinical relevance of the cytokine modifications is included in the discussion (lines 547-558) with additional information added to lines 524-526. For example, two methods to alter diffusion experimentally are: antibodies that bind directly to the cytokine to prevent it from binding to its receptor on the cell surface and plasmins that induce the release of bound cytokines.

      (5) A similar comment applies to Figure 5 and 6: Should I think of these results as experimentally testable predictions? Are any of the results surprising or new, for example in the sense that one would not have expected other cytokines to be affected as described in Figure 6?

      We appreciate the opportunity to clarify the basis for these perturbations. The perturbations included in Figure 5 were designed to mimic the conditions of a published experiment that delivered VEGF in vivo (Arsic et al. 2004, DOI:10.1016/J.YMTHE.2004.08.007). The perturbation input conditions and experimental results are included in Table 8 and Supplemental Table 6 has been added to include experimental data and method description of the perturbation. The results of this analysis provide both validation and new predictions, because some the outputs were measured in the experiments while others were not measured. The additional output metrics and timepoints that were not collected in the experiment allow for a deeper understanding of the dynamics and mechanisms leading to the changes in muscle recovery (lines 437-454). These model outputs can provide the basis for future experiments; for example, they highlight which time points would be more important to measure and even provide predicted effect sizes that could be the basis for a power analysis (lines 639-640).

      Regarding Figure 6, the published experimental outcomes of cytokine KOs are included in Table 8. The model allowed comparison of different cytokine concentrations at various timepoints when other cytokines were removed from the system due to the KO condition. The experimental results did not provide data on the impact on other cytokine concentrations but by using the model we were able to predict temporally based feedback between cytokines (lines 474-482). These cytokine values could be collected experimentally but would be time consuming and expensive. The results of these perturbations revealed the complex nature of the relationship between cytokines and how removal of one cytokine from the system has a cascading temporal impact. Lines 533-534 have been added to incorporate this into the discussion.

      (6) In figure 4, there were differences between the experiments and the model in two of the rows. Are these differences discussed anywhere in the manuscript?

      We appreciate your keen observation and the opportunity to address these differences. The model did not match experimental results for CSA output in the TNF KO and antiinflammatory nanoparticle perturbation or TGF levels with the macrophage depletion. While it did align with the other experimental metrics from those studies, it is likely that there are other mechanisms at play in the experimental conditions that were not captured by simulating the downstream effects of the experimental perturbations. We have added discussion of the differences to lines 445-454.

      (7) The variation between experimental results is much higher than the variation of results in the model. For example, in Figure 3 the error bars around experimental results are an order of magnitude larger than the simulated confidence interval. Do the authors have any insights into why the model is less variable than the experimental data? Does this have to do with the chosen initial condition, i.e. do you think that the experimental variability is due to variation in the geometries of the measured samples?

      Thank you for your insightful observations and questions. The lower model variability is attributed to the larger sample size of model simulations compared to experimental subjects. By running 100 simulations it narrows in the confidence interval (average 2.4 and max 3.3) compared to the experiments that typically had a sample size of less than 15. If the number of simulations had been reduced to 15 the stochasticity within the model results in a larger confidence interval (average 7.1 and max 10). There are also several possible confounding variables in the experimental protocols (i.e. variations in injury, different animal subjects for each timepoint, etc.) that are kept constant in the model simulation. We have added discussion of this point to the manuscript (lines 517519). Future work with the model will examine how variations in conditions, such as initial muscle geometry, injury, etc, alter regeneration outcomes and overall variability. This discussion has been incorporated into lines 640-643.

      (8) Is figure 2B described anywhere in the text? I could not find its description.

      Thank you for pointing that out. We have added a reference for Fig. 2B on line 190.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The model code seems to be available from https://simtk.org/projects/muscle_regen but that website requests member status ("This is a private project. You must be a member to view its contents.") and applying for membership could violate eLife's blind review process. So, this reviewer liked to but couldn't run the model her/himself. To eLife: Can the authors upload their model to a neutral server that reviewers and editors can access anonymously?

      The code has been made publicly available on the following sites:

      SimTK: https://simtk.org/docman/?group_id=2635

      Zendo: https://zenodo.org/records/10403014

      GitHub: https://github.com/mh2uk/ABM-of-Muscle-Regeneration-with-MicrovascularRemodeling

      Line 121 has been updated with the new link and the additional resources were added to lines 654-657.

      (2) The muscle regeneration field typically studies 2D cross-sections and the present model can be well compared to these other 2D models but cells as stochastic and localized sources of diffusible cytokines may yield different cytokine fields in 3D vs. 2D. I would expect more broadened and smoothened cytokine fields (from sources in neighboring cross-sections) than what the 2D model predicts based on sources just within the focus cross-section. Such relations of 2D to 3D should be discussed.

      We thank the reviewer for the excellent suggestions and observations. It has been reported in other Compucell3D models (Sego et al. 2017, DOI:10.1088/17585090/aa6ed4) that the convergence of diffusion solutions between 2D and 3D model configurations had similar outcomes, with the 3D simulations presenting excessive computational cost without contributing any noticeable additional accuracy. Similarly, other cell-based ABMs that incorporate diffusion mechanisms (Marino et al. 2018, DOI:10.3390/computation6040058) have found that 2D and 3D versions of the model both predict the same mechanisms and that the 2D resolution was sufficient for determining outcomes. Lines 615-618 were added to elaborate on this topic.

      (3) Since the model (and title) focuses on "nonlinear" cytokine interactions, what would change if cytokine decay would not be linear (as modeled here) but saturated (with nonlinear Michaelis-Menten kinetics as ligand binding and endocytosis mechanisms would call for)?

      Thank you for raising an intriguing point. The model includes a combination of cytokine decay as well as ligand binding and endocytosis mechanisms that can be saturated. For a cytokine-dependent model behavior to occur the cytokines necessary to induce that action had to reach a minimum threshold. Once that threshold was reached, that amount of the cytokine would be removed at that location to simulate ligand-receptor binding and endocytosis. These ligand binding and endocytosis mechanisms behave in a saturated way, removing a set amount when above a certain threshold or a defined ratio when under the threshold. Lines 313-315 was revised to clarify this point. There were certain concentrations of cytokines where we saw a plateau in outputs likely as a result of reaching a saturation threshold (Supplemental Fig. 3). In future work, more robust mathematical simulation of binding kinetics of cytokines (e.g., using ODEs) could be included.

      (4) Limitations of the model should be discussed together with an outlook for model refinement. For example, fiber alignment and ECM ultrastructure may require anisotropic diffusion. Many of the rate equations could be considered with saturation parameters etc. There are so many model assumptions. Please discuss which would be the most urgent model refinements and, to achieve these, which would be the most informative next experiments to perform.

      We appreciate your thoughtful consideration of the model's limitations and the need for a comprehensive discussion on model refinements and potential future experiments. The future direction section was expanded to discuss additional possible model refinements (lines 635-643) and additional possible experiments for model validation (lines 630-634).

      (5) It is not clear how the single spatial arrangement that is used affects the model predictions. E.g. now the damaged area surrounds the lymphatic vessel but what if the opposite corner was damaged and the lymphatic vessel is deep inside the healthy area?

      Thank you for highlighting the importance of considering different spatial arrangements in the model and its potential impact on predictions. We previously tested model perturbations that included specifying the injury surrounding the lymphatic vessel versus on the side opposite the vessel. Since this paper focuses more on cytokine dynamics, we plan to include this perturbation, along with other injury alterations, in a follow-on paper. We added more context about this in the future efforts section lines 640-643.

      (6) It seems that not only parameter values but also the initial values of most of the model components are unknown. The parameter estimation strategy does not seem to include the initial (spatial) distributions of collagen and cytokines and other model components. Please discuss how other (reasonable) initial values or spatial arrangements will affect model predictions.

      We appreciate your thoughtful consideration of unknown initial values/spatial arrangements and their potential influence on predictions. Initial cytokine levels prior to injury had a low relative concentration compared to levels post injury and were assumed to be negligible. Initial spatial distribution of cytokines was not defined as initial spatial inputs (except in knockout simulations) but are secreted from cells (with baseline resident cell counts defined from the literature). The distribution of cytokines is an emergent behavior that results from the cell behaviors within the model. The collagen distribution is altered in response to clearance of necrosis by the immune cells (decreased collagen with necrosis removal) and subsequent secretion of collagen by fibroblasts. The secretion of collagen from fibroblast was included in the parameter estimation sweep (Supplemental Table 1).

      We are working on further exploring the model sensitivity to altered spatial arrangements and have added this to the future directions section (lines 618-621), as well as provided Supplemental Figure 5 to demonstrate that model outcomes are similar with altered initial spatial arrangements.

      (7) Many details of the CC3D implementation are missing: overall lattice size, interaction neighborhood order, and "temperature" of the Metropolis algorithm. Are the typical adhesion energy terms used in the CPM Hamiltonian and if so, then how are these parameter values estimated?

      Thank you for bringing attention to the missing details regarding the CC3D implementation in our manuscript. We have included supplemental information providing greater detail for CPM implementation (Lines 808-854). We also added two additional supplemental tables for describing the requested CC3D implementation details (Supplemental Table 4) and adhesion energy terms (Supplemental Table 5).

      (8) Extending the model analysis of combinations of altered cytokine properties, which temporal schedules of administration would be of interest, and how could the timing of multiple interventions improve outcomes? Such a discussion or even analysis would further underscore the usefulness of the model.

      In response to your valuable suggestion, lines 558-562 were added to discuss the potential of using the model as a tool to perturb different cytokine combinations at varying timepoints throughout regeneration. In addition, this is also included in future work in lines 636-637.

      (9) The CPM is only weakly motivated, just one sentence on lines 142-145 which mentions diffusion in a misleading way as the CPM just provides cells with a shape and mechanical interactions. The diffusion part is a feature of the hybrid CompuCell3D framework, not the CPM.

      Thank you for bringing up this distinction. We removed the statement regarding diffusion and updated lines 143-146 to focus on CPM representation of cellular behavior and interactions. We also added a reference to supplemental text that includes additional details on CPM.

      (10) On lines 258-261 it does not become clear how the described springs can direct fibroblasts towards areas of low-density collagen ECM. Are the lambdas dependent on collagen density?

      Thank you for highlighting this area for clarification. The fibroblasts form links with low collagen density ECM and then are pulled towards those areas based on a constant lambda value. The links between the fibroblast and the ECM will only be made if the collagen is below a certain threshold. We added additional clarification to lines 260-264.

      (11) On line 281, what does the last part in "Fibers...were regenerating but not fully apoptotic cells" mean? Maybe rephrase this.

      The last of part of that line indicates that there were some fibers surrounding the main injury site that were damaged but still had healthy portions, indicating that they were impacted by the injury and are regenerating but did not become fully apoptotic like the fiber cells at the main site of injury. We rephrased this line to indicate that the nearby fibers were damaged but not fully apoptotic.

      (12) Lines 290-293 describe interactions of cells and fields with localized structures (capillaries and lymphatic vessel). Please explain in more detail how "capillary agents...transport neutrophiles and monocytes" in the CPM model formalism. Are new cells added following rules? How is spatial crowding of the lattice around capillaries affecting these rules? Moreover, how can "lymphatic vessel...drain the nearby cytokines and cells"? How is this implemented in the CPM and how is "nearby" calculated? We appreciate your detailed inquiry into the interactions of cells and fields with localized structures. The neutrophils and monocytes are added to the simulation at the lattice sites above capillaries (within the cell layer Fig. 2B) and undergo chemotaxis up their respective gradients. The recruitment of the neutrophils and monocytes are randomly distributed among the healthy capillaries that do not have an immune cell at the capillary location (a modeling artifact that is a byproduct of only having one cell per lattice site). This approach helped to prevent an abundance of crowding at certain capillaries. Because immune cells in the simulation are sufficiently small, chemotactic gradients are sufficiently large, and the simulation space is sufficiently large, we do not see aggregation of recruited immune cells in the CPM.

      The lymphatic vessel uptakes cytokines at lattice locations corresponding to the lymphatic vessel and will remove cells located in lattice sites neighboring the lymphatic vessel. In addition, we have included a rule in our ABM to encourage cells to migrate towards the lymphatic vessel utilizing CompuCell3D External Potential Plugin. The influence of this rule is inversely proportional to the distance of the cells to the lymphatic vessel.

      We have updated lines 294-298 and 305-309 to include the above explanation.

      (13) Tables 1-4 define migration speeds as agent rules but in the typical CPM, migration speed emerges from random displacements biased by chemotaxis and other effects (like the slope of the cytokine field). How was the speed implemented as a rule while it is typically observable in the model?

      We appreciate your inquiry regarding the implementation of migration speeds. To determine the lambda parameters (Table 7) for each cell type, we tested each in a simplified control simulation with a concentration gradient for the cell to move towards. We tuned the lambda parameters within this simulation until the model outputted cell velocity aligned with the literature reported cell velocity for each cell type (Tables 1-4). We have incorporated clarification on this to lines 177-180.

      (14) Line 312 shows the first equation with number (5), either add eqn. (1-4) or renumber.

      We have revised the equation number.

      (15) Typos: Line 456, "expect M1 cell" should read "except M1 cell".

      Line 452, "thresholds above that diminish fibroblast response (Supplemental Fig 3)." remains unclear, please rephrase.

      Line 473, "at 28." should read "at 28 days.".

      Line 474, is "additive" correct? Was the sum of the individual effects calculated and did that match?

      Line 534, "complexity our model" should read "complexity in our model".

      We have corrected the typos and clarified line 452 (updated line 594) to indicate that the TNF-α concentration threshold results in diminished fibroblast response. We updated terminology line 474 (updated line 512) to indicate that there was a synergistic effect with the combined perturbation.

      (16) Table 7 defines cell target volumes with the same value as their diameter. This enforces a strange cell shape. Should there be brackets to square the value of the cell diameter, e.g. Value=(12µm)^2 ?

      The target volume parameter values were selected to reflect the relative differences in average cell diameter as reported in the literature; however, there are no parameters that directly enforce a diameter for the cells in the CPM formalism separate from the volume. We have observed that these relative cell sizes allow the ABM to effectively reproduce cell behaviors described in the literature. Single cells that are too large in the ABM would be unable to migrate far enough per time step to carry out cell behaviors, and cells that are too small in the CPM would be unstable in the simulation environment and not persist in the simulation when they should. We removed the units for the cell shape values in Table 7 since the target volume is a relative parameter and does not directly represent µm.

      (17) Table 7 gives estimated diffusion constants but they appear to be too high. Please compare them to measured values in the literature, especially for MCP-1, TNF-alpha and IL-10, or relate these to their molecular mass and compare to other molecules like FGF8 (Yu et al. 2009, DOI:10.1038/nature08391).

      We utilized a previously published estimation method (Filion et al. 2004, DOI:10.1152/ajpheart.00205.2004) for estimating cytokine diffusivity within the ECM. This method incorporates the molecular masses and accounts for the combined effects of the collagen fibers and glycosaminoglycans. The paper acknowledged that the estimated value is faster than experimentally determined values, but that this was a result of the less-dense matrix composition which is more reflective of the tissue environment we are simulating in contrast to other reported measurements which were done in different environments. Using this estimation method also allowed us to more consistently define diffusion constants versus using values from the literature (which were often not recorded) that had varied experimental conditions and techniques (such as being in zebrafish embryo Yu et al. 2009, DOI:10.1038/nature08391 as opposed to muscle tissue). This also allowed for recalculation of the diffusivity throughout the simulation as the collagen density changed within the model. Lines 318-326 were updated to help clarify the estimation method.

      (18) Many DOIs in the bibliography (Refs. 7,17,20,31,40,47...153) are wrong and do not resolve because the appended directory names are not allowed in the DOI, just with a journal's URL after resolution.

      Thank you for bringing this to our attention. The incorrect DOIs have been corrected.

      Reviewer #2 (Recommendations For The Authors):

      Minor comments:

      (9) On line 174, the authors say "We used the CC3D feature Flip2DimRatio to control the number of times the Cellular-Potts algorithm runs per mcs." What does this mean? Isn't one monte carlo timestep one iteration of the Cellular Potts model? How does this relate to physical timescales?

      We appreciate your attention to detail and thoughtful question regarding the statement about the use of the CC3D feature Flip2DimRatio. Lines 175-177 were revised to simplify the meaning of Flip2DimRatio. That parameter alters the number of times the Cellular-Potts algorithm is run, which is the limiting factor for cell movement. The physical timescale is kept to a 15-minute timestep but a high Flip2DimRatio allows more flexibility and stability to allow the cells to move faster in one timestep.

      (10) Has the costum matlab script to process histology images into initial conditions been made available?

      The Matlab script along with CC3D code for histology initialization with documentation has been made available with the source code on the following sites:

      SimTK: https://simtk.org/docman/?group_id=2635

      Zendo: https://zenodo.org/records/10403014

      GitHub: https://github.com/mh2uk/ABM-of-Muscle-Regeneration-with-MicrovascularRemodeling

      (11) Equation 5 is provided without a reference or derivation. Where does it come from and what does it mean?

      Thank you for highlighting the diffusion equation and seeking clarification on its origin and significance. Lines 318-326 were revised to clarify where the equation comes from. This is a previously published estimation method that we applied to calculate the diffusivity of the cytokines considering both collagen and glycosaminoglycans.

      (12) Line 326: "For CSA, experimental fold-change from pre-injury was compared with fold-change in model-simulated CSA". Does this step rely on the assumption that the fold change will not depend on the CSA? If so, is this something that is experimentally known, or otherwise, can it be confirmed by simulations?

      We appreciate the opportunity to clarify our rationale. The fold change was used as a method to normalize the model and experiment so that they could be compared on the same scale. Yes, this step relies on the assumption that fold change does not depend on pre-injury CSA. Experimentally it is difficult to determine the impact of initial fiber morphology on altered regeneration time course. This fold-change allows us to compare percent recovery which is a common metric utilized to assess muscle regeneration outcomes experimentally. Line 340-343 was revised to clarify.

      (13) Line 355: "The final passing criteria were set to be within 1 SD for CSA recovery and 2.5 SD for SSC and fibroblast count" Does this refer to the experimental or the simulated SD?

      The model had to fit within those experimental SD. Lines 371-372 was edited to specify that we are referring the experimental SD.

      (14) "Following 8 iterations of narrowing the parameter space with CaliPro, we reached a set that had fewer passing runs than the previous iteration". Wouldn't one expect fewer passing runs with any narrowing of the parameter space? Why was this chosen as the stopping criterion for further narrowing?

      We appreciate your observation regarding the statement about narrowing the parameter space with CaliPro. We started with a wide parameter space, expecting that certain parameters would give outputs that fall outside of the comparable data. So, when the parameter space was narrowed to enrich parts that give passing output, initially the number of passing simulations increased.

      Once we have narrowed the set of possible parameters into an ideal parameter space, further narrowing will cut out viable parameters resulting in fewer passing runs. Therefore, we stopped narrowing once any fewer simulations passed the criteria that they had previously passed with the wider parameter set. Lines 375-379 have been updated to clarify this point.

      (15) Line 516: 'Our model could test and optimize combinations of cytokines, guiding future experiments and treatments." It is my understanding that this is communicated as a main strength of the model. Would it be possible to demonstrate that the sentence is true by using the model to make actual predictions for experiments or treatments?

      This is demonstrated by the combined cytokine alterations in Figure 7 and discussed in lines 509-513. We have also added in a suggested experiment to test the model prediction in lines 691-695.

      (16) Line 456, typo: I think 'expect' should be 'except'.

      Thank you for pointing that out. The typo has been corrected.

    1. Résumé de la vidéo [00:14:07][^1^][1] - [00:37:04][^2^][2]:

      Cette vidéo présente une conférence sur le rôle de l'école en tant que territoire vivant au cœur des valeurs de la République. Les intervenants discutent de l'importance de l'école dans la transmission des valeurs républicaines, surtout dans le contexte des événements récents qui ont touché l'académie.

      Points forts: + [00:14:07][^3^][3] Introduction de la conférence * Accueil par Sébastien Jaibovski, directeur de l'INSP de l'académie de Lille * Présentation du programme et des intervenants * Contextualisation de la conférence dans les événements actuels + [00:17:01][^4^][4] Allocution d'Alain Frugère * Importance de partager et transmettre les valeurs de liberté, égalité et fraternité * Rappel historique des valeurs républicaines depuis la Révolution française * Nécessité de défendre ces valeurs face aux défis contemporains + [00:23:02][^5^][5] Intervention de Madame Lower * L'école comme vecteur de l'égalité des chances et de la lutte contre les inégalités * Rôle de l'école dans la reconnaissance de la capacité de tous les enfants à apprendre * Importance de la liberté d'éducation pour l'émancipation individuelle + [00:30:03][^6^][6] Contextualisation par Sébastien Jaibovski * L'école face au poids des attentes sociales et symboliques * Réflexion sur le rôle des institutions publiques et les moyens alloués * L'école comme lieu de partage et de construction de la citoyenneté Résumé de la vidéo [00:37:07][^1^][1] - [01:02:26][^2^][2]:

      Cette partie de la vidéo aborde le rôle de l'école dans la transmission des valeurs républicaines, en mettant l'accent sur une approche citoyenne. Elle souligne l'importance de l'éducation aux valeurs comme fondement de la République et la nécessité d'une pédagogie qui favorise la pensée critique et la participation active des élèves.

      Points forts: + [00:37:07][^3^][3] L'importance de l'éducation aux valeurs * L'école comme lieu de découverte et de compréhension des valeurs républicaines * La transmission des valeurs comme mission centrale de l'institution scolaire * Les valeurs de liberté, égalité, fraternité, laïcité et refus des discriminations + [00:47:00][^4^][4] Approche citoyenne des valeurs * Refus de l'inculcation, éducation à la liberté * Appel à la pensée critique et à l'interrogation des valeurs * Importance de l'expérience vécue des valeurs dans l'établissement scolaire + [00:57:00][^5^][5] Engagement dans la République * La valeur comme refus du réel et espace pour l'engagement * L'écart entre les valeurs et le réel comme opportunité d'action * L'importance de l'engagement citoyen de chacun dans l'éducation aux valeurs Résumé de la vidéo 01:02:28 - 01:25:00 : La vidéo explore l'évolution de la notion des valeurs de la République dans le discours public, les médias, et le droit français depuis les années 80. Elle examine l'augmentation significative de l'utilisation de cette notion dans les années 80 et 90, sa stabilisation dans les années 2010, et son absence de définition constitutionnelle. La vidéo souligne également l'importance de l'éducation à la laïcité dans les écoles françaises et comment elle est abordée dans les programmes scolaires.

      Points saillants : + [01:02:28][^1^][1] L'évolution de la notion des valeurs de la République * Augmentation dans les publications et les médias depuis les années 80 * Stabilisation dans les années 2010 * Absence de définition constitutionnelle + [01:06:02][^2^][2] L'impact sur le droit de l'éducation et le droit des étrangers * Contribution significative du code de l'éducation et du droit des étrangers * Importance de l'apprentissage et de l'intégration des valeurs + [01:10:01][^3^][3] La définition et l'enseignement de la laïcité * Présence accrue dans les programmes scolaires depuis les années 80 * Nécessité d'expliquer les règles aux élèves * Approche pédagogique pour déconstruire les oppositions + [01:18:34][^4^][4] La perception et la compréhension des élèves sur la laïcité * Bonne maîtrise de la notion par les élèves * Importance de l'éducation à la laïcité pour une société inclusive Résumé de la vidéo [01:25:02][^1^][1] - [01:38:50][^2^][2] :

      Cette partie de la vidéo aborde l'importance de l'éducation à la laïcité et aux valeurs de la République dans les écoles françaises, y compris celles à l'étranger. Elle souligne les défis culturels et les différences dans l'approche de l'enseignement des valeurs universelles.

      Points forts : + [01:25:02][^3^][3] La laïcité dans l'éducation * La laïcité n'est pas une hostilité envers les croyances religieuses * Importance de partager une vision positive de la laïcité * Nécessité d'adapter l'enseignement aux contextes culturels variés + [01:30:10][^4^][4] L'approche citoyenne à l'école * Utilisation de la pensée critique et de la liberté * Transformation des valeurs en réalité concrète pour les élèves * Engagement des élèves envers les valeurs de la République + [01:35:00][^5^][5] L'école, un territoire vivant * L'école est un lieu d'échange et d'apprentissage actif des valeurs républicaines * L'actualité influence l'enseignement et la perception des valeurs * L'importance de l'équilibre entre idéal et réalité dans l'éducation

    1. Five months later, a little over a year after the Code Yellow debacle, Google would make Prabhakar Raghavan the head of Google Search

      author mentions this as the locking in of rotting google search.

    2. n the March 2019 core update to search, which happened about a week before the end of the code yellow, was expected to be “one of the largest updates to search in a very long time. Yet when it launched, many found that the update mostly rolled back changes, and traffic was increasing to sites that had previously been suppressed by Google Search’s “Penguin” update from 2012 that specifically targeted spammy search results, as well as those hit by an update from an August 1, 2018, a few months after Gomes became Head of Search.

      The start of Google search decreasing effectiveness

    1. k12 Daisuke Wakabayashi and Sapna Maheshwari. Advertisers Boycott YouTube After Pedophiles Swarm Comments on Videos of Children. The New York Times, February 2019. URL: https://www.nytimes.com/2019/02/20/technology/youtube-pedophiles.html (visited on 2023-12-07)

      After reading this article I was reminded of assignment 3 where we did bot trolling and I thought of how difficult it would be to code to catch for people not obeying user policies. For instance the comments weren't blatantly explicit, commenting sexually inappropriate comments, but they would contain a string of sexually suggestive emoji or insinuate some form of sexual abuse. At this rate, making sure your platform is safe for children would be extremely difficult. The way we coded the automatic response in assignment 3, we had it recognize a specific sentence but harassment is a spectrum and it's intepreted a lot of the time (making it difficult to detect).

    1. We often think of software development as a ticket-in-code-out business but this is really only a very small portion of the entire thing. Completely independently of the work done as a programmer, there exists users with different jobs they are trying to perform, and they may or may not find it convenient to slot our software into that job. A manager is not necessarily the right person to evaluate how good a job we are doing because they also exist independently of the user–software–programmer network, and have their own sets of priorities which may or may not align with the rest of the system.

      Software development as a conversation

    1. Quantification is ultimately linguistic: it is a form of translation. Most of our descriptions start as ‘ordinary language’, and in some cases, we ‘code’ those descriptions using numbers rather than words

      So you do not think physical quantities exist out of our mind?

    1. With Code FIRSTAID

      And remove the code because it should link to the discounted checkout cart, right?

    1. Author response:

      The following is the authors’ response to the original reviews.

      Editor’s summary:

      This paper by Castello-Serrano et al. addresses the role of lipid rafts in trafficking in the secretory pathway. By performing carefully controlled experiments with synthetic membrane proteins derived from the transmembrane region of LAT, the authors describe, model and quantify the importance of transmembrane domains in the kinetics of trafficking of a protein through the cell. Their data suggest affinity for ordered domains influences the kinetics of exit from the Golgi. Additional microscopy data suggest that lipid-driven partitioning might segregate Golgi membranes into domains. However, the relationship between the partitioning of the synthetic membrane proteins into ordered domains visualised ex vivo in GPMVs, and the domains in the TGN, remain at best correlative. Additional experiments that relate to the existence and nature of domains at the TGN are necessary to provide a direct connection between the phase partitioning capability of the transmembrane regions of membrane proteins and the sorting potential of this phenomenon.

      The authors have used the RUSH system to study the traffic of model secretory proteins containing single-pass transmembrane domains that confer defined affinities for liquid ordered (lo) phases in Giant Plasma Membrane derived Vesicles (GPMVs), out of the ER and Golgi. A native protein termed LAT partitioned into these lo-domains, unlike a synthetic model protein termed LAT-allL, which had a substituted transmembrane domain. The authors experiments provide support for the idea that ER exit relies on motifs in the cytosolic tails, but that accelerated Golgi exit is correlated with lo domain partitioning.

      Additional experiments provided evidence for segregation of Golgi membranes into coexisting lipid-driven domains that potentially concentrate different proteins. Their inference is that lipid rafts play an important role in Golgi exit. While this is an attractive idea, the experiments described in this manuscript do not provide a convincing argument one way or the other. It does however revive the discussion about the relationship between the potential for phase partitioning and its influence on membrane traffic.

      We thank the editors and scientific reviewers for thorough evaluation of our manuscript and for positive feedback. While we agree that our experimental findings present a correlation between trafficking rates and raft affinity, in our view, the synthetic, minimal nature of the transmembrane protein constructs in question makes a strong argument for involvement of membrane domains in their trafficking. These constructs have no known sorting determinants and are unlikely to interact directly with trafficking proteins in cells, since they contain almost no extramembrane amino acids. Yet, the LATTMD traffics through Golgi similarly to the full-length LAT protein, but quite different from mutants with lower raft phase affinity. We suggest that these observations can be best rationalized by involvement of raft domains in the trafficking fates and rates of these constructs, providing strong evidence (beyond a simple correlation) for the existence and relevance of such domains.

      We have substantially revised the manuscript to address all reviewer comments, including several new experiments and analyses. These revisions have substantially improved the manuscript without changing any of the core conclusions and we are pleased to have this version considered as the “version of record” in eLife.

      Below is our point-by-point response to all reviewer comments.

      ER exit:

      The experiments conducted to identify an ER exit motif in the C-terminal domain of LAT are straightforward and convincing. This is also consistent with available literature. The authors should comment on whether the conservation of the putative COPII association motif (detailed in Fig. 2A) is significantly higher than that of other parts of the C-terminal domain.

      Thank you for this suggestion, this information has now been included as Supp Fig 2B. While there are other wellconserved residues of the LAT C-terminus, many regions have relatively low conservation. In contrast, the essential residues of the COPII association motif (P148 and A150) are completely conserved across in LAT across all species analyzed.

      One cause of concern is that addition of a short cytoplasmic domain from LAT is sufficient to drive ER exit, and in its absence the synthetic constructs are all very slow. However, the argument presented that specific lo phase partitioning behaviour of the TMDs do not have a significant effect on exit from the ER is a little confusing. This is related to the choice of the allL-TMD as the 'non-lo domain' partitioning comparator. Previous data has shown that longer TMDs (23+) promote ER export (eg. Munro 91, Munro 95, Sharpe 2005). The mechanism for this is not, to my knowledge, known. One could postulate that it has something to do with the very subject of this manuscript- lipid phase partitioning. If this is the case, then a TMD length of 22 might be a poor choice of comparison. A TMD 17 Ls' long would be a more appropriate 'non-raft' cargo. It would be interesting to see a couple of experiments with a cargo like this.

      The basis for the claim that raft affinity has relatively minor influence on ER exit kinetics, especially in comparison to the effect of the putative COPII interaction motif, is in Fig 1G. We do observe some differences between constructs and they may be related to raft affinity, however we considered these relatively minor compared to the nearly 4-fold increase in ER efflux induced by COPII motifs.

      We have modified the wording in the manuscript to avoid the impression that we have ruled out an effect of raft affinity of ER exit.

      We believe that our observations are broadly consistent with those of Munro and colleagues. In both their work and ours, long TMDs were able to exit the ER. In our experiments, this was true for several proteins with long TMDs, either as fulllength or as TMD-only versions (see Fig 1G). We intentionally did not measure shorter synthetic TMDs because these would not have been comparable with the raft-preferring variants, which all require relatively long TMDs, as demonstrated in our previous work1,2. Thus, because our manuscript does not make any claims about the influence of TMD length on trafficking, we did not feel that experiments with shorter non-raft constructs would substantively influence our conclusions.

      However, to address reviewer interest, we did complete one set of experiments to test the effect of shortening the TMD on ER exit. We truncated the native LAT TMD by removing 6 residues from the C-terminal end of the TMD (LAT-TMDd6aa). This construct exited the ER similarly to all others we measured, revealing that for this set of constructs, short TMDs did not accumulate in the ER. ER exit of the truncated variant was slightly slower than the full-length LAT-TMD, but somewhat faster than the allL-TMD. These effects are consistent with our previous measurements with showed that this shortened construct has slightly lower raft phase partitioning than the LAT-TMD but higher than allL2. While these are interesting observations, a more thorough exploration of the effect of TMD length would be required to make any strong conclusion, so we did not include these data in the final manuscript.

      Author response image 1.

      Golgi exit:

      For the LAT constructs, the kinetics of Golgi exit as shown in Fig. 3B are surprisingly slow. About half of the protein Remains in the Golgi at 1 h after biotin addition. Most secretory cargo proteins would have almost completely exited the Golgi by that time, as illustrated by VSVG in Fig. S3. There is a concern that LAT may have some tendency to linger in the Golgi, presumably due to a factor independent of the transmembrane domain, and therefore cannot be viewed as a good model protein. For kinetic modeling in particular, the existence of such an additional factor would be far from ideal. A valuable control would be to examine the Golgi exit kinetics of at least one additional secretory cargo.

      We disagree that LAT is an unusual protein with respect to Golgi efflux kinetics. In our experiments, Golgi efflux of VSVG was similar to full-length LAT (t1/2 ~ 45 min), and both of these were similar to previously reported values3. Especially for the truncated (i.e. TMD) constructs, it is very unlikely that some factor independent of their TMDs affects Golgi exit, as they contain almost no amino acids outside the membrane-embedded TMD.

      Practically, it has proven somewhat challenging to produce functional RUSH-Golgi constructs. We attempted the experiment suggested by the reviewer by constructing SBP-tagged versions of several model cargo proteins, but all failed to trap in the Golgi. We speculate that the Golgin84 hook is much more sensitive to the location of the SBP on the cargo, being an integral membrane protein rather than the lumenal KDEL-streptavidin hook. This limitation can likely be overcome by engineering the cargo, but we did not feel that another control cargo protein was essential for the conclusions we presented, thus we did not pursue this direction further.

      Comments about the trafficking model

      (1) In Figure 1E, the export of LAT-TMD from the ER is fitted to a single-exponential fit that the authors say is "well described". This is unclear and there is perhaps something more complex going on. It appears that there is an initial lag phase and then similar kinetics after that - perhaps the authors can comment on this?

      This is a good observation. This effect is explainable by the mechanics of the measurement: in Figs 1 and 2, we measure not ‘fraction of protein in ER’ but ‘fraction of cells positive for ER fluorescence’. This is because the very slow ER exit of the TMD-only constructs present a major challenge for live-cell imaging, so ER exit was quantified on a population level, by fixing cells at various time points after biotin addition and quantifying the fraction of cells with observable ER localization (rather than tracking a single cell over time).

      For fitting to the kinetic model (which attempts to describe ‘fraction in ER/Golgi’) we re-measured all constructs by livecell imaging (see Supp Fig 5) to directly quantify relative construct abundance in the ER or Golgi. These data did not have the plateau in Fig 1E, suggesting that this is an artifact of counting “ER positive cells” which would be expected to have a longer lag than “fraction of protein in ER”. Notably however, t1/2 measured by both methods was similar, suggesting that the population measurement agrees well with single-cell live imaging.

      We have included all these explanations and caveats in the manuscript. We have also changed the wording from “well described” to “reasonably approximated”.

      (2) The model for Golgi sorting is also complicated and controversial, and while the authors' intention to not overinterpreting their data in this regard must be respected, this data is in support of the two-phase Golgi export model (Patterson et al PMID:18555781).

      The reviewers are correct, our observations and model are consistent with Patterson et al and it was a major oversight that a reference to this foundational work was not included. We have now added a discussion regarding the “two phase model” of Patterson and Lippincott-Schwartz.

      Furthermore contrary to the statement in lines 200-202, the kinetics of VSVG exit from the Golgi (Fig. S3) are roughly linear and so are NOT consistent with the previous report by Hirschberg et al.

      Regarding kinetics of VSVG, our intention was to claim that the timescale of VSVG efflux from the Golgi was similar to previously reported in Hirschberg, i.e. t1/2 roughly between 30-60 minutes. We have clarified this in the text. Minor differences in the details between our observations and Hirschberg are likely attributable to temperature, as those measurements were done at 32°C for the tsVSVG mutant.

      Moreover, the kinetics of LAT export from the Golgi (Fig. 3B) appear quite different, more closely approximating exponential decay of the signal. These points should be described accurately and discussed.

      Regarding linear versus exponential fits, we agree that the reality of Golgi sorting and efflux is far more complicated than accounted for by either the phenomenological curve fitting in Figs 1-3 or the modeling in Fig 4. In addition to the possibility of lateral domains within Golgi stacks, there is transport between stacks, retrograde traffic, etc. The fits in Figs 1-3 are not intended to model specifics of transport, but rather to be phenomenological descriptors that allowed us to describe efflux kinetics with one parameter (i.e. t1/2). In contrast, the more refined kinetic modeling presented in Figure 4 is designed to test a mechanistic hypothesis (i.e. coexisting membrane domains in Golgi) and describes well the key features of the trafficking data.

      Relationship between membrane traffic and domain partitioning:

      (1) Phase segregation in the GPMV is dictated by thermodynamics given its composition and the measurement temperature (at low temperatures 4degC). However at physiological temperatures (32-37degC) at which membrane trafficking is taking place these GPMVs are not phase separated. Hence it is difficult to argue that a sorting mechanism based solely on the partitioning of the synthetic LAT-TMD constructs into lo domains detected at low temperatures in GPMVs provide a basis (or its lack) for the differential kinetics of traffic of out of the Golgi (or ER). The mechanism in a living cell to form any lipid based sorting platforms naturally requires further elaboration, and by definition cannot resemble the lo domains generated in GPMVs at low temperatures.

      We thank the reviewers for bringing up this important point. GPMVs are a useful tool because they allow direct, quantitative measurements of protein partitioning between coexisting ordered and disordered phases in complex, cell-derived membranes. However, we entirely agree, that GPMVs do not fully represent the native organization of the living cell plasma membrane and we have previously discussed some of the relevant differences4,5. Despite these caveats, many studies have supported the cellular relevance of phase separation in GPMVs and the partitioning of proteins to raft domains therein 6-9. Most notably, elegant experiments from several independent labs have shown that fluorescent lipid analogs that partition to Lo domains in GPMVs also show distinct diffusive behaviors in live cells 6,7, strongly suggesting the presence of nanoscopic Lo domains in live cells. Similarly, our recent collaborative work with the lab of Sarah Veatch showed excellent agreement between raft preference in GPMVs and protein organization in living immune cells imaged by super-resolution microscopy10. Further, several labs6,7, including ours11, have reported nice correlations between raft partitioning in GPMVs and detergent resistance, which is a classical (though controversial) assay for raft association.

      Based on these points, we feel that GPMVs are a useful tool for quantifying protein preference for ordered (raft) membrane domains and that this preference is a useful proxy for the raft-associated behavior of these probes in living cells. We propose that this approach allows us to overcome a major reason for the historical controversy surrounding the raft field: nonquantitative and unreliable methodologies that prevented consistent definition of which proteins are supposed to be present in lipid rafts and why. Our work directly addresses this limitation by relating quantitative raft affinity measurements in a biological membrane with a relevant and measurable cellular outcome, specifically inter-organelle trafficking rates.

      Addressing the point about phase transition temperatures in GPMVs: this is the temperature at which macroscopic domains are observed. Based on physical models of phase separation, it has been proposed that macroscopic phase separation at lower temperatures is consistent sub-microscopic, nanoscale domains at higher temperatures8,12. These smaller domains can potentially be stabilized / functionalized by protein-protein interactions in cells13 that may not be present in GPMVs (e.g. because of lack of ATP).

      (2) The lipid compositions of each of these membranes - PM, ER and Golgi are drastically different. Each is likely to phase separate at different phase transition temperatures (if at all). The transition temperature is probably even lower for Golgi and the ER membranes compared to the PM. Hence, if the reported compositions of these compartments are to be taken at face value, the propensity to form phase separated domains at a physiological temperature will be very low. Are ordered domains even formed at the Golgi at physiological temperatures?

      It is a good point that the membrane compositions and the resulting physical properties (including any potential phase behavior) will be very different in the PM, ER, and Golgi. Whether ordered domains are present in any of these membranes in living cells remains difficult to directly visualize, especially for non-PM membranes which are not easily accessible by probes, are nanoscopic, and have complex morphologies. However, the fact that raft-preferring probes / proteins share some trafficking characteristics, while very similar non-raft mutants behave differently argues that raft affinity plays a role in subcellular traffic.

      (3) The hypothesis of 'lipid rafts' is a very specific idea, related to functional segregation, and the underlying basis for domain formation has been also hotly debated. In this article the authors conflate thermodynamic phase separation mechanisms with the potential formation of functional sorting domains, further adding to the confusion in the literature. To conclude that this segregation is indeed based on lipid environments of varying degrees of lipid order, it would probably be best to look at the heterogeneity of the various membranes directly using probes designed to measure lipid packing, and then look for colocalization of domains of different cargo with these domains.

      This is a very good suggestion, and a direction we are currently following. Unfortunately, due to the dynamic nature and small size of putative lateral membrane domains, combined with the interior of a cell being filled with lipophilic environments that overlay each other, directly imaging domains in organellar membranes with lipid packing probes remains extremely difficult with current technology (or at least available to us). We argue that the TMD probes used in this manuscript are a reasonable alternative, as they are fluorescent probes with validated selectivity for membrane compartments with different physical properties.

      Ultimately, the features of membrane domains suggested by a variety of techniques – i.e. nanometric, dynamic, relatively similar in composition to the surrounding membrane, potentially diverse/heterogeneous – make them inherently difficult to microscopically visualize. This is one reason why we believe studies like ours, which use a natural model system to directly quantify raft-associated behaviors and relate them to cellular effects (in our case, protein sorting), are a useful direction for this field.

      We believe we have been careful in our manuscript to avoid confusing language surrounding lipid rafts, phase separation, etc. Our experiments clearly show that mammalian membranes have the capacity to phase separate, that some proteins preferentially interact with more ordered domains, and that this preference is related to the subcellular trafficking fates and rates of these proteins. We have edited the manuscript to emphasize these claims and avoid the historical controversies and confusions.

      (4) In the super-resolution experiments (by SIM- where the enhancement of resolution is around two fold or less compared to optical), the authors are able to discern a segregation of the two types of Golgi-resident cargo that have different preferences for the lo-domains in GPMVs. It should be noted that TMD-allL and the LATallL end up in the late endosome after exit of the Golgi. Previous work from the Bonafacino laboratory (PMID: 28978644) has shown that proteins (such as M6PR) destined to go to the late endosome bud from a different part of the Golgi in vesicular carriers, while those that are destined for the cell surface first (including TfR) bud with tubular vesicular carriers. Thus at the resolution depicted in Fig 5, the segregation seen by the authors could be due to an alternative explanation, that these molecules are present in different areas of the Golgi for reasons different from phase partitioning. The relatively high colocalization of TfR with the GPI probe in Fig 5E is consistent with this explanation. TfR and GPI prefer different domains in the GPMV assays yet they show a high degree of colocalization and also traffic to the cell surface.

      This is a good point. Even at microscopic resolutions beyond the optical diffraction limit, we cannot make any strong claims that the segregation we observe is due to lateral lipid domains and not several reasonable alternatives, including separation between cisternae (rather than within), cargo vesicles moving between cisternae, or lateral domains that are mediated by protein assemblies rather than lipids. We have explicitly included this point in the Discussion: “Our SIM imaging suggests segregation of raft from nonraft cargo in the Golgi shortly (5 min) after RUSH release (Fig 5B), but at this level of resolution, we can only report reduced colocalization, not intra-Golgi protein distributions. Moreover, segregation within a Golgi cisterna would be very difficult to distinguish from cargo moving between cisternae at different rates or exiting via Golgi-proximal vesicles.”

      We have also added a similar caveat in the Results section of the manuscript: “These observations support the hypothesis that proteins can segregate in Golgi based on their affinity for distinct membrane domains; however, it is important to emphasize that this segregation does not necessarily imply lateral lipid-driven domains within a Golgi cisterna. Reasonable alternative possibilities include separation between cisternae (rather than within), cargo vesicles moving between cisternae, or lateral domains that are mediated by protein assemblies rather than lipids.”

      Finally, while probes with allL TMD do eventually end up in late endosomes (consistent with the Bonifacino lab’s findings which we include), they do so while initially transiting the PM2,11.

      Minor concerns:

      (1) Generally, the quantitation is high quality from difficult experimental data. Although a lot appears to be manual, it appears appropriately performed and interpreted. There are some claims that are made based on this quantitation, however, where there are no statistics performed. For example, figure 1B. Any quantitation with an accompanying conclusion should be subject to a statistical test. I think the quality of the model fits- this is particularly important.

      We appreciate the thoughtful feedback, the quantifications and fits were not trivial, but we believe important. We have added statistical significance to Figure 1B and others where it was missing.

      (2) Modulation of lipid levels in Fig 4E shows a significant change for the trafficking rate for the LAT-TMD construct and a not so significant change for all-TMD construct. However, these data are not convincing and appear to depend on a singular data point that appears to lower the mean value. In general, the experiment with the MZA inhibitor (Fig. 4D-F) is hard to interpret because cells will likely be sick after inhibition of sphingolipid and cholesterol synthesis. Moreover, the difference in effects for LAT-TMD and allL-TMD is marginal.

      We disagree with this interpretation. Fig 4E shows the average of three experiments and demonstrates clearly that the inhibitors change the Golgi efflux rate of LAT-TMD but not allL-TMD. This is summarized in the t1/2 quantifications of Fig 4F, which show a statistically significant change for LAT-TMD but not allL-TMD. This is not an effect of a singular data point, but rather the trend across the dataset.

      Further, the inhibitor conditions were tuned carefully to avoid cells becoming “sick”: at higher concentrations, cells did adopt unusual morphologies and began to detach from the plates. We pursued only lower concentrations, which cells survived for at least 48 hrs and without major morphological changes.

      (3) Line 173: 146-AAPSA-152 should read either 146-AAPSA-150 or 146-AAPSAPA-152, depending on what the authors intended.

      Thanks for the careful reading, we intended the former and it has been fixed.

      (4) What is the actual statistical significance in Fig. 3C and Fig. 3E? There is a single asterisk in each panel of the figure but two asterisks in the legend.

      Apologies, a single asterisk representing p<0.05 was intended. It has been fixed.

      (5) The code used to calculate the model. is not accessible. It is standard practice to host well-annotated code on Github or similar, and it would be good to have this publicly available.

      We have deposited the code on a public repository (doi: 10.5281/zenodo. 10478607) and added a note to the Methods.

      (1) Lorent, J. H. et al. Structural determinants and func7onal consequences of protein affinity for membrane ra=s. Nature communica/ons 8, 1219 (2017).PMC5663905

      (2) Diaz-Rohrer, B. B., Levental, K. R., Simons, K. & Levental, I. Membrane ra= associa7on is a determinant of plasma membrane localiza7on. Proc Natl Acad Sci U S A 111, 8500-8505 (2014).PMC4060687

      (3) Hirschberg, K. et al. Kine7c analysis of secretory protein traffic and characteriza7on of golgi to plasma membrane transport intermediates in living cells. J Cell Biol 143, 1485-1503 (1998).PMC2132993

      (4) Levental, K. R. & Levental, I. Giant plasma membrane vesicles: models for understanding membrane organiza7on. Current topics in membranes 75, 25-57 (2015)

      (5) Sezgin, E. et al. Elucida7ng membrane structure and protein behavior using giant plasma membrane vesicles. Nat Protoc 7, 1042-1051 (2012)

      (6) Komura, N. et al. Ra=-based interac7ons of gangliosides with a GPI-anchored receptor. Nat Chem Biol 12, 402-410 (2016)

      (7) Kinoshita, M. et al. Ra=-based sphingomyelin interac7ons revealed by new fluorescent sphingomyelin analogs. J Cell Biol 216, 1183-1204 (2017).PMC5379944

      (8) Stone, M. B., Shelby, S. A., Nunez, M. F., Wisser, K. & Veatch, S. L. Protein sor7ng by lipid phase-like domains supports emergent signaling func7on in B lymphocyte plasma membranes. eLife 6 (2017).PMC5373823

      (9) Machta, B. B. et al. Condi7ons that Stabilize Membrane Domains Also Antagonize n-Alcohol Anesthesia. Biophys J 111, 537-545 (2016)

      (10) Shelby, S. A., Castello-Serrano, I., Wisser, I., Levental, I. & S., V. Membrane phase separa7on drives protein organiza7on at BCR clusters. Nat Chem Biol in press (2023)

      (11) Diaz-Rohrer, B. et al. Rab3 mediates a pathway for endocy7c sor7ng and plasma membrane recycling of ordered microdomains Proc Natl Acad Sci U S A 120, e2207461120 (2023)

      (12) Veatch, S. L. et al. Cri7cal fluctua7ons in plasma membrane vesicles. ACS Chem Biol 3, 287-293 (2008)

      (13) Wang, H. Y. et al. Coupling of protein condensates to ordered lipid domains determines func7onal membrane organiza7on. Science advances 9, eadf6205 (2023).PMC10132753

    1. a) What is the return period corresponding to an exceedance probability of 99%? b) Determine the annual maxima and rank them from highest to lowest.

      the answers to exercise 7.4a and b should also be done with Python code

    1. Reviewer #1 (Public Review):

      Summary:

      Li and colleagues describe an experiment whereby sequences of dots in different locations were presented to participants while electroencephalography (EEG) was recorded. By presenting fixed sequences of dots in different locations repeatedly to participants, the authors assumed that participants had learned the sequences during the experiment. The authors also trained classifiers using event-related potential (ERP) data recorded from separate experimental blocks of dots presented in a random (i.e., unpredictable) order. Using these trained classifiers, the authors then assessed whether patterns of brain activity could be detected that resembled the neural response to a dot location that was expected, but not presented. They did this by presenting an additional set of sequences whereby only one of the dots in the learned sequence appeared, but not the other dots. They report that, in these sequences with omitted stimuli, patterns of EEG data resembled the visual response evoked by a dot location for stimuli that could be expected, but were not presented. Importantly, this only occurred for an omitted dot stimulus that would be expected to appear immediately after the dot that was presented in these partial sequences.

      This exciting finding complements previous demonstrations of the ability to decode expected (but not presented) stimuli in Blom et al. (2020) and Robinson et al. (2020) that are cited in this manuscript. It suggests that the visual system is able to generate patterns of activity that resemble expected sensory events, approximately at times at which an observer would expect them.

      Strengths:

      The experiment was carefully designed and care was taken to rule out some confounding factors. For example, gaze location was tracked over time, and deviations from fixation were marked, in order to minimise the contributions of saccades to above-chance decoding of dot position. The use of a separate block of dots (with unpredictable locations) to train the classifiers was also useful in isolating visual responses evoked by each dot location independently of any expectations that might be formed during the experiment. A large amount of data was also collected from each participant, which is important when using classifiers to decode stimulus features from EEG data. This careful approach is commendable and draws on best practices from existing work.

      Weaknesses:

      While there was clear evidence of careful experiment design, there are some aspects of the data analysis and results that significantly limit the inferences that can be drawn from the data. Both issues raised here relate to the use of pre-stimulus baselines and associated problems. As these issues are somewhat technical and may not be familiar to many readers, I will try to unpack each line of reasoning below. Here, it should be noted that these problems are complex, and similar issues often go undetected even by highly experienced EEG researchers.

      Relevant to both issues, the authors derived segments of EEG data relative to the time at which each dot was presented in the sequences (or would have appeared when the stimuli were omitted in the partial sequences). Segments were derived that spanned -100ms to 300ms relative to the actual or expected onset of the dot stimulus. The 300ms post-stimulus time period corresponds to the duration of each dot in the sequence (100ms) plus the inter-stimulus interval (ISI) that was 200ms in duration before the next dot appeared (or would be expected to appear in the partial sequences). Importantly, a pre-stimulus baseline was applied to each of these segments of data, meaning that the average amplitude at each electrode between -100ms and 0ms relative to (actual or expected) stimulus onset was subtracted from each segment of data (i.e., each epoch in common EEG terminology). While this type of baseline subtraction procedure is commonplace in EEG studies, in this study design it is likely to cause problematic effects that could plausibly lead to the patterns of results reported in this manuscript.

      First of all, the authors compare event-related potentials (ERPs) evoked by dots in the full as compared to partial sequences, to test a hypothesis relating to attentional tuning. They reported ERP amplitude differences across these conditions, for epochs corresponding to when a dot was presented to a participant (i.e., excluding epochs time-locked to omitted dots). However, these ERP comparisons are complicated by the fact that, in the full sequences, dot presentations are preceded by the presentation of other dots in the sequence. This means that ERPs evoked by the preceding dots in the full sequences will overlap in time with the ERPs corresponding to the dots presented at the zero point in the derived epochs. Importantly, this overlap would not occur in the partial sequence conditions, where only one dot was presented in the sequence. This essentially makes any ERP comparisons between full and partial sequences very difficult to interpret, because it is unclear if ERP differences are simply a product of overlapping ERPs from previously presented dots in the full sequence conditions. For example, there are statistically significant differences observed even in the pre-stimulus baseline period for this ERP analysis, which likely reflects the contributions ERPs evoked by the preceding dots in the full sequences, which are absent in the partial sequences.

      The problems with interpreting this data are also compounded by the use of pre-stimulus baselines as described above. Importantly, the use of pre-stimulus baselines relies on the assumption that the ERPs in the baseline period (here, the pre-stimulus period) do not systematically differ across the conditions that are compared (here, the full vs. partial sequences). This assumption is violated due to the overlapping ERPs issue described just above. Accordingly, the use of the pre-stimulus baseline subtraction can produce spurious effects in the time period after stimulus onset (for examples see Feuerriegel & Bode, 2022, Neuroimage). This also makes it very difficult to meaningfully compare the ERPs following dot stimulus onset in these analyses.

      The second issue relates to the use of pre-stimulus baselines and concerns the key finding reported in the paper: that EEG patterns corresponding to expected but omitted events can be decoded in the partial sequences. In the partial sequences, there are two critical epochs that were derived: One time-locked to the presentation of the dot, and another that was time-locked to 300ms after the dot was presented (i.e. when the next dot would be expected to appear). The latter epoch was used to test for representations of expected, but omitted, stimulus locations.

      For the epochs in which the dots were presented, above-chance decoding can be observed spanning a training time range from around 100-300ms and a testing time range of a similar duration (see the plot in Figure 4b). This plot indicates that, during the time window of around 200-300ms following dot stimulus onset, the position of the dot can be decoded not only from trained classifiers using the same time windows spanning 200-300ms, but also using classifiers trained using earlier time windows of around 100-200ms.

      This is important because the 200-300ms time period after dot onset in the partial sequences is the window used for pre-stimulus baseline subtraction when deriving epochs corresponding to the first successor representation (i.e., the first stimulus that might be expected to follow from the presented dot, but did not actually appear). In other words, the 200-300ms time window from dot onset corresponds to the -100 to 0 ms time window in the first successor epochs. Accordingly, the pattern that is indicative of the preceding, actually presented dot position would be subtracted from the EEG data used to test for the successor representation. Notably, the first successor condition would always be in another visual field quadrant (90-degree rotated or the opposite quadrant) as stated in the methods. In other words, the omitted stimulus would be expected to appear in the opposite vertical and/or horizontal visual hemifield as compared to the previously presented dot in these partial sequences.

      This is relevant because ERPs tend to show reversed polarity across hemifields. For example, a stimulus presented in the right hemifield will have reversed polarity patterns at the same electrode as compared to an equivalent stimulus presented in the left hemifield (e.g., Supplementary Figure 3 in the comparable study of Blom et al., 2020). By subtracting the ERP patterns evoked by the presented dot in the partial sequences during the time period of 200-300ms (corresponding to the -100 to 0ms baseline window), this would be expected to bias patterns of EEG data in the first successor epochs to resemble stimulus positions in opposite hemifields. This could plausibly produce above-chance decoding accuracy in the time windows identified in Figure 5a, where the training time windows broadly correspond to the periods of above-chance decoding during 200-300ms from dot stimulus onset in Figure 4b.

      In other words, the above-chance decoding of the first successor representation may plausibly be an artefact of the pre-stimulus baseline subtraction procedure used when deriving the epochs. This casts some doubt as to whether genuine successor representations were actually detected in the study. Additional tests for successor representations using ERP baselines prior to the presented dot in the partial sequences may be able to get around this, but such analyses were not presented, and the code and data were not accessible at the time of this review.

      Although the study is designed well and a great amount of care was taken during the analysis stage, these issues with ERP overlap and baseline subtraction raise some doubts regarding the interpretability of the findings in relation to the analyses currently presented.

    1. Reviewer #2 (Public Review):

      Summary:

      The authors investigate axonal and synapse development in two distinct visual feature-encoding neurons (VPN), LC4 and LPLC2. They first show that they occupy distinct regions on the GF dendrites, and likely arrive sequentially. Analysis of the VPNs' morphology throughout development, and synaptic gene and protein expression data reveals the temporal order of maturation. Functional analysis then shows that LPLC2 occupancy of the GF dendrites is constrained by LC4 presence.

      Strengths:

      The authors investigate an interesting and very timely topic, which will help to understand how neurons coordinate their development. The manuscript is very well written, and data are of high quality, that generally support the conclusions drawn (but see some comments for Fig. 2 below). A thorough descriptive analysis of the LC4/LPLC2 to GF connectivity is followed by some functional assessment showing that one neuron's occupancy of the GF dendrite depend on another.<br /> The manuscripts uses versatile methods to look at membrane contact, gene and protein expression (using scRNAseq data and state-of-the art genetic tools) and functional neuronal properties. I find it especially interesting and elegant how the authors combine their findings to highlight the temporal trajectory of development in this system.

      Weaknesses:

      After reading the summary, I was expecting a more comprehensive analysis of many VPNs, and their developmental relationships. For a better reflection of the data, the summary could state that the authors investigate *two* visual projection neurons (VPNs) and that ablation *of one cell type of VPNs* results in the expansion of the remaining VPN territory.

      The manuscript is falling a bit short of putting the results into the context of what is known about synaptic partner choice/competition between different neurons during neuronal or even visual system development. Lots of work has been done in the peripheral the visual system, from the Hiesinger lab and others. Both the introduction and the discussion section should elaborate on this.

      The one thing that the manuscript does not unambiguously show is when the connections between LC4 and LPLC2 become functional.

      Figure 2:<br /> Figure 2A-C: I found the text related to that figure hard to follow, especially when talking about filopodia. Overall, life imaging would probably clarify at which time point there really are dynamic filopodia. For this study, high magnification images of what the authors define as filopodia would certainly help.<br /> L137ff: This section talks about filopodia between 24-48 hAPF, but only 36h APF is shown in A, where one could see filopodia. The other time points are shown in B and C, but number of filopodia is not quantified.<br /> L143: "filopodia were still present, but visibly shorter": This is hard to see, and again, not quantified.<br /> L144f: "from 72h APF to eclosion, the volume of GF dendrites significantly decreased": this is not actually quantified, comparisons are only done to 24, 36 and 48 h APF.<br /> Furthermore, 72h APF is not shown here, but in Figure 2D, so either show here, or call this figure panel already?

      Figure 2D/E: to strengthen the point that LC4 and LPLC2 arrive sequentially, it would help to show all time points analyzed in Figure D/E.

      L208: "significant increase ... from 60h APF to 72h APF": according to the figure caption, this comparison is marked by "+" but there is no + in the figure itself.

      Figure 3:<br /> A key point of the manuscript is the sequential arrival of different VPN classes. So then why is the scRNAseq analysis in Figure 3 shown pooled across VPNs? Certainly, the reader at this point is interested in temporal differences in gene expression. The class-specific data are somewhat hidden in Supp. Fig. 9, and actually do not show temporal differences. This finding should be presented in the main data.

      L438: "silencing LC4 by expressing Kir2.1... reduced the GF response": Is this claim backed by some quantification?

      Figure 4K: Do the control data have error bars, which are just too small to see? And what is tested against what? Is blue vs. black quantified as well? What do red, blue, and black asterisks indicate? Please clarify in figure caption.

      Optogenetics is mentioned in methods (in "fly rearing", in the genotypes, and there is an extra "Optogenetics" section in methods), but no such data are shown in the manuscripts. (If the authors have those data, it would be great to know when the VPN>GF connections become functional!)

      Methods:

      Antibody concentrations are not given anywhere and will be useful information for the reader

      Could the authors please give more details on the re-analysis of the scRNAseq dataset? How did you identify cell type clusters in there, for example?

      L785 and L794: I am curious. Why is it informative to mention what was *not* done?

      Custom-written analysis code is mentioned in a few places. Is this code publicly available?

    1. Reviewer #1 (Public Review):

      Summary:

      The manuscript gives a broad overview of how to write NeuroML, and a brief description of how to use it with different simulators and for different purposes - cells to networks, simulation, optimization, and analysis. From this perspective, it can be an extremely useful document to introduce new users to NeuroML.

      However, the manuscript itself seems to lose sight of this goal in many places, and instead, the description at times seems to target software developers. For example, there is a long paragraph on the board and user community. The discussion on simulator tools seems more for developers, not users. All the information presented at the level of a developer is likely to be distracting to readers..

      Strengths:

      The modularity of NeuroML is indeed a great advantage. For example, the ability to specify the channel file allows different channels to be used with different morphologies without redundancy. The hierarchical nature of NeuroML also is commendable, and well illustrated in Figures 2a through c.

      The number of tools available to work with NeuroML is impressive.

      The abstract, beginning, and end of the manuscript present and discuss incorporating NeuroML into research workflows to support FAIR principles.

      Having a Python API and providing examples using this API is fantastic. Exporting to NeuroML from Python is also a great feature.

      Weaknesses:

      Though modularity is a strength, it is unclear to me why the cell morphology isn't also treated similarly, i.e., specify the morphology of a multi-compartmental model in a separate file, and then allow the cell file to specify not only the files containing channels, but also the file containing the multi-compartmental morphology, and then specify the conductance for different segment groups. Also, after pynml_write_neuroml2_file, you would not have a super long neuroML file for each variation of conductances, since there would be no need to rewrite the multi-compartmental morphology for each conductance variation.

      This would be especially important for optimizations, if each trial optimization wrote out the neuroML file, then including the full morphology of a realistic cell would take up excessive disk space, as opposed to just writing out the conductance densities. As long as cell morphology must be included in every cell file, then NeuroML is not sufficiently modular, and the authors should moderate their claim of modularity (line 419) and building blocks (551). In addition, this is very important for downloading NeuroML-compliant reconstructions from NeuroMorpho.org. If the cell morphology cannot be imported, then the user has to edit the file downloaded from NeuroMorpho.org, and provenance can be lost. Also, Figure 2d loses the hierarchical nature by showing ion channels, synapses, and networks as separate main branches of NeuroML.

      In Figure 5, the difference between the core and native simulator is unclear. What is involved in helper scripts? I thought neurons could read NeuroML? If so, why do you need the export simulator-specific scripts? In addition, it seems strange to call something the "core" simulation engine, when it cannot support multi-compartmental models. It is unclear why "other simulators" that natively support NeuroML cannot be called the core. It might be more helpful to replace this sort of classification with a user-targeted description. The authors already state which simulators support NeuroML and which ones need code to be exported. In contrast, lines 369-370 mention that not all NeuroML models are supported by each simulator. I recommend expanding this to explain which features are supported in each simulator. Then, the unhelpful separation between core and native could be eliminated.

      The body of the manuscript has so much other detail that I lose sight of how NeuroML supports FAIR. It is also unclear who is the intended audience. When I get to lines 336-344, it seems that this description is too much detail for the audience. The paragraph beginning on line 691 is a great example of being unclear about who is the audience. Does someone wanting to develop NeuroML models need to understand XSD schema? If so, the explanation is not clear. XSD schema is not defined and instead explains NeuroML-specific aspects of XSD. Lines 734-735 are another example of explaining to code developers (not model developers).

    2. Reviewer #2 (Public Review):

      Summary:

      Developing neuronal models that are shareable, reproducible, and interoperable allows the neuroscience community to make better use of published models and to collaborate more effectively. In this manuscript, the authors present a consolidated overview of the NeuroML model description system along with its associated tools and workflows. They describe where different components of this ecosystem lay along the model development pathway and highlight resources, including documentation and tutorials, to help users employ this system.

      Strengths:

      The manuscript is well-organized and clearly written. It effectively uses the delineated model development life cycle steps, presented in Figure 1, to organize its descriptions of the different components and tools relating to NeuroML. It uses this framework to cover the breadth of the software ecosystem and categorize its various elements. The NeuroML format is clearly described, and the authors outline the different benefits of its particular construction. As primarily a means of describing models, NeuroML also depends on many other software components to be of high utility to computational neuroscientists; these include simulators (ones that both pre-date NeuroML and those developed afterwards), visualization tools, and model databases.

      Overall, the rationale for the approach NeuroML has taken is convincing and well-described. The pointers to existing documentation, guides, and the example usages presented within the manuscript are useful starting points for potential new users. This manuscript can also serve to inform potential users of features or aspects of the ecosystem that they may have been unaware of, which could lower obstacles to adoption. While much of what is presented is not new to this manuscript, it still serves as a useful resource for the community looking for information about an established, but perhaps daunting, set of computational tools.

      Weaknesses:

      The manuscript in large part catalogs the different tools and functionalities that have been produced through the long development cycle of NeuroML. As discussed above, this is quite useful, but it can still be somewhat overwhelming for a potential new user of these tools. There are new user guides (e.g., Table 1) and example code (e.g. Box 1), but it is not clear if those resources employ elements of the ecosystem chosen primarily for their didactic advantages, rather than general-purpose utility. I feel like the manuscript would be strengthened by the addition of clearer recommendations for users (or a range of recommendations for users in different scenarios).

      For example, is the intention that most users should primarily use the core NeuroML tools and expand into the wider ecosystem only under particular circumstances? What are the criteria to keep in mind when making that decision to use alternative tools (scale/complexity of model, prior familiarity with other tools, etc.)? The place where it seems most ambiguous is in the choice of simulator (in part because there seem to be the most options there) - are there particular scenarios where the authors may recommend using simulators other than the core jNeuroML software?

      The interoperability of NeuroML is a major strength, but it does increase the complexity of choices facing users entering into the ecosystem. Some clearer guidance in this manuscript could enable computational neuroscientists with particular goals in mind to make better strategic decisions about which tools to employ at the outset of their work.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      Connelly and colleagues provide convincing genetic evidence that importation from mainland Tanzania is a major source of Plasmodium falciparum lineages currently circulating in Zanzibar. This study also reveals ongoing local malaria transmission and occasional near-clonal outbreaks in Zanzibar. Overall, this research highlights the role of human movements in maintaining residual malaria transmission in an area targeted for intensive control interventions over the past decades and provides valuable information for epidemiologists and public health professionals.

      Reviewer #1 (Public Review):

      Zanzibar archipelago is close to achieving malaria elimination, but despite the implementation of effective control measures, there is still a low-level seasonal malaria transmission. This could be due to the frequent importation of malaria from mainland Tanzania and Kenya, reservoirs of asymptomatic infections, and competent vectors. To investigate population structure and gene flow of P. falciparum in Zanzibar and mainland Tanzania, they used 178 samples from mainland Tanzania and 213 from Zanzibar that were previously sequenced using molecular inversion probes (MIPs) panels targeting single nucleotide polymorphisms (SNPs). They performed Principal Component Analysis (PCA) and identity by descent (IBD) analysis to assess genetic relatedness between isolates. Parasites from coastal mainland Tanzania contribute to the genetic diversity in the parasite population in Zanzibar. Despite this, there is a pattern of isolation by distance and microstructure within the archipelago, and evidence of local sharing of highly related strains sustaining malaria transmission in Zanzibar that are important targets for interventions such as mass drug administration and vector control, in addition to measures against imported malaria.

      Strengths:

      This study presents important samples to understand population structure and gene flow between mainland Tanzania and Zanzibar, especially from the rural Bagamoyo District, where malaria transmission persists and there is a major port of entry to Zanzibar. In addition, this study includes a larger set of SNPs, providing more robustness for analyses such as PCA and IBD. Therefore, the conclusions of this paper are well supported by data.

      Weaknesses:

      Some points need to be clarified:

      (1) SNPs in linkage disequilibrium (LD) can introduce bias in PCA and IBD analysis. Were SNPs in LD filtered out prior to these analyses?

      Thank you for this point. We did not filter SNPs in LD prior to this analysis. In the PCA analysis in Figure 1, we did restrict to a single isolate among those that were clonal (high IBD values) to prevent bias in the PCA. In general, disequilibrium is minimal only over small distances <5-10kb without selective forces at play. This is much less than the average spacing of the markers in the panel. If there is minimal LD, the conclusions drawn on relative levels and connections at high IBD are unlikely to be confounded by any effects of disequilibrium.

      ( 2) Many IBD algorithms do not handle polyclonal infections well, despite an increasing number of algorithms that are able to handle polyclonal infections and multiallelic SNPs. How polyclonal samples were handled for IBD analysis?

      Thank you for this point. We added lines 157-161 to clarify. This section now reads:

      “To investigate genetic relatedness of parasites across regions, identity by descent (IBD) estimates were assessed using the within sample major alleles (coercing samples to monoclonal by calling the dominant allele at each locus) and estimated utilizing a maximum likelihood approach using the inbreeding_mle function from the MIPanalyzer package (Verity et al., 2020). This approach has previously been validated as a conservative estimate of IBD (Verity et al., 2020).”

      Please see the supplement in (Verity et al., 2020) for an extensive simulation study that validates this approach.

      Reviewer #1 (Recommendations For The Authors):

      (3) I think Supplementary Figures 8 and 9 are more visually informative than Figure 2.

      Thank you for your response. We performed the analysis in Figure 2 to show how IBD varies between different regions and is higher within a region than between.

      Reviewer #2 (Public Review):

      This manuscript describes P. falciparum population structure in Zanzibar and mainland Tanzania. 282 samples were typed using molecular inversion probes. The manuscript is overall well-written and shows a clear population structure. It follows a similar manuscript published earlier this year, which typed a similar number of samples collected mostly in the same sites around the same time. The current manuscript extends this work by including a large number of samples from coastal Tanzania, and by including clinical samples, allowing for a comparison with asymptomatic samples.

      The two studies made overall very similar findings, including strong small-scale population structure, related infections on Zanzibar and the mainland, near-clonal expansion on Pemba, and frequency of markers of drug resistance. Despite these similarities, the previous study is mentioned a single time in the discussion (in contrast, the previous research from the authors of the current study is more thoroughly discussed). The authors missed an opportunity here to highlight the similar findings of the two studies.

      Thank you for your insights. We appreciated the level of detail of your review and it strengthened our work. We have input additional sentences on lines 292-295, which now reads:

      “A recent study investigating population structure in Zanzibar also found local population microstructure in Pemba (Holzschuh et al., 2023). Further, both studies found near-clonal parasites within the same district, Micheweni, and found population microstructure over Zanzibar.”

      Strengths:

      The overall results show a clear pattern of population structure. The finding of highly related infections detected in close proximity shows local transmission and can possibly be leveraged for targeted control.

      Weaknesses:

      A number of points need clarification:

      (1) It is overall quite challenging to keep track of the number of samples analyzed. I believe the number of samples used to study population structure was 282 (line 141), thus this number should be included in the abstract rather than 391. It is unclear where the number 232 on line 205 comes from, I failed to deduct this number from supplementary table 1.

      Thank you for this point. We have included 282 instead of 391 in the abstract. We added a statement in the results at lines 203-205 to clarify this point, which now reads:

      “PCA analysis of 232 coastal Tanzanian and Zanzibari isolates, after pruning 51 samples with an IBD of greater than 0.9 to one representative sample, demonstrates little population differentiation (Figure 1A).”

      (2) Also, Table 1 and Supplementary Table 1 should be swapped. It is more important for the reader to know the number of samples included in the analysis (as given in Supplementary Table 1) than the number collected. Possibly, the two tables could be combined in a clever way.

      Thank you for this advice. Rather than switch to another table altogether, we appended two columns to the original table to better portray the information (see Table 1).

      Methods

      (3) The authors took the somewhat unusual decision to apply K-means clustering to GPS coordinates to determine how to combine their data into a cluster. There is an obvious cluster on Pemba islands and three clusters on Unguja. Based on the map, I assume that one of these three clusters is mostly urban, while the other two are more rural. It would be helpful to have a bit more information about that in the methods. See also comments on maps in Figures 1 and 2 below.

      Cluster 3 is a mix of rural/urban while the clusters 2, 4 and 5 are mostly rural. This analysis was performed to see how IBD changes in relation to local context within different regions in Zanzibar, showing that there is higher IBD within locale than between locale.

      (4) Following this point, in Supplemental Figure 5 I fail to see an inflection point at K=4. If there is one, it will be so weak that it is hardly informative. I think selecting 4 clusters in Zanzibar is fine, but the justification based on this figure is unclear.

      The K-means clustering experiment was used to cluster a continuous space of geographic coordinates in order to compare genetic relatedness in different regions. We selected this inflection point based on the elbow plot and based the number to obtain sufficient subsections of Zanzibar to compare genetic relatedness. This point is added to the methods at lines 174-178, which now reads:

      “The K-means clustering experiment was used to cluster a continuous space of geographic coordinates in order to compare genetic relatedness in different regions. We selected K = 4 as the inflection point based on the elbow plot (Supplemental Figure 5) and based the number to obtain sufficient subsections of Zanzibar to compare genetic relatedness.”

      (5) For the drug resistance loci, it is stated that "we further removed SNPs with less than 0.005 population frequency." Was the denominator for this analysis the entire population, or were Zanzibar and mainland samples assessed separately? If the latter, as for all markers <200 samples were typed per site, there could not be a meaningful way of applying this threshold. Given data were available for 200-300 samples for each marker, does this simply mean that each SNP needed to be present twice?

      Population frequency is calculated based on the average within sample allele frequency of each individual in the population, which is an unbiased estimator. Within sample allele frequency can range from 0 to 1. Thus, if only one sample has an allele and it is at 0.1 within sample frequency, the population allele frequency would be 0.1/100 = 0.001. This allele is removed even though this would have resulted in a prevalence of 0.01. This filtering is prior to any final summary frequency or prevalence calculations (see MIP variant Calling and Filtering section in the methods). This protects against errors occurring only at low frequency.

      Discussion:

      (6) I was a bit surprised to read the following statement, given Zanzibar is one of the few places that has an effective reactive case detection program in place: "Thus, directly targeting local malaria transmission, including the asymptomatic reservoir which contributes to sustained transmission (Barry et al., 2021; Sumner et al., 2021), may be an important focus for ultimately achieving malaria control in the archipelago (Björkman & Morris, 2020)." I think the current RACD program should be mentioned and referenced. A number of studies have investigated this program.

      Thank you for this point. We have added additional context and clarification on lines 275-280, which now reads:

      “Thus, directly targeting local malaria transmission, including the asymptomatic reservoir which contributes to sustained transmission (Barry et al., 2021; Sumner et al., 2021), may be an important focus for ultimately achieving malaria control in the archipelago (Björkman & Morris, 2020). Currently, a reactive case detection program within index case households is being implemented, but local transmission continues and further investigation into how best to control this is warranted (Mkali et al. 2023).”

      (7) The discussion states that "In Zanzibar, we see this both within and between shehias, suggesting that parasite gene flow occurs over both short and long distances." I think the term 'long distances' should be better defined. Figure 4 shows that highly related infections rarely span beyond 20-30 km. In many epidemiological studies, this would still be considered short distances.

      Thank you for this point. We have edited the text at lines 287-288 to indicate that highly related parasites mainly occur at the range of 20-30km, which now reads:

      “In Zanzibar, highly related parasites mainly occur at the range of 20-30km.”

      (8) Lines 330-331: "Polymorphisms associated with artemisinin resistance did not appear in this population." Do you refer to background mutations here? Otherwise, the sentence seems to repeat lines 324. Please clarify.

      We are referring to the list of Pfk13 polymorphisms stated in the Methods from lines 146-148. We added clarifying text on lines 326-329:

      “Although polymorphisms associated with artemisinin resistance did not appear in this population, continued surveillance is warranted given emergence of these mutations in East Africa and reports of rare resistance mutations on the coast consistent with spread of emerging Pfk13 mutations (Moser et al., 2021). “

      (9) Line 344: The opinion paper by Bousema et al. in 2012 was followed by a field trial in Kenya (Bousema et al, 2016) that found that targeting hotspots did NOT have an impact beyond the actual hotspot. This (and other) more recent finding needs to be considered when arguing for hotspot-targeted interventions in Zanzibar.

      We added a clarification on this point on lines 335-345, which now reads:

      “A recent study identified “hotspot” shehias, defined as areas with comparatively higher malaria transmission than other shehias, near the port of Zanzibar town and in northern Pemba (Bisanzio et al., 2023). These regions overlapped with shehias in this study with high levels of IBD, especially in northern Pemba (Figure 4). These areas of substructure represent parasites that differentiated in relative isolation and are thus important locales to target intervention to interrupt local transmission (Bousema et al., 2012). While a field cluster-randomized control trial in Kenya targeting these hotspots did not confer much reduction of malaria outside of the hotspot (Bousema et al. 2016), if areas are isolated pockets, which genetic differentiation can help determine, targeted interventions in these areas are likely needed, potentially through both mass drug administration and vector control (Morris et al., 2018; Okell et al., 2011). Such strategies and measures preventing imported malaria could accelerate progress towards zero malaria in Zanzibar.”

      Figures and Tables:

      (10) Table 2: Why not enter '0' if a mutation was not detected? 'ND' is somewhat confusing, as the prevalence is indeed 0%.

      Thank you for this point. We have put zero and also given CI to provide better detail.

      (11) Figure 1: Panel A is very hard to read. I don't think there is a meaningful way to display a 3D-panel in 2D. Two panels showing PC1 vs. PC2 and PC1 vs. PC3 would be better. I also believe the legend 'PC2' is placed in the wrong position (along the Y-axis of panel 2).

      Supplementary Figure 2B suffers from the same issue.

      Thank you for your comment. A revised Figure 1 and Supplemental Figure 2 are included, where there are separate plots for PC1 vs. PC2 and PC1 vs. PC3.

      (12) The maps for Figures 1 and 2 don't correspond. Assuming Kati represents cluster 4 in Figure 2, the name is put in the wrong position. If the grouping of shehias is different between the Figures, please add an explanation of why this is.

      Thank you for this point. The districts with at least 5 samples present are plotted in the map in Figure 1B. In Figure 2, a totally separate analysis was performed, where all shehias were clustered into separate groups with k-means and the IBD values were compared between these clusters. These maps are not supposed to match, as they are separate analyses. Figure 1B is at the district level and Figure 2 is clustering shehias throughout Zanzibar.

      The figure legend of Figure 1B on lines 410-414 now reads:

      “B) A Discriminant Analysis of Principal Components (DAPC) was performed utilizing isolates with unique pseudohaplotypes, pruning highly related isolates to a single representative infection. Districts were included with at least 5 isolates remaining to have sufficient samples for the DAPC. For plotting the inset map, the district coordinates (e.g. Mainland, Kati, etc.) are calculated from the averages of the shehia centroids within each district.”

      The figure legend of Figure 2 on lines 417-425 now reads:

      “Figure 2. Coastal Tanzania and Zanzibari parasites have more highly related pairs within their given region than between regions. K-means clustering of shehia coordinates was performed using geographic coordinates all shehias present from the sample population to generate 5 clusters (colored boxes). All shehias were included to assay pairwise IBD between differences throughout Zanzibar. Pairwise comparisons of within cluster IBD (column 1 of IBD distribution plots) and between cluster IBD (column 2-5 of IBD distribution plots) was done for all clusters. In general, within cluster IBD had more pairwise comparisons containing high IBD identity.”

      (13) Figure 2: In the main panel, please clarify what the lines indicate (median and quartiles?). It is very difficult to see anything except the outliers. I wonder whether another way of displaying these data would be clearer. Maybe a table with medians and confidence intervals would be better (or that data could be added to the plots). The current plots might be misleading as they are dominated by outliers.

      Thank you for this point and it greatly improved this figure. We changed the plotting mechanisms through using a beeswarm plot, which plots all pairwise IBD values within each comparison group.

      (14) In the insert, the cluster number should not only be given as a color code but also added to the map. The current version will be impossible to read for people with color vision impairment, and it is confusing for any reader as the numbers don't appear to follow any logic (e.g. north to south).

      Thank you very much for these considerations. We changed the color coding to a color blind friendly palette and renamed the clusters to more informative names; Pemba, Unguja North (Unguja_N), Unguja Central (Unguja_C), Unguja South (Unguja_S) and mainland Tanzania (Mainland).

      (15) The legend for Figure 3 is difficult to follow. I do not understand what the difference in binning was in panels A and B compared to C.

      Thank you for this point. We have edited the legend to reflect these changes. The legend for Figure 3 on lines 427-433 now reads:

      “Figure 3. Isolation by distance is shown between all Zanzibari parasites (A), only Unguja parasites (B) and only Pemba parasites (C). Samples were analyzed based on geographic location, Zanzibar (N=136) (A), Unguja (N=105) (B) or Pemba (N=31) (C) and greater circle (GC) distances between pairs of parasite isolates were calculated based on shehia centroid coordinates. These distances were binned at 4km increments out to 12 km. IBD beyond 12km is shown in Supplemental Figure 8. The maximum GC distance for all of Zanzibar was 135km, 58km on Unguja and 12km on Pemba. The mean IBD and 95% CI is plotted for each bin.”

      (16) Font sizes for panel C differ, and it is not aligned with the other panels.

      Thank you for pointing this out. Figure 3 and Supplemental Figure 10 are adjusted with matching formatting for each plot.

      (17) Why is Kusini included in Supplemental Figure 4, but not in Figure 1?

      In Supplemental Figure 4, all isolates were used in this analysis and isolates with unique pseudohaplotypes were not pruned to a single representative infection. That is why there are additional isolates in Kusini. The legend for Supplemental Figure 4 now reads:

      “Supplemental Figure 4. PCA with highly related samples shows population stratification radiating from coastal Mainland to Zanzibar. PCA of 282 total samples was performed using whole sample allele frequency (A) and DAPC was performed after retaining samples with unique pseudohaplotypes in districts that had 5 or more samples present (B). As opposed to Figure 1, all isolates were used in this analysis and isolates with unique pseudohaplotypes were not pruned to a single representative infection.”

      (18) Supplemental Figures 6 and 7: What does the width of the line indicate?

      The sentence below was added to the figure legends of Supplemental Figures 6 and 7 and the legends of each network plot were increased in size:

      “The width of each line represents higher magnitudes of IBD between pairs.”

      (19) What was the motivation not to put these lines on the map, as in Figure 4A? This might make it easier to interpret the data.

      Thank you for this comment. For Supplemental Figure 8 and 9, we did not put these lines that represent lower pairwise IBD to draw the reader's attention to the highly related pairs between and within shehias.

      Reviewer #2 (Recommendations For The Authors):

      (1) There is a rather long paragraph (lines 300-323) on COI of asymptomatic infections and their genetic structure. Given that the current study did not investigate most of the hypotheses raised there (e.g. immunity, expression of variant genes), and the overall limited number of asymptomatic samples typed, this part of the discussion feels long and often speculative.

      Thank you for your perspective. The key sections highlighted in this comment, regarding immunity and expression of variant genes, were shortened. This section on lines 300-303 now reads:

      “Asymptomatic parasitemia has been shown to be common in falciparum malaria around the globe and has been shown to have increasing importance in Zanzibar (Lindblade et al., 2013; Morris et al., 2015). What underlies the biology and prevalence of asymptomatic parasitemia in very low transmission settings where anti-parasite immunity is not expected to be prevalent remains unclear (Björkman & Morris, 2020).”

      (2) As a detail, line 304 mentions "few previous studies" but only one is cited. Are there studies that investigated this and found opposite results?

      Thank you for this comment. We added additional studies that did not find an association between clinical disease and COI. These changes are on lines 303-308, which now reads:

      “Similar to a few previous studies, we found that asymptomatic infections had a higher COI than symptomatic infections across both the coastal mainland and Zanzibar parasite populations (Collins et al., 2022; Kimenyi et al., 2022; Sarah-Matio et al., 2022). Other studies have found lower COI in severe vs. mild malaria cases (Robert et al., 1996) or no significant difference between COI based on clinical status (Earland et al. 2019; Lagnika et al. 2022; Conway et al. 1991; Kun et al. 1998; Tanabe et al. 2015)”

      (3) Table 2: Percentages need to be checked. To take one of several examples, for Pfk13-K189N a frequency of 0.019 for the mutant allele is given among 137 samples. 2/137 equals to 0.015, and 3/137 to 0.022. 0.019 cannot be achieved. The same is true for several other markers. Possibly, it can be explained by the presence of polyclonal infections. If so, it should be clarified what the total of clones sequenced was, and whether the prevalence is calculated with the number of samples or number of clones as the denominator.

      Thank you for this point. We mistakenly reported allele frequency instead of prevalence. An updated Table 2 is now in the manuscript. The method for calculating the prevalence is now at lines 148-151:

      “Prevalence was calculated separately in Zanzibar or mainland Tanzania for each polymorphism by the number of samples with alternative genotype calls for this polymorphism over the total number of samples genotyped and an exact 95% confidence interval was calculated using the Pearson-Klopper method for each prevalence.”

  5. inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net
    1. “ShouldIsingadifferentsong?”Iask.“No,hijo.No singing.Allyoudoisjumpandcount,jumpandcount,okay? Everydayyoutraining,youtryingtojumpalittlemore.”

      Throughout Jaime Cortez’s short story Gordro, we find instances of where Gordo is assimilating to the mainstream image of a boy with masculine interests. This need of being masculine is pushed by his father and Cesar. The passage highlighted how Gordo quickly code switched from singing a girly song to doing something masculine hence counting. Although his father scolded him on counting, Gordo could’ve gone back to singing but he did not. As his first instinct to sing a song about being a princess. But Gordo wants to become more like or have his fathers approval, Gordo wants to please his fathers so he crosses a boundary to be accepted.

    1. if 00:05:22 you've looked at the code in your company you'll realize that wow I've got millions of millions of lines of code there and I have more than a sneaking suspicion that a lot of that code is 00:05:34 actually in my way it doesn't represent the actual bang per line of code that we'd expect from a higher-level language

      sneaking suspicion

      lot code in my way

      no bang per line of code

      // We do not get much out of higher-level languages because people do not appreciate that "that advantage of high-level languages is notational rather then computatioal" John Allen - Anatomy of Lisp

      I learn this 40 years ago.

      I only realise now that the problem with Programming Languages and with Software "Enginnering#2 which is nothing of the sort

      The lesson from the first NATO conference on the Software Crisis

      ended up identifying that it is not angineering what we doe

      that that was what was needed but clearly out of sight

      yet the pretence persisted and we were kidding ourselves that what we do is ingineering

      The reason is that programming languages constituted in terms of the means of primitives, means of abstractions and means of combinations,

      where what is needed is to raise th level of expressive power of our notations by building everything that is needed into a coherent complex self-orgamzing system that supports such complex way of arrticulatinon that the task calls for

      Articulating intent to the point where it is amenable to the pun to actuallyr un on a machine

    1. Author response:

      Reviewer #3 (Public Review):

      Software UX design is not a trivial task and a point-and-click interface may become difficult to use or misleading when such design is not very well crafted. While Phantasus is a laudable effort to bring some of the out-of-the box transcriptomics workflows closer to the broader community of point-and-click users, there are a number of shortcomings that the authors may want to consider improving.

      Thank you for such an in-depth review. We really appreciate this feedback and have tried to address all of the concerns in the new version of Phantasus.

      Here I list the ones I found running Phantasus locally through the available Bioconductor package:

      (1) The feature of loading in one click one of the thousands of available GEO datasets is great. However, one important use of any such interfaces is the possibility for the users to analyze his/her own data. One of the standard formats for storing tables of RNA-seq counts are CSV files. However, if we try to upload from the computer a CSV file with expression data, such as the counts stored in the file GSE120660_PCamerge_hg38.csv.gz from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE120660, a first problem is that the system does not recognize that the CSV file is compressed. A second problem is that it does not recognize that values are separated by commas, the very original CSV format, giving a cryptic error "columnVector is undefined". If we transform the CSV format into tab-separated values (TSV) format, then it works, but this constitutes already a first barrier for the target user of Phantasus.

      Thank you for highlighting this issue of file formats support. We acknowledge the commonality of CSV and CSV.gz files in gene expression analysis. As a response, we have updated our data loading procedure to support these file formats. Moreover, the most recent version of our web application is able to recognize gzip-archived file in any of supported table formats: GCT, TSV, CSV and XLSX.

      (2) Many RNA-seq processing pipelines use Ensembl annotations, which for the purpose of downstream interpretation of the analysis, need to be translated into HUGO gene symbols. When I try to annotate the rows to translate the Ensembl gene identifiers, I get the error

      "There is no AnnotationDB on server. Ask administrator to put AnnotationDB sqlite databases in cacheDir/annotationdb folder"

      Thank you for revealing this issue. Indeed, locally installed instances of the Phantatus might lose some functionality in absence of some auxiliary files. For example, gene annotation mapping is unavailable without annotation databases. Previously, the user had to perform additional setup steps to unlock a few features, which might be confusing and unclear. In order to overcome this we have revised significantly the installation procedure. Newly added ‘setupPhantasus’ function is able to create all necessary configuration files and provides an interactive dialog with the user that helps to load all necessary data files from our official cache mirror (https://alserglab.wsutl.edu/files/phantasus/minimal-cache/). Docker-based installation follows the same approach, however it is configured to install everything by default. Thus, with help of the new installation procedure locally installed Phantasus now has the whole functionality available at the official mirrors. The comprehensive installation description is now available at https://ctlab.github.io/phantasus-doc/installation.

      (3) When trying to normalize the RNA-seq counts, there are no standard options such as within-library (RPKM, FPKM) or between-library (TMM) normalization procedures.

      Appreciating your feedback, we've expanded the available normalization options in the updated version of Phantasus. We added support for TMM normalization as suggested by the edgeR package and voom normalization from the limma package. However, certain strategies like RPKM/FPKM or TPM rely on gene-specific effective lengths, which are challenging to infer without protocol and alignment details. As Phantasus operates on gene expression matrices and doesn't execute alignment steps, the implementation of these normalization seems infeasible. On the other hand, if the user has the matrix with FPKM or TPM gene values (for example from a core facility), such a matrix can be loaded into Phantasus and used for the analysis.

      If I take log2(1+x) a new tab is created with the normalized data, but it's not easy to realize what happened because the tab has the same name as the previous one and while the colors of the heatmap changed to reflect the new scale of the data, this is quite subtle. This may cause that an unexperienced user to apply the same normalization step again on the normalized data. Ideally, the interface should lead the user through a pipeline, reducing unnecessary degrees of freedom associated with each step.

      Thank you for your comment. Indeed our approach to create a new tab for each alteration to the expression values preserving the name might be the source of confusion for a user. On the other hand, generating informative tab names without overwhelming users with too much detail is also challenging. As a compromise we have an option for the user to manually rename the tab. Still, we agree that this remains an area for improvement. We also consider it to be a part of a larger issue: for example, the loaded data can already be log-scaled, so that even one round of log-scale transformation in Phantasus would be incorrect. Accordingly, we are exploring ways to address this issue in the future by adding automated checks for the tools or, as you suggested, implementing stricter pipelines.

      (4.4) Phantasus allows one to filter out lowly-expressed genes by averaging expression of genes across samples and discarding/selecting genes using some cutoff value on that average. This strategy is fine, but to make an informed decision on that cutoff it would be useful to see a density plot of those averages that would allow one to identify the modes of low and high expression and decide the cutoff value that separates them.

      Thank you for the suggestion. Indeed a density plot might help users to make informed decisions during gene filtration. We have added such a plot into the ‘Plot/Chart’ tool as a ‘histogram’ chart type.

      It would be also nice to have an interface to the filterByExpr() function from the edgeR package, which provides more control on how to filter out lowly-expressed genes.

      Thank you for proposing the inclusion of an interface for the filterByExpr() function from the edgeR package. In the recent update we have incorporated filterByExpr() as part of the voom normalization tool. For now, for simplicity, we have decided to keep only the default parameter values. However, we will explore the addition of the dedicated filtering tool in the future.

      (5) When attempting a differential expression (DE) analysis, a popup window appears saying:

      "Your dataset is filtered. Limma will apply to unfiltered dataset. Consider using New Heat Map tool."

      One of the main purposes of filtering lowly-expressed genes is mainly to conduct a DE analysis afterwards, so it does not make sense that the tool says that such an analysis will be done on the unfiltered dataset. The reference to the "New Heat Map tool" is vague and unclear where should the user look for that other tool, without any further information or link.

      Thank you for highlighting this issue. We agree that the message in the popup window and the default action were confusing. In response to your feedback, we've updated the default behavior of our DE tools to automatically use the filtered data in a new tab. Additionally, we've clarified the warning message to ensure a better understanding of this process.

      (6) The DE analysis only allows for a two-sample group comparison, which is an important limitation in the question we may want to address. The construction of more complex designs could be graphically aided by using the ExploreModelMatrix Bioconductor package (Soneson et al, F1000Research, 2020).

      Indeed, the ability to create complex designs and various comparisons is important for many applications for gene expression analysis. Accordingly, in the latest Phantasus version, we've introduced an advanced design feature for the DE analysis, enabling the utilization of multiple column annotations for the design matrix. Combined with the existing ability to create new annotations, this update facilitates the setup of diverse design matrices. While at the moment we do not allow setting a complex contrast, we hope that the current interface will cover most of the differential expression use cases.

      (7) When trying to perform a pathway analysis with FGSEA, I get the following error:

      "Couldn't load FGSEA meta information. Please try again in a moment. Error: cannot open the connection In call: file(file, "rt")

      We hope that this issue should be resolved after we have implemented a more streamlined setup process. Among others, the new approach aims to eliminate the unexpected absence of metafiles in local installations. The latest Phantasus package version explicitly prompts the user to load necessary additional files automatically during the initial run, reducing options for an invalid setup.

      Finally, there have been already some efforts to approach R and Bioconductor transcriptomics pipelines to point-and-click users, such as iSEE (Rue-Albrecht et al, 2018) and GeneTonic (Marini et al, 2021) but they are not compared or at least cited in the present work.

      Indeed, our comparison was focused toward tools that offer non-programmatic functionalities for gene expression data analysis. While tools like iSEE and GeneTonic are adept at visualizing data and hold their own in providing extensive abilities, they do necessitate additional data preparation using R, distinguishing them from the specific scope of tools we assessed.

      One nice features of these two tools that I missed in Phantasus is the possibility of generating the R code that produces the analysis performed through the interface. This is important to provide a way to ensure the reproducibility of the analyses performed.

      The ability to generate R code within tools like these indeed aids in ensuring analysis reproducibility. Moreover, we have previously attempted implementing this functionality in Phantasus, however it proved to be hard to do in a useful fashion due to potential complex interactions between user and the client-side part of Phantasus. Nevertheless, we acknowledge the significance of such a feature and aim to introduce it in the future.

    1. Usage is a two steps process: First, a schema is constructed using the provided types and constraints: const schema = Joi.object({ a: Joi.string() }); Note that joi schema objects are immutable which means every additional rule added (e.g. .min(5)) will return a new schema object.

      Sure! Imagine you're building a structure, like a house. Before you start building, you need a plan, right? That's what a schema is in programming – it's like your blueprint.

      So, in this code, we're using a tool called Joi to make our blueprint. We want our structure to have a specific type, like a string, and maybe some rules, like a minimum length.

      Here's a simple explanation:

      1. Constructing the Schema: First, we make our blueprint using Joi. In this case, we're saying we want something called a to be a string. Think of it like saying, "In my house blueprint, I want a room called a, and it should be a string."

      javascript const schema = Joi.object({ a: Joi.string() });

      1. Adding Rules (Constraints): Now, let's say we want to add a rule to our blueprint, like saying that our room a must be at least 5 characters long. When we add rules, Joi gives us back a new blueprint with that rule added. It's like updating our original blueprint with extra details.

      javascript const schemaWithRule = schema.keys({ a: Joi.string().min(5) });

      So, in simple terms, we're creating a plan for our data, and then we can add rules to that plan to make sure our data follows certain conditions.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity

      Seleit and colleagues set out to explore the genetics of developmental timing and tissue size by mapping natural genetic variation associated with segmentation clock period and presomitic mesoderm (PSM) size in different species of Medaka fish. They first establish the extent of variation between five different Medaka species of in terms of organismal size, segmentation rate, segment size and presomitic mesoderm size, among other traits. They find that these traits are species-specific but strongly correlated. In a massive undertaking, they then perform developmental QTL mapping for segmentation clock period and PSM size in a set of ~600 F2 fish resulting from the cross of Orizyas sakaizumii (Kaga) and Orizyas latipes (Cab). Correlation between segmentation period and segment size was lost among the F2s, indicating that distinct genetic modules control these traits. Although the researchers fail to identify causal variants driving these traits, they perform proof of concept perturbations by analyzing F0 Crispants in which candidate genes were knocked out. Overall, the study introduces a completely new methodology (QTL mapping) to the field of segmentation and developmental tempo, and therefore provides multiple valuable insights into the forces driving evolution of these traits.

      Major comments: - The first sentence in the abstract reads "How the timing of development is linked to organismal size is a longstanding question". It is therefore disappointing that organismal size is not reported for the F2 hybrids. Was larval length measured in the F2s? If so, it should be reported. It is critical to understand whether the correlation between larval size and segmentation clock period is preserved in F2s or not, therefore determining if they represent a single or separate developmental modules. If larval length data were not collected, the authors need to be more careful with their wording.

      The question the reviewer raises here is indeed a very relevant one, and a question that we also were curious about ourselves. While it was not possible (logistically) to grow the 600 F2 fish to adulthood, we did measure larval length in a subset of F2 hatchling (n=72) to ask precisely the question the reviewer raises here. Our results (new Supplementary Figure 5) show that the correlation between larval length and segmentation timing (which we report across the Oryzias species) is absent in the F2s. This indeed argues that the traits represent separate developmental modules.

      In the current version of the paper, organismal size is often incorrectly equated to tissue size (e.g. PSM size, segment size). For example, in page 3 lines 33-34, the authors state that faster segmentation occurred in embryos of smaller size (Fig. 1D). However, Fig. 1D shows correlation between segmentation rate and unsegmented PSM area. The appropriate data to show would be segmentation rate vs. larval or adult length.

      The reviewer is correct. We have now linked the data more clearly to data we show in Supplementary Figure 1, which shows that adult length and adult mass are strongly correlated (S1A) and that adult mass is in turn strongly correlated with segmentation rate in the different Oryzias species (S1B). Additionally main Figure 1B shows that larval length is correlated with PSM length. We have corrected the main text to reflect these relationships more clearly.

      • Is my understanding correct in that the her7-venus reporter is carried by the Cab F0 but not the Kaga F0? Presumably only F2s which carried the reporter were selected for phenotyping. I would expect the location of the reporter in the genome to be obvious in Figure 3J as a region that is only Cab or het but never Kaga. Can the authors please point to the location of the reporter?

      The reviewer is correct. Indeed the location of our her7-venus KI is on chromosome 16 and the recombination patterns on this chromosome overwhelmingly show either Hom Cab (green) or Het Cab/Kaga (Black). This is expected as we selected fish carrying the her7-venus KI for phenotyping.

      • devQTL mapping in this study seems like a wasted opportunity. The authors perform mapping only to then hand pick their targets based on GO annotations. This biases the study towards genes known to be involved in PSM development, when part of the appeal of QTL mapping is precisely its unbiased nature and the potential to discover new functionally relevant genes. The authors need to better justify their rationale for candidate prioritization from devQTL peaks. The GO analysis should be shown as supplemental data. What criteria were used to select genes based on GO annotations?

      We have now commented on these valid points and outlined our rationale in more detail in the text (page 4, lines 20-30). Our rationale now also includes selection of differentially expressed genes (n=5 genes) that fall within segmentation timing devQTL hits (for more details see below). Essentially, while we indeed finally focused on the proof of principle using known genes, these genes were previously not known to play a role in either setting the timing of segmentation or controlling the size of the PSM. Hence, we do think our strategy demonstrates the "the potential to discover new functionally relevant genes", even though the genes themselves had been involved overall in somitogenesis. We added the GO analysis as supplemental data as requested (new Supplementary Figure 7E).

      • Analysis of the predicted functional consequence of divergent SNPs (Fig. S6B, F) is superficial. Among missense variants, which genes harbor the most deleterious mutations? Which missense variants are located in highly conserved residues? Which genes carry variants in splice donors/acceptors? Carefully assessing the predicted effect of SNPs in coding regions would provide an alternative, less biased approach to prioritize candidate genes.

      We now included our analysis of SNPs based on the Variant effect predictor (VEP) tool from ensembl. This analysis does rank the predicted severity of the SNP on protein structure and function (Impact: low, moderate, high) and does annotate which variants can affect splice donors/acceptors. The VEP analysis for both phenotypes is now added to the manuscript as supplemental data (new Supplementary Data S2, S5).

      • Another potential way to prioritize candidate genes within devQTL peaks would be to use the RNA seq data. The authors should perform differential expression analysis between Kaga and Cab RNA-seq datasets. Do any of the differentially expressed genes fall within the devQTL peaks?

      As suggested we have performed this additional experiment and report the RNAseq differential analysis in new Supplement Figure 7C-D. The analysis revealed 2606 differentially expressed genes in the PSM between Kaga and Cab, five of which were candidate genes from the devQTL analysis. We now tested all of these (5 in total, 4 new and 1 previously targeted adgrg1) for segmentation timing by CRISPR/Cas9 KO in the her7-venus background, none of which showed a timing phenotype (new Supplementary Figure 7F-F'). We provide the complete set of results in new Supplementary Figure 7 , Supplementary Data file 3 (DE-genes), all data were deposited on publicly available repository Biostudies under accession number: E-MTAB-13927.

      • The use of crispants to functionally test candidate genes is inappropriate. Crispants do not mimic the effect of divergent SNPs and therefore completely fail to prove causality. While it is completely understandable that Medaka fish are not amenable to the creation of multiple knock-in lines where divergent SNPs are interconverted between species, better justification is needed. For instance, is there enough data to suggest that the divergent alleles for the candidate genes tested are loss of function? Why was a knockout approach chosen as opposed to overexpression?

      We agree with the reviewer that we do not address the causality of SNPs with the CRISPR/Cas9 KO approach we followed. And medaka does offer the genome editing capabilities to create tailored sequence modifications. So in principle, this can be done. In practice, however, we reasoned that any given SNP will contribute only partially to the observed phenotypes and combinatorial sequence edits are simply very laborious given the current state of the art in genome editing technologies. We therefore opted for an alternative proof of principle approach that aims to "to discover new functionally relevant genes", not SNPs.

      -Along the same line, now that two candidate genes have been shown to modulate the clock period in crispants (mespb and pcdh10b), the authors should at least attempt to knock in the respective divergent SNPs for one of the genes. This is of course optional because it would imply several months of work, but it would significantly increase the impact of the study.

      As above, this is in principle the correct rationale to follow though very time, cost and labour intensive. It is for the later practical consideration that we decided not to follow this option.

      Minor Comments - It would be highly beneficial to describe the ecological differences between the two Medaka species. For example, do the northern O. sakaizumii inhabit a colder climate than the southern O. latipes? Is food more abundant or easily accessible for one species compared to the other? What, if anything, has been described about each species' ecology?

      There are indeed differences in the ecology of both species, with the northern O.sakaizumii inhabiting a colder climate than the southern O. latipes. In addition, it is known that the breeding season is shorter in the north than the south, and also there is the fact that northern species have been shown to have a faster juvenile growth rate than southern species. While it would be premature to link those ecological factors to the timing differences we observe, we can certainly speculate. A line to this effect has been added to the main text (Page 5, line 28-30).

      • The authors describe two different methods for quantifying segmentation clock period (mean vs. intercept). It is still unclear what is the difference between Figs. 3A (clock period), S4A (mean period) and S4B (intercept period). Is clock period just mean period? Are the data then shown twice? How do Fig. 3A and S4A differ?

      The clock period shown in all the main figures is the intercept period, which was also used for the devQTL analysis. Both measurements (mean and intercept) are indeed highly correlated and we include both in supplement for completeness.

      • devQTL as shorthand for developmental QTL should be defined in page 4 line 1 (where the term first appears), not later in line 12 of the same page.

      Noted and corrected, we thank the reviewer for spotting this error.

      • Python code for period quantification should be uploaded to Github and shared with reviewers.

      All period quantification code that was used in this study was obtained from the publicly available tool Pyboat (https://www.biorxiv.org/content/10.1101/2020.04.29.067744v3). All code that is used in PyBoat is available from the Github page of the creator of the tool (https://github.com/tensionhead/pyBOAT). Both are linked in the references and materials and methods sections.

      • RNA-seq data should be uploaded to a publicly accessible repository and the reviewer token shared with reviewers.

      We have uploaded all RNA-sequencing Data to public repository BioStudies under accession numbers : E-MTAB-13927, E-MTAB-13928. This information is now also added to material and methods in the manuscript text.

      Why are the maintenance (27-28C) vs. imaging (30C) temperatures different?

      Medaka fish have a wide range of temperatures they can physiologically tolerate, i.e. 17-33. The temperature 30C was chosen for practical reasons, i.e. a slightly faster developmental rate enables higher sample throughput in overnight real-time imaging experiments.

      • For Crispants, control injections should have included a non-targeting sgRNA control instead of simply omitting the sgRNA.

      We agree a non-targeting sgRNA control can be included, though we choose a different approach. For clarity, we now also include a control targeting Oca2, a gene involved in the pigmentation of the eye to probe for any injection related effect on timing and PSM size. As expected, 3 sgRNAs + Cas9 against Oca2 had no impact on timing or PSM size. This data is now shown in new Supplementary Figure 9 F-G'.

      It is difficult to keep track of the species and strains. It would be most helpful if Fig. S1 appeared instead in main figure 1.

      We agree and included an overview of the phylogenetic relationship of all species and their geographical locales in new Figure 1 A-B.

      Significance

      • The study introduces a new way of thinking about segmentation timing and size scaling by considering natural variation in the context of selection. This new framing will have an important impact on the field.
      • Perhaps the most significant finding is that the correlation between segment timing and size in wild populations is driven not by developmental constraints but rather selection pressure, whereas segment size scaling does form a single developmental module. This finding should be of interest to a broad audience and will influence how researchers in the field approach future studies.
      • It would be helpful to add to the conclusion the author's opinion on whether segmentation timing is a quantitative trait based on the number of QTL peaks identified.
      • The authors should be careful not to assign any causality to the candidate genes that they test in crispants.
      • The data and results are generally well-presented, and the research is highly rigorous.
      • Please note I do have the expertise to evaluate the statistical/bioinformatic methods used for devQTL mapping.

      Reviewer #2

      Evidence, reproducibility and clarity

      Seleit et al. investigate the correlation between segment size, presomitic mesoderm and the rhythm of periodic oscilations in the segmentation clock of developing medaka fish. Specifically, they aim to identify the genetic determinants for said traits. To do so, they employ a common garden approach and measure such traits in separate strains (F0) and in interbreedings across two generations (F1 and F2). They find that whereas presomitic mesoderm and segment size are genetically coupled, the tempo of her7 oscilations it is not. Genetic mapping of the F0 and F2 progeny allows them to identify regions associated to said traits. They go on an perturb 7 loci associated to the segmentation clock and X related to segment size. They show that 2/7 have a tempo defect, and 2/ affect size.

      Major comments: The conclusions are convincing and well supported by the data. I think the work could be published as is in its current state, and no additional experiments that I can think of are needed to support the claims in the paper.

      Minor comments: - The authors could provide a more detailed characterization of the identified SNPs associated to the clock and to PSM size. For the segmentation clock, the authors identify 46872 SNPs, most of which correspond to non-coding regions and are associated to 57 genes. They narrow down their approach to those expressed in the PSM of Cab Kaga. Was the RNA selected from F1 hybrids? I wonder if this would impact the analysis for tempo and or size in any way, as F2 are derived from these, and they show broader variability in the clock period than the F0 and F1 fishes.

      The RNA was obtained from the pure F0 strains and we have now extended this analysis by deep bulk-RNA sequencing and differential gene expression analysis. As indicated also to reviewer 1, this revealed 2606 differentially expressed genes in the unsegmented tails of Kaga and Cab embryos, some of which occurred in devQTL peaks. Based on this information we expanded our list of CRISPR/Cas9 KOs by targeting all differentially expressed genes (5 in total, 4 new and 1 previously targeted) for segmentation timing, none of which showed a timing phenotype (new Supplementary figure 7C-D). We provide the complete set of results in new Supplementary Figure 7, Supplementary Data file 3 (DE-genes). All data were deposited on publicly available repository Biostudies under accession number: E-MTAB-13927.

      It would be good if the authors could discuss if there were any associated categories or overall functional relationships between the SNPs/genes associated to size. And what about in the case of timing?

      In the case of PSM size there were no clear GO terms or functional relationships between the genes that passed the significance threshold on chromosome 3.

      For the 35 genes related to segmentation timing, there were a number of GO enrichment terms directly related to somitogenesis. We have included the GO analysis in the new Supplementary Figure 7E.

      • Have any of the candidate genes or regulatory loci been associated to clock defects (57) or segment size (204) previously in the literature?

      To the best of our knowledge none of the genes have been associated with clock or PSM size defects so far. It might be worthwhile using our results to probe their function in other systems enabling higher throughput functional analysis, such as newly developed organoid models.

      • When the authors narrow down the candidate list, it is not clear if the genes selected as expressed in the PSM are tissue specific. If they are, I wonder if genes with ubiquitous expression would be more informative to investigate tempo of development more broadly. It would be good if the authors could specifically discuss this point in the manuscript.

      We have not addressed the spatial expression pattern of the 35 identified PSM genes in this study, so we cannot speculate further. But the reviewer raises an important point, how timing of individual processes (body axis segmentation) are linked at organismal scale is indeed a fundamental, additional, question that will be addressed in future studies, indeed the in-vivo context we follow here would be ideal for such investigations.

      Can the authors speculate mechanistically why mespb or pchd10b accelerates the period of her7 oscillations?

      While we do not have a mechanistic explanation yet, an additional experiment we performed, i.e. bulk-RNAsequencing on WT and mespb mutant tails, provided additional insight, we now added this data to the manuscript . This analysis revealed 808 differentially expressed genes between wt and mespb mutants. Interestingly, many of these affected genes are known to be expressed outside of the mespb domain, i.e. in the most posterior PSM (i.e. tbxt, foxb1,msgn1, axin2, fgf8, amongst others). This indicates that the effect of mespb downregulation is widespread and possibly occurs at an earlier developmental stage. This requires more follow up studies. This data is now shown in new Supplementary figure 9A, Supplementary Data file S4. We now comment on this point in the revised manuscript.

      • Are there any size difference associated to the functionally validated clock mutants?

      We addressed this point directly and added this analysis as supplementary Figure 9H-H'. While pcdh10b mutants do not show any detectable difference in PSM size, we find a small, statistically significant reduction in PSM size (area but not length) in mespb mutants. All this data is now included in the revised manuscript.

      -Ref 27 shows a lack of correlation between body size and the segmentation period in various species of mammals. The work supports their findings, and it would be good to see this discussed in the text.

      We are not certain how best to compare our in-vivo results in externally developing fish embryos to in-vitro mammalian 2-D cell cultures. In our view, the correlation of embryo size, larval and adult size that we find in Oryzias might not necessarily hold in mammalian species, which would make a comparison more difficult. We do cite the work mentioned so the reader is pointed towards this interesting, complementary literature.

      Significance

      The work is quite remarkable in terms of the multigenerational genetic analysis performed. The authors have analysed >600 embryos from three separate generations to obtain quantitative data to answer their question (herculean task!). Moreover, they have associated this characterization to specific SNPs. Then, to go beyond the association, they have generated mutant lines and identified specific genes associated to the traits they set out to decipher.

      To my knowledge, this is the first project that aims to identify the genetic determinants for developmental timing. Recent work on developmental timing in mammals has focused on interspecies comparisons and does not provide genetic evidence or insight into how tempo is regulated in the genome. As for vertebrates, recent work from zebrafish has profiled temperature effects on cell proportions and developmental timing. However, the genetic approach of this work is quite elegant and neat.

      Conceptually, it is quite important and unexpected that overall size and tempo are not related. Body size, lifespan, basal metabolic rates and gestational period correlate positively and we tend to think that mechanistically they would all be connected to one another. This paper and Lazaro et al. 2023 (ref 27) are one of the first in which this preconception is challenged in a very methodical and conclusive manner. I believe the work is a breakthrough for the field and this work would be interesting for the field of biological timing, for the segmentation clock community and more broadly for all developmental biologists.

      My field is quantitative stem cell biology and I work on developmental timing myself, so I acknowledge that I am biased in the enthusiasm for the work. It should be noted that as an expert on the field, I have identified instances where other work hasn't been as insightful or well developed in comparison to this piece. It is also worth noting that I am not an expert in fish development, phylogenetic studies or GWAS analyses, so I am not capable to asses any pitfalls in that respect.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __

      __Summary: __

      This manuscript explores the temporal and spatial regulation of vertebrate body axis development and patterning. In the early stages of vertebrate embryo development, the axial mesoderm (presomitic mesoderm - PSM) undergoes segmentation, forming structures known as somites. The exact genetic regulation governing somite and PSM size, and their relationship to the periodicity of somite formation remains unclear.

      To address this, the authors used two evolutionarily closely related Medaka species, Oryzias sakaizumii and Oryzias latipes, which, although having distinct characteristics, can produce viable offspring. Through analysis spanning parental (generation F0) and offspring (generations F1 and F2) generations, the authors observed a correlation between PSM and somite size. However, they found that size scaling does not correlate with the timing of somitogenesis.

      Furthermore, employing developmental quantitative trait loci (devQTL) mapping, the authors identified several new candidate loci that may play a role during somitogenesis, influencing timing of segment formation or segment size. The significance of these loci was confirmed through an innovative CRISPR-Cas9 gene editing approach.

      This study highlights that the spatial and temporal aspects of vertebrate segmentation are independently controlled by distinct genetic modular mechanisms.

      __Major comments: __

      1) In the main text page 3, lines 11 and 12, the authors state that the periodicity of the embryo clock of the F1 generation is the intermediate between the parental F0 lineages. However, the authors look only at the periodicity of the Cab strain (Oryzias latipes) segmentation clock. The authors should have a reporter fish line for the Kaga strain (Oryzias sakaizumii) to compare the segmentation clock of both parental strains and their offspring. Since it could be time consuming and laborious, I advise to alternatively rephrase the text of the manuscript.

      We agree a careful distinction between segment forming rate (measured based on morphology) and clock period (measured using the novel reporter we generated) is essential. We show that both measures correlate very well in Cab, in both F0 and F1 and F2 carrying the Cab allele. For Kaga F0, we indeed can only provide the rate of somite formation, which nevertheless allows comparison due to the strong correlation to the clock period we have found. We have rephrased the text accordingly.

      2) It is evident that only a few F0 and F1 animals were analyzed in comparison with the F2 generation. Could the authors kindly explain whether and how this could bias or skew the observed results?

      We provide statistical evidence through the F-test of equality that the variances between the F0, F1 and F2 samples are equal. Additionally if we sub-sample and separate the F2 data into groups of 100 embryos (instead of all 638) we get the same distribution of the F2s. We therefore believe that this is sufficient evidence against a bias or skew in the results.

      3) It would be interesting to create fish lines with the validated CRISPR-Cas9 gene manipulations in different genetic contexts (Cab or Kaga) to analyze the true impact on the segmentation clock and/or PSM & somite sizes.

      We agree with the reviewer this would in principle be of interest indeed, please see our response to reviewer 1 earlier.

      4) Please add the results of the Go Analysis as supplementary material.

      We have added the GO analysis in new Supplementary Figure 7E.

      __Minor comments: __

      1) In the main text, page 2, line 29, Supplementary Figure 1D should be referenced.

      We have added a clearer phylogeny and geographical location of the different species in new Figure 1 A-B. And reference it at the requested location.

      2) In the main text, page 2, line 32, the authors refer to Figure 1B, but it should be 1C.

      We have corrected the information.

      3) Regarding the topic "Correlation of segmentation timing and size in the Oryzias genus" the authors should also give information on the total time of development of the different Oryzias species, as well as the total number of formed somites.

      We follow this recommendation and have added this information in new Supplementary Figure 5. We also now include segment number measured in F2 embryos. We indeed view segmentation rate as a proxy for developmental rate, which however needs to be distinguished from total developmental time. The latter can be measured for instance by quantifying hatching time, which we did. These measurements show that Kaga, Cab and O.hubbsi embryos kept at constant 28 degrees started hatching on the same day while O.minutillus and O.mekongensis embryos started hatching one day earlier. We have not included this data in the manuscript because we think a distinction should be made between rate of development and total development time.

      4) In Figures 3A and B, please add info on the F1 lines for comparison.

      The information on F1 lines is provided in Supplementary Figure 3

      5) Supplementary Figures 2F shows that the generation F1 PSM is similar to Cab F0, and not an intermediate between Kaga F0 and Cab F0. This is interesting and should be discussed.

      We show that the F1 PSM is indeed closer to the PSM of Cab than it is to the Kaga PSM. This is indeed intriguing and we have now commented on this point directly in the text.

      6) Supplementary Figures 6C to H are not mentioned either in the main text or in the extended information. Please add/mention accordingly.

      We have added references to both in the text

      7) The order of Supplementary Figure 8 E to H and A to D appears to be not correct and not following the flow of the text. Please update/correct accordingly.

      We have updated the text accordingly.

      8) The authors should choose between "Fig.", "Fig", "fig.", "fig" or "Figure". All 'variants' can be found in the text.

      Noted, and updated. Fig. is used for main figures and fig. is used for supplementary figures.

      9) The color scheme of several figures (graphs with colored dots) should be revised. Several appear to be difficult to discern and analyze.

      We have enhanced the colours and increased the font on the figure panels. The colour panel was chosen to be colour-blind friendly.

      10) Please address/discuss following questions: What are the known somitogenesis regulating genes in Medaka? How do they correlate with the new candidates?

      The candidates we found and tested had not been implicated in regulating the tempo of segmentation or PSM size, while for some a role in somite formation had been previously established, hence the enrichment in GO analysis Somitogenesis.

      Reviewer #3 (Significance (Required)):

      General assessment:

      This interesting manuscript describes a novel approach to study and find new players relevant to the regulation of vertebrate segmentation. By employing this innovative methodology, the authors could elegantly demonstrate that the segmentation clock periodicity is independent from the sizes of the PSM and forming somites. The authors were further able to find new genes that may be involved in the regulation of the segmentation clock periodicity and/or the size of the PSM & somites. A limitation of this study is the fact that the results mainly rely on differences between the two species. The integration of additional Medaka species would be beneficial and may help uncover relevant genes and genetic contexts.

      Advance:

      To my best knowledge this is the first time that such a methodology was employed to study the segmentation clock and axial development. Although the topic has been extensively studied in several model organisms, such as mice, chicken, and zebrafish, none of them correlated the size of the embryonic tissues and the periodicity of the embryo clock. This study brings novel technological and functional advances to the study of vertebrate axial development.

      Audience:

      This work is particularly interesting to basic researchers, especially in the field of developmental biology and represents a fresh new approach to study a core developmental process. This study further opens the exciting possibility of using a similar methodology to investigate other aspects of vertebrate development. It is a timely and important manuscript which could be of interest to a wider scientific audience and readership.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Seleit and colleagues set out to explore the genetics of developmental timing and tissue size by mapping natural genetic variation associated with segmentation clock period and presomitic mesoderm (PSM) size in different species of Medaka fish. They first establish the extent of variation between five different Medaka species of in terms of organismal size, segmentation rate, segment size and presomitic mesoderm size, among other traits. They find that these traits are species-specific but strongly correlated. In a massive undertaking, they then perform developmental QTL mapping for segmentation clock period and PSM size in a set of ~600 F2 fish resulting from the cross of Orizyas sakaizumii (Kaga) and Orizyas latipes (Cab). Correlation between segmentation period and segment size was lost among the F2s, indicating that distinct genetic modules control these traits. Although the researchers fail to identify causal variants driving these traits, they perform proof of concept perturbations by analyzing F0 Crispants in which candidate genes were knocked out. Overall, the study introduces a completely new methodology (QTL mapping) to the field of segmentation and developmental tempo, and therefore provides multiple valuable insights into the forces driving evolution of these traits.

      Major comments:

      • The first sentence in the abstract reads "How the timing of development is linked to organismal size is a longstanding question". It is therefore disappointing that organismal size is not reported for the F2 hybrids. Was larval length measured in the F2s? If so, it should be reported. It is critical to understand whether the correlation between larval size and segmentation clock period is preserved in F2s or not, therefore determining if they represent a single or separate developmental modules. If larval length data were not collected, the authors need to be more careful with their wording. In the current version of the paper, organismal size is often incorrectly equated to tissue size (e.g. PSM size, segment size). For example, in page 3 lines 33-34, the authors state that faster segmentation occurred in embryos of smaller size (Fig. 1D). However, Fig. 1D shows correlation between segmentation rate and unsegmented PSM area. The appropriate data to show would be segmentation rate vs. larval or adult length.
      • Is my understanding correct in that the her7-venus reporter is carried by the Cab F0 but not the Kaga F0? Presumably only F2s which carried the reporter were selected for phenotyping. I would expect the location of the reporter in the genome to be obvious in Figure 3J as a region that is only Cab or het but never Kaga. Can the authors please point to the location of the reporter?
      • devQTL mapping in this study seems like a wasted opportunity. The authors perform mapping only to then hand pick their targets based on GO annotations. This biases the study towards genes known to be involved in PSM development, when part of the appeal of QTL mapping is precisely its unbiased nature and the potential to discover new functionally relevant genes. The authors need to better justify their rationale for candidate prioritization from devQTL peaks. The GO analysis should be shown as supplemental data. What criteria were used to select genes based on GO annotations?
      • Analysis of the predicted functional consequence of divergent SNPs (Fig. S6B, F) is superficial. Among missense variants, which genes harbor the most deleterious mutations? Which missense variants are located in highly conserved residues? Which genes carry variants in splice donors/acceptors? Carefully assessing the predicted effect of SNPs in coding regions would provide an alternative, less biased approach to prioritize candidate genes.
      • Another potential way to prioritize candidate genes within devQTL peaks would be to use the RNA seq data. The authors should perform differential expression analysis between Kaga and Cab RNA-seq datasets. Do any of the differentially expressed genes fall within the devQTL peaks?
      • The use of crispants to functionally test candidate genes is inappropriate. Crispants do not mimic the effect of divergent SNPs and therefore completely fail to prove causality. While it is completely understandable that Medaka fish are not amenable to the creation of multiple knock-in lines where divergent SNPs are interconverted between species, better justification is needed. For instance, is there enough data to suggest that the divergent alleles for the candidate genes tested are loss of function? Why was a knockout approach chosen as opposed to overexpression?
      • Along the same line, now that two candidate genes have been shown to modulate the clock period in crispants (mespb and pcdh10b), the authors should at least attempt to knock in the respective divergent SNPs for one of the genes. This is of course optional because it would imply several months of work, but it would significantly increase the impact of the study.

      Minor Comments

      • It would be highly beneficial to describe the ecological differences between the two Medaka species. For example, do the northern O. sakaizumii inhabit a colder climate than the southern O. latipes? Is food more abundant or easily accessible for one species compared to the other? What, if anything, has been described about each species' ecology?
      • The authors describe two different methods for quantifying segmentation clock period (mean vs. intercept). It is still unclear what is the difference between Figs. 3A (clock period), S4A (mean period) and S4B (intercept period). Is clock period just mean period? Are the data then shown twice? How do Fig. 3A and S4A differ?
      • devQTL as shorthand for developmental QTL should be defined in page 4 line 1 (where the term first appears), not later in line 12 of the same page.
      • Python code for period quantification should be uploaded to Github and shared with reviewers.
      • RNA-seq data should be uploaded to a publicly accessible repository and the reviewer token shared with reviewers.
      • Why are the maintenance (27-28C) vs. imaging (30C) temperatures different?
      • For Crispants, control injections should have included a non-targeting sgRNA control instead of simply omitting the sgRNA.
      • It is difficult to keep track of the species and strains. It would be most helpful if Fig. S1 appeared instead in main figure 1.

      Significance

      • The study introduces a new way of thinking about segmentation timing and size scaling by considering natural variation in the context of selection. This new framing will have an important impact on the field.
      • Perhaps the most significant finding is that the correlation between segment timing and size in wild populations is driven not by developmental constraints but rather selection pressure, whereas segment size scaling does form a single developmental module. This finding should be of interest to a broad audience and will influence how researchers in the field approach future studies.
      • It would be helpful to add to the conclusion the author's opinion on whether segmentation timing is a quantitative trait based on the number of QTL peaks identified.
      • The authors should be careful not to assign any causality to the candidate genes that they test in crispants.
      • The data and results are generally well-presented, and the research is highly rigorous.
      • Please note I do have the expertise to evaluate the statistical/bioinformatic methods used for devQTL mapping.
    1. Multi-factor authentication. December 2023. Page Version ID: 1188119370. URL: https://en.wikipedia.org/w/index.php?title=Multi-factor_authentication&oldid=1188119370 (visited on 2023-12-06).

      Multifactor authentication is a system where a site will only allow access to the site when 2 or more pieces of authenticating evidence are presented. this may come in the form of a password along with a code sent through SMS or email. This allows for a site to be more secure when multiple factors of authentication are presented to avoid unwanted access.

    1. But stepping back even further, one can only see this imagined software as an enhancement to Latour’s larger model of interplay in his actor-network theory, a theory that does not need software or special equipment to exist. The activity in a spatial environment is not reliant on the digital environment. It may be enhanced by a code/text-based software, but a spatial software or protocol can be any platform that establishes variables for space as information
    2. We are not accustomed to the idea that non-human, inanimate objects possess agency and activity, just as we are not accustomed to the idea that they can carry information unless they are endowed with code/text-based information technologies. While accepting that a technology like mobile telephony has become the world’s largest shared platform for information exchange, we are perhaps less accustomed to the idea of space as a technology or medium of information—undeclared information that is not parsed as text or code. Indeed, the more ubiquitous code/text-based information devices become, the harder it is to see spatial technologies and networks that are independent of the digital. Few would look at a concrete highway system or an electrical grid and perceive agency in their static arrangement. Agency might only be ascribed to the moving cars or the electrical current. Spaces and urban arrangements are usually treated as collections of objects or volumes, not as actors. Yet the organization itself is active. It is doing something, and changes in the organization constitute information. Even so, the idea that information is carried in activity, or what we might call active form, must still struggle against many powerful habits of mind.
    1. He said no way - using haskell he was convinced he could implement anything I could implement, faster and better and with less code. We didn't test the claim - but I still wonder - is he right?

      Both are correct. This aspirational ideal - crafting a program with a small, tight, and beautiful core - is possible if a program is intended to be an artifact.

      One definition of an artifact - a program designed to serve a specific use case in a specific point of time forever. It is crafted then left untouched.

      By contrast, software to most businesses is a living, breathing beast - we have strict time constraints to implement, modify, adjust to, and tack on features or the business dies. This business of crafting a perfect, beautiful core would require a rewrite of the entire system every time you intended to add a new feature or reinvestigate the model.

      Software engineering is, then, a process of compromising - continuously declaring that edge X is the one least likely to shoot yourself in the foot.

    1. es.redirect([status,] path) Redirects to the URL derived from the specified path, with specified status, a positive integer that corresponds to an HTTP status code. If not specified, status defaults to 302 "Found". res.redirect('/foo/bar') res.redirect('http://example.com') res.redirect(301, 'http://example.com') res.redirect('../login') Redirects can be a fully-qualified URL for redirecting to a different site: res.redirect('http://google.com') Redirects can be relative to the root of the host name. For example, if the application is on http://example.com/admin/post/new, the following would redirect to the URL http://example.com/admin: res.redirect('/admin') Redirects can be relative to the current URL. For example, from http://example.com/blog/admin/ (notice the trailing slash), the following would redirect to the URL http://example.com/blog/admin/post/new. res.redirect('post/new') Redirecting to post/new from http://example.com/blog/admin (no trailing slash), will redirect to http://example.com/blog/post/new. If you found the above behavior confusing, think of path segments as directories (with trailing slashes) and files, it will start to make sense. Path-relativ
    1. la place de la chiropraxi citons Wikipédia la Fédération Mondiale de chirropractique 00:11:12 WFC est membre de l'OMS depuis 1993 la chiropratique est reconnue comme profession de santé complémentaire par le Comité international olympique depuis 00:11:25 1992 la chiropratique est en 2009 la 3e profession de santé aux États-Unis après la médecine générale et la chirurgie dentaire en France la chyopraxie est 00:11:37 reconnue depuis la loi du 4 mars 2002 cette pratique est rattachée au code de la santé publique par l'article 75 comme profession de santé fin de citations de 00:11:48 Wikipédia
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2023-02218R

      Corresponding author(s): Steven, McMahon

      1. General Statements [optional]

      *We were pleased to receive the encouraging critiques and very much appreciate the Reviewer's specific comments and suggestions. In this revised version of our manuscript, we have made a number of substantive additions and modifications in response to these comments/suggestions. We hope you agree that the study is now improved to the point where it is suitable for publication. *

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary This study describes efforts to characterize differences in the roles of the two related human decapping factors Dcp1a and Dcp1b by assessing mRNA decay and protein associations in knockdown and knockout cell lines. The authors conclude that these proteins are non-redundant based on the observations that loss of DCp1a versus Dcp1b impacts the decapping complex (interactome) and the transcriptome differentially.

      Major comments • While the experiments appear to be well designed and executed and the data of generally high quality, the conclusions are drawn without sufficient consideration for the fact that these two proteins form a heterotrimeric complex. The authors assume that there are distinct homotrimeric complexes rather than a single complex with both proteins in. Homotrimers may have new/different functions not normally seen when both proteins are expressed. Thus while it is acceptable to infer that the functions of these two proteins within the decapping complex are distinct, it is not clear that they act separately, or that complexes naturally exist without one or the other. A careful evaluation of the relative ratios of Dcp1a and b overall and in decapping complexes would be informative if the authors want to make stronger statements about the roles of these two factors.

      RESPONSE: Thank you for this valuable comment. We have substantially edited the manuscript to incorporate these points. Examples include a detailed analysis of iBAQ values for the DDX6, DCP1a, and DCP1b interactomes (which now allows us to estimate the ratios of DCP1a and DCP1b in these complexes) and cellular fractionation to interrogate complex integrity (using Superose 6).

      • The concept of buffering is not adequately introduced and the interpretation of observations that RNAs with increased half life do not show increased protein abundance - that Dcp1a/b are involved in transcript buffering is nebulous. In order to support this interpretation, the mRNA abundances (NOT protein abundances) should be assessed, and even then, there is no way to rule out indirect effects. RESPONSE: Thank you for this comment. In the revised version of the manuscript, we introduced the concept of transcript buffering at an earlier stage as one of the potential explanations for our findings. We were also able to use a new algorithm (grandR) to estimate half-lives and synthesis rates from our data. These new data add strength to the argument that DCP1a and DCP1b are linked to transcript buffering pathways.

      • It might be interesting to see what happens when both factors are depleted to get an idea of the overall importance of each one.

      RESPONSE: In our work we tried to emphasize the differences between the two paralogs. We believe that doing double knockout or knockdown would mask the distinct impacts of the paralogs. In data not included in this study, we have shown that cells lacking both DCP1a and DCP1b are viable. We did check PARP cleavage in the CRISPR generated cell pools of DCP1a KO, DCP1b KO, and the double KO. The WB measuring the PARP cleavage is shown in the supplemental material (Supplementary Material: Replicates)

      • The algorithms etc used for data analysis should be included at the time of publication. Version number and settings used for SMART to define protein domains, and webgestalt should be indicated

      RESPONSE: We apologize for this oversight. Version number and settings used for the webtools (SMART, Webgestalt) are now included. The analysis pipeline for half-lives and synthesis rates estimation as well as all the files and the code needed to generate the figures in the paper are available on zenodo (https://zenodo.org/records/10725429).

      • Statistical analysis is not provided for the IP experiments, the number of replicates performed is not indicated and quantification of KD efficiency are not provided.

      RESPONSE: The number of replicates performed in each experiment is now clearly indicated and quantifications of knockdown efficiency are provided (Supplemental Figure 3A and 3B, Figure 3A, Figure 3B).

      • The possibility that the IP Antibody interferes with protein-protein interactions is not mentioned.

      RESPONSE: Thank you for this comment. The revised manuscript includes a discussion of the antibody epitope location and the potential for impact on protein-protein interactions.

      Minor comments • P4 - "This translational repression of mRNA associated with decapping can be reversed, providing another point at which gene expression can be regulated (21)" - implies that decapping can be reversed or that decapped RNAs are translated. I don't think this is technically true.

      RESPONSE: There have been several studies that document the reversal of decapping. These findings are summarized in the following reviews.

      Schoenberg, D. R., & Maquat, L. E. (2009). Re-capping the message. Trends in biochemical sciences, 34(9), 435-442.

      Trotman, J. B., & Schoenberg, D. R. (2019). A recap of RNA recapping. Wiley Interdisciplinary Reviews: RNA, 10(1), e1504.

      • P11 - how common is it for higher eukaryotes to have 2 DCP genes? *RESPONSE: Metazoans have 2 DCP1 genes. *

      • Fig S1 - says "mammalian tissues" in the text but the data is all human. The statement that "expression analyses revealed that DCP1a and DCP1b have concordant rather than reciprocal expression patterns across different mammalian tissues (Supplemental Figure 1)" is a bit misleading as no evidence for correlation or anti-correlation is provided. Also co-expression is not strong support for the idea that these genes have non-redundant functions. Both genes are just expressed in all tissues - there's no evidence provided that they are concordantly expressed. In bone marrow it may be worth noting that one is high and the other low - i.e. reciprocal. *RESPONSE: We appreciate this comment. We have corrected the interpretation of the aforementioned dataset. We have also incorporated a more detailed discussion in the text of the paper. As the Reviewer pointed out, there are a subset of tissues where their expression appears to be reciprocal. *

      • Fig 1A - it is not clear what the different colors mean. Does Sc DCP1 have 1 larger EVH or 2 distinct ones. Are the low complexity regions in Sc DCP2 the SLiMs. *RESPONSE: Thank you for this comment. We have corrected this ambiguity to reflect that Sc DCP1 has one EVH1 domain that is interconnected by a flexible hinge. The low-complexity regions typically contain short linear motifs (SLIMs), however, not all low-complexity regions have been verified to contain them. In the figure, only low-complexity regions are shown. The text of the paper refers only to verified SLIMs . *

      • P11 - why were HCT116 cells selected? RESPONSE: HCT116 cells are an easily transfectable human cell line and have been widely used in biochemical and molecular studies, including studies of mRNA decapping (see references below). Since decapping is impacted by viral proteins we avoided the use of other commonly used cell models such as HEK293T or HeLa.

      https://pubmed.ncbi.nlm.nih.gov/?term=decapping+hct116&sort=date&size=200

      • Fig 1B - what are the asterisks by the RNA names? Might be worth noting that over-expression of DCP1b reduced IP of DCP1a. There's no quantification and no indication of the number of times this experiment was repeated. Data from replicates and quantification of the knockdown efficiency in each replicate would be nice to see. *RESPONSE: Thank you for this comment. Asterisks indicate that those bands were from a second gel, as DCP1a and DCP1b run at approximately the same molecular weight. We have now included a note in our figure legend to indicate this. The knockdown efficiency is provided (Figure 3 and Supplemental Figure 3). We also noted the number of replicas for each IP in figure 1. The replicas are provided as supplementary material (Supplementary Materials: Replicates). *

      • Fig 1C/1D - why are there 3 bands in the DCP1a blot? Quantification of the IP bands is necessary to say whether there is an effect or not of over-expression/KO. RESPONSE: The additional bands in DCP1a blots are background. When we stained the whole blot for DCP1a, in cells which with complete DCP1a KO cells (clone A3), these bands still appear (Supplementary Material: Validation of the KO clones). Quantifications of the bands in the overexpression experiments is now provided.

      • Fig 3 - is it possible that differences are due to epitope positions for the antibodies used for IP? RESPONSE: We do not believe so. DCP1a antibody binds roughly 300-400 residues on DCP1a, and DCP1b antibody binds around Val202. Antibodies therefore do not bind DCP1a or DCP1b low-complexity regions (which are largely responsible for interacting with the decapping complex interactome). Antibodies don't bind the EVH1 domains or the trimerization domain, which are needed for their interaction with DCP2 and each other.

      • Fig 5A - the legend doesn't match the colors in the figure. It is not clear how the pRESPONSE: Thank you for this comment. We have corrected this issue in the revised version of the paper. High-confidence proteins are those with pRESPONSE: Thank you for this comment. We have corrected this issue in the revised version of the paper.*

      • There are a few more recent studies on buffering that should be cited and more discussion of this in the introduction is necessary if conclusions are going to be drawn about buffering. *RESPONSE: We have included a discussion of transcript buffering in the introduction. *

      • The heatmaps in figure 2 are hard to interpret. RESPONSE: To clarify the heatmaps, we included a more detailed description in the figure legends, have enlarged the heatmaps themselves, and have added more extensive labeling.

      Reviewer #1 (Significance (Required)):

      • Strengths: The experiments appear to be done well and the datasets should be useful for the field. • Limitations: The results are overinterpreted - different genes are affected by knocking down one or other of these two similar proteins but this does not really tell us all that much about how the two proteins are functioning in a cell where both are expressed. • Audience: This study will appeal most to a specialized audience consisting of those interested in the basic mechanisms of mRNA decay. Others may find the dataset useful. • This study might complement and/or be informed by another recent study in BioRXiv - https://doi.org/10.1101/2023.09.04.556219 • My field of expertise is mRNA decay - I am qualified to evaluate the findings within the context of this field. I do not have much experience of LC-MS-MS and therefore cannot evaluate the methods/analysis of this part of the study.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors provide evidence that Dcp1a and Dcp1b - two paralogous proteins of the mRNA decapping complex - may have divergent functions in a cancer cell line. In the first part, the authors show that interaction of Dcp2 with EDC4 is diminished upon depletion of Dcp1a but not affected by depletion of Dcp1b. The results have been controlled by overexpression of Dcp1b as it may be limiting factor (i.e. expression levels too low to compensate for depletion of Dcp1a reduced interaction with EDC3/4 while depletion of Dcp1b lead to opposite and increase interactions). They then defined the protein interactome of DDX6 in parental and Dcp1a or Dcp1b depleted cells. Here, the authors show some differential association with EDC4 again, which is along results shown in the first part. The authors further performed SLAM-seq and identified subsets of mRNA whose decay rates are common but also different upon depletion with Dcp1a and Dcp1b. Interestingly, it seems that Dcp1a preferentially targets mRNAs for proteins regulating lymphocyte differentiation. To further test whether changes in RNA decay rates are also reflected at the protein levels, they finally performed an MS analysis with Dcp1a/b depleted cells. However no significant overlap with mRNAs showing altered stability could be observed; and the authors suggested that the lack of congruence reflects translational repression.

      Major comments: 1. While functional difference between Dcp1a and Dcp1b are interesting and likely true, there are overinterpretations that need correction or further evidence for support. Sentences like "DCP1a regulates RNA cap binding proteins association with the decapping complex and DCP1b controls translational initiation factors interactions (Figure 2E)" sound misleading. While differential association with proteins has been recognised with MS-data, it does not necessary implement an active process of control/regulation. To make the claim on 'control/regulation', and inducible system or introduction of mutants would be required.

      RESPONSE: This set of comments were particularly useful in helping us refine the presentation of our findings. We have edited our manuscript to be more specific about the limits of our data.

      1. The MS analysis is not clearly described in the text and it is unclear how authors selected high-confident proteins. The reader needs to consider the supplemental tables to find out what controls were used. Furthermore, the authors should show correlation plots of MS data between replicates. For instance, there seems to be limited correlation among some of the replicates (e.g. Dcp1b_ko3 sample, Fig. 2c). Any explanation in this variance?

      *RESPONSE: We have now included a clear description of how all high-confidence proteins were selected in the Methods and Results sections. The revised manuscript also includes a more thorough description of the controls used and the number of replicates for individual experiments. The PCA plots have now been included where appropriate. The variance in this sample is likely technical. *

      1. GO analysis for the proteome analysis should consider the proteome and not the genome as the background. The authors should also indicate the corrected P-values (multiple testing) FDRs.

      *RESPONSE: Webgestalt uses a reference set of IDs to recognize the input IDs, and it does not use it for the background analysis in the classical sense. We repeated a subset of our proteome analyses using the 'genome-protein coding' as background and obtained the same result as in our original analysis. All ontology analyses now include raw p-values and/or FDRs when appropriate. *

      1. Fig 2E. The figures display GO enrichments needs better explanation and additional data can be added. The enrichment ratio is not explained (is this normalised?) and p-values and FDRs, number of proteins in respective GO category should be added. *RESPONSE: More thorough explanations of the GO enrichments are now included. The supplemental data contains all p-values (raw and adjusted), as well as the number of proteins in each GO category. The Enrichment ratio is normalized and contains information about the number of proteins that are redundant in multiple groups. GO Ontology analyses are now displayed with p-values and/or FDR values, and in this case the enrichment ratio contains information regarding the number of proteins found in our input set and the number of expected proteins in the GO group. The network analysis shows the FDR values and the number of proteins found in the groups compared. *

      Minor: 5. These studies were performed in a colorectal carcinoma cell line (HCT116). The authors should justify the choice of this specialised cell line. Furthermore, one wonders whether similar conclusions can be drawn with other cell lines or whether findings are specific to this cancer line.

      RESPONSE: The study that is currently in pre-print in BioRxiv (https://doi.org/10.1101/2023.09.04.556219*) utilized HEK293Ts and found similar results to ours when examining the various relationships between the core decapping core members. *

      1. Fig. 1B. It is unclear what DCP1b* refers to? There are bands of different size that are not mentioned by the authors - are those protein isoforms or what are those referring to? A molecular marker should be added to each Blots. Uncropped Western images and markers should be provided in the Supplement. *RESPONSE: The asterisk indicates that these images came from a second western blot gel (DCP1a and DCP1b have a similar molecular weight and cannot be probed on the same membrane). Uncropped western blot images and markers (as available) are provided in the supplement. *

      2. MS data submitted to public repository with access. No. indicated in the manuscript.

      RESPONSE: MS data is submitted as supplementary datasets to the paper. It contains the analyzed data as well as the LCMSMS output. We are in the process of submitting the raw LSMSMS data to a public repository.

      Fig 3. A Venn Diagram displaying the overlap of identified proteins should be added. GO analysis should be done considering the proteome as background (as mentioned above).

      *RESPONSE: A Venn diagram showing the overlap among the proteins identified is now included in the revised version. *

      Reviewer #2 (Significance (Required)):

      Overall, this is a large-scale integrative -omics study that suggest functional difference between Dcp1 paralogues. While it seems clear that both paralogous have some different functions and impact, there are overinterpretations in place and further evidence would to be provided to substantiate conclusions made in the paper. For instance, while the interactions with Dcp2/Ddx6 in the absence of Dcp1a,b with EDC4/3 may be altered (Fig. 1, 2), the functional implications of this changed associations remains unresolved and not further discussed. As such, it remains somehow disconnected with the following experiments and compromises the flow of the study. The observed differences in decay-rates for distinct functionally related sets of mRNAs is interesting; however, it remains unclear whether those are direct or rather indirect effects. This is further obscured by the absence of any correlation to changes in protein levels, which the authors interpreted as 'transcriptional buffering'. In this regard, it is puzzling how the authors can make a statement about transcriptional buffering? While this may be an interesting aspect and concept of the discussion, there is no primary data showing such a functional impact.

      As such, the study is interesting as it claims functional differences between DCP1a/b paralogous in a cancer cell line. Nevertheless, I am not sure how trustful the MS analysis and decay measurements are as there is not further validation. It woudl be interesting if the authors could go a bit further and draw some hypothesis how the selectivty could be achieved i.e interaction with RNA-binding proteins that may add some specificity towards the target RNAs for differential decay. As such, the study remains unfortunately rather descriptive without further functional insight.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Review on "Non-redundant roles for the human mRNA decapping cofactor paralogs DCP1a and DCP1b" by Steven McMahon and co-workers mRNA decay is a critical step in the regulation of gene expression. In eukaryotes, mRNA turnover typically begins with the removal of the poly(A) tail, followed by either removal of the 5' cap structure or exonucleolytic 3'-5' decay catalyzed by the exosome. The decapping enzyme DCP2 forms a complex with its co-activator DCP1, which enhances decapping activity. Mammals are equipped with two DCP1 paralogs, namely DCP1a and DCP1b. Metazoans' decapping complexes feature additional components, such as enhancer of decapping 4 (EDC4), which supports the interaction between DCP1 and DCP2, thereby amplifying the efficiency of decapping. This work focuses on DCP1a and DCP1b and investigates their distinct functions. Using DCP1a- and DCP1a-specific knockdowns as well as K.O. cell lines, the authors find surprising differences between the DCP1 paralogs. While DCP1a is essential for the assembly of EDC4-containig decapping complexes and interactions with mRNA cap binding proteins, DCP1b mediates interactions with the translational machinery. Furthermore, DCP1a and DCP1b target different mRNAs for degradation, indicating that they execute non-overlapping functions. The findings reported here expand our understanding of mRNA decapping in human cells, shedding light on the unique contributions of DCP1a and DCP1b to mRNA metabolism. The manuscript tackles an interesting subject. Historically, the emphasis has been on studying DCP1a, while DCP1b has been deemed a functionally redundant homolog of DCP1a. Therefore, it is commendable that the authors have taken on this topic and, with the help of knockout cell lines, aimed to dissect the function of DCP1a and DCP1b. Despite recognizing the significance of the subject and approach, the manuscript falls short of persuading me. Following a promising start in Figure 1 (which still has room for improvement), there is a distinct decline in overall quality, with only relatively standard analyses being conducted. However, I do not want to give the authors a detailed advice on maximizing the potential of their data and presenting it convincingly. So, here are just a few key points for improvement: Figure 1C: Upon closer examination, a faint band is still visible at the size of DCP1a in the DCP1a knockout cells. Could this be leaky expression of DCP1a? The authors should provide an in-depth characterization of their cells (possibly as supplementary material), including identification of genomic changes (e.g. by sequencing of the locus) and Western blots with longer exposure, etc.

      *RESPONSE: Thank you for this comment. The in-depth characterization of our cells is now included in the Supplementary Material. DCP1a KO cells and DCP1b KO cells indicated as single cell clones have been confirmed to have no DCP1a or DCP1b expression. In Figure 1D and Figure 3, polyclonal pool cells were used as indicated (only for DCP1a KO). *

      Figure 2: It is great to see that the effects of the KOs are also visible in the DDX6 immunoprecipitation. However, I wonder if the IP clearly confirms that the KO cells indeed do not express DCP1a or DCP1b. In the heatmap in Figure 2B, it appears as if the proteins are only reduced by a log2-fold change of approximately 1.5? Additionally, Figure 2 shows a problem that persists in the subsequent figures. The visual presentation is not particularly appealing, and essential details, such as the scale of the heatmap in 2B (is it log2 fold?), are lacking.

      *RESPONSE: The in-depth characterization of our cells is included in the Supplementary Materials and confirms the presence of single-cell clones where indicated. As noted above, only Figure 1D and Figure 3 used DCP1a KO pooled cells. The heatmap in Figure 2B is scaled by row using the pheatfunction in R studio. The actual data for the heatmap comes from protein intensities from the LC-MS/MS analysis. We have improved the visual presentation in the revised manuscript. *

      Figure 3: I wonder why there are no primary data shown here, only processed GO analyses. Wouldn't one expect that DCP2 interacts mainly with DCP1a, but less with DCP1b? Is this visible in the data? Moreover, such analyses are rather uninformative (as reflected in the GO terms themselves, for instance, "oxoglutarate dehydrogenase complex" doesn't provide much meaningful insight). The authors should rather try to derive functional and mechanistic insights from their data.

      RESPONSE: We have now revised this Figure to include primary data as well as the IP of DCP1a in DCP1b KO cells (single cell clones) and the IP of DCP1b in DCP1a KO cells (pooled cells). We identified EDC3 in the high-confidence protein pool. The EDC3:DCP1a interaction is enhanced in DCP1b KO cells. We also found that the EDC3:DCP1b interaction is less abundant in DCP1a KO cells. This is consistent with our data in Figures 1 and 2. DCP2 was not identified in the interactomes of either DCP1a or DCP1b. This is not unusual as DCP2 is highly flexible and the association between DCP1s with DCP2 is transient and facilitated by other proteins.

      In Fig. 4 the potential of the approach is not fully exploited. Firstly, I would advocate for omitting the GO analyses, as, in my opinion, they offer little insight. Again, crucial information is missing to assess the results. While 75 nt reads are mentioned in the methods, the sequencing depth remains unspecified. Figure 4b should be included in the supplements. Furthermore, I strongly recommend concentrating on insights into the mechanisms of DCP1a and DCP1b-containing complexes. E.g. what characteristics distinguish DCP1a and DCP1b-dependent mRNAs? Are these targets inherently unstable? Why are they degraded? Are they known decapping substrates?

      *RESPONSE: Thank you for this comment. We have now revised this figure and have included information about sequencing depth and other pertinent information. We have been able to use a newly available algorithm (grandR) and were able to estimate half-lives and synthesis rates. This is a significant addition to the paper. We were also able to compare significantly impacted mRNAs (by DCP1a or DCP1b loss) to the established DCP2 target list. *

      In general, I suggest the authors revise the manuscript with a focus on the potential readers. Reduce Gene Ontology (GO) analyses and heatmaps, and instead, incorporate more analyses regarding the molecular processes associated with the different decapping complexes.

      *RESPONSE: We removed selected GO analyses and heatmaps from the main body of the manuscript (included as Supplementary Figures instead). For our LC-MS/MS datasets, we added iBAQ analyses of the DDX6 IP, DCP1a IP, and DCP1b IP in the control conditions. Cellular fractionation studies (using Superose 6 chromatography) were also added to the paper and allow us to interrogate decapping complex composition in more detail. The revised version of the manuscript includes a new 4SU labeling experiment (pulse-chase) as well as estimation of half-lives and synthesis rates in our conditions. Also included is relevant information about DCP1b transcriptional regulation. *

      Reviewer #3 (Significance (Required)):

      The manuscript in its current form could benefit from substantial revisions for it to be considered impactful for researchers in the field.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      I have trialled the package on my lab's data and it works as advertised. It was straightforward to use and did not require any special training. I am confident this is a tool that will be approachable even to users with limited computational experience. The use of artificial data to validate the approach - and to provide clear limits on applicability - is particularly helpful.

      The main limitation of the tool is that it requires the user to manually select regions. This somewhat limits the generalisability and is also more subjective - users can easily choose "nice" regions that better match with their hypothesis, rather than quantifying the data in an unbiased manner. However, given the inherent challenges in quantifying biological data, such problems are not easily circumventable.

      *

      * I have some comments to clarify the manuscript:

      1. A "straightforward installation" is mentioned. Given this is a Method paper, the means of installation should be clearly laid out.*

      __This sentence is now modified. In the revised manuscript we now describe how to install the toolset and we give the link to the toolset website if further information is needed. __On this website, we provide a full video tutorial and a user manual. The user manual is provided as a supplementary material of the manuscript.

      * It would be helpful if there was an option to generate an output with the regions analysed (i.e., a JPG image with the data and the drawn line(s) on top). There are two reasons for this: i) A major problem with user-driven quantification is accidental double counting of regions (e.g., a user quantifies a part of an image and then later quantifies the same region). ii) Allows other users to independently verify measurements at a later time.*

      We agree that it is helpful to save the analyzed regions. To answer this comment and the other two reviewers' comments pointing at a similar feature, we have now included an automatic saving of the regions of interest. The user will be able to reopen saved regions of interest using a new function we included in the new version of PatternJ.

      * 3. Related to the above point, it is highlighted that each time point would need to be analysed separately (line 361-362). It seems like it should be relatively straightforward to allow a function where the analysis line can be mapped onto the next time point. The user could then adjust slightly for changes in position, but still be starting from near the previous timepoint. Given how prevalent timelapse imaging is, this seems like (or something similar) a clear benefit to add to the software.*

      We agree that the analysis of time series images can be a useful addition. We have added the analysis of time-lapse series in the new version of PatternJ. The principles behind the analysis of time-lapse series and an example of such analysis are provided in Figure 1 - figure supplement 3 and Figure 5, with accompanying text lines 140-153 and 360-372. The analysis includes a semi-automated selection of regions of interest, which will make the analysis of such sequences more straightforward than having to draw a selection on each image of the series. The user is required to draw at least two regions of interest in two different frames, and the algorithm will automatically generate regions of interest in frames in which selections were not drawn. The algorithm generates the analysis immediately after selections are drawn by the user, which includes the tracking of the reference channel.

      * Line 134-135. The level of accuracy of the searching should be clarified here. This is discussed later in the manuscript, but it would be helpful to give readers an idea at this point what level of tolerance the software has to noise and aperiodicity.

      *

      We agree with the reviewer that a clarification of this part of the algorithm will help the user better understand the manuscript.__ We have modified the sentence to clarify the range of search used and the resulting limits in aperiodicity (now lines 176-181). __Regarding the tolerance to noise, it is difficult to estimate it a priori from the choice made at the algorithm stage, so we prefer to leave it to the validation part of the manuscript. We hope this solution satisfies the reviewer and future users.

      *

      **Referees cross-commenting**

      I think the other reviewer comments are very pertinent. The authors have a fair bit to do, but they are reasonable requests. So, they should be encouraged to do the revisions fully so that the final software tool is as useful as possible.

      Reviewer #1 (Significance (Required)):

      Developing software tools for quantifying biological data that are approachable for a wide range of users remains a longstanding challenge. This challenge is due to: (1) the inherent problem of variability in biological systems; (2) the complexity of defining clearly quantifiable measurables; and (3) the broad spread of computational skills amongst likely users of such software.

      In this work, Blin et al., develop a simple plugin for ImageJ designed to quickly and easily quantify regular repeating units within biological systems - e.g., muscle fibre structure. They clearly and fairly discuss existing tools, with their pros and cons. The motivation for PatternJ is properly justified (which is sadly not always the case with such software tools).

      Overall, the paper is well written and accessible. The tool has limitations but it is clearly useful and easy to use. Therefore, this work is publishable with only minor corrections.

      *We thank the reviewer for the positive evaluation of PatternJ and for pointing out its accessibility to the users.

      *

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      # Summary

      The authors present an ImageJ Macro GUI tool set for the quantification of one-dimensional repeated patterns that are commonly occurring in microscopy images of muscles.

      # Major comments

      In our view the article and also software could be improved in terms of defining the scope of its applicability and user-ship. In many parts the article and software suggest that general biological patterns can be analysed, but then in other parts very specific muscle actin wordings are used. We are pointing this out in the "Minor comments" sections below. We feel that the authors could improve their work by making a clear choice here. One option would be to clearly limit the scope of the tool to the analysis of actin structures in muscles. In this case we would recommend to also rename the tool, e.g. MusclePatternJ. The other option would be to make the tool about the generic analysis of one-dimensional patterns, maybe calling the tool LinePatternJ. In the latter case we would recommend to remove all actin specific wordings from the macro tool set and also the article should be in parts slightly re-written.

      *

      We agree with the reviewer that our initial manuscript used a mix of general and muscle-oriented vocabulary, which could make the use of PatternJ confusing especially outside of the muscle field. To make PatternJ useful for the largest community, we corrected the manuscript and the PatternJ toolset to provide the general vocabulary needed to make it understandable for every biologist. We modified the manuscript accordingly.

      * # Minor/detailed comments

      # Software

      We recommend considering the following suggestions for improving the software.

      ## File and folder selection dialogs

      In general, clicking on many of the buttons just opens up a file-browser dialog without any further information. For novel users it is not clear what the tool expects one to select here. It would be very good if the software could be rewritten such that there are always clear instructions displayed about which file or folder one should open for the different buttons.*

      We experienced with the current version of macOS that the file-browser dialog does not display any message; we suspect this is the issue raised by the reviewer. This is a known issue of Fiji on Mac and all applications on Mac since 2016. We provided guidelines in the user manual and on the tutorial video to correct this issue by changing a parameter in Fiji. Given the issues the reviewer had accessing the material on the PatternJ website, which we apologize for, we understand the issue raised. We added an extra warning on the PatternJ website to point at this problem and its solution. Additionally, we have limited the file-browser dialog appearance to what we thought was strictly necessary. Thus, the user will experience fewer prompts, speeding up the analysis.

      *

      ## Extract button

      The tool asks one to specify things like whether selections are drawn "M-line-to-M-line"; for users that are not experts in muscle morphology this is not understandable. It would be great to find more generally applicable formulations. *

      We agree that this muscle-oriented vocabulary can make the use of PatternJ confusing. We have now corrected the user interface to provide both general and muscle-specific vocabulary ("center-to-center or edge-to-edge (M-line-to-M-line or Z-disc-to-Z-disc)").*

      ## Manual selection accuracy

      The 1st step of the analysis is always to start from a user hand-drawn profile across intensity patterns in the image. However, this step can cause inaccuracy that varies with the shape and curve of the line profile drawn. If not strictly perpendicular to for example the M line patterns, the distance between intensity peaks will be different. This will be more problematic when dealing with non-straight and parallelly poised features in the image. If the structure is bended with a curve, the line drawn over it also needs to reproduce this curve, to precisely capture the intensity pattern. I found this limits the reproducibility and easy-usability of the software.*

      We understand the concern of the reviewer. On curved selections this will be an issue that is difficult to solve, especially on "S" curved or more complex selections. The user will have to be very careful in these situations. On non-curved samples, the issue may be concerning at first sight, but the errors go with the inverse of cosine and are therefore rather low. For example, if the user creates a selection off by 5 degrees, which is visually obvious, lengths will be affected by an increase of only 0.38%. The point raised by the reviewer is important to discuss, and we therefore added a paragraph to comment on the choice of selection (lines 94-98) and a supplementary figure to help make it clear (Figure 1 - figure supplement 1).*

      ### Reproducibility

      Since the line profile drawn on the image is the first step and very essential to the entire process, it should be considered to save together with the analysis result. For example, as ImageJ ROI or ROIset files that can be re-imported, correctly positioned, and visualized in the measured images. This would greatly improve the reproducibility of the proposed workflow. In the manuscript, only the extracted features are being saved (because the save button is also just asking for a folder containing images, so I cannot verify its functionality). *

      We agree that this is a very useful and important feature. We have added ROI automatic saving. Additionally, we now provide a simplified import function of all ROIs generated with PatternJ and the automated extraction and analysis of the list of ROIs. This can be done from ROIs generated previously in PatternJ or with ROIs generated from other ImageJ/Fiji algorithms. These new features are described in the manuscript in lines 120-121 and 130-132.

      *

      ## ? button

      It would be great if that button would open up some usage instructions.

      *

      We agree with the reviewer that the "?" button can be used in a better way. We have replaced this button with a Help menu, including a simple tutorial showing a series of images detailing the steps to follow by the user, a link to the user website, and a link to our video tutorial.

      * ## Easy improvement of workflow

      I would suggest a reasonable expansion of the current workflow, by fitting and displaying 2D lines to the band or line structure in the image, that form the "patterns" the author aims to address. Thus, it extracts geometry models from the image, and the inter-line distance, and even the curve formed by these sets of lines can be further analyzed and studied. These fitted 2D lines can be also well integrated into ImageJ as Line ROI, and thus be saved, imported back, and checked or being further modified. I think this can largely increase the usefulness and reproducibility of the software.

      *

      We hope that we understood this comment correctly. We had sent a clarification request to the editor, but unfortunately did not receive an answer within the requested 4 weeks of this revision. We understood the following: instead of using our 1D approach, in which we extract positions from a profile, the reviewer suggests extracting the positions of features not as a single point, but as a series of coordinates defining its shape. If this is the case, this is a major modification of the tool that is beyond the scope of PatternJ. We believe that keeping our tool simple, makes it robust. This is the major strength of PatternJ. Local fitting will not use line average for instance, which would make the tool less reliable.

      * # Manuscript

      We recommend considering the following suggestions for improving the manuscript. Abstract: The abstract suggests that general patterns can be quantified, however the actual tool quantifies specific subtypes of one-dimensional patterns. We recommend adapting the abstract accordingly.

      *

      We modified the abstract to make this point clearer.

      * Line 58: Gray-level co-occurrence matrix (GLCM) based feature extraction and analysis approach is not mentioned nor compared. At least there's a relatively recent study on Sarcomeres structure based on GLCM feature extraction: https://github.com/steinjm/SotaTool with publication: *https://doi.org/10.1002/cpz1.462

      • *

      We thank the reviewer for making us aware of this publication. We cite it now and have added it to our comparison of available approaches.

      * Line 75: "...these simple geometrical features will address most quantitative needs..." We feel that this may be an overstatement, e.g. we can imagine that there should be many relevant two-dimensional patterns in biology?!*

      We have modified this sentence to avoid potential confusion (lines 76-77).

      • *

      • Line 83: "After a straightforward installation by the user, ...". We think it would be convenient to add the installation steps at this place into the manuscript. *

      __This sentence is now modified. We now mention how to install the toolset and we provide the link to the toolset website, if further information is needed (lines 86-88). __On the website, we provide a full video tutorial and a user manual.

      * Line 87: "Multicolor images will give a graph with one profile per color." The 'Multicolor images' here should be more precisely stated as "multi-channel" images. Multi-color images could be confused with RGB images which will be treated as 8-bit gray value (type conversion first) images by profile plot in ImageJ. *

      We agree with the reviewer that this could create some confusion. We modified "multicolor" to "multi-channel".

      * Line 92: "...such as individual bands, blocks, or sarcomeric actin...". While bands and blocks are generic pattern terms, the biological term "sarcomeric actin" does not seem to fit in this list. Could a more generic wording be found, such as "block with spike"? *

      We agree with the reviewer that "sarcomeric actin" alone will not be clear to all readers. We modified the text to "block with a central band, as often observed in the muscle field for sarcomeric actin" (lines 103-104). The toolset was modified accordingly.

      * Line 95: "the algorithm defines one pattern by having the features of highest intensity in its centre". Could this be rephrased? We did not understand what that exactly means.*

      We agree with the reviewer that this was not clear. We rewrote this paragraph (lines 101-114) and provided a supplementary figure to illustrate these definitions (Figure 1 - figure supplement 2).

      * Line 124 - 147: This part the only description of the algorithm behind the feature extraction and analysis, but not clearly stated. Many details are missing or assumed known by the reader. For example, how it achieved sub-pixel resolution results is not clear. One can only assume that by fitting Gaussian to the band, the center position (peak) thus can be calculated from continuous curves other than pixels. *

      Note that the two sentences introducing this description are "Automated feature extraction is the core of the tool. The algorithm takes multiple steps to achieve this (Fig. S2):". We were hoping this statement was clear, but the reviewer may refer to something else. We agree that the description of some of the details of the steps was too quick. We have now expanded the description where needed.

      * Line 407: We think the availability of both the tool and the code could be improved. For Fiji tools it is common practice to create an Update Site and to make the code available on GitHub. In addition, downloading the example file (https://drive.google.com/file/d/1eMazyQJlisWPwmozvyb8VPVbfAgaH7Hz/view?usp=drive_link) required a Google login and access request, which is not very convenient; in fact, we asked for access but it was denied. It would be important for the download to be easier, e.g. from GitHub or Zenodo.

      *

      We are sorry for issues encountered when downloading the tool and additional material. We thank the reviewer for pointing out these issues that limited the accessibility of our tool. We simplified the downloading procedure on the website, which does not go through the google drive interface nor requires a google account. Additionally, for the coder community the code, user manual and examples are now available from GitHub at github.com/PierreMangeol/PatternJ, and are provided as supplementary material with the manuscript. To our knowledge, update sites work for plugins but not for macro toolsets. Having experience sharing our codes with non-specialists, a classical website with a tutorial video is more accessible than more coder-oriented websites, which deter many users.

      * Reviewer #2 (Significance (Required)):

      The strength of this study is that a tool for the analysis of one-dimensional repeated patterns occurring in muscle fibres is made available in the accessible open-source platform ImageJ/Fiji. In the introduction to the article the authors provide an extensive review of comparable existing tools. Their new tool fills a gap in terms of providing an easy-to-use software for users without computational skills that enables the analysis of muscle sarcomere patterns. We feel that if the below mentioned limitations could be addressed the tool could indeed be valuable to life scientists interested in muscle patterning without computational skills.

      In our view there are a few limitations, including the accessibility of example data and tutorials at sites.google.com/view/patternj, which we had trouble to access. In addition, we think that the workflow in Fiji, which currently requires pressing several buttons in the correct order, could be further simplified and streamlined by adopting some "wizard" approach, where the user is guided through the steps.

      *As answered above, the links on the PatternJ website are now corrected. Regarding the workflow, we now provide a Help menu with:

      1. __a basic set of instructions to use the tool, __
      2. a direct link to the tutorial video in the PatternJ toolset
      3. a direct link to the website on which both the tutorial video and a detailed user manual can be found. We hope this addresses the issues raised by this reviewer.

      *Another limitation is the reproducibility of the analysis; here we recommend enabling IJ Macro recording as well as saving of the drawn line ROIs. For more detailed suggestions for improvements please see the above sections of our review. *

      We agree that saving ROIs is very useful. It is now implemented in PatternJ.

      We are not sure what this reviewer means by "enabling IJ Macro recording". The ImageJ Macro Recorder is indeed very useful, but to our knowledge, it is limited to built-in functions. Our code is open and we hope this will be sufficient for advanced users to modify the code and make it fit their needs.*

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary In this manuscript, the authors present a new toolset for the analysis of repetitive patterns in biological images named PatternJ. One of the main advantages of this new tool over existing ones is that it is simple to install and run and does not require any coding skills whatsoever, since it runs on the ImageJ GUI. Another advantage is that it does not only provide the mean length of the pattern unit but also the subpixel localization of each unit and the distributions of lengths and that it does not require GPU processing to run, unlike other existing tools. The major disadvantage of the PatternJ is that it requires heavy, although very simple, user input in both the selection of the region to be analyzed and in the analysis steps. Another limitation is that, at least in its current version, PatternJ is not suitable for time-lapse imaging. The authors clearly explain the algorithm used by the tool to find the localization of pattern features and they thoroughly test the limits of their tool in conditions of varying SNR, periodicity and band intensity. Finally, they also show the performance of PatternJ across several biological models such as different kinds of muscle cells, neurons and fish embryonic somites, as well as different imaging modalities such as brightfield, fluorescence confocal microscopy, STORM and even electron microscopy.

      This manuscript is clearly written, and both the section and the figures are well organized and tell a cohesive story. By testing PatternJ, I can attest to its ease of installation and use. Overall, I consider that PatternJ is a useful tool for the analysis of patterned microscopy images and this article is fit for publication. However, i do have some minor suggestions and questions that I would like the authors to address, as I consider they could improve this manuscript and the tool:

      *We are grateful to this reviewer for this very positive assessment of PatternJ and of our manuscript.

      * Minor Suggestions: In the methodology section is missing a more detailed description about how the metric plotted was obtained: as normalized intensity or precision in pixels. *

      We agree with the reviewer that a more detailed description of the metric plotted was missing. We added this information in the method part and added information in the Figure captions where more details could help to clarify the value displayed.

      * The validation is based mostly on the SNR and patterns. They should include a dataset of real data to validate the algorithm in three of the standard patterns tested. *

      We validated our tool using computer-generated images, in which we know with certainty the localization of patterns. This allowed us to automatically analyze 30 000 images, and with varying settings, we sometimes analyzed 10 times the same image, leading to about 150 000 selections analyzed. From these analyses, we can provide with confidence an unbiased assessment of the tool precision and the tool capacity to extract patterns. We already provided examples of various biological data images in Figures 4-6, showing all possible features that can be extracted with PatternJ. In these examples, we can claim by eye that PatternJ extracts patterns efficiently, but we cannot know how precise these extractions are because of the nature of biological data: "real" positions of features are unknown in biological data. Such validation will be limited to assessing whether a pattern was found or not, which we believe we already provided with the examples in Figures 4-6.

      * The video tutorial available in the PatternJ website is very useful, maybe it would be worth it to include it as supplemental material for this manuscript, if the journal allows it. *

      As the video tutorial may have been missed by other reviewers, we agree it is important to make it more prominent to users. We have now added a Help menu in the toolset that opens the tutorial video. Having the video as supplementary material could indeed be a useful addition if the size of the video is compatible with the journal limits.

      * An example image is provided to test the macro. However, it would be useful to provide further example images for each of the three possible standard patterns suggested: Block, actin sarcomere or individual band.*

      We agree this can help users. We now provide another multi-channel example image on the PatternJ website including blocks and a pattern made of a linear intensity gradient that can be extracted with our simpler "single pattern" algorithm, which were missing in the first example. Additionally, we provide an example to be used with our new time-lapse analysis.

      * Access to both the manual and the sample images in the PatternJ website should be made publicly available. Right now they both sit in a private Drive account. *

      As mentioned above, we apologize for access issues that occurred during the review process. These files can now be downloaded directly on the website without any sort of authentication. Additionally, these files are now also available on GitHub.

      * Some common errors are not properly handled by the macro and could be confusing for the user: When there is no selection and one tries to run a Check or Extraction: "Selection required in line 307 (called from line 14). profile=getProfile( ;". A simple "a line selection is required" message would be useful there. When "band" or "block" is selected for a channel in the "Set parameters" window, yet a 0 value is entered into the corresponding "Number of bands or blocks" section, one gets this error when trying to Extract: "Empty array in line 842 (called from line 113). if ( ( subloc . length == 1 ) & ( subloc [ 0 == 0) ) {". This error is not too rare, since the "Number of bands or blocks" section is populated with a 0 after choosing "sarcomeric actin" (after accepting the settings) and stays that way when one changes back to "blocks" or "bands".*

      We thank the reviewer for pointing out these bugs. These bugs are now corrected in the revised version.

      * The fact that every time one clicks on the most used buttons, the getDirectory window appears is not only quite annoying but also, ultimately a waste of time. Isn't it possible to choose the directory in which to store the files only once, from the "Set parameters" window?*

      We have now found a solution to avoid this step. The user is only prompted to provide the image folder when pressing the "Set parameter" button. We kept the prompt for directory only when the user selects the time-lapse analysis or the analysis of multiple ROIs. The main reason is that it is very easy for the analysis to end up in the wrong folder otherwise.

      * The authors state that the outputs of the workflow are "user friendly text files". However, some of them lack descriptive headers (like the localisations and profiles) or even file names (like colors.txt). If there is something lacking in the manuscript, it is a brief description of all the output files generated during the workflow.*

      PatternJ generates multiple files, several of which are internal to the toolset. They are needed to keep track of which analyses were done, and which colors were used in the images, amongst others. From the user part, only the files obtained after the analysis All_localizations.channel_X.txt and sarcomere_lengths.txt are useful. To improve the user experience, we now moved all internal files to a folder named "internal", which we think will clarify which outputs are useful for further analysis, and which ones are not. We thank the reviewer for raising this point and we now mention it in our Tutorial.

      I don't really see the point in saving the localizations from the "Extraction" step, they are even named "temp".

      We thank the reviewer for this comment, this was indeed not necessary. We modified PatternJ to delete these files after they are used.

      * In the same line, I DO see the point of saving the profiles and localizations from the "Extract & Save" step, but I think they should be deleted during the "Analysis" step, since all their information is then grouped in a single file, with descriptive headers. This deleting could be optional and set in the "Set parameters" window.*

      We understand the point raised by the reviewer. However, the analysis depends on the reference channel picked, which is asked for when starting an analysis, and can be augmented with additional selections. If a user chooses to modify the reference channel or to add a new profile to the analysis, deleting all these files would mean that the user will have to start over again, which we believe will create frustration. An optional deletion at the analysis step is simple to implement, but it could create problems for users who do not understand what it means practically.

      * Moreover, I think it would be useful to also save the linear roi used for the "Extract & Save" step, and eventually combine them during the "Analysis step" into a single roi set file so that future re-analysis could be made on the same regions. This could be an optional feature set from the "Set parameters" window. *

      We agree with the reviewer that saving ROIs is very useful. ROIs are now saved into a single file each time the user extracts and saves positions from a selection. Additionally, the user can re-use previous ROIs and analyze an image or image series in a single step.

      * In the "PatternJ workflow" section of the manuscript, the authors state that after the "Extract & Save" step "(...) steps 1, 2, 4, and 5 can be repeated on other selections (...)". However, technically, only steps 1 and 5 are really necessary (alternatively 1, 4 and 5 if the user is unsure of the quality of the patterning). If a user follows this to the letter, I think it can lead to wasted time.

      *

      We agree with the reviewer and have corrected the manuscript accordingly (line 119-120).

      • *

      *I believe that the "Version Information" button, although important, has potential to be more useful if used as a "Help" button for the toolset. There could be links to useful sources like the manuscript or the PatternJ website but also some tips like "whenever possible, use a higher linewidth for your line selection" *

      We agree with the reviewer as pointed out in our previous answers to the other reviewers. This button is now replaced by a Help menu, including a simple tutorial in a series of images detailing the steps to follow, a link to the user website, and a link to our video tutorial.

      * It would be interesting to mention to what extent does the orientation of the line selection in relation to the patterned structure (i.e. perfectly parallel vs more diagonal) affect pattern length variability?*

      As answered to reviewer 1, we understand this concern, which needs to be clarified for readers. The issue may be concerning at first sight, but the errors grow only with the inverse of cosine and are therefore rather low. For example, if the user creates a selection off by 3 degrees, which is visually obvious, lengths will be affected by an increase of only 0.14%. The point raised by the reviewer is important to discuss, and we therefore have added a comment on the choice of selection (lines 94-98) as well as a supplementary figure (Figure 1 - figure supplement 1).

      * When "the algorithm uses the peak of highest intensity as a starting point and then searches for peak intensity values one spatial period away on each side of this starting point" (line 133-135), does that search have a range? If so, what is the range? *

      We agree that this information is useful to share with the reader. The range is one pattern size. We have modified the sentence to clarify the range of search used and the resulting limits in aperiodicity (now lines 176-181).

      * Line 144 states that the parameters of the fit are saved and given to the user, yet I could not find such information in the outputs. *

      The parameters of the fits are saved for blocks. We have now clarified this point by modifying the manuscript (lines 186-198) and modifying Figure 1 - figure supplement 5. We realized we made an error in the description of how edges of "block with middle band" are extracted. This is now corrected.

      * In line 286, authors finish by saying "More complex patterns from electron microscopy images may also be used with PatternJ.". Since this statement is not backed by evidence in the manuscript, I suggest deleting it (or at the very least, providing some examples of what more complex patterns the authors refer to). *

      This sentence is now deleted.

      * In the TEM image of the fly wing muscle in fig. 4 there is a subtle but clearly visible white stripe pattern in the original image. Since that pattern consists of 'dips', rather than 'peaks' in the profile of the inverted image, they do not get analyzed. I think it is worth mentioning that if the image of interest contains both "bright" and "dark" patterns, then the analysis should be performed in both the original and the inverted images because the nature of the algorithm does not allow it to detect "dark" patterns. *

      We agree with the reviewer's comment. We now mention this point in lines 337-339.

      * In line 283, the authors mention using background correction. They should explicit what method of background correction they used. If they used ImageJ's "subtract background' tool, then specify the radius.*

      We now describe this step in the method section.

      *

      Reviewer #3 (Significance (Required)):

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. Being a software paper, the advance proposed by the authors is technical in nature. The novelty and significance of this tool is that it offers quick and simple pattern analysis at the single unit level to a broad audience, since it runs on the ImageJ GUI and does not require any programming knowledge. Moreover, all the modules and steps are well described in the paper, which allows easy going through the analysis.
      • Place the work in the context of the existing literature (provide references, where appropriate). The authors themselves provide a good and thorough comparison of their tool with other existing ones, both in terms of ease of use and on the type of information extracted by each method. While PatternJ is not necessarily superior in all aspects, it succeeds at providing precise single pattern unit measurements in a user-friendly manner.
      • State what audience might be interested in and influenced by the reported findings. Most researchers working with microscopy images of muscle cells or fibers or any other patterned sample and interested in analyzing changes in that pattern in response to perturbations, time, development, etc. could use this tool to obtain useful, and otherwise laborious, information. *

      We thank the reviewer for these enthusiastic comments about how straightforward for biologists it is to use PatternJ and its broad applicability in the bio community.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      The authors present an ImageJ Macro GUI tool set for the quantification of one-dimensional repeated patterns that are commonly occurring in microscopy images of muscles.

      Major comments

      In our view the article and also software could be improved in terms of defining the scope of its applicability and user-ship. In many parts the article and software suggest that general biological patterns can be analysed, but then in other parts very specific muscle actin wordings are used. We are pointing this out in the "Minor comments" sections below. We feel that the authors could improve their work by making a clear choice here. One option would be to clearly limit the scope of the tool to the analysis of actin structures in muscles. In this case we would recommend to also rename the tool, e.g. MusclePatternJ. The other option would be to make the tool about the generic analysis of one-dimensional patterns, maybe calling the tool LinePatternJ. In the latter case we would recommend to remove all actin specific wordings from the macro tool set and also the article should be in parts slightly re-written.

      Minor/detailed comments

      Software

      We recommend considering the following suggestions for improving the software.

      File and folder selection dialogs

      In general, clicking on many of the buttons just opens up a file-browser dialog without any further information. For novel users it is not clear what the tool expects one to select here. It would be very good if the software could be rewritten such that there are always clear instructions displayed about which file or folder one should open for the different buttons.

      Extract button

      The tool asks one to specify things like whether selections are drawn "M-line-to-M-line"; for users that are not experts in muscle morphology this is not understandable. It would be great to find more generally applicable formulations.

      Manual selection accuracy

      The 1st step of the analysis is always to start from a user hand-drawn profile across intensity patterns in the image. However, this step can cause inaccuracy that varies with the shape and curve of the line profile drawn. If not strictly perpendicular to for example the M line patterns, the distance between intensity peaks will be different. This will be more problematic when dealing with non-straight and parallelly poised features in the image. If the structure is bended with a curve, the line drawn over it also needs to reproduce this curve, to precisely capture the intensity pattern. I found this limits the reproducibility and easy-usability of the software.

      Reproducibility

      Since the line profile drawn on the image is the first step and very essential to the entire process, it should be considered to save together with the analysis result. For example, as ImageJ ROI or ROIset files that can be re-imported, correctly positioned, and visualized in the measured images. This would greatly improve the reproducibility of the proposed workflow. In the manuscript, only the extracted features are being saved (because the save button is also just asking for a folder containing images, so I cannot verify its functionality).

      ? button

      It would be great if that button would open up some usage instructions.

      Easy improvement of workflow

      I would suggest a reasonable expansion of the current workflow, by fitting and displaying 2D lines to the band or line structure in the image, that form the "patterns" the author aims to address. Thus, it extracts geometry models from the image, and the inter-line distance, and even the curve formed by these sets of lines can be further analyzed and studied. These fitted 2D lines can be also well integrated into ImageJ as Line ROI, and thus be saved, imported back, and checked or being further modified. I think this can largely increase the usefulness and reproducibility of the software.

      Manuscript

      We recommend considering the following suggestions for improving the manuscript. Abstract: The abstract suggests that general patterns can be quantified, however the actual tool quantifies specific subtypes of one-dimensional patterns. We recommend adapting the abstract accordingly.

      Line 58: Gray-level co-occurrence matrix (GLCM) based feature extraction and analysis approach is not mentioned nor compared. At least there's a relatively recent study on Sarcomeres structure based on GLCM feature extraction: https://github.com/steinjm/SotaTool with publication: https://doi.org/10.1002/cpz1.462

      Line 75: "...these simple geometrical features will address most quantitative needs..." We feel that this may be an overstatement, e.g. we can imagine that there should be many relevant two-dimensional patterns in biology?!

      Line 83: "After a straightforward installation by the user, ...". We think it would be convenient to add the installation steps at this place into the manuscript.

      Line 87: "Multicolor images will give a graph with one profile per color." The 'Multicolor images' here should be more precisely stated as "multi-channel" images. Multi-color images could be confused with RGB images which will be treated as 8-bit gray value (type conversion first) images by profile plot in ImageJ.

      Line 92: "...such as individual bands, blocks, or sarcomeric actin...". While bands and blocks are generic pattern terms, the biological term "sarcomeric actin" does not seem to fit in this list. Could a more generic wording be found, such as "block with spike"?

      Line 95: "the algorithm defines one pattern by having the features of highest intensity in its centre". Could this be rephrased? We did not understand what that exactly means.

      Line 124 - 147: This part the only description of the algorithm behind the feature extraction and analysis, but not clearly stated. Many details are missing or assumed known by the reader. For example, how it achieved sub-pixel resolution results is not clear. One can only assume that by fitting Gaussian to the band, the center position (peak) thus can be calculated from continuous curves other than pixels.

      Line 407: We think the availability of both the tool and the code could be improved. For Fiji tools it is common practice to create an Update Site and to make the code available on GitHub. In addition, downloading the example file (https://drive.google.com/file/d/1eMazyQJlisWPwmozvyb8VPVbfAgaH7Hz/view?usp=drive_link) required a Google login and access request, which is not very convenient; in fact, we asked for access but it was denied. It would be important for the download to be easier, e.g. from GitHub or Zenodo.

      Significance

      The strength of this study is that a tool for the analysis of one-dimensional repeated patterns occurring in muscle fibres is made available in the accessible open-source platform ImageJ/Fiji. In the introduction to the article the authors provide an extensive review of comparable existing tools. Their new tool fills a gap in terms of providing an easy-to-use software for users without computational skills that enables the analysis of muscle sarcomere patterns. We feel that if the below mentioned limitations could be addressed the tool could indeed be valuable to life scientists interested in muscle patterning without computational skills.

      In our view there are a few limitations, including the accessibility of example data and tutorials at sites.google.com/view/patternj, which we had trouble to access. In addition, we think that the workflow in Fiji, which currently requires pressing several buttons in the correct order, could be further simplified and streamlined by adopting some "wizard" approach, where the user is guided through the steps. Another limitation is the reproducibility of the analysis; here we recommend enabling IJ Macro recording as well as saving of the drawn line ROIs. For more detailed suggestions for improvements please see the above sections of our review.

    1. Under the new license, cloud service providers hosting Redis offerings will no longer be permitted to use the source code of Redis free of charge. For example, cloud service providers will be able to deliver Redis 7.4 only after agreeing to licensing terms with Redis, the maintainers of the Redis code. These agreements will underpin support for existing integrated solutions and provide full access to forthcoming Redis innovations.

      ¿Cómo afectará esto a los clientes finales?

      Microsoft seguramente comience a ofrecer como alternativa su propio software equivalente a Redis, https://github.com/microsoft/garnet

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Bell et al. provide an exhaustive and clear description of the diversity of a new class of predicted type IV restriction systems that the authors denote as CoCoNuTs, for their characteristic presence of coiled-coil segments and nuclease tandems. Along with a comprehensive analysis that includes phylogenetics, protein structure prediction, extensive protein domain annotations, and an in-depth investigation of encoding genomic contexts, they also provide detailed hypotheses about the biological activity and molecular functions of the members of this class of predicted systems. This work is highly relevant, it underscores the wide diversity of defence systems that are used by prokaryotes and demonstrates that there are still many systems to be discovered. The work is sound and backed-up by a clear and reasonable bioinformatics approach. I do not have any major issues with the manuscript, but only some minor comments.

      Strengths:

      The analysis provided by the authors is extensive and covers the three most important aspects that can be covered computationally when analysing a new family/superfamily: phylogenetics, genomic context analysis, and protein-structure-based domain content annotation. With this, one can directly have an idea about the superfamily of the predicted system and infer their biological role. The bioinformatics approach is sound and makes use of the most current advances in the fields of protein evolution and structural bioinformatics.

      Weaknesses:

      It is not clear how coiled-coil segments were assigned if only based on AF2-predicted models or also backed by sequence analysis, as no description is provided in the methods. The structure prediction quality assessment is based solely on the average pLDDT of the obtained models (with a threshold of 80 or better). However, this is not enough, particularly when multimeric models are used. The PAE matrix should be used to evaluate relative orientations, particularly in the case where there is a prediction that parts from 2 proteins are interacting. In the case of multimers, interface quality scores, such as the ipTM or pDockQ, should also be considered and, at minimum, reported.

      A description of the coiled-coil predictions has been added to the Methods. For multimeric models, PAE matrices and ipTM+pTM scores have been included in Supplementary Data File S1.

      Reviewer #2 (Public Review):

      Summary:

      In this work, using in-depth computational analysis, Bell et al. explore the diverse repertoire of type IV McrBC modification-dependent restriction systems. The prototypical two-component McrBC system has been structurally and functionally characterised and is known to act as a defence by restricting phage and foreign DNA containing methylated cytosines. Here, the authors find previously unanticipated complexity and versatility of these systems and focus on detailed analysis and classification of a distinct branch, the so-called CoCoNut, named after its composition of coiled-coil structures and tandem nucleases. These CoCoNut systems are predicted to target RNA as well as DNA and to utilise defence mechanisms with some similarity to type III CRISPR-Cas systems.

      Strengths:

      This work is enriched with a plethora of ideas and a myriad of compelling hypotheses that now await experimental verification. The study comes from the group that was amongst the first to describe, characterize, and classify CRISPR-Cas systems. By analogy, the findings described here can similarly promote ingenious experimental and conceptual research that could further drive technological advances. It could also instigate vigorous scientific debates that will ultimately benefit the community.

      Weaknesses:

      The multi-component systems described here function in the context of large oligomeric complexes. Some of the single chain AF2 predictions shown in this work are not compatible, for example, with homohexameric complex formation due to incompatible orientation of domains. The recent advances in protein structure prediction, in particular AlphaFold2 (AF2) multimer, now allow us to confidently probe potential protein-protein interactions and protein complex formation. This predictive power could be exploited here to produce a better glimpse of these multimeric protein systems. It can also provide a more sound explanation for some of the observed differences amongst different McrBC types.

      Hexameric CnuB complexes with CnuC stimulatory monomers for Type I-A, I-B, I-C, II, and III-A CoCoNuT systems have been modeled with AF2 and included in Supplementary Data File S1, albeit without the domains fused to the GTPase N-terminus (with the exception of Type I-B, which lacks the long coiled-coil domain fused to the GTPase and was modeled with its entire sequence). Attempts to model the other full-length CnuB hexamers did not lead to convincing results.

      Recommendations for the authors:

      Reviewing Editor:

      The detailed recommendations by the two reviewers will help the authors to further strengthen the manuscript, but two points seem particularly worth considering: 1. The methods are barely sketched in the manuscript, but it could be useful to detail them more closely. Particularly regarding the coiled-coil segments, which are currently just statists, useful mainly for the name of the family, more detail on their prediction, structural properties, and purpose would be very helpful. 2. Due to its encyclopedic nature, the wealth of material presented in the paper makes it hard to penetrate in one go. Any effort to make it more accessible would be very welcome. Reviewer 1 in particular has made a number of suggestions regarding the figures, which would make them provide more support for the findings described in the text.

      A description of the techniques used to identify coiled-coil segments has been added to the Methods. Our predictions ranged from near certainty in the coiled-coils detected in CnuB homologs, to shorter helices at the limit of detection in other factors. We chose to report all probable coiled-coils, as the extensive coiled-coils fused to CnuB, which are often the only domain present other than the GTPase, imply involvement in mediating complex formation by interacting with coiled-coils in other factors, particularly the other CoCoNuT factors. The suggestions made by Reviewer 1 were thoughtful and we made an effort to incorporate them.

      Reviewer #1 (Recommendations For The Authors):

      I do not have any major issues with the manuscript. I have however some minor comments, as described below.

      • The last sentence of the abstract at first reads as a fact and not a hypothesis resulting from the work described in the manuscript. After the second read, I noticed the nuances in the sentence. I would suggest a rephrasing to emphasize that the activity described is a theoretical hypothesis not backed-up by experiments.

      This sentence has been rephrased to make explicit the hypothetical nature of the statement.

      • In line 64, the authors rename DUF3578 as ADAM because indeed its function is not unknown. Did the authors consider reaching out to InterPro to add this designation to this DUF? A search in interpro with DUF3578 results in "MrcB-like, N-terminal domain" and if a name is suggested, it may be worthwhile to take it to the IntrePro team.

      We will suggest this nomenclature to InterPro.

      • I find Figure 1E hard to analyse and think it occupies too much space for the information it provides. The color scheme, the large amount of small slices, and the lack of numbers make its information content very small. I would suggest moving this to the supplementary and making it instead a bar plot. If removed from Figure 1, more space is made available for the other panels, particularly the structural superpositions, which in my opinion are much more important.

      We have removed Figure 1E from the paper as it adds little information beyond the abundance and phyletic distribution of sequenced prokaryotes, in which McrBC systems are plentiful.

      • In Figure 2, it is not clear due to the presence of many colorful "operon schemes" that the tree is for a single gene and not for the full operon segment. Highlighting the target gene in the operons or signalling it somehow would make the figure easy to understand even in the absence of the text and legend. The same applies to Supplementary Figure 1.

      The legend has been modified to show more clearly that this is a tree of McrB-like GTPases.

      • In line 146, the authors write "AlphaFold-predicted endonucelase fold" to say that a protein contains a region that AF2 predicts to fold like an endonuclease. This is a weird way of writing it and can be confusing to non-expert readers. I would suggest rephrasing for increased clarity.

      This sentence has been rephrased for greater clarity.

      • In line 167, there is a [47]. I believe this is probably due to a previous reference formatting.

      Indeed, this was a reference formatting error and has been fixed.

      • In most figures, the color palette and the use of very similar color palettes for taxonomy pie charts, genomic context composition schemes, and domain composition diagrams make it really hard to have a good understanding of the image at first. Legends are often close to each other, and it is not obvious at first which belong to what. I would suggest changing the layouts and maybe some color schemes to make it easier to extract the information that these figures want to convey.

      It seemed that Figure 4 was the most glaring example of these issues, and it has been rearranged for easier comprehension.

      • In the paragraph that starts at line 199, the authors mention an Ig-like domain that is often found at the N-terminus of Type I CoCoNuTs. Are they all related to each other? How conserved are these domains?

      These domains are all predicted to adopt a similar beta-sandwich fold and are found at the N-terminus of most CoCoNuT CnuC homologs, suggesting they are part of the same family, but we did not undertake a more detailed sequenced-based analysis of these regions.

      We also find comparable domains in the CnuC/McrC-like partners of the abundant McrB-like NxD motif GTPases that are not part of CoCoNuT systems, and given the similarity of some of their predicted structures to Rho GDP-dissociation inhibitor 1, we suspect that they have coevolved as regulators of the non-canonical NxD motif GTPase type. Our CnuBC multimer models showing consistent proximity between these domains in CnuC and CnuB GTPase domains suggest this could indeed be the case. We plan to explore these findings further in a forthcoming publication.

      • In line 210, the authors write "suggesting a role in overcrowding-induced stress response". Why so? In >all other cases, the authors justify their hypothesis, which I really appreciated, but not here.

      A supplementary note justifying this hypothesis has been added to Supplementary Data File S1.

      • At the end of the paragraph that starts in line 264, the authors mention that they constructed AF2 multimeric models to predict if 2 proteins would interact. However, no quality scores were provided, particularly the PAE matrix. This would allow for a better judgement of this prediction, and I would suggest adding the PAE matrix as another panel in the figure where the 3D model of the complex is displayed.

      The PAE matrix and ipTM+pTM scores for this and other multimer models have been added to Supplementary Data File S1. For this model in particular, the surface charge distribution of the model has been presented to support the role of the domains that have a higher PAE in RNA binding.

      • In line 306, "(supplementary data)" refers to what part of the file?

      This file has been renamed Supplementary Table S3 and referenced as such.

      • In line 464, the authors suggest that ShdA could interact with CoCoNuTs. Why not model the complex as done for other cases? what would co-folding suggest?

      As we were not able to convincingly model full-length CnuB hexamers with N-terminal coiled-coils, we did not attempt modeling of this hypothetical complex with another protein with a long coiled-coil, but it remains an interesting possibility.

      • In line 528, why and how were some genes additionally analyzed with HHPred?

      Justification for this analysis has been added to the Methods, but briefly, these genes were additionally analyzed if there were no BLAST hits or to confirm the hits that were obtained.

      • In the first section of the methods, the first and second (particularly the second) paragraphs are extremely long. I would suggest breaking them to facilitate reading.

      This change has been made.

      • In line 545, what do the authors mean by "the alignment (...) were analyzed with HHPred"?

      A more detailed description of this step has been added to the Methods.

      • The authors provide the models they produced as well as extensive supplementary tables that make their data reusable, but they do not provide the code for the automated steps, as to excise target sequence sections out of multiple sequence alignments, for example.

      The code used for these steps has been in use in our group at the NCBI for many years. It will be difficult to utilize outside of the NCBI software environment, but for full disclosure, we have included a zipped repository with the scripts and custom-code dependencies, although there are external dependencies as well such as FastTree and BLAST. In brief, it involves PSI-BLAST detection of regions with the most significant homology to one of a set of provided alignments (seals-2-master/bin/wrappers/cog_psicognitor). In this case, the reference alignments of McrB-like GTPases and DUF2357 were generated manually using HHpred to analyze alignments of clustered PSI-BLAST results. This step provided an output of coordinates defining domain footprints in each query sequence, which were then combined and/or extended using scripts based on manual analysis of many examples with HHpred (footprint_finders/get_GTPase_frags.py and footprint_finders/get_DUF2357_frags.py), then these coordinates were used to excise such regions from the query amino acid sequence with a final script (seals-2-master/bin/misc/fa2frag).

      Reviewer #2 (Recommendations For The Authors):

      (1) Page 4, line 77 - 'PUA superfamily domains' could be more appropriate to use instead of "EVE superfamily".

      While this statement could perhaps be applied to PUA superfamily domains, our previous work we refer to, which strongly supports the assertion, was restricted to the EVE-like domains and we prefer to retain the original language.

      (2) Page 5. lines 128-130 - AF2 multimer prediction model could provide a more sound explanation for these differences.

      Our AF2 multimer predictions added in this revision indeed show that the NxD motif McrB-like CoCoNuT GTPases interact with their respective McrC-like partners such that an immunoglobulin-like beta-sandwich domain, fused to the N-termini of the McrC homologs and similar to Rho GDP-dissociation inhibitor 1, has the potential to physically interact with the GTPase variants. However, we did not probe this in greater detail, as it is beyond the scope of this already highly complex article, but we plan to study it in the future.

      (3) Page 8, line 252 - The surface charge distribution of CnuH OB fold domain looks very different from SmpB (pdb3iyr). In fact, the regions that are in contact with RNA in SmpB are highly acidic in CoCoNut CnuH. Although it looks likely that this domain is involved in RNA binding, the mode of interaction should be very different.

      We did not detect a strong similarity between the CnuH SmpB-like SPB domain and PDB 3IYR, but when we compare the surface charge distribution of PDB 1WJX and the SPB domain, while there is a significant area that is positively charged in 1WJX that is negatively charged in SPB, there is much that overlaps with the same charge in both domains.

      The similarity between SmpB and the SPB domain is significant, but definitely not exact. An important question for future studies is: If the domains are indeed related due to an ancient fusion of SmpB to an ancestor of CnuH, would this degree of divergence be expected?

      In other words, can we say anything about how the function of a stand-alone tmRNA-binding protein could evolve after being fused to a complex predicted RNA helicase with other predicted RNA binding domains already present? Experimental validation will ultimately be necessary to resolve these kinds of questions, but for now, it may be safe to say that the presence of this domain, especially in conjunction with the neighboring RelE-like RTL domain and UPF1-like helicase domain, signals a likely interaction with the A-site of the ribosome, and perhaps restriction of aberrant/viral mRNA.

    1. Here is a detailed summary of the article "Super Charging Fine-Grained Reactive Performance" by Milo:

      1. Introduction to Reactivity in JavaScript

        • Definition and Importance: "Reactivity allows you to write lazy variables that are efficiently cached and updated, making it easier to write clean and fast code."
        • Introduction to Reactively: "I've been working on a new fine grained reactivity library called Reactively inspired by my work on the SolidJS team."
      2. Characteristics of Fine-Grained Reactivity Libraries

        • Library Examples and Usage: "Fine-grained reactivity libraries... Examples include new libraries like Preact Signals, µsignal, and now Reactively, as well as longer-standing libraries like Solid, S.js, and CellX."
        • Functionality and Advantages: "With a library like Reactively, you can easily add lazy variables, caching, and incremental recalculation to your typescript/javascript programs."
      3. Core Concepts in Reactively

        • Dependency Graphs: "Reactive libraries work by maintaining a graph of dependencies between reactive elements."
        • Implementation Example: "import { reactive } from '@reactively/core'; const nthUser = reactive(10);"
      4. Goals and Features of Reactive Libraries

        • Efficiency and State Consistency: "Efficient: Never overexecute reactive elements... Glitch free: Never allow user code to see intermediate state where only some reactive elements have updated."
      5. Comparison Between Lazy and Eager Evaluation

        • Evaluation Strategies: "A lazy library... will first ask B then C to update, then update D after the B and C updates have been completed."
        • Algorithm Challenges: "The first challenge is what we call the diamond problem... The second challenge is the equality check problem."
      6. Algorithm Descriptions

        • MobX: "MobX uses a two pass algorithm, with both passes proceeding from A down through its observers... MobX stores a count of the number of parents that need to be updated with each reactive element."
        • Preact Signals: "Preact checks whether the parents of any signal need to be updated before updating that signal... Preact also has two phases, and the first phase 'notifies' down from A."
        • Reactively: "Reactively uses one down phase and one up phase. Instead of version numbers, Reactively uses only graph coloring."
      7. Benchmarking Results

        • Performance Observations: "In early experiments with the benchmarking tool, what we've discovered so far is that Reactively is the fastest."
        • Framework Comparisons: "The Solid algorithm performs best on wider graphs... The Preact Signal implementation is fast and very memory efficient."

      This summary encapsulates the key concepts, methodologies, and findings presented in the article, focusing on the innovations and performance of various fine-grained reactivity libraries, especially the newly introduced Reactively.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This work provides a valuable contribution and assessment of what it means to replicate a null study finding, and what are the appropriate methods for doing so (apart from a rote p-value assessment). Through a convincing re-analysis of results from the Reproducibility Project: Cancer Biology using frequentist equivalence testing and Bayes factors, the authors demonstrate that even when reducing 'replicability success' to a single criterion, how precisely replication is measured may yield differing results. Less focus is directed to appropriate replication of non-null findings.

      Reviewer #1 (Public Review):

      Summary:

      The goal of Pawel et al. is to provide a more rigorous and quantitative approach for judging whether or not an initial null finding (conventionally with p ≥ 0.05) has been replicated by a second similarly null finding. They discuss important objections to relying on the qualitative significant/non-significant dichotomy to make this judgment. They present two complementary methods (one frequentist and the other Bayesian) which provide a superior quantitative framework for assessing the replicability of null findings.

      Strengths:

      Clear presentation; illuminating examples drawn from the well-known Reproducibility Project: Cancer Biology data set; R-code that implements suggested analyses. Using both methods as suggested provides a superior procedure for judging the replicability of null findings.

      Weaknesses:

      The proposed frequentist and the Bayesian methods both rely on binary assessments of an original finding and its replication. I'm not sure if this is a weakness or is inherent to making binary decisions based on continuous data.

      For the frequentist method, a null finding is considered replicated if the original and replication 90% confidence intervals for the effects both fall within the equivalence range. According to this approach, a null finding would be considered replicated if p-values of both equivalences tests (original and replication) were, say, 0.049, whereas would not be considered replicated if, for example, the equivalence test of the original study had a p-value of 0.051 and the replication had a p-value of 0.001. Intuitively, the evidence for replication would seem to be stronger in the second instance. The recommended Bayesian approach similarly relies on a dichotomy (e.g., Bayes factor > 1).

      Thanks for the suggestions, we now emphasize more strongly in the “Methods for assessing replicability of null results” and “Conclusions” sections that both TOST p-values and Bayes factors are quantitative measures of evidence that do not require dichotomization into “success” or “failure”.

      Reviewer #2 (Public Review):

      Summary:

      The study demonstrates how inconclusive replications of studies initially with p > 0.05 can be and employs equivalence tests and Bayesian factor approaches to illustrate this concept. Interestingly, the study reveals that achieving a success rate of 11 out of 15, or 73%, as was accomplished with the non-significance criterion from the RPCB (Reproducibility Project: Cancer Biology), requires unrealistic margins of Δ > 2 for equivalence testing.

      Strengths:

      The study uses reliable and shareable/open data to demonstrate its findings, sharing as well the code for statistical analysis. The study provides sensitivity analysis for different scenarios of equivalence margin and alfa level, as well as for different scenarios of standard deviations for the prior of Bayes factors and different thresholds to consider. All analysis and code of the work is open and can be replicated. As well, the study demonstrates on a case-by-case basis how the different criteria can diverge, regarding one sample of a field of science: preclinical cancer biology. It also explains clearly what Bayes factors and equivalence tests are.

      Weaknesses:

      It would be interesting to investigate whether using Bayes factors and equivalence tests in addition to p-values results in a clearer scenario when applied to replication data from other fields. As mentioned by the authors, the Reproducibility Project: Experimental Philosophy (RPEP) and the Reproducibility Project: Psychology (RPP) have data attempting to replicate some original studies with null results. While the RPCB analysis yielded a similar picture when using both criteria, it is worth exploring whether this holds true for RPP and RPEP. Considerations for further research in this direction are suggested. Even if the original null results were excluded in the calculation of an overall replicability rate based on significance, sensitivity analyses considering them could have been conducted. The present authors can demonstrate replication success using the significance criteria in these two projects with initially p < 0.05 studies, both positive and non-positive.

      Other comments:

      • Introduction: The study demonstrates how inconclusive replications of studies initially with p > 0.05 can be and employs equivalence tests and Bayesian factor approaches to illustrate this concept. Interestingly, the study reveals that achieving a success rate of 11 out of 15, or 73%, as was accomplished with the non-significance criterion from the RPCB (Reproducibility Project: Cancer Biology), requires unrealistic margins of Δ > 2 for equivalence testing.

      • Overall picture vs. case-by-case scenario: An interesting finding is that the authors observe that in most cases, there is no substantial evidence for either the absence or the presence of an effect, as evidenced by the equivalence tests. Thus, using both suggested criteria results in a picture similar to the one initially raised by the paper itself. The work done by the authors highlights additional criteria that can be used to further analyze replication success on a case-by-case basis, and I believe that this is where the paper's main contributions lie. Despite not changing the overall picture much, I agree that the p-value criterion by itself does not distinguish between (1) a situation where the original study had low statistical power, resulting in a highly inconclusive non-significant result that does not provide evidence for the absence of an effect and (2) a scenario where the original study was adequately powered, and a non-significant result may indeed provide some evidence for the absence of an effect when analyzed with appropriate methods. Equivalence testing and Bayesian factor approaches are valuable tools in both cases.

      Regarding the 0.05 threshold, the choice of the prior distribution for the SMD under the alternative H1 is debatable, and this also applies to the equivalence margin. Sensitivity analyses, as highlighted by the authors, are helpful in these scenarios.

      Thank you for the thorough review and constructive feedback. We have added an additional “Appendix C: Null results from the RPP and EPRP” that shows equivalence testing and Bayes factor analyses for the RPP and EPRP null results.

      Reviewer #3 (Public Review):

      Summary:

      The paper points out that non-significance in both the original study and a replication does not ensure that the studies provide evidence for the absence of an effect. Also, it can not be considered a "replication success". The main point of the paper is rather obvious. It may be that both studies are underpowered, in which case their non-significance does not prove anything. The absence of evidence is not evidence of absence! On the other hand, statistical significance is a confusing concept for many, so some extra clarification is always welcome.

      One might wonder if the problem that the paper addresses is really a big issue. The authors point to the "Reproducibility Project: Cancer Biology" (RPCB, Errington et al., 2021). They criticize Errington et al. because they "explicitly defined null results in both the original and the replication study as a criterion for replication success." This is true in a literal sense, but it is also a little bit uncharitable. Errington et al. assessed replication success of "null results" with respect to 5 criteria, just one of which was statistical (non-)significance.

      It is very hard to decide if a replication was "successful" or not. After all, the original significant result could have been a false positive, and the original null-result a false negative. In light of these difficulties, I found the paper of Errington et al. quite balanced and thoughtful. Replication has been called "the cornerstone of science" but it turns out that it's actually very difficult to define "replication success". I find the paper of Pawel, Heyard, Micheloud, and Held to be a useful addition to the discussion.

      Strengths:

      This is a clearly written paper that is a useful addition to the important discussion of what constitutes a successful replication.

      Weaknesses:

      To me, it seems rather obvious that non-significance in both the original study and a replication does not ensure that the studies provide evidence for the absence of an effect. I'm not sure how often this mistake is made.

      Thanks for the feedback. We do not have systematic data on how often the mistake of confusing absence of evidence with evidence of absence has been made in the replication context, but we do know that it has been made in at least three prominent large-scale replication projects (the RPP, RPEP, RPCB). We therefore believe that there is a need for our article.

      Moreover, we agree that the RPCB provided a nuanced assessment of replication success using five different criteria for the original null results. We emphasize this now more in the “Introduction” section. However, we do not consider our article as “a little bit uncharitable” to the RPCB, as we discuss all other criteria used in the RPCB and note that our intent is not to diminish the important contributions of the RPCB, but rather to build on their work and provide constructive recommendations for future researchers. Furthermore, in response to comments made by Reviewer #2, we have added an additional “Appendix B: Null results from the RPP and EPRP” that shows equivalence testing and Bayes factor analyses for null results from two other replication projects, where the same issue arises.

      Reviewer #1 (Recommendations For The Authors):

      The authors may wish to address the dichotomy issue I raise above, either in the analysis or in the discussion.

      Thank you, we now emphasize that Bayes factors and TOST p-values do not need to be dichotomized but can be interpreted as quantitative measures of evidence, both in the “Methods for assessing replicability of null results” and the “Conclusions” sections.

      Reviewer #2 (Recommendations For The Authors):

      Given that, here follow additional suggestions that the authors should consider in light of the manuscript's word count limit, to avoid confusing the paper's main idea:

      2) Referencing: Could you reference the three interesting cases among the 15 RPCB null results (specifically, the three effects from the original paper #48) where the Bayes factor differs qualitatively from the equivalence test?

      We now explicitly cite the original and replication study from paper #48.

      3) Equivalence testing: As the authors state, only 4 out of the 15 study pairs are able to establish replication success at the 5% level, in the sense that both the original and the replication 90% confidence intervals fall within the equivalence range. Among these 4, two (Paper #48, Exp #2, Effect #5 and Paper #48, Exp #2, Effect #6) were initially positive with very low p-values, one (Paper #48, Exp #2, Effect #4) had an initial p of 0.06 and was very precisely estimated, and the only one in which equivalence testing provides a clearer picture of replication success is Paper #41, Exp #2, Effect #1, which had an initial p-value of 0.54 and a replication p-value of 0.05. In this latter case (or in all these ones), one might question whether the "liberal" equivalence range of Δ = 0.74 is the most appropriate. As the authors state, "The post-hoc specification of equivalence margins is controversial."

      We agree that the post hoc choice of equivalence ranges is a controversial issue. The margins define an equivalence region where effect sizes are considered practically negligible, and we agree that in many contexts SMD = 0.74 is a large effect size that is not practically negligible. We therefore present sensitivity analyses for a wide range of margins. However, we do not think that the choice of this margin is more controversial for the mentioned studies with low p-values than for other studies with greater p-values, since the question of whether a margin plausibly encodes practically negligible effect sizes is not related to the observed p-value of a study. Nevertheless, for the new analyses of the RPP and EPRP data in Appendix B, we have added additional sensitivity analyses showing how the individual TOST p-values and Bayes factors vary as a function of the margin and the prior standard deviation. We think that these analyses provide readers with an even more transparent picture regarding the implications of the choice of these parameters than the “project-wise” sensitivity analyses in Appendix A.

      4) Bayes factor suggestions: For the Bayes factor approach, it would be interesting to discuss examples where the BF differs slightly. This is likely to occur in scenarios where sample sizes differ significantly between the original study and replication. For example, in Paper #48, Exp #2 and Effect #4, the initial p is 0.06, but the BF is 8.1. In the replication, the BF dramatically drops to < 1/1000, as does the p-value. The initial evidence of 8.1 indicates some evidence for the absence of an effect, but not strong evidence ("strong evidence for H0"), whereas a p-value of 0.06 does not lead to such a conclusion; instead, it favors H1. It would be interesting if the authors discussed other similar cases in the paper. It's worth noting that in Paper #5, Exp #1, Effect #3, the replication p-value is 0.99, while the BF01 is 2.4, almost indicating "moderate" evidence for H0, even though the p-value is inconclusive.

      We agree that some of the examples nicely illustrate conceptual differences between p-values and Bayes factors, e.g., how they take into account sample size and effect size. As methodologists, we find these aspects interesting ourselves, but we think that emphasizing them is beyond the scope of the paper and would distract eLife readers from the main messages.

      Concerning the conceptual differences between Bayes factors and TOST p-values, we already discuss a case where there are qualitative differences in more detail (original paper #48). We added another discussion of this phenomenon in the Appendix C as it also occurs for the replication of Ranganath and Nosek (2008) that was part of the RPP.

      5) p-values, magnitude and precision: It's noteworthy to emphasize, if the authors decide to discuss this, that the p-value is influenced by both the effect's magnitude and its precision, so in Paper #9, Exp #2, Effect #6, BF01 = 4.1 has a higher p-value than a BF01 = 2.3 in its replication. However, there are cases where both p-values and BF agree. For example, in Paper #15, Exp #2, Effect #2, both the original and replication studies have similar sample sizes, and as the p-value decreases from p = 0.95 to p = 0.23, BF01 decreases from 5.1 ("moderate evidence for H0") to 1.3 (region of "Absence of evidence"), moving away from H0 in both cases. This also occurs in Paper #24, Exp #3, Effect #6.

      We appreciate the suggestions but, as explained before, think that the message of our paper is better understood without additional discussion of more general differences between p-values and Bayes factors.

      6) The grey zone: Given the above topic, it is important to highlight that in the "Absence of evidence grey zone" for the null hypothesis, for example, in Paper #5, Exp #1, Effect #3 with a p = 0.99 and a BF01 = 2.4 in the replication, BF and p-values reach similar conclusions. It's interesting to note, as the authors emphasize, that Dawson et al. (2011), Exp #2, Effect #2 is an interesting example, as the p-value decreases, favoring H1, likely due to the effect's magnitude, even with a small sample size (n = 3 in both original and replications). Bayes factors are very close to one due to the small sample sizes, as discussed by the authors.

      We appreciate the constructive comments. We think that the two examples from Dawson et al. (2011) and Goetz et al. (2011) already nicely illustrate absence of evidence and evidence of absence, respectively, and therefore decided not to discuss additional examples in detail, to avoid redundancy.

      7) Using meta-analytical results (?): For papers from RPCB, comparing the initial study with the meta-analytical results using Bayes factor and equivalence testing approaches (thus, increasing the sample size of the analysis, but creating dependency of results since the initial study would affect the meta-analytical one) could change the conclusions. This would be interesting to explore in initial studies that are replicated by much larger ones, such as: Paper #9, Exp #2, Effect #6; Goetz et al. (2011), Exp #1, Effect #1; Paper #28, Exp #3, Effect #3; Paper #41, Exp #2, Effect #1; and Paper #47, Exp #1, Effect #5).

      Thank you for the suggestion. We considered adding meta-analytic TOST p-values and Bayes factors before, but decided that Figure 3 and the results section are already quite technical, so adding more analyses may confuse more than help. Nevertheless, these meta-analytic approaches are discussed in the “Conclusions” section.

      8) Other samples of fields of science: It would be interesting to investigate whether using Bayes factors and equivalence tests in addition to p-values results in a clearer scenario when applied to replication data from other fields. As mentioned by the authors, the Reproducibility Project: Experimental Philosophy (RPEP) and the Reproducibility Project: Psychology (RPP) have data attempting to replicate some original studies with null results. While the RPCB analysis yielded a similar picture when using both criteria, it is worth exploring whether this holds true for RPP and RPEP. Considerations for further research in this direction are suggested. Even if the original null results were excluded in the calculation of an overall replicability rate based on significance, sensitivity analyses considering them could have been conducted. The present authors can demonstrate replication success using the significance criteria in these two projects with initially p < 0.05 studies, both positive and non-positive.

      Thank you for the excellent suggestion. We added an Appendix B where the null results from the RPP and EPRP are analyzed with our proposed approaches. The results are also discussed in the “Results” and “Conclusions” sections.

      9) Other approaches: I am curious about the potential impact of using an approach based on equivalence testing (as described in https://arxiv.org/abs/2308.09112). It would be valuable if the authors could run such analyses or reference the mentioned work.

      Thank you. We were unaware of this preprint. It seems related to the framework proposed by Stahel W. A. (2021) New relevance and significance measures to replace p-values. PLoS ONE 16(6): e0252991. https://doi.org/10.1371/journal.pone.0252991

      We now cite both papers in the discussion.

      10) Additional evidence: There is another study in which replications of initially p > 0.05 studies with p > 0.05 replications were also considered as replication successes. You can find it here: https://www.medrxiv.org/content/10.1101/2022.05.31.22275810v2. Although it involves a small sample of initially p > 0.05 studies with already large sample sizes, the work is currently under consideration for publication in PLOS ONE, and all data and materials can be accessed through OSF (links provided in the work).

      Thank you for sharing this interesting study with us. We feel that it is beyond the scope of the paper to include further analyses as there are already analyses of the RPCB, RPP, and EPRP null results. However, we will keep this study in mind for future analysis, especially since all data are openly available.

      11) Additional evidence 02: Ongoing replication projects, such as the Brazilian Reproducibility Initiative (BRI) and The Sports Replication Centre (https://ssreplicationcentre.com/), continue to generate valuable data. BRI is nearing completion of its results, and it promises interesting data for analyzing replication success using p-values, equivalence regions, and Bayes factor approaches.

      We now cite these two initiatives as examples of ongoing replication projects in the introduction. Similarly as for your last point, we think that it is beyond the scope of the paper to include further analyses as there are already analyses of the RPCB, RPP, and EPRP null results.

      Reviewer #3 (Recommendations For The Authors):

      I have no specific recommendations for the authors.

      Thank you for the constructive review.

      Reviewing Editor (Recommendations For the Authors):

      I recognize that it was suggested to the authors by the previous Reviewing Editor to reduce the amount of statistical material to be made more suitable for a non-statistical audience, and so what I am about to say contradicts advice you were given before. But, with this revised version, I actually found it difficult to understand the particulars of the construction of the Bayes Factors and would have appreciated a few more sentences on the underlying models that fed into the calculations. In my opinion, the provided citations (e.g., Dienes Z. 2014. Using Bayes to get the most out of non-significant results) did not provide sufficient background to warrant a lack of more technical presentation here.

      Thank you for the feedback. We added a new “Appendix C: Technical details on Bayes factors” that provides technical details on the models, priors, and calculations underlying the Bayes factors.

    1. When we’ve been accessing Reddit through Python and the “PRAW” code library. The praw code library works by sending requests across the internet to Reddit, using what is called an “application programming interface” [h3] or API for short. APIs have a set of rules for what requests you can make, what happens when you make the request, and what information you can get back.

      The explanation provided about how the PRAW library functions as a mediator between Python applications and Reddit through the use of APIs is quite illuminating. APIs, as described, serve as the bridge that facilitates these interactions under a set of defined rules and protocols. This brings to mind the essential nature of understanding the limits and capabilities of any API when developing software that depends on external services. It would be interesting to explore further how robust the error handling capabilities of the PRAW library are. Specifically, how does PRAW manage or relay errors that arise from API limitations or disruptions in Reddit's service? This is crucial for developers to ensure their applications can gracefully handle such issues and maintain a good user experience.

    2. When we’ve been accessing Reddit through Python and the “PRAW” code library. The praw code library works by sending requests across the internet to Reddit, using what is called an “application programming interface” [h3] or API for short. APIs have a set of rules for what requests you can make, what happens when you make the request, and what information you can get back.

      I am not well informed on how API works but it sounds like it has a lot of connection to the internet and other kind of information systems. Going back to the sources of social media data, one of the things that platforms could record are what users click on, when they log on or off, etc. What arises from this I think ties back in with the ethical frameworks we were talking about in chapter 2. The discussion becomes what course of action is correct because even if the platform records information on the behavior of user interaction, it boils down to the idea that even if it's to mazimize user experience, should platforms be allowed to record information that could be personal?

    1. the 31,085 lines of configure for libtool still check if <sys/stat.h> and <stdlib.h> exist, even though the Unixen, which lacked them, had neither sufficient memory to execute libtool nor disks big enough for its 16-MB source code.

      yummy

    1. One of the biggest advantages of using Cubit is simplicity. When creating a Cubit, we only have to define the state as well as the functions which we want to expose to change the state. In comparison, when creating a Bloc, we have to define the states, events, and the EventHandler implementation. This makes Cubit easier to understand and there is less code involved.

      Cubit和Bloc的区别:

      Cubit只需要定义state和能改变state的function

      Bloc要定义state,event和event handler

    1. The “DAO Model Law” guide by COALA researchers outlines 11 technical and governance requirements for DAOs to meet the requirements for legal recognition as an entity, including:1. Deployed on a blockchain,2. Provide a unique public address for others to review its operations,3. Open source software code,4. Get code audited,5. Have at least one interface for laypeople to read critical information on DAO smart contracts and tokens,6. Have by-laws that are comprehensible to lay people,7. Have governance that is technically decentralized (i.e. not controlled by a single party),8. Have at least one member at any given time,9. Have a specific way for people to contact the DAO,10. Have a binding internal dispute resolution mechanism for participants,11. Have an external dispute resolution mechanism to resolve disputes with third-parties (e.g. service providers).These factors and considerations constitute a legal basis for conceptualizing DAOs.
    1. Nicole Nguyen. Here's Who Facebook Thinks You Really Are. September 2016. Section: Tech. URL: https://www.buzzfeednews.com/article/nicolenguyen/facebook-ad-preferences-pretty-accurate-tbh (visited on 2024-01-30).

      In the article it mentioned that many non facebook sites use JavaScript code that tells the mothership what kind of content you're looking at when you're not on Facebook's site and apps. Even if done legally, that doesn't make it the most ethical choice. I think there are better and more ethical ways to understand the target market of your audience. I think this is smart to engage and understand users better but it just doesn't feel right to be about in someone's own personal life when they don't know about it.

    1. toString Returns a String representation of an object. By default, it returns the class name and a hexidecimal representation of the hashCode. That's not very useful, so it's common to override this method. hashCode Returns an int code that's used for storing an object in hashed data structures. getClass Returns the Class associated to the object. An instance of a Class contains meta-data (names, parameters, annotations) associated to a class. equals Returns a boolean that indicates if this instance is equal to another object. By default, it evaluates to true if the objects share the same memory location -- called reference equality -- they share the same reference. It's common to override this method to inspect individual values instead of comparing references.

      Object methods

    2. Code changes:

      The person has-a student and instructor rather than the student is-a person (inheritance)

    1. Instructions: Step 1: Briefly summarize the “best fit occupations” results of the combined assessment (about 100 words). Step 2: Reflect on the combined results of your assessments as they relate to your current career interest (about 400 words). Consider responding to one or more of following prompts: In the Work Interest assessment, what is your Holland Code (please use the letters and descriptive titles)? How well do these three descriptors fit your current career interest? How might these descriptors help you select a better fitting career goal? In the Leisure Interest assessment, what are your top three leisure interests? How well do these three descriptors fit your current career interest? How might these descriptors help you select a better fitting career goal? What “best fit” occupation recommendations do you agree with? What recommendations do you disagree with? Why? Which of the five assessments (work, leisure, skills, personality, values) are most important to you personally? Select three assessments and run another combined report. Are the results any different? Did the results provide you with any new insights? You may also comment on the insights gained from the Focus 2 Career Assessment and how they relate to the results of previous assessment you have completed while in LEAD Scholars including True Colors, Strengths, and 16-Personalities. Step 3: Provide one personal insight about your career path gained from this learning activity.

      delete instructions

    1. créer une liste non ordonnée

      suite à la création de la liste non ordonnée une erreur Prettier s'affiche dans la console et Prettier ne met plus en page le code comme auparavant. quelqu'un sait il pourquoi ?

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1

      (1) Given the low trial numbers, and the point of sequential vs clustered reactivation mentioned in the public review, it would be reassuring to see an additional sanity check demonstrating that future items that are currently not on-screen can be decoded with confidence, and if so, when in time the peak reactivation occurs. For example, the authors could show separately the decoding accuracy for near and far items in Fig. 5A, instead of plotting only the difference between them.

      We have now added the requested analysis showing the raw decoded probabilities for near and distant items separately in Figure 5A. We have also chosen to replace Figure 5B with the new figure as we think it provides more information than the previous Figure 5B. Instead, we have moved Figure 5B to the supplement. The median peak decoded accuracy for near and distant items is equivalent. We have added the following description to the figure:

      “Decoded raw probabilities for off-screen items, that were up to two steps ahead of the current stimulus cue (‘near’,) vs. distant items that were more than two steps away on the graph, on trials with correct answers. The median peak decoded probability for near and distant items was at the same time point for both probability categories. Note that displayed lines reflect the average probability while, to eliminate influence of outliers, the peak displays the median.”

      (2) The non-sequential reactivation analyses often use a time window of peak decodability, and it was not entirely clear to me what data this time window is determined on, e.g., was it determined based on all future reactivations irrespective of graph distance? This should be clarified in the methods.

      Thank you for raising this. We now clarify this in the relevant section to read: “First, we calculated a time point of interest by computing the peak probability estimate of decoders across all trials, i.e., the average probability for each timepoint of all trials (except previous onscreen items) of all distances, which is equivalent to the peak of the differential reactivation analysis”

      (3) Fig 4 shows evidence for forward and backward sequential reactivation, suggesting that both forward and backward replay peak at a lag of 40-50msec. It would be helpful if this counterintuitive finding could be picked up in the discussion, explaining how plausible it is, physiologically, to find forward and backward replay at the same lag, and whether this could be an artifact of the TDLM method.

      This is an important point and we agree that it appears counterintuitive. However, we would highlight this exact time range has been reported in previous studies, though t never for both forward and backward replay. We now include a discussion of this finding. The section now reads:

      “[… ] Even though we primarily focused on the mean sequenceness scores across time lags, there appears s to be a (non-significant) peak at 40-60 milliseconds. While simultaneous forward and backward replay is theoretically possible, we acknowledge that it is somewhat surprising and, given our paradigm, could relate to other factors such as autocorrelations (Liu, Dolan, et al., 2021).”

      (4) It is reported that participants with below 30% decoding accuracy are excluded from the main analyses. It would be helpful if the manuscript included very specific information about this exclusion, e.g., was the criterion established based on the localizer cross-validated data, the temporal generalisation to the cued item (Fig. 2), or only based on peak decodability of the future sequence items? If the latter, is it applied based on near or far reactivations, or both?

      We now clarify this point to include more specific information, which reads:

      “[…] Therefore, we decided a priori that participants with a peak decoding accuracy of below 30% would be excluded from the analysis (nine participants in all) as obtained from the cross-validation of localizer trials”

      (5) Regarding the low amount of data for the reactivation analysis, the manuscript should be explicit about the number of trials available for each participant. For example, Supplemental Fig. 1 could provide this information directly, rather than the proportion of excluded trials.

      We have adapted the plot in the supplement to show the absolute number of rejected epochs per participant, in addition to the ratio.

      (6) More generally, the supplements could include more detailed information in the legends.

      We agree and have added more extensive explanation of the plots in the supplement legends.

      (7) The choice of comparing the 2 nearest with all other future items in the clustered reactivation analysis should be better motivated, e.g., was this based on the Wimmer et al. (2020) study?

      We have added our motivation for taking the two nearest items and contrasting them with the items further away. The paragraph reads:

      “[…] We chose to combine the following two items for two reasons: First, this doubled the number of included trials; secondly, using this approach the number of trials for each category (“near” and “distant”) was more balanced. […]”

      Reviewer 2

      (1) Focus exclusively on retrieval data (and here just on the current image trials).

      If I understand correctly, you focus all your analyses (behavioural as well as MEG analyses) on retrieval data only and here just on the current image trials. I am surprised by that since I see some shortcomings due to that. These shortcomings can likely be addressed by including the learning data (and predecessor image trials) in your analyses.

      a) Number of trials: During each block, you presented each of the twelve edges once. During retrieval, participants then did one "single testing session block". Does that mean that all your results are based on max. 12 trials? Given that participants remembered, on average, 80% this means even fewer trials, i.e., 9-10 trials?

      This is correct and a limitation of the paper. However, while we used only correct trials for the reactivation analysis, the sequential analysis was conducted using all trials disregarding the response behaviour. To retain comparability with previous studies we mainly focused on data from after a consolidation phase. Nevertheless, despite the trial limitation we consider the results are robust and worth reporting. Additionally, based on the suggestion of the referee, we now include results from learning blocks (see below).

      b) Extend the behavioural and replay/reactivation analysis to predecessor images.

      Why do you restrict your analyses to the current image trials? Especially given that you have such a low trial number for your analyses, I was wondering why you did not include the predecessor trials (except the non-deterministic trials, like the zebra and the foot according to Figure 2B) as well.

      We agree it would be great to increase power by adding the predecessor images to the current image cue analysis, excluding the ambiguous trials, we did not do so as we considered the underlying retrieval processes of these trial types are not the same, i.e. cannot be simply combined. Nevertheless, we have performed the suggested analysis to check if it increases our power. We found, that the reactivation effect is robust and significant at the same time point of 220-230 ms. However, the effect size actually decreased: While before, peak differential reactivation was at 0.13, it is now at 0.07. This in fact makes conceptual sense. We suspect that the two processes that are elicited by showing a single cue and by showing a second, related, cue are distinct insofar as the predecessor image acts as a primer for the current image, potentially changing the time course/speed of retrieval. Given our concerns that the two processes are not actually the same we consider it important to avoid mixing these data.

      We have added a statement to the manuscript discussing this point. The section reads:

      “Note that we only included data from the current image cue, and not from the predecessor image cue, as we assume the retrieval processes differ and should not be concatenated.”

      c) Extend the behavioural and replay/reactivation analysis to learning trials.

      Similar to point 1b, why did you not include learning trials in your analyses?

      The advantage of including (correct and incorrect) learning trials has the advantage that you do not have to exclude 7 participants due to ceiling performance (100%).

      Further, you could actually test the hypothesis that you outline in your discussion: "This implies that there may be a switch from sequential replay to clustered reactivation corresponding to when learned material can be accessed simultaneously without interference." Accordingly, you would expect to see more replay (and less "clustered" reactivation) in the first learning blocks compared to retrieval (after the rest period).

      To track reactivation and replay over the course of learning is a great idea. We have given a lot of thought as to how to integrate these findings but have not found a satisfying solution. Thus, analysis of the learning data turned out to be quite tricky: We decided that each participant should perform as many blocks as necessary to reach at least 80% (with a limit of six and lower bound of two, see Supplement figure 4). Indeed, some participant learned 100% of the sequence after one block (these were mostly medical students, learning things by hard is their daily task). With the benefit of hindsight, we realise our design means that different blocks are not directly comparable between participants. In theory, we would expect that replay emerges in parallel with learning and then gradually changes to clustered reactivation as memory traces become consolidated/stronger. However, it is unclear when replay should emerge and when precisely a switch to clustered reactivation would happen. For this reason, we initially decided not to include the learning trials into the paper.

      Nevertheless, to provide some insight into the learning process, and to see how consolidation impacts differential reactivation and replay, we have split our data into pre and post resting state, aggregating all learning trials of each participant. While this does not allow us to track processes on a block basis, it does offer potential (albeit limited) insight into the hypothesis we outline in the discussion.

      For reactivation, we see emergence of a clear increase, further strengthening the outlined hypothesis, however, for replay the evidence is less clear, as we do not know over how many learning blocks replay is expected.

      We calculated individual trajectories of how reactivation and replay changes from learning to retrieval and related these to performance. Indeed, we see an increase of reactivation is nominally associated with higher learning performance, while an increase in replay strength is associated with lower performance (both non-significant). However, due to the above-mentioned reasons we think it would premature to add this weak evidence to the paper.

      To mitigate problems of experiment design in relation to this question we are currently implementing a follow-study, where we aim to normalize the learning process across participants and index how replay/reactivation changes over the course of learning and after consolidation.

      We have added plots showing clustered reactivation sequential replay measures during learning (Figure 5D and Supplement 8)

      The added section(s) now read:

      “To provide greater detail on how the 8-minute consolidation period affected reactivation we, post-hoc, looked at relevant measures across learning trials in contrast to retrieval trials. For all learning trials, for each participant, we calculated differential reactivation for the same time point we found significant in the previous analysis (220-260 milliseconds). On average, differential reactivation probability increased from pre to post resting state (Figure 5D). […]

      Nevertheless, even though our results show a nominal increase in reactivation from learning to retrieval (see Figure 5D), due to experimental design features our data do not enable us to test for an hypothesized switch for sequential replay (see also “limitations” and Supplement 8).”

      d) Introduction (last paragraph): "We examined the relationship of graph learning to reactivation and replay in a task where participants learned a ..." If all your behavioural analyses are based on retrieval performance, I think that you do not investigate graph learning (since you exclusively focus the analyses on retrieving the graph structure). However, relating the graph learning performance and replay/reactivation activity during learning trials (i.e., during graph learning) to retrieval trials might be interesting but beyond the scope of this paper.

      We agree. We have changed the wording to be more accurate. Indeed, we do not examine graph learning but instead examine retrieval from a graph, after graph learning. The mentioned sentence now read

      “[…] relationship of retrieval from a learned graph structure to reactivation [...]”

      e) It is sometimes difficult to follow what phase of the experiment you refer to since you use the terms retrieval and test synonymously. Not a huge problem at all but maybe you want to stick to one term throughout the whole paper.

      Thank you for pointing this out. We have now adapted the manuscript to exclusively refer to “retrieval” and not to “test”.

      (2) Is your reactivation clustered?

      In Figure 5A, you compare the reactivation strength of the two items following the cue image (i.e., current image trials) with items further away on the graph. I do not completely understand why your results are evidence for clustered reactivation in contrast to replay.

      First, it would be interesting to see the reactivation of near vs. distant items before taking the difference (time course of item probabilities).

      (copied answer from response to Reviewer 1, as the same remark was raised)

      We have added the requested analysis showing the raw decoded probabilities for near and distant items separately in Figure 5A. We have chosen to replace Figure 5B with the new figure as we think that it offers more information than the previous Figure 5B. Instead, we have moved Figure 5B to the supplement. The median peak decoded accuracy for near and distant items is equivalent. We have added the following description to the figure:

      “Decoded raw probabilities for off-screen items, that were up to two steps ahead of the current stimulus cue (‘near’,) vs. distant items that were more than two steps away on the graph, on trials with correct answers. The median peak decoded probability for near and distant items was at the same time point for both probability categories. Note that displayed lines reflect the average probability while, to eliminate influence of outliers, the peak displays the median. .”

      Second, could it still be that the first item is reactivated before the second item? By averaging across both items, it becomes not apparent what the temporal courses of probabilities of both items look like (and whether they follow a sequential pattern). Additionally, the Gaussian smoothing kernel across the time dimension might diminish sequential reactivation and favour clustered reactivation. (In the manuscript, what does a Gaussian smoothing kernel of  = 1 refer to?). Could you please explain in more detail why you assume non-sequential clustered reactivation here and substantiate this with additional analyses?

      We apologise for the unclear description. Note the Gaussian kernel is in fact only used for the reactivation analysis and not the replay analysis, so any small temporal successions would have been picked up by the sequential analysis. We now clarify this in the respective section of the sequential analysis and also explain the parameter of delta= 1 in the reactivation analysis section. The paragraph now reads

      “[…] As input for the sequential analysis, we used the raw probabilities of the ten classifiers corresponding to the stimuli. [...]

      […] Therefore, to address this we applied a Gaussian smoothing kernel (using scipy.ndimage.gaussian_filter with the default parameter of σ=1 which corresponds approximately to taking the surrounding timesteps in both direction with the following weighting: current time step: 40%, ±1 step: 25%, ±2 step: 5%, ±3 step: 0.5%) [...]”

      (3) Replay and/or clustered reactivation?

      The relationship between the sequential forward replay, differential reactivation, and graph reactivation analysis is not really apparent. Wimmer et al. demonstrated that high performers show clustered reactivation rather than sequential reactivation. However, you did not differentiate in your differential reactivation analysis between high vs. low performers. (You point out in the discussion that this is due to a low number of low performers.)

      We agree that a split into high vs low performers would have been preferably for our analysis. However, there is one major obstacle that made us opt for a correlational analysis instead: We employed criteria learning, rendering a categorical grouping conceptually biased. Even though not all participants reached the criteria of 80%, our sample did not naturally split between high and low performers but was biased towards higher performance, leaving the groups uneven. The median performance was 83% (mean ~81%), with six of our subjects (~1/4th of included participant) having this exact performance. This makes a median or mean split difficult, as either binning assignment choice would strongly affect the results. We have added a limitations section in which we extensively discuss this shortcoming and reasoning for not performing a median split as in Wimmer et al (2020). The section now reads:

      “There are some limitations to our study, most of which originate from a suboptimal study design. [...], as we performed criteria learning, a sub-group analysis as in Wimmer et al., (2020) was not feasible, as median performance in our sample would have been 83% (mean 81%), with six participants exactly at that threshold. [...]”

      It might be worth trying to bring the analysis together, for example by comparing sequential forward replay and differential reactivation at the beginning of graph learning (when performance is low) vs. retrieval (when performance is high).

      Thank you for the suggestion to include the learning segments, which we think improves the paper quite substantially. However, analysis of the learning data turned out to be quite tricky> We had decided that each participant should perform as many blocks as necessary to reach at least 80% accuracy (with a limit of six and lower bound of two, see Supplement figure 4). Some participants learned 100% of the sequence after one block (these were mostly medical students, learning things by hard is their daily task). This in hindsight is an unfortunate design feature in relation to learning as it means different blocks are not directly comparable between participants.

      In theory, we would expect that replay emerges in parallel with learning and then gradually change to clustered reactivation, as memory traces get consolidated/stronger. However, it is unclear when replay would emerge and when the switch to reactivation would happen. For this reason, we initially decided not to include the learning trials into the paper at all.

      Nevertheless, to give some insight into the learning process and to see how consolidation effects differential reactivation and replay, we have split our data into pre and post resting state, aggregating all learning trials of each participant. While this does not allow us to track measures of interest on a block basis, it gives some (albeit limited) insight into the hypothesis outlined in our discussion.

      For reactivation, we see a clear increase, further strengthening the outlined hypothesis, However, for replay the evidence is less obvious, potentially due to that fact that we do not know across how many learning blocks replay is to be expected.

      The added section(s) now read:

      “To examine how the 8-minute consolidation period affected reactivation we, post-hoc, looked at relevant measures during learning trials in contrast to retrieval trials. For all learning trial, for each participant, we calculated differential reactivation for the time point we found significant during the previous analysis (220-260 milliseconds). On average, differential reactivation probability increased from pre to post resting state (Figure 5D).

      […]

      Nevertheless, even though our results show a nominal increase in reactivation from learning to retrieval (see Figure 5D), our data does not enable us to show an hypothesized switch for sequential replay (see also “limitations” and Supplement 8).”

      Additionally, the main research question is not that clear to me. Based on the introduction, I thought the focus was on replay vs. clustered reactivation and high vs. low performance (which I think is really interesting). However, the title is more about reactivation strength and graph distance within cognitive maps. Are these two research questions related? And if so, how?

      We agree we need to be clearer on this point. We have added two sentences to the introduction, which should address this point. The section now reads:

      “[…] In particular, the question remains how the brain keeps track of graph distances for successful recall and whether the previously found difference between high and low performers also holds true within a more complex graph learning context.”

      (4) Learning the graph structure.

      I was wondering whether you have any behavioural measures to show that participants actually learn the graph structure (instead of just pairs or triplets of objects). For example, do you see that participants chose the distractor image that was closer to the target more frequently than the distractor image that was further away (close vs. distal target comparison)? It should be random at the beginning of learning but might become more biased towards the close target.

      Thanks, this is an excellent suggestion. Our analysis indeed shows that people take the near lure more often than the far lure in later blocks, while it is random in the first block.

      Nevertheless, we have decided to put these data into the supplement and reference it in the text. This is because analysis of the learning blocks is challenging and biased in general. Each participant had a different number of learning blocks based on their learning rate, and this makes it difficult to compare learning across participants. We have tried our best to accommodate and explain these difficulties in the figure legend. Nevertheless, we thank the referee for guidance here and this analysis indeed provides further evidence that participants learned the actual graph structure.

      The added section reads

      “Additionally, we have included an analysis showing how wrong answers participants provided were random in the first block and biased towards closer graph nodes in later blocks. This is consistent with participants actually learning the underlying graph structure as opposed to independent triplets (see figure and legend of Supplement 6 for details).”

      (5) Minor comments

      a) "Replay analysis relies on a successive detection of stimuli where the chance of detection exponentially decreases with each step (e.g., detecting two successive stimuli with a chance of 30% leaves a 9% chance of detecting the replay event). " Could you explain in more detail why 30% is a good threshold then?

      Thank you. We have further clarified the section. As we are working mainly with probabilities, it is useful to keep in mind that accuracy is a class metric that only provides a rough estimate of classifier ability. Alternatively, something like a Top-3-Accuracy would be preferable, but also slightly silly in the context of 10 classes.

      Nevertheless, subtle changes in probability estimates are present and can be picked up by the methods we employ. Therefore, the 30% is a rough lower bound and decided based on pilot data that showed that clean MEG data from attentive participants can usually reach this threshold. The section now reads:

      “(e.g., detecting two successive stimuli with a chance of 30% leaves a 9% chance of detecting a replay event). However, one needs to bear in mind that accuracy is a “winnertakes-all” metric indicating whether the top choice also has the highest probability, disregarding subtle, relative changes in assigned probability. As the methods used in this analysis are performed on probability estimates and not class labels, one can expect that the 30% are a rough lower bound and that the actual sensitivity within the analysis will be higher. Additionally, based on pilot data, we found that attentive participants were able to reach 30% decodability, allowing us to use decodability as a data quality check. “

      b) Could you make explicit how your decoders were designed? Especially given that you added null data, did you train individual decoders for one class vs. all other classes (n = 9 + null data) or one class vs. null data?

      We added detail to the decoder training. The section now reads

      “Decoders were trained using a one-vs-all approach, which means that for each class, a separate classifier was trained using positive examples (target class) and negative examples (all other classes) plus null examples (data from before stimulus presentation, see below). In detail, null data was.”

      c) Why did you choose a ratio of 1:2 for your null data?

      Our choice for using a higher ratio was based upon previous publications reporting better sensitivity of TDLM using higher ratios, as spatial sensor correlations are decreasing. Nevertheless, this choice was not well investigated beforehand. We have added more information to this to the manuscript

      d) You could think about putting the questionnaire results into the supplement if they are sanity checks.

      We have added the questionnaire results. However, due to the size of the tables, we have decided to add them as excel files into the supplementary files of the code repository. We have mentioned the existence file in the publication.

      e) Figure 2. There is a typo in D: It says "Precessor Image" instead of "Predecessor Image".

      Fixed typo in figure.

      f) You write "Trials for the localizer task were created from -0.1 to 0.5 seconds relative to visual stimulus onset to train the decoders and for the retrieval task, from 0 to 1.5 seconds after onset of the second visual cue image." But the Figure legend 3D starts at -0.1 seconds for the retrieval test.

      We have now clarified this. For the classifier cross-validation and transfer sanity check and clustered analysis we used trials from -0.1 to 0.5s, whereas for the sequenceness analysis of the retrieval, we used trials from 0 to 1.5 seconds

    1. Taking values near 15/11 shows nothing too unusual:

      The following code is not working getting the following error: julia> [xs i.(xs)] ERROR: UndefVarError: i not defined Stacktrace: [1] top-level scope @ REPL[41]:1

    1. All code execution happens inside the browser’s security sandbox, not on remote VMs or local binaries.

      All running in the browser not on remote WMs

    1. ons: Step 1: Briefly summarize the “best fit occupations” results of the combined assessment (about 100 words). Step 2: Reflect on the combined results of your assessments as they relate to your current career interest (about 400 words). Consider responding to one or more of following prompts: In the Work Interest assessment, what is your Holland Code (please use the letters and descriptive titles)? How well do these three descriptors fit your current career interest? How might these descriptors help you select a better fitting career goal? In the Leisure Interest assessment, what are your top three leisure interests? How well do these three descriptors fit your current career interest? How might these descriptors help you select a better fitting career goal? What “best fit” occupation recommendations do you agree with? What recommendations do you disagree with? Why? Which of the five assessments (work, leisure, skills, personality, values) are most important to you personally? Select three assessments and run another combined report. Are the results any different? Did the results provide you with any new insights? You may also comment on the insights gained from the Focus 2 Career Assessment and how they relate to the results of previous assessment you have completed while in LEAD Scholars including True Colors, Strengths, and 16-Personalities. Step 3: Provide one personal insight about your career path gained from this learning activity. My best fit occupations included a Toy designer, an Architect, an Actor/Actress, and Funeral Director. I picked the top four to discuss. It’s interesting to me because the only occupation out of those four that have really interested me would be the architect. The Toy designer occupation seems very interesting, it has to do with arts and entertainment. I consider myself a very creative person so I could see why I got this occupation. It said that my values, personality, skills, and leisure all aligned with this occupation. The second one was an Architect. This occupation has to do with Architecture and engineering. This occupation has interested me before, because of the creativity it involves. It said that my values, skills, and leisures all aligned with this occupation. The third one was an actress. This one was very cool to see, but the last time I performed in a play was 7 years ago in middle school. I was never a theater kid or interested in being one. For this one it said my personality and leisure aligned. And lastly, a funeral director. I really did not know what to think about this one when I saw it. For this one it said work, personality,  and skills all aligned. My current career interest is becoming a Pediatric Nurse Practitioner. I love to work with kids because they are so happy all the time, and I also love science and how the human body works. Lastly I want to do something in my life that is meaningful, like helping others. It was interesting to see how this assessment played out regarding my current career interest. In the leisure assessment, my top three leisure interests include Aesthetic (The creators), Correct (The Organizers) and Eager (The persuaders). I can 100% agree with these interests.It says The creators Tend to be creative and intuitive, enjoy activities like writing, painting, sculpturing, playing a musical instrument, performing, and more, enjoy working in an unstructured environment where they can use their imagination and creativity, and often described as being: open, imaginative, original, intuitive, emotional, independent, idealistic, and unconventional. It says that the The Organizers like to be involved in activities that follow set procedures and routines, like to work with data and details, have clerical or numerical ability, and carry out tasks in great detail, and often described as being conforming, practical, careful, obedient, thrifty, efficient, orderly, conscientious, and persistent. And lastly it says that The persuaders like to influence others, enjoy persuading others to see their point of view, like to work with people and ideas, rather than things, and often described as being adventurous, energetic, optimistic, agreeable, extroverted, popular, sociable, self-confident, and ambitious. All of these characteristics perfectly describe me. I don’t really think that any of the “best fit” occupations are for me. The only one I could see myself being in is an architect, but again that is nothing close to a nurse practitioner. The most important out of the five assessments to me would be values. I decided to run another report with just values, personality and skills to see what I would get. The occupation that fit me the most with those three was a clinical psychologist. Now this is more into the occupation I could see myself in. It’s more into the sciences which I liked. As I scrolled through the careers that matched, I realized the only one that was remotely close to a Nurse Practitioner was a Family Practitioner, which I would have to get a medical degree in. In conclusion, I very much enjoyed completing this assessment, and it made me realize other career options I could consider based on my personality, values, leisure, work interest, and skills.

      delete the session on your best fit occupations--that info goes into your Career Ready Portfolio, not the SLJ.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This important study advances our understanding of how past and future information is jointly considered in visual working memory by studying gaze biases in a memory task that dissociates the locations during encoding and memory tests. The evidence supporting the conclusions is convincing, with state-of-the-art gaze analyses that build on a recent series of experiments introduced by the authors. This work, with further improvements incorporating the existing literature, will be of broad interest to vision scientists interested in the interplay of vision, eye movements, and memory.

      We thank the Editors and the Reviewers for their enthusiasm and appreciation of our task, our findings, and our article. We also wish to thank the Reviewers for their constructive comments that we have embraced to improve our article. Please find below our point-by-point responses to this valuable feedback, where we also state relevant revisions that we have made to our article.

      In addition, please note that we have now also made our data and code publicly available.

      Reviewer 1, Comments:

      In this study, the authors offer a fresh perspective on how visual working memory operates. They delve into the link between anticipating future events and retaining previous visual information in memory. To achieve this, the authors build upon their recent series of experiments that investigated the interplay between gaze biases and visual working memory. In this study, they introduce an innovative twist to their fundamental task. Specifically, they disentangle the location where information is initially stored from the location where it will be tested in the future. Participants are tasked with learning a novel rule that dictates how the initial storage location relates to the eventual test location. The authors leverage participants' gaze patterns as an indicator of memory selection. Intriguingly, they observe that microsaccades are directed toward both the past encoding location and the anticipated future test location. This observation is noteworthy for several reasons. Firstly, participants' gaze is biased towards the past encoding location, even though that location lacks relevance to the memory test. Secondly, there's a simultaneous occurrence of an increased gaze bias towards both the past and future locations. To explore this temporal aspect further, the authors conduct a compelling analysis that reveals the joint consideration of past and future locations during memory maintenance. Notably, microsaccades biased towards the future test location also exhibit a bias towards the past encoding location. In summary, the authors present an innovative perspective on the adaptable nature of visual working memory. They illustrate how information relevant to the future is integrated with past information to guide behavior.

      Thank you for your enthusiasm for our article and findings as well as for your constructive suggestions for additional analyses that we respond to in detail below.

      This short manuscript presents one experiment with straightforward analyses, clear visualizations, and a convincing interpretation. For their analysis, the authors focus on a single time window in the experimental trial (i.e., 0-1000 ms after retro cue onset). While this time window is most straightforward for the purpose of their study, other time windows are similarly interesting for characterizing the joint consideration of past and future information in memory. First, assessing the gaze biases in the delay period following the cue offset would allow the authors to determine whether the gaze bias towards the future location is sustained throughout the entire interval before the memory test onset. Presumably, the gaze bias towards the past location may not resurface during this delay period, but it is unclear how the bias towards the future location develops in that time window. Also, the disappearance of the retro cue constitutes a visual transient that may leave traces on the gaze biases which speaks again for assessing gaze biases also in the delay period following the cue offset.

      Thank you for raising this important point. We initially focused on the time window during the cue given that our central focus was on gaze-biases associated with mnemonic item selection. By zooming in on this window, we could best visualize our main effects of interest: the joint selection (in time) of past and future memory attributes.

      At the same time, we fully agree that examining the gaze biases over a more extended time window yields a more comprehensive view of our data. To this end, we have now also extended our analysis to include a wider time range that includes the period between cue offset (1000 ms after cue onset) and test onset (1500 ms after cue onset). We present these data below. Because we believe our future readers are likely to be interested in this as well, we have now added this complementary visualization as Supplementary Figure 4 (while preserving the focus in our main figure on the critical mnemonic selection period of interest).

      Author response image 1.

      Supplementary Figure 4. Gaze biases in extended time window as a complement to Figure 1 and Supplementary Figure 2. This extended analysis reveals that while the gaze bias towards the past location disappears around 600 ms after cue onset, the gaze bias towards the future location persists (panel a) and that while the early (joint) future bias occurs predominantly in the microsaccade range below 1 degree visual angle, the later bias to the future location incorporates larger eye movement that likely involve preparing for optimally perceiving the anticipated test stimulus (panel b).

      This extended analysis reveals that while the gaze bias towards the past location disappears around 600 ms after cue onset (consistent with our prior reports of this bias), the gaze bias towards the future location persists. Moreover, as revealed by the data in panel b above, while the early (joint) future bias occurs predominantly in the microsaccade range below 1 degree visual angle, the later bias to the future location incorporates larger eye movement that likely involve preparing for optimally perceiving the anticipated test stimulus.

      We now also call out these additional findings and figure in our article:

      Page 2 (Results): “Gaze biases in both axes were driven predominantly by microsaccades (Supplementary Fig. 2) and occurred similarly in horizontal-to-vertical and vertical-tohorizontal trials (Supplementary Fig. 3). Moreover, while the past bias was relatively transient, the future bias continued to increase in anticipation of the of the test stimulus and increasingly incorporated eye-movements beyond the microsaccade range (see Supplementary Fig. 4 for a more extended time range)”.

      Moreover, assessing the gaze bias before retro-cue onset allows the authors to further characterize the observed gaze biases in their study. More specifically, the authors could determine whether the future location is considered already during memory encoding and the subsequent delay period (i.e., before the onset of the retro cue). In a trial, participants encode two oriented gratings presented at opposite locations. The future rule indicates the test locations relative to the encoding locations. In their example (Figure 1a), the test locations are shifted clockwise relative to the encoding location. Thus, there are two pairs of relevant locations (each pair consists of one stimulus location and one potential test location) facing each other at opposite locations and therefore forming an axis (in the illustration the axis would go from bottom left to top right). As the future rule is already known to the participants before trial onset it is possible that participants use that information already during encoding. This could be tested by assessing whether more microsaccades are directed along the relevant axis as compared to the orthogonal axis. The authors should assess whether such a gaze bias exists already before retro cue onset and discuss the theoretical consequences for their main conclusions (e.g., is the future location only jointly used if the test location is implicitly revealed by the retro cue).

      Thank you – this is another interesting point. We fully agree that additional analysis looking at the period prior to retrocue onset may also prove informative. In accordance with the suggested analysis, we have therefore now also analysed the distribution of saccade directions (including in the period from encoding to retrocue) as a function of the future rule (presented below, and now also included as Supplementary Fig. 5). Complementary recent work from our lab has shown how microsaccade directions can align to the axis of memory contents during retention (see de Vries & van Ede, eNeuro, 2024). Based on this finding, one may predict that if participants retain the items in a remapped fashion, their microsaccades may align with the axis of the future rule, and this could potentially already happen prior to cue onset.

      These complementary analyses show that saccade directions are predominantly influenced by the encoding locations rather than the test locations, as seen most clearly by the saccade distribution plots in the middle row of the figure below. To obtain time-courses, we categorized saccades as occurring along the axis of the future rule or along the orthogonal axis (bottom row of the figure below). Like the distribution plots, these time course plots also did not reveal any sign of a bias along the axis of the future rule itself.

      Importantly, note how this does not argue against our main findings of joint selection of past and future memory attributes, as for that central analysis we focused on saccade biases that were specific to the selected memory item, whereas the analyses we present below focus on biases in the axes in which both memory items are defined; not only the cued/selected memory item.

      Author response image 2.

      Supplementary Figure 5. Distribution of saccade directions relative to the future rule from encoding onset. (Top panel) The spatial layouts in the four future rules. (Middle panel) Polar distributions of saccades during 0 to 1500 ms after encoding onset (i.e., the period between encoding onset and cue onset). The purple quadrants represent the axis of the future rule and the grey quadrants the orthogonal axis. (Bottom panel) Time courses of saccades along the above two axes. We did not observe any sign of a bias along the axis of the future rule itself.

      We agree that these additional results are important to bring forward when we interpret our findings. Accordingly, we now mention these findings at the relevant section in our Discussion:

      Page 5 (Discussion): “First, memory contents could have directly been remapped (cf. 4,24–26) to their future-relevant location. However, in this case, one may have expected to exclusively find a future-directed gaze bias, unlike what we observed. Moreover, using a complementary analysis of saccade directions along the axis of the future rule (cf. 24), we found no direct evidence for remapping in the period between encoding and cue (Supplementary Fig. 5)”.

      Reviewer 2, Comments:

      The manuscript by Liu et al. reports a task that is designed to examine the extent to which "past" and "future" information is encoded in working memory that combines a retro cue with rules that indicate the location of an upcoming test probe. An analysis of microsaccades on a fine temporal scale shows the extent to which shifts of attention track the location of the location of the encoded item (past) and the location of the future item (test probe). The location of the encoded grating of the test probe was always on orthogonal axes (horizontal, vertical) so that biases in microsaccades could be used to track shifts of attention to one or the other axis (or mixtures of the two). The overall goal here was then to (1) create a methodology that could tease apart memory for the past and future, respectively, (2) to look at the time-course attention to past/future, and (3) to test the extent to which microsaccades might jointly encode past and future memoranda. Finally, some remarks are made about the plausibility of various accounts of working memory encoding/maintenance based on the examination of these time courses.

      Strengths:

      This research has several notable strengths. It has a clear statement of its aims, is lucidly presented, and uses a clever experimental design that neatly orthogonalizes "past" and "future" as operationalized by the authors. Figure 1b-d shows fairly clearly that saccade directions have an early peak (around 300ms) for the past and a "ramping" up of saccades moving in the forward direction. This seems to be a nice demonstration the method can measure shifts of attention at a fine temporal resolution and differentiate past from future-oriented saccades due to the orthogonal cue approach. The second analysis shown in Figure 2, reveals a dependency in saccade direction such that saccades toward the probe future were more likely also to be toward the encoded location than away from the encoded direction. This suggests saccades are jointly biased by both locations "in memory".

      Thank you for your overall appreciation of our work and for highlighting the above strengths. We also thank you for your constructive comments and call for clarifications that we respond to below.

      Weaknesses:

      (1) The "central contribution" (as the authors characterize it) is that "the brain simultaneously retains the copy of both past and future-relevant locations in working memory, and (re)activates each during mnemonic selection", and that: "... while it is not surprising that the future location is considered, it is far less trivial that both past and future attributes would be retained and (re)activated together. This is our central contribution." However, to succeed at the task, participants must retain the content (grating orientation, past) and probe location (future) in working memory during the delay period. It is true that the location of the grating is functionally irrelevant once the cue is shown, but if we assume that features of a visual object are bound in memory, it is not surprising that location information of the encoded object would bias processing as indicated by microsaccades. Here the authors claim that joint representation of past and future is "far less trivial", this needs to be evaluaed from the standpoint of prior empirical data on memory decay in such circumstances, or some reference to the time-course of the "unbinding" of features in an encoded object.

      Thank you. We agree that our participants have to use the future rule – as otherwise they do not know to which test stimulus they should respond. This was a deliberate decision when designing the task. Critically, however, this does not require (nor imply) that participants have to incorporate and apply the rule to both memory items already prior to the selection cue. It is at least as conceivable that participants would initially retain the two items at their encoded (past) locations, then wait for the cue to select the target memory item, and only then consider the future location associated with the target memory item. After all, in every trial, there is only 1 relevant future location: the one associated with the cued memory item. The time-resolved nature of our gaze markers argues against such a scenario, by virtue of our observation of the joint (simultaneous) consideration of past and future memory attributes (as opposed to selection of past-before-future). These temporal dynamics are central to the insights provided by our study.

      In our view, it is thus not obvious that the rule would be applied at encoding. In this sense, we do not assume that the future location is part of both memory objects from encoding, but rather ask whether this is the case – and, if so, whether the future location takes over the role of the past location, or whether past and future locations are retained jointly.

      Our statements regarding what is “trivial” and what is “less trivial” regard exactly this point: it is trivial that the future is considered (after all, our task demanded it). However, it is less trivial that (1) the future location was already available at the time of initial item selection (as reflected in the simultaneous engagement of past and future locations), and (2) that in presence of the future location, the past location was still also present in the observed gaze biases.

      Having said that, we agree that an interesting possibility is that participants remap both memory items to their future-relevant locations ahead of the cue, but that the past location is not yet fully “unbound” by the time of the cue. This may trigger a gaze bias not only to the new future location but also to the “sticky” (unbound) past location. We now acknowledge this possibility in our discussion (also in response to comment 3 below) where we also suggest how future work may be able to tap into this:

      Page 6 (Discussion): “In our study, the past location of the memory items was technically irrelevant for the task and could thus, in principle, be dropped after encoding. One possibility is that participants remapped the two memory items to their future locations soon after encoding, and had started – but not finished – dropping the past location by the time the cue arrived. In such a scenario, the past signal is merely a residual trace of the memory items that serves no purpose but still pulls gaze. Alternatively, however, the past locations may be utilised by the brain to help individuate/separate the two memory items. Moreover, by storing items with regard to multiple spatial frames (cf. 37) – here with regard to both past and future visual locations – it is conceivable that memories may become more robust to decay and/or interference. Also, while in our task past locations were never probed, in everyday life it may be useful to remember where you last saw something before it disappeared behind an occluder. In future work, it will prove interesting to systematically vary to the delay between encoding and cue to assess whether the reliance on the past location gradually dissipates with time (consistent with dropping an irrelevant feature), or whether the past trace remains preserved despite longer delays (consistent with preserving utility for working memory).”

      (2) The authors refer to "future" and "past" information in working memory and this makes sense at a surface level. However, once the retrocue is revealed, the "rule" is retrieved from long-term memory, and the feature (e.g. right/left, top/bottom) is maintained in memory like any other item representation. Consider the classic test of digit span. The digits are presented and then recalled. Are the digits of the past or future? The authors might say that one cannot know, because past and future are perfectly confounded. An alternative view is that some information in working memory is relevant and some is irrelevant. In the digit span task, all the digits are relevant. Relevant information is relevant precisely because it is thought be necessary in the future. Irrelevant information is irrelevant precisely because it is not thought to be needed in the immediate future. In the current study, the orientation of the grating is relevant, but its location is irrelevant; and the location of the test probe is also relevant.

      Thank you for this stimulating reflection. We agree that in our set-up, past location is technically “task-irrelevant” while future location is certainly “task-relevant”. At the same time, the engagement of the past location suggests to us that the brain uses past location for the selection – presumably because the brain uses spatial location to help individuate/separate the items, even if encoded locations are never asked about. Therefore, whether something is relevant or irrelevant ultimately depends on how one defines relevance (past location may be relevant/useful for the brain even if technically irrelevant from the perspective of the task). In comparison, the use of “past” and “future” may be less ambiguous.

      It is also worth noting how we interpret our findings in relation to demands on visual working memory, inspired by dynamic situations whereby visual stimuli may be last seen at one location but expected to re-appear at another, such as a bird disappearing behind a building (the example in our introduction). Thus, past for us does not refer to the memory item perse (like in the digit span analogue) but, rather, quite specifically to the past location of a dynamic visual stimulus in memory (which, in our experiment, was operationalised by the future rule, for convenience).

      (3) It is not clear how the authors interpret the "joint representation" of past and future. Put aside "future" and "past" for a moment. If there are two elements in memory, both of which are associated with spatial bindings, the attentional focus might be a spatial average of the associated spatial indices. One might also view this as an interference effect, such that the location of the encoded location attracts spatial attention since it has not been fully deleted/removed from working memory. Again, for the impact of the encoded location to be exactly zero after the retrieval cue, requires zero interference or instantaneous decay of the bound location information. It would be helpful for the authors to expand their discussion to further explain how the results fit within a broader theoretical framework and how it fits with empirical data on how quickly an irrelevant feature of an object can be deleted from working memory.

      Thank you also for this point (that is related to the two points above). As we stated in our reply to comment 1 above, we agree that one possibility is that the past location is merely “sticky” and pulls the task-relevant future bias toward the past location. If so, our time courses suggest that such “pulling” occurs only until approximately 600 ms after cue onset, as the past bias is only transient. An alternative interpretation is that the past location may not be merely a residual irrelevant trace, but actually be useful and used by the brain.

      For example, the encoded (past) item locations provide a coordinate system in which to individuate/separate the two memory items. While the future locations also provide such a coordinate system, the brain may benefit from holding onto both coordinate systems at the same time, rendering our observation of joint selection in both frames. Indeed, in a recent VR experiment in which we had participants (rather than the items) rotate, we also found evidence for the joint use of two spatial frames, even if neither was technically required for the upcoming task (see Draschkow, Nobre, van Ede, Nature Human Behaviour, 2022). Though highly speculative at this stage, such reliance on multiple spatial frames may make our memories more robust to decay and/or interference. Moreover, while past location was never explicitly probed in our task, in daily life the past location may sometimes (unexpectedly) become relevant, hence it may be useful to hold onto it, just in case. Thus, considering the past location merely as an “irrelevant feature” (that takes time to delete) may not do sufficient justice to the potential roles of retaining past locations of dynamic visual objects held in working memory.

      As also stated in response to comment 1 above, we now added these relevant considerations to our Discussion:

      Page 5 (Discussion): “In our study, the past location of the memory items was technically irrelevant for the task and could thus, in principle, be dropped after encoding. One possibility is that participants remapped the two memory items to their future locations soon after encoding, and had started – but not finished – dropping the past location by the time the cue arrived. In such a scenario, the past signal is merely a residual trace of the memory items that serves no purpose but still pulls gaze. Alternatively, however, the past locations may be utilised by the brain to help individuate/separate the two memory items. Moreover, by storing items with regard to multiple spatial frames (cf. 37) – here with regard to both past and future visual locations – it is conceivable that memories may become more robust to decay and/or interference. Also, while in our task past locations were never probed, in everyday life it may be useful to remember where you last saw something before it disappeared behind an occluder. In future work, it will prove interesting to systematically vary to the delay between encoding and cue to assess whether the reliance on the past location gradually dissipates with time (consistent with dropping an irrelevant feature), or whether the past trace remains preserved despite longer delays (consistent with preserving utility for working memory).”

      Reviewer 3, Comments:

      This study utilizes saccade metrics to explore, what the authors term the "past and future" of working memory. The study features an original design: in each trial, two pairs of stimuli are presented, first a vertical pair and then a horizontal one. Between these two pairs comes the cue that points the participant to one target of the first pair and another of the second pair. The task is to compare the two cued targets. The design is novel and original but it can be split into two known tasks - the first is a classic working memory task (a post-cue informs participants which of two memorized items is the target), which the authors have used before; and the second is a classic spatial attention task (a pre-cue signal that attention should be oriented left or right), which was used by numerous other studies in the past. The combination of these two tasks in one design is novel and important, as it enables the examination of the dynamics and overlapping processes of these tasks, and this has a lot of merit. However, each task separately is not new. There are quite a few studies on working memory and microsaccades and many on spatial attention and microsaccades. I am concerned that the interpretation of "past vs. future" could mislead readers to think that this is a new field of research, when in fact it is the (nice) extension of an existing one. Since there are so many studies that examined pre-cues and post-cues relative to microsaccades, I expected the interpretation here to rely more heavily on the existing knowledge base in this field. I believe this would have provided a better context of these findings, which are not only on "past" vs. "future" but also on "working memory" vs. "spatial attention".

      Thank you for considering our findings novel and important, while at the same time reminding us of the parallels to prior tasks studying spatial attention in perception and working memory. We fully agree that our task likely engages both attention to the (past) memory item as well as spatial attention to the upcoming (future) test stimulus. At the same time, there is a critical difference in spatial attention for the future in our task compared with ample prior tasks engaging spatial cueing of attention for perception. In our task, the cue never directly cues the future location. Rather, it exclusively cues the relevant memory item. It is the memory item that is associated with the relevant future location, according to the future rule. This integration of the rule-based future location into the memory representation is distinct from classical spatial-attention tasks in which attention is cued directly to a specific location via, for example, a spatial cue such as an arrow.

      Thus, if we wish to think about our task as engaging cueing of spatial attention for perception, we have to at least also invoke the process of cueing the relevant location via the appropriate memory item. We feel it is more parsimonious to think of this as attending to both the past and future location of a dynamic visual object in working memory.

      If we return to our opening example, when we see a bird disappear behind a building, we can keep in working memory where we last saw it, while anticipating where it will re-appear to guide our external spatial attention. Here too, spatial attention is fully dependent on working-memory content (the bird itself) – mirroring the dynamic semng in our study. Thus, we believe our findings contribute a fresh perspective, while of course also extending established fields. We now contextualize our finding within the literature and clarify our unique contribution in our revised manuscript:

      Page 5 (Discussion): “Building on the above, at face value, our task may appear like a study that simply combines two established tasks: tasks using retro-cues to study attention in working memory (e.g.,2,31-33) and tasks using pre-cues to study orienting of spatial attention to an upcoming external stimulus (e.g., 31,32,34–36). A critical difference with common pre-cue studies, however, is that the cue in our task never directly informed the relevant future location. Rather, as also stressed above, the future location was a feature of the cued memory item (according to the future rule), and not of the cue itself. Note how this type of scenario may not be uncommon in everyday life, such as in our opening example of a bird flying behind a building. Here too, the future relevant location is determined by the bird – i.e. the memory content – itself.”

      Reviewer 2, Recommendations:

      It would be helpful to set up predictions based on existing working memory models. Otherwise, the claim that the joint coding of past/future is "not trivial" is simply asserted, rather than contradicting an existing model or prior empirical results. If the non-trivial aspect is simply the ability to demonstrate the joint coding empirical through a good experimental design, make it clear that this is the contribution. For example, it may be that prevailing models predict exactly this finding, but nobody has been able to demonstrate it cleanly, as the authors do here. So the non-triviality is not that the result contradicts working memory models, but rather relates to the methodological difficulty of revealing such an effect.

      Thank you for your recommendation. First, please see our point-by-point responses to the individual comments above, where we also state relevant changes that we have made to our article, and where we clarify what we meant with “non trivial”. As we currently also state in our introduction, our work took as a starting point the framework that working memory is inherently about the past while being for the future (cf. van Ede & Nobre, Annual Review of Psychology, 2023). By virtue of our unique task design, we were able to empirically demonstrate that visual contents in working memory are selected via both their past and their future-relevant locations – with past and future memory attributes being engaged together in time. With “not trivial” we merely intend to make clear that there are viable alternatives than the findings we observed. For example, past could have been replaced by the future, or it could have been that item selection (through its past location) was required before its future-relevant location could be considered (i.e. past-before-future, rather than joint selection as we reported). We outline these alternatives in the second paragraph of our Discussion:

      Page 5 (Discussion): “Our finding of joint utilisation of past and future memory attributes emerged from at least two alternative scenarios of how the brain may deal with dynamic everyday working memory demands in which memory content is encoded at one location but needed at another.

      First, [….]”

      Our work was not motivated from a particular theoretical debate and did not aim to challenge ongoing debates in the working-memory literature, such as: slot vs. resource, active vs. silent coding, decay vs. interference, and so on. To our knowledge, none of these debates makes specific claims about the retention and selection of past and future visual memory attributes – despite this being an important question for understanding working memory in dynamics everyday semngs, as we hoped to make clear by our opening example.

      Reviewer 3, Recommendations:

      I recommend that the present findings be more clearly interpreted in the context of previous findings on working memory and attention. The task design includes two components - the first (post-cue) is a classic working memory task and the second (the pre-cue) is a classic spatial attention design. Both components were thoroughly studied in the past and this previous knowledge should be better integrated into the present conclusions. I specifically feel uncomfortable with the interpretation of past vs. future. I find this framework to be misleading because it reads like this paper is on a topic that is completely new and never studied before, when in fact this is a study on the interaction between working memory and spatial attention. I recommend the authors minimize this past-future framing or be more explicit in explaining how this new framework relates to the more common terminology in the field and make sure that the findings are not presented in a vacuum, as another contribution to the vibrant field that they are part of.

      Thank you for these recommendations. Please also see our point-by-point responses to the individual comments above. Here, we explained our logic behind using the terminology of past vs. future (in addition, see also our response to point 2 or reviewer 2). Here, we also stated relevant changes that we have made to our manuscript to explain how our findings complement – but are also distinct from – prior tasks that used pre-cues to direct spatial attention to an upcoming stimulus. As we explained above, in our task, the cue itself never contained information about the upcoming test location. Rather, the upcoming test location was a property of the memory item (given the future rule). Hence, we referred to this as a “future attribute” of the cued memory item, rather than as the “cued location” for external spatial attention. Still, we agree the future bias likely (also) reflects spatial allocation to the upcoming test array, and we explicitly acknowledge this in our discussion. For example:

      Page 5 (Discussion): “This signal may reflect either of two situations: the selection of a future-copy of the cued memory content or anticipatory attention to its the anticipated location of its associated test-stimulus. Either way, by the nature of our experimental design, this future signal should be considered a content-specific memory attribute for two reasons. First, the two memory contents were always associated with opposite testing locations, hence the observed bias to the relevant future location must be attributed specifically to the cued memory content. Second, we cued which memory item would become tested based on its colour, but the to-be-tested location was dependent on the item’s encoding location, regardless of its colour. Hence, consideration of the item’s future-relevant location must have been mediated by selecting the memory item itself, as it could not have proceeded via cue colour directly.”

      Page 6 (Discussion): “Building on the above, at face value, our task may appear like a study that simply combines two established tasks: tasks using retro-cues to study attention in working memory (e.g.,2,31-33) and tasks using pre-cues to study orienting of spatial attention to an upcoming external stimulus (e.g., 31,32,34–36). A critical difference with common pre-cue studies, however, is that the cue in our task never directly informed the relevant future location. Rather, as also stressed above, the future location was a feature of the cued memory item (according to the future rule), and not of the cue itself. Note how this type of scenario may not be uncommon in everyday life, such as in our opening example of a bird flying behind a building. Here too, the future relevant location is determined by the bird – i.e. the memory content – itself.”

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Gap junction channels establish gated intercellular conduits that allow the diffusion of solutes between two cells. Hexameric connexin26 (Cx26) hemichannels are closed under basal conditions and open in response to CO2. In contrast, when forming a dodecameric gapjunction, channels are open under basal conditions and close with increased CO2 levels. Previous experiments have implicated Cx26 residue K125 in the gating mechanism by CO2, which is thought to become carbamylated by CO2. Carbamylation is a labile post-translational modification that confers negative charge to the K125 side chain. How the introduction of a negative charge at K125 causes a change in gating is unclear, but it has been proposed that carbamylated K125 forms a salt bridge with the side chain at R104, causing a conformational change in the channel. It is also unclear how overall gating is controlled by changes in CO2, since there is significant variability between structures of gap-junction channels and the cytoplasmic domain is generally poorly resolved. Structures of WT Cx26 gap-junction channels determined in the presence of various concentrations of CO2 have suggested that the cytoplasmatic N-terminus changes conformation depending on the concentration of the gas, occluding the pore when CO2 levels are high.

      In the present manuscript, Deborah H. Brotherton and collaborators use an intercellular dyetransfer assay to show that Cx26 gap-junction channels containing the K125E mutation, which mimics carbamylation caused by CO2, is constitutively closed even at CO2 concentrations where WT channels are open. Several cryo-EM structures of WT and mutant Cx26 gap junction channels were determined at various conditions and using classification procedures that extracted more than one structural class from some of the datasets. Together, the features on each of the different structures are generally consistent with previously obtained structures at different CO2 concentrations and support the mechanism that is proposed in the manuscript. The most populated class for K125E channels determined at high CO2 shows a pore that is constricted by the N-terminus, and a cytoplasmic region that was better resolved than in WT channels, suggesting increased stability. The K125E structure closely resembles one of the two major classes obtained for WT channels at high CO2. These findings support the hypothesis that the K125E mutation biases channels towards the closed state, while WT channels are in an equilibrium between open and closed states even in the presence of high CO2. Consistently, a structure of K125E obtained in the absence of CO2 appeared to also represent a closed state but at lower resolution, suggesting that CO2 has other effects on the channel beyond carbamylation of K125 that also contribute to stabilizing the closed state. Structures determined for K125R channels, which are constitutively open because arginine cannot be carbamylated, and would be predicted to represent open states, yielded apparently inconclusive results.

      A non-protein density was found to be trapped inside the pore in all structures obtained using both DDM and LMNG detergents, suggesting that the density represents a lipid rather than a detergent molecule. It is thought that the lipid could contribute to the process of gating, but this remains speculative. The cytoplasmic region in the tentatively closed structural class of the WT channel obtained using LMNG was better resolved. An additional portion of the cytoplasmic face could be resolved by focusing classification on a single subunit, which had a conformation that resembled the AlphaFold prediction. However, this single-subunit conformation was incompatible with a C6-symmetric arrangement. Together, the results suggest that the identified states of the channel represent open states and closed states resulting from interaction with CO2. Therefore, the observed conformational changes illuminate a possible structural mechanism for channel gating in response to CO2.

      Some of the discussion involving comparisons with structures of other gap junction channels are relatively hard to follow as currently written, especially for a general readership. Also, no additional functional experiments are carried out to test any of the hypotheses arising from the data. However, structures were determined in multiple conditions, with results that were consistent with the main hypothesis of the manuscript. No discussion is provided, even if speculative, to explain the difference in behavior between hemichannels and gap junction channels. Also, no attempt was made to measure the dimensions of the pore, which is relevant because of the importance of identifying if the structures indeed represent open or closed states of the channel.

      We have considerably revised the manuscript in an attempt to make it more tractable. We respond to the individual comments below.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Brotherton et al. describes a structural study of connexin-26 (Cx26) gap junction channel mutant K125E, which is designed to mimic the CO2-inhibited form of the channel. In the wild-type Cx26, exposure to CO2 is presumed to close the channel through carbamylation of the residue K125. The authors mutated K125 to a negatively charged residue to mimic this effect, and they observed by cryo-EM analysis of the mutated channel that the pore of the channel is constricted. The authors were able to observe conformations of the channel with resolved density for the cytoplasmic loop (in which K125 is located). Based on the observed conformations and on the position of the N-terminal helix, which is involved in channel gating and in controlling the size of the pore, the authors propose the mechanisms of Cx26 regulation.

      Strengths:

      This is a very interesting and timely study, and the observations provide a lot of new information on connexin channel regulation. The authors use the state of the art cryo-EM analysis and 3D classification approaches to tease out the conformations of the channel that can be interpreted as "inhibited", with important implications for our understanding of how the conformations of the connexin channels controlled.

      Weaknesses:

      My fundamental question to the premise of this study is: to what extent can K125 carbamylation by recapitulated by a simple K125E mutation? Lysine has a large side chain, and its carbamylation would make it even slightly larger. While the authors make a compelling case for E125-induced conformational changes focusing primarily on the negative charge, I wonder whether they considered the extent to which their observation with this mutant may translate to the carbamoylated lysine in the wild-type Cx26, considering not only the charge but also the size of the modified side-chain.

      This is an important point. We agree that the difference in size will have a different effect on the structure. For kinases, aspartate or glutamate are often used as mimics of phosphorylated serine or threonine and these will have the same issues. The fact that we cannot resolve the relevant side-chains in the density may be indicative that the mutation doesn’t give the whole story. It may be able to shift the equilibrium towards the closed conformation, but not stably trap the molecule in that conformation. We include a comment to this effect in the revised manuscript.

      Reviewer #3 (Public Review):

      Summary:

      The mechanism underlying the well-documented CO2-regulated activity of connexin 26 (Cx26) remains poorly understood. This is largely due to the labile nature of CO2-mediated carbamylation, making it challenging to visualize the effects of this reversible posttranslational modification. This paper by Brotherton et al. aims to address this gap by providing structural insights through cryo-EM structures of a carbamylation-mimetic mutant of the gap junction protein.

      Strengths:

      The combination of the mutation, elevated PCO2, and the use of LMNG detergent resulted in high-resolution maps that revealed, for the first time, the structure of the cytoplasmic loop between transmembrane helix (TM) 2 and 3.

      Weaknesses:

      The presented maps merely reinforce their previous findings, wherein wildtype Cx26 favored a closed conformation in the presence of high PCO2. While the structure of the TM2-TM3 loop may suggest a mechanism for stabilizing the closed conformation, no experimental data was provided to support this mechanism. Additionally, the cryo-EM maps were not effectively presented, making it difficult for readers to grasp the message.

      We have extensively revised the manuscript so that the novelty of this study is more apparent. There are three major points

      (1) The carbamylation mimetic pushes the conformation towards the closed conformation. Previously we just showed that CO2 pushes the conformation towards this conformation. Though we could show this was not due to pH, and could speculate this was due to carbamylation as suggested by previous mutagenesis studies, our data did not provide any mechanism whereby Lys125 was involved.

      (2) In going from the open to closed conformations, not only is a conformational change in TM2 involved, as we saw previously, but also a conformational change in TM1, the linker to the N-terminus and the cytoplasmic loop. Thus there is a clear connection between Lys125 and the conformation of the pore-closing N-terminus.

      (3) We observe for the first time in any connexin structure, density for the cytoplasmic loop. Since this loop is important in regulation, knowing how it might influence the positions of the transmembrane helices is important information if we are to understand how connexins can be regulated.

      Reviewing Editor:

      The reviewers have agreed on a list of suggested revisions that would improve the eLife assessment if implemented, which are as follows:

      (1) For completeness, Figure 1 could be supplied with an example of how the experiment would look like in the presence of CO2 - for the wild-type and for the K125E mutant. presumably for the wild-type this has been done previously in exactly this assay format, but this control would be an important part of characterization for the mutant. Page 4, lines 105106; "unsurprisingly, Cx26K125E gap junctions remain closed at a PCO2 of 55 mmHg." The data should be presented in the manuscript.

      We have now included the data with a PCO2 of 55mmH. This is now Figure 4 in our revised manuscript.

      (2) Would AlphaFold predictions show any interpretable differences in the E125 mutant, compared to the K125 (the wild-type)?

      We tried this in response to the reviewer’s suggestion. We did not see any interpretable differences. In general AlphaFold is not recognised as giving meaningful information around point mutations.

      (3) The K125R mutant appears to be a more effective control for extracting significant features from the K125E maps. Given that the use of a buffer containing high PCO2 is essential for obtaining high-resolution maps, wildtype Cx26 is unsuitable as an appropriate control. The K125R map, obtained at a high resolution (2.1Å), supports its suitability as a robust control.

      Though we are unsure what the referee is referring to here, we have rewritten this section and compare against the K125R map (figure 5a) as well as that derived from the wild-type protein. The important point is that the K125E mutant, causes a structural change that is consistent with the closure of the gap junctions that we observe in the dye-transfer assays.

      (4) Likewise, the rationale for using wildtype Cx26 maps obtained in DDM is unclear. Wildtype Cx26 seems to yield much better cryo-EM maps in LMNG. We suggest focusing the manuscript on the higher-quality maps, and providing supporting information from the DDM maps to discuss consistency between observations and the likely possibility that the nonprotein density in the pore is lipid and not detergent.

      The rationale for comparing the mutants against the wt Cx26 maps obtained in DDM was because the mutants were also solubilised in DDM. However, taking the lead from the referees’ comments, we have now rewritten the manuscript so that we first focus on the data we obtain from protein solubilised in LMNG. We feel this makes our message much clearer.

      (5) In general, the rationale for utilizing cryo-EM maps with the entire selected particles is unclear. Although the overall resolutions may slightly improve in this approach, the regions of interest, such as the N-terminus and the cytoplasmic loop, appear to be better ordered afer further classifications. The paper would be more comprehensible if it focuses solely on the classes representing the pore-constricting N-terminus (PCN) and the pore-open flexible Nterminus (POFN) conformations. Also, the nomenclatures used in the manuscript, such as "WT90-Class1", "K125E90-1", "LMNG90-class1", "LMNG90-mon-pcn" are confusing.

      LMNG90s are also wildtype; K125E-90-1 is in Class1 for this mutant and is similar to WT90Class2, which represents the PCN conformation. More consistent and intuitive nomenclatures would be helpful.

      We agree with the referees’ comments. This should now be clearer with our rewritten manuscript where we have simplified this considerably. We now call the conformations NConst (N-terminus defined and constricting the pore) and NFlex (N-terminus not visible) and keep this consistent throughout.

      (6) A potential salt bridge between the carbamylated K125 and R104 is proposed to account for the prevalence of Class-1 (i.e., PCN) in the majority of cryo-EM particles. However, the side chain densities are not well-defined, suggesting that such an interaction may not be strong enough to trap Cx26 in a closed conformation. Furthermore, the absence of experimental data to support this mechanism makes it unclear how likely this mechanism may be. Combining simple mutagenesis, such as R104E, with a dye transfer assay could offer support for this mechanism. Are there any published experimental results that could help address this question without the need for additional experimental work? Alternatively, as acknowledged in the discussion, this mechanism may be deemed as an "over-simplification." What is an alternative mechanism?

      R104 has been mutated to alanine in gap junctions and tested in a dye transfer assay as now mentioned in the text (Nijar et al, J Physiol 2021) supporting this role. In hemichannels R104 has been mutated to both alanine and glutamate and tested through dye loading assays Meigh et al, eLife 2013). Also in hemichannels R104 and K125 have been mutated to cysteines allowing them to be cross-linked through a disulphide bond. This mutant responds to a change in redox potential in a similar way to which the wild type protein responds to CO2 (Meigh et al, Open Biol 2015). Therefore, there is no doubt that the residues are important for the mechanism and the salt-bridge interaction seems a plausible mechanism to reconcile the mutagenesis data, however we cannot be sure that there are not other interactions involved that are necessary for closure. This information has now been included in the text.

      (7) The cryo-EM maps presented in the manuscript propose that gap junctions are constitutively open under normal PCO2 as the flexible N-terminus clears the solute permeation pathway in the middle of the channel. However, hemichannels appear to be closed under normal PCO2. It is puzzling how gap junctions can open when hemichannels are closed under normal PCO2 conditions. If this question has been addressed in previous studies, the underlying mechanism should be explicitly described in the introduction. If it remains an open question, differences in the opening mechanisms between hemichannels and gap junctions should be investigated.

      We suspect this is due to the difference in flexibility of gap junctions relative to hemichannels. However, a discussion of this is beyond this paper and would be complete speculation based on hemichannel structures of other connexins, performed in different buffering systems. There are no high resolution structures of Cx26 hemichannels.

      (8) A mystery density likely representing a lipid is abruptly introduced, but the significance of this discovery is unclear. It is hard to place the lipid on Figure S6 in the wider context of everything else that is discussed in the text. It would be helpful for readers if a figure were provided to show where the density is located in relation to all the other regions that are extensively discussed in the text.

      In the revised text this section has been completely rewritten. We have now include a more informative view in a new figure (Figure 1 – figure supplement 3).

      (9) Including and displaying even tentative pore-diameter measurements for the different states - this would be helpful for readers and provide a more direct visual cue as to the difference between open and closed states.

      We have purposely avoided giving precise measurements to the pore-diameter, since this depends on how we model the N-terminus. The first three residues are difficult to model into the density without causing stearic clashes with the neighbouring subunits.

      (10) Given that no additional experiments for channel function were carried out, it would be useful if to provide a more detailed discussion of additional mutagenesis results from the literature that are related to the experimental results presented.

      We have amplified this in the discussion (see answer to point 6).

      The reviewers also agreed that improvements in the presentation of the data would strengthen the manuscript. Here is a summary list of suggestions by reviewers aimed at helping improve how the data is presented:

      (1) Why is the pipette bright green in the top image, but rather weakly green in the bottom image in Figure 1 - is this the case for all images?

      (Now figure 4) This depends on whether the pipette was in the focal plane of view or not. The important point of these images is the difference in intensity of the donor vs the recipient cell. The graphs in figure 4c illustrate clearly the difference between the wild-type and the mutant gap junctions.

      (2) In figures 2-5, labels would help a lot in understanding what is shown - while the legends do provide the information on what is presented, it would help the reader to see the models/maps with labels directly in the panel. For example, Figure 2a/b - just indicating "WT90 Cx26" in pink and "K125E90" in blue directly in the panel would reduce the work for the reader.

      We have extensively modified the labels in the figures to address this issue.

      (3) Figure 4 - magenta and pink are fairly close, and to avoid confusion it might be useful to use a different color selection. This is especially true when structures are overlayed, as in this figure - the presentation becomes rather complicated, so the less confusion the color code can introduce, the better.

      (Now Figure 2) We have now changed pink to blue.

      (4) Figure 5 - a remarkably under-labelled figure.

      Now added labels.

      (5) Figure 6 - it would be interesting to add a comparison to Cx32 here as well for completeness, since the structure has been published in the meantime.

      Cx32 has now been included.

      (6) Figure 7 - please add equivalent labels on both sides of the model, left and right. Add the connecting lines for all of the tubes TM helices - this will help trace the structural elements shown. The legend does not quite explain the colors.

      We have modified the figure as suggested and explained the colours in the legend.

      (8) Fig.1 legend; Unclear what mCherry fluorescence represents. State that Cx26 was expressed as a translational fusion with mCherry.

      Now figure 4. We have now written “Montages each showing bright field DIC image of HeLa cells with mCherry fluorescence corresponding to the Cx26K125E-mCherry fusion superimposed (leftmost image) and the permeation of NBDG from the recorded cell to coupled cells.”

      (9) Fig. 3 b); Show R104 in the figure. Also E129-R98/R99 interaction is hard to acknowledge from the figure. It seems that the side chain density of E129 is not strong enough to support the modeled orientation.

      This is now Figure 1c. While the density in this region is sufficient to be confident of the main chain, we agree that the side chain density for the E129-R98/R99 interaction is not sufficiently clear to draw attention to and have removed the associated comment from the figure legend. The density is focussed on the linker between TM1 and the N-terminus and the KVRIEG motif. We prefer to omit R104, in order to keep the focus on this region. As described in the manuscript, the density for the R104 side chain is poor.

      (10) Fig. 3 c); Label the N-terminus and KVRIEG motif in the figure.

      Now Figure 1b. We have labelled the N-terminus. The KVRIEG motif is not visible in this map.

      (11) Page 9, lines 246-248; Restate, "We note, however, density near to Lys125, between Ser19 in the TM1-N-term linker, Tyr212 of TM4 and Tyr97 on TM3 of the neighbouring subunit, which we have been unable to explain with our modelling."

      We have reworded this.

      (12) Page 14, line 399; Patch clamp recording is not included in the manuscript.

      Patch clamp recordings were used to introduce dye into the donor cell.

      (13) On the same Figure 2, clashes are mentioned but these are hard to appreciate in any of the figures shown. Perhaps would be useful to include an inset showing this.

      We have modified Figure 2b slightly and added an explanation to highlight the clash. It is slightly confusing because the residues involved belong to neighbouring subunits.

      (14) The discussion related to Figure 6 is very hard to follow for readers who are not familiar with the context of abbreviations included on the figure labels. This figure could be improved to allow a general readership to identify more clearly each of the features and structural differences that are discussed in the text.

      We have extensively changed the text and updated the labels on the figure to make it much easier for the reader to follow.

      Below, you can find the individual reviews by each of the three reviewers.

      Reviewer #1 (Recommendations For The Authors):

      (1) In Figure 2d-e, the text discusses differences between K125E 90-1 and WT 90-class2 (7QEW), yet the figure compares K125E with 7QEQ. I suggest including a figure panel with a comparison between the two structures discussed in the manuscript text.

      This has been changed in the revised manuscript.

      Other comments have been addressed above.