1. Last 7 days
    1. eLife Assessment

      This important work employed a recent functional muscle network analysis to evaluate rehabilitation outcomes in post-stroke patients. While the research direction is relevant and suggests the need for further investigation, the strength of evidence supporting the claims is incomplete. Muscle interactions can serve as biomarkers, but improvements in function are not directly demonstrated, and the method's robustness is not benchmarked against existing approaches.

    2. Reviewer #1 (Public review):

      While the revised manuscript includes additional methodological details and a supplementary comparison with conventional NMF, it would be great if the authors could add the point below as limitations in the manuscript or change the title and abstract accordingly, since core issues remain:

      (1) The study claims to evaluate rehabilitation outcomes without demonstrating that patients actually improved functionally

      (2) The comparison with existing methods lacks the quantitative rigor needed to establish superiority

      (3) The added value of this complex framework over much simpler alternatives has not been demonstrated

      The strength of evidence supporting the main claims remains incomplete. I would encourage the authors to consider discussing these points

      (1) including or adding a limitation section about functional outcome measures that go beyond clinical scale scores, (2) providing/discussing quantitative benchmarks showing their method outperforms alternatives on specific, predefined metrics, and (3) clarifying the clinical pathway by which these biomarkers would inform treatment decisions.

      There are specific, relatively minor points, that require attention

      The authors write: "we did not focus on such complementary evidence in this study." This is a weakness for a paper claiming to provide "biomarkers of therapeutic responsiveness." The FMA-UE threshold defines responders, but there's no independent validation that patients actually functioned better in daily life. Can you please clarify?

      Maybe I missed the exact point about this, but with the added NMF plot, the authors list 'lower dimensionality' among their framework's advantages, but the basis for this claim is not clear because given that 12 network components were extracted compared to 11 "conventional" synergies. Can you please clarify, as it is not clear. You claim 'lower dimensionality' as an advantage of the proposed framework (in the Supplementary Materials), yet you extracted 12 components (5 redundant + 7 synergistic networks) compared to 11 synergies from the conventional NMF approach, which does not support a clinical / outcome advantage of this method. Please clarify.

    3. Reviewer #2 (Public review):

      This study presents an important analysis of how interactions between muscles can serve as biomarkers to quantify therapeutic responses in post-stroke patients. To do so, the authors employ an information-theoretical metric (co-information) to define muscle networks and perform cluster analysis.

      I thank the authors for improving the clarity of the Methods section; the newly added Figure 5 is very helpful.

      One minor suggestion is that the authors should avoid overloading the notation "m" for both the EEG measurement and the matrix of II values (Eq. 1.1), which I now realise was the source of some of my initial confusion. I suggest that the authors use separate notation for these two quantities.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study addresses an important clinical challenge by proposing muscle network analysis as a tool to evaluate rehabilitation outcomes. The research direction is relevant, and the findings suggest further research. The strength of evidence supporting the claims is, however, limited: the improvements in function are not directly demonstrated, the robustness of the method is not benchmarked against already published approaches, and key terminology is not clearly defined, which reduces the clarity and impact of the work.

      Comments:

      There are several aspects of the current work that require clarification and improvement, both from a methodological and a conceptual standpoint.

      First, the actual improvements associated with the rehabilitation protocol remain unclear. While the authors report certain quantitative metrics, the study lacks more direct evidence of functional gains. Typically, rehabilitation interventions are strengthened by complementary material (e.g., videos or case examples) that clearly demonstrate improvements in activities of daily living. Including such evidence would make the findings more compelling.

      We thank the reviewer for their careful consideration of our work. We agree that direct evidence for the functional gains achieved by patients is important for establishing the efficacy of a clinical intervention and that this evidence should provide comprehensive insights for clinicians, from videos to case examples as suggested. Our aim here was apply a novel computational framework to a cohort of patients undergoing rehabilitation, and in doing so, provide empirical support for its utility in standardised motor assessments. We have shown that our novel approach can identify distinct physiological responses to VR vs PT conditions across the post-stroke cohort (see Fig.2B and associated text). Hence, although the data contains virtual reality vs. conventional physical therapy experimental conditions which likely holds important insights into the clinical use case of virtual reality interventions, we did not focus on such complementary evidence in this study. In future work, research groups (including our own) investigating the important question of clinical intervention efficacy will likely gain unique and useful mechanistic insights using our approach.

      Moreover, a threshold of 5 points at the FMA-UE was considered as MCID, to distinguish between responder and non-responder patients, which represents an acknowledged and applicable measure in the clinical field. The use of single cases represents low evidence of change from the perspective of expert clinicians, raising concerns on the clinical meaningful of reported results. All this given, we chose to provide stronger evidence of clinical effect (i.e. comparison between responders and non-responders) interpreted from the perspective of muscle synergies, than to support our results in single selected cases, representing a bias in terms of translation to population of people survived to a stroke.

      Second, the claim that the proposed muscle network analysis is robust is not sufficiently substantiated. The method is introduced without adequate reference to, or comparison with, the extensive literature that has proposed alternative metrics. It is also not evident whether a simpler analysis (e.g., EMG amplitude) might produce similar results. To highlight the added value of the proposed method, it would be important to benchmark it against established approaches. This would help clarify its specific advantages and potential applications. Moreover, several studies have shown very good outcomes when using AI and latent manifold analyses in patients with neural lesions. Interpreting the latent space appears even easier than interpreting muscle networks, as the manifolds provide a simple encoding-decoding representation of what the patient can still perform and what they can no longer do.

      To address the reviewers concerns regarding adequate evidence for the claims made about the presented framework, we have now included an application of the conventional muscle synergy analysis approach based on non-negative matrix factorisation to the post-stroke cohort (see Supplementary materials Fig.5 and associated text). We made efforts to make this comparison as fair as possible by applying the conventional approach at the population level also and clustering the activation coefficients using a similar yet more conventional approach, agglomerative clustering. Accompanying the output of this application, we have included several points of where our framework improves significantly upon conventional muscle synergy analysis:

      “Comparison with conventional approaches

      To more directly illustrate the advantages of the proposed framework, we carried out a standardised pre-processing of the EMG data in line with conventional muscle synergy analysis. This included rectification, low-pass filtration (cut-off: 20Hz) and smooth resampling of EMG waveforms to 50 timepoints. All data for each participant at each session was separately normalised by channel-wise variance, concatenated together and input into non-negative matrix factorisation (NMF) ('nnmf' Matlab function, 10 replications) to extract 11 muscle synergies (W1-11 of Supplementary Materials Fig.5(Left)) and their time-varying activations. The number of components to extract was determined in a conventional way as the number of components required to explain >75% of the data variance. The extracted muscle synergies included distinct shoulder- (e.g. W2), elbow (e.g. W8) and forearm-level (e.g. W1) muscle covariation patterns along with more isolated muscle contributions (e.g. UT in W3, TL in W10).

      Regarding the clustering results of our framework and how they compare to conventional approaches, to facilitate this comparison we applied agglomerative clustering to the time-varying activation coefficients of all participants, trials, tasks separately for pre- and post-sessions and employed the 'evalclusters' Matlab function (Ward linkage clustering, Calinski Harabasz criterion, Klist search = 2:21) for each session. We identified two clusters both at pre-session (Criterion = 1.69) and post-session (Criterion = 1.81) as optimal fits to the population data (see Supplementary Materials Fig.5(Right)). We found no associations between pre- or post-session cluster partitions and participants FMA-UE scores. Nevertheless, we did identify significant associations between the pre-session clustering’s and S_Pre (X<sup>2</sup> = 7.08, p = 0.008) and between post-session clustering’s and conventionally-defined treatment responders (X<sup>2</sup> = 4.2, p = 0.04). These findings, along with the similar two-way clustering structure found using the NIF, highlights important commonalities between these approaches.

      To summarise the main advantages of our framework over this conventional approach:

      - Lower dimensionality and enhanced interpretability of extracted components.

      Our framework yields a lower number of population-level components that correspond more consistently to meaningful biomechanical and physiological functions.

      - Integration of pairwise muscle relationships.

      By incorporating muscle-pair level analysis, our framework captures coordinated interactions between primary and stabilising muscles—relationships that conventional NMF approaches overlook.

      - Separation of task-relevant and task-irrelevant activity.

      The NIF isolates task-relevant coordination patterns, distinguishing them from task-irrelevant interactions driven by biomechanical or task constraints. On the other hand, task-relevant and -irrelevant muscle contributions are intermixed in conventional muscle synergy analysis.

      - Ability to identify complementary functional roles.

      The NIF characterises whether muscle pairs act in similar or complementary ways, providing richer insight into motor control strategies.

      - Reduced dependence on variance-based optimisation.

      Unlike conventional methods that rely on maximising variance explained, our framework allows detection of subtle but functionally significant interactions that contribute less to total variance.

      - Improved detection of clinically relevant population structure.

      The clustering component of our framework revealed distinct post-stroke subgroups with important clinical relevance, distinguishing moderately and severely impaired cohorts and treatment responders and non-responders from pre-treatment data.”

      This supplementary analysis is referred to in the Methods section of the main text with reference to previous similar comparisons between our framework and conventional approaches:

      “Towards finding an effective approach to clustering participants in this data based on differences in impairment severity and therapeutic (non-)responsiveness, we found that conventional clustering algorithms (e.g. agglomerative, k-means etc.) could not provide substantive outputs (see Supplementary Materials Fig.5 and associated text for a direct comparison with conventional approaches), perhaps resulting from the complex interdependencies between the modular activations.”

      “To facilitate comparisons with existing approaches, we performed a conventional muscle synergy analysis on the post-stroke cohort (see Supplementary Materials Fig.5 and associated text). Further comparisons with conventional approaches can be found in our previous work (O’Reilly & Delis, 2022).”

      Further, we have also referred to a previous analysis of this post-stroke dataset using the conventional approach in the discussion section, where we point out how our approach can identify salient features of post-stroke physiological responses that conventional approaches cannot:

      “Further, the NIF demonstrated here an enhanced capability over traditional approaches to identify these crucial patterns, as earlier work on related versions of this dataset could not identify any differentiable fractionation events across the cohort (Pregnolato et al., 2025).”

      Overall, the utility of conventional muscle synergy analysis is well recognised across the field (Hong et al 2021). Our proposed approach builds on this conventional method by addressing key limitations to further enhance this clinical utility. We also agree that manifold learning approaches are an exciting area of research that we aim to incorporate into our framework in future research. Specifically, manifold learning methods like Laplacian eigenmaps can readily be applied to the co-membership matrix produced by our clustering algorithm, exploiting the geometry of this matrix to provide a continuous rather than discrete representation of population structure. We have highlighted this possibility in the discussion section:

      “Indeed, in future work, we aim to apply manifold learning approaches to the co-membership matrix derived from this clustering algorithm, providing a continuous representation of the population structure.”

      Third, the terminology used throughout the manuscript is sometimes ambiguous. A key example is the distinction made between "functional" and "redundant" synergies. The abstract states: "Notably, we identified a shift from redundancy to synergy in muscle coordination as a hallmark of effective rehabilitation-a transformation supported by a more precise quantification of treatment outcomes."

      However, in motor control research, redundancy is not typically seen as maladaptive. Rather, it is a fundamental property of the CNS, allowing the same motor task to be achieved through different patterns of muscle activity (e.g., alternative motor unit recruitment strategies). This redundancy provides flexibility and robustness, particularly under fatiguing conditions, where new synergies often emerge. Several studies have emphasized this adaptive role of redundancy. Thus, if the authors intend to use "redundancy" differently, it is essential to define the term explicitly and justify its use to avoid misinterpretation.

      We appreciate the reviewers concerns regarding the terminology employed in this study. Indeed, we agree that redundancy is seen in the motor control literature as a positive feature of biological systems, appearing to contradict the interpretations of the redundancy-to-synergy information conversion result we have presented. We also wish to highlight that across the motor control literature and beyond, the idea of redundancy is often conflated with the related but distinct notion of degeneracy. Traditional motor control research has also recognised this difference, for example, Latash has outlined this difference in the seminal work on motor abundance (https://doi.org/10.1007/s00221-012-3000-4). A key reference discussing this conflation and these two concepts in an information-theoretic way is found here: https://doi.org/10.1093/cercor/bhaa148. To summarise what their arguments mean for our work:

      - System degeneracy relates to the ability of different system components to contribute towards the same task in a context-specific way.

      - System redundancy corresponds to the degree of functional overlap among system components.

      Hence, conceptually speaking, informational redundancy as employed in our study (i.e. functionally-similar muscle interactions) links with system redundancy in that it quantifies the functional overlap of system components. This definition of system redundancy implies that it is an unavoidable by-product of degenerate systems (inefficient use of degrees of freedom) which should be minimised where possible. As a result of stroke, in our study and related previous work patients displayed increased informational redundancy, linking with the abnormal co-activations they typically experience for example and with previous results from traditional muscle synergy analysis showing fewer components extracted as a function of motor impairment post-stroke (i.e. higher informational redundancy) (Clark et al. 2010). Our novel contribution here is to convey how effective rehabilitation is underpinned by a redundancy-to-synergy information conversion across the muscle networks, relating in a loose sense conceptually to a reduction in system redundancy and enhancement of system degeneracy (i.e. functionally differentiated system components contributing towards task performance).

      Together, and alongside the mathematical descriptions of redundant (functionally-similar) and synergistic (functionally-complementary) information in what types of functional relationships they capture, we believe the intuition behind this finding has clear links with previous research showing a) the merging of muscle synergies in response to post-stroke impairment (i.e. functional de-differentiation), b) reduction in abnormal couplings with effective rehabilitation (i.e. functional re-differentiation). To communicate this more clearly to readers, we have included the following in the corresponding discussion section:

      “Previous research has shown that functional redundancy increases post-stroke (Cheung et al., 2012; Clark et al., 2010), reflecting the characteristic loss of functional specificity (i.e. functional de-differentiation) of muscle interactions post-stroke. Enhanced synergy with treatment here thus reflects the functional re-differentiation of predominantly flexor-driven muscle networks towards different, complementary task-objectives across the seven upper-limb motor tasks performed (Kim et al., 2024b), leading to improved motor function among responders.”

      Finally, we have screened the updated manuscript for consistent use of terminology including functional/redundant/synergistic.

      References

      Clark DJ, Ting LH, Zajac FE, Neptune RR, Kautz SA. Merging of healthy motor modules predicts reduced locomotor performance and muscle coordination complexity post-stroke. Journal of neurophysiology. 2010 Feb;103(2):844-57.

      Hong YN, Ballekere AN, Fregly BJ, Roh J. Are muscle synergies useful for stroke rehabilitation?. Current Opinion in Biomedical Engineering. 2021 Sep 1;19:100315.

      Latash ML. The bliss (not the problem) of motor abundance (not redundancy). Experimental brain research. 2012 Mar;217(1):1-5.

      O'Reilly D, Delis I. Dissecting muscle synergies in the task space. Elife. 2024 Feb 26;12:RP87651.

      Sajid N, Parr T, Hope TM, Price CJ, Friston KJ. Degeneracy and redundancy in active inference. Cerebral Cortex. 2020 Nov;30(11):5750-66.

      Reviewer #2 (Public review):

      Summary:

      This study analyzes muscle interactions in post-stroke patients undergoing rehabilitation, using information-theoretic and network analysis tools applied to sEMG signals with task performance measurements. The authors identified patterns of muscle interaction that correlate well with therapeutic measures and could potentially be used to stratify patients and better evaluate the effectiveness of rehabilitation.

      However, I found that the Methods and Materials section, as it stands, lacks sufficient detail and clarity for me to fully understand and evaluate the quality of the method. Below, I outline my main points of concern, which I hope the authors will address in a revision to improve the quality of the Methods section. I would also like to note that the methods appear to be largely based on a previous paper by the authors (O'Reilly & Delis, 2024), but I was unable to resolve my questions after consulting that work.

      I understand the general procedure of the method to be: (1) defining a connectivity matrix, (2) refining that matrix using network analysis methods, and (3) applying a lower-dimensional decomposition to the refined matrix, which defines the sub-component of muscle interaction. However, there are a few steps not fully explained in the text.

      (1) The muscle network is defined as the connectivity matrix A. Is each entry in A defined by the co-information? Is this quantity estimated for each time point of the sEMG signal and task variable? Given that there are only 10 repetitions of the measurement for each task, I do not fully understand how this is sufficient for estimating a quantity involving mutual information.

      We acknowledge the confusion caused here in how many datapoints were incorporated into the estimation of II. The number of datapoints included in each variable involved was in fact no. of timepoints x 10 repetitions. Hence for the EMGs employed in this analysis with a sampling rate of 2000Hz, the length of variables involved in this analysis could easily extend beyond 20,000 datapoints each. We have clarified this more specifically in the corresponding section of the methods:

      “We carried out this application in the spatial domain (i.e. interactions between muscles across time (Ó’Reilly & Delis, 2022)) by concatenating the 10 repetitions of each task executed on a particular side (i.e. variables of length no. of timepoints x 10 trials) and quantifying II with respect to this discrete task parameter codified to describe the motor task performed at each timepoint for each trial included.”

      In the previous paper (O'Reilly & Delis, 2024), the authors initially defined the co-information (Equation 1.3) but then referred to mutual information (MI) in the subsequent text, which I found confusing. In addition, while the matrix A is symmetrical, it should not be orthogonal (the authors wrote A<sup>T</sup>A = I) unless some additional constraint was imposed?

      We thank the reviewer for spotting this typo in the previous paper describing a symmetric matrix as A<sup>T</sup>A = I which is in fact related to orthogonality instead. To clarify this error, in the current study we have correctly described the symmetric matrix as A = A<sup>T</sup> here:

      “We carried out this application in the spatial domain (i.e. interactions between muscles across time (Ó’Reilly & Delis, 2022)) by concatenating the 10 repetitions of each task executed on a particular side (i.e. variables of length no. of timepoints x 10 trials) and quantifying II with respect to this discrete task parameter codified to describe the motor task performed at each timepoint for each trial included. This computation was performed on all unique m<sub>x</sub> and m<sub>y</sub> pairings, generating symmetric matrices (A) (i.e. A = A<sup>T</sup>) composed separately of non-negative redundant and synergistic values (Fig.5).”

      Regarding the reviewers point about the reference to MI after equation 1.3 of the previous paper where co-Information is defined, we were referring both to the task-relevant and task-irrelevant estimates analysed there collectively in a general sense as ‘MI estimates’ as they both are derived from mutual information, task-irrelevant being the MI between two muscles conditioned on a task variable (conditional mutual information) and task-relevant being the difference between two MI values (co-I is a higher-order MI estimate). This removed the need to continuously refer to each separately throughout the paper which may in its own way cause some confusion. For clarity, in the results of that paper we also provided context for each MI estimate on how they were estimated (see beginning of “Task-irrelevant muscle couplings” and “Task-redundant muscle couplings” and “Task-synergistic muscle couplings” results sections), referring throughout the Venn diagrams depicting them (see Fig.1 of previous paper). In the present study however, for brevity and focus we did not perform an analysis on task-irrelevant muscle interactions and so decided to focus our terminology on co-I (II), a higher-order MI estimate. We acknowledge that this may have caused some confusion but highlight the efforts made to communicate each measure throughout the previous and present study. We have explicitly pointed out this specific focus on task-dependent muscle couplings in this paper at the end of the introduction of the updated manuscript:

      “To do so, here we focussed our analysis on quantifying task-dependent muscle couplings (collectively referred to as II), extracting functionally-similar (i.e. redundant) and -complementary (i.e. synergistic) modules…”

      (2) The authors should clarify what the following statement means: "Where a muscle interaction was determined to be net redundant/synergistic, their corresponding network edge in the other muscle network was set to zero."

      We acknowledge this sentence was unclear/misleading and have now clarified this statement in the following way:

      “This computation was performed on all unique m<sub>x</sub> and m<sub>y</sub> pairings, generating sparse symmetric matrices (A) (i.e. A = A<sup>T</sup>) composed separately of non-negative redundant and synergistic values (Fig.5).” Additionally, we have now included an additional figure (fig.5) describing this text graphically.

      (3) It should be clarified what the 'm' values are in Equation 1.1. Are these the co-information values after the sparsification and applying the Louvain algorithm to the matrix 'A'? Furthermore, since each task will yield a different co-information value, how is the information from different tasks (r) being combined here?

      We thank the reviewer for their attention to detail. For clarity, at the related section of Equation 1.1, we have clarified that the input matrix is composed of co-I estimates:

      “The input matrix for PNMF consisted of the sparsified A on both affected and unaffected sides from all participants at both pre- and post-sessions concatenated in their vectorised forms. More specifically, the input matrix composed of redundant or synergistic values was configured such that the set of unique muscle pairings (1 … K) on affected and unaffected sides (m<sub>aff</sub> and m<sub>unaff</sub> respectively)…”.

      The co-I estimates in this input matrix are indeed those that survived sparsification in previous steps, however, for determining the number of modules to extract using the Louvain algorithm, this step has no direct impact or transformation on the co-I estimates and is simply employed to derive an empirical input parameter for dimensionality reduction. We refer the reviewer to the following part of this paragraph where this is described:

      “The number of muscle network modules identified in this final consensus partition was used as the input parameter for dimensionality reduction, namely projective non-negative matrix factorisation (PNMF) (Fig.1(D)) (Yang & Oja, 2010). The input matrix for PNMF consisted of the sparsified A on both affected and unaffected sides from all participants at both pre- and post-sessions concatenated together in their vectorised form.”

      Finally, as the reviewer has mentioned, the co-I estimates from the same muscles pairings but for different tasks, experimental sessions and participants are indeed different, reflecting their task-specific tuning, changes with rehabilitation and individual differences. To combine these representations into low-dimensional components, we employed projective non-negative matrix factorisation (PNMF). As outlined in the previous paper and earlier work on this framework (O’ Reilly & Delis, 2022), application of dimensionality reduction here can generate highly generalisable motor components, highlighting their ability to effectively represent large populations of participants, tasks and sessions, while allowing interesting individual differences mentioned by the reviewer to be buffered into the corresponding activation coefficients. These activation coefficients are for this reason the focus of the cluster analyses in the present study to characterise the post-stroke cohort. We have explicitly provided this reason in the methods section of the updated manuscript:

      “We focussed on $a$ here as the extraction of population-level functional modules enabled the buffering of individual differences into the space of modular activations, making them an ideal target for identifying population structure.”

      (4) In general, I recommend improving the clarity of the Methods section, particularly by being more precise in defining the quantities that are being calculated. For example, the adjacency matrix should be defined clearly using co-information at the beginning, and explain how it is changed/used throughout the rest of the section.

      We thank the reviewer for their constructive advice and have gone to lengths to improve the clarity of the methods section. Firstly, we have addressed all the reviewers comments on various specific sections of the methods, including more clearly the ‘why’ and ‘how’ of what was performed. Secondly, we have now included an additional figure illustrating how co-information was quantified at the network level and separated into redundant and synergistic values (see Fig.5 of updated manuscript). Finally, we have re-structured several paragraphs of the methods section to enhance flow with additional subheadings for clarity.

      (5) In the previous paper (O'Reilly & Delis, 2024), the authors applied a tensor decomposition to the interaction matrix and extracted both the spatial and temporal factors. In the current work, the authors simply concatenated the temporal signals and only chose to extract the spatial mode instead. The authors should clarify this choice.

      The reviewer is correct in that a different dimensionality reduction approach was employed in the previous paper. In the present study, we instead chose to employ projective non-negative matrix factorisation, as was employed in a preliminary paper on this framework (O’Reilly & Delis, 2022). This decision was made simply based on aiming to maintain brevity and simplicity in the analysis and presentation of results as we introduce other tools to the framework (i.e. the clustering algorithm). Indeed, we could have just as easily employed the tensor decomposition to extract both spatial and temporal components, however we believed the main take away points for this paper could be more easily communicated using spatial networks only. To clarify this difference for readers we have included the following in the methods section:

      “The choice of PNMF here, in contrast to the space-time tensor decomposition employed in the parent study (O’Reilly & Delis, 2024), was chosen simply to maintain brevity by focussing subsequent analyses on the spatial domain.”

      References

      Ó’Reilly D, Delis I. A network information theoretic framework to characterise muscle synergies in space and time. Journal of Neural Engineering. 2022 Feb 18;19(1):016031.

      O'Reilly D, Delis I. Dissecting muscle synergies in the task space. Elife. 2024 Feb 26;12:RP87651.

      Recommendations for the authors:

      Reviewing Editor Comments:

      Both reviewers are concerned with the manuscript in its current form. They questioned the relevance of the current approach in providing functional or mechanistic explanations about the rehabilitation process of post-stroke patients. Our eLife Assessment would change if you include comparisons between your current method and classical ones, in addition to improving the description of your method to strengthen the evidence of its robustness.

      Reviewer #1 (Recommendations for the authors):

      There is a minor typographical error in Figure 2 ("compononents" should be corrected).

      This error has been rectified.

      Reviewer #2 (Recommendations for the authors):

      The authors should be able to address most of my concerns by providing a substantially improved version of the Methods section.

      See above responses to the reviewers comments regarding the methods section.

      However, I would like the authors to explain in full detail (potentially including a simulation or power analysis) the procedure for estimating the co-information quantity, and to clarify whether it is robust given the sample size used in this paper.

      We refer the reviewer to our previous responses outlining with greater clarity the number of samples included in the estimation of co-I. We would also like to mention here that our framework does not make inferences on the statistical significance of individual muscle couplings (i.e. co-I estimates). Instead, these estimates are employed collectively for the sole purpose of pattern recognition. Nevertheless, to generate reliable estimates of the muscle couplings, we have employed a substantial number of samples for each co-I estimate (>20k samples in each variable) addressing the reviewers main concern her.

    1. eLife Assessment

      This important work introduces a splitGFP-based labeling tool with an analysis pipeline for the synaptic scaffold protein bruchpilot, with tests in the adult Drosophila mushroom bodies, a learning center in the Drosophila brain. The evidence supporting the conclusions is convincing.

    2. Reviewer #1 (Public review):

      Summary:

      The study by Wu et al. uses endogenous bruchpilot expression in a cell-type-specific manner to assess synaptic heterogeneity in adult Drosophila melanogaster mushroom body output neurons. The authors performed genomic on locus tagging of the presynaptic scaffold protein bruchpilot (brp) with one part of splitGFP (GFP11) using the CRISPR/Cas9 methodology and co-expressed the other part of splitGFP (GFP1-10) using the GAL4/UAS system. Upon expression of both parts of splitGFP, fluorescent GFP is assembled at the C-terminus of brp, exactly where brp is endogenously expressed in active zones. For manageable analysis, a high-throughput pipeline was developed. This analysis evaluated parameters like location of brp clusters, volume of clusters, and cluster intensity as a direct measure of the relative amount of brp expression levels on site using publicly available 3D analysis tools that are integrated in Fiji. Analysis was conducted for different mushroom body cell types in different mushroom body lobes using various specific GAL4 drivers. Further validation was provided by extending analysis to R8 photoreceptors that reside in the fly medulla. To test this new method of synapse assessment, Wu et al. performed an associative learning experiment in which an odor was paired with an aversive stimulus and found that in a specific time frame after conditioning, the new analysis solidly revealed changes in brp levels at specific synapses that are associated with aversive learning. Additionally, brp levels were assessed in R8 photoreceptor terminals upon extended exposure to light.

      Strengths:

      Expression of splitGFP bound to brp enables intensity analysis of brp expression levels as exactly one GFP molecule is expressed per brp. This is a great tool for synapse assessment. This tool can be widely used for any synapse as long as driver lines are available to co-express the other part of splitGFP in a cell-type-specific manner. As neuropils and thus brp label can be extremely dense, the analysis pipeline developed here is very useful and important. The authors have chosen an exceptionally dense neuropil - the mushroom bodies - for their analysis and compellingly show that brp assessment can be achieved even with such densely packed active zones. The result that brp levels change upon associative learning in an experiment with odor presentation paired with punishment is likewise compelling and strongly suggests that the tool and pipeline developed here can be used in an in vivo context. Thus, the tool and its uses have the potential to fundamentally advance protein analysis not only at the synapse but especially there.

      Weaknesses:

      The weaknesses I perceived originally were satisfactorily explained and refuted.

    3. Reviewer #2 (Public review):

      Summary:

      The authors developed a cell-type-specific fluorescence-tagging approach using a CRISPR/Cas9 induced spilt-GFP reconstitution system to visualize endogenous Bruchpilot (BRP) clusters at presynaptic active zones (AZ) in specific cell types of the mushroom body (MB) in the adult Drosophila brain. This AZ profiling approach was implemented in a high-throughput quantification process allowing to compare synapse profiles within single cells, cell-types, MB compartments and between different individuals. Aim is to in more detail analyze neuronal connectivity and circuits in this center of associative learning, notoriously difficult to investigate due to the density of cells and structures within the cells. The authors detect and characterize cell-type specific differences in BRP-dependent profiling of presynapses in different compartments of the MB, while intracellular AZ distribution was found to be stereotyped. Next to the descriptive part characterizing various AZ profiles in the MB, the authors apply an associative learning assay and Rab3 knock-down and detected consequent AZ reorganization.

      Strengths:

      The strength of this study lies in the outstanding resolution of synapse profiling in the extremely dense compartments of the MB. This detailed analysis will serve as an entry point for many future studies of synapse diversity in connection with functional specificity to uncover the molecular mechanisms underlying learning and memory formation and neuronal network logic. Therefore, this approach is of high importance to the scientific community and represents a valuable tool to investigate and correlate AZ architecture and synapse function in the CNS.

      Weaknesses:

      The results and conclusions presented in this study are conclusively and well supported by the data presented and appropriate controls. As a comment that could possibly aid and strengthen the manuscript (but not required for acceptance of the manuscript): The experiments in the study are based on spilt-GFP lines (BRP:GFP11 and UAS-GFP1-10). The authors clearly validate the new on-locus construct with a genomic GFP insertion (qPCR, confocal and STED imaging of the brain with anti-BRP (Nc82), MB morphology and memory formation). It would be important to comment on the significant overall intensity decrease of anti-BRP (Nc82) in Fig. S1B (R57C10>BRP::rGFP) and possibly a Western Blot with a correlative antibody staining against BRP might help to show that BRP protein level are not affected. Additionally, it would be important to state, at least in the Materials and Methods section, that the flies are not homozygous viable (and to offer an explanation) and to state that all experiments were performed with heterozygous flies.

    4. Reviewer #3 (Public review):

      Summary:

      The authors develop a tool for marking presynaptic active zones in Drosophila brains, dependent on the GAL4 construct used to express a fragment of GFP, which will incorporate with a genome-engineered partial GFP attached to the active zone protein bruchpilot - signal will be specific to the GAL4 expressing neuronal compartment. They then use various GAL4s to examine innervation onto the mushroom bodies to dissect compartment specific differences in the size and intensity of active zones. After a description of these differences, they induce learning in flies with classic odour/electric shock pairing and observe changes after conditioning that are specific to the paired conditioning/learning paradigm.

      Strengths:

      The imaging and analysis appears strong. The tool is novel and exciting.

      Weaknesses:

      I feel that the tool could do with a little more characterisation. It is assumed that the puncta observed are AZs with no further definition or characterisation. It is not resolved if the AZs visualised here simply tagged, or are the constructs incorporated to be an active functional part of the AZ.

      Comments on revisions:

      Apologies, I should have thought of this in the first round of review. An experiment I would suggest (and it is not a difficult one) to address the functionality of the marker: It is mentioned that the genetically tagged half of the construct is homozygous lethal. Can this be placed in trans to a brp null, with a neuronal UAS-expression of the other half of Brp-GFP - Are the animals then 1) alive, and 2) able to fly (brp mutants can't fly, hence the name 'crashpilot') - a rescue would suggest (and that is all that would be needed here) that the reconstituted brp-GFP has function.

      On another note, the paper keeps switching between different DAN-GAL4 lines. In 1H, 2Band 4A, there are informative cartoons showing the extension of the neurons for PPL1, APL and DPM neurons - could these be incorporated into figures 5, 6 and 7, and the supplementary figures to help orient the reader. Ideally they would refer to a figure (in Fig 1?) -to refer to the groups of DANs in the adult brain that are known to innervate the MBs (e.g. Fig1 in Mao and Davis, Front in Neural Circuits 2009). I suggest this because I feel that this tool will be widely used, and if non-MB aficionados can follow what's being done here I feel it will be more widely accepted.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The study by Wu et al. uses endogenous bruchpilot expression in a cell-type-specific manner to assess synaptic heterogeneity in adult Drosophila melanogaster mushroom body output neurons. The authors performed genomic on locus tagging of the presynaptic scaffold protein bruchpilot (BRP) with one part of splitGFP (GFP11) using the CRISPR/Cas9 methodology and co-expressed the other part of splitGFP (GFP1-10) using the GAL4/UAS system. Upon expression of both parts of splitGFP, fluorescent GFP is assembled at the N-terminus of BRP, exactly where BRP is endogenously expressed in active zones. For manageable analysis, a high-throughput pipeline was developed. This analysis evaluated parameters like location of BRP clusters, volume of clusters, and cluster intensity as a direct measure of the relative amount of BRP expression levels on site, using publicly available 3D analysis tools that are integrated in Fiji. Analysis was conducted for different mushroom body cell types in different mushroom body lobes using various specific GAL4 drivers. To test this new method of synapse assessment, Wu et al. performed an associative learning experiment in which an odor was paired with an aversive stimulus and found that, in a specific time frame after conditioning, the new analysis solidly revealed changes in BRP levels at specific synapses that are associated with aversive learning.

      Strengths:

      Expression of splitGFP bound to BRP enables intensity analysis of BRP expression levels as exactly one GFP molecule is expressed per BRP. This is a great tool for synapse assessment. This tool can be widely used for any synapse as long as driver lines are available to co-express the other part of splitGFP in a cell-type-specific manner. As neuropils and thus the BRP label can be extremely dense, the analysis pipeline developed here is very useful and important. The authors have chosen an exceptionally dense neuropil - the mushroom bodies - for their analysis and convincingly show that BRP assessment can be achieved with such densely packed active zones. The result that BRP levels change upon associative learning in an experiment with odor presentation paired with punishment is likewise convincing, and strongly suggests that the tool and pipeline developed here can be used in an in vivo context.

      Weaknesses:

      Although BRP is an important scaffold protein and its expression levels were associated with function and plasticity, I am still somewhat reluctant to accept that synapse structure profiling can be inferred from only assessing BRP expression levels and BRP cluster volume. Also, is it guaranteed that synaptic plasticity is not impaired by the large GFP fluorophore? Could the GFP10 construct that is tagged to BRP in all BRP-expressing cells, independent of GAL4, possibly hamper neuronal function? Is it certain that only active zones are labeled? I do see that plastic changes are made visible in this study after an associative learning experiment with BRP intensity and cluster volume as read-out, but I would be reassured by direct measurement of synaptic plasticity with splitGFP directly connected to BRP, maybe at a different synapse that is more accessible.

      We appreciate the reviewer’s comments. In the revised manuscript, we have clarified that Brp is an important, but not the only player in the active zone. We have included new data to demonstrate that split-GFP tagging does not severely affect the localization and plasticity of Brp and the function of synapses by showing: (1) nanoscopic localization of Brp::rGFP using STED imaging; (2) colocalization between Brp::rGFP and anti-Brp signals/VGCCs; (3) activity-dependent Brp remodeling in R8 photoreceptors; (4) no defect in memory performance when labeling Brp::rGFP in KCs; These four lines of additional evidence further corroborate our approach to characterize endogenous Brp as a proxy of active zone structure.

      Reviewer #2 (Public review):

      Summary:

      The authors developed a cell-type specific fluorescence-tagging approach using a CRISPR/Cas9 induced spilt-GFP reconstitution system to visualize endogenous Bruchpilot (BRP) clusters as presynaptic active zones (AZ) in specific cell types of the mushroom body (MB) in the adult Drosophila brain. This AZ profiling approach was implemented in a high-throughput quantification process, allowing for the comparison of synapse profiles within single cells, cell types, MB compartments, and between different individuals. The aim is to analyse in more detail neuronal connectivity and circuits in this centre of associative learning. These are notoriously difficult to investigate due to the density of cells and structures within a cell. The authors detect and characterize cell-type-specific differences in BRP-dependent profiling of presynapses in different compartments of the MB, while intracellular AZ distribution was found to be stereotyped. Next to the descriptive part characterizing various AZ profiles in the MB, the authors apply an associative learning assay and detect consequent AZ re-organisation.

      Strengths:

      The strength of this study lies in the outstanding resolution of synapse profiling in the extremely dense compartments of the MB. This detailed analysis will be the entry point for many future analyses of synapse diversity in connection with functional specificity to uncover the molecular mechanisms underlying learning and memory formation and neuronal network logics. Therefore, this approach is of high importance for the scientific community and a valuable tool to investigate and correlate AZ architecture and synapse function in the CNS.

      Weaknesses:

      The results and conclusions presented in this study are, in many aspects, well-supported by the data presented. To further support the key findings of the manuscript, additional controls, comments, and possibly broader functional analysis would be helpful. In particular:

      (1) All experiments in the study are based on spilt-GFP lines (BRP:GFP11 and UAS-GFP1-10).The Materials and Methods section does not contain any cloning strategy (gRNA, primer, PCR/sequencing validation, exact position of tag insertion, etc.) and only refers to a bioRxiv publication. It might be helpful to add a Materials and Methods section (at least for the BRP:GFP11 line). Additionally, as this is an on locus insertion the in BRP-ORF, it needs a general validation of this line, including controls (Western Blot and correlative antibody staining against BRP) showing that overall BRP expression is not compromised due to the GFP insertion and localizes as BRP in wild type flies, that flies are viable, have no defects in locomotion and learning and memory formation and MB morphology is not affected compared to wild type animals.

      We thank the reviewer for suggesting these important validations. We included details of the design of the construct and insertion site to the Methods section, performed several new experiments to validate the split-GFP tagging of Brp, and present the data in the revision.

      First, to examine whether the transcription of the brp gene is unaffected by the insertion of GFP<sub>11</sub>, we conducted qRT-PCR to compare the brp mRNA levels between brp::GFP<sub>11</sub>, UAS-GFP1-10 and UAS-GFP1-10 and found no difference (Figure 1 - figure supplement 1A).

      To further verify the effect of GFP<sub>11</sub> tagging at the protein level, we performed anti-Brp (nc82) immunohistochemistry of brains where GFP is reconstituted pan-neuronally. We found unaltered neuropile localization of nc82 signals (Figure 1 - figure supplement 1C). In presynaptic terminals of the mushroom body calyx, we found integration of Brp::rGFP to nc82 accumulation (Figure 1D). We performed super-resolution microscopy to verify the configuration of Brp::rGFP and confirmed the donut-shape arrangement of Brp::rGFP in the terminals of motor neurons (see Wu, Eno et al., 2025 PLOS Biology), corroborating the nanoscopic assembly of Brp::rGFP at active zones (Kittel et al., 2006 Science).

      Furthermore, co-expression of RFP-tagged voltage-gated calcium channel alpha subunit Cacophony (Cac) and Brp::rGFP in PAM-γ5 dopaminergic neurons revealed strong presynaptic colocalization of their punctate clusters (Figure 1E), suggesting that rGFP tagging of Brp did not damage key protein assembly at active zones (Kawasaki et al., 2004 J Neuroscience; Kittel et al., Science).

      These lines of evidence suggest that the localization of endogenous Brp is barely affected by the C-terminal GFP<sub>11</sub> insertion or GFP reconstitution therewith. This is in line with a large body of studies confirming that the N-terminal region and coiled-coil domains, but not the C-terminal, region of Brp are necessary and sufficient for active zone localization (Fouquet et al., 2009 J Cell Biol; Oswald et al., 2010 J Cell Biol; Mosca and Luo, 2014 eLife; Kiragasi et al., 2017 Cell Rep; Akbergenova et al., 2018 eLife; Nieratschker et al., 2009 PLoS Genet; Johnson et al., 2009 PLoS Biol; Hallermann et al., 2010 J Neurosci). We nevertheless report homozygous lethality and found the decreased immunoreactive signals in flies carrying the GFP<sub>11</sub> insertion (Figure 1 - figure supplement 1B).

      For these reasons, we always use heterozygotes for all the experiments therefore there is no conspicuous defect in locomotion as reported in the original study (Wagh et al., 2005 Neuron). To functionally validate the heterozygotes, we measured the aversive olfactory memory performance of flies where GFP reconstitution was induced in Kenyon cells using R13F02-GAL4. We found that all these transgenes did not alter mushroom body morphology (Figure 7 - figure supplement 1) or memory performance as compared to wild-type flies (Figure 7 - figure supplement 2), suggesting the synapse function required for short-term memory formation is not affected by split-GFP tagging of Brp.

      (2) Several aspects of image acquisition and high-throughput quantification data analysis would benefit from a more detailed clarification.

      (a) For BRP cluster segmentation it is stated in the Materials and Methods state, that intensity threshold and noise tolerance were "set" - this setting has a large effect on the quantification, and it should be specified and setting criteria named and justified (if set manually (how and why) or automatically (to what)). Additionally, if Pyhton was used for "Nearest Neigbor" analysis, the code should be made available within this manuscript; otherwise, it is difficult to judge the quality of this quantification step.

      (b) To better evaluate the quality of both the imaging analysis and image presentation, it would be important to state, if presented and analysed images are deconvolved and if so, at least one proof of principle example of a comparison of original and deconvoluted file should be shown and quantified to show the impact of deconvolution on the output quality as this is central to this study.

      We thank the reviewer for suggesting these clarifications. We have included more description to the revised manuscript to clarify the setting of segmentation, which was manually adjusted to optimize the F-score (previous Figure 1D, now moved to Figure 1 -figure supplement 5). We have included the code used for analyzing nearest neighbor distance, AZ density and local Brp density in the revised manuscript (Supplementary file 1), together with a pre-processed sample data sheet (Supplementary file 2).

      Regarding image deconvolution, we have clarified the differential use of deconvolved and not-deconvolved images in the revised manuscript. We have also included a quantitative evaluation of Richardson-Lucy iterative deconvolution (Figure 1 - figure supplement 4). We used 20 iterations due to only marginal FWHM improvement beyond this point (Figure 1 - figure supplement 4).

      (3) The major part of this study focuses on the description and comparison of the divergent synapse parameters across cell-types in MB compartments, which is highly relevant and interesting. Yet it would be very interesting to connect this new method with functional aspects of the heterogeneous synapses. This is done in Figure 7 with an associative learning approach, which is, in part, not trivial to follow for the reader and would profit from a more comprehensive analysis.

      (a) It would be important for the understanding and validation of the learning induced changes, if not (only) a ratio (of AZ density/local intensity) would be presented, but both values on their own, especially to allow a comparison to the quoted, previous AZ remodelling analysis quantifying BRP intensities (ref. 17, 18). It should be elucidated in more detail why only the ratio was presented here.

      We thank the reviewer for the suggestion on the presentation of learning-induced Brp remodeling. The reported values in Figure 7C are the correlation coefficient of AZ density and local intensity in each compartment, but not the ratio. These results suggest that subcompartment-sized clusters of AZs with high Brp accumulation (Figure 6) undergo local structural remodeling upon associative learning (Figure 7). For clarity, we have included a schematic of this correlation and an example scatter plot to Figure 6. Unlike the previous studies (refs 17 and 18), we did not observe robust learning-dependent changes in the Brp intensity, possibly due to some confounding factors such as overall expression levels and conditioning protocols as described in the previous and following points, respectively.

      (b) The reason why a single instead of a dual odour conditioning was performed could be clarified and discussed (would that have the same effects?).

      (c) Additionally, "controls" for the unpaired values - that is, in flies receiving neither shock nor odour - it would help to evaluate the unpaired control values in the different MB compartments.

      We use single odor conditioning because it is the simplest way to examine the effect of odor-shock association by comparing the paired and unpaired group. Standard differential conditioning with two odors contains unpaired odor presentation (CS-) even in the ‘paired’ group. We now show that single-odor conditioning induces memory that lasts one day as in differential conditioning (Figure 7B; Tully and Quinn, J Comp Phys A 1985).

      (d) The temporal resolution of the effect is very interesting (Figure 7D), and at more time points, especially between 90 and 270 min, this might raise interesting results.

      The sampling time points after training was chosen based on approximately logarithmic intervals, as the memory decay is roughly exponential (Figure 7B). This transient remodeling is consistent with the previous studies reporting that the Brp plasticity was short-lived (Zhang et al., 2018 Neuron; Turrel et al., 2022 Current Biol).

      (e) Additionally, it would be very interesting and rewarding to have at least one additional assay, relating structure and function, e.g. on a molecular level by a correlative analysis of BRP and synaptic vesicles (by staining or co-expression of SV-protein markers) or calcium activity imaging or on a functional level by additional learning assays.

      We thank the reviewer for raising this important point. We have performed calcium imaging of KC presynaptic terminals to correlate the structure and function in another study (see Figure 2 in Wu, Eno et al., 2025 PLOS Biology for more detail). The basal presynaptic calcium pattern along the γ compartments is strikingly similar to the compartmental heterogeneity of Brp accumulation (see also Figure 2 in this study). Considering colocalization of other active-zone components, such as Cac (Figure 1E), we propose that the learning-induced remodeling of local Brp clusters should transiently modulate synaptic properties.

      As a response to other reviewers’ interest, we used Brp::rGFP to measure different forms of Brp-based structural plasticity upon constant light exposure in the photoreceptors and upon silencing rab3 in KCs. Since these experiments nicely reproduced the results of previous studies (Sugie et al., Neuron 2013; Graf et al., Neuron 2009), we believe the learning-induced plasticity of Brp clustering in KCs has a transient nature.

      Reviewer #3 (Public review):

      Summary:

      The authors develop a tool for marking presynaptic active zones in Drosophila brains, dependent on the GAL4 construct used to express a fragment of GFP, which will incorporate with a genome-engineered partial GFP attached to the active zone protein bruchpilot - signal will be specific to the GAL4-expressing neuronal compartment. They then use various GAL4s to examine innervation onto the mushroom bodies to dissect compartment-specific differences in the size and intensity of active zones. After a description of these differences, they induce learning in flies with classic odour/electric shock pairing and observe changes after conditioning that are specific to the paired conditioning/learning paradigm.

      Strengths:

      The imaging and analysis appear strong. The tool is novel and exciting.

      Weaknesses:

      I feel that the tool could do with a little more characterisation. It is assumed that the puncta observed are AZs with no further definition or characterisation.

      We performed additional validation on the tool, including (1) nanoscopic localization of Brp::rGFP using STED imaging; (2) colocalization between Brp::rGFP and anti-Brp signals/VGCCs (Figure 1D-E); 3) activity-dependent active zone remodeling in R8 photoreceptors (Figure 1F). These will be detailed in our point-by-point response below.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) The authors keep stating, they profile or assess synaptic structure by analyzing BRP localization, cluster volume, and intensity. However, I do not think that BRP cluster volume and intensity warrant an educated statement about presynaptic structure as a whole. I do not challenge the usefulness of BRP cluster analysis for synapse evaluation, but as there are so many more players involved in synaptic function, BRP analysis certainly cannot explain it all. This should at least be discussed.

      It is correct that Brp is not the only player in the active zone. We have included more discussion on the specific role of Brp (line 84 to 89) and other synaptic markers (line 250) and edited potentially misunderstanding text.

      (2) I do see that changes in BRP expression were observed following associative learning, but is it certain, that synaptic plasticity is generally unaffected by the large GFP fluorophore? BRP is grabbing onto other proteins, both with its C- and N-termini. As the GFP is right before the stop codon, it should be at the N-terminus. How far could BRP function be hampered by this? Is there still enough space for other proteins to interact?

      We thank the reviewer for sharing the concerns. We here provided three lines of evidence to demonstrate that the Brp assembly at active zones required for synaptic plasticity is unaffected by split-GFP tagging.

      First, we assessed olfactory memory of flies that have Brp::rGFP labeled in Kenyon cells and found the performance comparable to wild-type (Figure 7 - figure supplement 2), suggesting the Brp function required for olfactory memory (Knapek et al., J Neurosci 2011) is unaffected by split-GFP tagging.

      Second, we measured Brp remodeling in photoreceptors induced by constant light exposure (LL; Sugie et al., 2015 Neuron). Consistent with the previous study, we found that LL decreased the numbers of Brp::rGFP clusters in R8 terminals in the medulla, as compared to constant dark condition (DD). This result validates the synaptic plasticity involving dynamic Brp rearrangement in the photoreceptors. We have included this result into the revised manuscript (Figure 1F).

      To further validate protein interaction of Brp::rGFP, we focused on Rab3, as it was previously shown to control Brp allocation at active zones (Graf et al., 2009 Neuron). To this end, we silenced rab3 expression in Kenyon cells using RNAi and measured the intensity of Brp::rGFP clusters in γ Kenyon cells. As previously reported in the neuromuscular junction, we found that rab3 knock-down increased Brp::rGFP accumulation to the active zones, suggesting that Brp::rGFP represents the interaction with Rab3. We have included all the new data to the revised manuscript (Figure 1 - figure supplement 3).

      (3) It may well be that not only active-zone-associated BRP is labeled but possibly also BRP molecules elsewhere in the neuron. I would like to see more validation, e.g., the percentage of tagged endogenous BRP associated with other presynaptic proteins.

      To answer to what extent Brp::rGFP clusters represent active zones, we double-labelled Brp::rGFP and Cac::tdTomato (Cacophony, the alpha subunit of the voltage-gated calcium channels). We found that 97% of Brp::rGFP clusters showed co-localization with Cac::tdTomato in PAM-γ5 dopamine neurons terminals (Figure 1E), suggesting most Brp::rGFP clusters represent functional AZs.

      (4) Z-size is ~200 nm, while x/y pixel size is ~75 nm during acquisition. How far down does the resolution go after deconvolution?

      The Z-step was 370 nm and XY pixel size was 79 nm for image acquisition. We performed 20 iterations of Richarson-Lucy deconvolution using an empirical point spread function (PSF). We found that the effect of deconvolution on the full-width at half maximum (FWHM) of Brp::rGFP clusters improves only marginally beyond 20 iterations, when the XY FWHM is around 200 nm and the XZ FWHM is around 450 nm (Figure 1 - figure supplement 4).

      (5) Figure Legend 7: What is a "cytoplasm membrane marker"? Does this mean membrane-bound tdTom is sticking into the cytoplasm?

      We apologize for the typo and have corrected it to “plasma membrane marker”.

      (6) At the end of the introduction: "characterizing multiple structural parameters..." - which were these parameters? I was under the assumption that BRP localization, cluster volume, and intensity were assessed. I do not see how these are structural parameters. Please define what exactly is meant by "structural parameters".

      We apologize for the confusion. By "structural parameters”, we indeed referred to the volume, intensity and molecular density of Brp::rGFP clusters. We have revised the sentence to “Characterizing the distinct parameters and localization of Brp::rGFP cluster.”

      (7) Next to last sentence of the introduction: "Characterizing multiple structural parameters revealed a significant synaptic heterogeneity within single neurons and AZ distribution stereotypy across individuals." What do the authors mean by "significant synaptic heterogeneity"?

      By “synaptic heterogeneity”, we refer to the intracellular variability of active zone cytomatrices reported by Brp clusters. For instance, the intensities of Brp::rGFP clusters within Kenyon cell subtypes were variable among compartments (Figure 2). Intracellular variability of the Brp concentration of individual active zones was higher in DPM and APL neurons than Kenyon cells (Figure 3). These variabilities demonstrate intracellular synaptic heterogeneity. We have revised the sentence to be more specific to the different characters of Brp clusters.

      (8) I do not understand the last sentence of the introduction. "These cell-type-specific synapse profiles suggest that AZs are organized at multiple scales, ranging from neighboring synapses to across individuals." What do the authors mean by "ranging from neighboring synapses to across individuals"? Does this mean that even neighboring synapses in the same cell can be different?

      We have revised the sentence to “These cell-type-specific synapse profiles suggest that AZs are spatially organized at multiple scales, ranging from interindividual stereotypy to neighboring synapses in the same cells.”

      By “neighboring synapses", we refer to the nearest neighbor similarity in Brp levels in some cell-types (Figure 6A-C), and also the sub-compartmental dense AZ clusters with high Brp level in Kenyon cells (Figure 6D-H). By “across individuals”, we refer to the individually conserved active zone distribution patterns in some neurons (Figure 5).

      (9) The title talks about cell-type-specific spatial configurations. I do not understand what is meant by "spatial configurations"? Do you mean BRP cluster volume? I think the title is a little misleading.

      By “spatial configuration”, we refer to the arrangement of Brp clusters within individual mushroom body neurons. This statement is based on our findings on the intracellular synaptic heterogeneity (see also response to comment #7). We have streamlined the text description in the revised manuscript for clarity.

      Reviewer #2 (Recommendations for the authors):

      (1) For Figure 3A: exemplary two AZs are compared here, a histogram comparing more AZs would aid in making the point that in general, AZ of similar size have different BRP level (intensities) and how much variation exists.

      We have included histograms for Brp::rGFP intensity and cluster volumes to Figure 3 in the revised manuscript.

      (2) Line 52: "endogenous synapses" is a confusing term; it's probably meant that the protein levels within the synapse are endogenous and not overexpressed. 

      We apologize for the confusion and have revised the term to “endogenous synaptic proteins.”

      (3) It is not clear from the Materials and Methods section, whether and where deconvolved or not-deconvolved images were used for the quantification pipeline. Please comment on this. 

      We have now revised the Method section to clarify how deconvolved or not-deconvolved images were differently used in the pipeline.

      (4) Line 664 (C) not bold.

      We have corrected the error.

      (5) 725 "Files" should be Flies.

      We have corrected the error.

      (6) 727 two times "first".

      We have corrected the error.

      (7) Figure 7. All (A) etc., not bold - there should be consistent annotation. 

      We want to thank the reviewer for the detailed proof and have corrected all the errors spotted.

      Reviewer #3 (Recommendations for the authors):

      (1) Has there been an expression of the construct in a non-neuronal cell? Astrocyte-like cell? Any glia? As some sort of control for background and activity?

      As the reviewer suggested, we verified the neuronal expression specificity of Brp::rGFP. Using R86E01-GAL4 and Amon-GAL4, we compared Brp::rGFP in astrocyte-like glia and neuropeptide-releasing neurons. We found no Brp::rGFP puncta in the neuropils in astrocyte-like glia compared to neurons, suggesting Brp::rGFP is specific to neurons. We have included this new dataset to the revised manuscript (Figure 1 - figure supplement 2).

      (2) Similarly, expression of the construct co-expressed with a channelrhodopsin, and induction of a 'learning'-like regime of activity, similarly in a control type of experiment, expression of an inwardly rectifying channel (e.g. Kir2.1) to show that increases in size of the BRP puncta are truly activity dependent? The NMJ may be an optimal neuron to use to see the 'donut' structures of the AZs and their increase with activity. Also, are these truly AZs we are seeing here? Perhaps try co-expressing cacophony-dsRed? If the GFP Puncta are active zones, then they should be surrounded by cacophony.

      We would like to clarify that we did not find Brp::rGFP size increase upon learning. Instead, we demonstrated that associative training transiently remodelled sub-compartment-sized AZ “hot spots” in Kenyon cells, indicated by the correlation of local intensity and AZ density (Figure 6-7).

      To demonstrate split-GFP tagging does not affect activity-dependent plasticity associated with Brp, we measured Brp remodeling in photoreceptors induced by constant light exposure (LL; Sugie et al., 2015 Neuron). Consistent with the previous study, we found that LL decreased the numbers of Brp::rGFP clusters in R8 terminals in the medulla, as compared to constant dark condition (DD). This result validates the synaptic plasticity involving dynamic Brp rearrangement in the photoreceptors (Figure 1F).

      As the reviewer suggested, we performed the STED microscopy for the larval motor neuron and confirmed the donut-shape arrangement of Brp::rGFP (Wu, Eno et al., PLOS Biol 2025).

      Also following the reviewer’s suggestion, we double-labelled Brp::rGFP and Cac::tdTomato (Cacophony, the alpha subunit of the voltage-gated calcium channels). We found that 97% Brp::rGFP clusters showed co-localization with Cac::tdTomato in PAM-γ5 dopamine neurons terminals (Figure 1E), suggesting most Brp::rGFP clusters represent functional AZs.

      (3) In the introduction: Intro, a sentence about BRP - central organiser of the active zone, so a key regulator of activity.

      We have included a few more sentences about the role Brp in the active zones to the revised manuscript.

      (4) Figure 1 E, line 650 'cite the resource here'. 

      We thank the reviewer for pointing out the error and we have corrected it.

      (5) Many readers may not be MB aficionados, and to make the data more accessible, perhaps use a cartoon of an MB with the cell bodies of the neurons around the MB expressing the constructs highlighted so that the reader can have a wider idea of the anatomy in relation to the MB.

      We appreciate these comments and have appended cartoons of the MB to figures to help readers understand the anatomy.

    1. The scenarios Wooldridge imagines include a deadly software update for self-driving cars, an AI-powered hack that grounds global airlines, or a Barings bank-style collapse of a major company, triggered by AI doing something stupid. “These are very, very plausible scenarios,” he said. “There are all sorts of ways AI could very publicly go wrong.”

      Scenario's for a Hindenburg style event: - deadly software update for self driving cars - AI-powered hacking ground global airlines (not sure, if that is clear enough to people, unlike the self driving cars running amok) - Barings-style collapse of a major company triggered by AI (if it's a tech company, it may be less shock, more ridicule, but still)

    2. “It’s the classic technology scenario,” he said. “You’ve got a technology that’s very, very promising, but not as rigorously tested as you would like it to be, and the commercial pressure behind it is unbearable.”

      true for AI, but wasn't the case for Hindenburg I'd say.

    3. The race to get artificial intelligence to market has raised the risk of a Hindenburg-style disaster that shatters global confidence in the technology, a leading researcher has warned.Michael Wooldridge, a professor of AI at Oxford University, said the danger arose from the immense commercial pressures that technology firms were under to release new AI tools, with companies desperate to win customers before the products’ capabilities and potential flaws are fully understood.

      prediction Michael Wooldridge (Oxford, AI), sees a risk at an 'Hindenburg' event. Shattering the global confidence in AI tech. I"m not sure this analogy entirely fits other than in its potential impact (AI isn't globally trusted, the Hindenburg did not fail bc of the tech itself but bc helium not being allowed to export from the US at the time. Still the Hindenburg did put an end to the entire zeppelin industry yes. No matter the causes.)

    1. Map the Fields For each field below, click + Choose field, select the field, then click the blue + button to insert the value from the trigger. User Tests Field Map From (Bug trigger) Notes Test Name Type: "Retest: " then insert Bug ID from trigger Prefix helps identify retests Status Select: Todo Static value Bugs Insert Airtable record ID from trigger Links retest back to the bug Iteration Type: 2 Static value (manually update if higher)

      Confusing and not easy enough to understand and read, especially if no experience with airtable. Also text is a bit to close to eachother and explanations to vague. Should explain and do everything step by step for easiest possible understanding.

    2. Configure the Trigger Trigger type: When record matches conditions Table: User Tests Add condition 1: Status → is → Failed Add condition 2: Finalize → is → checked ✅ 3 Add Action: Create Record Click + Add advanced logic or action Select: Create record Table: Bugs 4 Map the Field Click + Choose field Select: Relevant Test Click the blue + insert button on the right Choose: Airtable record ID (from the trigger step "When record matches conditions")

      End step missing:

      Examples = -> Press confirm -> Move onto next step or move onto step 5

    1. eLife Assessment

      This useful study uses creative scalp EEG decoding methods to attempt to demonstrate that two forms of learned associations in a Stroop task are dissociable, despite sharing similar temporal dynamics. However, the evidence supporting the conclusions is incomplete due to concerns with the experimental design and methodology. This paper would be of interest to researchers studying cognitive control and adaptive behavior, if the concerns raised in the reviews can be addressed satisfactorily.

    2. Reviewer #1 (Public review):

      Summary:

      This study focuses on characterizing the EEG correlates of item-specific proportion congruency effects. Two types of learned associations are characterized, one being associations between stimulus features and control states (SC), and the other being stimulus features and responses (SR). Decoding methods are used to identify time-resolved SC and SR correlates, which are used to test properties of their dynamics.

      The conclusion is reached that SC and SR associations can independently and simultaneously guide behavior. This conclusion is based on results showing SC and SR correlates are: (1) not entirely overlapping in cross-decoding; (2) simultaneously observed on average over trials in overlapping time bins; (3) independently correlate with RT; and (4) have a positive within-trial correlation.

      Strengths:

      Fearless, creative use of EEG decoding to test tricky hypotheses regarding latent associations.

      Nice idea to orthogonalize ISPC condition (MC/MI) from stimulus features.

      Weaknesses:

      I still have my concern from the first round that the decoders are overfit to temporally structured noise. As I wrote before, the SC and SR classes are highly confounded with phase (chunk of session). I do not see how the control analyses conducted in the revision adequately deal with this issue.

      In the figures, there are several hints that these decoders are biased. Unfortunately, the figures are also constructed in such a way that hides or diminishes the salience of the clues of bias. This bias and lack of transparency discourage trust in the methods and results.

      I have two main suggestions:

      (1) Run a new experiment with a design that properly supports this question.

      I don't make this suggestion lightly, and I understand that it may not be feasible to implement given constraints; but I feel that this suggestion is warranted. The desired inferences rely on successful identification of SC and SR representations. Solidly identifying SC and SR representations necessitates an experimental design wherein these variables are sufficiently orthogonalized, within-subject, from temporally structured noise. The experimental design reported in this paper unfortunately does not meet this bar, in my opinion (and the opinion of a colleague I solicited).

      An adequate design would have enough phases to properly support "cross-phase" cross-validation. Deconfounding temporal noise is a basic requirement for decoding analyses of EEG and fMRI data (see e.g., leave-one-run-out CV that is effectively necessary in fMRI; in my experience, EEG is not much different, when the decoded classes are blocked in time, as here). In a journal with a typical acceptance-based review process, this would be grounds for rejection.

      Please note that this issue of decoder bias would seem to weaken the rest of the downstream analyses that are based on the decoded values. For instance, if the decoders are biased, in the within-trial correlation analysis, how can we be sure that co-fluctuations along certain dimensions within their projected values are driven by signal or noise? A similar issue clouds the LMM decoding-RT correlations.

      (2) Increase transparency in the reporting of results throughout main text.

      Please do not truncate stimulus-aligned timecourses at time=0. Displaying the baseline period is very useful to identify bias, that is, to verify that stimulus-dependent conditions cannot be decoded pre-stimulus. Bias is most expected to be revealed in the baseline interval when the data are NOT baseline-corrected, which is why I previously asked to see the results omitting baseline correction. (But also note that if the decoders are biased, baseline-correcting would not remove this bias; instead, it would spread it across the rest of the epoch, while the baseline interval would, on average, be centered at zero.)

      Please use a more standard p-value correction threshold, rather than Bonferroni-corrected p<0.001. This threshold is unusually conservative for this type of study. And yet, despite this conservativeness, stimulus-evoked information can be decoded from nearly every time bin, including at t=0. This does not encourage trust in the accuracy of these p-values. Instead, I suggest using permutation-based cluster correction, with corrected p<0.05. This is much more standard and would therefore allow for better comparison to many other studies.

      I don't think these things should be done as control analyses, tucked away in the supplemental materials, but instead should be done as a part of the figures in the main text -- including decoding, RSA, cross-trial correlations, and RT correlations.

      Other issues:

      Regarding the analysis of the within-trial correlation of RSA betas, and "Cai 2019" bias:

      The correction that authors perform in the revision -- estimating the correlation within the baseline time interval and subtracting this estimate from subsequent timepoints -- assumes that the "Cai 2019" bias is stationary. This is a fairly strong assumption, however, as this bias depends not only on the design matrix, but also on the structure of the noise (see the Cai paper), which can be non-stationary. No data were provided in support of stationarity. It seems safer and potentially more realistic to assume non-stationarity.

      This analysis was included in the supplemental material. However, given that the correlation analysis presented in the Results is subject to the "Cai 2019" bias, it would seem to be more appropriate to replace that analysis, rather than supplement it.

      Regardless, this seems to be a moot issue, given that the underlying decoders seem to be overfit to temporally structured noise (see point above regarding weakening of downstream analyses based on decoder bias).

      Outliers and t-values:

      More outliers with beta coefficients could be because the original SD estimates from the t-values are influenced more by extreme values. When you use a threshold on the median absolute deviation instead of mean +/-SD, do you still get more outliers with beta coefficients vs t-values?

      Random slopes:

      Were random slopes (by subject) for all within-subject variables included in the LMMs? If not, please include them, and report this in the Methods.

    3. Reviewer #2 (Public review):

      Summary:

      In this EEG study, Huang et al. investigated the relative contribution of two accounts to the process of conflict control, namely the stimulus-control association (SC), which refers to the phenomenon that the ratio of congruent vs. incongruent trials affects the overall control demands, and the stimulus-response association (SR), stating that the frequency of stimulus-response pairings can also impact the level of control. The authors extended the Stroop task with novel manipulation of item congruencies across blocks in order to test whether both types of information are encoded and related to behaviour. Using decoding and RSA they showed that the SC and SR representations were concurrently present in voltage signals and they also positively co-varied. In addition, the variability in both of their strengths was predictive of reaction time. In general, the experiment has a sold design and the analyses are appropriate for the research questions.

      Strength:

      (1) The authors used an interesting task design that extended the classic Stroop paradigm and is effective in teasing apart the relative contribution of the two different accounts regarding item-specific proportion congruency effect.

      (2) Linking the strength of RSA scores with behavioural measure is critical to demonstrating the functional significance of the task representations in question.

      Weakness:

      (1) The distinction between Phase 2 and Phase 1&3 behavioral results, specifically the opposite effect of MC/MI in congruent trials raises some concerns with regard to the effectiveness of the ISPC manipulation. Why do RTs and error rates under MC congruent condition in Phase 2 seem to be worse than MI congruent? Could there be other factors at play here, e.g. order effect? How does this potentially affect the neural analyses where trials from different phases were combined? Also, the manuscript does not mention whether there is counterbalancing for the color groups across participants, so far as I can tell.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This useful study uses creative scalp EEG decoding methods to attempt to demonstrate that two forms of learned associations in a Stroop task are dissociable, despite sharing similar temporal dynamics. However, the evidence supporting the conclusions is incomplete due to concerns with the experimental design and methodology. This paper would be of interest to researchers studying cognitive control and adaptive behavior, if the concerns raised in the reviews can be addressed satisfactorily.

      We thank the editors and the reviewers for their positive assessment of our work and for providing us with an opportunity to strengthen this manuscript. Please see below our responses to each comment raised in the reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study focuses on characterizing the EEG correlates of item-specific proportion congruency effects. In particular, two types of learned associations are characterized. One being associations between stimulus features and control states (SC), and the other being stimulus features and responses (SR). Decoding methods are used to identify SC and SR correlates and to determine whether they have similar topographies and dynamics.

      The results suggest SC and SR associations are simultaneously coactivated and have shared topographies, with the inference being that these associations may share a common generator.

      Strengths:

      Fearless, creative use of EEG decoding to test tricky hypotheses regarding latent associations. Nice idea to orthogonalize the ISPC condition (MC/MI) from stimulus features.

      Thank you for acknowledging the strength in EEG decoding and design. We have addressed all your concerns raised below point by point.

      Weaknesses:

      (1a) I'm relatively concerned that these results may be spurious. I hope to be proven wrong, but I would suggest taking another look at a few things.

      While a nice idea in principle, the ISPC manipulation seems to be quite confounded with the trial number. E.g., color-red is MI only during phase 2, and is MC primarily only during Phase 3 (since phase 1 is so sparsely represented). In my experience, EEG noise is highly structured across a session and easily exploited by decoders. Plus, behavior seems quite different between Phase 2 and Phase 3. So, it seems likely that the classes you are asking the decoder to separate are highly confounded with temporally structured noise.

      I suggest thinking of how to handle this concern in a rigorous way. A compelling way to address this would be to perform "cross-phase" decoding, however I am not sure if that is possible given the design.

      Thank you for raising this important issue. To test whether decoding might be confounded by temporally structured noise, we performed a control decoding analysis. As the reviewer correctly pointed out, cross-phase decoding is not possible due to the experimental design. Alternatively, to maximize temporal separation between the training and test data, we divided the EEG data in phase 2 and phase 1&3 into the first and second half chronologically. Phase 1 and 3 were combined because they share the same MC and MI assignments. We then trained the decoders on one half and tested them on the other half. Finally, we averaged the decoding results across all possible assignments of training and test data. The similar patterns (Supplementary Fig.1) observed confirmed that the decoding results are unlikely to be driven by temporally structured noise in the EEG data. The clarification has been added to page 13 of the revised manuscript.

      (1b) The time courses also seem concerning. What are we to make of the SR and SC timecourses, which have aggregate decoding dynamics that look to be <1Hz?

      As detailed in the response to your next comment, some new results using data without baseline correction show a narrower time window of above-chance decoding. We speculate that the remaining results of long-lasting above-chance decoding could be attributed to trials with slow responses (some responses were made near the response deadline of 1500 ms). Additionally, as shown in Figure 6a, the long-lasting above-chance decoding seems to be driven by color and congruency representations. Thus, it is also possible that the binding of color and congruency contributes to decoding. This interpretation has been added to page 17 of the revised manuscript.

      (1c) Some sanity checks would be one place to start. Time courses were baselined, but this is often not necessary with decoding; it can cause bias (10.1016/j.jneumeth.2021.109080), and can mask deeper issues. What do things look like when not baselined? Can variables be decoded when they should not be decoded? What does cross-temporal decoding look like - everything stable across all times, etc.?

      As the reviewer mentioned, baseline-corrected data may introduce bias to the decoding results. Thus, we cited the van Driel et al (2021) paper in the revised manuscript to justify the use of EEG data without baseline-correction in decoding analysis (Page 27 of the revised manuscript), and re-ran all decoding analysis accordingly. The new results revealed largely similar results (Fig. 2, 4, 6 and 8 in the revised manuscript) with the following exceptions: narrower time window for separatable SC subspace and SR subspace (Fig. 4b), narrower time window for concurrent representations of SC and SR (Fig. 6a-b), and wider time window for the correlations of SC/SR representations with RTs (Fig. 8).

      (2) The nature of the shared features between SR and SC subspaces is unclear.

      The simulation is framed in terms of the amount of overlap, revealing the number of shared dimensions between subspaces. In reality, it seems like it's closer to 'proportion of volume shared', i.e., a small number of dominant dimensions could drive a large degree of alignment between subspaces.

      What features drive the similarity? What features drive the distinctions between SR and SC? Aside from the temporal confounds I mentioned above, is it possible that some low-dimensional feature, like EEG congruency effect (e.g., low-D ERPs associated with conflict), or RT dynamics, drives discriminability among these classes? It seems plausible to me - all one would need is non-homogeneity in the size of the congruency effect across different items (subject-level idiosyncracies could contribute: 10.1016/j.neuroimage.2013.03.039).

      Thank you for this question. To test what dimensions are shared between SC and SR subspaces, we first identify which factors can be shared across SC and SR subspaces. For SC, the eight conditions are the four colors × ISPC. Thus, the possible shared dimensions are color and ISPC. Additionally, because the four colors and words are divided into two groups (e.g., red-blue and green-yellow, counterbalanced across subjects, see Methods), the group is a third potential shared dimension. Similarly, for SR decoders, potential shared dimensions are word, ISPC and group. Note that each class in SC and SR decoders has both congruent and incongruent trials. Thus, congruency is not decodable from SC/SR decoders and hence unlikely to be a shared dimension in our analysis. To test the effect of sharing for each of the potential dimensions, we performed RSA on decoding results of the SC decoder trained on SR subspace (SR | SC) (Supplementary Fig. 4a) and the SR decoder trained on SC subspace (SC | SR) (Supplementary Fig. 4b), where the decoders indicated the decoding accuracy of shared SC and SR representations. In the SC classes of SR | SC, word red and blue were mixed within the same class, same were word yellow and green. The similarity matrix for “Group” of SR | SC (Supplementary Fig. 4a) shows the comparison between two word groups (red & blue vs. yellow & green). The similarity matrix for “Group” of SC | SR (Supplementary Fig. 4b) shows the comparison between two color groups (red & blue vs. yellow & green).

      The RSA results revealed that the contributions of group to the SC decoder (Supplementary Fig. 5a) and the SR decoder (Supplementary Fig. 5b) were significant. Meanwhile, a wider time window showed significant effect of color on the SC decoder (approximately 100 - 1100 ms post-stimulus onset, Supplementary Fig. 5a) and a narrower time window showed significant effect of word on SR decoder (approximately 100 - 500 ms post-stimulus onset, Supplementary Fig. 5b). However, we found no significant effect of ISPC on either SC or SR decoders. We also performed the same analyses on response-locked data from the time window -800 to 200 ms. The results showed shared representation of color in the SC decoder (Supplementary Fig. 5c) and group in both decoders (Supplementary Fig. 5c-d). Overall, the above results demonstrated that color, word and group information are shared between SC and SR subspaces.

      Lastly, we would like to stress that our main hypothesis for the cross-subspace decoding analysis is that SR and SC subspaces are not identical. This hypothesis was supported by lower decoding accuracy for cross-subspace than within-subspace decoders and enables following analyses that treated SC and SR as separate representations.

      We have added the interpretation to page 13-14 of the revised manuscript.

      (3) The time-resolved within-trial correlation of RSA betas is a cool idea, but I am concerned it is biased. Estimating correlations among different coefficients from the same GLM design matrix is, in general, biased, i.e., when the regressors are non-orthogonal. This bias comes from the expected covariance of the betas and is discussed in detail here (10.1371/journal.pcbi.1006299). In short, correlations could be inflated due to a combination of the design matrix and the structure of the noise. The most established solution, to cross-validate across different GLM estimations, is unfortunately not available here. I would suggest that the authors think of ways to handle this issue.

      Thank you for raising this important issue. Because the bias comes from the covariance between the regressors and the same GLM was applied to all time points in our analysis, we assume that the inflation would be similar at different time points. Therefore, we calculated the correlation of SC and SR betas ranging from -200 to 0 ms relative to stimulus onset as a baseline (i.e., no SC or SR representation is expected before the stimulus onset) and compared the post-stimulus onset correlation coefficients against this baseline. We hypothesized that if the positively within-trial correlation of SC and SR betas resulted from the simultaneous representation instead of inflation, we should observe significantly higher correlation when compared with the baseline. To examine this hypothesis, we first performed the linear discriminant analysis (Supplementary Fig. 7a) and RSA regression (Supplementary Fig. 7b) on the -200 - 0 ms window relative to stimulus onset. We then calculated the average r<sub>baseline</sub> of SC and SR betas on that time window for each participant (group results at each time point are shown in Supplementary Fig. 7c) and computed the relative correlation at each post-stimulus onset time point using (fisher-z (r) - fisher-z (r<sub>baseline</sub>)). Finally, we performed a simple t test at the group level on baseline-corrected correlation coefficients with Bonferroni correction. The results (Fig. 6c) showed significantly more positive correlation from 100 - 500 ms post-stimulus onset compared with baseline, supporting our hypothesis that the positive within-trial correlation of SC and SR betas arise from simultaneous representation rather than inflation. The related interpretation was added to page 17 of the revised manuscript.

      (4) Are results robust to running response-locked analyses? Especially the EEG-behavior correlation. Could this be driven by different RTs across trials & trial-types? I.e., at 400 ms poststim onset, some trials would be near or at RT/action execution, while others may not be nearly as close, and so EEG features would differ & "predict" RT.

      Thanks for this question. We now pair each of the stimulus-locked EEG analysis in the manuscript with response-locked analysis. To control for RT variations among trial types, when using the linear mixed model (LMM) to predict RTs from trial-wise RSA results, we included a separate intercept for each of the eight trial types in SC or SR. Furthermore, at each time point, we only included trials that have not generated a response (for stimulus-locked analysis) or already started (for response-locked analysis). All the results (Fig. 3, 5, 7, 9 in the revised manuscript) are in support of our hypothesis. We added these detailed to page 31 of the revised manuscript.

      (5) I suggest providing more explanation about the logic of the subspace decoding method - what trialtypes exactly constitute the different classes, why we would expect this method to capture something useful regarding ISPC, & what this something might be. I felt that the first paragraph of the results breezes by a lot of important logic.

      In general, this paper does not seem to be written for readers who are unfamiliar with this particular topic area. If authors think this is undesirable, I would suggest altering the text.

      To improve clarity, we revised the first paragraph of the SC and SR association subspace analysis to list the conditions for each of the SC and SR decoders and explain more about how the concept of being separatable can be tested by cross-decoding between SC and SR subspaces. The revised paragraph now reads:

      “Prior to testing whether controlled and non-controlled associations were represented simultaneously, we first tested whether the two representations were separable in the EEG data.

      In other words, we reorganized the 16 experimental conditions into 8 conditions for SC (4 colors × MC/MI, while collapsing across SR levels) and SR (4 words × 2 possible responses per word, while collapsing across SC levels) associations separately. If SC and SR associations are not separable, it follows that they encode the same information, such that both SC and SR associations can be represented in the same subspace (i.e., by the same information encoded in both associations). For example, because (1) the word can be determined by the color and congruency and (2) the most-likely response can be determined by color and ISPC, the SR association (i.e., association between word and most-likely response) can in theory be represented using the same information as the SC association. On the other hand, if SC and SR associations are separable, they are expected to be represented in different subspaces (i.e., the information used to encode the two associations is different). Notably, if some, but not all, information is shared between SC and SR associations, they are still separable by the unique information encoded. In this case, the SC and SR subspaces will partially overlap but still differ in some dimensions. To summarize, whether SC and SR associations are separable is operationalized as whether the associations are represented in the same subspace of EEG data. To test this, we leveraged the subspace created by the LDA (see Methods). Briefly, to capture the subspace that best distinguishes our experimental conditions, we trained SC and SR decoders using their respective aforementioned 8 experimental conditions. We then projected the EEG data onto the decoding weights of the LDA for each of the SC and SR decoders to obtain its respective subspace. We hypothesized that if SC and SR subspaces are identical (i.e., not separable), SC/SR decoding accuracy should not differ by which subspace (SC or SR) the decoder is trained on. For example, SC decoders trained in SC subspace should show similar decoding performance as SC decoders trained in SR subspace. On the other hand, if SC and SR association representations are in different subspaces, the SC/SR subspace will not encode all information for SR/SC associations. As a result, decoding accuracy should be higher using its own subspace (e.g., decoding SC using the SC subspace) than using the other subspace (e.g., decoding SC using the SR subspace). We used cross-validation to avoid artificially higher decoding accuracy for decoders using their own subspace (see Methods).” (Page 11-12).

      We also explicitly tested what information is shared between SC and SR representations (see response to comment #2). Lastly, to help the readers navigate the EEG results, we added a section “Overview of EEG analysis” to summarize the EEG analysis and their relations in the following manner:

      “EEG analysis overview. We started by validating that the 16 experimental conditions (8 unique stimuli × MC/MI) were represented in the EEG data. Evidence of representation was provided by above-chance decoding of the experimental conditions (Fig. 2-3). We then examined whether the SC and SR associations were separable (i.e., whether SC and SR associations were different representations of equivalent information). As our results supported separable representations of SC and SR association (Fig. 4-5), we further estimated the temporal dynamics of each representation within a trial using RSA. This analysis revealed that the temporal dynamics of SC and SR association representations overlapped (Fig. 6a-b, Fig. 7a-b). To explore the potential reason behind the temporal overlap of the two representations, we investigated whether SC and SR associations were represented simultaneously as part of the task representation, independently from each other, or competitively/exclusively (e.g., on some trials only SC association was represented, while on other trials only SR association was represented). This was done by assessing the correlation between the strength of SC and SR representations across trials (Fig. 6c, Fig. 7c). Lastly, we tested how SC and SR representations facilitated performance (Fig.8-9).” (Page 8-9).

      Minor suggestions:

      (6) I'd suggest using single-trial RSA beta coefficients, not t-values, as they can be more stable (it's a t-value based on 16 observations against 9 or so regressors.... the SE can be tiny).

      Thank you for your suggestion. To choose between using betas and t-values, we calculate the proportion of outliers (defined as values beyond mean ± 5 SD) for each predictor of the design matrix and each subject. We found that outliers were less frequent for t-values than for beta coefficients (t-values: mean = 0.07%, SD = 0.009%; beta-values: mean = 0.19%, SD = 0.033%). Thus, we decided to stay with t-values.

      (7) Instead of prewhitening the RTs before the HLM with drift terms, try putting those in the HLM itself, to avoid two-stage regression bias.

      Thank you for your suggestion. Because our current LMM included each of the eight trial types in SC or SR as separate predictors with their own intercepts (as mentioned above), adding regressors of trial number and mini blocks (1-100 blocks) introduced collinearity (as ISPC flipped during the experiment). We therefore excluded these regressors from the current LMM (Page 31).

      (8) The text says classical MDS was performed on decoding *accuracy* - is this accurate?

      We now clarify in the manuscript that it is the decoders’ probabilistic classification results (Page 28).

      (9) At a few points, it was claimed that a negative correlation between SC and SR would be expected within single trials, if the two were temporally dissociable. Wouldn't it also be possible that they are not correlated/orthogonal?

      We agree with the reviewer and revised the null hypothesis in the cross-trial correlation analysis to include no correlation as SC and SR association representations may be independent from each other (Page 17, 22).

      Reviewer #2 (Public review):

      Summary:

      In this EEG study, Huang et al. investigated the relative contribution of two accounts to the process of conflict control, namely the stimulus-control association (SC), which refers to the phenomenon that the ratio of congruent vs. incongruent trials affects the overall control demands, and the stimulus-response association (SR), stating that the frequency of stimulusresponse pairings can also impact the level of control. The authors extended the Stroop task with novel manipulation of item congruencies across blocks in order to test whether both types of information are encoded and related to behaviour. Using decoding and RSA, they showed that the SC and SR representations were concurrently present in voltage signals, and they also positively co-varied. In addition, the variability in both of their strengths was predictive of reaction time. In general, the experiment has a solid design, but there are some confounding factors in the analyses that should be addressed to provide strong support for the conclusions.

      Strengths:

      (1) The authors used an interesting task design that extended the classic Stroop paradigm and is potentially effective in teasing apart the relative contribution of the two different accounts regarding item-specific proportion congruency effect, provided that some confounds are addressed.

      (2) Linking the strength of RSA scores with behavioural measures is critical to demonstrating the functional significance of the task representations in question.

      Thank you for your positive feedback. We hope our responses below address your concerns.

      Weakness:

      (1) While the use of RSA to model the decoding strength vector is a fitting choice, looking at the RDMs in Figure 7, it seems that SC, SR, ISPC, and Identity matrices are all somewhat correlated. I wouldn't be surprised if some correlations would be quite high if they were reported. Total orthogonality is, of course, impossible depending on the hypothesis, but from experience, having highly covaried predictors in a regression can lead to unexpected results, such as artificially boosting the significance of one predictor in one direction, and the other one to the opposite direction. Perhaps some efforts to address how stable the timed-resolved RSA correlations for SC and SR are with and without the other highly correlated predictors will be valuable to raising confidence in the findings.

      Thank you for this important point. The results of proportion of variability explained shown in the Author response table 1 below, indicated relatively higher correlation of SC/SR with Color and Identity. We agree that it is impossible to fully orthogonalize them. To address the issue of collinearity, we performed a control RSA by removing predictors highly correlated with others. Specifically, we calculated the variance inflation factor (VIF) for each predictor. The Identity predictor had a high VIF of 5 and was removed from the RSA. All other predictors had VIFs < 4 and were kept in the RSA. The results (Supplementary Fig. 6) showed patterns similar to the results with the Identity predictor, suggesting that the findings are not significantly influenced by collinearity. We have added the interpretation to page 17 of the revised manuscript.

      Author response table 1.

      Proportion of variability explained (r<sup>2</sup>) of RSA predictors.

      (2) In "task overview", SR is defined as the word-response pair; however, in the Methods, lines 495-496, the definition changed to "the pairing between word and ISPC" which is in accordance with the values in the RDMs (e.g., mccbb and mcirb have similarity of 1, but they are linked to different responses, so should they not be considered different in terms of SR?). This needs clarification as they have very different implications for the task design and interpretation of results, e.g., how correlated the SC and SR manipulations were.

      Thank you for pointing out this important issue with how our operationalization captures the concept in questions. In the revised manuscript, we clarified the stimulus-response (SR) association is the link between the word and the most-likely response (i.e., not necessarily the actual response on the current trial). This association is likely to be encoded based on statistical learning over several trials. On each trial, the association is updated based on the stimulus and the actual response. Over multiple trials, the accumulated association will be driven towards the most-common (i.e., most-likely) response. In our ISPC manipulation, a color is presented in mostly congruent/incongruent (MC/MI) trials, which will also pair a word with a most-likely response. For example, if the color blue is MC, the color blue, which leads to the response blue, will co-occur with the word blue with high frequency. In other words, the SR association here is between the word blue and the response blue. As the actual response is not part of the SR association, in the RDM two trial types with different responses may share the same SR association, as long as they share the same word and the same ISPC manipulation, which, by the logic above, will produce the same most-likely response. These clarifications have been added to page 4 and 29 of the revised manuscript.

      In the revised manuscript (Page 17), we addressed how much the correlated SC and SR predictors in the RDM could affect the correlation analysis between SC and SR association representation strength. Specifically, we conducted the RSA using the same GLM on EEG data prior to stimulus onset (Supplementary Fig. 7a-b). As no SC and SR associations are expected to be present before stimulus onset, the correlation between SC and SR representation would serve as a baseline of inflation due to correlated predictors in the GLM (Supplementary Fig. 7c, also see comment #3 of R1). The SC-SR correlation coefficients following stimulus onset was then compared to the baseline to control for potential inflation (Fig. 6c). Significantly above-baseline correlation was still observed between ~100-500 ms post-stimulus onset, providing support for the hypothesis that SC and SR are encoded in the same task representation.

      Minor suggestions:

      (3) Overall, I find that calling SC-controlled and SR-uncontrolled representations unwarranted. How is the level controlledness defined? Both are essentially types of statistical expectation that provide contextual information for the block of tasks. Is one really more automatic and requires less conscious processing than the other? More background/justification could be provided if the authors would like to use these terms.

      Following your advice, we have added more discussion on how controlledness is conceptualized in this work and in the literature, which reads:

      “We consider SC and SR as controlled and uncontrolled respectively based on the literature investigating the mechanism of ISPC effect. The SC account posits that the ISPC effect results from conflict and involves conflict adaptation, which requires the regulation of attention or control (Bugg & Hutchison, 2013; Bugg et al., 2011; Schmidt, 2018; Schmidt & Besner, 2008). On the other hand, the SR account argues that ISPC effect does not require conflict adaptation but instead reflects contingency leaning. That is, the response can be directly retrieved from the association between the stimulus and the most-likely response without top-down regulation of attention or control. As more empirical evidence emerged, researchers advocating control view began to acknowledge the role of associative learning in cognitive control regarding the ISPC effect (Abrahamse et al., 2016). SC association has been thought to include both automatic that is fast and resource saving and controlled processes that is flexible and generalizable (Chiu, 2019). Overall, we do not intend to claim that SC is entirely controlled or SR is completely automatic. We use SC-controlled and SR-uncontrolled representations to align with the original theoretical motivation and to highlight the conceptual difference between SC and SR associations.” (Page 24-25)

      (4) Figures 3c and d: the figures could benefit from more explanation of what they try to show to the readers. Also for 3d, the dimensions were aligned with color sets and congruencies, but word identities were not linearly separable, at least for the first 3 axes. Shouldn't one expect that words can be decoded in the SR subspace if word-response pairs were decodable (e.g., Figure 3b)?

      Thank you for the insightful observation. We now clarified that Fig. 3c and d in the original manuscript (Fig. 4c and d in the current manuscript) aim to show how each of the 8 trial types in the SC and SR subspaces are represented. The MDS approach we used for visualization tries to preserve dissimilarity between trial types when projecting from data from a high dimensional to a low dimensional space. However, such projection may also make patterns linearly separatable in high dimensional space not linearly separatable in low dimensional space. For example, if the word blue has two points (-1, -1) and (1, 1) and the word red has two points (-1, 1) and (1, -1), they are not linearly separatable in the 2D space. Yet, if they are projected from a 3D space with coordinates of (-1, -1, -0.1), (1, 1, -0.1), (-1, 1, 0.1) and (1, -1, 0.1), the two words can be linearly separatable using the 3<sup>rd</sup> dimension. Thus, a better way to test whether word can be linearly separated in SR subspace is to perform RSA on the original high dimensional space. We performed the RSA with word (Supplementary Fig. 2) on the SR decoder trained on the SR subspace. Note that in Fig. 3c and d of the original script (Fig. 4c and d in the current manuscript) there are two pairs of words that are not linearly separable: red-blue and yellow-green. Thus, we specifically tested the separability within the two pairs using the one predictor for each pair, as shown in Supplementary Fig. 2. The results showed that within both word pairs individual words were presented above chance level (Supplementary Fig. 3). Considering that the decoders are linear, this finding indicates linear separability of the word pairs in the original SR subspace. The clarification has been added to page 13 (the end of the second paragraph) of the revised manuscript.

      References

      Abrahamse, E., Braem, S., Notebaert, W., & Verguts, T. (2016). Grounding cognitive control in associative learning. Psychological Bulletin, 142(7), 693-728.doi:10.1037/bul0000047.

      Bugg, J. M., & Hutchison, K. A. (2013). Converging evidence for control of color-word Stroop interference at the item level. Journal of Experimental Psychology:Human Perception and Performance, 39(2), 433-449. doi:10.1037/a0029145.

      Bugg, J. M., Jacoby, L. L., & Chanani, S. (2011). Why it is too early to lose control in accounts of item-specific proportion congruency effects. Journal of Experimental Psychology: Human Perception and Performance, 37(3), 844-859. doi:10.1037/a0019957.

      Chiu, Y.-C. (2019). Automating adaptive control with item-specific learning. In Psychology of Learning and Motivation (Vol. 71, pp. 1-37).

      Schmidt, J. R. (2018). Evidence against conflict monitoring and adaptation: An updated review. Psychonomic Bulletin & Review, 26(3), 753-771. doi:10.3758/s13423018-1520-z.

      Schmidt, J. R., & Besner, D. (2008). The Stroop effect: Why proportion congruent has nothing to do with congruency and everything to do with contingency. Journal of Experimental Psychology: Learning, Memory, and Cognition, 34(3), 514-523. doi:10.1037/0278-7393.34.3.514.

    1. beauty of online anonymity)

      This phrase reminded me of something. When I was in high school, I ran the Redmond [High School] Compliments Facebook page (yeah, it was 2014). People sent anonymous compliments for people at the school into a google form and I posted them anonymously onto h the page. I had the page passed down to me anonymously and I had to keep my identity a secret until the end of the year when I revealed it in the yearbook as my "senior confession". I got to watch people react to anonymous kindness all year. It's one of the reasons why I really believe in the power of social media as a possible force for good. What this phrase "beauty of online anonymity" reminded me of specifically was a quote I found at the time that said something along the lines of "we read things from others in their voice, When we read something anonymously, we read it in our own voice." It meant so much to me to be a part of having people read aloud kindness in their own voices.

    1. Sherry Turkle, author of Alone Together, characterizes the offline world as a physical place, a kind of Edenic paradise. “Not too long ago,” she writes, “people walked with their heads up, looking at the water, the sky, the sand” — now, “they often walk with their heads down, typing.” […] Gone are the happy days when families would gather around a weekly televised program like our ancestors around the campfire!

      Yes, this kind of attitude drives me nuts. I hear this constantly, and as a historian it takes everything in me not to say "literally every generation fearmongered about the new technology and called the previous generation's technology wholesome!!" People were freaking out about the television's influence on children when it first got big, but it's all forgotten once something new things comes around. They did it for videogames, TV, phones, social media, even PRINTED BOOKS.

    2. Many have anecdotal experiences with their own mental health and those they talk to. For example, cosmetic surgeons have seen how photo manipulation on social media has influenced people’s views of their appearance:

      As the frequency of social media usage increases, the level of standards rises as well. People who watch social media are constantly exposed to some of the most tailored and presentable people, which raises their own perceptions of what is normal.

    1. y pushing back three decades, past the recent waves of “new” immigrants from southern and Eastern Europe, Latin America, and Asia, the law made it extremely difficult for immigrants outside northern Europe to legally enter the United States.)

      not much has changed since back then, there is still this decline for new immigrants now

    2. The Harlem Renaissance was manifested in theater, art, and music. For the first time, Broadway presented Black actors in serious roles.

      The Harlem Renaissance was a cultural, artistic place centered in Harlem, New York, that redefined African American expression

    3. . Occupations such as law and medicine remained overwhelmingly male: most female professionals were in feminized professions such as teaching and nursing. And even within these fields, it was difficult for women to rise to leadership positions.

      It sucks that during these times women could not choose careers that were considered to be "masculine"

    1. “Incel” is short for “involuntarily celibate,” meaning they are men who have centered their identity on wanting to have sex with women, but with no women “giving” them sex. Incels objectify women and sex, claiming they have a right to have women want to have sex with them. Incels believe they are being unfairly denied this sex because of the few sexually attractive men (”Chads”), and because feminism told women they could refuse to have sex. Some incels believe their biology (e.g., skull shape) means no women will “give” them sex. They will be forever alone, without sex, and unhappy. The incel community has produced multiple mass murderers and terrorist attacks.

      Often, this is a self reinforcing cycle. Dwelling on the fact that women aren't interested in you won't make you more appealing. Additionally with the rise in standards from social media, incels also won't go for women similar to them. Couple these factors together produces no good.

    2. masochistic epistemology

      This reminds me of ED twitter a lot, where young women (usually) post their bodies and ask for honest critiques of their body to worsen their own perception of their body. I think its an incredibly harmful, and often overlooked, form of sel-fharm because people see it as just being young and on social media.

    1. The linking verb in this working thesis statement is the word are. Linking verbs often make thesis statements weak because they do not express action. Rather, they connect words and phrases to the second half of the sentence. Readers might wonder, “Why are they not paid enough?” But this statement does not compel them to ask many more questions. The writer should ask himself or herself questions in order to replace the linking verb with an action verb, thus forming a stronger thesis statement, one that takes a more definitive stance on the issue:

      Linking verbs in the thesis is the word "are'. Linking verbs don't show action, they just connect ideas. Which can make a thesis feel weak. Instead of leaving readers unsure, you should use an action verb to make the thesis stronger and show a clear opinion or stance.

    2. Your thesis will probably change as you write, so you will need to modify it to reflect exactly what you have discussed in your essay. Your thesis statement begins as a working thesis statement, an indefinite statement that you make about your topic early in the writing process for the purpose of planning and guiding your writing. Working thesis statements often become stronger as you gather information and form new opinions and reasons for those opinions. Revision helps you strengthen your thesis so that it matches what you have expressed in the body of the paper. The best way to revise your thesis statement is to ask questions about it and then examine the answers to those questions. By challenging your own ideas and forming definite reasons for those ideas, you grow closer to a more precise point of view, which you can then incorporate into your thesis statement.

      The thesis could change as you write. Start with a working thesis, a rough idea to guide your writing. As you research more, your thesis gets stronger. To improve, ask yourself questions about your ideas and use the answers to make your thesis clearer and more specific.

    3. For any claim you make in your thesis, you must be able to provide reasons and examples for your opinion. You can rely on personal observations in order to do this, or you can consult outside sources to demonstrate that what you assert is valid. A worthy argument is backed by examples and details. Assertiveness A thesis statement that is assertive shows readers that you are, in fact, making an argument. The tone is authoritative and takes a stance that others might oppose. Confidence In addition to creating authority in your thesis statement, you must also use confidence in your claim. Phrases such as “I feel” or “I believe” actually weaken the readers’ sense of your confidence because these phrases imply that you are the only person who feels the way you do. In other words, your stance has insufficient backing. Taking an authoritative stance on the matter persuades your readers to have faith in your argument and open their minds to what you have to say.

      Focuses on 1-3 main points that you'll explain in the essay. A thesis shows what your essay will argue and how it's organized.

    4. A thesis is not your paper’s topic, but rather your interpretation of the question or subject. For whatever topic your professor gives you, you must ask yourself, “What do I want to write about it?” Asking and then answering this question is vital to forming a thesis that is precise, forceful, and confident. A thesis is generally one to two sentences long and appears toward the end of your introduction. It is specific and focuses on one to three points of a single idea—points that will be demonstrated in the body. The thesis forecasts the content of the essay and suggests how you will organize your information. Remember that a thesis statement does not summarize an issue but rather dissects it.

      A thesis is your main point or opinion about a topic. It's 1-2 sentences at the end of the intro and shows whatyour essay will argue.

    1. The best practices highlighted here apply to charts, graphs, figures, and other supplements, including the legends or captions that accompany them.

      Visual aids in the classroom are extremely helpful for students and teachers. To show and understand the point that is trying to get across. I feel that this is something that is now being enforced now more than ever with earlier grades, rather than just middle school and up.

    2. Since printed materials, unlike their online counterparts, cannot be manipulated by the reader or read aloud by a screen reader, we need to follow best practices to increase their accessibility and readability.

      I appreciate how this mentions accessibility. I have grown this semester to learn that accessibility is a need in the classroom. The only thing I question is, what if we have an emergent bilingual student or a student who struggles with reading and the resources of technology are limited? Or other current issue is how can we get students engaged in a reading on a technological device without having them wonder off, or retain the information that we are trying to get across.

    3. The brain’s limited capacity impacts its ability to engage in active processing and, hence, to learn. We can avoid cognitive overload and facilitate active processing by reducing or eliminating extraneous information from our instructional materials.

      I love this statement as a reminder to educators and as a reassurance to students. I feel as though times in high school and higher education that we are so much distress in our personal and academic lives that we feel as though we are not doing "enough" or learning "enough" because we are receiving so much content information. Now as a teaching perspective I love this reminder that brain breaks are NEEDED! There will be now growth for a student if the content is long and rigorous, rather we should engage with them, piece it up, and not be stress under a "time line" we are given.

    4. Like other visual aids, images or photographs should be clear, bright, and of adequate size. If possible, crop large images to include only the relevant parts. Cropping reduces extraneous information, and you may be able to enlarge the cropped image for easier viewing.

      It's also important to remember that just because a person can see an image doesn't meant that they understand what's going on in the image so keep that in mind when it some to visuals. Given students time to really understand what's in the photo.

    5. In some ways, written materials are more accessible online than in print because learners can manipulate the document to increase text size and brightness or use screen readers.

      I always want to make sure that my students can adjust the text the ways they need so they can read them more easily.

    6. Cognitive load “is a theory about learning built on the premise that since the brain can only do so many things at once, we should be intentional about what we ask it to do”

      It's important to keep this in mind, because students have whole lives outside of school. Returning to school can be overwhelming for them, because they're learning what their teachers' expectations are. It might be best to give them time to adapt to the course load before we start overloading them.

    1. Anxiety over public sharing, shyness, and disagreements with group members were common reasons for not sharing.

      Student anxiety is a big issue--and one that profs can't always solve.

    2. Moreover, students reported higher levels of intrinsic motivation (inherent interest and enjoyment) with renewable assignments than traditional assignments

      wonderful!

    3. Traditional assignments had higher levels of reported pressure.

      I would like to know more about what they mean by "pressure." Some stress can be useful in learning but pressure sounds negative.

    4. Examples of renewable assignments include creating websites, editing and contributing to Wikipedia articles, co-creating syllabi with instructors, and creating ancillary material like test bank items (Clinton-Lisell Citation2021; Wiley and Hilton Citation2018).

      I'd love to see a big ;most of renewable assignments!

    1. In what ways have you found social media bad for your mental health and good for your mental health? What responsibility do you think social media platforms have for the mental health of their users? Are there ways social media sites can be designed to be better for the mental health of its users? What are the ways social media companies monitoring of mental health could be beneficial or harmful?

      I feel like social media has kind of helped me get away from whats happening in my reality by allowing me to just sit and watch a few short videos. It frees me from a sense of responsibility and gives me a break from whatever I'm thinking about in general. However, using social media in general makes me feel lazy and guilty about how I'm spending my time. There are a lot of things I could be focusing my energy on to relax, and social media just doesn't seem like the best one.

    1. After identifying the main point, you will find the supporting points, the details, facts, and explanations that develop and clarify the main point

      Recognizing this will help me not ramble on in my writing and get to a direct point.

    2. These strategies fall into three broad categories: Planning strategies.To help you manage your reading assignments before you begin reading. Active Reading strategies.To help you understand the material while you read. Application strategies.

      Strategies to help become a strong reader will also help you become successful.

    1. There is no such thingas writing in general. Writing is always in particular

      This stood out to me because we usually think writing is one skill we can use everywhere, but she’s saying writing always depends on who we’re writing to and why. Writing a text,an essay, and a job email are all different.

    1. As social media companies have tried to detect talk of suicide and sometimes remove content that mentions it, users have found ways of getting around this by inventing new word uses, like “unalive.”

      It's honestly quite difficult trying to find a way to automate detection. Words often overlap in contexts, and I feel like people often use these trigger words in both positive and negative ways. For example, people may make videos/posts using the word "suicide" as a way to discourage bullying, promote seeking help in forms of support lines or therapy, or keeping someone's story alive. On the other hand, people may make videos or posts also using "suicide" to encourage self-loathing or techniques to harm yourself/commit suicide. It's a double-edged sword that ends up affecting both sides no matter what the intentions are. Additionally, those who try to bypass the limitations to create positive videos are giving the other party (negative) another way to talk about it freely.

    2. For example, Facebook has a suicide detection algorithm, where they try to intervene if they think a user is suicidal (Inside Facebook’s suicide algorithm: Here’s how the company uses artificial intelligence to predict your mental state from your posts). As social media companies have tried to detect talk of suicide and sometimes remove content that mentions it, users have found ways of getting around this by inventing new word uses, like “unalive.”

      This shows how moderation and user behavior constantly adapt to each other. When platforms try to filter certain language, people often respond creatively, which makes the system harder to manage. It also raises questions about whether removing certain words actually addresses harm, or just shifts how people express it.

    3. like “unalive.”

      This is actually a really helpful thing to read right now, because for my group project I'm attempting to detect content on Bluesky that promotes eating disorders/disordered eating habits. It's a good reminder at least for my Dream version of the program I'm creating that only scanning keywords allows for loopholes.

    1. While many queer oral history projects have been developed over the past decades, similar to Chicana/Latina feminist history, there is still much work to be done.

      This ending thought leaves me with questions and makes me think about how these people have been neglected.

    1. It was exciting because it suggested there was a whole other level of skill at thinking.  It implied that college was about training the mind,

      This is actually exciting. I have been a lazy thinker prior to ENGL C1000. College is definitely expanding the mind and how we think.

    1. Anytheory must identify which characteristics are considered relevant and which evidence willbe accepted.

      This is my quealm with theory - how do we account for something and discount another aspect of a phenamenon to create a solution when we havent accounted for the entirey of it.

    2. We are not in a position to identify with any certainty the conditions that gave El Tor ameasure of evolutionary advantage over classic variants in recent outbreaks

      Relates to the epdmiological conecpt in Joralemons book

    Annotators

    1. Resilient teachers are those that canthink deeply, problem-solve, and feel confident in their ability to meet the needsof their students

      Resilience here is intellectual, not just emotional. It is rooted in reflective capacity and competence. This reframes teacher retention not as endurance alone, but as cultivated professional identity.

    2. results of this study support the notion that self-efficacy, derived fromsuccessful field and student teaching experiences and the ability to use reflection forproblem solving actually outweighed positive school climate as a factor in noviceteacher success.

      This is a bold claim: internal belief may override external conditions. Yet it also invites caution, how much should systems rely on individual resilience instead of structural reform?

    3. unsupportive school climatescause high efficacy teachers to transfer to other schools rather than leave theprofession

      High efficacy teachers may not quit teaching, but they will exit toxic systems. Retention, therefore, may depend less on keeping teachers in specific schools and more on ensuring ethical, supportive environments.

    4. A positive and supportive school environment may not in itself beenough to support a struggling teacher. Conversely, unsupportive school climatescause high efficacy teachers to transfer to other schools rather than leave theprofession

      This complicates the dominant narrative that environment alone determines retention. It suggests an interaction between personal competencies and contextual fit, raising questions about hiring, placement, and mentorship alignment.

    5. traditional induction pro-grams that focus on transmitting knowledge in a short period of time have limitedutility in enhancing the learning of novice teachers.

      This critiques the “information dump” model of induction. If learning requires processing, experimentation, and dialogue, then induction programs must mirror the reflective practices we expect teachers to use with students.

    6. All of the stories told by the teachers contained elements of the critical thinkingmodel introduced to them in their teacher education program as a problem-solvingtool (explained earlier) and used extensively during their student teaching semes-ters in the form of journal writing, action research projects, and seminar classdiscussions.

      This suggests that structured reflective models become internalized cognitive habits. Teacher education, then, does not simply impart knowledge, it shapes how teachers think under pressure.

    7. Critical reflection as a problem-solving tool empowers teachers tocope with the challenges that they encounter in their first few years of teaching.

      Reflection is presented as empowerment rather than mere introspection. This reframes reflective practice as an act of agency, an intellectual resistance against burnout and helplessness.

    8. ìIthink believing in yourself óyou are going to face different challenges

      Self-belief here is not naïve optimism; it is strategic endurance. The deeper question becomes: how do teacher education programs systematically cultivate belief without promoting overconfidence?

    9. Successful field and student teaching experiences that are con-nected to coursework build teachers’ confidence and self-efficacy and thusencourage a higher level of competence in their first year of teaching.

      Integration between theory and practice appears central. This suggests that disconnected coursework may unintentionally undermine teacher confidence. How often do teacher candidates experience coherence rather than fragmentation?

    10. eachers need knowledge of how to reflect as well as time to think about their practice,both of which are essential to oneís ability to problem-solve and cope with challenges

      Reflection is not accidental; it must be structured and protected. In schools dominated by urgency and pacing guides, is time for reflection treated as essential professional practice, or a luxury?

    11. Knowledge andprior skill attainment are poor predictors of future performance because the beliefspeople hold about their performance have more power than acquired learning

      This challenges traditional metrics of teacher quality (GPA, credentials). If belief outweighs skill, then teacher education must intentionally cultivate confidence rooted in authentic mastery, not just content coverage.

    12. Thus teacher resiliency and persistence arestrongly related to teacher efficacy.

      This sentence reframes resilience from a personality trait to a belief system. If efficacy drives persistence, then strengthening teachers’ beliefs about their capability may be as critical as developing their technical skills.

    13. Growing evidence also suggests that teachers who lackadequate preparation to become teachers are more likely to leave the profession

      Preparation is framed here as protective armor. If inadequate preparation predicts departure, then retention may be less about resilience alone and more about justice, ensuring teachers are not placed in environments for which they are underprepared.

    14. recent estimates ofteachers who choose to leave the profession within the first three years to pursueother careers remains at an unacceptably high level of 33.5 percent

      A one-third attrition rate within three years suggests not merely an individual failure but a systemic one. What does it reveal about the transition from preparation to practice—and are we designing teacher education programs with survival, sustainability, and identity formation in mind?

    1. At the same time, Sally Hemings knew a world in which the lives of all women were morecircumscribed than those of males.

      Sally Hemings lived under slavery but also lived in a patriarchal society where women's lives were restricted compared to men's. Her experience was not only shaped by race, but also gender.

    1. We found that traders are buying cobalt without asking questions about how and where it was mined

      Easy way for large corporations to dodge accountability

    2. In a report into cobalt mining in the Democratic Republic of the Congo, it found children as young as seven working in dangerous conditions.

      Child labor in a company that claims zero tolerance -- What companies are being accused? Apple, Samsung, Sony

    3. It claimed that at least 80 miners had died underground in southern DRC between September 2014 and December 2015.

      High risk industry -- Raises questions about occupational safety data and how these deaths occurred and were they preventable

    4. The DRC produces at least 50% of the world's cobalt. Miners working in the area face long-term health problems and the risk of fatal accidents, according to Amnesty.

      DRC Cobalt mines but also what other types of mines

    1. In answering this letter, please state if there would be any safety for my Milly and Jane, who are now grown up, and both good-looking girls

      This sentence is important to me because it shows the fear of any parent right through is words of worry. The fear of their daughters being dishonored. In this day and age there are still cases such as where a woman is disrespected and or mistreated just because she is a woman, and this is today where society has changed its views and is less misogynistic. I can only imagine the scenarios that play in Jourdon Anderson mind back then. I believe He is a great father.

    1. Shrimp trawl fishermen actually try to avoid structured habitat (oyster beds) or obstructions because the trawl will likely be damaged or destroyed and results in significant economic loss due to lost fishing time and replacing/repairing expensive gear

      So they only care about the financial reasons for not fishing in oyster reefs

    2. More recent studies found that 82% of the Atlantic croaker, 55% of the spot, and 76% of the weakfish were observed alive in catches of commercial shrimp trawlers when brought on board

      at what stage were they sampled? right as the trawl unloads? or when the fish are actually sorted through

    3. All bottom-disturbing fishing gear is prohibited in these areas, including shrimp trawls and has been prohibited as such for over 35 years

      accounts for less than have of estuarine ecosystems in nc

    4. The most common organisms caught in shrimp trawls are shrimp, small fish, crabs, and jellyfish.6 L

      we know that fish outnumber shrimp in catch abundance according the the 2016 brown study.

    5. Scientists have found that trawling disturbance can stimulate an increase in population numbers of bottom invertebrates

      misquoting this research, it is just polychaetes that are growing in numbers due to dead bycatch creating a trophic shift.

    1. Teachers have had classroom phone policies for years; what’s new at schools like Bullard are that their bans are blanket, campus-wide restrictions. Many of the schools that moved early to adopt such bans are smaller and charter schools, like Soar Academy, a TK-8 charter school with 430 mostly low-income students in San Bernardino. Like Bullard, it also found enforcement of its ban was tough. Suspending students wasn’t an option. Neither was yanking phones from students’ hands. That left an honor system, which relied on students’ willingness to accept that smartphones and social media are harmful to their mental health and a distraction from learning.

      I believe there should be this rule for all school, Elementary, Middle School, and High school because high school students are allowed to use their phones and it makes it a distraction for them in their education

    1. Share some interesting facts, go into the possibly unknown details, or reflect common knowledge in a new light to make readers intrigued. Body paragraphs should discuss the inquiry process you followed to research your topic.

      makes the body paragraphs interesting and informative to keep the reader engaged while sharing facts or other information

    2. Define the topic. Provide short background information. Introduce who your intended audience is. State what your driving research question is. Create a thesis statement by identifying the scope of the informative essay (the main point you want your audience to understand about your topic).

      keeps the essay organized and focuses on the topic of the essay to keeps the reader engaged

    3. The purpose of an informative essay, sometimes called an expository essay, is to educate others on a certain topic. Typically, these essays aim to answer the five Ws and H questions:

      Answering these questions help shape the research papers that you write and it organizes all the information

    1. Researchers follow an iterative process to solve problems. People (and research communities) are constantly revising research questions. The scope of research, the methods, and even the topic driving the research changes over time.

      Why do researcher need to revise or change their research questions instead of sticking to the original one?

    2. On the other hand, doing research for other reasons than to answer a question can be half-hearted or sloppy.

      yeah agree! research is better when we truly want to do it, not just we have to.

    3. When scholarship is working right, publication of research results produces inquiry by other scholars, which in turn produces more research

      It seems like research is ongoing process instead of one time task.

    4. You will need to figure out the best and most reliable way to answer each question (and the questions will probably each need a different research strategy).

      This makes me realize that all the questions cannot be answer in the same way . That means research is not just finding information but finding the right information.

    1. Genres such as the novel and the lyrical poem wereoriginally designed to engage with questions of the individual, the fam-ily, and the nation; non-fiction nature writing typically focused on thedetailed exploration of a particular place

      a strange conceit. original design implies that there is some authoritative consensus building strict boundaries of what writers write about and why.

    2. In nature, we are concerned today with a highly syntheticproduct everywhere, an artificial “nature.” Not a hair or acrumb of it is still “natural,” if “natural” means nature being leftto itself. (Risk 81)

      "being left to itself" is too vague. Humans are a part of nature and always will be. There is no separation, therefore it cannot be left to itself. We can identify behaviors that are more or less destructive to individuals and ecosystems thriving, but we can't say nature must just be left to itself.

    Annotators

    1. The laws of accumulation should be left free; the laws of distribution free. Individualism will continue. But the millionaire will be but a trustee for the poor; entrusted for a season with a part of the increased wealth of the community, but administering it for the community far better than it did, or would have done, of itself. The best in minds will thus have reached a stage in the development of the race in which it is clearly seen that there is no mode of disposing of surplus wealth creditable to thoughtful and earnest men into whose hands it flows save by using it year-by-year for the general good. This day already dawns.

      I find this interesting, as it's basically Carnegie saying that the wealth and power is better served in the hands of a few select people, rather than in the hands of the masses of people. It's interesting, because it's effectively the antithesis of the United States. The U.S. was founded on the idea that common people would have a say in government and in what the government spends money on. Individuals having freedom to do as they please. Carnegie is arguing that by sacrificing the freedom to purchase things as an individual, you enable a singular person to make decisions on what's best for the community. I think it then begs to question as to why we would all want to trust one person to decide what will benefit the common people the most, rather than letting the common people decide.

    2. The price which society pays for the law of competition, like the price it pays for cheap comforts and luxuries, is also great; but the advantages of this law are also greater still, for it is to this law that we owe our wonderful material development, which brings improved conditions in its train.

      The way I hear this, is it's Carnegie saying that competition leads to improvements in overall life. Analysing it from a military perspective proves that this is actually true. The Second World War had dozens of nations fighting and competing to out fight the others, with the Allied and Axis powers being full of many major nations. The technology they developed to try and defeat each other would end up leading directly to the invention of the microwave, men in space, nuclear power, jet engines, and more. These are thing that we do interact with and use frequently. You'd be hard pressed to find someone who has never had anything in their life affected by jet engines and modern aviation, be it through travel or shipping things. Same with a microwave, many people have interacted with them. A competition, not necessarily between two companies, but rather two halves of the world, did indeed produce new commodities which would help many people in the future.

    1. The song, "Song about Life in Virgina" is about a female indentured servant who shows as a glimpse into 5 years of her life while in that time. This woman writes very minimal about what she went through while telling a lot. The woman tells us “Five Years served I, Under master guy, In the land of Virginny, O: which made me for to know, sorrow, grief, and woe”, this one part shows us without explanation that she gave 5 years of her life taking care of this master needs to make his life better to have to live a like of misery for herself. Indentured servants are those who work for a person in return for things like food and shelter and no pay, this woman in this poem shows that through out her time she had to work and live in bad conditions like little food, thin clothing, dirty areas to sleep in, etc. This shows us as readers that these people that were indentured servants gave many years of their lifes to work for these masters which make their lifes easier and better just to have to live in horrible conditions to get food, clothing, and shelter.

    1. “Why hasn’t your group completed the task in the allotted time?” “What is so challenging about this step?” “You look frustrated. What is causing you to feel that way?” “I notice no group has moved on to step 3. Why not?”

      I love how these questions get straight to the point, but they will be beneficial for us as educators. I think when teaching in the arts, things will always have to change and be explained. These questions address them, but also are not accusatory which is also not beneficial to objectively understanding.

    2. Each circle identifies what students do. Students 1) imagine, examine, and perceive; 2) explore, experiment, and develop craft; 3) create; 4) reflect, assess, and revise, and 5) share their products with others. The arrows indicate the ways teachers can guide students through the creative process.

      I appreciate having these steps to help students through the creative process. It is so important that students take ownership of their creativity, but I have often asked, " How do you do this? These steps really help lay it out and help us know how to motivate them.

    1. REACT is precisely the kind of parallel constraint satisfaction that characterizes effective human composing and that distinguishes human writing from AI text generation.

      Constrants represent context, and generative AI has problems observing and identifying context.

    1. After getting sent to jail nearly 20 years later forracketeering, bribery and tax evasion, he ran for Congress again

      Holy real trump parallel and they dont even know it

    2. civic associations, theDemocratic Party itself, local growth-oriented elites, and social policies thatreflected the worldview of industrial workers.

      Which all eroded as the midwest become non-defined by industrialism

    3. More generally, as partisan conflict was reorganized aroundrace, issues of economic equity declined in importance.

      And the unification of races around labor would have eroded

    Annotators

    1. Walking sims often center interiority in a way that more mainstream genres struggle with, perhaps another reason the mainstream finds them off-putting or threatening.

      Outlandish: Well yes? isn't that why there are so many genres to choose from? if a person thinks that it's not enough then pick something else. Why does there have to be hate and finding something threatening?

    2. Real games are difficult, goes this argument: you can die in them; you can take “real” actions (i.e., shooting and loot collecting, not walking or investigating). Real game heroes are powerful and effective.

      Outlandish: I wholeheartedly disagree with this take on games. I believe that a game doesn't have to be complex or even hard to be considered interesting. Different people have different views on what's considered interesting: there are people who'd prefer easier and light hearted games.

    3. “Walking simulator” began as a derogatory label

      Outlandish: "walking simulator" being used as an insult is so stupid, like why would you be offended by that at all

    4. A number of traditional big-budget titles don’t demand this kind of moral engagement, which makes sense—asking a player to stop and consider the horrible things they’re doing is antithetical to moving forward” (Clark 2017). Slowness is forefronted in a game of permalife: adrenaline is neither the goal nor the appeal

      In my opinion, games typically considered walking simulators are best when they do what's described in this section. I honestly think this is the only kind of story which walking simulators excel at (stories in which the horrible truth about the main character is revealed slowly).

    5. Gone Home also plays with player agency by subverting expectations about danger and complicity. The first moments of the game create a sense of mystery more frequently associated with survival horror: the abandoned house is cast as unnatural and threatening, with the player invited to explore it suspiciously, suspecting some external danger behind the apparent disappearance of the family. That danger, of course, turns out to be internal, not external. The player becomes the intruder in what should be a familiar environment

      Outlandish: This interpretation of this misleading horror element of the game 'Gone Home' is interesting for sure. But, I do wonder if the creators of the game meant for it to be intentional because I believe a majority of players felt this way.

    6. As in adventure games, players of walking simulators strive to recreate the “ideal walkthrough,” the preexisting story that must be uncovered step by step through the player’s actions. But in these games, the next step is not occluded by puzzles: rather, it’s generally made so obvious it’s impossible to miss.

      Outlandish: It's more than easy to lose sight of what you're doing and not realize what the game is going for, I remember when I tried to play video games as a kid I wouldn't realize what I was actually supposed to do and then get mad when nothing was going my way. I felt that way when playing Gone Home at times.

    7. Even in text games like Adventure

      Outlandish: Not sure if this qualifies as outlandish, but if this is referring to the old Atari video game, I dont think it was text.

    8. Simply put, you cannot become better than someone else at a walking simulator, and this lack of a mechanism for dividing elite from noob might be what’s really behind some critiques complaining about the lack of gameplay.

      Outlandish: If there is no hierarchy of skill, then there is no clear status structure, which might threaten players who value mastery. It makes me rethink complaints about “no gameplay” as possibly complaints about “no competition.”

    9. walking for pleasure into new and unseen places is not an act of idleness but a necessary part of retaining our humanity in a modern world increasingly cut off from nature

      OUTLANDISH .I think this is pretty stupid. I can get walking out in nature or like in real life but on a screen I'm not gonna be admiring a video game the same way I admire nature even if it's super realistic. Like it's still a game and though there could be aspects of beauty in no way do I believe I am gonna retain my humanity.

    10. The hostile spaces he moves through after a plane crash in a remote icy landscape become a gauntlet not just of physical survival but of metaphorical endurance: the struggle of living with a crushing regret. Permalife games are difficult in an entirely different way than games requiring skill or strategy, requiring players to enact the motions of continuing existence, even in the face of survival under (or complicity with) the evils of that existence.

      Outlandish I found this claim crazy, adding depth I had never considered to games with "no consequences". In this way, it contradicts itself. There are consequences. They' re knowing what happened is fully your fault. If anything, the consequence is more scarring than any "death" could be.

    11. Yet this has only produced a tiny number of mildly suc-cessful games. But people still bitch and moan when the term gets applied to their work, or work they personally enjoy.

      Outlandish - Personally, I think I would react the same if a game I spent hundreds of hours making got grouped into a category of games that's considered to be trash by the general public.

    12. The player has to go through the boring process of walking about in order to make a mental picture of the surroundings. If they don’t, they cannot possibly know what the realm of their possibilities are.

      Outlandish: I'm surprised that someone defending walking simulators would admit that they're boring to their critics. It seems impossible to deny that they aren't as excited as action-based games, so most of the debate centers around the strength of their message instead.

    13. Permalife games are difficult in an entirely different way than games requiring skill or strategy, requiring players to enact the motions of continuing existence, even in the face of survival under (or complicity with) the evils of that existence.

      OUTLANDISH: Permalife is not difficult. If the player is always alive, there are no stakes and nothing to lose. The player's instincts to run if they are seen or fight back or hide are completely neutralized. This does not benefit the player in any way. There is nothing to deal with in the face of survival because survival is guaranteed.

    14. The most visible difference between adventure games and walking sims is the removal of puzzles

      Outlandish: Then what's the point? Walking sims bore me without having an objective. Feels mindless.

    15. While the game attracts attention for its centering of a queer narrative, the distance of the avatar from that narrative invites critique:

      I think it's interesting to see how participating in a walking simulator game allows you to view a story from a completely different perspective that you might be used to. In terms of Gone Home, the player is detached in terms of knowing what has been happening in their home and family across the time period they have been gone, that presents the opportunity for judgement and other emotions to arise.

    16. While the game attracts attention for its centering of a queer narrative, the distance of the avatar from that narrative invites critique:there’s a fundamental passivity to the game that contradicts this praise, particularly where the queer-centered narrative is concerned.

      This quote really makes me think about the tension between immersion and observation in games. Gone Home lets the player witness an important queer story without ever participating in it…does that create empathy, or does it keep the player at arm’s length? It also raises questions about what it means to “experience” a story through a character: can understanding and reflection replace direct involvement, or does passivity limit the emotional impact?

    17. The player becomes the intruder in what should be a familiar environment by virtue of returning after long absence, seeing the intimate lives of her family with fresh eyes. The player’s initial fear that they might need to act quickly to defend themselves from some lurking supernatural horror becomes transmuted, by the end of the story, into the inevitable realization that their character has already lost her chance to act,

      The way that the player become an intruder in your own home is really crazy to think about. Personally, for me it can't be a home if you don't want to be there and you don't even recognize all the character traits it makes as a home. This "long absence" that's there is something that can make it feel like are an intruder but it should be quick turnaround to feel like you're home. This makes you realize that something is really wrong in this home and that's really the horror aspect of this game.

    18. in adventure games specifically, it provides a space for thinking and reflecting, a necessary precursor to successfully overcoming obstacles. Walking “leaves us free to think without being wholly lost in our thoughts,” writes Rebecca Solnit in her book Wanderlust: A History of Walking: “The rhythm of walking generates a kind of rhythm of thinking, and the passage through a landscape echoes or stimulates the passage through a series of thoughts… one that suggests that the mind is also a landscape of sorts and that walking is one way to traverse it” (2001). Every walk is a chance “to assimilate the new into the known,” the fundamental precursor to that new perspective on the world that adventure games strive to induce.

      I do get what they’re saying, but I’m not fully convinced that walking is automatically grounds for reflection. Although slowing down can create space to think, it just ends up becoming boring for the player if there is nothing big happening. If the slowness of a game is intentional, like it is in walking simulators, then the world has to carry a lot of weight and substance.

    19. This is why most walking sims that descend from first-person shooters have been radical reimaginings taking years to produce, not merely removing enemies but crafting whole new environments, often with custom textures, objects, music, and narration: creating not just a new focus of interaction but an entirely different kind of world to support that focus.

      This idea is important because it refutes the notion that games typically considered walking simulators are simply equivalent to action games without the enemies. Walking simulators place the emphasis of their interactive aspect on the features of the world around the player.

    20. Sam’s words, though addressed to Katie, are also aimed at the player, serving  as an invitation to connect and respect Sam’s choice. The request not to “hate me” is particularly poignant, given that the essential absence goes unfixed.

      While I was playing Gone Home, I found it hard to see what the protagonist's role in the story was. Ultimately, I think their main purpose was to serve as a judge for the family's situation, since they are the most separated from Sam's issues with her parents. Similarly, our job as a player is to assess whether or not Sam made a good decision, which is why this ending is so impactful to the player.

    21. What kind of exploration, then, do the worlds of walking simulators support? Contrary to expectations, these games are rarely just about exploration. There are a few exceptions: Proteus (2013) is a joyful exploration of a shifting island purely for its own sake, and experimental games like Césure and Lumiere (both 2013) place the player in explorable abstracted spaces of light, color, and shadow (Reed 2013). But the most famous and successful walking simulators are best understood as explorations not of environment, but of character. Just as the environments in first-person shooters exist to support action-packed combat, the environments in most walking sims are designed to be platforms for understanding and empathizing with characters. In games like Dear Esther, Virginia (2016), What Remains of Edith Finch (2017), and many others, 3D game worlds come to be understood as metaphorical spaces offering windows into the minds and stories of the people within them. Sometimes this is made literal as part of the game’s fiction (as in the 2014 games Mind: Path to Thalamus and Ether One, both about entering an environmental representation of another character’s mind) but more commonly we understand this reification as(p. 126)working in the same way experimental films signify abstract meanings with concrete visuals, or the reality-bending conventions of magical realism or unreliable narrators creating layers of truth in literature.

      This argument can definitely be backed by my own prior experiences. Whether it’s Gone Home or other similar games that I've played and seen–some with horror and mystery aspects–I never truly explored the environments simply for the sake of exploring my surroundings. Rather, I was always driven by a sense of curiosity to unfold the mysteries surrounding my character and the others around me.

    22. walking for pleasure into new and unseen places is not an act of idleness but a necessary part of retaining our humanity in a modern world increasingly cut off from nature,

      Personally dislike this. If I want to retain humanity and walk into unseen places, why do it on a screen? Would much rather go outside and take a walk. A proper mental reset turning off screens.

    23. To call something a “walking simulator” became not just a complaint about pacing but an existential fight for survival, spiraling to include larger and larger questions of who gets to be a gamer and what should be “counted” as a game (Chess and Shaw 2015). Real games are difficult, goes this argument: you can die in them; you can take “real” actions (i.e., shooting and loot collecting, not walking or investigating). Real game heroes are powerful and effective. An ugly corollary to this argument, advanced by some, was that “real games” shouldn’t be about the disenfranchised.

      When I read this portion of this passage, I was wondering what was the point of being so hateful towards alternative types of media? I feel that It is odd for gamers to be so affected by different types or games entering the gaming sphere. Shouldn't they be happy that something they love is gaining more traction and new ideas are being implemented?

    24. In classic adventure games, you spend a lot of time walking. The world would usually be divided into stage-sized screens which your avatar must move across, at walking pace, to reach an edge and the next linked area. These animations can seem painfully slow by today’s standards. Some games, including parts of Loom, would zoom out to sprawling vistas to make environments seem especially epic, your character reduced to a cluster of tiny pixels lost in immensity, the journey to the edge of the screen even more drawn out. Even in text games like Adventure or point-and-click games like Myst, where movement is instantaneous, players still spent much of their time navigating complex environments, retracing their steps to return to earlier areas looking for clues, unsure where to go next. Mainstream game design has moved toward minimizing these down times, adding mechanics like fast travel or quest markers to get players straight to the next point of interest, another filing away of the adventure game’s rough corners.

      I think this point of adventure games making it easy to move to the interesting parts is something I see fairly often. There's not many games I’ve played that don't have some sort of feature to skip to the interesting parts and skip over travel. I have noticed that most games do require you to usually travel to a particular location the more time-consuming way at least once before you can skip there immediately.This definitely allows the player to appreciate the action of exploration more.

    25. There isn’t a lot of, “Walk through a door, hit a trigger, and watch this thing happen.” Everything that changes your perception of what the game means is through you interacting with what’s there and having an effect on the state of the world that in turn affects you. (qtd. in Suel-lentrop 2017)

      The ability to interact with objects further elevates the story and environmental atmosphere of 'Gone Home', which made the walking aspect of the game much more meaningful to me.

    26. Games scholar Bonnie Ruberg has called this notion “permalife,” for games which not only include but center the notion of making death impossible (2017). She notes that permalife games are often made by queer designers, positing that “permanent living represents a particularly potent trope for expressing both hopes and concerns about contemporary queer life in the face of an uncertain future.”

      I think this idea of Permalife is the most interesting idea in this chapter. Most games are focused on survival mechanics where death is the reset button. But permalife suggests that the challenge isn’t dying by continuing. It mirrors life, and that allows for emotional issues to really stand out through these games. But I also think that there needs to be something to keep the player moving forward. Sure some people may go on without a failure condition, but others would feel unmotivated to do so. They would need to nail down aspects like emotional weight, narrative curiosity and more for the game to really work.

    27. Stripping the violence from a first-person shooter, however, often results in a strange interstitial kind of experience, something in-between and unrecognizable. O’Connor’s review of the tourism Quake mod highlights some of the unsuitability of these environments to casual exploration. The architecture of these games in their original form is a means to the end of success in combat: to the extent the player notices it at all, it is while looking for places to hide, physical obstacles, routes for evasion or ambush. Details are designed to be glanced at briefly, not lingered over.

      I find it interesting that calmer games like walking sims originated from more violent and action-packed games like first-person shooters. Wanting to explore a world usually unavailable due to actions and conflict is common in human beings. It’s like setting a pack of cookies on a table, saying no one can eat it, and then leaving. Surely no one will miss one cookie in the pack. It’s the same concept for these shooter games. Players are busy running around, surviving, and fighting, so they don’t get to actually appreciate the world they are in (by design). It makes a part of their brain curious to experience the world they are already in without the pressures of battle, so a modder will take a cookie form the pack and eat it, opening up the rest of the pack for other people to enjoy. That’s when they realize the game they were praising for graphics and immersion is actually just a rough sketch, raisin cookies when they thought they were chocolate chips. This creates a sense of disappointment. I can see how walking sims were born from it. To cure the disappointment players felt when they realize the game they loved isn’t as polished as they thought. But without the allure of fighting, walking sims need something else, some other temptation. So, they promise ice cream with the cookies, sprinkling lore and stories into their world. Finding out pieces of the story give players their dopamine boost while satisfying their curiosity for adventure. The article then mentioned punishments within the game and how walking sims remove that and instead explore living with the consequences of your actions. This adds more depth to the ice cream and cookies players have been enjoying before, forcing them to either love or hate them more intensely.

    28. Gone Home also plays with player agency by subverting expectations about danger and complicity. The first moments of the game create a sense of mystery more frequently associated with survival horror: the abandoned house is cast as unnatural and threatening, with the player invited to explore it suspiciously, suspecting some external danger behind the apparent disappearance of the family. That danger, of course, turns out to be internal, not external. The player becomes the intruder in what should be a familiar environment by virtue of returning after long absence, seeing the intimate lives of her family with fresh eyes. The player’s initial fear that they might need to act quickly to defend themselves from some lurking supernatural horror becomes transmuted, by the end of the story, into the inevitable realization that their character has already lost her chance to act,

      Originally, I was not a fan of the subversion of my expectations, but in the past week since I have played it and discussed the game in class, the horror elements have grown on me. Over time, I have started to realize the purpose of subverting my expectations. The realization that Katie is too late to help her sister is cleverly implemented. Seeing satanic imagery leads the player to believe in supernatural elements, but it is just a red herring. The fear the player has allows them to relate to Sam and her experience of coming out. The lightning and creaking noises make the player anxious but not so anxious that they can’t keep playing and have to take a break. As this feeling grows, it transforms into internal fears and intensifies the feeling for the player.

    29. First-person shooters like Doom and Unreal revolutionised our  understanding  of space, structure and embodiment in games. They put players into the body of a killing machine and set them lose [sic]. First-person walking sims have taken the environmental lessons, the same ideas of architectural structure as a form of storytelling, and diverted the focus from action to introversion. They leave the player alone in a world with their own thoughts. (2016)

      People would never have thought of first-person shooters such as Doom and Unreal with walking simulators like Gone Home, but if you take out the violence in FPS games they are walking sims. Especially in open world games like Zelda you walk around the majority of the time. The only difference is that there's fights from time to time while walking sims tell stories through the environment.

    30. “Walking simulator” began as a derogatory label, and is still controversial among game creators: while some have reclaimed it as a useful category, to others it seems reductive or laden with too many negative associations.

      Walking simulators challenge the traditional view that games were designed only for players to win. The game’s purpose shifts from “Achieve a certain number of kills” or “Capture a certain number of enemy bases” to “Explore the house” or “Check your surroundings.” Players learn how to interpret and examine their environments.

    31. The Quake mod also changes the music, replacing the sharp-edged original soundtrack from Trent Reznor with tracks from one of his more “chilled-out” albums. The change in music is another important move to eliminate the game’s tension and replace it with a thoroughly different mind-set.

      It is interesting to me that there were mods created for Quake that removed the entire main goal of the game. I have never thought about taking out a main mechanic of a videogame and then playing it. It makes me rethink all the games ive played and imagine it just like exploration. Playing games with different music can change the whole vibe of the game, I have noticed, like i remember playing horror games with my own music and it made it not as scary.

    1. Surely the creative simply create, galvanized by a muse,unlike a lesser workaday mortal

      Goes against the idea that writers write well due to their spark within them. The author goes against this belief that many people have due to romanticization. Writers work hard to get their work done; it is unrealistic to believe that this is not the case.

    2. We hope to (or struggle to) move from one state to theother but we delay. We label those more disciplined than we are as plod-ders or hacks yet we chastise ourselves for our own procrastinatio

      I am someone who has been exposed to procrastination, especially when it comes to my schoolwork. I push all my work until the last minute, causing me to rush to get it done in the end.

    3. “Many writers use alcohol to help themselves write—tocalm their anxieties, lift their inhibitions.” Yet heavy drinking can quicklylead to a vicious circle. Writing ultimately suffers because of drink, “theunhappy writer then drinks more; the writing then suffers more, and soon” (Acocella 2004, 116

      Drinking can relieve writers of any pressure or anxiety they are feeling by doing this so that the writing can come out more natural.

    4. Writers’ block and procrastination are psychosocial states as much asthey are physical ones

      It is not always about being deprived of sleep; writer's block and procrastination can come from psychological issues such as anxiety or depression. It can also come from social and physical factors.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General Response to Review

      We would like to thank all three reviewers for their encouraging comments on our manuscript. We now submit our revised study after considerable efforts to address each of the reviewer concerns. I will first provide a response related to a major change we have made in the revision that addressed a concern common to all three reviewers, followed by a point-by-point response to individual comments.

      Replacing LRRK2ARM data with a LRRK2 specific type II kinase inhibitor: The most critical issue for all 3 reviewers was the use of our new CRISPR-generated truncation mutant of LRRK2 that we called LRRK2ARM. We had not provided direct evidence of the protein product of this truncation, which was a significant limitation. To address this we performed proteomics analysis of all clones, and to our surprise, we identified 7 peptides that were C-terminal to our "predicted" stop codon we had engineered into the CRISPR design. A repeat of the deep sequencing analysis in both directions then more clearly revealed site specific mutations leading to 4 amino acid changes at the junction of exon 19, without introducing a stop codon. Given that we could not detect the protein by western blot (even though proteomics now indicated the region of LRRK2 recognized by our antibodies was present) we decided to remove this clone from the manuscript. In the meantime we had compared the ineffectiveness of MLi-2 to block Rab8 phosphorylation during iron overload in the LRRK2G2019S cells with a type II kinase inhibitor called rebastinib. The data showed very clearly that treatment with rebastinib reversed the iron-induced phospho-Rab8 at the plasma membrane (and by western blot, in new Fig 3). Since this inhibitor is very broad spectrum inhibiting ~30% of the kinome we reached out to Sam Reck-Peterson and Andres Leschziner, experts in LRRK2 structure/function, who recently developed a much more selective LRRK2-specific type II kinase inhibitor they called RN341 and RN277 (developed with Stefan Knapp PMID: 40465731). These compounds effectively coupled the MLi-2 compound through an indole ring to a rebastinib type II compound to provide LRRK2 binding specificity to the efficient DYG "out" type II inhibitor. As with rebastinib, the new LRRK-specific kinase inhibitors also effectively reversed the cell surface p-Rab8 seen in LRRK2G2019S, iron loaded cells. These new data provide the first biological paradigm where the kinase activity of LRRK2 is resistant to type I MLi-2, yet remains highly sensitive to type II inhibitors. While the loss of our LRRK2ARM clone marks a significant change in the manuscript we believe the main message is stronger with the addition of the new LRRK2 specific type II kinase inhibitor. Our data show that it is indeed the active kinase function of LRRK2G2019S that is impacting the iron phenotypes we observe but highlight the conformational specificity upon iron overload such that MLi-2 is ineffective. The overall phenotypes we observe in LRRK2G2019S macrophages remain unchanged and are now expanded within the manuscript. We hope reviewers will agree that our work provides important new insights into LRRK2 function in iron homeostasis while opening new avenues of research in future studies.

      Given this new information we have changed the title from "LRRK2G2019S acts as a dominant interfering mutant in the context of iron overload" to the more accurate "LRRK2G2019S interferes with NCOA4 trafficking in response to iron overload leading to oxidative stress and ferroptotic cell death."

      Response to Reviewer 1

      Reviewer 1 (R1): There are two major concerns with the data in their present form. In brief, first, the G2019S cells express much less LRRK2 and more Rab8 that the WT cells and this severely affects interpretability.

      Heidi McBride (HM): We agree that the LRRK2G2019S lines express lower levels of LRRK2 than wild type, which is a previously documented phenomenon, presumably as the cell attempts to downregulate the increased kinase activity by reducing protein expression. However, the levels of Rab8 across 10s of experiments do not consistently show any differences between the wild type, G2019S and KO. We have provided more comprehensive quantifications of the blots in the revised version, and the Rab8 levels are consistent across all the blots presented in the manuscript (Figure 1A and 1B).

      R1: Second, the investigators used CRISPR to truncate the endogenous LRRK2 locus to produce a hypothetical truncated LRRK2-ARM polypeptide. This appears to have robust effects on NCOA4, in particular, which drives the overall interpretation of the data. However, the expression of this novel LRRK2 species is not confirmed nor compared to WT or G2019S in these cells (although admittedly the investigators did seek to address this with subsequent KO in the ARM cells). It would be premature to account for the changes reported without evidence of protein expression. This latter issue may be more easily addressed and could provide very strong support for a novel function/finding, see more detailed comments below, most seeking clarifications beyond the above.

      HM: As described in my common response above, we have removed the LRRK2ARM data from the manuscript.

      R1: Need to make clear in the results whether the G2019S CRISPR mutant is heterozygous or homozygous (presumably homozygous, same for ARM)

      HM: The RAW cell line we generated is homozygous for the G2019S and the KO alleles. We added this to the beginning of the results section and methods.

      R1: The text of the results implies that MLi2 was used in both WT and G2019S Raw cells, but it's only shown for G2019S. Given the premise for the use of RAW cells, it's important to show that there is basal LRRK2 kinase activity in WT cells to go along with its high protein expression. This is particularly important as the G2019S blot suggests minor LRRK2-independent phosphorylation of Rab8a (and other detected pRabs). One would imagine that pRab8 levels in both WT and G2019S would reduce to the same base line or ratio of total Rab in the presence of MLi2, but WT untreated is similar to G2019S with MLi2. This suggests no basal LRRK2 activity in the Raw cells, but I don't think that is the case.

      HM: We have included the data from MLi-2 treatment of wild type cells in Fig 3C quantified in D. Again, the baseline levels of Rab8 are unchanged across the genotypes. However, the reviewer is correct that there is some baseline LRRK2 kinase activity that is sensitive to MLi2 in wild type cells. This is seen most clearly on the autophosphorylation of LRRK2 at S1292 in Fig 3C. The pRab8 blots is not as clear in wild type cells. It is likely that LRRK2 must be actively recruited to membranes (as seen by others with LLOME, etc) to easily visualize p-Rabs in wild type cells. Nevertheless, we do clearly see the activity of autophosphorylation in wild type cells. Therefore while we understand the reviewers point that there should be some Rab8 phosphorylation in wild type cells, we don't see a significant, or very convincing, amount of it in our RAW macrophages.

      R1: Also, in terms of these cells, the levels of LRRK2 are surprisingly unmatched (Fig 1A, 1D, 1H, S1D, etc.) as are total levels of Rab8 (but in opposite directions) between the WT and G2019S. This is not mentioned in the Results text and is clearly reproducible and significant. Why do the investigators think this is? If Rab8 plays a role in iron, how do these differences affect the interpretation of the G2019S cells (especially given that MLi2 does not rescue)? Are other LRRK2-related Rabs affected at the protein (not phosphorylation level)? Could reduced levels of LRRK2 or increase Rab 8 alone or together account for some of these differences? Substantial further characterization is required as this seriously affects the interpretability of the data. Since pRab8 is not normalized to total Rab8, this G2019S model may not reflect a total increase in LRRK2 kinase activity, and could in fact have both less LRRK2 protein and less cellular kinase activity than WT (in this case).

      HM: In our hands, the RAW cells with homozygous LRRK2G2019S mutations show clearly that the total protein levels of LRRK2 is reduced compared to wild type, which is likely a compensatory effect to reduce cellular kinase activity overall. We understand that some of our previous blots were not so clear on the total Rab8 levels across the different experiments. We have repeated many of these experiments and hope the reviewer can see in Figs 1A, 3C, 3E, 3J, and Sup3A that the total Rab8 levels are stable across the conditions. We also present quantifications from 3 independent experiments normalizing the pRab8/Rab8 levels in all three genotypes in untreated and iron-loaded conditions (Supp Fig 3A and B), and upon MLi2 treatment (Fig 3C). In 3C and D the data show the effectiveness of MLi-2 to reduce pRab8 in control conditions, but the resistance to MLi-2 in FAS treated cells.

      R1: Presumably, the blots in 1H are whole cell lysates and account for the pooled soluble and insoluble NCOA4 (increased in G2019S), as there is no difference in soluble NCOA4 (Fig 2H). I suspect the prior difference is nicely reflected in the insoluble fraction (Fig 2H). This should be better explained in the Results text. This is a very interesting finding and I wonder what the investigators believe is driving this phenotype? Is the NCOA4 partitioning into a detergent-inaccessible compartment? Does this replicate with other detergents, those perhaps better at solubilizing lipid rafts? Is this a phenotype reversible with MLi2? Very interesting data.

      HM: We apologize for not being clearer in the text describing the behavior of NCOA4. The reviewer is correct that the major change in G2019S is the increased triton-X100 insoluble NCOA4. Previous work has established that NCOA4 segregates into detergent-insoluble foci upon iron overload as a way to release it from ferritin cages, and this fraction is then internalized into lysosomes through a microautophagy pathway (see Mizushima's work PMID: 36066504). In Fig 1I we show that the elevation in NCOA4 and ferritin heavy chain seen in untreated G2019S cells can be cleared upon iron chelation with DFO, indicating that the canonical NCOA4 mediated ferritinophagy (macroautophagy) pathway remains intact to recycle the iron in conditions of iron starvation. However in Figure 2 we show that conditions of iron overload, when NCOA4 segregates from ferritin (to allow cytosolic storage of iron), this form of NCOA4 cannot be degraded within the lysosome through the microautophagy pathway, and begins to accumulate. We see this with our live and fixed imaging compared to wild type cells (Fig 2A,D), and by the lack of clearance seen by western blot (Fig 2E). As for the impact of MLi-2, we observe some reversal of NCOA4 accumulation in untreated cells at 4 and 8 hrs after MLi-2 treatment (Supp Fig 2F). However, in iron loaded conditions the high NCOA4 levels in G2019S cells are MLi2 insensitive, while the elevated NCOA4 in wild type cells is reduced upon MLi2 addition (Fig. 2F, compare lates 3vs4 in wt with lanes 7vs8 in G2019S). This is consistent with a block in the microautophagy pathway of phase-separated NCOA4 degradation in G2019S cells.

      R1: Figure 2 describes the increased NCOA4-positive iron structures after iron load, but does not emphasize that the G2019S cells begin preloaded with more NCOA4. How do the investigators account for differential NCOA4 in this interpretation? Is this simply a reflection of more NCOA4 available in G2019S cells? This seems reasonable.

      HM: The reviewer is correct, we showed that there is some turnover of NCOA4 in untreated conditions through canonical ferritinophagy, but in iron overload this appears to be blocked, the NCOA4 segregates from ferritin and remains within insoluble, phase-separated structures that cannot be degraded through microautophagy. We have written the text to be more clear on these points.

      R1: These are very long exposures to iron, some as high as 48 hr which will then take into account novel transcriptomic and protein changes. Did the investigators evaluate cell death? Iron uptake would be trackable much quicker.

      HM: We agree that many things will change after our FAS treatments and now provide a full proteomics dataset on wild type and G2019S cells with and without iron overload, which is presented in Figure 4A-B. Indeed Figure 4 is entirely new to this revised submission. The proteomics highlighted a series of cellular changes that reflect major cell stress responses including the upregulation of HMOX1 (western blots to validate in Supp Fig 4A), an NRF2 transcriptional target consistent with our observation that NRF2 is stabilized and translocated to the nucleus in G2019S iron loaded cells (Sup Fig 4B,C). There are several interesting changes, and we highlighted the three major nodes, which are changes in iron response proteins, lysosomal proteins - particularly a loss of catalytic enzymes like lysozymes and granzymes consistent with the loss of hydrolytic capacity we show in Fig. 4C,D. We also noted changes in cytoskeletal proteins we suspect is consistent with the "blebbing" of the plasma membrane we see decorated with pRab8 in Fig 3. To test the activation of lipid oxidation likely resulting from the elevation in Fe2+ and oxidation signatures we employed the C11-bodipy probe and observe strong signal specific to the G2019 iron-loaded cells, particularly labelling endocytic compartments and the cell surface (Fig. 4E-G).

      Lastly, an analysis of SYTOX green uptake experiments was done to monitor the uptake of the dye into cells that have died of cell membrane rupture, commonly used to examine ferroptotic cell death. We now show the G2019S cells are very susceptible to this form of death (Fig 4H,I). These data add new functional evidence for the consequence of the G2019S mutation in an increased susceptibility to iron stress.

      R1: The legend for 2F is awkward (BSADQRED)

      HM: We have changed this to BSA-DQRed, which is a widely used probe to monitor the hydrolytic capacity of the lysosome.

      R1: Why are WT cells not included in Fig 2G?

      HM: We have now included new panels in Fig 3C,D showing wild type and G2019S +/- FAS and +/-ML-i2 with quantifications of pRab8/Rab8.

      R1: The biochemical characterization of NCOA4 in the LRRK2-arm cells is a great experiment and strength of the paper. The field would benefit by a bit further interrogation, other detergents, etc.

      HM: We have removed all of the LRRK2ARM data given our confusion over the impact of the 4 amino acid changes in exon 19 and our inability to monitor this protein by western blot. The concept that NCOA4 enters into TX100 insoluble, phase separated compartments has been well established, so we didn't explore other detergents at this point.

      R1: Have the investigators looked for aberrant Rab trafficking to lysosomes in the LRRK2-arm cells? Is pRab8 mislocalized compared to WT? Other pRabs?

      HM: We did initially show that pRab8 was also at the plasma membrane in the LRRK2ARM cells, and we still focus on this finding for the G2019S, seen in Fig 3A,B,F,H. We did try to look at other p-Rabs known to be targets of LRRK2 but none of them worked in immunofluorescence so we couldn't easily monitor specific traffic and/or localization changes for them.

      R1: The expression levels and therefore stability of the ARM fragment is not shown. This is necessary for interpretation. While very intriguing, the data in Aim 3 rely on the assumption that the ARM fragment is expressed, and at comparable levels to G2019S to account for phenotypes. The generation of second clone is admirable, but the expression of the protein must be characterized. This is especially true because of the different LRRK2 levels between WT and G2019S. One could easily conceive of exogenous expression of a tagged-ARM fragment into LRRK2 KO cells, for example, as another proof-of-concept experiment. If it is truly dominant, does this effect require or benefit from some FL LRRK2? It seems easy enough to express the LRRK2-ARM in at least WT and KO RAW cells.

      HM: We agree and our attempts to understand this clone resulted in its removal from the manuscript. We did also express cDNA encoding our ARM domain (up to exon 19), but it didn't phenocopy the CRISPR clone, which of course made sense once we had better proteomics and repeated our deep sequencing.

      In our further efforts to understand why our phenotype was MLi-2 resistant upon iron overload we expanded to examine the impact of pan-specific TypeII kinase inhibitors, and then reached out to the Reck-Peterson and Leschziner labs to obtain a newly developed LRRK2 selective type II kinase inhibitor. These all very efficiently reversed the pRab8 signals seen at the plasma membrane of G2019S cells upon iron overload (Fig 3E-K). Therefore the G2019S is not dominant negative, as we had initially supposed, rather there is a specific conformation of LRRK2 in high iron that potentially opens the ATP binding pocket to bind the type II inhibitors, but not MLi2. We do not understand exactly what this conformation is but likely involves new protein interactions specific to high iron, or perhaps LRRK2 binds iron directly as a sensor somehow that ultimately leads to the differential sensitivity we observe between type I and type II kinase inhibitors. Our data indicate that MLi-2 treatment in clinic will not be protective against iron toxicity phenotypes that may contribute to PD, where these newer selective type II LRRK2 kinase inhibitors would be effective in this conformation-specific context of iron toxicity.

      R1: Does iron overload induce Rab8a phosphorylation in a LRRK2 KO cell? This would be a solid extension on the ARM data and support the important finding that an additional kinase(s) can phosphorylate Rab8a under these conditions, and while not unexpected, this may not have been demonstrated by others as clearly. It also addresses whether the ARM domain is important to this other putative kinase(s), which may add value to the authors' model.

      HM: Iron overload does not induce pRab8 in LRRK2 KO cells, as seen by immunofluorescence in Fig 3A,B, and western blot in Supp Fig 3 A,B. With our new type II kinase inhibitor data we can confirm that the plasma membrane localized Rab8 is indeed phosphorylated by LRRK2.

      R1: Minor concern - the abstract but not the introduction emphasizes a hypothesis that loss of neuromelanin may promote cell loss in PD (through loss of iron chelation), while post mortem studies are by definition only correlative, early works suggested that the higher melanized DA neurons were preferentially lost when compared to poorly melanized neurons in PD. This speculation in the abstract is not necessary to the novel findings of the paper.

      HM: We appreciate that the links to iron in PD are correlative, we have maintained some of our discussion on this point within the manuscript given the lack of attention the field has paid to the cell biology of iron homeostasis in PD models. If there is a cell autonomous nature to the loss of DA neurons in PD, iron is very likely to be a part of this specificity in our opinion. Most of the newer MRI studies looking at iron levels in patient brains are showing higher free iron and working on this as potential biomarkers of disease. The precise timing of this relative to the stability/loss of neuromelanin is, I agree, not really clear.

      R1: (Significance (Required)): This study could shed light on a both novel and unexpected behavior of the LRRK2 protein, and open new insights into how pathogenic mutations may affect the cell. While studied in one cell line known for unusually high LRRK2 expression levels, data in this cell type have been broadly applicable elsewhere. Give the link to Parkinson's disease, Rab-dependent trafficking, and iron homeostasis, the findings could have import and relevance to a rather broad audience.

      HM: We are so very appreciative that reviewer 1 feels our work will be of interest to the PD and cell biology communities.

      Response to Reviewer 2

      Reviewer 2 (R2): Major: Please confirm that the observed phenotype is conserved within bone marrow-derived macrophages of LRRK2 G2019S mice. These mice are widely available within the community and frozen bone marrow could be sent to the labs. The main reason for this experiment is that CRISPR macrophage cell lines do sometimes acquire weird phenotypes (at least in our lab they sometimes do!) and it would strengthen the validity of the observations.

      HM: We did a series of experiments on primary BMDM derived from 3 pairs of wild type, LRRK2G2019S and LRRK2KO mice. We examined levels of ferritin heavy and light chains in steady state and withFAS treatment experiments. Unfortunately the data did not phenocopy the RAW macrophage lines we present here since FTL and FTH were mostly unchanged. We did observe an increase in NCOA4 levels, consistent with potential issues with microautophagy as observed in our RAW system.

      While we understand the danger that our phenotypes are nonspecific and linked to a CRISPR-based anomaly, there are a number of arguments we would make that these data and pathways are potentially very important to our understanding of LRRK2 mutant phenotypes and pathology. The first point is that we now include a LRRK2-specific type II kinase inhibitor that reverses the iron-overload pRab8 accumulation at the plasma membrane in LRRK2G2019S cells, showing that this is at least directly linked to LRRK2 kinase activity, even though it is resistant to MLi2.

      Second, Suzanne Pfeffer recently published their single cell RNAseq datasets from brains of untreated LRRK2G2019S mice (PMID: 39088390). She reported major changes in Ferritin heavy chain (it is lost) in very specific cell types of the brain, astrocytes, microglia and oligodendrocytes, with no changes in other cell types at all (her Fig 6 included left). This is consistent with a very context specific impact of LRRK2 on iron homeostasis that we don't yet understand.

      Third, the labs of both Cookson, Mamais and Lavoie have been working on the impact of LRRK2 mutations on iron handling in a few different model systems, including iPSCs, and see changes in transferrin recycling and iron accumulation. Those studies did not go into much detail on ferritin, NCOA4 and other readouts of iron homeostasis but are roughly in agreement with our work here. In the last biorxiv study submitted after we sent this work for review they concluded their phenotypes were reversed by MLi2 treatment, however they required 7 days of treatment for a ~20% restoration in iron levels. Given our work it would seem the impact of LRRK2G019S in high iron conditions is also very resistant to MLi2 treatment. In all these studies we do not yet know for sure whether iron overload in the brain may be a precursor to DA neuron cell death, which could be exacerbated in G2019S carriers. But we hope the reviewer will agree that our approach and findings will be useful for the field to expand on these concepts within different models of PD.

      R2: Minor comments: Supplementary Fig 1: I don't think one should normalize all controls to 1 and then do a statistical test as obviously the standard deviation of control is 0.

      HM: We agree with the reviewer that statistical testing is not appropriate when the WT control is fixed to a value of 1, as this necessarily eliminates variance in that group; accordingly, we have removed both statistical comparisons and standard deviation from the WT control while retaining variability measures for all experimental conditions. Raw densitometry values could not be pooled across independent experiments due to substantial inter-blot variability, and therefore normalization to the WT control was used solely to allow relative comparison within experiments, acknowledging the inherent quantitative limitations of Western blot densitometry. Ultimately the magnitude of the changes relative to the control lanes in each biological replicate was consistent across experiments, even if the absolute density of the bands between experiments was not always the same.

      R2: The raw data needs to be submitted to PRIDE or similar.

      HM: All of our data is being uploaded to the GEO databases, protocols to protocols.io and raw data deposited on Zenodo site in compliance with our ASAP funding requirements and the journals.

      R2: Some of the western blots could be improved. If these are the best shown, I am a little concerned about the reproducibility. How often has they been done?

      HM: We now ensure there is quantification of all the blots for at least 3 independent experiments and have worked to improve the quality of them throughout the revision period.

      R2: (Significance (Required)): Considering the importance of LRRK2 biology in Parkinson's and the new biology shown, this paper will be of great interest to the community and wider research fields.

      HM: We are so very grateful that the reviewer appreciates that the LRRK2 and PD community will find our work of interest. We hope our revisions will prove satisfactory even in the absence of ferritin changes in primary G2019S BMDM.

      Response to Reviewer 3

      Reviewer 3 (R3): What is missing in the study is the physiological relevance of these findings, mainly whether this effect actually results in higher cell death during iron overload. Since iron overload is known to result in ferroptosis, it is surprising that the authors have not checked whether the LRRK2 G2019S and ARM cells undergo more ferroptosis relative to LRRK2 WT cells.

      HM: We thank the reviewer for pushing us to monitor the functional implications of the iron mishandling upon iron overload in the G2019S RAW cell system. We now add a completely new Figure 4 to get to these functional points. We employed two tools to look at established aspects of ferroptosis, first the C11-bodipy probe that labels oxidized lipids and we see significant signals specific to the G2019S iron loaded cells, where it labels endocytic membranes and the cell surface (Fig 4 E-G). This is consistent with the elevation of free iron 2+. We also used the SYTOX green death assay where the dye is internalized into cells when the cell surface is ruptured and show that G2019S cells die upon iron overload, but not the LRRK2KO or wild type cells (Fig 4 H,I). Lastly, we performed full proteomics analysis of the wt and G2019S RAW cells in iron overload conditions. These data provide a better view of the full stress response initiated in the G2019S cells, including the upregulation of HMOX1 (an NRF2 target gene), changes in lysosomal hydrolytic enzymes consistent with the reduction in BSA-DQRed signals, and in cytoskeleton, which is consistent with the plasma membrane blebbing phenotypes we see in G2019S (Fig. 4A-D and Supp. Fig 4 data). We hope these new data help to position the phenotype into a more physiological output.

      R3: Moreover, their conclusion of the findings as "resistant to LRRK2 kinase inhibitors" is not convincing, since in most of the studies, they have removed the kinase domain, and this description implies the use of pharmacological kinase inhibition which has not been done in this paper.

      HM: We took this comment to heart and, as explained in the general response we removed the LRRK2ARM clones from the study. To understand the kinase function in the iron overload conditions we first explored the pan-specific type II kinase inhibitor rebastinib, shown to inhibit LRRK2. In contrast to MLi2, this drug effectively blocked p-Rab8 in G2019S cells exposed to high iron. However, since it is not specific and likely inhibits about 30-40% of all kinases we reached out to the Reck-Peterson and Leschziner labs who have developed a LRRK2 specific type II kinase inhibitor (published in June 2025 PMID: 40465731). They provided these to us (along with a great deal of discussion) and the two drugs both blocked the effect of LRRK2G2019 on p-Rab8 at the plasma membrane. These data show that the phenotypes we observe are indeed linked to the increased kinase activity of LRRK2, even though they are fully resistant to MLi-2. It suggests that high iron results in some alteration in LRRK2 conformation that alters the ability of MLi2 to block the kinase activity, while still allowing the type II kinase inhibitors that bind deeper in the ATP-binding pocket, to functionally block activity. We believe that these new data remove a great deal of confusion we had in the initial submission to explain the MLi-2 resistance.

      R3: There is lower LRRK2 expression in LRRK2 G2019S cells, have the authors checked Rab phosphorylation to validate the mutation?

      HM: We agree that the G2019S mutation leads a reduction in total LRRK2 levels in the cell, which is likely a compensatory effect to lower kinase activity in the cell. We do show that the G2019S mutation has clear activation of phosphorylation on both Rab8 and at the autophosphorylation site S1292 of LRRK2, as seen in Fig 1A, quantified in Fig 1B. In untreated conditions, these phosphorylation events are reversible upon treatment with MLi-2. We also provide the sequencing data in the supplement to confirm the presence of the G2019S mutation in this clone, shown in Supp Fig. 1A.

      R3: The authors should specify if their cells are heterozygous or homozygous since they are discussing a dominant interfering mutant.

      HM: The G2019S and LRRK2 KO are both homozygous. We state this early in the results section and the methods.

      R3: The transferrin phenotype validated through proteomics and western blot is solid. HM: We agree, thank you very much!

      R3: Quantification in figure 1F-G is problematic, not clear what they mean by "diffuse and lysosomal". Puncta is either colocalising with lysosomes or not colocalising. This needs to be clarified and re-analysed.

      HM: We apologize for the confusion. In control cells the Cherry tagged FTL is efficiently cycling through the lysosomes and we don't see a strong cytosolic (diffuse) pool, which likely reflects the relatively iron-poor culture conditions. However, in G2019S cells, there is a highly elevated amount of FTL, with a strong cytosolic/diffuse stain in steady state, with some flux into lysosomes. In this experiment we chelated iron to test whether this cytosolic pool of FTL was capable of clearing through the lysosomes (ferritinophagy). While there is a cytosolic (diffuse) pool that remains, the pool that fluxes into the lysosome increases in G2019S chelated cells. This is also seen by the reduction in total FTL seen by western blot (endogenous FTL). Our conclusion here is that the general ferritinophagy machinery remains functional in G2019S cells. We have changed the term "diffuse" to "cytosolic" and improved our description of this experiment in the text.

      R3: Text in the first results part called "LRRK2G2019S RAW macrophages have altered iron homeostasis" is very long. It could be divided into more sections to improve readability. HM: We have improved the text to be more descriptive of the conclusions and added new sections

      R3: If the effect is armadillo-dependent, where does LRRK2 G2019S is implicated since there is no kinase domain in these cells?

      HM: Our new data employing the LRRK2-specific type II kinase inhibitors now confirm that the effects of the G2019S on iron overload are indeed kinase dependent, it's just insensitive to MLi2.

      R3: The authors do not show any controls (PCR, sequencing) confirming knockout or truncation. HM: We did higher resolution proteomics and deep sequencing and learned that the "Arm" mutation was not a truncation but a series of 4 point mutations around exon 19. Therefore we removed all data referring to this clone and replaced it with the use of the type II kinase inhibitor experiments. We feel this removed a lot of confusion and provides much clearer conclusions on the role of the kinase activity in iron overload. We may continue to explore what the 4 amino acid mutations created such strong phenotypes, as it could reflect a critical conformational change that impacts the kinase activity. But that is for future work. We now include the sequencing files of the G2019 and KO as Supplementary Data Files 1 and 2.

      R3: The data is interesting and the image quality with the insets is very high. HM: We thank the reviewer for their positive comments!

      R3: Mutant not clearly described in text, did the authors remove just the kinase and ROC-COR domains or all the domains downstream of the Armadillo domain? This is not clear. HM: We have removed the clone from the manuscript.

      R3: The authors cannot conclude that their phenotype is due to the independence of the kinase domain specifically as they are also interfering with the GTPase activity by removing the ROC-COR domains. HM: We agree and our new drugs allow us to confirm that the phenotypes are due to kinase activity, but there is a new conformation of LRRK2 induced in high iron that renders the kinase domain resistant to MLi-2 inhibition. We discuss this in the manuscript now.

      R3: In Figure 3E, is the difference between the "ARM CTRL" and the "ARM FAS" conditions significant? A trend appears to be there, but the p-value is not shown. HM: these data are now removed.

      R3: In figure 4A, it would have been important to check if Rab8 phosphorylation is also observed in LRRK2 KO cells after administration of FAS to further evaluate the mechanism through which this Rab8 phosphorylation is occurring.

      HM: We show that the pRab8 is specific to the G2019S lines and not seen in LRRK2 KO (Fig 3A,B, Supp. Fig. 3A,B).

      R3: The vinculin bands in figure 4A are misaligned with the rest of the bands.

      HM: We now provide new blots for all of these experiments (in Fig 3) as we removed the LRRK2ARM data from the manuscript and the appropriate loading controls are all included.

      R3: The authors do not have any controls to validate the pRab8 staining in IF. This is an important caveat and needs to be addressed. HM: We now include siRNA validation of Rab8 (vs Rab10) to confirm the specificity of the antibody to pRab8 in IF where it labels the plasma membrane in G2019S iron loaded cells.

      R3: The authors should have checked if FAS administration in the LRRK2 G2019S and the ARM cells is leading to ferroptotic cell death (or cell death in general). This is key to validate the link between the altered iron homeostasis in LRRK2 G2019S cells and increased cytotoxicity observed during neurodegeneration.

      HM: As mentioned above, we have added extensively to our new Fig 4 to include full proteomics analysis of the changes in iron loaded G2019S cells, we use C11-Bodipy probes to monitor lipid oxidation, and SYTOX green assays to monitor cell death through cell surface rupture (consistent with ferroptosis). We thank the reviewer for pushing us to do these experiments and provide further relevance to the potential for LRRK2 mutations to promote cell toxicity during neurodegeneration.

      R3: Regarding the literature, the authors are missing some important papers that are preprinted and these studies need to be discussed. This includes a report with opposite findingshttps://www.biorxiv.org/content/10.1101/2025.09.26.678370v1.full and a report showing kinase independent cell death in macrophages https://www.biorxiv.org/content/10.1101/2023.09.27.559807v1.abstract

      HM: We thank the reviewers for alerting us to the biorxiv papers, one of which was submitted after we sent our manuscript to review. We are excited to see the growing interest in the impact of LRRK2 function in iron homeostasis and hope our work will contribute to this. Upon reading the study from the LaVoie lab they do show some sensitivity of the iron loaded phenotype in G2019S cells, however they see a ~20% reduction in lysosomal iron after 7 days of MLi treatment in Astrocytes (their Fig 2L). To us, this is very likely an indication of a relatively high resistance to the drug. I'm sure if they tried these new Type II inhibitors the iron load would be much more rapidly reversed. The specificity of their phenotype to Rab8 is also very interesting considering the cell surface localization we see for pRab8 in our iron loaded system. Similar comments for the Guttierez study in macrophages. We have included the findings of these papers within the manuscript and thank the reviewer for pointing them out.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this paper, the authors report an interesting phenotype of the LRRK2 G2019S mutation on iron homeostasis in RAW264.7 macrophages. The phenotype is well characterised through proteomic and western blot approaches investigating transferrin and ferritin trafficking. The study is well conducted and data of high quality. The authors also appear to have discovered a cellular context where Rab8 is phosphorylated independently of LRRK2. This is a major finding which can potentially have an important impact in the LRRK2 field. What is missing in the study is the physiological relevance of these findings, mainly whether this effect actually results in higher cell death during iron overload. Since iron overload is known to result in ferroptosis, it is surprising that the authors have not checked whether the LRRK2 G2019S and ARM cells undergo more ferroptosis relative to LRRK2 WT cells. Moreover, their conclusion of the findings as "resistant to LRRK2 kinase inhibitors" is not convincing, since in most of the studies, they have removed the kinase domain, and this description implies the use of pharmacological kinase inhibition which has not been done in this paper.

      Significance

      Major comments

      In Figure 1:

      • There is lower LRRK2 expression in LRRK2 G2019S cells, have the authors checked Rab phosphorylation to validate the mutation?
      • The authors should specify if their cells are heterozygous or homozygous since they are discussing a dominant interfering mutant.
      • The transferrin phenotype validated through proteomics and western blot is solid.
      • Quantification in figure 1F-G is problematic, not clear what they mean by "diffuse and lysosomal". Puncta is either colocalising with lysosomes or not colocalising. This needs to be clarified and re-analysed.
      • Text in the first results part called "LRRK2G2019S RAW macrophages have altered iron homeostasis" is very long. It could be divided into more sections to improve readability.

      In Figure 2:

      • If the effect is armadillo-dependent, where does LRRK2 G2019S is implicated since there is no kinase domain in these cells?
      • The authors do not show any controls (PCR, sequencing) confirming knockout or truncation.
      • The data is interesting and the image quality with the insets is very high.

      In Figure 3:

      • Mutant not clearly described in text, did the authors remove just the kinase and ROC-COR domains or all the domains downstream of the Armadillo domain? This is not clear.
      • The authors cannot conclude that their phenotype is due to the independence of the kinase domain specifically as they are also interfering with the GTPase activity by removing the ROC-COR domains.
      • In Figure 3E, is the difference between the "ARM CTRL" and the "ARM FAS" conditions significant? A trend appears to be there, but the p-value is not shown.

      In Figure 4:

      • In figure 4A, it would have been important to check if Rab8 phosphorylation is also observed in LRRK2 KO cells after administration of FAS to further evaluate the mechanism through which this Rab8 phosphorylation is occurring.
      • The vinculin bands in figure 4A are misaligned with the rest of the bands.
      • The authors do not have any controls to validate the pRab8 staining in IF. This is an important caveat and needs to be addressed.
      • The authors should have checked if FAS administration in the LRRK2 G2019S and the ARM cells is leading to ferroptotic cell death (or cell death in general). This is key to validate the link between the altered iron homeostasis in LRRK2 G2019S cells and increased cytotoxicity observed during neurodegeneration. Regarding the literature, the authors are missing some important papers that are preprinted and these studies need to be discussed. This includes a report with opposite findings https://www.biorxiv.org/content/10.1101/2025.09.26.678370v1.full and a report showing kinase independent cell death in macrophages https://www.biorxiv.org/content/10.1101/2023.09.27.559807v1.abstract
    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript the authors describe an interesting connection between the Parkinson's kinase LRRK2 and iron trafficking in RAW macrophages. Expression of the LRRK2 G2029S mutation affects the abundance of ferritin heavy and light chains and therefore the uptake and storage of iron. Interestingly, the loss of the kinase domain still had a strong phenotype, suggesting that this is independent of the kinase function.

      The paper is well written and excellently cited. The data is convincing and of good quality.

      I have only one request and else very minor comments:

      Major: Please confirm that the observed phenotype is conserved within bone marrow-derived macrophages of LRRK2 G2019S mice. These mice are widely available within the community and frozen bone marrow could be sent to the labs.

      The main reason for this experiment is that CRISPR macrophage cell lines do sometimes acquire weird phenotypes (at least in our lab they sometimes do!) and it would strengthen the validity of the observations.

      Minor comments:

      Supplementary Fig 1: I don't think one should normalize all controls to 1 and then do a statistical test as obviously the standard deviation of control is 0. I would normalize to the average of the control, which will provide an error for the control.

      The raw data needs to be submitted to PRIDE or similar. This has not happened yet.

      Some of the western blots could be improved. If these are the best shown, I am a little concerned about the reproducibility. How often has they been done?

      Significance

      Considering the importance of LRRK2 biology in Parkinson's and the new biology shown, this paper will be of great interest to the community and wider research fields.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Goldman et al describe some novel findings with respect to LRRK and iron handling in a series of RAW macrophage cell lines. This cell background is chosen for its recognized high levels of endogenous LRRK2 protein expression, its somewhat broad use in the field, and the investigators add its relevance due to phagocytosis of red blood cells, thus requiring iron robust metabolic processes. Proteomic analyses of WT and G2019S RAW cells revealed multiple iron-related proteins affected by LRRK2 mutation. A deeper candidate-based analysis revealed complex changes in ferritin heavy and light chain and changes in ferric and ferrous iron. Notably, reliable changes in the levels and/or solubility of NCOA4 result from this pathogenic LRRK2 mutation. Unexpectedly, however, these changes were not sensitive to LRRK2 kinase inhibitor treatment. The investigators suggest a dominant effect rather than loss-of-function as subsequent experiments revealed that these effects could be replicated with a LRRK2 variant lacking the kinase domain (LRRK2-ARM) and were not replicated by LRRK2 KO. The data are internally consistent throughout and could certainly shed new important light onto unique and unexpected effects of this LRRK2 mutation.

      There are two major concerns with the data in their present form. In brief, first, the G2019S cells express much less LRRK2 and more Rab8 that the WT cells and this severely affects interpretability. Second, the investigators used CRISPR to truncate the endogenous LRRK2 locus to produce a hypothetical truncated LRRK2-ARM polypeptide. This appears to have robust effects on NCOA4, in particular, which drives the overall interpretation of the data. However, the expression of this novel LRRK2 specie is not confirmed nor compared to WT or G2019S in these cells (although admittedly the investigators did seek to address this with subsequent KO in the ARM cells). It would be premature to account for the changes reported without evidence of protein expression. This latter issue may be more easily addressed and could provide very strong support for a novel function/finding, see more detailed comments below, most seeking clarifications beyond the above.

      • Need to make clear in the results whether the G2019S CRISPR mutant is heterozygous or homozygous (presumably homozygous, same for ARM)
      • The text of the results implies that MLi2 was used in both WT and G2019S Raw cells, but it's only shown for G2019S. Given the premise for the use of RAW cells, it's important to show that there is basal LRRK2 kinase activity in WT cells to go along with its high protein expression. This is particularly important as the G2019S blot suggests minor LRRK2-independent phosphorylation of Rab8a (and other detected pRabs). One would imagine that pRab8 levels in both WT and G2019S would reduce to the same base line or ratio of total Rab in the presence of MLi2, but WT untreated is similar to G2019S with MLi2. This suggests no basal LRRK2 activity in the Raw cells, but I don't think that is the case.
      • Also, in terms of these cells, the levels of LRRK2 are surprisingly unmatched (Fig 1A, 1D, 1H, S1D, etc.) as are total levels of Rab8 (but in opposite directions) between the WT and G2019S. This is not mentioned in the Results text and is clearly reproducible and significant. Why do the investigators think this is? If Rab8 plays a role in iron, how do these differences affect the interpretation of the G2019S cells (especially given that MLi2 does not rescue)? Are other LRRK2-related Rabs affected at the protein (not phosphorylation level)? Could reduced levels of LRRK2 or increase Rab 8 alone or together account for some of these differences? Substantial further characterization is required as this seriously affects the interpretability of the data. Since pRab8 is not normalized to total Rab8, this G2019S model may not reflect a total increase in LRRK2 kinase activity, and could in fact have both less LRRK2 protein and less cellular kinase activity than WT (in this case).
      • Presumably, the blots in 1H are whole cell lysates and account for the pooled soluble and insoluble NCOA4 (increased in G2019S), as there is no difference in soluble NCOA4 (Fig 2H). I suspect the prior difference is nicely reflected in the insoluble fraction (Fig 2H). This should be better explained in the Results text. This is a very interesting finding and I wonder what the investigators believe is driving this phenotype? Is the NCOA4 partitioning into a detergent-inaccessible compartment? Does this replicate with other detergents, those perhaps better at solubilizing lipid rafts? Is this a phenotype reversible with MLi2? Very interesting data.
      • Figure 2 describes the increased NCOA4-positive iron structures after iron load, but does not emphasize that the G2019S cells begin preloaded with more NCOA4. How do the investigators account for differential NCOA4 in this interpretation? Is this simply a reflection of more NCOA4 available in G2019S cells? This seems reasonable.
      • These are very long exposures to iron, some as high as 48 hr which will then take into account novel transcriptomic and protein changes. Did the investigators evaluate cell death? Iron uptake would be trackable much quicker.
      • The legend for 2F is awkward (BSADQRED)
      • Why are WT cells not included in Fig 2G?
      • The biochemical characterization of NCOA4 in the LRRK2-arm cells is a great experiment and strength of the paper. The field would benefit by a bit further interrogation, other detergents, etc.
      • Have the investigators looked for aberrant Rab trafficking to lysosomes in the LRRK2-arm cells? Is pRab8 mislocalized compared to WT? Other pRabs?
      • The expression levels and therefore stability of the ARM fragment is not shown. This is necessary for interpretation. While very intriguing, the data in Aim 3 rely on the assumption that the ARM fragment is expressed, and at comparable levels to G2019S to account for phenotypes. The generation of second clone is admirable, but the expression of the protein must be characterized. This is especially true because of the different LRRK2 levels between WT and G2019S. One could easily conceive of exogenous expression of a tagged-ARM fragment into LRRK2 KO cells, for example, as another proof-of-concept experiment. If it is truly dominant, does this effect require or benefit from some FL LRRK2? It seems easy enough to express the LRRK2-ARM in at least WT and KO RAW cells.
      • Does iron overload induce Rab8a phosphorylation in a LRRK2 KO cell? This would be a solid extension on the ARM data and support the important finding that an additional kinase(s) can phosphorylate Rab8a under these conditions, and while not unexpected, this may not have been demonstrated by others as clearly. It also addresses whether the ARM domain is important to this other putative kinase(s), which may add value to the authors' model.

      Minor concern - the abstract but not the introduction emphasizes a hypothesis that loss of neuromelanin may promote cell loss in PD (through loss of iron chelation), while post mortem studies are by definition only correlative, early works suggested that the higher melanized DA neurons were preferentially lost when compared to poorly melanized neurons in PD. This speculation in the abstract is not necessary to the novel findings of the paper.

      Significance

      This study could shed light on a both novel and unexpected behavior of the LRRK2 protein, and open new insights into how pathogenic mutations may affect the cell. While studied in one cell line known for unusually high LRRK2 expression levels, data in this cell type have been broadly applicable elsewhere. Give the link to Parkinson's disease, Rab-dependent trafficking, and iron homeostasis, the findings could have import and relevance to a rather broad audience.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank all the reviewers for their comments and suggestions.

      Please find below our point-by-point response to the Reviewers' comments, which details the corrections already made and outlines the planned revisions, experiments, and analyses.

      Reviewer 1

      Major comments:

      • Reviewer 1 commented that the 'manuscript would greatly benefit from having someone spend time on the figures, and associated text, to ensure they are fully comprehensible'. We agree wholeheartedly with the reviewer and apologise. We have now revisited the text, figures, and associated figure legends to ensure that they are more easily accessible and fully comprehensible to readers from across disciplines. This includes adding labels to point out specific anatomical features on images, and ensuring figures and text align. Further specific examples are included in the points below.
      • In response to concerns raised by Reviewer 1 relating to: Figure 1 and the lack of figure citations; 'the persistence of mCherry in the H2B Fucci'; how mCherry seems to persist longer in H1 (compare Figs 1D and 1G)':
      • We apologise for the lack of figure citations in the text. We have now reworked the figures relating to the constructs (original Figures 1 and S1) and have made these Figures 1, 2 and S1 in our updated version.
      • Figure 1 is now an introductory background figure which illustrates the differences between Fucci(SA) and Fucci(CA) reporters, with additional details provided in the associated legend, and call outs to the figure starting in the introduction.
      • Regarding 'the persistence of mCherry in the H2B Fucci', what we are trying to articulate is that the mCherry degradation that we observed in the Fucci(2A) expressing DF1 cells extended beyond the end of S phase and into G2/M, compared with what would be expected (Revised Figure 2H, arrows).
      • We have now replaced these montages with a more representative example. Additionally, the new images (Figures 2C and 2G) are synchronised (both starting at G2/M), restricted to a single cell cycle, are larger in size, and have the cell cycle stage labelled. We believe these changes will aid interpretation.
      • Specifically relating to the lack of labelling in Figure 3A, we agree that this figure was not labelled sufficiently, and neither was there enough detail included in the text or figure legend for readers to follow easily and make their own conclusions. We have now added additional labels to this figure, broken the figure down into more panels (Figures 4A-4D in revised manuscript), and included more detailed descriptions in the associated figure legend and text.
      • We thank the reviewer for making the important point that it is 'hard to know where the biosensor is reporting patterns that are already well established (eg neural tube), and where the biosensor is reporting patterns that are novel - and if so, what these patterns are' which was made more challenging by insufficient references to previous studies.
      • Firstly, as for the point above, we have now added labels to many of the panels (Figure 4 in revision), including highlighting features such as the non-proliferative dermal condensates and demarcating the proliferative retinal pigmented epithelium (Figures 4F and 4G in revision). Secondly, we have also now included additional references in the text, specifically relating to the neural tube, digits, and forming feathers, where our proliferation profiles are consistent with previous literature.
      • With regards to the Reviewer's comment regarding the difficulty in drawing conclusions 'about cell cycle in different tissue layers without sectioning' in original Figure 3B we will include more sections of FuChi embryos which include structures such as mesenchymal condensates.
      • To make our data on cell cycle stages as 'cells egress from the primitive streak, to form prechordal plate' clearer we have added additional labels to the figures (Figures 4B and 6E in revised manuscript). We will complement this adding sections of gastrulating FuChi embryos to further demonstrate the cell cycle status of cells that form the pre-chordal plates.

      Minor comments

      • We have added additional references relating to the data in original Figure 3 (now Figure 4 see above), and any new descriptions of known proliferation profiles that we include will have appropriate citations.
      • In this current revision we have addressed figure call out issues, and added labels to enhance readability, clarity and data interpretation. Reviewer 2

      Major comments

      • Reviewer 2 rightly pointed out that the 'description of the bicistronic tandem-Fucci(CA) system in paragraph 6 is not consistent with what is described in the original bibliographic reference indicated by the authors'. We have now added additional text to properly explain the CDT1 probe dynamics, as per the cited manuscript, and also referenced the schematics to help readers.
      • To address whether the FuChi model can be accurately 'used to study embryogenesis' and following up on the suggestion to 'indicate if the size of the embryos is comparable to the wildtype' we have now included size comparisons of FuChi and wild-type/non-transgenic embryos at mid (E9) and late (E18) gestational stages demonstrating that there is no significant difference between genotypes during embryogenesis (Figure 3D in revised manuscript). For all earlier stages, we did not see any developmental or size differences. We believe if there were any differences, these would be reflected in size at the mid and late gestational stages we analysed.
      • Reviewer 2 made very valuable observations and suggestions regarding our data and interpretation of somitogenesis, specifically in response to our sentence saying that "the mesenchyme, which is predominantly in G1 as they undergo condensation". Furthermore, they noted that Supplementary Video 4 "shows distinct green fluorescence (S) in the presomitic mesoderm for the first hour or so, only then turning to magenta (G1)". We were asked to review the sentence/video to clarify if this is a significant finding or if this is not representative of their observations.
      • We thank the reviewer for this suggestion. From looking again at our timelapse movies, and also analysing additional static images, we agree that presomitic mesoderm (PSM) does appear to be green (S phase), which then may transition to G1 as the somites form. To address this, we plan to quantify cell cycle status in the PSM on embryos to see if this is a significant finding.
      • We hope this quantification of the PSM may also enable us to include discussion on how our findings relate to the Cell Cycle model for somitogenesis proposed in the Collier et al, 2000 paper suggested by the Reviewer.
      • We agree with the Reviewer that "the fluorescence profiles in original Figure 4C do not seem similar regarding the Myc-tag epitope" and believe this difference is likely just a reflection of the part of the image we used. We will include a more representative image once we have repeated the staining.
      • Reviewer 2 has asked for quantitative support for our fluorescence-based interpretations. We thank the reviewer for this suggestion and are now planning to perform quantitative analyses of different tissues (similar to our quantification in germ cells) and in embryos to support our observations. These will include the PSM (see above), neural tube, intestine, and early embryos (also see Reviewer 3 response for blastoderm quantification).
      • Since our original submission, we have further refined our in situ hybridisation protocol on FuChi embryos (Figures 5A & B in revision), finding that strong reporter expression is maintained for all the fluorescent proteins of the H1-Fucci(CA)2 reporter. Therefore, the "notably fainter" appearance of the hGMNN-mVenus in Figure 4A from the first version of the paper was likely a result of the experimental protocol not being 100% optimal.
      • *

      Minor comments

      • We have reordered the paragraphs relating to the different Fucci versions in the introduction as per the suggestions by the reviewer for better clarity.
      • To address the issues with Fucci system nomenclatures which made reading difficult, we have now added a background figure (new Figure 1 in revised draft) which is cited in the introduction, made sure constructs are introduced appropriately, and ensured we are consistent with our nomenclature.
      • Supplementary Figure lettering corrected.
      • All figure panels are now mentioned in the main text, and the incorrect call outs noted by the Reviewer have been corrected
      • Removed period and included clarifying statement in the figure legend relating to the comment regarding the extraembryonic region in Figure 5 (original) / Figure 6 (revised).
      • Other issues raised relating to reference duplication and missing words have been resolved.
      • We have corrected the legend of Figure 1 of the original paper, see related Reviewer 1 response provided above.

      Reviewer #3

      Minor comments

      • We have corrected all the figure call outs (see responses to similar comments by Reviewers 1 and 2) to ensure that all data presented is accurately reported.
      • We would like to thank the reviewer for suggesting modifications to the cell cycle montages (original figures 1D, 1G and 2F). We agree it would help the reader to enlarge the image, and therefore reduced the montage to include just one cell cycle, and have also included annotations of cell cycle stages in Figures 2C and 2G of the revised manuscript. We have also added some labels to Figure 3E (original figure 2F) and enlarged this.
      • In response to Reviewer 3's comment regarding fluorescent intensity. We quantified fluorescence levels in multiple individual DF1 cells expressing either the H1.0-Fucci(CA)2 or H2B-Fucci(SA)2 reporters, and this is shown as the fluorescent index in Figures 2D, 2E, 2H and 2I of the revised manuscript, where reporter levels were measured across time. In terms of overall mean intensity levels of the reporters, we found the reporters to be comparable in brightness and have similar mean intensity levels across the cell populations in the flow cytometry data (Figures 2F and 2J).
      • To enhance speedy interpretation, we will also process our supplementary videos to include annotations and arrows to highlight key cells and events (e.g. a cell undergoing mitosis).
      • As recommended by Reviewer 3, we have now quantified cell cycle status in blastoderm cells, confirming that a high proportion are in the G2/M phase. We will include these data in the final revision, which will complement our planned quantification of cell cycle status in other tissues (see response to Reviewer 2).
      • For our final revision, we will include higher magnification/zoomed in images of selected regions of the somites, neural tube (lumen) and retina (epithelium). Revisiting our images of the neural tube showed that dividing cells lumen did so in the perpendicular plane and we will include these images in our revision to provide further evidence of the fidelity of the FuChi reporter. We thank the reviewer for this excellent idea to show the efficacy of our system.
      • To address the levels of proliferation in somites, we plan to generate a cropped video with a fixed ROI to enable proliferation in individual cells of the forming somites to be more readily visualised. This will be further complemented by the quantification of cell cycle status in forming somites (see responses to other reviewers).
      • We have added lines to the discussion regarding the use of our reporter in other conventional model systems.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript by Sudderick and colleagues describes the development and characterisation of a new generation of cell cycle reporter that can distinguish between cells in G1, S, G2 and M phases. Furthermore, the authors have developed a transgenic chicken line incorporating this reporter and demonstrated faithful discrimination of cell cycle stages in the in vivo context of developing transgenic embryos. Of note is the addition of epitope tags, which facilitate discrimination of cell cycle stages in tissue fixed using various techniques. This is a very important paper for the following reasons:

      • The authors have achieved faithful discrimination of all four cell cycle stages, which is a major advance in itself.
      • This generation of the FuChi transgenic chick is of enormous importance. This will facilitate accurate in vivo studies in a broad range of fixed and living tissue types and is a major milestone in the further establishment of the chick as a transgenic model system.

      Th characterisation of the cell cycle reporter as presented is robust and convincing. The authors further demonstrate the potential utility of the FuChi chickens through their observation of partial cell cycle synchrony during onset of development. I therefore only have minor suggestions that may facilitate easier interpretation of their data.

      Results 2

      • I can't see any mention of Figures 1C and D. Presumably the authors have carried out fluorescence intensity measurements using the two cell cycle reporters here, but this is not mentioned in the main text.
      • Figure 1D&G: I find these difficult to follow given the small size of the cells as presented. The authors may consider enlarging these and clearly annotating for cell cycle stage. They may find it helpful to focus on a single cell cycle, although I appreciate that displaying two cell cycles strengthens the claim of efficacy of the newly developed sensor. The supplementary videos associated with these figure panels are excellent as they display several cells with faithful reporter activity, but again, the authors may wish to annotate a few of these cells to enhance speedy interpretation. I have similar comments for Figure 2F and the associated movie.

      Results 4

      • The authors state that a large proportion of blastoderm cells were in G2/M. They may wish to formally quantify this, perhaps by performing simple cell counts in designated regions of interest. A similar quantification for gastrulating embryos would also be helpful.
      • It would be helpful to see zoomed in images of selected regions of the somites, neural tube and retina displayed in Figure 3B. This would be particularly appropriate in the context of the neural tube and retina (which are not discussed in the main text) as the positioning of the nucleus is defined by the stage of the cell cycle and should therefore serve to highlight the efficacy of the reporter.
      • Video 4 beautifully demonstrates the high levels of proliferation in somites, but again, it would be useful to have a zoomed in view. I appreciate the difficulty involved in doing this, given the movement of the embryo, but perhaps the authors could focus on a fixed ROI or present a separate movie of a few cells undergoing a full cell cycle.

      Discussion

      • The authors could perhaps expand on their discussion about potential utility in other conventional model systems (e.g. mouse, fish, etc).

      Significance

      General assessment: A timely piece of work that introduces a faithful cell cycle reporter that will be of broad interest.

      Advance: The ability to discriminate between all four stages of the cell cycle is a clear advance here.

      Audience: Broad interest, including those studying cell cycle and embryonic development in several tissue contexts.

      Expertise: Chick embryology, in vivo live imaging, neurogenesis, cellular developmental biology

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This work presents a novel transgenic chicken model with fluorescent reporters that allow in vivo monitoring of the four phases of the cell cycle. To achieve this, the authors clearly identify the limitations of previous Fucci systems and developed an optimised reporter construct that overcomes the major technical challenges identified. Addition of epitope tags to cell cycle stage-specific markers further enables antibody detection in fixed tissues. Proof of concept is provided by live imaging of chick embryos in early developmental stages, evidencing dynamic cell cycle states in tissues and migrating cells.

      Major comments:

      1. Introduction: Description of the bicistronic tandem-Fucci(CA) system in paragraph 6 is not consistent with what is described in the original bibliographic reference indicated by the authors. Namely: "...accumulation of the CTD1 probe..." should be expected in the G1-S transition (not S-G2) and the yellow reporter should be expected in G2 and M phases (not S and G2, as described). Please review this portion of the text.
      2. The authors state that "Of note, hatched FuChi chicks are initially smaller than wild type counterparts but grow at comparative rates and are fertile". If the model is to be used to study embryogenesis, it would be useful to indicate if the size of the embryos is comparable to the wildtype, at least for the major developmental stages mentioned in the manuscript.
      3. When referring to somitogenesis, the authors state "...the mesenchyme, which is predominantly in G1 as they undergo condensation". Suppl Video 4, however, shows distinct green fluorescence (S) in the presomitic mesoderm for the first hour or so, only then turning to magenta (G1). The authors should review the sentence/video to clarify if this is a significant finding or if this is not representative of their observations.
      4. (Optional) It would be interesting to describe if the authors' observations of cell cycle dynamics in the presomitic mesoderm support the proposed Cell Cycle model for somitogenesis (Collier et al., J.Theor.Biol.2000).
      5. The fluorescence profiles in Figure 4C do not seem similar regarding the Myc-tag epitope (contrarily to what is stated). The authors should rephrase or revisit this image to clarify their findings.
      6. Quantitative support for several fluorescence-based interpretations made throughout the manuscript. In some instances, conclusions are drawn from qualitative differences in signal intensity. For example, the statement in Fig. 4A that hGMNN-mVenus appears "notably fainter" than the other reporters. Incorporating simple quantitative analyses would strengthen these claims and ensure that observed differences reflect biological behaviour rather than technical or optical factors.

      Minor comments:

      1. Organization of the information in the Introduction: Paragraphs 3-5 introduce sequentially improved versions of the Fucci system. Then, paragraph 6 returns to the system described in the 4th paragraph. Authors should consider including paragraph 5 (description of Fucci4 and its limitations) just prior to the description of chickens as valuable developmental models (current paragraph 8) for clarity of the text.
      2. Fucci system nomenclature. Many different Fucci systems are mentioned, but nomenclature consistency throughout the manuscript is lacking, which makes reading difficult. For example, the terms "Fucci(SA)2" and "Fucci(CA)2" should be defined in the introduction, as they are employed to describe the construction of the new biosensor in the following sections.
      3. Some figure panels are not mentioned in the main text (for ex. Figures 1B and C, Figure 2C)
      4. The legend of Figure 1 (D & G) mentions "denoted by *", but the * seems to be missing in the figure.
      5. Supplementary Figure 1 has two D panels (and is missing the E).
      6. In the main text, where it reads "...Flow cytometry analysis of three independent PGC lines... (Figures 2G & S2E)", S2E should be replaced by S1E.
      7. In the Figure 4A legend, hCDT1-mVenus should be corrected to hCDT1-mcherry. Also, it is not clear why the authors state that "hGMNN-mVenus expression is notably fainter compared with hCDT1-mVenus and H1.0-mCerulean expression".
      8. In Figure 5E, the optical sections "i" seem to pertain to the extraembryonic tissue/area opaca and not to anterior mesoderm, as stated in the figure legend. Also, there is a period between "prechordal plate" and "and" in the legend's last sentence.
      9. Discussion: The last sentence of the third paragraph lacks "to" between "used" and "interrogate".
      10. References 10 and 23 are identical.

      Referee cross-commenting

      I agree with all comments from reviewers 1 and 3

      Significance

      This is a beautiful paper, describing a long sought-after model system to study cell cycle dynamics in vivo. The methodological details are thorough, and the results obtained are clearly presented, highlighting the utility of the new model in various embryonic stages and tissues/organs.

      This work is of pivotal importance to the developmental/stem cell biology community, as well as to the wider community that employs the chicken embryo as a preclinical model to assess therapeutic or teratogenic potential of biologically- or chemically-derived products.

      My expertise is in chicken embryo development, namely gastrulation, somitogenesis and limb bud outgrowth.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The manuscript reports the development of a novel Fucci (Fluorescent Ubiquitination-based cell cycle indicator) system for analysing cell cycle analysis, including live imaging of cell cycle. The novel biosensor (H1.0-Fucci(CA)2) has been developed for analyses of chick cells and tissues: chick embryos are a valuable developmental model that have (and in the future, will) particularly informed our understanding of early stages of embryogenesis, and of development of numerous tissues, including the neural tube, somites, limb bud. The authors conclude that the novel system has advantages over previous Fucci systems, including faithful labelling of all four cell cycle phases. Importantly, the authors have generated a stable germline of H1.0-Fucci(CA)2 transgenic chicks, enabling, for the first time, the discrimination and tracking of cells in all 4 phases of the cell cycle - i.e. in vivo studies of cell cycle progression in vivo, in intact tissues and organs. Additional epitope tags mean that the biosensor can be detected in fixed tissues, enabling comparison of cell cycle with expression of mRNA and proteins that mediate other aspects of development/label particular cells and tissues. The authors map proliferation dynamics across numerous tissues in the developing chick, at numerous stages of development, and conclude in particular that transition from S phase may be a key morphogenetic event in gastrulation, as mesendoderm cells leave the primitive streak to form embryonic stuctures such as prechordal plate

      Major comments:

      The novel biosensor looks to be an incredibly useful tool, and the manuscript suggests patterns of cell cycle progression in different tissues, and at different points in time, that look intriguing. But it is sometimes difficult to draw the strong conclusions suggested by the authors because the text and figures are sometimes difficult to follow. The manuscript would greatly benefit from having someone spend time on the figures, and associated text, to ensure they are fully comprehensible.

      Specifically:

      Conclusion1: That the new FUCCI biosensor is a superior cell cycle probe, better at discriminating all cell cycle phases than previous versions. I was very convinced by the vidoes (video 1 and 2) but had problems with Figure 1. Potentially, this is because I am not an expert in these types of analyses - but it was not helped by the fact that components of the figure were not cited in the text. I was particularly confused by the statement remarking on 'the persistence of mCherry in the H2B Fucci' as mCherry seems to persist longer in H1 (compare Figs 1D and 1G). Please explain, in the Figure legend, why this appears to be the case.

      Conclusion 2: that the FuChi chicks are the first viable stably expressing avian cell cycle biosensor model. I agree, and the authors should be congratulated on the development of this important tool.

      Conclusion 3: the authors monitor cell cycle progression in chicks, in vivo, looking at stages from blastoderm, through gastrulation, and into organogenesis, and draw various conclusions

      For example: Fig 3A and text: 'as gastrulation progresses, the primitive streak an presomitic mesoderm display...., whereas the .... And neural plate contains...'

      Figure 3A covers an enormous range of stages and tissues. The figure is barely labelled. The text and figure need to better align, and key features in each figure panel need to be labelled so that the reader can better follow, and draw conclusions.

      Fig 3B: Reports expression in numerous tissues. There are some beautiful examples of cells segregating relative to cell cycle - for instance, in the neural tube. But I found it hard to know where the biosensor is reporting patterns that are already well established (eg neural tube), and where the biosensor is reporting patterns that are novel - and if so, what these patterns are. Again, this is not described adequately in the text (for instance, there is no mention of the neural tube). And in some cases, references are provided (allowing comparison with previous studies) - but in other cases, there are no references to previous studies. The reader must be given the opportunity to compare this study with previous studies.

      Overall - I can appreciate that there are some fascinating patterns, but it is very difficult to draw the conclusions suggested by the authors. Primarily this is due to poor labelling of figures, and lack of clarity between figures and text, and poor referencing. Additionally, it is not clear that strong conclusions can be drawn about cell cycle in different tissue layers without sectioning some embryos.

      Fig 3C: The authors remark 'The results confirm that the ... FuChi embryos recapitulate known cell cycle profiles of those tissues'. See my comments in 3B.

      Conclusion 4: Robust stability of biosensor in fixed tissues. I agree, and the authors should be congratulated for having made a construct that can be paired with in situ hybridisation and immunohistochemistry - this is invaluable.

      Conclusion 5: The authors investigate the potential of the new system for live imaging, and focus on a couple of novel dynamic examples.

      The data indicating that PGCs at initial migratory stages are not undergoing frequent cell division is clear.

      However, the data indicating that cell cycle status changes as cells egress form the primitive streak, to form prechordal plate, is not clear. The figures need to be better labelled, and the text needs to be more clear (eg ' and prechordal plate. and anterior mesoderm'..

      Minor comments:

      • Specific experimental issues that are easily addressable.

      I would recommend that the authors section some embryos, to better support key conclusions (eg in figure 3 and 5) - Are prior studies referenced appropriately?

      Not always - see comment above (Fig 3) - Are the text and figures clear and accurate?

      No - this needs work. Not all figures cited in text, or cited in wrong order; Figures are poorly labelled - making it hard to follow - Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Label figures more carefully and ensure figures and text align

      Referee cross-commenting

      I agree with all comments from reviewers 2 and 3

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      Technically this is a fantastic resource. As detailed above, the novel biosensor (H1.0-Fucci(CA)2) has been developed for analyses of chick cells and tissues: chick embryos are a valuable developmental model that have (and in the future, will) particularly informed our understanding of early stages of embryogenesis, and of development of numerous tissues, including the neural tube, somites, limb bud. Increasingly, studies show the importance of cell cycle for development, differentiation and morphogenesis - it is a huge breakthrough to be able to perform in vivo studies of cell cycle progression in intact tissues and organs.<br /> - State what audience might be interested in and influenced by the reported findings.

      Broad basic research, including developmental biologists, stem cell biologists, modellers. - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Developmental biologist, with expertise in chick

    1. That begs the obvious question: whether they’ve reached that goal yet. Not a chance, said Shah. “It’s a work in progress, right? It’s forever a work in progress. By definition, I don’t think we’ll ever reach it, but I think we are further along than almost anyone else.”

      Me gusta en un 50/50 esta forma de pensar, si bien nada es perfecto y todo puede mejorar y evolucionar con el tiempo, no me cerraría a pensar de que estoy lejos o que no podría llegar a la meta que me propuse en cierto momento. Todo en la vida es resiliencia y mejora continua de procesos de manera progresiva y alcanzable.